HANDBOOK OF MATERIALS MODELING
This page intentionally left blank
HANDBOOK OF MATERIALS MODELING Part A. Methods Editor Sidney Yip, Massachusetts Institute of Technology
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN-10 1-4020-3287-0 (HB) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-10 1-4020-3286-2 (e-book) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-13 978-1-4020-3287-5 (HB) Springer Dordrecht, Berlin, Heidelberg, New York ISBN-13 978-1-4020-3286-8 (e-book) Springer Dordrecht, Berlin, Heidelberg, New York
Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands.
Printed on acid-free paper
All Rights Reserved
© 2005 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed in The Netherlands
CONTENTS PART A – METHODS Preface
xii
List of Subject Editors
ix
List of Contributors
xi
Detailed Table of Contents
xxix
Introduction
1
Chapter 1.
Electronic Scale
7
Chapter 2.
Atomistic Scale
449
Chapter 3.
Mesoscale/Continuum Methods
1069
Chapter 4.
Mathematical Methods
1215
PART B – MODELS Preface
xii
List of Subject Editors
ix
List of Contributors
xi
Detailed Table of Contents
xxix
Chapter 5.
Rate Processes
1565
Chapter 6.
Crystal Defects
1849
Chapter 7.
Microstructure
2081
Chapter 8.
Fluids
2409
Chapter 9.
Polymers and Soft Matter
2553
Plenary Perspectives
2657
Index of Contributors
2943
Index of Keywords
2947 v
PREFACE This Handbook contains a set of articles introducing the modeling and simulation of materials from the standpoint of basic methods and studies. The intent is to provide a compendium that is foundational to an emerging field of computational research, a new discipline that may now be called Computational Materials. This area has become sufficiently diverse that any attempt to cover all the pertinent topics would be futile. Even with a limited scope, the present undertaking has required the dedicated efforts of 13 Subject Editors to set the scope of nine chapters, solicit authors, and collect the manuscripts. The contributors were asked to target students and non-specialists as the primary audience, to provide an accessible entry into the field, and to offer references for further reading. With no precedents to follow, the editors and authors were only guided by a common goal – to produce a volume that would set a standard toward defining the broad community and stimulating its growth. The idea of a reference work on materials modeling surfaced in conversations with Peter Binfield, then the Reference Works Editor at Kluwer Academic Publishers, in the spring of 1999. The rationale at the time already seemed quite clear – the field of computational materials research was taking off, powerful computer capabilities were becoming increasingly available, and many sectors of the scientific community were getting involved in the enterprise. It was felt that a volume that could articulate the broad foundations of computational materials and connect with the established fields of computational physics and computational chemistry through common fundamental scientific challenges would be timely. After five years, none of the conditions have changed; the need remains for a defining reference volume, interest in materials modeling and simulation is further intensifying, the community continues to grow. In this work materials modeling is treated in 9 chapters, loosely grouped into two parts. Part A, emphasizing foundations and methodology, consists of three chapters describing theory and simulation at the electronic, atomistic, and mesoscale levels, and a chapter on analysis-based methods. Part B is more concerned with models and basic applications. There are five chapters describing basic problems in materials modeling and simulation, rate-dependent phenomena, crystal defects, microstructure, fluids, polymers and soft matter. In vii
viii
Preface
addition this part contains a collection of commentaries on a range of issues in materials modeling, written in a free-style format by experienced individuals with definite views that could enlighten the future members of the community. See the opening Introduction for further comments on modeling and simulation and an overview of the Handbook contents. Any organizational undertaking of this magnitude cans only be a collective effort. Yet the fate of this volume would not be so certain without the critical contributions from a few individuals. My gratitude goes to Liesbeth Mol, Peter Binfield’s successor at Springer Science + Business Media, for continued faith and support, Ju Li and Xiaofeng Qian for managing the websites and manuscript files, and Tim Kaxrias for stepping in at a critical stage of the project. To all the authors who found time in your hectic schedules to write the contributions, I am deeply appreciative and trust you are not disappointed. To the Subject Editors I say the Handbook is a reality only because of your perseverance and sacrifices. It has been my good fortune to have colleagues who were generous with advice and assistance. I hope this work motivates them even more to continue sharing their knowledge and insights in the work ahead. Sidney Yip Department of Nuclear Science and Engineering, Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
LIST OF SUBJECT EDITORS Martin Bazant, Massachusetts Institute of Technology (Chapter 4) Bruce Boghosian, Tufts University (Chapter 8) Richard Catlow, Royal Institution, UK (Chapter 6) Long-Qing Chen, Pennsylvania State University (Chapter 7) William Curtin, Brown University (Chapter 1, Chapter 2, Chapter 4) Tomas Diaz de la Rubia, Lawrence Livermore National Laboratory (Chapter 6) Nicolas Hadjiconstantinou, Massachusetts Institute of Technology (Chapter 8) Mark F. Horstemeyer, Mississippi State University (Chapter 3) Efthimios Kaxiras, Harvard University (Chapter 1, Chapter 2) L. Mahadevan, Harvard University (Chapter 9) Dimitrios Maroudas, University of Massachusetts (Chapter 4) Nicola Marzari, Massachusetts Institute of Technology (Chapter 1) Horia Metiu, University of California Santa Barbara (Chapter 5) Gregory C. Rutledge, Massachusetts Institute of Technology (Chapter 9) David J. Srolovitz, Princeton University (Chapter 7) Bernhardt L. Trout, Massachusetts Institute of Technology (Chapter 1) Dieter Wolf, Argonne National Laboratory (Chapter 6) Sidney Yip, Massachusetts Institute of Technology (Chapter 1, Chapter 2, Chapter 6, Plenary Perspectives)
ix
LIST OF CONTRIBUTORS Farid F. Abraham IBM Almaden Research Center, San Jose, California
[email protected] P20
Robert Averback Accelerator Laboratory, P.O. Box 43 (Pietari Kalmin k. 2), 00014, University of Helsinki, Finland; Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Illinois, USA
[email protected] 6.2
Francis J. Alexander Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 8.7
D.J. Bammann Sandia National Laboratories, Livermore, CA, USA
[email protected] 3.2
N.R. Aluru Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3
K. Barmak Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
[email protected] 7.19
Filippo de Angelis Istituto CNR di Scienze e Tecnologie Molecolari ISTM, Dipartimento di Chimica, Universit´a di Perugia, Via Elce di Sotto $, I-06123, Perugia, Italy
[email protected] 1.4
Stefano Baroni DEMOCRITOS-INFM, SISSA-ISAS, Trieste, Italy
[email protected] 1.10
Emilio Artacho University of Cambridge, Cambridge, UK
[email protected] 1.5
Rodney J. Bartlett Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
[email protected] 1.3
Mark Asta Northwestern University, Evanston, IL, USA
[email protected] 1.16
Corbett Battaile Sandia National Laboratories, Albuquerque, NM, USA
[email protected] 7.17
xi
xii Martin Z. Bazant Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.1, 4.10 Noam Bernstein Naval Research Laboratory, Washington, DC, USA
[email protected] 2.24 Kurt Binder Institut fuer Physik, Johannes Gutenberg Universitaet Mainz, Staudinger Weg 7, 55099 Mainz, Germany
[email protected] P19 Peter E. Bl¨ohl Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6 Bruce M. Boghosian Department of Mathematics, Tufts University, Bromfield-Pearson Hall, Medford, MA 02155, USA
[email protected] 8.1 Jean Pierre Boon Center for Nonlinear Phenomena and Complex Systems, Universit´e Libre de Bruxelles, 1050-Bruxelles, Belgium
[email protected] P21
List of contributors Russel Caflisch University of California at Los Angeles, Los Angeles, CA, USA
[email protected] 7.15 Wei Cai Department of Mechanical Engineering, Stanford University, Stanford, CA 94305-4040, USA
[email protected] 2.21 Roberto Car Department of Chemistry and Princeton Materials Institute, Princeton University, Princeton, NJ, USA
[email protected] 1.4 Paolo Carloni International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy
[email protected] 1.13 Emily A. Carter Department of Mechanical and Aerospace Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 1.8
Iain D. Boyd University of Michigan, Ann Arbor, MI, USA
[email protected] P22
C.R.A. Catlow Davy Faraday Laboratory, The Royal Institution, 21 Albemarle Street, London W1S 4BS, UK; Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
[email protected] 2.7, 6.1
Vasily V. Bulatov Lawrence Livermore National Laboratory, University of California, Livermore, CA 94550, USA
[email protected] P7
Gerbrand Ceder Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 1.17, 1.18
List of contributors
xiii
Alan V. Chadwick Functional Materials Group, School of Physical Sciences, University of Kent, Canterbury, Kent CT2 7NR, UK
[email protected] 6.5
Marvin L. Cohen University of California at Berkeley and Lawrence Berkeley National Laboratory, Berkeley, CA, USA
[email protected] 1.2
Hue Sun Chan University of Toronto, Toronto, Ont., Canada
[email protected] 5.16
John Corish Department of Chemistry, Trinity College, University of Dublin, Dublin 2, Ireland
[email protected] 6.4
James R. Chelikowsky University of Minnesota, Minneapolis, MN, USA
[email protected] 1.7 Long-Qing Chen Department of Materials Science and Engineering, Penn State University, University Park, PA 16802, USA
[email protected] 7.1 I-Wei Chen Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104-6282, USA
[email protected] P27 Sow-Hsin Chen Department of Nuclear Engineering, MIT, Cambridge, MA 02139, USA
[email protected] P28 Christophe Chipot Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche CNRS/UHP 7565, Institut nanc´een de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vanduvre-l`es-Nancy cedex, France 2.26 Giovanni Ciccotti INFM and Dipartimento di Fisica, Universit`a “La Sapienza,” Piazzale Aldo Moro, 2, 00185 Roma, Italy
[email protected] 2.17, 5.4
Peter V. Coveney Centre for Computational Science, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
[email protected] 8.5 Jean-Paul Crocombette CEA Saclay, DEN-SRMP, 91191 Gif/Yvette cedex, France
[email protected] 2.28 Darren Crowdy Department of Mathematics, Imperial College, London, UK
[email protected] 4.10 G´abor Cs´anyi Cavendish Laboratory, University of Cambridge, UK
[email protected] P16 Nguyen Ngoc Cuong Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15 Christoph Dellago Institute of Experimental Physics, University of Vienna, Vienna, Austria
[email protected] 5.3
xiv J.D. Doll Department of Chemistry, Brown University, Providence, RI, USA Jimmie
[email protected] 5.2 Patrick S. Doyle Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 9.7
List of contributors Diana Farkas Department of Materials Science and Engineering, Virginia Tech, Blacksburg, VA 24061, USA
[email protected] 2.23 Clemens J. F¨orst Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6
Weinan E Department of Mathematics, Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544-1000, USA
[email protected] 4.13
Glenn H. Fredrickson Department of Chemical Engineering & Materials, The University of California at Santa, Barbara Santa Barbara, CA, USA
[email protected] 9.9
Jens Eggers School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK
[email protected] 4.9
Daan Frenkel FOM Institute for Atomic and Molecular Physics, Amsterdam, The Netherlands
[email protected] 2.14
Pep Espanol ˜ Dept. Física Fundamental, Universidad Nacional de Educaci´on a Distancia, Aptdo. 60141, E-28080 Madrid, Spain
[email protected] 8.6 J.W. Evans Ames Laboratory - USDOE, and Department of Mathematics, Iowa State University, Ames, Iowa, 50011, USA
[email protected] 5.12 Denis J. Evans Research School of Chemistry, Australian National University, Canberra, ACT, Australia
[email protected] P17 Michael L. Falk University of Michigan, Ann Arbor, MI, USA
[email protected] 4.3
Julian D. Gale Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, Perth, 6845, Western Australia
[email protected] 1.5, 2.3 Giulia Galli Lawrence Livermore National Laboratory, CA, USA
[email protected] P8 Venkat Ganesan Department of Chemical Engineering, The University of Texas at Austin, Austin, TX, USA
[email protected] 9.9 Alberto García Universidad del País Vasco, Bilbao, Spain
[email protected] 1.5
List of contributors C. William Gear Princeton University, Princeton, NJ, USA
[email protected] 4.11 Timothy C. Germann Applied Physics Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11 Eitan Geva Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, USA
[email protected] 5.9 Nasr M. Ghoniem Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, CA 90095-1597, USA
[email protected] 7.11, P11, P30 Paolo Giannozzi Scuola Normale Superiore and National Simulation Center, INFM-DEMOCRITOS, Pisa, Italy
[email protected] 1.4, 1.10 E. Van der Giessen University of Groningen, Groningen, The Netherlands
[email protected] 3.4 Daniel T. Gillespie Dan T Gillespie Consulting, 30504 Cordoba Place, Castaic, CA 91384, USA
[email protected] 5.11 George Gilmer Lawrence Livermore National Laboratory, P.O. box 808, Livermore, CA 94550, USA
[email protected] 2.10
xv William A. Goddard III Materials and Process Simulation Center, California Institute of Technology, Pasadena, CA 91125, USA
[email protected] P9 Axel Groß Physik-Department T30, TU M¨unchen, 85747 Garching, Germany
[email protected] 5.10 Peter Gumbsch Institut f¨ur Zuverl¨assigkeit von Bauteilen und Systemen izbs, Universit¨at Karlsruhe (TH), Kaiserstr. 12, 76131Karlsruhe, Germany and Fraunhofer Institut f¨ur Werkstoffmechanik IWM, W¨ohlerstr. 11, D-79194 Freiburg, Germany
[email protected] P10 Fran¸cois Gygi Lawrence Livermore National Laboratory, CA, USA P8 Nicolas G. Hadjiconstantinou Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
[email protected] 8.1, 8.8 J.P. Hirth Ohio State and Washington State Universities, 114 E. Ramsey Canyon Rd., Hereford, AZ 85615, USA
[email protected] P31 K.M. Ho Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA 1.15
xvi
List of contributors
Wesley P. Hoffman Air Force Research Laboratory, Edwards, CA, USA
[email protected] P37
C.S. Jayanthi Department of Physics, University of Louisville, Louisville, KY 40292
[email protected] P39
Wm.G. Hoover Department of Applied Science, University of California at Davis/Livermore and Lawrence Livermore National Laboratory, Livermore, California, 94551-7808
[email protected] P34
Raymond Jeanloz University of California, Berkeley, CA, USA
[email protected] P25
M.F. Horstemeyer Mississippi State University, Mississippi State, MS, USA
[email protected] 3.1, 3.5 Thomas Y. Hou California Institute of Technology, Pasadena, CA, USA
[email protected] 4.14 Hanchen Huang Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590, USA
[email protected] 2.30 Gerhard Hummer National Institutes of Health, Bethesda, MD, USA
[email protected] 4.11 M. Saiful Islam Chemistry Division, SBMS, University of Surrey, Guildford GU2 7XH, UK
[email protected] 6.6 Seogjoo Jang Chemistry Department, Brookhaven National Laboratory, Upton, New York 11973-5000, USA
[email protected] 5.9
Pablo Jensen Laboratoire de Physique de la Mati´ere Condens´ee et des Nanostructures, CNRS and Universit´e Claude Bernard Lyon-1, 69622 Villeurbanne C´edex, France
[email protected] 5.13 Yongmei M. Jin Department of Ceramic and Materials Engineering, Rutgers University, 607 Taylor Road, Piscataway, NJ 08854, USA
[email protected] 7.12 Xiaozhong Jin Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3 J.D. Joannopoulos Francis Wright Davis Professor of Physics, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] P4 Javier Junquera Rutgers University, New Jersey, USA
[email protected] 1.5 Jo˜ao F. Justo Escola Polit´ecnica, Universidade de S˜ao Paulo, S˜ao Paulo, Brazil
[email protected] 2.4
List of contributors Hideo Kaburaki Japan Atomic Energy Research Institute, Tokai, Ibaraki, Japan
[email protected] 2.18 Rajiv K. Kalia Collaboratory for Advanced Computing and Simulations, Department of Physics & Astronomy, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 Raymond Kapral Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
[email protected] 2.17, 5.4 Alain Karma Northeastern University, Boston, MA, USA
[email protected] 7.2 Johannes K¨astner Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
[email protected] 1.6 Markos A. Katsoulakis Department of Mathematics and Statistics, University of Massachusetts - Amherst, Amherst, MA 01002, USA
[email protected] 4.12 Efthimios Kaxiras Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 2.1, 8.4
xvii Ronald J. Kerans Air Force Research Laboratory, Materials and Manufacturing Directorate, Wright-Patterson Air Force Base, Ohio, USA
[email protected] P38 Ioannis G. Kevrekidis Princeton University, Princeton, NJ, USA
[email protected] 4.11 Armen G. Khachaturyan Department of Ceramic and Materials Engineering, Rutgers University, 607 Taylor Road, Piscataway, NJ 08854, USA
[email protected] 7.12 T.A. Khraishi University of New Mexico, Albuquerque, NM, USA
[email protected] 3.3 Seong Gyoon Kim Kunsan National University, Kunsan 573-701, Korea
[email protected] 7.3 Won Tae Kim Chongju University, Chongju 360-764, Korea
[email protected] 7.3 Michael L. Klein Center for Molecular Modeling, Chemistry Department, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104-6323, USA
[email protected] 2.26 Walter Kob Laboratoire des Verres, Universit´e Montpellier 2, 34095 Montpellier, France
[email protected] P24
xviii David A. Kofke University at Buffalo, The State University of New York, Buffalo, New York, USA
[email protected] 2.14 Maurice de Koning University of S˜ao Paulo, S˜ao Paulo, Brazil
[email protected] 2.15 Anatoli Korkin Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA 1.3 Kurt Kremer MPI for Polymer Research, D-55021 Mainz, Germany
[email protected] P5
List of contributors C. Leahy Department of Physics, University of Louisville, Louisville, KY 40292, USA P39 R. LeSar Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 7.14 Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
[email protected] 2.8, 2.19, 2.31 Xiantao Li Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 4.13
Carl E. Krill III Materials Division, University of Ulm, Albert-Einstein-Allee 47, D-89081 Ulm, Germany
[email protected] 7.6
Gang Li Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 8.3
Ladislas P. Kubin LEM, CNRS-ONERA, 29 Av. de la Division Leclerc, BP 72, 92322 Chatillon Cedex, France
[email protected] P33
Vincent L. Lign`eres Department of Chemistry, Princeton University, Princeton, NJ 08544, USA 1.8
D.P. Landau Center for Simulational Physics, The University of Georgia, Athens, GA 30602, USA
[email protected] P2 James S. Langer Department of Physics, University of California, Santa Barbara, CA 93106-9530, USA
[email protected] 4.3, P14
Turab Lookman Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 7.5 Steven G. Louie Department of Physics, University of California at Berkeley and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
[email protected] 1.11
List of contributors
xix
John Lowengrub University of California, Irvine, California, USA
[email protected] 7.8
Richard M. Martin University of Illinois at Urbana, Urbana, IL, USA
[email protected] 1.5
Gang Lu Division of Engineering and Applied Science, Harvard University, Cambridge, Massachusetts, USA
[email protected] 2.20
Georges Martin ´ Commissariat a` l’Energie Atomique, Cab. H.C., 33 rue de la F´ed´eration, 75752 Paris Cedex 15, France
[email protected] 7.9
Alexander D. MacKerell, Jr. Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, MD, 21201, USA
[email protected] 2.5
Nicola Marzari Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 1.1, 1.4
Alessandra Magistrato International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy 1.13
Wayne L. Mattice Department of Polymer Science, The University of Akron, Akron, OH 44325-3909
[email protected] 9.3
L. Mahadevan Division of Engineering and Applied Sciences, Department of Organismic and Evolutionary Biology, Department of Systems Biology, Harvard University Cambridge, MA 02138, USA
[email protected] Dionisios Margetis Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 4.8
V.G. Mavrantzas Department of Chemical Engineering, University of Patras, Patras, GR 26500, Greece
[email protected] 9.4 D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
[email protected] 3.6, 3.9
E.B. Marin Sandia National Laboratories, Livermore, CA, USA
[email protected] 3.5
Michael J. Mehl Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
[email protected] 1.14
Dimitrios Maroudas University of Massachusetts, Amherst, MA, USA
[email protected] 4.1
Horia Metiu University of California, Santa Barbara, CA, USA
[email protected] 5.1
xx R.E. Miller Carleton University, Ottawa, ON, Canada
[email protected] 2.13 Frederick Milstein Mechanical Engineering and Materials Depts., University of California, Santa Barbara, CA, USA
[email protected] 4.2 Y. Mishin George Mason University, Fairfax, VA, USA
[email protected] 2.2 Francesco Montalenti INFM, L-NESS, and Dipartimento di Scienza dei Materiali, Universit`a degli Studi di Milano-Bicocca, Via Cozzi 53, I-20125 Milan, Italy
[email protected] 2.11 Dane Morgan Massachusetts Institute of Technology, Cambridge MA, USA
[email protected] 1.18 John A. Moriarty Lawrence Livermore National Laboratory, University of California, Livermore, CA 94551-0808
[email protected] P13 J.W. Morris, Jr. Department of Materials Science and Engineering, University of California, Berkeley, CA, USA
[email protected] P18 Raymond D. Mountain Physical and Chemical Properties Division, Chemical Science and Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899-8380, USA
[email protected] P23
List of contributors Marcus Muller ¨ Department of Physics, University of Wisconsin, Madison, WI 53706-1390, USA
[email protected] 9.5 Aiichiro Nakano Collaboratory for Advanced Computing and Simulations, Department of Computer Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 A. Needleman Brown University, Providence, RI, USA
[email protected] 3.4 Abraham Nitzan Tel Aviv University, Tel Aviv, 69978, Israel
[email protected] 5.7 Kai Nordlund Accelerator Laboratory, P.O. Box 43 (Pietari Kalmin k. 2), 00014, University of Helsinki, Finland; Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Illinois, USA 6.2 G. Robert Odette Department of Mechanical Engineering and Department of Materials, University of California, Santa Barbara, CA, USA
[email protected] 2.29 Shigenobu Ogata Osaka University, Osaka, Japan
[email protected] 1.20
List of contributors Gregory B. Olson Department of Materials Science and Engineering, Northwestern University, Evanston, IL, USA
[email protected] P3 Pablo Ordej´on Instituto de Materiales, CSIC, Barcelona, Spain
[email protected] 1.5 Tadeusz Pakula Max Planck Institute for Polymer Research, Mainz, Germany and Department of Molecular Physics, Technical University, Lodz, Poland
[email protected] P35 Vijay Pande Department of Chemistry and of Structural Biology, Stanford University, Stanford, CA 94305-5080, USA
[email protected] 5.17 I.R. Pankratov Russian Research Centre, “Kurchatov Institute”, Moscow 123182, Russia
[email protected] 7.10 D.A. Papaconstantopoulos Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
[email protected] 1.14 J.E. Pask Lawrence Livermore National Laboratory, Livermore, CA, USA
[email protected] 1.19 Anthony T. Patera Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15
xxi Mike Payne Cavendish Laboratory, University of Cambridge, UK
[email protected] P16 Leonid Pechenik University of California, Santa Barbara, CA, USA
[email protected] 4.3 Joaquim Peir´o Department of Aeronautics, Imperial College, London, UK
[email protected] 8.2 Simon R. Phillpot Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
[email protected] 2.6, 6.11 G.P. Potirniche Mississippi State University, Mississippi State, MS, USA
[email protected] 3.5 Thomas R. Powers Division of Engineering, Brown University, Providence, RI, USA thomas
[email protected] 9.8 Dierk Raabe Max-Planck-Institut f¨ur Eisenforschung, Max-Planck-Str. 1, D-40237 D¨usseldorf, Germany
[email protected] 7.7, P6 Ravi Radhakrishnan Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
[email protected] 5.5
xxii Christian Ratsch University of California at Los Angeles, Los Angeles, CA, USA
[email protected] 7.15 John R. Ray 1190 Old Seneca Road, Central, SC 29630, USA
[email protected] 2.16 William P. Reinhardt University of Washington Seattle, Washington, USA
[email protected] 2.15 Karsten Reuter Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany
[email protected] 1.9 J.M. Rickman Department of Materials Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA
[email protected] 7.14, 7.19
List of contributors Tomonori Sakai Centre for Computational Science, Queen Mary, University of London, Mile End Road, London E1 4NS, UK 8.5 Deniel S´anchez-Portal Donostia International Physics Center, Donostia, Spain
[email protected] 1.5 Joachim Sauer Institut f¨ur Chemie, Humboldt-Universit¨at zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany 1.12 Avadh Saxena Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
[email protected] 7.5 Matthias Scheffler Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany
[email protected] 1.9
Angel Rubio Departamento Física de Materiales and Unidad de Física de Materiales Centro Mixto CSIC-UPV, Universidad del País Vasco and Donosita Internacional Physics Center (DIPC), Spain
[email protected] 1.11
Klaus Schulten Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
Robert E. Rudd Lawrence Livermore National Laboratory, University of California, L-045 Livermore, CA 94551, USA
[email protected] 2.12
Steven D. Schwartz Departments of Biophysics and Biochemistry, Albert Einstein College of Medicine, New York, USA
[email protected] 5.8
Gregory C. Rutledge Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
[email protected] 9.1
Robin L.B. Selinger Physics Department, Catholic University, Washington, DC 20064, USA
[email protected] 2.23
List of contributors Marcelo Sepliarsky Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingenieria y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina
[email protected] 2.6 Alessandro Sergi Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
[email protected] 2.17, 5.4 J.A. Sethian Department of Mathematics, University of California, Berkeley, CA, USA
[email protected] 4.6 Michael J. Shelley Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
[email protected] 4.7 C. Shen The Ohio State University, Columbus, Ohio, USA
[email protected] 7.4 Spencer Sherwin Department of Aeronautics, Imperial College, London, UK
[email protected] 8.2 Marek Sierka Institut f¨ur Physikalische Chemie, Lehrstuhl f¨ur Theoretische Chemie, Universit¨at Karlsruhe, Kaiserstraße 12, D-76128 Karlsruhe, Germany
[email protected] 1.12 Asimina Sierou University of Cambridge, Cambridge, UK
[email protected] 9.6
xxiii Grant D. Smith Department of Materials Science and Engineering, Department of Chemical Engineering, University of Utah, Salt Lake City, Utah, USA
[email protected] 9.2 Fr´ed´eric Soisson CEA Saclay, DMN-SRMP, 91191 Gif-sur-Yuette, France
[email protected] 7.9 Jos´e M. Soler Universidad Aut´onoma de Madrid, Madrid, Spain
[email protected] 1.5 Didier Sornette Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, California, USA and CNRS and Universit´e des Sciences, Nice, France
[email protected] 4.4 David J. Srolovitz Princeton Materials Institute and Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ 08544, USA
[email protected] 7.1, 7.13 Marcelo G. Stachiotti Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingenieria y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina
[email protected] 2.6 Catherine Stampfl Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany; School of Physics, The University of Sydney, Sydney 2006, Australia
[email protected] 1.9
xxiv
List of contributors
H. Eugene Stanley Center for Polymer Studies and Department of Physics Boston, University, Boston, MA 02215, USA
[email protected] P36
Meijie Tang Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94550
[email protected] 2.22
P.A. Sterne Lawrence Livermore National Laboratory, Livermore, CA, USA
[email protected] 1.19
Mounir Tarek Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche CNRS/UHP 7565, Institut nanc´eien de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vanduvre-l`es-Nancy cedex, France 2.26
Howard A. Stone Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA 01238, USA
[email protected] 4.8 Marshall Stoneham Centre for Materials Research, and London Centre for Nanotechnology, Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, UK
[email protected] P12 Sauro Succi Istituto Applicazioni Calcolo, National Research Council, viale del Policlinico, 137, 00161, Rome, Italy
[email protected] 8.4 E.B. Tadmor Technion-Israel Institute of Technology, Haifa, Israel
[email protected] 2.13 Emad Tajkhorshid Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
DeCarlos E. Taylor Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
[email protected] 1.3 Doros N. Theodorou School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechniou Street, Zografou Campus, 157 80 Athens, Greece
[email protected] P15 Carl V. Thompson Department of Materials Science and Engineering, M.I.T., Cambridge, MA 02139, USA
[email protected] P26 Anna-Karin Tornberg Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
[email protected] 4.7 S. Torquato Department of Chemistry, PRISM, and Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
[email protected] 4.5, 7.18
List of contributors Bernhardt L. Trout Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 5.5 Mark E. Tuckerman Department of Chemistry, Courant Institute of Mathematical Science, New York University, New York, NY 10003, USA
[email protected] 2.9 Blas P. Uberuaga Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11, 5.6 Patrick T. Underhill Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA 9.7 V.G. Vaks Russian Research Centre, “Kurchatov Institute”, Moscow 123182, Russia
[email protected] 7.10 Priya Vashishta Collaboratory for Advanced Computing and Simulations, Department of Chemical Engineering and Materials Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
[email protected] 2.25 A. Van der Ven Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA 1.17 Karen Veroy Massachusetts Institute of Technology, Cambridge, MA, USA
[email protected] 4.15
xxv Alessandro De Vita King’s College London, UK, Center for Nanostructured, Materials (CENMAT) and DEMOCRITOS National Simulation Center, Trieste, Italy alessandro.de
[email protected] P16 V. Vitek Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104, USA
[email protected] P32 Dionisios G. Vlachos Department of Chemical Engineering, Center for Catalytic Science and Technology, University of Delaware, Newark, DE 19716, USA
[email protected] 4.12 Arthur F. Voter Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] 2.11, 5.6 Gregory A. Voth Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112-0850, USA
[email protected] 5.9 G.Z. Voyiadjis Louisiana State University, Baton Rouge, LA, USA
[email protected] 3.8 Dimitri D. Vvedensky Imperial College, London, United Kingdom
[email protected] 7.16 G¨oran Wahnstr¨om Chalmers University of Technology and G¨oteborg University Materials and Surface Theory, SE-412 96 G¨oteborg, Sweden
[email protected] 5.14
xxvi
List of contributors
Duane C. Wallace Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
[email protected] P1
Brian D. Wirth Department of Nuclear Engineering, University of California, Barkeley, CA, USA
[email protected] 2.29
Axel van de Walle Northwestern University, Evanston, IL, USA
[email protected] 1.16
Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
[email protected] 6.7, 6.9, 6.10, 6.11, 6.12, 6.13
Chris G. Van de Walle Materials Department, University of California, Santa Barbara, California, USA
[email protected] 6.3
C.Z. Wang Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA
[email protected] 1.15
Y. Wang The Ohio State University, Columbus, Ohio, USA
[email protected] 7.4
Yu U. Wang Department of Materials Science and Engineering, Virginia Tech., Blacksburg, VA 24061, USA
[email protected] 7.12
Hettithanthrige S. Wijesinghe Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
[email protected] 8.8
Chung H. Woo The Hong Kong Polytechnic University, Hong Kong SAR, China
[email protected] 2.27 Christopher Woodward Northwestern University, Evanston, Illinois, USA
[email protected] P29 S.Y. Wu Department of Physics, University of Louisville, Louisville, KY 40292, USA
[email protected] P39 Yang Xiang Department of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
[email protected] 7.13 Sidney Yip Department of Physics, Harvard University, Cambridge, MA 02138, USA
[email protected] 2.1, 2.10, 6.7, 6.8, 6.11 M. Yu Department of Physics, University of Louisville, Louisville, KY 40292, USA P39
List of contributors H.M. Zbib Washington State University, Pullman, WA, USA
[email protected] 3.3 Fangqiang Zhu Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
[email protected] 5.15
xxvii M. Zikry North Carolina State University, Raleigh, NC, USA
[email protected] 3.7
DETAILED TABLE OF CONTENTS PART A – METHODS Chapter 1. Electronic Scale 1.1
Understand, Predict, and Design Nicola Marzari 1.2 Concepts for Modeling Electrons in Solids: A Perspective Marvin L. Cohen 1.3 Achieving Predictive Simulations with Quantum Mechanical Forces Via the Transfer Hamiltonian: Problems and Prospects Rodney J. Bartlett, DeCarlos E. Taylor, and Anatoli Korkin 1.4 First-Principles Molecular Dynamics Roberto Car, Filippo de Angelis, Paolo Giannozzi, and Nicola Marzari 1.5 Electronic Structure Calculations with Localized Orbitals: The Siesta Method Emilio Artacho, Julian D. Gale, Alberto García, Javier Junquera, Richard M. Martin, Pablo Ordej´on, Deniel S´anchez-Portal, and Jos´e M. Soler 1.6 Electronic Structure Methods: Augmented Waves, Pseudopotentials and the Projector Augmented Wave Method Peter E. Bl¨ochl, Johannes K¨astner, and Clemens J. F¨orst 1.7 Electronic Scale James R. Chelikowsky 1.8 An Introduction to Orbital-Free Density Functional Theory Vincent L. Lign`eres and Emily A. Carter 1.9 Ab Initio Atomistic Thermodynamics and Statistical Mechanics of Surface Properties and Functions Karsten Reuter, Catherine Stampfl, and Matthias Scheffler 1.10 Density-Functional Perturbation Theory Paolo Giannozzi and Stefano Baroni
xxix
9 13
27
59
77
93 121 137
149 195
xxx
Detailed table of contents
1.11 Quasiparticle and Optical Properties of Solids and Nanostructures: The GW-BSE Approach Steven G. Louie and Angel Rubio 1.12 Hybrid Quantum Mechanics/Molecular Mechanics Methods and their Application Marek Sierka and Joachim Sauer 1.13 Ab Initio Molecular Dynamics Simulations of Biologically Relevant Systems Alessandra Magistrato and Paolo Carloni 1.14 Tight-Binding Total Energy Methods for Magnetic Materials and Multi-Element Systems Michael J. Mehl and D.A. Papaconstantopoulos 1.15 Environment-Dependent Tight-Binding Potential Models C.Z. Wang and K.M. Ho 1.16 First-Principles Modeling of Phase Equilibria Axel van de Walle and Mark Asta 1.17 Diffusion and Configurational Disorder in Multicomponent Solids A. Van der Ven and G. Ceder 1.18 Data Mining in Materials Development Dane Morgan and Gerbrand Ceder 1.19 Finite Elements in Ab Initio Electronic-Structure Calculations J.E. Pask and P.A. Sterne 1.20 Ab Initio Study of Mechanical Deformation Shigenobu Ogata
215
241
259
275 307 349
367 395 423 439
Chapter 2. Atomistic Scale 2.1 2.2 2.3 2.4 2.5 2.6
2.7 2.8
Introduction: Atomistic Nature of Materials Efthimios Kaxiras and Sidney Yip Interatomic Potentials for Metals Y. Mishin Interatomic Potential Models for Ionic Materials Julian D. Gale Modeling Covalent Bond with Interatomic Potentials Jo˜ao F. Justo Interatomic Potentials: Molecules Alexander D. MacKerell, Jr. Interatomic Potentials: Ferroelectrics Marcelo Sepliarsky, Marcelo G. Stachiotti, and Simon R. Phillpot Energy Minimization Techniques in Materials Modeling C.R.A. Catlow Basic Molecular Dynamics Ju Li
451 459 479 499 509
527 547 565
Detailed table of contents 2.9 2.10 2.11
2.12
2.13 2.14 2.15 2.16
2.17 2.18
2.19 2.20
2.21 2.22
2.23 2.24 2.25
2.26
Generating Equilibrium Ensembles Via Molecular Dynamics Mark E. Tuckerman Basic Monte Carlo Models: Equilibrium and Kinetics George Gilmer and Sidney Yip Accelerated Molecular Dynamics Methods Blas P. Uberuaga, Francesco Montalenti, Timothy C. Germann, and Arthur F. Voter Concurrent Multiscale Simulation at Finite Temperature: Coarse-Grained Molecular Dynamics Robert E. Rudd The Theory and Implementation of the Quasicontinuum Method E.B. Tadmor and R.E. Miller Perspective: Free Energies and Phase Equilibria David A. Kofke and Daan Frenkel Free-Energy Calculation Using Nonequilibrium Simulations Maurice de Koning and William P. Reinhardt Ensembles and Computer Simulation Calculation of Response Functions John R. Ray Non-Equilibrium Molecular Dynamics Giovanni Ciccotti, Raymond Kapral, and Alessandro Sergi Thermal Transport Process by the Molecular Dynamics Method Hideo Kaburaki Atomistic Calculation of Mechanical Behavior Ju Li The Peierls–Nabarro Model of Dislocations: A Venerable Theory and its Current Development Gang Lu Modeling Dislocations Using a Periodic Cell Wei Cai A Lattice Based Screw-Edge Dislocation Dynamics Simulation of Body Center Cubic Single Crystals Meijie Tang Atomistics of Fracture Diana Farkas and Robin L.B. Selinger Atomistic Simulations of Fracture in Semiconductors Noam Bernstein Multimillion Atom Molecular-Dynamics Simulations of Nanostructured Materials and Processes on Parallel Computers Priya Vashishta, Rajiv K. Kalia, and Aiichiro Nakano Modeling Lipid Membranes Christophe Chipot, Michael L. Klein, and Mounir Tarek
xxxi
589 613
629
649 663 683 707
729 745
763 773
793 813
827 839 855
875 929
xxxii
Detailed table of contents
2.27 Modeling Irradiation Damage Accumulation in Crystals Chung H. Woo 2.28 Cascade Modeling Jean-Paul Crocombette 2.29 Radiation Effects in Fission and Fusion Reactors G. Robert Odette and Brian D. Wirth 2.30 Texture Evolution During Thin Film Deposition Hanchen Huang 2.31 Atomistic Visualization Ju Li
959 987 999 1039 1051
Chapter 3. Mesoscale/Continuum Methods 3.1 3.2
3.3 3.4 3.5 3.6 3.7 3.8 3.9
Mesoscale/Macroscale Computational Methods M.F. Horstemeyer Perspective on Continuum Modeling of Mesoscale/Macroscale Phenomena D.J. Bammann Dislocation Dynamics H.M. Zbib and T.A. Khraishi Discrete Dislocation Plasticity E. Van der Giessen and A. Needleman Crystal Plasticity M.F. Horstemeyer, G.P. Potirniche, and E.B. Marin Internal State Variable Theory D.L. McDowell Ductile Fracture M. Zikry Continuum Damage Mechanics G.Z. Voyiadjis Microstructure-Sensitive Computational Fatigue Analysis D.L. McDowell
1071
1077 1097 1115 1133 1151 1171 1183 1193
Chapter 4. Mathematical Methods 4.1 4.2
4.3
4.4
Overview of Chapter 4: Mathematical Methods Martin Z. Bazant and Dimitrios Maroudas Elastic Stability Criteria and Structural Bifurcations in Crystals Under Load Frederick Milstein Toward a Shear-Transformation-Zone Theory of Amorphous Plasticity Michael L. Falk, James S. Langer, and Leonid Pechenik Statistical Physics of Rupture in Heterogeneous Media Didier Sornette
1217
1223
1281 1313
Detailed table of contents 4.5 4.6
4.7 4.8 4.9 4.10 4.11 4.12
4.13 4.14
4.15
Theory of Random Heterogeneous Materials S. Torquato Modern Interface Methods for Semiconductor Process Simulation J.A. Sethian Computing Microstructural Dynamics for Complex Fluids Michael J. Shelley and Anna-Karin Tornberg Continuum Descriptions of Crystal Surface Evolution Howard A. Stone and Dionisios Margetis Breakup and Coalescence of Free Surface Flows Jens Eggers Conformal Mapping Methods for Interfacial Dynamics Martin Z. Bazant and Darren Crowdy Equation-Free Modeling for Complex Systems Ioannis G. Kevrekidis, C. William Gear, and Gerhard Hummer Mathematical Strategies for the Coarse-Graining of Microscopic Models Markos A. Katsoulakis and Dionisios G. Vlachos Multiscale Modeling of Crystalline Solids Weinan E and Xiantao Li Multiscale Computation of Fluid Flow in Heterogeneous Media Thomas Y. Hou Certified Real-Time Solution of Parametrized Partial Differential Equations Nguyen Ngoc Cuong, Karen Veroy, and Anthony T. Patera
xxxiii
1333
1359 1371 1389 1403 1417 1453
1477 1491
1507
1529
PART B – MODELS Chapter 5. Rate Processes 5.1 5.2 5.3 5.4 5.5
5.6
Introduction: Rate Processes Horia Metiu A Modern Perspective on Transition State Theory J.D. Doll Transition Path Sampling Christoph Dellago Simulating Reactions that Occur Once in a Blue Moon Giovanni Ciccotti, Raymond Kapral, and Alessandro Sergi Order Parameter Approach to Understanding and Quantifying the Physico-Chemical Behavior of Complex Systems Ravi Radhakrishnan and Bernhardt L. Trout Determining Reaction Mechanisms Blas P. Uberuaga and Arthur F. Voter
1567 1573 1585 1597
1613 1627
xxxiv 5.7 5.8
5.9 5.10
5.11 5.12
5.13 5.14 5.15
5.16 5.17
Detailed table of contents Stochastic Theory of Rate Processes Abraham Nitzan Approximate Quantum Mechanical Methods for Rate Computation in Complex Systems Steven D. Schwartz Quantum Rate Theory: A Path Integral Centroid Perspective Eitan Geva, Seogjoo Jang, and Gregory A. Voth Quantum Theory of Reactive Scattering and Adsorption at Surfaces Axel Groß Stochastic Chemical Kinetics Daniel T. Gillespie Kinetic Monte Carlo Simulation of Non-Equilibrium Lattice-Gas Models: Basic and Refined Algorithms Applied to Surface Adsorption Processes J.W. Evans Simple Models for Nanocrystal Growth Pablo Jensen Diffusion in Solids G¨oran Wahnstr¨om Kinetic Theory and Simulation of Single-Channel Water Transport Emad Tajkhorshid, Fangqiang Zhu, and Klaus Schulten Simplified Models of Protein Folding Hue Sun Chan Protein Folding: Detailed Models Vijay Pande
1635
1673 1691
1713 1735
1753 1769 1787
1797 1823 1837
Chapter 6. Crystal Defects 6.1 6.2 6.3 6.4 6.5 6.6 6.7
Point Defects C.R.A. Catlow Point Defects in Metals Kai Nordlund and Robert Averback Defects and Impurities in Semiconductors Chris G. Van de Walle Point Defects in Simple Ionic Solids John Corish Fast Ion Conductors Alan V. Chadwick Defects and Ion Migration in Complex Oxides M. Saiful Islam Introduction: Modeling Crystal Interfaces Sidney Yip and Dieter Wolf
1851 1855 1877 1889 1901 1915 1925
Detailed table of contents 6.8 6.9 6.10
6.11 6.12 6.13
Atomistic Methods for Structure–Property Correlations Sidney Yip Structure and Energy of Grain Boundaries Dieter Wolf High-Temperature Structure and Properties of Grain Boundaries Dieter Wolf Crystal Disordering in Melting and Amorphization Sidney Yip, Simon R. Phillpot, and Dieter Wolf Elastic Behavior of Interfaces Dieter Wolf Grain Boundaries in Nanocrystalline Materials Dieter Wolf
xxxv
1931 1953
1985 2009 2025 2055
Chapter 7. Microstructure 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8
7.9
7.10 7.11 7.12 7.13
Introduction: Microstructure David J. Srolovitz and Long-Qing Chen Phase-Field Modeling Alain Karma Phase-Field Modeling of Solidification Seong Gyoon Kim and Won Tae Kim Coherent Precipitation – Phase Field Method C. Shen and Y. Wang Ferroic Domain Structures using Ginzburg–Landau Methods Avadh Saxena and Turab Lookman Phase-Field Modeling of Grain Growth Carl E. Krill III Recrystallization Simulation by Use of Cellular Automata Dierk Raabe Modeling Coarsening Dynamics using Interface Tracking Methods John Lowengrub Kinetic Monte Carlo Method to Model Diffusion Controlled Phase Transformations in the Solid State Georges Martin and Fr´ed´eric Soisson Diffusional Transformations: Microscopic Kinetic Approach I.R. Pankratov and V.G. Vaks Modeling the Dynamics of Dislocation Ensembles Nasr M. Ghoniem Dislocation Dynamics – Phase Field Yu U. Wang, Yongmei M. Jin, and Armen G. Khachaturyan Level Set Dislocation Dynamics Method Yang Xiang and David J. Srolovitz
2083 2087 2105 2117 2143 2157 2173
2205
2223 2249 2269 2287 2307
xxxvi
Detailed table of contents
7.14 Coarse-Graining Methodologies for Dislocation Energetics and Dynamics J.M. Rickman and R. LeSar 7.15 Level Set Methods for Simulation of Thin Film Growth Russel Caflisch and Christian Ratsch 7.16 Stochastic Equations for Thin Film Morphology Dimitri D. Vvedensky 7.17 Monte Carlo Methods for Simulating Thin Film Deposition Corbett Battaile 7.18 Microstructure Optimization S. Torquato 7.19 Microstructural Characterization Associated with Solid–Solid Transformations J.M. Rickman and K. Barmak
2325 2337 2351 2363 2379
2397
Chapter 8. Fluids 8.1 8.2
8.3
8.4 8.5
8.6 8.7
8.8
Mesoscale Models of Fluid Dynamics Bruce M. Boghosian and Nicolas G. Hadjiconstantinou Finite Difference, Finite Element and Finite Volume Methods for Partial Differential Equations Joaquim Peir´o and Spencer Sherwin Meshless Methods for Numerical Solution of Partial Differential Equations Gang Li, Xiaozhong Jin, and N.R. Aluru Lattice Boltzmann Methods for Multiscale Fluid Problems Sauro Succi, Weinan E, and Efthimios Kaxiras Discrete Simulation Automata: Mesoscopic Fluid Models Endowed with Thermal Fluctuations Tomonori Sakai and Peter V. Coveney Dissipative Particle Dynamics Pep Espa˜nol The Direct Simulation Monte Carlo Method: Going Beyond Continuum Hydrodynamics Francis J. Alexander Hybrid Atomistic–Continuum Formulations for Multiscale Hydrodynamics Hettithanthrige S. Wijesinghe and Nicolas G. Hadjiconstantinou
2411
2415
2447 2475
2487 2503
2513
2523
Chapter 9. Polymers and Soft Matter 9.1 9.2
Polymers and Soft Matter L. Mahadevan and Gregory C. Rutledge Atomistic Potentials for Polymers and Organic Materials Grant D. Smith
2555 2561
Detailed table of contents 9.3 9.4 9.5 9.6 9.7
9.8 9.9
Rotational Isomeric State Methods Wayne L. Mattice Monte Carlo Simulation of Chain Molecules V.G. Mavrantzas The Bond Fluctuation Model and Other Lattice Models Marcus M¨uller Stokesian Dynamics Simulations for Particle Laden Flows Asimina Sierou Brownian Dynamics Simulations of Polymers and Soft Matter Patrick S. Doyle and Patrick T. Underhill Mechanics of Lipid Bilayer Membranes Thomas R. Powers Field-Theoretic Simulations Venkat Ganesan and Glenn H. Fredrickson
xxxvii
2575 2583 2599 2607
2619 2631 2645
Plenary Perspectives P1 P2 P3 P4 P5 P6
P7
Progress in Unifying Condensed Matter Theory Duane C. Wallace The Future of Simulations in Materials Science D.P. Landau Materials by Design Gregory B. Olson Modeling at the Speed of Light J.D. Joannopoulos Modeling Soft Matter Kurt Kremer Drowning in Data – A Viewpoint on Strategies for Doing Science with Simulations Dierk Raabe Dangers of “Common Knowledge” in Materials Simulations Vasily V. Bulatov
Quantum Simulations as a Tool for Predictive Nanoscience Giulia Galli and François Gygi P9 A Perspective of Materials Modeling William A. Goddard III P10 An Application Oriented View on Materials Modeling Peter Gumbsch P11 The Role of Theory and Modeling in the Development of Materials for Fusion Energy Nasr M. Ghoniem
2659 2663 2667 2671 2675
2687
2695
P8
2701 2707 2713
2719
xxxviii
Detailed table of contents
P12 Where are the Gaps? Marshall Stoneham P13 Bridging the Gap between Quantum Mechanics and Large-Scale Atomistic Simulation John A. Moriarty P14 Bridging the Gap between Atomistics and Structural Engineering J.S. Langer P15 Multiscale Modeling of Polymers Doros N. Theodorou P16 Hybrid Atomistic Modelling of Materials Processes Mike Payne, G´abor Cs´anyi, and Alessandro De Vita P17 The Fluctuation Theorem and its Implications for Materials Processing and Modeling Denis J. Evans P18 The Limits of Strength J.W. Morris, Jr. P19 Simulations of Interfaces between Coexisting Phases: What Do They Tell us? Kurt Binder P20 How Fast Can Cracks Move? Farid F. Abraham P21 Lattice Gas Automaton Methods Jean Pierre Boon P22 Multi-Scale Modeling of Hypersonic Gas Flow Iain D. Boyd P23 Commentary on Liquid Simulations and Industrial Applications Raymond D. Mountain P24 Computer Simulations of Supercooled Liquids and Glasses Walter Kob P25 Interplay between Materials Theory and High-Pressure Experiments Raymond Jeanloz P26 Perspectives on Experiments, Modeling and Simulations of Grain Growth Carl V. Thompson P27 Atomistic Simulation of Ferroelectric Domain Walls I-Wei Chen
2731
2737
2749 2757 2763
2773 2777
2787 2793
2805 2811
2819 2823
2829
2837 2843
Detailed table of contents
xxxix
P28 Measurements of Interfacial Curvatures and Characterization of Bicontinuous Morphologies Sow-Hsin Chen
2849
P29 Plasticity at the Atomic Scale: Parametric, Atomistic, and Electronic Structure Methods Christopher Woodward P30 A Perspective on Dislocation Dynamics Nasr M. Ghoniem P31 Dislocation-Pressure Interactions J.P. Hirth P32 Dislocation Cores and Unconventional Properties of Plastic Behavior V. Vitek P33 3-D Mesoscale Plasticity and its Connections to Other Scales Ladislas P. Kubin P34 Simulating Fluid and Solid Particles and Continua with SPH and SPAM Wm.G. Hoover P35 Modeling of Complex Polymers and Processes Tadeusz Pakula P36 Liquid and Glassy Water: Two Materials of Interdisciplinary Interest H. Eugene Stanley P37 Material Science of Carbon Wesley P. Hoffman P38 Concurrent Lifetime-Design of Emerging High Temperature Materials and Components Ronald J. Kerans P39 Towards a Coherent Treatment of the Self-Consistency and the Environment-Dependency in a Semi-Empirical Hamiltonian for Materials Simulation S.Y. Wu, C.S. Jayanthi, C. Leahy, and M. Yu
2865 2871 2879
2883 2897
2903 2907
2917 2923
2929
2935
INTRODUCTION Sidney Yip Department of Nuclear Science and Engineering, Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 (USA)
The way a scientist looks at the materials world is changing dramatically. Advances in the synthesis of nanostructures and in high-resolution microscopy are allowing us to create and probe assemblies of atoms and molecules at a level that was unimagined only a short time ago – the prospect of manipulating materials for device applications, one atom at a time, is no longer a fantasy. Being able to see and touch the materials up close means that we are more interested than ever in understanding their properties and behavior at the atomic level. Another factor which contributes to the present state of affairs is the advent of large-scale computation, once a rare and highly sophisticated resource accessible only to a few privileged scientists. In the past few years materials modeling, in the broad sense of theory and simulation in integration with experiments, has emerged as a field of research with unique capabilities, most notably the ability to analyze and predict a very wide range of physical structures and phenomena. Some would now say the modeling approach is becoming an equal partner to theory and experiment, the traditional methods of scientific inquiry. There are certain problems in the fundamental description of matter, previously regarded as intractable, now are amenable to simulation and analysis. The ab initio calculation of solid-state properties using electronic-structure methods and the direct estimation of free energies based on statistical mechanical formulations are just two examples where predictions are being made without input from experiments. Because materials modeling draws from all the disciplines in science and engineering, it greatly benefits from cross fertilization within a multidisciplinary community. There is recognition that Computational Materials is just as much a field as Computational Physics or Chemistry; it offers a robust framework for focused scientific studies and exchanges, from the introduction of new university curricula to the formation of centers for collaborative research among academia, corporate and government laboratories. A basic appeal to all members of the growing community 1 S. Yip (ed.), Handbook of Materials Modeling, 1–5. c 2005 Springer. Printed in the Netherlands.
2
S. Yip
is the challenge and opportunity of solving problems that are fundamental in nature and yet have great technological impact, problems spanning the disciplines of physics, chemistry, engineering and biology. Multiscale modeling has come to symbolize the emerging field of computational materials research. The idea is to link simulation models and techniques across the micro-to-macro length and time scales, with the goal of analyzing and eventually controlling the outcome of critical materials processes. Invariably these are highly nonlinear, inhomogeneous, or non-equilibrium phenomena in nature. In this paradigm, electronic structure would be treated by quantum mechanical calculations, atomistic processes by molecular dynamics or Monte Carlo simulations, mesoscale microstructure evolution by methods such as finite-element, dislocation dynamics, or kinetic Monte Carlo, and continuum behavior by field equations central to continuum elasticity and computational fluid dynamics. The vision of multiscale modeling is that by combining these different methods, one can deal with complex problems in a much more comprehensive manner than when the methods are used individually [1]. “Modeling is the physicalization of a concept, simulation is its computational realization.”
This is an oversimplified statement. On the other hand, it is a way to articulate the intellectual character of the present volume. This Handbook is certainly about modeling and simulation. Many would agree that conceptually the process of modeling ought to be distinguished from the act of simulation. Yet there seems to be no consensus on how the two terms should be used to show that each plays an essential role in computational research. Here we suggest a brief all-purpose definition (admittedly lacking specificity). By concept we have in mind an idea, an idealization, or a picture of a system (a scenario of a process) which has the connotation of functionality. For an example consider the subway map of Boston. Although it gives no information about the city streets, its purpose is to display the connectivity of the stations – few would dispute that for the given purpose it is a superb physical construct enabling any person to navigate from point A to point B [2]. So it is with our twopart definition; it is first a thoughtfully simplified representation of an object to be studied, a phenomenon, or a process (modeling), then it is the means with which to investigate the model (simulation). Notice also that when used together modeling and simulation implies an element of coordination between what is to be studied and how the study is to be conducted.
Length/Time Scales in Materials Modeling Many physical phenomena have significant manifestations on more than one level of length or time scale. For example, wave propagation and
Introduction
3
attenuation in a fluid can be described at the continuum level using the equations of fluid dynamics, while the determination of shear viscosity and thermal conductivity is best treated at the level of molecular dynamics. While each level has its own set of relevant phenomena, an even more powerful description would result if the microscopic treatment of transport could be integrated into the calculation of macroscopic flows. Generally speaking, one can identify four distinct length (and corresponding time) scales where materials phenomena are typically studied. As illustrated in Fig. 1, the four regions may be referred to as electronic structure, atomistic, microstructure, and continuum. Imagine a piece of material, say a crystalline solid. The smallest length scale of interest is about a few angstroms (10−8 cm). On this scale one deals directly with the electrons in the system which are governed by the Schr¨odinger equation of quantum mechanics. The techniques that have been developed for solving this equation are extremely computationally intensive, as a result they can be applied only to small simulation systems, at present no more than about 300 atoms. On the other hand, these calculations are theoretically the most rigorous; they are particularly valuable for developing and validating more approximate but computationally more efficient descriptions. The scale at the next level, spanning from tens to about a thousand angstroms, is called atomistic. Here discrete particle simulation techniques, molecular dynamics (MD) and Monte Carlo (MC), are well developed,
Figure 1. Length scales in materials modeling showing that many applications in our physical world take place on the micron scale and higher, while our basic understanding and predictive ability lie at the microscopic levels.
4
S. Yip
requiring the specification of an empirical classical interatomic potential function with parameters fitted to experimental data and electronic-structure calculations. The most important feature of atomistic simulation is that one can now study a system of large number of atoms, at present as many as 109 . On the other hand, because the electrons are ignored atomistic simulations are not as reliable as ab initio calculations. Above the atomistic level the relevant length scale is a micron (104 angstroms). Whether this level should be called microscale or mesoscale is a matter for which convention has not been clearly established. The simulation technique commonly in use is finite-element calculations (FEM). Because many useful properties of materials are governed by the microstructure in the system, this is perhaps the most critical level for materials design. However, the information required to carry out such calculations, for example, the stiffness matrix, or any material-specific physical parameters, has to be provided from either experiment or calculations at the atomistic or ab initio level. To a large extend, the same can be said for the continuum-level methods, such as computational fluid dynamics (CFD) and continuum elasticity (CE). The parameters needed to perform these calculations have to be supplied externally. There are definite benefits when simulation techniques at different scales can be linked. Continuum or finite-element methods are often most practical for design calculations. They require parameters or properties which cannot be generated within the methods themselves. Also they cannot provide the atomic-level insights needed for design. For these reasons continuum and finite element calculations should be coupled to atomistic and ab initio methods. It is only when methods at different scales are effectively integrated that one can expect materials modeling to give fundamental insight as well as reliable predictions across the scales. The efficient bridging of the scales in Fig. 1 is a significant challenge in the further development of multiscale modeling. The classification of materials modeling and simulation in terms of length and time scales is but one way of approaching the subject. The point of Fig. 1 is to emphasize the theoretical and computational methods that have been developed to describe the properties and behavior of physical systems, but it does not address other equally important issues, those of applications. One might imagine discussing materials modeling through a matrix of methods and applications which could be useful for displaying their connection and particular suitability. This would be quite difficult to carry out at present because there are not enough clear-cut case studies in the literature to make the construction of such a matrix meaningful. From the standpoint of knowing what methods are best suited for certain problems, materials modeling is a field still in its infancy.
Introduction
5
An Overview of the Handbook The Handbook is laid out in 9 chapters, dealing with modeling and simulation methods (Part A) and models for specific areas of studies (Part B). In Part A the first three chapters describe modeling concepts and simulation techniques at the electronic (Chapter 1), atomistic (Chapter 2), and mesoscale (Chapter 3) levels, in the spirit of Fig. 1. In contrast Chapter 4 describes a variety of methods based on mathematical analysis. The chapters in Part B focus on systems in which basic studies have been carried out. Chapter 5 treats rate processes where time-scale problems are just as important and challenging as length-scale problems. The next four chapters cover a range of physical structures, crystal defects (Chapter 6) and microstructure (Chapter 7) in solids, various models and methods for fluid simulation (Chapter 8), and models of polymer and soft matter (Chapter 9). In each chapter there are other significant topics which have not been included; for these we recommend the readers consult the references given in each article. Each chapter begins with an introduction which serves to connect the individual articles in the chapter with the broad themes that are relevant to our growing community. While no single chapter attempts to be inclusive in treating the many important aspects of materials modeling, even with restrictions to fundamental methods and models, hopefully, the entire Handbook is a first step in that direction. The Handbook also has a special section which we call Plenary Perspectives. This is a collection of commentaries by recognized authorities in the materials modeling or related fields. Each author was invited to write briefly on a topic that would give the readers, especially the students, insight on different issues in materials modeling. Together with the 9 chapters these perspectives are meant to inform the future workers coming into this exciting field.
References [1] S. Yip, “Synergistic science,” Nature Mater., 3, 1–3, 2003. [2] M. Ashby, “Modelling of materials problems,” J. Comput.-Aided Mater. Des., 3, 95–99, 1996.
Chapter 1 ELECTRONIC SCALE
Let us, as nature directs, begin first with first principles. [Aristotle, Poetics I]
1.1 UNDERSTAND, PREDICT, AND DESIGN Nicola Marzari Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Electronic-structure approaches are changing dramatically the way much theoretical and computational research is done. This success derives from the ability to characterize from first-principles many material properties with an accuracy that complements or even augments experimental observations. This accuracy can extend beyond the properties for which a real-life experiment is either feasible or just cost-effective, and it is based on our ability to compute and understand the quantum-mechanical behavior of interacting electrons and nuclei. Density-functional theory, for which the Nobel prize in chemistry was awarded in 1998, has been instrumental to this success, together with the availability of computers that are now routinely able to deal with the complexity of realistic problems. The extent of such revolution should not be underestimated, notwithstanding the many algorithmic and theoretical bottlenecks that await resolution, and the existence of hard problems rarely amenable to direct simulations. Since ab-initio methods combine fundamental predictive power with atomic resolution, they provide a quantitatively-accurate first step in the study and characterization of new materials, and the ability to describe with unprecedented control molecular architectures exactly at those scales (hundreds to thousands of atoms) where some of the most promising and undiscovered properties are to be engineered. In the current effort to control and design the properties of novel molecules, materials, and devices, firstprinciples approaches constitute thus a unique and very powerful instrument. Complementary strategies emerge: • Insight: First-principles simulations provide a unique connection between microscopic and macroscopic properties. When partnered with experimental tools – from spectroscopies to microscopies – they can deliver unique insight and understanding on the detailed arrangements of atoms 9 S. Yip (ed.), Handbook of Materials Modeling, 9–11. c 2005 Springer. Printed in the Netherlands.
10
N. Marzari
and molecules, and on their relation to the observed phenomena. Gedanken computational experiments can be used to prove or probe cause-effect relationships in ways that are different, and novel, compared with our established approaches. • Control: Microscopic simulations provide an unprecedented degree of control on the systems studied. While macroscopic behavior often emerges from complexity – thus explaining all the ongoing efforts in overcoming the time- and length-scale limitations – fundamental understanding needs to be built from the bottom-up, under the carefully controlled condition of a computational experiment. Simulations can offer early and accurate insights on complex materials that are challenging to control or characterize. • Design: Quantitatively accurate predictions of materials’ properties provide us with an unprecedented freedom, a “magic wand” that can be used with ingenuity to try and engineer novel material properties. Intuitions can often be rapidly validated, shifting and focusing appropriately the synthetic challenge to the later stages, once a promising class of materials has been identified. • Optimization: Finally, the systematic exploration of material properties inside or across different classes of materials can highlight the potential for absolute or differential improvements. Stochastic techniques such as data mining and optimization then identify the most promising candidates, narrowing down the field of structures to be targeted in real-life testing. While the extent and scope of this emerging discipline are nothing short of revolutionary, researchers in the field face key challenges that are worth remembering: achieving thermodynamical accuracy, bridging length-scales, and overcoming time-scales limitations. It is unlikely that an overarching solution to these problems will appear, and much of the art of modeling goes into solving these challenges for the problem at hand. It is nevertheless important to remark the role of correlations: whenever the typical correlation lengths become smaller then the size of the simulation box (e.g., for a liquid studied in periodic-boundary conditions), the system studied becomes virtually infinite, and the finite-size bias irrelevant. The articles presented in this volume offer a glimpse on the panorama of electronic-structure modeling; in such distinguished company, it would be inappropriate for me to condense such diverse and exciting contributions into a few sentences. I will leave the science to the authors, and conclude with a few statements on future developments. The continuous improvement in the price vs. performance ratio for commodity CPUs is now widely apparent. Whereas computational resources seem never enough, and the desire of a longer and bigger simulation is always looming, we are now in the position where even a single desktop is sufficient to
Understand, predict, and design
11
sustain research of world-class quality (of course, human resources are even more precious, and human ingenuity can be sometimes light-heartedly traded for sheer computational power). This availability of computer power is now combined with the availability of state-of-the-art computer packages – some of them freely distributed and developed under a shared-community, public-license model akin to that, e.g., of Linux. The net result has been that “computational laboratories” around the world have been increasing in capability with a speed comparable to Moore’s law, their hardware and software infrastructures replicated almost at the flick of a switch. Some conclusions can be attempted: • The geographic distribution of researchers in this field might change significantly. World-class science can now be done inexpensively and extensively, and knowhow and human resources become almost exclusively the most precious commodities. • Publicly available electronic structure packages take the role of internationally shared infrastructures; in perfect analogy with the way brick-andmortar facilities (such as synchrotrons) serve many groups in different countries. It could even be argued that investment in “computational infrastructures” (electronic-structure packages) can have comparable benefits, and a remarkable cost structure. • While these technologies become faster, more robust, and prettier, they also become more and more complex, often requiring years of training to be mastered – content and expertise could also be developed and freely shared following similar public-license models. The last point brings us back to one of the greatest challenges, and one for which we hope this Handbook will bring a positive contribution: how to avoid trading contents for form, critical thinking for indiscriminate simulations. In T.S. Eliot’s words: “The last temptation is the greatest treason: To do the right deed for the wrong reason.”
1.2 CONCEPTS FOR MODELING ELECTRONS IN SOLIDS: A PERSPECTIVE Marvin L. Cohen University of California at Berkeley and Lawrence Berkeley National Laboratory, Berkeley, CA, USA
1.
The Electron’s Central Role
It’s clear that an understanding of the behavior of electrons in solids is essential for explaining and predicting solid state properties. Electrons provide the glue holding solids together, and hence they are central in determining structural, mechanical and vibrational properties. Under the influence of electromagnetic fields, electrical current transport involves electron transport for most solids. Optical properties for many ranges of frequency are dominated by electronic transitions. Understanding superconductivity, magnetism, dielectric properties, ferroelectricity, and most properties of solids requires a detailed knowledge of “electronic structure” which is the term associated with the study of electronic energy levels, but more broadly a general label for the subfield of condensed matter physics which is focused on the properties of electrons in solids. In the end, modeling, simulating, calculating, and computing refer to producing equations, numbers or pictures which describe, explain, and predict properties. So this general area has always had a mixed set of goals. Theoretical researchers vary in their emphasis on these goals. For example, some theorists are focused on explaining phenomena with the simplest possible models containing the fundamental physics. A good example is the Bardeen–Cooper– Schrieffer (BCS) [1] theory of superconductivity which is one of the great achievements of 20th century physics. This theory brought new concepts, but the modeling of the electrons forming Cooper pairs considered electrons in free electron states because calculating normal-state properties for particular solids was not very far along in 1957. As a result, computing transition
13 S. Yip (ed.), Handbook of Materials Modeling, 13–26. c 2005 Springer. Printed in the Netherlands.
14
M.L. Cohen
temperatures for specific solids using BCS theory was, and still is, difficult; and, for some researchers, this was viewed at the time as a defect in the fundamental theory, which it was not. There are theorists interested in numerical precision. They continually push at the forefront of computer science and applied mathematics to develop consistent approaches that can deal with properties of clusters, molecules, and complex solids with many atoms in a unit cell. Sometimes these researchers have strong overlap with computer scientists and engineers and even get involved in hardware development. Perhaps the largest and most dominant group of researchers in modeling solids at this time are theorists motivated by particular experimental properties or phenomena. Unlike the researchers interested only in phenomena, they are trying to calculate these properties for “real materials.” For these theorists, it is essential that interactions among electrons and ionic cores not be replaced by a constant (as in the BCS model), and electrons are not viewed as completely free or as atomic states. They want the appropriate description of the electronic states for the material at hand and a computational approach to calculate measured properties. Successful comparisons with experiments is the goal, and it is the degree of accuracy in these comparisons which measures the worth of the calculation rather than numerical precision. In the papers presented in this volume, the reader will find authors with research goals having varying degrees of “accuracy for explaining and predicting properties” versus “calculational precision” as a primary goal. Irrespective of motivation, an essential component for modeling is the conceptual base. In other words, the way we picture solids on a microscopic or nanoscopic level.
2.
Conceptual Base
Under pressure, gases made of atoms can condense to become liquids with molecular units of clusters or atoms, and then, with more pressure, they generally transform into solids. So most models of solids involve a picture of atoms interacting to form a periodic array of ions with electrons in various geometric configurations. Modern electron charge density plots [2] have influenced our mental images of covalent, ionic and metallic bonding using contour maps and pictures of dense dots to represent electrons confined in bonds appropriate for covalent or ionic semiconductors or spread out charge maps to represent electrons in metals. As an example, Fig. 1 shows the electronic charge density in the (110) plane for carbon and silicon both in the diamond structure. The bond lengths are 1.54 Å and 2.35 Å, respectively. It has been said that carbon is the basis of biology while silicon is the basis of geology, and it is the nature of the covalent bonds in these two systems which determines these properties. As
Concepts for modeling electrons in solids: a perspective
15
Valence charge density (110 plan)
Figure 1. Contour maps of the valence electron charge density of C and Si in the diamond structure to illustrate a visual perception of covalent bonding.
shown in the figure, the carbon bond has two maxima while there is essentially one for silicon. The electrons in carbon can form sp2 hybrids for three-fold coordination and multiple bonds while elemental silicon at ambient pressures and temperatures forms sp3 bonds and is tetrahedrally coordinated. If solids are made of atoms, then it is the job of those modeling electronic behavior to illustrate this evolution of electrons from being localized around ions to the formation of covalent and metallic bonds. For this purpose, the old atomic models of Thomson and Newton work well pictorially. Thomson’s plum pudding model resembled our modern picture of jellium with a positive smeared out background representing the ions and then electrons existing in this background. Unlike jellium where the electrons are smeared out, Thompson’s electrons were plums. Hence, the essential difference is that the electrons in the jellium model are treated quantum mechanically and despite the fact that they can be excited out of the metal and look like Thomson’s plums, inside the metal they are itinerant. The resulting jellium model works for many properties of metals. In contrast to Thomson’s atomic model, Newton’s atoms had hooks, and it takes little imagination to see how these atoms with interlocking hooks can be used to form the basis of covalent and ionic crystals. However, again we need to show how the electrons can become hooks and form covalent or ionic bonds, and this requires quantum mechanics.
16
M.L. Cohen
Our modern quantum atom description is based on wavefunctions which yield probabilities for electron density. So, we can determine “exactly where an electron probably is.” This brings up the challenge of Dirac [3] posed after the development of quantum theory: “The underlying physical laws necessary for a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.” It is probably safe to say that to some extent we have answered Dirac’s challenge and we can now model electrons in some solids. Modern computing machines and new algorithms for solving complex equations have been an important ingredient, but just as important and probably more so is the conceptual base or modern “picture” of a solid that is inherently quantum mechanical.
3.
Standard Model
Since solids are made of atoms, why not start with atomic wavefunctions and perturb them. This works; it is the tight binding model which has had great success especially for systems where electrons are not “too itinerant.” Methods like this represent a natural path for quantum chemists who start from atoms and study molecules. This is also a logical path for doing computations of finite small systems like clusters or nanostructures. Another approach is to think of the free electron metal where each atom contributes its electrons to the soup of electrons in a solid. Perturbations on this model, such as the nearly free electron model, represent a very successful approach. Both of these very different paths will be represented in this volume and both are useful. The latter approach is conceptually the more difficult because in some sense it starts from a gas of electrons instead of electrons bound to atoms, but it has had widespread use and leads to very useful methods. One generally restricts the basis set to plane waves which are appropriate for free electrons, but there are other approaches. So in this model, sometimes referred to as the “Standard Model,” one can visualize an array of positive cores in a background sea of valence electrons coming from the atoms. In the plane wave pseudopotential version of this model, there are two types of particles: valence electrons and positive cores. For a study of a particular solid, one arranges the cores in a periodic array and uses a plane wave basis set for the quantum mechanical calculations. The particles interact in the following way. Core–core interactions which can be viewed as point-like Coulombic objects which can be represented by Madelung type sums to give accurate descriptions of these interactions. The electron–core
Concepts for modeling electrons in solids: a perspective
17
interaction is modeled using pseudopotentials [4, 5] and the electron–electron interactions are dealt with using density functional theory [6]. It is amazing how robust this model is when one considers the fact that for over 50 years beginning with approaches like the OPW [7] and APW [8] methods, researchers struggled with the band structure dilemma of how to describe electrons which are atomic-like near the cores and free electron-like between the cores. The conceptual breakthrough was the pseudopotential which accounted for the Pauli forces near the cores and led to weak effective potentials. Early versions were empirical [9] and fit to optical data, but eventually it became possible to construct pseudopotentials from first principles. Further discussion of pseudopotentials will be given in this volume. A convenient approach using the standard model is to calculate total energies [10] for model solids where the atoms are arranged in different configurations and only atomic information such as atomic numbers and masses are used as input. Hence different candidate crystal structures can be compared at varying volumes or pressures to explain the stability of observed structures or predict new ones. Here we find a major application of this method since in addition to structural stability, properties such as lattice constants, elastic constants, bulk moduli, vibrational spectra, and even electron–phonon and anharmonic properties of solids can be evaluated. The techniques connected to this method have evolved and they too will be discussed in this volume. Using plane waves or other basis sets and even tight binding schemes, there appears to be consensus in this area. Particularly dramatic early successes were the successful predictions of new high pressure crystalline phases of Si and Ge, and the successful prediction of superconductivity in high pressure phases of Si [11]. A more recent success is a detailed explanation of the unusual superconducting properties of MgB2 [12].
4.
Now and Later
So what are the modern challenges? If in fact we have to some extent answered Dirac’s challenge of 75 years ago, what’s next? A few obvious areas at this point for future exploration and development are: studies of electron behavior and transport in confined or small systems; development of better order N methods for calculating electronic properties so that more complex systems can be addressed; further development of theories designed to study excited states for optical and related properties; and the evaluation of the effects of strong electron correlation. In addition, more semi-empirical models should be developed since they were important in the past, and there is reason to believe these will contribute to future development.
18
5.
M.L. Cohen
Confinement
It is clear that confinement sets the energy scale whether we are considering protons in nuclei, electrons in atoms or clusters, and to some extent, electrons in nano and macro materials. In the latter case, there are confinement scales set by the overall object size and by the components such as atoms or unit cells. One gets a good sense of how this works when considering shell models for nuclei or for alkali metal clusters [13, 14]. The so-called magic numbers emerge for the number of atoms in a cluster and stability of energy shells. The energy shell structure can influence overall structure and properties. For macrosystems, it is the atoms, their spacings, and the unit cell which set the energy scales. For confinement in macrosystems, their large sizes lead to such small energy splittings that the available energy states appear continuous even at the lowest attainable temperatures. However, size effects for small systems and surfaces can bring in a new scale and methods such as the supercell method [15] can be used to address situations like this where translational symmetry is lost. Clusters are good examples of systems where confinement effects can be dominant. Here, supercell techniques can be used, but real space methods, such as those described in this volume, can cover a wide range of situations where size matters. Nanotubes, peapods, atomic chains, quantum dots, large molecules, network systems, polymers, fullerenes, etc. are all examples of systems where electron confinement can lead to significant alterations in wavefunctions and hence properties. Transport is a particularly interesting field of study on the nanoscale. There are a number of research groups focused on the formulation of a transport theory for electron conduction through molecules and nanosystems. Here the vexing problem of contacts must be dealt with, and, for chains of atoms, questions related to even and odd numbers of atoms are relevant. Because the nanoscale is of interest to physicists, chemists, biologists, engineers, materials scientists, and computer scientists, there has been a great deal of synergy between these disciplines and surprising demonstrations of the commonality of the problems facing researchers in these fields. One example is molecular motors. The problem of understanding friction in molecular motors with nanotube bearings is not very different from similar questions posed by biologists studying friction in biomotors. Another example is the application of nanostructures for devices. Figure 2 shows the merging of an (8,0) semiconducting carbon nanotube with a (7,1) metallic carbon nanotube. This is achieved by inserting a defect between them with adjacent five-member and seven-member rings of carbon atoms. The result is a Schottky barrier whose properties are determined just by the action of a handful of atoms at the interface.
Concepts for modeling electrons in solids: a perspective
19
Figure 2. A schematic drawing of Schottky barrier composed of semiconducting (8,0) and metallic (7,1) carbon nanotubes.
6.
Methods
Many researchers are exploring so-called “order N ” methods for attacking large or complex systems. As mentioned before, real space methods also appear promising. Researchers have developed new schemes for attempting to do inversions of matrices employing methods that resemble a “divide and conquer” approach. Schematically, a large matrix can be cut down through different point sampling into smaller units. The developments in this area are encouraging, and the collaborations between mathematicians doing numerical analysis and theoretical physicists and chemists appear to be productive. Another approach is to acknowledge that most problems on solids are multi-scale problems. A multi-scale approach can be most simply illustrated
20
M.L. Cohen
by an example where one calculates microscopic parameters and uses them along with semi-empirical models at a larger scale. Many sophisticated versions of this approach have been developed in recent years. Some of this very interesting research is described in detail in this volume.
7.
Excited States
Generally the problem which arises when excited states of solids are considered is that many of the standard methods used to compute the effects of electron–electron interactions use the local density approximation (LDA) which is not directly applicable for calculating excited state properties. For example, in the total energy LDA approach [10], ground state properties such as lattice constants and mechanical properties are determined quite accurately. However, in an optical process, photons create electron–hole pairs in the solid which influence the excited state properties of the many electron system. When band gaps of semiconductors are evaluated from energy bands obtained using the LDA methods, there is an underestimate of the band gap typically by a factor of about two. In some cases metallic behavior is predicted for systems known to be semiconductors. The so called “band gap” problem was of central concern when applications of the “standard model,” which were so successful for ground state properties, became clearly unusable for computing band gaps. The overall topology of the energy bands was approximately right and in agreement with empirical models and experimental data where checks were possible, but the details were wrong. Early suggestions such as the “scissors model” where levels were artificially shifted by adding a constant energy to the calculated bandgap were considered to be “band aids” and not cures. Although this is still an active area of research, there are methods for evaluating quasiparticle energies. One of the most successful is the GW method [16] which works for a broad class of solids. Two major ingredients in this approach are the inclusion of electron self-energy effects and the modulation of the charge density in the crystal. This latter feature allows for the effects on exchange and correlation energies arising from the concentrations of electrons into bonds as an example. Another feature of the properties of the excited state which must be addressed is the role of electron–hole interactions. Two of the most dramatic effects are the formation of excitons and the alteration of oscillator strengths arising from electron–hole interactions. Again, this is an active area of research, but a workable theory is available [17] where the Bethe–Salpeter approach for two particle scatterings is adopted and applied along with the GW machinery. Forces in the excited state and other special features arising
Concepts for modeling electrons in solids: a perspective
21
from considering these interactions can be calculated. Comparisons between this method and others, such as time dependent density functional approaches [18], quantum Monte Carlo methods and more quantum chemistry oriented approaches are yielding new insights into this area. It appears that research in this field will remain active for some time as there are many possible applications.
8.
Strongly Correlated Electrons
At this time, it is commonly believed that a forefront field of condensed matter theory is the study of strongly correlated electrons. However, as in the case of defining biophysics, the image of what is meant by this field of study varies with individuals. As was described at the beginning of this article, there are theorists attempting to use simplified models to get the essence of the physics associated with problems related to strongly correlated electron systems. A prime example is the large amount of research devoted to the study of superconductivity in copper oxide systems. Here it is clear why theorists are motivated. Electron correlation effects are important, there is no consensus yet on the underlying electron pairing mechanism, and the normal state and superconducting properties are very interesting. So the application of models such as the Hubbard Model has attracted a large number of theoretical researchers. Many interesting proposals for explaining the electronic properties of the oxides using Hubbard-like models have been advanced. At present, this is an active field, but as mentioned before, there is still no general agreement on “the” appropriate description of these systems, and in general, there is a lack of definitive proof of good theoretical–experimental agreement. The more ab initio approaches designed for specific materials are beginning to make some impact on this area. Despite the known shortcomings of applying band structure calculations based on a density functional approach to materials of this kind, these were among the most useful calculations for interpreting experiments like photoelectron spectroscopy aimed at determining electronic structure. The Fermi surface topology and other electronic characteristics were explored with considerable success through experimental– theoretical comparisons along with reasonable empirical adjustments to the electronic structure calculations. Currently, efforts are underway for a more frontal assault on this problem. By combining local spin density calculations together with Hubbard-like terms to account for electron–electron repulsion, more realistic electronic structure calculations are being done. Variations and improvements on these “LSDA + U” approaches [19] including the use of pseudopotentials appear to be promising. And it is possible that the more first
22
M.L. Cohen
principles, materials-motivated approach may make important contributions to the conceptual development of this field.
9.
Empirical Models
Just as the atomic models of Thompson and Newton described earlier help to form a basis for the conceptual picture of electronic behavior, other empirical and semi-empirical models had a considerable effect on the the development of this field of study. The Thomas–Fermi model which allowed calculation of electron screening effects, Slater’s and Wigner’s formulas for evaluating the effects of exchange and correlation gave important insight into the role of these many body effects. Free electron and nearly free electron models were extremely important as were empirical tight binding models for estimating band structure effects. An example which illustrates the transition from an empirical model designed to explain experimental data into a first-principles approach is the Empirical Pseudopotential Method (EPM). In this approach [9], a few form factors (usually three per atom) of the potential in the unit cell are fit to yield band structures consistent with experimental measurements. For example, three band gaps in the optical spectrum of Si or Ge can be used to fix the potential for these atoms, and then the electronic band structure and other properties can be computed with a high degree of accuracy. When applying the EPM, the pseudopotential is taken to be the total potential a valence electron experiences; it combines the electron–ion and electron– electron interactions. In the course of fitting these potentials, the problem of how the optical properties of semiconductors were related to interband transitions was solved in the 1960s and 1970s. In addition, a great deal was learned about the pseudopotential. It was found that pseudopotentials were “transferable.” Pseudopotentials constructed for InAs, InSb and GaAs could be used to extract As, In, Sb and Ga pseudopotentials. In fact, the extracted In, Ga As, and Sb pseudopotentials were transferable between compounds and even worked well to give the electronic structure of these metals and semi-metals. So it became clear that each atom had its own transferable potential, and at least to a first approximation, these could be extracted from experiment and applied widely. In addition to learning about the transferability of the pseudopotentials, their general form and properties gave a great deal of information which was used when first-principles potentials were developed. So this empirical approach which is still used not only provided an accessible and flexible calculational tool, it also provided ideas and facts for use in developing the fundamental theory. The resulting band structures were also accurate. Figure 3 shows a comparison between the predicted EPM band structures of GaAs
Concepts for modeling electrons in solids: a perspective
23
Figure 3. A comparison of the predicted pseudopotential band structure for occupied energy bands in GaAs together with the experimental bands determined by Angular Resolved Photoemission Spectroscopy.
and the subsequent experimentally determined data using Angular Resolved Photoemission Spectroscopy. Another example involved bulk moduli of semiconductors and insulators. The first principles approach using total energy calculations as a function of volume E(V ) allows the determination of elastic constants and, in particular, the bulk modulus B. These calculations are fairly extensive and hence costly. Another approach based on concepts introduced by Phillips [20] yields a connection between spectral properties of semiconductors and insulators and their structural or bonding properties. By exploiting [21] these concepts, a simple formula can be derived for B which requires only the bond length d, and the integers I = 0, 1, 2 to indicate a group IV, III–V, or II–VI compounds. The resulting formula B = (1972 − 220 I) d−3.5 gives calculated values for B to within a few percent of the experimental values. Again, not only is this semi-empirical approach valuable because the calculation can be done on a hand calculator in a few seconds, it also give
24
M.L. Cohen
insight into the nature of compressibilities. For example, one can make estimates and explore limits of B for aids in predicting the existence of superhard solids [22].
10.
Future
As Yogi Berra stated, “Predictions are hard to make, especially about the future.” However, it is clear that this area of physics will expand. Multi-scale methods [23] to study materials assembled from fundamental building blocks that are understood at the micro or nano level will continue to be an active field with interest coming from materials science, chemistry, and physics. Problems like understanding the nature of growth, diffusion, amorphous materials, and even non-equilibrium processes can be addressed. Molecular dynamics [24] can also be used to attack problems of this kind [25]. Real space methods [26, 27] will also continue to impact this area of research. The general interest in clusters and how they develop properties associated with bulk properties and the study of the evolution of material properties as size changes will demand new methods and concepts. As mentioned in the section on excited states, there has been considerable progress in determining optical properties from first-principles theory for solids. There has also been progress on the calculation of optical properties for clusters and nanocrystals. These approaches [18] are sometimes labeled as time dependent LDA or TDLDA. Growth in this area is also expected. A frontier has always been the study of increasingly more complex solids. Many materials can be described in terms of unit cells with a finite number of atoms. Computational problems arise as the number of atoms increases. Here hardware development helps, and it is impressive how much progress continues to be made in extending the complexity of systems that can be studied. However, the appetite for considering more complex systems is large particularly at the border where this field of science merges with biophysics. Complex molecules and systems like DNA are coming into the range of study where researchers expect precision on the level of what has been achieved for crystals. Clearly this is an area of important research with a bright future as is nanoscience and quantum computation where we may possibly learn new things about quantum mechanics. As mentioned earlier, the frontier of correlated electrons remains, and many feel that present theory is up to the challenge. If success is achieved in this area and our ability to treat more complex systems is enhanced, it may be possible to predict new states of matter. I would expect that this phase of discovery, if it is in the cards for theorists, will be preceded by the development of semiempirical theories like the EPM. With good models and general knowledge of effects such as polarizability [28] one may be able to predict phenomena
Concepts for modeling electrons in solids: a perspective
25
on the level of magnetism, superconductivity, and the quantum Hall effects. However, this may be a long way off, so we still need experimentalists.
Acknowledgments This work was supported by National Science Foundation Grant No. DMR00-87088 and by the Director, Office of Science, Office of Basic Energy Sciences, Division of Materials Sciences and Engineering, US Department of Energy under contract No. DE-AC03-76SF00098.
References [1] J. Bardeen, L.N. Cooper, and JR., Schrieffer, “Theory of superconductivity,” Phys. Rev., 108, 1175–1204, 1957. [2] J.P. Walter and M.L. Cohen, “Electronic charge densities in semiconductors,” Phys. Rev. Lett., 26, 17–19, 1971. [3] P.A.M. Dirac, “Quantum mechanics of many-electron systems,” Proc. R. Soc. (London), A123, 714–733, 1929. [4] E. Fermi, “On the pressure shift of the higher levels of a spectral line series,” Nuovo Cimente, 11, 157, 1934. [5] J.C. Phillips and L. Kleinman, “New method for calculating wave functions in crystals and molecules,” Phys. Rev., 116, 287–294, 1959. [6] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [7] C. Herring, “A new method for calculating wave functions in crystals,” Phys. Rev., 57, 1169–1177, 1940. [8] J.C. Slater, “Wave functions in a periodic potential,” Phys. Rev., 51, 846–851, 1937. [9] M.L. Cohen and T.K. Bergstresser, “Band structures and pseudopotential form factors for fourteen semiconductors of the diamond and zincblende structures,” Phys. Rev., 141, 789–796, 1966. [10] M.L. Cohen, “Pseudopotentials and total energy calculations,” Phys. Scripta, T1, 5–10, 1982. [11] K.J. Chang, M.L. Cohen, J.M. Mignot, G. Chouteau, and G. Martinez, “Superconductivity in high-pressure metallic phases of Si,” Phys. Rev. Lett., 54, 2375–2378, 1985. [12] H.J. Choi, D. Roundy, H. Sun, M.L. Cohen, and S.G. Louie, “The origin of the anomalous superconducting properties of MgB2 ,” Nature, 418, 758, 2002. [13] W.D. Knight, K. Clemenger, W.A. de Heer, W.A. Saunders, M.Y. Chou, and M.L. Cohen, “Electronic shell structure and abundances of sodium clusters,” Phys. Rev. Lett., 52, 2141–2143, 1984. [14] W.A. de Heer, W.D. Knight, M.Y. Chou, and M.L. Cohen, “Electronic shell structure and metal clusters,” In: H. Ehrenreich and D. Turnbull, (eds.), Solid State Physics, vol. 40, Academic Press, New York, p. 93, 1987. [15] M.L. Cohen, M. Schl¨uter, J.R. Chelikowsky, and S.G. Louie, “Self-consistent pseudopotential method for localized configurations: molecules,” Phys. Rev. B, 12, 5575–5579, 1975.
26
M.L. Cohen [16] M.S. Hybertsen and S.G. Louie, “First-principles theory of quasiparticles: calculation of band gaps in semiconductors and insulators,” Phys. Rev. Lett., 55, 1418–1421, Phys. Rev. B, 34, 5390–5413, 1986. [17] M. Rohlfing and S.G. Louie, “Electron–hole exitations in semiconductors and insulators,” Phys. Rev. Lett., 81, 2312–2315, 1998, Phys. Rev. B, 62, 4927–4944, 2000. [18] I. Vasiliev, S. Ogut, and J.R. Chelikowsky, “First-principles density-functional calculations for optical spectra of clusters and nanocrystals,” Phys. Rev. B, 65, 115416, 2002. [19] V.I. Anisimov, J. Zaanen, and O.K. Andersen, “Band theory and Mott insulators: Hubbard U instead of Stoner I,” Phys. Rev. B, 44, 943–954, 1991. [20] J.C. Phillips, Bonds and Bands in Semiconductors, Academic Press, New York, 1973. [21] M.L. Cohen, “Calculation of bulk moduli of diamond and zinc-blende solids,” Phys. Rev. B, 32, 7988–7991, 1985. [22] A.Y. Liu and M.L. Cohen, “Prediction of new low compressibility solids,” Science, 245, 841, 1989. [23] N. Choly and E. Kaxiras, “Fast method for force computations in electronic structure calculations,” Phys. Rev. B, 67, 155101, 2003. [24] R. Carr and M. Parrinello, “Variational quantum Monte Carlo nonlocal pseudopotential approach to solids: cohesive and structural properties of diamond,” Phys. Rev. Lett., 61, 1631–1634, 1988. [25] S. Yip, “Nanocrystaline metals – Mapping plasticity,” Nature Mater., 3, 11, 2004. [26] J.R. Chelikowsky, N. Troullier, and Y. Saad, “The finite-difference-pseudopotential method: electronic structure calculations without a basis,” Phys. Rev. Lett., 72, 1240–1243, 1994. [27] M.M.G. Alemany, M. Jain, J.R. Chelikowsky, and L. Kronik, “A real space pseudopotential method for computing the electronic properties of periodic systems,” Phys. Rev. B, 69, 075101, 2004. [28] I. Souza, J. Iniguez, D. Vanderbilt, “Dynamics of berry-phase polarization in timedependent electric fields,” Phys. Rev. B, 69, 085106, 2004. [29] M.L. Cohen, and J.R. Chelikowsky, Electronic Structure and Optical Properties of Semiconductors, Springer-Verlag, Berlin, 1988. [30] C. Kittel, Introduction to Solid State Physics, seventh edition, Wiley, New York, 1996. [31] J.C. Phillips, Bonds and Bands in Semiconductors, Acadamic Press, New York, 1973. [32] P.Y. Yu and M. Cardona, Fundamentals of Semiconductors, Springer, Berlin, 1996.
1.3 ACHIEVING PREDICTIVE SIMULATIONS WITH QUANTUM MECHANICAL FORCES VIA THE TRANSFER HAMILTONIAN: PROBLEMS AND PROSPECTS Rodney J. Bartlett, DeCarlos E. Taylor, and Anatoli Korkin Quantum Theory Project, Departments of Chemistry and Physics, University of Florida, Gainesville, FL 32611, USA
1.
Prologue
According to the Westmoreland report [1], “in the next ten years, molecularly based modeling will profoundly affect how new chemistry, biology, and materials physics are understood, communicated, and transformed to technology, both intellectually and in commercial applications. It creates new ways of thinking – and of achieving.” Computer modeling of materials can potentially have an enormous impact in designing or identifying new materials, how they fracture or decompose, what their optical properties are, and how these and other properties can be modified. However, materials’ simulations can be no better than the forces provided by the potentials of interaction among the atoms involved in the material. Today, these are almost invariably classical, analytical, two- or threebody potentials, because only such potentials permit the very rapid generation of forces required by large-scale molecular dynamics. Furthermore, while such potentials have been laboriously developed over many years, adding new species frequently demands another long-term effort to generate potentials for the new interactions. Most simulations also depend upon idealized crystalline (periodic) symmetry, making it more difficult to describe the often more technologically important amorphous materials. If we also want to observe bond breaking and formation, optical properties, and chemical reactions, we must have a quantum mechanical basis for our simulations. This requires a multi-scale philosophy, where a quantum mechanical core is tied to a classical 27 S. Yip (ed.), Handbook of Materials Modeling, 27–57. c 2005 Springer. Printed in the Netherlands.
28
R.J. Bartlett et al.
atomistic region, which in turn is embedded in a continuum of some sort, like a reaction field or a finite-element region. It is now well-known that ab initio quantum chemistry has achieved the quality of being “predictive” to within established small error bars for most properties of isolated, relatively small molecules, making it far easier to obtain requisite information about molecules from applications of theory, than to attempt complicated and expensive experimental observation. In fact, applied quantum chemistry as implemented in many widely used computer programs, ACES II [2], GAUSSIAN, MOLPRO, MOLCAS, QCHEM, etc, has now attained the status of a tool that is complimentary to those of X-ray structure determination and NMR and IR spectra in the routine determination of the structure and spectra of molecules. However, there is an even greater need for the computer simulations of complex materials to be equally predictive. Unlike molecules, which can usually be characterized in detail by spectral and other means, materials are far more complex and cannot usually be investigated experimentally under similarly controlled conditions. They have to be studied at elevated temperatures and under non-equilibrium conditions. Frequently, the application of the material might be meant for extreme situations that might not even be accessible in a laboratory. Hence, if we use more economical computer models to learn how to suitably modify a material to achieve an objective, our materials simulations must be “predictive,” to trust both the qualitative and quantitative consequences of the simulations. Besides the predictive aspect, another theme that permeates our work with materials is “chemistry.” By chemistry we mean that unlike the idealized systems that have been the focus of most of the simulation work in materials science, we want to consider the essential interactions among many different molecular species; and, in particular, under stress. As an example, a long unsolved problem in materials is why water will cause forms of silica to weaken by several orders of magnitude compared to their dry forms [3–5] while ammonia with silica shows a different behavior. A proper, quantum mechanically based simulation should reflect these differences, qualitatively and quantitatively. The third theme of our work is that by virtue of using a quantum mechanical (QM) core in multi-scale simulations, unlike all the simulations based upon classical potentials, we have quantum state specificity. In a problem like etching silica with CF4 , which generates the ething agent, CF3 , a classical potential − · cannot distinguish between CF+ 3 , CF3 , and CF3 , yet obviously the chemistry will be very different. Furthermore, we also have need for the capability to use excited electronic states in our simulations, to include species like CF∗3 , e.g., or to distinguish between different modes of fractures of the silica target, such as radical dissociation as opposed to ionic dissociation. Conventionally, the only quantum mechanically based multi-scale dynamics simulations that would permit as many as 500–1000 atoms in the QM region were based upon the tight-binding (TB) method, density functional theory
Achieving predictive simulations with quantum mechanical forces
29
(DFT) being used only for smaller QM regions. TB is a pervasive term that covers everything from crude, non-self-consistent descriptions like extended H¨uckel theory [6], to quasi-self-consistent schemes based upon Mulliken or other point charges [7], to a long history of solid state efforts [8, 9], to TB with three-body terms [10]. The poorest of these do not introduce overlap, selfconsistency, nor explicit consideration of the nuclear–nuclear repulsion terms that would be essential in any ab initio approach; so in general such methods cannot correctly describe bond breaking, where charge transfer is absolutely essential. However, there have been significant improvements on several fronts in the recent TB literature [11, 12] which are helping to rectify these failings. The alternative approach to TB is that based upon the semi-empirical quantum chemistry tradition starting with Pariser and Parr [13, 14], Dewar et al. [15, 16] and Pople et al. [17, 18], and being extended on several fronts by Stewart [19–21], Thiel [22], Merz [23], Repasky et al. [24], and TubertBrohman et al. [25]. These “neglect of differential overlap methods,” of which the most flexible is the NDDO method, meaning “neglect of diatomic differential overlap” will be our initial focus. Like TB methods, the Hamiltonian is greatly simplified but not necessarily by limiting all interactions to nearest neighbors, but instead to operationally limiting interactions to mostly diatomic units in molecules. We will address some of the details later, but for most of our purposes, the particular form for the “transfer Hamiltonian” will be at our disposal and suitable forms with rigorous justification are a prime objective of our research. It might be asked why a “Hamiltonian” instead of a potential energy surface? Fitting the latter especially while including the plethora of bond-breaking regions, is virtually impossible for even simple molecules. Highly parameterized molecular mechanics (MM) methods [26] can do a good job of generating a potential energy surface near equilibrium for well-defined and unmodified molecular units; but bond breaking and formation is outside the scope of MM. So our objective, instead of the PES (potential energy surface), is to create a “transfer Hamiltonian” that permit the very rapid determination of, in principle, all the properties of a molecule; and especially the forces on a PES for steps of the MD. The transfer Hamiltonian gives us a way to subsum most of the complications of a PES in a very convenient package that will yield the energy and first and second derivatives upon command. This has been done to some degree in rate constant applications for several atom molecules where the complication is the need for multi-dimensional PES information [27–29]. Here, we conceive of the transfer Hamiltonian as a way to get all the relevant properties of a molecule including its electronic density, and related properties like dipole moments, and its photoelectron, electronic, and vibrational spectra. Except for the latter, these are purely “electronic” properties, which depend solely on the electronic Schr¨odinger equation. These should be distinguished from forces and the PES itself, which are properties of the total energy.
30
R.J. Bartlett et al.
The distinction between the two has been at the heart of the principal dilemma in simplified or semi-emprirical theory, where a set of parameters that give the total energy are not able to describe electronic properties equally well. It is also critical that the Hamiltonian be computed very rapidly to accomodate MD applications, and a form for it needs to be determined such that we retain the accuracy of the forces and other properties that would come from ab initio correlated theory. This is more an objective than a fait-accompli, but we will discuss how to try to accomplish this in this contribution. Our approach is to appeal to the highest level of ab initio quantum chemistry, namely coupled-cluster (CC) theory, to use as a basis for a “transfer Hamiltonian” that embed the accurate, predictive quality CC forces taken from suitable clusters into it, but in an operator that is of very low rank, making it possible to do fully self-consistent calculations on ∼500–1000 atoms undergoing MD. Hence, as long as a phenomena is accessible to MD, and if the transfer Hamiltonian forces retain the accuracy of CC theory, we should be able to retain the predictive quality of the CC method in materials simulations; and if we can also describe the electronic properties accurately, we have everything that the Schr¨odinger equation could tell us about our system. In addition, we have no problem with changing atoms or adding new molecules to our simulations, as our transfer Hamiltonian is applicable to any system once trained to ensure its proper description. We will also develop the transfer Hamiltonian approach from DFT considerations in the following to show the essential consistency between the wavefunction and density functional methods. Our emphasis on predictability, chemistry, and state specificity, offers a novel perspective in the field; and the tools we are developing, all tied together with highly flexible software, sets the stage for the kinds of simulations that will lead to reliable materials design. As the Westmoreland report further states, ‘The top needs required by industry are methods that are “bigger, better, faster;” (with) more extensive validation, and multiscale techniques.’
2.
Introduction
Our objective is predictive simulations of materials. The critical element in any such simulation are the forces that drive the molecular dynamics. For a reliable description of bond breaking, as in fracture or chemical reaction, or to distinguish between a free radical and a cation or anion, to be electronic state specific; or to account for optical spectra; the forces must be obtained from a quantum mechanical method. Today’s entirely first-principles, quantum chemical methods are “predictive” for small molecules in the sense that with a suitable level of electron correlation, notably with coupled-cluster (CC) theory [30], and large enough basis sets [30, 31]; or to a lesser extent, density functional theory (DFT) [32–34] the results for molecular structure, spectra,
Achieving predictive simulations with quantum mechanical forces
31
energetics and the associated atomic forces required for these quantities and for reaction paths are competitive with experiment. In particular, these highly correlated methods offer accurate results for transient molecules and other experimentally inaccessible species, and particularly reaction paths that can seldom be known from solely experimental considerations. In terms of ab initio theory, the established paradigm of results from converging, correlated methods is MP2
Figure 1. Comparison of CI, MBPT, and CC results with full CI. Results Based on DZP basis for BH, and H2 O at Re, 1.5Re, and 2.0Re.
32
R.J. Bartlett et al.
well. When we go to X = 3, we get a third s, third set of p functions, a second set of d functions, and a set of f functions. Clearly, we rapidly go to quite large basis sets when X 3. The fundamental problem with using these methods for large molecules is that after MP2 (∼n 5 ) the above CC calculations scale non-linearly with the number of basis functions, as ∼n 6 for CCSD, ∼n 7 for (T), ∼n 8 for CCSDT, etc. The CC methods now in wide use were developed by the Bartlett group from 1978 to the present [36], and have now been implemented numerous times by independent researchers. As for benchmarks toward experiment, many studies of expected error bars exist in the literature for various levels of CC. Notably, the book by Helgaker et al. [31] shows many comparisons. We plot the normal distributions of their results for HF, CCSD and CCSD(T) in Figs. 2–4 for the pVDZ and pVTZ bases. All ab initio results depend upon the quality of the basis set as well as the correlation corrections. We can summarize the results in Table 1. With a triply polarized basis like cc-pVTZ, the CCSD(T) standard deviations are for structure (∼0.0023 Å), dissociation energies for single bonds (∼3.5 kcal/mol), harmonic vibrational frequencies(∼5–20 cm−1 ), excitation energies (∼0.1 eV for singly excited states) and NMR coupling constants (∼5 Hz), with similar ones for other properties. From the normal distributions of errors for bond lengths, dissociation energies, and heats of atomization in small molecules at various levels of theory, there is a dramatic improvement of CC methods over SCF, CISD, MP2. There can also be a significant difference between CCSD and CCSD(T) where the triple excitations are added in a non-iterative form to CCSD [36]. There is an inadequate database about transition states and activation barriers, since few are known experimentally. For complex systems of the type addressed by modern multi-scale simulations [39, 40], maintaining a chain of approximations built upon the quantum mechanical core like the paradigm above to retain the predictability of the underlying forces is even more important, as there is seldom the extent and quality
Figure 2. Normal distributions of the errors in calculated bond distances for a set of 28 molecules containing first row atoms.
Achieving predictive simulations with quantum mechanical forces
33
Figure 3. Normal distributions of the errors in calculated atomization energies for a set of 16 molecules containing first row atoms.
Figure 4. Normal distributions for the errors in calculated reaction enthalpies for a set of 13 reactions containing first row atoms. Table 1. Bond lengths and dissociation energies as a function of basis set and method [37, 38] Band length
HF MP2 CCSD(T)
Dissociation energy
DZ
TZ
DZ
TZ
0.021 0.013 0.016
0.028 0.006 0.002
7.12 7.41 8.78
6.85 3.28 2.88
of molecular specific experimental data available to test the theory that there is for small molecules Hence, evolving toward predictive simulations is critical to obtaining accurate, qualitative and quantitative conclusions. So how can we achieve the predictability we need for materials simulations? The problem is illustrated in Fig. 5. We can do highly accurate studies of molecular structure, spectra, and bond breaking for ∼20 atoms at the CC level; ∼50–200 at the MP2 level; and ∼100–300 at the DFT level. In an isolated case for the energy at a
34
R.J. Bartlett et al.
Figure 5. Computational accuracy and efficiency of available potential forms compared to the transfer Hamiltonian.
single geometry (not necessarily forces) with additional tricks we can go much further to ∼1000 atoms [41, 42]. But here we are only concerned with methods for the forces that can be done on a time scale that can be reasonably tied to MD. This imposes a severe limitation on the size of system that can be addressed. The “transfer Hamiltonian” concept [43, 44] is meant to be a way to retain much of the accuracy of ab initio quantum chemistry, like that from CCSD; but in a way that permits the description of ∼500–1000 atoms to be described by QM forces within a time-frame that can be tied to dynamics. We will first consider the wavefunction viewpoint and then that from DFT. After discussing the formal structure, we will specify to a particular form for the transfer Hamiltonian and illustrate its application with numerical results.
3.
Transfer Hamiltonian: Wavefunction Approach
In the correlated CC theory we start with the time-independent Schr¨odinger equation, H = E = exp(T )|0
(1) (2)
Achieving predictive simulations with quantum mechanical forces
35
and introduce the CC ansatz, by writing the wavefunction in the exponential form of Eq. (2). The operator T = T1 + T2 + T3 + · · · tia {a † i} T1 =
(3) (4)
a,i
T2 =
† † tiab j {a ib j }
(5)
i> j.a>b
T3 =
† † † tiabs j k {a ib j c k}
(6)
i> j >k,a>b.c
The T1 generates all single excitations, i.e., T1 |0= a,i tia ai from the vacuum, usually HF (but could equally well be the Kohn–Sham determinant), meaning excitation of an electron from an occupied orbital to an unoccupied one. We use the convention that i, j, k, l represent orbitals occupied in the Fermi vacuum, while a, b, c, d are unoccupied, and p, q, r, s are unspecified. T2 does the same for the double excitations, and T3 the triple excitations. Continuation through Tn for n electrons will give the full CI solution. Multiplying the Schr¨odinger equations from the left by exp(−T ), the critical quantity in CC theory is the similarity transformed Hamiltonian, exp(−T )H exp(T ) = H
(7)
where the Schrodinger equation becomes, H |0 = E|0
(8)
|0 is the Fermi vacuum, or an independent particle wavefunction, but E(R) = 0|H |0 is the exact energy at a given geometry, and the exact forces subject to atomic displacement are ∇ E(R) = F(R)
(9)
The effects of electron correlation are contained in the cluster amplitudes, whose equations at a given R are Q n H |0 = 0 ab abc abc where Q1 = |ai ai |, Q 2 = |ab i j i j |, Q 3 = |i j k i j k |+ · · · . Q1 projections give the equations for {tai }, and similarly for the other amplitudes. Limiting ourselves to single and double excitations, we have CCSD which is a highly correlated, accurate wavefunction. Consideration of triples provides, CCSDT, the state-of-the-art; while for practical application, its non-iterative forms CCSD[T] and its improved modification, CCSD[T]; is currently considered the “gold standard” for most molecular studies [36, 43].
36
R.J. Bartlett et al.
Regardless of choice of excitation, H may be written in secondquantization as 1 pq † † p H = h q p† q + grs p q s r + III + IV + · · · 2
(10)
where summation of repeated indices is assumed and III and IV indicate threeand four-body operators. The indices can indicate either atomic or molecular pq = pq|rs = ( pr|qs) = d1 d2φ ∗p (1)φr (1)g12 φq∗ orbitals. More explicitly, grs (2)φs (2) where the latter two-electron integral indicates the interaction between the electron distributions associated with electrons 1 and 2, respectively. We use g12 instead of r−1 12 because in the generalized form for H there may be additional operators of two-electron type besides just the familiar integrals. Such one- and two-electron quantitites further separated into one, two, and more atomic centers, are the quantitites that will have to be computed or in the case of simplified theories, approximated, to provide the results we require. At this point, we have an explicitly correlated, many-particle theory. It is important to distinguish this from an effective one-particle theory as in DFT or Hartree–Fock, which are much easier to apply to complicated systems. To make this connection, we choose to reformulate the many-particle theory into an effective one-particle form. This is accomplished by insisting that the energy variation δ E = 0, which means the derivative of E with respect to the orbitals that will compose the single determinant, |, vanish. As our expressions for tab.. i j.. , the CC equations, will depend upon the integrals over these orbitals, and consequently H ; this procedure is iterative. As any such variation of a determinant can be written in the form | = exp(T1 )|0, the single excitation projection of H has to vanish, ai |H |0 = 0 = a|hT |i
(11) (12)
where we introduce the “transfer Hamiltonian” operator, hT . Since this matrix element vanishes between the occupied orbital, i, and the unoccupied orbital, a, we can use the resolution of the identity 1= j |j j | + b |bb| to rewrite this equation in the familiar form, hT |i =
λ j i | j = i |i
(13)
j
where the first form retains the off-diagonal Lagrangian multipliers, while the second is canonical. The above can equally well be done for HF-SCF theory, except hT = f= t + v + J− K =h + J− K , wherewe have the kinetic-energy operator, the electron–nuclear attraction term − Z A /|r − R A |, combined together into the one-particle element of Eq. (13); the Coulomb repulsion and
Achieving predictive simulations with quantum mechanical forces
37
the non-local exchange operator, repectively. The Hartree–Fock effective one particle operator, J − K = j d2φ ∗j (2)(1 − P12 )φ j (2), and there would be no correlation in the Fock operator. In that case, i provides the negative of the Koopmans’ estimate of ionization potentials, and a the Koopmans’ approximation to the electron affinities. For the correlated hT , which is the one-particle theory originally due to Brueckner [45, 46], all single excitations vanish from the exact wavefunction, and as a consequence, we have maximum overlap of the Brueckner determinant with the exact wavefunction, | B ||. In general, Brueckner theory is not Hermitian, but in any order of perturbation theory we can insist upon its hermiticity, i.e., i|hT |a = 0, and that will be sufficient for our purposes. The specific form for the transfer Hamiltonian matrix element is a|hT |i = a| f|i +
1 a j ||cbticbj − k j ||ib tkjab 2
(14)
where summation over repeated indices is implied. Keeping the form of the hT operator in the a|hT |i matrix element the same, when a is replaced by an occupied orbital, m, we have m|hT |i = m| f|i +
1 m j ||cbticbj − k j ||ibtkjmb 2
(15)
Then, we have the Hartree–Fock-like equations but now for the correlated one-particle operator, hT , represented in the basis set, |χ, where S = χ|χ is the overlap matrix, hT C = SC
(16)
and the (molecular) orbitals are |φ = |χC. The Brueckner determinant, B , is composed of the lowest n occupied MOs, |φ0 = |χC0 In particular, the matrix elements for the transfer Hamiltonian in terms of the atomic orbital basis set are
µ
µα µ|h T |ν = h ν + Pαβ (g µα νβ − g βν )
Pµν = cµi ciν
(17) (18)
(summation of repeated indices is assumed),where Pνµ is the density matrix for µ the Brueckner determinant. Hence, subject to modified definitions for h ν and µα g νβ , which we will assume are renormalized to include the critical parts of the three- and higher-electron effects, we have the matrix which contains the exact ionization potentials for the system.
38
R.J. Bartlett et al. The total energy, E = B |H | B =
i|h|i +
1 1 i j ||i j + i j ||abtiab j 2 i, j 4 i, j,a,b
(19)
i|h|i +
1 i j |g 12 |i j 2 i, j
(20)
i
=
i
1 T rP(h+hT ) 2 1 −1 = r12 + T2 ||abab|| 2 a,b
(21)
= g 12
(22)
†
is also written in terms of the reference density matrix P = C0 C0 , evaluated from the occupied orbital coefficients, C0 . The quantityt hT = differs from the form in Eq. (15), because of the absence of the third term on the RHS. This term is an orbital relaxation term that only pertains to the ionization potentials, as there we would need to allow the system to relax after the ionization. Hence, this cannot contribute to the ground state energy, and its manifestation of that is that the total energy cannot be written in terms of the exact ionization potentials in Eq. (13), but can be written in terms of an approxi mation introduced by hT . The analytical forces for MD can be written eas T includes all electron correlation. Once h µ and g µα ily, as well. Notice h νβ ν are specified, which need to be viewed as quantities to be determined to reproduce the reference results from ab initio correlated calculations, we obtain self-consistent solutions for the correlated, effective, one-particle Hamiltonian. The self-consistency is essential in accounting for bond-breaking and associated charge rearrangement. The overlap matrix is included for generality, but as is often done in NDDO type theories, enforcing the ZDO approximation removes it. Another way to view this is to assume the parameters are based upon using the orthonormal expansion basis, |χ = |χS−1/2 which gives hT = S−1/2 hT S−1/2 . Developing this expression to include low-order in some S terms permits us to still retain the simpler and computationally faster orthogonal form of the eigenvalue equation, yet introduce what is sometimes called “Pauli repulsion” in the semi-empricial community [22]. A self-consistent solution provides the coefficients, C and the reference orbital energies, ,which as we discussed, are not the exact Ip’s that would come from including the contributions of the tmb j k amplitudes, which contain three-hole line and one-particle line. Such terms arise in the generalized EOM or Fock space CC theory for ionized, electron attached, and excited states. In lowest order, tmb j k =mb|| j k/( j +k −b −m ).
Achieving predictive simulations with quantum mechanical forces
4.
39
Transfer Hamiltonian: Density Functional Viewpoint
The DFT approach to the hT starts from a different premise that is actually simpler, since DFT is already exact in an independent particle form, unlike the usual many-particle theory above. As is well known, we have the Sham oneparticle Hamiltonian [32] whose first n eigenvectors give the exact density, h S = t + v + J + Vx + V h S |i = i |i. h S C = SC φi (1)φi∗ (1) = χµ (1)Pµν χ∗ν (1) ρ(1) = i
(23) (24) (25) (26)
µ,ν
†
and like the above, the density matrix is P = C0 C0 . The highest-occupied MO, n, has the property that n = −Ip(n). However, solving these equations does not provide an energy until we know the functional E xc [ρ], from which we know that δ E xc [ρ]/δρ(1) = Vxc (1), to close the cycle. The objective of DFT is to get the density, ρ, first; and then all other ground state properties follow; in particular, the energy and forces we need for MD. The transfer Hamiltonian in this case will be defined by the condition that ρCCSD = ρKS . Satisfying this condition means that we could obtain a Vxc from this density by using the ZMP method [47], but our approach is simply to parameterize the elements in h S = hT in analogy with that in semi-empirical quantum chemistry or TB such that the density condition is satisfied. This should specify Vxc , and indeed, the other terms in hT , which is then sufficient to obtain the forces, {∂ E(R)/∂X A }. Note this bypasses the need to use an explicit Exc [ρ],but, of course, that would always be an option. We can also bypass any explicit treatment of the kinetic energy operator by virtue of parametrization of h = t + v as in the semi-empirical approach discussed below. Besides the density condition, we also have the option to use the force condition in the sense that the forces can be obtained from CC theory, and then their values directly used to obtain the parameterized version of h S = hT . Ideally, the parameters will be able to describe both the densities and the forces, although this raises the issue of the long-term inability of semi-empirical methods to describe structures and spectra with the same parameters, discussed further in the last section. As our objective is to be able to define a hT that will satisfy many of the essential elements of ab initio theory, some of interest besides the forces are the density, and the ionization potential and electron affinity. The latter define the Mulliken electronegativity, E N = (I − A)/2, which should help to ensure that our calculations correctly describe the charge distribution in a system and the density. We also know the correct long-range behavior of the √ density is determined by the homo ionization potential, ρ (r) ∝ exp(−2 2I )r, which is a property of exact DFT. If the density is right, then we also know
40
R.J. Bartlett et al.
that we will get the correct dipole moments for the molecules involved, and this is likely to be critical if we hope to correctly describe polar systems like water, along with their hydrogen bonding.
5.
What About Semi-Empirical Methods?
Before embarking upon a particular form for the transfer Hamiltonian that must inevitably be semi-empiricial or TB type, we can ask what kind of accuracy is possible with such methods. In an recent paper on PM5, a parameterized NDDO Hamiltonian, [20, 21] Stewart reports that the PM5 heats of formation for over ∼1000 molecules composed of H, C, N, O, F, S, Cl, Br, and I have a mean absolute deviation (MAD) of 4.6 kcal/mol, nearly the same as DFT using BLYP or BPW91. The errors of PM3 are slightly larger (5.2) and AM1 (7.2). The largest errors are 27.2, (PM5), 35.1, (PM3), 54.8, (AM1) and 55.7 for BLYP and 34.5 for BPW91. Using a TZ instead of a DZ basis for the latter gives some improvement in the worst cases. For Jorgensen’s reparameterized PM3 and MNDO methods, referred to as PDDG [22, 25], the MAD heats of formation for 662 molecules limited to H, C, N, and O are reduced from 8.4 to 5.2, and with some extra PDDG additions, from 4.4 to 3.2 kcal/mol. For geometries, PDDG gets bond lengths to a MAD of 0.016 Å, 2.3◦ bond angle, and 29.0◦ dihedral angle. The principal Ip is typically within ∼0.5 eV – though it can be off by several – which is some 3% more accurate than PM3 and 12% less accurate than PM5. For dipole moments, the MAD is 0.24 Debye. There is less information about transition states and activation barriers, but these methods have seen extensive use for such problems in chemistry. Recent TB work termed SCC-DFTB for self-consistent charge density functional TB [11] is based upon DFT rather than HF and is less empirical, but still simplified using similar approximations for two-center interactions as in NDDO, discussed below. It is developed for solids as well as molecules. For the latter, in 63 organic examples the MAD deviations in bond lengths are 0.012 Å, and angles, 1.80◦ . For heats of reaction, in 36 example molecules composed of H, C, N, O the MAD is 12.5 kcal/mol compared to 11.1 for DFT-LSD. On the other hand, we can have dramatic failures. None of these new semi-empirical methods yet even treat Si, much less heavier elements of the sort that are important in many materials applications. To quote just one example, in comparisons of nine Zn complexes with B3LYP and CCSD(T), “MNDO/d failed the case study” and the errors compared to ab initio or DFT were dramatic.” The authors [48] say “No one semiempiricial model is applicable for the calculations of the whole variety of structures found in Zn chemistry.”
Achieving predictive simulations with quantum mechanical forces
6.
41
Forms for Tranfer Hamiltonian
Our objective is to model hT for the particular phenomena of interest and for chosen representative systems (i.e. unlike normal semi-empirical theory we do not expect the parameters to describe many elements at once) in a way that permits the routine, self-consistent treatment of a very large number of the same kinds of atoms. We also recognize that the traditional approaches are built upon approximating the HF-SCF one-particle Hamiltonian, f, not the more exact DFT or Brueckner approach discussed above. Also, traditionally, only a minimum basis set of an s orbital on H, and one s and a set of p orbtials are used on the other atoms, until d orbtials are occupied. Thinking more like ab initio theory, we do not presuppose such restrictions, but will use polarization functions and potentially double zeta sets of s and p orbitals on all atoms. We recognize the attraction of a transfer Hamiltonian that (1) consists solely of atomic parameters; and (2), is essentially two-atom in form, as all threeand four-center contributions are excluded. This is the fundamental premise of all neglect of differential overlap approximations [15, 17, 19]. Hence, as a first realization, guided by many years of semi-empirical quantum chemistry, we choose the “neglect of diatomic differential overlap” (NDDO) Hamiltonian, µ|hT |ν =
αµν δuv +
µ∈ A
+
µ=α,ν=β µ,ν∈A
µ∈ A,ν∈B
−
µ= /β∈ A, ν= /α∈B
Pαβ (µα|νβ) −
µβ=να,µ= /β µ,β∈A
Pαβ (µβ|να)
1 (βu + βv )Sνµ + Pαβ (µα|νβ) 2 µ=α∈A,v,β∈B,
Pαβ (µβ|να)
ν=β∈B,µ,α∈ A
(27)
consisting of atomic and diatomic units. αµµ is a purely atomic quantity that represents the one-particle part of the energy of an electron in its atomic orbital. We would have different values for s, p, d, . . . orbitals, collectively indicated as αA . The one-center, two-electron terms for atom A are separated into coulomb and exchange terms and weighted by the density matrix. No explicit correlation operator as in DFT is yet considered. Instead modifications (parameterizations) of the coulomb and exchange terms are viewed as potentially accomplishing the same objective. βu is an atomic parameter indicative of each orbital type (s,p,d) on atom A and Sµν is the overlap integral between, formally, two atomic orbitals on atoms A and B. A Slater type orbital on atom A is χA =rAn−1 exp (−ζA )Yl,m (ϑA, ϕA ), and the overlap integral, Sµν (ζA, ζB ) depends upon ζA and ζB , so it is entirely determined by what the atoms are. So it, too, consists of atomic parameters.
42
R.J. Bartlett et al.
The terms which include density matrix elements account for the twoelectron repulsion terms which depend upon the purely one-center two-electron µν integral type, (µA νA |µA νA ) = γAA . A typical choice for the two-center, twoelectron term then becomes [49, 50]
2 (µA νA |µB νB ) ∝ rAB + (cAuv + cBuv )2
−1/2
(28)
where rAB = RAB + qi and the additive terms cuv are numerically determined such that the two-center repulsion integral goes to the proper one-center limiting value. RAB is the distance bewteen atoms A and B, but differs from rAB due to the multipole method used to compute the two-electron integral. For (sA sA | pB pB ), a monopole and quadrupole are used for the p electron distribution while a monopole is used for the s distribution. The radial extent of the multipoles is given by qi = q p B, and is a function of the atomic orbital exponent ζB on atom B. This form for the two-electron integrals assumes the correct long-range (1/R) behavior. More general forms for the two-center, two-electron integrals combine such contributions together from several multipoles to distingush (ss|ss) from (ss|dd), etc. [19, 51]. This set of approximations defines the NDDO form of the matrix elements of hT between two atomic orbitals. Now we have to consider the nuclear repulsion contribution to the ene rgy, A,B ZA Z B /RAB . Importantly, and unlike in ab initio theory, the effective atomic number, ZA ,which is chosen initially to be equal to the number of valence electrons being contributed by atom A, is also made a function of all RAB in the system. This introduces several new parameters into the calculation, justified roughly by some ideas of electron screening. The AM1 choice [16] for the latter reflects screening of the effective nuclear charge with the parameterized form
E CR = Z A Z B (sA sA |s B sB ) 1 + e(−dA RAB ) + e(−dB RAB ) Z Z + A B RAB
k
aAk e
−bA (RAB −CkA )2
+
aBk e
−bB (RAB −CkB )2
(29)
k
These core repulsion (CR) parameters, d, b, a and C account for the nuclear repulsion, which means they contribute to total energies and forces, but not to purely electronic results. The latter depend upon the electronic parameters βA, γAA , αA , . . . . In our work, both sets are specified via a genetic algorithm to ensure that correlated CCSD results are obtained for representative systems, tailored to the phenomena of interest. Looking at the above approximations, we see that we retain only one and two-center two-electron integrals. In principle, we can have a three-center one-electron integral from µA |Z C /|r − RC νB , but in NDDO, such terms are excluded as well. Any approximation of hT that is to be tied to ab initio
Achieving predictive simulations with quantum mechanical forces
43
results, has to have the property of “saturation.” To achieve this, we insist that our form for hT be fundamentally short range. We see from the above, that our hT depends on two-center interactions, but unlike TB, not just those for the nearest neighbor atoms but for all the two-body interactions in the system. This short-range character helps to saturate the atomic parameters for comparatively small example systems that are amendable to ab initio correlated methods. Then once the atomic parameters are obtained, and found to be unchanged to within a suitable tolerance when redetermined for larger clusters, they define a saturated, self-consistent, correlated, effective one-particle Hamiltonian that can be readily solved for quite large systems to rapidly determine the forces required for MD. We also have easy access to the secondderivatives (Hessians) for definitive saddle point determination, vibrational frequencies, and interpolation between calculations at different points for MD. Using H2 O as an example for saturation, we can obtain the cartesian force matrix for the monomer by insisting that our simplified Hamiltonian provide the same force curves as a function of intra-atomic separation for breaking the O–H bond with the other degrees of freedom being optimum (i.e. a distinguished reaction path). Call this matrix FA. From FA we use a GA to obtain the Hamiltonian parameters that, in turn, determine h and g elements that make our transfer Hamiltonian reproduce these values. The more meaningful gradient norm |F | is used in practice rather than the individual cartesian elements. Now consider two water molecules interacting. The principal new element is the dihedral angle that orients one monomer relative to the other, but the H-bonding and dipole–dipole interaction will cause some small change when we break an O–H bond in the dimer. Our first approximation to FAB =FA +FB + VAB . Then by changing our parameters to accomodate the dimer bond breaking, we get slightly modified h and g elements in the transfer hamiltonian. VAC, VBC This makes FAB = FA + FB . Going to a third unit, we would add VABC, perturbations and repeat the process to define FABC = FA + FB + FC . Since these atomic based interactions will rapidly fall off with distance, we expect that relatively quickly we would have a saturated set of parameters for the bond breaking in water with a relatively small number of clusters. We can obviously look at other properties, too, such as dipole moments, cluster structures, etc., to assess their degree of saturation with our hT parameters. If we fail to achieve a satisfactory saturation, then we have to pursue more flexible, or more accurate forms of transfer Hamiltonians. It is essential to identify the terms that matter, and the DFT form provides complimentary input to the wavefunction approach in this regard. Also, unlike most semi-empirical methods we do not limit ourselves to a minimum basis set. The general level we would anticipate is CCSD with a double-zeta + polarization basis, while dropping the core electrons. This is viewed as the quality of ab initio result that we would pursue for complicated molecules.
44
R.J. Bartlett et al.
In addition, following the equation-of-motion (EOM) CC approach [52], we insist that H Rk |0 = ωk Rk |0
(30)
where Rk exp (T )|0 = k and ωk is the excitation energy for any ionized, Ik, electron-attached, Ak, or excited state. In other words, this provides Ips and Eas that tie to the Mulliken electronegativity, to help to ensure that our transfer Hamiltonian represents the correct charge distribution and density size. Furthermore, whereas forces and geometries are highly sensitive to the corerepulsion parameters, properties like I and A are sensitive to the electronic parameters in the transfer Hamiltonian. The transfer Hamiltonian procedure is far more general than the particular choice of Hamiltonian chosen here, since we can choose any expansion of H or hT that is formally correct and include elements to be computed or parameters to be determined, to define a transfer Hamiltonian. Furthermore, we can insist that it satisfy suitable exact and consistency conditions such as having the correct asymptotic or scaling behavior. Other desirable conditions might include the satisfaction of the virial and Hellman–Feynman theorems. We can also choose to do many of the terms like the one-center ones, ab initio, and keep those values fixed subsequently. Then, our simplified forms 12 (βu +βv )Sνµ and that of Eq. (29), are the only ones where there is an electronic dependence upon geometry. Adding this dependence to that from the core–core repulsions, has to provide the forces that drive the MD. We can explore many other practical approximations such as supressing self-consistency by setting P = 1, and impose the restriction that only nearest neighbor two-atom interactions be retained, to extract a non-self-consistent TB Hamiltonian that should be very fast in application. We can obviously make many other choices and create, perhaps, a series of improving approximations to the ab initio results that parallel their computational demands.
7.
Numerical Illustrations
As an illustration of the procedure, consider the prototype system for an Si–O–Si bond as in silica, pyrosilicic acid (Fig. 6). This molecule has been frequently used as a simple model for silica. We are interested in the Si-O bond rupture. Hence, we perform a series of CCSD calculations as a function of the Si–O distance all the way to the separated radical units, ·Si(OH)3 and ·O–Si(OH)3 , relaxing all other degrees of freedom at each point (while avoiding any hydrogen bonding which would be artificial for silica) using now wellknown CC analytical gradient techniques [36]. For each point we compute the
Achieving predictive simulations with quantum mechanical forces
45
O
O H
H O Si
Si O
O O
H
H
O H
H
Figure 6. Structure of pyrosilicic acid.
Figure 7. Comparison of forces from standard semi-empirical theory (AMI) and the transfer Hamiltonian (TH-CCSD) with coupled-cluster (CCSD) results for dissociation of pyrosilicic acid into neutral fragments.
gradient norm of the forces for the 3N cartesian coordinates, q I , (3 per atom 2 1/2 and use the genetic algorithm PIKAIA [53] A), |F| = 3N I [(∂ E/∂q I ) ] to minimize the difference between |F(CCSD)-F(hT )| for the transfer Hamiltonian and the CCSD solution. This is shown in Fig. 7. Since forces drive the MD, their determination is more relevant for the problem than the potential energy curves, themselves. For this case, we find that fixing the parameters in our transfer Hamiltonian that are associated with the core-repulsion
46
R.J. Bartlett et al.
function is sufficient, leaving the electronic parameters at the standard values for the AM1 method. As seen in Fig. 7, these new parameters are responsible for removing AM1s too large repulsion at short Si–O distances and erroneous behavior shortly beyond the equilibrium point. Hence, to a small tolerance, the transfer Hamiltonian provides the same forces as that in the highly sophisticated ab initio CCSD method. In a second study, QM forces permit the description of different electronic states. As an example, for this system we can also separate pyrosilicic acid into charged fragments, Si(OH)3+ and O–Si(OH)3− , and in a material undergoing bond-breaking, we would expect to take multiple paths such as this. A classical potential has no such capability. Figure 8 shows the curve and once again we obtain a highly accurate representation from the transfer Hamiltonian, with the same parameters obtained for the radical dissociation. Hence, our transfer Hamiltonian has the capability of describing the effects of these different electronic states in simulations, which besides enabling reliable descriptions of bond-breaking, should have an essential role if a materials’ optical properties are of interest. Figure 9 shows the integrated force curves to illustrate that even though the parameters were determined from the forces, the associated potential energy surfaces are also accurate compared to the reference CCSD results, and more accurate than the conventional AM1 results. The latter has an error of ∼0.4 eV between the neutral and charged paths compared to the CCSD results. We have also investigated the parameter saturation. Moving to trisilicic acid we obtain the reference results wihout any further change in our parameters.
Figure 8. Comparison of forces from standard semi-empirical theory (AM1) and the Transfer Hamiltonian (TH-CCSD) with coupled-cluster (CCSD) results for dissociation of pyrosilicic acid into charged fragments.
Achieving predictive simulations with quantum mechanical forces
47
Figure 9. Comparison of PES for dissociation of pyrosilicic acid. Each curve is labeled by the Hamiltonian used and the dissociation path followed.
The correct description of complicated phenomena in materials requires that the approach be able to describe, accurately, a wealth of different valence states and coordination states of the relevant atoms involved. For example, the surface structure of silica is known to show three, four, and five coordinate Si atoms. Hence, a critical test of the ability of the hT is how well its form can account for the observed structure of such species with the same parameters already determined for bond breaking. In Figs. 10 and 11, we show comparisons of the hT results for some Six O y molecules with DFT (B3LYP), various two-body classical potentials [54, 55], and a three-body potential [56] frequently used in simulations, and molecular mechanics [26]. The reference values are from CCSD(T), which are virtually the same as the experimental values when available. The hT results are competitive with DFT and superior to all classical forms, including even MM with standard parameterization. The latter is usually quite accurate for molecular structures at equilibrium geometries, but not necessarily for SiO2 . MM methods do not attempt to describe bond breaking. The comparative timings using the various methods are shown in Table 2 for two different sized systems, pyrosilicic acid and a 108-atom SiO2 nanorod [57]. The 216-atom version is shown in Fig. 12. The hT procedure is about 3.5 orders of magnitude faster than the gaussian basis B3LYP DFT results, which is another ∼3.5 orders of magnitude faster than CCSD[ACESII]. The 108 atom nanorod is clearly well beyond the capacity of CCSD ab initio calculations, but even the DFT result (in this case with a plane wave basis using the BO-LSD-MD (GGA) program, is excessive, while the hT is again three to four orders of magnitude faster. With streamlining of programs, we expect that this can still be significantly improved.
48
R.J. Bartlett et al.
Figure 10. Error in computed Six O y equilibrium bond lengths relative to CCSD(T) using various potentials.
Figure 11. Error in computed Six O y equilibrium bond angles relative to CCSD(T) using various potentials.
Achieving predictive simulations with quantum mechanical forces
49
Table 2. Comparative timings for electronic structure calculations (IBM RS/6000) Pyrosilicic acid Method CCSD DFT T h BKS
CPU time (s) 8656 375 0.17 0.001
108-atom nanorod Method CCSD DFT T h BKS
CPU time (s) N/A 85,019 43 0.02
Finally, to illustrate the results of a simulation we consider the 216-atom SiO2 system of Fig. 12, subject to a uniaxial stress, using various classical potentials and that for our QM transfer Hamiltonian. The equilibrated nanorod was subjected to uniaxial tension by assigning a fixed velocity (25 m/s) in the loading direction to the 15 atoms in the caps at each end of the rod. The stress was computed by summing the forces in the end caps and dividing by the projected cross sectional area at each time step. The simulations evolved for (approximately) 10 ps where the system temperature was maintained at 1 K by velocity rescaling. Figure 13 shows the computed stress–strain curves. The main differences between the classical potentials and their QM potentials seems to be the differnce at the maximum and the long tail indicating surface reconstruction. The QM potential shows the expected brittle fracture, perhaps a little more than the classical potentials. The transfer Hamiltonian, retains self-consistency, state specificity, and permits readily adding other molecules to simulations after ensuring that they, too, reflect the reference ab initio values for their various interactions. Hence, the transfer Hamiltonian built upon NDDO or more general forms, would seem to offer a practical approach to moving toward the objective of predictive simulations. In Fig. 14 we show the same kind of information about bond-breaking in water, showing the substantial superiority of the hT results compared to standard AM1. A well-known failing of semi-empirical methods is their inability to correctly describe H-bonding. In Fig. 15 we compare the equilibrium structure of the water dimer obtained from the hT , ab initio MBPT(2), and standard semi-empirical theory. It provides the quite hard to describe water dimer in excellent agreement with the first-principles calculations, contrary to AM1 which leads to errors in the donor–acceptor O–H bond of 0.15 Å. In this example, we have to change the electronic parameters along with the corecore repulsion. We would expect this to be the case for most applications. In the future, we hope we can develop the hT to the point that we will have an accurate, QM, description of water and its interactions with other species.
50
R.J. Bartlett et al.
Figure 12. Silica nanorod containing 216 atoms.
Achieving predictive simulations with quantum mechanical forces
Figure 13. potentials.
51
Stress–strain curve for 216-atom silica nanorod with classical and quantum
Figure 14. Comparison of forces for O–H bond breaking in water monomer.
52
R.J. Bartlett et al.
Figure 15. Structure of water dimer using transfer Hamiltonian, MBPT(2), and standard AM1 Hamiltonian. Bond lengths in angstroms and angles in degrees.
8.
Future
This article calls for some expectations for the future. We have little doubt that the future will demand QM potentials and forces in simulations. It seems to be the single most critical, unsolved, requirement if we aspire toward “predictive” quality. If we could use high-level CC forces in simulations for realistic systems, we would be as confident of our results – as long as the phenomena of interest is amendable to classical MD – as we would be for the determination of molecular properties at that level of theory and basis. Of course, in many cases we cannot run MD for long enough time periods to allow some phenomena to manifest themselves, perhaps forcing more of a kinetic Monte Carlo time extension at that point. We clearly also need much accelerated MD methods regardless of the choice of forces. Like the above NDDO and TB methods, DFT as used in practice, is also a “semi-empirical” theory, as methods like B3LYP now use many parameters to define their functionals and potentials. Even the bastion of state-of-the-art ab initio correlated methods – coupled-cluster theory – is not exact because it depends upon a basis set, as shown in the examples in the introduction. Since even DFT cannot generally be used in MD simulations involving more than ∼300 atoms, to make progress in this field demands that we have “simplified” methods that we can argue retain ab initio or DFT accuracy but now for
Achieving predictive simulations with quantum mechanical forces
53
>1000 atoms, and that can be readily tied to simulations. In this article, we have suggested a procedure for doing so. We showed that the many-electron CC theory could be reformulated into a single determinant form, but at the cost λδη of having a procedure to reliably introduce the quantites we called gνµ ,gλδ µν , gµν , etc. These are complicated quantities that in an ab initio calculation would depend upon one- and two-electron integrals over the basis functions and the cluster amplitudes in T . We could directly compute these elements from ab initio CC methods, to assess their more detailed importance and behavior, and expect to do so. But we prefer, initially, to obtain most of these elements from consideration of a smaller set of quantities and parameters like those in NDDO, or perhaps in TB; and investigatewhether those limited numbers of parameters will be capable of fixing hT = µ,ν |µµ|hT |νν| to the required accuracy. We believe in ensuring that hT has the correct long- and short-range behavior, including the united atom and the separated atom limits. We also want to make sure that the proper balance between the core–core repulsions and the electronic energy is maintained. In our opinion, this is the origin of the age-old problem in semi-empirical theory, that there needs to be different parameters for the total energy, forces, transition states, and those for purely electronic parameters like the electronic density, or photo-electron, or electronic spectra. The same features are observed in solid state applications where the accuracy of cohesive energies and lattice parameters does not transfer to the band structure. Such electronic properties do not depend upon the core– core repulsion at all, yet for many of the total energy properties, as we saw for SiO2 , only the core repulsion parameters need to be changed to get agreement with CCSD. This is not surprising. For total energies and forces, we are fitting the difference between two large numbers, which is much easier to fit than the much larger electronic energy, itself. It would be nice to develop a method that fully accounts for whatever the appropriate cancellation of the core–core effects with the electronic effects from the beginning. Only an ability to describe both reliably will pay the dividends of a truly predictive theory. DFT, MP2, and even higher level methods will continue to progress using local criteria [41], linear scaling, various density fitting tricks [58] and a wealth of other schemes; but regardless, if we can make a transfer Hamiltonian that is already ∼4–5 orders of magnitude faster than DFT, retain and transfer the predictive quality of ab initio or DFT results for clusters to very large molecules, there will always be a need to describe much larger systems accurately and smaller systems faster. In fact, it might be argued, that if such a procedure can be created that will be able to correctly reproduce high-level ab initio results for representative clusters – and fulfill the saturation property we emphasized – the final results might well exceed those from a purely ab initio or DFT method for ∼1000 atoms. The compromises made to make such large molecule applications possible, even at one geometry, forces
54
R.J. Bartlett et al.
restricting the basis sets, or number of grid points, or other assorted elements to acommodate the size of system. In principle, the transfer Hamiltonian would not be similarly compromised. Its compromises lie elsewhere.
Acknowledgments This work was support by the National Science Foundation under grant numbers DMR-9980015 and DMR-0325553.
References [1] P.R. Westmoreland, P.A. Kollman, A.M. Chaka, P.T. Cummings, K. Morokuma, M. Neurock, E.B. Stechel, and P. Vashishta, “Applications of molecular and materials modeling,” NSF, DOE, NIST, DARPA, AFOSR, NIH, 2002. [2] ACES II is a program product of the Quantum Theory Project, University of Florida. Authors: J.F. Stanton, J. Gauss, J.D. Watts, MNooijen, N. Oliphant, S.A. Perera, P.G. Szalay, W.J. Lauderdale, S.A. Kucharski, S.R. Gwaltney, S. Beck, A. Balkov D.E. Bernholdt, K.K. Baeck, P. Rozyczko, H. Sekino, C. Hober, and R.J. Bartlett. Integral packages included are VMOL (J. Almlf and P.R. Taylor); VPROPS (P.Taylor) ABACUS; (T. Helgaker, H.J. Aa. Jensen, P. Jrgensen, J. Olsen, and P.R. Taylor). [3] D.T. Griggs and J.D. Blacic, “Quartz – anomalous weakness of synthetic crystals,” Science, 147, 292, 1965. [4] G.V. Gibbs, “Molecules as models for bonding in silicates,” Am. Mineral, 67, 421, 1982. [5] A. Post and J. Tullis, “The rate of water penetration in experimentally deformed quartzite, implications for hydrolytic weakening,” Tectonophysics, 295, 117, 1998. [6] R. Hoffman, “An extended Huckel theory. I. hydrocarbons,” J. Chem. Phys., 39, 1397, 1963. [7] M. Wolfsberg and L. Helmholtz, “The spectra and electronic structure of the tetrahedral ions MnO4 , CrO4 , and ClO4 ,” J. Chem. Phys., 20, 837, 1952. [8] J.C. Slater and G.F. Koster, “Simplified LCAO method for the periodic potential problem,” Phys. Rev., 94, 1167, 1954. [9] W.A. Harrison, “Coulomb interactions in semiconductors and insulators,” Phys. Rev. B, 31, 2121, 1985. [10] O.F. Sankey and D.J. Niklewski, “Ab initio multicenter tight binding model for molecular dynamics simulations and other applications in covalent systems,” Phys. Rev. B, 40, 3979, 1989. [11] M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Haugk, T. Frauenheim, S. Suhai, and G. Seifert, “Self-consistent charge density functional tight binding method for simulations of complex materials properties,” Phys. Rev. B, 58, 7260, 1998. [12] M.W. Finnis, A.T. Paxton, M. Methfessel, and M. van Schilfgaarde, “Crystal structures of zirconia from first principles and self-consistent tight binding,” Phys. Rev. Lett., 81, 5149, 1998. [13] R. Pariser, “Theory of the electronic spectra and structure of the polyacenes and of alternant hydrocarbons,” J. Chem. Phys., 24, 250, 1956.
Achieving predictive simulations with quantum mechanical forces
55
[14] R. Pariser and R.G. Parr, “A semi-empirical theory of electronic spectra and electronic structure of complex unsaturated molecules,” J. Chem. Phys., 21, 466, 1953. [15] M.J.S. Dewar and G. Klopman, “Ground states of sigma bonded molecules. I. A semi-empirical SCF MO treatment of hydrocarbons,” J. Am. Chem. Soc., 89, 3089, 1967. [16] M.J.S. Dewar, J. Friedheim, G. Grady, E.F. Healy, and J.J.P. Stewart, “Revised MNDO parameters for silicon,” Organometallics, 5, 375, 1986. [17] J.A. Pople, D.P. Santry, and G.A. Segal, “Approximate self-consistent molecular orbital theory. I. Invariant procedures,” J. Chem. Phys., 43, S129, 1965. [18] J.A. Pople, D.L. Beveridge, and P.A. Dobosh, “Approximate self-consistent molecular orbital theory. 5. Intermediate neglect of differential overlap,” J. Chem. Phys., 47, 2026, 1967. [19] J.J.P. Stewart, In: K.B. Lipkowitz and D.B. Boyd (eds.), Reviews in Computational Chemistry, VCH Publishers, Weinheins, 1990. [20] J.J.P. Stewart, “Comparison of the accuracy of semiempirical and some DFT methods for predicting heats of formation,” J. Mol. Model, 10, 6, 2004. [21] J.J.P. Stewart, “Optimization of parameters for semiempirical methods. IV. Extension of MNDO, AM1, and PM3 to more main group elements,” J. Mol. Model, 10, 155, 2004 [22] W. Thiel, “Perspectives on semiempirical molecular orbital theory,” Adv. Chem. Phys., 93, 703, 1996. [23] K.M. Merz, “Semiempirical quantum chemistry: where we are and where we are going,” Abstr. Pap. Am. Chem. Soc., 224, 205, 2002. [24] M.P. Repasky, J. Chandrasekhar, and W.L. Jorgensen, “PDDG/PM3 and PDDG/MNDO: improved semiempirical methods,” J. Comput. Chem., 23, 1601, 2002. [25] I. Tubert-Brohman, C.R.W. Guimaraes, M.P. Repasky, and W.L. Jorgensen, “Extension of the PDDG/PM3 and PDDG/MNDO semiempirical molecular orbital methods to the halogens,” J. Comput. Chem., 25, 138, 2003. [26] M.R. Frierson and N.L. Allinger, “Molecular mechanics (MM2) calculations on siloxanes,” J. Phys. Org. Chem., 2, 573, 1989. [27] I. Rossi and D.G. Truhlar, “Parameterization of NDDO wavefunctions using genetic algorithms – an evolutionary approach to parameterizing potential energy surfaces and direct dynamics for organic reactions,” Chem. Phys. Lett., 233, 231, 1995. [28] K. Runge, M.G. Cory, and R.J. Bartlett, “The calculation of thermal rate constants for gas phase reactions: the quasi-classical flux–flux autocorrelation function (QCFFAF) approach,” J. Chem. Phys., 114, 5141, 2001. [29] S. Sekusak, M.G. Cory, R.J. Bartlett, and A. Sabljic, “Dual-level direct dynamics of the hydroxyl radical reaction with ethane and haloethanes: toward a general reaction parameter method,” J. Phys. Chem. A, 103, 11394, 1999. [30] R.J. Bartlett, “Coupled-cluster approach to molecular structure and spectra – a step toward predictive quantum chemistry,” J. Phys. Chem., 93, 1697, 1989. [31] T. Helgaker, P. Jorgensen, and J. Olsen, Molecular Electronic Structure Theory, John Wiley and Sons, West Sussex England, 2000. [32] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, 1133, 1965. [33] J.P. Perdew and W. Yue, “Accurate and simple density functional for the electronic exchange energy – generalized gradient approximation,” Phys. Rev. B, 33, 8800, 1986.
56
R.J. Bartlett et al. [34] A. Becke, “Density functional thermochemistry 3. The role of exact exchange,” J. Chem. Phys., 98, 5648, 1993. [35] D.E. Woon and T.H. Dunning, Jr., “Gaussian basis sets for use in correlated molecular calculations. 4. Calculation of static electrical response properties,” J. Chem. Phys., 100, 2975, 1994. [36] R.J. Bartlett, “Coupled-cluster theory: an overview of recent developments,” In: D. Yarkony (ed.) Modern Electronic Structure, II. World Scientific, Singapore, pp. 1047–1131, 1995. [37] K. Bak, P. Jorgensen, J. Olsen, T. Helgaker, and W. Klopper, “Accuracy of atomization energies and reaction enthalpies in standard and extrapolated electronic wave function/basis set calculations,” J. Chem. Phys., 112, 9229, 2000. [38] T. Helgaker, J. Gauss, P. Jorgensen, and J. Olsen, “The prediction of molecular equilibrium structures by the standard electronic wave functions,” J. Chem. Phys., 106, 6430, 1997. [39] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391, 1999. [40] F. Abraham, J. Broughton, N. Bernstein, and E. Kaxiras, “Spanning the length scales in dynamic simulation,” Computers in Phys., 12, 538, 1998. [41] M. Schutz and H.J. Werner, “Local perturbative triples correction (T) with linear cost scaling,” Chem. Phys. Lett., 318, 370, 2000. [42] J. Cioslowski, S. Patchkovskii, and W. Thiel, “Electronic structures, geometries, and energetics of highly charged cations of the C-60 fullerene,” Chem. Phys. Lett., 248, 116, 1996. [43] R.J. Bartlett, “Electron correlation from molecules to materials,” In: A. Gonis, N. Kioussis, and M. Ciftan (eds.), Electron Correlations and Materials Properties 2, Kluwer/Plenum, Dordrecht, pp. 219–236, 2003. [44] C.E. Taylor, M.G. Cory, R.J. Bartlett, and W. Thiel, “The transfer Hamiltonian: a tool for large scale simulations with quantum mechanical forces,” Comput. Mater. Sci., 27, 204, 2003. [45] K.A. Brueckner, “Many body problem for strongly interacting particles. 2. linked cluster expansion,” Phys. Rev., 100, 36, 1955. [46] P.O. Lowdin, “Studies in perturbation theory. 5. Some aspects on exact selfconsistent field theory,” J. Math. Phys., 3, 1171, 1962. [47] Q. Zhao, R.C. Morrison, and R.G. Parr, “From electron densities to Kohn–Sham kinetic energies, orbital energies, exchange-correlation potentials, and exchange correlation energies,” Phys. Rev. A, 50, 2138, 1994. [48] M. Brauer, M. Kunert, E. Dinjus, M. Klussmann M. Doring, H. Gorls, and E. Anders, “Evaluation of the accuracy of PM3, AM1 and MNDO/d as applied to zinc compounds,” J. Mol. Struct., (Theo. Chem.) 505, 289, 2000. [49] G. Klopman, “Semiempirical treatment of molecular structures. 2. Molecular terms + application to diatomic molecules,” J. Am. Chem. Soc., 86, 4550, 1964. [50] K. Ohno, “Some remarks on the pariser–parr–pople method,” Theor. Chim. Acta, 2, 219, 1964. [51] M.J.S. Dewar and W. Thiel, “A semiempirical model for the two-center repulsion integrals in the NDDO approximation,” Theor. Chim. Acta, 46, 89, 1977. [52] J.F. Stanton and R.J. Bartlett, “The equation of motion coupled-cluster method – a systematic biorthogonal approach to molecular excitation energies, transition probabilities and excited state properties,” J. Chem. Phys., 98, 7029, 1993. [53] P. Charbonneau, “Genetic algorithms in astronomy and astrophysics,” Astrophys. J. (Suppl), 101, 309, 1995.
Achieving predictive simulations with quantum mechanical forces
57
[54] S. Tsuneyuki, H. Aoki, M. Tsukada, and Y. Matsui, “First-principle interatomic potential of silica applied to molecular dynamics,” Phys. Rev. Lett., 61, 869, 1988. [55] B.W.H van Beest, G.J. Kramer, and R.A. van Santen, “Force fields for silicas and aluminophosphates based on ab initio calculations,” Phys. Rev. Lett., 64, 1955, 1990. [56] P. Vashishta, R.K. Kalia, J.P. Rino, and I. Ebbsjo, “Interaction potential for SiO2 – a molecular dynamics study of structural correlations,” Phys. Rev. B, 41, 12197, 1990. [57] T. Zhu, J. Li, S. Yip, R.J. Bartlett, S.B. Trickey and N.H. de Leeuw, “Deformation and fracture of a SiO2 nanorod,” Mol. Simul., 29, 671, 2003. [58] M. Schutz and M.R. Manby, “Linear scaling local coupled cluster theory with density fitting. Part I: 4-external integrals,” Phys. Chem. – Chem. Phys., 5, 3349, 2003.
1.4 FIRST-PRINCIPLES MOLECULAR DYNAMICS Roberto Car1 , Filippo de Angelis2 , Paolo Giannozzi3, and Nicola Marzari4 1 Department of Chemistry and Princeton Materials Institute, Princeton University, Princeton, NJ, USA 2 Istituto CNR di Scienze e Tecnologie Molecolari ISTM, Dipartimento di Chimica, Universit´a di Perugia, Via Elce di Sotto 8, I-06123, Perugia, Italy 3 Scuola Normale Superiore and National Simulation Center, INFM-DEMOCRITOS, Pisa, Italy 4 Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Ab initio or first-principles methods have emerged in the last two decades as a powerful tool to probe the properties of matter at the microscopic scale. These approaches are used to derive macroscopic observables under the controlled condition of a “computational experiment,” and with a predictive power rooted in the quantum-mechanical description of interacting atoms and electrons. Density-functional theory (DFT) has become de facto the method of choice for most applications, due to its combination of reasonable scaling with system size and good accuracy in reproducing most ground state properties. Such an electronic-structure approach can then be combined with classical molecular dynamics to provide an accurate description of thermodynamic properties and phase stability, atomic dynamics, and chemical reactions, or as a tool to sample the features of a potential energy surface. In a molecular-dynamics (MD) simulation the microscopic trajectory of each individual atom in the system is determined by integration of Newton’s equations of motion. In classical MD, the system is considered composed of massive, point-like nuclei, with forces acting between them derived from empirical effective potentials. Ab initio MD maintains the same assumption of treating atomic nuclei as classical particles; however, the forces acting on them are considered quantum mechanical in nature, and are derived from an electronic-structure calculation. The approximation of treating quantummechanically only the electronic subsystem is usually perfectly appropriate, due to the large difference in mass between electrons and nuclei. Nevertheless, nuclear quantum effects can be sometimes relevant, especially for light 59 S. Yip (ed.), Handbook of Materials Modeling, 59–76. c 2005 Springer. Printed in the Netherlands.
60
R. Car et al.
elements such as hydrogen; classical or ab initio path integral approaches can then be applied, albeit at a higher computational cost. The use of Newton’s equations of motion for the nuclear evolution implies that vibrational degrees of freedom are not quantized, and will follow a Boltzmann statistics. This approximation becomes fully justified only for temperatures comparable with the highest vibrational level in the system considered. In the following, we will describe the combined approach of Car and Parrinello to determine the simultaneous “on-the-fly” evolution of the (Newtonian) nuclear degrees of freedom and of the electronic wavefunctions, as implemented in a modern density-functional code [1] based on plane-waves basis sets, and with the electron–ion interactions described by ultrasoft pseudopotentials [2].
1.
Total Energies and the Ultrasoft Pseudopotential Method
Within DFT, the ground-state energy of a system of Nv electrons, whose one-electron Kohn–Sham (KS) orbitals are φi , is given by E tot [{φi }, {R I }] =
i
+
h2 ¯ 2 φi − ∇ + VNL φi + E H [n] + E xc [n] 2m ion dr Vloc (r)n(r) + U ({R I }),
(1)
where the i index runs over occupied KS orbitals (Nv /2 for closed-shell systems) and n(r) is the electron density. E H [n] is the Hartree energy defined as: E H [n] =
e2 2
dr dr
n(r)n(r ) , |r − r |
(2)
E xc [n] is the exchange and correlation energy, R I are the coordinates of the I th nucleus, {R I } is the set of all nuclear coordinates, and U ({R I }) is the nuclear Coulomb interaction energy. In typical first-principles MD implementations, pseudopotentials (PPs) are used to describe the interaction between the valence electrons and the ionic core, which includes the nucleus and the core electrons. The use of PPs allows to simplify the many-body electronic problem by avoiding an explicit description of the core electrons, which in turn results in a greatly reduced number of orbitals and allows the use of plane waves as a basis set. In the following, we will consider the general case of ultrasoft PPs [2], which includes as a special case norm-conserving PPs [3] in separable form. The PP is composed of ion , given by a sum of atom-centred radial potentials: a local part Vloc ion (r) = Vloc
I
I Vloc ( |r − R I | )
(3)
First-principles molecular dynamics
61
and a nonlocal part VNL :
VNL =
(0) I Dnm |βn βmI |,
(4)
nm,I (0) characterize the PP and are where the functions βnI and the coefficients Dnm specific for each atomic species. For simplicity, we will consider only a single atomic species in the following. The βnI functions, centred at site R I , depend on the nuclear positions via
βnI (r) = βn (r − R I ).
(5)
βn here is a combination of an angular momentum eigenfunction in the angular variables times a radial function which vanishes outside the core region; the indices n and m in Eq. (4) run over the total number Nβ of these functions. The electron density entering Eq. (1) is given by n(r) =
|φi (r)|2 +
i
I Q nm (r)φi |βnI βmI |φi ,
(6)
nm,I
where the sum runs over occupied KS orbitals. The augmentation functions I (r) = Q nm (r − R I ) are localized in the core. The ultrasoft PP is fully Q nm I (0) (r), Dnm , Q nm (r), and βn (r). The functions determined by the quantities Vloc Q nm (r) are related to atomic orbitals via Q nm (r) = ψnae∗ (r)ψmae (r) − ψnps∗ (r) ψmps (r), where ψ ae are the all-electron atomic orbitals (not necessarily bound), and ψ ps are the corresponding pseudo-orbitals. The Q nm (r) themselves can be smoothed for computational convenience, by taking a truncated multipole expansion [4]. For the case of norm-conserving PPs the Q nm (r) are identically zero. The KS orbitals obey generalized orthonormality conditions φi | S({R I }) |φ j = δi j ,
(7)
where S is a Hermitian overlap operator given by S=1+
qnm |βnI βmI |,
(8)
nm,I
and
qnm =
dr Q nm (r).
(9)
The orthonormality condition (7) is consistent with the conservation of the charge dr n(r) = Nv . Note that the overlap operator S depends on nuclear positions through the |βnI .
62
R. Car et al.
The ground-state orbitals φi that minimize the total energy (1) subject to the constraints (7) are given by δ E tot = i Sφi (r), δφi∗ (r)
(10)
where the i are Lagrange multipliers. Equation (10) yields the KS equations H |φi = i S|φi ,
(11)
where H , the KS Hamiltonian, is defined as H =−
h¯ 2 2 I ∇ + Veff + Dnm |βnI βmI |. 2m nm,I
(12)
Here, Veff is a screened effective local potential ion (r) + VH (r) + µxc (r), Veff (r) = Vloc
(13)
µxc (r) is the exchange-correlation potential µxc (r) =
δ E xc [n] , δn(r)
(14)
and VH (r) is the Hartree potential VH (r) = e
2
dr
n(r ) . |r − r |
(15)
I appearing in Eq. (12) are defined as The “screened” coefficients Dnm I Dnm
=
(0) Dnm
+
I dr Veff (r)Q nm (r).
(16)
I depend on the KS orbitals through Veff (Eq. (13)) and the charge The Dnm density n(r) (Eq. (6)). Since the KS Hamiltonian in Eq. (11) depends on the KS orbitals φi via the charge density, the solution of Eq. (11) is achieved by an iterative self-consistent field procedure.
2.
First-Principles Molecular Dynamics: Born–Oppenheimer and Car–Parrinello
We will assume here that all nuclei (together with their core electrons) can be treated as classical particles; furthermore, we consider only systems for which a separation between the classical motion of the atoms and the quantum motion of the electrons can be achieved, i.e., systems satisfying the
First-principles molecular dynamics
63
Born–Oppenheimer adiabatic approximation. For any given ionic configurations, it is possible to calculate the self-consistent electronic ground state, and consequently the forces acting on the ions by virtue of the Hellmann– Feynman theorem. The knowledge of the ionic forces allows then to evolve the nuclear trajectories in time, using any of the algorithms developed in classical mechanics for finite-differences solution of Newton’s equations of motion (two of the most popular choices are Verlet algorithms and Gear predictor– corrector approaches). Born–Oppenheimer MD strives for an accurate evolution of the ions by alternatively converging the electronic wavefunctions to full selfconsistency, for a given set of nuclear coordinates, and then evolving by a finite time step the ions according to the quantum mechanical forces acting on them. A practical algorithms could be summarized as such: • self-consistent solution of the KS equations for a given ionic configuration {R I }; • calculation of the forces acting on the nuclei via the Hellmann–Feynman theorem; • integration of the Newton’s equations of motion for the nuclei; • update of the ionic configuration. This way, the nuclei move on the Born–Oppenheimer surface, i.e., with the electrons in their ground state for any instantaneous configuration of the {R I }. An efficient implementation of this class of algorithms relies on efficient selfconsistent minimization schemes for the electronic wavefunctions, and on accurate extrapolations of the electronic ground-state from one step to the other. The time step itself will only be limited by the need to integrate accurately the highest ionic frequencies. In addition, due to the impossibility of reaching perfect electronic selfconsistency, a drift of the constant of motion is unavoidable, and long simulations require the use of a thermostat to compensate. On the other hand, the Car–Parrinello approach [5] combines “on-thefly” the simultaneous classical MD evolution of the atomic nuclei with the determination of the ground-state wavefunction for the electrons. A (fictitious) dynamics for the electronic degrees of freedom is introduced, defining a classical Lagrangian for the combined electronic and ionic degrees of freedom L=µ
i
dr |φ˙i (r)|2 +
1 ˙ 2 − E tot ({φi }, {R I }); MI R I 2 I
(17)
the wavefunctions above are subject to the set of orthonormality constraints Ni j ({φi }, {R I }) = φi |S({R I })|φ j − δi j = 0.
(18)
Here, µ is a mass parameter coupled to the electronic degrees of freedom, M I are the masses of the atoms, and E tot and S were given in Eqs. (1) and (8),
64
R. Car et al.
respectively. The first term in Eq. (17) plays the role of a kinetic energy associated to the electronic degrees of freedom. The orthonormality constraints (18) are holonomic and do not lead to energy dissipation in a MD run. The Euler equations of motion generated by the Lagrangian of Eq. (17) under the constraints (18) are: µφ¨ i = −
δ E tot + i j Sφ j , δφi∗ j
¨ I = − ∂ E tot + FI = MI R ∂R I
ij
(19)
∂S i j φi ∂R
I
φj .
(20)
where i j are Lagrange multipliers enforcing orthogonality. If the system is in the electronic ground state corresponding to the nuclear configuration at that time step, the forces acting on the electronic degrees of freedom µφ¨i =0 vanish and Eq. (19) reduces to the KS equations (10) or (11). A unitary rotation brings the matrix into diagonal form: i j = i δi j . Similarly, the equilibrium nuclear configuration is achieved when the atomic forces F I in Eq. (20) vanish. In deriving explicit expressions for the forces, Eq. (20), one should keep in mind that the electron density also I depends on R I through Q nm and βnI . Introducing the quantities I = ρnm
φi |βnI βmI |φi ,
(21)
i
and I = ωnm
i j φ j |βnI βmI |φi ,
(22)
ij
we arrive at the expression FI = − −
∂U − ∂R I nm
dr
ion ∂ Vloc n(r) − ∂R I
dr Veff (r)
I ∂ω I I ∂ρnm Dnm + qnm nm , ∂R I ∂R I nm
I ∂ Q nm (r) nm
∂R I
I ρnm
(23)
I and Veff have been defined in Eqs. (16) and (13), respectively. The where Dnm last term of Eq. (23) gives the constraint contribution to the forces. We underline that the dynamical evolution for the electronic degrees of freedom should not be construed as representing the true electron dynamics; rather it represent a dynamical system of fictitious degree of freedom adiabatically decoupled from the moving ions, but driven to follow closely the ionic dynamics, with small and oscillatory departures from what would be the exact Born–Oppenheimer ground-state energy. As a consequence, even
First-principles molecular dynamics
65
the Car–Parrinello dynamics for the nuclei becomes in principle inequivalent to the Born–Oppenheimer dynamics. However, suitable choices for the computational parameters used in the simulation exist, and are such that the two dynamics give the same macroscopic observables. The full self-consistency cycle of the Born–Oppenheimer dynamics can be dispensed for, at a great computational advantage only marginally offset by the need to use shorter timesteps to integrate the fast electronic degrees of freedom. The adiabatic separation can be understood on the basis of the following argument [6, 7]. The fictitious electronic dynamics, once close to the ground state, can be described as a superposition of harmonic oscillators whose frequencies are given by:
2( j − i ) ωi j = µ
1/2
,
(24)
where i is the KS eigenvalue of the ith occupied orbital and j is the KS eigenvalue of the j th unoccupied orbital. For a system with an energy gap E g , the lowest frequency can be estimated to be ωmin = (2E g /µ)1/2. If ωmin is much larger than the highest frequency appearing in the nuclear motion, there is a large separation between electronic and nuclear frequencies. Under such conditions, the electronic motion is adiabatically decoupled from the nuclear motion and there is negligible energy transfer from nuclear to electronic degrees of freedom. This is a nonobvious result, since both dynamics are classical and subject to the equipartion of energy, and it is the key to understand when and why the Car–Parrinello dynamics works. For typical E g values, in the order of a few electronvolts, the electronic mass parameter µ can be chosen relatively large, in the order of 300–500 amu or even more, without any loss of adiabaticity. The time step of the simulation can be chosen as the largest compatible with the resulting electronic dynamics. Larger values of µ allow the use of larger time steps, but the requirement of adiabaticity sets an upper limit to µ. Time steps of a fraction of a femtosecond are typically accessible. The electronic dynamics is faster than the nuclear dynamics and averages out the error on forces that is present because the system is never at the instantaneous electronic ground state, but only close to it (the system has to be brought close to the electronic ground state at the beginning of the dynamics). In such conditions, the resulting nuclear dynamics is very close to the true Born–Oppenheimer dynamics, and the electronic dynamics is stable (with negligible energy transfer from the nuclei) even for long simulation times. Moreover, the Car–Parrinello dynamics is computationally more convenient than the Born–Oppenheimer dynamics, because the latter requires a high accuracy in self-consistency in order to provide the needed accuracy on the forces. The Car–Parrinello dynamics does not provide accurate instantaneous forces, but it provides accurate average nuclear trajectories.
66
R. Car et al.
2.1.
Equations of Motion and Orthonormality Constraints
In Car–Parrinello implementations equations of motion (19) and (20) are discretized using the standard-Verlet or the velocity-Verlet algorithm. The following discussion, including the treatment of the R I -dependence of the orthonormality constraints, applies to the standard Verlet algorithm, and using the Fourier acceleration scheme of Tassone et al. [8]. (In this approach the fictitious electronic mass is generally represented by an operator , chosen in such a way to reduce the highest electronic frequencies.∗ ) From the knowledge of the electronic orbitals at time t and t − t, the orbitals at t + t are given, in the standard Verlet, by φi (t + t) = 2φi (t) − φi (t − t)
δ E tot i j (t + t) S(t)φ j (t); −(t)2 −1 ∗ − δφi j
(25)
where t is the time step, and S(t) indicates the operator S evaluated for nuclear positions R I (t). Similarly the nuclear coordinates at time t + t are given by: R I (t + t) = 2R I (t) − R I (t − t) −
(t)2 MI
∂ S(t) ∂ E tot φ j (t) . × − i j (t + t) φi (t) ∂R I ∂R I ij
(26)
The orthonormality conditions must be imposed at each time-step: φi (t + t)|S(t + t)|φ j (t + t) = δi j ,
(27)
leading to the following matrix equation: A + λB + B † λ† + λCλ† = 1
(28)
where the unknown matrix λ is related to the matrix of Lagrange multipliers at time t + t via λ = (t)2 ∗ (t + t). In Eq. (28), the dagger indicates ∗ When using plane waves, a convenient choice for the matrix elements of such operator is
G,G = max(µ, µ((h¯ 2 G 2 )/(2m E c )))δG,G , where G, G are the wave vector of PWs, E c is a cutoff (typically
a few Ry) which defines the threshold for Fourier acceleration. The fictitious electron mass depends on G as the kinetic energy for large G, it is constant for small G. This scheme allows us to use larger steps with negligible computational overhead.
First-principles molecular dynamics
67
Hermitian conjugate (λ = λ† ). The matrices A, B, and C are given by: Ai j = φ¯i |S(t + t)|φ¯ j , Bi j = −1 S(t)φi (t)|S(t + t)|φ¯ j , Ci j = −1 S(t)φi (t)|S(t + t)| −1 S(t)φ j (t),
(29)
with φ¯ i = 2φi (t) − φi (t − t) − (t)2 −1
δ E tot(t) . δφi∗
(30)
The solution of Eq. (28) in the ultrasoft PP case is not obvious, because Eq. (26) is not a closed expression for R I (t + t). The problem is that (t + t) appearing in Eq. (26) depends implicitly on R I (t + t) through S(t + t). Consequently, it is in principle necessary to solve iteratively for R I (t + t) in Eq. (26). A simple solution to this problem was provided in Laasonen et al. [4]. (t + t) is extrapolated using two previous values: (0) i j (t + t) = 2i j (t) − i j (t − t).
(31)
4 Equation (26) is used to find R(0) I (t +t), which is correct to O(t ). From (0) (1) R I (t +t) we can obtain a new set i j (t +t) and repeat the procedure until convergence is achieved. It turns out that in most practical applications the procedure converges at the very first iteration. Thus, the operations described above are generally executed only once per time step. The solution of Eq. (28) is found using a modified version [4, 9] of the iterative procedure of Car and Parrinello [10]. The matrix B is decomposed into hermitian (Bh ) and antihermitian (Ba ) parts,
B = Bh + Ba ,
(32)
and the solution is obtained by iteration: λ(n+1) Bh + Bhλ(n+1) = 1 − A − λ(n) Ba − Ba† λ(n) − λ(n) Cλ(n) .
(33)
The initial guess λ(0) can be obtained from λ(0) Bh + Bh λ(0) = 1 − A.
(34)
Here, the Ba - and C-dependent terms are neglected because they are of higher order in t (Ba vanishes for vanishing t). Equations (34) and (33) have the same structure: λBh + Bhλ = X
(35)
68
R. Car et al.
where X a Hermitian matrix. Equation (35) can be solved exactly by finding the unitary matrix U that diagonalizes Bh : U † BhU = D, where Di j = di δi j . The solution is obtained from (U † λU )i j = (U † XU )i j /(di + d j ).
(36)
When X = 1 − A, Eq. (36) yields the starting λ(0), while λ(n+1) is obtained from λ(n) by solving Eq. (36) with X given by Eq. (33). This iterative procedure usually converges in very few steps (ten or less).
3.
Plane-Wave Implementation
In most standard implementations, first-principles MD schemes employ a plane-wave (PW) basis set. An advantage of PWs is that they do not depend on atomic positions and are free of basis-set superposition errors. Total energies and forces on the atoms can be calculated using computationally efficient Fast Fourier transform (FFT) techniques and Pulay forces [11] vanish because PWs do not depend on atomic positions. Finally, the convergence of a calculation can be controlled in a simple way, since it depends only upon the number of PWs included in the expansion of the electron density. The dimension of a PW basis set is controlled by a cutoff in the kinetic energy of the PWs. A disadvantage of PWs is their extremely slow convergence in describing core states, which can however be circumvented by the use of PPs. Ultrasoft PPs allow to efficiently deal with this difficulty also in systems containing transition metals or first-row elements O, N, F whose 3d and 2p orbitals, respectively, are very contracted. The use of a PW basis set implies that periodic boundary conditions are imposed. Systems not having translational symmetry in one or more directions, have to be placed into a suitable periodically repeated box (a “supercell”). Let {R} be the translation vectors of the periodically repeated supercell. The corresponding reciprocal lattice vectors {G} obey the conditions Ri · G j = 2π n, with n an integer number. The KS orbitals can be expanded in a plane-wave basis up to a kinetic energy cutoff E cwf : 1 φ j,k (G)e−i(k+G)·r , φ j,k (r) = √ G∈{G wf }
(37)
c
where is the volume of the cell, {Gcwf} is the set of G vectors satisfying the condition h¯ 2 |k + G|2 < E cwf , 2m
(38)
and k is the Bloch vector of the electronic states. In crystals, one must use a grid of k-points dense enough to sample the Brillouin zone (the unit cell of the
First-principles molecular dynamics
69
reciprocal lattice). In molecules, liquids and in general if the simulation cell is large enough, the Brillouin zone can be sampled using only the k = 0 () point. An advantage of this choice is that the orbitals can be taken to be real in r-space. In the following we will drop the k vector index. Functions in real space and their Fourier transforms will be denoted by the symbols, if this does not originate ambiguity. The φ j (G)s are the actual electronic variables in the fictitious dynamics. The calculation of H φ j and of the forces acting on the ions are the basic ingredients of the computation. Scalar products φ j |βnI and their spatial derivatives are typically evaluated in G-space. An important advantage of I are easily working in G-space is that atom-centred functions like βnI and Q nm evaluated at any atomic position: βnI (G) = βn (G)e−iG·R I .
(39)
Thus,
φ j |βnI =
φ ∗j (G)βn (G)e−iG·R I
(40)
G∈{Gcwf }
and
∂β I n φj = −i ∂R I
Gφ ∗j (G)βn (G)e−iG·R I .
(41)
G∈{Gcwf }
The kinetic energy term is diagonal in G-space and is easily calculated:
− ∇ 2 φ j (G) = G 2 φ j (G).
(42)
In summary, the kinetic and nonlocal PP terms in H φ j are calculated in G-space, while the local potential term Veff φ j , that could be calculated in G-space, is more convenient determined using a ‘dual space’ technique, switching from G- to r-space with FFTs, and performing the calculation in the space where it is least expensive. In practice, the KS orbitals are first Fourier-transformed to r-space; then, (Veff φ j )(r) = Veff (r)φ j (r) is calculated in r-space, where Veff is diagonal; finally (Veff φ j )(r) is Fourier-transformed back to (Veff φ j )(G). In order to use FFT, the r-space is discretized by a uniform grid spanning the unit cell: f (m 1 , m 2 , m 3 ) ≡ f (rm 1 ,m 2 ,m 3 ),
rm 1 ,m 2 ,m 3 = m 1
a1 a2 a3 + m2 + m3 , N1 N2 N3 (43)
where a1 , a2 , a3 are lattice basis vectors, the integer index m 1 runs from 0 to N1 − 1, and similarly for m 2 and m 3 . In the following we will assume
70
R. Car et al.
for simplicity that N1 , N2 , N3 are even numbers. The FFT maps a discrete periodic function in real space f (m 1 , m 2 , m 3 ) into a discrete periodic function in reciprocal space f˜(n 1 , n 2 , n 3 ) (where n 1 runs from 0 to N1 − 1, and similarly for n 2 and n 3 ), and vice versa. The link between G-space components and FFT indices is: f˜(n 1 , n 2 , n 3 ) ≡ f (Gn1 ,n2 ,n3 ), n 1
n 1
n 1
Gn1 ,n2 ,n3 = n 1 b1 + n 2 b2 + n 3 b3
(44)
n 1
≥ 0, n 1 = + N1 if < 0, and similarly for n 2 and n 3 . where n 1 = if The FFT dimensions N1 , N2 , N3 must be big enough to include all non negligible Fourier components of the function to be transformed: ideally the Fourier component corresponding to n 1 = N1 /2, and similar for n 2 and n 3 , should vanish. In the following, we will refer to the set of indices n 1 , n 2 , n 3 and to the corresponding Fourier components as the “FFT grid”. The soft part of the charge density n soft(r) = j |φ j (r)|2 contains Fourier components up to a kinetic energy cutoff E csoft = 4E cwf . This is evident from the formula: n soft(G) =
G ∈{Gcwf }
j
φ ∗j (G − G )φ j (G ).
(45)
In the case of norm-conserving PPs, the entire charge density is given by n soft(r). Veff should be expanded up to the same E csoft cutoff since all the Fourier components of Veff φ j up to E cwf are required. Let us call {Gcsoft} the set of G-vectors such that h¯ 2 G < E csoft . (46) 2m The soft part of the charge density is calculated in r-space, by Fouriertransforming φ j (G) into φ j (r) and summing over the occupied states. The exchange-correlation potential µxc (r), Eq. (14), is a function of the local charge density and – for gradient-corrected functionals – of its gradient at point r: µxc (r) = Vxc (n(r), |∇n(r)|).
(47)
The gradient ∇n(r) is conveniently calculated from the charge density in G-space, using (∇n)(G) = −iGn(G). The Hartree potential VH (r), Eq. (15), is also conveniently calculated in G-space: VH (G) =
4π n(G)∗ . G2
(48)
Thus, in the case of norm-conserving PPs, a single FFT grid, large enough to accommodate the {Gcsoft} set, can be used for orbitals, charge density, and potential.
First-principles molecular dynamics
71
The use of FFT is mathematically equivalent to a pure G-space description (we neglect here a small inconsistency in exchange-correlation potential and energy density, due to the presence of a small amount of components beyond the {Gcsoft} set). This has important consequences: working in G-space means that translational invariance is exactly conserved and that forces are analytical derivatives of the energy (apart from the effect of the small inconsistency mentioned above). Forces that are analytical derivatives of the energy ensure that the constant of motion (i.e., the sum of kinetic and potential energy of the ions in Newtonian dynamics) is conserved during the evolution.
3.1.
Double-Grid Technique
Let us focus on ultrasoft PPs. In G-space the charge density is: n(G) = n soft(G) +
I Q mn (G)φi |βnI βmI |φi .
(49)
i,nm,I
The augmentation term often requires a cutoff higher than E csoft , and as a consequence a larger set of G-vectors. Let us call {Gcdens} the set of G-vectors that are needed for the augmented part: h¯ 2 2 G < E cdens . 2m
(50)
In typical situations, using pseudized augmented charges, E cdens ranges from E csoft to ∼ 2 − 3E csoft . The same FFT grid could be used both for the augmented charge density and for KS orbitals. This however would imply using an oversized FFT grid in the most expensive part of the calculation, dramatically increasing computer time. A better solution is to introduce two FFT grids: • a coarser grid (in r-space) for the KS orbitals and the soft part of the charge density. The FFT dimensions N1 , N2 , N3 of this grid are big enough to accommodate all G-vectors in {Gcsoft}; • a denser grid (in r-space) for the total charge density and the exchangecorrelation and Hartree potentials. The FFT dimensions M1 ≥ N1 , M2 ≥ N2 , M3 ≥ N3 of this grid are big enough to accommodate all G-vectors in {Gcdens}. In this framework, the soft part of the electron density n soft , is calculated in r-space using FFTs on the coarse grid and transformed in G-space using a coarse-grid FFT on the {Gcsoft} grid. The augmented charge density is calculated in G-space on the {Gcdens} grid, using Eq. (49) as described in the next section. n(G) is used to evaluate the Hartree potential, Eq. (48). Then
72
R. Car et al.
n(G) is Fourier-transformed in r-space on the dense grid, where the exchangecorrelation potential, Eq. (47), is evaluated. In real space, the two grids are not necessarily commensurate. Whenever the need arises to go from the coarse to the dense grid, or vice versa, this is done in G-space. For instance, the potential Veff , Eq. (13), is needed both on the I , Eq. (16), and on the coarse dense grid to calculate quantities such as the Dnm grid to calculate Veff φ j , Eq. (11). The connection between the two grids occurs in G-space, where Fourier filtering is performed: Veff is first transformed in G-space on the dense grid, then transferred to the coarse G-space grid by eliminating components incompatible with E csoft , and then back-transformed in r-space using a coarse-grid FFT. We remark that for each time step only a few dense-grid FFT are performed, while the number of necessary coarse-grid FFTs is much larger, proportional to the number of KS states Nks .
3.2.
Augmentation Boxes
Let us consider the augmentation functions Q nm , which appear in the calI , Eq. (16), culation of the electron density, Eq. (49), in the calculation of Dnm I and in the integrals involving ∂ Q nm /∂R I needed to compute the forces acting on the nuclei, Eq. (23). The calculation of the Q nm in G-space has a large computational cost because the cutoff for the Q nm is the large cutoff E cdens . The computational cost can be significantly reduced if we take advantage of the localization of the Q nm in the core region. We call “augmentation box” a fraction of the supercell, containing a small portion of the dense grid in real space. An augmentation box is defined only for atoms described by ultrasoft PPs. The augmentation box for atom I is centred at the point of the dense grid that is closer to the position R I . During a MD run, the centre of the I th augmentation box makes discontinuous jumps to one of the neighbouring grid points whenever the position vector R I gets closer to such grid point. In a MD run, the augmentation box must always contain completely the augmented charge belonging to the I th atom; otherwise, the augmentation box must be as small as possible. The volume of the augmentation box is much smaller than the volume of the supercell. The number of G-vectors in the reciprocal space of the augmentation box is smaller than the number of G-vectors in the dense grid by the ratio of the volumes of the augmentation box and of the supercell. As a consequence, the cost of calculations on the augmentation boxes increases linearly with the number of atoms described by ultrasoft PPs. Augmentation boxes are used (i) to construct the augmented charge density, Eq. (6), and (ii) to calculate the self-consistent contribution to the
First-principles molecular dynamics
73
coefficients of the nonlocal PP, Eq. (16). In case (i), the augmented charge is conveniently calculated in G-space, following [4], and Fourier-transformed in r-space. All these calculations are done on the augmentation box grid. Then the calculated contribution at each r-point of the augmentation box grid is added to the charge density at the same point in the dense grid. In case I as follows: for every atom described (ii), it is convenient to calculate Dnm by a ultrasoft PP, take the Fourier transform of Veff (r) on the corresponding augmentation box grid and evaluate the integral of Eq. (16) in G-space.
3.3.
Parallelization
Various parallelization strategies for PW–PP calculations have been described in the literature. A strategy that ensures excellent scalability in terms of both computer time and memory consists in distributing the PW basis set and the FFT grid points in real and reciprocal space across processors. A crucial issue for the success of this approach is the FFT algorithm, which must be capable of performing three-dimensional FFT on data shared across different processors with good load balancing. The parallelization in the case of ultrasoft PPs is described in detail in Giannozzi et al. [12].
4.
Applications
Presently, systems described by supercells containing up to a few hundreds atom are within the reach of first-principles MD. A large body of techniques developed for classical MD, such as simulated annealing, finite-temperature simulations, free-energy calculations, etc. can be straightforwardly extended to first-principles MD. Typical applications include the study of aperiodic systems: liquids, atomic clusters, large molecules, including biological active sites; complex solid-state systems: defects in solids, defect diffusion, surface reconstructions; dynamical processes: chemical reactions, catalysis, and finitetemperature studies. The use of ultrasoft PPs is especially convenient in the simulation of systems containing first-row atoms (C, N, O, F) and transition metal elements, such as, e.g., biological active sites, involving Fe, Mn, Ni as catalytic centers. A good example of application of first-principles MD is the investigation of a complex organometallic reaction: the migratory insertion of carbon monoxide (CO) into zirconium–carbon bonds anchored to a calix[4]arene moiety, shown in Fig. 1 [13]. The investigated reactivity is representative of the large class of migratory insertions of carbon monoxide and alkyl-isocyanides into metal–alkyl bonds observed for most of the early d-block metals, leading to the formation of a new carbon–carbon bond [14].
74
R. Car et al.
Figure 1.
Figure 2.
Geometry of calix[4]arene.
Insertion of CO into the Zr-CH3 bond of a calix[4]arene.
The CO migratory insertion is supposed to be initialized by the coordination of the nucleophilic CO species to the electron-deficient zirconium centre of [ p-But calix[4](OMe)2 (O)2 –Zr(Me)2 ], 1 in Fig. 2, to form the relatively stable adduct 2. MD simulations were started by heating up by small steps (via rescaling of atomic velocities) the structure of 2 to a temperature of 300 K. Both electronic and nuclear degrees of freedom were allowed to evolve without any constraint for 2.4 ps. The migratory CO insertion can be followed by studying the time evolution of the carbon–carbon CH3 –CO, metal–carbon Zr–CH3 and metal– oxygen Zr–O distances. Figure 3 clearly shows that the reactive CO migration takes place within ca. 0.4 ps: the fast decrease in the CH3 –CO distance from ca. 2.7 Å to ca. 1.5 Å corresponds to the formation of the new CH3–CO carbon– carbon bond. At the same time the Zr–CH3 distance follows an almost complementary trajectory with respect to the CH3 –CO distance and grows from ca. 2.4 up to ca. 3.7 Å, reflecting the methyl detachment from the metal centre upon CO insertion.
First-principles molecular dynamics
75
4.5
’C-C’ ’Zr-C’ ’Zr-O’
4
Distances (Angstrom)
3.5
3
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
1
1.2 1.4 Time (ps)
1.6
1.8
2
2.2
2.4
Figure 3. Evolution of carbon–carbon CH3 –CO, metal–carbon Zr–CH3 and metal–oxygen Zr–O distances during the simulation of CO insertion into calix[4]arene.
The Zr–O distance is found to decrease from its initial value of ca. 3.5 Å in 2, to ca. 2.2 Å, corresponding to the Zr–O bond in 4, within 1.0 ps. The 0.6 ps delay between the formation of the CH3 –CO bond and the formation of the Zr–O bond suggests the initial formation of a transient species, 3 in Fig. 2, characterized by an η1 -coordination of the OC–CH3 acyl group with a formed CH3 –CO bond and still a long Zr–O bond; this η1 -acyl subsequently evolves to the corresponding η2 -bound acyl species. The short time stability of the η1 -acyl isomer (ca. 0.6 ps) suggests a negligible barrier for the conversion of the η1 into the more stable η2 -isomer, as confirmed by static DFT calculations.
Acknowledgments Algorithms and codes presented in this work have been originally developed at EPFL Lausanne by Alfredo Pasquarello and Roberto Car, and then at Princeton University by Paolo Giannozzi and Roberto Car. Several people have also contributed or are contributing to the current development and distribution under the GPL License: Kari Laasonen, Andrea Trave, Carlo Cavazzoni, and Nicola Marzari.
76
R. Car et al.
References [1] A. Pasquarello, P. Giannozzi, K. Laasonen, A. Trave, N. Marzari, and R. Car, The Car–Parrinello molecular dynamics code described in this paper is freely available in the Quantum-espresso distribution, released under the GNU Public License at http://www.democritos.it/scientific.php., 2004. [2] D. Vanderbilt, “Soft Self-Consistent Pseudopotentials in a Generalized Eigenvalue Formalism,” Physical Review B, 41, 7892, 1990. [3] D.R. Hamann, M. Schl¨uter, and C. Chiang, “Norm-Conserving Pseudopotentials,” Physical Review Letters, 43, 1494, 1979. [4] K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Vanderbilt, “Car–Parrinello Molecular Dynamics with Vanderbilt Ultrasoft Pseudopotentials,” Physical Review B, 47, 10142, 1993. [5] R. Car and M. Parrinello, “Unified Approach for Molecular Dynamics and DensityFunctional Theory,” Physical Review Letters, 55, 2471, 1985. [6] G. Pastore, E. Smargiassi, and F. Buda, “Theory of Ab Initio Molecular-Dynamics Calculations,” Physical Review A, 44, 6334, 1991. [7] D. Marx and J. Hutter, “Ab-Initio Molecular Dynamics: Theory and Implementation,” In: Modern Methods and Algorithms of Quantum Chemistry, John von Neumann Institute for Computing, FZ J¨ulich, pp. 301–449, 2000. [8] F. Tassone, F. Mauri, and R. Car, “Acceleration Schemes for Ab Initio MolecularDynamics Simulations and Electronic-Structure Calculations,” Physical Review B, 50, 10561, 1994. [9] C. Cavazzoni and G.L. Chiarotti, “A Parallel and Modular Deformable Cell Car–Parrinello Code,” Computer Physics Communuications, 123, 56, 1999. [10] R. Car and M. Parrinello, “The Unified Approach for Molecular Dynamics and Density Functional Theory,” In: A. Polian, P. Loubeyre, and N. Boccara (eds.), Simple Molecular Systems at Very High Density, Plenum, New York, p. 455, 1989. [11] P. Pulay, “Ab Initio Calculation of Force Constants and Equilibrium Geometries,” Molecular Physics, 17, 197, 1969. [12] P. Giannozzi, F. De Angelis, and R. Car, “First-Principle Molecular Dynamics with Ultrasoft Pseudopotential: Parallel Implementation and Application to Extended Bio-Inorganic Systems,” Journal of Chemical Physics, 120, 5903–5915, 2004. [13] S. Fantacci, F. De Angelis, A. Sgamellotti, and N. Re, “Dynamical Density Functional Study of the Multistep CO Insertion into Zirconium–Carbon Bonds Anchored to a Calix[4]arene Moiety,” Organometallics, 20, 4031, 2001. [14] L.D. Durfee and I.P. Rothwell, “Chemistry of Eta-2-acyl, Eta-2-iminoacyl, and Related Functional Groups,” Chemical Reviews, 88, 1059, 1988.
1.5 ELECTRONIC STRUCTURE CALCULATIONS WITH LOCALIZED ORBITALS: THE SIESTA METHOD Emilio Artacho1 , Julian D. Gale2 , Alberto García3 , Javier Junquera4, Richard M. Martin5 , Pablo Ordej´on6 , Daniel S´anchez-Portal7, and Jos´e M. Soler8 1 University of Cambridge, Cambridge, UK 2 Curtin University of Technology, Perth, Western Australia, Australia 3 Universidad del País Vasco, Bilbao, Spain 4 Rutgers University, New Jersey, USA 5 University of Illinois at Urbana, Urbana, IL, USA 6 Instituto de Materiales, CSIC, Barcelona, Spain 7 Donostia International Physics Center, Donostia, Spain 8
Universidad Aut´onoma de Madrid, Madrid, Spain
Practical quantum mechanical simulations of materials, which take into account explicitly the electronic degrees of freedom, are presently limited to about 1000 atoms. In contrast, the largest classical simulations, using empirical interatomic potentials, involve over 109 atoms. Much of this 106 -factor difference is due to the existence of well-developed order-N algorithms for the classical problem, in which the computer time and memory scale linearly with the number of atoms N of the simulated system. Furthermore, such algorithms are well suited for execution in parallel computers, using rather small interprocessor communications. In contrast, nearly all quantum mechanical simulations involve a computational effort which scales as O(N 3 ), that is, as the cube of the number of atoms simulated. Such an intrinsically more expensive dependence is due to the delocalized character of the electron wavefunctions. Since the electrons are fermions, every one of the ∼N occupied wavefunctions must be kept orthogonal to every other one, thus requiring ∼N 2 constraints, each involving an integral over the whole system, whose size is also proportional to N . Despite such intrinsic difficulties, the last decade has seen an intense advance in algorithms that allow quantum mechanical simulations with an 77 S. Yip (ed.), Handbook of Materials Modeling, 77–91. c 2005 Springer. Printed in the Netherlands.
78
E. Artacho et al.
O(N ) computational effort. Such algorithms are based on avoiding the spatially extended electron eigenfunctions and using instead magnitudes, such as the one-electron density matrix, that are spatially localized, thus allowing for a spatial decomposition of the electronic problem. This strategy exploits what has been called by Walter Kohn the nearsightedness of the electron-gas [1]. Its implementation requires, or is greatly facilitated, by the use of a spatially localized basis set, such as a linear combination of atomic orbitals (LCAO). This paper gives a brief overview of such methods and describes in some detail one of them, the Spanish Initiative for Electronic Simulations with Thousands of Atoms (SIESTA).
1.
Order- N Algorithms
Despite its relatively recent development, there are already good reviews of O(N ) methods for the electronic structure problem, such as those of Ordejon [2] and Goedecker [3]. Here we will only explain briefly the basic difficulties and lines of solution, emphasizing the more practical aspects. Although some methods, such as that of Car and Parrinello, use a direct minimization approach, it is pedagogically convenient to consider the solution of the electronic problem as a two-step process. First, one needs to find the Hamiltonian (and eventually the overlap) matrix in some convenient basis. Second one has to find the solution of Schr¨odinger’s equation in that representation, that is, the electron wavefunctions or density matrix as a linear combination of basis functions. Since the effective electron potential, and therefore the Hamiltonian, depends on the electron density, this two-step process has to be iterated to selfconsistency. Although both steps require highly nontrivial algorithms to be performed with O(N ) effort, from a physical point of view the second one involves more fundamental problems and solutions. We will therefore give first, in this section, an overview of the second step, and leave for the next section the technical solution of the first step (the construction of the Hamiltonian), in the context of SIESTA. Although O(N ) methods have been developed for Hatree–Fock calculations as well, here we will restrict ourselves to density functional theory (DFT) because the methods are more mature and easier to understand in this context. There are numerous good introductory reviews on DFT like in Ref. [4]. A central magnitude in most O(N ) methods is the one-electron density operator ρˆ =
|ψi f (i )ψi |.
(1)
i
Its representation in real space is the density matrix ρ(r, r ) =
i
f (i ) ψi (r) ψi∗ (r ),
(2)
Electronic structure calculations with localized orbitals
79
where ψi (r) is the ith eigenfunction of the Kohn–Sham one-electron Hamiltonian of DFT, i is its corresponding eigenvalue, and f (i ) is its Fermi– Dirac occupation factor. Such a representation is appropriate for recent schemes that use finite difference formulae, in a real space grid of points, to solve the Kohn–Sham equations. We will assume, however, that a basis set of some kind of localized orbitals φµ (r), is used to expand the electron wavefunc = matrix takes the form tions: ψi (r) µ ciµ φµ (r). In this case the density ∗ . The density ρ(r, r ) = µν ρµν φµ (r) φν∗ (r ), where ρµν = i f (i ) ciµ ciν matrix allows to generate all the magnitudes required for a self-consistent DFT calculation. The electron density is simply its diagonal, ρ(r) = ρ(r, r), and it allows to calculate the Hartree (electrostatic) and exchange-correlation potentials. The electronic kinetic energy is given by 1 E kin = − 2
∇r2 ρ(r, r )
r=r
d3 r =
µν
ρµν Tνµ ,
(3)
where, using atomic units (e = m e = = 1),
1 φν∗ (r)∇ 2 φµ (r)d3 r. (4) 2 Notice from Eq. (2) that the electron eigenstates ψi (r) are also eigenvectors of the density matrix, whose corresponding eigenvalues are the occupation factors f (i ). However, diagonalizing ρµν is an O(N 3 ) operation, no cheaper than diagonalizing the Hamiltonian, so that magnitudes that depend on the eigenvectors, like the band structure or the density of states, are not ususally obtained in O(N ) calculations (although there are special O(N ) techniques to obtain partially some of these magnitudes [3]). The central role of ρ(r, r ) in O(N ) methods stems from the fact that it is sparse: when r and r are far away, ρ(r, r ) becomes negligibly small. To see this, it suffices to consider a uniform electron gas. In this case, the one-electron √ eigenfunctions become plane waves of the form ψk (r) = exp(ikr)/ where k is a wave vector and is the system volume. By substitution into Eq. (2), it is easy to see that ρ(r, r ), which in this case depends only on |r − r|, is simply the Fourier transform of the Fermi function in k space: f (k) = 1, if |k| ≤ k F , and f (k) = 0 otherwise, at zero temperature. Its Fourier transform ρ(|r − r|) decays as cos(k F |r − r|)/|r − r|2 . Furthermore, it turns out that the free electron gas at T = 0 is the worst possible case: at finite temperature the decay is exponential, with a decay constant proportional to the temperature. For an insulator, the decay is also exponential, even at zero temperature, with a decay constant that increases with the energy gap [3]. Therefore, the number of non-negligible values of ρ(r, r ) increases only linearly with the size of the system, with a prefactor that depends on its bonding character, and particularly on whether it is metallic or insulating. We will see that the computational effort (execution time and memory) is directly related to the number of those Tνµ = −
80
E. Artacho et al.
non-negligible matrix elements. In practice, for metallic systems, the prefactor is so large that the crossover system size, at which O(N ) methods become computationally competitive over traditional O(N 3 ) methods, has not yet been reached. We will therefore assume that the systems that we are considering are insulators, even though some (but not all) of the methods described could in principle be applied to metals as well. Chronologically, the first quantum mechanical O(N ) method, the divide and conquer (DC) scheme of Weitao Yang et al., is also conceptually the simplest from a physical point of view (recursion and other methods based on Green’s functions were developed in the 1970s that were also linear scaling; their linear-scaling character was not the driving force behind them though, and they are not so well suited for self-consistent studies). It is based on dividing the whole system into smaller pieces, each surrounded by a buffer region, that are then treated (including the buffer) by conventional quantum mechanical methods, i.e., by diagonalizing the local Hamiltonian. Using a common value for the chemical potential (Fermi energy) allows for charge transfer among different regions. From this treatment, the density (in the first proposal) or the density matrix (in a subsequent development) of the different pieces are combined to generate that of the entire system. The matrix elements between points (or orbitals) in different spatial pieces are obtained from those between the pieces themselves and their buffer regions (the elements between two buffer points are not used). Thus, the width of the buffer regions must account fully for the decay of ρ(r, r ). Beyond this width, usually called the localization radius, the matrix elements are neglected. In practice, this implies rather large buffer regions, making the method more expensive than other, more recent, O(N ) methods. The second O(N ) method to be mentioned, the Fermi operator expansion (FOE), constructs the whole (though sparse) density matrix as an expansion of the Hamiltonian. To this end, one expands the Fermi–Dirac function (conveniently smoothed) as a polynomial, within some energy range: f () = nmax n a , for min < < max . In practice, one uses n max + 1 Chebyshev n n=0 polynomials rather than powers of for stability reasons, but this is just a technical point [3]. Then one constructs the density matrix (by performing n max multiplications of the Hamiltonian) as ρˆ =
n max
an H n ,
(5)
n=0
where the coefficients an are the same as before. To keep the O(N ) scaling of the computation, one needs to restrict the spatial range, within the required localization radius, after each matrix multiplication. To understand the effect of this operator, consider its application to an eigenvector of the Hamiltonian. Provided that the eigenvalue is within the range of the expansion, the result
Electronic structure calculations with localized orbitals
81
max will be ρψ ˆ = nn=0 an n ψ = f ()ψ. This is exactly the effect of the density matrix operator of Eq. (1). A closely related method is the Fermi operator projection (FOP), in which one starts from a trial set of electron wavefunctions, each constrained within a different localization region (usually around atoms) and applies the expansion (5) of the density matrix operator (without constructing it) to the trial functions, projecting them into the occupied subspace. One still needs to make them orthogonal but, since they are spatially localized by construction, the process can be performed in O(N ) operations. The resulting functions are a complete representation of the density matrix, of size Nel × Nloc , with Nel the number of electrons and Nloc the number of basis orbitals within a localization region. In contrast, the normal representation of the density matrix, used in the FOE method, has Nbasis × Nloc nonzero matrix elements, where Nbasis is the number of basis orbitals, which is substantially larger than Nel . Therefore, the FOP method is more efficient than the FOE. In the density matrix minimization (DMM) method of Li, Nunes and Vanderbilt, the entire sparse density matrix is also obtained by minimizing the total energy as a function of its matrix elements in a localized basis set of atomic orbitals [5], grid points, or some other kind of support functions [6]. Again, matrix elements separated by more than a pre-established localization radius are neglected. A complication is that in performing the minimization, one must impose the constraint that the eigenvalues of the density matrix (i.e., the occupation weights) must be between zero and one, as required by the Fermi exclusion principle (for simplicity, we will consider combined spin–orbital indexes µ and i, so that each basis orbital or electron state has a defined spin and contains a single electron). At zero temperature, the constrained energy minimization will make all the eigenvalues either zero (above the Fermi energy) or one, what amounts to making matrix ρ idempotent: ρ 2 = ρ (since all the eigenvalues of ρ 2 will be identical to those of ρ). To impose this constraint, one introduces an auxiliary matrix ρ˜µν , with the same dimensions, and defines the density matrix using the McWeeny “purification” transformation ρ = 3ρ˜ 2 − 2ρ˜ 3 . Thus, the eigenvalues of ρ and ρ˜ are related by f i = 3 f˜i2 − 2 f˜i3 . It can be easily seen that, if f˜i is between –1/2 and 3/2, then f i is within the required range 0 ≤ f i ≤ 1. And if f˜i is close to either 0 or 1, then f i is even closer to these values. This allows for an unconstrained minimization of the ˜ = min. A practical energy as a function of the auxiliary matrix: E tot (ρ(ρ)) problem is that the spatial range of ρ˜ 3 is three times larger than the localization radius of ρ. ˜ To improve efficiency, one may truncate ρ further, although this degrades its exact idempotency, introducing extra errors. If the basis set is not orthonormal, ρ˜ 3 becomes (ρ˜ S)3 and the problem worsens. Like the FOP method, the orbital minimization (OM) approach uses a set of ∼Nel localized wavefunctions, conventionally called Wannier functions.
82
E. Artacho et al.
These wavefunctions are optimized, within their respective localization regions, by minimizing a modified total energy functional proposed by Kim, Mauri, and Galli, which has the form E = Tr[(H − µI )(2S − I )]
(6)
where Hi j and Si j are, respectively, the Hamiltonian and overlap matrix elements between the localized states i and j , Ii j ≡ δi j is the identity matrix, and µ is the chemical potential (Fermi energy). Although not immediately obvious, it has been shown that this functional form has very convenient properties. Initially, the localized orbitals need not be orthonormal, but the functional penalizes them for not being so, in such a way that they become orthogonal as a result of the unconstrained minimization. Furthermore, although more localized orbitals are used than the number of electrons, the minimization retains only Nel of them with norm equal to one, while the rest become normless. A problem with this method is that it usually requires a very large number (frequently over 1000) of iterations in the first functional minimization (for the first Hamiltonian). This is a consequence of the minimization problem becoming ill-conditioned when the localization regions are imposed on the wavefunctions. Subsequent minimizations, during the self-consistency process and geometry relaxation, require many fewer iterations (typically of the order of ten), so that the initial minimization problem is not so important in most practical calculation projects. Another practical problem is to choose the chemical potential µ, which must lie within the energy gap to ensure charge conservation. Furthermore, the self-consistency process and geometry relaxation may result in a shift of the gap, thus requiring cumbersome changes of µ during it. There are also hybrid methods. Gillan et al. use the DMM method, optimizing a density matrix expanded in a rather small basis of localized orbitals. These orbitals are in turn optimized by expanding them in terms of a much richer basis of finite elements called “blips” [6]. Bernholc et al use a similar approach, sometimes called the quasi-O(N ) method [7], in which a conventional diagonalization, rather than DMM, is used to find the eigenvectors (and the density matrix) in terms of the small basis of localized orbitals, which are then optimized in a fine real space grid. Although the diagonalization step is O(N 3 ), the small size of the localized orbital basis, and thus of the Hamiltonian, implies a small prefactor, allowing for simulations of rather large systems in practice, including metallic ones.
2.
The SIESTA method
The O(N ) methods, described in the previous section, were developed initially in the context of tight binding calculations, in which the Hamiltonian
Electronic structure calculations with localized orbitals
83
matrix elements, between atomic orbitals of a minimal basis set, are given by empirical formulae for any atomic positions. This allows to concentrate on the more fundamental problem of finding the electron states, given a Hamiltonian of minimum size, without caring about how to obtain selfconsistently such a Hamiltonian. This latter problem, although more prosaic and technical, involves a large number of small sub-problems, such as finding good and efficient pseudopotentials and basis sets, calculating the electron density from the electron wavefunctions, the Hartree and exchange-correlation potentials from the density, the matrix elements of the kinetic and potential operators, the atomic forces, etc. Although none of these problems poses essential difficulties, solving all of them with an O(N ) effort is a major enterprise that involves tens or hundreds of thousands of code lines. Therefore, there are not many well developed codes able to perform practical O(N ) DFT simulations. On this respect, we may cite, apart from SIESTA: the implementation of the DMM method in the GAUSSIAN code [5]; the CONQUEST code, which uses the hybrid approach mentioned in last section [6]; and the recent ONETEP code [8] using finite-cut-off representations of Dirac delta functions as basis set. Although not using strictly O(N ) methodology, we will also mention the FIREBALL code of Lewis et al. [9], which was the precursor of SIESTA in many technical aspects, as well as that of Lippert et al. [10], which also employs a very similar approach. The first major decision of any DFT implementation concerns the election of the basis set. Traditionally, most codes developed in the condensed matter community employ plane waves (PWs). They are conceptually simple and asymptotically complete. Most importantly, this completeness is very easy to approach in a systematic way, what greatly simplifies their practical use. Not depending on the atomic positions, plane waves are also spatially unbiased, what simplifies many developments and eliminates spurious effects like Pulay forces, even when the basis is far from converged. In addition, there are some very efficient techniques, particularly the fast Fourier transform (FFT), that greatly help and simplify the implementation of an efficient plane wave code. PWs have also disadvantages: being unbiased, they can equally represent any function, but they are not specially well suited to represent any one in particular. In comparison, the atomic orbitals traditionally used in quantum chemistry are very specially suited to represent the electron wavefunctions, and therefore they are much more efficient. Thus, one frequently needs tens or even hundreds of PWs per atom to achieve the same accuracy of a minimal basis of just four atomic orbitals. However, when comparing basis set efficiency, it is essential to consider the target accuracy of the calculations. LCAO basis are very efficient initially (i.e., for low accuracies). They can also achieve very high accuracies, but they are much harder to improve systematically than PWs. Therefore, in terms of both human and computational effort, LCAO basis sets become less and less convenient, compared to PW, as the required accuracy increases.
84
E. Artacho et al.
In practice, most simulation projects involve a huge number of trial calculations, to check the importance and the convergence of many effects and parameters, to explore candidate geometries and compositions, etc. To perform efficiently this initial exploration, it is extremely useful to have a method (and a basis set in particular) that allows a uniform transition from very fast “quick and dirty” calculations to very accurate ones. And LCAO bases allow precisely that. Apart from the pros and cons of PW mentioned before, their main disadvantage for us is their intrinsic inadequacy for O(N ) calculations. This is because each plane wave extends over the whole system, making PW inadequate to expand localized wave functions. Partly because of this reason, the last decade has seen a renaissance of real space methods, in which the electron wave functions are represented directly in a grid of points [11]. Such a “basis” has many of the advantages of PW, specially its systematic completeness, while it is also perfectly adequate to represent localized wave functions. It also allows for implementing a variety of boundary conditions, apart from the periodic ones imposed by PW. In practice, considerably more real space points are required than the already numerous PW, to achieve a similar precision, thus facing important limitations, especially in computer memory. The other main alternative for bases to implement O(N ) methods is LCAO. This is the traditional workhorse basis of quantum chemistry methods, in most of which the atomic orbitals are in turn expanded as a linear combination of Gaussian orbitals. This Gaussian expansion greatly facilitates the calculation of the three- and four-center integrals required in Hartree–Fock and configuration interaction methods. However, it is not specially useful to calculate the matrix elements of the nonlinear exchange and correlation potential, needed in DFT. In this case, it is better to use numerical orbitals, given by the product of a spherical harmonic times a radial function, represented in a fine radial grid. Furthermore, in order to expand the localized electron states and density matrices, used in O(N ) methods, it is conceptually and practically useful that the basis functions are stricly localized, i.e., defined to be zero beyond a specified radius. Such orbitals were proposed by Sankey and Niklewski and implemented in the codes FIREBALL [9] and SIESTA [12, 13]. They are generated by solving, for each angular momentum, the radial Schr¨odinger equation for the corresponding nonlocal pseudopotential. At the atomic orbital eigenvalue, the wavefunction decays exponentially for r → ∞. Shifting the energy to a slightly higher value, the wavefunction has a node at some radius rc , and may be considered as the solution under the constraint of a hard wall at rc . Using a common “energy shift” for all atoms and angular momenta (what implies a different rc for each one) provides a balanced basis, avoiding or mitigating spurious charge transfers. This scheme has the disadvantage of generating orbitals with a discontinuous derivative at rc (kink), which has been proven to have a small effect on the energy of condensed systems.
Electronic structure calculations with localized orbitals
85
To generate a richer basis set, SIESTA splits these numerical atomic orbitals (NAO) as the sum of a smooth part with even shorter range, plus a remainder, treating both parts as variationally independent basis orbitals, and producing in this way a radial flexibilization of the basis set. This splitting, inspired by the “split-valence” procedure used with Gaussian-expanded orbitals in quantum chemistry, can be repeated to generate multiple-ζ bases for each valence orbital. In order to introduce also angular flexibilization, polarization orbitals with higher angular momentum can be included. To provide them, SIESTA finds the perturbation created in the valence orbitals by an applied electric field. These polarization orbitals can also be “split,” using the previously described method, to create arbitrarily rich basis sets. It is well known that the optimal atomic basis orbitals are environment dependent. The simplest example is the hydrogen molecule, in which the optimal exponential atomic orbitals decay as e−r (in atomic units) for large interatomic separations (isolated atoms) and as e−2r for zero separation (helium atom). To account for this effect, the basis orbitals can be optimized variationally (i.e., by minimizing the total energy) within an environment similar (but simpler) to that in which they will be used. The transferability will improve by increasing the number of atomic orbitals in the basis set. To eliminate the kink, present at rc , in the orbitals of Sankey and Niklewski, it is convenient to use as variational parameters those defining a soft confinement potential, which diverges at rc . As with the “energy shift” of the hard-potential orbitals, it is important to use a common “pressure” parameter, for all the atoms and angular momenta, that controls the range of the orbitals during the optimization process [14]. To handle efficiently the core electrons, SIESTA uses the norm-conserving pseudopotentials of Troullier and Martins, in the fully nonlocal form of Kleinman and Bylander: VˆPS =
PS
d r |rVlocal (r)r| + 3
lmax
|χlm Vl χlm |,
(7)
l,m
where Vlocal(r) decays as −Z val /r when r → ∞. Since these pseudopotentials have become standard in condensed matter electronic structure codes, and they have been covered in other chapters of this handbook, we will only mention that, in SIESTA, Vlocal(r) is optimized for smoothness, rather than using the semilocal pseudopotential of a given angular momentum. The Hamiltonian and ovelap matrix elements contain several terms. The simplest ones to calculate in O(N ) operations are those involving two-center integrals between overlapping orbitals, because each orbital overlaps only with a small number of other orbitals, independent of the system size. These matrix elements are the overlap elements themselves Sµν = φµ |φν , the integrals χlm |φµ involved in the second term of Eq. (7), and the kinetic matrix elements Tµν = φµ | − 12 ∇ 2 |φν . All of these are calculated in Fourier
86
E. Artacho et al.
space, using convolution techniques, and stored as a product of spherical harmonics times numerical radial functions, interpolated in a fine radial grid [13]. To compute the matrix elements of the local potentials, we first find the electron density ρ(r), in a regular three-dimensional grid of points r, from the density matrix: ρ(r) =
µν
ρµν φµ (r)φν (r).
(8)
Notice that, for a given point r, only a few orbitals are nonzero at r and contribute to the sum, so that the evaluation of ρ(r) is an O(N ) operation, given the fact that the the number of grid points scales linearly with the volume, which in turn is proportional to N . From ρ(r) we calculate the Hartree potential VH (r) (the electrostatic potential created by ρ(r)) using FFT. This step scales as N log(N ) and is therefore not strictly O(N ). In practice it represents only a very minor part of the whole calculation, even for the largest systems considered up to now. Whenever this step becomes dominant, we may switch to other methods, like fast multipoles or multigrid algorithms, that are strictly O(N ). The exchange and correlation potential Vxc (r) is computed in the local density (LDA) or generalized gradient approximations (GGA), the latter using finite difference derivatives. We then find the total effective potential Veff (r) by adding the local pseudopotentials of all the atoms to VH (r) + Vxc (r). Since both Vlocal and VH have long range parts with opposite signs, we subtract from each of them the electrostatic potential created by a reference density, the sum of the electron densities of the free atoms. We then find the matrix elements φµ |Veff |φν by direct integration in the grid points. Like the evaluation of ρ(r), the effort of this step has O(N ) scaling, because the number of nonzero orbitals at each grid point is independent of the system size. The evaluation of the total energy, atomic forces, and stress tensor, proceeds simultaneously to that of the Hamiltonian matrix elements, using the last density matrix available during the self-consistency process. For exam ple, the kinetic and Hartree energies are given by E kin = µν ρµν Tνµ and E H = 12 µν ρµν φν |VH |φµ , respectively. The factor 1/2 prevents double counting of the electron–electron interactions. For the forces and stress we directly use the analytic derivatives of each term of the total energy. For each term, energy, forces and stresses are computed simultaneously, in the same places of the code. This ensures an exact compatibility between the computed total energy and its derivatives, including all corrections like Pulay forces. Once the Hamiltonian and overlap matrices have been calculated, a new density matrix is obtained either by: (i) solving the generalized eigenvalue problem by conventional O(N 3 ) methods of linear algebra, or (ii) using the O(N ) orbital minimization method of Kim, Mauri, and Galli, described in previous section. The first one must be used for systems that are metallic or suffer bond breakings that create partially occupied states during the
Electronic structure calculations with localized orbitals
87
simulation. Apart from those, systems below a threshold size actually run faster with the conventional O(N 3 ) methods. This threshold depends on the bonding nature of the system, on the size of the basis set used, on the spatial range of the basis orbitals, and on other calculation parameters, but it is typically around ∼100 atoms. Even for sizes above this threshold, it may be more efficient, specially in terms of human investment, to use plain diagonalization. This is because the O(N ) method is intrinsically more limited (specially for bond breaking) and difficult to use, with more parameters to adjust: the localization radius of the Wannier orbitals and, especially, the chemical potential. As a rule of thumb, the O(N ) method is practical for long geometry relaxations or molecular dynamics of systems with more than ∼300 atoms, or for short calculations with more than ∼500 atoms. With conventional diagonalization, an important efficiency consideration is whether the computational effort is dominated by the diagonalization itself or by the construction of the Hamiltonian. In the first case, which occurs above ∼100 atoms, the only relevant efficiency parameter is the basis set size, while other parameters, like the spatial range of the basis orbitals or the fineness of the integration grid, can be incresed at negligible cost, to improve the accuracy. In fact, it may be advantageous to increse the grid fineness even for efficiency reasons, since this will decrease the so called “eggbox effect”: a spurious ripling of the potential, due to the dependence of the total energy on the atomic positions relative to the integration grid. Though slight in the energy, the effect is larger on the atomic forces, and may increase considerably the number of iterations required to relax the geometry. We will finish this section by briefly mentioning some capabilities of SIESTA to perform a variety of calculations: • For very fast “quick and dirty” calculations, it is possible to use the non-self-consistent Harris–Foulknes functional, in which the only Hamiltonian calculated derives from a superposition of free atom densities. For diagonalization-dominated systems, with more than ∼100 atoms, and used in combination with a minimal basis set, this is essentially as fast as a tight binding calculation. • SIESTA contains algorithms for a large variety of geometry relaxation and dynamics, including the simultaneous relaxation of the lattice vectors and atomic positions, Parrinello–Rahman molecular dynamics, dynamics at constant pressure and/or temperature, etcetera. • The SIESTA program itself does not consider symmetries because it is designed for large and/or dynamical systems, which generally have low or no symmetry. However, an accompanying package contains several tools to facilitate the evaluation of phonon modes and spectra, which prepare data files with the required geometries (considering the system symmetry) and process the resulting forces to calculate the phonons.
88
E. Artacho et al.
• SIESTA is able to apply an external electric field to systems like molecules, clusters, chains and slabs, as well as to calculate the spontaneous polarization of a solid, using the Berry phase formalism of King-Smith and Vanderbilt [4]. • It is also possible to simulate magnetic systems, using spin dependent DFT, including the ability to impose the total magnetic moment, to start with antiferromagnetic configurations, and to allow noncollinear spin solutions. • A forthcoming version will also include time-dependent DFT, using the method of Yabana and Bertsch [13].
3.
DNA: A Prototype Application
SIESTA has been applied to hundreds of different systems, including solid metals, semiconductors and insulators, liquids, molecules, surfaces, nanotubes, and biological systems [15]. Of all these, because of the reasons explained in previous section, only a minority has been studied using the O(N ) methodology to solve Schr¨odinger’s equation (although the Hamiltonian is always generated in O(N ) operations). A good representative of this minority is the study of the electronic structure of DNA by Artacho et al. [16]. Apart from its obvious biological interest, DNA has generated much interest recently as a candidate for controlled self assembly of molecular electronic devices. On this respect, its ability to conduct electricity is of maximum interest, but very contradictory experimental results have been obtained on this ability. Furthermore, in such devices, DNA is normally found in a dry environment, very different from its conditions in vivo, which might strongly affect its structure. Thus, the goal of the calculations was to study the structural stability and the electrical conductivity of dry DNA. A preliminary calculation used the B conformation, but later studies used the A conformation, which is known experimentally to be more stable under dry conditions. The poly(C)–poly(G) sequence (only guanines in one of the strands and only cytosines in the other one) was chosen because guanine has the smallest ionization energy (and therefore the highest apetite for electron holes, which are suspected to be the relevant carriers) and because a uniform sequence is optimal for band conductivity. The CG base pair contains 65 atoms, including those in the sugar-phosphate side chains. Since the A conformation has a helix pitch of 11 base pairs, the total number of atoms per unit cell was 715. In solution, DNA is negatively ionized, by losing a proton in each phosphate group (two per base pair). This negative charge is neutralized by positive ions in solution around the DNA chain. In dried DNA, like that deposited on surfaces, it is uncertain how the charge will be distributed, but a reasonable approximation was to restore the phosphate protons (acidic form). It must be kept in mind, however, that in reality some of
Electronic structure calculations with localized orbitals
89
these protons (or whatever countercations) may be missing, in which case the charge must be compensated by electron holes, like in a doped semiconductor. The calculations were done with a double-ζ basis set, with additional polarization orbitals on the hydrogen atoms involved in hydrogen bonds and on the phosphorous atoms, for a total basis set size of 4510 orbitals. To find the chemical potential, an initial selfconsistent calculation was performed using standard diagonalizations. Then, the geometry relaxation proceeded during ∼800 steps using the O(N ) method of Kim, Mauri, and Galli, with a localization radius of 4 Å for the Wannier orbitals. A final calculation, using standard diagonalization, was performed for the relaxed coordinates, to find the electron eigenfunctions and to compare the total energy and forces. The total energy with the extended eigenfunctions was only 5 meV/atom lower than with the localized Wannier functions, and the average residual force was 6 meV/Å, while it was 2 meV/Å for the linear scaling. While a geometry relaxation step takes only about one hour with the O(N ) method, it takes 20 h using standard diagonaliztion, in a single 1 GHz Intel Pentium III processor. Despite the large number of relaxation steps, the relaxed geometry was rather close to the initial one, taken from X-ray diffraction experiments. Its structural parameters are typical of the A conformation, showing that this structure is indeed stable (at least metastable) for dry DNA. The electronic structure shows clear bands, as expected for a periodic system. The highest valence band is formed by the guanine HOMO states, and has a width of only 40 meV. The lowest conduction band is formed by the cytosine LUMO states, with a width of 270 meV. Between them, there is a wide band gap of 2.0 eV, showing that nondoped poly(C)–poly(G) must be an insulator. Even for DNA doped with holes, the extremely narrow HOMO band suggests that the holes will become localized by any lattice disorder, according to Anderson’s model. To check this, we performed two calculations for “perturbed” systems. The first system has one of the base pairs inverted (GC instead of CG) as the simplest realization of sequence disorder, after which the geometry was relaxed again. As a result, the band structure of the system changed dramatically, and the extended Bloch states changed to states localized over two-three base pairs in particular sections of the 11-base-pair periodic cell. The second “perturbed” system was one of the intermediate geometries during the relaxation process, with “random” changes in the atomic coordinates, relative to the final relaxed positions. These coordinate changes lead to a total energy difference compatible with that of thermal fluctuations at 300 K. Though not as dramatic as those of the base pair inversion, the changes in the electronic band structure were also substantial, and the electron states became localized as well, indicating in this case a strong electron-phonon interaction. These results ruled out band-like conduction of holes in doped DNA, suggesting also that holes would become localized by polaronic effects (structure deformations around
90
E. Artacho et al.
the hole). Such a suggestion was confirmed by later calculations of the hole polaron in poly(C)–poly(G) [17].
4.
Outlook
Besides the differences in scaling with system size, a large part of the advantage of classical potentials for large systems stems from the ease of parallelizing the algorithms involved in their use. In the case of quantum simulations, there are codes, like CONQUEST, which have been designed from the begining to run in massively parallel computers, and which have demonstrated their ability to run in them simulations with over ten thousand atoms. This was not the case of SIESTA, which was designed to run in modest workstations and PCs, and only later parallelized. The initial parallel versions were not very efficient, although demonstration runs with over one hundred thousand atoms were done. Recent versions have improved the parallel scaling considerably and now aim at one million atom demonstration runs. Much progress has been obtained also in a variety of acceleration techniques, from hybrid quantum mechanics–molecular mechanics to accelerated molecular dynamics. All this combined may lead very soon to unprecedented simulations of materials properties and devices with quantum mechanical methods. The major obstacle to make this possible, however, will be to find practical O(N ) methods for metals and systems with broken bonds. This is a subject of very active reseach in which much progress is expected in the coming years.
References [1] W. Kohn, “Density functional and density matrix method scaling linearly with the number of atoms,” Phys. Rev. Lett., 76, 3168–3171, 1996. [2] P. Ordej´on, “Order-N tight-binding methods for electronic-structure and molecular dynamics,” Comp. Mat. Sci., 12, 157–191, 1998. [3] S. Goedecker, “Linear scaling electronic structure methods,” Rev. Mod. Phys., 71, 1085–1123, 1999. [4] R.M. Martin, Electronic Structure: Basic Theory and Practical Methods, Cambridge University Press, Cambridge, 2004. [5] G.E. Scuseria, “Linear scaling density functional calculations with gaussian orbitals,” J. Phys. Chem. A, 103, 4782–4790, 1999. [6] D.R. Bowler, T. Miyazaki, and M.J. Gillan, “Recent progress in linear scaling ab initio electronic structure techniques,” J. Phys. Condens. Matter, 14, 2781–2798, 2002. [7] J.L. Fattebert and J. Bernholc, “Towards grid-based O(N) density-functional theory methods: optimized nonorthogonal orbitals and multigrid acceleration,” Phys. Rev. B, 62, 1713–1722, 2000.
Electronic structure calculations with localized orbitals
91
[8] A.A. Mostofi, C.-K. Skylaris, P.D. Haynes, and M.C. Payne, “Total-energy calculations on a real space grid with localized functions and a plane-wave basis,” Comput. Phys. Commun., 147, 788–802, 2002. [9] J.P. Lewis, K.R. Glaesemann, G.A. Voth, J. Fritsch, A.A. Demkov, J.Ortega, and O.F. Sankey, “Further developments in the local-orbital density-functional-theory tight-binding method,” Phys. Rev. B, 64, 195103.1–10, 2001. [10] G. Lippert, J. Hutter, P. Ballone, and M. Parrinello, “A hybrid gaussian and plane wave density functional scheme,” Mol. Phys., 92, 477–487, 1997. [11] T.L. Beck, “Real-space mesh techniques in density-functional theory,” Rev. Mod. Phys., 72, 1041–1080, 2000. [12] P. Ordej´on, E. Artacho, and J.M. Soler, “Selfconsistent order-N density-functional calculations for very large systems,” Phys. Rev. B, 53, R10441–R10444, 1996. [13] J.M. Soler, E. Artacho, J.D. Gale, A. García, J. Junquera, P. Ordej´on, and D. S´anchezPortal, “The SIESTA method for ab initio order-N materials simulation,” J. Phys. Condens. Matter, 14, 2745–2779, 2002. [14] E. Anglada, J.M. Soler, J. Junquera, and E. Artacho, “Systematic generation of finiterange atomic basis sets for linear-scaling calculations,” Phys. Rev. B, 66, 205101.1–4, 2000. [15] D. S´anchez-Portal, P. Ordej´on, and E. Canadell, “Computing the properties of materials from first principles with SIESTA,” Struct. Bonding, 113, 103–170, 2004. See also http://www.uam.es/siesta. [16] E. Artacho, M. Machado, D. S´anchez-Portal, P. Ordej´on, and J.M. Soler, “Electrons in dry DNA from density functional calculations,” Mol. Phys., 101, 1587–1594, 2003. [17] S.S. Alexandre, E. Artacho, J.M. Soler, and H. Chacham, “Small polarons in dry DNA,” Phys. Rev. Lett., 91, 108105–108108, 2003.
1.6 ELECTRONIC STRUCTURE METHODS: AUGMENTED WAVES, PSEUDOPOTENTIALS AND THE PROJECTOR AUGMENTED WAVE METHOD Peter E. Bl¨ochl, Johannes K¨astner, and Clemens J. F¨orst Institute for Theoretical Physics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany
The main goal of electronic structure methods is to solve the Schr¨odinger equation for the electrons in a molecule or solid, to evaluate the resulting total energies, forces, response functions and other quantities of interest. In this paper we describe the basic ideas behind the main electronic structure methods such as the pseudopotential and the augmented wave methods and provide selected pointers to contributions that are relevant for a beginner. We give particular emphasis to the projector augmented wave (PAW) method developed by one of us, an electronic structure method for ab initio molecular dynamics with full wavefunctions. We feel that it allows best to show the common conceptional basis of the most widespread electronic structure methods in materials science. The methods described below require as input only the charge and mass of the nuclei, the number of electrons and an initial atomic geometry. They predict binding energies accurate within a few tenths of an electron volt and bond lengths in the 1–2% range. Currently, systems with a few hundred atoms per unit cell can be handled. The dynamics of atoms can be studied up to tens of picoseconds. Quantities related to energetics, the atomic structure and to the ground-state electronic structure can be extracted. In order to lay a common ground and to define some of the symbols, let us briefly touch upon the density functional theory [1, 2]. It maps a description for interacting electrons, a nearly intractable problem, onto one of non-interacting electrons in an effective potential. Within density functional theory, the total
93 S. Yip (ed.), Handbook of Materials Modeling, 93–119. c 2005 Springer. Printed in the Netherlands.
94
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
energy is written as E[n (r), R R ] =
n
fn
−h 2 ¯ 2 n ∇ n 2m e 2
1 e n(r) + Z (r) n(r ) + Z (r ) + · d3r d3r 2 4π 0 |r − r | + E xc [n(r)]
(1)
occupations, n(r) = Here, |n are one-particle electron states, f n are the state ∗ f (r) (r) is the electron density and Z (r) = − n n n R Z R δ(r − R R ) is the n nuclear charge density expressed in electron charges. Z R is the atomic number of a nucleus at position R R . It is implicitly assumed that the infinite selfinteraction of the nuclei is removed. The exchange and correlation functional contains all the difficulties of the many-electron problem. The main conclusion of the density functional theory is that E xc is a functional of the density. We use Dirac’s bra and ket notation. A wavefunction n corresponds to a ket |n , the complex conjugate wave function n∗ corresponds to a bra n |, and a scalar product d3rn∗ (r)m (r) is written as n |m . Vectors in the three-dimensional coordinate space are indicated by boldfaced symbols. Note that we use R as position vector and R as atom index. In current implementations, the exchange and correlation functional E xc [n(r)] has the form
E xc [n(r)] =
d3r Fxc (n(r), |∇n(r)|),
where Fxc is a parameterized function of the density and its gradients. Such functionals are called gradient corrected. In local spin density functional theory, Fxc furthermore depends on the spin density and its derivatives. A review of the earlier developments has been given by Parr and Yang [3]. The electronic ground state is determined by minimizing the total energy functional E[n ] of Eq. (1) at a fixed ionic geometry. The one-particle wavefunctions have to be orthogonal. This constraint is implemented with the method of Lagrange multipliers. We obtain the ground state wavefunctions from the extremum condition for F[n (r), m,n ] = E[n (r)] −
[n |m − δn,m ]m,n
(2)
n,m
with respect to the wavefunctions and the Lagrange multipliers m,n . The extremum condition for the wavefunctions has the form H |n f n =
m
|m m,n
(3)
Electronic structure methods
95
2
h¯ where H = − 2m ∇2 + v eff (r) is the effective one-particle Hamilton operator. e The effective potential depends itself on the electron density via
v eff (r) =
e2 4π 0
d3r
n(r ) + Z (r ) + µxc (r), |r − r |
xc [n(r)] is the functional derivative of the exchange and correwhere µxc (r) = δ Eδn(r) lation functional. After a unitary transformation that diagonalizes the matrix of Lagrange multipliers m,n , we obtain the Kohn–Sham equations:
H |n = |n n .
(4)
The one-particle energies n are the eigenvalues of n,m 2fnf+n ffmm [4]. The remaining one-electron Schr¨odinger equations, namely the Kohn– Sham equations given above, still pose substantial numerical difficulties: (1) in the atomic region near the nucleus, the kinetic energy of the electrons is large, resulting in rapid oscillations of the wavefunction that require fine grids for an accurate numerical representation. On the other hand, the large kinetic energy makes the Schr¨odinger equation stiff, so that a change of the chemical environment has little effect on the shape of the wavefunction. Therefore, the wavefunction in the atomic region can be represented well already by a small basis set. (2) In the bonding region between the atoms the situation is opposite. The kinetic energy is small and the wavefunction is smooth. However, the wavefunction is flexible and responds strongly to the environment. This requires large and nearly complete basis sets. Combining these different requirements is nontrivial and various strategies have been developed. • The atomic point of view has been most appealing to quantum chemists. Basis functions that resemble atomic orbitals are chosen. They exploit that the wavefunction in the atomic region can be described by a few basis functions, while the chemical bond is described by the overlapping tails of these atomic orbitals. Most techniques in this class are a compromise of, on the one hand, a well-adapted basis set, where the basis functions are difficult to handle, and on the other hand numerically convenient basis functions such as Gaussians, where the inadequacies are compensated by larger basis sets. • Pseudopotentials regard an atom as a perturbation of the free electron gas. The most natural basis functions are planewaves. Plane wave basis sets are, in principle, complete and suitable for sufficiently smooth wavefunctions. The disadvantage of the comparably large basis sets required is offset by their extreme numerical simplicity. Finite plane-wave expansions are, however, absolutely inadequate to describe the strong
96
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
oscillations of the wavefunctions near the nucleus. In the pseudopotential approach the Pauli repulsion of the core electrons is therefore described by an effective potential that expels the valence electrons from the core region. The resulting wavefunctions are smooth and can be represented well by plane-waves. The price to pay is that all information on the charge density and wavefunctions near the nucleus is lost. • Augmented wave methods compose their basis functions from atom-like wavefunctions in the atomic regions and a set of functions, called envelope functions, appropriate for the bonding in between. Space is divided accordingly into atom-centered spheres, defining the atomic regions, and an interstitial region in between. The partial solutions of the different regions, are matched at the interface between atomic and interstitial regions. The PAW method is an extension of augmented wave methods and the pseudopotential approach, which combines their traditions into a unified electronic structure method. After describing the underlying ideas of the various approaches let us briefly review the history of augmented wave methods and the pseudopotential approach. We do not discuss the atomic-orbital based methods, because our focus is the PAW method and its ancestors.
1.
Augmented Wave Methods
The augmented wave methods have been introduced in 1937 by Slater [5] and were later modified by Korringa [6], Kohn and Rostokker [7]. They approached the electronic structure as a scattered-electron problem. Consider an electron beam, represented by a plane wave, traveling through a solid. It undergoes multiple scattering at the atoms. If for some energy, the outgoing scattered waves interfere destructively, a bound state has been determined. This approach can be translated into a basis set method with energy and potential dependent basis functions. In order to make the scattered wave problem tractable, a model potential had to be chosen: The so-called muffin-tin potential approximates the true potential by a constant in the interstitial region and by a spherically symmetric potential in the atomic region. Augmented wave methods reached adulthood in the 1970s: Andersen [8] showed that the energy-dependent basis set of Slater’s APW method can be mapped onto one with energy independent basis functions, by linearizing the partial waves for the atomic regions in energy. In the original APW approach, one had to determine the zeros of the determinant of an energy dependent matrix, a nearly intractable numerical problem for complex systems. With the new energy independent basis functions, however, the problem is reduced to
Electronic structure methods
97
the much simpler generalized eigenvalue problem, which can be solved using efficient numerical techniques. Furthermore, the introduction of well-defined basis sets paved the way for full-potential calculations [9]. In that case the muffin-tin approximation is used solely to define the basis set |χi , while the matrix elements χi |H |χ j of the Hamiltonian are evaluated with the full potential. In the augmented wave methods one constructs the basis set for the atomic region by solving the Schr¨odinger equation for the spheridized effective potential
−h¯ 2 2 ∇ + v eff (r) − φ,m (, r) = 0 2m e
as function of energy. Note that a partial wave φ,m (, r) is an angular momentum eigenstate and can be expressed as a product of a radial function and a spherical harmonic. The energy-dependent partial wave is expanded in a Taylor expansion about some reference energy ν, φ,m (, r) = φν,,m (r) + ( − ν, )φ˙ ν,,m (r) + O(( − ν, )2 ), where φν,,m (r) = φ,m (ν, , r). The energy derivative of the partial wave φ˙ν (r)= ∂φ(,r) solves the equation ∂ ν,
−h¯ 2 2 ∇ + v eff (r) − ν, φ˙ ν,,m (r) = φν,,m (r). 2m e
Next, one starts from a regular basis set, such as plane waves, Gaussians or Hankel functions. These basis functions are called envelope functions |χ˜ i . Within the atomic region they are replaced by the partial waves and their energy derivatives, such that the resulting wavefunction is continuous and differentiable: χi (r) = χ˜i (r) −
R
θ R (r)χ˜ i (r) +
+ φ˙ν,R,,m (r)b R,,m,i .
θ R (r) φν,R,,m (r)a R,,m,i
R,,m
(5)
θ R (r) is a step function that is unity within the augmentation sphere centered at R R and zero elsewhere. The augmentation sphere is atom-centered and has a radius about equal to the covalent radius. This radius is called the muffintin radius, if the spheres of neighboring atoms touch. These basis functions describe only the valence states; the core states are localized within the augmentation sphere and are obtained directly by radial integration of the Schr¨odinger equation within the augmentation sphere.
98
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
The coefficients a R,,m,i and b R,,m,i are obtained for each |χ˜i as follows: The envelope function is decomposed around each atomic site into spherical harmonics multiplied by radial functions: χ˜ i (r) =
u R,,m,i (|r − R R |)Y,m (r − R R ).
(6)
,m
Analytical expansions for plane waves, Hankel functions or Gaussians exist. The radial parts of the partial waves φν,R,,m and φ˙ν,R,,m are matched with value and derivative to u R,,m,i (|r|), which yields the expansion coefficients a R,,m,i and b R,,m,i . If the envelope functions are plane waves, the resulting method is called the linear augmented plane wave (LAPW) method. If the envelope functions are Hankel functions, the method is called linear muffin-tin orbital (LMTO) method. A good review of the LAPW method [8] has been given by Singh [10]. Let us now briefly mention the major developments of the LAPW method: Soler and Williams [11] introduced the idea of additive augmentation: While augmented plane waves are discontinuous at the surface of the augmentation sphere if the expansion in spherical harmonics in Eq. (5) is truncated, Soler replaced the second term in Eq. (5) by an expansion of the plane wave with the same angular momentum truncation as in the third term. This dramatically improved the convergence of the angular momentum expansion. Singh [12] introduced so-called local orbitals, which are nonzero only within a muffintin sphere, where they are superpositions of φ and φ˙ functions from different expansion energies. Local orbitals substantially increase the energy transferability. Sj¨ostedt et al. [13] relaxed the condition that the basis functions are differentiable at the sphere radius. In addition they introduced local orbitals, which are confined inside the sphere, and that also have a kink at the sphere boundary. Due to the large energy-cost of kinks, they will cancel, once the total energy is minimized. The increased variational degree of freedom in the basis leads to a dramatically improved plane-wave convergence [14]. The second variant of the linear methods is the LMTO method [8]. A good introduction into the LMTO method is the book by Skriver [15]. The LMTO method uses Hankel functions as envelope functions. The atomic spheres approximation (ASA) provides a particularly simple and efficient approach to the electronic structure of very large systems. In the ASA, the augmentation spheres are blown up so that their volume are equal to the total volume and the first two terms in Eq. (5) are ignored. The main deficiency of the LMTO-ASA method is the limitation to structures that can be converted into a closed packed arrangement of atomic and empty spheres. Furthermore, energy differences due to structural distortions are often qualitatively incorrect. Full potential versions of the LMTO method, that avoid these deficiencies of the ASA have been developed. The construction of tight
Electronic structure methods
99
binding orbitals as superposition of muffin-tin orbitals [16] showed the underlying principles of the empirical tight-binding method and prepared the ground for electronic structure methods that scale linearly instead of with the third power of the number of atoms. The third generation LMTO [17] allows to construct true minimal basis sets, which require only one orbital per electronpair for insulators. In addition they can be made arbitrarily accurate in the valence band region, so that a matrix diagonalization becomes unnecessary. The first steps towards a full-potential implementation, that promises a good accuracy, while maintaining the simplicity of the LMTO-ASA method are currently under way. Through the minimal basis-set construction the LMTO method offers unrivaled tools for the analysis of the electronic structure and has been extensively used in hybrid methods combining density functional theory with model Hamiltonians for materials with strong electron correlations [18].
2.
Pseudopotentials
Pseudopotentials have been introduced to (1) avoid describing the core electrons explicitly and (2) to avoid the rapid oscillations of the wavefunction near the nucleus, which normally require either complicated or large basis sets. The pseudopotential approach traces back to 1940 when Herring [19] invented the orthogonalized plane-wave method. Later, Phillips and Kleinman [20] and Antoncik [21] replaced the orthogonality condition by an effective potential, which mimics the Pauli repulsion by the core electrons and thus compensates the electrostatic attraction by the nucleus. In practice, the potential was modified, for example, by cutting off the singular potential of the nucleus at a certain value. This was done with a few parameters that have been adjusted to reproduce the measured electronic band structure of the corresponding solid. Hamann et al. [22] showed in 1979 how pseudopotentials can be constructed in such a way, that their scattering properties are identical to that of an atom to first order in energy. These first-principles pseudopotentials relieved the calculations from the restrictions of empirical parameters. Highly accurate calculations have become possible especially for semiconductors and simple metals. An alternative approach towards first-principles pseudopotentials [23] preceded the one mentioned above.
2.1.
The Idea Behind Pseudopotential Construction
In order to construct a first-principles pseudopotential, one starts out with an all-electron density-functional calculation for a spherical atom. Such
100
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
calculations can be performed efficiently on radial grids. They yield the atomic potential and wavefunctions φ,m (r). Due to the spherical symmetry, the radial parts of the wavefunctions for different magnetic quantum numbers m are identical. For the valence wavefunctions one constructs pseudo-wavefunctions |φ˜ ,m : There are numerous ways [24–27] to construct the pseudo-wavefunctions. They must be identical to the true wave functions outside the augmentation region, which is called core-region in the context of the pseudopotential approach. Inside the augmentation region the pseudo-wavefunction should be nodeless and have the same norm as the true wavefunctions, that is φ˜ ,m |φ˜ ,m = φ,m |φ,m (compare Fig. 1). From the pseudo-wavefunction, a potential u (r) can be reconstructed by inverting the respective Schr¨odinger equation:
h¯ 2 2 − ∇ + u (r) − ,m φ˜,m (r) = 0 2m e ⇒ u (r) = ,m +
h¯ 2 2 ∇ φ˜,m (r). φ˜ ,m (r) 2m e 1
·
0
0
1
2
3
r [abohr] Figure 1. Illustration of the pseudopotential concept at the example of the 3s wavefunction of Si. The solid line shows the radial part of the pseudo-wavefunction φ˜,m . The dashed line corresponds to the all-electron wavefunction φ,m , which exhibits strong oscillations at small radii. The angular momentum dependent pseudopotential u (dash-dotted line) deviates from the all-electron one v eff (dotted line) inside the augmentation region. The data are generated by the fhi98PP code [28].
Electronic structure methods
101
This potential u (r) (compare Fig. 1), which is also spherically symmetric, differs from one main angular momentum to the other. Next we define an effective pseudo-Hamiltonian
h¯ 2 2 e2 ps ∇ + v (r) + H˜ = − 2m e 4π 0
d3r
n(r ˜ ) + Z˜ (r ) + µxc ([n(r)], ˜ r) |r − r |
ps
and determine the pseudopotentials v such that the pseudo-Hamiltonian produces the pseudo-wavefunctions, that is ps v (r)
e2 = u (r) − 4π 0
d3r
n(r ˜ ) + Z˜ (r ) − µxc ([n(r)], ˜ r). |r − r |
(7)
This process is called “unscreening.” ˜ Z(r) mimics the charge density of the nucleus and the core electrons. It is usually an atom-centered, spherical Gaussian that is normalized to the charge of nucleus and core of that atom. In the pseudopotential approach, Z˜ R (r) does ˜ n (r) ˜ n∗ (r) not change with the potential. The pseudo density n(r) ˜ = n fn is constructed from the pseudo-wavefunctions. In this way we obtain a different potential for each angular momentum channel. In order to apply these potentials to a given wavefunction, the wavefunction must first be decomposed into angular momenta. Then each comps ponent is applied to the pseudopotential v for the corresponding angular momentum. The pseudopotential defined in this way can be expressed in a semilocal form
¯ −r)+ v (r, r ) = v(r)δ(r ps
,m
ps
Y,m (r) v (r) − v(r) ¯
δ(|r| − |r |) ∗ × Y,m (r ) . |r|2
(8)
The local potential v(r) ¯ only acts on those angular momentum components, not included in the expansion of the pseudopotential construction. Typically, it is chosen to cancel the most expensive nonlocal terms, the one corresponding to the highest physically relevant angular momentum. The pseudopotential is nonlocal as it depends on two position arguments, r and r . The expectation values are evaluated as a double integral ˜ = ˜ ps | |v
3
dr
˜ ). ˜ ∗ (r)v ps (r, r )(r d3r
102
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
The semilocal form of the pseudopotential given in Eq. (8) is computationally expensive. Therefore, in practice, one uses a separable form of the pseudopotential [29–31]: v ps ≈
−1
v ps |φ˜i φ˜ j |v ps |φ˜ i
i, j
i, j
φ˜ j |v ps .
(9)
Thus, the projection onto spherical harmonics used in the semilocal form of Eq. (8) is replaced by a projection onto angular momentum dependent functions |v ps φ˜ i . The indices i and j are composite indices containing the atomic-site index R, the angular momentum quantum numbers , m and an additional index α. The index α distinguishes partial waves with otherwise identical indices R, , m, as more than one partial wave per site and angular momentum is allowed. The partial waves may be constructed as eigenstates to the ps pseudopotential v for a set of energies. One can show that the identity of Eq. (9) holds by applying a wavefunction ˜ = i |φ˜ i ci to both sides. If the set of pseudo partial waves |φ˜i in Eq. (9) | is complete, the identity is exact. The advantage of the separable form is that ˜ ps | is treated as one function, so that expectation values are reduced to φv ˜ combinations of simple scalar products φ˜i v ps |. The total energy of the pseudopotential method can be written in the form E=
n
fn
h2 ¯ ˜ 2 ˜ n |v ps | ˜ n ˜ n − ∇ f n n + E self + 2m e n
˜ ) 2 n(r) ˜ + Z˜ (r) n(r ˜ ) + Z(r
1 e × + · 2 4π 0
d3r
d3r
|r − r |
+ E xc [n(r)]. ˜ (10)
The constant E self is adjusted such that the total energy of the atom is the same for an all-electron calculation and the pseudopotential calculation. For the atom, from which it has been constructed, this construction guarantees that the pseudopotential method produces the correct one-particle energies for the valence states and that the wavefunctions have the desired shape. While pseudopotentials have proven to be accurate for a large variety of systems, there is no strict guarantee that they produce the same results as an allelectron calculation, if they are used in a molecule or solid. The error sources can be divided into two classes: • Energy transferability problems: Even for the potential of the reference atom, the scattering properties are accurate only in given energy window. • Charge transferability problems: In a molecule or crystal, the potential differs from that of the isolated atom. The pseudopotential, however, is strictly valid only for the isolated atom.
Electronic structure methods
103
The plane-wave basis set for the pseudo wavefunctions is defined by the shortest wave length λmin = 2π/|G max | via the so-called plane-wave cutoff h2 G2 . It is often specified in Rydberg (1Ry = 12 H≈13.6 eV). The planeE PW = ¯ 2mmax e wave cutoff is the highest kinetic energy of all basis functions. The basis-set convergence can systematically be controlled by increasing the plane-wave cutoff. The charge transferability is substantially improved by including a nonlinear core correction [32] into the exchange-correlation term of Eq. (10). Hamann [33] showed how to construct pseudopotentials from unbound wavefunctions as well. Vanderbilt [31] and Laasonen et al. [34] generalized the pseudopotential method to non-norm-conserving pseudopotentials, so-called ultra-soft pseudopotentials, which dramatically improves the basis-set convergence. The formulation of ultra-soft pseudopotentials has already many similarities with the projector augmented wave method. Truncated separable pseudopotentials suffer sometimes from so-called ghost states. These are unphysical core-like states, which render the pseudopotential useless. These problems have been discussed by Gonze et al. [35] . Quantities such as hyperfine parameters that depend on the full wavefunctions near the nucleus, can be extracted approximately [36]. A good review about pseudopotential methodology has been written by Payne et al. [37] and Singh [10]. In 1985, Car and Parrinello [38] published the ab initio molecular dynamics method. Simulations of the atomic motion have become possible on the basis of state-of-the-art electronic structure methods. Besides making dynamical phenomena and finite temperature effects accessible to electronic structure calculations, the ab initio molecular dynamics method also introduced a radically new way of thinking into electronic structure methods. Diagonalization of a Hamilton matrix has been replaced by classical equations of motion for the wavefunction coefficients. If one applies friction, the system is quenched to the ground state. Without friction truly dynamical simulations of the atomic structure are performed. Using thermostats [39–42], simulations at constant temperature can be performed. The Car–Parrinello method treats electronic wavefunctions and atomic positions on an equal footing.
3.
Projector Augmented Wave Method
The Car–Parrinello method had been implemented first for the pseudopotential approach. There seemed to be unsurmountable barriers against combining the new technique with augmented wave methods. The main problem was related to the potential-dependent basis set used in augmented wave methods: the Car–Parrinello method requires a well-defined and unique total energy functional of atomic positions and basis set coefficients. Furthermore, the analytic evaluation of the first partial derivatives of the total energy with respect
104
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
to wavefunctions, H |n , and atomic position, the forces, must be possible. Therefore, it was one of the main goals of the PAW method to introduce energy and potential independent basis sets that are as accurate as the previously used augmented basis sets. Other requirements have been: (1) The method should at least match the efficiency of the pseudopotential approach for Car–Parrinello simulations. (2) It should become an exact theory when converged and (3) its convergence should be easily controlled. We believe that these criteria have been met, which explains why the PAW method becomes increasingly widespread today.
3.1.
Transformation Theory
At the root of the PAW method lies a transformation, that maps the true wavefunctions with their complete nodal structure onto auxiliary wavefunctions, that are numerically convenient. We aim for smooth auxiliary wavefunctions, which have a rapidly convergent plane-wave expansion. With such a transformation we can expand the auxiliary wave functions into a convenient basis set such as plane waves, and evaluate all physical properties after reconstructing the related physical (true) wavefunctions. Let us denote the physical one-particle wavefunctions as |n and the aux˜ n . Note that the tilde refers to the representation of iliary wavefunctions as | smooth auxiliary wavefunctions and n is the label for a one-particle state and contains a band index, a k-point and a spin index. The transformation from the auxiliary to the physical wavefunctions is denoted by T : ˜ n . |n = T |
(11)
Now we express the constrained density functional F of Eq. (2) in terms of our auxiliary wavefunctions ˜ n] − ˜ n , m,n ] = E[T F[T
˜ n |T † T | ˜ m − δn,m ]m,n . [
(12)
n,m
The variational principle with respect to the auxiliary wavefunctions yields ˜ n = T † T | ˜ n n . T † H T |
(13)
Again we obtain a Schr¨odinger-like equation (see derivation of Eq. (4)), but now the Hamilton operator has a different form, H˜ = T † H T , an overlap operator O˜ = T † T occurs, and the resulting auxiliary wavefunctions are smooth. When we evaluate physical quantities we need to evaluate expectation values of an operator A, which can be expressed in terms of either the true or the auxiliary wavefunctions: A =
n
f n n |A|n =
n
˜ n |T † AT | ˜ n . f n
(14)
Electronic structure methods
105
In the representation of auxiliary wavefunctions we need to use transformed ˜ † AT . As it is, this equation only holds for the valence electrons. operators A=T The core electrons are treated differently as will be shown below. The transformation takes us conceptionally from the world of pseudopotentials to that of augmented wave methods, which deal with the full wavefunctions. We will see that our auxiliary wavefunctions, which are simply the plane-wave parts of the full wavefunctions, translate into the wavefunctions of the pseudopotential approach. In the PAW method, the auxiliary wavefunctions are used to construct the true wavefunctions and the total energy functional is evaluated from the latter. Thus it provides the missing link between augmented wave methods and the pseudopotential method, which can be derived as a well-defined approximation of the PAW method. In the original paper [4], the auxiliary wavefunctions have been termed pseudo wavefunctions and the true wavefunctions have been termed allelectron wavefunctions, in order to make the connection more evident. We avoid this notation here, because it resulted in confusion in cases, where the correspondence is not clear-cut.
3.2.
Transformation Operator
So far, we have described how we can determine the auxiliary wave functions of the ground state and how to obtain physical information from them. What is missing, is a definition of the transformation operator T . The operator T has to modify the smooth auxiliary wave function in each atomic region, so that the resulting wavefunction has the correct nodal structure. Therefore, it makes sense to write the transformation as identity plus a sum of atomic contributions S R : T =1+
SR .
(15)
R
For every atom, S R adds the difference between the true and the auxiliary wavefunction. The local terms S R are defined in terms of solutions |φi of the Schr¨odinger equation for the isolated atoms. This set of partial waves |φi will serve as a basis set so that, near the nucleus, all relevant valence wavefunctions can be expressed as superposition of the partial waves with yet unknown coefficients: (r) =
φi (r)ci
for |r − R R | < rc,R ,
(16)
i∈R
with i ∈ R we indicate those partial waves that belong to site R. Since the core wavefunctions do not spread out into the neighboring atoms, we will treat them differently. Currently we use the frozen-core approximation, which imports the density and the energy of the core electrons from
106
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
the corresponding isolated atoms. The transformation T shall produce only wavefunctions orthogonal to the core electrons, while the core electrons are treated separately. Therefore, the set of atomic partial waves |φi includes only valence states that are orthogonal to the core wavefunctions of the atom. For each of the partial waves we choose an auxiliary partial wave |φ˜i . The identity |φi = (1 + S R )|φ˜i for i ∈ R S R |φ˜i = |φi − |φ˜i
(17)
defines the local contribution S R to the transformation operator. Since 1 + S R shall change the wavefunction only locally, we require that the partial waves |φi and their auxiliary counter parts |φ˜i are pairwise identical beyond a certain radius rc,R : φi (r) = φ˜i (r)
for i ∈ R and |r − R R | > rc,R .
(18)
Note that the partial waves are not necessarily bound states and are therefore not normalizable, unless we truncate them beyond a certain radius rc,R . The PAW method is formulated such that the final results do not depend on the location where the partial waves are truncated, as long as this is not done too close to the nucleus and identical for auxiliary and all-electron partial waves. In order to be able to apply the transformation operator to an arbitrary auxiliary wavefunction, we need to be able to expand the auxiliary wavefunction locally into the auxiliary partial waves. ˜ (r) =
φ˜i (r)ci =
i∈R
˜ φ˜ i (r) p˜i |
for |r − R R | < rc,R ,
(19)
i∈R
which defines the projector functions | p˜i . The projector functions probe the local character of the auxiliary wave function in the atomic region. Examples of projector functions are shown in Fig. 2. From Eq. (19) we can derive ˜ i∈R |φi p˜i | = 1, which is valid within rc,R . It can be shown by insertion, ˜ that can be that the identity Eq. (19) holds for any auxiliary wavefunction | expanded locally into auxiliary partial waves |φ˜i , if p˜i |φ˜ j = δi, j
for i, j ∈ R.
(20)
Note that neither the projector functions nor the partial waves need to be orthogonal among themselves. The projector functions are fully determined with the above conditions and a closure relation, which is related to the unscreening of the pseudopotentials (see Eq. 90 in Ref. [4]). By combining Eqs. (17) and (19), we can apply S R to any auxiliary wavefunction: ˜ = S R |
i∈R
˜ = S R |φ˜ i p˜i |
˜ |φi − |φ˜ i p˜i |.
i∈R
(21)
Electronic structure methods
107
Figure 2. Projector functions of the chlorine atom. Top: two s-type projector functions, middle: p-type, bottom: d-type.
Hence, the transformation operator is T =1+
|φi − |φ˜i p˜i |,
(22)
i
where the sum runs over all partial waves of all atoms. The true wavefunction can be expressed as ˜ + | = |
˜ = | ˜ + |φi − |φ˜i p˜i |
i
˜ R1 | R1 − |
(23)
R
with | R1 =
˜ |φi p˜i |
(24)
˜ |φ˜i p˜i |.
(25)
i∈R
˜ R1 = |
i∈R
In Fig. 3, the decomposition of Eq. (23) is shown for the example of the bonding p-σ state of the Cl2 molecule. To understand the expression Eq. (23) for the true wavefunction, let us concentrate on different regions in space. (1) Far from the atoms, the partial waves are, according to Eq. (18), pairwise identical so that the auxiliary wavefunc˜ tion is identical to the true wavefunction, that is (r) = (r). (2) Close to an atom R, however, the auxiliary wavefunction is, according to Eq. (19), identi˜ ˜ R1 (r). Hence, the true cal to its one-center expansion, that is, (r) =
108
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
Figure 3. Bonding p-σ orbital of the Cl2 molecule and its decomposition of the wavefunction into auxiliary wavefunction and the two one-center expansions. Top-left: True and auxiliary wave function; top-right: auxiliary wavefunction and its partial wave expansion; bottomleft: the two partial wave expansions; bottom-right: true wavefunction and its partial wave expansion.
wavefunction (r) is identical to R1 (r), which is built up from partial waves that contain the proper nodal structure. In practice, the partial wave expansions are truncated. Therefore, the identity of Eq. (19) does not hold strictly. As a result, the plane waves also contribute to the true wavefunction inside the atomic region. This has the advantage that the missing terms in a truncated partial wave expansion are partly accounted for by plane waves, which explains the rapid convergence of
Electronic structure methods
109
the partial wave expansions. This idea is related to the additive augmentation of the LAPW method of Soler and Williams [11]. Frequently, the question comes up, whether the transformation Eq. (22) of the auxiliary wavefunctions indeed provides the true wavefunction. The transformation should be considered merely as a change of representation analogous to a coordinate transform. If the total energy functional is transformed consistently, its minimum will yield auxiliary wavefunctions that produce the correct wavefunctions |.
3.3.
Expectation values
Expectation values can be obtained either from the reconstructed true wavefunctions or directly from the auxiliary wave functions A =
Nc
f n n |A|n +
n
=
φnc |A|φnc
n=1
˜ n |T † AT | ˜ n + f n
n
Nc
φnc |A|φnc ,
(26)
n=1
where f n are the occupations of the valence states and Nc is the number of core states. The first sum runs over the valence states, and second over the core states |φnc . Now we can decompose the matrix element for a wavefunction into its individual contributions according to Eq. (23):
˜ + |A| =
R
˜ ˜ + = |A|
R
˜ R1 ) ( R1 −
˜ R1 |A| ˜ R1 R1 |A| R1 −
R
+
˜ R1 ) A ˜ + ( R1 −
part 1
˜ R1 |A| ˜ − ˜ R1 + ˜ − ˜ R1 |A| R1 − ˜ R1 R1 −
R
part 2 +
R/ = R
˜ R1 |A| R1 − ˜ R1 . R1 −
part 3
(27)
110
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
Only the first part of Eq. (27) is evaluated explicitly, while the second and third parts of Eq. (27) are neglected, because they vanish for sufficiently local operators as long as the partial wave expansion is converged: The func˜ R1 vanishes per construction beyond its augmentation region, tion R1 − because the partial waves are pairwise identical beyond that region. The func˜ − ˜ R1 vanishes inside its augmentation region, if the partial wave expantion ˜ R1 sion is sufficiently converged. In no region of space both functions R1 − ˜ − ˜ R1 are simultaneously nonzero. Similarly the functions R1 − ˜ R1 and from different sites are never non-zero in the same region in space. Hence, the second and third parts of Eq. (27) vanish for operators such as the kinetic h¯ 2 2 ∇ and the real space projection operator |rr|, which produces energy − 2m e the electron density. For truly nonlocal operators the parts 2 and 3 of Eq. (27) would have to be considered explicitly. The expression, Eq. (26), for the expectation value can therefore be written with the help of Eq. (27) as
A =
˜ n |A| ˜ n + n1 |A|n1 − ˜ n1 |A| ˜ n1 + f n
n
=
˜ n |A| ˜ n + f n
n
R
−
φnc |A|φnc
n=1
+
Nc
R
Nc
φ˜nc |A|φ˜nc
n=1
Di, j φ j |A|φi +
Nc,R
i, j ∈R
n∈R
Nc,R
Di, j φ˜ j |A|φ˜i +
i, j ∈R
φnc |A|φnc
φ˜ nc |A|φ˜nc ,
(28)
n∈R
where Di, j is the one-center density matrix defined as
Di, j =
n
˜ n | p˜ j p˜i | ˜ n = f n
˜ n f n ˜ n | p˜ j , p˜i |
(29)
n
The auxiliary core states, |φ˜ nc allow to incorporate the tails of the core wavefunction into the plane-wave part, and therefore assure, that the integrations of partial wave contributions cancel strictly beyond rc . They are identical to the true core states in the tails, but are a smooth continuation inside the atomic sphere. It is not required that the auxiliary wave functions are normalized.
Electronic structure methods
111
Following this scheme, the electron density is given by n(r) = n(r) ˜ + n(r) ˜ =
n 1R (r) − n˜ 1R (r)
R ∗ ˜ n (r) ˜ n (r) fn
(30)
+ n˜ c (r)
n
n 1R (r) =
Di, j φ ∗j (r)φi (r) + n c,R (r)
i, j ∈R
n˜ 1R (r)
=
Di, j φ˜ ∗j (r)φ˜i (r) + n˜ c,R (r),
(31)
i, j ∈R
where n c,R is the core density of the corresponding atom and n˜ c,R is the auxiliary core density, which is identical to n c,R outside the atomic region, but smooth inside. Before we continue, let us discuss a special point: The matrix element of a general operator with the auxiliary wavefunctions may be slowly converging with the plane-wave expansion, because the operator A may not be well behaved. An example for such an operator is the singular electrostatic potential of a nucleus. This problem can be alleviated by adding an “intelligent zero”: If an operator B is purely localized within an atomic region, we can use the identity between the auxiliary wavefunction and its own partial wave expansion ˜ n − ˜ n1 |B| ˜ n1 . ˜ n |B| 0 =
(32)
Now we choose an operator B so that it cancels the problematic behavior of the operator A, but is localized in a single atomic region. By adding B to the plane-wave part and the matrix elements with its one-center expansions, the plane-wave convergence can be improved without affecting the converged result. A term of this type, namely v¯ will be introduced in the next section to cancel the Coulomb singularity of the potential at the nucleus.
4.
Total Energy
Like wavefunctions and expectation values also the total energy can be divided into three parts: ˜ n , R R ] = E˜ + E[
E 1R − E˜ 1R .
(33)
R
The plane-wave part E˜ involves only smooth functions and is evaluated on equi-spaced grids in real and reciprocal space. This part is computationally
112
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
most demanding, and is similar to the expressions in the pseudopotential approach: E˜ =
−h 2 ¯ ˜ ˜n ∇2 2m e n 2
n
+ +
e 1 · 2 4π 0
d 3r
d3r
[n(r) ˜ + Z˜ (r)][n(r ˜ ) + Z˜ (r )] |r − r |
d3r v(r) ¯ n(r) ˜ + E xc [n(r)]. ˜
(34)
Z˜ (r) is an angular-momentum dependent core-like density that will be described in detail below. The remaining parts can be evaluated on radial grids in a spherical harmonics expansion. The nodal structure of the wavefunctions can be properly described on a logarithmic radial grid that becomes very fine near the nucleus, E 1R
=
i, j ∈R
Di, j
N c,R 2 −h 2 ¯ 2 c ¯ 2 c −h φj ∇ φi + φn ∇ φn 2m e 2m e
n∈R
e2 1 [n 1 (r) + Z (r)][n 1 (r ) + Z (r )] + · d3 r d3 r 2 4π 0 |r − r | 1 + E xc [n (r)] 2 − h ¯ 1 2 Di, j φ˜ j ∇ φ˜ i E˜ R = 2m e i, j ∈R + +
e2 1 · 2 4π 0
d3 r
d3 r
(35)
[n˜ 1 (r) + Z˜ (r)][n˜ 1 (r ) + Z˜ (r )] |r − r |
d3r v(r) ¯ n˜ 1 (r) + E xc [n˜ 1 (r)].
(36)
˜ The compensation charge density Z(r) = R Z˜ R (r) is given as a sum of angular momentum dependent Gauss functions, which have an analytical plane-wave expansion. A similar term occurs also in the pseudopotential approach. In contrast to the norm-conserving pseudopotential approach, however, the compensation charge of an atom Z˜ R is nonspherical and constantly adapts to the instantaneous environment. It is constructed such that n 1R (r) + Z R (r) − n˜ 1R (r) − Z˜ R (r)
(37)
has vanishing electrostatic multipole moments for each atomic site. With this choice, the electrostatic potentials of the augmentation densities vanish outside their spheres. This is the reason that there is no electrostatic interaction of the one-center parts between different sites.
Electronic structure methods
113
The compensation charge density as given here is still localized within the atomic regions. A technique similar to an Ewald summation, however, allows to replace it by a very extended charge density. Thus we can achieve, that the plane-wave convergence of the total energy is not affected by the auxiliary density. The potential v¯ = R v¯ R , which occurs in Eqs. (34) and (36), enters the total energy in the form of “intelligent zeros” described in Eq. (32) 0=
n
=
˜ n |v¯ R | ˜ n − ˜ n1 |v¯ R | ˜ n1 f n ˜ n |v¯ R | ˜ n − f n
n
Di, j φ˜i |v¯ R |φ˜ j .
(38)
i, j ∈R
The main reason for introducing this potential is to cancel the Coulomb singularity of the potential in the plane-wave part. The potential v¯ allows to influence the plane-wave convergence beneficially, without changing the converged result. v¯ must be localized within the augmentation region, where Eq. (19) holds.
5.
Approximations
Once the total energy functional provided in the previous section has been defined, everything else follows: Forces are partial derivatives with respect to atomic positions. The potential is the derivative of the nonkinetic energy contributions to the total energy with respect to the density, and the auxiliary ˜ n with respect to auxiliary wave Hamiltonian follows from derivatives H˜ | functions. The fictitious Lagrangian approach of Car and Parrinello [38] does not allow any freedom in the way these derivatives are obtained. Anything else than analytic derivatives will violate energy conservation in a dynamical simulation. Since the expressions are straightforward, even though rather involved, we will not discuss them here. All approximations are incorporated already in the total energy functional of the PAW method. What are those approximations? • First, we use the frozen-core approximation. In principle, this approximation can be overcome. • The plane-wave expansion for the auxiliary wavefunctions must be complete. The plane-wave expansion is controlled easily by increasing the plane-wave cut-off defined as E PW = 12 h¯ 2 G 2max . Typically, we use a planewave cut-off of 30 Ry. • The partial wave expansions must be converged. Typically we use one or two partial waves per angular momentum (, m) and site. It should be noted that the partial wave expansion is not variational, because it
114
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst changes the total energy functional and not the basis set for the auxiliary wavefunctions.
We do not discuss here numerical approximations such as the choice of the radial grid, since those are easily controlled.
6.
Relation to the Pseudopotentials
We mentioned earlier that the pseudopotential approach can be derived as a well-defined approximation from the PAW method: The augmentation part of the total energy E = E 1 − E˜ 1 for one atom is a functional of the one-center density matrix Di, j ∈R defined in Eq. (29). The pseudopotential approach can be recovered if we truncate a Taylor expansion of E about the atomic density matrix after the linear term. The term linear to Di, j is the energy related to the nonlocal pseudopotential. E(Di, j ) = E(Di,atj )+ = E self +
(Di, j − Di,atj )
i, j
˜ n |v ps | ˜ n − f n
∂E + O(Di, j − Di,atj )2 ∂ Di, j
d3r v(r) ¯ n(r)+ ˜ O(Di, j −Di,atj )2
n
(39) which can directly be compared to the total energy expression, Eq. (10), of the pseudopotential method. The local potential v(r) ¯ of the pseudopotential approach is identical to the corresponding potential of the projector augmented ˜ wave method. The remaining contributions in the PAW total energy, namely E, differ from the corresponding terms in Eq. (10) only in two features: our auxiliary density also contains an auxiliary core density, reflecting the nonlinear core correction of the pseudopotential approach, and the compensation density Z˜ (r) is non-spherical and depends on the wavefunction. Thus, we can look at the PAW method also as a pseudopotential method with a pseudopotential that adapts to the instantaneous electronic environment. In the PAW method, the explicit nonlinear dependence of the total energy on the one-center density matrix is properly taken into account. What are the main advantages of the PAW method compared to the pseudopotential approach? First, all errors can be systematically controlled so that there are no transferability errors. As shown by Watson and Carter [43] and Kresse and Joubert [44], most pseudopotentials fail for high-spin atoms such as Cr. While it is probably true that pseudopotentials can be constructed that cope even with this situation, a failure can not be known beforehand, so that some empiricism remains in practice: A pseudopotential constructed from an isolated atom is
Electronic structure methods
115
not guaranteed to be accurate for a molecule. In contrast, the converged results of the PAW method do not depend on a reference system such as an isolated atom, because PAW uses the full density and potential. Like other all-electron methods, the PAW method provides access to the full charge and spin density, which is relevant, for example, for hyperfine parameters. Hyperfine parameters are sensitive probes of the electron density near the nucleus. In many situations they are the only information available that allows to deduce atomic structure and chemical environment of an atom from experiment. The plane-wave convergence is more rapid than in norm-conserving pseudopotentials and should in principle be equivalent to that of ultra-soft pseudopotentials [31]. Compared to the ultra-soft pseudopotentials, however, the PAW method has the advantage that the total energy expression is less complex and can therefore be expected to be more efficient. The construction of pseudopotentials requires to determine a number of parameters. As they influence the results, their choice is critical. Also the PAW methods provides some flexibility in the choice of auxiliary partial waves. However, this choice does not influence the converged results.
7.
Recent Developments
Since the first implementation of the PAW method in the CP-PAW code, a number of groups have adopted the PAW method. The second implementation was done by the group of Holzwarth [45]. The resulting PWPAW code is freely available [46]. This code is also used as a basis for the PAW implementation in the AbInit project. An independent PAW code has been developed by Valiev and Weare [47]. Recently, the PAW method has been implemented into the VASP code [44]. The PAW method has also been implemented by Kromen into the ESTCoMPP code of Bl¨ugel and Schr¨oder. Another branch of methods uses the reconstruction of the PAW method, without taking into account the full wavefunctions in the energy minimization. Following chemists’ notation, this approach could be termed “postpseudopotential PAW.” This development began with the evaluation for hyperfine parameters from a pseudopotential calculation using the PAW reconstruction operator [36] and is now used in the pseudopotential approach to calculate properties that require the correct wavefunctions such as hyperfine parameters. The implementation by Kresse and Joubert [44] has been particularly useful as they had an implementation of PAW in the same code as the ultrasoft pseudopotentials, so that they could critically compare the two approaches with each other. Their conclusion is that both methods compare well in most
116
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst
cases, but they found that magnetic energies are seriously – by a factor 2 – in error in the pseudopotential approach, while the results of the PAW method were in line with other all-electron calculations using the linear augmented plane-wave method. As a short note, Kresse and Joubert incorrectly claim that their implementation is superior as it includes a term that is analogous to the nonlinear core correction of pseudopotentials [32]: this term however is already included in the original version in the form of the pseudized core density. Several extensions of the PAW have been done in the recent years: For applications in chemistry truly isolated systems are often of great interest. As any plane-wave based method introduces periodic images, the electrostatic interaction between these images can cause serious errors. The problem has been solved by mapping the charge density onto a point charge model, so that the electrostatic interaction could be subtracted out in a self-consistent manner [48]. In order to include the influence of the environment, the latter was simulated by simpler force fields using the molecular-mechanics–quantummechanics (QM–MM) approach [49]. In order to overcome the limitations of the density functional theory, several extensions have been performed. Bengone et al. [50] implemented the LDA+U approach into the CP-PAW code. Soon after this, Arnaud and Alouani [51] accomplished the implementation of the GW approximation into the CP-PAW code. The VASP-version of PAW [52] and the CP-PAW code have now been extended to include a noncollinear description of the magnetic moments. In a noncollinear description, the Schr¨odinger equation is replaced by the Pauli equation with two-component spinor wavefunctions. The PAW method has proven useful to evaluate electric field gradients [53] and magnetic hyperfine parameters with high accuracy [54]. Invaluable will be the prediction of NMR chemical shifts using the GIPAW method of Pickard and Mauri [55], which is based on their earlier work [56]. While the GIPAW is implemented in a post-pseudopotential manner, the extension to a self-consistent PAW calculation should be straightforward. An post-pseudopotential approach has also been used to evaluate core level spectra [57] and momentum matrix elements [58].
Acknowledgments We are grateful for carefully reading the manuscript to S. Boeck, J. Noffke, A. Poddey, as well as to K. Schwarz for his continuous support. This work has benefited from the collaborations within the ESF Programme on “Electronic Structure Calculations for Elucidating the Complex Atomistic Behavior of Solids and Surfaces.”
Electronic structure methods
117
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133, 1965. [3] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. [4] P.E. Bl¨ochl, “Projector augmented-wave method,” Phys. Rev. B, 50, 17953, 1994. [5] J.C. Slater, “Wave functions in a periodic potential,” Phys. Rev., 51, 846, 1937. [6] J. Korringa, “On the calculation of the energy of a Bloch wave in a metal,” Physica (Utrecht), 13, 392, 1947. [7] W. Kohn and J. Rostocker, “Solution of the schr¨odinger equation in periodic lattices with an application to metallic lithium,” Phys. Rev., 94, 1111, 1954. [8] O.K. Andersen, “Linear methods in band theory,” Phys. Rev. B, 12, 3060, 1975. [9] H. Krakauer, M. Posternak, and A.J. Freeman, “Linearized augmented plane-wave method for the electronic band structure of thin films,” Phys. Rev. B, 19, 1706, 1979. [10] S. Singh, Planewaves, Pseudopotentials and the LAPW method, Kluwer Academic, Dordrecht, 1994. [11] J.M. Soler and A.R. Williams, “Simple formula for the atomic forces in the augmented-plane-wave method,” Phys. Rev. B, 40, 1560, 1989. [12] D. Singh, “Ground-state properties of lanthanum: treatment of extended-core states,” Phys. Rev. B, 43, 6388, 1991. [13] E. Sj¨ostedt, L. Nordstr¨om, and D.J. Singh, “An alternative way of linearizing the augmented plane-wave method,” Solid State Commun., 114, 15, 2000. [14] G.K.H. Madsen, P. Blaha, K. Schwarz, E. Sj¨ostedt, and L. Nordstr¨om, “Efficient linearization of the augmented plane-wave method,” Phys. Rev. B, 64, 195134, 2001. [15] H.L. Skriver, The LMTO Method, Springer, New York, 1984. [16] O.K. Andersen and O. Jepsen, “Explicit, first-principles tight-binding theory,” Phys. Rev. Lett., 53, 2571, 1984. [17] O.K. Andersen, T. Saha-Dasgupta, and S. Ezhof, “Third-generation muffin-tin orbitals,” Bull. Mater. Sci., 26, 19, 2003. [18] K. Held, I.A. Nekrasov, G. Keller, V. Eyert, N. Bl¨umer, A.K. McMahan, R.T. Scalettar, T. Pruschke, V.I. Anisimov, and D. Vollhardt, “The LDA+DMFT approach to materials with strong electronic correlations,” In: J. Grotendorst, D. Marx, and A. Muramatsu (eds.) Quantum Simulations of Complex Many-Body Systems: From Theory to Algorithms, Lecture Notes, vol. 10 NIC Series. John von Neumann Institute for Computing, J¨ulich, p. 175, 2002. [19] C. Herring, “A new method for calculating wave functions in crystals,” Phys. Rev., 57, 1169, 1940. [20] J.C. Phillips and L. Kleinman, “New method for calculating wave functions in crystals and molecules,” Phys. Rev., 116, 287, 1959. [21] E. Antoncik, “Approximate formulation of the orthogonalized plane-wave method,” J. Phys. Chem. Solids, 10, 314, 1959. [22] D.R. Hamann, M. Schl¨uter, and C. Chiang, “Norm-conserving pseudopotentials,” Phys. Rev. Lett., 43, 1494, 1979. [23] A. Zunger and M. Cohen, “First-principles nonlocal-pseudopotential approach in the density-functional formalism: development and application to atoms,” Phys. Rev. B, 18, 5449, 1978.
118
P.E Bl¨ochl, J. K¨astner, and C.J. F¨orst [24] G.P. Kerker, “Non-singular atomic pseudopotentials for solid state applications,” J. Phys. C, 13, L189, 1980. [25] G.B. Bachelet, D.R. Hamann, and M. Schl¨uter, “Pseudopotentials that work: from H to Pu,” Phys. Rev. B, 26, 4199, 1982. [26] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations,” Phys. Rev. B, 43, 1993, 1991. [27] J.S. Lin, A. Qteish, M.C. Payne, and V. Heine, “Optimized and transferable nonlocal separable ab initio pseudopotentials,” Phys. Rev. B, 47, 4174, 1993. [28] M. Fuchs and M. Scheffler, “Ab initio pseudopotentials for electronic structure calculations of poly-atomic systems using density-functional theory,” Comput. Phys. Commun., 119, 67, 1999. [29] L. Kleinman and D.M. Bylander, “Efficacious form for model pseudopotentials,” Phys. Rev. Lett., 48, 1425, 1982. [30] P.E. Bl¨ochl, “Generalized separable potentials for electronic structure calculations,” Phys. Rev. B, 41, 5414, 1990. [31] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 17892, 1990. [32] S.G. Louie, S. Froyen, and M.L. Cohen, “Nonlinear ionic pseudopotentials in spindensity-functional calculations,” Phys. Rev. B, 26, 1738, 1982. [33] D.R. Hamann, “Generalized norm-conserving pseudopotentials,” Phys. Rev. B, 40, 2980, 1989. [34] K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Vanderbilt, “Implementation of ultrasoft pseudopotentials in ab initio molecular dynamics,” Phys. Rev. B, 47, 110142, 1993. [35] X. Gonze, R. Stumpf, and M. Scheffler, “Analysis of separable potentials,” Phys. Rev. B, 44, 8503, 1991. [36] C.G. Van de Walle and P.E. Bl¨ochl, “First-principles calculations of hyperfine parameters,” Phys. Rev. B, 47, 4244, 1993. [37] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate-gradients,” Rev. Mod. Phys., 64, 11045, 1992. [38] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471, 1985. [39] S. Nos´e, “A unified formulation of the constant temperature molecular-dynamics methods,” Mol. Phys., 52, 255, 1984. [40] Hoover, “Canonical dynamics: equilibrium phase-space distributions,” Phys. Rev. A, 31, 1695, 1985. [41] P.E. Bl¨ochl and M. Parrinello, “Adiabaticity in first-principles molecular dynamics,” Phys. Rev. B, 45, 9413, 1992. [42] P.E. Bl¨ochl, “Second generation wave function thermostat for ab initio molecular dynamics,” Phys. Rev. B, 65, 1104303, 2002. [43] S.C. Watson and E.A. Carter, “Spin-dependent pseudopotentials,” Phys. Rev. B, 58, R13309, 1998. [44] G. Kresse and J. Joubert, “From ultrasoft pseudopotentials to the projector augmented-wave method,” Phys. Rev. B, 59, 1758, 1999. [45] N.A.W. Holzwarth, G.E. Mathews, R.B. Dunning, A.R. Tackett, and Y. Zheng, “Comparison of the projector augmented-wave, pseudopotential, and linearized augmented-plane-wave formalisms for density-functional calculations of solids,” Phys. Rev. B, 55, 2005, 1997.
Electronic structure methods
119
[46] A.R. Tackett, N.A.W. Holzwarth, and G.E. Matthews, “A projector augmented wave (PAW) code for electronic structure calculations. Part I: atompaw for generating atom-centered functions. A projector augmented wave (PAW) code for electronic structure calculations. Part II: pwpaw for periodic solids in a plane wave basis,” Comput. Phys. Commun., 135, 329–347, 2001. See also pp. 348–376. [47] M. Valiev and J.H. Weare, “The projector-augmented plane wave method applied to molecular bonding,” J. Phys. Chem. A, 103, 10588, 1999. [48] P.E. Bl¨ochl, “Electrostatic decoupling of periodic images of plane-wave-expanded densities and derived atomic point charges,” J. Chem. Phys., 103, 7422, 1995. [49] T.K. Woo, P.M. Margl, P.E. Bl¨ochl, and T. Ziegler, “A combined Car–Parrinello QM/MM implementation for ab initio molecular dynamics simulations of extended systems: application to transition metal catalysis,” J. Phys. Chem. B, 101, 7877, 1997. [50] O. Bengone, M. Alouani, P.E. Bl¨ochl, and J. Hugel, “Implementation of the projector augmented-wave LDA+U method: application to the electronic structure of NiO,” Phys. Rev. B, 62, 16392, 2000. [51] B. Arnaud and M. Alouani, “All-electron projector-augmented-wave GW approximation: application to the electronic properties of semiconductors,” Phys. Rev. B., 62, 4464, 2000. [52] D. Hobbs, G. Kresse, and J. Hafner, “Fully unconstrained noncollinear magnetism within the projector augmented-wave method,” Phys. Rev. B, 62, 11556, 2000. [53] H.M. Petrilli, P.E. Bl¨ochl, P. Blaha, and K. Schwarz, “Electric-field-gradient calculations using the projector augmented wave method,” Phys. Rev. B, 57, 14690, 1998. [54] P.E. Bl¨ochl, “First-principles calculations of defects in oxygen-deficient silica exposed to hydrogen,” Phys. Rev. B, 62, 6158, 2000. [55] C.J. Pickard and F. Mauri, “All-electron magnetic response with pseudopotentials: NMR chemical shifts,” Phys. Rev. B., 63, 245101, 2001. [56] F. Mauri, B.G. Pfrommer, and S.G. Louie, “Ab initio theory of NMR chemical shifts in solids and liquids,” Phys. Rev. Lett., 77, 5300, 1996. [57] D.N. Jayawardane, C.J. Pickard, L.M. Brown, and M.C. Payne, “Cubic boron nitride: experimental and theoretical energy-loss near-edge structure,” Phys. Rev. B, 64, 115107, 2001. [58] H. Kageshima and K. Shiraishi, “Momentum-matrix-element calculation using pseudopotentials,” Phys. Rev. B, 56, 14985, 1997.
1.7 ELECTRONIC SCALE James R. Chelikowsky University of Minnesota, Minneapolis, MN, USA
1.
Real-space methods for ab initio calculations
Major computational advances in predicting the electronic and structural properties of matter come from two sources: improved performance of hardware and the creation of new algorithms, i.e., software. Improved hardware follows technical advances in computer design and electronic components. Such advances are frequently characterized by Moore’s Law, which states that computer power will double every 2 years or so. This law has held true for the past 20 or 30 years and most workers expect it to hold for the next decade, suggesting that such technical advances can be predicted. In clear contrast, the creation of new high performance algorithms defies characterization by a similar law as creativity is clearly not a predictable activity. Nonetheless, over the past half century, most advances in the theory of the electronic structure of matter have been made with new algorithms as opposed to better hardware. One may reasonably expect these advances to continue. Physical concepts such as the pseudopotentials and density functional theories coupled with numerical methods such as iterative diagonalization methods have permitted very large systems to be examined, much larger systems than could be handled solely by the increase allowed by computational hardware advances. Systems with hundreds, if not thousands, of atoms can now be examined, whereas methods of a generation ago might handle only tens of atoms. The development of real-space methods for the electronic structure over the past ten years is a notable advance in high performance algorithms for solving the electronic structure problem. Real-space methods do not require an explicit basis. The convergence of the method, assuming a uniform grid, can be tested by varying only one parameter: the grid spacing. The method can be easily be applied to neutral or charged systems, to extended or localized systems, and to diverse materials such as simple metals, semiconductors, 121 S. Yip (ed.), Handbook of Materials Modeling, 121–135. c 2005 Springer. Printed in the Netherlands.
122
J.R. Chelikowsky
and transition metals. These methods are also well suited for highly parallel computing platforms as few global communications are required. Review articles on these approaches can be found in Refs. [1–3].
2.
The Electronic Structure Problem
Most contemporary descriptions of the electronic structure problem for large systems cast the problem within density functional theory [4]. The many body problem is mapped onto a one electron Schr¨odinger equation called the Kohn–Sham equation [5]. For an atom, this equation can be written as
Z e2 −2 ∇ 2 − + VH ( r ) + Vxc [ r , ρ( r )] 2m r
ψn ( r ) = E n ψn ( r)
(1)
where there are Z electrons in the atom, VH is the Hartree or Coulomb potential, and Vxc is the exchange-correlation potential. The Hartree and exchangecorrelation potentials can be determined from the electronic charge density. r )), can be used to determine the The eigenvalue and eigenfunctions, (E n , ψn ( total electronic energy of the atom. The density is given by ρ( r ) = −e
|ψn ( r )|2
(2)
n,occup
The summation is over all occupied states. The Hartree potential is then determined by r ) = −4π eρ( r) ∇ 2 VH (
(3)
This term can be interpreted as the electrostatic interaction of an electron with the charge density of system. The exchange-correlation potential is more problematic. Within density functional theory, one can define an exchange correlation potential as a functional of the charge density. The central tenant of the local density approximation [5] is that the total exchange-correlation energy may be written as
r ) xc (ρ( r )) d 3r E xc [ρ] = ρ(
(4)
where xc is the exchange-correlation energy density. If one has knowledge of the exchange-correlation energy density, one can extract the potential and total electronic energy of the system. As a first approximation the exchangecorrelation energy density can be extracted from a homogeneous electron gas. It is common practice to separate exchange and correlation contributions to xc : xc = x + c [4]. It is not difficult to solve the Kohn–Sham equation (Eq. 1) for an atom. The potential, and charge density, is assumed to be spherically symmetric
Electronic scale
123
and the Kohn–Sham problem reduces to solving a one-dimensional problem. The Hartree and exchange-correlation potentials can be iterated to form a selfconsistent field. Usually the process is so quick for an atom that it can be done on desktop or laptop computer in a matter of seconds. In three dimensions, as for a complex atomic cluster, liquid or crystal, the problem is highly nontrivial. One major difficulty is the range of length scales involved. For example, in the case of a multielectron atom, the most tightly bound, core electrons can be confined to within ∼0.1 Å whereas the outer valence electrons may extend over ∼1–5 Å. In addition, the nodal structure of the atomic wave functions are difficult to replicate with a simple basis, especially the cusp in a wave function at the nuclear site where the Coulomb potential diverges. One approach to this problem is to form a basis combining highly localized functions with extended functions. This approach enormously complicates the electronic structure problem as valence and core states are treated on equal footing whereas such states are not equivalent in terms of their chemical activity. Consider the physical content of the periodic table, i.e., arranging the elements into columns with similar chemical properties. The Group IV elements such as C, Si, and Ge have similar properties because they share an outer s2 p2 configuration. This chemical similarity of the valence electrons is recognized by the pseudopotential approximation [6, 7]. The pseudopotential replaces the “all electron” potential by one that reproduces only the chemically active, or valence electrons. Usually, the pseudopotential subsumes the nuclear potential with those of the core electrons to generate an “ion core potential.” As an example, consider a sodium atom whose core electron configuration is 1s2 2s2 2p6 and valence electron configuration is 3s1 . The charge on the ion core pseudopotential is +1 (the nuclear charge minus the number of core electrons). Such a pseudopotential will bind only one electrons. The length scale of the pseudopotential is now set by the valence electrons alone. This permits a great simplification of the Kohn–Sham problem in terms of choosing a basis. For the purposes of designing an ab initio pseudopotential let us consider a sodium atom. By solving for the Na atom, we know the eigenvalue, 3s , and the corresponding wave function, ψ3s (r) for the valence electron. We demand several conditions for the Na pseudopotential: (1) The potential bind only the valence electron, the 3s-electron for the case at hand. (2) The eigenvalue of the corresponding valence electron be identical to the full potential eigenvalue. The full potential is also called the all-electron potential. (3) The wave function be nodeless and identical to the “all electron” wave function outside the core region. For example, we construct a pseudo-wave function, φ3s (r) such that φ3s (r)=ψ3s (r) for r > rc where rc defines the size spanned by the ion core, i.e., the nucleus and core electrons. For Na, this means the “size” of 1s2 2s2 2p6
124
J.R. Chelikowsky
states. Typically, the core is taken to be less than the distance corresponding to the maximum of the valence wave function, but greater than the distance of the outermost node. If the eigenvalue, p , and the wave function, φp (r), are known from solving the atom, it is possible to invert the Kohn–Sham equation to yield an ion core pseudopotential, i.e., a pseudopotential that when screened will yield the exact eigenvalue and wave function by construction: p
Vion(r) = p +
2 ∇ 2 φp − VH (r) − Vxc [r, ρ(r)] 2mφp
(5)
Within this construction, the pseudo-wave function, φp (r), should be identical to the all electron wave function, ψAE (r), outside the core: φp (r) = ψAE (r) for r >rc will guarantee that the pseudo-wave function will yield similar chemical properties as the all electron wave function. For r < rc , one may alter the all-electron wave function as one wishes, within certain limitations, and retain the chemical accuracy of the problem. For computational simplicity, we take the wave function in this region to be smooth and nodeless. Another very important criterion is mandated. Namely, the integral of the pseudocharge density, i.e., square of the wave function |φp (r)|2 , within the core should be equal to the integral of the all-electron charge density. Without this condition, the pseudo-wave function can differ by a scaling factor from the all-electron wave function, that is, φp (r)=C ×ψAE (r) for r > rc where the constant, C, may differ from unity. Since we expect the chemical bonding of an atom to be highly dependent on the tails of the valence wave functions, it is imperative that the normalized pseudo wave function be identical to the all-electron wave functions. The criterion by which one insures C = 1 is called norm conserving [2]. An example of a pseudopotential, in this case the Na pseudopotential, is presented in Fig. 1. The ion core pseudopotential is dependent on the angular momentum component of the wave function. This is apparent from Eq. (5) p where the Vion is “state dependent” or nonlocal. This nonlocal behavior is pronounced for first row elements, which lack p-states in the core, and for first row transition metals, which lack d-states in the core. A physical explanation for this behavior can be traced to the orthogonality requirement of the valence wave functions to the core states. This may be illustrated by considering the carbon atom. The 2s of carbon is orthogonal to the 1s state, whereas the 2p state is not required to be orthogonal to a 1p state. As such, the 2s state has a node; the 2p does not. In transforming these states to nodeless pseudo-wave functions, more kinetic energy associated with the 2s exists compared to the 2p state. The additional kinetic energy cancels the strong coulombic potential better for the 2s state than the 2p. In terms of the ion core pseudopotential, the 2s potential is weaker than the 2p state.
Electronic scale
125
2 1
s-pseudopotential
Potential (Ry)
0 ⫺1 p-pseudopotential
⫺2 d-pseudopotential
⫺3 ⫺4
all electron
⫺5
0
1
2 r (a.u.)
3
4
Figure 1. Pseudopotential compared to the all-electron potential for the sodium atom. This pseudopotential was constructed using the method of Troullier and Martins [8].
In the case of sodium, only three significant components (s, p, and d) are required for an accurate pseudopotential. Note how the d component is the strongest following the argument that no core states of similar angular momentum exist within the Na core. For more complex systems such as a rare earth metals, one might have four or more components. In Fig. 2, the 3s state for the all electron potential is illustrated. It is compared to the lowest s-state for the pseudopotential illustrated in Fig. 1 The Kohn–Sham equation can be rewritten for a pseudopotential as
−2 ∇ 2 p + Vion ( r ) + VH ( r ) + Vxc [ r , ρ( r )] 2m
ψn ( r ) = E n ψn ( r)
(6)
p
where Vion can be expressed as p
r) = Vion(
Vi,ion ( r − Ri ) p
(7)
i p
where Vi,ion is the ionic pseudopotential for the ith-atomic species located at position, Ri . The charge density in Eq. (7) corresponds to a sum over the wave functions for occupied valence states.
126
J.R. Chelikowsky 0.6 Na
Wave Functions
0.4
3s
3p
0.2
0
⫺0.2 0
1
2 r (a.u.)
3
4
5
Figure 2. Pseudopotential wave functions compared to all-electron wave functions for the sodium atom. The all-electron wave functions are indicated by the dashed lines.
Since the pseudopotential and corresponding wave functions vary slowly in space, a number of simple basis sets is possible, e.g., one could use Gaussians [9] or plane waves [6, 7]. Both methods often work quite well, although each has its limitations. Owing in part to the simplicity and ease of implementation, plane wave methods have become of the method of choice for electronic structure work, especially for simple metals and semiconductors like silicon [7, 10]. Methods based on plane wave bases are often called “momentum” or “reciprocal” space approaches to the electronic structure problem. Plane wave approaches utilize a basis of “infinite extent.” The extended basis requires special techniques to describe localized systems. For example, suppose one wishes to examine a cluster of silicon atoms. A common approach is to use a “supercell method.” The cluster would be placed in a large cell, which is periodically repeated to fill up all space. The electronic structure of this system corresponds to an isolated cluster, provided sufficient “vacuum” surrounds each cluster. This method is very successful and has been used to consider localized systems such as clusters as well as extended systems such as surfaces or liquids [10]. In contrast, one can take a rather dramatic alternative view and eliminate an explicit basis altogether and solve Eq. (6) completely in real space using
Electronic scale
127
a grid. Real space or grid methods are typically used for engineering problems, e.g., one might solve for the strain field in an airplane wing using finite element methods. Such methods have not been commonly used for the electronic structure problem. There are at least two reasons for this situation. First, without the pseudopotential method, a nonlinear grid would be needed to describe the singular coulombic potential near the atomic nucleus and the corresponding cusp in the wave function. This would enormously complicate the problem and destroy the simplicity of the method. Second, the non-local nature of the pseudopotential can be easily addressed in grid methods, but until recently the formalism for this task has not been available. Real-space approaches overcome many of the complications involved with explicit basis, especially for describing nonperiodic systems such as molecules, clusters and quantum dots. Unlike localized orbitals such as Gaussians, the basis is unbiased. One need not specify whether the basis contains particular angular momentum components. Moreover, the basis is not “attached” to the atomic positions and no Pulay forces need to be considered [11]. Pulay forces arise from an incomplete basis. As atoms are moved, the basis needs to be recomputed as the convergence changes with the atomic configuration. Unlike an extended basis such as those based on plane waves, the vacuum is easily described by grid points. In contrast to plane waves, grids are efficient and easy to implement on parallel platforms. Real space algorithms avoid the use of fast Fourier transforms by performing all calculations in physical space instead of Fourier space. A benefit of avoiding Fourier transforms is that very few global communications are required. Different numerical methods can be used to implement real space methods such as finite element or finite difference methods. Both approaches have advantages and liabilities. Finite element methods can easily accommodate nonuniform grids and can reflect the variational principle as the mesh is refined [1]. This is an appropriate approach for systems in which complex boundary conditions exist. For systems where the boundary conditions are simple, e.g., outside a domain the wave function is set to zero, this is not an important consideration. Finite differencing methods are easier to implement compared to finite element methods, especially with uniform grids. Both approaches have been extensively utilized; however, owing to the ease of implementation, finite differencing methods have been applied to a wider range of materials and properties. For this reason, we will illustrate the finite differencing method. A key aspect to the success of the finite difference method is the availability of higher order finite difference expansions for the kinetic energy operator, i.e., expansions of the Laplacian [12]. Higher order finite difference methods significantly improve convergence of the eigenvalue problem when compared with standard finite difference methods. If one imposes a simple, uniform grid
128
J.R. Chelikowsky
on our system where the points are described in a finite domain by (xi , y j , z k ), one may approximate the Laplacian operator at (xi , y j , z k ) by M ∂ 2ψ = Cn ψ(xi + nh, y j , z k ) + O(h 2M+2 ), ∂ x 2 n=−M
(8)
where h is the grid spacing and M is a positive integer. This approximation is accurate to O(h2M+2 ) under the assumption that ψ can be approximated accurately by a power series in h. Algorithms are available to compute the coefficients Cn for arbitrary order in h [12]. With the kinetic energy operator expanded as in Eq. (8), one can set up the Kohn–Sham equation over a grid. For simplicity, let us assume a uniform grid, but this is not a necessary requirement. ψ(xi , y j , z k ) is computed on the grid by solving the eigenvalue problem:
M M 2 Cn1 ψn (xi + n 1 h, y j , z k ) + Cn2 ψn (xi , y j + n 2 h, z k ) − 2m n =−M n =−M 1
+
M n 3 =−M
2
Cn3 ψn (xi , y j , z k + n 3 h) + Vion(xi , y j , z k ) + VH (xi , y j , z k )
+ Vxc (xi , y j , z k ) ψn (xi , y j , z k ) = E n ψn (xi , y j , z k )
(9)
For L grid points, the size of the full matrix is L 2 . A uniformly spaced grid in a three-dimensional cube is shown in Fig. 3. Each grid point corresponds to a row in the matrix. However, many points in the cube are far from any atoms in the system and the wave function on these points may be replaced by zero. Special data structures may be used to discard these points and retain only those having a nonzero value for the wave function. The size of the Hamiltonian matrix is usually reduced by a factor of two to three with this strategy, which is quite important considering the large number of eigenvectors which must be saved. Further, since the Laplacian can be represented by a simple stencil, and since all local potentials sum up to a simple diagonal matrix, the Hamiltonian need not be stored. Nonlocality in the pseudopotential, i.e., the “state dependence” of the potential as illustrated in Fig. 1, is easily treated using a plane wave basis in Fourier space, but it may also be calculated in real space. The nonlocality appears only in the angular dependence of the potential and not in the radial coordinate. It is often advantageous to use a more advanced projection scheme, due to Kleinman and Bylander [13]. The interactions between valence electrons and pseudo-ionic cores in the Kleinman–Bylander form may be separated into a local potential and a nonlocal pseudopotential in real space [8], which differs from zero only inside the small core region around each atom.
Electronic scale
129
Figure 3. Uniform grid illustrating a typical configuration for examining the electronic structure of a localized system. The gray sphere represents the domain where the wave functions are allowed to be nonzero. The light spheres within the domain are atoms.
One can write the Kleinman–Bylander form in real space as p
r )φn ( r) = Vion(
Vloc (| ra |)φn ( r) +
a a K n,lm
1 = a Vlm
G an,lm u lm ( ra )Vl (ra ),
(10)
a, n,lm
u lm ( ra )Vl (ra )ψn ( r )d3r,
(11)
a is the normalization factor, and Vlm
<
a Vlm
> = u lm ( ra )Vl (ra )u lm ( ra ) d3r,
(12)
where ra = r − Ra , and the u lm are the atomic pseudopotential wave functions of angular momentum quantum numbers (l, m) from which the l-dependent ionic pseudopotential, Vl (r), is generated. Vl (r) = Vl (r) − Vloc (r) is the difference between the l component of the ionic pseudopotential and the local ionic potential. As a specific example, in the case of Na, we might choose the local part of the potential to replicate only the l = 0 component as defined by the 3s state. The nonlocal parts of the potential would then contain only the l = 1 and l = 2 components. The choice of which angular component is chosen for the local part of the potential is somewhat arbitrary. It is often convenient to chose the local potential to correspond to the highest l-component of interest. This
130
J.R. Chelikowsky
reduces the computational effort associated with higher l-components [3]. The choice of the local potential can be tested by utilizing different components for the local potential. There are several difficulties with the eigen problems generated in this application in addition to the size of the matrices. First, the number of required eigenvectors is proportional to the atoms in the system, and can grow up to thousands. Besides storage, maintaining the orthogonality of these vectors can be a formidable task. Second, the relative separation of the eigenvalues becomes increasingly poor as the matrix size increases and this has an adverse effect on the rate of convergence of the eigenvalue solvers. Preconditioning techniques attempt to alleviate this problem. A brief review of these approaches can be found in Ref. [3]. The architecture of the Hamiltonian matrix is illustrated in Fig. 4 for a diatomic molecule. Although the details of matrix structure will be a function of the geometry of the system, the essential elements remain the same. The off-diagonal elements arise from the expansion coefficients in Eq. (8) and the nonlocal potential in Eq. (10). These elements are not updated during the self-consistency cycle. The on-diagonal matrix elements consist of the local ion core pseudopotential, the Hartree potential and the exchange-correlation potential. These terms are updated each self-consistent cycle.
Figure 4. Hamiltonian matrix for a diatomic molecule in real space. Nonzero matrix elements are indicated by black dots. The diagonal matrix elements consist of the local ionic pseudopotential, Hartree potential and local density exchange-correlation potential. The off-diagonal matrix elements consistent of the coefficients in the finite difference expansion and the nonlocal matrix elements of the pseudopotential. The system contains about 4000 grid points or 16 million matrix elements.
Electronic scale
131
Figure 5. Potentials and wave functions for the oxygen dimer molecule. The total electronic potential is shown on the left along a ray connecting the two oxygen atoms. The Kohn–Sham molecular orbitals are shown on the right side of the figure. The orbitals on the left are from a real space calculation and the ones on the right from a plane wave calculation.
While the Hamiltonian matrix in real space can be large, it never needs to be explicitly saved. Also, the matrix is sparse; the sparsity is a function of M (see Eq. 8), which is the order of the higher order difference expansion. For larger values of M, the grid can be made coarse. However, this reduces the sparsity of the matrix. Conversely if we use standard finite difference methods, the matrix is sparser, but the grid size must be fine to retain the same accuracy. In practice, a value of M = 4−6 appears to work very well. There is a close relationship between the plane wave method and real-space methods. For example, one can always do a Fourier transform on a real-space method and obtain results in reciprocal space, or perform the operation in reverse to go from Fourier space to real space. In this sense, higher order finite differences can be considered an abridged Fourier transform as one does not sum over all grid points in the mesh. As a rough measure of the convergence of real space methods, one can consider a Fourier component or plane wave cut off of (π/ h)2 for a grid spacing, h. Using this criterion, a grid spacing of h = 0.5 a.u.1 would correspond to a plane wave cut-off of approximately 40 Ry. In Fig. 5, a comparison is between the plane-wave supercell method and a real-space method for the oxygen dimer. The oxygen dimer is a difficult 1 1 a.u. = 0.529 Å or one bohr unit of length.
132
J.R. Chelikowsky
molecular species using pseudopotentials as the potential is rather deep and quite nonlocal as compared to second row elements such as silicon. The total local electronic potential is depicted along a ray containing the oxygen atoms [14]. Also shown are the Kohn–Sham one electron orbitals. The agreement between the two methods is quite good, certainly less than the uncertainties involved in the local density approximation. The most noticeable difference in the potential occurs at the nuclear positions. At this point, the atomic pseudopotential are quite strong and the variation in the wave function requires a fine mesh. However, it is important to note that this spatial regime is removed from the bonding region of the molecule. A survey of cluster and molecular species using both plane waves and real space method confirms that the accuracy of the two methods is comparable, but the real space method is easier to implement [14].
3.
Outlook
The focus of the electronic structure problem will likely not reside in solving for the energy bands of ordered solids. The energy band structure of crystalline matter, especially elemental solids, has largely been exhausted. This is not to say that elemental solids are no longer of interest. Certainly, interest in these materials will continue as testing grounds for new electronic structure methods. However, interest in nonperiodic systems such as amorphous solids, liquids, glasses, clusters, and nanoscale quantum dots is now a major focus of the electronic structure problem. Perhaps this is the greatest challenge for electronic structure methods, i.e., systems with many electronic and nuclear degrees of freedom and little or no symmetry. Often the structure of these materials are unknown and the materials properties may be a strong function of temperature. Real-space methods offer a new avenue for these large and complex systems. As an illustration of the potential of these methods, consider the example of quantum dots. In Fig. 6, we illustrate hydrogenated Ge clusters. These clusters are composed of bulk fragments of Ge whose dangling bonds are capped with hydrogen. The hydrogen passivates any electronically active dangling bonds. The larger clusters correspond to quantum dots, i.e., semiconductor fragments whose surface properties have been removed, but whose optical properties are dramatically altered by quantum confinement. It is well known that these systems have optical properties with much larger gaps than that of the bulk crystal. The optical spectra of such clusters are shown in Fig. 7. The largest cluster illustrated contains over 800 atoms, although even larger clusters have been examined. This size cluster would be difficult to examine with traditional methods. Although these calculations were done with a ground state method the general shape of the spectra are correct and the evolution of the
Electronic scale
Figure 6.
133
Hydrogenated germanium clusters ranging from germane (GeH4 ) to Ge147 H100 .
Ge35H36
Ge87H76
Photoabsorption (arb.un.)
Ge147H100
Ge191H148
Ge239H196
Ge293H172
Ge357H204
E
Ge525H276
1
E
2
E0 1
2
3 4 Transitionenergy (eV)
5
6
Figure 7. Photoabsorption spectra for hydrogenated germanium quantum dots. The labels E 0 , E 1 and E 2 refer to optical features.
134
J.R. Chelikowsky
spectra appear bulk-like by a few hundred atoms. Surfaces, clusters, magnetic systems, complex solids have also been treated with real-space methods [1, 15]. Finally, systems approach the macroscopic limit, it is common to employ finite element or finite difference methods to describe material properties. One would like to couple these methods to those appropriate at the quantum (or nano) limit. The use of real space methods at these opposite limits would be a natural choice. Some attempts along these lines exist. For example, fracture methods often divide up a problem by treating the fracture tip with quantum mechanical methods, the surrounding area by molecular dynamics and the medium away from the tip by continuum mechanics [16].
References [1] T.L. Beck, “Real-space mesh techniques in density functional theory,” Rev. Mod. Phys., 74, 1041, 2000. [2] J.R. Chelikowsky, “The pseudopotential-density functional method applied to nanostructures,” J. Phys. D: Appl. Phys., 33, R33, 2000. [3] C.L. Bris (ed.), Handbook of Numerical Analysis (Devoted to Computational Chemistry), Volume X, Elsevier, Amsterdam, 2003. [4] S. Lundqvist and N.H. March (eds.), Theory of the Inhomogeneous Electron Gas, Plenum, New York, 1983. [5] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133, 1965. [6] W. Pickett, “Pseudopotential methods in condensed matter applications,” Comput. Phys. Rep., 9, 115, 1989. [7] J.R. Chelikowsky and M.L. Cohen, “Ab initio pseudopotentials for semiconductors,” In: T.S. Moss and P.T. Landsberg (eds.), Handbook of Semiconductors, 2nd edn., Elsevier, Amsterdam, 1992. [8] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations,” Phys. Rev. B, 43, 1993, 1991. [9] J.R. Chelikowsky and S.G. Louie, “First principles linear combination of atomic orbitals method for the cohesive and structural properties of solids: application to diamond,” Phys. Rev. B, 29, 3470, 1984. [10] J.R. Chelikowsky and S.G. Louie (eds.), Quantum Theory of Materials, Kluwer, Dordrecht, 1996. [11] P. Pulay, “Ab initio calculation of force constants and equilibrium geometries,” Mol. Phys., 17, 197, 1969. [12] B. Fornberg and D.M. Sloan, “A review of pseudospectral methods for solving partial differential equations,” Acta Numerica, 94, 203, 1994. [13] L. Kleinman and D.M. Bylander, “Efficacious form for model pseudopotential,” Phys. Rev. Lett., 48, 1425, 1982. [14] J.R. Chelikowsky, N. Troullier, and Y. Saad, “The finite-difference-pseudopotential method: electronic structure calculations without a basis,” Phys. Rev. Lett., 72, 1240, 1994.
Electronic scale
135
[15] J. Bernholc, “Computational materials science: the era of applied quantum mechanics,” Phys. Today, 52, 30, 1999. [16] A. Nakano, M.E. Bachlechner, R.K. Kalia, E. Lidorkis, P. Vashishta, G.Z. Voyladjis, T.J. Campbell, S. Ogata, and F. Shimojo, “Multiscale simulation of nanosystems,” Comput. Sci. Eng., 3, 56, 2001.
1.8 AN INTRODUCTION TO ORBITAL-FREE DENSITY FUNCTIONAL THEORY Vincent L. Lign`eres1 and Emily A. Carter2 1 Department of Chemistry, Princeton University, Princeton, NJ 08544, USA 2
Department of Mechanical and Aerospace Engineering and Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
Given a quantum mechanical system of N electrons and an external potential (which typically consists of the potential due to a collection of nuclei), the traditional approach to determining its ground-state energy involves the optimization of the corresponding wavefunction, a function of 3N dimensions, without considering spin variables. As the number of particles increases, the computation quickly becomes prohibitively expensive. Nevertheless, electrons are indistinguishable so one could intuitively expect that the electron density – N times the probability of finding any electron in a given region of space – might be enough to obtain all properties of interest about the system. Using the electron density as the sole variable would reduce the dimensionality of the problem from 3N to 3, thus drastically simplifying quantum mechanical calculations. This is in fact possible, and it is the goal of orbital-free density functional theory (OF-DFT). For a system of N electrons in an external potential Vext , the total energy E can be expressed as a functional of the density ρ [1], taking on the following form: E[ρ] = F[ρ] +
Vext ( r )ρ( r ) d r
(1)
Here, denotes the system volume considered, while F is the universal functional that contains all the information about how the electrons behave and interact with one another. The actual form of F is currently unknown and one has to resort to approximations in order to evaluate it. Traditionally, it is split into kinetic and potential energy contributions, the exact forms of which are also unknown. Kohn and Sham first proposed replacing the exact kinetic energy of an interacting electron system with an approximate, noninteracting, single 137 S. Yip (ed.), Handbook of Materials Modeling, 137–148. c 2005 Springer. Printed in the Netherlands.
138
V. Lign`eres and E.A. Carter
determinantal wavefunction that gives rise to the same density [2]. This approach is general and remarkably accurate but involves the introduction of one-electron orbitals. E[ρ] = TKS [φ1 , . . . , φ N ] +
Vext( r )ρ( r )d r + J [ρ] + E xc [ρ]
(2)
TKS denotes the Kohn–Sham (KS) kinetic energy for a system of N noninteracting electrons (i.e., for the case of noninteracting electrons, a single-determinantal wavefunction is the exact solution), the φi are the corresponding one-electron orbitals, J is the classical electron–electron repulsion, and E xc is a correction term that should account for electron exchange, electron correlation, and the difference in kinetic energy between the interacting and noninteracting systems. If the φi are orthonormal, TKS has the following explicit form: TKS = −
1 2
N
φi∗ ( r )∇ 2 φi ( r ) d r
(3)
i=1
Unfortunately, the required orthogonalization of these orbitals makes the computational time scale cubically in the number of electrons. Although linearscaling KS algorithms exist, they require some degree of localization in the orbitals and, for this reason, are not applicable to metallic systems [3]. For condensed matter systems, the KS method has another bottleneck: the need to sample the Brillouin zone for the wavefunction (also called “k-point sampling”) can add several orders of magnitude in cost to the computation. Thus, a further advantage of OF-DFT is that, without a wavefunction, this very expensive computational prefactor of the number of k-points is completely absent from the calculation. At this point, many general, efficient and often accurate functionals are available to handle every term in Eq. (2) as functionals of the electron density alone, except for the kinetic energy. The development of a generally applicable, accurate, linear-scaling kinetic energy density functional (KEDF) would remove the last bottleneck in the DFT computations and enable researchers to study much larger systems than are currently accessible. In the following, we will focus our discussion on such functionals.
1.
General Overview
Historically, the first attempt at approximating the kinetic energy assumes a uniform, noninteracting electron gas [4, 5] and is known as the Thomas–Fermi (TF) model for a slowly varying electron gas. 3 (3π 2 )2/3 ρ( r )5/3d r (4) TTF = 10
Orbital-free density functional theory
139
The model, although crude, constitutes a reasonable first approximation to the kinetic energy of periodic systems. It fails for atoms and molecules, however, as it predicts no shell structure, no interatomic bonding, and the wrong behavior for ρ at the r = 0 and r = +∞ limits. We will discuss some ways to improve this model later. A deeper look at Eq. (3) reveals another approach to describing the kinetic energy as a functional of the density. Within the Hartree–Fock (HF) approximation [6], we have ρ( r) = ρ( r) =
N i=1 N
φi∗ ( r )φi ( r)
(5a)
ρi ( r)
(5b)
i=1
so that, using the hermiticity of the gradient operator, and acting on Eq. (5) we obtain r) = 2 ∇ 2 ρ(
N
φi∗ ( r )∇ 2 φi ( r ) + ∇φi∗ ( r )∇φi ( r)
(6)
i=1
Rearranging Eq. (6), integrating over , and substituting Eq. (3) into Eq. (6) yields TKS = −
1 4
∇ 2 ρ( r )d r+
1 2
N
∇φi∗ ( r )∇φi ( r ) d r
(7)
i=1
Multiplying and dividing every term of the sum by ρi naturally introduces ∇ρi TKS = −
1 4
∇ 2 ρ( r )d r+
1 8
N |∇ρi ( r )|2
i=1
ρi ( r)
d r
(8)
but does not provide a form for which the sum can be evaluated simply. Nevertheless, the first term can be rewritten as the integral of the gradient of the density around the edge of space.
∇ 2 ρ( r )d r=
∇ρ( r ) d r
(9)
For a finite system, the gradient of the density vanishes at large distances and for a periodic system the gradients on opposite sides of a periodic cell cancel each other out, so that this integral evaluates to zero in both cases. Finally, for a one-orbital system, we obtain the following exact expression for the kinetic energy [7]. 1 TVW = 8
|∇ρ( r )|2 d r ρ( r)
(10)
140
V. Lign`eres and E.A. Carter
Although only exact for up to two electrons, the von Weizs¨acker (VW) functional is an essential component of the true kinetic energy and provides a good first approximation in the case of quickly varying densities such as those of atoms and molecules. Unfortunately, the total energy corresponding to the ground-state electron density has the same magnitude as the exact kinetic energy. Consequently, errors made in approximating the kinetic energy have a dramatic impact on the total energy and, by extension, on the ground state electron density computed by minimization. Unlike the exchange-correlation energy functionals, which represent a much smaller component of the total energy, kinetic-energy functionals must be highly accurate in order to achieve consistently accurate energy predictions.
2.
KEDFs for Finite Systems
In the case of a finite system such as a single atom, a few molecules in the gas phase, or a cluster, the electron density varies extremely rapidly near the nuclei, making the TF functional inadequate. Although many corrections have been suggested to improve upon the TF results for atoms, these modifications only yield acceptable results when densities obtained from a different method are used, usually HF. Left to determine their own densities self-consistently, these corrections still predict no shell structure for atoms. Nevertheless, the TF functional, or some fraction of it, may still be useful as a corrective term, as we will see later. Going back to the KS expression from Eq. (8), we introduce r) = n i (
ρi ( r) ρ( r)
(11)
which, when multiplying both sides by ρ( r ) and taking the gradient, yields r ) = n i ( r )∇ρ( r ) + ρ( r )∇n i ( r) ∇ρi (
(12)
Substituting Eq. (12) into Eq. (8) gives the following expression: TKS =
1 8
N (n i ( r )∇ρ( r ) + ρ( r )∇n i ( r ))2
n i ( r )ρ( r)
i=1
d r
(13)
The product is expanded into three sums and reorganized as TKS
1 = 8
N N |∇ρ( r )|2 n i ( r ) + 2∇ρ( r) ∇n i ( r) ρ( r ) i=1 i=1
+ ρ( r)
N |∇n i ( r )|2 i=1
n i ( r)
d r
(14)
Orbital-free density functional theory
141
From Eq. (11), it follows immediately that N
n i ( r) = 1
(15)
i=1
and so, making use of the linearity of the gradient operator in the second term of Eq. (14) N
∇n i ( r) = ∇
i=1
N
n i ( r ) = ∇(1) = 0
(16)
i=1
the expression further simplifies to
TKS =
|∇ρ( r )|2 d r+ 8ρ( r)
ρ( r)
N |∇n i ( r )|2 i=1
8n i ( r)
d r
(17)
As every quantity in the second integral is positive, we can conclude that the VW functional (the first term in Eq. 17) constitutes a lower bound on the noninteracting kinetic energy. This makes physical sense anyway, as we know that the VW kinetic energy is exact for any one-orbital system (one or two electrons, or any number of bosons). Any other orbital introduced will have to be orthogonal to the first. This introduces nodes in the wavefunction, which raises the kinetic energy of the entire system. Therefore, further improvements upon the VW model involve adding an extra term to take into account the larger kinetic energy in the regions of space in which more than one orbital is significant. Far away from the molecule, only one orbital tends to dominate the picture and the VW functional is accurate enough to account for the relatively small contribution of these regions to the total kinetic energy. Most of the deviation from the exact, noninteracting kinetic energy is located close to the nuclei, in the core region of atoms. Corrections based on adding some fraction of the TF functional to the VW have been proposed (see, for instance, Ref. [8]), but only when nonlocal functionals (those depending on more than one point in space, e.g., r and r ) are introduced is a convincing shell structure observed for atomic densities [9]. Even without such correction terms, the TF and VW functionals may still be enough to obtain an accurate description of the system in some limited cases. For instance, Wesolowski and Warshel used a simple, orbital-free KEDF to describe water molecules as a solvent for a quantum-chemically treated water molecule solute [10]. They were able to reproduce the solvation free energy of water accurately using this method. Although this result is encouraging, the ultimate goal of OF-DFT is to determine a KEDF that would be accurate even without the backup provided by the traditional quantum-mechanical method. One key to judging of the
142
V. Lign`eres and E.A. Carter
quality of a given functional is to express it in terms of its kinetic-energy density.
T [ρ] =
t (ρ( r )) d r
(18)
The KS functional as it is expressed in Eq. (3) uniquely defines its kinetic-energy density. Certainly, if a given functional can reproduce the KS kinetic-energy density faithfully it must reproduce the total energy also. Any functional that differs from that one by a function that integrates to 0 over the entire system – like, for instance, the Laplacian of the density – will match the KS energy just as well but not the KS kinetic-energy density. For the VW functional, for instance, the corresponding kinetic-energy density should include a Laplacian contribution:
TVW =
tVW (ρ) d r
(19)
|∇ρ( 1 r )|2 r) + tVW (ρ) = − ∇ 2 ρ( 4 8ρ( r)
(20)
OF-DFT has experienced its most encouraging successes for periodic systems using a different class of kinetic energy functionals described below. These achievements led to attempts to use this alternative class of functionals for nonperiodic systems as well. Choly and Kaxiras recently proposed a method to approximate such functionals and adapt them for nonperiodic systems [11]. If successful, their method may further enlarge the range of applications where currently available functionals yield physically reasonable results.
3.
KEDFs for Periodic Systems
If the system exhibits translational invariance, or can be approximated using a system that does, it becomes advantageous to introduce periodic boundary conditions and thus reduce the size of the infinite system to a small number of atoms in a finite volume. A plane-wave basis set expansion most naturally describes the electron density under these conditions. As an additional advantage, quantities can be computed either in real or reciprocal space, by performing fast Fourier transforms (FFTs) on the density represented on a uniform grid. The number of functions necessary to describe the electron density in a given system is highly dependent upon the rate of fluctuation of said density. Quickly varying densities need more plane waves in real space which translate into larger reciprocal-space grids and, consequently, into finer realspace meshes. Unfortunately, in real systems, electrons tend to stay mostly
Orbital-free density functional theory
143
around atomic nuclei and only occasionally venture in the interatomic regions of space. This makes the total electron density vary extremely rapidly close to the nuclei, in the core region of space. Consequently, an extremely large number of plane waves would be necessary to describe the total electron density. One can get around this problem by realizing that the core region density is often practically invariant upon physical and chemical change. This observation is similar to the realization that only valence shell electrons are involved in chemical bonding. The valence electron density varies a lot less rapidly than the total density, so that if the core electrons could be removed, one could drastically reduce the total number of plane waves required in the basis set. Of course, the influence of the core electrons on the geometry and energy of the system must still be accounted for. This is done by introducing pseudopotentials that mimic the presence of core electrons and the nuclei. Obviously, if one is interested in any properties that require an accurate description of the electron density near the nuclei of a system, such pseudopotential-based methods will be inappropriate. Each chemical element present in the system must be represented by its own unique pseudopotential, which is typically constructed as follows. First, an all-electron calculation on an atom is performed to obtain the valence eigenvalues and wavefunctions that one seeks to reproduce within a pseudopotential calculation. Then, the oscillations of the valence wavefunction in the core region are smoothed out to create a “pseudowavefunction,” which is then used to invert the KS equations for the atom to obtain the pseudopotential that corresponds to the pseudowavefunction, subject to the constraint that the allelectron eigenvalues are reproduced. Typically, this is done for each angular momentum channel, so that one obtains a pseudopotential that has an angular dependence, usually expressed as projection operators involving the atomic pseudowavefunctions. Such a pseudopotential is referred to as “nonlocal,” because it is not simply a function of the distance from the nucleus, but also depends on the angular nature of the wavefunction it acts upon. In other words, when a nonlocal pseudopotential acts on a wavefunction, s-symmetry orbitals will be subject to a different potential than p-symmetry orbitals, etc. (as in the exact solution to the Schroedinger equation for a one-electron atom or ion). This affords a nonlocal pseudopotential enough flexibility so that it is quite accurate and transferable to a diverse set of environments. The above discussion presents a second significant challenge for OF-DFT beyond kinetic energy density functionals, since nonlocal pseudopotentials cannot be employed in OF-DFT, because no wavefunction exists to be acted upon by the orbital-based projection operators intrinsic to nonlocal pseudopotentials. In the case of an orbital-free description of the density, the pseudopotentials must be local (depending only on one point in space) and spherically symmetrical around the atomic nucleus. Thus, in OF-DFT, the challenge is to
144
V. Lign`eres and E.A. Carter
construct accurate and transferable local pseudopotentials for each element. An attempt in this direction specifically for OF-DFT was made by Madden and coworkers, where the OF-DFT equation δ E xc δJ δTKS + Vext + + =µ δρ δρ δρ
(21)
is inverted to find a local pseudopotential (the second term on the left-hand side of Eq. (21)) that reproduces a crystalline density derived from a KS calculation using a nonlocal pseudopotential [12]. Here the terms on the left-hand side of Eq. (21) are the density functional variations of the same terms given in Eq. (2), except that in OF-DFT, TKS will be a functional of the density only and not of the orbitals. On the right-hand side is µ, the chemical potential. This method yielded promising results for alkali and alkaline earth metals, but was not extended beyond such elements because inherent to the method was the assumption and use of a given approximate kinetic energy density functional. Hence the pseudopotential had built into it the success and/or failure associated with any given choice of kinetic energy functional. A related approach for constructing local pseudopotentials based on embedding an ion in an electron gas was proposed by Anta and Madden; this method yielded improved results for liquid Li, for example [13]. More recently, Zhou et al. proposed that improved local pseudopotentials for condensed matter could be obtained by inverting not the OF-DFT equations but instead the KS equations so that the exact kinetic energy could be used in the inversion procedure. This was done subject to the constraint of reproducing accurate crystalline electron densities, using a modified version of the method developed by Wang and Parr for the inversion procedure [14]. Zhou et al. showed that a local pseudopotential could be constructed in this way that, e.g., for silicon, yielded bulk properties for both semiconducting and metallic phases in excellent agreement with predictions by a nonlocal pseudopotential within the KS theory. This bulk-derived local pseudopotential also exhibited improved transferability over those derived from a single atomic density. In principle, Zhou et al.’s approach is a general scheme applicable to all elements, since the exact kinetic energy is utilized [15]. With local pseudopotentials now in hand, we turn our attention back to calculating accurate valence electron densities via kinetic-energy density functionals within OF-DFT. The valence electron density in condensed matter can be viewed as fluctuating around an average value that corresponds to the total number of electrons spread homogeneously over the system. If this were exactly the case, we would have a uniform electron gas for which the kinetic energy is described exactly by the TF functional in Eq. (4) with a constant density. For an inhomogeneous density, the TF functional still constitutes an
Orbital-free density functional theory
145
appropriate starting point and is the zeroth order term of the conventional gradient expansion (CGE) [16]. TKS [ρ] = TTF [ρ] + T 2 [ρ] + T 4 [ρ] + T 6 [ρ] + · · ·
(22)
Here, T 2, T 4, and T 6 correspond to the second-, fourth-, and sixth-order corrections, respectively. All odd-order corrections are zero. The second-order correction is found to be one ninth of the VW kinetic energy, while the fourthorder term is [17]: 1 T [ρ] = 540(3π 2 )2/3
4
ρ
1/3
(∇ 2 ρ)2 9∇ 2 ρ(∇ρ)2 (∇ρ)4 − + d r (23) ρ2 8ρ 3 3ρ 4
Starting with the sixth-order term, all further corrections diverge for quickly varying or exponentially decaying densities [18]. Moreover, the fourth-order correction constitutes only a minor improvement over the second-order term and its potential δT 4 [ρ]/δρ also diverges for quickly varying or exponentially decaying densities. Usually then, the CGE expansion is truncated at second order as TCGE [ρ] = TTF [ρ] + 19 TVW [ρ]
(24)
For slowly varying densities, this truncation is reasonable. For the nearly-free electron gas, linear response theory can provide an additional constraint on the kinetic-energy functional [19].
1 δ 2 T [ρ]
=− = Fˆ
2
δρ χLind ρ 0
−1
1 1 − η2
1 + η
+ ln
2 4η 1 − η
(25)
Here Fˆ denotes the Fourier transform, δ the functional derivative evaluated at a reference density ρ0 , and χLind is the Lindhard susceptibility function, the expression for which is detailed on the right-hand side, where η = q/2kF , q is the reciprocal space wave vector and kF = (3π 2 ρ0 )1/3 . Although the exact susceptibility is known in this case, the actual kinetic-energy functional is not. Its behavior at the small and large q limits can be evaluated, however. The exact linear response matches the CGE only for very slowly varying densities, which correspond to small values of q.
δ 2 (TTF [ρ] + 19 TVW [ρ])
δ 2 T [ρ]
ˆ = Lim F Lim Fˆ
η→0 η→0 δρ 2 ρ δρ 2 ρ 0
(26)
0
In the limit of infinitely quickly varying densities or the large q limit (LQL), the linear response behavior is very different.
δ 2 (− 35 TTF [ρ] + TVW [ρ])
δ 2 T [ρ]
ˆ = Lim F Lim Fˆ
(27) η→+∞ η→+∞
δρ 2 ρ δρ 2 ρ 0
0
146
V. Lign`eres and E.A. Carter
As we saw before though, the VW kinetic energy constitutes a lower bound to the kinetic energy. Therefore, here the linear response behavior cannot be correct (we are far from the small perturbations away from the uniform gas limit required in linear response theory) and we can conclude that linear response theory inadequately describes quickly varying densities. Nevertheless, a lot of effort has been made to determine the corresponding kineticenergy functional. Bridging the gap between the small and large q to obtain the linear response kinetic-energy functional involves explicitly enforcing the correct linear response behavior. Pioneering work in this direction by Wang and Teter [20], Perrot [21], and Smargiassi and Madden [22] produced impressive results for many main group metals. A correction term is added to the TF and VW functionals to enforce the linear response. T [ρ] = TTF [ρ] + TVW [ρ] + TX [ρ]
(28)
Here TX is the correction, usually a nonlocal functional of the density that can be expressed as a double integral
TX [ρ] =
ρ α ( r)
w( r − r )ρ β ( r ) d r d r
(29)
where w is called the response kernel and is adjusted to produce the global linear response behavior, while α and β are functional-dependent parameters. More complex functionals, based either on higher-order response theories [23], for instance) or on density-dependent kernels (like those of Chac´on and coworkers [24] or Wang et al. [25] can produce more general and transferable results. However, their excellent performance comes with increased computational costs and, in the case of the Chac´on functional, with quadratic scaling of the computational time with system size. Nevertheless, computations using these functionals are several orders of magnitude faster than those using the KS kinetic energy. For example, Jesson and Madden performed DFT molecular dynamics simulations of solid and liquid aluminum using the Foley and Madden KEDF, on systems four times larger and for simulation times twice as long [26] as previous KS molecular dynamics studies [27] could consider. Although the melting temperature they predicted was much lower than the experimental value and previous predictions, it appears that their pseudopotential, not their KEDF, was the main source of error. It is important to emphasize that even the best of today’s functionals do not exactly match the accuracy of the KS method, exhibiting non-negligible deviations from the KS densities and energies in many cases. This should spur further developments of kinetic-energy density functionals.
Orbital-free density functional theory
4.
147
Conclusions and Outlook
Despite more than seventy years of research in this field and some tremendous progress, kinetic-energy density functionals have not yet reached a degree of sophistication that allow their use reliably and transferably for all elements in the periodic table and for all phases of matter. One could easily view the development of accurate descriptions of the kinetic energy in terms of the density alone as the last great frontier of density functional theory. Currently, OF-DFT research is moving from the development of new, approximate functionals to attempting to determine the properties of the exact one [28]. Also, it is becoming clearer that reproducing the KS energy for a given system is not a guarantee of functional accuracy. More efforts have been devoted to trying to reproduce the kinetic energy density predicted by the KS method at every point in space [29]; one can expect this type of effort to intensify in the future. If highly accurate and general forms for the kinetic-energy density functional are discovered, which retain the linear scaling efficiency of current functionals, OF-DFT will undoubtedly become the quantum-based method of choice for investigating wavefunctionindependent properties of large numbers of atoms. Aside from spectroscopic quantities, most properties of interest (e.g., vibrations, forces, dynamical evolution, structure, etc.) do not depend on knowledge of the electronic wavefunction and hence OF-DFT can be employed. For further reading about advanced technical details in kinetic-energy density functional theory, see Wang and Carter [30].
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– B871, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [3] S. Goedecker, “Linear scaling electronic structure models,” Rev. Mod. Phys., 71(4), 1085–1123, 1999. [4] E. Fermi, “Un metodo statistice per la determinazione di alcune proprieta dell’atomo,” Rend. Accad., Lincei 6, 602–607, 1927. [5] L.H. Thomas, “The calculation of atomic fields,” Proc. Camb. Phil. Soc., 23, 542– 548, 1927. [6] C.C.J. Roothaan, “New developments in molecular orbital theory,” Rev. Mod. Phys., 23, 69–89, 1951. [7] C.F. von Weizs¨acker, “Zur Theorie der Kernmassen,” Z. Phys, 96, 431–458, 1935. [8] P.K. Acharya, L.J. Bartolotti, S.B. Sears, and R.G. Parr, “An atomic kinetic energy functional with full Weizsacker correction,” Proc. Natl. Acad. Sci. USA, 77, 6978– 6982, 1980. [9] P. García-Gonz´alez, J.E. Alvarellos, and E. Chac´on, “Kinetic-energy density functional: atoms and shell structure,” Phys. Rev. A, 54, 1897–1905, 1996.
148
V. Lign`eres and E.A. Carter [10] T. Wesolowski and A. Warshel, “Ab initio free-energy perturbation calculations of solvation free-energy using the frozen density-functional approach,” J. Phys. Chem., 98, 5183–5187, 1994. [11] N. Choly and E. Kaxiras, “Kinetic evergy density functionals for non-periodic systems,” Solid State Commun., 121, 281–286, 2002. [12] S. Watson, B.J. Jesson, E.A. Carter, and P. A. Madden, “Ab initio pseudopotentials for orbital-free density functionals,” Europhys. Lett., 41, 37–42, 1998. [13] J.A. Anta and P.A. Madden, “Structure and dynamics of liquid lithium: comparison of ab initio molecular dynamics predictions with scattering experiments,” J. Phys. Condens. Matter, 11, 6099–6111, 1999. [14] Y. Wang and R.G. Parr, “Construction of exact Kohn–Sham orbitals from a given electron density,” Phys. Rev. A, 47, R1591–R1593, 1993. [15] B. Zhou, Y.A. Wang, and E.A. Carter, “Transferable local pseudopotentials derived via inversion of the Kohn–Sham equations in a bulk environment,” Phys. Rev. B, 69, 125109, 2004. [16] D.A. Kirzhnits, “Quantum corrections to the Thomas–Fermi equation,” Sov. Phys. – JETP, 5, 64–71, 1957. [17] C.H. Hodges, “Quantum corrections to the Thomas–Fermi approximation – the Kirzhnits method,” Can. J. Phys., 51, 1428–1437, 1973. [18] D.R. Murphy, “The sixth-order term of the gradient expansion of the kinetic energy density functional,” Phys. Rev. A, 24, 1682–1688, 1981. [19] J. Lindhard. K. Dan. Vidensk. Selsk. Mat. Fys. Medd., 28, 8, 1954. [20] L.-W. Wang and M.P. Teter, “Kinetic-energy functional of the electron density,” Phys. Rev. B, 45, 13196–13220, 1992. [21] F. Perrot, “Hydrogen–hydrogen interaction in an electron gas,” J. Phys. Condens. Matter, 6, 431–446, 1994. [22] E. Smargiassi and P.A. Madden, “Orbital-free kinetic-energy functionals for firstprinciples molecular dynamics,” Phys. Rev. B, 49, 5220–5226, 1994. [23] M. Foley and P.A. Madden, “Further orbital-free kinetic-energy functionals for ab initio molecular dynamics,” Phys. Rev. B, 53, 10589–10598, 1996. [24] P. García-Gonz´alez, J.E. Alvarellos, and E. Chac´on, “Nonlocal symmetrized kineticenergy density functional: application to simple surfaces,” Phys. Rev. B, 57, 4857– 4862, 1998. [25] Y.A. Wang, N. Govind, and E.A. Carter, “Orbital-free kinetic-energy density functionals with a density-dependent kernel,” Phys. Rev. B, 60, 16350–16358, 1999. [26] B.J. Jesson and P.A. Madden, “Ab initio determination of the melting point of aluminum by thermodynamic integration,” J. Chem. Phys., 113, 5924–5934, 2000. [27] G.A. de Wijs, G. Kresse, and M.J. Gillan, “First-order phase transitions by firstprinciples free-energy calculations: the melting of Al.,” Phys. Rev. B, 57, 8223–8234, 1998. ´ Nagy, “A method to get an analytical expression for the non-interacting [28] T. G´al and A. kinetic energy density functional,” J. Mol. Struct., 501–502, 167–171, 2000. [29] E. Sim, J. Larkin, and K. Burke, “Testing the kinetic energy functional: kinetic energy density as a density functional,” J. Chem. Phys., 118, 8140–8148, 2003. [30] Y.A. Wang and E.A. Carter, “Orbital-free kinetic energy density functional theory,” In: S.D. Schwartz (ed.), Theoretical Methods in Condensed Phase Chemistry, Kluwer, Dordrecht, pp. 117–184, 2000.
1.9 AB INITIO ATOMISTIC THERMODYNAMICS AND STATISTICAL MECHANICS OF SURFACE PROPERTIES AND FUNCTIONS Karsten Reuter1 , Catherine Stampfl1,2, and Matthias Scheffler1 1 Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, D-14195 Berlin, Germany 2 School of Physics, The University of Sydney, Sydney 2006, Australia
Previous and present “academic” research aiming at atomic scale understanding is mainly concerned with the study of individual molecular processes possibly underlying materials science applications. In investigations of crystal growth one would, for example, study the diffusion of adsorbed atoms at surfaces, and in the field of heterogeneous catalysis it is the reaction path of adsorbed species that is analyzed. Appealing properties of an individual process are then frequently discussed in terms of their direct importance for the envisioned material function, or reciprocally, the function of materials is often believed to be understandable by essentially one prominent elementary process only. What is often overlooked in this approach is that in macroscopic systems of technological relevance typically a large number of distinct atomic scale processes take place. Which of them are decisive for observable system properties and functions is then not only determined by the detailed individual properties of each process alone, but in many, if not most cases, also the interplay of all processes, i.e., how they act together, plays a crucial role. For a predictive materials science modeling with microscopic understanding, a description that treats the statistical interplay of a large number of microscopically well-described elementary processes must therefore be applied. Modern electronic structure theory methods such as density-functional theory (DFT) have become a standard tool for the accurate description of the individual atomic and molecular processes. In what follows we discuss the present status of emerging methodologies that attempt to achieve a (hopefully seamless) match of DFT with concepts from statistical mechanics or thermodynamics, in order to also address the interplay of the various molecular processes. The 149 S. Yip (ed.), Handbook of Materials Modeling, 149–194. c 2005 Springer. Printed in the Netherlands.
150
K. Reuter et al.
new quality of, and the novel insights that can be gained by, such techniques is illustrated by how they allow the description of crystal surfaces in contact with realistic gas-phase environments, which is of critical importance for the manufacture and performance of advanced materials such as electronic, magnetic and optical devices, sensors, lubricants, catalysts, and hard coatings. For obtaining an understanding, and for the design, advancement or refinement of modern technology that controls many (most) aspects of our life, a large range of time and length scales needs to be described, namely, from the electronic (or microscopic/atomistic) to the macroscopic, as illustrated in Fig. 1. Obviously, this calls for a multiscale modeling, were corresponding theories (i.e., from the electronic, mesoscopic, and macroscopic regimes) and their results need to be linked appropriately. For each length and time scale regime alone, a number of methodologies are well established. It is however, the appropriate linking of the methodologies that is only now evolving. Conceptually quite challenging in this hierarchy of scales are the transitions from what is often called a micro- to a mesoscopic system description, and from a meso- to a macroscopic system description. Due to the rapidly increasing number of particles and possible processes, the former transition is methodologically primarily characterized by the rapidly increasing importance of statistics, while in the latter, the atomic substructure is finally discarded in favor of a
Statistical Mechanics or Thermodynamics
length (m) 1 10
-3
10
-6
10
-9
macroscopic regime
Density Functional Theory mesoscopic regime electronic regime
time (s) 10
-15
10
-9
10
-3
1
Figure 1. Schematic presentation of the time and length scales relevant for most material science applications. The elementary molecular processes, which rule the behavior of a system, take place in the so-called “electronic regime”. Their interplay, which frequently determines the functionalities however, only develops after meso- and macroscopic lengths or times.
Ab initio atomistic thermodynamics and statistical mechanics
151
continuum modeling. In this contribution we will concentrate on the micro- to mesoscopic system transition, and correspondingly discuss some possibilities of how atomistic electronic structure theory can be linked with concepts and techniques from statistical mechanics and thermodynamics. Our aim is a materials science modeling that is based on understanding, predictive, and applicable to a wide range of realistic conditions (e.g., realistic environmental situations of varying temperatures and pressures). This then mostly excludes the use of empirical or fitted parameters – both at the electronic and at the mesoscopic level, as well as in the matching procedure itself. Electronic theories that do not rely on such parameters are often referred to as first-principles (or in latin: ab initio) techniques, and we will maintain this classification also for the linked electronic-statistical methods. Correspondingly, our discussion will mainly (nearly exclusively) focus on such ab initio studies, although mentioning some other work dealing with important (general) concepts. Furthermore, this chapter does not (or only briefly) discuss equations; instead the concepts are demonstrated (and illustrated) by selected, typical examples. Since many (possibly most) aspects of modern material science deal with surface or interface phenomena, the examples are from this area, addressing in particular surfaces of semiconductors, metals, and metal oxides. Apart from sketching the present status and achievements, we also find it important to mention the difficulties and problems (or open challenges) of the discussed approaches. This can however only be done in a qualitative and rough manner, since the problems lie mostly in the details, the explanations of which are not appropriate for such a chapter. To understand the elementary processes ruling the materials science context, microscopic theories need to address the behavior of electrons and the resulting interactions between atoms and molecules (often expressed in the terminology of chemical bonds). Electrons move and adjust to perturbations on a time scale of femtoseconds (1 fs = 10−15 s), atoms vibrate on a time scale of picoseconds (1 ps = 10−12 s), and individual molecular processes take place on a length scale of 0.1 nanometer (1 nm = 10−9 m). Because of the central importance of the electronic interactions, this time and length scale regime is also often called the “electronic regime”, and we will use this term here in particular, in order to emphasize the difference between ab initio electronic and semi-empirical microscopic theories. The former explicitly treat the electronic degrees of freedom, while the latter already coarse-grain over them and directly describe the atomic scale interactions by means of interatomic potentials. Many materials science applications depend sensitively on intricate details of bond breaking and making, which on the other hand are often not well (if at all) captured by existing semi-empiric classical potential schemes. A predictive first-principles modeling as outlined above must therefore be based on a proper description of molecular processes in the “electronic regime”, which is much harder to accomplish than just a microscopic description employing more or
152
K. Reuter et al.
less guessed potentials. In this respect we find it also appropriate to distinguish the electronic regime from the currently frequently cited “nanophysics” (or better “nanometer-scale physics”). The latter deals with structures or objects of which at least one dimension is in the range 1–100 nm, and which due to this confinement exhibit properties that are not simply scalable from the ones of larger systems. Although already quite involved, the detailed understanding of individual molecular processes arising from electronic structure theories is, however, often still not enough. As mentioned above, in many cases the system functionalities are determined by the concerted interplay of many elementary processes, not only by the detailed individual properties of each process alone. It can, for example, very well be that an individual process exhibits very appealing properties for a desired application, yet the process may still be irrelevant in practice, because it hardly ever occurs within the “full concert” of all possible molecular processes. Evaluating this “concert” of elementary processes one obviously has to go beyond separate studies of each microscopic process. However, taking the interplay into account, naturally requires the treatment of larger system sizes, as well as an averaging over much longer time scales. The latter point is especially pronounced, since many elementary processes in materials science are activated (i.e., an energy barrier must be overcome) and thus rare. This means that the time between consecutive events can be orders of magnitude longer than the actual event time itself. Instead of the above mentioned electronic time regime, it may therefore be necessary to follow the time evolution of the system up to seconds and longer in order to arrive at meaningful conclusions concerning the effect of the statistical interplay. Apart from the system size, there is thus possibly the need to bridge some twelve orders of magnitude in time which puts new demands on theories that are to operate in the corresponding mesoscopic regime. And also at this level, the ab initio approach is much more involved than an empirical one because it is not possible to simply “lump together” several not further specified processes into one effective parameter. Each individual elementary step must be treated separately, and then combined with all the others within an appropriate framework. Methodologically, the physics in the electronic regime is best described by electronic structure theories, among which density-functional theory [1–4] has become one of the most successful and widespread approaches. Apart from detailed information about the electronic structure itself, the typical output of such DFT calculations, that is of relevance for the present discussion, is the energetics, e.g., total energies, as well as the forces acting on the nuclei for a given atomic configuration. If this energetic information is provided as function of the atomic configuration {R I }, one talks about a potential energy surface (PES) E({R I }). Obviously, a (meta)stable atomic configuration corresponds to a (local) minimum of the PES. The forces acting on the given atomic configuration are just the local gradient of the PES, and the vibrational
Ab initio atomistic thermodynamics and statistical mechanics
153
modes of a (local) minimum are given by the local PES curvature around it. Although DFT mostly does not meet the frequent demand for “chemical accuracy” (1 kcal/mol ≈ 0.04 eV/atom) in the energetics, it is still often sufficiently accurate to allow for the aspired modeling with predictive character. In fact, we will see throughout this chapter that error cancellation at the statistical interplay level may give DFT-based approaches a much higher accuracy than may be expected on the basis of the PES alone. With the computed DFT forces it is possible to directly follow the motion of the atoms according to Newton’s laws [5, 6]. With the resulting ab initio molecular dynamics (MD) [7–11] only time scales up to the order of 50 ps are, however, currently accessible. Longer times may, e.g., be reached by so-called accelerated MD techniques [12], but for the desired description of a truly mesoscopic scale system which treats the statistical interplay of a large number of elementary processes over some seconds or longer, a match or combination of DFT with concepts from statistical mechanics or thermodynamics must be found. In the latter approaches, bridging of the time scale is achieved by either a suitable “coarse-graining” in time (to be specified below) or by only considering thermodynamically stable (or metastable) states. We will discuss how such a description, appropriate for a mesoscopic-scale system, can be achieved starting from electronic structure theory, as well as ensuing concepts like atomistic thermodynamics, lattice-gas Hamiltonians (LGH), equilibrium Monte Carlo simulations, or kinetic Monte Carlo simulations (kMC). Which of these approaches (or a combination) is most suitable depends on the particular type of problem. Table 1 lists the different theoretical approaches and the time and length scales that they treat. While the concepts are general, we find it instructive to illustrate their power and limitations on the basis of a particular issue that is central to the field of surface-related studies including applications as important as crystal growth and heterogeneous catalysis, namely to treat the effect of a finite gas-phase. With surfaces forming the interface to the surrounding environment, a critical dependence of their
Table 1. The time and length scales typically handled by different theoretical approaches to study chemical reactions and crystal growth Information
Time scale
Length scale
< 103 atoms Density-functional theory Microscopic – ∼ < 103 atoms Ab initio molecular dynamics Microscopic t< ∼ ∼ 50 ps < 103 atoms Semi-empirical molecular dynamics Microscopic t< ∼ ∼ 1 ns < < Kinetic Monte Carlo simulations Micro- to mesoscopic 1 ps < ∼ t ∼ 1 h ∼ 1 µm > 10 nm Ab initio atomistic thermodynamics Meso- to macroscopic Averaged ∼ > < < Rate equations Averaged 0.1 s ∼ t ∼ ∞ ∼ 10 nm < > 10 nm Continuum equations Macroscopic 1s < ∼t ∼∞ ∼
154
K. Reuter et al.
properties on the species in this gas-phase, on their partial pressures and on the temperature can be intuitively expected [13, 14]. After all, we recall that for example in our oxygen-rich atmosphere, each atomic site of a close-packed crystal surface at room temperature is hit by of the order of 109 O2 molecules per second. That this may have profound consequences on the surface structure and composition is already highlighted by the everyday phenomena of oxide formation, and in humid oxygen-rich environments, eventually corrosion with rust and verdigris as two visible examples [15]. In fact, what is typically called a stable surface structure is nothing but the statistical average over all elementary adsorption processes from, and desorption processes to, the surrounding gas-phase. If atoms or molecules of a given species adsorb more frequently from the gas-phase than they desorb to it, the species’ concentration in the surface structure will be enriched with time, thus also increasing the total number of desorption processes. Eventually this total number of desorption processes will (averaged over time) equal the number of adsorption processes. Then the (average) surface composition and structure will remain constant, and the surface has attained its thermodynamic equilibrium with the surrounding environment. Within this context we may be interested in different aspects; for example, on the microscopic level, the first goal would be to separately study elementary processes such as adsorption and desorption in detail. With DFT one could, e.g., address the energetics of the binding of the gas-phase species to the surface in a variety of atomic configurations [16], and MD simulations could shed light on the possibly intricate gas-surface dynamics during one individual adsorption process [10, 11, 17]. Already the search for the most stable surface structure under given gas-phase conditions, however, requires the consideration of the interplay between the elementary processes (of at least adsorption and desorption) at the mesoscopic scale. If we are only interested in the equilibrated system, i.e., when the system has reached its thermodynamic ground (or a metastable) state, the natural choice would then be to combine DFT data with thermodynamic concepts. How this can be done will be exemplified in the first part of this chapter. On the other hand, the processes altering the surface geometry and composition from a known initial state to the final ground state can be very slow. And coming back to the above example of oxygen–metal interaction, corrosion is a prime example, where such a kinetic hindrance significantly slows down (and practically stops) further oxidation after an oxide film of certain thickness has formed at the surface. In such circumstances, a thermodynamic description will not be satisfactory and one would want to follow the explicit kinetics of the surface in the given gas-phase. Then the combination of DFT with concepts from statistical mechanics explicitly treating the kinetics is required, and we will illustrate some corresponding attempts in the last section entitled “First-principles kinetic Monte Carlo simulations”.
Ab initio atomistic thermodynamics and statistical mechanics
1.
155
Ab Initio Atomistic Thermodynamics
First, let us discuss the matching of electronic structure theory data with thermodynamics. Although this approach applies “only” to systems in equilibrium (or in a metastable state), we note that at least, at not too low temperatures, a surface is likely to rapidly attain thermodynamic equilibrium with the ambient atmosphere. And even if it has not yet equilibrated, at some later stage it will have and we can nevertheless learn something by knowing about this final state. Thermodynamic considerations also have the virtue of requiring comparably less microscopic information, typically only about the minima of the PES and the local curvatures around them. As such, it is often advantageous to first resort to a thermodynamic description, before embarking upon the more demanding kinetic modeling described in the last section. The goal of the thermodynamic approach is to use the data from electronic structure theory, i.e., the information on the PES, to calculate appropriate thermodynamic potential functions like the Gibbs free energy G [18–21]. Once such a quantity is known, one is immediately in the position to evaluate macroscopic system properties. Of particular relevance for the spatial aspect of our multiscale endeavor is further that within a thermodynamic description larger systems may readily be divided into smaller subsystems that are mutually in equilibrium with each other. Each of the smaller and thus potentially simpler subsystems can then first be treated separately, and the contact between the subsystems is thereafter established by relating their corresponding thermodynamic potentials. Such a “divide and conquer” type of approach can be especially efficient, if infinite, but homogeneous parts of the system like bulk or surrounding gas-phase can be separated off [22–27].
1.1.
Free Energy Plots for Surface Oxide Formation
How this quite general concept works and what it can contribute in practice may be illustrated with the case of oxide formation at late transition metal (TM) surfaces sketched in Fig. 2 [28, 29]. These materials have widespread technological use, for example, in the area of oxidation catalysis [30]. Although they are likely to form oxidic structures (i.e., ordered oxygen–metal compounds) in technologically-relevant high oxygen pressure environments, it is difficult to address this issue at the atomic scale with the corresponding experimental techniques of surface science because they often require Ultra-High Vacuum (UHV) [31]. Instead of direct, so-called in situ measurements, the surfaces are usually first exposed to a defined oxygen dosage, and the produced oxygen-enriched surface structures are then cooled down and analyzed in UHV. Due to the low temperatures, it is hoped that the surfaces do not attain their equilibrium structure in UHV during the time of the measurement, and
156
K. Reuter et al.
Figure 2. Cartoon sideviews illustrating the effect of an increasingly oxygen-rich atmosphere on a metal surface. Whereas in perfect vacuum (left) the clean surface prevails, finite O2 pressures in the environment lead to an oxygen-enrichment in the solid and its surface. Apart from some bulk dissolved oxygen, frequently observed stages in this oxidation process comprise (from left to right) on-surface adsorbed O, the formation of thin (surface) oxide films, and eventually the transformation to an ordered bulk oxide compound. Note, that all stages can be strongly kinetically-inhibited. It is, e.g., not clear whether the observation of a thin surface oxide film means that this is the stable surface composition and structure at the given gas-phase pressure and temperature, or whether the system has simply not yet attained its real equilibrium structure (possibly in form of the full bulk oxide). Such limitations can be due to quite different microscopic reasons: adsorption from or desorption to the gas-phase could be slow/hindered, or (bulk) oxide growth may be inhibited because metal diffusion through the oxide to its surface or oxygen diffusion from the surface to the oxide/metal interface is very slow.
thus provide information about the corresponding surface structure at higher oxygen pressures. This is, however, not fully certain, and it is also not guaranteed that the surface has reached its equilibrium structure during the time of oxygen exposure. Typically, a large variety of potentially kinetically-limited surface structures can be produced this way. Even though it can be academically very interesting to study all of them in detail, one would still like to have some guidance as to which of them would ultimately correspond to an equilibrium structure under which environmental conditions. Furthermore, the knowledge of a corresponding, so-called surface phase diagram as a function of, in this case, the temperature T and oxygen pressure pO2 can also provide useful information to the now surging in situ techniques, as to which phase to expect. The task for an ab initio atomistic thermodynamic approach would therefore be to screen a number of known (or possibly relevant) oxygen-containing surface structures, and evaluate which of them turns out to be the most stable one under which (T, pO2 ) conditions [24–27]. Most stable translated into the thermodynamic language meaning that the corresponding structure minimizes an appropriate thermodynamic function, which would in this case be the Gibbs free energy of adsorption G [32, 33]. In other words, one has to compute G as a function of the environmental variables for each structural model,
Ab initio atomistic thermodynamics and statistical mechanics
157
and the one with the lowest G is identified as most stable. What needs to be computed are all thermodynamic potentials entering into the thermodynamic function to be minimized. In the present case of the Gibbs free energy of adsorption these are for example the Gibbs free energies of bulk and surface structural models, as well as the chemical potential of the O2 gas phase. The latter may, at the accuracy level necessary for the surface phase stability issue, well be approximated by an ideal gas. The calculation of the chemical potential µO (T, pO2 ) is then straightforward and can be found in standard statistical mechanics text books, (e.g., [34]). Required input from a microscopic theory like DFT are properties like bond lengths and vibrational frequencies of the gas-phase species. Alternatively, the chemical potential may be directly obtained from thermochemical tables [35]. Compared to this, the evaluation of the Gibbs free energies of the solid bulk and surface is more involved. While in principle contributions from total energy, vibrational free energy or configurational entropy have to be calculated [24–26], a key point to notice here is that not the absolute Gibbs free energies enter into the computation of G, but only the difference of the Gibbs free energies of bulk and surface. This often implies some error cancellation in the DFT total energies. It also leads to quite some (partial) cancellation in the free energy contributions like the vibrational energy. In a physical picture, it is thus not the effect of the absolute vibrations that matters for our considerations, but only the changes of vibrational modes at the surface as compared to the bulk. Under such circumstances it may result that the difference between the bulk and surface Gibbs free energies is already well approximated by the difference of their leading total energy terms, i.e., the direct output of the DFT calculations [24]. Although this is of course appealing from a computational point of view, and one would always want to formulate the thermodynamic equations in a way that they contain such differences, we stress that it is not a general result and needs to be carefully checked for every specific system. Once the Gibbs free energies of adsorption G(T, pO2 ) are calculated for each surface structural model, they can be plotted as a function of the environmental conditions. In fact, under the imposed equilibrium the two-dimensional dependence on T and pO2 can be summarized into a one-dimensional dependence on the gas-phase chemical potential µO (T, pO2 ) [24]. This is done in Fig. 3(a) for the Pd(100) surface including, apart from the clean surface, a number of previously characterized oxygen-containing surface structures. These are two structures with ordered on-surface√O adsorbate layers of differ√ ent density ( p(2 × 2) and c(2 × 2)), a so-called ( 5 × 5)R27◦ surface oxide containing one layer of PdO on top of Pd(100), and finally the infinitely thick PdO bulk oxide [37]. If we start at very low oxygen chemical potential, corresponding to a low oxygen concentration in the gas-phase, we expectedly find the clean Pd(100) surface to yield the lowest G line, which in fact is used here as the reference zero. Upon increasing µO in the gas-phase,
158
K. Reuter et al. pO (atm) -20
600K
10
-50
300K (a)
10
2
-10
10 -40
10
-30
10
10
10
1 -20
10
-10
10
1
(b)
-100
2
10
bulk oxide
1
10 -50
0
clean
-2
10
rfa c
50
eo
2)
c(2
su
x2
)
-3
10
x
-4
√5
10
)R 27
100
surface oxide bulk oxide ˚
metal 150 -2
10
2
p(2x
pO (atm)
-1
xi de
0
5 (√
∆G (meV/Å)
10
metal
-5
10
-6
-1.5
-1
µO (eV)
-0.5
0
600
700
800
900 10
T (K)
Figure 3. (a) Computed Gibbs free energy of adsorption G for the clean Pd(100) surface and several oxygen-containing surface structures. Depending on the chemical potential µO of the surrounding gas-phase, either the clean √ surface √ or a surface oxide film (labeled here according to its two-dimensional periodicity as ( 5 × 5)R27◦ ), or the infinite PdO bulk oxide exhibit the lowest G and result as the stable phase under the corresponding environmental conditions (as indicated by the different background shadings). Note that a tiny reduction of its surface energy would suffice to make the p(2 × 2) adlayer structure most stable in an intermediate range of chemical potential between the clean surface and the surface oxide. Within the present computational uncertainty, no conclusion can therefore be made regarding the stability of this structure. (b) The stability range of the three phases, evaluated in (a) as a function of µO , plotted directly in (T, pO2 )-space. Note the extended stability range of the surface oxide compared to the PdO bulk oxide (after Refs. [28, 36]).
the Gibbs free energies of adsorption of the other oxygen-containing surfaces decrease gradually, however, as it becomes more favorable to stabilize such structures with more and more oxygen atoms being present in the gas-phase. The more oxygen the structural models contain, the steeper the slope of their G curves becomes, and above a critical µO we eventually find the surface oxide to be more stable than the clean surface. Since the PdO bulk oxide contains a macroscopic (or at least mesoscopic) number of oxygen atoms, the slope of its G line exhibits an infinite slope and cuts the other lines vertically at µO ≈ − 0.8 eV. For any higher oxygen chemical potential in the gas-phase, the bulk PdO phase will then always result as most stable.
Ab initio atomistic thermodynamics and statistical mechanics
159
With the clean surface, the surface and the bulk oxide, the thermodynamic analysis yields therefore three equilibrium phases for Pd(100) depending on the chemical potential of the O2 environment. Exploiting ideal gas laws, this one-dimensional dependence can be translated into the physically more intuitive dependence on temperature and oxygen pressure. For two fixed temperatures, this is also indicated by the resulting pressure scales at the top axis of Fig. 3(a). Alternatively, the stability range of the three phases can be directly plotted in (T, pO2 )-space, as shown Fig. 3(b). A most intriguing result is that the thermodynamic stability range of the recently identified surface oxide extends well beyond the one of the common PdO bulk oxide, i.e., the surface oxide could well be present under environmental conditions where the PdO bulk oxide is known to be unstable. This result is somewhat unexpected, in two ways: First, it had hitherto been believed that it is the slow growth kinetics (not the thermodynamics) that exclusively controls the thickness of oxide films at surfaces. Second, the possibility of only few atomic layer thick (surface) oxides with structures not necessarily related to the known bulk oxides was traditionally not perceived. √ √ The additional stabilization of the ( 5 × 5)R27◦ surface oxide is attributed to the strong coupling of the ultrathin film to the Pd(100) substrate [37]. Similar findings have recently been obtained at the Pd(111) [28, 38] and Ag(111) [33, 39] surfaces. Interestingly, the low stability of the bulk oxide phases of these more noble TMs had hitherto often been used as argument against the relevance of oxide formation in technological environments like in oxidation catalysis [30]. It remains to be seen whether the surface oxide phases and their extended stability range, which have recently been intensively discussed, will change this common perception.
1.2.
Free Energy Plots of Semiconductor Surfaces
Already in the introduction we had mentioned that the concepts discussed here are general and applicable to a wide range of problems. To illustrate this, we supplement the discussion by an example from the field of semiconductors, where the concepts of ab initio atomistic thermodynamics had in fact been developed first [18–21, 40]. Semiconductor surfaces exhibit complex reconstructions, i.e., surface structures that differ significantly in their atomic composition and geometry from the one of the bulk-truncated structure [13]. Knowledge of the correct surface atomic structure is, on the other hand, a prerequisite to understand and control the surface or interface electronic properties, as well as the detailed growth characteristics. While the number of possible configurations with complex surface unit-cell reconstructions is already large, searching for possible structural models becomes even more involved for surfaces of compound semiconductors. In order to minimize the number
160
K. Reuter et al.
of dangling bonds, the surface may exchange atoms with the surrounding gasphase, which in molecular beam epitaxy (MBE) growth is composed of the substrate species at elevated temperatures and varying partial pressures. As a consequence of the interaction with this gas-phase, the surface stoichiometry may be altered and surface atoms be displaced to assume a more favorable bonding geometry. The resulting surface structure depends thus on the environment, and atomistic thermodynamics may again be employed to compare the stability of existing (or newly suggested) structural models as a function of the conditions in the surrounding gas-phase. The thermodynamic quantity that is minimized by the most stable structure is in this case the surface free energy, which in turn depends on the Gibbs free energies of the bulk and surface of the compound, as well as on the chemical potentials in the gasphase. The procedure of evaluating these quantities goes exactly along the lines described above, where in addition, one frequently assumes the surface fringe not only to be in thermodynamic equilibrium with the surrounding gasphase, but also with the underlying compound bulk [24]. With this additional constraint, the dependence of the surface structure and composition on the environment can, even for the two component gas-phase in MBE, be discussed as a function of the chemical potential of only one of the compound species alone. Figure 4 shows as an example the dependence on the As content in the gas-phase for a number of surface structural models of the GaAs(001)
Figure 4. Surface energies for GaAs(001) terminations as a function of the As chemical potential, µAs . The thermodynamically allowed range of µAs is bounded by the formation of Ga droplets at the surface (As-poor limit at −0.58 eV) and the condensation of arsenic at the surface (As-rich limit at 0.00 eV). The ζ (4 × 2) geometry is significantly lower in energy than the previously proposed β2(4 × 2) model for the c(8 × 2) surface reconstruction observed under As-poor growth conditions (from Ref. [41]).
Ab initio atomistic thermodynamics and statistical mechanics
161
surface. A reasonable lower limit for this content is given, when there is so little As2 in the gas-phase that it becomes thermodynamically more favorable for the arsenic to leave the compound. The resulting GaAs decomposition and formation of Ga droplets at the surface denotes the lower limit of As chemical potentials considered (As-poor limit), while the condensation of arsenic on the surface forms an appropriate upper bound (As-rich limit). Depending on the As to Ga stoichiometry at the surface, the surface free energies of the individual models have either a positive slope (As-poor terminations), a negative slope (As-rich terminations) or remain constant (stoichiometric termination). While the detailed atomic geometries behind the considered models in Fig. 4 are not relevant here, most of them may roughly be characterized as different ways of forming dimers at the surface in order to reduce the number of dangling orbitals [42]. In fact, it is this general “rule” of dangling bond minimization by dimer formation that has hitherto mainly served as inspiration in the creation of new structural models for the (001) surfaces of III–V zinc-blende semiconductors, thereby leading to some prejudice in the type of structures considered. In contrast, at first the theoretically proposed so-called ζ(4 × 2) structure is actuated by the filling of all As dangling orbitals and emptying of all Ga dangling orbitals, as well as a favorable electrostatic (Ewald) interaction between the surface atoms [41]. The virtue of the atomistic thermodynamic approach is now that such a new structural model can be directly compared in its stability against all existing ones. And indeed, the ζ(4 × 2) phase was found to be more stable than all previously proposed reconstructions at low As pressure. Returning to the methodological discussion, the results shown in Figs. 3 and 4 nicely summarize the contribution that can be made by such analysis. While ab initio atomistic thermodynamics has a much wider applicability (see Sections 1.3–1.5), the approach followed for obtaining Figs. 3 and 4 has some limitations. Most prominently, one has to be aware that the reliability is restricted to the number of considered configurations, or in other words that only the stability of those structures plugged in can be compared. Had, for example, the surface oxide structure not been considered in Fig. 3, the p(2×2) adlayer structure would have yielded the lowest Gibbs free energy of adsorption in a range of µO intermediate to the stability ranges of the clean surface and the bulk oxide, changing the resulting surface phase diagram √ √ accordingly. Alternatively, it is at present not completely clear, whether the ( 5× 5)R27◦ structure is really the only surface oxide on Pd(100). If another yet unknown surface oxide exists and exhibits a sufficiently low G for some oxygen chemical potential, it will similarly affect the surface phase diagram, as would another novel and hitherto unconsidered surface reconstruction with sufficiently low surface free energy in the GaAs example. As such, appropriate care should be in place when addressing systems where only limited information about surface structures is available. With this in mind, even in such systems the
162
K. Reuter et al.
atomistic thermodynamics approach can still be a particularly valuable tool though, since it allows, for example, to rapidly compare the stability of newly devised structural models against existing ones. In this way, it gives tutorial insight into what structural motives may be particularly important. This may even yield ideas about other structures that one should test, as well, and the theoretical identification of the ζ(4 × 2) structure in Fig. 4 by Lee et al. [41] is a prominent example. In the Section 1.4 we will discuss an approach that is able to overcome this limitation. This comes unfortunately at a significantly higher computational demand, so that it has up to now only be used to study simple adsorption layers on surfaces. This will then also provide more detailed insight into the transitions between stable phases. In Figs. 3 and 4, the transitions are simply drawn abrupt, and no reference is made to the finite phase coexistence regions that should occur at finite temperatures, i.e., regions in which with changing pressure or temperature one phase gradually becomes populated and the other one depopulated. That this is not the case in the discussed examples is not a general deficiency of the approach, but has to do with that the configurational entropy contribution to the Gibbs free energy of the surface phases has been deliberately neglected in the two corresponding studies. This is justified, since for the well-ordered surface structural models considered, this contribution is indeed small and will affect only a small region close to the phase boundaries. The width of this affected phase coexistence region can even be estimated [26], but if more detailed insight into this very region is desired, or if disorder becomes more important e.g., at more elevated temperatures, then an explicit calculation of the configurational entropy contribution will become necessary. For this, equilibrium MC simulations as described below are the method of choice, but before we turn to them there is yet another twist to free energy plots that deserves mentioning.
1.3.
“Constrained Equilibrium”
Although a thermodynamic approach can strictly describe only the situation where the surface is in equilibrium with the surrounding gas-phase (or in a metastable state), the idea is that it can still give some insight when the system is close to thermodynamic equilibrium, or even when it is only close to thermodynamic equilibrium with some of the present gas-phase species [25]. For such situations it can be useful to consider “constrained equilibria,” and one would expect to get some ideas as to where in (T, p)-space thermodynamic phases may still exist, but also to identify those regions where kinetics may control the material function.
Ab initio atomistic thermodynamics and statistical mechanics
163
We will discuss heterogeneous catalysis as a prominent example. Here, a constant stream of reactants is fed over the catalyst surface and the formed products are rapidly carried away. If we take the CO oxidation reaction to further specify our example, the surface would be exposed to an environment composed of O2 and CO molecules, while the produced CO2 desorbs from the catalyst surface at the technologically employed temperatures and is then transported away. Neglecting the presence of the CO2 , one could therefore model the effect of an O2 /CO gas-phase on the surface, in order to get some first ideas of the structure and composition of the catalyst under steady-state operation conditions. Under the assumption that the adsorption and desorption processes of the reactants occur much faster than the CO2 formation reaction, the latter would not significantly disturb the average surface population, i.e., the surface could be close to maintaining its equilibrium with the reactant gas-phase. If at all, this equilibrium holds, however, only with each gasphase species separately. Were the latter fully equilibrated among each other, too, only the products would be present under all environmental conditions of interest. It is in fact particularly the high free energy barrier for the direct gas-phase reaction that prevents such an equilibration on a reasonable time scale, and necessitates the use of a catalyst in the first place. The situation that is correspondingly modeled in an atomistic thermodynamics approach to heterogeneous catalysis is thus a surface in “constrained equilibrium” with independent reservoirs representing all reactant gas-phase species, namely O2 and CO in the present example [25]. It should immediately be stressed though, that such a setup should only be viewed as a thought construct to get a first idea about the catalyst surface structure in a high-pressure environment. Whereas we could write before that the surface will sooner or later necessarily equilibrate with the gas-phase in the case of a pure O2 atmosphere, this must no longer be the case for a “constrained equilibrium”. The on-going catalytic reaction at the surface consumes adsorbed reactant species, i.e., it continuously drives the surface populations away from their equilibrium value, and even more so in the interesting regions of high catalytic activity. That the “constrained equilibrium” concept can still yield valuable insight is nicely exemplified for the CO oxidation over a “Ru” catalyst [43]. For ruthenium, the afore described tendency to oxidize under oxygen-rich environmental conditions is much more pronounced than for the above discussed nobler metals Pd and Ag [28]. While for the latter the relevance of (surface) oxide formation under the conditions of technological oxidation catalysis is still under discussion [28, 33, 39, 44], it is by now established that a film of bulklike oxide forms on the Ru(0001) model catalyst during high-pressure CO oxidation, and that this RuO2 (110) is the active surface for the reaction [45]. When evaluating its surface structure in “constrained equilibrium” with an O2 and CO environment, four different “surface phases” result depending on the gas-phase conditions that are now described by the chemical potentials of both
164
K. Reuter et al.
reactants, cf. Fig. 5. The “phases” differ from each other in the occupation of two prominent adsorption site types exhibited by this surface, called bridge (br) and coordinatively unsaturated (cus) sites. At very low µCO , i.e., a very low CO concentration in the gas-phase, either only the bridge, or bridge and cus sites are occupied by oxygen depending on the O2 pressure. Under increased CO concentration in the gas-phase, both the corresponding Obr /− and the Obr /Ocus phase have to compete with CO that would also like to adsorb at the cus sites. And eventually the Obr /COcus phase develops. Finally, under very reducing gas-phase conditions with a lot of CO and essentially no oxygen, a completely CO covered surface results (CObr /COcus). Under these conditions the RuO2 (110) surface can at best be metastable, however, as above the white-dotted line in Fig. 5 the RuO2 bulk oxide is already unstable against CO-induced decomposition. With the already described difficulty of operating the atomic-resolution experimental techniques of surface science at high pressures, the possibility of reliably bridging the so-called pressure gap is of key interest in heterogeneous catalysis research [30, 43, 46]. The hope is that the atomic-scale understanding gained in experiments with some suitably chosen low pressure conditions would also be representative of the technological ambient pressure situation. Surface phase diagrams like the one shown in Fig. 5 could give some valuable guidance in this endeavor. If the (T, pO2 , pCO ) conditions of the low pressure experiment are chosen such that they lie within the stability region of the same surface phase as at high-pressures, the same surface structure and composition will be present and scalable results may be expected. If, however, temperature and pressure are varied in such a way, that one crosses from one stability region to another one, different surfaces are exposed and there is no reason to hope for comparable functionality. This would, e.g., also hold for a naive bridging of the pressure gap by simply maintaining a constant partial pressure ratio. In fact, the comparability holds not only within the regions of the stable phases themselves, but with the same argument also for the phase coexistence regions along the phase boundaries. The extent of these configurational entropy induced phase coexistence regions has been indicated in Fig. 5 by white regions. Although as already discussed, the above mentioned approach gives no insight into the detailed surface structure under these conditions, pronounced fluctuations due to an enhanced dynamics of the involved elementary processes can generally be expected due to the vicinity of a phase transition. Since catalytic activity is based on the same dynamics, these regions are therefore likely candidates for efficient catalyst functionality [25]. And indeed, very high and comparable reaction rates have recently been noticed for different environmental conditions that all lie close to the white region between the Obr /Ocus and Obr /COcus phases. It must be stressed, however, that exactly in this region of high catalytic activity one would similarly expect the
Ab initio atomistic thermodynamics and statistical mechanics
165
Figure 5. Top panel: Top view of the RuO2 (110) surface explaining the location of the two prominent adsorption sites (coordinatively unsaturated, cus, and bridge, br). Also shown are perspective views of the four stable phases present in the phase diagram shown below (Ru = light large spheres, O = dark medium spheres, C = white small spheres). Bottom panel: Surface phase diagram for RuO2 (110) in “constrained equilibrium” with an oxygen and CO environment. Depending on the gas-phase chemical potentials (µO , µCO ), br and cus sites are either occupied by O or CO, or empty (–), yielding a total of four different surface phases. For T = 300 and 600 K, this dependence is also given in the corresponding pressure scales. Regions that are expected to be particularly strongly affected by phase coexistence or kinetics are marked by white hatching (see text). Note that conditions representative for technological CO oxidation catalysis (ambient pressures, 300–600 K) fall exactly into one of these ranges (after Refs. [25, 26]).
166
K. Reuter et al.
breakdown of the “constrained equilibrium” assumption of a negligible effect of the on-going reaction on the average surface structure and stoichiometry. At least everywhere in the corresponding hatched regions in Fig. 5 such kinetic effects will lead to significant deviations from the surface phases obtained within the approach described above, even at “infinite” times after steady-state has been reached. Atomistic thermodynamics may therefore be employed to identify interesting regions in phase space. Their surface coverage and structure, i.e., the very dynamic behavior, must then however be modeled by statistical mechanics explicitly accounting for the kinetics, and the corresponding kMC simulations will be discussed towards the end of the chapter.
1.4.
Ab Initio Lattice-gas Hamiltonian
The predictive power of the approach discussed in the previous sections extends only to the structures that are directly considered, i.e., it cannot predict the existence of unanticipated geometries or stoichiometries. To overcome this limitation, and to include a more general and systematic way of treating phase coexistence and order–disorder transitions, a proper sampling of configuration space must be achieved, instead of considering only a set of plausible structural models. Modern statistical mechanical methods like Monte Carlo (MC) simulations are particularly designed to efficiently fulfill this purpose [6, 47]. The straightforward matching with electronic structure theories would thus be to determine with DFT the energetics of all system configurations generated in the course of the statistical simulation. Unfortunately, this direct linking is currently, and also in the foreseeable future, computationally unfeasible. The exceedingly large configuration spaces of most materials science problems require a prohibitively large number of free energy evaluations (which can easily go beyond 106 for moderately complex systems), including also disordered configurations. With the direct matching impossible, an efficient alternative is to map the real system somehow onto a simpler, typically discretized model system, the Hamiltonian of which is sufficiently fast to evaluate. This then enables us to evaluate the extensive number of free energies required by the statistical mechanics. Obvious uncertainties of this approach are how appropriate the model system represents the real system, and how its parameters can be determined from the first-principles calculations. The advantage, on the other hand, is that such a detour via an appropriate (“coarse-grained”) model system often provides deeper insight and understanding of the ruling mechanisms. If the considered problem can be described by a lattice defining the possible sites for the species in the system, a prominent example for such a mapping approach is given by the concept of a LGH (or in other languages, an “Isingtype model” [48] or a “cluster-expansion” [49, 50]). Here, any system state
Ab initio atomistic thermodynamics and statistical mechanics
167
is defined by the occupation of the sites in the lattice and the total energy of any configuration is expanded into a sum of discrete interactions between these lattice sites. For a one component system with only one site type, the LGH would then for example read (with obvious generalizations to multicomponent, multi-site systems): H=F
i
ni +
p m=1
Vmpair
(i j )m
ni n j +
q m=1
Vmtrio
ni n j nk + . . . ,
(1)
(i j k)m
where the site occupation numbers n l = 0 or 1 tell whether site l in the lattice is empty or occupied, and F is the free energy of an isolated species at this lattice site, including static and vibrational contributions. There are p pair interactions with two-body (or pair) interaction energies Vmpair between species at mth nearest neighbor sites, and q trio interactions with Vmtrio three-body interaction energies. The sum labels (i j )m (and (i j k)m ) indicate that the sums run over all pairs of sites (i j ) (and three sites (i j k)) that are separated by m lattice constants. Formally, higher and higher order interaction terms (quattro, quinto, . . . ) would follow in this infinite expansion. In practice, the series must obviously (and can) be truncated after a finite number of terms though. Figure 6 illustrates some of these interactions for the case of a two-dimensional (a)
(b)
Figure 6. (a) Illustration of some types of lateral interactions for the case of a twodimensional adsorbate layer (small dark spheres) that can occupy the two distinct threefold pair hollow sites of a (111) close-packed surface. Vm (n = 1, 2, 3) are two-body (or pair) interactions at first, second, and third nearest neighbor distances of like hollow sites (i.e., fcc–fcc or hcp–hcp). Vmtrio (n = 1, 2, 3) are the three possible three-body (or trio) interactions between pair(h,f)
three atoms in like nearest neighbor hollow sites, and Vm (n = 1, 2, 3) represent pair interactions between atoms that occupy unlike hollow sites (i.e., one in fcc and the other in hcp or vice versa). (b) Example of an adsorbate arrangement from which an expression can be obtained for use in solving for interaction parameters. The (3 × 3) periodic surface unit-cell is indicated by the large darker spheres. The arrows indicate interactions between the adatoms. Apart from the obvious first nearest-neighbor interactions (short arrows), also third nearestneighbor two-body interactions (long arrows) exist, due to the periodic images outside of the unit cell.
168
K. Reuter et al.
adsorbate layer that can occupy the two distinct threefold hollow sites of a (111) close-packed surface. In particular, the pair interactions up to third nearest neighbor between like and unlike hollow sites are shown, as well as three possible trio interactions between adsorbates in like sites. It is apparent that such a LGH is very general. The Hamiltonian can be equally well evaluated for any lattice occupation, be it dense or sparse, periodic or disordered. And in all cases it merely comprises performing an algebraic sum over a finite number of terms, i.e., it is computationally very fast. The disadvantage is, on the other hand, that for more complex systems with multiple sites and several species, the number of interaction terms in the expansion increases rapidly. Which of these (far-reaching or multi-body) interaction terms need to be considered, i.e., where the sum in Eq. (1) may be truncated, and how the interaction energies in these terms may be determined, is the really sensitive part of such a LGH approach that must be carefully checked. The methodology in itself is not new, and traditionally the interatomic interactions have often been assumed to be just pairwise additive (i.e., higherorder terms beyond pair interactions were neglected); the interaction energies were then obtained by simply fitting to experimental data (see, e.g., [51–53]). This procedure obviously results in “effective parameters” with an unclear microscopic basis, “hiding” or “masking” the effect and possible importance of three-body (trio) and higher-order interactions. This has the consequence that while the Hamiltonian may be able to reproduce certain specific experimental data to which the parameters were fitted, it is questionable and unlikely that it will be general and transferable to calculations of other properties of the system. Indeed, the decisive contribution to the observed behavior of adparticles by higher-order, many-atom interactions has in the meanwhile been pointed out by a number of studies (see, e.g., [54–58]). As an alternative to this empirical procedure, the lateral interactions between the particles in the lattice can be deduced from detailed DFT calculations, and it is this approach in combination with the statistical mechanics methods that is of interest for this chapter. The straightforward way to do this is to directly compute these interactions as differences of calculations, with different occupations at the corresponding lattice sites. For the example of a pair interaction between two adsorbates at a surface, this would translate into two DFT calculations where only either one of the adsorbates sits at its lattice site, and one calculation where both are present simultaneously. Unfortunately, this type of approach is hard to combine with the periodic boundary conditions that are typically required to describe the electronic structure of solids and surfaces [16]. In order to avoid interactions with the periodic images of the considered lattice species, huge (actually often prohibitively large) supercells would be required. A more efficient and intelligent way of addressing the problem is instead to specifically exploit the interaction with the periodic images. For this, different configurations in various (feasible)
Ab initio atomistic thermodynamics and statistical mechanics
169
supercells are computed with DFT, and the obtained energies expressed in terms of the corresponding interatomic interactions. Figure 6 illustrates this for the case of two adsorbed atoms in a laterally periodic surface unit-cell. Due to this periodicity, each atom has images in the neighboring cells. Because of these images, each of the atoms in the unit-cell experiences not only the obvious pair interaction at the first neighbor distance, but also a pair interaction at the third neighbor distance (neglecting higher pairwise or multi-body interactions for the moment). The computed DFT binding energy for this conpair pair (3×3),i = 2E + 2V1 + 2V3 . Doing figuration i can therefore be written as E DFT this for a set of different configurations thus generates a system of linear equations that can be solved for the interaction energies either by direct inversion (or by fitting techniques, if more configurations than interaction parameters were determined). The crucial aspect in this procedure is the number and type of interactions to include in the LGH expansion, and the number and type of configurations that are computed to determine them. We note that there is no a priori way to know at how many, and what type of, interactions to terminate the expansion. While there are some attempts to automatize this procedure [59–61], it is probably fair to say that the actual implementation remains to date a delicate task. Some guidelines to judge on the convergence of the constructed Hamiltonian include its ability to predict the energies of a number of DFT-computed configurations that were not employed in the fit, or that it reproduces the correct lowest-energy configurations at T = 0 K (so-called “ground-state line”) [50].
1.5.
Equilibrium Monte Carlo Simulations
Once an accurate LGH has been constructed, one has at hand a very fast and flexible tool to provide the energies of arbitrary system configurations. This may in turn be used for MC simulations to obtain a good sampling of the available configuration space, i.e., to determine the partition function of the system. An important aspect of modern MC techniques is that this sampling is done very efficiently by concentrating on those parts of the configuration space that contribute significantly to the latter. The Metropolis algorithm [62], as a famous example of such so-called importance sampling schemes, proceeds therefore by generating at random new system configurations. If the new configuration exhibits a lower energy than the previous one, it is automatically “accepted” to a gradually built-up sequence of configurations. And even if the configuration has a higher energy, it still has an appropriately Boltzmann weighted probability to make it to the considered set. Otherwise it is “rejected” and the last configuration copied anew to the sequence. This way, the algorithm preferentially samples low energy configurations, which contribute most to the partition function. The acceptance criteria of the Metropolis, and of other
170
K. Reuter et al.
importance sampling schemes, furthermore fulfill detailed balance. This means that the forward probability of accepting a new configuration j from state i is related to the backward probability of accepting configuration i from state j by the free energy difference of both configurations. Taking averages of system observables over the thus generated configurations yields then their correct thermodynamic average for the considered ensemble. Technical issues regard finally how new trial configurations are generated, or how long and in what system size the simulation must be run in order to obtain good statistical averages [6, 47]. The kind of insights that can be gained by such a first-principles LGH + MC approach is nicely exemplified by the problem of on-surface adsorption at a close-packed surface, when the latter is in equilibrium with a surrounding gas-phase. If this environment consists of oxygen, this would, e.g., contribute to the understanding of one of the early oxidation stages sketched in Fig. 2. What would be of interest is for instance to know how much oxygen is adsorbed at the surface given a certain temperature and pressure in the gas-phase, and whether the adsorbate forms ordered or disordered phases. As outlined above, the approach proceeds by first determining a LGH from a number of DFT-computed ordered adsorbate configurations. This is followed by grand-canonical MC simulations, in which new trial system configurations are generated by randomly adding or removing adsorbates from the lattice positions and where the energies of these configurations are provided by the LGH. Evaluating appropriate order parameters that check on prevailing lateral periodicities in the generated sequence of configurations, one may finally plot the phase diagram, i.e., what phase exists under which (T, p)-conditions (or equivalently (T, µ)-conditions) in the gas-phase. The result of one of the first studies of this kind is shown in Fig. 7 for the system O/Ru(0001). The employed LGH comprised two types of adsorption sites, namely the hcp and fcc hollows, lateral pair interactions up to third neighbor and three types of trio interactions between like and unlike sites, thus amounting to a total of fifteen independent interaction parameters. At low temperature, the simulations yield a number of ordered phases corresponding to different periodicities and oxygen coverages. Two of these ordered phases had already been reported experimentally at the time the work was carried out. The prediction of two new (higher coverage) periodic structures, namely a 3/4 and a 1 monolayer phase, has in the meanwhile been confirmed by various experimental studies. This example thus demonstrates the predictive nature of the first-principles approach, and the stimulating and synergetic interplay between theory and experiment. It is also worth pointing out that these new phases and their coexistence in certain coverage regions were not obtained in early MC calculations of this system based on an empirical LGH, which was determined by simply fitting a minimal number of pair interactions to the then available experimental phase diagram [51]. We also like to
Ab initio atomistic thermodynamics and statistical mechanics
171
1.00
D C Chemical potential (eV)
0.75
B 0.50
0.25
A
0.00 200
l.g. 400
600
800
T (K)
Figure 7. Phase diagram for O/Ru(0001) as obtained using the ab initio LGH approach in combination with MC calculations. The triangles indicate first order transitions and the circles second order The identified ordered structures are labeled as: (2×2)-O (A), (2×1)√ transitions. √ O (B), ( 3 × 3)R30◦ (C), (2 × 2)-3O (D), and disordered lattice-gas (l.g.) (from Ref. [63]).
stress the superior transferability of the first-principles interaction parameters. As an example we name simulations of temperature programmed desorption (TPD) spectra, which can among other possibilities be obtained by combining the LGH with a transfer-matrix approach and kinetic rate equations [61]. Figure 8 shows the result obtained with exactly the same LGH that also underlies the phase diagram of Fig. 7. Although empirical fits of TPD spectra may give better agreement between calculated and experimental results, we note that the agreement visible in Fig. 8 is in fact quite good. The advantage, on the other hand, is that no empirical parameters were used in the LGH, which allows to unambiguously trace back the TPD features to lateral interactions with well-defined microscopic meaning. The results summarized in Fig. 7 also serve quite well to illustrate the already mentioned differences between the initially described free energy plots and the LGH + MC method. In the first approach, the stability of a fixed set of configurations is compared in order to arrive at the phase diagram. Consider, for example, that we would have restricted our free energy analysis of the O/Ru(0001) system to only the O(2 × 2) and O(2 × 1) adlayer structures that were the two experimentally known ordered phases before 1995. The stability region of the prior phase, bounded at lower chemical potentials by the clean surface and at higher chemical potentials by O(2 × 1) phase, then comes
172
K. Reuter et al.
O2 desorption rate (ML/s)
0.05 0.04
θ ⫽ 1.0
0.03 θ ⫽ 0.8
0.02
θ ⫽ 0.1
0.01 0.00 800
1000 1200 1400 temperature (K)
1600
Figure 8. Theoretical (left panel) and experimental (right panel) temperature programmed desorption curves. Each curve shows the rate of oxygen molecules that desorb from the Ru(0001) surface as a function of temperature, when the system is prepared with a given initial oxygen coverage θ ranging from 0.1 to 1 monolayer (ML). The first-principles LGH employed in the calculations is exactly the same as the one underlying the phase diagram of Fig. 7 (from Refs. [57, 58]).
out just as much as in Fig. 7. This stability range will be independent of temperature, however, there is no order–disorder transition at higher temperature due to the neglect of configurational entropy. More importantly, since the two higher-coverage phases would not have been explicitly considered, the stability of the O(2 × 1) phase would falsely extend over the whole higher chemical potential range. One would have to include these two configurations into the analysis to obtain the right result shown in Fig. 7, whereas the LGH + MC method yields them automatically. While this emphasizes the deeper insight and increased predictive power that is achieved by the proper sampling of configuration space in the LGH + MC technique, one must also recognize that the computational cost of the latter is significantly higher. It is, in particular, straightforward to directly compare the stability of qualitatively different geometries like the on-surface adsorption and the surface oxide phases in Fig. 3 in a free energy plot (or the various surface reconstructions entering Fig. 4). Setting up an LGH that would equally describe both systems, on the other hand, is far from trival. Even if it were feasible to find a generalized lattice that would be able to treat all system states, disentangling and determining the manifold of interaction energies in such a lattice will be very involved. The required discretization of the real system, i.e., the mapping onto a lattice, is therefore to date the major limitation of the LGH + MC technique – be it applied to two-dimensional pure surface systems or
Ab initio atomistic thermodynamics and statistical mechanics
173
even worse to three-dimensional problems addressing a surface fringe of finite width. Still, it is also precisely this mapping and the resulting very fast analysis of the properties of the LGH that allows for an extensive and reliable sampling of the configuration space of complex systems that is hitherto unparalleled by other approaches. Having highlighted the importance of this sampling for the determination of unanticipated new ordered phases at lower temperatures, the final example in this section illustrates specifically the decisive role it also plays for the simulation and understanding of order-disorder transitions at elevated temperatures. A particularly intriguing transition of this kind is observed for Na on Al(001). The interest in such alkali metal adsorption systems has been intense, especially since in the early 1990s it was found (first for Na on Al(111) and then on Al(100)) that the alkali metal atoms may kick-out surface Al atoms and adsorb substitutionally [65–67]. This was in sharp contrast to the “experimental evidence” and the generally accepted understanding of the time, which was that alkali-metal atoms adsorb in the highest coordinated on-surface hollow site, and cause little disturbance to a close-packed metal surface. For the specific system Na on Al(001) at a coverage of 0.2 monolayer, recent low energy electron diffraction experiments observed furthermore a reversible phase transition √ √ in the◦temperature range 220 K–300 K. Below this range, an ordered ( 5 × 5)R27 structure forms, where the Na atoms occupy surface substitutional sites, while above it, the Na atoms, still in the substitutional sites, form a disordered arrangement in the surface. Using the ab initio LGH + MC approach the ordered phase and the disorder transition could be successfully reproduced [67]. Pair interactions up to the sixth nearest neighbor and two different trio interactions, as well as one quarto interaction were included in the LGH expansion. We note that determining these interaction parameters requires care, and careful cross-validation. To specifically identify the crucial role played by configurational entropy in the temperature induced order–disorder transition, a specific MC algorithm proposed by Wang and Landau [68] was employed. In contrast to the above outlined Metropolis algorithm, this scheme affords an explicit calculation of the density of configuration states, g(E), i.e., the number of system configurations with a certain energy E. This quantity provides in turn all major thermodynamic functions, e.g., the canonical distributionat a given temperature, g(E)e−E/ kB T , the free energy, F(T ) = − kB T ln( E g(E)e−E/kB T ) = −kB T ln(Z ), where Z is the partition function, the internal energy, U (T ) = [ E Eg(E)e−E/kB T ]/Z , and the entropy S = (U − F)/T . Figure 9 shows the calculated density of configuration states g(E), together with the internal and free energy derived from it. In the latter two quantities, the abrupt change corresponding to the first-order phase transition obtained at 301 K can be nicely discerned. This is also visible as a double peak in the logarithm of the canonical distribution (Fig. 9(a), inset) and as a singularity
174
K. Reuter et al. (a)
(b)
Figure 9. (a) Calculated density of configuration states, g(E), for Na on Al(100) at a coverage of 0.2 monolayers. Inset: Logarithm of the canonical distribution P(E, T ) = g(E)e E/ kB T , at the critical temperature. (b) Free energy F(T ) and internal energy U (T ) as a function of temperature, derived from g(E). The cusp in F(T ) and discontinuity in U (T ) at 301 K reflect the occurrence of the disorder–order phase transition, experimentally observed in the range 220–300 K (from Ref.[67]).
Ab initio atomistic thermodynamics and statistical mechanics
175
in the specific heat at the critical temperature (not shown) [67]. It can be seen that the free energy decreases notably with increasing temperature. The reason for this is clearly the entropic contribution (difference in the free and internal energies), the magnitude of which suddenly increases at the transition temperature and continues to increase steadily thereafter. Taking this configurational entropy into account is therefore (and obviously) the crucial aspect in the simulation and understanding of this order–disorder phase transition, and only the LGH+MC approach with its proper sampling of configuration space can provide it. What the approach does not yield, on the other hand, is how the phase transition actually takes place microscopically, i.e., how the substitutional Na atoms move their positions by necessarily displacing surface Al atoms, and on what time scale (with what kinetic hindrance) this all happens. For this, one necessarily needs to go beyond a thermodynamic description, and explicitly follow the kinetics of the system over time, which will be the topic of the following section.
2.
First-Principles Kinetic Monte Carlo Simulations
Up to now we had discussed how equilibrium MC simulations can be used to explicitly evaluate the partition function, in order to arrive at surface phase diagrams as function of temperature and partial pressures of the surrounding gas-phase. For this, statistical averages over a sequence of appropriately sampled configurations were taken, and it is appealing to also connect some time evolution to this sequence of generated configurations (MC steps). In fact, certain nonequilibrium problems can already be tackled on the basis of this uncalibrated “MC time” [47]. The reason why this does not work in general is twofold. First, equilibrium MC is designed to achieve an optimum sampling of configurational space. As such, also MC moves that are unphysical like a particle hop from an occupied site to an unoccupied one, hundreds of lattice spacings away may be allowed, if they help to obtain an efficient sampling of the relevant configurations. The remedy for this obstacle is straightforward, though, as one only needs to restrict the possible MC moves to “physical” elementary processes. The second reason is more involved, as it has to do with the probabilities with which the individual events are executed. In equilibrium MC the forward and backward acceptance probabilities of time-reversed processes like hops back and forth between two sites only have to fulfill the detailed balance criterion, and this is not enough to establish a proper relationship between MC time and “real time” [69]. In kinetic Monte Carlo simulations (kMC) a proper relationship between MC time and real time is achieved by interpreting the MC process as providing a numerical solution to the Markovian master equation describing the
176
K. Reuter et al.
dynamic system evolution [70–74]. The simulation itself still looks superficially similar to equilibrium MC in that a sequence of configurations is generated using random numbers. At each configuration, however, all possible elementary processes and the rates with which they occur are evaluated. Appropriately weighted by these different rates one of the possible processes is then executed randomly to achieve the new system configuration, as sketched in Fig. 10. This way, the kMC algorithm effectively simulates stochastic processes, and a direct and unambiguous relationship between kMC time and real time can be established [74]. Not only does this open the door to a treatment of the kinetics of nonequilibrium problems, but also it does so very efficiently, since the time evolution is actually coarse-grained to the really decisive rare events, passing over the irrelevant short-time dynamics. Time scales of the order of seconds or longer for mesoscopically-sized systems are therefore readily accessible by kMC simulations [12].
Figure 10. Flow diagram illustrating the basic steps in a kMC simulation. (i) Loop over all lattice sites of the system and determine the atomic processes that are possible for the current system configuration. Calculate or look up the corresponding rates. (ii) Generate two random numbers, (iii) advance the system according to the process selected by the first random number (this could, e.g., be moving an atom from one lattice site to a neighboring one, if the corresponding diffusion process was selected). (iv) Increment the clock according to the rates and the second random number, as prescribed by an ensemble of Poisson processes, and (v) start all over or stop, if a sufficiently long time span has been simulated.
Ab initio atomistic thermodynamics and statistical mechanics
2.1.
177
Insights from MD, MC, and kMC
To further clarify the different insights provided by molecular dynamics, equilibrium and kinetic Monte Carlo simulations, consider the simple, but typical rare event type model system shown in Fig. 11. An isolated adsorbate vibrates at finite temperature T with a frequency on the picosecond time scale and diffuses about every microsecond between two neighboring sites of different stability. In terms of a PES, this situation is described by two stable minima of different depths separated by a sizable barrier. Starting with the particle in any of the two sites, a MD simulation would follow the thermal motion of the adsorbate in detail. In order to do this accurately, timesteps in the femtosecond range are required. Before the first diffusion event can be observed at all, of the order of 109 time steps have therefore to be calculated first, in which the particle does nothing but just vibrate around the stable minimum. Computationally this is unfeasible for any but the simplest model systems, and even if it were feasible it would obviously not be an efficient tool to study the long-term time evolution of this system. For Monte Carlo simulations on the other hand, the system first has to be mapped onto a lattice. This is unproblematic for the present model and results
Figure 11. Schematic potential energy surface (PES) representing the thermal diffusion of an isolated adsorbate between two stable lattice sites A and B of different stability. A MD simulation would explicitly follow the dynamics of the vibrations around a minimum, and is thus inefficient to address the rare diffusion events happening on a much longer time scale. Equilibrium Monte Carlo simulations provide information about the average thermal occupation of the two sites,
, based on the depth of the two PES minima (E A and E B ). Kinetic Monte Carlo simulations follow the “coarse-grained” time evolution of the system, N(t), employing the rates for the diffusion events between the minima (rA→B , rB→A ). For this, PES information not only about the minima, but also about the barrier height at the transition state (TS) between initial and final state is required (E A , E B ).
178
K. Reuter et al.
in two possible system states with the particle being in one or the other minimum. Equilibrium Monte Carlo provides then only time-averaged information about the equilibrated system. For this, a sequence of configurations with the system in either of the two system states is generated, and considering the higher stability of one of the minima, appropriately more configurations with the system in this state are sampled. When taking the average, one arrives at the obvious result that the particle is with a certain higher (Boltzmann-weighted) probability in the lower minimum than in the higher one. Real information on the long-term time-evolution of the system, i.e., focusing on the rare diffusion events, is finally provided by kMC simulations. For this, first the two rates of the diffusion events from one system state to the other and vice versa have to be known. We will describe below that they can be obtained from knowledge of the barrier between the two states and the vibrational properties of the particle in the minima and at the barrier, i.e., from the local curvatures. A lot more information on the PES is therefore required for a kMC simulation than for equilibrium MC, which only needs input about the PES minima. Once the rates are known, a kMC simulation starting from any arbitrary system configuration will first evaluate all possible processes and their rates and then execute one of them with appropriate probability. In the present example, this list of events is trivial, since with the particle in either minimum only the diffusion to the other minimum is possible. When the event is executed, on average the time (rate)−1 has passed and the clock is advanced accordingly. Note that as described initially, the rare diffusion events happen on a time scale of nano- to microseconds, i.e., with only one executed event the system time will be directly incremented by this amount. In other words, the time is coarse-grained to the rare event time, and all the short-time dynamics (corresponding in the present case to the picosecond vibrations around the minimum) are efficiently contained in the process rate itself. Since the barrier seen by the particle when in the shallower minimum is lower than when in the deeper one, cf. Fig. 11, the rate to jump into the deeper minimum will correspondingly be higher than the one for the backwards jump. Generating the sequence of configurations, each time more time will therefore have passed after a diffusion event from deep to shallow compared to the reverse process. When taking a long-time average, describing then the equilibrated system, one therefore arrives necessarily at the result that the particle is on average longer in the lower minimum than in the higher one. This is identical to the result provided by equilibrium Monte Carlo, and if only this information is required, the latter technique would most often be the much more efficient way to obtain it. KMC, on the other hand, has the additional advantage of shedding light on the detailed time-evolution itself, and can in particular also follow the explicit kinetics of systems that are not (or not yet) in thermal equilibrium.
Ab initio atomistic thermodynamics and statistical mechanics
179
From the discussion of this simple model system, it is clear that the key ingredients of a kMC simulation are the analysis and identification of all possibly relevant elementary processes and the determination of the associated rates. Once this is known, the coarse graining in time achieved in kMC immediately allows to follow the time evolution and the statistical occurrence and interplay of the molecular processes of mesoscopically sized systems up to seconds or longer. As such it is currently the most efficient approach to study long time and larger length scales, while still providing atomistic information. In its original development, kMC was exclusively applied to simplified model systems, employing a few processes with guessed or fitted rates (see, e.g., Ref. [69]). The new aspect brought into play by so-called first-principles kMC simulations [75, 76] is that these rates and the processes are directly provided from electronic structure theory calculations, i.e., that the parameters fed into the kMC simulation have a clear microscopic meaning.
2.2.
Getting the Processes and Their Rates
For the rare event type molecular processes mostly encountered in the surface science context, an efficient and reliable way to obtain the individual process rates is transition-state theory (TST) [77–79]. The two basic quantities entering this theory are an effective attempt frequency, ◦ , and the minimum energy barrier E that needs to be overcome for the event to take place, i.e., to bring the system from the initial to the final state. The atomic configuration corresponding to E is accordingly called the transition state (TS). Within the harmonic approximation, the effective attempt frequency is proportional to the ratio of normal vibrational modes at the initial and transition state. Just like the barrier E, ◦ is thus also related to properties of the PES, and as such directly amenable to a calculation with electronic structure theory methods like DFT [80]. In the end, the crucial additional PES information required in kMC compared to equilibrium MC is therefore the location of the transition state in form of the PES saddle point along a reaction path of the process. Particularly for high-dimensional PES this is not at all a trivial problem, and the development of efficient and reliable transition-state-search algorithms is a very active area of current research [81, 82]. For many surface related elementary processes (e.g., diffusion, adsorption, desorption or reaction events) the dimensionality is fortunately not excessive, or can be mapped onto a couple of prominent reaction coordinates as exemplified in Fig. 12. The identification of the TS and the ensuing calculation of the rate for individual identified elementary processes with TST are then computationally involved, but just feasible. This still leaves as a fundamental problem, how the relevant elementary processes for any given system configuration can be identified in the first place.
180
K. Reuter et al.
O cus position along [001] (Å)
1.25
2.08 Å
1.15 Å 1.88 Å 1.79 Å
1.87
[001] 0.00 Å 3.12Å 0.89 eV 2.50
3.12 0.00
0.62
C
cus
1.25
1.87
> 1.50 1.40 1.30 1.20 1.10 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 < 0.10
eV
position along [001] (Å)
Figure 12. Calculated DFT-PES of a CO oxidation reaction process at the RuO2 (110) model catalyst surface. The high-dimensional PES is projected onto two reaction coordinates, representing two lateral coordinates of the adsorbed Ocus and COcus (cf. Fig. 5). The energy zero corresponds to the initial state at (0.00 Å, 3.12 Å), and the transition state is at the saddle point of the PES, yielding a barrier of 0.89 eV. Details of the corresponding transition state geometry are shown in the inset. Ru = light, large spheres, O = dark, medium spheres, and C = small, white spheres (only the atoms lying in the reaction plane itself are drawn as three-dimensional spheres) (from Ref. [26]).
Most TS-search algorithms require not only the automatically provided information of the actual system state, but also knowledge of the final state after the process has taken place [81]. In other words, quite some insight into the physics of the elementary process is needed in order to determine its rate and include it in the list of possible processes in the kMC simulation. How difficult and nonobvious this can be even for the simplest kind of processes is nicely exemplified by the diffusion of an isolated metal atom over a close-packed surface [82]. Such a process is of fundamental importance for the epitaxial growth of metal films, which is a necessary prerequisite in many applications like catalysis, magneto-optic storage media or interconnects in microelectronics. Intuitively, one would expect the surface diffusion to proceed by simple hops from one lattice site to a neighboring lattice site, as illustrated in Fig. 13(a) for an fcc (100) surface. Having said that, it is in the meanwhile well established that on a number of substrates diffusion does not operate preferentially by such hopping processes, but by atomic exchange as explained in Fig. 13(b). Here, the adatom replaces a surface atom, and the latter then assumes the adsorption site. Even much more complicated, correlated exchange diffusion processes involving a larger number of surface atoms are currently discussed for some materials. And the complexity increases of course further, when diffusion along island edges, across steps and around defects needs to be treated in detail [82].
Ab initio atomistic thermodynamics and statistical mechanics
181
(a)
(b)
Figure 13. Schematic top view of a fcc(100) surface, explaining diffusion processes of an isolated metal adatom (white circle). (a) Diffusion by hopping to a neighboring lattice site, (b) diffusion by exchange with a surface atom.
While it is therefore straightforward to say that one wants to include, e.g., diffusion in a kMC simulation, it can in practice be very involved to identify the individual processes actually contributing to it. Some attempts to automatize the search for the elementary processes possible for a given system configuration are currently undertaken, but in the first-principles kMC studies performed up to date (and in the foreseeable future), the process lists are simply generated by physical insight. This obviously bears the risk of overlooking a potentially relevant molecular process, and on this note this just evolving method has to be seen. Contrary to traditional kMC studies, where an unknown number of real molecular processes is often lumped together into a handful effective processes with optimized rates, first-principles kMC has the advantage, however, that the omission of a relevant elementary process will definitely show up in the simulation results. As such, first experience [15] tells that a much larger number of molecular processes needs to be accounted for in a corresponding modeling “with microscopic understanding” compared to traditional empirical kMC. In other words, that the statistical interplay determining the observable function of materials takes places between quite a number of different elementary processes, and is therefore often way too complex to be understood by just studying in detail the one or other elementary process alone.
2.3.
Applications to Semiconductor Growth and Catalysis
The new quality of and the novel insights that can be gained by mesoscopic first-principles kMC simulations was first demonstrated in the area of nucleation
182
K. Reuter et al.
and growth in metal and semiconductor epitaxy [75, 76, 83–87]. As one example from this field we return to the GaAs(001) surface already discussed in the context of the free energy plots. As apparent from Fig. 4, the so-called β2(2 × 4) reconstruction represents the most stable phase under moderately As-rich conditions, which are typically employed in the MBE growth of this material. Aiming at an atomic-scale understanding of this technologically most relevant process, first-principles LGH + kMC simulations were performed, including the deposition of As2 and Ga from the gas phase, as well as diffusion on this complex β2(2 × 4) semiconductor surface. In order to reach a trustworthy modeling, the consideration of more than 30 different elementary processes was found to be necessary, underlining our general message that complex materials properties cannot be understood by analyzing isolated molecular processes alone. Snapshots of characteristic stages during a typical simulation at realistic deposition fluxes and temperature are given in Fig. 14. They show a small part (namely 1/60) of the total mesoscopic simulation area, focusing on one “trench” of the β2(2 × 4) reconstruction. At the chosen conditions, island nucleation is observed in these reconstructed surface trenches, which is followed by growth along the trench, thereby extending into a new layer. Monitoring the density of the nucleated islands in huge simulation cells (160 × 320 surface lattice constants), a saturation indicating the beginning of steady-state growth is only reached after simulation times of the order of seconds for quite a range of temperatures. Obviously, neither such system sizes, nor time scales would have been accessible by direct electronic structure theory calculations combined, e.g., with MD simulations. In the ensuing steady-state growth, attachment of a deposited Ga atom to an existing island typically takes place before the adatom could take part in a new nucleation event. This leads to a very small nucleation rate that is counterbalanced by a simultaneous decrease in the number of islands due to coalescence. The resulting constant island density during steady-state growth is plotted in Fig. 15 for a range of technologically relevant temperatures. At the lower end around 500–600 K, this density decreases, as is consistent with the frequently employed standard nucleation theory. Under these conditions, the island morphology is predominantly determined by Ga surface diffusion alone, i.e., it may be understood on the basis of one molecular process class. Around 600 K the island density becomes almost constant, however, and even increases again above around 800 K. The determined magnitude is then orders of magnitude away from the prediction of classical nucleation theory, cf. Fig. 15, but in very good agreement with existing experimental data. The reason for this unusual behavior is that the adsorption of As2 molecules at reactive surface sites becomes reversible at these elevated temperatures. The initially formed Ga–As–As–Ga2 complexes required for nucleation, cf. Fig. 14(b), become unstable against As2 desorption, and a decreasing fraction of them can stabilize into larger aggregates. Due to the contribution of the decaying complexes, an
Ab initio atomistic thermodynamics and statistical mechanics (a)
183
(b)
t =100 ms
(c)
t =135 ms
(d)
t =170 ms
t =400 ms
Figure 14. Snapshots of characteristic stages during a first-principles kMC simulation of GaAs homoepitaxy. Ga and As substrate atoms appear in medium and dark grey, Ga adatoms in white. (a) Ga adatoms preferentially wander around in the trenches. (b) Under the growth conditions used here, an As2 molecule adsorbing on a Ga adatom in the trench initiates island formation. (c) Growth proceeds into a new atomic layer via Ga adatoms forming Ga dimers. (d) Eventually, a new layer of arsenic starts to grow, and the island extends itself towards the foreground, while more material attaches along the trench. The complete movie can be retrieved via the EPAPS homepage (http://www.aip.org/pubservs/epaps.html), document No. E-PRLTAO-87-031152 (from Ref. [86]).
effectively higher density of mobile Ga adatoms results at the surface, which in turn yields a higher nucleation rate of new islands. The temperature window around 700–800 K, which is frequently used by MBE crystal growers, may therefore be understood as permitting a compromise between high Ga adatom mobility and stability of As complexes that leads to a low island density and correspondingly smooth films. Exactly under the technologically most relevant conditions, surface properties that decisively influence the growth behavior (and therewith the targeted functionality) result therefore from the concerted interdependence of distinct molecular processes, i.e., in this case diffusion, adsorption and desorption. To further show that this interdependence is to our opinion more the rule than an exception in materials science applications, we return in the remainder of
184
K. Reuter et al. 880K 2x10
800K
700K
600K
500K
4
kMC simulation 4
⫺2
island density (µm )
10
nucleation theory i*⫽1 3
10
1.2
1.4
1.6 ⫺1 1000/T (K )
1.8
2
Figure 15. Saturation island density corresponding to steady-state MBE of GaAs as a function of the inverse growth temperature. The dashed line shows the prediction of classical nucleation theory for diffusion-limited attachment and a critical nucleus size equal to 1. The significant deviation at higher temperatures is caused by arsenic losses due to desorption, which is not considered in classical nucleation theory (from Ref. [87]).
this section to the field of heterogeneous catalysis. Here, the conversion of reactants into products by means of surface chemical reactions (A + B → C) adds another qualitatively different class of processes to the statistical interplay. In the context of the thermodynamic free energy plots we had already discussed that these on-going catalytic reactions at the surface continuously consume the adsorbed reactants, driving the surface populations away from their equilibrium value. If this has a significant effect, presumably, e.g., in regions of very high catalytic activity, the average surface coverage and structure does even under steady-state operation never reach its equilibrium with the surrounding reactant gas phase, and must thence be modeled by explicitly accounting for the surface kinetics [88–90]. In terms of kMC, this means that in addition to the diffusion, adsorption and desorption of the reactants and products, also reaction events have to be considered. For the case of CO oxidation, as one of the central reactions taking place in our car catalytic converters, this translates into the conversion of adsorbed O and CO into CO2 . Even for the afore discussed, moderately complex model catalyst RuO2 (110), again close to 30 elementary processes result, comprising both adsorption to and desorption from the two prominent site-types at the surface (br and cus, cf. Fig. 5), as well as diffusion between any nearest neighbor site-combination (br→br, br→cus, cus→br, cus→cus). Finally, reaction events account for the catalytic activity and are possible
Ab initio atomistic thermodynamics and statistical mechanics
185
whenever O and CO are simultaneously adsorbed in any nearest neighbor sitecombination. For given temperature and reactant pressures, the corresponding kMC simulations are then first run until steady-state conditions are reached, and the average surface populations are thereafter evaluated over sufficiently long times. We note that even for elevated temperatures, both time periods may again largely exceed the time span accessible by current MD techniques as exemplified in Fig. 16. The obtained steady-state average surface populations at T = 600 K are shown in Fig. 17 as a function of the gas-phase partial pressures. Comparing with the surface phase diagram of Fig. 5 from ab initio atomistic thermodynamics, i.e., neglecting the effect of the on-going catalytic reactions at the surface, similarities, but also the expected significant differences under some environmental conditions can be discerned. The differences affect most prominently the presence of oxygen at the br sites, where it is much more strongly bound than CO. For the thermodynamic approach only the ratio of adsorption to desorption matters, and due to the ensuing very low desorption rate, Obr is correspondingly stabilized even when there is much more CO in the gas-phase than O2 (left upper part of Fig. 5). The surface reactions, on the other hand, provide a very efficient means of 100 Site occupation number (%)
O 80
CO
br
cus
60 40 O
cus
20 0 0.0
CO 0.2
0.4 0.6 Time (s)
0.8
br 1.0
Figure 16. Time evolution of the site occupation by O and CO of the two prominent adsorption sites of the RuO2 (110) model catalyst surface shown in Fig. 5. The temperature and pressure conditions chosen (T = 600 K, pCO = 20 atm, pO2 = 1 atm) correspond to an optimum catalytic performance. Under these conditions kinetics builds up a steady-state surface population in which O and CO compete for either site type at the surface, as reflected by the strong fluctuations in the site occupations. Note the extended time scale, also for the “induction period” until the steady-state populations are reached when starting from a purely oxygen covered surface. A movie displaying these changes in the surface population can be retrieved via the EPAPS homepage (http://www.aip.org/pubservs/spaps.html), document No. E-PRLTAO93-006438 (from Ref. [90]).
186
K. Reuter et al.
Figure 17. Left panel: Steady state surface structures of RuO2 (110) in an O2 /CO environment obtained by first-principles kMC calculations at T = 600 K. In all non-white areas, the average site occupation is dominated (> 90 %) by one species, and the site nomenclature is the same as in Fig. 5, where the same surface structure was addressed within the ab initio atomistic thermodynamics approach. Right panel: Map of the corresponding catalytic CO oxidation activity measured as so-called turn-over frequencies (TOFs), i.e., CO2 conversion per cm2 and second: White areas have a TOF < 1011 cm−2 s−1 , and each increasing gray level represents one order of magnitude higher activity. The highest catalytic activity (black region, TOF > 1017 cm−2 s−1 ) is narrowly concentrated around the phase coexistence region that was already suggested by the thermodynamic treatment (from Ref. [90]).
removing this Obr species that is not accounted for in the thermodynamic treatment. As net result, under most CO-rich conditions in the gas phase, oxygen is faster consumed by the reaction than it can be replenished from the gas phase. The kMC simulations covering this effect yield then a much lower surface concentration of Obr , and in turn show a much larger stability range of surface structures with CObr at the surface (blue and hatched blue regions). It is particularly interesting to notice, that this yields a stability region of a surface structure consisting of only adsorbed CO at br sites that does not exist in the thermodynamic phase diagram at all, cf. Fig. 5. The corresponding CObr /− “phase” (hatched blue region) is thus a stable structure with defined average surface population that is entirely stabilized by the kinetics of this open catalytic system. These differences were conceptually anticipated in the thermodynamic phase diagram, and qualitatively delineated by the hatched regions in Fig. 5. Due to the vicinity to a phase transition and the ensuing enhanced dynamics at the surface, these regions were also considered as potential candidates for highly efficient catalytic activity. This is in fact confirmed by the first-principles kMC simulations as shown in the right panel of Fig. 17. Since the detailed statistics of all elementary processes is explicitly accounted for in the latter type simulations, it is straightforward to also evaluate the average occurrence of
Ab initio atomistic thermodynamics and statistical mechanics
187
the reaction events over long time periods as a measure of the catalytic activity. The obtained so-called turnover frequencies (TOF, in units of formed CO2 per cm2 per second) are indeed narrowly peaked around the phase coexistence line, where the kinetics builds up a surface population in which O and CO compete for either site type at the surface. This competition is in fact nicely reflected by the large fluctuations in the surface populations apparent in Fig. 16. The partial pressures and temperatures corresponding to this high activity “phase”, and even the absolute TOF values under these conditions, agree extremely well with detailed experimental studies measuring the steady-state activity in the temperature range from 300–600 K and both at high pressures and in UHV. Interestingly, under the conditions of highest catalytic performance it is not the reaction with the highest rate (lowest barrier) that dominates the activity. Although the particular elementary process itself exhibits very suitable properties for catalysis, it occurs too rarely in the full concert of all possible events to decisively affect the observable macroscopic functionality. This emphasizes again the importance of the statistical interplay and the novel level of understanding that can only be provided by first-principles based mesoscopic studies.
3.
Outlook
As highlighted by the few examples from surface physics, many materials’ properties and functions arise out of the interplay of a large number of distinct molecular processes. Theoretical approaches aiming at an atomic-scale understanding and predictive modeling of such phenomena have therefore to achieve both an accurate description of the individual elementary processes at the electronic regime and a proper treatment of how they act together on the mesoscopic level. We have sketched the current status and future direction of some emerging methods which correspondingly try to combine electronic structure theory with concepts from statistical mechanics and thermodynamics. The results already achieved with these techniques give a clear indication of the new quality and novelty of insights that can be gained by such descriptions. On the other hand, it is also apparent that we are only at the beginning of a successful bridging of the micro- to mesoscopic transition in the multiscale materials modeling endeavor. Some of the major conceptual challenges we see at present that need to be tackled when applying these schemes to more complex systems have been touched in this chapter. They may be summarized under the keywords accuracy, mapping and efficiency, and as outlook we briefly comment further on them. Accuracy: The reliability of the statistical treatment depends predominantly on the accuracy of the description of the individual molecular processes that are input to it. For the mesoscopic methods themselves it makes in fact no
188
K. Reuter et al.
difference, whether the underlying PES comes from a semi-empirical potential or from first-principles calculations, but the predictive power of the obtained results (and the physical meaning of the parameters) will obviously be significantly different. In this respect, we only mention two somehow diverging aspects. For the interplay of several (possibly competing) molecular processes, an “exact” description of the energetics of each individual process, e.g., in form of a rate for kMC simulations may be less important than the relative ordering among the processes as, e.g., provided by the correct trend in their energetics. In this case, the frequently requested chemical accuracy in the description of single processes could be a misleading concept, and modest errors in the PES would tend to cancel (or compensate each other) in the statistical mechanics part. Here, we stress the words modest errors, however, which, e.g., largely precludes semi-empiric potentials. Particularly for systems where bond breaking and making is relevant, the latter do not have the required accuracy. On the other hand, for the particular case of DFT as the current workhorse of electronic structure theories it appears that the present uncertainties due to the approximate treatment of electronic exchange and correlation are less problematic than hitherto often assumed (still caution, and systematic tests are necessary). On the other hand, in other cases where for example one process strongly dominates the concerted interplay, such an error cancellation in the statistical mechanics part will certainly not occur. Then, a more accurate description of this process will be required than can be provided by the exchangecorrelation functionals in DFT that are available today. Improved descriptions based on wave-function methods and on local corrections to DFT exist or are being developed, but come so far at a high computational cost. Assessing what kind of accuracy is required for which process under which system state, possibly achieved by evolutionary schemes based on gradually improving PES descriptions, will therefore play a central role in making atomistic statistical mechanics methods computationally feasible for increasingly complex systems. Mapping: The configuration space of most materials science problems is exceedingly large. In order to arrive at meaningful statistics, even the most efficient sampling of such spaces still requires (at present and in the foreseeable future) a number of PES evaluations that is prohibitively large to be directly provided by first-principles calculations. This problem is mostly circumvented by mapping the actual system onto a coarse-grained lattice model, in which the real Hamiltonian is approximated by discretized expansions, e.g., in certain interactions (LGH) or elementary processes (kMC). The expansions are then first parametrized by the first-principles calculations, while the statistical mechanics problem is thereafter solved exploiting the fast evaluations of the model Hamiltonians. Since in practice these expansions can only comprise a finite number of terms, the mapping procedure intrinsically bears the problem of overlooking a relevant interaction or process. Such an omission can
Ab initio atomistic thermodynamics and statistical mechanics
189
obviously jeopardize the validity of the complete statistical simulation, and there are at present no fool-proof or practical, let alone automatized schemes as to which terms to include in the expansion, neither how to judge on the convergence of the latter. In particular when going to more complex systems the present “hand-made” expansions that are mostly based on educated guesses will become increasingly cumbersome. Eventually, the complexity of the system may become so large, that even the mapping onto a discretized lattice itself will be problematic. Overcoming these limitations may be achieved by adaptive, self-refining approaches, and will certainly be of paramount importance to ensure the general applicability of the atomistic statistical techniques. Efficiency: Even if an accurate mapping onto a model Hamiltonian is achieved, the sampling of the huge configuration spaces will still put increasing demands on the statistical mechanics treatment. In the examples discussed above, the actual evaluation of the system partition function, e.g., by MC simulations is a small add-on compared to the computational cost of the underlying DFT calculations. With increasing system complexity, different problems and an increasing number of processes this may change eventually, requiring the use of more efficient sampling schemes. A major challenge for increasing efficiency is for example the treatment of kinetics, in particular when processes operate at largely different time scales. The computational cost of a certain time span in kMC simulations is dictated by the fastest process in the system, while the slowest process governs what total time period needs actually to be covered. If both process scales differ largely, kMC becomes expensive. A remedy may, e.g., be provided by assuming the fast process to be always equilibrated at the time scale of the slow one, and correspondingly an appropriate mixing of equilibrium MC with kMC simulations may significantly increase the efficiency (as typically done in nowadays TPD simulations). Alternatively, the fast process could not be explicitly considered anymore on the atomistic level, and only its effect incorporated into the remaining processes. Obviously, with such a grouping of processes one approaches already the meso- to macroscopic transition, gradually giving up the atomistic description in favor of a more coarse-grained or even continuum modeling. The crucial point to note here is that such a transition is done in a controlled and hierarchical manner, i.e., necessarily as the outcome and understanding from the analysis of the statistical interplay at the mesoscopic level. This is therefore in marked contrast to, e.g., the frequently employed rate equation approach in heterogeneous catalysis modeling, where macroscopic differential equations are directly fed with effective microscopic parameters. If the latter are simply fitted to reproduce some experimental data, at best a qualitative description can be achieved anyway. If really microscopically meaningful parameters are to be used, one does not know which of the many in principle possible elementary processes to consider. Simple-minded “intuitive” approaches like,
190
K. Reuter et al.
e.g., parametrizing the reaction equation with the data from the reaction process with the highest rate may be questionable in view of the results described above. This process may never occur in the full concert of the other processes, or it may only contribute under particular environmental conditions, or be significantly enhanced or suppressed due to an intricate interplay with another process. All this can only be filtered out by the statistical mechanics at the mesoscopic level, and can therefore not be grasped by the traditional rate equation approach omitting this intermediate time and length scale regime. The two key features of the atomistic statistical schemes reviewed here are in summary that they treat the statistical interplay of the possible molecular processes, and that these processes have a well-defined microscopic meaning, i.e., they are described by parameters that are provided by first-principles calculations. This distinguishes these techniques from approaches where molecular process parameters are either directly put into macroscopic equations neglecting the interplay, or where only effective processes with fitted or empirical parameters are employed in the statistical simulations. In the latter case, the individual processes lose their well-defined microscopic meaning and typically represent an unspecified lump sum of not further resolved processes. Both the clear cut microscopic meaning of the individual processes and their interplay are, however, decisive for the transferability and predictive nature of the obtained results. Furthermore, it is also precisely these two ingredients that ensure the possibility of reverse-mapping, i.e., the unambiguous tracing back of the microscopic origin of (appealing) materials’ properties identified at the meso- or macroscopic modeling level. We are convinced that primarily the latter point will be crucial when trying to overcome the present trial and error based system engineering in materials sciences in the near future. An advancement based on understanding requires theories that straddle various traditional disciplines. The approaches discussed here employ methods from various areas of electronic structure theory (physics as well as chemistry), statistical mechanics, mathematics, materials science, and computer science. This high interdisciplinarity makes the field challenging, but is also part of the reason why it is exciting, timely, and full with future perspectives.
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev. B, 136, 864, 1964. [2] W. Kohn and L. Sham, “Self consistent equations including exchange and correlation effects,” Phys. Rev. A, 140, 1133, 1965. [3] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, New York, 1989. [4] R.M. Dreizler and E.K.U. Gross, Density Functional Theory, Springer, Berlin, 1990. [5] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford University Press, Oxford, 1997.
Ab initio atomistic thermodynamics and statistical mechanics
191
[6] D. Frenkel and B. Smit, Understanding Molecular Simulation, 2nd edn., Academic Press, San Diego, 2002. [7] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471, 1985. [8] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total energy calculations: molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045, 1992. [9] G. Galli and A. Pasquarello, “First-principle molecular dynamics,” In: M.P. Allen, and D.J. Tildesley (eds.), Computer Simulations in Chemical Physics, Kluwer, Dordrecht, 1993. [10] A. Gross, “Reactions at surfaces studied by ab initio dynamics calculations,” Surf. Sci. Rep., 32, 293, 1998. [11] G.J. Kroes, “Six-dimensional quantum dynamics of dissociative chemisorption of H2 on metal surfaces,” Prog. Surf. Sci., 60, 1, 1999. [12] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Annu. Rev. Mater. Res., 32, 321, 2002. [13] A. Zangwill, Physics at Surfaces, Cambridge University Press, Cambridge, 1988. [14] R.I. Masel, Principles of Adsorption and Reaction on Solid Surfaces, Wiley, New York, 1996. [15] C. Stampfl, M.V. Ganduglia-Pirovano, K. Reuter, and M. Scheffler, “Catalysis and corrosion: the theoretical surface-science context,” Surf. Sci., 500, 368, 2002. [16] M. Scheffler and C. Stampfl, “Theory of adsorption on metal substrates,” In: K. Horn and M. Scheffler (eds.), Handbook of Surface Science, vol. 2: Electronic Structure, Elsevier, Amsterdam, 2000. [17] G.R. Darling and S. Holloway, “The dissociation of diatomic molecules at surfaces,” Rep. Prog. Phys., 58, 1595, 1995. [18] E. Kaxiras, Y. Bar-Yam, J.D. Joannopoulos, and K.C. Pandey, “Ab initio theory of polar semiconductor surfaces. I. Methodology and the (22) reconstructions of GaAs(111),” Phys. Rev. B, 35, 9625, 1987. [19] M. Scheffler, “Thermodynamic aspects of bulk and surface defects – first-principles calculations,” In: J. Koukal (ed.), Physics of Solid Surfaces – 1987, Elsevier, Amsterdam, 1988. [20] M. Scheffler and J. Dabrowski, “Parameter-free calculations of total energies, interatomic forces, and vibrational entropies of defects in semiconductors,” Phil. Mag. A, 58, 107, 1988. [21] G.-X. Qian, R.M. Martin, and D.J. Chadi, “First-principles study of the atomic reconstructions and energies of Ga- and As-stabilized GaAs(100) surfaces,” Phys. Rev. B, 38, 7649, 1988. [22] X.-G. Wang, W. Weiss, Sh.K. Shaikhutdinov, M. Ritter, M. Petersen, F. Wagner, R. Schl¨ogl, and M. Scheffler, “The hematite (alpha–Fe2 O3 )(0001) surface: evidence for domains of distinct chemistry,” Phys. Rev. Lett., 81, 1038, 1998. [23] X.-G. Wang, A. Chaka, and M. Scheffler, “Effect of the environment on Al2 O3 (0001) surface structures,” Phys. Rev. Lett., 84, 3650, 2000. [24] K. Reuter and M. Scheffler, “Composition, structure, and stability of RuO2 (110) as a function of oxygen pressure,” Phys. Rev. B, 65, 035406, 2002. [25] K. Reuter and M. Scheffler, “First-principles atomistic thermodynamics for oxidation catalysis: surface phase diagrams and catalytically interesting regions,” Phys. Rev. Lett., 90, 046103, 2003. [26] K. Reuter and M. Scheffler, “Composition and structure of the RuO2 (110) surface in an O2 and CO environment: implications for the catalytic formation of CO2 ,” Phys. Rev. B, 68, 045407, 2003.
192
K. Reuter et al. [27] Z. Lodzianan and J.K. Nørskov, “Stability of the hydroxylated (0001) surface of Al2 O3 ,” J. Chem. Phys., 118, 11179, 2003. [28] K. Reuter and M. Scheffler, “Oxide formation at the surface of late 4d transition metals: insights from first-principles atomistic thermodynamics,” Appl. Phys. A, 78, 793, 2004. [29] K. Reuter “Nanometer and sub-nanometer thin oxide films at surfaces of late transition metals,” In: U. Heiz, H. Hakkinen, and U. Landman (eds.), Nanocatalysis: Principles, Methods, Case Studies, 2005. [30] G. Ertl, H. Kn¨ozinger, and J. Weitkamp (eds.), Handbook of Heterogeneous Catalysis, Wiley, New York, 1997. [31] D.P. Woodruff and T.A. Delchar, Modern Techniques of Surface Science, 2nd edn., Cambridge University Press, Cambridge, 1994. [32] W.-X. Li, C. Stampfl, and M. Scheffler, “Insights into the function of silver as an oxidation catalyst by ab initio atomistic thermodynamics,” Phys. Rev. B, 68, 16541, 2003. [33] W.-X. Li, C. Stampfl, and M. Scheffler, “Why is a noble metal catalytically active? the role of the O–Ag interaction in the function of silver as an oxidation catalyst,” Phys. Rev. Lett., 90, 256102, 2003. [34] D.A. Mc Quarrie, Statistical Mechanics, Harper and Row, New York, 1976. [35] D.R. Stull and H. Prophet, JANAF Thermochemical Tables, 2nd edn., U.S. National Bureau of Standards, Washington, D.C., 1971. [36] E. Lundgren, J. Gustafson, A. Mikkelsen, J.N. Andersen, A. Stierle, H. Dosch, M. Todorova, J. Rogal, K. Reuter, and M. Scheffler, “Kinetic hindrance during the initial oxidation of Pd(100) at ambient pressures,” Phys. Rev. Lett., 92, 046101, 2004. [37] M. Todorova, E. Lundgren, V. Blum, A. Mikkelsen, S. Gray, J. Gustafson, √M. Borg, √ J. Rogal, K. Reuter, J.N. Andersen, and M. Scheffler, “The Pd(100)-( 5 × 5) R27◦ -O surface oxide revisited,” Surf. Sci., 541, 101, 2003. [38] E. Lundgren, G. Kresse, C. Klein, M. Borg, J.N. Andersen, M. De Santis, Y. Gauthier, C. Konvicka, M. Schmid, and P. Varga, “Two-dimensional oxide on Pd(111),” Phys. Rev. Lett., 88, 246103, 2002. [39] A. Michaelides, M.L. Bocquet, P. Sautet, A. Alavi, and D.A. King, “Structures and thermodynamic phase transitions for oxygen and silver oxide phases on Ag{111},” Chem. Phys. Lett., 367, 344, 2003. [40] C.M. Weinert and M. Scheffler, In: H.J. von Bardeleben (ed.), Defects in Semiconductors, Mat. Sci. Forum, 10–12, 25, 1986. [41] S.-H. Lee, W. Moritz, and M. Scheffler, “GaAs(001) under conditions of low as pressure: edvidence for a novel surface geometry,” Phys. Rev. Lett., 85, 3890, 2000. [42] C.B. Duke, “Semiconductor surface reconstruction: the structural chemistry of twodimensional surface compounds,” Chem. Rev., 96, 1237, 1996. [43] T. Engel and G. Ertl, “Oxidation of carbon monoxide,” In: D.A. King and D.P. Woodruff (eds.), The Chemical Physics of Solid Surfaces and Heterogeneous Catalysis, Elsevier, Amsterdam, 1982. [44] B.L.M. Hendriksen, S.C. Bobaru, and J.W.M. Frenken, “Oscillatory CO oxidation on Pd(100) studied with in situ scanning tunnelling microscopy,” Surf. Sci., 552, 229, 2003. [45] H. Over and M. Muhler, “Catalytic CO oxidation over ruthenium – bridging the pressure gap,” Prog. Surf. Sci., 72, 3, 2003. [46] G. Ertl, “Heterogeneous catalysis on the atomic scale,” J. Mol. Catal. A, 182, 5, 2002. [47] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2002. [48] D. de Fontaine, In: P.E.A. Turchi and A. Gonis (eds.), Statics and Dynamics of Alloy Phase Transformations, NATO ASI Series, Plenum Press, New York, 1994.
Ab initio atomistic thermodynamics and statistical mechanics
193
[49] J.M. Sanchez, F. Ducastelle, and D. Gratias, “Generalized cluster description of multicomponent systems,” Physica A, 128, 334, 1984. [50] A. Zunger, “First principles statistical mechanics of semiconductor alloys and intermetallic compounds,” In: P.E.A. Turchi and A. Gonis (eds.), Statics and Dynamics of Alloy Phase Transformations, NATO ASI Series, Plenum Press, New York, 1994. [51] P. Piercy, K. De’Bell, and H. Pfn¨ur, “Phase diagram and critical behavior of the adsorption system O/Ru(001): comparison with lattice-gas models,” Phys. Rev. B, 45, 1869, 1992. [52] G.M. Xiong, C. Schwennicke, H. Pfn¨ur, and H.-U. Everts, “Phase diagram and phase transitions of the adsorbate system S/Ru(0001): a monte carlo study of a lattice gas model,” Z. Phys. B, 104, 529, 1997. [53] V.P. Zhdanov and B. Kasemo, “Simulation of oxygen desorption from Pt(111),” Surf. Sci., 415, 403, 1998. [54] S.-J. Koh and G. Ehrlich, “Pair- and many-atom interactions in the cohesion of surface clusters: Pdx and Irx on W(110),” Phys. Rev. B, 60, 5981, 1999. ¨ [55] L. Osterlund, M.Ø. Pedersen, I. Stensgaard, E. Lægsgaard, and F. Besenbacher, “Quantitative determination of adsorbate-adsorbate interactions,” Phys. Rev. Lett., 83, 4812, 1999. [56] S.H. Payne, H.J. Kreuzer, W. Frie, L. Hammer, and K. Heinz, “Adsorption and desorption of hydrogen on Rh(311) and comparison with other Rh surfaces,” Surf. Sci., 421, 279, 1999. [57] C. Stampfl, H.J. Kreuzer, S.H. Payne, H. Pfn¨ur, and M. Scheffler, “First-principles theory of surface thermodynamics and kinetics,” Phys. Rev. Lett., 83, 2993, 1999. [58] C. Stampfl, H.J. Kreuzer, S.H. Payne, and M. Scheffler, “Challenges in predictive calculations of processes at surfaces: surface thermodynamics and catalytic reactions,” Appl. Phys. A, 69, 471, 1999. [59] J. Shao, “Linear model selection by cross-validation,” J. Amer. Statist. Assoc., 88, 486, 1993. [60] P. Zhang, “Model selection via multifold cross-validation,” Ann. statist., 21, 299, 1993. [61] A. van de Walle and G. Ceder, “Automating first-principles phase diagram calculations,” J. Phase Equilibria, 23, 348, 2002. [62] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087, 1976. [63] J.-S. McEwen, S.H. Payne, and C. Stampfl, “Phase diagram of O/Ru(0001) from first principles,” Chem. Phys. Lett., 361, 317, 2002. [64] H.J. Kreuzer and S.H. Payne, “Theoretical approaches to the kinetics of adsorption, desorption and reactions at surfaces,” In: M. Borowko (eds.), Computational Methods in Surface and Colloid, Marcel Dekker, New York, 2000. [65] C. Stampfl and M. Scheffler, “Theory of alkali metal adsorption on close-packed metal surfaces,” Surf. Rev. Lett., 2, 317, 1995. [66] D.L. Adams, “New phenomena in the adsorption of alkali metals on Al surfaces,” Appl. Phys. A, 62, 123, 1996. [67] M. Borg, C. Stampfl, A. Mikkelsen, J. Gustafson, E. Lundgren, M. Scheffler, and J.N. Andersen, “Density of configurational states from first-principles: the phase diagram of Al-Na surface alloys,” Chem. Phys. Chem. (in press), 2005. [68] F. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050, 2001.
194
K. Reuter et al. [69] H.C. Kang and W.H. Weinberg, “Modeling the kinetics of heterogeneous catalysis,” Chem. Rev., 95, 667, 1995. [70] A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, “New algorithm for Monte Carlo simulation of ising spin systems,” J. Comp. Phys., 17, 10, 1975. [71] D.T. Gillespie, “General method for numerically simulating stochastic time evolution of coupled chemical reactions,” J. Comp. Phys., 22, 403, 1976. [72] A.F. Voter, “Classically exact overlayer dynamics: diffusion of rhodium clusters on Rh(100),” Phys. Rev. B, 34, 6819, 1986. [73] H.C. Kang and W.H. Weinberg, “Dynamic Monte Carlo with a proper energy barrier: surface diffusion and two-dimensional domain ordering,” J. Chem. Phys., 90, 2824, 1989. [74] K.A. Fichthorn and W.H. Weinberg, “Theoretical foundations of dynamical Monte Carlo simulations,” J. Chem. Phys., 95, 1090, 1991. [75] P. Ruggerone, C. Ratsch, and M. Scheffler, “Density-functional theory of epitaxial growth of metals,” In: D.A. King and D.P. Woodruff (eds.), Growth and Properties of Ultrathin Epitaxial Layers. The Chemical Physics of Solid Surfaces, vol. 8, Elsevier, Amsterdam, 1997. [76] C. Ratsch, P. Ruggerone, and M. Scheffler, “Study of strain and temperature dependence of metal epitaxy,” In: Z. Zhang and M.G. Lagally (eds.), Morphological Organization in Epitaxial Growth and Removal, World Scientific, Singapore, 1998. [77] S. Glasston, K.J. Laidler, and H. Eyring, The Theory of Rate Processes, McGrawHill, New York, 1941. [78] G.H. Vineyard, “Frequency factors and isotope effects in solid state rate processes,” J. Phys. Chem. Solids, 3, 121, 1957. [79] K.J. Laidler, Chemical Kinetics, Harper and Row, New York, 1987. [80] C. Ratsch and M. Scheffler, “Density-functional theory calculations of hopping rates of surface diffusion,” Phys. Rev. B, 58, 13163, 1998. [81] G. Henkelman, G. Johannesson, and H. Jonsson, “Methods for finding saddle points and minimum energy paths,” In: S.D. Schwartz (ed.), Progress on Theoretical Chemistry and Physics, Kluwer, New York, 2000. [82] T. Ala-Nissila, R. Ferrando, and S.C. Ying, “Collective and single particle diffusion on surfaces,” Adv. Phys., 51, 949, 2002. [83] S. Ovesson, A. Bogicevic, and B.I. Lundqvist, “Origin of compact triangular islands in metal-on-metal growth,” Phys. Rev. Lett., 83, 2608, 1999. [84] K.A. Fichthorn and M. Scheffler, “Island nucleation in thin-film epitaxy: a firstprinciples investigation,” Phys. Rev. Lett., 84, 5371, 2000. [85] P. Kratzer M. Scheffler, “Surface knowledge: Toward a predictive theory of materials,” Comp. in Science and Engineering, 3(6), 16, 2001. [86] P. Kratzer and M. Scheffler, “Reaction-limited island nucleation in molecular beam epitaxy of compound semiconductors,” Phys. Rev. Lett., 88, 036102, 2002. [87] P. Kratzer, E. Penev, and M. Scheffler, “First-principles studies of kinetics in epitaxial growth of III–V semiconductors,” Appl. Phys. A, 75, 79, 2002. [88] E.W. Hansen and M. Neurock, “Modeling surface kinetics with first-principles-based molecular simulation,” Chem. Eng. Sci., 54, 3411, 1999. [89] E.W. Hansen and M. Neurock, “First-principles-based Monte Carlo simulation of ethylene hydrogenation kinetics on Pd,” J. Catal., 196, 241, 2000. [90] K. Reuter, D. Frenkel, and M. Scheffler, “The steady state of heterogeneous catalysis, studied with first-principles statistical mechanics,” Phys. Rev. Lett., 93, 116105, 2004.
1.10 DENSITY-FUNCTIONAL PERTURBATION THEORY Paolo Giannozzi1 and Stefano Baroni2 1 DEMOCRITOS-INFM, Scuola Normale Superiore, Pisa, Italy 2
DEMOCRITOS-INFM, SISSA-ISAS, Trieste, Italy
The calculation of vibrational properties of materials from their electronic structure is an important goal for materials modeling. A wide variety of physical properties of materials depend on their lattice-dynamical behavior: specific heats, thermal expansion, and heat conduction; phenomena related to the electron–phonon interaction such as the resistivity of metals, superconductivity, and the temperature dependence of optical spectra, are just a few of them. Moreover, vibrational spectroscopy is a very important tool for the characterization of materials. Vibrational frequencies are routinely and accurately measured mainly using infrared and Raman spectroscopy, as well as inelastic neutron scattering. The resulting vibrational spectra are a sensitive probe of the local bonding and chemical structure. Accurate calculations of frequencies and displacement patterns can thus yield a wealth of information on the atomic and electronic structure of materials. In the Born–Oppenheimer (adiabatic) approximation, the nuclear motion is determined by the nuclear Hamiltonian H: H=−
h¯ 2 I
∂2 + E({R}), 2M I ∂R2I
(1)
where R I is the coordinate of the I th nucleus, M I its mass, {R} indicates the set of all the nuclear coordinates, and E({R}) is the ground-state energy of the Hamiltonian, H{R} , of a system of N interacting electrons moving in the field of fixed nuclei with coordinates {R}: H{R} = −
h¯ 2 ∂ 2 e2 1 + + v I (ri − R I ) + E N ({R}), 2 2m i ∂ri 2 i=/ j |ri − r j | i,I
(2) 195 S. Yip (ed.), Handbook of Materials Modeling, 195–214. c 2005 Springer. Printed in the Netherlands.
196
P. Giannozzi and S. Baroni
where ri is the coordinate of the ith electron, m is the electron mass, −e is the electron charge, E N ({R}) is the nuclear electrostatic energy: E N ({R}) =
e2 Z I Z J , 2 I =/ J |R I − R J |
(3)
Z I being the charge of the I th nucleus, and v I is the electron–nucleus Coulomb interaction: v I (r) = −Z I e2 /r. In a pseudopotential scheme each nucleus is thought to be lumped together with its own core electrons in a frozen ion which interacts with the valence electrons through a smooth pseudopotential, v I (r). The equilibrium geometry of the system is determined by the condition that the forces acting on all nuclei vanish. The forces F I can be calculated by applying the Hellmann–Feynman theorem to the Born–Oppenheimer Hamiltonian H{R} :
∂ H{R} ∂ E({R}) {R} , = − {R} FI ≡ − ∂R I ∂R I
(4)
where {R} (r1 , . . . , r N ) is the ground-state wavefunction of the electronic Hamiltonian, H{R} . Eq. (4) can be rewritten as: FI = −
n(r)
∂v I (r − R I ) ∂ E N ({R}) dr − , ∂R I ∂R I
(5)
where n(r) is the electron charge density for the nuclear configuration {R}:
n(r) = N
|{R} (r, r2 , . . . , r N )|2 dr2 · · · dr N .
(6)
For a system near its equilibrium geometry, the harmonic approximation applies and the nuclear Hamiltonian of Eq. (1) reduces the Hamiltonian of a system of independent harmonic oscillators, called normal modes. Normal mode frequencies, ω, and displacement patterns, U Iα for the αth Cartesian component of the I th atom, are determined by the secular equation:
αβ
β
C IJ − M I ω2 δ IJ δαβ U J = 0,
(7)
J,β αβ
where C IJ is the matrix of interatomic force constants (IFCs): αβ
C IJ ≡
∂ 2 E({R}) β
∂ R αI ∂ R J
=−
∂ FIα
β.
∂ RJ
(8)
Various dynamical models, based on empirical or semiempirical inter-atomic potentials, can be used to calculate the IFCs. In most cases, the parameters of the model are obtained from a fit to some known experimental data, such as a set of frequencies. Although simple and often effective, such approaches tend
Density-functional perturbation theory
197
to have a limited predictive power beyond the range of cases included in the fitting procedure. It is often desirable to resort to first-principles methods, such as density-functional theory, that have a far better predictive power even in the absence of any experimental input.
1.
Density-Functional Theory
Within the framework of density-functional theory (DFT), the energy E({R}) can be seen as the minimum of a functional of the charge density n(r): e2 E({R}) = T0 [n(r)] + 2 +
n(r)n(r ) dr dr + E xc [n(r)] |r − r |
V{R} (r)n(r)dr + E N ({R}),
(9)
with the constrain that the integral of n(r) equals the number of electrons in the system, N . InEq. (9), V{R} indicates the external potential acting on the electrons, V{R} = I v I (r − R I ), T0 [n(r)] is the kinetic energy of a system of noninteracting electrons having n(r) as ground-state density, N/2 h¯ 2 ∂ 2 ψn (r) ψn∗ (r) dr T0 [n(r)] = −2 2m n=1 ∂r2
n(r) = 2
N/2
|ψn (r)|2 ,
(10) (11)
n=1
and E xc is the so-called exchange-correlation energy. For notational simplicity, the system is supposed here to be a nonmagnetic insulator, so that each of the N/2 lowest-lying orbital states accommodates two electrons of opposite spin. The Kohn-Sham (KS) orbitals are the solutions of the KS equation:
HSCF ψn (r) ≡
h¯ 2 ∂ 2 − + VSCF (r) ψn (r) = n ψn (r), 2m ∂r2
(12)
where HSCF is the Hamiltonian for an electron under an effective potential VSCF : n(r ) 2 dr + v xc (r), (13) VSCF (r) = V{R} (r) + e |r − r | and v xc – the exchange-correlation potential – is the functional derivative of the exchange-correlation energy: v xc (r) ≡ δ E xc /δn(r). The form of E xc is unknown: the entire procedure is useful only if reliable approximate expressions for E xc are available. It turns out that even the simplest of such expressions, the local-density approximation (LDA), is surprisingly good in many
198
P. Giannozzi and S. Baroni
cases, at least for the determination of electronic and structural ground-state properties. Well-established methods for the solution of KS equations, Eq. (12), in both finite (molecules, clusters) and infinite (crystals) systems, are described in the literature. The use of more sophisticated and more performing functionals than LDA (such as generalized gradient approximation, or GGA) is now widespread. An important consequence of the variational character of DFT is that the Hellmann–Feynman form for forces, Eq. (5), is still valid in a DFT framework. In fact, the DFT expression for forces contains a term coming from explicit derivation of the energy functional E({R}) with respect to atomic positions, plus a term coming from implicit dependence via the derivative of the charge density: =− FDFT I
n(r)
∂ V{R}(r) ∂ E N ({R}) dr − − ∂R I ∂R I
δ E({R}) ∂n(r) dr. (14) δn(r) ∂R I
The last term in Eq. (14) vanishes exactly for the ground-state charge density: the minimum condition implies in fact that the functional derivative of E({R}) equals a constant – the Lagrange multiplier that enforces the constrain on the total number of electrons – and the integral of the derivative of the electron = FI density is zero because of charge conservation. As a consequence, FDFT I as in Eq. (5). Forces in DFT can thus be calculated from the knowledge of the electron charge-density. IFCs can be calculated as finite differences of Hellmann–Feynman forces for small finite displacements of atoms around the equilibrium positions. For finite systems (molecules, clusters) this technique is straightforward, but it may also be used in solid-state physics (frozen phonon technique). An alternative technique is the direct calculation of IFCs using density-functional perturbation theory (DFPT) [1–3].
2.
Density-Functional Perturbation Theory
An explicit expression for the IFCs can be obtained by differentiating the forces with respect to nuclear coordinates, as in Eq. (8): ∂ 2 E({R}) = ∂R I ∂R J
∂ 2 V{R} (r) ∂ 2 E N ({R}) ∂n(r) ∂ V{R} (r) dr + δ IJ n(r) dr + . ∂R J ∂R I ∂R I ∂R J ∂R I ∂R J (15)
The calculation of the IFCs thus requires the knowledge of the ground-state charge density, n(r), as well as of its linear response to a distortion of the nuclear geometry, ∂n(r)/∂R I .
Density-functional perturbation theory
199
The charge-density linear response can be evaluated by linearizing Eqs. (11)–(13), with respect to derivatives of KS orbitals, density, and potential, respectively. Linearization of Eq. (11) leads to: ∂ψn (r) ∂n(r) = 4 Re ψn∗ (r) . ∂R I ∂R I n=1 N/2
(16)
Whenever the unperturbed Hamiltonian is time-reversal invariant, eigenfunctions are either real, or they occur in conjugate pairs, so that the prescription to keep only the real part in the above formula can be dropped. The derivatives of the KS orbitals, ∂ψn (r)/∂R I , are obtained from linearization of Eqs. (12) and (13):
(HSCF − n )
∂ψn (r) ∂n ∂ VSCF (r) =− − ∂R I ∂R I ∂R I
ψn (r),
(17)
where ∂ VSCF (r) ∂ V{R} (r) = + e2 ∂R I ∂R I
1 ∂n(r ) dr + |r − r | ∂R I
δv xc (r) ∂n(r ) dr δn(r ) ∂R I (18)
is the first-order derivative of the self-consistent potential, and
∂ VSCF ∂n ψn = ψn ∂R I ∂R I
(19)
is the first-order derivative of the KS eigenvalue, n . The form of the righthand side of Eq. (17) ensures that ∂ψn (r)/∂R I can be chosen so as to have a vanishing component along ψn (r) and thus the singularity of the linear system in Eq. (17) can be ignored. Equations (16)–(18) form a set of self-consistent linear equations. The linear system, Eq. (17), can be solved for each of the N/2 derivatives ∂ψn (r)/∂R I separately, the charge-density response calculated from Eq. (16), and the potential response ∂ VSCF /∂R I is updated from Eq. (18), until self-consistency is achieved. Only the knowledge of the occupied states of the system is needed to construct the right-hand side of the equation, and efficient iterative algorithms – such as conjugate gradient or minimal residual methods – can be used for the solution of the linear system. In the atomic physics literature, an equation analogous to Eq. (17) is known as the Sternheimer equation, and its self-consistent version was used to calculate atomic polarizabilities. Similar methods are known in the quantum chemistry literature, under the name of coupled Hartree–Fock method for the Hartree–Fock approximation [4, 5].
200
P. Giannozzi and S. Baroni
The connection with standard first-order perturbation (linear-response) theory can be established by expressing Eq. (17) as a sum over the spectrum of the unperturbed Hamiltonian: 1 ∂ψn (r) = ψm (r) ∂R I n − m m= /n
∂ VSCF ψn , ψm ∂R
(20)
I
running over all the states of the system, occupied and empty. Using Eq. (20), the electron charge-density linear response, Eq. (16), can be recast into the form: 1 ∂ψn (r) =4 ψn∗ (r)ψm (r) ∂R I n − m /n n=1 m= N/2
∂ VSCF ψn . ψm ∂R
(21)
I
This equations shows that the contributions to the electron-density response coming from products of occupied states cancel each other. As a consequence, in Eq. (17) the derivatives ∂ψn (r)/∂R I can be assumed to be orthogonal to all states of the occupied manifold. An alternative and equivalent point of view is obtained by inserting Eq. (16) into Eq. (18) and the resulting equation into Eq. (17). The set of N/2 selfconsistent linear systems is thus recast into a single huge linear system for all the N/2 derivatives ∂ψn (r)/∂R I
∂ψn (r) ∂ψm + K nm ∂R I ∂R I m=1 N/2
(HSCF − n )
(r) = −
∂ V{R} (r) ψn (r), ∂R I
(22)
under the orthogonality constraints: ∂ψn ψn = 0. ∂R
(23)
I
The nonlocal operator K nm is defined as:
∂ψm K nm ∂R I
(r) = 4
ψn (r)
δv xc (r) e2 + |r − r | δn(r )
ψm∗ (r )
∂ψm (r ) dr . ∂R I (24)
The same expression can be derived from a variational principle. The energy functional, Eq. (9), is written in terms of the perturbing potential and of the perturbed KS orbitals: V (u I ) V{R} (r) + u I
∂ V{R} (r) , ∂R I
ψn(u I ) ψn (r) + u I
∂ψn (r) , ∂R I
(25)
and expanded up to second order in the strength u I of the perturbation. The first-order term gives the Hellmann–Feynman forces. The second-order one is a quadratic functional in the ∂ψn (r)/∂R I s whose minimization yields
Density-functional perturbation theory
201
Eq. (22). This approach forms the basis of variational DFPT [6, 7], in which all the IFCs are expressed as minima of suitable functionals. The big linear system of Eq. (22) can be directly solved with iterative methods, yielding a solution that is perfectly equivalent to the self-consistent solution of the smaller linear systems of Eq. (17). The choice between the two approaches is thus a matter of computational strategy.
3.
Phonon Modes in Crystals In perfect crystalline solids, the position of the I th atom can be written as: R I = Rl + τs = l1 a1 + l2 a2 + l3 a3 + τs
(26)
where Rl is the position of the lth unit cell in the Bravais lattice and τs is the equilibrium position of the sth atom in the unit cell. Rl can be expressed as a sum of the three primitive translation vectors a1 , a2 , a3 , with integer coefficients l1 , l2 , l3 . The electronic states are classified by a wave-vector k and a band index ν: ψn (r) ≡ ψν,k (r),
ψν,k (r + Rl ) = eik·Rl ψν,k (r)
∀l,
(27)
where k is in the first Brillouin zone, i.e.: the unit cell of the reciprocal lattice, defined as the set of all vectors {G} such that Gl · Rm = 2π n, with n an integer number. Normal modes in crystals (phonons) are also classified by a wave-vector q and a mode index ν. Phonon frequencies, ω(q), and displacement patterns, Usα (q), are determined by the secular equation:
C˜ stαβ (q) − Ms ω2 (q)δst δαβ Utβ (q) = 0.
(28)
t,β αβ
The dynamical matrix, C˜ st (q), is the Fourier transform of real-space IFCs: C˜ stαβ (q) =
e−iq·Rl Cstαβ (Rl ).
(29)
l
The latter are defined as Cstαβ (l, m) ≡
∂2 E β
∂u αs (l)∂u t (m)
= Cstαβ (Rl − Rm ),
(30)
where us (l) is the deviation from the equilibrium position of atom s in the lth unit cell: R I = Rl + τs + us (l).
(31)
Because of translational invariance, the real-space IFCs, Eq. (30), depend on l and m only through the difference Rl − Rm . The derivatives are evaluated
202
P. Giannozzi and S. Baroni
at us (l) = 0 for all the atoms. The direct calculation of such derivatives in an infinite periodic system is however not possible, since the displacement of a single atom would break the translational symmetry of the system. The elements of the dynamical matrix, Eq. (29), can be written as second derivatives of the energy with respect to a lattice distortion of wave-vector q: 1 ∂2 E , C˜ stαβ (q) = β Nc ∂u ∗α s (q)∂u t (q)
(32)
where Nc is the number of unit cells in the crystal, and us (q) is the amplitude of the lattice distortion: us (l) = us (q)eiq·Rl .
(33)
In the frozen-phonon approach, the calculation of the dynamical matrix at a generic point of the Brillouin zone presents the additional difficulty that a crystal with a small distortion, Eq. (33), “frozen-in,” loses the original periodicity, unless q = 0. As a consequence, an enlarged unit cell, called supercell, is required for the calculation of IFCs at any q =/ 0. The suitable supercell for a perturbation of wave-vector q must be big enough to accommodate q as one of the reciprocal-lattice vectors. Since the computational effort needed to determine the forces (i.e., the electronic states) grows approximately as the cube of the supercell size, the frozen-phonon method is in practice limited to lattice distortions that do not increase the unit cell size by more than a small factor, or to lattice-periodical (q = 0) phonons. The dynamical matrix, Eq. (32), can be decomposed into an electronic and an ionic contribution: (34) C˜ stαβ (q) = el C˜ stαβ (q) +ion C˜ stαβ (q), where: 1 el ˜ αβ Cst (q) = Nc
+ δst
∂n(r) ∂u αs (q)
n(r)
∗
∂ V{R} (r) β
∂u t (q)
dr
∂ 2 V{R}(r) β
∂u ∗α s (q = 0)∂u t (q = 0)
dr .
(35)
The ionic contribution – the last term in Eq. (15) – comes from the derivatives of the nuclear electrostatic energy, Eq. (3), and does not depend on the electronic structure. The second term in Eq. (34) depends only on the charge density of the unperturbed system and it is easy to evaluate. The first term in Eq. (34) depends on the charge-density linear response to the lattice distortion of Eq. (33), corresponding to a perturbing potential characterized by a single wave-vector q: ∂v s (r − Rl − τs ) ∂ V{R} (r) =− eiq·Rl . (36) ∂us (q) ∂r l
Density-functional perturbation theory
203
An advantage of DFPT with respect to the frozen-phonon technique is that the linear response to a monochromatic perturbation is also monochromatic with the same wave-vector q. This is a consequence of the linearity of DFPT equations with respect to the perturbing potential, especially evident in Eq. (22). The calculation of the dynamical matrix can thus be performed for any q−vector without introducing supercells: the dependence on q factors out and all the calculations can be performed on lattice-periodic functions. Real-space IFCs can then be obtained via discrete (fast) Fourier transforms. To this end, dynamical matrices are first calculated on a uniform grid of q-vectors in the Brillouin zone: b1 b2 b3 + l2 + l3 , (37) ql1 ,l2 ,l3 = l1 N1 N2 N3 where b1 , b2 , b3 are the primitive translation vectors of the reciprocal lattice, l1 , l2 , l3 are integers running from 0 to N1 − 1, N2 − 1, N3 − 1, respectively. αβ A discrete Fourier transform produces the IFCs in real space: C˜ st (ql1 ,l2 ,l3 ) → αβ Cst (Rl1 ,l2 ,l3 ), where the real-space grid contains all R−vectors inside a supercell, whose primitive translation vectors are N1 a1 , N2 a2 , N3 a3 : Rl1 ,l2 ,l3 = l1 a1 + l2 a2 + l3 a3 .
(38)
Once this has been done, the IFCs thus obtained can be used to calculate inexpensively via (inverse) Fourier transform dynamical matrices at any q vector not included in the original reciprocal-space mesh. This procedure is known as Fourier interpolation. The number of dynamical matrix calculations to be performed, N1 N2 N3 , is related to the range of the IFCs in real space: the realspace grid must be big enough to yield negligible values for the IFCs at the boundary vectors. In simple crystals, this goal is typically achieved for relatively small values of N1 , N2 , N3 [8, 9]. For instance, the phonon dispersions of Si and Ge shown in Fig. 1 were obtained with N1 = N2 = N3 = 4.
4.
Phonons and Macroscopic Electric Fields
Phonons in the long-wavelength limit (q → 0) may be associated with a macroscopic polarization, and thus a homogeneous electric field, due to the long-range character of the Coulomb forces. The splitting between longitudinal optic (LO) and transverse optic (TO) modes at q = 0 for simple polar semiconductors (e.g., GaAs), and the absence of LO–TO splitting in nonpolar semiconductors (e.g., Si), is a textbook example of the consequences of such phenomenon. Macroscopic electrostatics in extended systems is a tricky subject from the standpoint of microscopic ab initio theory. In fact, on the one hand, the macroscopic polarization of an extended system depends on surface effects; on the
204
P. Giannozzi and S. Baroni 600 Frequency [cm-1]
Si
400
200
0
Frequency [cm-1]
Ge
⌫
K
X
⌫
L
X
W
L
Dos
⌫
K
X
⌫
L
X
W
L
Dos
400 300 200 100 0
Figure 1. Calculated phonon dispersions and density of states for crystalline Si and Ge. Experimental data are denoted by diamonds. Reproduced from Ref. [8].
other hand, the potential which generates a homogeneous electric field is both nonperiodic and not bounded from below: an unpleasant situation when doing calculations using Born–von K´arm´an periodic boundary conditions. In the last decade, the whole field has been revolutionized by the advent of the so called modern theory of electric polarization [10, 11]. From the point of view of lattice dynamics, a more traditional approach based on perturbation theory is however appropriate because all the pathologies of macroscopic electrostatics disappear in the linear regime, and the polarization response to a homogeneous electric field and/or to a periodic lattice distortion – which is all one needs in order to calculate long-wavelength phonon modes – is perfectly well-defined. In the long-wavelength limit, the most general expression of the energy as a quadratic function of atomic displacements, us (q = 0) for atom s, and of a macroscopic electric field, E, is:
E({u}, E) =
1 ˜ st · ut − E · ∞ · E − e us · an C us · Z s · E, 2 st αβ 8π s
(39)
Density-functional perturbation theory
205
where is the volume of the unit cell; ∞ is the electronic (i.e., clamped nuclei) dielectric tensor of the crystal; Z s is the tensor of Born effective charges ˜ is the q =0 dynamical matrix of the system, calculated [12] for atom s; and an C at vanishing macroscopic electric field. Because of Maxwell’s equations, the polarization induced by a longitudinal phonon in the q → 0 limit generates a macroscopic electric field which exerts a force on the atoms, thus affecting the phonon frequency. This, in a nutshell, is the physical origin of the LO–TO splitting in polar materials. Minimizing Eq. (39) with respect to the electric field amplitude at fixed lattice distortion yields an expression for the energy which depends on atomic displacements only, defining an effective dynamical matrix which contains an additional (“nonanalytic”) contribution: C˜ stαβ =an C˜ stαβ +na C˜ stαβ ,
(40)
where na
C˜ stαβ
4π e2 =
γ
νβ Z γα 4π e2 (q · Z s )α (q · Z t )β ν Z t qν s qγ = γν q · ∞ · q γ ,ν qγ ∞ qν
(41)
displays a nonanalytic behavior in the limit q → 0. As a consequence, the resulting IFCs are long-range in real space, with a dependence on the interatomic distance, which is typical of the dipole–dipole interaction. Because of this long-range behavior, the Fourier technique described above must be modified: a suitably chosen function of q, whose q → 0 limit is the same as in Eq. (41), is subtracted from the dynamical matrix in q-space. This procedure makes residual IFCs short-range and suitable for Fourier transform on a relatively small grid of points. The nonanalytic term previously subtracted out in q-space is then readded in real space. An example of application of such procedure is shown in Fig. 2, for phonon dispersions of some III–VI semiconductors. The link between the phenomenological parameters Z and ∞ of Eq. (39) and their microscopic expression is provided by conventional electrostatics. From Eq. (39) we obtain the expression for the electric induction D: D≡−
4π ∂ E 4π e = Z s · us + ∞ E, ∂E s
(42)
from which the macroscopic polarization, P, is obtained via D = E + 4π P. One finds the known result relating Z to the polarization induced by atomic displacements, at zero electric field:
Z αβ s
∂Pα = ; β e ∂u s (q = 0) E=0
(43)
206
P. Giannozzi and S. Baroni 400
Frequency [cm-1]
GaAs 300 200 100 0
Frequency [cm-1]
AlAs
⌫
K X
⌫
L
X
W
L
Dos
⌫
K
X
⌫
L
X
W
L
Dos
⌫
K X
⌫
L
X
W
L
Dos
⌫
K
⌫
L
X
W
L
Dos
500
250
0
Frequency [cm-1]
GaSb
300
200
100
0
Frequency [cm-1]
AlSb
400 300 200 100 0
X
Figure 2. Calculated phonon dispersions and density of states for several III-V zincblende semiconductors. Experimental data are denoted by diamonds. Reproduced from Ref. [8].
Density-functional perturbation theory
207
while the electronic dielectric-constant tensor ∞ is the derivative of the polarization with respect to the macroscopic electric field at clamped nuclei:
αβ = δαβ ∞
∂Pα + 4π . ∂Eβ u (q=0)=0
(44)
s
DFPT provides an easy way to calculate Z and ∞ from first principles [8, 9]. The polarization linearly induced by an atomic displacement is given by the sum of an electronic plus an ionic term: ∂Pα
e =− β Nc ∂u s (q = 0)
r
e ∂n(r) dr + Z s δαβ . ∂u s (q = 0)
(45)
This expression is ill-defined for an infinite crystal with Born–von K´arm´an periodic boundary conditions, because r is not a lattice-periodic operator. We remark, however, that we actually only need off-diagonal matrix elements / n (see the discussion of Eqs. 20 and 21). These can be ψm |r|ψn with m = rewritten as matrix elements of a lattice-periodic operator, using the following trick: ψm |r|ψn =
ψm |[HSCF , r]|ψn
, m − n
∀ m =/ n.
(46)
The quantity |ψ¯ nα = rα |ψn is the solution of a linear system, analogous to Eq. (17): (HSCF − n )|ψ¯ nα = Pc [HSCF , rα ]|ψn ,
(47)
N/2
where Pc = 1 − n=1 |ψn ψn | projects out the component over the occupiedstate manifold. If the self-consistent potential acting on the electrons is local, the above commutator is simply proportional to the momentum operator: [HSCF , r] = −
h¯ 2 ∂ . m ∂r
(48)
Otherwise, the commutator will contain an explicit contribution from the nonlocal part of the potential [13]. The final expression for the effective charges reads:
Z αβ s
N/2 4 ¯ α ∂ψn = Zs + . ψn ∂u β (q = 0) Nc n=1
(49)
The calculation of ∞ requires the response of a crystal to an applied electric field E. The latter is described by a potential, V (r) = eE · r, that is neither lattice-periodic nor bounded from below. In the linear-response regime,
208
P. Giannozzi and S. Baroni
however, we can use the same trick as in Eq. (46) and replace all the occurrences of r|ψn with |ψ¯ nα calculated as in Eq. (47). The simplest way to calculate ∞ is to keep the electric field E fixed and to iterate on the potential: ∂ VSCF (r) ∂ V (r) = + ∂E ∂E
e2 δv xc (r) + |r − r | δn(r )
∂n(r ) dr . ∂E
(50)
One finally obtains:
αβ ∞
= δαβ
N/2 ∂ψ 16π e n − ψ¯ nα ∂Eβ Nc n=1
.
(51)
Effective charges can also be calculated from the response to an electric field. In fact, they are also proportional to the force acting on an atom upon application of an electric field. Mathematically, this is simply a consequence of the fact that the effective charge can be seen as the second derivative of the energy with respect to an ion displacement and an applied electric field, and its value is obviously independent of the order of differentiation. Alternative approaches – not using perturbation theory – to the calculation of effective charges and of dielectric tensors have been recently developed. Effective charges can be calculated as finite differences of the macroscopic polarization induced by atomic displacements, which in turn can be expressed in terms of a topological quantity – depending on the phase of ground-state orbitals – called the Berry’s phase [10, 11]. When used at the same level of accuracy, the linear-response and Berry’s phase approaches yield the same results. The calculation of the dielectric tensor using the same technique is possible by performing finite electric-field calculations (the electrical equivalent of the frozen-phonon approach). Recently, practical finite-field calculations have become possible [14, 15], using an expression of the position operator that is suitable for periodic systems.
5.
Applications
The calculation of vibrational properties in the frozen-phonon approach can be performed using any methods that provide accurate forces on atoms. Localized basis-set implementations suffers from the problem of Pulay forces: the last term of Eq. (14) does not vanish if the basis set is incomplete. In order to obtain accurate forces, the Pulay term must be taken into account. The plane-wave (PW) basis set is instead free from such problem: the last term in Eq. (14) vanishes exactly even if the PW basis set is incomplete.
Density-functional perturbation theory
209
Practical implementations of DFPT equations is straightforward with PW’s and norm-conserving pseudopotentials (PPs). In a PW-PP calculation, only valence electrons are explicitly accounted for, while the electron-ionic cores interactions are described by suitable atomic PPs. Norm-conserving PPs contain a nonlocal term of the form: NL (r, r ) = V{R}
Dnm βn∗ (r − Rl − τs )βm (r − Rl − τs ).
(52)
sl n,m
The nonlocal character of the PP requires some generalizations of the formulas described in the previous section, which are straightforward. More extensive modifications are necessary for “ultrasoft” PPs [16], which are appropriate to effectively deal with systems containing transition metal or other atoms that would otherwise require a very large PW basis set when using normconserving PPs. Implementations for other kinds of basis sets, such as LMTO, FLAPW, mixed basis sets (localized atomic-like functions plus PWs) exist as well. Presently, phonon spectra can be calculated for materials described by unit cells or supercells containing up to several tens atoms. Calculations in simple semiconductors (Fig. 1 and 2) and metals (Fig. 3) are routinely performed with modest computer hardware. Systems that are well described by some flavor of DFT in terms of structural properties have a comparable accuracy in their phonon frequencies (with typical error in the order of a few percent points) and phonon-related quantities. The real interest of phonon calculations in simple systems, however, stems from the possibility to calculate real-space IFCs also in cases for which experimental data would not be sufficient to set up a reliable dynamical model (as, for instance, in AlAs, Fig. 2). The availability of IFCs in real space and thus of the complete phonon spectra allows for the accurate evaluation of thermal properties (such as thermal expansion coefficients in the quasi-harmonic approximation) and of electron–phonon coupling coefficients in metals. Calculations in more complex materials are computationally more demanding, but still feasible for a number of nontrivial systems [2]: semiconductor superlattices and heterostructures, ferroelectrics, semiconductor surfaces [18], metal surfaces, high-Tc superconductors are just a few examples of systems successfully treated in the recent literature. A detailed knowledge of phonon spectra is crucial for the explanation of phonon-related phenomena such as structural phase transitions (under pressure or with temperature) driven by “soft phonons,” pressure-induced amorphization, Kohn anomalies. Some examples of such phonon-related phenomenology are shown in Fig. 4–6. Figure 4 shows the onset of a phonon anomaly at an incommensurate q-vector under pressure in ice XI, believed to be connected to the observed amorphization under pressure. Figure 5 displays a Kohn anomaly and the related lattice instability in the phonon spectra of ferromagnetic shape-memory alloy
210
P. Giannozzi and S. Baroni
Fe
ω [cm-1]
300
200
100
H
⌫
P
H
N
P
⌫
N
Ni
ω [cm-1]
300
200
100
⌫
X
W
X
K
⌫
L
Figure 3. Calculated phonon dispersions, with spin-polarized GGA (solid lines) and LDA (dotted lines), for Ni in the face-centered cubic structure and Fe in the body-centered cubic structure. Experimental data are denoted by diamonds. Reproduced from Ref. [17].
Ni2 MnGa. Figure 6 shows a similar anomaly in the phonon spectra of the hydrogenated W(110) surface. DFT-based methods can also be employed to determine Raman and infrared cross sections – very helpful quantities when analyzing experimental data. Infrared cross sections are proportional to the square of the polarization induced by a phonon mode. For the νth zone-center (q = 0) mode,
Density-functional perturbation theory (a) 500
0 kbar
(b)
211
15 kbar
(c)
35 kbar
400
kz
ω(cm⫺1)
Z A
200
Σ
100
B
T E
Λ ∆ Γ
V C
ky
kx
0 Γ
Figure 4.
Σ
C
Y
∆
Γ
Γ
Σ
C
Y
∆
Γ
Γ
Σ
C
Y
∆
Γ
Phonon dispersions in ice XI at 0, 15, and 35 kbar. Reproduced from Ref. [19].
Γ
K
X
125 LA
frequency (cm1)
100
75 TA1 50
25 TA2 0 theory 370oK
25
250oK 50 0
0.2
0.4 0.6 q=ζ[110] 2π/a
0.8
1
Figure 5. Calculated phonon dispersion of Ni2 MnGa in the fcc Heusler structure, along the − K − Z line in the [110] direction. Experimental data taken at 250 and 370 K are shown for comparison. Reproduced from Ref. [20].
characterized by a normalized vibrational eigenvector Usβ , the oscillator strength f is given by 2 αβ β Z s Us . f = α sβ
(53)
212
P. Giannozzi and S. Baroni clean
hydrogenated
frequency (cm⫺1)
200
[110] N
100
S
[001] H
Γ
Γ
[112]
H
N
S
Γ
H
N
S
Figure 6. Phonon dispersions of the clean (left panel) and hydrogenated (right panel) W(110). Full dots indicate electron energy-loss data, open diamonds helium-atom scattering data. Reproduced from Ref. [21].
The calculation of Raman cross sections is difficult in resonance conditions, since the knowledge of excited-state Born–Oppenheimer surfaces is required. Off-resonance Raman cross sections are however simply related to the change of the dielectric constant induced by a phonon mode. If the frequency of the incident light, ωi , is much smaller than the energy band gap, the contribution of the νth vibrational mode to the intensity of the light diffused in Stokes Raman scattering is: I (ν) ∝
(ωi − ων )4 αβ r (ν), ων
(54)
where α and β are the polarizations of the incoming and outgoing light beams, ων is the frequency of the νth mode, and the Raman tensor r αβ (ν) is defined as:
r
αβ
∂χ αβ 2 (ν) = , ∂eν
(55)
where χ = (∞ − 1)/4π is the electric polarizability of the system, eν is the coordinate along the vibrational eigenvector Usβ for mode ν, and indicates an average over all the modes degenerate with the νth one. The Raman tensor can be calculated as a finite difference of the dielectric tensor with a phonon frozen-in, or directly from higher-order perturbation theory [22].
Density-functional perturbation theory
6.
213
Outlook
The field of lattice-dynamical calculations based on DFT, in particular in conjunction with perturbation theory, is ripe enough to allow a systematic application to systems and materials of increasing complexity. Among the most promising fields of application, we mention the characterization of materials through the prediction of the relation existing between their atomistic structure and experimentally detectable spectroscopic properties; the study of the structural (in)stability of materials at extreme pressure conditions; the prediction of the thermal dependence of different materials properties using the quasi-harmonic approximation; the prediction of superconductive properties via the calculation of electron–phonon coupling coefficients. We conclude mentioning that sophisticated open-source codes for lattice dynamical calculations [23] are freely available for download from the web.
References [1] S. Baroni, P. Giannozzi, and A. Testa, “Green’s-function approach to linear response in solids,” Phys. Rev. Lett., 58, 1861, 1987. [2] S. Baroni, S. de Gironcoli, A. Dal Corso, and P. Giannozzi, etc. “Phonons and related crystal properties from density-functional perturbation theory,” Rev. Mod. Phys., 73, 515–562, 2001. [3] X. Gonze, “Adiabatic density-functional perturbation theory,” Phys. Rev. A, 52, 1096, 1995. [4] J. Gerratt and I.M. Mills, J. Chem. Phys., 49, 1719, 1968. [5] R.D. Amos, In: K.P. Lawley (ed.), Ab initio Methods in Quantum Chemistry – I, Wiley, New York, p. 99, 1987. [6] X. Gonze, “Perturbation expansion of variational principles at arbitrary order,” Phys. Rev. A, 52, 1086, 1995. [7] X. Gonze, “First-principles responses of solids to atomic displacements and homogeneous electric fields: Implementation of a conjugate-gradient algorithm,” Phys. Rev. B, 55, 10337, 1997. [8] P. Giannozzi, S. de Gironcoli, P. Pavone, and S. Baroni, “Ab initio calculation of phonon dispersions in semiconductors,” Phys. Rev. B, 43, 7231, 1991. [9] X. Gonze and C. Lee, “Dynamical matrices, Born effective charges, dielectric permittivity tensors, and interatomic force constants from density-functional perturbation theory,” Phys. Rev. B, 55, 10355, 1997. [10] D. Vanderbilt and R.D. King-Smith, “Electric polarization as a bulk quantity and its relation to surface charge,” Phys. Rev. B, 48, 4442, 1993. [11] R. Resta, “Macroscopic polarization in crystalline dielectrics: the geometrical phase approach,” Rev. Mod. Phys., 66, 899, 1994. [12] M. Born and K. Huang, Dynamical Theory of Crystal Lattices., Oxford University Press, Oxford, 1954. [13] S. Baroni and R. Resta, “Ab initio calculation of the macroscopic dielectric constant in silicon,” Phys. Rev. B, 33, 7017, 1986.
214
P. Giannozzi and S. Baroni [14] P. Umari and A. Pasquarello, “Ab initio molecular dynamics in a finite homogeneous electric field,” Phys. Rev. Lett., 89, 157602, 2002. [15] I. Souza, J. ´I˜niguez, and D. Vanderbilt, “First-principles approach to insulators in finite electric fields,” Phys. Rev. Lett., 89, 117602, 2002. [16] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 7892, 1990. [17] A. Dal Corso and S. de Gironcoli, “Density-functional perturbation theory for lattice dynamics with ultrasoft pseudo-potentials,” Phys. Rev. B, 62, 273, 2000. [18] J. Fritsch and U. Schr¨oder, “Density-functional calculation of semiconductor surface phonons,” Phys. Rep., 309, 209–331, 1999. [19] K. Umemoto, R.M. Wentzcovitch, S. Baroni, and S. de Gironcoli, “Anomalous pressure-induced transition(s) in ice XI,” Phys. Rev. Lett., 92, 105502, 2004. [20] C. Bungaro, K.M. Rabe, and A. Dal Corso, “First-principle study of lattice instabilities in ferromagnetic Ni2 MnGa,” Phys. Rev. B, 68, 134104, 2003. [21] C. Bungaro, S. de Gironcoli, and S. Baroni, “Theory of the anomalous Rayleigh dispersion at H/W(110) surfaces,” Phys. Rev. Lett., 77, 2491, 1996. [22] M. Lazzeri and F. Mauri, “High-order density-matrix perturbation theory,” Phys. Rev. B, 68, 161101, 2003. [23] PWscf package: www.pwscf.org. ABINIT: www.abinit.org.
1.11 QUASIPARTICLE AND OPTICAL PROPERTIES OF SOLIDS AND NANOSTRUCTURES: THE GW-BSE APPROACH Steven G. Louie1 and Angel Rubio2 1
Department of Physics, University of California at Berkeley and Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA 2 ´ ´ Departamento Fisica de Materiales and Unidad de Fisica de Materiales ´ Vasco and Centro Mixto CSIC-UPV, Universidad del Pais Donosita Internacional Phycis Center (DIPC)
We present a review of recent progress in the first-principles study of the spectroscopic properties of solids and nanostructures employing a many-body Green’s function approach based on the GW approximation to the electron self-energy. The approach has been widely used to investigate the excitedstate properties of condensed matter as probed by photoemission, tunneling, optical, and related techniques. In this article, we first give a brief overview of the theoretical foundations of the approach, then present a sample of applications to systems ranging from extended solids to surfaces to nanostructures and discuss some possible ideas for further developments.
1.
Background
A large part of research in condensed matter science is related to the characterization of the electronic properties of interacting many-electron systems. In particular, an accurate description of the electronic structure and its response to external probes is essential for understanding the behavior of systems ranging from atoms, molecules, and nanostructures to complex materials. Moreover, many characterization tools in physics, chemistry and materials science as well as electro/optical devices are spectroscopic in nature, based on the interaction 215 S. Yip (ed.), Handbook of Materials Modeling, 215–240. c 2005 Springer. Printed in the Netherlands.
216
S.G. Louie and A. Rubio
of photons, electrons, or other quanta with matter exciting the system to higher energy states. Yet, many fundamental questions concerning the conceptual and quantitative descriptions of excited states of condensed matter and their interactions with external probes are still open. Hence there is a strong need for theoretical approaches which can provide an accurate description of the excitedstate electronic structure of a system and its response to external probes. In what follows we discuss some recent progress along a very fruitful direction in the first-principles studies of the electronic excited-state properties of materials, employing a many-electron Green’s function approach based on the so-called GW approximation [1–3]. Solving for the electronic structure of an interacting electron system (in terms of the many-particle Schr¨odinger equation) has an intrinsic high complexity: while the problem is completely well defined in terms of the total number of particles N and the external potential V(r), its solution depends on 3N coordinates. This makes the direct search for either exact or approximate solutions to the many-body problem a task of rapidly increasing complexity. Fortunately, in the study of either ground- or excited-state properties, we seldom need the full solution to the Schr¨odinger equation. When one is interested in structural properties, the ground-state total energy is sufficient. In other cases, we want to study how the system responds to some external probe. Then knowledge of a few excited-state properties must be added. For instance, in a direct photoemission experiment, a photon impinges on the system and an electron is removed. In an inverse photoemission process, an electron is absorbed and a photon is ejected. In both cases we just have to deal with the gain or loss of energy of the N electron system when a single particle is added or removed, i.e., with the one-particle excitation spectrum. If the electron was not removed after the absorption of the photon, the system evolves from its ground state to a neutral excited state, and the process may be described by correlated electron–hole excitation amplitudes. At the simplest level of treating the many-electron problem, the Hartree– Fock theory (HF) is obtained by considering the ground-state wavefunction to be a single Slater determinant of single-particle orbitals. In this way the N-body problem is reduced to N one-body problems with a self-consistent requirement due to the dependence of the HF effective potential on the wavefunction. By the variational theorem, the HF total energy is a variational upper bound of the ground-state energy for a particular symmetry. The HF-eigenvalues may also be used as rough estimates of the one-electron excitation energies. The validity of this procedure hinges on the assumption that the single-particle orbitals in the N and (N-1) system are the same (Koopman’s theorem), i.e., neglecting the electronic relaxation of the system. A better procedure to estimate excitation energies is to perform self-consistent calculations for the N and (N-1) systems and subtract the total energies (this is called the “-SCF method” for excitation energies which has also been used in other theoretical frameworks such as the
Quasiparticle and optical properties of solids and nanostructures
217
density-functional theory). For infinitely extended system, this scheme gives the same result as Koopman’s theorem and more refined methods are needed to address the problem of one-particle (quasiparticle) excitation energies in solids. The HF theory in general is far from accurate because typically the wavefunction of a system cannot be written as a single determinant for the ground state and Koopman’s theorem is a poor approximation. On the other hand, within density-functional-theory (DFT), the ground-state energy of an interacting system can be exactly written as a functional of the ground-state electronic density [4]. When comparing to conventional quantum chemistry methods, this approach is particularly appealing since solving the ground-state energy does not rely on the complete knowledge of the N-electron wavefunction but only on the electronic density, reducing the problem to that of a self-consistent field calculation. However, although the theory is exact, the energy functional contains an unknown quantity called the exchange-correlation energy, E xc [n], that has to be approximated in practical implementations. For ground-state properties, in particular those of solids and larger molecular systems, present-day DFT results are comparable or even surpassing in quality to those from standard ab initio quantum chemistry techniques. Its use has continued to increase due to a better scaling in computational effort with the number of atoms in the system. As in HF theory, the Kohn–Sham eigenvalues of the DFT cannot be directly interpreted as the quasiparticle excitation energies. Such interpretation has led to the well-known bandgap problem for semiconductors and insulators: the Kohn–Sham gap is typically 30–50% less than the observed band gap. Indeed, the original formulation of the DFT is not applicable to excited states nor to problems involving time-dependent external fields, thus excluding the calculation of optical response, quasiparticle excitation spectrum, photochemistry, etc. Theorems have, however, been proved subsequently for time-dependent density functional theory (TDDFT) which extends the applicability of the approach to excited-state phenomena [5, 6]. The main result of TDDFT is a set of time-dependent Kohn–Sham equations that include all the many-body effects through a time-dependent exchange-correlation potential. As for static DFT, this potential is unknown and has to be approximated in any practical application. TDDFT has been applied with success to the calculations of quantities such as the electron polarizabilities for the optical spectra of finite systems. However, TDDFT encounters problems in studying spectroscopic properties of extended systems [7] and severely underestimates the high-lying excitation energies in molecules when simple exchange and correlation functionals are employed. These failures are related to our ignorance of the exact exchangecorrelation potential in DFT. The actual functional relation between density, n(r), and the exchange-correlation potential, Vxc (r), is highly non-analytical and non-local. A very active field of current research is in the search of robust, new exchange-correlation functionals for real material applications.
218
S.G. Louie and A. Rubio
Alternatively, a theoretically well-grounded and rigorous approach for the excited-state properties of condensed matter is the interacting Green’s function approach. The n-particle Green’s function describes the propagation of the n-particle amplitude in an interacting electron system. It provides a proper framework for accurately computing the N-particle excitation properties. For example, knowledge of the one-particle and two-particle Green’s functions yields information, respectively, on the quasiparticle excitations and optical response of a system. The use of this approach for practical study of the spectroscopic properties of real materials is the focus of the present review. In the remainder of the article, we first present a brief overview of the theoretical framework for many-body perturbation theory and discuss the firstprinciples calculation of properties related to the one- and two-particle Green’s functions within the GW approximation to the electron self-energy operator. Then, we present some selected examples of applications to solids and reduced dimensional systems. Finally, some conclusions and perspectives are given.
2.
Many-body Perturbation Theory and Green’s Functions
A very successful and fruitful development for computing electron excitations has been a first-principles self-energy approach [1–3, 8] in which the quasiparticle’s (excited electron or hole) energy is determined directly by calculating the contribution of the dynamical polarization of the surrounding electrons. In many-body theory, this is obtained by evaluating the evolution of the amplitude of the added particle via the single-particle Green’s function, G(xt, x t ) = −iN |T {ψ(xt)ψ † (x t )}|N ,∗ from which one obtains the dispersion relation and lifetime of the quasiparticle excited state. There are no adjustable parameters in the theory and, from the equation of motion of the single-particle Green’s function, the quasiparticle energies E nk and wavefunctions ψnk are determined by solving a Schr¨odinger-like equation: (T + Vext + VH )ψk (r) +
dr(r,r ; E nk )ψnk (r ) = E nk ψnk (r),
(1)
where T is the kinetic energy operator, Vext is the external potential due to the ions, VH is the Hartree potential of the electrons, and is the self-energy operator where all the many-body exchange and correlation effects are included. The self-energy operator describes an effective potential on the quasiparticle * This corresponds to the Green’s function at zero temperature where |N > is the many-electron ground state, ψ(xt) is the field operator in the Heisenberg picture, x stands for the spatial coordinates r plus the spin coordinate, and T is the time ordered operator. In this context, ψ † (xt)|N> represents an (N + 1)-electron state in which an electron has been added at time t onto position r.
Quasiparticle and optical properties of solids and nanostructures
219
resulting from the interaction with all the other electrons in the system. In general is non-local, energy dependent and non-Hermitian, with the imaginary part giving the lifetime of the excited state. Similarly, from the two-particle Green’s function, we can obtain the correlated electron–hole amplitude and excitation spectrum, and hence the optical properties. For details of the Green’s function formalism and many-body techniques applied to condensed matter, we refer the reader to several comprehensive papers in the literature [2, 3, 7–10]. Here we shall just present some of the main equations used for the quasiparticle and optical spectra calculations. (To simplify the presentation, we use in the following atomic units, e = h¯ = m = 1.) In standard textbook, the unperturbed system is often taken to be the noninteracting system of electrons under the potential Vion(r) + VH (r). However, for rapid convergence in a perturbation series, it is better to start from a different non-interacting or mean-field scenario, like the Kohn–Sham DFT system, which already includes an attempt to describe exchange and correlations in the actual system. Also, in a many-electron system, the Coulomb interaction between two electrons is readily screened by a dynamic rearrangement of the other electrons, reducing its strength. It is more natural to describe the electron–electron interaction in terms of a screened Coulomb potential W and formulate the self energy as a perturbation series in terms of W. In this approach [1–3], the electron self-energy can then be obtained from a self-consistent set of Dyson-like equations: P(12) = −i
d(34)G(13)G(41+ ) (34, 2)
W (12) = v(12) + (12) = i
d(34)W (13)P(34)v(42)
d(34)G(14+ )W (13)(42, 3)
G(12) = G 0 (12) +
d(34)G 0 (13)[(34) − δ(34)Vxc (4)]G(42)
(12, 3) = δ(12)δ(13) +
(2) (3) (4) (5)
d(4567)[δ(12)/δG(45)] × G(46)G(75)(67, 3)
(6)
where 1 ≡ (x1 , t1 ) and 1+ ≡ (x1 , t1 + η)(η >0 infinitesimal). v stands for the bare Coulomb interaction, P is the irreducible polarization, W is the dynamical screened Coulomb interaction, and is the so-called vertex function. Here G 0 is the single-particle DFT Green’s function, G 0 (x, x ; ω) = n ψn (x)ψn∗ (x)/[ω−εn −iηsgn(µn )], with η a positive infinitesimal and ψn and εn the corresponding DFT wavefunctions and eigenenergies. This way of writing down the equations is in fact appealing since it highlights the important physical ingredients: the polarization (which contains the response of the system to the additional particle or hole) is built up by the creation of particle–hole pairs
220
S.G. Louie and A. Rubio
(described by the two-particle Green’s functions). The vertex function contains the information that the hole and the electron interact. This set of equations defines an iterative approach that allows us to gather information about quasiparticle excitations and dynamics. The iterative approach of course has to be approximated. We now describe some of the approximations used in the literature to address quasiparticle excitations and their subsequent extension to optical spectroscopy and exciton states.
3.
Quasiparticle Excitations: the GW Approach
In practical first-principles implementations, the GW approximation [1] is employed in which the self-energy operator is taken to be the first order term in a series expansion in terms of the screened Coulomb interaction W and the dressed Green function G of the electron P(12) = −i G(12)G(21) (12) = i G(12+ )W (12)
(7) (8)
(in frequency space: (r, r ; ω) = i/2π dω e−iω η G(r, r , ω − ω )W (r, r , ω )). Vertex corrections are not included in this approximation. This corresponds to the simplest approximation for (123), assuming it to be diagonal in space and time coordinates, i.e., (123) = δ(12)δ(13). This has to be complemented with Eq. (5) above. Thus, even at the GW level, we have a many-body self-consistent problem. Most ab initio GW applications do this self-consistent loop by (1) taking the DFT results as the mean field and (2) varying the energy of the quasiparticle but keeping fixed its wavefunction (equal to the DFT wavefunction). This corresponds to the G 0 W0 scheme for the calculation of quasiparticle energy as a first-order perturbation to the Kohn–Sham energy εnk : E nk ≈ εnk + nk|(E nk ) − Vxc |nk,
(9)
where Vxc is the exchange-correlation potential within DFT and |nk > is the corresponding wavefunction. This “G 0 W0 ” approximation reproduces to within 0.1 eV the experimental band gaps for many semiconductors and insulators and their surfaces, thus circumventing the well-known bandgap problem [2, 3]. Also it gives much better HOMO–LUMO gaps and ionization energies in localized systems, and results for the lifetimes of hot electrons in metals and image states at surfaces [7]. For some systems, the quasiparticle wavefunction can differ significantly from the DFT wavefunction; one then needs to solve the quasiparticle equation, Eq. (1), directly.
Quasiparticle and optical properties of solids and nanostructures
4.
221
Optical Response: the Bethe–Salpeter Equation
From Eqs. (2)–(6) for the GW self energy, we have a non-vanishing functional derivative δ/δG. One obtains a second-order correction to the bare vertex (1) (123) = δ(12)δ(13): (2)
(123) = δ(12)δ(13) +
d(4567)[δ (1) (12)/δG 0 (45)]G 0 (46) × G 0 (75) (1) (673).
(10)
This can be viewed as the linear response of the self-energy to a change in the total potential of the system. The vertex correction accounts for exchangecorrelation effects between an electron and the other electrons in the screening density cloud. In particular it includes the electron–hole interaction (excitonic effects) in the dielectric response∗ . Indeed, the functional derivative of G is responsible for the attractive direct term in the electron–hole interaction that goes into the effective two-particle equation, the Bethe–Salpeter equation, which determines the spectrum and wavefunctions of the correlated electron– hole neutral excitations created, for example, in optical experiments. Taking as first-order self energy (1) = G 0 W0 , it is easy to derive a Bethe–Salpeter equation, which correctly yields features like bound excitons and changes in absorption strength in the optical absorption spectra. Within this scheme [7, 10], the effective two-particle Hamiltonian takes (when static screening is used in W) a particularly simple, energy-independent form
[(εn1 − εn2 )δn1n3 δn2n4 + u (n1n2)(n3n4) − W(n1n2)(n3n4)]A(n3n4) S
n3n4
= S AS (n1n2)
(11)
where AS is the electron–hole amplitude and the matrix elements are taken with respect to the quasiparticle wavefunctions n 1 , . . . , n 4 as follows: u (n1n2)(n3n4) = n 1 n 2 |u|n 3 n 4 and W(n1n2)(n3n4) = n 1 n 3 |W |n 2 n 4 , with u equal to the Coulomb potential v except for the long-range component q = 0 that is set to zero (that is, u(q)=4π/q 2 but with u(0) = 0). The solution of Eq. (11) allows one to construct the optical absorption spectrum from the imaginary part of the macroscopic dielectric function ε M : Im[εM (ω)] = 16π e2 /ω2
|ˆe· < 0|i/h¯ [H, r]|S > |2 δ(ω − S )
(12)
S
* Vertex corrections and self-consistency tend to cancel to a large extent for the 3D homogeneous electron gas. This cancellation of vertex corrections with self-consistency seems to be a quite general feature. However, there is no formal justification for it and further work along the direction of including consistently dynamical effects and vertex corrections should be explored (Aryasetiawan and Gunnarsson, 1998; and references therein).
222
S.G. Louie and A. Rubio
where eˆ is the normalized polarization vector of the light and i/h¯ [H ,r] is the single-particle velocity operator. The sum runs over all the excited states |S> of the system (with excitation energy S ) and |0 > is the ground state. One of the main effects of the electron–hole interaction is the coupling of different electron–hole configurations (denoted by |he >) which modifies the usual interband transition matrix elements that appear in Eq. (12) to: electrons (h,e) AS < h|i/h¯ [H, r]|e >. <0|i/h¯ [H, r]|S > = holes h e In this context, the Bethe–Salpeter approach to the calculation of two-particle excited states is a natural extension of the GW approach for the calculation of one-particle excited states, within a same theoretical framework and set of approximations (the GW-BSE scheme). As we shall see below, GW-BSE calculations have helped elucidate the optical spectra for a wide range of systems from nanostructures to bulk semiconductors to surfaces and 1D polymers and nanotubes.
5.
Applications to Bulk Materials and Surfaces
Since the mid 1980s, the GW approach has been employed with success to the study of quasiparticle excitations in bulk semiconductors and insulators [2, 3, 9, 11, 12]. In Fig. 1, the calculated GW band gaps of a number of insulating materials are plotted against the measured quasiparticle gaps [11]. A perfect agreement between theory and experiment would place the data points on the diagonal line. As seen from the figure, the Kohn–Sham gaps in the local density approximation (LDA) significantly underestimate the experimental values, giving rise to the bandgap problem. Some of the Kohn–Sham gaps are even negative. However, the GW results (which provide an appropriate description of particle-like excitations in an interacting systems) are in excellent agreement with experiments for a range of materials – from the small gap semiconductors such as InSb, to moderate size gap materials such as GaN and solid C60 , and to the large gap insulators such as LiF. In addition, the GW quasiparticle band structures for semiconductors and conventional metals in general compare very well with data from photoemission and inverse photoemission measurements. Figure 2 depicts the calculated quasiparticle band structure of germanium [11] and copper [13] as compared to photoemission data for the occupied states and inverse photoemission data for the unoccupied states. For Ge, the agreement is within the error bars of experiments. In fact, the conduction band energies of Ge were theoretically predicted before the inverse photoemission measurement. The results for Cu agree with photoemission data to within 30 meV for the highest d-band, correcting 90% of the LDA error. The energies of the other d-bands throughout the Brillouin zone are reproduced within 300 meV, and the maximum error (about 600 meV) is found for the bottom valence band at the
Quasiparticle and optical properties of solids and nanostructures
223
Theoretical Band Gap (eV)
15
10 Quasiparticle theory
5 Many-body corrections
LDA
0 0
5 10 Experimental Band Gap (eV)
15
Figure 1. Comparison of the GW bandgap with experiment for a wide range of semiconductors and insulators. The Kohn–Sham eigenvalue gaps calculated within the local density approximation (LDA) are also included for comparison. (after Ref. [11]).
Figure 2. Calculated GW quasiparticle band structure of Ge (left panel) and Cu (right panel) as compared with experiments (open and full symbols). In the case of Cu we also provide the DFT-LDA band structure as dashed lines. (after Ref. [11, 13]).
224
S.G. Louie and A. Rubio
Figure 3. Computed GW quasiparticle bandstructure for the Si(111) 2 × 1 surface compared with experimental results (dots). On the left we show a model of the surface reconstruction (after Ref. [15]).
point, where only 50% of the LDA error is corrected. This level of agreement for the d-bands cannot be obtained without including self-energy contributions∗ . Similar results have been obtained for other materials and even for some nonconventional insulating systems such as the transition metal oxides and metal hydrides. The GW approach has also been used to investigate the quasiparticle excitation spectrum of surfaces, interfaces and clusters. Figure 3 gives the example of the Si(111)2 × 1 surface [14, 15]. This surface has a very interesting geometric and electronic structure. At low temperature, to minimize the surface energy, the surface undergoes a 2 × 1 reconstruction with the surface atoms forming buckled π -bonded chains. The ensuing structure has an occupied and an unoccupied quasi-1D surface-state band, which are dispersive only along the π -bonded chains and give rise to a quasiparticle surface-state bandgap of 0.7 eV that is very different from the bulk Si bandgap of 1.2 eV. The calculated quasiparticle surface-state bands are compared to photoemission and inversed photoemission data in Fig. 3. As seen in the figure, both the calculated surface-state band dispersion and bandgap are in good agreement with experiment, and these results are also in accord with results from scanning tunneling spectroscopy (STS) which physically also probes quasiparticle excitations. But, a long-standing puzzle in the literature has been that the measured surface-state gap of this system from
* On the other hand, the total bandwidth is still larger than the measured one. This overestimate of the GW bandwidth for metals with respect to the experimental one seems to be a rather general feature, which is not yet properly understood.
Quasiparticle and optical properties of solids and nanostructures
225
optical experiments differs significantly (by nearly 0.3 eV) from the quasiparticle gap, indicative of perhaps very strong electron-hole interaction on this surface. We shall take up this issue later when we discuss optical response. Owing to interactions with other excitations, quasiparticle excitations in a material are not exact eigenstates of the system and thus possess a finite lifetime. The relaxation lifetimes of excited electrons in solids can be attributed to a variety of inelastic and elastic scattering mechanisms, such as electron–electron (e–e), electron–phonon (e–p), and electron–imperfection interactions. The theoretical framework to investigate the inelastic lifetime of the quasiparticle (due to electron–electron interaction as manifested in the imaginary part of ) has been based for many years on the electron gas model of Fermi liquids, characterized by the electron-density parameter rs . In this simple model for either electrons or holes with energy E very near the Fermi level, the inelastic lifetime is found to be, in the high-density limit (rs << 1), τ (E) = 263 rs−5/2 (E − E F )−2 fs, where E and the Fermi energy E F are expressed in eV [16]. A proper treatment of the electron dynamics (quasiparticle damping rates or lifetimes), however, needs to include bandstructure and dynamical screening effects in order to be in quantitative comparison with experiment. An illustrative example is given in Fig. 4 where the quasiparticle lifetimes of electrons and holes in bulk Cu and Au have been evaluated within the GW scheme, showing an increase in the lifetime close to the Fermi level as compared to the predictions of the free electron gas model. For Au, a major contribution from the occupied d states to the screening yields lifetimes of electrons that are larger than those of electrons in a free-electron-gas model by a factor of about 4.5 for electrons with
Figure 4. Calculated GW electron and hole lifetimes for Cu and Au. Solid and open circles represent the ab initio calculation of τ (E) for electrons and holes, respectively, as obtained after averaging over wavevectors and the bandstructure for each k vector. The solid and dotted lines represent the corresponding lifetime of electrons (solid line) and holes (dotted line) in a free electron gas with rs = 2.67 for Cu and rs = 3.01 for Au. In the inset for Au the theoretical results (solid circles) are compared with experimental data (open circles) from Ref. [17]. (after Refs. [18, 19]).
226
S.G. Louie and A. Rubio
energies 1–3 eV above the Fermi level. This prediction is in agreement with a recent experimental study of ultrafast electron dynamics in Au(111) films [17]. Up until the late 1990s, the situation for ab initio calculation of the optical properties of real materials was, however, not nearly as good as that for the quasiparticle properties. As discussed in Section 4, for the optical response of an interacting electron system, we must also include electron–hole interaction or excitonic effects. The important consequence of such effects is shown in Fig. 5 where the computed absorption spectrum of SiO2 neglecting electron– hole interaction is compared with the experimental spectrum [20]. There is hardly any resemblance between the spectrum from the non-interacting theory to that of experiment, which has led to extensive debates over the past 40 years on the nature of the four very sharp peaks observed in the experiment. We shall return to this technologically important material later. With the advance of the GW-BSE method [21–24], accurate ab initio calculation of the optical spectra of materials is now possible. As discussed above, solving the Bethe–Salpeter equation yields both the excitation energy and the coupling coefficients among the different electron–hole configurations that form the excited state. The resulting excited-state energies and electron–hole amplitude can then be used to compute the optical (or energy loss and related) spectrum including excitonic effects. The approach has been employed to obtain quite accurately optical transitions to both the bound and continuum states of
10
Im ε(ω)
8 6 4 2 0 0
10 15 Photon Energy (eV)
20
Figure 5. Comparison of the calculated absorption spectrum of SiO2 including excitonic effects (continuous curve) and neglecting electron–hole interaction (dot-dashed curve) with the experimental spectrum (dashed curve) taken from Ref. [25] (after Ref. [20]).
Quasiparticle and optical properties of solids and nanostructures
227
various materials [7, 10, 21, 22], including reduced dimensional systems and nanostructures. For bulk GaAs, the GW-BSE results for the optical absorption are compared with experiments in Fig. 6. We see that even for this simple and wellknown semiconductor, only with the inclusion of electron–hole interaction then we have good agreement between theory and experiment. The influence of the electron–hole interaction effects extends over an energy range far above the fundamental band gap. As seen from the figure, the optical strength of GaAs is enhanced by nearly a factor of two in the low frequency regime. Also, the electron–hole interaction enhances and shifts the second prominent peak (the so-called E 2 peak) structure at 5 eV to much closer to experiment. This very large shift of about 1/2 eV in the E 2 peak is not due to a negative shift of the transition energies, as one might naively expect from an attractive electron– hole interaction. The changes in the optical spectrum originate mainly from the coupling of different electron–hole configurations in the excited states, which leads to a constructive coherent superposition of the interband transition oscillator strengths for transitions at lower energies and to a destructive superposition at energies above 5 eV [21, 22]. In addition to the continuum part of the spectrum, one can also get out the bound exciton states near the absorption edge from the Bethe–Salpeter equation from first principles without making use of any effective mass approximation. For the case of GaAs, we see in Table 1 that the theory basically reproduces all the observed bound exciton structures to a very high level of accuracy.
Figure 6. Theoretical (continuous line) and measured (dots) optical absorption spectra for bulk GaAs. The experimental data are taken from Refs. [26, 27]. The calculated spectrum without inclusion of electron–hole interaction (dashed curve) is also given for completeness (after Refs. [21, 22]).
228
S.G. Louie and A. Rubio Table 1. Calculated exciton binding energies near the absorption edge for GaAs. The GW-BSE calculations are from [21, 22] and the experimental data are from [26] Binding energy
Theory (meV)
Experiment (meV)
E 1s E 2s E2 p
4.0 0.9 0.2–0.7
4.2 1.0 0–1
The scheme can thus directly be applied to situations in which simple empirical techniques do not hold. Similarly accurate results have been obtained for the other semiconductors. For larger gap materials, exciton effects are even more dramatic in the optical response as seen for the case of SiO2 [20] in Fig. 5. The quasiparticle gap of α-quartz is 10 eV. From the ab initio calculation, we learn that all the prominent peaks seen in the experiment and also in theory when electron–hole interaction is included are due to transitions to excitonic states. The much-debated peaks in the experimental spectrum are in fact due to the strong correlations between the excited electron and hole in resonant excitonic states since these excited states have energies that are higher than the value for the quasiparticle band gap.
6.
Applications to Reduced Dimensional Systems and Nanostructures
The GW-BSE approach in particular has been valuable in explaining and predicting the quasiparticle excitations and optical response of reduced dimensional systems and nanostructures. This is because Coulomb interaction effects in general are more dominant in lower dimensional systems owing to geometrical and symmetry restrictions. As illustrated below, self-energy and electron– hole interaction effects can be orders of magnitude larger in nanostructures than in bulk systems made up of the same elements. A good example of a reduced dimensional system is the conjugated polymers. The optical properties of these technologically important systems are still far from well understood when compared to conventional semiconductors [28]. For example, there has been much argument in the literature regarding to the binding energy of excitons in polymers such as poly-phenylene-vinylene (PPV); values ranging from 0.1 to 1.0 eV had been suggested. Ab initio calculation using the GW-BSE approach show that excitonic effects in PPV are indeed dominant and change qualitative the optical spectrum of the material. This is shown in Fig. 7 where we see that each of the 1D van Hove singularities in the
Quasiparticle and optical properties of solids and nanostructures
229
Figure 7. Optical absorption spectra of the polymer PPV. Theoretical results with (continuous line) and without (dashed line) including excitonic effects (after Ref. [28]).
interband absorption spectrum is replaced by a series of sharp peaks due to excitonic states. The lowest optically active exciton is a bound exciton state; but the others are strong resonant exciton states giving rise to peak structures that agree very well with experiment. In particular, when compared to the quasiparticle gap of 3.3 eV, the theoretical results in Fig. 7 yield a very large binding energy of nearly 1 eV for the lowest energy bound exciton in PPV. The reduced dimensionality at a surface can also greatly enhance excitonic effects. For example, in the case of the Si(111) 2 × 1 surface [28], it is found that the surface optical spectrum at low frequencies is dominated by a surfacestate exciton which has a binding energy that is an order of magnitude bigger than that of bulk Si, and one cannot interpret the experimental spectrum without considering the excitonic effects. This is illustrated in Fig. 8 where the measured differential reflectivity is compared with theory. Here we find that the peak in the differential reflectivity spectrum is dictated by a surface-state exciton with a binding energy of 0.23 eV. This very large binding energy for the surface-state exciton is to be compared to the excitonic binding energy in bulk Si which is only 15 meV. The large enhancement in the electron–hole interaction at this particular surface arises from the quasi-1D nature of the surface states, which are localized along the π -bonded atomic chains on the surface. Similar excitonic calculations for the Ge(111) 2 × 1 reconstructed surface demonstrate how optical differential reflectivity spectra can be used to distinguish between the two possible isomers of the reconstructed surface (see right panel in Fig. 8). This distinction has been enabled by the fact that a quantitative comparison between the calculated
230
S.G. Louie and A. Rubio
Figure 8. Comparison between experiments and the computed differential reflectivity spectra with and without electron–hole interaction for the Si(111)2 × 1 surface (left panel) [28] and for Ge(111)2 × 1 (right panel) [29].
and experimental spectrum is possible when electron–hole effects are treated correctly [29]. Another 1D system of great current interest is the carbon nanotubes [30]. These are tubular structures of graphene with diameter in the range of one nanometer and length that can be many hundreds of microns or longer. The carbon nanotubes can be metals or semiconductors depending sensitively on their geometric structure, which is indexed by a pair of integers (m, n) where m and n are the two integers specifying the circumferential vector in units of the two primitive translation vectors of graphene. Recent experimental advances have allowed the measurement of the optical response of well-characterized individual, single-walled carbon nanotubes (SWCNTs). For example, absorption measurement on well-aligned samples of SWCNTs of uniform diameter of 4 Å grown inside the channels of zeolites has been performed [31]. And, through the use of photoluminescence excitation techniques, the Rice group has succeeded in measuring both the first and second optical transition energies of well identified, individually isolated, semiconducting SWCNTs [32, 33]. The optical properties of these tubes are found be to quite unusual and cannot be explained by conventional theories. Because of the reduced dimensionality of the nanotubes, many-electron (both quasiparticle and excitonic) effects have been shown to be extraordinarily important in these systems [34, 35]. Figure 9 illustrates the effects of many-electron interactions on the quasiparticle excitation energies of the carbon nanotubes. Plotted in the figure are the quasiparticle corrections to the LDA Kohn–Sham energies for the metallic (3,3) carbon nanotube and the semiconducting (8,0) carbon nanotube.
Quasiparticle and optical properties of solids and nanostructures
231
Figure 9. Plot of the quasiparticle corrections to the DFT Kohn–Sham eigenvalues due to selfenergy effects as a function of the energy of the states for the metallic (3,3) carbon nanotube (left panel) and the semiconducting (8,0) carbon nanotube (right panel) (after Refs. [34, 35]).
Figure 10. Calculated quasiparticle density of states (left panel) and optical absorption spectrum (right panel) for the (3,3) carbon nanotube (after Refs. [34, 35]).
The general trends are that, for the metallic tubes, the corrections are relatively straight forward. Basically they stretch the bands by ∼15%, as in the case of graphite [11]. But, the self-energy corrections to the quasiparticle energies of the semiconducting tubes are quite large. The corrections cause a large opening of the minimum band gap, as well as a stretching of the bands. As seen in Fig. 9, the self-energy corrections cause the minimum quasiparticle gap of the (8,0) carbon nanotube to open up by nearly 1 eV. Many-electron interaction effects play an even more important role in the optical response of the carbon nanotubes. The calculated optical spectrum of the metallic (3,3) nanotube (which is one of the 4 Å diameter SWCNTs) is presented in Fig. 10. The left panel shows the electronic density of states. Because of the symmetry of the states, only certain transitions between states (indicated by the arrow A) are optically allowed. The right panel compares the calculated
232
S.G. Louie and A. Rubio
imaginary part of the dielectric response function between the case with and without electron–hole interactions. The optical spectrum of the (3,3) nanotube is changed qualitatively due to the existence of a bound exciton, even though the system is metallic. This rather surprising result comes from the fact that, although the tube is metallic, there is a symmetric gap in the electron–hole spectrum (i.e., there are no free electron–hole states of the same symmetry as the exciton possible in the energy range of the excitonic state). The symmetry gap is possible here because the (3,3) tube is a 1D metal – i.e., all k-states can have well-defined symmetry. Figure 11 depicts the results for the (5,0) tube, which is another metallic SWCNT of 4 Å in diameter. The surprise here is that, for the range of frequencies considered, the electron–hole interaction in this tube is a net repulsion between the excited electron and hole. Unlike the case of bulk semiconductors, owing to the symmetry of the states involved and metallic screening, the repulsive exchange term dominates over the attractive direct term in the electron– hole interaction. As a consequence, there are no bound exciton states in Fig. 11 and there is a suppression of the optical strength at the van Hove singularities, especially for the second peak in the spectrum. One expects the above excitonic effects should be even more pronounce in the semiconducting nanotubes. Indeed, this is the case. Figure 12 compares the calculated absorption spectrum of a (8,0) tube between the case with and without electron–hole interactions. The two resulting spectra are qualitatively and dramatically different. When electron–hole interaction effects are included, the spectrum is dominated by bona fide and resonant excitonic states. With interactions, each van Hove singularity structure in the non-interacting spectrum gives rise to a series of exciton states. For the (8,0) tube, the lowest-energy bound exciton has a binding energy of more than 1 eV. Note that the exciton binding energy for bulk semiconductors of similar size bandgap is in general only of the
Figure 11. Calculated quasiparticle density of states (left panel) and optical absorption spectrum (right panel) for the (5,0) carbon nanotube (after Refs. [34, 35]).
Quasiparticle and optical properties of solids and nanostructures
233
Figure 12. Optical absorption spectra for the (8,0) carbon nanotube (top panel) and the spatial extent of the excitonic wavefunction along the tube axis for a bound and resonant excitonic state (after Refs. [34, 35]).
order of tens of meVs. This illustrates again the dominance of many-electron Coulomb effects in the carbon nanotubes owing to their reduced dimensionality. The bottom two panels in Fig. 12 give the spatial correlation between the excited electron and hole in two of the exciton states, one bound and one resonant state. The extent of the exciton wavefunction is about 25–30 Å for both of these states. In Table 2 we compare the calculated results for the 4 Å diameter tubes with experimental data. For the samples of 4 Å diameter single-walled carbon nanotubes grown in the channels of the zeolite AlPO4 -5 crystal, the Hong Kong group observed three prominent peaks in the optical absorption spectrum [31]. There are only three possible types of carbon nanotubes with a diameter of 4 Å – (5,0), (4,2) and (3,3). All three types of tubes are expected to be present in these samples. The theoretical results quantitatively explain from first principles the three observed peaks and identify their physical origin. The first peak is due to
234
S.G. Louie and A. Rubio Table 2. Comparison between experimental [31] and calculated main absorption peaks for all possible 4 Å – (5,0), (4,2) and (3,3) – carbon nanotubes CNT
Theory (eV)
Experiment (eV)
Character
(5,0) (4,2) (3,3)
1.33 2.0 3.17
1.37 2.1 3.1
Interband Exciton Exciton
Table 3. Calculated lowest two optical transition energies for the (8,0) and (11,0) carbon nanotubes compared to experimental values [32, 33]. It is noted that the ratio between the two transition energies deviates strongly from the value of 2 predicted by a simple independent-particle model (after Refs. [34, 35]) (8,0) E 11 E 22 E 22 /E 11
(11,0)
Experiment
Theory
Experiment
Theory
1.6 eV 1.9 eV 1.19
1.6 eV 1.8 eV 1.13
1.2 eV 1.7 eV 1.42
1.1 eV 1.6 eV 1.45
an interband transition van Hove singularity from the (5,0) tubes, whereas the second peak and third peak are due to the formation of excitons in the (4,2) and (3,3) tubes, respectively [34–36]. The theoretical results [34, 35] on the larger semiconducting tubes have also been used to elucidate the findings from photoluminescence excitation measurements, which yielded detailed information on optical transitions in individual single-walled nanotubes. Table 3 gives a comparison between experiment and theory for the particular cases of the (8,0) and (11,0) tubes. The measured transition energies are in excellent agreement with theory. In particular, we found that the large reduction in the ratio of the second transition energy to the first transition energy E 22 /E 11 from the value of 2 (predicted by simple interband transition theory) is due to a combination of factors – bandstructure effects, quasiparticle self-energy effects, and excitonic effects. One must include all these factors to have an understanding of the optical response of the semiconducting carbon nanotubes. Another example of low-dimensional systems is clusters. In Fig. 13, we show some results on the optical spectra of the Na4 cluster calculated using the GW-BSE approach as well as those from TDLDA and experiment. The measured spectrum consists of three peaks in the 1.5–3.5 eV range and a broader feature around 4.5 eV. The agreement between results from TDDFT based calculations and GW-BSE calculations is very good. The comparison with the experimental peak positions is also quite good, although the calculated peaks appear shifted to higher energies by approximately 0.2 eV. Good agreement has been obtained for other small semiconductor and metal clusters.
Quasiparticle and optical properties of solids and nanostructures
235
Figure 13. Calculation of the optical absorption (proportional to the strength function) of a Na4 cluster using the GW-BSE scheme (dashed line) (from Ref. [37]) and with TDDFT using different kernels [38]: TDLDA (solid line), exact-exchange (dotted line). Filled dots represent the experimental results from Ref. [39] (after Ref. [7]).
The above are just several selected examples, given to illustrate the current status in ab initio calculations of quasiparticle and optical properties of materials. Similar results have been obtained for the spectroscopic properties of many other moderately correlated electron systems, in particular for semiconducting systems, to a typical level of accuracy of about 0.1 eV.
7.
Conclusions and Perspectives
We have discussed in this article the theory and applications of an ab initio approach to calculating electron excitation energies, optical spectra, and exciton states in real materials. The approach is based on evaluating the one-particle and the two-particle Green’s function, respectively, for the quasiparticle and optical excitations of the interacting electron system, including relevant electron self-energy and electron–hole interaction effects at the GW approximation level. It provides a unified approach to the investigation of both extended and confined systems from first principles. Various applications have shown that the method is capable of describing successfully the spectroscopic properties of a range of systems including semiconductors, insulators, surfaces, conjugated polymers, small clusters, nanotubes and other nanostructures. The agreement between theoretical spectra and data from experiments such as photoemission, tunneling,
236
S.G. Louie and A. Rubio
optical and related measurements is in general remarkably good for moderately correlated electron systems. A popular alternative scheme to address optical response is TDDFT. In particular the optical response of simple metal clusters and biomolecules is well reproduced by the standard TDLDA approximation [7, 40]. However, if we increase the size of the system towards a periodic structure in one, two or three dimensions (i.e., polymers, slabs, surfaces or solids), we must be careful with the form of the exchange-correlation functional employed. In contrast to the GW-BSE scheme, difficulties arise when applying TDDFT, for example, to long conjugated molecular chains, where the strong non-locality of the exact functional is not well reproduced in the usual approximations. Similarly, for bulk semiconductors and insulators, the standard functionals fail to describe the optical absorption spectra. The reason has been traced to the fact that the exchange and correlation kernel f xc (which describes the electron–hole interaction within TDDFT) should behave asymptotically, in momentum space, as 1/q 2 as q goes to 0 [7]. This condition, however, is not satisfied by the LDA or GGA. Input from the GW-BSE method has in fact been employed to improve the approximate exchange-correlation functionals for use in the TDDFT scheme ([41] and references therein). Such new many-body based f xc has given results on the optical loss spectra of bulk materials such as LiF and SiO2 that are in quite good agreement with the Bethe–Salpeter equation results and with experiments. (See Fig. 14.) Both spatial nonlocality and frequency dependence of the f xc kernel turn out to be important in order to properly describe excitonic effects. However, quasiparticle effects still need to be embodied properly within this new approximated TDDFT scheme. An interesting practical question is: which of the two approaches, the GW-BSE method or the TDDFT, would be more efficient in computing the optical properties of the different systems of interest in the future?∗ The overall success of the first-principles many-body Green’s function approach is impressive and has been highly valued. Nevertheless, the G 0 W0 scheme can be refined in some applications. Studies have shown that: (i) inclusion of vertex corrections improves the description of the absolute position of quasiparticle energies although the amount of such corrections depends sensitively on the model used for the vertex [7, 9]; (ii) vertex effects slightly changes the occupied bandwidth of the homogeneous electron gas, but this correction is not enough to fit the experimental results for metals such as Na; (iii) for the bandwidth of simple metals, self-consistency performed for the homogenous electron gas [42]
* The GW Bethe–Salpeter equation approach offers a clear physical and straightforward picture for the analysis of results and further improvements. It works over a wide range of systems for both quasiparticle and optical excitations. The TDDFT approach, on the other hand, is appealing since it computes optical response more efficiently, but it is appropriate only for neutral excitations and its range of validity is uncertain because of uncontrolled approximations to the functionals.
Quasiparticle and optical properties of solids and nanostructures
237
Figure 14. Calculated optical absorption spectra within the GW-BSE approach (continuous line) and those from a new TDDFT f xc kernel derived from the BSE method (dashed line) are compared to experiment (open dots). The independent-quasiparticle response (dashed-dotted line) is also shown (after Ref. [41]).
showed that partially self-consistent GW0 calculations – in which W is calculated only once using the random-phase-approximation (RPA) so that Eq. (7) is not included in the iterative process – only slightly increase the G 0 W0 occupied bandwidth. Results are even worse at full self-consistency without vertex corrections. The effects of self-consistency thus must be necessarily balanced by the proper inclusion of vertex corrections. This, however, is not the case for the calculation of total energies where the fully self-consistent GW solution appears to provide better results than the partial G 0 W0 procedure. But, if one is interested in spectroscopic properties, a self-consistent GW procedure seems to perform worse than the simpler G 0 W0 scheme. Experiences from numerous past applications to bulk solids and reduced dimensional systems have demonstrated that in general the GW scheme is an
238
S.G. Louie and A. Rubio
excellent approximation for the evaluation of the quasiparticle and optical properties of moderately correlated systems. Methods beyond the GW approximation are expected to be required for the study of the spectral features of highly correlated systems. The GW-BSE approach described in this article, however, is arguably the most reliable, practical, and versatile tool we have at present to tackle the optical and electronic response of real material systems from first principles. Further developments in the field should address the proper treatment of self-consistency and vertex corrections. This would further extend the range of applicability of this, already successful, many-body Green’s function approach.
References [1] L. Hedin, “New method for calculating the one-particle Green’s function with application to the electron-gas problem,” Phys. Rev., 139, A796, 1965. [2] M.S. Hybertsen and S.G. Louie, “First-principles theory of quasiparticles: calculation of band gaps in semiconductors and insulators,” Phys. Rev. Lett., 55, 1418, 1985. [3] M.S. Hybertsen and S.G. Louie, “Electron correlation in semiconductors and insulators: band gaps and quasiparticle energies,” Phys. Rev. B, 34, 5390, 1986. [4] W. Kohn, “Nobel lecture: electronic structure of matter-wave functions and density functionals,” Rev. Mod. Phys., 71, 1253, 1999. [5] E. Runge and E.K.U. Gross, “Density-functional theory for time-dependent systems,” Phys. Rev. Lett., 52, 997, 1985. [6] E.K.U. Gross, J. Dobson, and M. Petersilka, “Density functional theory of timedependent phenomena,” In Density Functional Theory II, R.F. Nalewajski (ed.), Topics in Current Chemistry, vol. 181, Springer, Berlin, p. 81, 1986. [7] G. Onida, L. Reining, and A. Rubio, “Electronic excitations: density functional versus many-body Green’s-function approaches,” Rev. Mod. Phys., 74, 601, 2002. [8] L. Hedin and S. Lundqvist, “Effects of electron–electron and electron–phonon interactions on the one electron states of solids,” In: H. Ehrenreich, F. Seitz, and D. Turnbull (eds.), Solid State Physics, Academic Press, New York, vol. 23, p. 1, 1969. [9] F. Aryasetiawan and O. Gunnarsson, “GW method,” Rep. Prog. Phys., 61, 3, 1998. [10] M. Rohlfing and S.G. Louie, “Electron–hole excitations and optical spectra from first principles,” Phys. Rev. B, 62, 4927, 2000. [11] S.G. Louie, “First-principles theory of electron excitation energies in solids, surfaces, and defects,” In: C.Y. Fong (ed.), Topics in Computational Materials Science, World Scientific, Singapore, p. 96, 1997. [12] W.G. Aulbur, L. J¨onsson, and J. Wilkins, “Quasiparticle calculations in solids,” In: Solid State Physics, vol. 54, p. 1, 2000. [13] A. Marini, G. Onida, and R. Del Sole, “Quasiparticle electronic structure of copper in the GW approximation,” Phys. Rev. Lett., 88, 016403, 2002. [14] J.E. Northrup, M.S. Hybertsen, and S.G. Louie, “Many-body calculation of the surface state energies for Si(111)2 × 1,” Phys. Rev. Lett., 66, 500, 1991. [15] M. Rohlfing and S.G. Louie, “Optical excitations in conjugated polymers,” Phys. Rev. Lett., 82, 1959, 1999. [16] P.M. Echenique, J.M. Pitarke, E. Chulkov, and A. Rubio, “Theory of inelastic lifetimes of low-energy electrons in metals,” Chem. Phys., 251, 1, 2000.
Quasiparticle and optical properties of solids and nanostructures
239
[17] J. Cao, Y. Gao, H.E. Elsayed-Ali, R.D.E. Miller, and D.A. Mantell, “Femtosecond photoemission study of ultrafast dynamics in single-crystal Au(111) films,” Phys. Rev. B, 50, 10948, 1998. [18] I. Campillo, J.M. Pitarke, A. Rubio, E. Zarate, and P.M. Echenique, “Inelastic lifetimes of hot electrons in real metals,” Phys. Rev. Lett., 83, 2230, 1999. [19] I. Campillo, A. Rubio, J.M. Pitarke, A. Goldman, and P.M. Echenique, “Hole dynamics in noble metals,” Phys. Rev. Lett., 85, 3241, 2000. [20] E.K. Chang, M. Rohlfing, and S.G. Louie, “Excitons and optical properties of alphaquartz,” Phys. Rev. Lett., 85, 2613, 2000. [21] M. Rohlfing and S.G. Louie, “Excitonic effects and the optical absorption spectrum of hydrogenated Si clusters,” Phys. Rev. Lett., 80, 3320, 1998. [22] M. Rohlfing and S.G. Louie, “Electron–hole excitations in semiconductors and insulators,” Phys. Rev. Lett., 81, 2312, 1998. [23] L.X. Benedict, E.L. Shirley, and R.B. Bohm, “Optical absorption of insulators and the electron–hole interaction: an ab initio calculation,” Phys. Rev. Lett., 80, 4514, 1998. [24] S. Albretch, L. Reining, R. Del Sole, and G. Onida, “Ab initio calculation of excitonic effects in the optical spectra of semiconductors,” Phys. Rev. Lett., 80, 4510, 1998. [25] H.R. Philipp, “Optical transitions in crystalline and fused quartz,” Solid State Commun., 4, 73, 1966. [26] D.E. Aspnes and A.A. Studna, “Dielectric functions and optical parameters of Si, Ge, GaP, GaAs, GaSb, InP, InAs and InSb frp, 1.5 to 6.0 eV,” Phys. Rev. B, 27, 985, 1983. [27] P. Lautenschlager, M. Garriga, S. Logothetisdis, and M. Cardona, “Interband critical points of GaAs and their temperature dependence,” Phys. Rev. B, 35, 9174, 1987. [28] M. Rohlfing and S.G. Louie, “Excitations and optical spectrum of the Si(111)-(2×1) surface,” Phys. Rev. Lett., 83, 856, 1999. [29] M. Rohlfing, M. Palummo, G. Onida, and R. Del Sole, “Structural and optical properties of the Ge(111)-( 2 × 1) surface,” Phys. Rev. Lett., 85, 5440, 2000. [30] S. Iijima, “Helical microtubules of graphitic carbon,” Nature, 354, 56, 1991. [31] Z.M. Li, Z.K. Tang, H.J. Liu, N. Wang, C.T. Chan, R. Saito, S. Okada, G.D. Li, J.S. Chen, N. Nagasawa, and S. Tsuda, “Polarized absorption spectra of single-walled 4 Å carbon nanotubes aligned in channels of an AlPO4 -5 single crystal,” Phys. Rev. Lett., 87, 127401, 2001. [32] M.J. O’Connell, S.M. Bachilo, C.B. Huffman, V.C. Moore, M.S. Strano, E.H. Haroz, K.L. Rialon, P.J. Boul, W.H. Noon, C. Kittrell, J. Ma, R.H. Hauge, R.B. Weisman, and R.E. Smalley, “Band gap fluorescence from individual single-walled carbon nanotubes,” Science, 297, 593, 2002. [33] S.M. Bachilo, M.S. Strano, C. Kittrell, R.H. Hauge, R.E. Smalley, and R.B. Weisman, “Structure-assigned optical spectra of single-walled carbon nanotubes,” Science, 298, 2361, 2002. [34] C.D. Spataru, S. Ismail-Beigi, L.X. Benedict, and S.G. Louie, “Excitonic effects and optical spectra of single-walled carbon nanotubes,” Phys. Rev. Lett., 92, 077402, 2004. [35] C.D. Spataru, S. Ismail-Beigi, L.X. Benedict, and S.G. Louie, “Quasiparticle energies, excitonic effects and optical absorption spectra of small-diameter single-walled carbon nanotubes,” Appl. Phys. A, 78, 1129, 2004. [36] E. Chang, G. Bussi, A. Ruini, and E. Molinari, “Excitons in carbon nanotubes: an ab initio symmetry-based approach,” Phys. Rev. Lett., 92, 196401, 2004. [37] G. Onida, L. Reining, R.W. Godby, and W. Andreoni, “Ab initio calculations of the quasiparticle and absorption spectra of clusters: the sodium tetramer,” Phys. Rev. Lett., 75, 818, 1995.
240
S.G. Louie and A. Rubio [38] M.A.L. Marques, A. Castro, and A. Rubio, “Assesment of exchange-correlation functionals for the calculation of dynamical properties of small clusters in TDDFT,” J. Chem. Phys., 115, 3006, 2001. http://www.tddft.org/programs/octopus. [39] C.R.C. Wang, S. Pollack, D. Cameron, and M.M. Kappes, “Optical absorption spectroscopy of sodium clusters as measured by collinear molecular-beam photodepletion,” J. Chem. Phys., 93, 3787, 1990. [40] M.A.L. Marques, X. L´opez, D. Varsano, A. Castro, and A. Rubio, “Time-dependent density-functional approach for biological photoreceptors: the case of the Green fluorescent protein,” Phys. Rev. Lett., 90, 158101, 2003. [41] A. Marini, R. Del Sole, and A. Rubio, “Bound excitons in time-dependent densityfunctional-theory: optical and energy-loss spectra,” Phys. Rev. Lett., 91, 256402, 2003. [42] B. Holm and U. von Barth, “Fully self-consistent GW self-energy of the electron gas,” Phys. Rev. B, 57, 2108, 1998.
1.12 HYBRID QUANTUM MECHANICS/ MOLECULAR MECHANICS METHODS AND THEIR APPLICATION Marek Sierka1,∗ and Joachim Sauer2 1
Institut für Physikalische Chemie, Lehrstuhl für Theoretische Chemie, Universität Karlsruhe, Kaiserstraße 12, D-76128 Karlsruhe, Germany 2 Institut für Chemie, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany
Hybrid quantum mechanics (QM)/molecular mechanics (MM) methods allow simulations for much larger systems than accessible by QM methods alone. The size of many systems of topical interest in chemistry and biochemistry prevents efficient and accurate treatment by quantum mechanical ab initio methods. For reactions in condensed phase and surfaces periodic boundary conditions (PBC) can be applied reducing the size of the problem to a unit cell [1–3]. However, many interesting structure features such as defects or active sites require larger unit cells due to broken space and translation symmetry. A computationally appealing alternative are interatomic potential functions ranging from molecular mechanics force fields to ion-pair potentials. They yield accurate equilibrium structures for the type of systems for which they are parameterized [4], but are usually not suitable to describe the active sites of catalysts with sufficient accuracy. Moreover, unless special modifications are made, they cannot be used to model reactions in which chemical bonding is changed. The cluster model approach is an alternative that makes the calculations on active sites and defects feasible to ab initio methods [5]. Only a fragment of the structure is considered that contains the interesting part, and the surroundings are neglected or approximately included. There exist, however, classes of problems, which require a computational treatment of the whole system. A prominent example is shape selectivity in zeolite catalysis.
∗ Present address: Institut für Chemie, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany
241 S. Yip (ed.), Handbook of Materials Modeling, 241–258. c 2005 Springer. Printed in the Netherlands.
242
M. Sierka and J. Sauer
Although zeolite catalysts with different framework structures have the same active sites in common, they may show very different catalytic performances. Hybrid quantum mechanics-molecular mechanics (QM/MM) methods combine advantages of an accurate QM description of the important part of the system, e.g., the active site, with the computational efficiency of interatomic potential functions applied to its surroundings. This way the most important environmental effects can be included, for example mechanical constraints, electrostatic interactions and polarization. The idea of hybrid QM/MM methods goes back to the late 1970s [6]. Today, there are a large number of different implementations of the QM/MM hybrid approach and an increasing number of applications. One example is our own combined quantum mechanics– interatomic potential functions approach (QM–Pot) that has been applied to various problems of homogeneous and heterogeneous catalysis [7]. This more general name is chosen since the term “molecular mechanics” (MM) stresses the force field type of potential functions most often used for organic and biomolecules, while inorganic solids are better described by ion-pair potential functions. In this contribution the QM–Pot method and its applications are reviewed, with special focus on problems in zeolite catalysis. We demonstrate that hybrid methods are not only an alternative to full QM calculations, in particular when active sites are considered, but in some cases can even recover the deficiencies of approximate QM methods. More complete reviews can be found, for example in Gao and Thompson [8], Sherwood [9] and in several articles of the “Encyclopedia of Computational Chemistry” [10].
1.
Definition of the QM/MM Potential Energy Surface
This section describes the theoretical background of hybrid QM/MM methods in general and the QM–Pot method in particular. The entire system (S) is partitioned into an inner or active part (I) and the outer part (O), as shown in Fig. 1. The interactions within the inner part are treated at the higher, usually QM level. All interactions within the outer part are described by a computationally less expensive, lower level method, for example, a parameterized potential function, MM force field or a more approximate QM method. Thus, the energy of the whole system is expressed as E(S)high/low = E(O)low + E(I)high + E(I–O).
(1)
The E(I–O) term describes the mutual interaction between I and O parts. It can be described by potential function or MM method alone (mechanical embedding) or include some QM terms, for example, the electrostatic interaction between I and O or mutual polarization terms (electronic embedding) [11].
Hybrid QM/MM methods and their application (a)
243
(b)
Figure 1. (a) Chemical system partitioned into the active (inner) part I and an outer part O; (b) Link atoms L are used to saturate the inner part when its definition requires the breaking of covalent bonds, A–B.
If the E(I–O) term is given entirely by the low level method, Eq. (1) can be alternatively expressed as E(S)high/low = E(S)low − E(I)low + E(I)high.
(2)
This so-called subtraction scheme [12–14] has the advantage that standard methods, QM (high) and MM (low), are applied to well-defined systems, I and O. This idea is followed in our QM–Pot approach [12, 15]. Note, that in case of Eq. (2), there is no direct influence of the O part on the wavefunction of the QM part. However, the electronic structure obtained from the QM/MM calculations is different from that of a QM calculation for the I part only since the equilibrium structures obtained in the two optimizations are different. The advantage of Eq. (2) is that the electrostatic interactions between I and O parts are treated at the same lower level, i.e., by the potential function used. Balanced and proven point charge or higher multiple models can be used and even mutual polarization effects can be treated provided that the potential functions have this functional form (e.g., [11]).
2.
Link Atoms
The division of a system is sometimes trivial, for example, in solvated systems with solvent as the O part and solute as the I part [16]. Difficulties arise, however, if the I and O parts are connected by covalent bonds. Simply breaking the bonds would result in a highly charged or high spin system leading to a poor description of the interactions at the QM level. In such a case the definition of the I part requires proper description of the boundary region. Several methods have been developed to handle this situation, for example, localized orbitals (e.g., [17]) but the easiest and most commonly applied approach
244
M. Sierka and J. Sauer
involves socalled link atoms. Most of the hybrid QM/MM and embedding methods differ by the way the boundary region is treated and by contributions to E(I–O) that are evaluated at the QM level. In the link atom approach the bonds between atoms of the inner region and atoms of the outer region (A–B, A ∈ I, B ∈ O) are replaced by the bonds between the inner part atoms and link atoms (A–L), as shown in Fig. 1(b). The I part atoms together with the link atoms form a finite molecular cluster model (C). The type and number of link atoms should be chosen such that bond orders are conserved and no open valencies are left on the link atoms. In most cases hydrogen atoms seem to be the most reasonable choice for link atoms [5]. Introduction of link atoms requires modification of the QM–Pot energy expression, Eq. (2), which now takes the form E(S,L)high/low = E(S)low − E(C)low + E(C)high,
(3)
where the E(S,L) notation stresses that the energy defined by Eq. (3) depends now also on coordinates of the link atoms. The energy defined by Eq. (3) differs formally from that of Eq. (2) by the term that involves differences of energy contributions at the high QM and low Pot levels = E(L)low − E(L)high + E(I–L)low − E(I–L)high,
(4)
where E(L) and E(I–L) denote contributions from the link atoms and the interactions between link atoms and the I part atoms. To maintain high accuracy of the QM/MM hybrid calculations the influence of this term on computed relative energies has to be minimized. This can be achieved in two ways: (a) use of potential functions that mimic the quantum mechanical energy contribution connected with link atoms sufficiently well, and (b) use of large enough QM clusters, so that the distances between the reaction center and the link atoms is sufficiently large to ensure that remains constant during the reaction course. In practice, (a) can be achieved using potential functions parameterized to reproduce results of QM calculations on small model systems. For (b) it is difficult judge a priori whether a given QM cluster size is sufficient to yield acceptable errors. Therefore, careful convergence studies of the calculated properties with increasing QM cluster size or comparison with full QM calculations are very important in the calibration of the QM–Pot results. The link atoms introduce additional, artificial degrees of freedom to the system and their proper treatment is very important in structure optimizations or molecular dynamics simulations. There are generally two ways of treating link atoms – unconstrained (e.g., [18]) and constrained (e.g., [12, 19]). In the unconstrained approach the link atoms are free to move and their positions are independently optimized. In the constrained approach the link atoms are kept fixed at some chosen position, usually on the bonds they terminate.
Hybrid QM/MM methods and their application
245
In the QM–Pot approach the terminating link atoms are kept in a position in which they serve their purpose best: they are constrained to stay on the bonds between the inner and the outer part of the system that they terminate [12, 15]. This way the explicit dependence of the QM–Pot energy on link atom positions is removed and replaced by a parametric dependence in the form of constraints. This creates additional contributions to the forces and force constant on the atom in bonds linking the I and O part. The advantage is that the derivatives of the E high/low, Eq. (3), fulfill the requirements that follow from its translational and rotational invariance. Morokuma’s ONIOM method uses a similar approach [19].
3.
The Reaction Force Field
The QM–Pot approach relies on the subtraction scheme and the potential function must be also known for the I part, at least the part that describes the long-range interaction with the O part. The usual force fields and potential functions describe the potential energy surface only in the vicinity of a stable minimum, but are not valid for the regions corresponding to transition structures. A solution to this problem was proposed by Warshel [20]. His empirical valence bond (EVB) method creates a smooth connection between the force fields describing different states (resonance forms) of the system. We have adopted the EVB idea for efficient location of transition structures in extended systems by the QM–Pot method [15]. In the simplest case of just two states described by single minimum interatomic potential functions V1 and V2 a simple 2×2 eigenvalue problem is obtained with a nondiagonal V12 element that couples these two states (see Fig. 2). The lowest eigenvalue describes an adiabatic state that creates a smooth transition between the V1 and V2 states. The V12 term is defined in terms of a small set of internal coordinates in which only atoms with the largest displacement along the reaction path are involved, and the necessary parameters are obtained from QM calculations on small model system. Since the I part is described by the QM method this EVB blending of two potentials affects only its interaction with the O part, E(I–O) term in Eq. (1), and already a crude estimate of the V12 parameters yields accurate enough results [21].
3.1.
Long-Range Interactions
Reaction energies, energy barriers or other relative energies calculated with the QM–Pot method consist of two contributions [22] E QM−Pot = E QM//QM−Pot + E Pot//QM−Pot .
(5)
246
M. Sierka and J. Sauer
Figure 2. Interatomic potential functions V1 and V2 for reactants (R) and products (P), respectively, coupled by the EVB method. The result is a smooth potential function E valid also in the region of transition state (TS).
The notation “//QM–Pot” means that the energies are evaluated for the structures obtained by QM–Pot calculations. The first one, the direct QM contribution E QM = E(C2 )QM//QM−Pot − E(C1 )QM//QM−Pot
(6)
is different from unconstrained cluster model results because the structures of the embedded clusters are different due to constraints imposed by the extended system (e.g., solid lattice). Superscripts “1” and “2” correspond to the two states of the system, for example, reactants and products or reactants and transition state. The second term includes all contributions due to the interatomic potential functions E L R = E(S2 )Pot//QM−Pot − E(S1 )Pot//QM−Pot −E(C2 )Pot//QM−Pot + E(C1 )Pot//QM−Pot.
(7)
If the QM cluster is large enough to account for all structure distortions upon the reaction, the latter contribution can be considered as a correction accounting for all long-range interactions not included in the QM part.
3.2.
Implementation
The QM–Pot method, the EVB coupling of parameterized potential energy surfaces and its combination with the QM–Pot approach have been implemented in the QMPOT program [15]. It is designed as an optimizer for
Hybrid QM/MM methods and their application
247
minima and saddle points. External programs provide the energies, forces, and force constants for the QM and Pot parts, and the communication is achieved through the interface functions. The main features of QMPOT are (a) QM– Pot and EVB structure optimizations to minima and saddle points, (b) QM– Pot and EVB energy second derivatives and harmonic vibrational frequencies. Morokuma’s IMOMM, IMOMO, and ONIOM are other implementations of hybrid QM/MM and QM/QM methods and available in the Gaussian program [19]. A third implementation with various options is the ChemShell software [9].
4.
Applications to Structure and Reactivity of Zeolite Catalysts
Zeolites are nanoporous crystalline solids built of three-dimensional networks of corner-sharing TO4 tetrahedra in which T is an electropositive element, typically Si, Al, or P. Depending on their composition, (SiO2 )x (AlO− 2 )y + (PO2 )z , frameworks are negatively charged and charge-compensating metal cations or protons are present on extra-framework positions. The different structure types and examples for zeolites found as minerals or synthesized in the laboratory are collected in the “Atlas of Zeolite Structure Types” [23]. Figure 3 shows four examples of zeolite lattices, chabazite (CHA), faujasite (FAU), mordenite (MOR), and ZSM-5 (MFI). Because of their unique nanoporous structure, zeolites have the ability to act as catalysts for chemical reactions which take place within the internal cavities. The most important sources of catalytic activity are: (i) Protons as charge compensating cations (solid Brønsted acids). This is exploited in many organic reactions, including crude oil cracking, isomerization and fuel synthesis. (ii) Transition metals cations occupying extra-framework positions. Examples are Cu, Co, or Ag exchanged zeolites used in NOx decomposition. (iii) Isomorphous substitution of Si4+ by other tetrahedrally coordinated cations, for example substitution of Ti into high-silica zeolite frameworks creates highly selective oxidation catalysts.
5.
Acidic Zeolite Catalysts
The acidic strength of zeolites containing Brønsted acidic sites can be characterized by different model reactions. (1) Deprotonation, ZO–H → ZO− + H+ .
(8)
248
M. Sierka and J. Sauer
Figure 3.
(a)
(b)
(c)
(d)
Different zeolite framework structures: (a) CHA, (b) FAU, (c) MFI, and (d) MOR.
This hypothetical reaction defines gas phase acidities. The relative values of enthalpies of deprotonation for gas phase molecules are accessible from proton transfer equilibrium data. For acidic sites at surfaces only inferences can be made from spectroscopic data, and reliable values can only be provided by theoretical calculations. (2) Chemisorption of basic molecules, e.g., NH3 , ZO–H + NH3 → ZO–NH+ 4.
(9)
This process can be studied by calorimetry and the reverse process is observed in temperature programmed desorption experiments. (3) Proton motion between two oxygen atoms Z(O2 )O1 –H → Z(O1 )O2 –H.
(10)
The reaction energy for this proton motion is, by definition, given by the relative deprotonation energy of the two sites involved. The barrier and the corresponding jump rate characterize the proton mobility of the active site. (4) Protonation of organic molecules involved in the catalytic reactions in zeolites, for example, hydrocarbon conversions.
Hybrid QM/MM methods and their application
249
The QM–Pot method has been applied to investigate influence of the structure and chemical composition of zeolites on the acidity of their Brønsted sites [22, 24–26]. For deprotonation and ammonia adsorption the QM–Pot reaction energies deviate from the periodic full QM results by 4–9 kJ/mol only, which demonstrates the power of the combined approach [27]. A more detailed review of these results is given by Sauer and Sierka [28]. Here, we give a short overview of QM–Pot calculations on dynamic properties of Brønsted sites, i.e., reactions (3) and (4) above.
6.
Proton Mobility in Zeolites
The simplest dynamic process, which characterizes Brønsted acidic sites is the proton jump between oxygen atoms of the zeolite framework. In dehydrated acidic zeolites two types of proton motion can be distinguished (Fig. 4): (a) local, on-site jumps between the four oxygen atoms of the AlO4 tetrahedron, and (b) translational, inter-site motions between two different aluminum sites. Clearly, reliable theoretical predictions of how the proton jump barriers and rates depend on zeolite structure and chemical composition require a modeling method which takes the whole periodic lattice into account. Our QM–Pot approach is such a method. The predictions of on-site jump barriers and rates require localization of all four local minima and six transition structures for a given crystallographic location of the Al atom. For a small unit cell zeolite chabazite the maximum deviation of the QM–Pot results from the full periodic QM treatment is 4 kJ/mol for reaction energies (stabilities of different proton positions around the AlO4 tetrahedron) and 6 kJ/mol for proton jump barriers, as shown in Table 1 [15]. For zeolites with a large unit cell such as FAU and MFI convergence of the QM–Pot results with the size of the QM cluster was investigated [7]. The long-range correction to the barrier decreases with the QM cluster size, but shows large variations with the specific sites considered. Even for the largest clusters comprising up to 25 TO4 units (T = Si, Al) the long-range corrections vary over 25 kJ/mol, but due to the combined QM–Pot method the total barrier heights are stable within a few kJ/mol. Hence, use of (a)
Figure 4.
(b)
Two types of proton jumps in dehydrated zeolites: on-site (a) and inter-site (b).
250
M. Sierka and J. Sauer Table 1. Comparison of relative stabilities (kJ/mol) of proton positions (E) and proton jump barriers (E ‡ ) between two oxygen atoms calculated with the QM–Pot method and full periodic QM for zeolite chabazite. Data taken from Sierka and Sauer [15] E ‡a
E a Jump path
QM–Pot
Full QM
QM–Pot
Full QM
O3-O4b
1.7 3.3 65 69 O1-O2 11.2 11.5 66 72 O3-O2 3.5 3.8 87 92 O1-O3 3.6 7.2 90 90 O2-O4 0.2 3.2 99 97 O1-O4 12.4 13.4 102 105 a At 0 K temperature, zero point energy correction not included. b Numbers denote different crystallographic positions of oxygen atoms within the zeolite lattice.
cluster models of the same size without embedding by the QM–Pot scheme will produce large errors on the relative barrier heights. For the investigated zeolites (CHA, FAU, and MFI) the final on-site proton jump barriers including zero-point vibrational energies vary between 52 and 106 kJ/mol, depending on zeolite type, crystallographic site, and path for the proton motion. The predicted jump rates also show large variations, from 10−6 to 105 s−1 . Estimates show that tunneling is not an important factor above room temperature. For the inter-site proton motion in acidic MFI the activation barriers are found to depend on the spatial separation of the two neighboring Al sites. The calculated proton jump rates vary over a broad range of 10−10 – 1010 s−1 , depending on the proton jump path and the Al–Al distance [29].
6.1.
Hydrocarbon Conversion in Zeolites
Zeolites are very important catalysts in petroleum refining and petrochemical conversion processes. Experimental techniques cannot easily provide information about elementary catalytic steps because adsorption, desorption, and diffusion processes interplay with multiple simultaneous reactions. Although zeolites of different framework structures have the same Brønsted sites in common, they show orders of magnitude differences in catalytic performance. Thus, reliable modeling techniques must incorporate a realistic model of the active site environment. Large hydrocarbon species in zeolite catalysts with unit cells containing several hundred atoms are still a challenge for quantum chemistry methods. Recently, several studies using periodic density functional theory (DFT) method with plane wave basis set for small and medium unit cell zeolites have been reported (see, e.g., [30–33]). However, such calculations are far from being routine and still a challenge for
Hybrid QM/MM methods and their application
251
zeolites with unit cells of the size of ZSM-5 zeolite, and suffer from an additional problem. While current density functionals are reasonably well suited for describing bond breaking–bond making reaction steps, they fail to yield reliable energies for the van der Waals (vdW) interactions that dominate the adsorption–desorption steps [34]. The QM–Pot approach is such a method capable of treating the full active site environment at reasonable computational expense. It has also the advantage of producing more reliable adsorption energies compared to the full DFT treatment. When the QM part can be chosen small enough, the most important vdW interactions between the hydrocarbon and the zeolite are described by the force field (which is superior to DFT for this purpose). A cluster model containing three T atoms with the formula (OH)2 Al(OSi(OH)3 )2 proved the best choice. This model is also large enough to describe bond-breaking and bond-making properly. We have applied our method in studies of two important issues in hydrocarbon chemistry in zeolites – the role of the carbocations in hydrocarbon conversions [21] and the phenomenon known as Transition State Shape Selectivity [35].
6.2.
Adsorption of Unsaturated Hydrocarbons and the Role of Carbocations
Adsorption of unsaturated hydrocarbons on zeolitic Brønsted sites results in formation of an adsorption complex. Contrary to the full periodic DFT calculations our QM–Pot approach yields reasonable adsorption energies. For m-xylene in ZSM5 the QM–Pot adsorption energy is approximately 61 kJ/mol, with the DFT part of this result of only 12 kJ/mol. Using plane-wave periodic DFT calculations we obtain a value of 28 kJ/mol. Experimental values are between 60 and 85 kJ/mol on NaY and KY zeolites [35]. We clearly see the failure of DFT to give the reasonable dispersion interactions resulting in much underestimated adsorption energies. The detailed mechanisms by which solid acids catalyze hydrocarbon conversion reactions are still not completely known. The work of Olah and others demonstrated that liquid super acids protonate hydrocarbons and stabilize carbenium ions [36, 37], but still it is controversial if zeolites do the same. We have used our hybrid QM/MM method and studied the bimolecular mechanism of the disproportionation of m-xylene into toluene and trimethylbenzene (TMB) [21]. The most important finding from our calculations is that benzenium type carbenium ions are local minima on the potential energy surface (see Fig. 5) and, hence, possible intermediates in the reaction mechanism. Nicolas and Haw [38] have produced NMR evidence for some carbenium ions and concluded that only species with proton affinities (PA) greater than about 874 kJ/mol live long enough in zeolites to be observed on the NMR time scale.
252
M. Sierka and J. Sauer (a)
(b)
Figure 5. Benzenium carbenium ion based on 3-methylphenyl-2,4-dimethylphenyl-methane (upper part of the figure) and its positions within the FAU zeolitic cage: the electrostaticaly stabilized position A in the vicinity of A1 atom (a), and van der Waals stabilized position B far from the active site (b).
The PA of the benzenium ion shown in Fig. 5 is only 821 kJ/mol and it remains to be seen if this is enough to observe it experimentally. Both the ionic attraction between the negatively charged active Al site on the zeolite framework and the positively charged benzenium ion and the vdW (dispersion) interaction with the zeolite wall contribute to the stabilization of the benzenium-type intermediate in the zeolite cavity. While the electrostatic attraction dominates for position A close to the negatively charged active site, in structure B the benzenium ion fits tightly to the zeolite wall far from the active site where it maximizes the vdW interaction. The unique feature of zeolite catalysts is that they combine the activation of molecules with the confinement of the nano-sized pores and cavities. This has led to the concept of transition state shape selectivity. It states that some isomers may not be observed in the product stream, not because they are too bulky to leave the pores, but because the transition state through which they form at the active site may be too bulky for the pore of a given zeolite. An experimental proof is difficult, and therefore, we have used our hybrid method to study a sterically demanding reaction, the disproportionation of m-xylene into TMB and toluene [35]. The results of the QM–Pot calculations show that one product isomer (1,3,5-TMB) is disfavored, but relative selectivity to the other two isomers varies with pore geometry, mechanistic pathway, and inclusion of entropic effects. The calculated barriers, including zero point energy corrections, are in general agreement with experimental
Hybrid QM/MM methods and their application
253
data. For both pathways they fall into the range of 112–135 kJ/mol, comparing to experimentally derived apparent reaction barriers of 80–110 kJ/mol. Variation of the environment shape at the critical transition states is shown to affect the course of reaction in three zeolites investigated (FAU, MFI, and MOR). Barrier height shifts on the order of 10–20 kJ/mol are achievable. However, observed selectivities do not agree with the transition state characteristics calculated and, hence, are most likely due to product shape selectivity.
7.
Transition Metal Containing Zeolites
Systems containing transition metals ions (TMIs) show catalytic activity in different systems ranging from homogeneous and heterogeneous catalysts to biomolecules. Particularly, TMIs exchanged high-silica zeolites such as MFI and ferrierite (FER) show high catalytic activity for the direct conversion of NO into N2 and O2 (“deNOx activity”) and for the selective catalytic reduction of NO by hydrocarbons in the presence of excess oxygen. Among different materials the Cu exchanged MFI shows an unusually high catalytic activity [39]. Due to a rather high Si/Al ratio and low TMI loading detailed information about catalytic processes is not easily accessible experimentally. Therefore, reliable theoretical studies are of great importance to understand the distribution, local structure, and the catalytic activity of TMI sites. Empirical interatomic potentials can be applied to periodic structures (e.g., [40]) and are capable of distinguishing between different zeolite frameworks and different sites within a given framework, but the reliability of potential functions is an open question. DFT provide reliable results for TMI interactions [41], but problems arise from the use of cluster models. Cluster models are not capable of describing differences between different zeolite frameworks, the results of cluster model calculations also depend on the shape and size of the cluster as well as on geometric constraints imposed on it. An example is the correct prediction of the number of coordinations (CN) of the Cu+ ions to the zeolite framework. It appears that linear cluster models are biased towards twofold coordination, whereas cyclic ones are biased toward structures with higher coordination numbers [42]. For linear chains of TO4 tetrahedra (T = Si,Al) with one to five T atoms, two-fold coordination of Cu+ cations to two oxygen atoms of AlO4 tetrahedra was consistently found by several authors (e.g., [43, 44]), while cluster models with a ring of TO4 tetrahedra containing four to six T atoms yielded structures with three- or four-fold coordinated Cu+ ions (e.g., [45]). Reliable results can only be obtained when the whole periodic zeolite structure is included, different possible locations of TMIs are considered and the interactions with the zeolite framework are described accurately enough. This can be easily achieved by using the QM–Pot method, which since its first application to the chemistry of Cu+ cations in MFI [46] proved a powerful tool for studying TMI ions in zeolites, particularly in MFI and FER [28, 42].
254
M. Sierka and J. Sauer
Determination of the preferred siting and coordination of Cu+ ions in zeolites such as MFI and FER requires investigation of many different arrangements of aluminium atoms and copper ions, since such information is not available from experiment. A two-step computational strategy proved very useful [47, 48]. First, lattice energy minimizations using an accurate potential function parametrized on DFT data alone were performed for a large number of initial Al and Cu+ distributions. This allows for a fast determination of the most favored sites. Next, for selected structures QM–Pot energy minimizations were performed using QM clusters large enough to capture the most important interactions between the Cu+ ion and the zeolite framework. In both zeolites sites were found with two-, three-, or four-fold coordinated Cu+ ions and with average coordination numbers in close agreement with experimental data. The sites were classified depending on the number of O atoms coordinating the Cu+ ion and its position in the framework, as shown in Fig. 4 of [42]. Type II site copper ions are coordinated to two framework O atoms, either at the channel intersection (I2 site in MFI and FER) or on the walls of the main or perpendicular channels (M2 and P2 sites, respectively in FER). Higher coordinated sites, summarized as type I sites, have one or two additional coordinations to other oxygen atoms within a five- or sixmembered (TO)n ring. The existence of the two types of sites also emerged from experimental photoluminescence spectra. While the observed 3d10 (1 S0 ) – 3d9 4s1 (1 D2 ) excitation spectra show two well-separated bands, the band splitting almost disappears in the emission spectra. The QM–Pot calculations not only confirmed this observation but also provided an explanation [49]. In the ground state, different types of Cu+ coordination cause large variations in the excitation energies. In contrast, in the excited state the coordination differences between type I and II sites disappear. The type I sites give up their additional coordination and retain only the twofold coordination to the AlO4 tetrahedron, whereas type II sites remain unchanged. The reason is that on excitation the 4s orbital becomes occupied, which is much larger than the 3d orbital, and so the Cu+ ion moves away from the zeolite wall. Thus, because the excited structures are alike for all Cu+ sites considered, the emission energies are also very similar. Relatively large relaxation effects of the surrounding zeolite lattice, showing that such phenomena cannot be adequately described by free space or frozen lattice cluster calculations, accompany the excitation process. The most important question is whether the two different types of Cu+ sites in MFI and FER exhibit different catalytic properties. The QM–Pot calculations have been performed to investigate the influence of the Cu+ ion location on adsorption of small molecules, such as CO [50], NO, NO2 , N2 , and H2 O [51]. Upon the interaction with one molecule the coordination of TMI to zeolite framework is unchanged for type II sites. For type I site the interaction of TMI with any of the studied molecule leads to the loss of the coordination
Hybrid QM/MM methods and their application
255
Table 2. Calculated QM–Pot interaction energies (kJ/mol)a of NO, CO, N2 , and NO2 molecules with Cu+ ions in zeolites MFI and FER. Data taken from Nachtigall et al. [51] Type I site
Type II site
Molecule
MFI
FER
MFI
FER
NO CO N2 NO2
117 151 84 146
84 117 54 109
146 176 109 180
138 167 100 159
a At 0 K temperature, zero point energy correction not included.
of TMI to non-AlO4 tetrahedron framework oxygen atoms and TMI is moved farther form the channel wall. For this reason, the interaction energies with type II sites are 6–8 and 11–13 kcal/mol stronger for MFI and FER, respectively, than with type I sites, as shown in Table 2. The significant differences between type I and type II sites were also found for interaction with two or three molecules. For example, the Cu+ ion in type II site can bind two or even three CO molecules (in agreement with experimental observation) while the Cu+ ion in type I site can bind two CO molecules at most. The two step QM–Pot approach has also been successful in determining the siting and local coordination of Cu(I) pairs [52], and Ag+ ions in MFI [53] as well as the coordination of Cu+ and Cu2+ ions in MFI in the vicinity of two framework Al atoms [54].
8.
Ti-Silicalite Catalysts
Isomorphous substitution of Si4+ by Ti4+ in synthetic zeolites gives rise to an interesting family of very active and selective catalysts. Ti containing silicalite-1 (TS-1) shows an unprecedented catalytic activity for the oxidation of organic substances [55]. Experimental methods have difficulties to localize the active Ti sites and characterize their catalytic properties because of their low concentration (Ti/Si < 0.025) and probable structural disorder. The QM–Pot method was used to examine the possible location of Ti within the framework and the interaction of the Ti-sites with one or two water molecules [56]. Full periodic QM calculations on the small unit cell zeolite Ti-chabazite showed that the QM–Pot results converge to the true periodic QM limit when large enough QM cluster are used. In the dehydrated state the stability differences between structures with Ti in different crystallographic positions are typically between 0 and 10 kJ/mol. A similar range of energies was obtained in calculations using cluster models [57] or periodic Hartree-Fock calculation [58]. A recent QM/MM study using partially constrained cluster models (ONIOM: DFT in combination with the UFF force field) yielded unreliable stability
256
M. Sierka and J. Sauer
differences of up to 235 kJ/mol [59]. We suspect that the UFF force field used by the authors does not reliably describe zeolitic systems. The binding of H2 O and NH3 to Ti-sites was recently examined using the ONIOM method (DFT/larger basis set for a small cluster combined with Hartree-Fock/small basis set for a large cluster) [60]. The QM–Pot method was also used to study the spectroscopic properties of Ti substituted zeolites [61, 62]. The QM–Pot method not only correctly reproduced the observed IR and 29 Si NMR spectra of titanium silicalite-1 but also provided insight in the role of Ti substitutions. Such substitution causes a shift of the 29 Si NMR signal of neighboring Si nuclei of only ∼1 ppm to lower fields, while the dependence of the chemical shift on the average T–O–T angle remains unaltered. Hydration of the framework Ti site has been found to strongly influence the 29 Si NMR chemical shift via structural distortions of the lattice and by hydrolysis of Si–O–Ti bridges. QM–Pot calculations confirmed that the IR mode at 960 cm−1 is characteristic of Ti substitution and due to asymmetric TiO4 vibrations [62].
Acknowledgments M. Sierka acknowledges support from the “Center for Functional Nanostructures”, which is funded by the “Deutsche Forschungsgemeinschaft,” the State of Baden-W¨urttemberg and the Universit¨at Karlsruhe. J. Sauer has been supported by the “Deutsche Forschungsgemeinschaft” (SPP 1155) and the “Fonds der chemischen Industrie.”
References [1] C. Pisani (ed.), Quantum-Mechanical Ab-initio Calculation of the Properties of Crystalline Materials, Lecture Notes in Chemistry, vol. 67, Springer-Verlag, Berlin, 1996. [2] M. Parrinello, Sol. Stat. Commun., 102, 107–120, 1997. [3] D. Marx and J. Hutter, In: J. Grotendorst (ed.), Modern Methods and Algorithms of Quantum Chemistry, NIC Series, vol. 3, NIC Directors, FZ J¨ulich, J¨ulich, pp. 301– 449, 2000. [4] J.R. Hill, C.M. Freeman, and L. Subramanian, “Use of force fields in materials modeling,” In: K.B. Lipkowitz and D.B. Boyd (eds.), Reviews in Computational Chemistry, vol. 16, VCH, New York, pp. 141–216, 2000. [5] J. Sauer, Chem. Rev., 89, 199–255, 1989. [6] A. Warshel and M. Levitt, J. Mol. Biol., 103, 227–249, 1976. [7] M. Sierka and J. Sauer, J. Phys. Chem. B, 105, 1603–1613, 2001. [8] J. Gao and M.A. Thompson (eds.), Combined Quantum Mechanical and Molecular Mechanical Methods, ACS Symposium Series, vol. 712, American Chemical Society, Washington, 1998.
Hybrid QM/MM methods and their application
257
[9] P. Sherwood, In: J. Grotendorst (ed.), Modern Methods and Algorithms of Quantum Chemistry, NIC Series, vol. 3, NIC Directors, FZ J¨ulich, J¨ulich, pp. 257–277, 2000. [10] P. von Ragu´e Schleyer, N.L. Allinger, T. Clark, J. Gastaiger, P.A. Kollman, H.F. Schaefer, III, and P.R. Schreiner (eds.), Encyclopedia of Computational Chemistry, Wiley, Chichester, 1998. [11] D. Bakowies and W. Thiel, J. Phys. Chem., 100, 10580–10594, 1996. [12] U. Eichler, C.M. K¨olmel, and J. Sauer, J. Comput. Chem., 18, 463–477, 1997. [13] S. Humbel, S. Sieber, and K. Morokuma, J. Chem. Phys., 105, 1959–1967, 1996. [14] A.L. Shluger and J.D. Gale, Phys. Rev. B, 54, 962–969, 1996. [15] M. Sierka and J. Sauer, J. Chem. Phys., 112, 6983–6996, 2000. [16] J. Gao, “Methods and applications of combined quantum mechanical and molecular mechanical potentials,” In: K.B. Lipkowitz and D.B Boyd (eds.), Reviews in Computational Chemistry, vol. 7, VCH, New York, pp. 119–185, 1995. [17] M.F. Ruiz-L´opez and J.L. Rivail, “Combined quantum mechanics and molecular mechanics approaches to chemical and biochemical reactivity,” In: P. von Ragu´e Schleyer, N.L. Allinger, T. Clark, J. Gastaiger, P.A. Kollman, H.F. Schaefer, III, and P.R. Schreiner (eds.), Encyclopedia of Computational Chemistry, Vol. 1, pp. 437–448 Wiley, Chichester, 1998. [18] J.R. Shoemaker, L.W. Burggraf, and M.S. Gordon, J. Phys. Chem. A, 103, 3245– 3251, 1999. [19] S. Dapprich, I. Kom´aromi, K.S. Byun, K. Morokuma, and M.J. Frisch, J. Mol. Struct. (Theochem), 461–462, 1–21, 1999. [20] A. Warshel, Computer Modeling of Chemical Reactions in Enzymes and in Solutions, New York, Wiley, 1991. [21] L.A. Clark, M. Sierka, and J. Sauer, J. Am. Chem. Soc., 125, 2136–2141, 2003. [22] U. Eichler, M. Br¨andle, and J. Sauer, J. Phys. Chem. B, 101, 10035–10050, 1997. [23] Baerlocher, Ch., W.M. Meier, and D.H. Olson, Atlas of Zeolite Framework Types, Amsterdam, Elsevier, 2001. [24] M. Sierka and J. Sauer, Faraday Discuss, 106, 41–62, 1997. [25] M. Sierka, U. Eichler, J. Datka, and J. Sauer, J. Phys. Chem. B, 102, 6397–6404, 1998. [26] M. Br¨andle and J. Sauer, J. Am. Chem. Soc., 120, 1556–1570, 1998. [27] M. Br¨andle, J. Sauer, R. Dovesi, and N.M. Harrison, J. Chem. Phys., 109, 10379– 10389, 1998. [28] J. Sauer and M. Sierka, J. Comput. Chem., 21, 1470–1493, 2000. [29] M.E. Franke, M. Sierka, U. Simon, and J. Sauer, Phys. Chem. Chem. Phys., 4, 5207– 5216, 2002. [30] T. Demuth, X. Rozanska, L. Benco, J. Hafner, R.A. van Santen, and H. Toulhoat, J. Catal., 214, 68–77, 2003. [31] X. Rozanska, R.A. van Santen, T. Demuth, F. Hutschka, and J. Hafner, J. Phys. Chem. B, 107, 1309–1315, 2003. [32] X. Rozanska, R.A. van Santen, F. Hutschka, and J. Hafner, J. Am. Chem. Soc., 123, 7655–7667, 2001. [33] A.M. Vos, X. Rozanska, R.A. Schoonheydt, R.A. van Santen, F. Hutschka, and J. Hafner, J. Am. Chem. Soc., 123, 2799–2809, 2001. [34] T.A. Wesolowski, O. Parisel, Y. Ellinger, and J. Weber, J. Phys. Chem. A, 101, 7818– 7825, 1997. [35] L.A. Clark, M. Sierka, and J. Sauer, J. Am. Chem. Soc., 126, 936–947, 2004. [36] J.F. Haw, Phys. Chem. Chem. Phys., 4, 5431–5441, 2002.
258
M. Sierka and J. Sauer [37] G.A. Olah and A. Molnar, Hydrocarbon Chemistry, Willey, New York, 1995. [38] J.B. Nicholas and J.F. Haw, J. Am. Chem. Soc., 120, 11804–11805, 1998. [39] M. Iwamoto, H. Furukawa, Y. Mine, F. Uemura, S. Mikuriya, and S. Kagawa, J. Chem. Soc. Chem. Commun., 1272–1273, 1986. [40] D.C. Sayle, C.R.A. Catlow, J.D. Gale, M.A. Perrin, and P. Nortier, J. Mater. Chem., 7, 1635–1639, 1997. [41] K. Koszinowski, D. Schr¨oder, H. Schwarz, M.C. Holthausen, J. Sauer, H. Koizumi, and P.B. Armentrout, Inorg. Chem., 41, 5882–5890, 2002. [42] J. Sauer, D. Nachtigallov´a, and P. Nachtigall, In: G. Centi, B. Wichterlov´a, and A.T. Bell (eds.), Catalysis by Unique Metal Ion Structures in Solid Matrices. From Science to Application, Nato Science Series, Sub-Series II, vol. 13, Kluwer Dordrecht, Academic Publishers, pp. 221–234, 2001. [43] B.L. Trout, A.K. Chakraborty, and A.T. Bell, J. Phys. Chem., 100, 4173–4179, 1996. [44] K.C. Haas and W.F. Schneider, Phys. Chem. Chem. Phys., 1, 639–648, 1999. [45] E. Broclawik, J. Datka, B. Gill, and P. Kozyra, Phys. Chem. Chem. Phys., 2, 401–405, 2000. [46] L. Rodriguez-Santiago, M. Sierka, V. Branchadell, M. Sodupe, and J. Sauer, J. Am. Chem. Soc., 120, 1545–1551, 1998. [47] D. Nachtigallov´a, P. Nachtigall, M. Sierka, and J. Sauer, Phys. Chem. Chem. Phys., 1, 2019–2026, 1999. [48] P. Nachtigall, M. Davidov´a, and D. Nachtigallov´a, J. Phys. Chem. B, 105, 3510– 3517, 2001. [49] P. Nachtigall, D. Nachtigallov´a, and J. Sauer, J. Phys. Chem. B, 104, 1738–1745, 2000. [50] M. Davidov´a, D. Nachtigallov´a, R. Bul´anek, and P. Nachtigall, J. Phys. Chem. B, 107, 2327–2332, 2003. [51] P. Nachtigall, M. Davidov´a, M. Silhan, and D. Nachtigallov´a, In: R. Aiello, G. Giordano, and F. Testa (eds.), Studies in Surface Science and Catalysis, vol. 142, Elsevier, Amsterdam, pp. 101–108, 2002. [52] P. Spuhler, M.C. Holthausen, D. Nachtigallov´a, P. Nachtigall, and J. Sauer, Chem. Eur. J., 8, 2099–2115, 2002. [53] M. Silhan, D. Nachtigallov´a, and P. Nachtigall, Phys. Chem. Chem. Phys., 3, 4791– 4795, 2001. [54] D. Nachtigallov´a, P. Nachtigall, and J. Sauer, Phys. Chem. Chem. Phys., 3, 1552– 1559, 2001. [55] B. Notari, Adv. Catal., 41, 253–334 and references cited therein, 1996. [56] G. Ricchiardi, A. de Man, and J. Sauer, Phys. Chem. Chem. Phys., 2, 2195–2204, 2000. [57] C.A. Hijar, R.M. Jacubinas, J. Eckert, N.J. Henson, P.J. Hay, and K.C. Ott, J. Phys. Chem. B, 104, 12157-12164, 2000. [58] C.M. Zicovich-Wilson, R. Dovesi, J. Phys. Chem. B, 102, 1411–1417, 1998. [59] T. Atoguchi and S. Yao, J. Mol. Catal A: Chem., 191, 281–288, 2003. [60] A. Damin, S. Bordiga, A. Zecchina, and C. Lamberti, J. Chem. Phys., 117, 226–237, 2002. [61] G. Ricchiardi and J. Sauer, Z. Phys. Chem. (Munich), 209, 21–32, 1999. [62] G. Ricchiardi, A. Damin, S. Bordiga, C. Lamberti, G. Spano, F. Rivetti, and A. Zecchina, J. Am. Chem. Soc., 123, 11409–11419, 2001.
1.13 AB INITIO MOLECULAR DYNAMICS SIMULATIONS OF BIOLOGICALLY RELEVANT SYSTEMS Alessandra Magistrato and Paolo Carloni International School for Advanced Studies (SISSA/ISAS) and INFM Democritos Center, Trieste, Italy
1.
Introduction
Ab initio (Car–Parrinello) molecular dynamics (AIMD) simulations [1] are increasingly used to investigate structural, dynamical, energetic and electronic properties of biomolecules. At opposite to classical MD simulations, in this approach the underlying potential energy surface is calculated directly from first-principles. This leads to a parameter free molecular dynamics, where interatomic forces are not empirically derived, but are evaluated from electronic structure calculations as the simulations proceeds. In most of its implementations Car–Parrinello AIMD relies on density functional theory (DFT) [2] as electronic structure method. This is due to the relatively low computational cost of DFT compared to post Hartree–Fock methods and to its wide range of applicability. The application of AIMD to the study of biochemically relevant systems poses, however, severe limitations because of the invariably large size of the systems under investigation. A natural way to reconcile the requirements on system size and accuracy is the use of mixed quantum/classical (AIMD/MM) simulations [3–5]. In this scheme, originally developed by Warshel [6], the chemically relevant part of the system (usually the active site) is treated at the quantum mechanical level, while the effects of the surroundings are explicitly taken into account within a mechanical force field description. This enables a realistic description of chemical reactions that occur in a complex heterogeneous environment such as enzymatic reaction cycles in explicit protein environment [7]. In this review, we first provide the fundamental principles of the AIMD and hybrid AIMD/MM methods in most wide spread implementations, namely 259 S. Yip (ed.), Handbook of Materials Modeling, 259–274. c 2005 Springer. Printed in the Netherlands.
260
A. Magistrato and P. Carloni
using density functional theory and planewave basis sets. Subsequently, we illustrate the power and limitations of the techniques for the modeling of biological systems by a survey of selected applications from our own work. Because of the explicit treatment of the electronic degrees of freedom, AIMD/MM calculations allow the direct simulation of bond breaking-bond forming processes, such as those occurring in an enzymatic reactions. Here, we report a study on the reaction catalyzed by caspase-3 [8], a current target for the cure of neurodegenerative diseases. We also show that fundamental insights can be obtained into an important class of biomimetics of iron-based enzymes [9]. We then move to the characterization of two transition-metal drugs: rhenium and technetium hexatioether complexes and cisplatin, along with their targeting abilities. This is a fundamental point in the design of new and highly selective drugs. Unfortunately, because of the presence of a transition metal ion, biomolecular force-fields might encounter difficulties to describe structural and energetic properties of the drug, which depends in an intricate way on the electronic structure [10]. AIMD/MM may provide a valuable alternative, as it takes electronic properties into account. Here, we present an investigation of the structural and electronic properties of metal-based drugs targeting proteins and DNA [11, 12]. Our approach from one hand provides insights into the reactivity of the metal based drugs, on the other hand it is used as a tool for docking these drugs on their targets. The article is concluded with a brief perspective on some of the methodologies, which may further extend the domain of applications of AIMD to biological systems.
2. 2.1.
Methods The Basic “Idea” of Car–Parrinello Molecular Dynamics
Under the simplifying assumptions that the motion of the nuclei can be described by classical laws and that the Born–Oppenheimer approximation holds, the most intuitive way to combine electronic structure calculations with a classical molecular dynamics scheme is a straightforward coupling of the two approaches (“Born–Oppenheimer molecular dynamics”). In this approach, the total potential energy E pot is calculated for a given nuclear configuration by solving the electronic structure problem. (1) E pot ( R I ) = E n ( R I ) + E e ( R I ) where E n ( R I ) is the direct internuclear interaction energy and E e ( R I ) is the ground state energy evaluated at fixed nuclear positions R I . The nuclear forces
AIMD simulations of biologically relevant systems
261
are then calculated from E pot and the nuclei are moved to new positions according to the laws of classical mechanics: M I R¨ I = −∇ I min {0 | He |0 }
(2)
where the energy of the ground state reads He 0 = E 0 0
(3)
where He is the Hamiltonian of the electronic subsystem and 0 is the ground state wavefunction. A different approach, proposed in 1985 by Car and Parrinello [1, 13] treats the electronic degrees of freedom (represented by the one electron wave functions ψi ) as fictitious classical variables. The system is therefore described in terms of the following extended Lagrangian L CP = Tn + Te − E pot
(4)
where L CP represents the Car–Parrinello extended Lagrangian, Tn the kinetic energy of the nuclei, Te the fictitious kinetic energy of the electronic system and E pot the potential energy that depends on both the nuclear position RI and the electronic variables ψi . The explicit form of this extended Lagrangian reads
L CP =
1 1 . . M I R˙ 2I + µ ψ i | ψ i − E KS ψi ; R I 2 I 2 i
+
λi j
ψi∗ ( r )ψ j ( r ) dr
− δi j
(5)
i, j
where M I represents the ionic masses and µ the fictitious mass associated with the electronic degrees of freedom, the last term represents the constraints that are added to ensure orthonormality of the one-electron wave functions. In most of the current Car–Parrinello AIMD implementations, the potential energy is given by the Kohn–Sham energy functional [2]
1 E KS ψi ; RI = − 2 +
1 2
r ) ∇ψi ( r) + dr ψi∗ (
dr VN ρ(r )
i
dr dr
ρ( r )ρ( r ) + E xc ρ( r) | r − r |
(6)
r ) is the external potential, E xc [ρ( r )] the exchange-correlation where VN ( functional and the electron density ρ( r ). The extended Lagrangian reported
262
A. Magistrato and P. Carloni
in Eq. (5) determines the evolution of a fictitious classical system in which nucleic positions as well as electronic degrees of freedom are treated as dynamical variables. The Newtonian equations of motion of this system are given by the Euler–Lagrange equations δL d δL = ˙ dt δ R I δ R I
(7)
δL d δL . = dt δ ψ ∗i δψ ∗i
(8)
and the corresponding Car–Parrinello equations of motions are δ M I R¨ I (t) = −
o | He |o δ R I
δ o | He |o .. µi ψ i (t) = − + ij ψ j = −He ψi + ij ψ j δψi j j
(9) (10)
where µ determines the velocity at which the electronic degrees evolve in time. In particular, the ratio µ/M characterizes the relative speed in which the electronic variables are propagated with respect to the nuclear positions. For µ M, the electronic degrees of freedom adjust instantaneously to changes in the nuclear coordinates and the resulting dynamics is adiabatic. Under this condition, Te Tn and the extended Lagrangian becomes identical to the physical Lagrangian of the system (L CP ≈ L). According to Eqs. (9) and (10), while the nuclei are moving at a certain physical temperature proportional to their kinetic energy, the electronic degrees of freedom move at a certain “fictitious temperature”. Thus, for a finite value of µ, the electronic subsystem moves within a limited width, given by its fictitious kinetic energy, above the Born–Oppenheimer surface. In such a way adiabaticity can be ensured only if the highest frequency of the nuclear motion is well separated from the lowest frequency associated with the fictitious ωmax I motions of the electronic degrees of freedom ωemin . Since ωemin is proportional to the square root of the electronic energy difference (E gap ) between the lowest unoccupied orbital and the highest occupied orbital (HOMO–LUMO gap).
ωemin α
E gap µ
(11)
. In the parameter µ can be chosen in such a way as to ensure ωemin ωmax I this way, energy is not transferred from the electronic to the nuclear subsystem. In Car–Parrinello AIMD the explicitly treated electron dynamics limits the largest time step that can be used to 0.1–0.2 fs only. This limitation does of
AIMD simulations of biologically relevant systems
263
course not exist in BO dynamics where there is no explicit electron dynamics and the time step is usually one order of magnitude larger with respect to Car–Parrinello AIMD. Therefore, the advantage of performing the diagonalization of the Hamiltonian only at the very first step of the dynamics has to find a compromise with the use of a small time step. The Kohn–Sham one electron ψi orbitals are expanded in a basis set of m ) up to a given kinetic energy cutoff E cut. plane waves (with wave vectors G ψi ( r) = √
1 cim ei G m •r Vcell m
(12)
In such a scheme, an adequate treatment of the inner core electrons would require prohibitively large basis sets. Therefore, only valence electrons are explicitly treated and the effect of ionic core electrons is integrated out using ab initio pseudopotential formalism. All calculations presented in section 3 are performed with the original Car–Parrinello scheme [14] based on (gradient corrected) [15] density functional theory in the framework of a pseudopotential approach [16] and a basis set of plane waves.
2.2.
Hybrid Car–Parrinello/MM Calculations
Chemical and biochemical processes of relevance usually occur in heterogeneous condensed phase environments consisting of thousands of atoms. One of the solutions that it is often used in order to model such systems is the use of a hybrid QM/MM approach [3, 4] in which the whole system is partitioned into a localized chemically active region, treated at the quantum mechanical level, and the remaining part of the system treated with empirical force fields. Several schemes exist in which the Car–Parrinello molecular dynamics scheme has been extended to a hybrid QM/MM framework [5, 17]. The general form of a mixed QM/MM Hamiltonian was introduced by Warshel [6] H = HQM + HMM + H QM/MM
(13)
where HMM is described by a standard biomolecular force field and comprises bonded (harmonic bonds, angles and dihedrals) and nonbonded interactions (electrostatic point charges and van der Waals interactions). The difficulties of each QM/MM implementation lie in the coupling between the QM and the MM region that is described in the HQM/MM part of the Hamiltonian. Recently, the CPMD has been combined with an MM approach [5]. In this scheme bonds between QM and MM region of the system are treated with specifically designed monovalent pseudopotentials [18], where the remaining bond interactions are described by the classical force field. The same holds for van der Waals interactions between QM and MM part of the system. On the other hand, electrostatic effects of the classical environment are treated
264
A. Magistrato and P. Carloni
as an additional contribution to the external field of the quantum system and particular care is taken in order to avoid over polarization of electron clouds near the boundary region, the so called spill-out effects. In addition, in order to limit the computational overhead, the electrostatic interactions between the QM system and the more distant MM atoms are included via a Hamiltonian term that explicitly couples the multipole moments of the quantum charge distribution with the classical point charges. In the scheme we have used to perform mixed QM/MM Car–Parrinello simulations either the GROMOS96 [19] or the AMBER95 [20] force fields can be used for the molecular mechanics part, in combination with particle– particle–particle mesh (P3M) treatment of long-range electrostatic interactions [21]. This scheme provides an efficient computational tool which takes explicitly into account the entire system and solvent effects.
3. 3.1.
Applications Enzymes and Biomimetic Compounds
3.1.1. Caspase-3 Caspase-3, a cysteine protease, plays a key role for cell apoptosis and it is directly involved in the neurodegeneration of the Alzheimer’s disease [22]. Up to now, the identification of inhibitors has been based only on combinatorial chemistry that, however, has not completely solved efficiency and selectivity problems. Knowledge of the energetics and the structural features of the enzymatic reaction mechanism is of relevance to develop transition state analog inhibitors. We have used AIMD/MM simulations to investigate the second reaction step, involving the deacylation of the peptide susbstrate (Chart 1) [8]. The activation free energy of the reaction is investigated using a thermodynamic integration technique. The attack of the hydrolytic water molecule implies an activation free energy of ∼20 kcal/mol and leads to a previously unrecognized gem-diol intermediate that can readily evolve to the enzyme products (Fig. 1). Analogs resembling the gem-diol transition state structures will therefore provide specific powerful noncovalent inhibitors by capturing a fraction of the binding energy for the transition state species. The consequent C–S bond dissociation, which requires a much lower activation free energy (∼5 kcal/mol) is concerted with a proton transfer to the side chain of the substrate Asp (Fig. 1). Such a mechanism is an alternative to the proposal that the positively charged His residue transfers a proton to the anionic intermediate, proposed for the correspondent reaction in aqueous solution in presence of a His residue [23].
AIMD simulations of biologically relevant systems
265
Chart 1
In addition, it suggests that the decrease in catalytic efficiency on passing from papain to caspase-3 may be ascribed to both conformational and electrostatic properties.
3.1.2. Iron-based biomimetic compounds Nonheme diiron enzymes [24–30] catalyze a variety of important and diverse chemical reactions, which may be of interest for industry [31, 32]. For instance, the diiron enzyme methane monooxygenase (MMO) is able to catalyze the oxidation of alkane, alkene and aromatic groups [33] under mild conditions and in an efficient and highly selective manner [24, 25], while the corresponding industrial process occurs under extreme conditions with low yields [31, 32]. Thus, the idea of synthesizing enzyme mimics emerges quite naturally. These mimics should retain the structural features common to this class of diiron proteins, which is a four antiparallel α-helix peptide bundle in which each iron atom binds a histidine and a glutamate ligand and two bridging carboxylates bind both metal ions [33].
266
A. Magistrato and P. Carloni
Figure 1. Water nucleophilic attack to the acyl-enzyme, as emerging from the QM/MM calculations: structure of transition state (a) and of the intermediate (I3 in Chart 1). (b) The H-bond pattern of His237 is shown with dotted lines. The QM part is indicated in thick lines. Labeling as in Chart 1.
Based on a retro structural analysis on a series of diiron proteins [33–35], the synthetic biomimetic complex Due Ferro 1 (DF1, Fig. 2) has been recently synthesized and characterized by DeGrado’s Group [33–35]. The complex resembles the common motif of diiron proteins. Besides fully emulating the active site and the tertiary structure of the real enzymes, DF1 binds also metal ions [34, 35] other than iron (such as zinc,
AIMD simulations of biologically relevant systems
267
Figure 2. (a) Schematic structure of DF1 [33–35]. (b) Close-up of the bridged bimetallic putative catalytic center.
manganese), thus mimicking the corresponding dizinc [36, 37], dimanganese [38] containing enzymes. AIMD and hybrid AIMD/MM have elucidated the key factors governing stability/reactivity of the active site of the two latter species. In the dizinc compound, our calculations have elucidated the crucial role of the environment (in particular, the second-shell ligands and the solvent waters) for stabilizing the hydrogen bond networks that surround the active site. Similar conclusions have been observed also for other metal centers in proteins [39]. In addition, our calculations show a highly flexible nature of the carboxylate-bridged binuclear motif. The chelating carboxylate ligands are particularly mobile and in presence of the whole protein they perform a syn–anti isomerization in which the glutamate coordinates with the internal and the external oxygen lone pair, respectively. In case of the manganese species, our simulations have shown that DF1 is not active as a mimic of manganese catalase [38] and calculations on chemically modified species have helped to shed light on the catalytic mechanism of the wild-type enzyme [40]. In conclusion, our calculations confirm that transition metal centers have a highly dynamical behaviour in which the coordination sphere undergoes continuous changes in the geometric arrangement of the ligands. In DF1, the mobility of the Glu ligands is expected to play an important role for the catalytic properties of the caboxylate-bridged binuclear motif. Therefore, our computational approach can be critically important for the tailoring of efficient and highly selective biomimetic catalysts.
268
3.2.
A. Magistrato and P. Carloni
Pharmaceutical Compounds
3.2.1. Metal-based radiopharmaceutical compounds Metal complexes with radioactive nuclei find multiple applications in medicine as they enable to monitor biological functions and constitute a tool for imaging of tumors, organs and tissues [41, 42]. Over 90% of nuclear diagnostic medicine is carried out with technetium 99m. This is mainly due to the favorable properties of this radio isotope (99m Tc is a radio isotope with a half life of 6 h and an emission energy of 141 keV only) and its ready generator availability. At opposite, radioisotopes such as the β-emitting 186 Re and 188 Re are now of widespread use in a therapeutic manner for the in situ treatment of cancerous tissues [43, 44]. A central issue in the development of radiopharmaceuticals with improved imaging and therapeutic properties is the search for compounds with enhanced selectivity. Unfortunately, a rational design of highly selective agents is hampered by the limited knowledge of the factors determining their reactivity and biodistribution. In the human body these molecules encounter a variety of different chemical environments (such as different pH or redox potential) and their pathways and final destinations are crucially determined by their chemical transformations under these varying external conditions. A characterization of the detailed physicochemical behavior of these compounds is therefore important to develop new radiopharmaceuticals with improved features. This is the case of the crown thioether complexes of rhenium and technetium, of general formula [M(9S3)2 ]2+ (M = Re, Tc) which are similar to the so-called “first generation” radiopharmaceutical agents [45]. These have been successful in the selective imaging of organs such as the heart, the brain, the liver, the kidneys and the bones. In the presence of reducing agents (such as ascorbic acid, Zn, Cr or SnCl2 ) under mild conditions, the Re and Tc compounds undergo instantaneous C–S bond cleavage to yield ethene and [M(9S3)L]+ (where L = SCH2CH2 SCH2 CH2 S), whereas in presence of other transition metals the reaction does not occur [46]. AIMD calculations have confirmed the hypothesis, based on experiments and semiempirical calculations [47], that the reductive bond scission in Tc and Re is caused by a strong π -back-donation from donor t2g -metal-orbitals into antibonding C–S σ *-orbitals of the thioether ligands (Fig. 3) [11]. In addition, the calculations show that the reaction proceeds in two steps. The first step consists in the reduction of the doubly positive charged metal complex to the unipositive analogue. The additional electron promotes a lowering of the activation energy barrier for the dissociation of ethane of about 10 kcal/mol [11] and this is sufficient to reduce the activation energy barrier to a level that only a
AIMD simulations of biologically relevant systems
269
Figure 3. Contour plot of the HOMO-2 orbital at the transition state for [Re(9S3)2 ]2+ indicating the presence of π-back-donation from one of the d (t2g ) orbitals into C–S σ ∗ ligand orbitals. The contours are given at ±4.0 au.
short simulations of few ps at room temperature allows us a direct observation of the loss of the ethene molecule (Fig. 4). In conclusion, our study provides a detailed understanding of the mechanism of the reductive C–S bond cleavage in rhenium and technetium radioactive agents and contributes to a comprehensive characterization of their chemical behavior in redox active environments.
3.2.2. Cisplatin binding to DNA Cisplatin (cis-diamminedichloroplatinum(II)) is widely used in clinic treatment against a variety of cancer diseases [48]. This compound targets DNA, distorting its structure (kink in the helical axis from ∼50◦ to ∼80◦ ) and thereby
270
A. Magistrato and P. Carloni (1)
(2)
(3)
(4)
Figure 4. Dissociation pathway of [Re(9S3)2 ]+ at 350 K. Snapshots of the most representative frames 1–4 are shown. (a) Conformation of the molecule after 0.1 ps simulation time. (b) Simultaneous dissociation of the two C–S bonds and release of ethene (0.2 ps). (c) Progressive removal of ethene. (d) Formation of the final product.
inhibiting the replication and transcription machinery of the cell. Upon DNA binding, this drug loses its two chlorine ligands and binds to a guanine N7 atom and an adjacent guanine N7 atom (65% of total platinated DNA), or to a lesser extent, to adenine N7 (25%). AIMD simulations were used to investigate the first (and rate-limiting) step of the DNA binding [49], which is believed to involve the substitution of a chlorine ligand with a water molecule (Fig. 5). The calculations provided a structural model of the transition state of the reaction and the calculated free energy barrier compared remarkably well with the experimental data. Subsequently, we carried out QM/MM calculations in aqueous solution of the final product of the reaction, namely the complex between cisplatin and an DNA oligomer (cisplatin-d(CCTCTG*G*TCTCC) d(GGAGACCAGAGG)) [12], for which both an X-ray and an NMR structures are available [50, 51].
AIMD simulations of biologically relevant systems
271
Figure 5. (a) Cisplatin in water. H-bonds denoted by dashed lines. (b) Initial structural model of the AIMD/MM simulations (cispt-d(CCTCTG*G*TCTCC) -d(GGAGACCAGAGG). (c) comparison between the initial and final AIMD/MM structures.
The platinated moiety was the QM region, while the biomolecular frame was treated with the AMBER force field [20]. During the dynamics, the structure of the platinated DNA dodecamer rearranged from the initial, X-ray structure towards the structural determinants of the solution structure as obtained by NMR spectroscopy (Fig. 5) [52]. The calculated 195 Pt chemical shifts of the QM/MM structure relative to cisplatin in aqueous solution were in qualitative agreement with the experimental data. The [Pt(NH3 )]2+ 2 moiety was subsequently docked onto DNA in its canonical B form. Within the relative short time scale (∼7 ps), the DNA oligomer experienced a large kink and a rearrangement of DNA, as experimentally observed in the platinated adducts. The AIMD/MM approach described here can be used in the future to model the interaction of other platinum-based compounds with DNA oligomers and DNA nucleobases, for which a valuable force field parametrization has not yet been developed [52].
4.
Concluding Remarks
Because of the large number of AIMD applications already present in the literature, it would clearly be impossible to review all the work appeared so far. Therefore, only very few examples are included here (for more exhaustive reviews the reader is referred to Refs. [53–55]). Before closing, we would like to mention some of the developments in the code, which are extending the domain of AIMD applications to biomolecular modeling: (i) The calculation of IR and Raman spectra [56], as well as of NMR chemical shifts [57], which may allow to make contact with experiment.
272
A. Magistrato and P. Carloni
(ii) The implementation of DFT-based methods for excited states such as ROCKS [58] and time dependent DFT (TDDFT) [59], which allows to simulate to the study of photophysical processes such as cis–trans isomerization of the retinal chormophore in rhodopsin [60]. (iii) The implementation of path integral MD (PIMD) simulations [61], which allows to describe hydrogen tunneling. These quantum effects are believed to play an important role for some enzymatic reactions [62, 63].
Acknowledgments We would like to thank people who have contributed to this review, namely, K. Spiegel, M. Sulpizi, and in particular, U. Rothlisberger and M.L. Klein. In addition, we thank M. Parrinello for his continuous support.
References [1] R. Car and M. Parrinello, Phys. Rev. Lett., 55, 2471, 1985. [2] R.G. Parr and W. Yang, Density Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. [3] P. Sherwood, In: Modern Methods and Algorithms of Quantum Chemistry, vol. 1, J. Grotendorst (ed.), John von Neumann Institute for Computing, Juelich, NIC Series, 1257, 2000. [4] M. Colombo, L. Guidoni, A. Laio, A. Magistrato, P. Maurer, S. Piana, U. Roehrig, K. Spiegel, M. Sulpizi, J. VandeVondele, M. Zumstain, and U. Rothlisberger, Chimia, 56, 13, 2002. [5] A. Laio, J. VandeVondele, and U. Rothlisberger, J. Chem. Phys., 116, 6941, 2002. [6] A. Warshel and M. Levitt, J. Mol. Biol., 7, 718, 1976. [7] D. Sebastiani and U. R¨othlisberger, Advances in Density-Functional-Theory based Modeling Techniques – Recent Extension of the Car–Parrinello Approach. In: P. Carloni and F. Alber (eds.), Quantum Medicinal Chemistry, Chap 1. p. 5, 2002. [8] M. Sulpizi, A. Laio, J. VandeVondele, A. Cattaneo, U. Rothlisberger, and P. Carloni, Proteins 52, 212–224, 2003. [9] A. Magistrato, W.F. DeGrado, A. Laio, U. Rothlisberger, J. VandeVondele, and M.L. Klein, J. Phys. Chem. B, 107, 4182, 2003. [10] L. Banci and P. Comba, Molecular Modeling and Dynamics of Bioinorganic Compounds, Kluwer Academic Publisher, Dorderecht, Boston, London, 1997. [11] A. Magistrato, P. Maurer, T. F¨assler, and U. Rothlisberger, J. Phys. Chem. A, 108, 2008–2013, 2004. [12] K. Spiegel, U. Rothlisberger, and P. Carloni, J. Phys. Chem. B, 108, 2699–2707, 2004. [13] D. Marx and J. Hutter, “Ab initio molecular dynamics: theory and implementations,” In: J. Grotendorst (ed.), Modern Methods and Algorithms in Quantum Chemistry, John von Neumann Insitute for Computing, Julich, p. 301, 2000.
AIMD simulations of biologically relevant systems
273
[14] All calculations are performed with the code J. CPMD Hutter, A. Alavi, T. Deutsch, P. Ballone, M. Bernasconi, P. Focher, S. Goedecker, M. Tuckerman, and M. Parrinello, CPMD. Max-Planck-Institut f¨ur Festk¨orperforschung, Stuttgart and IBM Research Laboratory Z¨urich, 1995–1999. [15] The calculations presented in the next section have been performed using the gradient corrected scheme developed by of Becke (A.D. Becke, Phys. Rev. A, 38, 3098–3100, 1988) for the exchange and by Lee, Yang and Parr (C. Lee, W. Yang, and R.G. Parr, Phys. Rev. B, 37, 785–789, 1988), or Perdew (J.P. Perdew, Phys. Rev. B, 33, 8822, 1986), for the correlation part. [16] In all our calculations we have employed pseudopotentials of the Martins–Troullier type (M. Trouiller and J.L. Martins, Phys. Rev. B, 43, 1993, 1991. [17] M. Eichinger, P. Tavan, J. Hutter, and M. Parrinello, J. Chem. Phys., 21, 10452, 1999. [18] U. Rothlisberger, To be published. [19] W.R.P. Scott, P. H¨unemberger, I.G. Tironi, A.E. Mark, S.R. Billiter, A.E. Torda, T. Huber, P. Krueger, and W.F. van Gunsteren, J. Phys. Chem. A, 103, 3596, 1999. [20] D. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham, S. Debolt, D. Ferguson, G. Seibel, and P. Kollman, Comput. Phys. Commun., 91, 1, 1995. [21] P. H¨unemberger, J. Chem. Phys., 113, 10464, 2000. [22] S. Shimohama, Apoptosis, 5, 9, 2000. [23] M. Strajbl, J. Florian, and A. Warshel, J. Phys. Chem. B, 105, 4471, 2001. [24] (a) B.J. Wallar and J.D. Lipscomb, Chem. Rev., 96, 2625–2657, 1996. (b) A.L. Feig and S.J. Lippard, Chem. Rev., 94, 759, 1994. [25] S.J. Lange and L. Que Jr., Curr. Opin. Chem. Biol., 2, 159, 1998. [26] R.E. Stenkamp, Chem. Rev., 94, 715, 1994. [27] (a) A.K. Powell, Met. Ions. Biol. Syst., 35, 515, 1998. (b) P.M. Harrison, P.D. Hempstead, P.J. Artymiuk, and S.C. Andrews, Met. Ions. Biol. Syst., 3, 5, 435, 1998. [28] P. Nordlund and H. Eklung, Curr. Opin. Struct. Biol., 5, 758, 1995. [29] (a) S.C. Gallagher, A. George, and H. Dalton, Eur. J. Biochem., 254, 480, 1998. (b) M. Lee, M. Lenman, A. Banas, M. Bator, S. Singh, N. Schweizer, R. Nilsson, C. Liljenberg, A. Dahlquist, and P.D. Gummeson, et al. Science, 280, 915, 1998. [30] P. Nordlund, B.M. Sj¨orberg, and H. Eklund, Nature, 345, 593, 1990. [31] A.E. Shilov and G.B. Shul’pin, Chem. Rev., 97, 2879, 1997. [32] A.E. Shilov, Activation of Saturated Hydrocarbons by Transition Metal Complexes D, Riedel Publishing Co., Dordrecht, The Netherlands, 1984. [33] C.M. Summa, A. Lomardi, M. Lewis, and W.F. DeGrado, Curr. Opin. Struct. Biol., 9, 500, 1999. [34] A. Lomabrdi, C.M. Summa, S. Geremia, L. Randaccio, V. Pavone, and W.F. DeGrado Proc. Natl. Acad. Sci. USA, 97, 6298, 2000. [35] L. Di Costanzo, H. Wade, S. Geremia, L. Randaccio, V. Pavone, W.F. DeGrado, and A. Lombardi, J. Am. Chem. Soc., 123, 12749, 2001. [36] M.C.J. Wilce, C.S. Bond, N.E. Dixon, H.C. Freeman, J.M. Guss, P.E. Lilley, and J.A. Wilce, Proc. Natl. Acad. Sci. USA, 95, 3472, 1998. [37] S.P. Liu, J. Widom, C.W. Kemp, C.M. Crews, and J. Clardy, Science, 282, 1324, 1998. [38] G.C. Dismukes, Chem. Rev., 96, 2909, 1996. [39] M. Dal Peraro, A.J. Vila, and P. Carloni, J. Biol. Inorg. Chem., 7, 704, 2002. [40] A. Magistrato, W.F. DeGrado, and M.L. Klein, To be published. [41] J.R. Dilworth and S.J. Parrott, Chem. Soc. Rev., 27, 43, 1998. [42] W.A. Volkert and S. Jurisson, Technetium and Rhenium, 176, 123, 1996. [43] S. Prakash, M.J. Went, and P.J. Blower, Nucl. Med. Biol., 23, 543, 1996.
274
A. Magistrato and P. Carloni [44] G. Schoeneich, H. Palmedo, D. Heimbach, H.J. Biersack, and S.C. Muller, Onkologie, 20, 316, 1997. [45] www.cardiolite.com. [46] (a) G.E.D. Mullen, M.J. Went, S. Wocadlo, A.K. Powell, and P.J. Blower, Angew. Chem. Int. Ed. Engl., 36, 1205, 1997. (b) G.E.D. Mullen, P.J. Blower, D.J. Price, A.K. Powell, M.J. Howard, and M.J. Went, Inorg. Chem., 39, 4093, 2000. [47] G.E.D. Mullen, F.T. F¨assler, M.J. Went, K. Howland, B. Stein, and P.J. Blower, J. Chem. Soc., Dalton Trans., 21, 3759, 1999. [48] J. Redijk, Proc. Natl. Acad. Sci. USA, 100, 3611, 2003. [49] P. Carloni, M. Sprik, and W. Andreoni, J. Phys. Chem. B, 104, 823, 2000. [50] P.M. Takahara, A.C. Rosenzweig, C.A. Frederick, and S.J. Lippard, Nature, 377, 649–652, 1995. [51] A. Gelasco and S.J. Lippard, Biochemistry, 37, 9230, 1998. [52] M.A. Elizondo-Riojas and J. Kozelka, J. Mol. Biol., 314, 1227, 2001. [53] P. Carloni and U. Rothlisberber, In: L. Eriksson (ed.), Theoretical Biochemistry – processes and Properties of Biological Systems, Elsevier Science, New York, 2000. [54] W. Andreoni, A. Curioni, and T. Mordasini, IBM J. Res. Dev., 45, 397, 2001. [55] P. Carloni, U. Rothlisberger, and M. Parrinello, Acc. Chem. Res., 35, 455, 2002. [56] D. Sebastiani and Parrinello, J. Phys. Chem. A, 105, 1951, 2001. [57] P. Silvestrelli and M. Parrinello, Phys. Rev. Lett., 82, 3308, 1999. [58] I. Frank, J. Hutter, D. Marx, and M. Parrinello, J. Chem. Phys., 108, 4060, 1998. [59] (a) E.K.U. Gross, F.J. Dobson, and M. Petersilka, Density Functional Theory, Springer, Berlin 1996. (b) M.E. Casida, In: D.P. Chong (ed.), Recent Advances in Density Functional Methods, World Scientific, Singapore, 1995. [60] U. Rohrig, L. Guidoni, A. Laio, J. VandeVondele, and U. Rothlisberger, To be published. [61] S. Raugei, M.L. Klein, J. Am. Chem. Soc., 125, 8992, 2003. [62] D.B. Northrop, Acc. Chem. Res., 34, 790, 2001. [63] K.M. Doll, B.R. Bender, and R.G. Finke, J. Am. Chem. Soc., 125, 10877, 2003.
1.14 TIGHT-BINDING TOTAL ENERGY METHODS FOR MAGNETIC MATERIALS AND MULTI-ELEMENT SYSTEMS Michael J. Mehl and D.A. Papaconstantopoulos Center for Computational Materials Science, Naval Research Laboratory, Washington, DC, USA
The classic paper of Slater and Koster [1] described a method for modifying a linear combination of atomic orbitals (LCAO) for use in an interpolation scheme to determine energy bands over the entire Brillouin zone while only fitting to the results of first-principles calculations at high symmetry points in the zone. This tight-binding (TB) method was shown to be extremely useful for the study of the band structure of solids with little computational cost. Harrison [2, 3] developed a “universal” set of parameters which are used both to obtain a basic understanding of band structures and for making approximate calculations. Papaconstantopoulos [4] computed the Slater–Koster parameters for most elements by fitting to results obtained from the first-principles augmented plane wave (APW) method. Numerous other applications of this method have appeared in the literature [5, 6]. As computational methods developed, it was realized [7–11] that tightbinding methods, properly applied, could be used as scheme for determining structural energies as well as electronic structure. Since these methods use a minimal basis set for each atom, they are much faster than first-principles methods for similar size systems, and therefore useful for quickly studying systems containing several hundred atoms, e.g., in molecular dynamics simulations [12]. One example of the method is the two-center, nonorthogonal NRL-TB method [9, 10], which uses environment-dependent on-site parameters and bond-length dependent hopping parameters to go beyond interpolating between fitted structures to the determination of elastic constants, phonon spectra, and defect structures. A similar approach is used by the Ames group [11, 13, 14] who approximate the three-center integrals by modifying the 275 S. Yip (ed.), Handbook of Materials Modeling, 275–305. c 2005 Springer. Printed in the Netherlands.
276
M.J. Mehl and D.A. Papaconstantopoulos
two-center hopping integrals according to the local environment. Cohen, Stixrude, and Wasserman [15] have modified the description of the on-site parameters (4–6) to include crystal-field like corrections, extending the work of Mercer and Chou [16] to include d orbitals. We have previously summarized much of this work [5, 6]. In this article we focus on extensions of the TB method beyond the original elemental systems. Specifically, we show how the method can be extended to spin-polarized systems, including noncollinear spins, using the atomic moment approximation (AMA) [17]. We also describe the development of parameters for binary and ternary compounds. As we will see, although the determination of the TB parameters is tedious, the resulting method is computationally efficient, capable of performing static and dynamic calculations beyond the limits of first-principles methods. The method has been applied to all of the magnetic elements, and many nonmagnetic compounds. The accuracy of electronic, elastic, and phonon properties is comparable to that of the original, nonmagnetic single element calculations. In the discussion of our work below, our TB calculations are fitted to first principles results obtained from the linearized augmented plane wave (LAPW) method [18], including full potential and total energy capabilities [19, 20]. Calculations used the Kohn–Sham independent electron formulation of Density Functional Theory [21, 22] with various local density approximations (LDA) [23] or the Perdew–Wang 1991 generalized gradient approximation (GGA) [24]. Other tight-binding methods use similar first-principles techniques, as described in the references. This work is divided into two major parts. Section 1 describes work on spin-polarized systems, including noncollinear spins, while Section 2 shows how TB methods can be adapted to compounds. Finally, in Section 3 we briefly discuss the future of TB total energy methods.
1.
Magnetic Systems
Since spin-polarized density functional calculations produce eigenvalues for both the majority and minority spin channels, it is rather easy to set up a Slater–Koster tight-binding parametrization for each channel. These parameter sets are bound together by the requirement that they reproduce the firstprinciples eigenvalues for each spin as well as the total energy. Accordingly, we modify the original nonpolarized TB procedure [9, 10] as follows. The total energy of the TB system is given by the sum over occupied states of the shifted spin-polarized eigenvalues: E=
i
f (εi↑ − µ )εi↑ +
i
f (εi↓ − µ )εi↓ ,
(1)
TB methods for magnetic materials and multi-element systems
277
where f (ε) is a smoothing function, usually the Fermi function [25], and µ is the shifted Fermi level, which gives the correct number of occupied bands, and the arrows indicate the collinear spin-polarization of the electronic states. The eigenvalues ε are uniformly shifted from the eigenvalues ε found from the density functional calculations: = εi↑ + εs , εi↑
and εi↓ = εi↓ + εs .
(2)
The shift, εs , is defined so that the total energy E in (1) is equal to the total energy from the DFT calculation:
εs =
E−
[ f (εi↑ − µ)εi↑ +
i
f (εi↓ − µ)εi↓
Ne ,
(3)
i
where Ne is the number of electrons in the system and µ = µ + εs is the Fermi level for the original DFT calculation. In our approach to spin-polarized TB [26] we assign all of the difference between the majority and minority bands to the on-site terms. Thus, with each atom i we associate both a majority and a minority “density” of nearby atoms:
exp(−λ2↑,↓ Ri, j )F(Ri j ),
(4)
F(R) = θ(Rc − R)/{1 + exp[(R − Rc )/L + 5]},
(5)
ρi(↑,↓) =
j
where
is a screening function designed to smoothly take the densities (4) to zero at distances greater than Rc . Typically, we take Rc between 10 and 16 a.u., and L between 0.25 and 0.5 a.u. Once we have the density in the neighborhood of each atom, we assign the spin-dependent on-site parameters for states with angular momentum = s, p, and d by 2/3
4/3
2 . h i(↑,↓) = α(↑,↓) + β(↑,↓)ρi(↑,↓) + γ(↑,↓)ρi(↑,↓) + δ(↑,↓)ρi(↑,↓)
(6)
We will frequently find it useful to determine the energy of a paramagnetic system using these TB parameters. In the paramagnetic system the on-site parameters are the average of the majority and minority spin parameters in (6). The hopping and on-site terms have the same form here as in our unpolarized TB calculations, and are taken to be independent of the spin associated with each TB orbital. Thus the Slater–Koster hopping parameters between atoms separated by a distance R are given by 2 Hµ = [A µ + Bµ R + Cµ R 2 ] exp(−D µ R)F(R),
(7)
278
M.J. Mehl and D.A. Papaconstantopoulos
where µ = (ssσ, spσ, ppσ, ppπ, sdσ, pdσ, pdπ, ddσ, ddπ, ddδ) are the Slater–Koster parameters. We usually assume the TB basis to be nonorthogonal, requiring us to define a set of overlap parameters Sµ to compliment (7). In the spin-polarized calculations we have done so far we have given S the same functional form as H , only noting here that this is not required for a successful theory [9, 10]. For an sp3 d5 basis, the procedure above gives 106 independent parameters. For Iron [26] we fit these parameters to reproduce a database of eigenvalues and total energies for paramagnetic bcc Fe, ferromagnetic bcc Fe, and ferromagnetic fcc Fe, using the GGA [24] to obtain the correct ferromagnetic body-centered cubic (bcc) ground state. The structural energies as a function of volume are shown in Fig. 1, where we compare our results to firstprinciples calculations. We note that the output paramagnetic fcc total energy closely tracks the paramagnetic fcc energy from LAPW calculations. It should be noted that the TB parametrization cannot reproduce the low-spin/high-spin discontinuity found in ferromagnetic fcc iron [27]. This is not usually a problem in Fe, especially when we consider that the paramagnetic fcc TB total energy is very close to the low-spin fcc LAPW total energy.
Energy/atom (Ry)
0.04 BCC FM LAPW BCC FM TB BCC PM LAPW BCC PM TB FCC FM LAPW FCC FM TB FCC PM LAPW FCC PM TB
0.02
0.00
65
70
75 Volume/atom (a.u.)
80
85
Figure 1. Comparison for first-principles and tight-binding calculations for Fe, using a spinpolarized tight-binding parametrization [26]. Squares represent bcc phases, diamonds fcc phases. Solid symbols denote ferromagnetic phases, open symbols unpolarized phases. Red lines are LAPW calculations, blue lines tight-binding. The low-spin/high-spin discontinuity in the LAPW ferromagnetic phase is not reproduced by the tight-binding parametrization.
TB methods for magnetic materials and multi-element systems
279
The TB method also lets us examine the total polarization in a system, as the difference in occupation number between the majority and minority spin sites, m=
[ f (εi↑ − µ) − f (εi↓ − µ)].
(8)
i
Figure 2 shows the magnetic moment for fcc and bcc Fe as a function of volume. Note that the first-principles high/low spin transition in fcc Iron occurs at approximately the same volume at which the paramagnetic TB total energy becomes lower than the ferromagnetic TB energy for the FCC lattice. We have extended our tight-binding calculations for magnetic systems to cobalt and nickel (as well as chromium, which will be discussed below). Both elements are substantially easier to fit than Iron, since there is no high/low spin transition in any state. Our fitting database included first-principles LAPW total energy calculations for the fcc, bcc, and simple cubic structures, using the Hedin–Lundqvist LDA [23]. The resulting TB parameters correctly predict the ferromagnetic hcp lattice as the ground state of Co, even though we did not include this state in the fit. Table 1 shows our calculated elastic constants [28, 29] for the three ferromagnetic elements as well as Cr. We list the TB results at both the equilibrium and experimental volumes [30]. At the
3.0
Moment/atom (spins)
2.5 2.0 1.5 1.0 BCC FM LAPW BCC FM TB FCC FM LAPW FCC FM TB
0.5 0.0
65
70
75 Volume/atom (a.u.)
80
85
Figure 2. Comparison for first-principles and tight-binding calculations for the magnetic moment of Fe, using the spin-polarized tight-binding parametrization of Ref. [26]. The notation is the same as in Fig. 1.
280
M.J. Mehl and D.A. Papaconstantopoulos
Table 1. Elastic constants for the magnetic elements computed from the spin-polarized tight-binding parameters and compared to experiment [30]. Calculations for Fe, Co, and Ni were done with ferromagnetic spin orientations. The first “TB” column is the tight-binding equilibrium volume, while the second is at the experimental equilibrium. For Co we use the tight-binding minimum energy value for c/a at the experimental volume. As explained in the text, we model the spin-density wave in chromium by a CsCl type unit cell, where one of the Cr atoms has spin “up” and the other spin “down”. All elastic constants are in GPa Cr Fe Co Ni TB TB Exp. TB TB Exp. TB TB Exp. TB TB Exp. a (a.u.) 5.280 5.451 5.451 5.373 5.416 5.416 4.797 4.786 4.743 6.483 6.652 6.652 c (a.u.) 7.591 7.557 7.693 B 278 164 162 180 158 173 223 247 186 264 175 185 C11 599 407 350 250 223 237 348 359 287 358 251 249 C12 117 42 68 145 125 141 180 189 158 217 137 153 160 168 116 C13 322 336 322 C33 C44 142 105 101 142 132 116 78 80 66 75 69 96 Table 2. Phonon frequencies at selected high symmetry points for ferromagnetic fcc Iron and bcc Nickel, computed from NRL tight-binding parameters and compared to experiment. Symmetry labels follow the notation of Miller and Love [33]. The column labeled “P” indicates the polarization of the mode, either longitudinal (L) or transverse (T), if it is defined. The column “D” indicates the degeneracy of the mode. All frequencies are in inverse centimeters Fe Sym. H P N3 N2 N4
Ni
P
D
TB
Exp. [31]
L T T
3 3 1 1 1
289 262 308 221 148
286 240 357 215 149
Sym.
P
D
TB
Exp. [32]
X3 X5 L2 L3 W2 W5
L T L T
1 2 1 2 1 2
273 180 265 130 170 198
285 209 296 141 207 250
experimental volume we find that the elastic constants are in good agreement with experiment, and are at the same level of accuracy as first-principles DFT calculations. Using our TB parameters we have determined phonon frequencies at highsymmetry locations in the Brillouin zone, using the frozen-phonon method. Table 2 shows phonon frequencies for iron and nickel, compared to experiment [31, 32]. The symmetry notation used here follows that of Miller and Love [33]. We see that the agreement here is comparable to similar calculations for nonmagnetic transition metals [9]. Barreteau et al. [34] have developed a method for the study of magnetism in transition metals by starting with an approach similar to ours for the nonmagnetic part of the interaction [35], and modeling the magnetic interactions
TB methods for magnetic materials and multi-element systems
281
by a multiband Hubbard model treated in the Hartree–Fock approximation. The method has been applied to Rh and Pd clusters and slabs [36]. Recently, Barreteau et al. [37] analyzed the main effects due to the renormalization of the hopping integrals by the intersite Coulomb interactions. They find that these effects are strongly dependent on the relative values of the intersite electron– electron interaction and on the shape of the electronic density of states. The predicted electronic structure for bcc iron, hcp cobalt, and fcc nickel are in excellent agreement with first-principles calculations. Xie and Blackman [38] begin with a similar, though orthogonal, form for the nonmagnetic part of the TB calculation, and add parametrized terms for charge self-consistency and spin polarization. They use their method to study the magnetics of iron clusters embedded in Cobalt. Finally, we note that one could apply the semiempirical approach of Krasko [39], using a Stoner model to add a magnetization energy to, in our case, a nonmagnetic TB parametrization. This approach has the advantage that a single set of parameters serves for both the magnetic and nonmagnetic cases, but it has not been applied to materials other than iron. We have calculated vacancy formation energies by a supercell method [10, 25]. One atom in the supercell is removed and neighboring atoms are allowed to relax around this vacancy while preserving the symmetry of the lattice. The great advantage of the NRL-TB method over first-principles approaches is that we can do the calculation in a very large supercell, in a computationally efficient manner, including relaxation with the TBMD code [12]. We found that a supercell containing 216 atoms was sufficient to eliminate vacancy-vacancy interactions in ferromagnetic iron and nickel. For iron, we found an unrelaxed vacancy formation energy of 2.62 eV, and a relaxed formation energy of 2.33 eV. For nickel we found 1.87 and 1.60 eV for the unrelaxed and relaxed formation energies. The relaxed vacancy formation energies are in very good agreement with the experimental values of 2.0 eV for iron and 1.6 eV for nickel.
1.1.
Noncollinear Magnetism
The theory described above assumes, as in most versions of spin-dependent density functional theory, that the electronic spin points in a global “up” or “down” direction, excluding the possibility that electrons on different atoms might be aligned in different directions. This is a difficult problem in density functional theory. A simplified approach valid within the AMA was made by Pickett [17]. We have adapted [40] it to our TB procedure (1)–(7) as follows: For each atom, define the paramagnetic part of each on-site term as ti = (h i↑ + h i↓ )/2,
(9)
282
M.J. Mehl and D.A. Papaconstantopoulos
where the h i(↑,↓) are defined in (6). Define the exchange splitting introduced by the polarization by
i = (h i↑ − h i↓ )/2.
(10)
Note that both (9) and (10) define diagonal elements in the Slater–Koster Hamiltonian. To introduce noncollinear spin polarization, we give each atom a spin direction dˆi , where |dˆi | = 1. We then construct the nonorthogonal Slater– Koster Hamiltonian by coupling the majority and minority spin channels together. The hopping and overlap terms between majority and minority orbitals are assumed to be identical to the terms between orbitals of the same spin are have the form (7). The on-site terms, however, are mixed according to the rule h is, j s = ti δi, j δ, − 1/2 i δi, j δ, dˆi · σss ,
(11)
where the s and s components indicate the spin index (↑ or ↓), and σss is the vector form of the Pauli spin matrices for spins s and s . The simplest application of noncollinear magnetization is an antiferromagnet, where the dˆi are along the Cartesian directions zˆ and −ˆz . This a common model for chromium [41], which has a nominally bcc structure modulated by an incommensurate spin-density wave with vector q = (2π/a)(0, 0, 0.952). If we model this vector by (2π/a)(0, 0, 1), which is the ground state of all first-principles calculations using current Density Functionals [42], then the wave is commensurate and we can model it as an antiferromagnetic CsCllike unit cell with atoms on the cesium sites having spins pointing in the zˆ direction and atoms on the chlorine sites point along the opposite direction. We computed the total energy for this state by using our spin-polarized tightbinding parameters for Cr, and Eqs. (9)–(11), alternating the “up” and “down” spins in a CsCl structure, to yield the results shown in Fig. 3. We see that the antiferromagnetic phase has lower energy than the ferromagnetic phase for all volumes, in agreement with experimental data. Manganese is another element with an antiferromagnetic ground state. We have previously shown [43] that paramagnetic TB parameters correctly predict the ground state αMn structure, but we did not consider the effects of magnetic interactions. Using a spin-polarized set of TB parameters, fitted to the fcc, bcc, and simple cubic structures, we computed the total energy of αMn for all possible spin configurations which preserve the symmetry of the crystal. As shown in Fig. 4, we found that a configuration with 13 “up” atoms and 16 “down” atoms gives the lowest energy. Given the constraints of the 29-atom unit cell we cannot get a perfect antiferromagnet. This will require (at least) doubling the unit cell. An alternative method for determining magnetization within a parametrized TB framework was developed by Mukherjee and Cohen [44]. In this method, the net magnetic moment (8) is considered to be a parameter, and is solved for
TB methods for magnetic materials and multi-element systems
283
0.66 FM AFM Energy/atom (Ry)
0.67 0.68 0.69 0.70 0.71 0.72 60
65
70
75 80 85 Volume/atom (a.u.)
90
95
Figure 3. Tight-binding total energy calculations for bcc chromium, using spin-polarized parameters. The ferromagnetic (FM) calculations were done in the bcc unit cell. The antiferromagnetic (AFM) calculations were performed using two atoms in a simple cubic unit cell, with one spin pointing “up”, and the other “down” [40].
self-consistently. This allows ferromagnetic and paramagnetic systems to be computed from the same set of parameters. The method has been successfully applied to high pressure hcp iron [45], which has a rather unusual magnetic structure [46], Zhuang and Halley [47] use a charge self-consistent TB method to describe the noncollinear magnetic spin structures of MnF2 and MnO2 .
2.
Compounds
Extension of the method to compounds requires several modifications [48]. As always, we begin by shifting the eigenvalues so that their sum is the total energy E[n(r)] =
f (εn − µ ) εn ,
(12)
n
There are three types of parameters in the fit: the on-site terms, which depend on the local environment and represent the energy required to put an electron in a specificatomicshell,thehoppingparameters,whichrepresenttheenergyrequired for the electron to move between atoms, and overlap parameters, detailing the
284
M.J. Mehl and D.A. Papaconstantopoulos 28.25 PM FM AFM
Energy/unit cell (Ry)
28.30 28.35 28.40 28.45 28.50 28.55 2000
2100
2200
2300
2400
Volume/unit cell (a.u.) Figure 4. Tight-binding total energy calculations for α-Manganese, using spin-polarized and unpolarized parameters and the noncollinear tight-binding method [40]. The paramagnetic (PM) calculations used the average of the spin-up and spin-down parameters. The ferromagnetic (FM) calculations used the spin-polarized parameters with all the atomic spins aligned. For nearly anti-ferromagnetic (AFM) calculations, atoms at the (2a) and one set of (24g) Wyckoff positions were aligned in the “up” direction, and atoms on the (8c) and second (24g) sites were aligned “down.” This yields the lowest possible total spin for the primitive 29-atom α-Mn unit cell.
nonorthogonality of the TB orbitals. In all three cases we must now determine pairwise interactions between atoms of the same type as well as those between atoms of different species. The environmental dependence of the on-site parameters is controlled by a set of atomic-like densities, ρ(i, ˜) =
j ∈˜
exp[−λ2ı˜˜ |Ri − R j |]F(|Ri − R j |),
(13)
where the ith atom is of type ı˜, the jth atom is of type ˜, ρ(i, ˜) is the density on atom i due to atoms of type ˜, and λı˜˜ is a fitting constant to be determined, and F is defined in (4). The on-site terms themselves are polynomial functions in ρ 2/3 : h (i) = a (˜ı ) +
[b (˜ı , ˜)ρ(i, ˜)2/3 + c (˜ı , ˜)ρ(i, ˜)4/3
˜
+ d (˜ı , ˜)ρ(i, ˜)2 ],
(14)
TB methods for magnetic materials and multi-element systems
285
where the sum is over all atom types in the system. Each atom type interacts with the target atom differently. The method used here was adopted for the sake of expediency, and is not the ideal form. However, it is a very useful form, as we shall see. In general, we use angular momenta = s, p, d. However, in systems with essentially cubic symmetry it is sometimes convenient to split the d on-site terms into tg and e2g components. We took this approach for the parametrization of FeAl [48], but not for Cu–Au [49]. The two-center Slater–Koster hopping integrals are determined using an exponentially damped polynomial, and depend only on the atomic species and the distance between the atoms: Hµ (i, j ; R) = [A µ (˜ı , ˜) + Bµ (˜ı , ˜)R 2 ı , ˜) R]F(R). + Cµ (˜ı , ˜)R 2 ] exp[−D µ (˜
(15)
The A, B, C and D parameters are to be fit. For like-atom (˜ = ı˜) interactions, there are 10 independent Slater–Koster parameters: ssσ, spσ, ppσ, ppπ, sdσ, pdσ, pdπ, ddσ, ddπ, and ddδ. When the atoms are of different types, we must include an additional four parameters, psσ, dsσ, dpσ, and dpπ. Note that we do not distinguish between tg and e2g orbitals when computing the hopping integrals. Since we are using a nonorthogonal basis set, we must also parametrize the overlap integrals. These have a form similar to the hopping integrals: Sµ (i, j ; R) = [Oµ (˜ı , ˜) + P µ (˜ı , ˜)R + Q µ (˜ı , ˜)R 2 ] exp[−T2 µ (˜ı , ˜) R]F(R),
(16)
where O, P, Q, and T also represent parameters to be fit. Again, we do not distinguish between tg and e2g orbitals. For a two-component system with s, p, d orbitals, including tg and e2g on-site terms, there are 330 parameters (λs, a, b, c, d, A, B, etc.) which are used in the fit, in contrast to 97 for a single-element parametrization [10]. These parameters are chosen so as to reproduce the eigenvalues ε and energies E in Eq. (12). While the number of parameters may seem rather large, one must realize that we are using these parameters as a mathematical transformation from the DFT to the TB formalism. With this in mind, the number of parameters seems quite reasonable.
2.1.
Copper–Gold
A good test case for the method is the Cu–Au system. Experimentally it is known that ordered phases exist up to 200–400◦ C for Cu3 Au (L12 ), CuAu
286
M.J. Mehl and D.A. Papaconstantopoulos
(L10 ), and CuAu3 (L12 ) [50]. Theoretically, Ozoli¸nsˇ, et al. [51] have done extensive first-principles calculations on hypothetical ordered phases in this system, using the energetics data to fit a cluster expansion model for the alloy. In our calculations, we first obtained good TB parameters for Cu [52] and Au [12]. These were fixed throughout the remainder of the fit. We then fit the Cu–Au on-site, hopping, and overlap terms to reproduce the band structure and total energies of Cu3 Au and CuAu3 in the L12 and D03 structures, and CuAu in the L10 , L11 , B1, and B2 structures. We then compute the total energies of a number of ordered structures, and compute the formation energy per atom, which, for a structure with formula unit Cum Aun is E form (m, n) = [E 0 (Cum Aun ) − n E fcc (Cu) − m E fcc (Au)]/(m + n). (17)
Formation Energy (Ry/atom)
where E 0 is the minimum energy for the structure in question, and E fcc is the equilibrium energy of the pure element in the face-centered cubic phase. The results for the low-lying phases in the Cu–Au system are shown in Fig. 5. We see that these parameters do, in fact, predict the existence of ordered
0.010
C11b
D022
A1’ C32
C11b
0.005
A2’ B32
C32
C32
D03
A2 0.000
D03
A1
A3’
A1’ A2 A1
C19 L12
L12 ⫺0.005
B2 L1 0
0
0.2
0.4 0.6 0.8 Gold Concentration (x)
1
Figure 5. Formation energy diagram for ordered Cu1−x Aux compounds, using our tightbinding parameters [6, 49]. Strukturbericht symbols are used to designate the phases, except for A1 and A2 , which are ordered Cu7 Au and CuAu7 supercells of the fcc and bcc lattices, respectively. The tie line connects the known ordered structures in the Cu–Au system [50]. The red dots represent structures used to fit the tight-binding parameters, while the blue dots are predictions.
TB methods for magnetic materials and multi-element systems
287
Formation Energy (meV/atom)
50 Cu3 Au
Cu Au
Cu Au2 Cu Au3
25
0
⫺25
⫺50
⫺75
NRLLAPW NRELLAPW TB L12 D023 D022 L10 NbP W2 SQS8aL11 C11b L12 D022
Figure 6. Formation energy of several ordered phases in the Cux Au1−x system, calculated using our tight-binding parameters [6, 49] (blue bars), and compared to first-principles calculations performed by Ozoli¸nsˇ et al. [51] (red bars). The structure notation is from Ref. [51]. On this scale, the cluster-expansion energies found in Ref. [51] are indistinguishable from the corresponding LAPW results. For comparison, we also plot our first-principles LAPW results (green bars), which were used in the Cu–Au tight-binding fitting process.
L12 Cu3 Au and L10 CuAu. The L12 CuAu3 structure is, on the other hand, above the tie-line between CuAu and pure gold. This is consistent with our LAPW calculations, suggesting that L12 is not the ground state structure of CuAu3 . Figure 6 compares some of our structural energies to the first-principles formation energies found by Ozoli¸nsˇ et al. [51] We see that we have very good agreement for the low-lying phases. Part of the discrepancy may be that we disagree slightly on the first-principles formation energies of some structures, as shown in the figure. To further assess the transferability of the Cu–Au parameters, we computed elastic constants and zone-center phonon frequencies for ordered Cu3 Au and compared them to experiment [53, 54] as well as first-principles LAPW calculations. The results are shown in Table 3. We find reasonable agreement between these values and results obtained from first-principles. The advantage of the tight-binding method over first-principles is that it allows us to quickly study systems with a large number of atoms. Accordingly, we used these parameters to seek understanding of the surface electronic
288
M.J. Mehl and D.A. Papaconstantopoulos Table 3. Equilibrium bulk properties of Cu3 Au in the L12 structure, as determined by our tight-binding parametrization [49], first-principles LAPW calculations, and from experiment Property a (Å) C11 (GPa) C12 (GPa) C44 (GPa) 4 (cm−1 ) 4 (cm−1 ) 5 (cm−1 )
Experiment
LAPW
TB
3.755 [53] 187 [30] 135 [30] 68 [30] 125 [54] 210 [54] 161 [54]
3.68 180 120
3.69 198 98 92 153 270 195
110 200 159
Figure 7. Band structure of the Cu3 Au from [49]. (a) bulk system along the R direction, and (b) (111) surface along the M direction. E 1 and E 2 are the experimentally determined surface states [55].
structure of Cu3 Au [49]. Experiment [55] shows that two electronic surface states exist at in the (111) surface Brillouin zone of Cu3 Au. We model this system using our TB parameters and a slab consisting of 15 atomic layers and 60 atoms. In Fig. 7, we compare band structures for bulk Cu3 Au and the slab. We see that the surface states found experimentally agree nicely with the states found in our TB calculation.
TB methods for magnetic materials and multi-element systems
2.2.
289
Aluminides, Hydrides, and Carbides
To study aluminides we created a database of LAPW calculations for the B1 (NaCl), B2 (CsCl), D03 (Fe3 Al), C11b (MoSi2 ), and B32 (NaTl) structures, generating TB Hamiltonians for FeAl [48], CoAl, and NiAl by fitting the energy bands for the B2 structure and the total energies for all the above structures. The TB Hamiltonian included the s, p, and d orbitals for both the metal and Al sites, which were all necessary for obtaining a good fit to the LAPW results. The RMS error for the total energy was less than 1 mRy for all structures fitted, and in the B2 structure the RMS error for the lowest 12 bands was less than 20 mRy. We were able to reproduce well the lattice constants and bulk moduli, and electronic properties, such as the densities of states and energy bands. In addition, quantities that were not fitted, such as elastic constants, are found to be in good agreement with independent LAPW calculations and experiment. Figure 8 shows that there is excellent agreement between the LAPW results and the TB results over a wide range of pressures for all the fitted phases. The agreement is especially good in the ground-state CsCl (B2) structure. We plot the formation energy, which is defined in analogy with (17).
0.04 Formation Energy/Atom (Ry)
B1 0.02
0.00 B32
⫺0.02 D03 ⫺0.04
C11b B2
60
70
80
90
100
3
Volume/Atom (Bohr ) Figure 8. The formation energies versus atomic volume for ordered Fex Al1−x structures, calculated using our TB parameters [48] and compared to first-principles LAPW calculations. The solid lines represent the TB results while the points represent the LAPW results.
290
M.J. Mehl and D.A. Papaconstantopoulos
The TB and LAPW band structures of the B2 FeAl structure are shown in Fig. 9. The original TB calculations [48] reproduces the main features of the first-principles results, but in detail there are significant differences. Here we use a parameter set which has a better fit to the band structure, and find that the behavior of the bands near the Fermi level is close to the LAPW results. We obtained the TB and LAPW electronic densities of states (DOS) by the tetrahedron method [56], using 165 k-points in the irreducible part of the Brillouin zone. The LAPW and TB DOS shown in Fig. 10 are in good agreement. Experimentally, the DOS at the Fermi energy is known only from specificheat measurements, where it was measured to be ρ(εF ) = 31.1 states/Ry/FeAl molecule [57]. Our TB calculation yields (F) = 48.7 states/Ry, slightly higher 0.4
0.2
0.2
0.0
0.0 ε (Ry)
0.4
⫺0.2
⫺0.2
⫺0.4
⫺0.4
⫺0.6
⫺0.6
⫺0.8
Γ
∆
X
Z M
Σ
Γ
Λ
⫺0.8
R
S
X
S
R
T M
Γ
∆ X Z M
Σ
Γ
Λ
R
S
X
S
R T M
Figure 9. The band structure of FeAl in the CsCl structure, at the lattice constant a = 2.94 Å. The left figure shows the tight-binding band structure, while the LAPW results are on the right. In both cases the Fermi level has been set to zero. These calculations were done using a tightbinding parameter set which was selected to improve the fit to the FeAl band structure compared to our original parameters [48].
120
120
80
εF 100 ρ(ε) (States/Ry/FeAl)
ρ(ε) (States/Ry/FeAl)
100
Total Fe s Fe p Fe d Fe f Al s Al p Al d
60
40
εF
60
40
20
20
0 ⫺0.8
80
Total Fe s Fe p Fe d Fe f Al s Al p Al d
⫺0.6
⫺0.4
⫺0.2 ε (Ry)
0.0
0.2
0 ⫺0.8
⫺0.6
⫺0.4
⫺0.2
0.0
0.2
ε (Ry)
Figure 10. The electronic density of states of B2 (CsCl) FeAl, using the TB (left) and LAPW (right) methods, at the lattice constant a = 2.94 Å. In each case the Fermi level has been set to zero. The partial densities of states are given according to the legend in each part of the figure. While the LAPW results have a longer tale at low energy, the DOS are essentially similar near the Fermi level.
TB methods for magnetic materials and multi-element systems
291
than the LAPW value ρ(εF ) = 36.8. Other reports in the literature also find the theoretical value of ρ(εF ) to be greater than that from experiment, a discrepancy that does not allow for electron–phonon enhancement, which puts the experimental result into question. This discrepancy is possibly caused by the nonstoichiometry of the Fe–Al samples. Our predicted equilibrium lattice parameters and bulk modulus are also in good agreement with the first-principles results shown in Table 4. This is a result of the fitting procedure, as we fit the TB parameters to total energies at several volumes. However, the shear elastic moduli that we computed [58, 29] for the CsCl phase were not included in the fit, and except for C44 are in good agreement with the experimental results. In summary, we have presented a brief report of our TB study of the FeAl system. We showed that the parameters describe excellently several bcc and bcc-like phases as well as the NaCl phases. We have also developed TB parametrizations for several other binary compounds. We can judge the transferability of the parameters by computing elastic constants for the equilibrium phase and comparing to experiment, as we do in Table 4. In many cases, the compound measured is not stoichiometric, e.g., PdH0.66 [59] or Fe0.5989Al0.4011 [30], or only has been measured in thin films [60]. In extreme cases, where there is no available experimental data, we compare to the results of LAPW calculations [58]. The TB method described here is not limited to the study of bulk systems. It can, indeed, be used to study chemisorption processes. Our initial work was on the Pd–H2 system [63]. Building on our previous parameters for Pd [9], and using a database of 55 ab initio total energy calculations, we were able to model dissociation of molecular hydrogen at the Pd (100) surface. We modified our usual procedure so that the fitting was done varying only the hydrogen on-site terms and the H–H and Pd–H Hamiltonian and overlap hopping parameters. The Pd on-site terms and Pd–Pd parameters were kept fixed to their pure Pd values. However, to obtain higher accuracy we expanded the polynomial that described the H–H and Pd–H parameters up to fourth order. Figure 11 shows potential energy surface cross-sections for two orientations of the H2 molecule above the surface. A comparison of the TB and ab initio results reveals that the fit reproduces the minimum energy paths and also the general shape of the elbow plots very well. The overall RMS error, including additional ab initio values that were not fitted, was only 0.1 eV, a value that is usually considered to be within the accuracy of the ab initio total energies. Using similar techniques, we have also developed a set of TB parameters for studying the dissociation of the O2 molecule as it approaches a platinum surface [64]. In addition to the energy surfaces (as we computed for Pd–H2 ), we used the TB Molecular Dynamics (TBMD) [12] code to compute sticking probabilities. This was done by performing TBMD runs for a number of incident O2 kinetic energies in the range 0–1.5 eV, and averaging over 150 trajectories for a given energy. The results are shown in Fig. 12. We see that the
292 Table 4. Equilibrium lattice parameters and elastic constants (in GPa) for various cubic compounds in the NaCl or CsCl structure. Tight-binding results are compared to the available experimental data, noting that some compounds do not exist at the given stoichiometry. Values are calculated at the indicated equilibrium lattice constants (in atomic units)
a B C11 C12 C44
PdH
FeAl
NiAl
CoAl
LAPW
TB
Exp. [59]
TB
Exp. [30]
TB
Exp. [30]
TB
6.908 238 353 181 92
6.908 234 311 196 64
7.723 207 282 170 27
7.584 183 227 161 69
5.323 204 313 149 71
5.479 136 181 114 127
5.389 195 247 168 60
5.461 166 211 143 112
5.295 213 306 166 82
LAPW [58] 5.408 157 257 107 130
NbC
VN
TB
Exp. [61]
TB
8.405 313 639 151 126
8.447 340 620 200 150
7.873 333 570 214 170
Exp. [60] 7.810[62] 268 533 135 133
M.J. Mehl and D.A. Papaconstantopoulos
NiH TB
TB methods for magnetic materials and multi-element systems
a)
293
b)
3.0
2.6 Pd
2.2
H
1.8 Z(Å) 1.4
0.2 0.0
1.0
⫺0.2
-0.2 ⫺0.4 ⫺0.8
0.0
⫺0.6
0.6
0.2 ⫺1.0
-0.2 0.5
1.0
1.5 d
2.0 H⫺H
(Å)
2.5
3.0
0.5
1.0 d
H⫺H
1.5
2.0
(Å)
Figure 11. Contour plots of the TB-PES along two two-dimensional cuts through the sixdimensional coordinate space of H2 /Pd(100) [63]. The coordinates in the figure are the H2 center-of-mass distance from the surface Z and the H–H interatomic distance dH−H . The lateral H2 center-of-mass coordinates in the surface unit cell and the orientation of the molecular axis, i.e., the coordinates X, Y , u, and f are kept fixed for each 2D cut and depicted in the insets. The molecular axis is kept parallel to the surface; (a) corresponds to the dissociation at the bridge site, (b) to dissociation at the top site. The dots denote the points that have been used to obtain the fit. Energies are in eV per H2 molecule. The contour spacing is 0.1 eV.
294
M.J. Mehl and D.A. Papaconstantopoulos 1.0 Exp. Luntz et al.,Ts ⫽ 200K Exp. Luntz et al.,Ts ⫽ 90K
0.8
Exp. Nolan et al.,Ts ⫽ 77K
Trapping probability
TBMD, Ts ⫽ 0K TBMD, Ts ⫽ 0 K,Erot⫽ 0.1 eV
0.6
0.4
0.2
0.0 0.0
0.2
0.4 0.6 0.8 Kinetic energy (eV)
1.0
1.2
1.4
Figure 12. Trapping probability of O2 /Pt(111) as a function of the kinetic energy for normal incidence [64]. Results of molecular beam experiments for surface temperatures of 90 and 200 K [65] and 77 K [66] are compared to TBMD simulations for the surface initially at rest (Ts = 0 K).
trapping probability has the same basic behavior as found experimentally [65, 66], showing that we can successfully model the chemisorption of O2 on Pt.
2.3.
Silicon Carbide
We previously developed parameter sets for both carbon [67] and silicon [68], so it is natural to extend the technique to the development of a parameter set for SiC [69]. Silicon carbide has a wide variety of polytypes, distinguished by the stacking of the SiC layers. It is therefore a good test of the ability of the method to develop transferable parameter sets. The parameters were developed by fitting to the zincblende (stacking ABCABC), wurtzite (stacking ABAB), and 4H (stacking ABACABAC) structures, several zone-boundary phonons, elastic constant modes, and diamond Si and C. The method was able to successfully reproduce the first-principles electronic band structure, as shown in Fig. 13. In addition, we computed phonon frequencies along the (001)
TB methods for magnetic materials and multi-element systems 1.0
295 LAPW TB
ε(k) (Ry)
0.5
0.0
⫺0.5
⫺1.5 Γ
∆
X ZW Q L Λ Γ
Σ
KSX Z W K
L
Figure 13. Band structure of zincblende SiC along high symmetry directions of the Brillouin zone, calculated from sp3 d5 tight-binding parameters (solid lines) and LAPW–LDA (dashed lines) [69].
direction of the zincblende unit cell and compared them to experiment [70]. As seen in Fig. 14, the acoustic modes are in good agreement with experiment, though the optic modes are somewhat low. Thermal expansion was also computed using the TBMD program, and found to be in good agreement with experiment.
2.4.
Tight-binding Description of MgB2
A nonorthogonal TB Hamiltonian for the superconductor MgB2 was derived [71] by fitting to both the total-energy and energy-band results of a first-principles full-potential LAPW calculation using the Hedin–Lundqvist parametrization of the local-density approximation LDA. The LAPW calculations were performed in the ground-state (AlB2 ) structure, for 17 different combinations of c and a, that determined the LDA equilibrium volume. The LAPW results for the total energy and the energy bands at 76 k points in the irreducible hexagonal Brillouin zone, were used as a database to determine the parameters of the TB Hamiltonian. Our basis included the s and p orbitals in both Mg and B in a nonorthogonal two-center representation. In order to obtain an accurate fit it was essential to block diagonalize the Hamiltonian at the high-symmetry points , A, L, K, and H. We found that at a given set of lattice parameters (c,a) we can reproduce the energy bands of MgB2 quite
296
M.J. Mehl and D.A. Papaconstantopoulos Γ
L
1000 Long. Trans.
800
Exp.
ν (cm1)
Optical Modes
TB
600
400
Acoustic Modes
200
0 0.0
0.2
0.4
0.6
0.8
1.0
q Figure 14. Phonon dispersion along the − L direction in zincblende SiC. Circles are from an sp3 tight-binding parametrization [69], and diamonds are from experiment [70].
well. A comparison is shown in Fig. 15, where the solid and broken lines represent the LAPW and TB bands, respectively, at the LDA values of the equilibrium lattice parameters. The TB bands are in very good agreement with the LAPW bands, including the two-dimensional B- band in the A direction just above, which has been identified as hole-band-controlling superconductivity. The RMS fitting error is 2 mRy for the total energy, and close to 10 mRy for the first five bands. Beyond the fifth band our fit is not as accurate, as the Mg d bands, which are not included in our Hamiltonian, come into play. The values of our TB parameters are given in the references [71]. In Fig. 16 we show a comparison of TB and LAPW densities of states DOS. There is an excellent agreement in both the total DOS and the B p-like DOS. The B and Mg s components of the DOS have their strongest presence at the bottom of the valence band, from −0.8 to −0.6 Ry on our scale. They are much smaller than the p-like DOS, so we chose not to include them in Fig. 16. Additionally, we have omitted the Mg p-like DOS, which is also small below εF , but becomes significant above E f . Our TB value of the total DOS at εF is ρ(εF ) = 0.69 states/eV, which is almost identical to that found from our direct LAPW calculation. This value of ρ(εF ) corresponds to the LDA equilibrium volume and is slightly smaller than the value of 0.71 states/eV reported by other workers at the experimental volume. Using our value of ρ(εF ) and the measured value of the specific-heat coefficient γ we find a value
TB methods for magnetic materials and multi-element systems
297
0.4 0.2
ε (Ry)
0.0 0.2 ⫺0.4 ⫺0.6 ⫺0.8 Γ
Σ
M U L
R
A ∆ Γ
T
K P H
S
A
Figure 15. The band structure of MgB2 in the AlB2 structure at the theoretical equilibrium volume, as determined by the full-potential LAPW method (solid lines) and our tight-binding parametrization dashed lines [71]. The Fermi level is at zero.
ρ(ε) (States/Ry/Unit Cell)
20
εF
15
10
5
0 1
0.8
0.6
0.4
0.2
0
0.2
0.4
ε (Ry)
Figure 16. The electronic density of states DOS of MgB2 in the AlB2 structure at the theoretical equilibrium volume, comparing the total DOS as determined by the full-potential LAPW method (upper solid line) and our tight-binding parametrization [71] (upper dashed line), and the partial single-atom B p decomposition lower lines.
298
M.J. Mehl and D.A. Papaconstantopoulos ⫺0.580 V (a.u.) 170 175 180 185 190 195 200
E (Ry)
⫺0.590
⫺0.600
⫺0.610 1.00
1.05
1.10
1.15
1.20
1.25
c/a Figure 17. Total energy of MgB2 , at fixed volume versus c/a, calculated from our tightbinding parametrization [71]. The points indicate the actual calculated energies, while the lines are cubic polynomial fits to the data.
of the electron-phonon coupling constant λ =0.65, which is consistent with the high superconducting-transition temperature in MgB2 . Our TB Hamiltonian also provides an accurate description of the energetics of MgB2 , as shown in Fig. 17. We have tested our parameters by computing the TB equilibrium structure. We find an equilibrium of c 6.66 a.u. and a 5.79 a.u., in good agreement with the LAPW result. At c/a = 1.14, the experimental value, we deduce a bulk modulus of B = 165 GPa which is in good agreement with the experimental value of 120 GPa and with the calculated value of 147 GPa reported by Bohnen et al. [72].
2.5.
Ternary Systems: Ruthenates
The NRL-TB scheme has been applied to ternary systems as well. For such applications the number of parameters increases substantially. However, in most cases it is easy to restrict the number of parameters by using an orthogonal Hamiltonian and by reducing the orbitals to only those who are the most dominant in the particular system. We consider first [73] SrRuO3 and Sr2 RuO4 where for the former we have constructed a 14×14 orthogonal Hamiltonian including Ru-d and O-p orbitals and for the latter the Hamiltonian size is
TB methods for magnetic materials and multi-element systems
299
27×27 with Sr-d, Ru-d, and O-p orbitals. In these calculations we did not fit the total energies, as we aim only for a very accurate reproduction of the LAPW band structures. These Hamiltonians allow the band structure to be computed on very fine meshes in the Brillouin zone at low computational cost, and additionally have yielded an analytic form for band velocities, while retaining the accuracy of the full-potential electronic structure calculations. This greatly facilitates calculation of transport and superconducting parameters related to the fermiology. These features were exploited to calculate the Hall coefficient and an anisotropy parameter relevant to the superconducting vortex lattice geometry for Sr2 RuO4 . A comparison of TB and LAPW Fermi surface for Sr2 RuO4 is shown in Fig. 18 where we see an excellent agreement.
Figure 18. Fermi surface of SrRuO4 from LAPW and TB calculations [73].
300
2.6.
M.J. Mehl and D.A. Papaconstantopoulos
Na x CoO2
The TB method has been applied to the study of the odd-gap superconductor Nax CoO2 [74, 75]. This system has strong nesting, involving nearly 70% of the electrons at the Fermi level. Since this effect primarily involves the Co and O atoms, the parametrization was restricted to those states. The crystal field of the octahedral structure splits the on-site Co d-bands into a1g , eg , and eg bands. To accommodate this, the on-site parameters (14) were computed independently for the x y, yz, zx, x 2 − y 2 , and 3z 2 − r 2 Co d orbitals and the x, y, and z O p orbitals. The dependence of the on-site and hopping parameters on bond distance was then used to analyze the Fermi surface change with interlayer distance. The band structure and Fermi surface was found to depend on the Oxygen height in a non-trivial manner. In addition, the one-electron susceptibility was then computed, as shown in Fig. 19. The nesting shown hear leads to a charge density wave as well as spin-fluctuations, suggesting that system is an odd-gap triplet s-wave superconductor.
Figure 19. Low frequency limit of χ0 (q, ω)/ω in Nax CoO2 , using a tight-binding parametrization of Co–O [74], The double humped peaks on the zone boundary indicate nesting.
TB methods for magnetic materials and multi-element systems
2.7.
301
Other Methods
Porezag et al. [8] developed an alternative method for computing total energies and electronic eigenvalues from a parametrized tight-binding scheme. In their work, based on the Linear Combination of Atomic Orbitals (LCAO) method, the hopping parameters computed directly from first principles calculations. A repulsive pair potential between the ions is then fitted so that the sum of the pair potential energies and the sum over the occupied states gives the correct total energy. Finally, the on-site terms are corrected with a simulated Coulomb interaction to preserve charge self-consistency on each ion. The method has been applied to many sp3 systems, including, e.g., predicting the structure of tetragonal CN compounds [76], the electronic structure of GaN edge dislocations, and the structure of amorphous CN [77]. Halley and coworkers [7, 78, 79] have developed a similar charge selfconsistent TB approach, which has been applied mainly to oxides (rutile TiO2 ) and fluorides (MnF2 , discussed in Section 1). In this method the isolated ions are required to have the proper energy levels, which allows for better descriptions of electrochemistry. As noted above, this method has also been used to study magnetic systems. Pan [80] adapted the work of the Ames group on carbon [11] to derive a TB parametrization for hydrocarbons. This has been used to study the geometries of small hydrocarbons and hydrogenated diamond surfaces, and finds geometries in qualitative agreement with previous results. We have discussed extensions of the our original TB total energy method [9, 10] to spin-polarized systems, including non-collinear spins, and compounds. Althoughthe determination of theTB parameters istedious, theresulting method is computationally efficient, capable of performing static and dynamic calculations beyond the limits of first-principles methods. We have applied the method to all of the magnetic elements, and many nonmagnetic compounds. The accuracy of electronic, elastic, and phonon properties is comparable to that of the original, nonmagnetic single element calculations.
3.
Outlook
Tight-binding total energy methods can be thought of as a mapping of a large set of first-principles data onto a compact TB Hamiltonian based on Slater–Koster parameters. As we have seen, these methods are nearly as accurate as first-principles calculations over a wide range of structures and densities. The calculations are very fast, as well. A typical first-principles calculation for a transition metal or intermetallic compound requires on the order of one hundred basis functions per atom to achieve convergence. The TB calculation will use only nine functions per atom, assuming an sp3 d5
302
M.J. Mehl and D.A. Papaconstantopoulos
basis set. Given that the time to diagonalize Hamiltonian scales as the cube of the number of basis functions, we see that TB methods are inherently 1000 times faster than the corresponding first-principles calculations. Furthermore, any algorithmic improvements in eigenvalue determination can be applied to TB methods as well as first-principles. Tight-binding calculations will therefore always be faster than first-principles, and so can be applied to much larger systems. As we have seen, these methods are routinely applied to molecular dynamics simulations containing hundreds of atoms, and have been applied to systems containing several thousand atoms. The major bottleneck to the widespread use of TB methods is the development of accurate parameter sets, particularly for binary and ternary compounds. This involves large numbers of first-principles calculations, and thorough testing of the resulting parameter sets. However, once a parameter set is validated, it can be used for a wide variety of applications. We expect the use of TB methods to grow rapidly as more systems are parametrized.
Acknowledgments This work was supported by the U.S. Office of Naval Research (ONR). The development of the tight-binding codes was supported in part by the U.S. Department of Defense Common HPC Software Support Initiative (CHSSI). Work on the magnetic elements was sponsored in part by the ONR Design of Naval Steels program. In addition to our collaborators, we would like to thank W. Pickett for helpful discussions concerning noncollinear magnetization.
References [1] J.C. Slater and G.F. Koster, Phys. Rev., 94, 1498, 1954. [2] W.A. Harrison, Electronic Structure and the Properties of Solids, Freeman, San Francisco, 1980. [3] W.A. Harrison, Elementary Electronic Structure, World Scientific, Singapore, 1999. [4] D.A. Papaconstantopoulos, Handbook of the Band Structure of Elemental Solids, Plenum, New York, 1986. [5] M.J. Mehl and D.A. Papaconstantopoulos, In: C.F. Yong (ed.), Topics in Computational Materials Science, World Scientific, Singapore, Chap.V, pp. 169–213, 1998. [6] D.A. Papaconstantopoulos and M.J. Mehl, J. Phys. Condens. Matter, 15, R413, 2003. [7] N. Yu and J.W. Halley, Phys. Rev. B, 51, 4768, 1995. [8] D. Porezag, T. Frauenheim, T. K¨ohler, G. Seifert, and R. Kaschner, Phys. Rev. B, 51, 12947, 1995. [9] R.E. Cohen, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. B, 50, 14694, 1994. [10] M.J. Mehl and D.A. Papaconstantopoulos, Phys. Rev. B, 54, 4519, 1996.
TB methods for magnetic materials and multi-element systems
303
[11] M.S. Tang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev. B, 53, 979, 1996. [12] F. Kirchhoff, M.J. Mehl, N.I. Papanicolaou, D.A. Papaconstantopoulos, and F.S. Khan, Phys. Rev. B, 63, 195101, 2001. [13] H. Haas, C.Z. Wang, M. F¨ahnle, C. Els¨asser, and K.M. Ho, Phys. Rev. B, 57, 1461, 1998. [14] C.Z. Wang, B.C. Pan, and K.M. Ho, J. Phys. Condens. Matter, 11, 2043, 1999. [15] R.E. Cohen, L. Stixrude, and E. Wasserman, Phys. Rev. B, 56, 8575, (1997), erratum Phys. Rev. B, 58, 5873, 1998. [16] J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev. B, 49, 8506, 1994. [17] W.E. Pickett, J. Korean Phys. Soc. (Proc. Suppl.), 29, S70, 1996. [18] O.K. Andersen, Phys. Rev. B, 12, 3060, 1975. [19] S.-H. Wei and H. Krakauer, Phys. Rev. Lett., 55, 1200, 1985. [20] D. Singh, Phys. Rev. B, 43, 6388, 1991. [21] P. Hohenberg and W. Kohn, Phys. Rev., 136, B864, 1964. [22] W. Kohn and L.J. Sham, Phys. Rev., 140, A1133, 1965. [23] L. Hedin and B.I. Lundqvist, J. of Phys. C: Solid State Phys., 4, 2064, 1971. [24] J.P. Perdew, J.A. Chevary, S.H. Vosko, K.A. Jackson, M.R. Pederson, D.J. Singh, and C. Fiolhais, Phys. Rev. B, 46, 6671, 1992. [25] M.J. Gillan, J. Phys. Condens. Matter, 1, 689, 1989. [26] N.C. Bacalis, D.A. Papaconstantopoulos, M.J. Mehl, and M. Lach-hab, Physica B: Conden Matter, 296, 125, 2001. [27] P. Entel, R. Meyer, K. Kadau, H. Herper, and E. Hoffmann, Eur. Phys. J. B, 5, 379, 1998. [28] M.J. Mehl, Phys. Rev. B, 47, 2493, 1993. [29] M.J. Mehl, B.M. Klein, and D.A. Papaconstantopoulos, In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds – Principles and Practice, vol. 1, John Wiley and Sons, London, pp. 195–210, 1994. [30] G. Simmons and H. Wang, Single Crystal Elastic Constants and Calculated Aggregate Properties: A Handbook, 2nd ed., MIT Press, Cambridge, MA and London, 1971. [31] V.J.M.G. Shirane and R. Nathans, Phys. Rev., 162, 528, 1967. [32] R.J. Birgeneau, J. Cordes, G. Dolling, and A.D.B. Woods, Phys. Rev., 136, A1359, 1964. [33] S.C. Miller and W.F. Love, Tables of Irreducible Representations of Space Groups and Co-representations of Magnetic Space Groups, Pruett, Bolder, 1967. [34] C. Barreteau, R. Guirado-L´opez, M.C. Desjonqu`eres, D. Spanjaard, and A.M. Ole´s, Comput. Mat. Sci., 17, 211, 2000. [35] C. Barreteau, D. Spanjaard, and M.C. Desjonqu`eres, Phys. Rev. B, 58, 9721, 1998. [36] C. Barreteau, R. Guirado-L´opez, D. Spanjaard, M. C. Desjonqu`eres, and A.M. Oles, Phys. Rev. B, 61, 7781, 2000. [37] C. Barreteau, M.-C. Desjonqueres, A.M. Oles, and D. Spanjaard, Phys. Rev. B, 69, 064432, 2004. [38] Y. Xie and J.A. Blackman, Phys. Rev. B, 66, 085410, 2002. [39] G.L. Krasko, J. Appl. Phys., 79, 4682, 1996. [40] M.J. Mehl, D.A. Papaconstantopoulos, I.I. Mazin, N.C. Bacalis, and W.E. Pickett, J. Appl. Phys., 89, 6880, 2001. [41] K. Hirai, J. Phys. Soc. Japan, 67, 1776, 1998. [42] R. Hafner, D. Spisak, R. Lorenz, and J. Hafner, Phys. Rev. B, 65, 184432, 2002. [43] M.J. Mehl and D.A. Papaconstantopoulos, Europhys. Lett., 31, 537, 1995. [44] S. Mukherjee and R.E. Cohen, J. Comput.-Aid. Matter Des., 8, 107, 2001.
304
M.J. Mehl and D.A. Papaconstantopoulos [45] R.E. Cohen and S. Mukherjee, Phys. Earth Planet. Int., 143–144, 445, 2004. [46] G. Steinle-Neumann, L. Stixrude, and R.E. Cohen, Proc. Natl. Acad. Sci., USA, 101, 33, 2004. [47] M. Zhuang and J.W. Halley, Phys. Rev. B, 64, 024413, 2001. [48] S.H. Yang, M.J. Mehl, D.A. Papaconstantopoulos, and M.B. Scott, J. Phys. Condens. Matter, 14, 1895, 2002. [49] C.E. Lekka, N. Bernstein, M.J. Mehl, and D.A. Papaconstantopoulos, Appl. Surf. Sci., 219, 158, 2003. [50] T.B. Massalski (ed.), Binary Alloy Phase Diagrams, American Society for Metals, Metals Park, OH, 1987. [51] V. Ozoli¸nsˇ, C. Wolverton, and A. Zunger, Phys. Rev. B, 57, 6427, 1998. [52] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, Phys. Rev. B, 63, 224106, 2001. [53] P.D. Bogdanoff, B. Fultz, and S. Rosenkranz, Phys. Rev. B, 60, 3976, 1999. [54] S. Katano, M. Iizumi, and Y. Noda, J. Phys. F, 18, 2195, 1988. [55] R. Courths, M. Lau, T. Scheunemann, H. Gollisch, and R. Feder, Phys. Rev. B, 63, 195110, 2001. [56] O. Jepsen and O.K. Andersen, Solid State Commun., 9, 1763, 1971. [57] H. Okamoto and P.A. Beck, Monatsh. Chem., 103, 907, 1972. [58] M.J. Mehl, J.E. Osburn, D.A. Papaconstantopoulos, and B.M. Klein, Phys. Rev. B, 41, 10311, 1990, erratum Phys. Rev. B, 42, 5362, 1990. [59] D.K. Hsu and R.G. Leisure, Phys. Rev. B, 20, 1339, 1979. [60] J.O. Kim, J.D. Achenbach, P.B. Mirkarimi, M. Shinn, and S.A. Barnett, J. Appl. Phys., 72, 1805, 1992. [61] W. Weber, Phys. Rev. B, 8, 5082, 1973. [62] H. Holleck, J. Vac. Sci. Technol, A, 4, 2661, 1986. [63] A. Gross, M. Scheffler, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. Lett., 82, 1209, 1999. [64] A. Groß, A. Eichler, J. Hafner, M.J. Mehl, and D.A. Papaconstantopoulos, Surf. Sci., 539, L542, 2003. [65] A.C. Luntz, M.D. Williams, and D.S. Bethune, J. Chem. Phys., 89, 4381, 1988. [66] P.D. Nolan, B.R. Lutz, P.L. Tanaka, J.E. Davis, and C.B. Mullins, J. Chem. Phys., 111, 3696, 1999. [67] D.A. Papaconstantopoulos, M.J. Mehl, S.C. Erwin, and M.R. Pederson, In: P. Turchi, A. Gonis, and L. Colombo (eds.), Tight-Binding Approach to Computational Materials Science, vol. 491, Materials Research Society, Pittsburgh, p. 221, 1998. [68] N. Bernstein, M.J. Mehl, D.A. Papaconstantopoulos, N.I. Papanicolaou, M.Z. Bazant, and E. Kaxiras, Phys. Rev. B, 62, 4477, 2000, erratum Phys. Rev. B, 65, 249002(E), 2002. [69] N. Bernstein, H.J. Gotsis, D.A. Papaconstantopoulos, and M.J. Mehl, submitted to Phys. Rev. B, (unpublished). [70] D.W. Feldman, J. James H.Parker, W.J. Choyke, and L. Patrick, Phys. Rev., 173, 787, 1968. [71] D.A. Papaconstantopoulos and M.J. Mehl, Phys. Rev. B, 64, 172510, 2001. [72] K.-P. Bohnen, R. Heid, and B. Renker, Phys. Rev. Lett., 86, 5771, 2001. [73] I.I. Mazin, D.A. Papaconstantopoulos, and D.J. Singh, Phys. Rev. B, 61, 5223, 2000. [74] M.D. Johannes, I.I. Mazin, D.J. Singh, and D.A. Papaconstantopoulos, condmat/0403135, Phys. Rev. Lett., 93, 101802, 2004. [75] M.D. Johannes, D.A. Papaconstantopoulos, D.J. Singh, and M.J. Mehl, Europhys. Lett., 68, 433, 2004.
TB methods for magnetic materials and multi-element systems
305
[76] E. Kim, C. Chen, T. Kohler, M. Elstner, and T. Frauenheim, Phys. Rev. Lett., 86, 652, 2001. [77] F. Weich, J. Widany, and T. Frauenheim, Phys. Rev. Lett., 78, 3326, 2001. [78] P.K. Schelling, N. Yu, and J.W. Halley, Phys. Rev. B, 58, 1279, 1998. [79] J.W. Halley, Y. Lin, and M. Zhuang, Faraday Discuss., 121, 85, 2002. [80] B.C. Pan, Phys. Rev. B, 64, 155408, 2001.
1.15 ENVIRONMENT-DEPENDENT TIGHT-BINDING POTENTIAL MODELS C.Z. Wang and K.M. Ho Ames Laboratory-U.S. DOE and Department of Physics and Astronomy, Iowa State University, Ames, IA 50011, USA
The use of tight-binding formalism to parametrize the electronic structures of crystals and molecules has been a subject of continuous interest since the pioneering work of Slater and Koster [1] half a century ago. In the last 15 years, tight-binding method has attracted even more attention due to the development of tight-binding total energy models that can provide interatomic forces for molecular dynamics simulations of materials [2–7]. The simplicity of the tight-binding description makes the method very promising for large scale electronic calculations and atomistic simulations [8, 9]. However, studies of complex systems require that the tight-binding parameters should be “transferable” [4], i.e., should be able to describe accurately the electronic structure and total energy of a material in different bonding configurations. Although tight-binding molecular dynamics has been successfully applied to a number of interesting systems such as carbon fullerenes and carbon nanotubes [10–12], the transferability of tight-binding potentials is still a major issue that hinders the wide spread application of the method to more materials of current interest. Most of the tight-binding models developed so far are based on the Slater– Koster formalism [1]. The accuracy and transferability of these tight-binding models are limited by the approximations inherent in the Slater–Koster theory. One of the key approximations is the assumption that the hopping integrals are independent of the bonding environment. Experience from firstprinciples calculations showed that a fixed minimal basis set optimized for a given atomic geometry will not usually give accurate results for total energies of other atomic geometries. Minimal basis sets need to have the flexibility to adjust to the bonding environment of the atom on which they are based in order to give converged results for different geometries. Since tight-binding is a minimal-basis description of the electronic structure, it must follow that 307 S. Yip (ed.), Handbook of Materials Modeling, 307–347. c 2005 Springer. Printed in the Netherlands.
308
C.Z. Wang and K.M. Ho
the tight-binding hopping parameters should be environment dependent. Another major limitation in the Slater–Koster theory is the use of the twocenter approximation [1]. Such an approximation could be justified only when the system is governed by strong covalent interactions (e.g., carbon). For systems where metallic bonding effects are significant, contributions beyond pairwise interactions should not be neglected. Furthermore, the Lo¨ wdin symmetric orthogonalization procedure [13] used to construct the orthogonal basis set in the Slater–Koster theory may also result in additional structure-dependent contributions to the two-center hopping integrals because the overlap matrices are different for different structures. Several developments to go beyond the Slater–Koster theory have been undertaken in the last ten years. The environment-dependent tight-binding (EDTB) potential models developed by the authors and co-workers is one of such attempts. These potential models incorporate environment-dependent scaling functions not only for the diagonal matrix elements, but also for the off-diagonal hopping integrals and the repulsive energy in the tight-binding total energy expression. These models provide a mechanism for including some of the important effects that have been ignored in the Slater–Koster theory such as the variation of the local minimal basis set with environment, three-center interactions, and effects due to L¨owdin orthogonality. The EDTB models have been demonstrated to describe well the properties of both the lower-coordinated covalent structures as well as the higher-coordinated metallic structures of carbon and silicon [14, 15]. In spite of these progress, the development of EDTB models so far still relies on empirical fitting to the band structure and total energies of some standard structures. The fitting procedure is quite laborious if we want to study a broad range of materials, especially in compound systems where different sets of interactions have to be determined simultaneously from a given set of electronic structures. Moreover, fundamental questions such as how and to what extent the approximations used in the Slater–Koster scheme influence the transferability of the tight-binding models are still not well understood from the empirical fitting approach. Information from first-principles calculations about these issues are highly desirable to guide the development of more accurate and transferable tight-binding models. In general, the overlap and one-electron Hamiltonian matrices from firstprinciples calculations cannot be used directly to infer the tight-binding parameters because first-principles calculations are done using a large basis set in order to get convergent results while tight-binding parameters are based on a minimal basis representation. Recently, the authors and co-workers have developed a method for extracting a minimal basis Hamiltonian starting from ab initio calculation results using large converged basis sets [16–19]. This new development provides a clear way to separate the electronic structure into different component interactions among the atomic orbitals in the system. This
Environment-dependent tight-binding potential models
309
provides the basis for developing a systematic scheme to simplify the potential generation process to make the problem much faster and more tractable, especially when we are dealing with compound systems.
1.
Fundamentals of Tight-Binding Potential Models
The expression for the binding energy of a system with M atoms and N valence electrons in tight-binding molecular dynamics (TBMD) is given by E binding = E bs + E rep − E 0
(1)
The first term on the right-hand side of Eq. (1) is the band structure energy which is equal to the sum of the one-electron eigenvalues εi of the occupied states given by a tight-binding Hamiltonian HTB , E bs =
f i εi
(2)
i
where f i is the electron occupation (Fermi–Dirac) function and i f i = N . The second term on the right-hand side of Eq. (1) is a repulsive energy usually expressed as a sum of short-ranged pairwise interactions E rep =
1 φ(ri, j ) 2 i, j
(3)
or a functional of sum of pairwise interactions E rep =
Fi
i
φ(ri, j )
(4)
j
where F is a function which for example can be a fourth order polynomial [20]. The term E 0 in Eq. (1) is a constant which represents the sum of the energies of the individual atoms. The founding work of tight-binding Hamiltonians HTB for the electronic structure of solids was done by Slater and Koster in 1954 [1]. Starting with a set of atomic orbitals φiα (r − Ri ) located on an atom at Ri , they used the L¨owdin symmetric orthogonalization scheme [13] to construct a set of orthogonal orbitals iα (r − Ri ): iα (r − Ri ) =
jβ
−1/2
Siα, jβ φ jβ (r − Ri )
(5)
310
C.Z. Wang and K.M. Ho
where S is the overlap matrix of the original atomic orbitals: Siα, jβ (Ri − R j ) =
∗ φiα (r − Ri )φ jβ (r − R j )d3 r
(6)
These set of orthogonal orbitals iα (r − Ri ) have the symmetry properties of the corresponding φiα [1]. If the system is a periodic crystal, then the Bloch sum can be used to construct the wave-vector k-dependent orbitals iα,k = N −1/2
exp(ik · Ri )iα (r − Ri )
(7)
Ri
where N is the number of unit cell. Let H be the Hamiltonian of the system which also has the periodicity of the lattice, then the k-dependent Hamiltonian matrix elements can be written as Hiα, jβ (k) =
Rj
exp[ik · (R j − Ri )] ·
∗ iα (r − Ri )H jβ (r − R j )d3 r (8)
One of the key approximations made by Slater and Koster, which led to the commonly used tight-binding formulation of the Hamiltonian Matrix is the so called two-center approximation. They assumed that the potential part of the Hamiltonian is a sum of spherical potentials located on the two atoms on which the L¨owdin orbitals are located, and the three-center integrals are disregarded. Under this assumption, the integral in Eq. (8) becomes similar to the type that one would expect in a diatomic molecule. The Hamiltonian matrix elements of Eq. (8) now depend only on the form of the L¨owdin orbitals iα and the vector (R j − Ri ). Since the L¨owdin orbitals iα has the same symmetry as the corresponding atomic orbital φiα , it can be expanded as a sum of functions with well-defined angular momentum with respect to the axis between two atoms. The labels σ, π, δ were used (in analogy with s, p, d) for the angular momentum quantum number m = 0, 1, 2, respectively. For example, if iα is a p orbital, it can be expanded as linear combination of a pσ and pπ± function with respect to the axis. For a system containing only s and p orbitals, there are only four types of hopping integrals h ssσ , h spσ , h ppσ , and h ppπ to be considered. These integrals depend only on the separation r of the two atoms and can be treated as parameters to be determined by fitting to ab initio band structure calculations. Once the hopping integrals are obtained, the TB Hamiltonian matrix can be constructed by linear combination of the hopping integrals using the direction cosines (cx , c y , cz ) of the vector (R j − Ri ). Note that when R j = Ri the integrals Eq. (8) yield the diagonal matrix elements. The formulations described in this subsection can also be applied to nonperiodic systems (i.e., clusters).
Environment-dependent tight-binding potential models
2.
311
Environment-Dependent Tight-Binding Potential Models
Since the classic paper of Slater and Koster [1], a lot of work have been devoted to the tight-binding parameterization of different materials. The Slater–Koster theory has been extended by incorporating continuous distancedependent scaling functions for the hopping parameters. One of such scaling functions is the famous 1/r 2 scaling function introduced by Harrison [21] and by Chadi [22]. Subsequently, Goodwin et al. (GSP) [4] showed that the transferability of tight binding models for silicon can be improved by multiplying an exponential attenuation function to the simple power-law scaling of the tight-binding parameters and the pairwise repulsion. Although the two-center approximation greatly simplifies the tight-binding parameterization, it limits the accuracy and transferability of the tight-binding models. There have been a number of evidence pointing to the necessity of going beyond the two-center approximation. Sawada [23] and subsequently Kohyama [24], and Mercer and Chou [6] showed that explicit inclusion of three-center interactions into the repulsive energy gives a better description of the energy–volume curves for silicon and germanium in comparison with the pure two-center model proposed by Goodwin et al. (GSP). Tight-binding models that allow the diagonal matrix elements to be dependent on the environment of the atoms developed by Mercer and Chou [25] for silicon and by Cohen et al. [26] and Mehl et al. [7] for metallic elements showed significant improvements in transferability. Li and Biswas [27] found that it is necessary to allow neighbordependent hopping integrals between silicon and hydrogen atoms for a correct description of the properties of interstitial hydrogen atom in the silicon lattice. Tight-binding potential models that include environment-dependent scaling functions for off-diagonal as well as diagonal matrix elements were developed by Wang and Ho et al. [14, 15].
2.1.
Formalism
In the environment-dependent tight-binding potential model of Wang et al., the minimal basis set is taken to be orthogonal. The effects of L¨owdin orthogonality, three-center interactions, and the variation of the local basis set with environment are taken into account empirically by renormalizing the interaction strength between atom pairs according to the surrounding atomic configurations. The tight-binding hopping parameters and the repulsive interaction between atoms i and j depend on the environments of atoms i and j through two scaling functions [14]. The first one is a screening function that is designed to weaken the interactions between two atoms when there are
312
C.Z. Wang and K.M. Ho
intervening atoms between them. Another is a bond-length scaling function which scales the interatomic distance (hence the interaction strength) between the two atoms according to their effective coordination numbers. Longer effective bond lengths are assumed for higher coordinated atoms. Specifically, the hopping parameters and the pairwise repulsive potential for silicon and carbon are expressed as α4 2 h(ri j ) = α1 Ri−α j exp[−α3 Ri j ](1 − Sij )
(9)
In this expression, h(rij ) denotes the possible types of interatomic hopping parameters h ssσ , h spσ , h ppσ , h ppπ and pairwise repulsive potential φ(rij ) between atoms i and j . rij is the real distance and Rij is a scaled distance between atoms i and j . Sij is a screening function. The parameters α1 , α2 , α3 , α4 , and parameters for the bond-length scaling function Rij and the screening function Sij can be different for different hopping parameters and the pairwise repulsive potential. Note that expression Eq. (9) will reduce to the traditional two-center form if we set Rij = rij and Sij = 0. The screening function Sij is expressed as a hyperbolic tangent (tanh) function (i.e., Sij = tanh(ξij )) with argument ξij given by ξij = β1
exp − β2
l
ril + r j l rij
β3
(10)
where β1 , β2 , and β3 are adjustable parameters. Maximum screening effect occurs when the atom l is situated close to the line connecting the atoms i and j (i.e., ril + rl j is minimum). This approach allows us to distinguish between first and further neighbor interactions without explicit specification. This is well-suited for molecular dynamics simulations where it is difficult to define exactly which atoms are first neighbors and which are second neighbors. The bond-length scaling function scales the distance between two atoms according to their effective coordination numbers. Longer effective bond lengths are assumed for higher coordinated atom pairs, leading to reduced interactions per atom pair for larger-coordinated structures. The scaling between the real and effective interatomic distance is given by Rij = rij (1 + δ1 + δ2 2 + δ3 3 ) where 1 = 2
ni − n0 n0
+
n j − n0 n0
(11)
is the fractional coordination number relative to the coordination number of i and j . The coordination the diamond structure n 0 , averaged between atoms number can be modeled by a smooth function, n i = j (1 − Sij ) with a proper choice of parameters for Sij which has the form of the screening function described above.
Environment-dependent tight-binding potential models
313
Besides the hopping parameters, the diagonal matrix elements are also dependent on the bonding environments. The expression for the diagonal matrix elements is eλ,i = eλ,0 +
eλ (rij )
(12)
j
where eλ (rij ) takes the same expression as Eq. (9), λ denotes the two types of orbitals (s or p). es,0 and ep,0 are the on site energies of a free atom. Finally, the repulsive energy term is expressed in a functional of the sum of pairwise interactions as defined in Eq. (4) in the previous section. The parameters in the model are determined by fitting to the self-consistent first-principles density functional calculations results of electronic band structures and the cohesive energy versus volume curves of several crystalline structures of different coordination numbers. Such crystalline structures include diamond, graphite, β-tin, simple cubic, bcc, and fcc structures. In addition, some elastic constants and vibration frequencies of the lowest-energy structures are also included in the fitting in order to ensure that the model gives good description of elastic and vibrational properties in addition to electronic structures and binding energies.
2.2.
EDTB Potential for Carbon
Carbon is a strong covalent bonded material best described by the tightbinding scheme. The two-center tight-binding model for carbon developed by Xu et al. (XWCH model [20]) gives accurate description for carbon in the low-coordination diamond, graphite, and linear-chain structures as shown in Fig. 1. The potential also describes well the structures and energies of carbon fullerenes and carbon nanotubes. Therefore, the two-center XWCH carbon potential is adequate for studying most carbon systems. However, the twocenter XWCH model describes the higher coordinated carbon structures poorly (see Fig. 1). Therefore, it is not suitable for studying carbon structures at high pressure or high compressive stress. In order to correct this deficiency, Tang et al. have developed an environment-dependent tight-binding potential for carbon following the formalism described in the previous subsection. The parameters of this potential are given in Tables 1 and 2. The parameters for calculating the coordination number of carbon are β1 = 2.0, β2 = 0.0478, β3 = 7.16. The cutoff distance for the interaction is rij = 3.3 Å. As shown in Fig. 2, the environment-dependent tight-binding potential model for carbon describes very well the binding energies not only for the covalent (diamond, graphite, and linear chain) structures, but also for the higher-coordinated metallic (bcc, fcc, and simple cubic) structures. The EDTB potential is also more accurate for
314
C.Z. Wang and K.M. Ho
Figure 1. The cohesive energies as a function of nearest neighbor distance for carbon in different crystalline structures calculated using the two-center XWCH TB model are compared with the results from the first-principles DFT-LDA calculations. The solid curves are the TB results and the dashed curves are the LDA results [20].
Table 1. The parameters of the EDTB model for carbon. The TB hopping integrals are in the unit of eV and the interatomic distances are in the unit of Å. φ is dimensionless h ssσ h spσ h ppσ h ppπ φ es , ep
α1
α2
α3
α4
β1
−8.9491 8.3183 11.7955 −5.4860 30.0000 0.1995275
0.8910 0.6170 0.7620 1.2785 3.4905 0.029681
0.1580 0.1654 0.1624 0.1383 0.00423 0.19667
2.7008 2.4692 2.3509 3.4490 6.1270 2.2423
β2
2.0200 1.3000 1.0400 0.2000 1.5035 0.055034
0.2274 0.2274 0.2274 8.5000 0.205325 0.10143
β3
δ
4.7940 4.7940 4.7940 4.3800 4.1625 3.09355
0.0310 0.0310 0.0310 0.0310 0.002168 0.272375
Table 2. The coefficients (in unit of eV) of the polynomial function F(x) for the EDTB potential for carbon c0
c1
c2
12.201499972 0.583770664 0.336418901 × 10−3
c3
c4
−0.5334093735 × 10−4
0.7650717197 × 10−6
elastic constants and phonon frequencies of diamond and graphite structures as compare to the two-center tight-binding model (Tables 3 and 4). Another example that demonstrates the better transferability of the EDTB model over the two-center model for complex simulations is the study of diamond-like amorphous carbon. Diamond-like (or tetrahedral) amorphous carbon consists of mostly sp3 bonded carbon atom produced under highly compressive stress which promotes the formation of sp3 bonds, in contrast to the formation of sp2 graphite-like bonds under normal conditions [30–32].
Environment-dependent tight-binding potential models
315
Figure 2. The cohesive energies as a function of nearest neighbor distance for carbon in different crystalline structures calculated using the environment-dependent TB model are compared with the results from the first-principles DFT-GGA calculations. The solid curves are the TB results and the dashed curves are the GGA results [14].
Table 3. Elastic constants, phonon frequencies and Gr¨unneisen parameters of diamond calculated from the XWCH-TB model [20] and the environment-dependent TB (EDTB) model [14] are compared with experimental results [28]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz
a(Å) B c11 − c12 c44 νLTO() νTA(X ) νTO(X ) νLA(X ) γLTO() γTA(X ) γTO(X ) γLA(X )
XWCH
EDTB
Experiment
3.555 4.56 6.22 4.75 37.80 22.42 33.75 34.75 1.03 −0.16 1.10 0.62
3.585 4.19 9.25 5.55 41.61 25.73 32.60 36.16 0.93 0.30 1.50 0.98
3.567 4.42 9.51 5.76 39.90 24.20 32.0 35.5 0.96
Although the two-center XWCH carbon potential can produce the essential topology for the diamond-like amorphous carbon network [33], the comparison with experiment is not quite satisfactory as one can see from Fig. 3. There are also some discrepancies in ring statistics between the two-center tight-binding potential generated and ab initio molecular dynamics generated diamond-like amorphous carbon models [34]. Specifically, a small fraction of three and four-membered rings observed in the ab initio model is absent
316
C.Z. Wang and K.M. Ho Table 4. Elastic constants, phonon frequencies and Gr¨uneisen parameters of graphite calculated from the XWCH-TB model [20] and the environment-dependent TB (EDTB) model [14] are compared with experimental results [29]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz c11 − c12 E2g2 A2u γ(E2g2) γ(A2u )
XWCH
EDTB
Experiment
8.40 49.92 29.19 2.00 0.10
8.94 48.99 26.07 1.73 0.05
8.80 47.46 26.04 1.63
Figure 3. Radial distribution functions G(r ) of the two tetrahedral amorphous carbon samples (F and G) generated by tight-binding molecular dynamics using the two-center XWCH TB potential (solid curve) are compared with the neutron scattering data of Ref. [32] (dotted curve). The theoretical results have been convoluted with the experimental resolution corresponding to the termination of the Fourier transform at the experimental maximum scattering vector Q max = 16 Å−1 [33].
Environment-dependent tight-binding potential models
317
from the results of the two-center tight-binding model. These subtle deficiencies are corrected when the EDTB potential is used to generate diamond-like amorphous carbon [35]. The radial distribution function of the diamond-like a-c obtained from the EDTB potential is in much better agreement with experiment as one can see from Fig. 4. More discussion of the applications of the EDTB carbon potentials will be given in the next section.
Figure 4. Radial distribution functions G(r ) of the tetrahedral amorphous carbon structure generated by tight-binding molecular dynamics using the environment-dependent TB potential (solid curve) are compared with the neutron scattering data of Ref. [32] (dotted curve). The theoretical result has been convoluted with the experimental resolution corresponding to the termination of the Fourier transform at the experimental maximum scattering vector Q max = 16 Å−1 [35].
2.3.
EDTB Potential for Silicon
Although the diamond structure of Si has covalent sp3 bonding configurations, the higher coordinated metastable structures of Si are metallic and with energies close to that of the diamond structure. Therefore, Si can be metallic under high pressures or at high temperatures. For example, the coordination of the liquid phase of Si is close to the coordination of the metallic structures (i.e., 6.5). These properties of Si pose a challenge for accurate tight-binding
318
C.Z. Wang and K.M. Ho
modeling of Si: it is difficult to describe the low-coordinated covalent structures and high-coordinated metallic structures with good accuracy using one set of tight-binding parameters. With the environment-dependent tight-binding formalism, Wang et al. show that this difficulty can be overcome [15]. The EDTB Si potential developed by them gives excellent fit to the energy vs interatomic distance of various silicon crystalline structures with different coordination as shown in Fig. 5. The EDTB Si potential also describes well the structure and energies of Si surfaces in addition to other bulk properties such as elastic constants and phonon frequencies. These results can be seen from Tables 5 and 6. The parameters of the EDTB Si potential are listed in Tables 7 and 8. The parameters for calculating the coordination number of Si are β1 = 2.0, β2 = 0.02895, β3 = 7.96284. The cutoff distance for the interaction is rij = 5.2 Å. A useful benchmark for Si interatomic potentials is a series of model structures for the = 13{510} symmetric tilt boundary structures in Si [37]. Eight different structures as indicated in the horizontal axis of Fig. 6 have been selected for the calculations. These structures were not included in the database for fitting the parameters. The structures are relaxed by steepest-decent
Figure 5. The cohesive energies as a function of nearest neighbor distance for silicon in different crystalline structures calculated using the environment-dependent TB model are compared with the results from the first-principles DFT-LDA calculations. The solid curves are the TB results and the dashed curves are the LDA results [15].
Environment-dependent tight-binding potential models
319
Table 5. Elastic constants and phonon frequencies of silicon in the diamond structure calculated from the two-center TB model [36] and the environment-dependent TB (EDTB) model [15] are compared with experimental results [28]. Elastic constants are in units of 1012 dyn/cm2 and the phonon frequencies are in terahertz Two-center TB
EDTB
Experiment
0.876 0.939 0.890 21.50 5.59 20.04 14.08
5.450 0.90 0.993 0.716 16.20 5.00 12.80 11.50
5.430 0.978 1.012 0.796 15.53 4.49 13.90 12.32
a(Å) B c11 − c12 c44 νLTO() νTA(X ) νTO(X ) νLA(X )
Table 6. Surface energies of the silicon (100) and (111) surfaces from the EDTB Si potential [15]. E is the energy relative to that of the (1×1)-ideal surface. The energies are in the unit of eV/(1×1) Surface energy
E
Si(100) (1×1)-ideal (2×1) p(2×2) c(4×2)
2.292 1.153 1.143 1.148
0.0 −1.139 −1.149 −1.144
Si(111) (1×1)-ideal (1×1)-relaxed (1×1)-faulted √ √ 3 × 3 − t4 √ √ 3 × 3 − h3 (2×1)-Haneman (2×1)-π-bonded chain (7×7)-DAS
1.458 1.435 1.495 1.213 1.346 1.188 1.138 1.099
0.0 −0.025 0.037 −0.245 −0.112 −0.270 −0.320 −0.359
Structure
Table 7. The parameters obtained from the fitting for the EDTB model of Si [15]. The α1 is in the unit of eV. Other parameters are dimensionless α1 h ssσ −5.9974 h spσ 3.4834 h ppσ 11.1023 h ppπ −3.6014 φ 126.640 es , ep 0.2830
α2
α3
0.4612 0.0082 0.7984 1.3400 5.3600 0.1601
0.1040 0.1146 0.1800 0.0500 0.7641 0.050686
α4
β1
2.3000 4.4864 1.8042 2.4750 1.4500 1.1360 2.2220 0.1000 0.4536 37.00 2.1293 7.3076
β2
β3
0.1213 6.0817 0.1213 6.0817 0.1213 6.0817 0.1213 6.0817 0.56995 19.30 0.07967 7.1364
δ1
δ2
0.0891 0.0494 0.1735 0.0494 0.0609 0.0494 0.4671 0.0494 0.082661 −0.023572 0.7338 −0.03953
δ3 −0.0252 −0.0252 −0.0252 −0.0252 0.006036 −0.062172
320
C.Z. Wang and K.M. Ho Table 8. The coefficients of the polynomial function f (x) for the EDTB potential of Si c0 (eV) x ≥ 0.7 −0.739 × 10−6 x < 0.7 −1.8664
c1 0.96411 6.3841
c2 (eV−1 ) 0.68061 −3.3888
c3 (eV−2 )
c4 (eV−3 )
−0.20893 0.0
0.02183 0.0
Figure 6. Energies of the = 13 { 510 } symmetric tilt boundary structures in Si. Eight different structures as indicated in the horizontal axis were selected for calculations. The energies are relative to that of the structure M which has been identified by experiment. The energies obtained from the calculations using the EDTB Si potential are compared with results from ab initio calculations, and from two-center Si tight-binding potentials [36], and classical potential calculations (classical I [38] and classical II [39]). The results of EDTB, ab initio, and classical I are taken from Ref. [37].
method until the forces on each atom were less than 0.01 eV/ Å. The energies obtained from the calculations using the EDTB Si potential are compared with the results from ab initio calculations, and from two-center Si tight-binding potentials [36], and classical potential calculations as shown in Fig. 6. The energy differences for different structures predicted by the EDTB calculations agree very well with those from the ab initio calculations. The energies from the two-center tight-binding potentials and classical potentials do not give the correct results in comparison with the results from ab initio and environment tight-binding potential calculations even though the atoms in the structures are all four-fold coordinated.
Environment-dependent tight-binding potential models
321
Figure 7. Structure of Si13 cluster predicted by (a) classical potential, (b) two-center GSP TB potential, (c) ab initio Car–Parrinello method, and (d) EDTB. The formation energies listed under the structures are calculated using first-principles DFT-LDA. See the text for more details.
Another example for the predictive power of the EDTB silicon potential is the prediction of the ground-state structure of Si clusters. The structure of Si13 has been the subject of many debates [40–44, 46]. Si13 is special because ionized Si+ 13 clusters are observed to have very low chemical reactivity compared to other clusters in the range from Si11 to Si20 [47–49]. Based on theoretical calculations using a classical force field model, Chelikowsky et al. proposed an icosahedral structure with an atom in the center of an icosahedral cage (Fig. 7a) [40]. The argument in favor of this structure is that it is highly symmetric and seems to be chemically less reactive [43]. On the
322
C.Z. Wang and K.M. Ho
other hand, tight-binding calculations using the two-center GSP tight-binding model favor a structure with an icosahedral cage plus an atom attached from the outside (Fig. 7b). However, theoretical calculations using ab initio methods showed that both structures are energetically very unfavorable [42, 44, 45]. In 1992, Rothlisberger et al. found that a structure with a C3v symmetry (Fig. 7c) has energy much lower than that of the icosahedral structure [42]. This C3v isomer has been regarded as the ground state structure for Si13 cluster over years until the new lowest-energy structure revealed by calculations using the environment-dependent Si tight-binding potential. Combining tightbinding molecular dynamics calculation with a genetic algorithm for structural search, we obtained a new structure with a Cs symmetry which has energy even lower than that of the C3v isomer [50]. This new structure is shown in Fig. 7d. More about TBMD/GA applications will be discussed in the next section.
2.4.
EDTB Potential for Molybdenum
Since the EDTB model describes the metallic phases accurately for carbon and silicon, they can also be applied to the metallic elements. Here we use Molybdenum, a bcc transition metal, as an example. We use an orthogonal minimal basis set of one s, three p, and five d atomic orbitals to construct the TB Hamiltonian. This choice of basis gives 10 independent hopping parameters h ll m and three intra-atomic matrix elements (s , p and d ). The environment– dependence the three on-site energies of atom i is taken into account by 0 0 0 (r ), = + + d = d + d ij s d j j s−d (rij ), and p = d + p−d + s−d j p−d (rij ). Here the quantities with superscript 0 denote the parts independent of the environment and the l are the environment–dependent contributions. We adopt the same functional form for the distance-dependence of h ll m , l and φ(rij ), namely f (rij ) = α1 exp(−α2 Rij )(1 − Sij ),
(13)
where Sij is the screening function as discussed in the EDTB model for carbon and Si (see Eq. 10). Rij is the bond length scaling function as defined in Eq. (11). The coordination numbers are calculated using the same parameters as used for carbon, but the coordination of the bcc structure is chosen as n 0 for calculating the relative coordination . Only the linear term with respect to in Eq. (11) is consider for the bond length scaling for Mo, and two parameters (δ1 ), one for h ll m and l , and another for φ(rij ), are introduced to describe the bonding length scaling. Using this bond–length scaling considerably improves the TB total energies of the sc and A15 structures as compare to our previous model without bond length scaling [52].
Environment-dependent tight-binding potential models
323
In order to further reduce the number of parameters in the fitting we impose √ the universal ratios h pdσ : h pdπ = − 3 : 1 and h ddσ : h ddπ : h ddδ = (−6) : 4 :(−1) for the pre-exponential factors as we did in the previous model [52]. Furthermore, we realized from the fitting that h ppπ is small and can be neglected without a noticeable change of the results. Altogether the final model contains 55 parameters. Data used in the fitting are: ab initio band structures along lines of high symmetry in the Brillouin zone, the total energy for Mo in various crystal structures (sc, bcc, fcc) for a variety of lattice parameters around the respective equilibrium lattice parameters, the experimental phonon frequencies at the points N , H , P of the Brillouin zone, the experimentally obtained elastic constant C44 , the unrelaxed vacancy formation energy obtained by the mixed–basis pseudopotential (MBPP) method, and the unrelaxed (100) surface energy obtained by the MBPP method. The optimized parameters are listed in Table 9. The cutoff distance for the interactions is 8.5 a.u. The total energy curves for sc, bcc, and fcc structures of Mo as well as that of hcp and A15 structures, which were not included in the fit, are shown in Fig. 8. Figure 9 exhibits the fit to the band structure in bcc Mo. Table 10 represents the TB results for bcc Mo for the equilibrium lattice constant a0 , the elastic constants C11 , C12 and C44, the vacancy formation energy E Vf for a relaxed supercell with 54 sites and the formation energy E If for an octahedral interstitial atom in a relaxed supercell with 16 regular lattice sites, in comparison with experimental results and with results from the MBPP approach. The quantities a0 and C44 were included in the fit, and therefore the comparison simply tests for the quality of the fit. In contrast, the results for C11 , C12 , for E Vf in the relaxed supercell and for E If are predictions of the TB model, which
Table 9. The parameters of the EDTB model for Mo, α1 is in the unit of Ry α1 h ssσ h ppσ h ppπ h ddσ h ddπ h ddδ h spσ h sdσ h pdσ h pdπ s−d p−d d φ
α2
−1.11973 0.72213 0.23682 0.29692 0.00000 0.00000 −4.28084 0.86823 2.85390 0.86823 −0.71347 0.86823 0.21103 0.35923 −0.35016 0.61218 −4.97982 0.89071 2.87510 0.89071 2.56385 0.89192 2.05237 −0.83267 −0.02691 −0.72297 344.66341 1.31611 0 = −0.20387 Ry, s−d
β1
β2
1.02360 0.85483 0.94862 0.71027 0.00000 0.00000 0.61901 1.25923 0.61901 1.25923 0.61901 1.25923 0.63336 0.11474 6.07645 0.74477 3.56861 0.58919 3.56861 0.58919 1.30193 0.91911 3.39410 0.96099 0.97189 0.96909 2.91007 0.94413 0 p−d = 0.25666 Ry,
β3
δ1
1.75553 0.08233 2.06777 0.08233 0.00000 0.00000 1.53080 0.08233 1.53080 0.08233 1.53080 0.08233 3.02617 0.08233 3.52520 0.08233 3.12674 0.08233 3.12674 0.08233 1.95662 0.08233 1.85273 0.08233 2.35201 0.08233 1.77243 0.06595 d0 = 0.09414 Ry
324
C.Z. Wang and K.M. Ho
Figure 8. Total energy versus volume for sc, bcc, fcc, hcp, and the A15 structures of Mo. The dashed and the full lines represent the EDTB and ab initio data obtained by the linear-muffintin-orbital method in atomic-sphere approximation (LMTO-ASA), respectively.
Figure 9. The EDTB band structure (dots) of bcc Mo at a0 = 5.8 a.u. in comparison with the ab initio LMTO-ASA band structure (solid lines).
Environment-dependent tight-binding potential models
325
Table 10. Results of the TB model for the equilibrium lattice constant a0 , the elastic conf , and the formation energy E f of an stants C11 , C12 and C44 , the vacancy formation energy E V I octahedral interstitial atom in a relaxed supercell containing 16 sites, in comparison with results from ab initio MBPP calculations and experimental data
TB MBPP Experiment
a0 (a.u.)
C11 (Mbar)
C12 (Mbar)
C44 (Mbar)
f EV (eV)
E If (eV)
5.935 5.926 5.945 [54]
4.75 ± 0.10 – 4.50 [55]
1.45 ± 0.10 – 1.73 [55]
0.99 ± 0.04 – 1.25 [55]
2.95 2.90 ± 0.1 2.9 [56]
10.55 9.54 –
Figure 10. Comparison of the phonon frequencies in bcc Mo from the EDTB method (frozen– phonon calculation, full lines) and from inelastic neutron scattering [57] at T = 296 K (dots).
agree rather well with the data from experiments (C11 , C12 , E Vf ) and/or the MBPP calculation (E Vf , E If ). The phonon dispersion curves calculated using the EDTB potential as shown in Fig. 10 also compare well with experiments. The TB model also describes well the structure, energy, and electronic properties of the Mo(001) surface reconstruction [58]. These benchmark results suggest that this tight-binding potential is accurate and suitable for molecular dynamics simulation of Mo under a variety of environments.
326
C.Z. Wang and K.M. Ho
Using this environment-dependent tight-binding model, we have studied the core structure of the a20 [111] screw dislocation in bcc molybdenum at T = 0 [59]. We carry out energy minimizations with two initial core structures: one generated by continuum theory that has no polarity, and another fully relaxed by the Finnis–Sinclair potential that has finite polarity (see Fig. 11, upper panel). In both cases, the atoms relax to the same configuration by the tight-binding potential, with a zero-polarity core structure whose differential displacement (DD) map is plotted in Fig. 11, lower panel. The results predicted by the EDTB model are consistent with results from ab initio calculations, while all classical potentials favor structures with finite polarity.
3. 3.1.
More Applications of EDTB Potentials Coupling TBMD with Genetic Algorithm for Structural Optimization
Global structure optimization for atomic clusters is a challenging problem. A widely used method is simulated annealing by molecular dynamics or Monte Carlo simulations. However, simulated annealing method works well only for small clusters (typically less than 20 atoms). As the number of metastable structures grows rapidly with the number of atoms, the time needed for a cluster with more than 20 atoms to reach its ground state structure by simulated annealing is usually beyond the capability of our computers. An alternative optimization strategy inspired by the Darwinian evolution process [60], i.e., genetic algorithm (GA), for atomistic structure optimization has been developed by Deaven and Ho [61, 62]. Coupling the accurate TBMD simulations with GA has led to an efficient TBMD/GA scheme in searching for the candidate geometries for atomic clusters. In the TBMD/GA scheme, we start with a population of randomly generated structures. TBMD is used to relax the structures to the nearest local energy minimum. Using the energies of relaxed structures as the criteria of fitness, a fraction of the population (usually 10–20 different structures) is selected to be kept in the candidate pool. The next generation of candidates is then generated by a “cut-and-paste” mating operation [50, 61, 62] on the parent structures selected from the candidate pool. When the structures of this new generation have been relaxed, the candidate pool is updated according to the fitness criteria mentioned above. This optimization procedure is repeated until the candidate pool is “converged”, i.e., no more low-energy structure can be found within a reasonable computational time. When the TBMD/GA structural optimization is done, the candidates that remain in the pool can be further evaluated by ab initio calculations in order to determine the groundstate structure.
Environment-dependent tight-binding potential models
327
Figure 11. (Upper panel) DD map of Mo screw dislocation using the Finnis–Sinclair potential. (Lower panel) DD map of Mo screw dislocation using the environment-dependent tightbinding potential.
328
C.Z. Wang and K.M. Ho
Deaven and Ho first applied this algorithm to optimize the geometry of carbon clusters up to C60 [61] using the XWCH carbon tight-binding potential. In all cases of study, the algorithm is successful in finding the ground-state structures starting from an unbiased population of random atomic structures. This performance is very impressive since the strong directional bonds in carbon systems result in large energy barriers between different isomers. Although there have been many previous attempts to generate the C60 buckyball structure from atomistic simulated annealing, none has yielded the ground-state structure [63, 64]. The genetic algorithm approach dramatically outperforms simulated annealing and can arrive at the lowest-energy structure of C60 (the icosahedral buckminsterfullerene cage) in a relatively short simulation time, as one can see from Fig. 12. Ho et al. have also applied the TBMD/GA scheme to determine the structures of medium-sized silicon clusters Sin (n = 11–20) [50, 51]. Due to the complexity of the bonding characters, the structures of Sin clusters with n ≥ 11 have been an outstanding challenge. The structural optimizations require an accurate description of interatomic potential that empirical classical potentials cannot provide. Using the accurate environment-dependent silicon tight
Figure 12. Generation of the C60 molecule, starting from random coordinates, using the genetic algorithm with four candidates ( p = 4). The energy per atom is plotted for the lowestenergy (solid line) and highest-energy (dashed line) candidate structure as a function of MD time steps. Mating operations among the four candidates are performed every 30 MD time steps [61].
Environment-dependent tight-binding potential models
329
binding potential developed by Wang et al. [15] and with the efficient GA describe above, Ho et al. were able to locate candidate structures for the medium-sized silicon clusters Sin (n = 11–20) [50, 51]. The structures obtained by the TBMD/GA search are further studied by ab initio calculations and the ground-state structures of the clusters are determined. The results are shown in Fig. 13. The properties of the clusters are in excellent agreement with experimental measurements [50, 51, 65, 66].
Figure 13. Structures of silicon clusters Sin (n = 11–20) obtained from TBMD/GA global optimization and DFT-GGA calculations. The biding energy (per atom) below the structures are from DFT-GGA calculations. Larger binding energy indicates that the isomer is more stable [51].
330
C.Z. Wang and K.M. Ho
The TBMD/GA approach has also been recently applied to investigate the low-energy structures of hydrogenated silicon clusters (Si7 H2m (m = 1 − 7) and Si8 H8 ) [67, 68] and the structures of silicon “magic” cluster on Si(111)(7×7) surface [69]. These studies demonstrate that the combination of TBMD with GA is a powerful tool for global structure optimization of atomic clusters and surfaces.
3.2.
Inclusion of Electronic Entropy Effects in TBMD
At high temperatures, electronic entropy plays a significant role in molecular dynamics simulations. Since the electronic structures are calculated explicitly in the tight-binding potential model, the effects of electronic entropy can be taken into account properly in TBMD simulations. At finite temperatures Tel , the Fermi–Dirac (FD) distribution can be used to describe the occupation of the electronic states in the energy and force calculations in the TBMD simulations: E TB = 2
εi f i ,
(14)
i
Fl = −2
∂ HTB ∂(εi − µ) ∂ fi
ψi f i − 2 · εi ·
∂Rl ∂(εi − µ) ∂Rl i
ψi
i
(15)
where fi =
1 e(εi −µ)/kB Tel
+1
(16)
µ, the chemical potential, is adjusted every time step to guarantee the conservation of the total number of electrons Nel : 2
f i = Nel
(17)
i
It was pointed out by Pederson and Jackson [70] that it is very difficult to calculate the second term in the equation (15) in first-principles molecular dynamics simulations. However, Wentzcovitch et al. [71] introduced the Mermin free energy [72]: G = E total + K I − Tel Sel,
(18)
Sel = −2k B
(19)
[ f i ln f i + (1 − f i ) ln(1 − f i )]
i
and showed that MD simulations conserve the free energy G if one drops the second term in Eq. (15). It can also be shown analytically that only the first term in Eq. (15) is required if the Hellmann–Feynman forces are calculated using
Environment-dependent tight-binding potential models
331
the electronic free energy instead of the electronic energy ETB [73]. The second term in Eq. (15) is canceled by the derivative of the electronic entropy Sel : ∂ Sel ∂(εi − µ) ∂ fi ∂(Tel Sel) · = Tel · ∂Rl ∂ f i ∂(εi − µ) ∂Rl i
(20)
Equation (20) can be rewritten as ∂[ f i ln f i + (1 − f i ) ln(1 − f i )] ∂(Tel Sel) = −2kB Tel ∂Rl ∂ fi i
×
∂(εi − µ) ∂ fi · ∂(εi − µ) ∂Rl
(21)
and after some simple algebra, Eq. (20) becomes ∂(εi − µ) ∂ fi ∂(Tel Sel) · =2 εi · ∂Rl ∂(εi − µ) ∂Rl i
(22)
Here, conservation of the total number of electrons is assumed: i
∂(εi − µ) ∂ 1 ∂ ∂ fi · = fi = Nel = 0 ∂(εi − µ) ∂Rl ∂Rl i 2 ∂Rl
(23)
Thus, the second term in Eq. (15) is canceled by the derivative of electronic entropy and the first term is −(∂(E TB − Tel Sel )/∂Rl ). The inclusion of electronic temperature effects not only avoids the instability caused by the change of occupancies of states near the Fermi level in metallic systems, but also includes the effects of electronic entropy into the calculation in a very convenient manner. Explicit inclusion of the electronic entropy in TBMD simulations allows us to investigate the behavior of the system when the electrons are highly excited, for example, by ultra-fast laser pulses in laser ablation experiments. We have taken advantage of this approach and performed TBMD simulation to study structural changes on diamond (111) surfaces under laser irradiation [74]. The simulation results as shown in Figs. 14 and 15 indicate that lasertreated diamond surfaces behave differently depending on the duration of the laser pulses being used. Under nanosecond or longer pulses, the diamond (111) surface graphitizes via formation of graphite–diamond interfaces driven by thermal fluctuations, leading to a mixed graphite–diamond surface after the laser treatment (see Fig. 14). With femtosecond laser pulses, graphitization of the surface is a nonthermal process and occurring in a layer by layer fashion, resulting in a clean diamond surface after the process (see Fig. 15). These results provide a microscopic explanation of experimental observed differences in laser-ablation of diamond surfaces [74]. With femtosecond pulses, there is efficient removal of material and the surface retains a diamond Raman
332
C.Z. Wang and K.M. Ho
Figure 14. Graphitization of the diamond (111) surface via the thermal process. The snapshot pictures are taken from the tight-binding molecular dynamics simulation in which the electrons and the ions are thermal equilibrated at 2700 K. The plots show the side view of the simulation unit cell which is a 12-layer slab with two (111) surfaces (the top and bottom layers). In plane periodic boundary conditions are imposed. Graphitization is found to occur through the formation of graphite-diamond interfaces (see (d)). The whole process takes about 3 ps [74].
signal after ablation, whereas with nanosecond pulses, the Raman signal characteristic of the diamond lattice disappears after ablation. In these simulations, the electronic entropy plays an important role in driving the graphitization under the femtosecond laser pulses. As shown in Fig. 16 the free energies (including the electronic entropy) calculated along the diamond-to-graphite transition pathway under the two different laser pulses (electron temperature of 2700 and 15 000 K represent the system under nanosecond and femtosecond laser pulses, respectively; see Ref. [74] for details) are very different. The free-energy barrier at the higher electron temperature of 15 000 K is much
Environment-dependent tight-binding potential models
333
Figure 15. Graphitization of the diamond (111) surface due to the effects of hot electron plasma (no-thermal process). The snapshot pictures are taken from the tight-binding molecular dynamics simulation in which the electronic temperature is raised to 15 000 K and the ions are evolved freely. The orientation of the simulation unit cell is the same as specified in Fig. 14. Note that the graphitization takes place in a layer-by-layer fashion. The slab is graphitized completely within 500 fs of simulation time [74].
334
C.Z. Wang and K.M. Ho
Figure 16. The free energy (potential energy plus the contribution of electronic entropy) as a function of inter-layer distance along the diamond to rhombohedral graphite transition path at three given intra-layer lattice constants. (a) and (b) show the results at electron temperatures of 2700 K and 15 000 K respectively. The intra-layer lattice constants are corresponding to the lattice constants of diamond structure at 3.50 Å (open circles and dotted lines), 3.58 Å (filled circles and solid lines), and 3.66 Å (open squares and dashed lines) respectively [74].
smaller than that at low electronic temperatures. This will make the graphitization transition much easier at high electronic temperatures. It is also interesting to note that the free energy of the graphite phase is much lower than that of the diamond phase (by about 0.3 eV/atom) at high electronic temperatures. The free energy gain due to initial graphitization caused by femtosecond laser pulses will help the system complete the transition very quickly.
3.3.
TBMD Simulation of Nanodiamond to Carbon Cage Transformation
Nanometer-sized diamonds have been found in interstellar dust [75], solid detonation products [76], and diamond-like films [77]. Recently, Raty et al. [78] performed ab initio calculations and TBMD simulations to study the
Environment-dependent tight-binding potential models
335
structure of a nanodiamond and found that the carbon nanoparticle consist of a diamond core and a reconstructed fullerene-like surface. Experiments have shown that diamond nanoparticles of diameter ∼ 5 nm can be transformed into spherical and polyhedral carbon onions at high temperatures [79–81]. Using the environment-dependent carbon tight-binding potential developed by Tang et al. [14], Lee et al. have recently performed tight-binding molecular dynamics simulations to study the structural transformation of nanodiamond at high temperature. The simulations show that upon annealing up to 2500 K, a 1.4 nm-diameter nanodiamond is transformed into a cage structure that looks like a single-walled capped nanotube [82]. The simulation was performed with a bulk-terminated carbon cluster of 275 atoms within a sphere of diameter of 1.4 nm cut from bulk diamond. This cluster is relaxed using the steepest descent method with the environmentdependent tight-binding carbon potential. The cluster structure after the relaxation consists of a diamond core and fullerene-like reconstructed surface, similar to the structure obtained by ab initio calculation [78]. Starting from this relaxed cluster geometry, TBMD simulation was performed to investigate the structural transformation of the nanodiamond at high temperatures. The snapshots of the system during its structural transformation from a nanodiamond into a capped nanotube are shown in Fig. 17. The nanodiamond cluster was heated up to about 2500 K by constant-temperature molecular dynamics simulations. Near 2500 K, as shown in Fig. 17(b), the (111) surface layer of the nanodiamond begins to graphitize after a simulation time of 3 ps, the exfoliation of the graphitized (111) layer occurs by breaking the bonds between graphene fragments and the underneath “core” atoms. This resembles the graphitization process of the (111) surface of bulk diamond induced by nanosecond laser pulses as discussed in the previous subsection [74]. As the simulation continues, the graphitized layer evaporates, breaking ¯ down into carbon dimers one-by-one from the end of layer. Similarly, the (1¯ 1¯ 1) surface layer consisting of three pentagons undergoes the same exfoliation and evaporation process as that of the (111) surface. At about 18 ps, as shown in Fig. 17(c), the graphitization process extends to the entire cluster surface. The “core” and the surface “shell” start to separate at the bottom side of the cluster. As the bonds between the “core” and “shell” atoms start to break up, the cluster begins to inflate like a bubble. At this stage, if the thermostat is maintained to keep the system at a constant temperature of 2500 K, the whole cluster would completely evaporate within a simulation time of 45 ps. To prevent full vaporization, the system is cooled by decreasing the temperature from 2500 K to 2000 K in 10 ps (during the simulation time between 25 and 35 ps). The cluster is then further cooled down to a temperature of ∼1500 K in 20 ps when a stable cage structure is found to form. The two holes (“H1” and “H2” in Fig. 17d), generated by the successive breaking of bonds among surface atoms play an important role in pumping inner carbon atoms out onto the surface to
336
C.Z. Wang and K.M. Ho
Figure 17. Atomic processes of structural transformation of nanodiamond to capped nanotube by successive annealings. (a) 0 K (at time t = 0 ps), (b) ∼2500 K (t ≈ 3 ps), (c) ∼2500 K (t ≈ 19 ps), (d) ∼2100 K (t ≈ 35 ps), (e) ∼1900 K (t ≈ 50 ps), (f) ∼20 K (t ≈ 120 ps). Simulated annealings with temperatures up to 3000 K are performed during the process (e)→(f). White color indicates atoms and bonds of 2 and less-fold coordination. Red and Yellow colors indicate atoms and bonds of three fold coordination and four fold coordination, respectively. Green colors indicate atoms and bonds of 5 and higher-fold coordination. Note that two holes H1 and H2 are created in (d) [82].
form the graphitic layer. This process is referred to as the “flow-out” mechanism by Lee et al. During the annealing process (stage (e)–(f) in Fig. 17), two other interesting atomic processes, namely the “direct absorption” and “pushout” mechanisms, have also been identified from the simulations to play a crucial role in the conversion of the residual inner carbon atoms of the nanodiamond into the surface atoms of the nanotube [82].
Environment-dependent tight-binding potential models
4.
337
Recent Developments and Future Perspective
In order to develop more accurate and transferable tight-binding models for large scale atomistic simulations, it is important to understand quantitatively the errors introduced by the various approximations used in the tightbinding models. In this regard, detailed information from first-principles calculations would be very useful. In general, overlap and one-electron Hamiltonian matrices from first-principles calculations cannot be used directly to infer the tight-binding parameters because fully converged first-principles calculations are done using a large basis set while tight-binding parameters are based on a minimal basis representation. Very recently, the authors and co-workers have developed a method for projecting a set of chemically deformed atomic minimal-basis-set orbitals from accurate first-principles wavefunctions [16–18]. These orbitals, referred to as “quasi-atomic minimalbasis-sets orbitals” (QUAMBOs), are highly localized on atoms and exhibit shapes close to orbitals of the isolated atom. Moreover, the QUAMBOs span exactly the same occupied subspace as the original first-principles calculation with a large basis set. Therefore, accurate tight-binding Hamiltonian and overlap matrix elements can be obtained directly from ab initio calculations through the construction of QUAMBOs. This new development enables us to examine the accuracy and transferability of the tight-binding models from a first-principles perspective. The key step in constructing the above mentioned QUAMBOs is the selection of a small subset of unoccupied orbitals from the entire virtual space that are maximally overlaped with the atomic orbitals of interest. For simplicity, let us assume that we are dealing with a nonperiodic system (i.e., cluster). Generalization to periodic systems will involve Bloch sums and is straightforward [18]. Suppose that a set of occupied valence orbitals φn (n=1, 2, . . . , Nval ) and virtual orbitals φv (v = Nval + 1, Nval + 2, . . . , Nval + Nvir ) are obtained from ab initio molecular orbital calculations, our objective is to construct orbitals a set of quasi-atomic orbitals Aiα spanned by the occupied valence φn and a small subset of orthogonal virtual orbitals ϕ p (ϕ p = v T p,v φv , p = 1, 2, . . . , N p < Nvir ). That is, A= = =
n n n
an φn + an φn + an φn +
p p,v v
bpϕp b p T p,v φv
(24)
av φv
where a = p b p T p,v , and b p = v av T p,v (because ϕ p are orthogonal, v i.e., v T p,v Tq,v = δ p,q ). The orbital index iα where i is the index of the atom and α denotes the orbital type (e.g., s or p orbitals) is implied in Eq. (24) and will also be omitted in the rest of this subsection unless specified. The
338
C.Z. Wang and K.M. Ho
requirement is that Aiα should as close as possible to the corresponding free atom orbitals A∗iα . Mathematically, this is a problem of minimizing A− A∗ | A− A∗ under the side condition A | A = 1. Therefore the Lagrangian for this minimization problem is L = A − A∗ | A − A∗ − λ( A | A − 1) =
(an −
an∗ )2
+
v
n
av∗ )2
(av −
−λ
n
an2
+
v
av2
−1
(25)
The independent variables are an and b p . Using the side condition, the Lagrange’s minimization leads to A=D
−1/2
∗
φn φn |A +
n
with D=
φn |A∗ 2 +
n
∗
ϕ p ϕ p |A
(26)
p
ϕ p |A∗ 2
(27)
p
For this optimized A, the total mean-square deviation from A∗ is, A − A∗ | A − A∗ = 2(1 − A | A∗ )
(28)
It is clear from the Eqs. (27) and (28) that the key step to get quasi-atomic minimal-basis-set orbitals is to select a subset of virtual orbitals ϕ that have p ∗ maximal overlap with the atomic orbitals set of Aiα , i.e., S = iα, p ϕ p | A∗iα
A∗iα | ϕ p is maximized. This can be achieved by forming the rectangular matrix T, which defines the subset of virtual states p ( p = 1, 2, . . . , N p ), from the eigenvectors with the largest eigenvalues of the matrix Bµ,ν =
φµ | A∗iα A∗iα | φν
(29)
iα
where both indexes, µ as well as ν run over a range of virtual space. Once the ϕ p are determined, the localized QUAMBOs are then constructed using Eqs. (26) and (27). As shown in Fig. 18, the QUAMBOs constructed by such a scheme are indeed atomic-like and well localized on the atoms. However, these QUAMBOs are different from the atomic orbitals of the free atoms because they are deformed according to the bonding environment. The Hamiltonian matrix in the QUAMBO representation by construction preserves the occupied valence subspace from the first-principles calculations so that it should give the exact eigenvalues and eigenvectors for the occupied states as those from first-principle calculations. This property can be seen from Fig. 19 where the eigenvalues of
Environment-dependent tight-binding potential models
339
Figure 18. QUAMBOs for the four non-equivalent atoms in Si10 cluster. The QUAMBOs are similar to the 3s and 3p orbitals of a free silicon atom but are deformed according to the environment of the atoms [17].
a Si10 cluster from first-principles calculations and from the QUAMBO based tight-binding Hamiltonian are compared. Once the QUAMBOs have been constructed, the overlap and the oneelectron Hamiltonian matrices from the first-principles calculations in terms of QUAMBOs are readily calculated. The Slater–Koster tight-binding parameters can then be extracted by decomposing the matrix elements using the Slater–Koster geometrical factors [1]. Using Si as an example, Lu et al. [19] have performed calculations for three types (diamond, sc, and fcc) of bulkfragment clusters with several different bond lengths for each type of clusters so that they can study the tight-binding parameters in different bonding environments. Figure 20 shows the overlap parameters Sssσ , Sspσ , Sppσ , and Sppπ
340
C.Z. Wang and K.M. Ho
Figure 19. Electronic eigenvalues of Si10 in terms of QUAMBOs (QUAMBO Space) are compared to those from self-consistent DFT (Full AO Space) calculations. Note that the occupied states (below −5.0 eV) are exactly reproduced in the QUAMBO space [17].
from different structures and different pair of atoms, plotted as a function of interatomic distance. Note that the two-center nature of overlap integrals for fixed atomic minimal basis orbitals may not necessarily hold for the QUAMBOs because QUAMBOs are deformed according to the bonding environments of the atoms. Nevertheless, the overlap parameters obtained from the calculations as plotted in Fig. 20 fall into smooth scaling curves nicely. These results suggest that the two-center approximation is adequate for overlap integrals. By contrast, the hopping parameters as plotted in Fig. 21 are far from being transferable, especially for h ppσ . Even for a given pair of atoms, the hopping parameters h ppσ and h ppπ obtained from the decomposition of different matrix elements can exhibit slightly different values, especially for the sc and fcc structures. The hopping parameters from different structures do not follow the same scaling curve. For a given crystal-like structure, although the bond-length dependence of hopping parameters for the first and second neighbor interactions can be fitted to separate smooth scaling curves respectively, these two scaling curves cannot be joined together to define an unique transferable scaling function for the structure. Beside the hopping parameters, crystal-field effects on the on-site atomic energies can also be seen clearly from the density and structure dependence of the on-site matrix elements as
Environment-dependent tight-binding potential models
341
Figure 20. The overlap integrals as a function of interatomic distance for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures.
plotted in Fig. 22. These results suggest that it is necessary to go beyond the two-center approximation even in the case of nonorthogonal tight-binding scheme. Information from the QUAMBO-based tight-binding analysis will provide very useful information for future tight-binding model developments. Expressing the tight-binding Hamiltonian matrix in terms of QUAMBOs also allows us to address the issue of the effects of orthogonality on the transferability of tight-binding models from a first-principles perspective. By applying the L¨owdin transformation to the QUAMBOs described above, we can obtain a set of orthogonal QUAMBOs. The Hamiltonian matrix and hence the Slater–Koster hopping integrals in the orthogonal QUAMBO representation can be calculated. As shown in Fig. 23, the hopping parameters decay much faster than their nonorthogonal counterparts. The interactions in the orthogonal tight-binding scheme are essentially dominated by first neighbor interactions which depend not only on the interatomic separations and also on the coordination of the structures. In contrast to the nonorthogonal model, the magnitudes of the hopping parameters decrease as the coordination number
342
C.Z. Wang and K.M. Ho
Figure 21. The nonorthogonal tight-binding hopping integrals for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures obtained by decomposing the QUAMBO-based one-electron Hamiltonian according to the Slater–Koster tight-binding scheme.
Figure 22. The non-orthogonal QUAMBO-based Hamiltonian diagonal matrix elements (E s and E p ) of Si clusters in the diamond (diamond-like), simple cubic (sc-like), and facecentered cubic (fcc-like) structures are plotted as a function of density.
Environment-dependent tight-binding potential models
343
Figure 23. The orthogonal tight-binding hopping integrals for Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures obtained by decomposing the QUAMBO-based one-electron Hamiltonian according to the Slater–Koster tight-binding scheme.
of the structure increase. These coordination-dependence of the hopping parameters and the short-range nature of the interactions are well described by the environment dependent tight-binding model of Wang et al. [14, 15]. The on-site energies are also found to be dependent on the structures and densities. In contrast to the behavior in the nonorthogonal case described above, the onsite energies in the orthogonal tight-binding scheme decreases as the density is decreased as one can see from the plot of Fig. 24. This behavior of the diagonal tight-binding matrix elements is also described by the environmentdependent tight-binding model of Wang et al. [14, 15]. However, in the orthogonal QUAMBO description, the second and higher neighbor hopping parameters, though small, are not entirely negligible. In particular, some hopping parameters in the orthogonal scheme are found to change sign for second and higher neighbors. Such behavior need to be taken into account properly in future tight-binding model developments.
344
C.Z. Wang and K.M. Ho
Figure 24. The orthogonal QUAMBO-based Hamiltonian diagonal matrix elements (E s and E p ) of Si clusters in the diamond (diamond-like), simple cubic (sc-like), and face-centered cubic (fcc-like) structures are plotted as a function of density.
Acknowledgments We would like to thank Dr Wencai Lu for help in preparing the manuscript and the figures. We also thank Drs Songyou Wang and Cristian Ciobanu for performing the two-center TB and classical II calculations for Fig. 6. Ames Laboratory is operated for the U.S. Department of Energy by Iowa State University under Contract No. W-7405-Eng-82. This work was supported by the Director for Energy Research, Office of Basic Energy Sciences including a grant of computer time at the National Energy Research Supercomputing Center (NERSC) in Berkeley.
References [1] [2] [3] [4] [5] [6]
J.C. Slater and G.F. Koster, Phys. Rev., 94, 1498, 1954. C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 39, 8592, 1989. F.S. Khan and J.Q. Broughton, Phys. Rev., B, 39, 3688, 1989. L. Goodwin, A.J. Skinner, and D.G. Pettifor, Europhys. Lett., 9, 701, 1989. C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 42, 11276, 1990. J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev., B, 47, 9366, 1993.
Environment-dependent tight-binding potential models
345
[7] M.J. Mehl and D.A. Papaconstantopoulos, In: C.Y. Fong (ed.), Topic in Computational Materials Science, World Scientific, Singapore, pp. 169–213, 1997. [8] C.Z. Wang, and K.M. Ho, In: I. Prigogine, and S.A. Rice (eds.), Advances in Chem. Phys., Vol. XCIII, John Wiley & Sons, New York, pp. 651–702, 1996. [9] L. Colombo, Annu. Rev. Comput. Phys., 147(IV), 1, 1996. [10] C.Z. Wang, B.L. Zhang, K.M. Ho, and X.Q. Wang, Int. J. Mod. Phys. B, 7, 4305, 1993. [11] C.Z. Wang, B.L. Zhang, and K.M. Ho, In: D.A. Jelski and T.F. Geoge, (eds.), Computational Studies of New Materials, World Scientific, Singapore, pp.74–111, 1999. [12] C.Z. Wang and K.M. Ho, J. Comput. Theor. Nanosci., 1, 1, 2004. [13] P.O. Lowdin, J. Chem. Phys., 18, 365, 1950. [14] M.S. Tang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev., B, 53, 979, 1996. [15] C.Z. Wang, B.C. Pan, and K.M. Ho, J. Phys. Condens. Matter, 11, 2043, 1999. [16] W.C. Lu, C.Z. Wang, M.W. Schmidt, L. Bytautas, K.M. Ho, and K. Ruedenberg, J. Chem. Phys., 120, 2629, 2004. [17] W.C. Lu, C.Z. Wang, M.W. Schmidt, L. Bytautas, K.M. Ho, and K. Ruedenberg, J. Chem. Phys., 120, 2638, 2004. [18] W.C. Lu, C.Z. Wang, Z.L. Chan, K. Ruedenberg, and K.M. Ho, Phys. Rev. B, 70, 041101, 2004. [19] W.C. Lu, C.Z. Wang, K. Ruedenberg, and K.M. Ho, to be published. [20] C.H. Xu, C.Z. Wang, C.T. Chan, and K.M. Ho, J. Phys. Condens. Matter, 4, 6047, 1992. [21] W.A. Harrison, Electronic Structure and the Properties of Solids, Freeman, San Francisco, 1980. [22] D.J. Chadi, Phys. Rev. Lett., 41, 1062, 1978; Phys. Rev. B, 29, 785, 1984. [23] S. Sawada, Vacuum, 41, 612, 1990. [24] M. Kohyama, J. Phys. Condens. Matter, 3, 2193, 1991. [25] J.L. Mercer, Jr. and M.Y. Chou, Phys. Rev. B, 49, 8506, 1994. [26] R.E. Cohen, M.J. Mehl, and D.A. Papaconstantopoulos, Phys. Rev. B, 50, 14694, 1994. [27] Q.M. Li and R. Biswas, Phys. Rev. B, 50, 18090, 1994. [28] O. Madelung, M. Schulz, and H. Weiss (eds.), Semiconductors: Physics of Group IV Elements and III–V Compounds, Landolt-B¨ornstein New Series III/17a, SpringerVerlag, New York 1982; O. Madelung and M. Schulz, (eds.), Semiconductors: Intrinsic Properties of Group IV Elements and III–V, II–VI and I–VII Compounds, Landolt-B¨ornstein New Series III/22a, Springer-Verlag, New York, 1987. [29] M.S. Dresselhaus and G. Dresselhaus, In: M. Cardona and G. Guntherodt (eds.), Light Scattering in Solids III, Springer, Berlin, pp. 8, 1982. [30] For a review, see J. Robertson, Adv. Phys. 35, 317, (1986), and R. Clausing et al. (eds.), Diamond and Diamond-like Films and Coatings, NATO Advanced Study Institutes Ser. B, 266, 331, Plenum, New York, 1991. [31] D.R. McKenzie, D. Muller, and B.A. Pailthorpe, Phys. Rev. Lett., 67, 773, 1991. [32] P.H. Gaskell, A. Saeed, P. Chieux, and D.R. McKenzie, Phys. Rev. Lett., 67, 1286, 1991. [33] C.Z. Wang and K.M. Ho, Phys. Rev. Lett., 71, 1184, 1993. [34] N.A. Marks, D.R. McKenzie, B.A. Pailthorpe, M. Bernasconi, and M. Parrinello, Phys. Rev. Lett., 76, 768, 1996. [35] C.Z. Wang and K.M. Ho, “Structural trends in amorphous carbon,” In: M.P. Siegal et al. (eds.), MRS Symposium Proceedings, 498, MRS, 1998.
346
C.Z. Wang and K.M. Ho [36] I. Kwon, R. Biswas, C.Z. Wang, K.M. Ho, and C.M. Soukoulis, Phys. Rev. B, 49, 7242, 1994. [37] J.R. Morris, Z.Y. Lu, D.M. Ring, J.B. Xiang, K.M. Ho, C.Z. Wang, and C.L. Fu, Phys. Rev. B, 58, 11241, 1998. [38] J. Tersoff, Phys. Rev. B, 38, 9902, 1988. [39] T.J. Lenosky, B. Sadigh, E. Alonso, V.V. Bulatov, T. Diaz de la Rubia, J. Kim, A.F. Voter, and J.D. Kress, Modell. Simul. Matter Sci. Eng., 8, 825, 2000. [40] J.R. Chelikowsky, J.C. Phillips, M. Kamal, and M. Strauss, Phys. Rev. Lett., 62, 292, 1989. [41] J.R. Chelikowsky, K.M. Glassford, and J. C. Phillips, Phys. Rev. B, 44, 1538, 1991. [42] U. Rothlisberger, W. Andreoni, and P. Giannozzi, J. Chem. Phys., 96, 1248, 1992. [43] J.C. Phillips, Phys. Rev. B, 47, 14132, 1993. [44] J.C. Grossman and L. Mitas, Phys. Rev. Lett., 74, 1323, 1995. [45] Y.F. Li, C.Z. Wang, and K.M. Ho, unpublished. [46] J.C. Grossman and L. Mitas, Phys. Rev. B, 52, 16735, 1995. [47] M.F. Jarrold, J.E. Bower, and K.M. Creegan, J. Chem. Phys., 90, 3615, 1989. [48] U. Ray and M.F. Jarrold, J. Chem. Phys., 94, 2631, 1991. [49] M.F. Jarrold and E.C. Honea, J. Am. Chem. Soc., 114, 459, 1992. [50] K.M. Ho, A. Shvartsburg, B.C. Pan, Z.Y. Lu, C.Z. Wang, J. Wacker, J.L. Fye, and M.F. Jarrold, “Structures of Medium-Sized Silicon Clusters,” Nature, 392, 582, 1998. [51] B. Liu, Ph.D thesis, Iowa State University 2001. [52] H. Haas, C.Z. Wang, M. Fahnle, C. Elsasser, and K.M. Ho, Phys. Rev. B, 57, 1461, 1998. [53] H. Haas, C.Z. Wang, M. Fahnle, C. Elsasser, and K.M. Ho, “Environment-dependent tight-binding model for molybdenum,” In: P.E.A. Turchi et al. (eds.), MRS Symposium Proceedings, 491, MRS, pp. 327, 1998. [54] Y. Waseda, K. Hirata, and M. Ohtani, High Temp. High Press., 7, 221, 1975. [55] F.H. Featherston and J.R. Neighbours, Phys. Rev., 130, 1324, 1963. [56] R. Ziegler and H.-E. Schaefer, Matter Sci. Forum, 15–18, 145, 1987. [57] B.M. Powell, P. Martel, and A.D.B. Woods, Can. J. Phys., 55, 1601, 1977. [58] H. Haas, C.Z. Wang, K.M. Ho, M. Fahnle, and C. Elsasser, Surf. Sci. Lett., 457, L397, 2000. [59] Ju Li, C.Z. Wang, J.-P. Chang, W. Cai, V. Bulatov, K.M. Ho, and S. Yip, Phys. Rev. B, 70, 104113, 2004. [60] J.H. Holland, Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor, 1975. [61] D. Deaven and K.M. Ho, Phys. Rev. Lett., 75, 288, 1995. [62] D.M. Deaven, N. Tit, J.R. Morris, and K.M. Ho, Chem. Phys. Lett., 256, 195, 1996. [63] C.Z. Wang, C.H. Xu, C.T. Chan, and K.M. Ho, J. Phys. Chem., 96, 7603, 1992. [64] J.R. Chelikowsky, Phys. Rev. Lett., 67, 2970, 1991. [65] B. Liu, Z.Y. Lu, B. Pan, C.Z. Wang, K.M. Ho, A.A. Shvartsburg, and M.F. Jarrold, J. Chem. Phys., 109, 9401, 1998. [66] A.A. Shvartsburg, M.F. Jarrold, B. Liu, Z.Y. Lu, C.Z. Wang, and K.M. Ho, Phys. Rev. Lett., 81, 4616, 1998. [67] Mingsheng Tang, Wencai Lu, C.Z. Wang, and K.M. Ho, Chem. Phys. Lett., 377, 413, 2003. [68] Mingsheng Tang, C.Z. Wang, W.C. Lu, and K.M. Ho, Phys. Rev. B, (submitted). [69] F.C. Chuang, B. Liu, C.Z. Wang, T.L. Chan, and K.M. Ho, Phys. Rev. B, (submitted). [70] M. Pederson and K. Jackson, Phys. Rev. B, 43, 7312, 1991. [71] R.M. Wentzcovitch, J.L. Martin, and P.B. Allen, Phys. Rev. B, 45, 11372, 1992.
Environment-dependent tight-binding potential models [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82]
347
N.D. Mermin, Phys. Rev., 137, A1441, 1965. B.L. Zhang, C.Z. Wang, C.T. Chan, and K.M. Ho, Phys. Rev. B, 48, 11381, 1993. C.Z. Wang, K.M. Ho, M. Shirk, and P. Molian, Phys. Rev. Lett., 85, 4092, 2000. R.S. Lewis, E. Anders, and B.T. Draine, Nature (London), 339, 117, 1989. N.R. Greiner, D.S. Phillios, J.D. Johnson, and F. Volk, Nature (London), 333, 440, 1988. Y.K. Chang, H.H. Hsieh, W.F. Pong, M.H. Tsai, F.Z. Chien, P.K. Tseng, L.C. Chen, T.Y. Wang, K.H. Chen, and D.M. Bhusari, Phys. Rev. Lett., 82, 5377, 1999. J.Y. Raty, G. Galli, C. Bostedt, T.W. Van Buuren, and L.J. Terminello, Phys. Rev. Lett., 90, 037401, 2003. S. Tomita, M. Fujii, and S. Hayashi, Phys. Rev B, 66, 245424, 2002. S. Tomita, M. Fujii, S. Hayashi, and K. Yamamoto, Chem. Phys. Lett., 305, 225, 1999. V.L. Kuznetsov, A.L. Chuvilin, Y.V. Butenko, I.Y. Mal’kov, and V.M. Titov, Chem. Phys. Lett., 222, 343, 1994. G.D. Lee, C.Z. Wang, J. Yu, E. Yoon, and K.M. Ho, Phys. Rev. Lett., 91, 265701, 2003.
1.16 FIRST-PRINCIPLES MODELING OF PHASE EQUILIBRIA Axel van de Walle and Mark Asta Northwestern University, Evanston, IL, USA
First-principles approaches to the modeling of phase equilibria rely on the integration of accurate quantum-mechanical total-energy calculations and statistical-mechanical modeling. This combination of methods makes possible parameter-free predictions of the finite-temperature thermodynamic properties governing a material’s phase stability. First-principles, computationalthermodynamic approaches have found increasing applications in phase diagram studies of a wide range of semiconductor, ceramic and metallic systems. These methods are particularly advantageous in the consideration of previously unexplored materials, where they can be used to ascertain the thermodynamic stability of new materials before they are synthesized, and in situations where direct experimental thermodynamic measurements are difficult due to constraints imposed by kinetics or metastability.
1.
First-Principles Calculations of Thermodynamic Properties: Overview
At finite temperature (T ) and pressure (P) thermodynamic stability is governed by the magnitude of the Gibbs free energy (G): G = E − T S + PV
(1)
where E, S and V denote energy, entropy and volume, respectively. In principle, the formal statistical-mechanical procedure for calculating G from firstprinciples is well defined. Quantum-mechanical calculations can be performed to compute the energy E(s) of different microscopic states (s) of a system, which then must be summed up in the form of a partition function (Z ): Z=
exp[−E(s)/kB T ]
(2)
s
349 S. Yip (ed.), Handbook of Materials Modeling, 349–365. c 2005 Springer. Printed in the Netherlands.
350
A. van de Walle and M. Asta
from which the free energy is derived as F = E − T S = −kB T ln Z , where kB is Boltzman’s constant. Figure 1(a) illustrates, for the case of a disordered crystalline binary alloy, the nature of the disorder characterizing a representative finite-temperature atomic structure. This disorder can be characterized in terms of the configurational arrangement of the elemental species over the sites of the underlying parent lattice, coupled with the displacements characterizing positional disorder. In principle, the sum in Eq. (2) extends over all configurational and displacive states accessible to the system, a phase space that is astronomically large for a realistic system size. In practice, the methodologies of atomicscale molecular dynamics (MD) and Monte Carlo (MC) simulations, coupled with thermodynamic integration techniques (Kofke and Frenkel, de Koning, Chapter 2), reduce the complexity of a free energy calculation to a more tractable problem of sampling on the order of several to tens of thousands of representative states. Electronic density-functional theory (DFT) provides an accurate quantummechanical framework for calculating the relative energetics of competing atomic structures in solids, liquids and molecules for a wide range of materials classes (Kaxiras, Chapter 1). Due to the rapid increase in computational cost with system size, however, DFT calculations are typically limited to structures containing fewer than ≈1000 atoms, while ab initio MD simulations (Scheffler, Chapter 1) are practically limited to time scales of less than ≈1 ns. For liquids or compositionally ordered solids, where the time scales for structural rearrangements (displacive in the latter case, configurational and displacive
(a)
(b)
Figure 1. (a) Disordered crystalline alloy. The state of the alloy is characterized both by the atomic displacements v i and the occupation of each lattice site. (b) Mapping of the real alloy onto a lattice model characterized by occupation variables σi describing the identity of atoms on each of the lattice sites.
First-principles modeling of phase equilibria
351
in the former) are sufficiently fast, and the size of periodic cells required to accurately model the atomic structure are relatively small, DFT-based MD methods have found direct applications in the calculation of finite-temperature thermodynamic properties [1, 2]. For crystalline solids containing both positional and concentrated compositional disorder, however, direct applications of DFT to the calculation of free energies remains intractable; the time scales for configurational rearrangements are set by solid-state diffusion, ruling out direct application of MD, and the necessary system sizes required to accurately model configurational disorder are too large to permit direct application of DFT as the basis for MC simulations. Effective strategies have nonetheless been developed for bridging the size and time-scale limitations imposed by DFT in the first-principles computation of thermodynamic properties for disordered solids. The approach involves exploitation of DFT methods as a framework for parameterizing classical potentials and coarse-grained statistical models. These models serve as efficient “effective Hamiltonians” in direct simulation-based calculations of thermodynamic properties; they can also function as useful reference systems for thermodynamic-integration calculations.
2.
Thermodynamics of Compositionally Ordered Solids
In an ordered solid thermal fluctuations take the form of electronic excitations and lattice vibrations and, accordingly, the free energy can be written as F = E 0 + Felec + Fvib , where E 0 is the absolute zero total energy while Felec and Fvib denote electronic and vibrational free energy contributions, respectively. This section is devoted to the calculation of the electronic and vibrational contributions most commonly considered in phase-diagram calculations under the assumption that electron–phonon interactions are negligible (i.e., Felec and Fvib are simply additive). To account for electronic excitations, electronic DFT (Kaxiras, Chapter 1) can be extended to nonzero temperatures by allowing for partial occupations of the electronic states [3]. Within this framework, the electronic contribution to the free energy Felec (T ) at temperature T can be decomposed as∗ Felec (T ) = E elec (T ) − E elec (0) − T Selec(T )
(3)
* Equations (3)–(5) also assume that both the electronic charge density and the electronic density of states can be considered temperature-independent.
352
A. van de Walle and M. Asta
where the electronic band energy E elec (T ) and the electronic entropy Selec(T ) are respectively given by E elec (T ) =
f µ,T (ε)εg(ε) dε
(4)
Selec(T ) = −kB ( f µ,T (ε) ln f µ,T (ε) + (1 − f µ,T (ε)) ln(1 − f µ,T (ε)))g(ε) dε
(5)
where g(ε) is the electronic density of states obtained from a density-functional calculation, while f µ,T (ε) is the Fermi distribution when the electronic chemical potential is equal to µ,
ε−µ f µ,T (ε) = 1 + exp kB T
−1
.
(6)
The chemical potential µ is the solution to f µ,T (ε)g(ε) dε=n ε , where n ε is the total number of electrons. Under the assumption that the electronic density of states near the Fermi level is slowly varying relative to f µ,T (ε), the equations for the electronic free energy reduce to the well-known Sommerfeld model, an expansion in powers of T whose lowest order term is Felec (T ) = −
π2 2 2 k T g(ε F ) 6 B
(7)
where g(ε F ) is the zero-temperature value of the electronic density of states at the Fermi level (ε F ). The quantum treatment of lattice vibrations in the harmonic approximation provides a reliable description of thermal vibrations in many solids for low to moderately high temperatures [4]. To describe this theory, consider an infinite periodic system with n atoms per unit cell and let u li for i = 1, . . . , n denote the displacement of atom i in cell l away from its equilibrium position and let Mi be the mass of atom i. Within the harmonic approximation, the potential energy of this system U is entirely determined by: (i) the potential energy (per unit cell) of the system at its equilibrium position E 0 and (ii) the force constants tensors li lj whose components are given, for α, β = 1, 2, 3, by
αβ
l i
∂ 2U l = l j ∂u α i ∂u β lj
(8)
evaluated at u li = 0 for all l, i. Such a harmonic approximation to the Hamiltonian of a solid is often referred to as a Born–von K´arm´an model. The thermodynamic properties of a harmonic system are entirely determined by the frequencies of its normal modes of oscillations, which can be
First-principles modeling of phase equilibria
353
obtained by finding the eigenvalues of the so-called 3n × 3n dynamical matrix of the system:
D(k) =
ei2π(k·l)
l
0l √ (1 1 ) M1 M1
.. . (n0 1l ) √
Mn M1
··· .. . ···
0l √ (1 n ) M1 Mn
.. . 0l √ (n n )
(9)
Mn Mn
for all vectors k in the first Brillouin zone. The resulting eigenvalues λb (k) for b = 1,.√ . . , 3n, provide the frequencies of the normal modes through νb (k) = λb (k) . This information for all k is conveniently summarized by 1/2π g(ν), the phonon density of states (DOS), which specifies the number of modes of oscillation having a frequency lying in the infinitesimal interval [ν, ν + dν]. The vibrational free energy (per unit cell) Fvib is then given by ∞
Fvib = k B T 0
hν ln 2 sinh 2k B T
g(ν) dν
(10)
where h is Planck’s constant and k B is Boltzman’s constant. The associated vibrational entropy Svib of the system can be obtained from the well-known thermodynamic relationship Svib = −∂ Fvib /∂ T . The high temperature limit (which is also the classical limit) of Eq. (10) is often a good approximation over the range of temperature of interest in solid-state phase diagram calculations ∞
Fvib = k B T 0
hν ln kB T
g(ν) dν.
The high temperature limit of the vibrational entropy difference between two phases is often used as measure of the magnitude of the effect of lattice vibrations on phase stability. It has the advantage of being temperature-independent, thus allowing a unique number to be reported as a measure of vibrational effects. Figure 2 (from [5]) illustrates the use of the above formalism to assess the relative phase stability of the θ and θ phases responsible for precipitation hardening in the Al–Cu system. Interestingly, accounting for lattice vibrations is crucial in order for the calculations to agree with the experimentally observed fact that the θ phase is stable at typical processing temperatures (T > 475 K). A simple improvement over the harmonic approximation, called the quasiharmonic approximation, is obtained by employing volume-dependent force constant tensors. This approach maintains all the computational advantages of the harmonic approximation while permitting the modeling of thermal expansion. The volume dependence of the phonon frequencies induced by the volume dependence of the force constants is traditionally described by the Gr¨uneisen parameter γkb = −∂ ln νb (k)/∂ ln V . However, for the purpose of
354
A. van de Walle and M. Asta
Figure 2. Temperature-dependence of the free energy of the θ and θ phases of the Al2 Cu compound. Insets show the crystal structures of each phase and the corresponding phonon density of states. Dashed lines indicate region of metastability and the θ phase is seen to become stable above about 475 K. (Adapted from Ref. [5] with the permission of the authors.)
modeling thermal expansion, it is more convenient to directly parametrize the volume-dependence of the free energy itself. This dependence has two sources: the change in entropy due to the change in the phonon frequencies and the elastic energy change due to the expansion of the lattice: F(T, V ) = E 0 (V ) + Fvib (T, V )
(11)
where E 0 (V ) is the energy of a motionless lattice whose unit cell is constrained to remain at volume V, while Fvib (T, V ) is the vibrational free energy of a harmonic system constrained to remain with a unit cell volume V at temperature T . The equilibrium volume V ∗ (T ) at temperature T is obtained by minimizing F(T, V ) with respect to V . The resulting free energy F(T ) at temperature T is then given by F(T, V ∗ (T )). The quasiharmonic approximation has been shown to provide a reliable description of thermal expansion of numerous elements up to their melting points, as illustrated in Fig. 3. First-principles calculations can be used to provide the necessary input parameters for the above formalism. The so-called direct force method proceeds by calculating, from first principles, the forces experienced by the atoms in response to various imposed displacements and by determining the value of the force constant tensors that match these forces through a least-squares fit.
First-principles modeling of phase equilibria 2.0 Na
0.0 ⫺1.0 ⫺2.0
∆1/1(%)
∆1/1(%)
1.0
355
A1
1.0 0.0
⫺1.0 0
100 200 300 400 Temperature (K)
0
200 400 600 800 1000 Temperature (K)
Figure 3. Thermal expansion of selected metals calculated within the quasiharmonic approximation. (Reproduced from Ref. [6] with the permission of the authors.)
Note that the simultaneous displacements of the periodic images of each displaced atom due to the periodic boundary conditions used in most ab initio methods typically requires the use of a supercell geometry, in order to be able to sample all the displacements needed to determine the force constants. While the number of force constants to be determined is in principle infinite, in practice, it can be reduced to a manageable finite number by noting that the force constant tensor associated with two atoms that lie farther than a few nearest neighbor shells can be accurately neglected for many systems. Alternatively, linear response theory (Rabe, Chapter 1) can be used to calculate the dynamical matrix D(k) directly using second-order perturbation theory, thus circumventing the need for supercell calculations. Linear response theory is also particularly useful when a system is characterized by non-negligible long-range force-constants, as in the presence of Fermi-surface instabilities or long-ranged electrostatic contributions. The above discussion has centered around the application of harmonic (or quasiharmonic) approximations to the statistical modeling of vibrational contributions to free energies of solids. While harmonic theory is known to be highly accurate for a wide class of materials, important cases exist where this approximation breaks down due to large anharmonic effects. Examples include the modeling of ferroelectric and martensitic phase transformations where the high-temperature phases are often dynamically unstable at zero temperature, i.e., their phonon spectra are characterized by unstable modes. In such cases, effective Hamiltonian methods have been developed to model structural phase transitions from first principles (Rabe, Chapter 1). Alternatively, direct application of ab initio molecular-dynamics offers a general framework for modeling thermodynamic properties of anharmonic solids [1, 2].
356
3.
A. van de Walle and M. Asta
Thermodynamics of Compositionally Disordered Solids
We now relax the main assumption made in the previous section, by allowing atoms to exit the neighborhood of their local equilibrium position. This is accomplished by considering every possible way to arrange the atoms on a given lattice. As illustrated in Fig. 1(b), the state of order of an alloy can be described by occupation variables σi specifying the chemical identity of the atom associated with lattice site i. In the case of a binary alloy, the occupations are traditionally chosen to take the values +1 or −1, depending on the chemical identity of the atom. Returning to Eq. (2), all the thermodynamic information of a system is contained in its partition function Z and in the case of a crystalline alloy system, the sum over all possible states of the system can be conveniently factored as follows: Z=
σ v∈σ e∈v
exp[−β E(σ, v, e)]
(12)
where β = (k B T )−1 and where • σ denotes a configuration (i.e., the vector of all occupation variables); • v denotes the displacement of each atom away from its local equilibrium position; • e is a particular electronic state when the nuclei are constrained to be in a state described by σ and v; and • E(σ, v, e) is the energy of the alloy in a state characterized by σ , v and e. Each summation defines an increasingly coarser level of hierarchy in the set of microscopic states. For instance, the sum over v includes all displacements such that the atoms remain close to the undistorded configuration σ . Equation (12) implies that the free energy of the system can be written as
F(T ) = −kB T ln
σ
exp[−β F(σ, T )]
(13)
where F(σ, T ) is nothing but the free energy of an alloy with a fixed atomic configuration, as obtained in the previous section
F(σ, T ) = −kB T ln
v∈σ e∈v
exp[−β E(σ, v, e)]
(14)
The so-called “coarse graining” of the partition function illustrated by Eq. (13) enables, in principle, an exact mapping of a real alloy onto a simple lattice model characterized by the occupation variables σ and a temperaturedependent Hamiltonian F(σ, T ) [7, 8].
First-principles modeling of phase equilibria
357
Although we have reduced the problem of modeling the thermodynamic properties of configurationally disordered solids to a more tractable calculation for a lattice model, the above formalism would still require the calculation of the free energy for every possible configuration σ , which is computationally intractable. Fortunately, the configurational dependence of the free energy can often be parametrized using a convenient expansion known as a cluster expansion [7, 9]. This expansion takes the form of a polynomial in the occupation variables F(σ, T ) = J∅ +
Ji σi +
i
i, j
Ji j σi σ j +
Ji j k σi σ j σk + · · ·
i, j,k
where the so-called effective cluster interactions (ECI) J∅ , Ji , Ji j , . . . , need to be determined. The cluster expansion can be recast into a form which exploits the symmetry of the lattice by regrouping the terms as follows F (σ, T ) =
α
m a Ja
σi
i∈α
where α is a cluster (i.e., a set of lattice sites) and where the summation is taken over all clusters that are symmetrically distinct while the average . . .
is taken over all clusters α that are symmetrically equivalent to α. The multiplicity m α weight each term by the number of symmetrically equivalent clusters in a given reference volume (e.g., a unit cell). While the cluster expansion is presented here in the context of binary alloys, an extension to multicomponent alloys (where σi can take more than two different values) is straightforward [9]. It can be shown that when all clusters α are considered in the sum, the cluster expansion is able to represent any function of configuration σ by an appropriate selection of the values of Jα . However, the real advantage of the cluster expansion is that, for many systems, it is found to converge rapidly. An accuracy that is sufficient for phase diagram calculations can often be achieved by keeping only clusters α that are relatively compact (e.g., short-range pairs or small triplets, as illustrated in the left panel of Fig. 4). The unknown parameters of the cluster expansion (the ECI Jα ) can then determined by fitting them to F(σ, T ) for a relatively small number of configurations σ obtained from first-principles computations. Once the ECI have been determined, the free energy of the alloy for any given configuration can be quickly calculated, making it possible to explore a large number of configurations without recalculating the free energy of each of them from first principles. In some applications the development of a converged cluster expansion can be complicated by the presence of long-ranged interatomic interactions mediated by electronic-structure (Fermi-surface), electrostatic and/or elastic effects. Long-ranged interactions lead to an increase in the number of ECIs
358
A. van de Walle and M. Asta
Figure 4. Typical choice of clusters (left) and structures (right) used for the construction of a cluster expansion on the hcp lattice. Big circles, small circles and crosses represent consecutive close-packed planes of the hcp lattice. Concentric circles represent two sites, one above the other in the [0001] direction. The unit cell of the structures (right) along the (0001) plane is indicated by lines while the third lattice vector, along [0001], is identical to the one of the hcp primitive cell. (Adapted, with the permission of the authors, from Ref. [10], a first-principles study of the metastable hcp phase diagram of the Ag–Al system.)
that must be computed, and a concomitant increase in the number of configurations that must be sampled to derive them. For metals it has been demonstrated how long-ranged electronic interactions can be derived from perturbation theory using coherent-potential approximations to the electronic structure of a configurationally disordered solid as a reference state [11]. Effective approaches to modeling long-ranged elastically mediated interactions have also been formulated [12]. Such elastic effects are known to be particularly important in describing the thermodynamics of mixtures of species with very large differences in atomic “size”. The cluster expansion tremendously simplifies the search for the lowest energy configuration at each composition of the alloy system. Determining these ground states is important because they determine the general topology of the alloy phase diagram. Each ground state is typically associated with one of the stable phases of the alloy system. There are three main approaches to identify the ground states of an alloy system. With the enumeration method, all the configurations whose unit cell contains less than a given number of atoms are enumerated and their energy
First-principles modeling of phase equilibria
359
is quickly calculated using the value of F(σ, 0) predicted from the cluster expansion. The energy of each structure can then be plotted as a function of its composition (see Fig. 5) and the points touching the lower portion of the convex hull of all points indicate the ground states. While this method is approximate, as it ignores ground states with unit cell larger than the given threshold, it is simple to implement and has been found to be quite reliable, thanks to the fact that most ground states indeed have a small unit cell. Simulated annealing offers another way to find the ground states. It proceeds by generating random configurations via MC simulations using the Metropolis algorithm (G. Gilmer, Chapter 2) that mimic the ensemble sampled in thermal equilibrium at a given temperature. As the temperature is lowered, the simulation should converge to the ground state. Thermal fluctuations are used as an effective means of preventing the system from getting trapped in local minima of energy. While the constraints on the unit cell size are considerably relaxed relative to the enumeration method, the main disadvantage of this method is that, whenever the simulation cell size is not an exact multiple of the ground state unit cell, artificial defects will be introduced in the simulation that need to be manually identified and removed. Also, the risk of obtaining local rather than global minima of energy is not negligible and must be controlled by adjusting the rate of decay of the simulation temperature.
Figure 5. Ground state search using the enumeration method in the Scx -Vacancy1−x S system. Diamonds represent the formation energies of about 3×106 structures, predicted from a cluster expansion fitted to LDA energies. The ground states, indicated by open circles, are the structures whose formation energy touches the convex hull (solid line) of all points. (Reproduced from Ref. [13], with the permission of the authors.)
360
A. van de Walle and M. Asta
Finally, there exists an exact, although computational demanding, algorithm to identify the ground states [14]. This approach relies on the fact that σ the cluster expansion is linear in the correlations σα ≡ i∈a i . Moreover, it can be shown that the set of correlations σα that correspond to “real” structures can be defined by a set of linear inequalities. These inequalities are the result of lattice-specific geometric constraints and there exists systematic methods to generate them [14]. As an example of such constraints, consider the fact that it is impossible to construct a binary configuration on a triangular lattice where the nearest neighbor pair correlations take the value −1 (i.e., where all nearest neighbors are between different atomic species). Since both the objective function and the constraints are linear in the correlations, linear programming techniques can be used to determine the ground states. The main difficulties associated with this method is the fact that the resulting linear programming problem involves a number of dimensions and a number of inequalities that grows exponentially fast with the range of interactions included in the cluster expansion. Once the ground states have been identified, thermodynamic properties at finite temperature must be obtained. Historically, the infinite summation defining the alloy partition function has been approximated through various mean-field methods [7, 14]. However, the difficulties associated with extending such methods to systems with medium to long-ranged interactions, and the increase in available computational power enabling MC simulations to be directly applied, have led to reduced reliance upon these techniques more recently. MC simulations readily provide thermodynamic quantities such as energy or composition by making use of the fact that averages over an infinite ensemble of microscopic states can be accurately approximated by averages over a finite number of states generated by “importance” sampling. Moreover, quantities such as the free energy, which cannot be written as ensemble averages, can nevertheless be obtained via thermodynamic integration (Frenkel, Chapter 2; de Koning, Chapter 2) using standard thermodynamic relationships to rewrite the free energy in terms of integrals of quantities that can be obtained via ensemble averages. For instance, since energy E(T ) and free energy F(T ) are related through E(T ) = ∂(F (T )/T )/∂(1/T ) we have T
F(T0 ) F(T ) − =− T T0
T0
E(T ) dT T2
(15)
and free energy differences can therefore be obtained from MC simulations providing E (T ). Figures 6 and 7 show two phase diagrams obtained by combining first principles calculations, the cluster expansion formalism and MC simulations, an approach which offers the advantage of handling, in a
First-principles modeling of phase equilibria
361
Figure 6. Calculated composition–temperature phase diagram for a metastable hcp Ag–Al alloy. Note that the cluster expansion formalism enables a unified treatment of both solid solutions and ordered compounds. (Reproduced from Ref. [10], with the permission of the authors.)
Figure 7. Calculated composition–temperature solid-state phase diagram for a rocksalt-type CaO–MgO alloy. The inclusion of lattice vibrations via the coarse-graining formalism is seen to substantially improve in agreement with experimental observations (filled circles). (Reproduced from Ref. [15], with the permission of the authors.)
362
A. van de Walle and M. Asta
unified framework, both ordered phases (with potential thermal defects) and disordered phases (with potential short-range order).
4.
Liquids and Melting Transitions
While first-principles thermodynamic methods have found the widest application in studies of solids, recent progress has been realized also in the development and application of methods for ab initio calculations of solid–liquid phase boundaries. This section provides a brief overview of such methods, based upon the application of thermodynamic integration methods within the framework of ab initio molecular dynamics simulations. Consider the ab initio calculation of the melting point for an elemental system, as was first demonstrated by Sugino and Car [1] in an application to elemental Si. The approach is based on the use of thermodynamic-integration methods to compute temperature-dependent free energies for bulk solid and liquid phases. Let U1 (r1 , r2 , . . . , r N ) denote the DFT potential energy for a collection of ions at positions (r1 , . . . , r N ), while U0 (r1 , r2 , . . . , r N ) corresponds to the energy of the same collection of ions described by a reference classical-potential model. We suppose that the free energy of the reference system, F0 , has been accurately calculated, either analytically (as in the case of an Einstein crystal) or using the atomistic simulation methods reviewed by Kofke and Frenkel in Chapter 2. We proceed to calculate the difference F1 − F0 between the DFT free energy (F1 ) and F0 employing the statistical-mechanical relation: F1 − F0 =
1 0
dUλ dλ dλ
1
= λ
dλ U1 − U0 λ
(16)
0
where the brackets · · · λ denote an average over the ensemble generated by the potential energy Uλ = λU1 + (1 − λ)U0 . In practice, · · · λ can be calculated from a time average over an MD trajectory generated with forces derived from the hybrid energy Uλ . The integral in Eq. (16) is evaluated from results computed for a discrete set of λ values, or from a time average over a simulation where λ is slowly “switched” on from zero to one. Practical applications of this approach rely on the careful choice of the reference system to provide energies that are sufficiently “close” to DFT to allow the ensemble averages in Eq. (16) to be precisely calculated from relatively short MD simulations. It should be emphasized that the approach outlined in this paragraph, when applied to the solid phase, provides a framework for accurately calculating anharmonic contributions to the vibrational free energy. Figure 8 shows results derived from the above procedure by Sugino and Car [1] in an application to elemental Si (using the Stillinger–Weber potential as a reference system). Temperature-dependent chemical potentials for solid and
Chemical potential (eV/atom)
First-principles modeling of phase equilibria
363
0.0 ⫺0.2 ⫺0.4 ⫺0.6 ⫺0.8 0.0
Solid Liquid
0.4
0.8 1.2 1.6 Temperature (⫻ 1000 K)
2.0
Figure 8. Calculated chemical potential of solid and liquid silicon. Full lines correspond to theory and dashed lines to experiments. (Reproduced from Ref. [1], with the permission of the authors.)
liquid phases (referenced to the zero-temperature free energy of the crystal) are plotted with symbols and are compared to experimental data represented by the dashed lines. It can be seen that the temperature-dependence of the solid and liquid free energies (i.e., the slopes of the curves in Fig. 8) are accurately predicted. Relative to the solid, the liquid chemical potentials are approximately 0.1 eV/atom lower than experiment, leading to a calculated melting temperature that is approximately 300 K lower than the measured value. Comparable and even somewhat higher accuracies have been demonstrated in more recent applications of this approach to the calculation of melting temperatures in elemental metal systems (see, e.g., the references cited in [2]). The above formalism has been extended as a basis for calculating solid and liquid chemical potentials in binary mixtures [2]. In this application, thermodynamic integration for the liquid phase is used to compute the change in free energy accompanying the continuous interconversion of atoms from solute to solvent species. Such calculations form the basis for extracting solute and solvent atom chemical potentials. For the solid phase the vibrational free energy of formation of substitutional impurities is extracted either within the harmonic approximation (along the lines described above) and/or from thermodynamic integration to derive anharmonic contributions. In applications to Fe-based systems relevant to studies of the Earth’s core, the approach has been used to compute the equilibrium partitioning of solute atoms between
364
A. van de Walle and M. Asta
solid and liquid phases in binary mixtures at pressures that are beyond the range of direct experimental measurements.
5.
Outlook
The techniques described in this article provide a framework for computing the thermodynamic properties of elements and alloys from first principles, i.e., requiring, in principle, only the atomic numbers of the elemental constituents as input. In the most favorable cases, these methods have been demonstrated to yield finite-temperature thermodynamic properties with an accuracy that is limited only by the approximations inherent in electronic DFT. For a growing number of metallic alloy systems, such accuracy can be comparable to that achievable in direct measurements of thermodynamic properties. In such cases, ab initio methods have found applications as a framework for augmenting the experimental databases that form the basis of “computationalthermodynamics” modeling in the design of alloy microstructure. Firstprinciples methods offer the advantage of being able to provide estimates of thermodynamic properties in situations where direct experimental measurements are difficult due to constraints imposed by sluggish kinetics, metastability or extreme conditions (e.g., high pressures or temperatures). In the development of new materials, first-principles methods can be employed as a framework for rapidly assessing the thermodynamic stability of hypothetical structures before they are synthesized. With the continuing increase in computational power and improvements in the accuracy of first-principles electronicstructure methods, it is anticipated that ab initio techniques will find growing applications in predictive studies of phase stability for a wide range of materials systems.
References [1] O. Sugino and R. Car, “Ab initio molecular dynamics study of first-order phase transitions: melting of silicon,” Phys. Rev. Lett., 74, 1823, 1995. [2] D. Alf`e, M.J. Gillan, and G.D. Price, “Ab initio chemical potentials of solid and liquid solutions and the chemistry of the Earth’s core,” J. Chem. Phys., 116, 7127, 2002. [3] N.D. Mermin, “Thermal properties of the inhomogeneous electron gas,” Phys. Rev., 137, A1441, 1965. [4] A.A. Maradudin, E.W. Montroll, and G.H. Weiss, Theory of Lattice Dynamics in the Harmonic Approximation, 2nd edn., Academic Press, New York, 1971. [5] C. Wolverton and V. Ozoli¸nsˇ, “Entropically favored ordering: the metallurgy of Al2 Cu revisited,” Phys. Rev. Lett., 86, 5518, 2001. [6] A.A. Quong and A.Y. Lui, “First-principles calculations of the thermal expansion of metals,” Phys. Rev. B, 56, 7767, 1997.
First-principles modeling of phase equilibria
365
[7] D. de Fontaine, “Cluster approach to order-disorder transformation in alloys,” Solid State Phys., 47, 33, 1994. [8] A. van de Walle and G. Ceder, “The effect of lattice vibrations on substitutional alloy thermodynamics,” Rev. Mod. Phys., 74, 11, 2002. [9] J.M. Sanchez, F. Ducastelle, and D. Gratias, “Generalized cluster description of multicomponent systems,” Physica, 128A, 334, 1984. [10] N.A. Zarkevich and D.D. Johnson, “Predicted hcp Ag–Al metastable phase diagram, equilibrium ground states, and precipitate structure,” Phys. Rev. B, 67, 064104, 2003. [11] G.M. Stocks, D.M.C. Nicholson, W.A. Shelton, B.L. Gyorffy, F.J. Pinski, D.D. Johnson, J.B. Staunton, P.E.A. Turchi, and M. Sluiter, “First Principles Theory of Disordered Alloys and Alloy Phase Stability,” In: P.E. Turchi and A. Gonis (eds.), NATO ASI on Statics and Dynamics of Alloy Phase Transformation, vol. 319, Plenum Press, New York, p. 305, 1994. [12] C. Wolverton and A. Zunger, “An ising-like description of structurally-relaxed ordered and disordered alloys,” Phys. Rev. Lett., 75, 3162, 1995. [13] G.L. Hart and A. Zunger, “Origins of nonstoichiometry and vacancy ordering in Sc1−x x S,” Phys. Rev. Lett., 87, 275508, 2001. [14] F. Ducastelle, Order and Phase Stability in Alloys., Elsevier Science, New York, 1991. [15] P.D. Tepesch, A.F. Kohan, and G.D. Garbulsky, et al., “A model to compute phase diagrams in oxides with empirical or first-principles energy methods and application to the solubility limits in the CaO–MgO system,” J. Am. Ceram., 49, 2033, 1996.
1.17 DIFFUSION AND CONFIGURATIONAL DISORDER IN MULTICOMPONENT SOLIDS A. Van der Ven and G. Ceder Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
1.
Introduction
Atomic diffusion in solids is a kinetic property that affects the rates of important nonequilibrium phenomena in materials. The kinetics of atomic redistribution in response to concentration gradients determine not only the speed, but often also the mechanism by which phase transformations in multicomponent solids occur. In electrode materials for batteries and fuel cells high mobilities of specific ions ranging from lithium or sodium to oxygen or hydrogen are essential. In many instances, diffusion occurs in nondilute regimes in which different migrating atoms interact with each other. For example, lithium intercalation compounds such as Lix CoO2 and Lix C6 which serve as electrodes in lithium-ion batteries, can undergo large variations in lithium concentrations, ranging from very dilute concentrations to complete filling of all interstitial sites available for Li in the host. In nondilute regimes, diffusing atoms interact with each other, both electronically and elastically. A complete theory of nondilute diffusion in multi-component solids needs to account for the dependence of the energy and migration barriers on the configuration of diffusing ions. In this chapter, we present the formalism to describe and model diffusion in multicomponent solids. With tools from alloy theory to describe configurational thermodynamics [1–3], it is now possible to rigorously calculate diffusion coefficients in nondilute alloys from first-principles. The approach relies on the use of the alloy cluster expansion which has proven to be an invaluable statistical mechanical tool that links first-principles energies to the thermodynamic and kinetic properties of solids with configurational disorder. Although diffusion is a nonequilibrium phenomenon, diffusion coefficients 367 S. Yip (ed.), Handbook of Materials Modeling, 367–394. c 2005 Springer. Printed in the Netherlands.
368
A. Van der Ven and G. Ceder
can nevertheless be calculated by considering fluctuations at equilibrium using Green–Kubo relations [4]. We first elaborate on the atomistic mechanisms of diffusion in solids with interacting diffusing species. This is followed with a discussion of the relevant Green–Kubo expressions for diffusion coefficients. We then introduce the cluster expansion formalism to describe the configurational energy of a multi-component solid. We conclude with several examples of first-principles calculations of diffusion coefficients in multi-component solids.
2.
Migration in Solids with Configurational Disorder
Multi-component crystalline solids under most thermodynamic boundary conditions are characterized by a certain degree of configurational disorder. The most extreme example of configurational disorder occurs in a solid solution in which on average the arrangements of the different components of the solid approximate randomness. But even ordered compounds exhibit some degree of disorder due to thermal excitations or slight off-stoichiometry of the bulk composition. Atoms diffusing over crystal sites of a disordered solid sample a variety of different local environments along their trajectory. Diffusion in most crystals can be characterized as a Markov process whereby atoms after each hop completely thermalize before migrating to the next site along its trajectory. Hence each hop is independent of all previous hops. With reasonable accuracy, the rate with which individual atomic hops occur, can be described with transition state theory according to = ν ∗ exp
−E b kB T
(1)
where ν ∗ is a vibrational prefactor (having units of Hz) and E b is an activation barrier. Within the harmonic approximation, the vibrational prefactor is a ratio between the vibrational eigenmodes of the solid at the initial state of the hop to the vibrational eigenmodes when the migrating atom is at the activated state [5]. In the presence of configurational disorder, the activation barrier and frequency prefactor depend on the local arrangement of atoms around the migrating atom. Modeling of diffusion in a multicomponent system therefore requires a knowledge of the dependence of E b and ν ∗ on configuration. Especially, the configuration dependence of E b is of importance as the hop frequency, , depends on it exponentially. We restrict ourselves here to migration that occurs by individual atomic hops to adjacent vacant sites. Hence we do not consider diffusion that occurs through either a ring or intersticialicy mechanism. We also make a distinction between diffusion of interstitial species and substitutional species.
Diffusion and configurational disorder in multicomponent solids
2.1.
369
Interstitial Diffusion
Interstitial diffusion occurs in many important materials. A common example is the diffusion of carbon atoms over the interstitial sites of bcc or fcc iron (i.e., steel). Many phase transformations in steel involve the redistribution of carbon atoms between growing precipitate phases and the consumed matrix phase. A defining characteristic of interstitial diffusion is the existence of an externally imposed lattice of sites over which atoms can diffuse. In steel, the crystallized iron atoms create the interstitial sites for carbon. A similar situation exists in Lix CoO2 in which a crystalline CoO2 host creates an array of intersitial sites that can be occupied by lithium. While in Lix CoO2 , the lithium concentration x can be varied from 0 to 1, in steel FeC y , the carbon concentration y is typically very low. Individual carbon atoms interfere minimally with each other as they wander over the interstitial sites of iron. In Lix CoO2 , however, as the lithium concentration is typically large, migrating lithium atoms interact strongly with each other and influence each other’s diffusive trajectories. Another type of system that we place in the category of interstitial diffusion is adatom diffusion on the surface of a crystalline substrate. Often a crystalline surface creates an array of well defined sites on which adsorbed atoms reside, such as the fcc sites on a (111) terminated surface of an fcc crystal. Diffusion then involves the migration of adsorbed atoms over these surface sites. The presence of many diffusing atoms creates a state of configurational disorder over the interstitial sites that evolves over time as a result of the activated hops of individual atoms. Not only does the activation barrier of a migrating atom depend on the local arrangement of the surrounding interstitial atoms, but also the migration mechanism can depend on that arrangement. This is the case in Lix CoO2 , a layered compound consisting of close packed oxygen planes stacked with an ABCABC sequence. Between the oxygen layers are alternating layers of Li and Co which occupy octahedral sites of the oxygen sublattice. Within each lithium plane, the lithium ions occupy a two dimensional triangular lattice. As lithium is removed from LiCoO2 , vacancies are created in the lithium planes. First-principles density functional theory calculations (LDA) have shown that two migration mechanisms for lithium exchange with an adjacent vacancy exist depending on the arrangement of surrounding lithium atoms [3]. This is illustrated in Fig. 1. If the two sites adjacent to the end points of the hop (sites (a) and (b) in Fig. 1a) are simultaneously occupied by lithium ions, then the migration mechanism follows a direct path, passing through a dumbel of oxygen atoms. The calculated activation barrier for this mechanism is high, approaching 0.8 eV. This mechanism occurs when lithium migrates to an isolated vacancy. If, however, one or both of the sites adjacent to the end points of the hop are vacant (Fig. 1b), then the migrating lithium follows a curved path which passes through an adjacent tetrahedral
370
A. Van der Ven and G. Ceder
O Li
a
a b
b O Co O Single vacancy hop (a)
Divacancy hop 35 (b)
Figure 1. Two lithium migration mechanims in Lix CoO2 depending on the arrangement of lithium ions around the migrating ion. (a) When both sites a and b are occupied by Li, the migrating lithium performs a direct hop whereby it has to squeeze through a dumbel of oxygen ions. This mechanism occurs when the migrating lithium ion hops into an isolated vacancy (square). (b) When either site a or site b are vacant, the migrating lithium ion performs a curved hop whereby it passes through a tetrahedrally coordinated site. This mechanism occurs when the migrating atom hops into a divacancy.
site, out of the plane formed by the Li sites. For this mechanism, the activation barrier is low, taking values in the vicinity of 0.3–0.4 eV. This mechanism occurs when lithium migrates into a divacancy. Comparison of the activation barriers for the two mechanisms clearly shows that lithium diffusion mediated with divacancies is more rapid than with single vacancies. Nevertheless, we can already anticipate that the availability of divacancies will depend on the overall lithium concentration. The complexity of diffusion in a disordered solid is evident in Fig. 2 which schematically illustrates a typical disordered arrangement of lithium atoms within a lithium plane of Lix CoO2 . Hop 1, for example, must occur with a large activation barrier as the lithium is migrating to an isolated vacancy. In hop 2, lithium migrates to a vacant site that belongs to a divacancy and hence follows a curved path passing through an adjacent tetrahedral site characterized by a low activation barrier. In hop 3, lithium migrates to a vacant site belonging to two divacancies simultaneously, and hence has two low energy paths available. Similar complexities can be expected for adatom diffusion on crystalline substrates.
2.2.
Substitutional Diffusion
Substitutional diffusion is qualitatively different from interstitial diffusion in that an externally imposed lattice of sites for the diffusing atoms is absent.
Diffusion and configurational disorder in multicomponent solids
371
Cobalt Lithium
C
Oxygen
B c
A C B
3
A
2 a
LixCoO2
1 Lithium plane
Figure 2. A typical disordered lithium-vacancy arrangement within the lithium planes of Lix CoO2 . In a given lithium-vacancy arrangement, several different migration mechanisms can occur.
Instead, the diffusing atoms themselves form the network of crystal sites. This describes the situation for most metallic and semiconductor alloys. Vacancies with which to exchange with do exist in these crystalline alloys, however, the concentrations are often very dilute. Examples where substitutional diffusion is relevant are alloys such as Si–Ge, Al–Ti and Al–Li, in which the different species reside on the same crystal structure, and migrate by exchanging with vacancies. As with intersitial compounds, widely varying degrees of local order or disorder exist, affecting migration barriers. Al(1−x)Lix for example is metastable on the fcc crystal structure for low x and forms an ordered L12 compound at x = 0.25. Diffusion within a solid solution is different than in the ordered compound as the local arrangement of Li and Al are different. Figure 3 illustrates a diffusive hop of an Al atom to a neighboring vacancy within the ordered L12 Al3 Li phase. The energy along the migration path as calculated with LDA is also
372
A. Van der Ven and G. Ceder
1000
Energy (meV)
800
600
400
200
0 0
0.5
1
1.5
2
2.5
3
Migration path (Angstrom) Figure 3. The energy along the migration path of an Al atom hopping into a vacancy (square) on the lithium sublattice of L12 Al3 Li. Lighter atoms are Al, darker atoms are Li.
illustrated in Fig. 3. Clearly, the vacancy prefers the Li sublattice as the energy of the solid increases as the vacancy migrates from the Li sublattice to the Al sublattice by exchanging with an Al atom.
3.
Green–Kubo Expressions for Diffusion
While diffusion is complex at the atomic length scale, of central importance at the macroscopic length scale is the rate with which gradients in concentration dissipate. These rates can be described by diffusion coefficients that relate atomic fluxes to gradients in concentration. Green–Kubo methods make it possible to link kinetic coefficients to microscopic fluctuations of appropriate quantities at equilibrium. In this section we present the relevant Green–Kubo equations that allow us to calculate diffusion coefficients in multi-component solids from first-principles. We again make a distinction between interstitial and substitutional diffusers.
Diffusion and configurational disorder in multicomponent solids
3.1.
373
Interstitial Diffusion
3.1.1. Single component diffusion For a single component occuping interstitial sites of a host, such as carbon in iron, or Li in Lix CoO2 , irreversible thermodynamics [4] stipulates that a net flux J in particles occurs when a gradient in the chemical potential µ of the interstitial specie exists according to J = −L∇µ
(2)
where L is a kinetic coefficient that depends on the mobility of the diffusing atoms. Often it is more practical to express the flux in terms of a concentration gradient instead of a chemical potential gradient as the former is more accessible experimentally J = −D∇C.
(3)
D in Eq. (3) is the diffusion coefficient and the concentration C refers to the number of interstitial particles per unit volume. While the true driving force for diffusion is a gradient in chemical potential, it is nevertheless possible to work with Eq. (3) provided the diffusion coefficient is expressed as
D=L
dµ . dC
(4)
Hence the diffusion coefficient consists of a kinetic factor L and a thermodynamic factor dµ/dC. The validity of irreversible thermodynamics is restricted to systems that are not too far removed from equilibrium. To quantify this, it is useful to mentally divide the solid into small subregions that are microscopically large enough for thermodynamic variables to be meaningful yet macroscopically small enough that the same thermodynamic quantities can be considered constant within each subregion. Hence, although the solid itself is removed from equilibrium, each subregion is locally at equilbrium. This is called the local equilibrium approximation, and it is within this approximation that the linear kinetic equation Eq. (2) is considered valid. Within the local equilibrium approximation, the kinetic parameters D and L can be derived by a consideration of relevant fluctuations at thermodynamic equilibrium. Crucial in this derivation, is the assumption made by Onsager in his proof of the reciprocity relations of kinetic parameters, that the regression of a fluctuation of a particular extensive property around its equilibrium value occurs on average according to the same linear phenomenological laws as those governing the regression of artificially induced fluxes of the same extensive quantity [4]. This regression hypothesis is a consequence of the fluctuation–dissipation theorem of nonequilibrium statistical mechanics [6].
374
A. Van der Ven and G. Ceder
Several techniques, collectively referred to as Green–Kubo methods, exist to link microscopic fluctuations to macroscopic kinetic quantities [7–9]. Neglecting crystallographic anisotropy, the Green–Kubo expression for the kinetic factor for diffusion can be written as [10–12]
L=
ζ
Rζ (t)
2
(2d)t Mv s kB T
(5)
where Rζ (t) is the vector connecting the end points of the trajectory of particle ζ after a time t, M refers to the total number of interstitial sites available, v s is the volume per interstitial site, kB is the Boltzmann constant, T is the temperature and d refers to the dimension of the interstitial network. The brackets indicate an ensemble average performed at equilibrium. Often, the diffusion coefficient is also written in an equivalent form as [10] D = DJ F where
DJ =
ζ Rζ (t) (2d)t N
and
d F=
(6)
µ kB T
2
(7)
d ln(x)
.
(8)
N refers to the number of diffusing atoms and x = N/M to the fraction of filled interstitial sites. F is often called a thermodynamic factor and DJ is sometimes called the jump-diffusion or self-diffusion coefficient. A common approximation is to neglect cross correlations between different diffusing species and to replace DJ with the tracer diffusion coefficient defined as
D∗ =
Rζ2 (t) (2d)t
.
(9)
The difference between DJ and D ∗ is that the former depends on the square of the displacement of all the particles while the latter depends on the average of the square of the displacement of individual diffusing atoms. DJ is a measure of collective fluctuations of the center of mass of all the diffusing particles. Figure 4 compares DJ and D ∗ calculated with kinetic Monte Carlo simulations for the Lix CoO2 system. Notice in Fig. 4 that DJ is systematically larger than D ∗ for all lithium concentrations x, only approaching D ∗ for dilute lithium concentrations.
Diffusion and configurational disorder in multicomponent solids
375
⫺6
13
D ( 10 ) (cm2/s) ν∗
⫺7 ⫺8 ⫺9
⫺10 ⫺11 0
0.2
0.6 0.4 Li concentration
0.8
1
Figure 4. A comparison of the self diffusion coefficient DJ (crosses), and the tracer diffusion coefficient D ∗ (squares), for lithium diffusion in Lix CoO2 calculated at 400 K.
For interstitial components, the chemical potential of the diffusing atoms is defined as dG dg = (10) dN dx where G is the free energy of the crystal containing the interstitial species and g is the free energy normalized per interstitial site. While the thermodynamic factor is related to the chemical potential according Eq. (8) it is often convenient to determine F by considering fluctuations in the number of interstitial atoms within the grand canonical ensemble (constant µ, T and M). µ=
F=
N N 2 − N 2
(11)
Diffusion involves redistribution of particles from subregions of the solid with a high concentration of interstitial atoms to other subregions with a low concentration. The thermodynamic factor describes the thermodynamic response to concentration fluctuations within sub-regions.
3.1.2. Two component system A similar formalism emerges when two different species reside and diffuse over the same interstitial sites of a host. This is the case for example for carbon and nitrogen diffusion in iron or lithium and sodium diffusion over the
376
A. Van der Ven and G. Ceder
interstitial sites of a transition metal oxide host. Referring to the two diffusing species as A and B, the flux equations become JA = −L AA ∇µA − L AB ∇µB JB = −L BA ∇µA − L BB ∇µB
(12)
where L i j (i, j = A or B) are kinetic coefficients similar to L of Eq. (2). As with Eq. (2), gradients in chemical potential are often not readily accessible experimentally and Eq. (12) can be written as JA = −DAA ∇CA − DAB ∇CB JB = −DBA ∇CA − DBB ∇CB .
(13)
where the matrix of diffusion coefficients
DAA DB A
D AB DB B
=
L AA LBA
L AB L BB
∂µ A ∂C A ∂µ B ∂C A
∂µ A ∂C B ∂µ B ∂C B
(14)
can again be factorized into a product of a kinetic term (the 2×2 L matrix) and a thermodynamic factor (the 2 × 2 matrix of partial derivative of the chemical potentials). The Green–Kubo expressions relating the macroscopic diffusion coefficients to atomic fluctuations are [13, 14]
ζ
Lij =
Rζi (t)
ξ
Rξ (t) j
.
(2d)tv s MkB T
(15)
where Rζi is the vector linking the end points of the trajectory of atom ζ of specie i after time t. Another factorization of D is practical when studying diffusion with a lattice model description of the interactions between the different constituents residing on the crystal network ˜ D = L˜ Θ
(16)
where
L˜ i j =
ζ
Rζi (t)
(2d)t M
ξ
Rξ (t) j
.
(17)
Diffusion and configurational disorder in multicomponent solids and ˜ ij =
∂
µi kB T
377
∂x j
.
(18)
are respectively matrices of kinetic coefficients and thermodynamic factors. As with the single component intersitial systems the chemical potentials for a binary component interstitial system are defined as µi =
∂G ∂g = ∂ Ni ∂ x i
(19)
˜ can also be written in terms where i refers to either A or B. The components of of variances of the number of particles residing on the M site crystal network at constant chemical potentials, that is in terms of measures of fluctuations M ˜ = Θ Q
N B2 − N B 2 − (N B N A − N A N B )
− (N A N B − N A N B ) N A2 − N A 2
(20) where
Q=
NA2 − NA 2
NB2 − NB 2 − (NA NB − NA NB )2
These fluctuations in NA and NB are to be calculated in the grand canonical ensemble at the chemical potentials µA and µB corresponding to the concentrations at which the diffusion coefficient is desired.
3.2.
Substitutional Diffusion
The starting point for treating substitutional diffusion in a binary alloy are the Green–Kubo relations of Eqs. (14)–(18). However, several modifications and qualifications are necessary. These arise from the fact that alloys are characterized by a dilute concentration of vacancies and that the crystallographic sites on which the diffusing atoms reside are not created externally by a host, but are rather formed by the diffusing atoms themselves. The consequences of this for diffusion is that the chemical potentials appearing in the thermodynamic factor are not the conventional chemical potentials for the individual species A and B of a substitutional alloy, but are rather differences in chemical potentials between that of each diffusing specie and the vacancy chemical potential. Hence the chemical potentials of Eqs. (12), (14) and (18) need to be replaced by µ˜ i in which µ˜ i = µi − µV
(21)
378
A. Van der Ven and G. Ceder
where µV is the vacancy chemical potential in the solid. The reason for this modification arises from the fact that the chemical potential appearing in the Green–Kubo expression for the diffusion coefficient matrix Eq. (14) and defined in Eq. (19) corresponds to the change in free energy as component i is added by holding the number of crystalline sites constant, meaning that i is added at the expense of vacancies. This differs from the conventional chemical potentials of alloys which are defined as the change in free energy of the solid as component i is added by extending the crystalline network of the solid. µ˜ i refers to the chemical potential for a fixed crystalline network, while µi and µV correspond to chemical potentials for a solid in which the crystalline network is enlarged as more species are added. The use of µ˜ i instead of µi in the thermodynamic factor of the Green–Kubo expressions for the diffusion coefficients of crystalline solids also follows from irreversible thermodynamics [15, 16] as well as thermodynamic considerations of crystalline solids [17]. It can also be understood on physical grounds. By dividing the crystalline solid up into subregions, diffusion can be viewed as the flow of particles from one subregion to the next. Because of the constraint imposed by the crystalline network, the only way for excess atoms from one sub-region to be accommodated by a neighboring subregion is through the exchange of vacancies. One subregion gains vacancies the other loses them. The change in free energy in each subregion due to diffusion occurs by adding or removing atoms at the expense of vacancies. Another important modification to the treatment of binary interstitial diffusion is the identification of interdiffusion. Interdiffusion in its most explicit form refers to the dissipation of concentration gradients by the intermixing of A and B atoms. It is this phenomenon of intermixing that enters into continuum descriptions of diffusion couples and phase transformations involving atomic redistribution. Kehr et al. [18] demonstrated that in the limit of dilute vacancy concentrations, the full 2 × 2 diffusion matrix can be diagonalized producing an eigenvalue λ+ corresponding to density relaxations due to inhomegeneities in vacancies and an eigenvalue λ− corresponding to interdiffusion. The diagonalization of the D matrix is accompanied by a coordinate transformation of the fluxes and the concentration gradients. In matrix notation, J = −D∇C
(22)
where J and ∇C are column vectors containing as elements JA , JB and ∇CA , ∇CB , respectively. Diagonalization of D leads to D = EλE−1
(23)
Diffusion and configurational disorder in multicomponent solids
379
where λ is a diagonal matrix with components λ+ (the larger eigenvalue) and λ− (the smaller eigenvalue) in the notation of Kehr et al. [18], i.e.,
λ=
λ+ 0 0 λ−
The flux equation (22) can then be rewritten as E−1 J = −λE−1 ∇C.
(24)
The eigenvalue λ− , which describes the rate with which gradients in the concentration of A and B atoms dissipate by an intermixing mode is the most rigorous formulation of what is commonly referred to as an interdiffusion coefficient.
4.
Cluster Expansion
The Green–Kubo expressions for diffusion coefficients are proportional to the ensemble averages of the square of the collective distance travelled by the diffusing particles of the solid. Trajectories of interacting diffusing particles can be obtained with kinetic Monte Carlo simulations in which particles migrate on a crystalline network with migration rates given by Eq. (1). The migration rates of a specific atom, however, depend on the local arrangement of the other diffusing atoms through the configuration dependence of the activation barrier and frequency prefactor. Ideally, the activation barrier for each local environment could be calculated from first-principles. Nevertheless, this is computationally impossible, as the number of configurations are exceedingly large, and firstprinciples activation barrier calculations have a high computational cost. It is here that the cluster expansion formalism [1–3] becomes invaluable as a tool to extrapolate energy values calculated for a few configurations to determine the energy for any arrangement of atoms in a crystalline solid. In this section, we describe the cluster expansion formalism and how it can be applied to characterize the configuration dependence of the activation barrier for diffusion. We first focus on describing the configurational energy of atoms residing at their equilibrium sites, i.e., of the configurational energy of the end points of any hop.
4.1.
General Formalism
We restrict ourselves to binary problems though the cluster expansion formalism is valid for systems with any number of species [1, 2]. While it is clear that two component alloys without crystalline defects such as vacancies are
380
A. Van der Ven and G. Ceder
binary problems, atoms residing on the interstitial sites of a host can be treated as a binary system as well, with the interstitial atoms constituting one of the components and the vacancies the other. In crystals, atoms can be assigned to well defined sites, even when relaxations from ideal crystallographic positions occur. There is always a one to one correspondence between each atom and a crystallographic site. If there are M crystallographic sites, then there are 2 M possible arrangements of two species over those sites. To characterize a particular configuration, it is useful to introduce occupation variables σi that are +1 (−1) if an A (B which could be an atom different from A or a vacancy) resides at site i. The vector σ =(σ1 , σ2 , . . . , σi , . . . , σ M ) then uniquely specifies a configuration. The use of σ , however, is cumbersome and a more versatile way of uniquely characterizing configurations can be achieved with polynomials φα of occupation variables defined as [1] σ) = φα (
σi
(25)
i∈α
where i are sites belonging to a cluster α of crystal sites. Typical examples of clusters are a nearest neighbor pair cluster, a next nearest neighbor pair cluster, a triplet cluster etc. Examples of clusters on a two dimensional triangular lattice are illustrated in Fig. 5. There are 2 M different clusters of sites and σ ). therefore 2 M cluster functions φα ( σ ) form a complete It can be shown [1] that the set of cluster functions φα ( and orthonormal basis in configuration space with respect to the scalar product 1 f ( σ )g( σ) (26) f, g = M 2 σ where f and g are any scalar functions of configuration. The sum in Eq. (26) extends over all possible configurations of A and B atoms over the M sites of the crystal. Because of their completeness and orthonormality over the space of configurations, it is possible to expand any function of configuration f ( σ) σ ). In particular, the conas a linear combination of the cluster functions φα ( figurational energy (with atoms relaxed around the crystallographic positions of the crystal) can be written as E( σ ) = Eo +
α
Vα φα ( σ)
(27)
where the sum extends over all clusters α over the M sites. The coefficients σ ) with the Vα are constants and formally follow from the scalar product of E( σ) cluster function φα ( 1 σ ), φα ( σ ) = M E( σ )φα ( σ ). (28) Vα = E( 2 σ σ) E o is the coefficient of the empty cluster φo = 1 and is the average of E( over all configurations. Equation (27) is referred to as a cluster expansion of
Diffusion and configurational disorder in multicomponent solids
381
b a γ
α
β
Figure 5. Examples of clusters for a two dimensional triangular lattice.
the configurational energy and the coefficients of the expansion Vα are called effective cluster interactions (ECI). Equation (27) can be viewed as a generalized Ising model Hamiltonian containing not only nearest neighbor pair interactions, but also all other pair and multibody interactions extending beyond the nearest neighbors. Through Eq. (28), a formal link is made between the interaction parameters of the generalized Ising model and the configuration dependent ground state energies of the solid in each configuration σ . Clearly, the cluster expansion for the configurational energy, Eq. (27), is only useful if it converges rapidly, i.e., there exists a maximal cluster αmax such that all ECI corresponding to clusters larger than αmax can be neglected. In this case, the cluster expansion can be truncated to yield E( σ ) = Eo +
α max α
Vα φα ( σ)
(29)
382
A. Van der Ven and G. Ceder
A priori mathematical criteria for the convergence of the configurational energy cluster expansion do not exist. Experience indicates that convergence depends on the particular system being considered. In general, though, it can be expected that the lower order clusters extending over a limited range within the crystal will have the largest contribution in the cluster expansion.
4.2.
Symmetry and the Cluster Expansion
Simplifications to the cluster expansion (27) or (29) can be made by taking the symmetry of the crystal into account [2]. Clusters are said to be equivalent by symmetry if they can be mapped onto each other with at least one space group symmetry operation. For example, clusters α and β of Fig. 5 are equivalent since a clockwise rotation of α by 60◦ followed by a translation by the vector 2b maps α onto β. The ECI corresponding to clusters that are equivalent by symmetry have the same numerical value. In the case of α and β of Fig. 5, Vα = Vβ . All clusters that are equivalent by symmetry are said to belong to an orbit α where α is a representative cluster of the orbit. For any arrangement σ ) as σ we can define averages over cluster functions φα ( σ ) = φα (
1 φβ ( σ) | α | β∈ α
(30)
where the sum extends over all clusters β belonging to the orbit α and | α | represents the number of clusters that are symmetrically equivalent to α. The σ ) are commonly referred to as correlation functions. Using the defiφα ( nition of the correlation functions and the fact that symmetrically equivalent clusters have the same ECI, we can rewrite the configurational energy normalized by the number of primitive unit cells Np (i.e., number of Bravais lattice points of the crystal which is not necessarily equal to the number of crystal sites M), as e( σ) =
E( σ) = Vo + m α Vα φα ( σ ) Np α
(31)
where m α is the multiplicity of the cluster α, defined as the number of clusters per Bravais lattice point symmetrically equivalent with α (i.e., m α = | α |/Np ) and Vo = E o /Np . The sum in (31) is only performed over the symmetrically non-equivalent clusters.
4.3.
Determination of the ECI
According to Eq. (28), the ECI for the energy cluster expansion are determined by the first-principles ground state energies for all the different
Diffusion and configurational disorder in multicomponent solids
383
configurations σ . Explicitly calculating the ECI according to the scalar product Eq. (28) is intractable. Techniques, such as direct configurational averaging (DCA), though, have been devised to approximate the scalar product (28) [2, 19, 20]. In recent years, the preferred method of obtaining ECI has been with an inversion method [21–29]. In this approach, energies E( σ I ) for a set of P periodic configurations σ I with I = 1, . . . , P are calculated from firstprinciples and a truncated form of (31) is inverted such that it reproduces the E( σ I ) within a tolerable error when Eq. (31) is evaluated for configuration σ I . The simplest inversion scheme uses a least squares fit. More sophisticated algorithms involving linear programming techniques [30], cross-validation optimization [32] or the inclusion of k-space terms to account for long-range elastic strain have been developed [33, 34].
4.4.
Local Cluster Expansion
The traditional cluster expansion formalism described so far is applicable to the configurational energy of the solid which is an extensive quantity. We will refer to these expansions as extended cluster expansions. Activation barriers, however, are equal to the difference between the energy of the solid when the migrating atom is at the activated state and that when the migrating atom is at the initial equilibrium site. Hence, the configuration dependence of the activation barrier of an atom needs to be described by a cluster expansion with no translational symmetry and as such it converges to a fixed value as the system size grows. Not only is the activation barrier a function of configuration, but it also depends on the direction of the hop. This is schematically illustrated in Fig. 6 in which the end points of the hop have a different configurational energy. Describing the configuration dependence of the activation barrier independent of the direction of the hop is straightforward if a kinetically resolved activation barrier is introduced [3], defined as E KRA = E act −
n 1 Ej n j =1
(32)
∆Eb
∆EKRA
∆Eb
Figure 6. The activation barrier for migration depends on the direction of the hop when the energies of the end points of the hop are different.
384
A. Van der Ven and G. Ceder
where E act is the energy of the solid with the migrating atom at the activated state and E j are the energies of the solid with the migrating atom at the end points j of the hop. In most solids, there are n=2 end points to a hop, however, it is possible that more end points exist. All terms in Eq. (32) depend on the arrangement of atoms surrounding the end points of the hop and the activated state. The dependence of E KRA on configuration can, be described with a cluster expansion that has a point group symmetry compatible with the symmetry of the crystal as well as that of the activated state. For this reason, the cluster expansion of E KRA is called a local cluster expansion [3]. The kinetically resolved activation barrier is not the true activation barrier that enters in the transition state theory expression for the hop frequency, Eq. (1). It is merely a useful quantity that characterizes the configuration dependence of the activated state independent of the direction of the hop. The true activation barrier can be calculated from E KRA using
n 1 E j − Ei E b = E KRA + n j =1
(33)
where E i is the energy of the crystal with the migrating atom at the initial site of the hop. All quantities on the right hand side of Eq. (33) can be described with either a local cluster expansion (for E KRA ) or an extended cluster expansion (for the configurational energy of the solid).
5.
Practical Implementation
Calculating diffusion coefficients from first-principles in multicomponent solids involves three steps. First, a variety of ab initio energies for different atomic arrangements need to be calculated with an accurate first-principles method. This includes energies for a wide range of atomic arrangements over the sites of the crystal, as well as energies for migrating atoms placed at activated states surrounded by different arrangements. The latter calculations are typically performed with an atom at the activated state in large supercells. A useful technique to find the activated state between two equilibrium end points is the nudged elastic band method [31] which determines the lowest energy path between two equilibrium states. Calculating the vibrational prefactor requires a calculation of the phonon density of states for different atomic arrangements both with the migrating atom at its equilibrium site and at the activated state. While sophisticated techniques have been devised to characterize the configurational dependence of the vibrational free energy of a solid [35], for diffusion studies, a convenient simplification is the local harmonic approximation [36].
Diffusion and configurational disorder in multicomponent solids
385
In the second step, the first-principles energy values for different atomic arrangements are used to determine the coefficients of both a local cluster expansion (for the kinetically resolved activation barriers) and a traditional extended cluster expansion (for the energy of the crystal with all atoms at non-activated crystallographic sites) with either a least squares fit or with one of the more sophisticated methods alluded to above. The cluster expansions enable the calculation of the energy and activation barrier for any arrangement of atoms on the crystal. They serve as a convenient and robust tool to extrapolate accurate first-principles energies calculated for a few configurations to the energy of any configuration. Hence the migration rates of Eq. (1) can be calculated for any arrangement of atoms. The final step is the combination of the cluster expansions with kinetic Monte Carlo simulations to calculate the quantities entering the Green–Kubo expressions for the diffusion coefficients. Kinetic Monte Carlo simulations have been discussed extensively elsewhere [3, 37, 38]. Applied to diffusion in crystals, kinetic Monte Carlo algorithms are used to simulate the stochastic migrations of many atoms, hopping to neighboring sites with frequencies given by Eq. (1). A kinetic Monte Carlo simulation starts from a representative arrangement of atoms (typically obtained with a standard Monte Carlo method for lattice models). As atoms migrate, their trajectories and the time are kept track of, enabling the calculation of the quantities between the brackets in the Green–Kubo expressions. Since the Green–Kubo expressions involve ensemble averages, many kinetic Monte Carlo runs which start from different representative initial conditions are necessary. Depending on the desired accuracy, averages need to be performed over the trajectories departing from between 100 and 10 000 different initial conditions.
6.
Examples
Two examples of first-principles calculations of diffusion coefficients in multi-component solids are reviewed in this section. The first is for lithium diffusion in Lix CoO2 and is an example of nondilute interstitial diffusion. The second example, diffusion in the fcc based Al–Li alloy, corresponds to a substitutional system.
6.1.
Interstitial Diffusion
Lix CoO2 consists of a host structure made up of a CoO2 frame work. Layers of interstitial sites that can be occupied by lithium ions reside between O–Co–O slabs. The interstitial sites are octahedrally coordinated by oxygen and they form two dimensional triangular lattices. As described in Section 2.1,
386
A. Van der Ven and G. Ceder
two migration mechanisms exist for lithium: a single vacancy mechanism whereby lithium squeezes through a dumbell of oxygen atoms into an isolated vacancy and a divacancy mechanism in which lithium migrates through an adjacent tetrahedral site into a vacant site that is part of a divacancy [3]. The two migration mechanisms are illustrated in Fig. 1. Not only does the local arrangement of lithium ions around a hopping ion determine the migration mechanism, it also affects the value of the activation barrier for a particular migration mechanism. Figure 7 illustrates kinetically resolved activation barriers calculated from first- principles (LDA) for a variety of different lithium-vacancy arrangements around the migrating ion at different bulk lithium concentrations [3]. Note that for a given bulk composition, many possible lithium-vacancy arrangements around an atom in the activated state exist. The kinetically resolved activation barriers illustrated in Fig. 7 correspond to only a small subset of the these many configurations. The local cluster expansion is used to extrapolate from this set to all the configurations needed in a kinetic Monte Carlo simulation. Figure 7 shows that the activation barrier for the divacancy migration mechanism can vary by more that 200 meV with lithium concentration. The increase in activation barrier upon lithium removal from the host can be traced to the contraction of the host along the c-axis as the lithium concentration is reduced [3].
Activation Barrier (meV)
1000
800
600
400
200
0
0
0.2
0.4
0.6
0.8
1
Li concentration Figure 7. A sample of first-principles (LDA) kinetically resolved activation barriers E KRA for the divacancy hop mechanism (circles) and the single vacancy mechanism (squares).
Diffusion and configurational disorder in multicomponent solids
387
This contraction disporportionately penalizes the activated state over the end point states of the divacancy hop mechanism. Another contribution to the variation in activation barrier with composition derives from the fact that the activated state is very close in proximity to a Co ion, which becomes progressively more oxidized (i.e., its eff ective charge becomes more positive) as the overall lithium concentration is reduced [3, 29]. This leads to an increase in the electrostatic repulsion between the activated Li and the Co as x is reduced. Extended and local cluster expansions can be constructed to describe both the configurational energy of Lix CoO2 and the configuration dependence of the kinetically resolved activation barriers. An extended cluster expansion for the first-principles configurational energy of Lix CoO2 has been described in detail in Ref. [29]. This cluster expansion when combined with Monte Carlo simulations accurately predicts phase stability in Lix CoO2 . In particular, two ordered lithium-vacancy phases are predicted at x = 1/2 and x = 1/3. Both phases are observed experimentally [39, 40]. A local cluster expansion for the kinetically resolved activation barriers has been described in Ref [3]. Figure 8 illustrates calculated diffusion coefficients at 300 K determined by applying kinetic Monte Carlo simulations to the cluster expansions of Lix CoO2 [3]. While the configuration dependence of the activation barriers were rigorously accounted for with the cluster expansions, no attempt in these calculations was made to describe the migration rate prefactor ν ∗ from first- principles. Instead, a value of 1013 Hz was used for all compositions and environments. Figure 8(a) shows both DJ and the chemical diffusion coefficient D, while Fig. 8(b) illustrates the thermodynamic factor F, which was determined by calculating fluctuations in the number of lithium particles in grand canonical Monte Carlo simulations [3] (see Section 3.1). Notice that the calculated diffusion coefficient varies by several orders of magnitude with composition, showing that the assumption of a concentration independent diffusion coefficient in this system is unjustified. The thermodynamic factor F is a measure for the deviation from ideality. In the dilute limit (x → 0), interactions between lithium ions are negligible and the configurational thermodynamics approximates that of an ideal solution. In this limit the thermodynamic factor is 1. As x increases from 0, and the solid departs from ideal behavior, the thermodynamic factor increases substantially. The local minima in DJ and D at x = 1/2 and x = 1/3 are a result of lithium ordering at those compositions. Lithium-vacancy ordering effectively locks in lithium ions into energetically favorable sublattice positions which reduces ionic mobility. The thermodynamic factor on the other hand exhibits peaks at x = 1/2 and x = 1/3 as the configurational thermodynamics of an ordered phase deviates strongly from ideal behavior. The peak signifies the fact that in an ordered phase, a small gradient in composition leads to an enormous gradient in chemical potential, and hence a large thermodynamic driving force for diffusion. This partly compensates the reduction in DJ .
388
A. Van der Ven and G. Ceder
⫺7
D
13
D (10 ) (cm2/s) ν∗
⫺8 ⫺9 ⫺10
DJ
⫺11 ⫺12 ⫺13 ⫺14
0
0.2
0.4
0.6
0.8
1
Li concentration 100000
Thermodynamic factor
10000
1000
100
100
1 1
0.2
0.4
0.6
0.8
1
Li concentration
Figure 8. (a) Calculated self diffusion coefficient DJ and chemical diffusion coefficient D for Li x CoO2 at 300 K. (b) The thermodynamic factor of Lix CoO2 at 300 K.
Diffusion and configurational disorder in multicomponent solids
389
A similar computational approach can be followed to determine for example the diffusion coefficient for oxygen diffusion on a platinum (111) surface. If in addition to oxygen, sulfur atoms are also adsorbed on the platinum surface, Green–Kubo relations for binary interstitial diffusion would be needed. Furthermore, ternary cluster expansions are then necessary to describe the configuration dependence of the energy and kinetically resolved activation barrier as there are then three species: oxygen, sulfur and vacancies.
6.2.
Substitutional Diffusion
To illustrate diffusion in a binary substitutional solid, we consider the fcc Al–Li alloy. While Al1−x Lix is predominantly stable in the bcc based crystal structure, it is metastable in fcc up to x = 0.25. In fact, it is the metastable form of fcc Al1−x Lix that strengthens the important candidate alloy for aerospace applications. A first step in determining the diffusion coefficients in this system is an accurate first-principles characterization of the alloy thermodynamics. This can be done with a binary cluster expansion for the configurational energy [26]. The expansion coefficients of the cluster expansion were fit to the first-principles energies (LDA) of more than 70 different periodic lithium-aluminum arrangements on the fcc lattice [41]. Figure 9(a) illustrates the calculated metastable fcc based phase diagram of Al1−x Lix obtained by applying Monte Carlo simulations to the cluster expansion [41]. The phase diagram shows that a solid solution phase is stable at low lithium concentration and at high temperature. At x = 0.25, the L12 ordered phase is stable. In this ordered phase the Li atoms occupy the corner points of the conventional cubic fcc unit cell. Diffusion in most metals is dominated by a vacancy mechanism. Hence it is not sufficient to simply characterize the thermodynamics of the strictly binary Al–Li alloy. Real alloys always have a dilute concentration of vacancies that wander through the crystal and in the process redistribute the atoms of the solid. The vacancies themselves have a thermodynamic preference for particular local environments over others which in turn affects the mobility of the vacancies. Treating vacancies in addition to Al and Li makes the problem a ternary one and in principles would require a ternary cluster expansion. Nevertheless, since vacancies are present in dilute concentrations, a ternary cluster expansion can be avoided by using a local cluster expansion to describe the configuration dependence of the vacancy formation energy [41]. In effect, the local cluster expansion serves as a perturbation to the binary cluster expansion to describe the interaction of a dilute concentration of a third component, in this case the vacancy. A local cluster expansion for the vacancy formation energy in fcc Al–Li was constructed by fitting to first-principles (LDA) vacancy formation energies in 23 different Al–Li arrangements [41]. Combining the vacancy
390
A. Van der Ven and G. Ceder 800 (a) solid solution
Temperature (K)
700
600
L12
500
Vacancy concentration
(b)
1e-06
1e-07 (c)
Li concentration around vacancy
0.8 0.6 2nd NN 0.4 1st NN
0.2 0 0
0.1
0.2
0.3
x in LixAl(1-x) Figure 9. (a) First-principles calculated phase diagram of fcc based Al(1−x) Lix alloy. (b) Calculated equilibrium vacancy concentration as a function of bulk alloy composition at 600 K. (c) Average lithium concentration in the first two nearest neighbor shells around a vacancy. The dashed line corresponds to the average bulk lithium concentration.
Diffusion and configurational disorder in multicomponent solids
391
formation local cluster expansion with the binary cluster expansion for Al–Li in Monte Carlo simulations enables a calculation of the equilibrium vacancy concentration as a function of alloy composition and temperature. Figure 9(b) illustrates the result for Al–Li at 600 K [41]. While the vacancy concentration is more or less constant in the solid solution phase, it can vary by an order of magnitude over a small concentration range in the ordered L12 phase at 600 K. Another relevant thermodynamic property that is of importance for diffusion is the equilibrium short range order around a vacancy in fcc Al–Li. Monte Carlo simulations using the cluster expansions predict that the vacancy repels lithium ions, preferring a nearest neighbor shell rich in aluminum. Illustrated in Fig. 9(c) is the lithium concentration in shells with varying distance around a vacancy. The lithium concentration in the first nearest neighbor shell is less than the bulk alloy composition, while it is slightly higher than the bulk composition in the second nearest neighbor shell. This indicates that the vacancy repels Li and attracts Al. In the ordered phase, stable at 600 K between x = 0.23 and 0.3, the degree of order around the vacancy is even more pronounced as illustrated in Fig. 9(c). Between x = 0.23 and 0.3, the vacancy is predominantly surrounded by Al in its first and third nearest neighbor shells and by Li in its second and fourth nearest neighbor shells. This corresponds to a situation in which the vacancy occupies the lithium sublattice of the L12 ordered phase. Clearly the thermodynamic preference of the vacancies for a specific local environment will have an impact on their mobility through the crystal. While thermodynamic equilibrium determines the degree of order within the alloy and which environments the vacancies are attracted to, atomic migration mediated by a vacancy mechanism involves passing through activated states, which requires passing over an energy barrier that also depends on the local degree of order. Contrary to what is predicted for Lix CoO2 , the kinetically resolved activation barriers in fcc Al1−x Lix are not very sensitive to configuration and bulk composition [42]. For each type of atom (Al or Li), the variations in kinetically resolved activation barriers are within the numerical errors of the first-principles method (50 meV for a plane wave pseudopotential method using 107 atom supercells). This is likely the result of a negligible variation in volume of fcc Al1−x Lix with composition. But while the migration barriers do not depend significantly on configuration, they are very different depending on which atom performs the hop. The first-principles calculated migration barrier for Al hops are systematically between 150 to 200 meV larger than for Li hops [42]. The thermodynamic tendency of the vacancy to repel lithium atoms deprives Li of diffusion mediating defects. Kinetically, though, Li has a lower activation barrier relative to Al for migration into an adjacent vacancy. Hence a trade-off exists between thermodynamics and kinetics. While Li exchanges more readily with a neighboring vacancy, thermodynamically it has less access to those vacancies. Quantitatively determining the effect of this trade-off requires explicit
392
A. Van der Ven and G. Ceder
Interdiffusion coefficient (cm2/s)
10⫺11
10⫺12 Two phase coexistence
10⫺13
10⫺14 0
0.05
0.1 0.15 x in Al(1-x)Lix
0.2
0.25
0.3
Figure 10. Calculated interdiffusion coefficient (the λ− eigenvalue of the 2 × 2 D matrix) for fcc Al(1−x) Lix alloy at 600 K.
evaluation of diffusion coefficients. This can be done by applying kinetic Monte Carlo simulations to cluster expansions that describe the configurational energy and kinetically resolved activation barriers for Al, Li and dilute vacancies on the fcc lattice. Figure 10 illustrates the calculated interdiffusion coefficient at 600 K obtained by diagonalizing the D matrix of Eq. (14) [42]. The coefficient for interdiffusion describes the rate with which the Al and Li atoms intermix in the presence of a concentration gradient in the two species. The calculated interdiffusion coefficient is more or less constant in the solid solution phase, but drops by more than an order of magnitude in the L12 ordered phase. The thermodynamic preference of the vacancies for the lithium sublattice sites of L12 dramatically constricts the trajectory of the vacancies, leading to a drop in overall mobility of Li and Al.
7.
Conclusion
In this chapter, we have presented the statistical mechanical formalism that relates phenomenological diffusion coefficients for multicomponent solids to microscopic fluctuations of the solid at equilibrium. We have focussed on
Diffusion and configurational disorder in multicomponent solids
393
diffusion that is mediated by a vacancy mechanism and have distinguished between interstitial systems and substitional systems. An important property of multicomponent solids is the existence of configurational disorder among the constituent species. This adds a level of complexity in calculating diffusion coefficients from first- principles since the activation barriers vary along an atom’s trajectory as a result of variations in the local degree of atomic order. In this respect, the cluster expansion is an invaluable tool to describe the dependence of the energy, in particular of the activation barrier, on atomic configuration. While the formalism of calculating diffusion coefficients from firstprinciples in multicomponent solids has been established, many opportunities exist to apply it to a wide variety of multicomponent crystalline solids, including metals, ceramics and semiconductors. Faster computers and improvements to electronic structure methods that go beyond density functional theory will lead to more accurate first-principles approximations to activation barriers and vibrational prefactors. It is only a matter of time before first-principles diffusion coefficients for multicomponent solids are routinely used in continuum simulations of diffusional phase transformations and electrochemical devices such as batteries and fuel cells.
Acknowledgments We acknowledge support from the AFOSR, grant F49620-99-1-0272 and the Department of Energy, Office of Basic Energy Sciences under Contract No. DE-FG02-96ER45571. Additional support came from NSF (ACI-9619020) through computing resources provided by NPACI at the San Diego Supercomputer Center.
References [1] J.M. Sanchez, F. Ducastelle, and D. Gratias, Physica A, 128, 334, 1984. [2] D. de Fontaine, In: H. Ehrenreich and D. Turnbull (eds.), Solid State Physics., Academic Press, New York, pp. 33, 1994. [3] A. Van der Ven, G. Ceder, M. Asta, and P.D. Tepesch, Phys. Rev. B, 64, 184307, 2001. [4] S.R. de Groot and P. Mazur, Non-Equilibrium Thermodynamics, Dover Publications, Mineola, NY, 1984. [5] G.H. Vineyard, J. Phys. Chem. Solids, 3, 121, 1957. [6] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. [7] R. Zwanzig, Annu. Rev. Phys. Chem., 16, 67, 1965. [8] R. Zwanzig, J. Chem. Phys., 40, 2527, 1964. [9] Y. Zhou and G.H. Miller, J. Phys. Chem., 100, 5516, 1996. [10] R. Gomer, Rep. Prog. Phys., 53, 917, 1990.
394
A. Van der Ven and G. Ceder [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42]
M. Tringides and R. Gomer, Surf. Sci., 145, 121, 1984. C. Uebing and R. Gomer, J. Chem. Phys., 95, 7626, 1991. A.R. Allnatt, J. Chem. Phys., 43, 1855, 1965. A.R. Allnatt, J. Phys. C: Solid State Phys., 15, 5605, 1982. R.E. Howard and A.B. Lidiard, Rep. Prog. Phys., 27, 161, 1964. A.R. Allnatt and A.B. Lidiard, Rep. Prog. Phys., 50, 373, 1987. J.W. Cahn and F.C. Larche, Scripta Met., 17, 927, 1983. K.W. Kehr, K. Binder, and S.M. Reulein, Phys. Rev. B, 39, 4891, 1989. C. Wolverton, G. Ceder, D. de Fontaine, and H. Dreysse, Phys. Rev. B, 45, 13105, 1992. C. Wolverton and A. Zunger, Phys. Rev. B, 50, 10548, 1994. J.W.D. Connolly and A.R. Williams, Phys. Rev. B, 27, 5169, 1983. J.M. Sanchez, J.P. Stark, and V.L. Moruzzi, Phys. Rev. B, 44, 5411, 1991. Z.W. Lu, S.H. Wei, A. Zunger, S. Frotapessoa, and L.G. Ferreira, Phys. Rev. B, 44, 512, 1991. M. Asta, D. de Fontaine, M. Vanschilfgaarde, M. Sluiter, and M. Methfessel, Phys. Rev. B, 46, 5055, 1992. M. Asta, R. McCormack, and D. de Fontaine, Phys. Rev. B, 48, 748, 1993. M.H.F. Sluiter, Y. Watanabe, D. de Fontaine, and Y. Kazazoe, Phys. Rev. B, 53, 6136, 1996. P.D. Tepesch, et al., J. Am. Cer. Soc., 79, 2033, 1996. V. Ozolins, C. Wolverton, and A. Zunger, Phys. Rev. B, 57, 6427, 1998. A. Van der Ven, M.K. Aydinol, G. Ceder, G. Kresse, and J. Hafner, Phys. Rev. B, 58, 2975, 1998. G.D. Garbulsky and G. Ceder, Phys. Rev. B, 51, 67, 1995. G. Mills, H. Jonsson, and G.K. Schenter, Surf. Sci., 324, 305, 1995. A. van de Walle and G. Ceder, J. Phase Eqilib., 23, 348, 2002. D.B. Laks, L.G. Ferreira, S. Froyen, and A. Zunger, Phys. Rev. B, 46, 12587, 1992. C. Wolverton, Philos. Mag. Lett., 79, 683, 1999. A. van de Walle and G. Ceder, Rev. Mod. Phys., 74, 11, 2002. R. LeSar, R. Najafabadi, and D.J. Srolovitz, Phys. Rev. Lett., 63, 624, 1989. A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, J. Comput. Phys., 17, 10, 1975. F.M. Bulnes, V.D. Pereyra, and J.L. Riccardo, Phys. Rev. E, 58, 86, 1998. J.N. Reimers and J.R. Dahn, J. Electrochem. Soc., 139, 2091, 1992. Y. Shao-Horn, S. Levasseur, F. Weill, and C. Delmas, J. Electrochem. Soc., 150, A366, 2003. A. Van der Ven and G. Ceder, Phys. Rev. B., 2005 (in press). A. Van der Ven and G. Ceder, Phys. Rev. Lett., 2005 (in press).
1.18 DATA MINING IN MATERIALS DEVELOPMENT Dane Morgan and Gerbrand Ceder Massachusetts Institute of Technology, Cambridge MA, USA
1.
Introduction
Data Mining (DM) has become a powerful tool in a wide range of areas, from e-commerce, to finance, to bioinformatics, and increasingly, in materials science [1, 2]. Miners think about problems with a somewhat different focus than traditional scientists, and DM techniques offer the possibility of making quantitative predictions in many areas where traditional approaches have had limited success. Scientists generally try to make predictions through constitutive relations, derived mathematically from basic laws of physics, such as the diffusion equation or the ideal gas law. However, in many areas, including materials development, the problems are so complex that constitutive relations either cannot be derived, or are too approximate or intractable for practical quantitative use. The philosophy of a DM approach is to assume that useful constitutive relations exist, and to attempt to derive them primarily from data, rather than from basic laws of physics. As an example, consider what will likely stand forever as the greatest application of DM in the hard sciences, the periodic table. In 1869 Mendeleev organized the elements based on their properties, without any guiding theory, into the first modern periodic table [3]. With the advent of quantum theory it became possible to predict the structure of the periodic table and DM was no longer strictly necessary, but the results had already been known and used for many years. Even today, the easy organization of data made possible by the classifications in the periodic table make it an everyday tool for research scientists. Mendeleev established a simple ordering based on a relatively small amount of data, and so could do it on paper. However, today’s data sets can be many orders of magnitude larger, and an impressive array of computational algorithms have been developed to automate the task of identifying relationships within data. 395 S. Yip (ed.), Handbook of Materials Modeling, 395–421. c 2005 Springer. Printed in the Netherlands.
396
D. Morgan and G. Ceder
DM is becoming an increasingly valuable tool in the general area of materials development, and there are good reasons why this area is particularly fruitful for DM applications. There is an enormous range of possible new materials, and it is often difficult to physically model the relationships between constituents, and processing, and final properties. For this reason, materials are primarily still developed by what one might call informed trial-and-error, where researchers are guided by experience and heuristic rules to a somewhat restricted space of constituents and processing conditions, but then try as many combinations as possible to find materials with desired properties. This is essentially human DM, where one’s brain, rather than the computer, is being used find correlations, make predictions, and design optimal strategies. Transferring DM tasks from human to computer offers the potential to enhance accuracy, handle more data, and allow wider dissemination of accrued knowledge. Other key drivers for growing DM use in materials development are ease of access to large databases of materials properties, new data being generated in large quantities by high-throughput experiments and quantitative computational models, and improved algorithms, computer speed, and software packages leading to more effective and easy to use DM methods. Note that DM is also used in other areas of materials science beside materials development, e.g., design and manufacturing [4, 5], but this work will not be discussed here. The interdisciplinary nature of DM creates a special challenge, since a typical materials scientist’s education does not provide an introduction to DM techniques, and the computer scientists and statisticians usually involved in developing DM methods are equally unlikely to be versed in materials science. The goal of this paper is to help foster communication between the disciplines and show examples of how they can be joined productively. We introduce DM concepts in a fairly general framework, discuss a few of the more common methods, and describe how DM is being used to tackle some materials development problems, including predicting physiochemical properties of compounds, modeling electrical and mechanical properties, developing more effective catalysts, and predicting crystal structure. The breadth of methods and applications makes a comprehensive discussion impossible, but hopefully this brief introduction will be enough to allow the interested reader to follow up on specific areas of interest.
2.
Key Methods of Data Mining
Data Mining (DM) is a vast and rapidly changing topic, with many different techniques appearing in many different fields. Broad reviews of the issues, methods, and applications are given in Refs. [1, 2] and somewhat less comprehensively but more in depth in Refs. [6, 7]. There is some disagreement about exactly what constitutes DM, as opposed to, e.g., knowledge discovery or
Data mining in materials development
397
statistical analysis. We will not worry much about such distinctions, and give DM the rather all encompassing definition of using your data to obtain information. This essentially defines every discovery task as some kind of DM, but there is really a continuum. The more data one has, and the less physical modeling one includes, then the more time one will spend on data management, models, and investigation, and the more DM the task will be. If one has eight data points of force and acceleration, and one performs a linear regression to fit mass, it is silly to consider it DM. There is very little time spent on the data, and one is essentially just fitting an unknown parameter in the known physical law F = ma. However, if one is trying to predict what song can be a commercial hit based on a database of song characteristics and sales data, then the primacy of data, and the absence of any guiding theory, make it clearly a DM problem [8]. DM in materials development generally focuses on prediction. Relationships are established between desired dependent properties (e.g., melting temperature or catalytic activity) and independent properties that are easily controlled and measured (e.g., precursor concentrations or annealing temperatures). Once such a relationship is established, dependent properties can be quickly predicted from independent ones, without having to perform costly and time consuming experiments. It is then possible to optimize over a large space of possible independent properties to obtain the desired dependent property. In general, we will define X as the independent properties or variables, Y as the dependent properties or variables, F as the derived relationship between X and Y , and YPred as the predicted values of Y based on F and X . The goal of a DM effort is usually to determine F such that YPred represents Y as effectively as possible. There are several key areas that need to be considered in a DM application such as the one described above: data management and preparation, prediction methods, assessment, optimization, and software.
2.1.
Data Preparation and Management
Data preparation and management will not be discussed in detail since the issues are very dependent on the specific data being used. However, the tasks associated with cleaning and managing the data can often take up the bulk of a DM project, and should not be underestimated. Data must be stored so that it can be accessed efficiently, interfaced with equipment, updated, etc. Solutions can range from simple flat files to sophisticated database software. Issues often exist with the type and quality of the data, and it is frequently necessary to make significant transformations to bring the data into a universally comparable format, and to regroup data into appropriate new variables. There is sometimes erroneous or just missing data, which may need to be dealt with
398
D. Morgan and G. Ceder
in some manner before or during the DM process. Finally, data must be adequately comprehensive to be amenable to DM. It may be necessary to obtain further data in key areas, perhaps guided by the DM results in an iterative procedure. These issues are described in many data mining books, e.g., Ref. [7].
2.2.
Prediction Methods
Prediction methods form the heart of DM tools relevant for materials development. Although there are many DM approaches that can be used for prediction, here we focus only on three of the most popular, linear regression, neural networks, and classification methods. Linear regression is often one of the first approaches to try in a DM project, unless one has reasons to expect nonlinear behavior. It is assumed that the relationship F is a linear function, and the unknown parameters are determined by multivariate linear regression to minimize the squared error between YPred and Y (these methods are discussed in many textbooks, e.g., Refs. [9, 10]. Linear regression is generally performed by matrix manipulations and is very robust and rapid. There are many variations on strict regression, e.g., adding weights or transforming variables with logarithms. Some of the most useful regression tools are those for reducing the number of independent variables (X ), sometimes called dimensional reduction. It is frequently the case that there are many possible independent variables, but not all of them will be truly independent or important. Furthermore, the original data categories may not be optimal, and linear combinations of the variables, called latent variables, might be more effective. For example, alloy properties affected by strain will depend on the differences in atomic sizes, rather than the size of each constituent element separately. It is often difficult to have enough data to properly fit coefficients for a large number of variables (e.g., uniformly gridding a space of n variables with m points for each variable requires n m data points, which rapidly becomes unmanageable. This is sometime called the “curse of dimensionality” and is a much more significant problem in nonlinear fitting methods, such as the neural networks described below). Having too many variables that are not well constrained can lead to overfitting and poor predictive ability of the function F. Ideally, the DM method will help the user define and include the most effective latent variables for prediction. One common method for defining latent variables is Principal Component Analysis (PCA), which yields latent variables that are orthogonal and ordered by decreasing variance [11]. Assuming that variance correlates well with the importance of the latent variable to the dependent variables, then the principal components are ordered in a sensible fashion and can be truncated at some point. Orthogonality assures that latent variables are independent and
Data mining in materials development
399
will represent different variations. A limitation of this approach is that no information about Y is used in picking the variables. Some improvement can often be obtained by using Partial Least Squares (PLS) regression [9, 12–14], which is similar in spirit to PCA, but constructs orthogonal latent variables that maximize the covariance between X and Y . PLS latent variables capture a lot of the variation of X , but are also well correlated with Y , and so are likely to provide effective predictions. However one defines the latent variables, it is important to test their effectiveness, and there are a number of methods to identify statistically significant variables in a regression (e.g., ANOVA) [7, 9]. Another popular method is to make use of cross validation, which is discussed below, to exclude variables that are not predictive. Neural Network (NN) methods [15] are more general than linear approaches and have become a popular prediction tool for many areas. NNs loosely model the functioning of the brain, and consist of a network of neurons that can take inputs, sum them with weights, operate on the sum with a transfer function, and then emit an output. The NN is generally viewed as having layers, the first takes input from outside the NN, and the last outputs the final results to the user, while layers in between are called hidden and communicate only with other layers. For the problems considered here, the NN plays the role of the relationship F between X and Y . The weights of the neurons are unknown and must be determined by training based on known input X and output Y , where the goal is generally to minimize |YPred − Y |. The training process is analogous to a linear regression, except that the unknown weights are much more difficult to determine and many different training methods exist. Similar problems occur with excessive numbers of independent variables, and some dimensional reduction, e.g., by PCA, may be necessary. The strength of NNs is that they are very flexible, and with enough training can in principle represent any function, making them more powerful than linear methods. However, this increased power comes at a price of increased complexity. NNs have many choices that must be made correctly for optimal performance, including the number of layers, the number of neurons in each layer, the type of transfer function for each neuron, and the method of training the neural network. In general, training a NN is orders of magnitude slower than a linear regression, and convergence to the optimal parameters is by no means assured. NNs also have the drawback that it is less obvious how the X and Y variables are related than in a linear regression, making intuitive understanding more challenging. The problems of inadequate training and overfitting data are quite serious with NN’s. Some NN’s make use of “Bayesian regularization” [16–19], which includes uncertainty in the NN weights and provides some protection against overfitting. Another common solution is combining predictions from a number of differently trained NN’s (prediction by “committee”) (this approach is used
400
D. Morgan and G. Ceder
in, e.g., Refs. [20, 21]). Another interesting approach, which can only be used in cases where one if faced with many similar problems, is to retrain NNs on related problems, making use of the information already gained in their previous training (this is done in, e.g., Ref. [22]). Classification maps data into predefined classes rather than continuous variables, where the classes are defined based on the dependent properties Y . For example, if Y is conductivity, one could classify materials into metals and insulators, and try to predict to which class a material should belong based on X , rather than performing a full regression of Y on X to predict the continuous conductivity values. Another example is predicting crystal structure, where each different structure type can be considered a class, and the goal is to be able to predict class (assign a structure type) based on the independent data X . In classification DM the relation F maps X onto categories YPred , rather than continuous values. There are a range of different classification methods, as described in most standard textbooks (we found Ref. [6] particularly lucid on these issues). The only classification scheme that will be discussed here is the K -nearest neighbor method, which is one of the simplest. This approach requires that one can define a distance between any two samples, dij = distance between X i and X j . Classification for a new X i is performed by calculating its K nearest neighbors in the existing data set, and then assigning X i to the class that contains the most items from the K neighbors. The spirit of this approach underlies structure maps for crystal structure prediction, discussed in more detail below. Other classification approaches use Bayesian probabilistic methods, decision trees, NNs, etc. but will not be described here [1, 6, 7]. There are some issues with defining a metric of success for classifications. Since YPred and Y represent class occupancies, there is not necessarily any way to measure a distance between them. One way to view the results is what is rather wonderfully called a confusion matrix, where matrix element m ij gives the number of times a sample belonging in class Ci was assigned to C j . In order to define a metric for success it is important to realize that when assigning samples to a class there are two parameters that characterize the accuracy, the fraction of samples correctly placed into the class (true positives), and the fraction of samples incorrectly placed into the class (false positives). These can vary independently and their importance can be very dependent on the problem (for example, in classifying blood as safe, it is important to get as many true positives as possible, but absolutely essential not to allow any false positives, since that would allow unsafe blood into the blood supply). Therefore, the metric for success in classification must be chosen with some care. Note that clustering, which is similar to classification, is differentiated by the fact that clustering groups data without the data clusters being predefined. This is sometimes called “unsupervised” learning and will not be discussed further here, but can be found in most DM references.
Data mining in materials development
2.3.
401
Assessment
Cross-validation (CV) [23, 24] is a technique to assess the predictive ability of a fit and reduce the danger of overfitting. In a CV test with N data points, N − n data points are fit and used to predict the n points excluded from the fit. The predicted error of the excluded points is the CV score. This process can be averaged over many possible subsets of the data, which is called “leave n out CV”. The key concept behind CV is that the CV score is based on data not used in the fit. For this reason, the CV score will decrease as the model becomes more predictive, but will start to increase if the model under- or overfits the data. This in contrast to predicted errors in data that is included in the fit, which will always decrease with more fitting degrees of freedom. For example, consider a linear regression on a set of latent variables. The root mean square (RMS) error in the fit data will be a monotonically decreasing function of the number of latent variables used in the regression. However, the CV score will generally decrease for the initial principal components, and then start to increase again as the number of principal components gets large. The initial decrease in the CV score occurs because statistically meaningful variables are being added and the regression model is becoming more accurate. The increasing CV score signals that too many variables are being used, the regression is fitting noise, and that the model is overfit. By minimizing the CV score it is therefore possible to select an optimal set of latent variables for prediction. This idea is illustrated schematically in Fig. 1.
CV Error
Optimal
RMS
Number of latent variables Figure 1. A schematic comparison of the error calculated with data included in the fit (normal RMS fitting error – solid line) and excluded from the fit (CV score – dashed line).
402
D. Morgan and G. Ceder
Test data is another important assessment tool, and simply refers to a set of data that is excluded from working data at the beginning of the project and then used to validate the model at the end of model building. To some extent, the CV method does this already, but in the common case where the model is altered to optimize the CV score, it will overestimate the true predictive accuracy of the model [23]. It is only by testing on an entirely new data set, which the model has not previously encountered, that a reliable estimate of the predictive capacity of the model can be established. Sometimes there is not enough data to create an effective test data set, but it is certainly advisable to do so if at all possible.
2.3.1. Optimization Optimization methods [25, 26] are not usually considered DM, but they are an essential tool of many DM projects. For example, once a predictive model has been established, one frequently wants to optimize the inputs to give a desired output. This usually cannot be done with local optimization schemes (e.g., conjugate gradient methods) due to a rough optimization surface with many local minima. It is therefore frequently necessary to use an optimization method capable of finding at least close to the global minimum in a landscape with many local minima. A detailed discussion of these methods is beyond the scope of this article, but common approaches include simulated annealing Monte Carlo, genetic algorithms, and branch and bound strategies. Genetic algorithms seem to be the most popular in the DM applications discussed here, and work by “evolving” toward an optimal sample population through operations such as mixing, changing, and removing samples.
2.3.2. Software Many DM algorithms are fairly simple, and can be programmed relatively quickly. Often the underlying numerical operations involve no more than standard matrix operations, and access to widely available basic linear algebra subroutines (BLAS) is adequate. However, DM is generally very explorative, and it is common to try many different approaches. Coding everything from scratch becomes prohibitive, and will lock the user into the few things they can readily implement. Fortunately, there are a large number of both free and commercial DM tools available for users. Some tools, like the Neural Net Toolbox in Matlab, are implemented in languages likely to be familiar to the materials scientist, and are readily accessible. An impressive list of possible tools is given in Appendix A of Refs. [6, 7]. It should also be remembered that for the academic user many companies will have special rates, so it is worth exploring commercial software.
Data mining in materials development
3.
403
Applications
There are far too many studies using DM methods to offer a comprehensive revue. Therefore, we focus on a few key areas where DM techniques are highlighted and seem to be playing an increasingly important role.
3.1.
Quantitative Structure–Property Relationships (QSPR)
Quantitative Structure–Property Relationships (QSPR), and the closely related techniques of Quantitative Structure–Activity Relationships (QSAR), are based on the fundamental tenet that many molecular properties, from boiling point to biological activity, can be derived from basic descriptors of molecular structure. For some examples, see the general review of using NNs to predict physiochemical properties in Ref. [27] QSPR/QSAR are generally considered methods of chemistry, but are closely related to the activities of a DM material scientist. QSPR/QSAR is a large field and here we consider only one particularly illustrative example, the work of Chalk et al., predicting boiling points for molecules [20]. The boiling point for any given compound is not a particularly hard measurement, but the ability to quickly predict boiling points for many compounds, particularly ones that only exist as computer models, can be useful for screening in, e.g., drug design. Computing the boiling point of a compound directly from physical principles requires a very accurate model of the energetics and significant computation. Therefore, researchers have generally turned to DM applications in this area. Chalk et al. have a database of 6629 molecular structures and boiling points. The dependent variables Y are taken as the boiling points. A set of descriptors, X 0 , are developed based on structural and electronic characteristics (derived from semiempirical atomistic models). A technique called formal inference-based recursive modeling (FIRM) is then used to asses the relevance of each variable (this technique will not be described here but allows the influence of a variable to be tested). A set of 18 descriptors are settled on as likely to be significant and they are used for the independent variables X . A test data set of 629 molecules that span the whole range of boiling temperatures is removed. The remaining 6000 molecules are then used to find the optimal model function F to map X to Y . F is represented by a NN, and after some initial testing one is chosen with 18 first layer nodes, 10 nodes in the hidden second layer, and a single node in the third layer. The transfer functions are all sigmoids (sig(x) = 1/(1 + exp(−x))) and trained with a back-propagation algorithm. In order to control for overfitting the data is broken up into 10 disjoint subsets and a “leave
404
D. Morgan and G. Ceder
600 out” cross validation is performed. This trains 10 distinct NNs on 5400 molecules each. The NN training is stopped when the CV score reaches a minimum. The prediction function F is taken to be a committee, and uses the mean result of the values predicted by all 10 NNs. The final test for F is done by comparing the predicted and true boiling points for the 629 molecule test set, giving errors with a standard deviation of only 19 K (the predicted vs. true melting temperatures for the test set are shown in Fig. 2). The predictive capacity is good enough that for many of the largest prediction errors it was possible to go back to the experimental data and show that the input data itself was in error. One could now imagine using a genetic algorithm and the predicting function F to search the space of molecular structures to find, e.g., a very high melting temperature molecule, although no such work was performed by the authors. It is worth noting that computation plays an important role in providing the basic input data in the study. All of the structural and electrostatic descriptors were generated by semi-empirical atomistic models. Using computational methods can be an efficient way to generate large amounts of descriptor information, greatly reducing the amount of experimental work required.
Figure 2. Predicted vs. true boiling points for 629 compounds. Prediction is done by neural networks fit to 6000 boiling points that did not include the 629 shown here. (After [20], reproduced with permission).
Data mining in materials development
3.2.
405
Processing–Structure–Property Relationships
Processing–Structure–Property (PSP) relationships refer to the challenging materials problem of connecting the processing parameters of a material to its structure and properties. Processing conditions might include such things as initial composition of reactants and annealing schedule, while structural aspects might be crystal structure or grain size, and final properties are such characteristics as yield stress and corrosion resistance. PSP relationships are very important because they allow processing parameters to be adjusted to create optimal materials. PSP relationships tend to involve many different phenomena, with widely varying length and time scales, making direct modeling extremely challenging. However, analogous to QSPR’s reliance on the fact that properties must be a function of the structure of the molecules involved, in PSP relationships we know that properties must follow from structure in some manner, and that structure is somehow determined by processing. The assurance that PSP relationships exist, combined with the challenge of directly modeling them, makes this a good area for DM applications. One of the most active groups in this area has been Bhadeshia and co-workers. Bhadeshia’s review in 1999 [21] covers a lot of the material’s work that had been done up to that time in neural network (NN) modeling, and he and co-workers have continued to apply NN techniques in PSP applications to such areas as creep modeling [28, 29], mechanical weld properties [30, 31], and phase fractions in steel [32]. In general, these studies follow the DM framework used in QSPR above. Many of the data and codes used by Bhadeshia et al., as well as many others, can be found online as part of the Materials Algorithm Project [33]. Malinov and co-workers have also done extensive work with DM tools in PSP relationships, and have developed a code suite, complete with graphical user interface, to make use of their models [34]. Their work has focused primarily on Ti alloys [35–37] and nitrocarburized steels [38, 39]. The NN software they developed uses a cross validation (CV)-like strategy to assess the effectiveness of different NN architectures, training methods, and trainings, so that the best network can be obtained by optimization, rather than intuitive choice. It is a general trend in DM applications to try to automatically optimize as many choices as possible, since this gives the best results with the least user intervention. Many apparent DM choices, such as which latent variables or NN architectures to use, can in fact be determined by performing a large number of tests. Implementing this type of automation is generally limited by the user’s willingness to code the required tests, the time it takes to perform the optimization, and the amount of data required for sufficient testing. Also, one should ideally have a test set that is entirely excluded from all the optimization processes for final testing.
406
D. Morgan and G. Ceder
A particularly interesting application by Malinov et al. is the prediction of time–temperature-transformation (TTT) diagrams for Ti alloys [34, 35, 37]. TTT diagrams give the time to reach a specified fraction of phase transformation at each temperature, and for a given phase fraction they are a curve in time–temperature space. They can be modeled to some extent directly with Johnson–Mehl–Avrami theory, but Malinov et al. chose to use a NN model so as to be able to predict for many systems and composition variations. The details discussed here are all from Ref. [35]. The data set was 189 TTT diagrams for Ti alloys, and the independent variables were taken to be the compositions of the 8 most common alloying elements and oxygen. Some additional elements that were not prevalent enough in the data set for accurate treatment had to be removed or mapped onto a Mo equivalent. It should be noted that the authors are careful to identify the ranges of the concentrations of alloying elements present in the test set. This is very important, since given the limited data, it is not clear that this NN would give accurate predictions outside the concentration ranges used in training. The dependent variables represented more of a problem, since TTT diagrams are curves, not single values. Malinov et al. solved this problem by representing the TTT diagram as a 23-tuple. Two entries gave the position of the TTT graph nose, its time and temperature. Ten entries gave the upper portion of the curve, where each entry was the fractional change in time for a fixed change in temperature, and ten more the lower portion. Finally, one entry was reserved for the martensite start temperature. These considerations, for both the independent and dependent variables, demonstrate some of the data processing that can be required for successful DM. The final predictions are quite accurate for test sets, and allowed exploration of the dependence of TTT curves on alloy composition. A number of TTT diagram predictions for (at that time) unmeasured materials were given, and some of these have since been measured, demonstrating reasonably good predictive ability for the NN model (see Fig. 3) [37]. A set of studies using DM techniques to model Al alloys recently came out of Southampton University [40–44]. The work by Starink et al. [44] summarizes studies on strength, electrical conductivity, and toughness. These studies are particularly interesting since they directly compare different DM methods as well as more physically based modeling, based on known constitutive relations. Starink et al. make use of linear regression and Bayesian NN models like those discussed above, but also apply neurofuzzy methods and support vector machines. We will not discuss these further except to point out that the latter is a relatively new development that seems to have some improved ability to give accurate predictions over the more common NN methods, and will likely grow in importance [45–47]. For the cases of direct comparison, Starink et al. find that physically based modeling performs slightly better. However, these examples involve very small data sets (around 30 samples),
Data mining in materials development
407
Figure 3. Comparison of predicted and measured TTT diagrams for different Ti alloys. These predictions were made and published before the experimental measurements were taken. (After Ref. [37], reproduced with permission.)
so one expects there to be significant undertraining in DM methods. Also of interest is the over three-fold decrease in predictive error for conductivity when going from linear to nonlinear DM methods, demonstrating why nonlinear NN methods have become the dominant tool for many applications. Starink et al. make some use of the concept of hybrid physical and DM approaches. This is a very natural idea, but worth mentioning explicitly. The spirit of DM is often one of using as little physical knowledge as possible, and allowing the data to guide the results. However, by introducing a certain amount of physical knowledge, a DM effort can be greatly improved. As summarized by Starink et al., this can be done through initially choosing independent variables based on known physics, using functional forms that are physically motivated in the DM, and using DM to fit remaining errors after a physical model has been used.
3.3.
Catalysis
A particularly exciting area of DM applications at present is in catalysis. A lot of recent activity in this field has been driven by the advent of highthroughput experiments, where the ability to rapidly create large data sets has created a new need for data mining concepts to interpret and guide experiment. Some reviews in this area can be found in Refs. [48–50].
408
D. Morgan and G. Ceder
Some authors have taken approaches similar to those used in QSPR/QSAR applications and the PSP modeling described above – finding a NN model to connect the properties of interest to tractable descriptors, and then exploring that model to understand dependencies or optimize properties [22, 50–56]. The input independent variables are generally the compositions of possible alloying materials in the catalyst, and the output is some measure of the catalytic activity. Note that it is quite possible to have multiple final nodes in the network to output multiple measures of interest, such as conversion of the reactants and percentages of different products [51, 52]. It is also possible to look at catalytic behavior for a fixed catalyst under different reactor conditions, where the reactor conditions become the independent variables [22]. Once a NN has been trained, the best catalyst can be found through optimization of the function defined by the NN. This is generally done with a genetic algorithm [51, 54, 56], but other methods have also been explored [55]. Baerns et al. have done influential work in using a genetic algorithm to design new catalysts, but have skipped the step of fitting a model altogether, directly running experiments on each new generation of catalysts suggested by the genetic algorithm [57–59]. For example, Baerns et al. studied oxidative dehydrogenation of propane to propene using metal oxide catalysts with up to eight metal constituents, and found a general trend toward better catalytic activity with each generation, as shown in Fig. 4. Although optimizing the direct experimental data limits the number of samples that can be examined (Baerns et al. generally look at only a few hundred) the results have been very encouraging, e.g., leading to an effective multicomponent catalyst for low-temperature oxidation of low-concentration propane [58]. Further success
Figure 4. The best (open bar) and mean (solid bar) yield of propene at each generation of catalysts created by genetic algorithm. (After [57], reproduced with permission.)
Data mining in materials development
409
was obtained in studying oxidative dehydrogenation of propane to propene by following up on materials suggested by the combinatorial genetic algorithm search with further noncombinatorial “fundamental” studies [57]. Baerns et al.’s work demonstrates that the best results are sometimes obtained by combining DM and more traditional approaches. Further improvements in high-throughput methods will make direct iterative optimization of the experiments increasingly effective, but a fitted model will likely always be able to explore more samples and provide more rigorous optimization. The choice to use a fitted model is then a balance between the advantage of being able to optimize more accurately and the disadvantage of having a less accurate function to optimize. Umegaki et al. suggest that, in direct comparisons, a combined NN and genetic algorithm approach is more effective than direct optimization of experimental results, but this is a complex issue and will be problem dependent [56]. Despite many encouraging successes, DM in catalysis still faces a number of challenges. As pointed out by Hutchings and Scurrell [49] extending the independent variables to include more preparation and processing variables might significantly broaden the search for optimal materials. In addition, issues related to lifetime, stability, and other aspects of long-term performance are often difficult to predict and need to be addressed. Finally, Klanner et al. point out that there are different challenges for optimizing a library over a well known space of possible compositions and designing a discovery program for development in areas where there is essentially no precedent [50]. In the case of development of truly new materials, the problem of using a QSPR/QSAR approach in catalysis design is complicated because of the inherent difficulties of characterizing heterogeneous solids to build diverse initial libraries. Structure is a good metric for measuring diversity of molecular behavior, and therefore allows relatively easy assembly of diverse libraries for exploration. However, the very nonlinear behavior of solid catalysts, where activity is often dependent on such subtle details as surface defects, means that at this point there is no metric for measuring, a priori, the diversity of solid catalysts. Klanner et al. therefore suggest that development work will have to take place through building a large initial set of descriptors, based on synthesis data and properties of the constituent elements, and then use dimensional reduction to get a manageable number. Finally, no effort has been made here to make comparisons of DM to direct kinetic equation modeling in catalysis design. Some comments with regards to theses methods, and how they can be integrated with DM approaches, are given in Ref. [60]. It should be noted that the above issue of assembling diverse libraries, along with using genetic algorithms for intelligent searching, can be viewed as parts of the general problem of optimized experimental design. This is not a new area, but has become increasingly important due to the advent of high-throughput methods. It also encompasses such well developed fields as
410
D. Morgan and G. Ceder
statistical Design of Experiments. This is a fruitful area for statistical and DM methods, and many of the relevant issues have already been mentioned, but we will not discuss it further here. The interested reader can consult the review by Harmon and references therein [48]. Another DM area that has been receiving increased attention due to high-throughput experiments is correlating the results of cheap and fast experimental measurements with properties of interest. This becomes particularly important when it is necessary to characterize large numbers of samples quickly, and careful measurement of the desired properties is not practical. For a discussion of this issue in high-throughput polymer research see Refs. [61, 62] and a number of rapid screening tools and detection schemes used in high-throughput catalysis development are described in Ref. [63].
3.4.
Crystal Structure
The prediction of crystal structure is a classic materials problem that has been an area of ongoing research for many years. Now that modeling efforts have made computational materials design a real possibility in many areas, the problem of predicting crystal structure has become more practically pressing, since it is usually a prerequisitie for any extensive materials modeling. Crystal structure prediction is an area well suited for DM efforts, since there is no generally reliable and tractable method to predict structure, and there is a lot of structural data collected in crystallographic databases (e.g., ICSD [64], Pauling files [65], CRYSTMET [66], ICDD [67]). Some of the most successful methods for crystal structure prediction are what are known as structure maps, reviewed at length in Refs. [68, 69]. Structure maps exist primarily for binary and ternary compounds, and the best known examples are probably the Pettifor maps [70]. To understand how Pettifor maps work, consider the map designed for AB binary alloys. Each possible element is assigned a number, called the Mendeleev number. Then each alloy AB can be plotted on a Cartesian axis by assigning it the position (x, y), where x is the Mendeleev number for element A and y is the Mendeleev number for element B. At position (x, y) one places a symbol representing the structure type for alloy AB. When enough data is plotted the like symbols tend to cluster – in other words, alloys with the same structure type tend to be located near each other on the map. This can be clearly seen in the Pettifor map in Fig. 5. The probable structure type for a new alloy can simply be found by locating where the new alloy should reside in the map and examining the nearby structure types. Structure maps were not originally introduced as an example of DM, but can be understood within that framework. One can extend the idea of using Mendeleev number to a general “vector map,” which maps each alloy to a
Data mining in materials development
411
Figure 5. An AB binary alloy Pettifor map. Notice that like structure types show a clear tendency to cluster near one another. Provided by John Rodgers using the CRYSTMET database [66].
multicomponent vector. The vector components might be any set of descriptors for the alloy, such as Mendeleev numbers, melting temperatures, or differences in electronegativities. Once the alloys have been mapped to representative vectors they are amenable to different DM schemes. Since crystal structures are discrete categories, not continuous values, some sort of classification DM is going to be required. Structure maps work by defining a simple Euclidean metric on the alloy vectors and making the assumption that alloys with the same structure types will be close together. When a new alloy is encountered its crystal structure
412
D. Morgan and G. Ceder
is predicted by examining the neighborhood of the new alloy in the structure map. Structure types that appear frequently in a small neighborhood of the new alloy are good candidates for the alloy’s structure type. This is a geometric classification scheme, along the lines of K -nearest-neighbors described above. There is no unique way to define the vectors that create the structure map, and many different physical quantities, such as electronegativities and effective radii, have been proposed for constructing structure maps. Ref. [64] lists at least 53 different atomic parameters that could be used as descriptors to define a structure map. The most accurate Pettifor maps are built by mapping alloys to vectors using a specially devised chemical scale [71]. The chemical scale was motivated by many physical concerns, but is fundamentally an empirical way to map alloys to vectors, chosen to optimize the clustering of alloys with the same crystal structures. A number of new ideas are suggested by viewing crystal structure prediction from a DM framework. First, it is clear that many standard assessment techniques have only recently begun to be incorporated. It was not until about 20 years after the first Pettifor maps that an effort was made to formalize their clustering algorithm and assess their accuracy using cross validation techniques (the accuracy was found to very good, in some cases giving correct predictions for non-unique structures 95% of the time) [72]. Also, the question of how to assess errors can be fruitfully thought of in terms of false positives (predicting a crystal structure that is wrong) and false negatives (failing to predict the crystal structure that is right). For many situations, e.g., predicting structures to be checked by ab initio methods or used as input for Rietveld refinements, a false positive is not a large problem, since the error will likely be discarded at a later stage, but a false negative is critical, since it means the correct answer will not be found with further investigation. This leads to the idea of using maps to suggest a candidate structure list, rather than a single candidate structure [72]. Using a list creates many false positives, but greatly reduces the chance of false negatives. A DM perspective on structure prediction encourages one to think of moving beyond present structure map methods. For example, different metrics, other classification algorithms, or mining on more complex alloy descriptors, might yield more accurate results. Some work along these lines has already occurred, including machine learning based structure maps [73] and NN and clustering predictions of compound formation [74]. A similarly spirited application used partial least squares to predict higher level structural features of zeolites in terms of simpler structural descriptors [75], and is part of a more general center focused on DM in materials science [76]. The structure maps have at least two severe limitations. As described above, they predict structure type given that the alloy has a structure at a given stoichiometry, but do not consider the question of whether or not an alloy will have an ordered phase at that stoichiometry. This is not a problem when a structure
Data mining in materials development
413
is known to exist and one wants to identify it, but in many cases that information is not available. There are some successful methods for identifying alloys as compound forming versus having no-compounds, e.g., Meidema’s rules [77] or Villar’s maps for ternary compounds [68], but the problem of identifying when an alloy will show ordering at a given composition has not been thoroughly investigated in the context of structure maps. However, it is certainly possible that further DM work could be of value solving this problem, and some potentially useful methods are discussed below. Another serious limitation on structure maps is that classification DM is only effective when an adequate number of samples of each class are available. There are already thousands of structure types, the number is still increasing, and only a small percentage of possible multicomponent alloy systems have been explored [68]. Therefore, it seems unlikely that sufficiently many examples of all the structure type classes will ever be available for totally general application of structure maps. Infrequent structure types are less robustly predicted with structure maps, and totally new structure types cannot be predicted at all. The problem of limited sampling can be alleviated by restricting the area of focus, e.g., considering only the most common structure types, which are likely to be well sampled, or only a subset of alloys, where all the relevant structure types can be discovered. However, the very significant challenge of sampling all the relevant structure types creates a need for other methods. One promising idea is to abandon the use of structure types as the most effective way to classify structures and replace it with a scheme easier to sample. An idea along these lines is to classify alloys by the local environments around each atom [68, 78]. Local environments may in fact be a more relevant method of classification than structure type for understanding physical properties, and there seem to be far fewer local environments than different structure types. This is analogous to classifying proteins by their different folds, which are essential to function and come in limited variety [79]. Computational methods, using different Hamiltonians, offer an increasingly practical route toward crystal structure prediction. Given an accurate Hamiltonian for an alloy, the stable crystal structures can be calculated by minimizing the total energy. These techniques can also predict entirely new structures never seen experimentally, since the prediction is done on the computer. Unfortunately, the structural energy landscape has many local minima, and it cannot be explored quickly or easily. Researchers in this area therefore are forced to make a tradeoff between the speed and accuracy of the energy methods, and the range of possible structures that are explored. For example, Jansen has used simple pair potentials to explore the energy landscape, and then applied more accurate ab initio methods for likely structural candidates [80]. This is a common approach, to optimize with simplified expressions and then use slower and more accurate ab initio energy methods on only the more promising areas. A similar approach was taken to predict a range of
414
D. Morgan and G. Ceder
inorganic structures from a genetic algorithm [81]. If one restricts the possible structures, then direct optimization of ab initio energies can be performed. For example, low cohesive energy structures for 32 possible alloying elements were found on a four atom, face centered cubic unit cell by optimizing ab initio energies using a genetic algorithm [82]. Although these approaches are quite promising, optimizing the energy over the space of all possible atomic arrangements is generally not practical. It is necessary to find some approach to guide the calculations to regions of structure space that are likely to have the lowest energy structures and can be explored effectively. A practical and common method to guide calculations is sometimes colloquially referred as the “round up the usual suspects” approach, borrowing a quotation from Captain Louis Renault in the end of Casablanca. This approach simply involves calculating structures one thinks are likely to be ground states and is another example of human DM, where the scientist is drawing on their own experience to guide the calculations toward the correct structure. As mentioned in the introduction, formalizing human DM on the computer offers many advantages in accuracy, verification, portability, and efficiency. An improvement can be made by limiting the human component to suggesting a few likely parent lattices, and then fitting simplified Hamiltonians on each parent lattice to predict stable structures. This approach, called cluster expansion, has been well validated in many systems [83, 84] and has been successful in predicting some structures that had not been previously identified experimentally [85, 86]. However, choosing the correct parent lattice and performing the fitting required for cluster expansion is at present still difficult to automate, although efforts along these lines are being made [87]. Ideally, the process of guiding computational crystal structure prediction would be entirely automated by DM methods. A step in this direction has been taken by Curtarolo et al. who have demonstrated how one might combine experimental data, high-throughput computation, and DM methods to guide new calculations toward likely stable crystal structures [88]. Experimental information is used to get a list of commonly occurring structure types, and then these are calculated using automated scripts for a large number of systems. Mined correlations between structural energies are then used to guide calculations on new systems toward stable regions, reducing the number of calculations required to predict crystal structures. This approach can, in theory, be expanded to totally new structure types, since these can be generated on the computer, and work in this direction is under development.
4.
Conclusions
We have seen here a number of different examples of DM applications in different areas, and it is valuable to step back and note some overall
Data mining in materials development
415
features. In general, DM applications in materials development still need to prove themselves, and relatively few new discoveries have been made using them. Many of the results in this field consist primarily of exploring new models to demonstrate that such modeling is possible, that accurate predictions can be made, and that useful understanding of dependencies on key variables can be obtained. This will inevitably cause some skepticism about the final utility of the methods, but it is appropriate for a field which is still relatively young and finding its place. A similar evolution has been taken by, e.g., ab initio quantum mechanical techniques. It is only recently that these methods have moved out the stage where the accuracy of the model was the key issue to the stage where the bulk of papers focus on the materials results, not the techniques. All the drivers for using DM methods identified in the introduction, more data, databases, and DM tools, will only become increasingly forceful with continuing advances in experiment, computation, algorithms, and information technology. For these reasons, we believe that DM approaches are going to be increasingly important tools for the modern materials developer. A number of the above examples showed the necessity of combining DM methods with more traditional physical approaches. Whether it is microstructural modeling in the area of processing–structure–property prediction or kinetic equation modeling in catalysis design, physical modeling is by no means standing still, and its utility will continue to expand. In the few cases where authors make direct comparisons, it is not clear that DM applications have been more effective [44, 89]. It is already true that DM approaches, although more data focused, are deeply intertwined with traditional physical modeling. A researchers knowledge of the physics of the problem strongly influences such things as choices of descriptors (e.g., exponentiating parameters where thermal activation is expected), choices in the predictive model (e.g., using linear models when linear relationships are expected), and many unwritten small decisions about how the DM is done. DM and physical modeling, despite an apparent conflict, are really best used collaboratively, and effective materials researchers will need to combine both tools to have maximal impact. Another important feature to note is the difference between DM in materials science and the more established areas of drug design and QSPR/QSAR. Although the overall framework is very similar, establishing effective descriptors for independent variables seems to be harder in materials applications. Bulk materials, more common in traditional materials science applications, often have atomic-, nano-, and micro-structural features that are hard to characterize and quantify with effective descriptors. In their absence, further progress on many problems will require additional descriptors relating to processing choices.
416
D. Morgan and G. Ceder
Finally, we would like to stress the natural synergy between DM and other kinds of computational modeling. High-throughput computation can help provide the wealth of data needed for robust data mining, as was illustrated above in the use of computationally optimized structures for boiling point modeling [20] and crystal structure prediction [80–82, 88]. Impressive examples of high-throughput ab initio computation providing large amounts of accurate materials data can be found in Refs. [90–92]. High-throughput computation not only increases the effectiveness of DM methods, but extends the reach of computational modeling, since DM methods can help span the challenging range of length and time scales involved in materials phenomena. The growing power of DM and other computational methods will only increase their interdependence in the future. Finally, on a more personal note, we have found that one of the most valuable contributions of DM to our research has been to expand how we think about problems. DM encourages one to ask how one can make optimal use of data and to look deeply for patterns that might provide valuable information. DM makes one think on a large scale, thereby encouraging the automation of experiment, computation, and data analysis for high-throughput production. DM also encourages a culture of careful testing for any kind of fitting, through cross validation and statistical methods. Finally, DM is inherently inderdisciplinary, encouraging materials scientists to learn more about analogous problems and techniques from across the hard and soft sciences, thereby enriching us all as researchers.
References [1] W. Klosgen and J.M. Zytkow, Handbook of Data Mining and Knowledge Discovery, Oxford University Press, Oxford, 2002. [2] N. Ye, The Handbook of Data Mining, Lawrence Erlbaum Associates, London, 2003. [3] D. von Mendelejeff, “Ueber die Bezlehunger der Eigenschaften Zu den Atomgewichte der Elemente,” Zeit. Chem., 12, 405–406, 1869. [4] M.F. Ashby, Materials Selection in Mechanical Design., Butterworth-Heinemann, Boston, 1999. [5] D. Braha, Data Mining for Design and Manufacturing, Kluwer Academic Publishers, Boston, 2001. [6] M.H. Dunham, Data Mining: Introductory and Advanced Topics, Pearson Education, Inc., Upper Saddle River, New Jersey, 2003. [7] M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, WileyInterscience, IEEE Press, Hoboken, New Jersey, 2003. [8] PolyphonicHMI, (http://www.polyphonichmi.com/technology.html). [9] M.H. Kutner, C.J. Nachtschiem, W. Wasserman, and J. Neter, Applied Linear Statistical Models, McGraw-Hill, New York, 1996. [10] A.C. Rencher, Methods of Multivariate Analysis, Wiley-Interscience, New York, 2002.
Data mining in materials development
417
[11] J.E. Jackson, A User‘s Guide to Principal Components, John Wiley & Sons, New York, 1991. [12] S.d. Jong, “Simpls: an alternative approach to partial least squares regression,” in Chemometrics and Intelligent Laboratory Systems, 18, 251–263, 1993. [13] B.M. Wise and N.B. Gallagher, PLS Toolbox 2.1 for Matlab, Eigenvector Reseach, Inc., Manson, WA, 2000. [14] S. Wold, A.H.W. Ruhe, and W.J. Dunn, “The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses,” SIAM J. Sci. Stat. Comput., 5, 735–743, 1984. [15] M.T. Hagan, H.B. Demuth, and M.H. Beale, Neural Network Design, Martin Hagan, 2002. [16] D.J.C. Mackay, “Bayesian interpolation,” Neural Comput., 4, 415–447, 1992. [17] D.J.C. Mackay, “A practical bayesian framework for backpropagation networks,” Neural Comput., 4, 448–472, 1992. [18] D.J.C. Mackay, “Probable networks and plausible predictions – a review of practical bayesian methods for supervised neural networks,” Network-Comput. Neural Syst., 6, 469–505, 1995. [19] D.J.C. MacKay, “Bayesian modeling with neural networks,” In: H. Cerjack (ed.), Mathematical Modeling of Weld Phenomena, vol. 3. The Institute of Materials, London, pp. 359–389, 1997. [20] A.J. Chalk, B. Beck, and T. Clark, “A quantum mechanical/neural net model for boiling points with error estimation,” J. Chem. Inf. Comput. Sci., 41, 457–462, 2001. [21] H. Bhadeshia, “Neural networks in materials science,” ISIJ Int., 39, 966–979, 1999. [22] J.M. Serra, A. Corma, A. Chica, E. Argente, and V. Botti, “Can artificial neural networks help the experimentation in catalysis?,” Catal. Today, 81, 393–403, 2003. [23] K. Baumann, “Cross-validation as the objective function for variable-selection techniques,” Trac-Trend Anal. Chem., 22, 395–406 2003. [24] A.S. Goldberger, A Course in Econometrics, Harvard University Press, Cambridge, MA, 1991. [25] E.K.P. Chong and S.H. Zak, An Introduction to Optimization, John Wiley & Sons, New York, 2001. [26] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Cambridge University Press, Cambridge, 1992. [27] J. Taskinen and J. Yliruusi, “Prediction of physicochemical properties based on neural network modelling,” Adv. Drug Deliv. Rev., 55, 1163–1183, 2003. [28] H. Bhadeshia, “Design of ferritic creep-resistant steels,” ISIJ Int., 41, 626–640, 2001. [29] T. Sourmail, H. Bhadeshia, and D.J.C. MacKay, “Neural network model of creep strength of austenitic stainless steels,” Mater. Sci. Technol., 18, 655–663, 2002. [30] S.H. Lalam, H. Bhadeshia, and D.J.C. MacKay, “Estimation of mechanical properties of ferritic steel welds part 1: yield and tensile strength,” Sci. Technol. Weld. Joining 5, 135–147, 2000. [31] S.H. Lalam, H. Bhadeshia, and D.J.C. MacKay, “Estimation of mechanical properties of ferritic steel welds part 2: Elongation and charpy toughness,” Sci. Technol. of Weld. Joining, 5, 149–160, 2000. [32] M.A. Yescas, H. Bhadeshia, and D.L. MacKay, “Estimation of the amount of retained austenite in austempered ductile irons using neural networks,” Mater. Sci. Eng. A, 311, 162–173, 2001. [33] S. Cardie and H.K.D.H. Bhadeshia, “Materials algorithms project (map): Public domain research software & data,” In: Mathematical Modelling of Weld Phenomena IV, Institute of Materials, London, 1998.
418
D. Morgan and G. Ceder [34] S. Malinov and W. Sha, “Software products for modelling and simulation in materials science,” Comput. Mater. Sci., 28, 179–198, 2003. [35] S. Malinov, W. Sha, and Z. Guo, “Application of artificial neural network for prediction of time-temperature-transformation diagrams in titanium alloys,” Mater. Sci. Eng. Struct. Matter Properties Microstruct. Process, 283, 1–10, 2000. [36] S. Malinov, W. Sha, and J.J. McKeown, “Modelling the correlation between processing parameters and properties in titanium alloys using artificial neural network,” Comput. Mater. Sci., 21, 375–394, 2001. [37] S. Malinov and W. Sha, “Application of artificial neural networks for modelling correlations in titanium alloys,” Mater. Sci. Eng., A365, 202–211, 2004. [38] T. Malinova, S. Malinov, and N. Pantev, “Simulation of microhardness profiles for nitrocarburized surface layers by artificial neural network,” Surf. Coat. Technol., 135, 258–267, 2001. [39] T. Malinova, N. Pantev, and S. Malinov, “Prediction of surface hardness after ferritic nitrocarburising of steels using artificial neural networks,” Mater. Sci. Technol., 17, 168–174, 2001. [40] S. Christensen, J.S. Kandola, O. Femminella, S.R. Gunn, P.A.S. Reed, and I. Sinclair, “Adaptive numerical modelling of commercial aluminium plate performance,” Aluminium Alloys: Their Physical and Mechanical Properties, Pts 1–3, 331–3, 533– 538, 2000. [41] O.P. Femminella, M.J. Starink, M. Brown, I. Sinclair, C.J. Harris, and P.A.S. Reed, “Data pre–processing/model initialisation in neurofuzzy modelling of structure-property relationships in Al–Zn–Mg–Cu alloys,” ISIJ Int., 39, 1027–1037, 1999. [42] O.P. Femminella, M.J. Starink, S.R. Gunn, C.J. Harris, and P.A.S. Reed, “Neurofuzzy and supanova modelling of structure–property relationships in Al–Zn–Mg–Cu alloys,” Aluminium Alloys: Their Physical and Mechanical Properties, Pts 1–3, 331– 3, 1255–1260, 2000. [43] J.S. Kandola, S.R. Gunn, I. Sinclair, and P.A.S. Reed, “Data driven knowledge extraction of materials properties,” In: Proceedings of Intelligent Processing and Manufacturing of Materials, Hawaii, USA, 1999. [44] M.J. Starink, I. Sinclair, P.A.S. Reed, and P.J. Gregson, “Predicting the structural performance of heat-treatable al-alloys,” In: Aluminum Alloys - Their Physical and Mechanical Properties, Parts 1-3, vol. 331–337, pp. 97–110, Trans Tech Publications, Switzerland, 2000. [45] H. Byun and S.W. Lee, “Applications of support vector machines for pattern recognition: A survey,” Pattern Recogn. Support Vector Machines, Proc., 2388, 213–236, 2002. [46] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000. [47] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995. [48] L. Harmon, “Experiment planning for combinatorial materials discovery,” J. Mater. Sci., 38, 4479–4485, 2003. [49] G.J. Hutchings and M.S. Scurrell, “Designing oxidation catalysts – are we getting better?,” Cattech, 7, 90–103, 2003. [50] C. Klanner, D. Farrusseng, L. Baumes, C. Mirodatos, and F. Schuth, “How to design diverse libraries of solid catalysts?,” QSAR & Combinatorial Science, 22, 729–736, 2003.
Data mining in materials development
419
[51] T.R. Cundari, J. Deng, and Y. Zhao, “Design of a propane ammoxidation catalyst using artificial neural networks and genetic algorithms,” Indust. & Eng. Chem. Res., 40, 5475–5480, 2001. [52] T. Hattori and S. Kito, “Neural-network as a tool for catalyst development,” Catal. Today, 23, 347–355, 1995. [53] M. Holena and M. Baerns, “Feedforward neural networks in catalysis - a tool for the approximation of the dependency of yield on catalyst composition, and for knowledge extraction,” Catal. Today, 81, 485–494, 2003. [54] K. Huang, X.L. Zhan, F.Q. Chen, and D.W. Lu, “Catalyst design for methane oxidative coupling by using artificial neural network and hybrid genetic algorithm,” Chem. Eng. Sci., 58, 81–87, 2003. [55] A. Tompos, J.L. Margitfalvi, E. Tfirst, and L. Vegvari, Information mining using artificial neural networks and “holographic research strategy,” Appl. Catal. A, 254, 161–168, 2003. [56] T. Umegaki, Y. Watanabe, N. Nukui, E. Omata, and M. Yamada, “Optimization of catalyst for methanol synthesis by a combinatorial approach using a parallel activity test and genetic algorithm assisted by a neural network,” In: Energy Fuels, 17, 850–856, 2003. [57] O.V. Buyevskaya, A. Bruckner, E.V. Kondratenko, D. Wolf, and M. Baerns, “Fundamental and combinatorial approaches in the search for and optimisation of catalytic materials for the oxidative dehydrogenation of propane to propene,” Catal. Today, 67, 369–378, 2001. [58] U. Rodemerck, D. Wolf, O.V. Buyevskaya, P. Claus, S. Senkan, and M. Baerns, “High-throughput synthesis and screening of catalytic materials – case study on the search for a low-temperature catalyst for the oxidation of low-concentration propane,” Chem. Eng. J., 82, 3–11, 2001. [59] D. Wolf, O.V. Buyevskaya, and M. Baerns, “An evolutionary approach in the combinatorial selection and optimization of catalytic materials,” Appl. Catal. A, 200, 63–77, 2000. [60] J.M. Caruthers, J.A. Lauterbach, K.T. Thomson, V. Venkatasubramanian, C.M. Snively, A. Bhan, S. Katare, and G. Oskarsdottir, “Catalyst design: knowledge extraction from high-throughput experimentation,” J. Catal., 216, 98–109, 2003. [61] A. Tuchbreiter and R. Mulhaupt, “The polyolefin challenges: catalyst and process design, tailor-made materials, high-throughput development and data mining,” Macromol. Symp., 173, 1–20, 2001. [62] A. Tuchbreiter, J. Marquardt, B. Kappler, J. Honerkamp, M.O. Kristen, and R. Mulhaupt, “High-output polymer screening: exploiting combinatorial chemistry and data mining tools in catalyst and polymer development,” Macromol. Rapid Comm., 24, 47–62, 2003. [63] A. Hagemeyer, B. Jandeleit, Y.M. Liu, D.M. Poojary, H.W. Turner, A.F. Volpe, and W.H. Weinberg, “Applications of combinatorial methods in catalysis,” Appl. Catal. A, 221, 23–43, 2001. [64] G. Bergerhoff, R. Hundt, R. Sievers, and I.D. Brown, “The inorganic crystal-structure data-base,” J. Chem. Compu. Sci., 23, 66–69, 1983. [65] P. Villars, K. Cenzual, J.L.C. Daams, F. Hullinger, T.B. Massalski, H. Okamoto, K. Osaki, and A. Prince, Pauling File, ASM International, Materials Park, Ohio, USA, 2002. [66] P.S. White, J. Rodgers, and Y. Le Page, “Crystmet: a database of structures and powder patterns of metals and intermetallics,” Acta Cryst. B, 58, 343–348, 2002.
420
D. Morgan and G. Ceder [67] S. Kabekkodu, G. Grosse, and J. Faber, “Data mining in the icdd’s metals & alloys relational database,” Epdic 7: European Powder Diffraction, Pts 1 and 2, 378–3, 100–105, 2001. [68] P. Villars, Factors governing crystal structures. In: J.H. Westbrook and R.L. Fleischer (eds.), vol. 1, John Wiley & Sons, New York, pp. 227–275, 1994. [69] J.K. Burdett and J. Rodgers, “Structure & property maps for inorganic solids,” In: R.B. King (ed.), Encyclopedia of Inorganic Chemistry, vol. 7, John Wiley & Sons, New York, 1994. [70] D.G. Pettifor, “The structures of binary compounds: I. Phenomenological structure maps,” J. Phys. C: Solid State Phys., 19, 285–313, 1986. [71] D.G. Pettifor, “A chemical scale for crystal-structure maps,” Solid State Commun., 51, 31–34, 1984. [72] D. Morgan, J. Rodgers, and G. Ceder, “Automatic construction, implementation and assessment of Pettifor maps,” J. Phys. Condens. Matter, 15, 4361–4369, 2003. [73] G.A. Landrum, Prediction of Structure Types for Binary Compounds, Rational Discovery, Inc., Palo Alto, pp. 1–8, 2001. [74] Y.H. Pao, B.F. Duan, Y.L. Zhao, and S.R. LeClair, “Analysis and visualization of category membership distribution in multivariate data,” Eng. Appl. Artif. Intell., 13, 521–525, 2000. [75] A. Rajagopalan, C.W. Suh, X. Li, and K. Rajan, “Secondary” descriptor development for zeolite framework design: an informatics approach, Appl. Catal. A, 254, 147–160, 2003. [76] K. Rajan, Combinatorial materials science and material informatics laboratory (COSMIC), (http://www.rpi.edu/∼rajank/materialsdiscovery/). [77] F.R. de Boer, R. Boom, W.C.M. Matten, A.R. Miedema, and A.K. Niessen, Cohesion in Metals: Transition Metal Alloys, North Holland, Amsterdam, 1988. [78] J.L.C. Daams, “Atomic environments in some related intermetallic structure types,” In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds, Principle and Practice, vol. 1, John Wiley & Sons, New York, pp. 227–275, 1994. [79] S. Dietmann, J. Park, C. Notredame, A. Heger, M. Lappe, and L. Holm, “A fully automatic evolutionary classification of protein folds: Dali domain dictionary version 3,” Nucleic Acids Res., 29, 55–57, 2001. [80] M. Jansen, “A concept for synthesis planning in solid-state chemistry,” Angew. Chem. Int. Ed., 41, 3747–3766, 2002. [81] S.M. Woodley, P.D. Battle, J.D. Gale, and C.R.A. Catlow, “The prediction of inorganic crystal structures using a genetic algorithm and energy minimisation,” Phys. Chem. Chem. Phys., 1, 2535–2542, 1999. [82] G.H. Johannesson, T. Bligaard, A.V. Ruban, H.L. Skriver, K.W. Jacobsen, and J.K. Norskov, “Combined electronic structure and evolutionary search approach to materials design,” Phys. Rev. Lett., 88, pp. 255506-1–255506-5, 2002. [83] D. de Fontaine, “Cluster approach to order-disorder transformations in alloys,” In: Solid State Physics, H. Ehrenreich and D. Turnbull (eds.), vol. 47, Academic Press, pp. 33–77 1994. [84] A. Zunger, “First-principles statistical mechanics of semiconductor alloys and intermetallic compounds,” Statics and Dynamics of Alloy Phase Transformations, New York, 1994. [85] V. Blum and A. Zunger, “Structural complexity in binary bcc ground states: The case of bcc Mo–Ta,” Phys. Rev. B, 69, pp. 020103-1–020103-4, 2004. [86] G. Ceder, “Predicting properties from scratch,” Science, 280, 1099–1100, 1998.
Data mining in materials development
421
[87] A. van de Walle, M. Asta, and G. Ceder, “The alloy theoretic automated toolkit: A user guide,” Calphad-Computer Coupling of Phase Diagrams and Thermochemistry, 26, 539–553, 2002. [88] S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, and G. Ceder, “Predicting crystal structures with data mining of quantum calculations,” Phy. Rev. Lett., 91, 2003. [89] B. Chan, M. Bibby, and N. Holtz, “Predicting 800 to 500 Degrees C Weld Cooling Times by using Backpropagation Neural Networks,” Trans. Can. Soc. Mech. Eng., 20, 75, 1996. [90] T. Bligaard, G.H. Johannesson, A.V. Ruban, H.L. Skriver, K.W. Jacobsen, and J.K. Norskov, “Pareto-optimal alloys,” Appl. Phys. Lett., 83, 4527–4529, 2003. [91] S. Curtarolo, D. Morgan, and G. Ceder, “Accuracy of ab initio methods in predicting the crystal structures of metals: Review of 80 binary alloys,” submitted for publication, 2004. [92] A. Franceschetti and A. Zunger, “The inverse hand-structure problem of finding an atomic configuration with given electronic properties,” Nature, 402, 60–63, 1999.
1.19 FINITE ELEMENTS IN AB INITIO ELECTRONIC-STRUCTURE CALULATIONS J.E. Pask and P.A. Sterne Lawrence Livermore National Laboratory, Livermore, CA, USA
Over the course of the past two decades, the density functional theory (DFT) (see e.g., [1]) of Hohenberg, Kohn, and Sham has proven to be an accurate and reliable basis for the understanding and prediction of a wide range of materials properties from first principles (ab initio), with no experimental input or empirical parameters. However, the solution of the Kohn–Sham equations of DFT is a formidable task and this has limited the range of physical systems which can be investigated by such rigorous, quantum mechanical means. In order to extend the interpretive and predictive power of such quantum mechanical theories further into the domain of “real materials”, involving nonstoichiometric deviations, defects, grain boundaries, surfaces, interfaces, and the like; robust and efficient methods for the solution of the associated quantum mechanical equations are critical. The finite-element (FE) method (see e.g., [2]) is a general method for the solution of partial differential and integral equations which has found wide application in diverse fields ranging from particle physics to civil engineering. Here, we discuss its application to large-scale ab initio electronic-structure calculations. Like the traditional planewave (PW) method (see e.g., [3]), the FE method is a variational expansion approach, in which solutions are represented as a linear combination of basis functions. However, whereas the PW method employs a Fourier basis, with every basis function overlapping every other, the FE method employs a basis of strictly local piecewise polynomials, each overlapping only its immediate neighbors. Because the FE basis consists of polynomials, the method is completely general and systematically improvable, like the PW method. Because the basis is strictly local, however, the method offers some significant advantages. First, because the basis functions are localized, they can be concentrated where needed in real space to increase the efficiency 423 S. Yip (ed.), Handbook of Materials Modeling, 423–437. c 2005 Springer. Printed in the Netherlands.
424
J.E. Pask and P.A. Sterne
of the representation. Second, a variety of boundary conditions can be accommodated, including Dirichlet boundary conditions for molecules or clusters, Bloch boundary conditions for crystals, or a mixture of these for surfaces. Finally, and most significantly for large-scale calculations, the strict locality of the basis facilitates implementation on massively parallel computational architectures by minimizing the need for nonlocal communications. The advantages of such a local, real-space approach in large-scale calculations have been amply demonstrated in the context of finite-difference (FD) methods (see, e.g., [4]). However, FD methods are not variational expansion methods, and this leads to disadvantages such as limited accuracy in integrations and nonvariational convergence. By retaining the use of a basis while remaining strictly local in real space, FE methods combine significant advantages of both PW and FD approaches.
1.
Finite Element Bases
The construction and key properties of FE bases are perhaps best conveyed in the simplest case: a one-dimensional (1D), piecewise-linear basis. Figure 1 shows the steps involved in the construction of such a basis on a domain = (0, 1). The domain is partitioned into subdomains called elements (Fig. 1a). In this case, the domain is partitioned into three elements 1 –3 ; in practice, there are typically many more, so that each element encompasses only a small fraction of the domain. For simplicity, we have chosen a uniform partition, but this need not be the case in general. (Indeed, it is precisely the flexibility to partition the domain as desired which allows for the substantial efficiency of the basis in highly inhomogeneous problems.) A parent basis φˆ i ˆ = (−1, 1) (Fig. 1b). In this case, the is then defined on the parent element parent basis functions are φˆ1 (ξ ) = (1 − ξ )/2 and φˆ2 (ξ ) = (1 + ξ )/2. Since the parent basis consists of two (independent) linear polynomials, it is complete to linear order, i.e., a linear combination can represent any linear polynomial exactly. Furthermore, it is defined such that each function takes on the value 1 at exactly one point, called its node, and vanishes at all (one, in this case) other nodes. Local basis functions φi(e) are then generated by transformations ξ (e) (x) ˆ to each element e of the parent basis functions φˆ i from the parent element (Fig. 1c). In present case, for example, φ1(1) (x) ≡ φˆ1 (ξ (1)(x)) = 1 − 3x and φ2(1) (x) ≡ φˆ2 (ξ (1)(x)) = 3x, where ξ (1)(x) = 6x − 1. Finally, the piecewisepolynomial basis functions φi of the method are generated by piecing together the local basis functions (Fig. 1d). In the present case, for example, φ2 (x) =
(1) φ2 (x),
φ1(2) (x), 0,
x ∈ [0, 1/3] x ∈ [1/3, 2/3] otherwise.
Finite elements in ab initio electronic-structure calculations
425
Figure 1. 1D piecewise-linear FE bases. (a) Domain and elements. (b) Parent element and parent basis functions. (c) Local basis functions generated by transformations of parent basis functions to each element. (d) General piecewise-linear basis, generated by piecing together local basis functions across interelement boundaries. (e) Dirichlet basis, generated by omitting boundary functions. (f) Periodic basis, generated by piecing together boundary functions.
The above 1D piecewise-linear FE basis possesses the key properties of all such bases, whether of higher dimension or higher polynomial order. First, the basis functions are strictly local, i.e., nonzero over only a small fraction of the domain. This leads to sparse matrices and scalability, as in FD approaches,
426
J.E. Pask and P.A. Sterne
while retaining the use of a basis, as in PW approaches. Second, within each element, the basis functions are simple, low-order polynomials, which leads to computational efficiency, generality, and systematic improvability, as in FD and PW approaches. Third, the basis functions are C 0 in nature, i.e., continuous but not necessarily smooth. As we shall discuss, this necessitates extra care in the solution of second-order problems, with periodic boundary conditions in particular. Finally, the basis functions have the key property φi (x j ) = δi j i.e., each basis function takes on a value of 1 at its associated node and vanishes at all other nodes. By virtue of this property, an FE expansion f (x) = c φ i i i (x) has the property f (x j ) = c j , so that the expansion coefficients have a direct, real-space meaning. This eliminates the need for computationally intensive transforms, such as Fourier transforms in PW approaches, and facilitates preconditioning in iterative solutions, such as multigrid in FD approaches (see, e.g., [4]). Figure 1(d) shows a general FE basis, capable of representing any piecewise linear function (having the same polynomial subintervals) exactly. To solve a problem subject to vanishing Dirichlet boundary conditions, as occurs in molecular or cluster calculations, one can restrict the basis as in Fig. 1(e), i.e., omit boundary functions. To solve a problem subject to periodic boundary conditions, as occurs in solid-state electronic-structure calculations, one can restrict the basis as in Fig. 1(f), i.e., piece together local basis functions across the domain boundary in addition to piecing together across interelement boundaries. Regarding this periodic basis, however, it should be noted that an arbitrary linear combination f (x) = i ci φi (x) necessarily satisfies f (0) = f (1),
(1)
but does not necessarily satisfy f (0) = f (1).
(2)
Thus, unlike PW or other such smooth bases, while the value condition (1) is enforced by the use of such an FE basis, the derivative condition (2) is not. And so for problems requiring the enforcement of both, as in solid-state electronic-structure, the derivative condition must be enforced by other means [5]. We address this further in the next section. Higher-order FE bases are constructed by defining more independent parent basis functions, which requires that some basis functions be of higher order than linear. And, as in the linear case, what is typically done is to define all functions to be of the same order so that, for example, to define a 1D quadratic basis, one would define three quadratic parent basis functions; for a 1D cubic basis, four cubic parent basis functions, etc. With higher-order basis functions,
Finite elements in ab initio electronic-structure calculations
427
however, come new possibilities. For example, with cubic basis functions there are sufficient degrees of freedom to specify both value and slope at end points, thus allowing for the possibility of both value and slope continuity across interelement boundaries, and so allowing for the possibility of a C 1 (continuous value and slope) rather than C 0 basis. For sufficiently smooth problems, such higher order continuity can yield greater accuracy per degree of freedom and such bases have been used in the electronic-structure context [6, 7]. However, while straightforward in one dimension, in higher dimensions this requires matching both values and derivatives (including cross terms) across entire curves or surfaces, which becomes increasingly difficult to accomplish and leads to additional constraints on the transformations, and thus meshes, which can be employed [8]. Higher-dimensional FE bases are constructed along the same lines as the 1D case: partition the domain into elements, define local basis functions within each element via transformations of parent basis functions, and piece together the resulting local basis functions to form the piecewise-polynomial FE basis. In higher dimensions, however, there arises a significant additional choice: that of shape. The most common 2D element shapes are triangles and quadrilaterals. In 3D, tetrahedra, hexahedra (e.g., parallelepipeds), and wedges are among the most common. A variety of shapes have been employed in atomic and molecular calculations (see, e.g., [9]). In solid-state electronic-structure calculations, the domain can be reduced to a parallelepiped and C 0 [5] as well as C 1 [7] parallelepiped elements have been employed.
2.
Solution of the Schr¨odinger and Poisson Equations
The solution of the Kohn–Sham equations can be accomplished by a number of approaches, including direct minimization of the energy functional [10], solution of the associated Lagrangian equations [11], and self-consistent (SC) solution of associated Schr¨odinger and Poisson equations (see, e.g., [3]). A finite-element based energy minimization approach has been described by Tsuchida and Tsukada [7] in the context of molecular and -point crystalline calculations. Here, we shall describe a finite-element based SC approach. In this section, we discuss the solution of the Schr¨odinger and Poisson equations; in the next, we discuss self-consistency. The solution of such equations subject to Dirichlet boundary conditions, as appropriate for molecular or cluster calculations, is discussed extensively in the standard texts and literature (see, e.g., [2, 9]). Here, we shall discuss their solution subject to boundary conditions appropriate for a periodic (crystalline) solid.
428
J.E. Pask and P.A. Sterne
In a perfect crystal, the electronic potential is periodic, i.e., V (x + R) = V (x)
(3)
for all lattice vectors R, and the solutions of the Schr¨odinger equation satisfy Bloch’s theorem ψ(x + R) = eik·R ψ(x)
(4)
for all lattice vectors R and wavevectors k [12]. Thus the values of V (x) and ψ(x) throughout the crystal are completely determined by their values in a single unit cell, and so the solutions of the Poisson and Schr¨odinger equations in the crystal can be reduced to their solutions in a single unit cell, subject to boundary conditions consistent with Eqs. (3) and (4), respectively. We consider first the Schr¨odinger problem: 1 − ∇ 2 ψ + V ψ = εψ 2
(5)
in a unit cell, subject to boundary conditions consistent with Bloch’s theorem, where V is an arbitrary periodic potential (atomic units are used throughout, unless otherwise specified). Since V is periodic, ψ can be written in the form ψ(x) = u(x)eik·x ,
(6)
where u is a complex, cell-periodic function satisfying u(x) = u(x + R) for all lattice vectors R [12]. Assuming the form (6), the Schr¨odinger equation (5) becomes 1 1 − ∇ 2 u − ik · ∇u + k 2 u + VL u + e−ik·x VNL eik·x u = εu, 2 2
(7)
where, allowing for the possibility of nonlocality, VL and VNL are the local and nonlocal parts of V . From the periodicity condition (4), the required boundary conditions on the unit cell are then [12] u(x) = u(x + Rl ),
x ∈ l
(8)
and nˆ · ∇u(x) = nˆ · ∇u(x + Rl ),
x ∈ l ,
(9)
where l and Rl are the surfaces of the boundary and associated lattice vectors R shown in Fig. 2, and nˆ is the outward unit normal at x. The required Bloch-periodic problem can thus be reduced to the periodic problem (7)–(9). However, since the domain has been reduced to the unit cell, nonlocal operators require further consideration. In particular, if as is typically the case for ab initio pseudopotentials, the domain of definition is all space (i.e., the
Finite elements in ab initio electronic-structure calculations
429
R3
R2
R1
Figure 2. Parallelepiped unit cell (domain) , boundary , surfaces 1 –3 , and associated lattice vectors R1 –R3 .
full crystal), they must be transformed to the relevant finite subdomain (i.e., the unit cell) [13]. For a separable potential of the usual form VNL (x, x ) =
a a vlm (x − τa − Rn )h la vlm (x − τa − Rn ),
(10)
n,a,l,m
where n runs over all lattice vectors and a runs over atoms in the unit cell, the nonlocal term e−ik·x VNL eik·x u in Eq. (7) is
e−ik·x
a vlm (x − τa − Rn )h la
n,a,l,m
a dx vlm (x − τa − Rn )eik·x u(x ),
R3
where the integral is over all space. Upon transformation to the unit cell , this becomes
e−ik·x
a eik·Rn vlm (x − τa − Rn )h la
a,l,m n
×
dx
n
a e−ik·Rn vlm (x − τa − Rn )eik·x u(x ).
Having reduced the required problem to a periodic problem on a finite domain, solutions may be obtained using a periodic FE basis. However, if the
430
J.E. Pask and P.A. Sterne
basis is C 0 , as is typically the case, rather than C 1 or smoother, some additional consideration is required. First, the direct application of the Laplacian to such a basis is problematic. Second, being periodic in value but not in derivative (as discussed in the preceding section), the basis does not satisfy the required boundary conditions. Both issues can be resolved by reformulating the original differential formulation in weak (integral) form. Such a weak formulation can be constructed which contains no derivatives higher than first order, and which requires only value-periodicity (i.e., Eq. (8)) of the basis, thus resolving both issues. Such a weak formulation of the required problem (7)–(9) is [5]: Find scalars ε and functions u ∈ V such that 1 2
1 dx ∇v ∗ · ∇u + dxv ∗ −ik · ∇u + k 2 u + VL u + e−ik·x VNL eik·x u 2
= ε dxv ∗ u
∀v ∈ V,
where V = {v : v(x) = v(x + Rl ), x ∈ l }, and the x dependence of u and v has been suppressed for compactness. Having reformulated the problem in weak form,solutions may be obtained using a C 0 FE basis. Letting u = j c j φ j and v = j d j φ j , where φ j are real periodic finite element basis functions and c j and d j are complex coefficients, leads to a generalized Hermitian eigenproblem determining the approximate eigenvalues ε and eigenfunctions u of the weak formulation and thus of the required problem [5]:
Hi j c j = ε
j
Si j c j ,
(11)
j
where
Hi j =
dx
1 1 ∇φi · ∇φ j − ik · φi ∇φ j + k 2 φi φ j + VL φi φ j 2 2
+ φi e−ik·x VNL eik·x φ j and
Si j =
(12)
dx φi φ j ,
(13)
and again the x dependence of φi and φ j has been suppressed for compactness. For a separable potential of the form (10), the nonlocal term in (12) becomes [13]
dx φi (x)e−ik·x VNL eik·x φ j (x) =
a,l,m
ai a flm hl
aj ∗
flm ,
Finite elements in ab initio electronic-structure calculations
431
where ai = flm
dx φi (x)e−ik·x
a eik·Rn vlm (x − τa − Rn ).
n
As in the PW method, the above matrix elements can be evaluated to any desired accuracy, so that the basis need only be large enough to provide a sufficient representation of the required solution, though other functions such as the nonlocal potential may be more rapidly varying. As in the FD method, the above matrices are sparse and structured due to the strict locality of the basis. Figure 3 shows a series of FE results for a Si pseudopotential [14]. Since the method allows for the direct treatment of any Bravais lattice, results are shown for a two-atom fcc primitive cell. The figure shows the sequence of band structures obtained for 3 × 3 × 3, 4 × 4 × 4, and 6 × 6 × 6 uniform meshes vs. exact values at selected k points (where “exact values” were obtained from a well converged PW calculation). The variational nature of the method is clearly manifested: the error is strictly positive and the entire band structure converges rapidly and uniformly from above as the number of basis functions is increased. Further analysis [5] shows that the convergence of the eigenvalues is in fact sextic, i.e., the error is of order h 6 , where h is the mesh spacing, consistent with asymptotic convergence theorems for the cubic-complete case [8]. The Poisson solution proceeds along the same lines as the Schr¨odinger solution. In this case, the required problem is −∇ 2 VC (x) = f (x),
x∈
(14)
subject to boundary conditions VC (x) = VC (x + Rl ),
x ∈ l
(15)
and nˆ · ∇ VC (x) = nˆ · ∇ VC (x + Rl ),
x ∈ l ,
(16)
where the source term f (x) = −4πρ(x), VC (x) is the potential energy of an electron in the charge density ρ(x), and the domain , bounding surfaces l , and lattice vectors Rl are again as in Fig. 2. Reformulation of (14)–(16) in weak form and subsequent discretization in a real periodic FE basis φ j leads to a symmetric linear system determining the approximate solution VC (x) = j c j φ j (x) of the weak formulation and thus of the required problem [5]: j
L i j c j = fi ,
(17)
432
J.E. Pask and P.A. Sterne
Si 15
Energy (eV)
10
3⫻3⫻3 4⫻4⫻4 6⫻6⫻6
5
FE Exact 0
L
Γ
X
Figure 3. Exact and finite-element (FE) band structures for a series of meshes, for a Si primitive cell. The convergence is rapid and variational: the entire band structure converges from above, with an error of O(h 6 ), where h is the mesh spacing.
where
Lij =
dx ∇φi (x) · ∇φ j (x)
(18)
and
fi =
dx φi (x) f (x).
(19)
As in the FD method, the above matrices are sparse and structured due to the strict locality of the basis, requiring only O(n) storage and O(n) operations
Finite elements in ab initio electronic-structure calculations
433
for solution by iterative methods, whereas O(n log n) operations are required in a PW basis, where n is the number of basis functions.
3.
Self-Consistency
The above Schr¨odinger and Poisson solutions can be employed in a fixed point iteration to obtain the self-consistent solution of the Kohn–Sham equations. In the context of a periodic solid, the process is generally as follows (see, e.g., Ref. [3]): an initial electronic charge density ρein is constructed (e.g., by overlapping atomic charge densities). An effective potential Veff is constructed based upon ρein (see below). The eigenstates ψi of Veff are computed by solving the associated Schr¨odinger equation subject to Bloch boundary conditions. From these eigenstates, or “orbitals”, a new electronic charge density ρe is then constructed according to ρe = −
f i |ψi |2 ,
i
where the sum is over occupied orbitals with occupations f i . If ρe is sufficiently close to ρein , then self-consistency has been reached; otherwise, a new ρein is constructed based on ρe and the process is repeated until self-consistency is achieved. The resulting density minimizes the total energy and is the DFT approximation of the physical density, from which other observables may be derived. The effective potential can be constructed as the sum of ionic (or nuclear, in an all-electron context), Hartree, and exchange-correlation parts: Veff = ViL + ViNL + VH + VXC ,
(20)
where, allowing for the possibility of nonlocality, ViL and ViNL are the local and nonlocal parts of the ionic term. For definiteness, we shall assume that the atomic cores are represented by nonlocal pseudopotentials. ViNL is then determined by the choice of pseudopotential. VXC is a functional of the electronic density determined by the choice of exchange-correlation functional. ViL is the Coulomb potential associated with the ions (sum of local ionic pseudopotentials). VH is the Coulomb potential associated with electrons (the Hartree potential). In the limit of an infinite crystal, ViL and VH are divergent due to the long range 1/r nature of the Coulomb interaction, and so their computation requires careful consideration. A common approach is to add and subtract analytic neutralizing densities and associated potentials, solve the resulting neutralized problems, and add analytic corrections (see, e.g., Ref. [3] in a reciprocal space context, [15] in real space). Alternatively [13], it may be L associated with each atom noted that the local parts of the ionic potentials Vi,a
434
J.E. Pask and P.A. Sterne
can be replaced by corresponding localized ionic charge densities ρi,a since the potentials fall off as −Z /r (or rapidly approach this behavior) for r > rc , where Z is the number of valence electrons, r is the distance from the ion center, and rc is on the order of half the nearest neighbor distance. The total Coulomb potential VC = ViL + VH in the unit cell may then be computed at once by solving the Poisson equation ∇ 2 VC = 4πρ subject to periodic boundary conditions, where ρ = ρi + ρe is the sum of electronic and ionic charge densities in the unit cell, and the ionic charge densities ρi,a associated with each atom a are related to their respective local ionic L by Poisson’s equation potentials Vi,a L /4π. ρi,a = ∇ 2 Vi,a
Since the ionic charge densities are localized, their summation in the unit cell is readily accomplished, whereas the summation of ionic potentials is not, due to their long range 1/r tails. With VC determined, Veff can then be constructed as in Eq. (20), and the self-consistent iteration can proceed.
4.
Total Energy
Like Veff , the computation of the total energy in a crystal requires careful consideration due to the long range nature of the Coulomb interaction and resulting divergent terms. In this case, the electron–electron and ion–ion terms are divergent and positive, while the electron–ion term is divergent and negative. As in the computation of Veff , a common approach involves the addition and subtraction of analytic neutralizing densities (see, e.g., Refs. [3, 15]). Alternatively, it may be noted that the replacement of the local parts of the ionic potentials by corresponding localized charge densities, as discussed above, yields a net neutral charge density ρ = ρi + ρe , and all convergent terms in the total energy. For sufficiently localized ρi,a , a quadratically convergent expression for the total energy in terms of Kohn–Sham eigenvalues εi is then [13] E tot =
i
1 − 2
f i εi +
dx ρe (x)
VLin (x)
1 dx ρi (x)VC (x) + 2 a
1 − VC (x) − εXC [ρe (x)] 2
L dx ρi,a (x)Vi,a (x),
(21)
R3
where VLin is the local part of Veff constructed from the input charge density ρein , VC is the Coulomb potential associated with ρe , i.e., ∇ 2 VC = 4π(ρi + ρe ), εXC
Finite elements in ab initio electronic-structure calculations
435
is the exchange-correlation energy density, i runs over occupied states with occupations f i , and a runs over atoms in the unit cell. Figure 4 shows the convergence of FE results to well converged PW results as the number of elements in each direction of the wavefunction mesh is increased in a self-consistent GaAs calculation at an arbitrary k point, using the same pseudopotentials [16] and exchange-correlation functional. As in the PW method, higher resolution is employed in the calculation of the charge density and potential (twice that employed in the calculation of the of the wavefunctions, in the present case). The rapid, variational convergence of the FE approximations to the exact self-consistent solution is clearly manifested: the error is strictly positive and monotonically decreasing, with an asymptotic slope of ∼−6 on a log–log scale, indicating an error of O(h 6 ), where h is the mesh spacing, consistent with the cubic completeness of the basis. This is in contrast to FD approaches where, lacking a variational foundation, the error can be of either sign and may oscillate.
5.
Outlook
Because FE bases are simultaneously polynomial and strictly local in nature, FE methods retain significant advantages of FD methods without sacrificing the use of a basis, and in this sense, combine advantages of both PW
GaAs self-consistent total energy and eigenvalues
EFE⫺EEXACT (Ha)
10⫺2
10⫺2
10⫺3
10⫺3
10⫺4
10⫺4 Etot E1 E2 E3
10⫺5 10⫺6 8
10⫺5 10⫺6 12
16
20
24
28
32
Elements in each direction Figure 4. Convergence of self-consistent FE total energy and eigenvalues with respect to number of elements, for a GaAs primitive cell. As for a fixed potential, the convergence is rapid and variational: the error is strictly positive and monotonically decreasing, with an error of O(h 6 ), where h is the mesh spacing.
436
J.E. Pask and P.A. Sterne
and FD based approaches for ab initio electronic structure calculations. In particular, while variational and systematically improvable, the method produces sparse matrices and requires no computation- or communication-intensive transforms; and so is well suited to large, accurate calculations on massively parallel architectures. However, FE methods produce generalized rather than standard eigenproblems, require more memory than FD based approaches, and are more difficult to implement. Because of the relative merits of each approach, and because FE based approaches are yet at a relatively early stage of development, it is not clear which approach will prove superior in the largescale ab initio electronic structure context in the years to come [4]. Early nonself-consistent applications to ab initio positron distribution and lifetime calculations involving over 4000 atoms [5] are promising indications, however, and the development and optimization of FE based approaches for a range of large-scale applications remains a very active area of research.
Acknowledgment This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
References [1] R.O. Jones and O. Gunnarsson, “The density functional formalism, its applications and prospects,” Rev. Mod. Phys., 61, 689–746, 1989. [2] O.C. Zienkiewicz and R.L. Taylor, The Finite Element Method, McGraw-Hill, New York, 4th edn., 1988. [3] W.E. Pickett, “Pseudopotential methods in condensed matter applications,” Comput. Phys. Rep., 9, 115–198, 1989. [4] T.L. Beck, “Real-space mesh techniques in density-functional theory,” Rev. Mod. Phys., 72, 1041–1080, 2000. [5] J.E. Pask, B.M. Klein, P.A. Sterne, and C.Y. Fong, “Finite-element methods in electronic-structure theory,” Comput. Phys. Commun., 135, 1–34, 2001. [6] S.R. White, J.W. Wilkins, and M.P. Teter, “Finite-element method for electronic structure,” Phys. Rev. B, 39, 5819–5833, 1989. [7] E. Tsuchida and M. Tsukada, “Large-scale electronic-structure calculations based on the adaptive finite-element method,” J. Phys. Soc. Japan, 67, 3844–3858, 1998. [8] G. Strang and G.J. Fix, An Analysis of the Finite Element Method, Prentice-Hall, Englewood Cliffs, NJ, 1973. [9] L.R. Ram-Mohan, Finite Element and Boundary Element Applications in Quantum Mechanics, Oxford University Press, New York, 2002. [10] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045–1097, 1992.
Finite elements in ab initio electronic-structure calculations
437
[11] T.A. Arias, “Multiresolution analysis of electronic structure: semicardinal and wavelet bases,” Rev. Mod. Phys., 71, 267–311, 1992. [12] N.W. Ashcroft and N.D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976. [13] J.E. Pask and P.A. Sterne, “Finite-element methods in ab initio electronic-structure calculations,” Modell. Simul. Mater. Sci. Eng., to appear, 2004. [14] M.L. Cohen and T.K. Bergstresser, “Band structures and pseudopotential form factors for fourteen semiconductors of the diamond and zinc-blende structures,” Phys. Rev., 141, 789–796, 1966. [15] J.L. Fattebert and M.B. Nardelli, “Finite difference methods in ab initio electronic structure and quantum transport calculations of nanostructures,” In: P.G. Ciarlet, (ed.), Handbook of Numerical Analysis, vol. X: Computational Chemistry, Elsevier, Amsterdam, 2003. [16] C. Hartwigsen, S. Goedecker, and J. Hutter, “Relativistic separable dual-space gaussian pseudopotentials from H to Rn,” Phys. Rev. B, 58, 3641–3662, 1998.
1.20 AB INITIO STUDY OF MECHANICAL DEFORMATION Shigenobu Ogata Osaka University, Osaka, Japan
The Mechanical properties of materials under finite deformation are very interesting and are important topics for material scientists, physicists, and mechanical and materials engineers. Many insightful experimental tests of the mechanical properties of such deformed materials have afforded an increased understanding of their behavior. Recently, since nanotechnologies have started to occupy the scientific spotlight, we must accept the challenge of studying these properties in small nano-scaled specimens and in perfect crystals under ideal conditions. While state-of-the-art experimental techniques have the capacity to make measurements in extreme situations, they are still expensive and require specialized knowledge. However, the considerable improvement in calculation methods and the striking development of computational capacity bring such problems within the range of atomic-scale numerical simulations. In particular, within the past decade, ab initio simulations, which can often give qualitatively reliable results without any experimental data as input, have become readily available. In this section, we discuss methods for studying the mechanical properties of materials using ab initio simulations. At present, we have many ab initio methods that have the potential to perform such mechanical tests. Here, however, we employ planewave methods based on density functional theory (DFT) and pseudopotential approximations because they are widely used in solid state physics. Details of the theory and of more sophisticated, state-of-the-art techniques can be found in the other section of this volume and in a review article [1]. Concrete examples of parameters settings appearing in this section presuppose that the reader is using the VASP (Vienna Ab initio Simulation Package) code [2, 3] and the ultrasoft pseudopotential. Other codes based on the same theory, such as ABINIT, CASTEP, and so on, should basically accept the same parameter settings as on VASP. 439 S. Yip (ed.), Handbook of Materials Modeling, 439–448. c 2005 Springer. Printed in the Netherlands.
440
1.
S. Ogata
Applying Deformation to Supercell
In the planewave methods, we usually use a parallelepiped-shaped supercell that has a periodic boundary condition in all directions and includes one or more atoms. The supercell can be defined by three, linearly independent basis vectors, h1 = (h 11 , h 12 , h 13 ), h2 = (h 21 , h 22 , h 23 ), h3 = (h 31, h 32 , h 33 ). In investigating the phenomena connected with a local atomic displacement, for example, a slip of the adjacent atomic planes in a crystal, an atomic position in the supercell can be directly moved within the system of fixed basis vectors. However, when we need a uniform deformation of the system under consideration, we can accomplish this by changing the basis vectors directly as we would do, for example, in simulating a phase transition or crystal twinning, and in calculations of the elastic constants and ideal strength of a perfect crystal. Let a deformation gradient tensor F represent the uniform deformation of the system. The F can be defined as Fi j =
dxi , dX j
where x and X are, respectively, the positions of a material particle in a deformed and in a reference state. By using the F, each basis vector is mapped to a new basis vector h via h k = Fkj h j . For example, for a simple shear deformation, F can be written as,
1 0 γ F = 0 1 0 , 0 0 1 where γ represents the magnitude of the shear corresponding to the engineering shear strain. In some cases, for ease of understanding, different coordinate systems for F and for the basis vectors are taken. In this case, F is transformed into the coordinate system for a basis vector by an orthogonal tensor Q ( Q Q T = I). F = Q F Q T, h k = Fkj h j .
2.
Simulation Setting
In DFT calculations, the pseudopotential (if the code is not full-potential code) and the exchange correlation potentials should be carefully selected.
Ab initio study of mechanical deformation
441
Since these problems are not particular to deformation analysis, the reader who needs a more detailed discussion can find it elsewhere. Only a short commentary is given here. When we use the pseudopotential in a separable form [4], we need to pay attention to a possible ghost band [5], because almost all DFT codes use the separable form to save computational time and memory resources. Usually the pseudopotentials in the package codes were very carefully determined to avoid a ghost band in an equilibrium state. However, even when a pseudopotential does not generate a ghost band in the equilibrium state, such a band may still appear in a deformed state. Therefore, it is strongly recommended that a pseudopotential result should be confirmed by comparing it with the result of a full-potential calculation where possible. For the exchange correlation potential, we can normally use functions derived from the local density approximation (LDA), generalized gradient approximation (GGA), and LDA+U. In many cases, the former two methods are equally accurate. The LDA tends to underestimate lattice constants, and overestimate elastic constants and strength, and the GGA to overestimate elastic constants and strength, and underestimate lattice constants. The LDA+U sometimes offers a significantly improved accuracy [6]. The above discussions of the pseudopotential and exchange-correlation potential pertain to error sources resulting from theoretical approximations. However, as well as attending to errors from this source, we should also take care of numerical errors. Numerical errors in the planewave DFT calculation usually derive from the finite size of the k-point set and the finite number of planewaves which are uniquely determined by the supercell shape and the planewave cut-off energy. With regard to other problems, a good estimation of the stress tensor to MPa accuracy requires a finer k-point sampling than does that for an energy estimation with meV accuracy. Figure 1 shows the
-3.6
3.5 3 2.5 Stress GPa
Total energy eV
-3.62
-3.64
-3.66
2 1.5 1
-3.68 0.5 -3.7
0 0
10000 20000 30000 40000 50000 60000 70000 80000 Number of k-points
(a) Total energy vs. number of k-points
0
10000 20000 30000 40000 50000 60000 70000 80000 Number of k-points
(b) Shear stress vs. number of k-points
Figure 1. Total energy and stress vs. number of k-points curves for an aluminum primitive ¯ direction. cell under 20% shear in the {111}112
442
S. Ogata
convergence of the energy and stress as the number of k-points is increased. The model supercell is a primitive cell with an fcc structure which contains ¯ just one aluminum atom. An engineering shear strain of 0.2 to the {111}112 direction has already been applied to the primitive cell. Only the shear stress component corresponding to the shearing direction is shown. Clearly, the stress converges very slowly even though the energy converges relatively quickly. Figure 2 shows the stress–strain curves of the Al primitive cell under a {111} ¯ shear deformation using two sets of k-points, the normal 15 × 15 × 15 112 and a fine 43 × 43 × 43 Monkhorst–Pack Brillouin zone sampling [7]. This sampling scheme is explained later. The curve for 15×15×15 is significantly wavy even though the total free energy of the primitive cell agrees to the order of meV with the energy of the 43 × 43 × 43 case. Apparently, a small set of k-points does not produce a smooth stress–strain curve. This is not a small problem for the study of mechanical properties of materials, because, in the above case, the ideal strength, that is, the maximum stress of the stress– strain curve, is overestimated by 20%, a level which is usually corresponds to 2 ∼ 20 GPa. Although there are many k-points sampling schemes, in recent practice, the Monkhorst–Pack sampling scheme is typically used for testing mechanical properties. Since more efficient schemes [8], in which a smaller number
3.5
Shear Stress GPa
3 2.5 2
43x43x43 k-points
1.5 1 0.5
15x15x15 k-points
0 0
0.05
0.1 0.15 0.2 0.25 Engineering Shear Strain
0.3
0.35
Figure 2. Shear stress vs. strain curves calculated with different numbers of k-point sets. ¯ direction is applied. A shear deformation in the {111}112
Ab initio study of mechanical deformation
443
of k-points can be used without loss of accuracy, are constructed based on crystal symmetries, a deformation which would break the crystal symmetries would remove their advantage. Therefore, the Monkhorst–Pack scheme is often favored because of its simplicity. In it, the sampling points are defined in the following manner: k(n, m, l) = nb1 + mb2 + l b3 , 2r − q − 1 ; r = 1, 2, 3, . . . , q n, m, l = 2q where bi are the reciprocal lattice vectors of the supercell and n, m, and l are the mesh sizes for each reciprocal lattice vector direction. Therefore, the total number of sampled k-points is n×m ×l. If we find that, under the symmetries of the supercell, some of the k-points are equivalent we consider only the nonequivalent k-points to save computational time. The planewave cut-off energy should also be carefully determined. We should use a large enough planewave cut-off energy to achieve a convergence of energy and stress to the required degree of accuracy. Since the atomic configuration affects the cut-off energy, it is better that we estimate that energy for the particular atomic configuration under consideration. However, in mechanical deformation analysis, it is difficult to fix the cut-off energy before starting the simulation because the deformation path cannot be predicted at the simulation’s starting point. In such a case, we have to add a safety margin of 10–20 % to the cut-off energy estimated from a known atomic configuration, for example, that of an equivalent structure. In principle, a complete basis set is necessary to express an arbitrary function by a linear combination of the basis functions. As discussed above, the planewave basis set is used to express the wave functions of electrons in ordinary DFT calculations using the pseudopotential. Because a FFT algorithm can be easily used to calculate the Hamiltonian, we can save computational time. To achieve completeness, a infinite number of the planewaves is necessary; however, to perform a practical numerical calculation, we must somehow reduce the infinite number to a finite one. Fortunately, we can ignore planewaves which have a higher energy than a cut-off value, termed the planewave cut-off energy, because the wave functions of electrons in real system do not have a component of extremely high frequencies. To estimate the cut-off energy, we can perform a series of calculations with an increasing cut-off energy for a single system. By this means, we can find a cut-off energy which is large enough to ensure that the total energy and the stress convergence of the supercell of interest fall within the required accuracy. Usually, the incompleteness of a finite number of planewave basis sets produces an unphysical stress, that is, a Puley stress. However, by using a large enough number of planewaves, we can avoid this problem. Therefore, both the stress convergence check and the energy convergence check are important in
444
S. Ogata 3.5
Shear Stress GPa
3
Ecut=90 eV
2.5 2 Ecut=129 eV
1.5 1 0.5 0 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Engineering Shear Strain Figure 3. Shear stress vs. strain curves calculated with different cut-off energies. A shear ¯ direction is applied. deformation in the {111}112
deformation. Figure 3 shows the stress–strain curves obtained by the use of different planewave cut-off energies. The model and simulation procedure are the same as those we have utilized in the above k-point check. Clearly, even though the error due to a small cut-off energy is small in a near equilibrium structure, it becomes larger at in a highly strained structure.
3.
Mechanical Deformation of Al and Cu
Many ab initio studies of mechanical deformation, such as tensile and shear deformation studies for metals and ceramics, have been done in the past two decades. An excellent summary of the history of ab initio mechanical testing ˇ [9]. can be found in a review paper written by Sob Here, we discuss as examples both a fully relaxed and an unrelaxed uniform shear deformation analysis [10], that is, an analysis of a pure shear and a simple shear, for aluminum and copper. The shear mode is the most important deformation mode in our consideration of the strength of a perfect crystalline solid. The shear deformation analysis usually involves more computational cost than the tensile analysis; because the shear deformation breaks many of the crystal symmetries, many nonequivalent k-points should be treated in the calculation.
Ab initio study of mechanical deformation
445
The following analysis has been performed using the VASP code. The exchange-correlation density functional potential adopted is the Perdew–Wang generalized gradient approximation (GGA) [11]; the ultrasoft pseudopotentials [12] are used. Brillouin zone k-point sampling is performed using the Monkhorst–Pack algorithm, and the integration follows the Methfessel–Paxton scheme [13] with the smearing width chosen so that the entropic free energy (a “-T S” term) is less than 0.5 meV/atom. A six atom fcc supercell which has three {111} layer is used, and 18×25×11 k-points for Al and 12×17×7 k-points for Cu are adopted. The k-point convergence is checked as shown in Table 1. The carefully determined cut-off energies of the planewaves for the Al and Cu supercells are 162 and 292 eV, respectively. Incremental affine shear strains of 1% as described above are imposed on each crystal along the experimentally determined common slip systems to obtain the corresponding energies and stresses. In each step, the stress components, excluding the resolved shear stress along the slip system, are kept to a value less than 0.1 GPa during the simulation. In Table 2, the equilibrium lattice constants a0 obtained from the energy minimization are listed and compared with the experimental data. The calculated relaxed and unrelaxed shear moduli G r , G u for the common slip systems are compared with computed analytical values based on the experimental elastic constants. A value of γ = 0.5% is used to interpolate the resolved shear stress (σ ) versus the engineering shear strain (γ ) curves and to calculate the resolved shear moduli. In the relaxed analysis, the stress components are relaxed to within a convergence tolerance of 0.05 GPa. Table 1. Calculated ideal pure shear σr and simple shear strengths σu using different k-point sets No. of k-points 12 × 17 × 7 18 × 25 × 11 21 × 28 × 12 27 × 38 × 16
Al
Cu
σ u (GPa)
σ r (GPa)
σ u (GPa)
σ r (GPa)
3.67 3.73 – 3.71
2.76 2.84 – 2.84
3.42 3.44 3.45 –
2.16 2.15 2.15 –
Table 2. Equilibrium lattice constant (a0 ), relaxed (G r ) ¯ shear moduli of Al and Cu and unrelaxed (G u ) {111}112 Al (calc.) Al (expt.) Cu (calc.) Cu (expt.)
a0 (Å)
G r (GPa)
G u (GPa)
4.04 4.03 3.64 3.62
25.4 27.4 31.0 33.3
25.4 27.6 40.9 44.4
446
S. Ogata 3
Stress (GPa)
2.5 2 1.5 1 0.5 0
0
0.1
0.2
0.3
0.4
0.5
x/bp Figure 4. Shear stress vs. displacement curves for Al and Cu of the fully relaxed shear ¯ direction. deformation in the {111}112
At equilibrium, the Cu is considerably stiffer, with simple and pure shear moduli greater by 65 and 25%, respectively, than those of the Al. However, the Al ends up with a 32% larger ideal pure shear strength σmr than the Cu, because it has a longer range of strain before softening (see Fig. 4): γm = 0.200 in the Al, γm = 0.137 in the Cu. Figure 5 shows the changes of the iso-surfaces of the valence charge density during the shear deformation (h ≡ Vcell ρv , Vcell and ρv are the supercell volume and valence charge density, respectively). At the octahedral interstice in Al, the pocket of charge density has cubic symmetry and is angular in shape, with a volume comparable to the pocket centered on every ion. In contrast, in Cu, there is no such interstitial charge pocket, the charge density being nearly spherical about each ion. The Al has an inhomogeneous charge distribution in the interstitial region and bond directionality, while the Cu has relatively homogeneous charge distributions and little bond directionality. The charge density analysis gives a clear view of the electron activity under shear deformation, and sometime informs us about the origin of the mechanical behavior of the solids.
4.
Outlook
Currently, we can perform ab initio mechanical deformation analyses for many types of materials and for primitive and nano systems. However, in the
Ab initio study of mechanical deformation (a)
<110>
<112>
c
(b)
x=x1=0.196
a
c a
c
x=x2=0.436
b
b
<112>
b
c a
a
x=0.000
<111>
x=x2=0.494
b
b
a
<110>
x=x1=0.283
x=0.000
<111>
447
b
c a
c
Figure 5. Charge density isosurface change in (a) Al; (b) Cu during the shear deformation in ¯ direction. the {111}112
near future, the most interesting studies incorporating these analyses might address not only the mechanical behavior of materials under deformation and loading, but also the relation between mechanical deformation and loading, and physical and chemical reactions, such as stress corrosion. For this purpose, ab initio methods are the most powerful and reliable tools.
References [1] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos. “Iterative minimization techniques for ab initio total-energy calculations – molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045–1097, 1992. [2] G. Kresse and J. Hafner, “Ab initio molecular dynamics for liquid metals,” Phys. Rev. B, 47, RC558–RC561, 1993. [3] G. Kresse and J. Furthm¨uller, “Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set,” Phys. Rev. B, 54, 11169–11186, 1996. [4] L. Kleinman and D.M. Bylander “Efficacious form for model pseudopotentials,” Phys. Rev. Lett., 48, 1425–1428, 1982.
448
S. Ogata [5] X. Gonze, P. Kackell, and M. Scheffler, “Ghost states for separable, norm-conserving, ab initio pseudopotential,” Phys. Rev. B, 41, 12264–12267, 1990. [6] S.L. Dudarev, G.A. Botton, S.Y. Savrasov, C.J. Humphreys, and A.P. Sutton, “Electron-energy-loss spectra and the structural stability of nickel oxide: An LSDA+ U study,” Phys. Rev. B, 57, 1505–1509, 1998. [7] H.J. Monkhorst and J.D. Pack, “Special points for Brillouin zone integrations,” Phys. Rev. B, 13, 5188–5192, 1976. [8] D.J. Chadi, “Special points in the Brillouin zone integrations,” Phys. Rev. B, 16, 1746–1747, 1977. ˇ [9] M. Sob, M. Fri´ak, D. Legut, J. Fiala, and V. Vitek, “The role of ab initio electronic structure calculations,” Mat. Sci. Eng. A, to be published, 2004. [10] S. Ogata, J. Li, and S. Yip, “Ideal pure shear strength of aluminum and copper,” Science, 298, 807–811, 2002. [11] J.P. Perdew and Y. Wang, “Atoms, molecules, solids, and surfaces: application of the generalized gradient approximation for exchange and correlation,” Phys. Rev. B, 46, 6671–6687, 1992. [12] D. Vanderbilt, “Soft self-consistent pseudopotentials in a generalized eigenvalue formalism,” Phys. Rev. B, 41, 7892–7895, 1990. [13] M. Methfessel and A. T. Paxton, “High-precision sampling for Brillouin zone in metals,” Phys. Rev. B, 40, 3616–3621, 1989.
Chapter 2 ATOMISTIC SCALE
2.1 INTRODUCTION: ATOMISTIC NATURE OF MATERIALS Efthimios Kaxiras1 and Sidney Yip2 1
Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA 2 Department of Physics, Harvard University, Cambridge, MA 02138, USA
Materials are made of atoms. The atomic hypothesis was put forward by the Greek philosopher Demokritos about 25 centuries ago, but was only proven by quantitative arguments in the 19th and 20th centuries, beginning with the work of John Dalton (1766–1844) and through the development of quantum mechanics, the theory that provided a complete and accurate description of the properties of atoms. The very large number of atoms encountered in a typical material (of order ∼1024 or more) precludes any meaningful description of its properties based on a complete account of the behavior of each and every atom that comprises it. Special cases, such as perfect crystals, are exceptions where symmetry reduces the number of independent atoms to very few; in such cases, the properties of the solid are indeed describable in terms of the behavior of the few independent atoms and this can be accomplished using quantum mechanical methods. However, this is only an idealized model of actual solids in which perfect order is broken either by thermal disorder or by the presence of defects that play a crucial role in determining the physical properties of the system. An example of a crystal defect is dislocations, which determine the mechanical behavior of solids (their tendency for brittle or ductile response to external loading); these defects have a core which can only be described properly by its atomic scale structure, but they also have long range strain and stress fields which are adequately described by continuum elasticity theory (see Chapters 3 and 7). This situation typifies the dilemma of describing the behavior of real materials: the majority of atoms, far from the defect regions, behave in a manner consistent with a macroscopic, continuum description, where the atomic hypothesis is not important, while a small minority of atoms, in the immediate neighborhood of the defects, do not follow this rule and need to be 451 S. Yip (ed.), Handbook of Materials Modeling, 451–458. c 2005 Springer. Printed in the Netherlands.
452
E. Kaxiras and S. Yip
described individually. Neither aspect, atomistic or macroscopic, can provide by itself a satisfactory description of the defect and its role in determining the material’s behavior. The example of dislocations is representative: any type of crystal defect (vacancies, interstitials, impurities, grain boundaries, surfaces, interfaces, etc.) requires, at some level, atomic scale representation in order to fully understand its effect on the properties of the material. Similarly, disorder induced by thermal motion and other external agents (pressure, irradiation) can lead to changes in the stucture of a solid, possibly driving it to new phases, which also requires a detailed atomistic description (see Chapters 2.29 and 6.11). Finally, the case of fluids or solids like polymers, in which there is no order at the atomic scale, is another example of where atomistic scale description is necessary to provide invaluable information for a comprehensive picture of the system’s behavior (see Chapters 8.1 and 9.1). These considerations provide the motivation for the description of materials properties based on atomistic simulations, by judiciously choosing the aspects that need to be explicitly modeled at the atomic scale. The term “atomistic simulations” has acquired a particular meaning: it refers to computational studies of materials properties based on explicit treatment of the atomic degrees of freedom within classical mechanics, either deterministically, that is, in accordance with the laws of classical dynamics (the so-called Molecular Dynamics or MD approach, see Chapter 2.8), or stochastically, that is, by appropriately sampling distributions from a chosen ensemble (the so called Monte Carlo or MC approach, see Chapter 2.10). The energy functional underlying the calculation of forces for the dynamics of atoms or the ensemble distribution, can be based either on a classical description or a quantum mechanical one. We will discuss briefly the issues that arise from the various approaches and then elaborate on what these approaches can provide in terms of a detailed understanding of the behavior of materials.
1.
The Input to Atomistic Simulation
The energy of a system as a function of atomic positions should ideally be treated within quantum mechanics, with the valence electrons providing the interactions between atoms that hold the solid together. The development of Density Functional Theory [1, 2] and of pseudopotential theory (for a comprehensive review see, e.g., Ref. [3]) has produced a computational methodology which is accurate and efficient, and has the required chemical versatility to describe a very wide range of materials properties, fully within the quantum mechanical framework [4]. However, this is an approach which puts exceptionally large demands on computational resources for systems larger than a few tens of atoms, a situation that arises frequently in the descriptions of
Introduction: atomistic nature of materials
453
realistic systems (the dislocation core is a case in point), and this limitation applies to a single atomic configuration. The description of systems comprising of thousands to millions of atoms, and including a large number of atomistic configurations (as a molecular dynamics or a Monte Carlo simulation would require) is beyond current and anticipated computational capabilities. Consequently, alternative approaches have been pursued in order to be able to model such systems, which, though large on the atomistic scale, are still many orders of magnitude smaller than typical meterials. The basic idea is to employ either a simplified quantum mechanical approach for the electrons, or a purely classical one in which the electronic degrees of freedom are completely eliminated and the interactions between atoms are modeled by an effective potential; in both cases, the computational resources required are greatly reduced, permitting the treatment of much larger systems and more extensive exploration of their configurational space (more time steps in a MD simulation or more samples in a MC simulation). The strategies for reducing the computational cost, whether quantum mechanical or classical in nature, are usually distinctly different when applied to systems with covalent versus those with metallic bonding, because of the difference in the nature of electronic bonds in these two situations. In the quantum case, covalent systems are typically modeled by a so-called tight-binding hamiltonian, which restricts the electronic wavefunctions to linear combinations of localized atomic orbitals; this approach is adequate to describe the nature of the covalent bonds (see Chapters 1.14 and 1.15), but can also be extended to capture metallic systems. The restricted variational freedom of electronic wavefunctions greatly reduces the computational cost involved in finding the proper solution. For simple metallic systems, an approach based on density functional theory but without requiring electronic orbitals has also been employed to approximate their properties, again with very substantial reduction in computational cost. These developments have made possible the quantum mechanical, atomistic scale simulation of systems consisting of up to a few thousand atoms (see Ref. [3] for examples). An altogether different methodology is to maintain a strictly classical description with interactions between the atoms provided by an effective potential which somehow encapsulates all the effects of valence electrons. The methodology used in this type of approach is again determined by the type of system to which it is applied. Specifically, for covalently bonded systems, the emphasis of the potential is to reproduce the energy cost of distorting the length of covalent bonds, the angles between them and the torsional angles, which are the basic features characterizing structures with predominantly covalent bonding; a characteristic example of such approaches is silicon, the prototypical covalently bonded solid, for which many attempts have been made to produce a reliable effective interactomic potential with various
454
E. Kaxiras and S. Yip
degrees of success [5–7]. In contrast to this, for metallic systems the emphasis of the potential is to describe realistically the environment of an atom embedded in the background of valence electrons of the host solid; the approaches here often employ an effective (but not necessarily realistic) representation of the valence electron density and are referred to as the embedded atom method [for a review see, Ref. [8], see Chapter 2.2]. In both types of approaches, great care is given to ensuring that the potential reproduces accurately the energetics of at least a set configurations, by fitting it to a database produced by the more elaborate and accurate quantum mechanical methods. Finally, there are also cases where a more generic type of approach can be employed, modeling for instance the interaction between atoms as a simple potential derived by heuristic arguments without fitting to any particular system. Examples of such potentials are the well known van der Waals and Morse potentials, which have the general behavior of an attractive tail, a well-defined minimum and a repulsive core, as a function of the distance between two atoms (see Chapters 2.2–2.6, and 9.2). While not specific to any given material or system, these potentials can provide great insight as far as generic behavior of solids is concerned, including the role of defects in fairly complex contexts (see Chapters 6.1 and 7.1).
2.
Unique Properties of Molecular Dynamics and Monte Carlo
There are certain aspects of atomistic simulation, particularly molecular dynamics and Monte Carlo, which make this approach quite unique. The basic underlying concept here is particle tracking. Without going into the distinction between the two methods of simulation, we make the following general observations. (i) A few hundred particles are often sufficient to simulate bulk properties. Bulk or macroscopic properties like the system pressure and temperature can be determined with a simulation cell containing less than a thousand atoms, even though the number of atoms in a typical macroscopic system is of order of Avogadro’s number, 6 × 1023 . (ii) Simulation allows a unified study of all physical properties. A single simulation can generate the basic data, particle trajectories or configurations, with which one can calculate all the materials properties of interest, structural, thermodynamic, vibrational, mechanical, transport, etc. (iii) Simulation provides a direct connection between the fundamental description of a material system, such as internal energy and atomic structure, and all the physical properties of interest. In essence, it is a “numerical theory of matter”.
Introduction: atomistic nature of materials
455
(iv) In simulation one has complete control over the conditions under which the system study is carried out. This applies to the specification of interatomic interactions and the initial and boundary conditions. With this information and the simulation output one has achieved a precise characterization of the material being simulated. (v) Simulation can give properties that cannot be measured. This can be a very significant feature with regard to testing theory. In situations where the most clean-cut test involves systems or properties not accessible by laboratory experiments, simulation can play the role of experiment and provide this information. Conversely, in those cases where there are no theories to interpret an experiment, simulation can play the role of theory. (vi) Simulation makes possible the direct visualization of physical phenomena of interest. Visualization can play a very important role in modeling and simulation at all scales, for communication of results, gaining physical insights, and discovery. While its potential is recognized, its practical use remains underdeveloped. We recall here an oft quoted sentiment: “Certainly no subject is making more progress on so many fronts than biology, and if we were to name the most powerful assumption of all, which leads one on and on in an attempt to understand life, it is that all things are made of atoms, and that everything that living things do can be understood in terms of the jiggling and wiggling of atoms.” Richard Feynman, Lectures on Physics, vol. 1, p. 3–6 (1963)
3.
Limitations of Atomistic Simulation
To balance the usefulness of molecular dynamics and Monte Carlo, it is appropriate to acknowledge at the same time the inherent limitations of atomistic simulation. As mentioned earlier, the first-principles, quantum mechanical description of atomic bonding in solids is restricted to very few (by macroscopic standards) atoms and for exteremely short time scales: barely a few hundred atoms can be handled, for periods of few hundreds of femto seconds. Extending this fundamental description to larger systems and longer times of simulation requires the introduction of approximations in the quantum mechanical method (such as tight binding or orbital-free approaches), which significantly limit the accuracy of the quantum mechanical approach. With such restrictions on size and time-span of the simulation, the scope of applications to real materials properties is rather limited. The alternative is to use a purely classical description, based on empirical interatomic potentials to describe the interactions of atoms. This, however, introduces more severe approximations, which limit the
456
E. Kaxiras and S. Yip
ability of the approach to capture realistically how the bonds between atoms are formed and dissolved during a simulation. Such uncertainties put bounds on the scope of physical phenomena that can be successfuly addressed by simulations. The other limitation is a practical issue, that is, the finite capabilities of computers no matter how large they are. This translates into limits on the spatial size (usually identified with the number of atoms N in the model) and the temporal extent of simulations, which often fall short of desired values. It is quite safe to say that the upper bounds on system size and run time, whatever they are, will be pushed out further with time, because computer power is certain to increase in the foreseeable future. Probably more important in extending the effective size of simulations are novel algorithmic developments, which are likely to produce computational gains in the simulation size and duration much larger than any direct gains by raw increases in computer power. As an example of new approaches, we mention multiscale simulations of materials, which combine the different types of system description (quantum, classical and continuum) into a single method. Several approaches of this type have appeared in the last few years, and their development is at present a very active field which holds promise for bringing to fruition the full potential of atomistic simulations.
4.
A Brief Survey of the Chapter Contents
The diversity of atomistic simulations, regarding either methods or applications, makes any attempt at a complete coverage a practically impossible task. The contributions that have been brought together here should give the reader a substantial overview of the basic capabilities of the atomistic simulation approach, along with emphasis on certain unique features of modeling and simulation at this scale from the standpoint of multiscale modeling. Leading off the discussions are five articles describing the development of interatomic potentials for specific classes of materials – metals (Chapter 2.2), ionic (Chapter 2.3) and covalent (Chapter 2.4) solids, molecules (Chapter 2.5), and ferroelectrics (Chapter 2.6). From these the reader gains an appreciation of the physics and the database that go into the models, and how the resulting potentials are validated. Immediately following are articles on the simulation methods where the potentials are the necessary inputs, energy minimization (Chapter 2.7), molecular dynamics (Chapters 2.8, 2.9, 2.11), Monte Carlo (Chapter 2.10), and methods at the mesoscale which incorporate atomistic information (Chapters 2.12, 2.13). In the next set of articles emphasis is directed at applications, beginning with free-energy calculations (Chapters 2.14, 2.15) for which atomistic simulations are uniquely well suited, followed by studies of elastic constants (Chapter 2.16), transport coefficients (Chapters 2.17, 2.18),
Introduction: atomistic nature of materials
457
mechanical behavior (Chapter 2.19), dislocations (Chapters 2.20, 2.21, 2.22), fracture in metals (Chapter 2.23), and semiconductors (Chapter 2.24). The next two articles deal with large scale simulations, on metallic and ceramic nanostructures (Chapter 2.25) and biological membranes (Chapter 2.26), followed by three articles on studies in radiation damage to which atomistic modeling and simulations have made significant contributions (Chapters 2.27, 2.28, 2.29). The next article, on thin-film deposition (Chapter 2.30), is an example of how simulation can address problems of technological relevance. The chapter concludes with an article on visualization at the atomistic level (Chapter 2.31), a topic which is destined to grow in recognized importance as well as opportunities for software innovation. The contents of this chapter clearly have a great deal of overlap with the rest of the Handbook. The connection between atomistic simulations using classical potentials and electronic structure calculations (Chapter 1.1) permeates throughout the present chapter, since the potentials used in MD/MC simulations rely on the first-principles quantum mechanical calculations for inspiration of functional form of the potentials, for the database used to determine parameter values, and for benchmark results in model validation. The connection to the mesoscale (Chapter 3.1) is clearly also very intimate since this is the next level of length/time scale. Since atomistic simulation methods and results are used liberally throughout the Handbook, one may be tempted to say that this chapter serves as perhaps the most central link to the different parts of the volume. If we may be allowed another quote from R.P. Feynman, the following is a different way of expressing the centrality of the chapter. “If, in some cataclysm, all of scientific knowledge were to be destroyed, and only sentence passed on to the next generatios of creatures, what statement would contain the most information in the fewest words? I believe it is the atomic hypothesis (or the atomic fact, whatever you wish to call it) that all things are made of atoms – little particles that move around in perpetual motion, attracting each other when they are a little distance apart, but repelling upon squeezed into one another. In that one sentence, you will see, there is enormous amount of information about the world, if just a little imagination and thinking are applied.” Richard P. Feynman, Six Easy Pieces, (Addison-Wesley, Reading, 1963), p. 4.
References [1] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– 871, 1964. [2] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev. A, 140, 1133–1138, 1965. [3] R.M. Martin, Electronic Structure: Basic Theory and Practical Methods, Cambridge University Press, Cambridge, 2004.
458
E. Kaxiras and S. Yip [4] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471–2474, 1985. [5] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [6] J. Tersoff, “New empirical model for the structural properties of silicon,” Phys. Rev. Lett., 56, 632–635, 1986. [7] J. Justo, M.Z. Bazant, E. Kaxiras, V.V. Bulatov, and S. Yip, “Interatomic potential for silicon defects and disordered phases,” Phys. Rev. B, 58, 2539–2550, 1998. [8] A.F. Voter, Intermetallic Compounds, vol. 1, Wiley, New York, pp. 77, 1994.
2.2 INTERATOMIC POTENTIALS FOR METALS Y. Mishin George Mason University, Fairfax, VA, USA
Many processes in materials, such as plastic deformation, fracture, diffusion and phase transformations, involve large ensembles of atoms and/or require statistical averaging over many atomic events. Computer modeling of such processes is made possible by the use of semi-empirical interatomic potentials allowing fast calculations of the total energy and classical interatomic forces. Due to their computational efficiency, interatomic potentials give access to systems containing millions of atoms and enable molecular dynamics simulations for tens or even hundreds of nanoseconds. State-ofthe-art potentials capture the most essential features of interatomic bonding, reaching the golden compromise between computational speeds and accuracy of modeling. This article reviews interatomic potentials for metals and metallic alloys. The basic concepts used in this area are introduced, the methodology commonly applied to generate atomistic potentials is outlined, and capabilities as well as limitations of atomistic potentials are discussed. Expressions for basic physical properties within the embedded-atom formalism are provided in a form convenient for computer coding. Recent trends in this field and possible future developments are also discussed.
1.
Embedded-atom Potentials
Molecular dynamics, Monte Carlo, and other simulation methods require multiple evaluations of Newtonian forces Fi acting on individual atoms i or (in the case of Monte Carlo simulations) the total energy of the system, E tot . Atomistic potentials, also referred to as force fields, parameterize the configuration space of a system and represent its total energy as a relatively simple function of configuration point. The interatomic forces are then obtained as coordinate derivatives of E tot , Fi = −∂ E tot /∂ri , ri being the radius-vector of an 459 S. Yip (ed.), Handbook of Materials Modeling, 459–478. c 2005 Springer. Printed in the Netherlands.
460
Y. Mishin
atom i. This calculation of E tot and Fi is a simple and fast numerical procedure that does not involve quantum-mechanical calculations, although the latter are often used when generating potentials as will be discussed later. Potential functions contain fitting parameters, which are adjusted to give desired properties of the material known from experiment and/or first-principles calculations. Once the fitting procedure is complete, the parameters are not subject to any further changes and the potential thus defined is used in all subsequent simulations of the material. The underlying assumption is that a potential providing accurate energies/forces at configuration points used in the fit will also give reasonable results for configurations between and beyond them. This property of potentials, often refereed to as “transferability,” is probably the most adequate measure of their quality. Early atomistic simulations employed pair potentials, usually of the Morse or Lennard-Jones type [1, 2]. Although such potentials have been and still are a useful model for fundamental studies of generic properties of materials, the agreement between simulation results and experiment can only be qualitative at best. While such potential can be physically justified for inert elements and perhaps some ionic solids, they do not capture the nature of atomic bonding even in simple metals, not to mention transition metals or covalent solids. Daw and Baskes [3] and Finnis and Sinclair [4] proposed a more advanced potential form that came to be known as the embedded atom method (EAM). In contrast to pair potentials, EAM incorporates, in an approximate manner, many-body interactions between atoms, which are responsible for a significant part of bonding in metals. The introduction of the many-body term has enabled a semi-quantitative, and in good cases even quantitative, description of metallic systems. In the EAM model, E tot is given by the expression E tot =
1 si s j (rij ) + Fsi (ρ¯i ). 2 i, j ( j =/ i) i
(1)
The first term is the sum of all pair interactions between atoms, si s j (rij ) being a pair-interaction potential between atoms i (of chemical sort si ) and j (of chemical sort s j ) at positions ri and r j = ri + rij , respectively. Function Fsi is the so-called embedding energy of atom i, which depends upon the host electron density ρ¯i at site i induced by all other atoms of the system. The host electron density is given by the sum ρ¯i =
ρs j (rij ),
(2)
j= /i
where ρs j (r) is the electron density function assigned to atom j . The second term in Eq. (1) represents the many-body effects. The functional form of Eq. (1) was originally derived as a generalization of the effective medium theory [5] and the second moment approximation to tight-binding theory [4, 6]. Later, however, it lost its close ties with the original physical meaning
Interatomic potentials for metals
461
and came to be treated as a working semi-empirical expression with adjustable parameters. A complete EAM description of an n-component system requires n(n + 1)/2 pair interaction functions ss (r), n electron density functions ρs (r), ¯ (s = 1, . . . , n). An elemental metal is desand n embedding functions Fs (ρ) cribed by three functions (r), ρ(r) and F(ρ), ¯ 1 while a binary system A–B ¯ and requires seven function AA (r), AB (r), BB (r), ρA (r), ρB (r), FA (ρ), ¯ Notice that if potential functions for pure metals A and B are availFB (ρ). able, only the cross-interaction function AB (r) is needed for describing the respective binary system. Over the past two decades, EAM potentials have been constructed for many metals and a number of binary systems. Potentials for ternary systems are scares and their reliability is yet to be evaluated. The pair-interaction and electron-density functions are normally forced to turn to zero together with several higher derivatives at a cutoff radius Rc . Typically, Rc covers 3–5 coordination shells. EAM functions are usually defined by analytical expressions. Such expressions and their derivatives can be directly coded into a simulation program. However, a more common and computationally more efficient procedure is to tabulate each function at a large number of points (usually, a few thousand) and store it in the tabulated form for all subsequent simulations. In the beginning of each simulation run, the tables are read into the program, interpolated by a cubic spline, and the spline coefficients are used during the rest of the simulation for retrieving interpolated values of the functions and their derivatives for any desired value of the argument. It is important to understand that the partition of E tot into pair interactions and the embedding energy is not unique [7]. Namely, E tot defined by Eq. (1) is invariant under the transformations ¯ → Fs (ρ) ¯ + gs ρ, ¯ Fs (ρ) ss (r) → ss (r) − gs ρs (r) − gs ρs (r),
(3) (4)
where s, s = 1, . . . , n and gs are arbitrary constants. In addition, all functions ρs (r) can be scaled by the same arbitrary factor p with a simultaneous scaling of the argument of the embedding functions: ρs (r) → pρs (r), ¯ → Fs (ρ/ ¯ p). Fs (ρ)
(5) (6)
Thus, there is a large degree of ambiguity in defining EAM potential functions: the units of the electron density are arbitrary, the pair-interaction and electron-density functions can be mixed with each other, and the embedding energy can only be defined up to a linear function. It is important, however, that 1 For elemental metals, the chemical indices s are often omitted.
462
Y. Mishin
the embedding function be non-linear, otherwise the second term in Eq. (1) can be absorbed by the first one, resulting in a simple pair potential. The non¯ reflects the bond-order character of atomic interactions by linearity of Fs (ρ) making the energy per nearest-neighbor bond decrease with increasing number ¯ must be positive of bonds. To capture this trend, the second derivative Fs (ρ) ¯ a convex curve, at least around the equilibrium volume of the and thus Fs (ρ) ¯ is proportional to crystal. Furthermore, in pure metals at equilibrium, F (ρ) the Cauchy pressure (c12 − c44 )/2, which is normally positive (cij are elastic constants). Notice that all pair potentials inevitably give c12 = c44 , a relation which is rarely followed by real materials. Given this arbitrariness of EAM functions, one should be careful when comparing EAM potentials developed by different research groups for the same material: functions looking very different may actually give close physical properties. As a common platform for comparison, potentials are often converted to the so-called effective pair format. To bring potential functions to this format, apply the transformations by Eqs. (3) and (4) with coefficients gs chosen as gs = −Fs (ρ), ¯ where the derivative is taken at the equilibrium lattice parameter of a reference crystal structure. For that structure, the transformed ¯ = 0 at equilibrium. In embedding functions will satisfy the condition Fs (ρ) ¯ will have a minimum at the host other words, each embedding function Fs (ρ) electron density arising at atoms of the respective sort s in the equilibrium reference structure. Together with the normalization condition ρ¯1 = 1 applied to sort s = 1 in that structure, the potential format is uniquely defined and different potentials can be conveniently compared with each other provided that their reference structures are identical. In elemental metals, the natural choice of the reference structure is the ground state, whereas for binary systems this choice is not unique and should always be specified by the author.
2.
Calculation of Properties with EAM Potentials
Below we provide EAM expressions for some basic physical properties of materials in a form convenient for computer coding. We are using a laboratory reference system with rectangular Cartesian coordinates, so that positions of indices of vectors and tensors are unimportant. We will reserve superscripts for Cartesian coordinates of atoms and subscripts for their labels (all atoms are assumed to be labeled) and chemical sorts (s-indices). The force acting on a particular atom i in a Cartesian direction α(α = 1, 2, 3) is given by the expression Fiα =
j= /i
f ij (rij )
rijα , rij
(7)
Interatomic potentials for metals
463
where f ij (rij ) = si s j (rij ) + Fsi (ρ¯i )ρs j (rij ) + Fs j (ρ¯ j )ρs i (rij ).
(8)
Notice that this force depends on the electron density on all neighboring atoms j , which in turn depends on positions of all neighbors of atom j . It follows that force coupling between atoms extends effectively over a distance of 2Rc and not just Rc as for pair potentials. EAM allows a direct calculation of the mechanical stress tensor for any atomic configuration: 1 αβ σ i , V i i
σ αβ =
(9)
where αβ
σ i i ≡
1 j= /i
2
si s j (rij ) + Fsi (ρ¯i )ρs j (rij )
β
rijα rij rij
.
(10)
Here, V = i i is the total volume of the system and i are atomic volumes assigned to individual atoms. A partition of V between atoms is somewhat arbitrary but adopting a reasonable approximation (for example, equipartiαβ tion) one can compute the local stress tensor σi on individual atoms. Analysis of stress distribution can be especially useful in atomistic simulations of dislocations, grain boundaries and other crystal defects. The condition of mechanical equilibrium of an isolated or periodic system can be expressed as σ αβ = 0 for all α and β: 1 i, j ( j = / i)
2
si s j (rij )
+
Fsi (ρ¯i )ρs j (rij )
β
rijα rij = 0. rij
(11)
In particular, equilibrium with respect to volume variations requires that the hydrostatic stress vanish, α σ αα = 0, which reduces Eq. (11) to 1 i, j ( j = / i)
2
si s j (rij ) + Fsi (ρ¯i )ρs j (rij ) rij = 0.
(12)
Analysis of stresses also allows us to formulate equilibrium conditions of a crystal with respect to tetragonal or any other homogeneous distortion. We now turn to elastic constants of an equilibrium prefect crystal. The elastic constant tensor C αβγ δ of a general crystal structure is given by C αβγ δ =
1 αβγ δ αβγ δ αβ γ δ Ui + Fsi (ρ¯i )Wi + Fsi (ρ¯i )Vi Vi , n b 0 i
(13)
464
Y. Mishin
where 0 is the equilibrium atomic volume and
αβγ δ Ui
αβγ δ Wi
=
j= /i
αβ Vi
si s j (rij ) rijα rijβ rijγ rijδ 1 = si s j (rij ) − , 2 j =/ i rij (rij )2
ρsj (rij )
ρs (rij ) rijα rijβ rijγ rijδ − j , rij (rij )2
β rijα rij ρs j (rij ) rij j= /i
=
(14)
(15)
.
(16)
In Eq. (13), i is the summation over n b basis atoms defining the structure, while the summation j extends over all neighbors of atom i within its cutoff sphere. Expressions for contracted elastic constants cij can be readily developed from the above equations. It is important to remember that Eqs. (13)–(16) have been derived by applying to the crystal an infinitesimal homogeneous strain. These equations are, thus, not valid for structures (e.g., HCP or diamond cubic) where the lack of inversion symmetry gives rise to internal atomic relaxations under applied strains. EAM provides relatively simple expressions for force constants and the dynamical matrix [8]. For off-diagonal (i=/ j ) elements of the force-constant αβ matrix G ij we have αβ G ij
s s (rij ) rijα rijβ δαβ f ij (rij ) ≡ α β =− − si s j (rij ) − i j rij rij (rij )2 ∂ri ∂r j ∂ E tot
−
Fsi (ρ¯i )
ρsj (rij )
−
−Fs j (ρ¯ j ) ρsi (rij ) −
ρs j (rij ) rijα rijβ rij
+
k= / i, j
(rij )2 β
ρs i (rij ) rijα rij rij (rij )2
α β rij
+ Fsj (ρ¯ j )ρs i (rij )Q j
rij
− Fsi (ρ¯i )ρs j (rij )Q αi
Fsk (ρ¯k )ρs i (rik )ρs j (r j k )
β
rij rij
β
rikα r j k , rik r j k
(17)
where Q αi =
m= /i
ρs m (rim )
α rim rim
(18)
Interatomic potentials for metals
465 αβ
and f ij (rij ) is given by Eq. (8). For the diagonal elements G ii we have αβ G ii
∂ E tot
≡
β
∂riα ∂ri +
k= /i
+
k= /i
+
= δαβ
Fsi (ρ¯i )
f ik (rik ) k= /i
rik
ρsk (rik )
Fsk (ρ¯k )
ρsi (rik )
β Fsi (ρ¯i )Q αi Q i
+
k= /i
+
k= /i
−
ρs k (rik ) rik
si sk (rik )
(rik ) rikα rikβ − si sk rik (rik )2
β
rikα rik (rik )2
ρ (rik ) rikα rikβ − si rik (rik )2 Fsk (ρ¯k )
ρs i (rik )
2 r α r β ik ik
(rik )2
.
(19)
If the system is subject to periodic boundary conditions or if there are no αβ external fields, G ii can be simply found from the relation
αβ
αβ
G ij + G ii = 0,
(20)
j= /i
expressing the invariance of E tot with respect to arbitrary rigid translations of the system. Eqs. (17) and (19) reveal again that dynamic coupling between atoms in EAM extends over distances up to 2Rc . Notice that these equations are not limited to a perfect crystal and are valid for any equilibrium atomic configuration. αβ Knowing G ij , we can construct the dynamical matrix αβ Dij
αβ
G ij = , Mi M j
(21) αβ
Mi and M j being the atomic masses. A diagonalization of Dij gives us squares, ωn2 , of the normal vibrational frequencies ωn of our system. For a stable system all eigenvalues ωn2 are non-negative, which allows us to determine the normal frequencies. These, in turn, can be immediately plugged into the relevant statistical-mechanical expressions for the free energy and other thermodynamic functions associated with atomic vibrations. This procedure, with possible slight modifications, lies in the foundation of all harmonic and quasi-harmonic thermodynamics calculations with atomistic potentials [9, 10]. In particular, a minimization of the total free energy (vibrational free energy plus E tot ) with respect to volume provides a quasi-harmonic scheme of thermal expansion calculations [11]. Alternatively, for a perfect crystal it is straightforαβ ward to compute the Fourier transform, Dij (k), of the dynamical matrix for various k-vectors within the Brillouin zone (here i and j refer to basis atoms). αβ A diagonalization of Dij (k) permits a calculation of 3n b phonon dispersion relations ω(k).
466
Y. Mishin
If an EAM potential is used in the effective pair format and we need to αβ αβ ¯ =0 compute G ij or Dij for the equilibrium reference structure, then all Fs (ρ) and Eqs. (17) and (19) are somewhat simplified. But even without this simpliαβ fication, the computation of G ij directly from Eqs. (17) and (19) is a straightforward and relatively fast computational procedure. In fact, it is the diagonalization of the dynamical matrix rather than its construction that becomes the bottleneck of harmonic calculations for large systems. Finally, we will provide EAM expressions for the unrelaxed vacancy formation energy. The change in E tot accompanying the creation of a vacancy at a site i without relaxation equals E i = −
si s j (rij ) − Fsi (ρ¯i ) +
j= /i
Fs j (ρ¯j − ρi (rij )) − Fs j (ρ¯j ) ,
j= /i
(22) where ρ¯j is the host electron density at site j =/ i before the vacancy creation. The first two terms in Eq. (22) account for the energy of broken bonds and the loss of the embedding energy of atom i, whereas the third term represents the changes in embedding energies of neighboring atoms j due to the reduction in their host electron density upon removal of atom i. For an elemental metal whose crystal structure consists of symmetrically equivalent sites,2 the unrelaxed vacancy formation energy equals E v = E i + E 0 , where E0 =
1 (rij ) + F(ρ) ¯ 2 j =/ i
(23)
is the cohesive energy of the crystal (the choice of site i is unimportant). Thus, Ev = −
1 (rij ) + F(ρ¯ − ρ(rij )) − F(ρ) ¯ . 2 j =/ i j= /i
(24)
The relaxation typically decreases E v by 10–20%. For a pair potential, Eq. (24) leads to E v = −E 0 , a relation which overestimates experimental values of E v over a factor of two. For example, in copper E v = 1.27 eV while E 0 = −3.54 eV (both experimental numbers). The embedding energy terms in Eq. (24) make the agreement with experiment much closer. For an alloy or compound, Eq. (22) only gives the so-called “raw” formation energy of a vacancy [12]. This energy alone is not sufficient for calculating the equilibrium vacancy concentration but it serves as one of the ingredients required for such calculations. For an ordered intermetallic compound, “raw” energies of vacancies and antisite defects need to be computed for each sublattice. Expressions similar to Eq. (22) can be readily developed 2 Some structures, for example A15, contain nonequivalent sites.
Interatomic potentials for metals
467
for antisite defects. Another ingredient is the average cohesive energy of the compound,
1 1 s s (rij ) + Fsi (ρ¯i ), E0 = n b i 2 j =/ i i j
(25)
where the summation i is over n b basis atoms and the summation j is over all neighbors of atom i. The set of all “raw” formation energies of point defects and E 0 provides input for statistical-mechanical models describing dynamic equilibrium among point defects and allowing a numerical calculation of their equilibrium concentrations [12, 13]. Although relaxations can reduce the “raw” energies significantly, fast unrelaxed calculations are very useful when generating potentials or making preliminary tests. EAM potentials serve as a workhorse in the overwhelming majority of atomistic simulations of metallic materials. They are widely used in simulations of grain boundaries and interfaces [14], dislocations [15], fracture [16], diffusion and other processes [17]. EAM potentials have a good record of delivering reasonable results for a wide variety of properties. For elemental metals, elastic constants and the vacancy formation energies are usually reproduced accurately. Surface energies tend to lie 10–20% below experiment, a problem that can hardly be solved within regular EAM. Surface relaxations and reconstructions usually agree with experiment at least qualitatively. Vacancy migration energies tend to underestimate experimental values unless specifically fit to them. Phonon dispersion curves, thermal expansion, melting temperatures, stacking fault energies, and structural energy differences may not come out accurate automatically but can be adjusted during the potential generation procedure (see below). For binary systems, experimental heats of phase formation and properties of individual ordered compounds can be fitted to with reasonable accuracy. For some binary systems, even basic features of phase diagrams can be reproduced without fitting to experimental thermodynamic data [18]. However, in systems with multiple intermediate phases, transferability across the entire phase diagram can be problematic [18].
3.
Generation and Testing of Atomistic Potentials
We will first discuss potential generation procedures for elemental metals. The EAM functions (r) and ρ(r) are usually described by analytical expressions containing five to seven fitting parameters each. Different authors use polynomials, exponents, Morse, Lennard-Jones or Gaussian functions, or their combinations. In the absence of strong physical leads, any reasonable function can be acceptable as long as it works. It is important, however, to keep the functions simple and smooth. Oscillations and wiggles can lead to
468
Y. Mishin
rapid changes or even discontinuities in higher derivatives and cause unphysical effect in phonon frequencies, thermal expansion and other properties. The risk increases when analytical forms are replaced by cubic splines (discontinuous third derivative), especially with a large number of nodes. Increasing the number of fitting parameters should be done with great caution. The observed improvement in accuracy of fit can be illusive as the potential may perform poorly for properties not included in the fit. Many sophisticated potentials contain hidden flaws that only reveal themselves under certain simulation conditions. As a rough rule of thumb, potentials whose (r) and ρ(r) together contain over 15 fitting parameters may lack reliability in applications. At the same time, using too few (say, < 10) parameters may not take full advantage of the capabilities of EAM. Since the speed of atomistic simulations does not depend on the complexity of potential functions or the number of fitting parameters,3 it makes sense to put efforts in optimizing them for the best accuracy and reliability. There are two ways of constructing the embedding function F(ρ). ¯ One way is to describe it by an analytical function (or cubic spline [19]) with adjustable parameters. Another way is to postulate an equation of state of the ground-state structure. Most authors use the universal binding curve [20], E(a) = E 0 (1 + αx) e−αx ,
(26)
where E(a) is the crystal energy per atom as a function of the lattice parameter a, x = (a/a0 − 1) (a0 being the equilibrium value of a),
α=
−
90 B , E0
and B is the bulk modulus. F(ρ) ¯ is then obtained by inverting Eq. (26). Namely, by varying the lattice parameter we compute ρ(a) ¯ and F(a) = E(a) − E p (a), where E(a) is given by Eq. (26) and E p (a) is the pair-interaction part of ¯ thus obtained parametrically define F(ρ). ¯ E tot . The functions F(a) and ρ(a) Notice that this procedure automatically guarantees an exact fit to E 0 , a0 and B. A slightly improved procedure is to add a higher-order term ∼βx 3 to the pre-exponential factor of Eq. (26) and use the additional parameter β to fit to an experimental pressure-volume relation under large compressions [21]. Even if we do not postulate Eq. (26) and treat F(ρ) ¯ as a function with parameters, E 0 , a0 , and B can still be matched exactly using Eq. (23) for E 0 , the lattice equilibrium condition 1 (rij )rij + F (ρ) ¯ ρ (rij )rij = 0 2 j =/ i j= /i 3 We assume that potential functions are used by the simulation program in a tabulated form.
(27)
Interatomic potentials for metals
469
(follows from Eq. (12)) and the expression for B, 90 B =
1 (rij )(rij )2 + F (ρ) ¯ ρ (rij )(rij )2 2 j =/ i j= /i
+ F (ρ) ¯
2
ρ (rij )rij
(28)
j= /i
(can be derived from Eqs. (13) and (27)). These three equations can be readily ¯ and F (ρ) ¯ at a = a0 . satisfied by adjusting the values of F(ρ), ¯ F (ρ) Fitting parameters of a potential are optimized by minimizing the weighted mean squared deviation of properties from their target values. The weights are used as a means of controlling the importance of some properties over others. Some properties are included with a very small weight that only prevents unreasonable values without pursuing an actual fit. While early EAM potentials were fit to experimental properties only, the current trend is to include into the fitting database both experimental and first-principles data [19, 21, 22]. In fact, some of the recent potentials are predominantly fit to firstprinciples data and only use a few experimental numbers, which essentially makes them a parameterization of first-principles calculations. The incorporation of first-principles data into the fitting database improves the reliability of potentials by sampling larger areas of configuration space, including atomic configurations away from those represented by experimental data. Experimental properties used for potential generation traditionally include E 0 , a0 , elastic constants cij , the vacancy formation energy, and often the stacking fault energy. Thermal expansion factors, phonon frequencies, surface energies, and the vacancy migration energy can also be included. Depending on the intended use of the potential, some of these properties are strongly enforced while others are only used for a sanity check (small weight). First-principles data usually come in the form of energy–volume relations for the ground-state structure and several hypothetical “excited” structures of the same metal. The role of these structures is to probe various local environments and atomic volumes of the metal. This sampling improves the transferability of potentials to atomic configurations occurring during subsequent atomistic simulations. Furthermore, first-principles energies along uniform deformation paths between different structures are often calculated, such as the tetragonal deformation path between the FCC and BCC structures (Bain path) or the trigonal deformation path FCC – simple cubic – BCC. Such deformations, however, are normally used for testing potentials rather than fitting. An alternative way of using first-principles data is to fit to interatomic forces drawn from snapshots of first-principles molecular dynamics simulations for solid as well as liquid phases of a metal (force matching method) [19]. The liquid-phase configurations can improve the accuracy of the potential in melting simulations.
470
Y. Mishin
To illustrate the accuracy achievable by modern EAM potentials, Table 1 summarizes selected properties of copper calculated with an EAM potential [23] in comparison with experiment. This particular potential was parameterized by simple analytical functions. A universal equation of state was not enforced and F(ρ) ¯ was described by a polynomial. The cutoff radius of the potential, Rc = 0.551 nm, covers four coordination shells but the contribution of the fourth shell is extremely small. Besides experimental properties indicated in Table 1, the fitting database included two experimental phonon frequencies at the zone-boundary point X , a high pressure–volume relation and, with a small weight, the dimer bond energy E d and thermal expansion factors at several temperatures. The first-principles data included energy–volume relations for several structures. Only the FCC, HCP, and BCC structures were used in the fit, while other structures were deferred for testing. The potential demonstrates excellent agreement with experiment for both fitted and predicted properties, except for the surface energies which are too low. Phonon dispersion relations and thermal expansion factors are also in accurate agreement with experiment (Fig. 1). The potential accurately reproduces firstprinciples energies of alternate structures not included in the fit, as well as energies along several deformation paths between them.
Table 1. Selected properties of Cu calculated with an embedded-atom potential [23] in comparison with experimental data (see [23] for experimental references). Notations: E vf and E vm – vacancy formation and migration energies, E if and E im – self-interstitial formation and migration energies, γSF – intrinsic stacking fault energy, γus – unstable stacking fault energy, γs – surface energy, γT – symmetrical twin boundary energy, Tm – melting temperature, Rd – dimer bond length, E d – dimer bond energy. All other notations are explained in the text. All defect energies were obtained by static relaxation at 0 K Property a0 (nm)a E 0 (eV)a c11 (GPa)a c12 (GPa)a c44 (GPa)a E vf (eV)a E vm (eV)a E if (eV) E im (eV)
Experiment
EAM
0.3615 −3.54 170.0 122.5 75.8 1.27 0.71 2.8–4.2 0.12
0.3615 −3.54 169.9 122.6 76.2 1.27 0.69 3.06 0.10
Property γSF (mJ/m2 )a γus (mJ/m2 ) γT (mJ/m2 ) γs (111) (mJ/m2 ) γs (110) (mJ/m2 ) γs (100) (mJ/m2 ) Tm (K) Rd (nm) E d (eV)d
a Used in the fit. b Average orientation. c Calculated by molecular dynamics (interface velocity method). d Used in the fit with a small weight.
Experiment
EAM
45 – 24 1790b 1790b 1790b 1357 0.22 −2.05
44.4 158 22.2 1239 1475 1345 1327 0.218 −1.93
Interatomic potentials for metals (a)
9
Γ
[q00]
471
X
K
Γ
[qq0]
[qqq]
L
EAM Experiment
8 7
T2
L
ψ(THz)
6 5
L
L
4 T
3
T1
2
T
1 0 0.00 0.25 0.50 0.75 1.00
0.75
q
(b)
0.50 q
0.25
0.00
0.25
0.50
q
EAM Monte Carlo Experiment
2.0
Linear expansion (%)
1.5
1.0
0.5
0.0
Tm
⫺0.5 0
200
400
600 800 1000 Temperature (K)
1200
1400
Figure 1. Comparison of embedded-atom calculations [23] with experimental data for Cu. (a) phonon dispersion curves, (b) linear thermal expansion relative to room temperature. The discrepancy in thermal expansion at low temperatures is due to quantum effects that are not captured by classical Monte Carlo simulations.
For a binary system A–B, the simplest potential generation scheme is to utilize existing potentials for two metals A and B and only construct a cross-interaction function AB (r).4 4 An alternative approach is to optimize all seven potential functions simultaneously, see for example, Ref. [24].
472
Y. Mishin
To win additional fitting parameters we take advantage of the fact that the transformations ¯ → FA (ρ) ¯ + gA ρ, ¯ FA (ρ) AA (r) → AA (r) − 2gA ρA (r), ¯ → FB (ρ) ¯ + gB ρ, ¯ FB (ρ) BB (r) → BB (r) − 2gB ρB (r), ρB (r) → pB ρB (r), ¯ → FB (ρ/ ¯ pB ) FB (ρ)
(29) (30) (31) (32) (33) (34)
leave the energies of elements A and B invariant while altering energies of binary alloys. Thus, pB , gA and gB can be treated as adjustable parameters. After the fit, the new potential functions can be converted to the binary effective pair format by applying the invariant transformations by Eqs. (3)–(6) with gA = −FA (ρ¯A ) and gB = −FB (ρ¯B ), ρ¯A , and ρ¯B being host electron densities in a reference compound. It should be remembered that the binary effective pair format thus obtained will produce elemental potential functions different from the initial ones. Thus, if the initial elemental potentials were in the effective pair format, it will generally be destroyed by the fitting process. Indeed, the reference state of an elemental potential is its ground state, while the reference state of the binary system is a particular binary compound. Physically, however, both elemental potentials will remain exactly the same. All these mathematical transformations should be carefully observed when comparing different potentials or reconstructing them from published parameters. Experimental properties used for optimizing a binary potential typically include E 0 , a0 , and cij of a chosen intermetallic compound. For structural intermetallics, energies of generalized planar faults involved in dislocation dissociations can also be used in the fit to improve the applicability of the potential to simulations of mechanical behavior [15]. Fracture simulations [16] may additionally require reasonable surface energies, which can be adjusted to some extent during the fitting procedure. On the other hand, for thermodynamic and diffusion simulations it is more important to reproduce the heat of the compound formation and point defect characteristics. As with pure metals, the current trend in constructing binary potentials is to incorporate first-principle data, usually in the form of energy–volume relations for experimentally observed and hypothetical compounds. The transferability of a potential can be significantly improved by including compounds with several different stoichiometries across the entire phase diagram [18, 21, 24]. Even if such compounds do not actually exist on the experimental diagram, they sample a broader area of configuration space and secure reasonable energies of various environments and chemical compositions that may occur locally during atomistic simulations, for example, in core regions of lattice
Interatomic potentials for metals
473
defects. Some of the recent binary potentials only use a few experimental numbers but otherwise heavily rely on first-principles input [18]. Besides structural energies, such input may include energies along deformation paths between compounds, energies of stable and unstable planar faults, point defect energies and other data. Some of this information can be deferred from the fitting database and used for testing the potential. The most critical test of transferability of a binary potential is its ability to reproduce the phase diagram at least qualitatively. Unfortunately, many existing potentials are nicely fit to specific properties of a particular compound but fail to describe other structures and compositions with any acceptable accuracy. Such potentials can easily produce incorrect structures of grain boundaries, interfaces or any other defects whose local chemical composition deviates significantly from the bulk composition. A challenge of future research is to establish a procedure for generating reliable EAM potentials for ternary systems. A carefully chosen model system A–B–C must be used as a testing ground. The first step would be to simply construct three binary potentials, A–B, B–C, and C–A, based on the same set of high-quality elemental potentials and capable of reproducing the relevant binary phase diagrams at least on a qualitative level. Such potentials should be based on extensive first-principles input and a smart procedure for a simultaneous optimization of the transformation parameters gs and ps relating to different binaries. The critical test of this potential set would be an evaluation of thermodynamic stability of ternary compounds existing on the experimental diagram. At the next step, calculated properties of such compounds can be improved by further adjustments of the binary potentials.
4.
Angular-dependent Potentials
EAM potentials work best for simple and noble metals but are less accurate for transition metals. The latter reflects an intrinsic limitation of EAM, which is essentially a central-force model that cannot capture the covalent component of bonding arising due to d-electrons in transition metals. Baskes et al. [25– 28] developed a non-central-force extension of EAM, which they called the modified embedded-atom method (MEAM). In MEAM, electron density is treated as a tensor quantity and the host electron density ρ¯i is expressed as a function of the respective tensor invariants. In the simplest approximation, ρ¯i is given by the expansion
(0) (ρ¯i )2 = ρ¯i
2
+ ρ¯i(1)
2
+ ρ¯i(2)
2
+ ρ¯i(3)
2
,
(35)
474
Y. Mishin
where
ρ¯i(0)
2
=
j= /i
ρ¯i(1)
ρ¯i(2)
2
ρ¯i(3)
2
(36)
j= /i
sj
(37)
rij
2 2 α β r r 1 ij ij = ρ (2) (rij ) − ρ (2) (rij ) , α,β
ρs(0) (rij ) , j
2 α r ij = ρ (1) (rij ) , α
2
2
=
α,β,γ
j= /i
j= /i
sj
ρs(3) (rij ) j
rij2
3
β γ
rijα rij rij rij3
j= /i
sj
(38)
2 .
(39)
The terms ρ¯i(k) (k = 0, 1, 2, 3) can be thought of as representing contributions of s, p, d, and f electronic orbitals, respectively. It should be emphasized, however, that the exact relation of these terms to electronic orbitals is not physically clear and Eqs. (35)–(39) can as well be viewed as ad hoc expressions whose only role is to introduce non-spherical components of bonding. The regular EAM is recovered by including only the electron density of “s-orbitals,” ρ¯i(0) , and neglecting all other terms. In comparison with regular EAM, MEAM introduces three new functions, ρs(1) (r), ρs(2) (r), and ρs(3) (r) for each species s, which are fit to experimental and first-principles data in much the same manner as in EAM. While EAM potentials are smoothly truncated at a sphere breaembracing several coordination shells, MEAM includes only one or two coordination shells but introduces a many-body “screening” procedure described in detail by Baskes [27, 29]. Computationally, MEAM is roughly a factor of five to six slower than EAM but can be more accurate for transition metals. It has even been successfully applied to covalent solids, including Si and Ge [27]. Advantages of MEAM over EAM are particularly strong for noncentrosymmetric structures and materials with a negative Cauchy pressure. The latter can be readily reproduced ¯ > 0. MEAM potentials have by angular-dependent terms while keeping F (ρ) been constructed for a number of metals [27, 29, 30] and intermetallic compounds [31, 32]. Pasianot et al. [33] proposed a slightly different way of incorporating angular interactions into EAM. In their so-called embedded-defect method (EDM), the total energy is written in the form E tot =
1 si s j (rij ) + Fsi (ρ¯i ) + G Yi , 2 i, j ( j =/ i) i i
(40)
Interatomic potentials for metals where ρ¯i =
475
ρs j (rij ),
(41)
j= /i
2 β 2 rijα rij 1 ρs j (rij ) 2 − ρs j (rij ) . Yi = α,β
3
rij
j= /i
(42)
j= /i
Expression (40) was originally derived from physical considerations different from those underlying MEAM. Mathematically, however, Eqs. (40)–(42) present a particular case of Eqs. (35)–(39) in which ρ¯i(1) and ρ¯i(3) are neglected, F(ρ¯i ) is approximated by a linear expansion in terms of the small 2 perturbation ρ¯i(2) , and the later is expressed through the undisturbed electron density function ρs (r): ρs(2) (r) ≡ρs (r). In comparison with EAM, EDM introduces only one additional parameter, G. Like EAM, EDM uses cutoff functions, thus avoiding the MEAM screening procedure. EDM potentials have been successfully constructed for several HCP [33] and BCC transition metals [33, 35–37]. While EDM is computationally faster than MEAM, it is less general and offers less fitting parameters for the angular part. However, the original EDM formulation can be readily generalized by including more angular-dependent terms: E tot =
1 si s j (rij ) + Fsi (ρ¯i ) 2 i, j ( j =/ i) i
+
2 ρ¯i(1)
+
2 ρ¯i(2)
+
2 ρ¯i(3)
,
(43)
i
where ρ¯i(k) are expressed through parameterized functions ρs(k) (r) by Eqs. (37)–(39). Overall, MEAM, EDM, and Eq. (43) are all equally legitimate empirical expressions introducing angular-dependent forces. The role of ρ¯i(k) ’s is to simply penalize E tot for deviations from local cubic symmetry. These terms do not affect the energy–volume relations for cubic crystals but are important for structures with broken local cubic symmetry. Thus, energies of many common crystal structures such as L12 , L10 , and L11 , depend of the “quadrupole” term ρ¯i(2) . This dependence opens new degrees of freedom for reproducing structural energies of intermetallic compounds. Since nonhydrostatic strains break cubic symmetry, ρ¯i(2) also affects elastic constants, which enables their more accurate fit and a reproduction of negative Cauchy pressures. In some structures, such as diamond and some binary compounds, elastic constants are also affected by the “dipole” term ρ¯i(1) . Areas of broken symmetry inevitably
476
Y. Mishin
exist around lattice defects. Due to the additional penalty arising from angular terms, defect energies can be larger than in EAM. In particular, it becomes possible to reproduce higher surface energies and a more accurate vacancy migration energy. In sum, angular-dependent terms can improve the accuracy of fit of potentials in comparison with regular EAM. However, the effect of such terms on the transferability of potential needs to be studied in more detail.
5.
Outlook
Embedded-atom potentials provide a reasonable description of a broad spectrum of properties of metallic systems and enable fast atomistic simulations of a variety of processes ranging from thermodynamic functions and diffusion to plastic deformation and fracture. There are intrinsic limitations of EAM, which is still a semi-empirical model based on central-force interactions. Such limitations set boundaries to the accuracy achievable within this method. However, the accuracy and robustness of EAM potentials gradually improve, within those boundaries, by developing more efficient fitting and testing procedures, using larger data sets, and most importantly, increasing the weight of first-principles data. The latter trend may eventually transform the method to a parameterization, or mapping, of first-principles data. Much work needs to be done to improve transferability of binary EAM potentials. This, again, can be achieved by further optimizing the potential generation procedures and using more first-principle data. The most severe test of a binary potential is its ability to predict the correct phase stability across the entire phase diagram. It is not quite clear at this point how far EAM can be pushed in that direction, but this certainly deserves to be explored. Reliable ternary potentials remain a grand challenge of future research. Presently, the only way of generalizing EAM to include non-central interactions is to introduce energy penalties for local deviations from cubic symmetry. This can be achieved by calculating local dipole, quadrupole, and perhaps higher order tensors and making the energy a function of their invariants. Depending on the initial physical motivation behind such tensors and some technical details (such as cutoff functions versus screening), this idea has been implemented first in MEAM and later in EDM. It should be emphasized, however, that other equally legitimate forms of an angular-dependent potential can be readily constructed in the same spirit, Eq. (43) being just one example. Since there is no unique physical justification for those different forms, they all can simply be viewed as useful empirical expressions. Both MEAM and EDM potentials have been developed for a number of transition metals and have demonstrated an improved accuracy in reproducing their properties. MEAM has also been applied, with significant success, to
Interatomic potentials for metals
477
intermetallic compounds and even covalent solids. Future work may further develop this group of methods towards binary and eventually ternary systems.
References [1] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn., Academic, San Diego, 2002. [2] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, Cambridge, 2000. [3] M.S. Daw and M.I. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals,” Phys. Rev. B, 29, 6443–6453, 1984. [4] M.W. Finnis and J.E. Sinclair, “A simple empirical N-body potential for transition metals,” Philos. Mag. A, 50, 45–55, 1984. [5] J.K. Nørskov, “Covalent effects in the effective-medium theory of chemical binding: Hydrogen heats of solution in the 3d metals,” Phys. Rev. B, 26, 2875–2885, 1982. [6] D.G. Pettifor, Bonding and Structure of Molecules and Solids, Clarendon Press, Oxford, 1995. [7] M.S. Daw, “Embedded-atom method: many-body description of metallic cohesion,” In: V. Vitek and D.J. Srolovitz (eds.), Atomistic Simulation of Materials: Beyond Pair Potentials, Plenum Press, New York, pp. 181–191, 1989. [8] M.S. Daw and R.L. Hatcher, “Application of the embedded atom method to phonons in transition metals,” Solid State Comm., 56, 697–699, 1985. [9] A. Van de Walle and G. Ceder, “The effect of lattice vibrations on substitutional alloy thermodynamics,” Rev. Mod. Phys., 74, 11–45, 2002. [10] J.M. Rickman and R. LeSar, “Free-energy calculations in materials research,” Annu. Rev. Mater. Res., 32, 195–217, 2002. [11] S.M. Foiles, “Evaluation of harmonic methods for calculating the free energy of defects in solids,” Phys. Rev. B, 49, 14930–14938, 1994. [12] Y. Mishin and C. Herzig, “Diffusion in the Ti-Al system,” Acta Mater., 48, 589–623, 2000. [13] M. Hagen and M.W. Finnis, “Point defects and chemical potentials in ordered alloys,” Philos. Mag. A, 77, 447–464, 1998. [14] D. Wolf, Handbook of Materials Modeling, vol. 1, Chapter 8, Interfaces, 2004. [15] W. Cai, “Modeling dislocations using a periodic cell,” Article 2.21, this volume. [16] D. Farkas and R. Selinger, “Atomistics of fracture,” Article 2.33, this volume. [17] A.F. Voter, “The embedded-atom method,” In: J.H. Westbrook and R.L. Fleischer (eds.), Intermetallic Compounds, vol. 1, John Wiley & Sons, New York, pp. 77–90, 1994. [18] Y. Mishin, “Atomistic modeling of the γ and γ phases of the Ni-Al system,” Acta Mater., 52, 1451–1467, 2004. [19] F. Ercolessi and J.B. Adams, “Interatomic potentials from first-principles calculations: the force-matching method,” Europhys. Lett., 26, 583–588, 1994. [20] J.H. Rose, J.R. Smith, F. Guinea, and J. Ferrante, “Universal features of the equation of state of metals,” Phys. Rev. B, 29, 2963–2969, 1984.
478
Y. Mishin [21] R.R. Zope and Y. Mishin, “Interatomic potentials for atomistic simulations of the Ti-Al system,” Phys. Rev. B, 68, 024102, 2003. [22] Y. Mishin, D. Farkas, M.J. Mehl, and D.A. Papaconstantopoulos, “Interatomic potentials for monoatomic metals from experimental data and ab initio calculations,” Phys. Rev. B, 59, 3393–3407, 1999. [23] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, “Structural stability and lattice defects in copper: ab initio, tight-binding and embeddedatom calculations,” Phys. Rev. B, 63, 224106, 2001. [24] Y. Mishin, M.J. Mehl, and D.A. Papaconstantopoulos, “Embedded-atom potential for B2-NiAl,” Phys. Rev. B, 65, 224114, 2002. [25] M.I. Baskes, “Application of the embedded-atom method to covalent materials: a semi-empirical potential for silicon,” Phys. Rev. Lett., 59, 2666–2669, 1987. [26] M.I. Baskes, J.S. Nelson, and A.F Wright, “Semiempirical modified embedded-atom potentials for silicon and germanium,” Phys. Rev. B, 40, 6085–6110, 1989. [27] M.I. Baskes, “Modified embedded-atom potentials for cubic metals and impurities,” Phys. Rev. B, 46, 2727–2742, 1992. [28] M.I. Baskes, J.E. Angelo, and C.L. Bisson, “Atomistic calculations of composite interfaces,” Modelling Simul. Mater. Sci. Eng., 2, 505–518, 1994. [29] M.I. Baskes, “Determination of modified embedded atom method parameters for nickel,” Mater. Chem. Phys., 50, 152–158, 1997. [30] M.I. Baskes and R.A. Johnson, “Modified embedded-atom potentials for HCP metals,” Modelling Simul. Mater. Sci. Eng., 2, 147–163, 1994. [31] M.I. Baskes, “Atomic potentials for the molybdenum–silicon system,” Mater. Sci. Eng. A, 261, 165–168, 1999. [32] D. Chen, M. Yan, and Y.F. Liu, “Modified embedded-atom potential for L10 -TiAl,” Scripta Mater., 40, 913–920, 1999. [33] R. Pasianot, D. Farkas, and E.J. Savino, “Empirical many-body interatomic potentials for bcc transition metals,” Phys. Rev. B, 43, 6952–6961, 1991. [34] J.R. Fernandez, A.M. Monti, and R.C. Pasianot, “Point defects diffusion in α-Ti,” J. Nucl. Mater., 229, 1–9, 1995. [35] G. Simonelli, R. Pasianot, and E.J. Savino, “Point-defect computer simulation including angular forces in bcc iron,” Phys. Rev. B, 50, 727–738, 1994. [36] G. Simonelli, R. Pasianot, and E.J. Savino, “Phonon-dispersion curves for transition metals within the embedded-atom and embedded-defect methods,” Phys. Rev. B, 55, 5570–5573, 1997. [37] G. Simonelli, R. Pasianot, and E.J. Savino, “Self-interstitial configuration in BCC metals. An analysis based on many-body potentials for Fe and Mo,” Phys. Status Solidi (b), 217, 747–758, 2000.
2.3 INTERATOMIC POTENTIAL MODELS FOR IONIC MATERIALS Julian D. Gale Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, Perth, 6845, Western Australia
Ionic materials are present in many key technological applications of the modern era, from solid state batteries and fuel cells, nuclear waste immobiliza tion, through to industrial heterogeneous catalysis, such as that found in automotive exhaust systems. With the boundless possibilities for their utilization, it is natural that there has been a long history of computer simulation of their structure and properties in order to understand the materials science of these systems at the atomic level. The classification of materials into different types is, of course, an arbitrary and subjective decision. However, when a binary compound is composed of two elements with very different electronegativities, as is the case for oxides and halides in particular, then it is convenient to regard it as being an ionic solid. The implication is that, as a result of charge transfer from one element to the other, the dominant binding force between particles is the Coulombic attraction between opposite charges. Such materials tend to be characterized by close-packed, dense structures that show no strong directionality in the bonding. Typically, most ionic materials possess a large band gap and are therefore insulating. As a consequence, the notion that the solid is composed of spherical ions whose interactions can be represented by simple distance-dependent functional forms is quite a reasonable one, since overtly quantum mechanical effects are lesser than in materials where covalent bonding occurs. Thus it is possible to develop force fields that are specific for ionic materials, and this approach can be surprisingly successful considering the simplicity of the interatomic potential model. When considering how to construct a force field for ionic materials, the starting point, as is the case for all types of system, is to assume that the total 479 S. Yip (ed.), Handbook of Materials Modeling, 479–497. c 2005 Springer. Printed in the Netherlands.
480
J.D. Gale
energy, Utot, can be decomposed into interactions between different numbers of atoms: Utot =
1 1 1 Ui j + Ui j k + Ui j kl + · · · 2! i j 3! i j k 4! i j k l
Here, Ui j is the energy of interaction between a pair of atoms, i and j , or so-called two-body interaction energy; Ui j k is the extra interaction that arises (beyond the sum of the three two-body energy components for the pairs i − j, j − k, and i − k) when a triad of atoms is considered, and so forth for higher order terms. Note that the inverse factorial prefactor is required to avoid double counting of interactions between particles. In principle, the above decomposition is exact if carried out to terms of high enough order. However, in practice it is necessary to truncate the expansion at some point. For many ionic materials it is often sufficient to only include the two-body term, though the extensions beyond this will be discussed later. Imagining an ionic solid as being composed of cations and anions whose electron densities are frozen, which represents the simplest possible case, the physical interactions present can be intuitively understood. There will obviously be a Coulombic attraction between ions of opposite charge, with a corresponding repulsive force between those of like nature. Because ions are arranged such that the closest neighbours are of opposite sign, this gives rise to a strong net attractive energy that will tend to contract the solid in order to lower the energy. In order that an equilibrium structure is obtained there must be a counterbalancing repulsive force. This arises from the overlap of the electron densities of two ions, regardless of the sign of their charge, and has its origin in the Pauli repulsion between electrons. Hence, we can write the breakdown of the two-body energy in general terms as: repulsive
+ Ui j Ui j = UiCoulomb j
While real spherical ions will have a radial electron density distribution, it is convenient to treat the ions as point charges – i.e., as though all the electron density is situated at the nucleus. Within this approximation, the electrostatic interaction of two charged particles is just given by Coulomb’s law; = UiCoulomb j
qi q j 4π 0ri j
or, if written in atomic units, as will subsequently be done, we can drop the constant factor of 4π 0 : = UiCoulomb j
qi q j ri j
Interatomic potential models for ionic materials
481
The error in the electrostatic energy arising from the point charge approximation is usually subsumed into the repulsive energy contribution, since this latter term is usually derived by a fitting procedure, rather than from direct theoretical considerations.
1.
Calculating the Electrostatic Energy
Not only is the electrostatic energy the dominant contribution to the total value, but it turns out that it is actually the most difficult to evaluate. While it is easy to write down that the electrostatic energy is the sum over all pairwise interactions, including all periodic images of the unit cell, the complication arises because the sum must be truncated for actual computation. Unfortunately, the summation is an example of a conditionally convergent series, i.e., the value of the sum depends on how the truncation is made. The reason for this can be understood by considering the interactions of a single ion with all other ions within a given radius, r. The convergence of the energy of r , is given by the number of interactions, Nr , multiplied by the interaction, Utot magnitude of the interaction, U r : r = Utot
Nr U r
r
As r increases, the number of interactions rises in proportion to the surface area of the cut-off sphere: Nr ∝ 4πr 2 . However, the interaction itself only decreases as the inverse power of r, as has been shown previously. Consequently, the magnitude of interaction potentially increases as the cut-off radius is extended. The fact that the magnitude converges in practice relies on the fact that there is cancelation between interactions with cations and anions. It turns out that the electrostatic energy of a system actually depends on the macroscopic state of a crystal due to the long-ranged effect of Coulomb fields. In other words, it is not purely a property of the bulk crystal, but also depends, in general, on the nature of the surfaces and of the crystal morphology [3]. To make it feasible to define an electrostatic energy that is useful for the simulation of ionic materials, it is conventional to impose two conditions on the Coulomb summation: 1. The sum of the charges within the system must be equal to zero: i
qi = 0
482
J.D. Gale
2. The total dipole moment of the system in all directions must also be equal to zero: µ x = µ y = µz = 0 If these conditions are satisfied, the electrostatic energy will always converge to the same value as the cut-off radius is incremented. It is also possible to define the electrostatic energy when the dipole moments along the three Cartesian axes differ from zero. This Coulomb energy is related to the value obtained when the dipole moment is zero, U 0 , according to the following expression; U = U0 +
2π 2 µx + µ2y + µ2z 3V
where V is the volume of the unit cell. Considering the expression for the dipole moment in a given direction, α; µα =
qi riα
i
where riα is the position of the ith ion projected on to this axis, then there is a complication. Because there are multiple images of the same ion, due to the presence of periodic boundary conditions, the dipole contribution of any given ion is an ambiguous quantity. The only way to determine the true dipole moment is to perform the sum over all ions within the entire crystal, which includes those ions at the surface. This is the origin of the electrostatic energy being a macroscopic property of the system. While it has been stated that the electrostatic energy is convergent if the above conditions are obeyed, it is not obvious how to achieve this in practice for a general crystal structure. Various methods have been proposed, the most reknown of which is that of Evjen who constructed charge neutral shells of ions about each interacting particle. However, this is more difficult to automate for a computational implementation and is best for high symmetry structures. Apart from the need to converge to a defined electrostatic energy, there is also the issue of how rapidly the sum converges, since it is required that the calculation be fast for numerical evaluation. By far the dominant approach to evaluating the electrostatic energy is through the use of the summation method due to Ewald which aims to accelerate the convergence by partially transforming the expression into reciprocal space. While the details of the derivation are beyond the scope of this text, and can be found elsewhere [2, 9], the concepts behind the approach and the final result will be given below. In Ewald’s approach, a Gaussian charge distribution of equal magnitude, but opposite sign, is placed at the position of every ion in the crystal. Because the charges cancel, all but for the contribution from the differing
Interatomic potential models for ionic materials
483
shape of the distribution, the resulting electrostatic interaction between ions is now rapidly convergent when summed out in real space and converges to the energy U real . In order to recover the original electrostatic energy it is then necessary to compute two further terms. Firstly, the interaction of the Gaussian charge distributions with each other must be subtracted. Because of the smooth nature of the electrostatic potential arising from such a distribution, it is possible to efficiently evaluate this term, U recip , by expanding the charge density in planewaves with the periodicity of the reciprocal lattice. Again, the energy contribution is rapidly convergent with respect to the cut-off radius within reciprocal space. Finally, there is the self-energy, U self , that arises from the interaction of the Gaussian with itself. Mathematically, the Ewald sum is derived by a Laplace transform of the Coulomb energy and the final expressions are given below; U Coulomb = U real + U recip + U self 1 1 qi q j U real = er f c η 2 ri j 2 R i j ri j U recip =
1 4π exp −(G 2 /4η) q q exp G.r) (i i j 2 G i j V 2 G2
U self = −
i
1
qi2
η π
2
where R denotes a real space lattice vector, G represents a reciprocal lattice vector and η is a parameter that determines the width of the Gaussian charge distribution. Note that the summation over reciprocal lattice vectors excludes the case when G = 0. The key to rapid convergence of the Ewald sum is to choose the optimal value of η. If the value is small, then the Gaussians are narrow and so the real space expression converges quickly, while the reciprocal space sum requires a more extensive summation due to the higher degree of curvature of the charge density. Choosing a large value of η obviously leads to the inverse situation. One approach to choosing the convergence parameter is to derive an expression for the total number of terms to be evaluated in real and reciprocal space for a given accuracy and then to find the stationary point where this quantity is at a minimum. The choice of ηopt is then given by;
ηopt =
Nπ3 V
1 3
where N is the number of particles within the unit cell. If the target accuracy, A, is represented by the given fractional degree of convergence (e.g.,
484
J.D. Gale
A = 0.001 would imply that the energy is converged to within 0.1%), then the cut-off radii in real and reciprocal space are given as follows:
max ropt
−ln A = η
12 1
2 G max opt = 2(−η ln A)
Before leaving the evaluation of the electrostatic energy, it is important to comment on other dimensionalities than three-dimensional (3-D) periodic boundary conditions. There is also an analogous approach involving a partial reciprocal space transformation in two dimensions, due to Parry, which can be employed for slab or surface calculations [6]. For the 1-D case of a polymer, the Coulomb sum is now absolutely convergent for a charge neutral system. However, it is still beneficial to use methods that accelerate the convergence, though there is less concensus as to the most efficient technique.
2.
Non-electrostatic Contributions to the Energy
While the electrostatic energy often accounts for the majority of the binding, the non-Coulombic contributions are equally critical since they determine the position and shape of the energy minimum. As previously mentioned, there must always be a short-ranged repulsive force between ions to counter the Coulomb attraction and therefore prevent the collapse of the solid. Most work has followed the pioneering work in the field, as embodied in the Born– Meyer and Born–Lande equations for the lattice energy, by utilizing either an exponential or inverse power-law repulsive term. This gives rise to two widely employed functional forms, namely the Buckingham potential; short−ranged
Ui j
= Ai j exp −
ri j ρi j
−
Ci j ri6j
and that due to Lennard–Jones: Bi j Ci j short−ranged = m − n Ui j ri j ri j For the Lennard–Jones potential, the exponents m and n are typically 9–12 and 6, respectively. This latter potential can also be recast in many different forms by rewriting in terms of the well-depth, ε, and either the distance at repulsive = 0 axis, r0 , or the position of the which the potential intercepts the Ui j minimum, req . Both the Buckingham and Lennard–Jones potentials have the same common features – a short-ranged repulsive term and a slightly longerranged attractive term. The latter contribution, often referred to as the C6 term,
Interatomic potential models for ionic materials
485
arises as the leading term in the expansion of the dispersion energy between two non-overlapping charge densities. When choosing between the use of Buckingham and Lennard–Jones potentials, there are arguments for and against both. Physically, the exponential form of the Buckingham potential should be more realistic because electron densities of ions decay with this shape and so it would seem natural that the repulsion follows the magnitude of the interacting ion densities, at least for weak overlap. However, in the limit of ri j → 0 the repulsive Buckingham potential tends to Ai j , i.e., a constant value that is unphysically low for nuclear fusion! Worse still, if the coefficient Ci j is non-zero, then the potential, while initially repulsive, goes through a maximum and then tends to −∞ – a result that is physically absurd. In contrast, the Lennard-Jones potential behaves sensibly and tends to +∞ as long as m > n. While the false minimum of the Buckingham potential is not usually a problem for energy minimization studies, it can be an issue in molecular dynamics where there is a finite probability of the system gaining sufficient kinetic energy to overcome the repulsive barrier. There is a further solution to the problems with the Buckingham potential at small distances. The problems arise due to the simple power-law expression for the dispersion energy. However, this is also incorrect at short-range since the electron densities begin to overlap leading to a reduction of the dispersion contribution. This can be accounted for by explicitly damping the C6 term as the distance tends to zero, and the most widely used approach for doing this is to adopt the form proposed by Tang and Toennies:
UiC6 j
=− 1−
6
bri j k k=0
k!
exp −bri j
Ci j
ri6j
Occasionally other short-ranged, two-body potentials are choosen, such as the Morse or a harmonic potential. However, these are normally selected when acting between two atoms that are bonded. In this situation, the potential is usually Coulomb-subtracted too, in order that the parameters can be directly equated with the bond length and curvature. All the above short-ranged potentials are pairwise in form. However, there are instances where it is useful to include higher order contributions. For example, in the case of semi-ionic materials, such as silicates, where there is a need to reproduce a tetrahedral local coordination geometry, it is common to include three-body terms that act as a constraint on an angle: 2 1 Ui j k = k3 θi j k − θi0j k 2
There are also many variants on this, such as including higher powers of the deviation of the angle from the equilibrium value and the addition of an
486
J.D. Gale
exponential dependence on the bond lengths so that the potential becomes smooth and continuous with respect to coordination number changes. For systems containing particularly polarizable ions, there is also the possibility of including the three-body contribution to the dispersion energy, as embodied in the Axilrod–Teller potential. As with all materials, it is necessary to select the most approriate force field functional form based on the physical interactions that are likely to dominate in an ionic material. While this will often consist of just the electrostatic term and a two-body short-ranged contribution for dense close-packed materials, it may be necessary to contemplate adding further terms as the degree of covalency and structural complexity increases.
3.
Ion Polarization
Up to this point we have considered ions to have a frozen spherical electron density that may be represented by a point charge. While this is a reasonable representation of many cations, it is not as accurate a description for anions which tend to be much more polarizable. This can be readily appreciated for the oxide ion, O2− in particular. In this case, the first electron affinity of oxygen is favourable, while the second electron affinity is endothermic due to the Coulomb repulsion between electrons. Consequently, the second electron is only bound by the electrostatic potential due to the surrounding cations, and therefore the distribution of this electron will be strongly perturbed by the local environment. It is therefore natural to include the polarizability of anions, and even some larger cations, in ionic potential models when reliable results are required. While polarization may occur to arbitrary order, here the focus will be on the dipole polarizability, α, which is typically the dominant contribution. In the presence of an electric field, E, the dipole moment, µ, generated is given by; µ = αE and the polarization energy, U dipolar, that results is: U dipolar = − 12 α E 2 The electric field at an ion is given by the first derivative of the electrostatic potential with respect to the three Cartesian directions, and therefore can be calculated from the Ewald summation for a bulk material. In principle, it is then straightforward to apply the above point ion polarizability correction to the total energy of a simulation. However, it introduces extra complexity since
Interatomic potential models for ionic materials
487
the induced dipole moments will also generate an electric field at all other ions in the system. Hence, it is necessary to consider the charge–dipole and dipole–dipole interactions as well. The whole procedure involves iteratively solving for the dipole moments on the ions until self-consistency is achieved in a manner analogous to the self-consistent field procedure that occurs in quantum mechanical methods. There is one disadvantage to the use of point ion polarizabilities, as described above, which is that the value of α is a constant. Physically, the more polarized an ion becomes, the harder it should be to polarize it further, and so the induced dipole is prevented from reaching extreme values. If the polarizablity is a constant, a so-called polarization catastrophe can occur in which the total electrostatic energy becomes exothermic faster than the repulsive energy increases leading to the collapse of two ions onto each other. This is particularly problematic with the Buckingham potential since the energy at zero distance tends to −∞. An alternative description of dipolar ion polarization that addresses the above problem is the shell model introduced by Dick and Overhauser [4]. Their approach is to create a simple mechanical model for polarization by dividing each ion into two particles, known as the core and the shell. Here the core can be conceptually thought of as representing the nucleus and core electrons, while the shell represents the more polarizable valence electrons. Thus the core is often positively charged, while the shell is negatively charged, though when utilizing a shell model for a cation it is not uncommon for both core and shell to share the positive charge. Both particles are Coulombically screened from each other and only interact via a harmonic restoring force: 2 U core−shell = 12 kcsrcs
where rcs is the distance between the core and shell. There are two important consequences of the shell model approach. Firstly, because the shell enters the simulation as a point particle, the achievement of electrostatic self-consistency is transformed into a minimization of the shell coordinates. Consequently, this is achieved concurrently with the optimization of the real atomic positions (namely the core positions), though at the cost of doubling the number of variables. While this significantly increases the time required to invert the Hessian matrix, assuming Newton–Raphson optimization is being employed, the convergence rate is also enhanced through all the information on the coupling of coordinates with the polarization being utilized. Secondly, it is the usual convention for the short-ranged potentials to act on the shell of a particle, rather than on the core, which leads to the polarizability becoming environment dependent. If the force constant (second derivative) of the short-range potential acting on the shell is kSR and the shell charge is
488
J.D. Gale
qshell , the polarizability of the ion is equal to: α=
2 qshell kcs + kSR
Special handling of the shell model is required in some simulations. In particular, for molecular dynamics the presence of a particle with no mass potentially complicates the solution of Newton’s equations of motion. However, there are two solutions to this that parallel the techniques found in electronic structure methods. One approach is to divide the atomic mass so that a small fraction is attributed to the shell instead of the core. If chosen to be small enough, the frequency spectra for the shells is higher than any mode of the real material, such that the shells are largely decoupled from the nuclear motions. The disadvantage of this is that a smaller timestep is required in order to achieve an accurate integration. Alternatively, the shells can be minimized at every timestep in order to follow the adiabatic surface. Although the same timestep can now be used as per core-only dynamics, the cost per move is greatly increased. Similarly in lattice dynamics, it is also necessary to consider the contribution from relaxation of the shell positions to the dynamical matrix, which will act to soften the energy surface. Both point ion polarizabilities and the shell model have benefits for interatomic potential simulations of ionic materials. Firstly, they act to stabilize lower symmetry structures and hence it would not be possible to reproduce the structural distortion of various materials without their inclusion. Secondly, they make it possible to determine many materials properties that intrinsically have a strong electronic component. For instance, both the low and high frequency dielectric constant tensors may be calculated, where the former is determined by both the electronic and nuclear contributions, while the latter is purely dependent on the contribution from the polarization model.
4.
Derivation of Ionic Potentials
So far, the typical functional form of the interaction energy in ionic materials has been described, without discussing how the parameter values are arrived at within the model. Many aspects are similar to general forcefield derivation as practiced for organic and inorganic systems, be they ionic or not. However, there are a few differences also that will be highlighted below. Given the dominance of the electrostatic contribution for ionic materials, the starting point for any force field is to determine the nature of the point charges to be employed. There are two broad approaches – either to employ the formal valence charge or to chose smaller partial charges. The main advantages of formal charges are that they remove a degree of freedom from the fitting process and also ensure wide compatability of force fields, in
Interatomic potential models for ionic materials
489
that parameters from binary compounds can be combined to model ternary or more complex phases where the cations do not have the same formal valence charge. Furthermore, when studying defects in materials the vacancy, interstitial or impurity will be guaranteed to carry the correct total charge. On the other hand, for materials with a formal valence of greater than +2 it is argued that formal charges are unrealistic and so partial charges must be used. Indeed, Mulliken charges from ab initio calculations do suggest that such materials are not fully ionic. However, the Mulliken charge is only one of several charge partitioning schemes. Arguably more pertinent measures of ionicity are the Born effective charges that describe the response of the charge density to an electric field. For a solid, where it is not possible to determine the charges that best reproduce the external electrostatic potential, as would be the case for molecules, considering the dipolar response is the next best thing. It is often the case that formal charges, in combination with a shell model for polarization, yield very similar Born effective charges to periodic density functional calculations [6]. Consequently, for low symmetry structures at least, both formal and partial charges can be equally valid in a well derived model. Having determined the charge states of the ions, it is then necessary to derive the short-range and other parameters for the force field by fitting. Parameter derivation falls into one of two classes, either being based on the use of theoretical or experimental data. While truly ab initio parameter derivation is desirable, most theoretical procedures are subject to systematic errors and so empirical fitting to experimental information has tended to be prevalent. Fitting consists of specifying a training set of observable quantities, that may be derived theoretically or experimentally, and then varying the parameters in a least squares procedure in order to minimize the discrepancy between the calculated and observed values [5]. Typically, the training set would consist of one or more structures that represent local energy minima (i.e., stable states with zero force) and data that provide information as to the curvature of the energy surface about these minima, such as bulk moduli, elastic constants, dielectric constants, phonon frequencies, etc. Ideally, multiple structures and as much data as possible should be included in the procedure in order to maximize transferability and to constrain the parameters to physically sensible values. Because it is possible to weight the observables according to their reliability or importance there can never be a single unambiguous fit. In the above brief statement of what fitting is, it is given that the structural data is to be used as an observable. However, there are several distinct ways in which this can be done. If the force field is a perfect fit then the forces calculated at the observed experimental, or theoretically optimized, structure should be zero. Hence it is common to use the forces determined at this point as the observable for fitting, rather than the structure per se, since they are straight forward to calculate. In practice, the quality of the fit is usually imperfect and so there will be residual forces. Lowering the forces does not guarantee that the
490
J.D. Gale
discrepancy in the optimized structural parameters will be minimized though, since this also depends on the curvature. Assuming that the system is within the harmonic region, the errors in the structure, x, will be related to residual force vector, f resid , according to x = H −1 f resid where H is the Hessian matrix containing the second derivatives. Thus one approach to directly fitting the structure is to use the above expression for the errors in the structure. Alternatively, the structure can be fully optimized for each evaluation of the fit quality, which is considerably more expensive, but guaranteed to be reliable regardless of whether the energy surface is quadratic or not. This latter method, referred to as relaxed fitting, also possesses the advantage that any curvature related properties can be evaluated for the structure of zero force, such that the harmonic expressions employed are truly valid. The case of a shell model fit deserves special mention here, since the issues do not usually arise during fits to other types of model. Because of the mapping of dipoles to a coordinate space representation there is the question of how to handle the shell positions during a fit. Given that the cores are equated with the nuclear position, and that it is difficult to ascribe atom-centered dipoles in a crystal, there is rarely any information on where the shells should be sited. In a relaxed fit the issue disappears, since the shells just optimize to the position of minimum force. For a conventional force-based fit then the shells must either still be relaxed explicitly at each evaluation of the sum of squares, or their coordinates can be included as variable parameters such that the relaxation occurs concurrently with the fitting process. Theoretical derivation of parameters can either closely resemble empirical fitting, by inputing calculated observables, or alternatively an energy hypersurface can be utilized. In this latter case many different structures, usually sampled from around the energy minima, are specified along with their corresponding energies. As a result, the curvature of the energy surface is fitted directly rather than by assuming harmonic behavior about the minimum. Again the issue of weighting is particularly important since it tends to be more crucial to ensure a good quality of fit close to the minimum at the expense of points that are further away. To date it has been more common to utilize quantum mechanical data for finite clusters in potential derivation, rather than directly fitting solid state ab initio information. However, this introduces uncertainties, since it is not clear how transferable the gas phase cluster data will be to bulk materials since they are dominated by surface effects. There are two further theoretical methods for parameter derivation that deserve a mention, namely electron gas methods and rule-based methods. The first is particularly significant since it was a popular approach in the early days of the computer simulation of ionic materials at the atomistic level. In the electron gas method, the energy of overlapping frozen ion electron densities
Interatomic potential models for ionic materials
491
is calculated according to density functional theory as a function of distance. These energies can then be used directly via splines or fitted to a functional form. Given that not all ions, such as O2− , are stable in vacu, the ion densities were usually determined in an appropriate potential well to mimic the lattice environment. The results obtained directly from this procedure where not always accurate, given the limitations of density functional theory, so often the distance dependence was shifted to improve the position of the minimum. The second alternative theoretical approach is to use rules that encapsulate how to determine interactions from atomic properties, such as the polarizability and atomic radius, in order to generate force fields of universal applicability. Of course, this compromises the accuracy of the results for any given system, but can be useful for systems were there is little known data to fit to.
5.
Applications of Ionic Potentials
Having defined the appropriate force field for a material, it is then possible to calculate many different properties in a very straight forward fashion. Simulations can be broadly divided into two categories – static and dynamic. In a static calculation, the structure of a material is optimized to the nearest local minimum, which may represent one desired polymorph of a system, as opposed to the global minimum, and then the properties are derived by consideration of the curvature about that position. For example, many of the mechanical, vibrational and electrical response properties are all functions of the second derivatives of the energy with respect to atomic coordinates and lattice strains. For pair potentials, the determination of these properties is not dramatically more expensive than the evaluation of the forces, with the exception of matrix inversions that may be required once the second derivative matrix has been calculated. This is in contrast to quantum mechanical methods where the determination of the wavefunction derivatives makes analytical property calculations almost as expensive as finite difference procedures. In a dynamical simulation, the probability distribution, composed of many different nuclear configurations, is sampled to provide averaged properties that depend on temperature. This usually involves performing either molecular dynamics (in which case the time correlation between data is known) or Monte Carlo (where configurations are selected randomly according to the Boltzmann distribution). Fundamentally static and dynamic methods differ because the former are founded within the harmonic approximation, while the latter allow for anharmonicity. For the purposes of this section, the focus will be placed on the static information that can be obtained from ionic potentials, but stoichastic simulations would also be equally as applicable. The first information to be yielded by an energy minimization is the equilibrium structure. Given that many potentials are
492
J.D. Gale
fitted to such data, it is not surprising that the quality of structural reproduction, at least for simple binary materials, is usually high. Many force fields are derived with out explicit reference to temperature, so consequently the structure that is calculated may contain implicit temperature effects even though the optimization was performed nominally at zero Kelvin. As an example of the application of the formal charge, shell model potential a set of parameters has been derived for alumina. The observables used consisted of the structure of corundum and its elastic and dielectric constants. As a starting model, the parameters originally derived by Catlow et al. [1] were used and subjected to the relax fitting approach. Alumina is a material that has been much studied already, so the aim here is just to illustrate typical results yielded by a fit to such a material and some of the related issues. Values of the calculated properties for corundum, α-Al2 O3 are given in Table 1, along with the comparison against experiment, using the potentials derived, which are given in Table 2. Before considering the results, let us consider the parameters that resulted from the fit since they highlight a number of points. Firstly, by looking at the shell charges and spring constants it can be seen that the oxide ion is responsible for most of the polarizability of the system as would be expected. This is a natural result of the fitting process since the charge distribution between core and shell, as well as the spring constant, was allowed to vary. Secondly, in accord with this picture the attractive dispersion term for Al–O is set to zero, though even if allowed to vary it remains small. Finally, the oxygen–oxygen Table 1. Calculated versus experimental structure and properties for aluminium oxide in the corundum structure based on a shell model potential fitted to the same experimental data Observable
Experiment
Calculated
a (Å) c (Å) Al z (frac) O x (frac) C11 (GPa) C12 (GPa) C13 (GPa) C14 (GPa) C33 (GPa) C44 (GPa) C66 (GPa) 0 ε11 0 ε33 ∞ ε11 ∞ ε33
4.7602 12.9933 0.3522 0.3062 496.9 163.6 110.9 −23.5 498.0 147.4 166.7 9.34 11.54 3.1 3.1
4.9084 12.9778 0.3597 0.2987 567.1 224.6 158.1 −54.3 453.3 127.6 171.2 8.70 13.38 2.88 3.06
Interatomic potential models for ionic materials
493
Table 2. Interatomic potential parameters derived for alumina based on relax fitting to the experimental observables given in Table 1. The starting parameters were taken from Catlow et al. and a two-body cut-off distance of 16.0 Å was employed, while that for the core-shell interaction was 0.8 Å. All non-Coulombic interactions not explicitly given are implicitly zero. The shell charges for A1 and O were −0.0395 and −2.0816, respectively Species 1
Species 2
A (eV)
ρ (Å)
C (eV/Å6 )
kcs (eV/Å2 )
A1 shell O shell A1 core O core
O shell O shell A1 shell O shell
1012.17 22764.00 – –
0.32709 0.14900 – –
0.0 22.368 – –
– – 331.958 24.625
repulsive term is particularly short-ranged and only makes a minute contribution at the equilibrium structure. Consequently, the A and ρ values are rarely varied from the original starting values. The rhombohedral corundum structure is sufficiently complex that even though the potential was empirically fitted to this particular system it is still not possible to achieve a perfect fit. While for many dense high symmetry ionic compounds it is possible to obtain accuracy of better than 1% for structural parameters, the moment there are appreciable anisotropic effects it becomes more difficult. This is illustrated by corundum where it is impossible with the basic shell model to accurately describe the behavior in the ab plane and along the c axis simultaneously, leading to an error of 3% in the a and b cell parameters. Not only is this true for the structure, but it is even more valid for the curvature related properties. If the values of C11 and C33 are compared, which are indicative of the elastic behavior in the two distinct directions, the calculated values have to achieve a compromise by one value being higher than experiment, while the other is lower. In reality, alumina is elastically fairly isotropic, but a dipolar model cannot capture this. The above results for alumina also illustrate the fact that while it is usually possible to reproduce structural parameters to within a few percent, the errors associated with other properties can be considerably greater. As pointed out earlier, although a formal charge model for alumina was employed, the ions in fact behave as though the system is less than fully ionic due to the polarizability. The calculated Born effective charges show that aluminium has a reduced ionicity with a charge of +2.32 in the ab plane and a slightly higher value of +2.55 parallel to the c axis. These magnitudes are in good agreement with assessments of the degree of ionicity of corundum obtained from ab initio calculations. There are many more bulk properties that can be readily determined from interatomic potentials than those given above. For instance, phonon
494
J.D. Gale
frequencies, dispersion curves and densities of states, acoustic velocities, thermal expansion coefficients, heat capacities, entropies and free energies can all be obtained from determining the dynamical matrix about an optimized structure [6]. Other important quantities can also be determined by creating defects in the system, such as vacancies, interstitials and grain boundaries, or by locating other stationary points, in particular transition states for ion diffusion. The possibilities are as boundless as the number of physical processes that can occur in a real material.
6.
Discussion
So far, the basic ionic potential approach to the modeling of solids has been portrayed. While this is very successful for many of the materials for which it was intended, and that composed the majority of the earlier studies, there are increasingly many situations where extensions and modifications are required in order to broaden the scope of the technique. These enhancements recognize the fact that many systems comprise atoms that are less than fully ionic and often non-spherical. One of the most limiting aspects of the ionic model is the use of fixed charges. It is often the case that potential parameters are derived for the bulk material alone where a compound is at its most ionic. However, the ideal force field should also be transferable to lower coordination environments, such as surfaces and even gas phase clusters. Fundamentally, the problem with any fixed charge model, be it formally or partially charged, is that it cannot reproduce the proper dissociation limit of the interaction. Ultimately, if sufficiently far removed from each other, an ionic structure should transform into separate neutral atoms. There is a more sophisticated way of determining partial charges within a force field that addresses the above issue, which is to calculate them as an explicit function of geometry. While this has only been sparsely utilized to date, due to the extra complexity, it has the potential to capture, through chargetransfer, many of the higher order polarizabilities beyond the dipole level, as well as yielding the proper dissociation behavior. The predominant approach to determining the charges has been via electronegativity equalization [8]. Here the self energy of an ion is expressed as a quadratic function of the charge in terms of the electronegativity, χ, and hardness, µ: Uiself (q) = Uiself (0) + χi q + 12 µi q 2 When coupled to the electrostatic energy of interaction between the ions, and solved subject to the condition of charge neutrality for the unit cell, this
Interatomic potential models for ionic materials
495
determines the charges on the ions. The main variation between schemes is the form selected for the Coulomb interaction between ions. While some workers have used the limiting point-charge interaction of 1r at all distances, it has been argued that damped interactions should be used that more realistically mimic the nature of two-centre integrals (i.e., tend to a constant value as r → 0). Variable charge schemes have shown some promise, and have clear advantages since they allow multiple oxidation states to be treated with a single set of parameters, at least in principle. This simplifies the study of materials where the same cation occurs in multiple oxidation states, since no prior assumption needs to be made as to the charge ordering scheme. However, there are still many challenges in this area since it appears that choosing the more formally correct screened Coulomb interaction leads to the electrostatics only contributing weakly to the interionic forces to an extent that is unrealistic. Looking beyond dipolar polarizability, which is a limitation of the most widely used form of ionic model, there are instances where higher order contributions are important. Here, we consider two examples that highlight the issues. Experimentally it is observed that many cubic rock salt structured materials exhibit a so-called Cauchy violation in that the elastic constants C12 and C44 are not equivalent. It has been demonstrated that two-body potential models are unable to reproduce this phenomenon, and inclusion of dipolar polarizability fails to improve the situation. The Cauchy violation actually requires a many-body coupling of the interactions through a higher order polarization. This can be handled through the inclusion of a breathing shell model. Here the shell is given a finite radius that is allowed to vary with a harmonic restoring force about an equilibrium value, with the repulsive short-ranged potential also acting on it. This non-central ion force generates a Cauchy violation, though always of one particular sign (C44 > C12 ), while the experimental values can be in either direction. A second example of the role of polarization, is in the stability of polymorphs of alumina. If the relative energies of alumina adopting different possible M2 O3 structures is examined using most standard interatomic potential models, including that given in the previous section, then it is found that the corundum structure (which is the experimental ground state under ambient conditions) is not the most stable, with the bixbyite form being preferred. Investigations have demonstrated that the inclusion of quadrupolar polarizability is essential here [7]. This can be readily achieved within the point ion approach, but is more difficult in the shell model case. While an elliptical breathing shell model can capture the effect, it highlights the fact that the extension of this mechanical approach to higher order terms becomes increasingly cumbersome. While most alkali and alkaline earth metals conform reasonably well to the ionic model, there are substantial problems with describing many of the remaining cations in the periodic table. In particular, transition metals ions
496
J.D. Gale
are often non-spherical due to the partial occupancy of the d-orbitals. The classic example of this is when the anti-bonding eg∗ orbitals of an octahedral ion are half-filled for a particular spin, giving rise to a Jahn–Teller distortion, as is the case for Cu2+ . To describe this effect with a simple potential model is impossible, except by constructing a highly specific model with different short-ranged potentials for each metal–ligand interaction, regardless of the fact that they may be acting between the same species. So far, the only solution to the problem of ligand–field effects has been to resort to approaches that mimic the underlying quantum mechanics, but in an empirical fashion. Hence, most work has utilized the angular overlap model to describe a set of energy levels that are subsequently populated according to a Fermi–Dirac distribution, where the states are determined by diagonalizing a 5 × 5 matrix determined according to the local environment [11]. This approach has been successfully used to describe the manganate (Mn3+ , d4 ) cation, as well as other systems within a molecular mechanics framework. At the heart of the ionic potential method is the electrostatic energy, normally evaluated according to the Ewald sum when working within 3-D boundary conditions. However, this approach possesses the disadvantage that it scales 3 at best as N 2 , where N again represents the number of atoms within the simulation cell. In an era when very large scale simulations are being targeted, it is necessary to also reassess the underlying algorithms to ensure the optimal efficiency is attained. Consequently, the fundamental task of calculating the Coulomb energy is still an area of active research. Approaches currently being employed include the particle-mesh and cell multipole methods. The desirable characteristics of an algorithm are now that it should both scale linearly with system size and also be amenable to parallel computation. Both of these can be achieved as long as the method is local in real space, in some cases with complementary linear-scaling in reciprocal space, or if a hierarchical scheme is utlized within the cell multipole method to make the problem increasing coarse-grained the greater the distance of interaction is. Methods have been proposed that use a spherical cut-off in real space alone, which naturally satisfies both desirable criteria [10]. However, it becomes difficult to achieve the defined Ewald limiting value without a considerable prefactor.
7.
Outlook
The state of the art in force fields for ionic materials looks set for a gradual evolution that sees it take on board many concepts from other types of system, while retaining the aim of an accurate evaluation of the electrostatic energy at the core. For the very short-ranged interactions it is likely that bond order models, widely used in the semiconductor and hydrocarbon fields, and
Interatomic potential models for ionic materials
497
also closely related to the approach taken for metallic systems, will be blended with schemes that capture the variation of the charge and higher order multipole moments as a function of structure. The result will be force fields that are capable of simulating not only one category of material, but several distinct ones. Development of solid state quantum mechanical methods to increased levels of accuracy will increasingly provide the wealth of information required for parameterisation of more complex interatomic potentials for systems, especially where there is a paucity of experimental data. Ultimately, this will lead to a seamless transition to models capable of reliably describing interfaces between ionic and non-ionic systems – currently one of the most challenging problems in materials science.
References [1] C.R.A. Catlow, R. James, W.C. Mackrodt, and R.F. Stewart, “Defect energetics in α-Al2 O3 and rutile TiO2 ,” Phys. Rev. B, 25, 1006–1026, 1982. [2] C.R.A. Catlow and W.C. Mackrodt, “Theory of simulation methods for lattice and defect energy calculations in crystals,” Lecture Notes in Phys., 166, 3–20, 1982. [3] S.W. de Leeuw, J.W. Perram, and E.R. Smith, “Simulation of electrostatic systems in periodic boundary conditions. i. lattice sums and dielectric constants,” Proc. R. Soc. London, Ser. A, 373, 27–56, 1980. [4] B.G. Dick and A.W. Overhauser, “Theory of the dielectric constants of alkali halide crystals,” Phys. Rev., 112, 90–103, 1958. [5] J.D. Gale, “Empirical potential derivation for ionic materials,” Phil. Mag. B, 73, 3–19, 1996. [6] J.D. Gale and A.L. Rohl, “The general lattice utility program (GULP),” Mol. Simul., 29, 291–341, 2003. [7] P.A. Madden and M. Wilson, “ ‘Covalent’ effects in ‘ionic’ systems,” Chem. Soc. Rev., pp. 339–350, 1996. [8] W.J. Mortier, K. van Genechten, and J. Gasteiger, “Electronegativity equalization: applications and parameterization,” J. Am. Chem. Soc., 107, 829–835, 1985. [9] M.P. Tosi, “Cohesion of ionic solids in the Born model,” Solid State Phys., 16, 1–120, 1964. [10] D. Wolf, P. Keblinski, S.R. Philpot, and J. Eggebrecht, “Exact method for the simulation of Coulombic systems by spherically truncated, pairwise r −1 summation,” J. Chem. Phys., 110, 8254–8282, 1999. [11] S.M. Woodley, P.D. Battle, C.R.A. Catlow, and J.D. Gale, “Development of a new interatomic potential for the modeling of ligand field effects,” J. Phys. Chem. B, 105, 6824–6830, 2001.
2.4 MODELING COVALENT BOND WITH INTERATOMIC POTENTIALS Joa˜ o F. Justo Escola Polit´ecnica, Universidade de S˜ao Paulo, S˜ao Paulo, Brazil
Atoms, the elementary carriers of chemical identity, interact strongly with each other to form solids. It is interesting that those interactions could be directly mapped to the electronic and structural properties of the resulting materials. This connection between microscopic and macroscopic worlds is appealing, and suggests that a theoretical atomistic model could help to model and build materials with predetermined properties. Atomistic simulations represent one of the tools that can bridge those two worlds, accessing to information on the microscopic mechanisms which, in many cases, could not be sampled out by experiments. One of the most important elements in an atomistic simulation is the model describing the interatomic interactions. In principle, such model should take into account all the particles (electrons and nuclei) of the system. Quantum mechanical (or ab initio) methods provide a precise description of those interactions, but they are computationally prohibitive. As a result, simulations would be restricted to systems involving only up to a thousand (or a few thousand) atoms, which is not enough to capture many important atomistic mechanisms. Some approximation, leading to less expensive models, should be implemented. A radical approach is to describe the interactions by classical potentials, in which the electronic effects are somehow integrated out, being taken into account only implicitly. The gain in computational efficiency comes with a price: a poorer description of the interactions. Ab initio methods will become increasingly important in materials science over the next decade. Even using the fastest computers, those methods will continue being computationally expensive. Therefore, there is a demand for less expensive models to explore a number of important phenomena, to provide a qualitative view, scan for trends or insights on atomistic events, which could be later refined using ab initio methods. Developing an interatomic potential involves a combination of intuitive thinking, which comes out from our 499 S. Yip (ed.), Handbook of Materials Modeling, 499–507. c 2005 Springer. Printed in the Netherlands.
500
J.F. Justo
knowledge on the nature of the interatomic bonding, and theoretical input. However, there is no theory which would directly provide the functional form for an interatomic potential. As a result, depending on the bonding type, considerably distinct approaches have been devised to describe interatomic interactions [1, 2]. In any case, the functional form should have a physical motivation and enough flexibility, in terms of fitting parameters, to capture the essential aspects underlying the interatomic interactions. The next sections discuss the specific case of modeling the covalent bonding by interatomic potentials, and the elements which should be present to properly describe such interactions.
1.
Pair Potentials
The cohesive energy (E c ) is the relevant property which quantifies cohesion in a solid. It is given by E c (Rn , rm ), where Rn and rm represent the degrees of freedom of the n nuclei and m electrons, respectively. While E c could be computed by solving the quantum mechanical Schr¨odinger equation for the electrons of the system, one should inquire what kind of approximation could be performed to describe E c with less expensive methods. One strategy is to average the electronic effects out, but still keeping the electronic degrees of freedom explicitly. One of these approaches, called tight-binding method, provides a realistic description of bonding in solids. However, those models are still computationally too expensive, although simulations with a few thousand atoms could be performed. An extreme approach is to remove all the electronic degrees of freedom, and E c would be given by E c (Rn , rm ) ≈ E c (Rn ). In this last case, the electronic effects would be implicitly present in the functional form. Several interatomic potentials for covalent bonding have been developed over the years. Only for silicon, which is considered the prototypical covalent material, there are more than thirty models which have been extensively used and tested [3]. This and the following sections discuss the relevant elements of an interatomic potential to describe a typical covalent material. The discussion focuses on the two most important models which have been developed for silicon [4, 5]. Cohesive energy could be determined by the atomic arrangement, in terms of a many-body expansion [6] Ec =
n i
V1 (Ri ) +
n i, j
V2 (Ri , R j ) +
n
V3 (Ri , R j , Rk ) + · · · ,
(1)
i, j,k
in which the sums are over all the n atoms of the system. In principle, E c could be determined by an infinite many-body expansion, but the computational cost scales with n l , where l is the order in which the expansion is truncated. The one-body terms (V1 ) are generally neglected, but the two-body (V2 ) and
Modeling covalent bond with interatomic potentials
501
three-body (V3 ) terms carry most of the relevant effects underlying bonding. While the V2 and V3 have a simple physical interpretation, intuition for higher order terms is not so straightforward, and most models have avoided such terms. Could the expansion (1) be truncated in a two-body expansion and still capture the essential properties of covalent bonding? For a long period, pair potentials were used to investigate materials properties, and revealed a number of fundamental atomistic processes. Models including higher order expansions, later developed, provided results which were qualitatively consistent with those early investigations. This sets light on the discussion of pair potentials. Although they provide an unrealistic description of covalent bonding, they still capture some of the essential aspects of cohesion. A typical V2 function has a strong repulsive interaction at short interatomic separations, changing to an attractive interaction at intermediate separations which goes smoothly to zero at longer distances. The V2 interaction, between atoms i and j , can be written as combination of a repulsive (VR ) plus an attractive (V A ) interaction in terms of the interatomic distance, ri j = |Ri − R j |.
V2 / ε
1
0
⫺1 1
2
r/a Figure 1. The two-body interatomic potential. The figure presents V2 for two models: the Lennard–Jones (full line) and the Stillinger–Weber (dashed line) potentials. The functions are plotted normalized in terms of the minimum in energy and equilibrium separation (a).
502
J.F. Justo
The Lennard–Jones potential, shown in Fig. 1, is an example of a pair potential used to model cohesion in a solid V2 (r) = VR (r) + V A (r) = 4ε
12
σ r
−
6
σ r
,
(2)
where ε and σ are free parameters which can be fitted to properties of the material. The equilibrium interatomic distance (a) is related to the crystalline lattice parameter, while the curvature of the potential near a is directly correlated to the macroscopic bulk modulus. The functional form in Eq. (2) is long ranged, and the computational cost scales with n 2 . On the other hand, this cost could scale linearly with n if a cut-off function f c (r) were used. This f c (r) function should not change substantially the interaction for the relevant region of bonding, near the minimum of V2 (r), and should vanish at a certain interatomic distance Rc , defined as the cut-off of the interaction. Therefore, the two-body interaction is described by an effective potential V2eff (r) = V2 (r) f c (r). The functional form of the Lennard–Jones potential can provide a realistic description of noble gases in condensed phases. Although pair potentials capture some essential aspects of bonding, there are still some important elements missing in order to properly describe covalent bonding. If interatomic interactions were described only by pair potentials, there would be a gain in cohesive energy if an atom increased its coordination (number of nearest neighbors). Since there is no energy penalty for increasing coordination, pair potentials will always lead to closed packed crystalline structures. However, atoms in covalent materials sit in much more open crystalline structures, such as hexagonal or the diamond cubic. Pair potentials alone cannot describe the covalent bonding, and many-body effects must be introduced in the description of cohesion.
2.
Beyond Pair Potentials
The many-body effects [6] could be introduced in E c by several procedures: inside the two-body expansion (pair functionals), by an explicit many-body expansion (cluster potentials), or a combination of both (cluster functionals). Models which have been successfully developed to describe covalent systems fit into one of these categories. The Stillinger–Weber [4] and the Tersoff [5] models can be classified as a cluster potential and as a cluster functional, respectively. In a description using only pair potentials, as given by Eq. (2), the cohesive energy of an individual bond inside a crystal is constant for any atomic coordination. However, this departs from a realistic description. Figure 2(a) shows the cohesive energy per bond as a function of atomic coordination for several crystalline structures of silicon. There is a weakening of the bond strength
Modeling covalent bond with interatomic potentials (a)
503
(b) 1.5
0
1
b(Z)
E c /bond
⫺1
⫺2
0.5 ⫺3
0
2
4
6
8
10
12
14
0
2
Z
4
6
8
10
12
14
Z
Figure 2. (a) Cohesive energy per bond (E c /bond) as a function of atomic coordination (Z ). Cohesive energies are taken from ab initio calculations (diamond), and the full and dashed lines represent fitting with a Z −1/2 and exp(−β Z 2 ), respectively. (b) Bond order term b(Z) as a function atomic coordination taken from ab initio calculations (diamond), and fitted to Z −1/2 (full line) and exp(−β Z 2 ) (dashed line).
with increasing coordination, a behavior that is observed in any material. However, bond strength weakens very fast with coordination in molecular crystals and very slow in most metals. That is why molecular solids favor very low coordinations and metals favor high coordinations. Covalent solids fall between those two extremes. Cohesive energy can be written as a sum over all the individual bonds Vi j Ec =
1 1 Vi j = VR (ri j ) + bi j V A (ri j ) , 2 i, j 2 i, j
(3)
where the parameter bi j controls the strength of the attractive interaction in Vi j . The attractive interaction between two atoms, i.e., the interaction controlling cohesion, is a function of the local environment. This dependence could be translated into a physical quantity called local coordination (Z ). As the coordination increases, valence electrons should be shared with more neighbors, so the individual bond between an atom and its neighbors weakens. Using chemistry arguments, it can be shown that the bond order term (bi j ), can be given as a function of the local coordination (Z i ) in atom i by −1/2
bi j (Z i ) = η Z i
,
(4)
where η is a fitting parameter. Figure 2(b) shows the bond order term as a function of coordination for several crystalline structures. The Z −1/2 function is a good approximation for high coordinations, but fails for low coordinations. It has been recently shown [7] that an exponential behavior for bi j would be more adequate. The introduction of the bond order term in V2 considerably improves the description of cohesion in a covalent material. With this new
504
J.F. Justo
term, the equilibrium distance and strength of a bond is also determined by the local coordination at each atomic center. Even using a bond order term, covalent bonding still requires a functional form with some angular dependence to stabilize the open crystalline structures. Angular functions could be introduced inside the bond order term b(Z ), as developed by Tersoff [5], which becomes b(Z , θ), where θ represents the angles between adjacent bonds around each atom of the system. Another procedure is to use an explicit three-body expansion [4]. In terms of energetics, there is a parallel between two-body and three-body potentials. In the former case, there is an energy penalty for interatomic distances differing from a certain equilibrium value. In the later case, there is a penalty for angles differing from a certain equilibrium value θ0 . The three-body potentials are generally positive, being null at an equilibrium angle. The interaction for the (i, j, k) set of atoms is described by V3 (ri j , rik , r j k ) = h(ri j )h(rik )g(θi j k ),
(5)
where the radial functions h(r) goes monotonically to zero with increasing the interatomic distance. Figure 3 shows the behavior of typical angular functions g(θ). The Stillinger–Weber model used a three-body expansion, and the V3 potential was developed as a penalty function with a minimum 2
i θ ijk
1.5
g(θ)
j
k
1
0.5
0
30
60
90
120
150
180
θ Figure 3. Angular function g(θ) from the Stillinger–Weber (full line) and Tersoff (dashed line) models.
Modeling covalent bond with interatomic potentials
505
at the tetrahedral angle (109.47◦ ). On the other hand, the Tersoff potential introduced an angular function inside the bond order term, and the minimum of the angular term was a fitting parameter.
3.
Models
Developing an interatomic potential involves several elements. The first one is the functional form, which should capture all the properties of covalent bonding. The functions should have enough flexibility, in terms of number of free parameters, to allow a description of a wide set of the materials properties. The second element is the fitting procedure used to find the set of free parameters that better describes a predetermined database. The database comprises a set of crystalline properties (such as cohesive energy, lattice parameter, elastic constants) and other specific properties (such as the formation energy of point defects) obtained from experiments or ab initio calculations. Additionally, the interatomic potential should be transferable, i.e., it should provide a realistic description of relevant configurations away from the database. Two interatomic potentials [4, 5] have prevailed over the others in studies of covalent materials. The Tersoff model is described by a two-body expansion, including a bond order term Ec =
1 Vi j , 2 i=/ j
(6)
Vij = f c (rij ) f R (rij ) + bij f A (ri j ) ,
(7)
where f R (ri j ) and f A (ri j ) are respectively, the repulsive and attractive terms given by f R (r) = A exp(−λ1r)
and
f A (r) = −B exp(−λ2r).
(8)
The f c (r) is a cut-off function which is one for the relevant region of bonding r < S, going smoothly to zero in the range S < r < R. The R and S, which control the range of interactions, are fitting parameters. The bij is the bond order term which is given by
bi j = 1 + β n ζinj ζi j =
1/2n
,
(9)
g(θi j k ) exp α 3 (ri j − rik )3 ,
(10)
c2 c2 − , d 2 d 2 + (h − cos θ)2
(11)
k= / i, j
g(θ) = 1 +
where θij k is the angle between i j and ik bonds.
506
J.F. Justo
The Tersoff potential was fitted to several silicon polytypes, being extended to other covalent systems, including multi-component materials. The Brenner potential [8], a model which resembles the Tersoff potential, is widely used to study hydrocarbon systems. The Stillinger–Weber potential is the most used model for covalent materials. It was developed as a three-body expansion E=
V2 (ri j ) +
i, j
V3 (ri j , rik , r j k ).
(12)
i, j,k
The two-body term V2 (r) is given by
B V2 (r) = A ρ − 1 f c (r), r
(13)
where the cut-off function f c (r) is given by
f c (r) = exp µ/(r − R) ,
(14)
if r < R and null otherwise. The three-body potential V3 is given by: V3 (ri j , rik ) = h(ri j )h(rik )g(θi j k ), h(r) = exp γ /(r − R) , g(θ) = (cos θ + 1/3)2.
(15) (16) (17)
This model was fitted to properties of the diamond cubic structure and local order of liquid silicon. Other models have been developed to describe covalent materials. Those models have used different approaches, such as functional forms with up to 50 parameters and extensive database. Some of those models have been compared with each other, specially in the case of silicon [3]. Such comparisons revealed that no interatomic potential is suitable for all situations, such that there is still space for further developments. Recently, a new model for covalent materials was developed [7] and included the features of both the Tersoff and the Stillinger–Weber models. That model included explicitly bond order terms in the two-body and three-body interactions, which allowed a better description of covalent bonding as compared to previous models.
4.
Perspectives
Interatomic potentials will continue playing an important role in atomistic simulations. Although potentials have been successfully applied to investigate covalent materials, they still face several challenges. As new models are
Modeling covalent bond with interatomic potentials
507
developed, theoretical input will increasingly prevail over empirical input. So far, the physical properties of bonding have been introduced by trial and error. Attempts to improve models have been in the direction of trying new functional forms, going to higher order expansions or increasing the number of fitting parameters. This will give place to more sophisticated approaches, in which the functional forms could be directly extracted from theory. Interatomic potentials also face the challenge to describe materials with mixed bonding character (metallic, covalent, and ionic altogether). The Tersoff potential, for example, has been extended to systems with some ionic character, but still with prevailing covalent character. That model would not work for materials with stronger ionic character, requiring at least the introduction of a long-ranged Coulomb interaction term. Finally, even if sophisticated interatomic potentials are developed, one should keep in mind that every model has its limited applicability and should always be used with caution.
References [1] A.F. Voter, “Interatomic potentials for atomistic simulations,” MRS Bulletin, 21(2), 17–19, (and additional references in the same issue, 1996). [2] R. Phillips, Crystals, Defects and Microstructures: Modeling Across Scales, Cambridge University Press, Cambridge, UK, 2001. [3] H. Balamane, T. Halicioglu, and W.A. Tiller, “Comparative study of silicon empirical interatomic potentials,” Phys. Rev. B, 46, 2250–2279, 1992. [4] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [5] J. Tersoff, “New empirical-approach for the structure and energy of covalent systems,” Phys. Rev. B, 37, 6991–7000, 1988. [6] A.E. Carlsson, “Beyond pair potentials in elemental transition metals and semiconductors,” In: H. Ehrenreich and D. Turnbull (eds.), Solid State Physics, vol. 43, Academic Press, San Diego, pp. 1–91, 1990. [7] J.F. Justo, M.Z. Bazant, E. Kaxiras, V.V. Bulatov, and S. Yip, “Interatomic potential for silicon defects and disordered phases,” Phys. Rev. B, 58, 2539–2550, 1998. [8] D.W. Brenner, “Empirical potential for hydrocarbons for use in simulating the chemical vapor-deposition of diamond films,” Phys. Rev. B, 42, 9458–9471, 1990.
2.5 INTERATOMIC POTENTIALS: MOLECULES Alexander D. MacKerell, Jr. Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, 20 Penn Street, Baltimore, MD, 21201, USA
Interatomic interactions between molecules dominate their behavior in condensed phases, including the aqueous phase in which biologically relevant processes occur [1]. Accordingly, it is essential to accurately treat interatomic interactions using theoretical approaches in order to apply such methods to study condensed phase phenomena. Typical condensed phase systems subjected to theoretical studies include thousands to hundreds of thousands of particles. Thus, to allow for calculations on such systems to be performed simple, computationally efficient functions, termed empirical or potential energy functions, are applied to calculate the energy as a function of structure. In this chapter an overview of potential energy functions used to study of condensed phase systems will be presented, with emphasis on biologically relevant systems. This overview will include information on the optimization of these models and address future developments in the field.
1.
Empirical Force Fields
Potential energy functions used for condensed phase simulation studies are comprised of simple functions to relate the structure, R, to the energy, V , of the system. An example of such a function is shown in Eqs. (1)–(3). The total potential V (R)total = V (R)internal + V (R)external V (R)internal =
K b (b − b0 )2 +
bonds
+
(1) K θ (θ − θ0 )2
angles
K χ (1 + cos (nχ − δ))
dihedrals
509 S. Yip (ed.), Handbook of Materials Modeling, 509–525. c 2005 Springer. Printed in the Netherlands.
(2)
510
A.D. MacKerell V (R)external
Rmin,ij εij = rij nonbonded
12
−
Rmin,ij ri j
6
qq + i j ε D rij
atompairs
(3) energy, V (R)total, is separated into internal or intramolecular terms, V (R)internal and external, V (R)external terms. The latter are also referred to as intermolecular or nonbonded terms. While interatomic interactions are dominated by external terms, the internal terms also make a significant contribution to condensed phase properties, requiring their consideration in this chapter [2]. Furthermore, it is not just the potential energy function alone that is required for determination of the energy as a function of the structure, but the parameters in Eqs. (2) and (3) are also needed. The combination of the potential energy function along with the parameters is termed an empirical force field. Application of an empirical force field to a chemical system of interest, in combination with numerical approaches allowing for sampling of relevant conformations via, e.g., a molecular dynamics simulation (MD) [3] (see below), can be used to predict a variety of structural and thermodynamic properties via statistical mechanics [4]. Importantly, such approaches allow for comparisons with experimental thermodynamic data and the atomic details of interatomic interactions between molecules that dictate the thermodynamic properties can be obtained. Such atomic details are often difficult to access via experimental approaches, motivating the application of computational approaches. Equations (2) and (3) represent a compromise between simplicity and chemical accuracy. The structure or geometry of a molecule is simply represented by four terms, as shown in Fig. 1. The intramolecular geometry is based on bond lengths, b, valence angles, θ, and dihedral or torsion angles, χ, that describe the orientation of 1,4 atoms (i.e., atoms connected by 3 covalent bonds). Additional internal terms may be included in a potential energy function, as described elsewhere [5, 6]. The bond stretching and angle bending terms are treated harmonically; bond and angle parameters include b0 and θ0 , the equilibrium bond length and equilibrium angle, respectively, and K b and K θ are the force constants associated with the bond and angle terms, respectively. The use of harmonic terms for the bond and valence angles is typically sufficient for molecular distortions near ambient temperatures and in the absence of bond breaking or making events, due the bonds and angles staying close to their equilibrium values at room temperature. Dihedral or torsion angles represent the rotations that occur about a bond. These terms are oscillatory in nature (e.g., rotation about the central carbon– carbon bond in ethane changes the structure from a low energy staggered conformation, to a high energy eclipsed conformation, back to a low energy staggered conformation and so on), requiring the use of a sinusoidal function to accurately model them. The dihedral angle parameters (Eq. (2)) include the
Interatomic potentials: molecules
511
Figure 1. Schematic diagram of the terms used to describe the structure of molecules in empirical force fields. Internal or intramolecular terms include bond lengths, b, valence angles, θ, and dihedral or torsion angles, χ. For the intermolecular interactions only the distance between atoms i and j, rij , is required.
force constant, Kχ , the periodicity or multiplicity, n, and the phase, δ. The magnitude of Kχ dictates the height of the barrier to rotation, such that Kχ associated with a double bond would be significantly larger that that for a single bond. The multiplicity, n, indicates the number of cycles per 360◦ rotation about the dihedral. In the case of an sp3–sp3 bond, as in ethane, n would equal three, while an sp2–sp2 C=C double bond would have n equal to two. The phase, δ, dictates the location of the maxima in the dihedral energy surface allowing for the location of the minima for a dihedral with n = 2 to be shifted from 0◦ to 90◦ and so on. Typically, δ is equal to 0 or 180, although recent extensions allow any value from 0 to 360 to be assigned to δ ◦ [7]. Each dihedral angle in a molecule may be treated with a sum of dihedral terms that have different multiplicities, as well as force constants and phases. The use of a summation of dihedral terms for a single torsion angle, a fourier series, greatly enhances the flexibility of the dihedral term allowing for more accurate reproduction of experimental and quantum mechanical (QM) energetic target data. Equation (3) describes the intermolecular, external or nonbond interaction terms which are dependent on the distance, rij , between two atoms i and j (Fig. 1). As stated above, these terms dominate the interactions between molecules and, accordingly, condensed phase properties. Intermolecular interactions are also important for the structure of biological macromolecules
512
A.D. MacKerell
due to the large number of interactions that occur between different regions of biological polymers that dictate their 3D conformation (e.g., hydrogen bonds between Watson–Crick base pairs in DNA or between peptide bonds in α-helicies or β-sheets in proteins). Parameters associated with the external terms are the well depth, εij , between atoms i and j, the minimum interaction radius, Rmin,i j , and the partial atomic charge, qi . The dielectric constant, ε D , is generally treated as equal to one, the permittivity of vacuum, although exceptions do exist when implicit solvent models are used to treat the condensed phase environment [8]. The first term in Eq. (3) is used to treat the van der Waals (vdW) interactions. The particular form in Eq. (3) is referred to as the Lennard–Jones (LJ) 6-12 term. The 1/r 12 term represents the exchangerepulsion between atoms associated with overlap of the electron clouds of the individual atoms (i.e., the Pauli exclusion principle). The strong distance dependence of the repulsion is indicated by the 12th power of this term. Representing London’s dispersion interactions or instantaneous-dipole induceddipole interactions is the 1/r 6 term, which is negative indicating its favorable nature. In the LJ 6-12 equation there are two parameters; the well depth, εij , dictating the magnitude of the favorable London’s dispersion interactions between two atoms, i, j, and Rmin ,ij is the distance between atoms i and j at which the minimum LJ interaction energy occurs; the latter is related to the vdW radius of an atom. Typically, εij and Rmin ,ij are not determined for every possible interaction pair, i, j. Instead εi and Rmin,i parameters are determined for the individual atom types (e.g., sp2 carbon vs sp3 carbon) and then combining rules are used to create the ij cross terms. These combining rules are generally quite simple being either the arithmetic mean (i.e., Rmin,ij = (Rmin,i + √ Rmin, j )/2) or the geometric mean (i.e., εij = ( εi · ε j )), although other variations exist [9]. The use of combining rules greatly simplifies the determination of the εi and Rmin,i parameters. In special cases the force field can be supplemented by specific i, j LJ parameters, referred to as off-diagnol terms, to treat interactions between specific atom types that are poorly modeled by the use of combining rules. There are several commonly used alternate forms for treatment of the vdW interactions. The three primary alternatives to the LJ 6-12 term included in Eq. (3) are designed to “soften” the repulsive wall associated with Pauli exclusion, yielding better agreement with high-level QM data [9]. For example, the Buckingham potential [10] uses an exponential term to treat repulsion while a buffered 14-7 term is used in the MMFF force field [11–13]. A simple alternative is to replace the r 12 repulsion with an r 9 term. The final term contributing to the intermolecular interactions is the electrostatic or Coulombic term. This term involves the interaction between partial atomic charges, qi and q j , on atoms i and j divided by the distance, rij , between those atoms with the appropriate dielectric constant taken into account. Use of a charge representation for the individual atoms, or monopoles,
Interatomic potentials: molecules
513
effectively includes all higher order electronic interactions, such as dipoles and quadrapoles. As will be discussed below, the majority of force fields treat the partial atomic charges as static in nature, due to computational considerations. These are referred to as non-polarizable or additive force fields. Finally, the use of a dielectric constant, ε, of one is appropriate when the condensed phase environment is treated explicitly (i.e., use of explicit water molecules to treat an aqueous condensed phase). Combined, the Lennard–Jones and Coulombic interactions have been shown to produce an accurate representation of the interaction between molecules, including both the distance and angle dependencies of hydrogen bonds [14]. This success has allowed for the omission of explicit terms to treat hydrogen bonding from the majority of empirical force fields. It is important to emphasize that the LJ and electrostatic parameters are highly correlated, such that LJ parameters determined for a set of partial atomic charges will not be applicable to another set of charges. In addition, the values of the internal parameters are dependent on the external parameters. For example, the barrier to rotation about the C–C bond in ethane includes electrostatic and vdW interactions between the hydrogens as well as contributions from the bond, angle and dihedral terms. Accordingly, if the LJ parameters or charges are changed, the internal parameters will have to be adjusted to reproduce the correct energy barrier. Finally, condensed phase properties obtained from empirical force field calculations contain contributions for the conformations of the molecules being studied as well as interatomic interactions between those molecules, emphasizing the importance of both internal and external portions of the force field for accurate condensed phase simulations.
2.
Parameter Optimization
Due to the simplicity of the potential energy function used in empirical force fields it is essential that the parameters in the function be optimized allowing for the force field to yield accurate results as judged by their quality in reproducing the experimental regimen. Parameter optimization is based on reproducing a set of target data. The target data may be obtained from QM calculations or experimental data. QM data is generally readily accessible for most molecules; however, limitations in QM level of theory, especially with respect to the treatment of dispersion interactions [15, 16], require the use of experimental data when available [6]. In the rest of this article, we will focus on intermolecular parameter optimization due to their dominant role in the interactions between molecules. Readers can obtain information on the optimization of internal parameters elsewhere [5, 11–13, 16, 17]. A large number of studies have focused on the determination of the electrostatic parameters; the partial atomic charges, qi . The most common charge
514
A.D. MacKerell
determination methods are the supramolecular and QM electrostatic potential (ESP) approaches. Other variations include bond charge increments [19, 20] and electronegativity equilization methods [21]. An important consideration with the determination of partial atomic charges, related to the Coulombic treatment of electrostatics in Eq. (3), is the omission of explicit electronic polarizability or induction. Thus, it is necessary for static charges to reproduce the polarization that occurs in the condensed phase. To do this, the partial atomic charges of a molecule are “enhanced” leading to an overestimation of the dipole moment as compared to the gas phase value, yielding an implicitly polarized model. For example, many of the water models used in additive empirical force fields (e.g., TIP3P, TIP4P, SPC) have dipole moments in the vicinity of 2.2 debeye [22], vs. the gas phase value of 1.85 debeye for water. Such implicit polarizability allows for additive empirical force fields based on Eq. (3) to reproduce a wide variety of condensed phase properties [23]. However, such models are limited when treating molecules in environments of significantly different polar character. Determination of partial atomic charges via the supramolecular approach is used in the OPLS [24, 25] and CHARMM [26–29] force fields. In this approach, the charges are optimized to reproduce QM determined minimum interaction energies and geometries of a model compound with, typically, individual water molecules or for model compound dimers. Historically, the HF/6-31G* level of theory was used for the QM calculations. This level typically overestimates dipole moments [30], thereby approximating the influence of the condensed phase on the obtained charge distribution leading to the implicitly polarizable model. In addition, the supramolecular approach implicitly includes local polarization effects due to the charge induction caused by the two interacting molecules, facilitating determination of charge distributions appropriate for the condensed phase. With CHARMM it was found that an additional scaling of the QM interaction energies prior to charge fitting was necessary to obtain the correct implicit polarization for accurate condensed phase studies of polar neutral molecules [31]. Even though recent studies have shown that QM methods can accurately reproduce gas phase experimental interaction energies for a range of model compound dimers [32, 33], it is important to maintain the QM level of theory that was historically used for a particular force field when extending that force field to novel molecules. This assures that the balance of the nonbond interactions between different molecules in the system being studied is maintained. Finally, an advantage of charges obtained from the supramolecular approach is that they are generally developed for functional groups, such that they may be transferred between molecules allowing for charge assignment to novel molecules to readily be performed. ESP charge fitting methods are based on the adjustment of charges to reproduce a QM determined ESP mapped onto a grid surrounding the model
Interatomic potentials: molecules
515
compound. Such methods are convenient and a number of charge fitting methods based on this approach have been developed [34–38]. However, there are limitations in ESP fitting methods. First, the ability to unambiguously fit charges to an ESP is not trivial [37] and charges on “buried” atoms (e.g., a carbon to which three or four nonhydrogen atoms are covalently bound) tend to be underdetermined, requiring the use of restraints during fitting [36]. The latter method is referred to as Restrained ESP (RESP). Third, since the charges are based on a gas phase QM wave function, they may not necessarily be consistent with the condensed phase, although recent developments are addressing this limitation [39]. Finally, considerations of multiple conformations of a molecule, for which different charge distributions typically exist, must be taken into account [30]. It should be noted that the last two problems must also be considered when using the supramolecular approach. As with the supramolecular approach, the QM level of theory was often the HF/6-31G*, as in the AMBER force fields [41], due to that level typically overestimating the dipole moment. More recently, higher level QM calculations have been applied in conjunction with the RESP approach [42], although their ability to reproduce condensed phase thermodynamic properties has not been tested. Clearly, both the supramolecular and ESP methods are useful for the determination of partial atomic charges. Which one is used, therefore, should be based on compatibility with that used for the remainder of the force field being applied. Accurate optimization of the LJ parameters is one of the most important aspects in the development of a force field for condensed phase simulations. Due to limitations in QM methods for the determination of dispersion interactions, optimization of LJ parameters is dominated by the reproduction of thermodynamics properties in condensed phase simulations, generally neat liquids [43, 44]. Typically, the LJ parameters for a model compound are optimized to reproduce experimentally measured values such as heats of vaporization, densities, isocompressibilities and heat capacities. Alternatively, heats or free energies of aqueous solvation, partial molar volumes or heats of sublimation and lattice geometries of crystals [45, 46] can be used as the target data. These methods have been applied extensively for development of the force fields associated with the programs AMBER, CHARMM, and OPLS. However, it should be noted that LJ parameters are typically underdetermined due to only a few experimental observations being available for the optimization of a significantly larger number of LJ parameters. This enhances the parameter correlation problem where LJ parameters for different atoms in a molecule (e.g., H and C in ethane) can compensate for each other such that it is difficult to accurately determine the “correct” LJ parameters of a molecule based on reproduction of condensed phase properties alone [5]. To overcome this approach a method has been developed that determines the relative value of the LJ parameters based on high level QM data [47] and the absolute values
516
A.D. MacKerell
based on the reproduction of experimental data [16, 49]. This approach is tedious as it requires supramolecular interactions involving rare gases; however, once satisfactory LJ parameters are optimized for atoms in a class of functional groups they can often be directly transferred to other molecules with those functional groups without further optimization.
3.
Considerations for Condensed Phase Simulations
Proper application of an empirical force field is obviously essential for success of a condensed phase calculation. An important consideration is the inclusion of all nonbond interactions between all atom-atom pairs For the electrostatic interactions this can be achieved via Ewald methods [49], including the particle Mesh Ewald approach [50], for periodic systems while reaction field methods can be used to simulation finite (e.g., spherical) systems [51– 53]. For the LJ interactions, long-range corrections exist that treat the interactions beyond the atom-atom truncation distance (i.e., those beyond a distance were the atom–atom interactions are calculated explicitly) as homogenous in nature [54, 55]. Another important consideration is the use of integrators that generate proper ensembles in MD simulations, allowing for direct comparison with experimental data [3, 57–60]. In addition, a number of methods are available to increase the sampling of conformational space [60–62]. The available and proper use of these different methods greatly facilitates investigations of molecular interactions via condensed phase simulations.
4.
Available Empirical Force Fields
A variety of empirical force fields have been developed. Force fields that focus on biological molecules include AMBER [18, 42] CHARMM [26–29], GROMOS [63, 64], and OPLS [24, 25], All of these force fields have been parametrized to account for condensed phase properties, such that they all treat molecular interactions with a reasonably high level of accuracy [65, 66]. However, these force fields, to varying extents, do not treat the full range of pharmaceutically relevant compounds. Force fields designed for a broad range of compounds include MMFF [11–13, 67], CVFF [17, 68], the commercial CHARMm force field [69], CFF [70], COMPASS [71], the MM2/MM3/MM4 series [72–74], UFF [75], Drediing [76], the Tripos force field (Tripos, Inc.), among others. However, these force fields have been designed primarily to reproduce internal geometries, vibrations and conformational energies, often sacrificing the quality of the nonbond interactions [65]. Exceptions are MMFF and COMPASS where nonbond parameters have been investigated at a reasonable level of detail. With all force fields the user is advised to perform tests
Interatomic potentials: molecules
517
on molecules for which experimental data is available to validate the quality of the model.
5.
Electronic Polarizability
Future improvements in the treatment of interatomic interactions between molecules will be based on the extension of the treatment of electrostatics to include explicit treatment of electronic polarizability [77, 78]. There are several methods by which electronic polarizability may be included in a potential energy function. These include fluctuating charge models [79–85], induced dipole models [85–89], or a combination of those methods [90, 91]. The classic Drude oscillator is an alternative method [92, 93] in which a “Drude” particle is attached to the nucleus of each atom and, by applying the appropriate charges to the atoms and “Drude” particles, the polarization response can be modeled. This method is also referred to as the shell model and has only been used in a few studies thus far [94–96]. In all of these approaches, the polarizability is solved analytically, iteratively or, in the case of MD simulations via extended Lagrangian methods [3, 77]. In extended Lagrangian methods the polarizability is treated as a dynamic variable in MD simulations. Extended Lagrangian methods are important for the inclusion of polarizability in empirical force fields as they offer the necessary computational efficiency to perform simulations on large systems. To date, work on water has dominated the application of polarizable force fields to molecular interactions. Polarizable water models have been shown to accurately treat both the gas and condensed phase properties [78, 86–89, 95, 97–99]. The ability to treat both the gas and condensed phases accurately marks a significant improvement over force fields where polarizability is not included explicitly. Other examples, where the inclusion of electronic polarization has been shown to increase the accuracy of the treatment of molecular interactions includes the solvation of ions [79, 85, 100, 101], ion-pair interactions in micellar systems [102], condensed phase properties of a variety of small molecules [78, 83, 103–107], cation–π interactions [103, 104], and in interfacial systems [108]. With respect to biological macromolecules, only a few successful applications have been made thus far [109–111]. Thus, explicit treatment of electronic polarizability in empirical force fields, although computationally more expensive then nonpolarizable models, is anticipated to make a significant contribution to the understanding molecular interactions at an atomic level of detail. An interesting observation with electronic polarizability is the apparent inability to apply gas phase polarizabilities to condensed phase systems, as evidenced in studies on water [95]. This phenomenom appears to be associated with the Pauli exclusion principle such that the deformability of the electron
518
A.D. MacKerell
cloud due to induction by the environment is hindered by the presence of adjacent molecules in the condensed phase [112]. This would lead to a decreased effective polarizability in the condensed phase. Such a phenomena has more recently been observed in QM studies of water clusters [113]. Further studies are required to better understand this phenomenon and properly treat it in empirical force fields.
6.
Summary
Interatomic interactions involving molecules dominate the properties of condensed phase systems. Due to the number of particles in such systems, it is typically necessary to apply computationally efficient empirical force fields to study them via theoretical methods. The success of empirical force field is based, in large part, on their accuracy in reproducing a variety of experimental observations; the accuracy being dictated by the quality of the optimization of the parameters that comprise the empirical force field. Proper optimization requires careful selection of target data as well as use of the appropriate optimization process. In cases where empirical force field parameters are being developed as an extension of an available force field, the optimization strategy must be selected to insure consistency with the previous parameterized molecules. These considerations will maximize the potential that the atomistic details obtained from condensed phase simulations will be representative of the experimental regimen. Finally, when analyzing results from condensed phase simulations, possible biases due to the parameters themselves must be considered when interpreting the data.
Acknowledgments Financial support from the NIH (GM51501) and the University of Maryland, School of Pharmacy, Computer-Aided Drug Design Center is acknowledged.
References [1] O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel-Dekker, Inc., New York, 2001. [2] W.L. Jorgensen, “Theoretical studies of medium effects on conformational equilibria,” J. Phys. Chem., 87, 5304–5312, 1983. [3] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000.
Interatomic potentials: molecules
519
[4] D.A. McQuarrie, Statistical Mechanics, Harper & Row, New York, 1976. [5] A.D. MacKerell, Jr., “Atomistic models and force fields,” In: O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe, Computational Biochemistry and Biophysics, Marcel Dekker, Inc., New York, pp. 7–38, 2001. [6] A.D. MacKerell, Jr., “Empirical force fields for biological macromolecules: overview and issues,” J. Comp. Chem., 25, 1584–1604, 2004. [7] A. Blondel and M. Karplus, “New formulation of derivatives of Torsion angles and improper Torsion angles in molecular mechanics: elimination of singularities,” J. Comput. Chem., 17, 1132–1141, 1996. [8] M. Feig, A. Onufriev, M.S. Lee, W. Im, D.A. Case, and C.L. Brooks, III, “Performance comparison of generalized born and Poisson methods in the calculation of electrostatic solvation energies for protein structures,” J. Comput. Chem., 25, 265– 284, 2004. [9] T.A. Halgren, “Representation of van der Waals (vdW) Interactions in molecular mechanics force fields: potential form, combination rules, and vdW parameters,” J. Amer. Chem. Soc., 114, 7827–7843, 1992. [10] A.D. Buckingham and P.W. Fowler, “A model for the geometries of van der Waals complexes,” Can. J. Chem., 63, 2018, 1985. [11] T.A. Halgren, “Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94,” J. Comput. Chem., 17, 490–519, 1996a. [12] T.A. Halgren, “Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions,” J. Comput. Chem., 17, 520–552, 1996b. [13] T.A. Halgren, “Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94,” J. of Comput. Chem., 17, 553–586, 1996c. [14] W.E. Reiher, Theoretical Studies of Hydrogen Bonding, Harvard University, 1985. [15] G. Chalasinski and M.M. Szczesniak, “Origins of structure and energetics of van der Waals clusters from ab initio calculations,” Chem. Rev., 94, 1723–1765, 1994. [16] I.J. Chen, D. Yin, and A.D. MacKerell, Jr., “Combined ab initio/empirical optimization of Lennard–Jones parameters for polar neutral compounds,” J. Comp. Chem., 23, 199–213, 2002. [17] C.S. Ewig, R. Berry, U. Dinur, J.R. Hill, M.-J. Hwang, H. Li, C. Liang, J. Maple, Z. Peng, T.P. Stockfisch, T.S. Thacher, L. Yan, X. Ni, and A.T. Hagler, “Derivation of class II force fields. VIII. Derivation of a general quantum mechanical force field for organic compounds,” J. Comp. Chem., 22, 1782–1800, 2001. [18] J. Wang and P.A. Kollman, “Automatic parameterization of force field by systematic search and genetic algorithms,” J. Comp. Chem., 22, 1219–1228, 2001. [19] B.L. Bush, C.I. Bayly, and T.A. Halgren, “Consensus bond-charge increments fitted to electrostatic potential or field of many compounds: application of MMFF94 training set,” J. Comp. Chem., 20, 1495–1516, 1999. [20] A. Jakalian, B.L. Bush, D.B. Jack, and C.I. Bayly, “Fast, efficient generation of highquality atomic charges. AM1-BCC model: I. Method,” J. Comp. Chem., 21, 132–146, 2000. [21] M.K. Gilson, H.S. Gilson, and M.J. Potter, “Fast assignment of accurate partial atomic charges: an electronegativity equilization method that accounts for alternate resonance forms,” J. Chem. Inf. Comp. Sci., 43, 1982–1997, 2003. [22] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, and M.L. Klein, “Comparison of simple potential functions for simulating liquid water,” J. Chem. Phys., 79, 926–935, 1983.
520
A.D. MacKerell [23] R.C. Rizzo and W.L. Jorgensen, “OPLS all-atom model for amines: resolution of the amine hydration problem,” J. Amer. Chem. Soc., 121, 4827–4836, 1999. [24] W.L. Jorgensen and J. Tirado-Rives, “The OPLS potential functions for proteins. energy minimizations for crystals of cyclic peptides and crambin,” J. Amer. Chem. Soc., 110, 1657–1666, 1988. [25] W.L. Jorgensen, D.S. Maxwell, and J. Tirado-Rives, “Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids,” J. Amer. Chem. Soc., 118, 11225–11236, 1996. [26] A.D. MacKerell, Jr., D. Bashford, M. Bellott, R.L. Dunbrack, Jr., J. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher, III., B. Roux, M. Schlenkrich, J. Smith, R. Stote, J. Straub, M. Watanabe, J. WiorkiewiczKuczera, D. Yin, and M. Karplus, “All-hydrogen empirical potential for molecular modeling and dynamics studies of protein using the Charmm22 force field,” J. Phys. Chem. B, 102, 3586–3616, 1998. [27] A.D. MacKerell, Jr., D. Bashford, M. Bellott, R.L. Dunbrack, Jr., J. Evanseck, M.J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. Kuchnir, K. Kuczera, F.T.K. Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, W.E. Reiher, I., B. Roux, M. Schlenkrich, J. Smith, R. Stote, J. Straub, M. Watanabe, J. WiorkiewiczKuczera, D. Yin, and M. Karplus, “All-atom empirical potential for molecular modeling and dynamics studies of proteins,” J. Phys. Chem. B, 102, 3586–3616. [28] N. Foloppe and A.D. MacKerell, Jr., “All-atom empirical force field for nucleic acids: 1) parameter optimization based on small molecule and condensed phase macromolecular target data,” J. Comp. Chem., 21, 86–104, 2000. [29] S.E. Feller, K. Gawrisch, and A.D. MacKerell, Jr., “Polyunsaturated fatty acids in lipid bilayers: intrinsic and environmental contributions to their unique physical properties,” J. Amer. Chem. Soc., 124, 318–326, 2002. [30] P. Cieplak, W.D. Cornell, C.I. Bayly, and P.K. Kollman, “Application of the multimolecule and multiconformational RESP methodlogy to biopolymers: charge derivation for DNA, RNA, and proteins,” J. Comp. Chem., 16, 1357–1377, 1995. [31] A.D. MacKerell, Jr. and M. Karplus, “Importance of attractive van der Waals contributions in empirical energy function models for the heat of vaporization of polar liquids,” J. Phys. Chem., 95, 10559–10560, 1991. [32] K. Kim and R.A. Friesner, “Hydrogen bonding between amino acid backbone and side chain analogues: a high-level ab initio study,” J. Amer. Chem. Soc., 119, 12952– 12961, 1997. [33] N. Huang and A.D. MacKerell, Jr., “An ab initio quantum mechanical study of hydrogen-bonded complexes of biological interest,” J. Phys. Chem. B, 106, 7820– 7827, 2002. [34] U.C. Singh and P.A. Kollman, “An approach to computing electrostatic charges for molecules,” J. Comp. Chem., 5, 129–145, 1984. [35] L.E. Chirlian and M.M. Francl, “Atomic charges derived from electrostatic potentials: a detailed study,” J. Comput. Chem., 8, 894–905, 1987. [36] K.M. Merz, “Analysis of a large data base of electrostatic potential derived atomic charges,” J. Comput. Chem., 13, 749–767, 1992. [37] C.I. Bayly, P. Cieplak, W.D. Cornell, and P.A. Kollman, “A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model,” J. Phys. Chem., 97, 10269–10280, 1993. [38] R.H. Henchman and J.W. Essex, “Generation of OPLS-like charges from molecular electrostatic potential using restraints,” J. Comp. Chem., 20, 483–498, 1999.
Interatomic potentials: molecules
521
[39] A. Laio, J. VandeVondele, and U. Rothlisberger, “D-RESP: dynamically generated electrostatic potential derived charges from quantum mechanics/molecular mechanics simulations,” J. Phys. Chem. B, 106, 7300–7307, 2002. [40] M.M. Francl, C. Carey, L.E. Chirlian, and D.M. Gange, “Charge fit to electrostatic potentials. II. Can atomic charges be unambiguously fit to electrostatic potentials?” J. Comp. Chem., 17, 367–383, 1996. [41] W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, and P.A. Kollman, “A second generation force field for the simulation of proteins, nucleic acids, and organic molecules,” J. Amer. Chem. Soc., 117, 5179–5197, 1995. [42] Y. Duan, C. Wu, S. Chowdhury, M.C. Lee, G. Xiong, W. Zhang, R. Yang, P. Ceiplak, R. Luo, T. Lee, J. Caldwell, J. Wang, and P. Kollman, “A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations,” J. Comp. Chem., 24, 1999–2012, 2003. [43] W.L. Jorgensen, “Optimized intermolecular potential functions for lipuid hydrocarbons,” J. Amer. Chem. Soc., 106, 6638–6646, 1984. [44] W.L. Jorgensen, “Optimized intermolecular potential functions for liquid alcohols,” J. Phys. Chem., 90, 1276–1284, 1986. [45] A. Warshel and S. Lifson, “Consitent force field calculations. II. Crystal structures, sublimation energies, molecular and lattice vibrations, molecular conformations, and enthalpy of alkanes,” J. Chem. Phys., 53, 582–594, 1970. [46] A.D. MacKerell, Jr., J. Wi´orkiewicz-Kuczera, and M. Karplus, “An all-atom empirical energy function for the simulation of nucleic acids,” J. Am. Chem. Soc., 117, 11946–11975, 1995. [47] D. Yin and A.D. MacKerell, Jr., “Ab initio calculations on the use of helium and neon as probes of the van der Waals surfaces of molecules,” J. Phys. Chem., 100, 2588–2596, 1996. [48] D. Yin and A.D. MacKerell, Jr., “Combined ab initio/empirical approach for the optimization of Lennard–Jones parameters,” J. Comp. Chem., 19, 334–348, 1998. [49] P.P. Ewald, “Die berechnung optischer und elektrostatischer gitterpotentiale,” Annalen der Physik, 64, 253–287, 1921. [50] T. Darden, “Treatment of long-range forces and potentials,” In: O.M. Becker, A.D. MacKerell, Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel Dekker, Inc., New York, pp. 91–114, 2001. [51] D. Beglov and B. Roux, “Finite representation of an infinite bulk system: solvent boundary potential for computer simulations,” J. Chem. Phys., 100, 9050–9063, 1994. [52] T.C. Bishop, R.D. Skeel, and K. Schulten, “Difficulties with multiple time stepping and fast multipole algorithm in molecular dynamics,” J. Comp. Chem., 18, 1785– 1791, 1997. [53] W. Im, S. Bern´eche, and B. Roux, “Generalized solvent boundary potential for computer simulations,” J. Chem. Phys., 114, 2924–2937, 2001. [54] M.P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press, New York, 1989. [55] P. Lague, R.W. Pastor, and B.R. Brooks, “A pressure-based long-range correction for Lennard–Jones interactions in molecular dynamics simulations: application to alkanes and interfaces,” J. Phys. Chem. B, 108, 363–368, 2004. [56] M. Tuckerman, B.J. Berne, and G.J. Martyna, “Reversible multiple time scale molecular dynamics,” J. Chem. Phys., 97, 1990–2001, 1992.
522
A.D. MacKerell [57] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant pressure molecular dynamics algorithms,” J. Chem. Phys., 101, 4177–4189, 1994. [58] S.E. Feller, Y. Zhang, R.W. Pastor, and R.W. Brooks, “Constant pressure molecular dynamics simulation: The Langevin Piston Method,” J. Chem. Phys., 103, 4613– 4621, 1995. [59] E. Barth and T. Schlick, “Extrapolation versus impulse in multiple-timestepping schemes. II. Linear analysis and applications to Newtonian and Langevin dynamics,” J. Chem. Phys., 109, 1633–1642, 1998. [60] R. Elber and M. Karplus, “Enhanced sampling in molecular dynamics: use of the time-dependent hartree approximation for a simulation of carbon monoxide diffusion through myoglobin,” J. Amer. Chem. Soc., 112, 9161–9175, 1990. [61] U.H.E. Hansmann, “Parallel tempering algorithm for conformational studies of biological molecules,” Chem. Phys. Lett., 281, 140–150, 1997. [62] C. Simmerling, T. Fox, and P.A. Kollman, “Use of locally enhanced sampling in free energy calculations: testing and application to the α∅β Anomerization of Glucose,” J. Am. Chem. Soc., 120, 5771–5782, 1998. [63] W.F. van Gunsteren, “GROMOS. Groningen molecular simulation program package,” University of Groningen, Groningen, 1987. [64] W.F. van Gunsteren, S.R. Billeter, A.A. Eising, P.H. H¨unenberger, P. Kr¨uger, A.E. Mark, W.R.P. Scott, and I.G. Tironi, Biomolecular Simulation: The GROMOS96 Manual and User Guide, BIOMOS b.v., Z¨urich, 1996. [65] G. Kaminski and W.L. Jorgensen, “Performance of the AMBER94, MMFF94, and OPLS-AA force fields for modeling organic liquids,” J. Phys. Chem., 100, 18010– 18013, 1996. [66] M.R. Shirts, J.W. Pitera, W.C. Swope, and V.S. Pande, “Extremely precise free energy calculations of amino acid side chain analogs: comparison of common molecular mechanics force fields for proteins,” J. Chem. Phys., 119, 5740–5761, 2003. [67] T.A. Halgren, “MMFF VII. Characterization of MMFF94, MMFF94s, and other widely available force fields for conformational energies and for intermolecularinteraction energies and geometries,” J. Comp. Chem., 20, 730–748, 1999. [68] S. Lifson, A.T. Hagler, and P. Dauber, “Consistent force field studies of intermolecular forces in hydrogen-bonded crystals. 1. Carboxylic acids, amides, and the C=O. . .H hydrogen bonds,” J. Amer. Chem. Soc., 101, 5111–5121, 1979. [69] F.A. Momany and R. Rone, “Validation of the general purpose QUANTA 3.2/CHARMm force field,” J. comput. Chem., 13, 888–900, 1992. [70] M.J. Hwang, T.P. Stockfisch, and A.T. Hagler, “Derivation of class II force fields. 2. Derivation and characterization of a class II force field, CFF93, for the alkyl functional group and alkane molecules,” J. Amer. Chem. Soc., 116, 2515–2525, 1994. [71] H. Sun, “COMPASS: An ab initio force-field optimized for condensed-phase applications-overview with details on alkane and benzene compounds,” J. Phys. Chem. B, 102, 7338–7364, 1998. [72] U. Burkert and N.L. Allinger, Molecular Mechanics, American Chemical Society, Washington, D.C., 1982. [73] N.L. Allinger, Y.H. Yuh, and J.L. Lii, “Molecular mechanics, the MM3 force field for hydrocarbons. 1,” J. Amer. Chem. Soc., 111, 8551–8566, 1989. [74] N.L. Allinger, K.H. Chen, J.H. Lii, and K.A. Durkin, “Alcohols, ethers, carbohydrates, and related compounds. I. The MM4 force field for simple compounds,” J. Comput. Chem., 24, 1447–1472, 2003.
Interatomic potentials: molecules
523
[75] A.K. Rapp´e, C.J. Colwell, W.A. Goddard, III, and W.M. Skiff, “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations,” J. Amer. Chem. Soc., 114, 10024–10035, 1992. [76] S.L. Mayo, B.D. Olafson, and I. Goddard, W.A. “DREIDING: a generic force field for molecular simulations,” J. Phys. Chem., 94, 8897–8909, 1990. [77] T.A. Halgren and W. Damm, “Polarizable force fields,” Curr. Opin. Struct. Biol., 11, 236–242, 2001. [78] S.W. Rick and S.J. Stuart, “Potentials and algorithms for incorporating polarizability in computer simulations,” Rev. Comp. Chem., 18, 89–146, 2002. [79] S.W. Rick, S. J. Stuart, J. S. Bader, and B. J. Berne, “Fluctuating charge force fields for aqueous solutions,” J. Mol. Liq., 66/66, 31–40, 1995. [80] S.W. Rick and B.J. Berne, “Dynamical fluctuating charge force fields: the aqueous solvation of amides,” J. Amer. Chem. Soc., 118, 672–679, 1996. [81] R.A. Bryce, M.A. Vincent, N.O.J. Malcolm, I.H. Hillier, and N.A. Burton, “Cooperative effects in the structure of fluoride water clusters: ab initio hybrid quantum mechanical/molecular mechanical model incorporating polarizable fluctuating charge solvent,” J. Chem. Phys., 109, 3077–3085, 1998. [82] J.L. Asensio, F.J. Canada, X. Cheng, N. Khan, D.R. Mootoo, and J. Jimenez-Barbero, “Conformational differences between O- and C-glycosides: the alpha-O-man(1-->1)-beta-Gal/alpha-C-Man-(1-->1)-beta-Gal case--a decisive demonstration of the importance of the exo-anomeric effect on the conformation of glycosides,” Chemistry, 6, 1035–1041, 2000. [83] N. Yoshii, R. Miyauchi, S. Niura, and S. Okazaki, “A molecular-dynamics study of the equation of water using a fluctuating-charge model,” Chem. Phys. Lett., 317, 414–420, 2000. [84] E. Llanta, K. Ando, and R. Rey, “Fluctuating charge study of polarization effects in chlorinated organic liquids,” J. Phys. Chem. B, 105, 7783–7791, 2001. [85] S. Patel and C.L. Brooks, III, “CHARMM fluctuating charge force field for proteins: I parameterization and application to bulk organic liquid simulations,” J. Comput. Chem., 25, 1–15, 2004. [86] J. Caldwell, L.X. Dang, and P.A. Kollman, “Implementation of nonadditive intermolecular potentials by use of molecular dynamics: development of a water–water potential and water–ion cluster interactions,” J. Amer. Chem. Soc., 112, 9144–9147, 1990. [87] A. Wallqvist and B.J. Berne, “Effective potentials for liquid water using polarizable and nonpolarizable models,” J. Phys. Chem., 97, 13841–13851, 1993. [88] D.N. Bernardo, Y. Ding, K. Krogh-Jespersen, and R.M. Levy, “An anisotropic polarizable water model: incorporation of all-atom polarizabilities into molecular mechanics force fields,” J. Phys. Chem., 98, 4180–4187, 1994. [89] L.X. Dang, “Importance of polarization effects in modeling hydrogen bond in water using classical molecular dynamics techniques,” J. Phys. Chem. B, 102, 620–624, 1998. [90] H.A. Stern, G.A. Kaminski, J.L. Banks, R. Zhou, B.J. Berne, and R.A. Friesner, “Fluctuating charge, polarizable dipole, and combined models: parameterization from ab initio quantum chemistry,” J. Phys. Chem. B, 103, 4730–4737, 1999. [91] B. Mannfors, K. Palmo, and S. Krimm, “A new electrostatic model for molecular mechanics force fields,” J. Mol. Struct., 556, 1–21, 2000. [92] B.G. Dick, Jr. and A.W. Overhauser, “Theory of the dielectric constants of alkali halide crystals,” Phys. Rev., 112, 90–103, 1958.
524
A.D. MacKerell
[93] L.R. Pratt, “Effective field of a dipole in non-polar polarizable fluids,” Mol. Phys., 40, 347–360, 1980. [94] P.J. van Marren and D. van der Spoel, “Molecular dynamics simulations of water with novel shell-model potentials,” J. Phys. Chem. B, 105, 2618–2626, 2001. [95] G. Lamoureux, A.D. MacKerell, Jr., and B. Roux, “A simple polarizable model of water based on classical Drude oscillators,” J. Chem. Phys., 119, 5185–5197, 2003. [96] G. Lamoureux and B. Roux, “Modelling induced polarizability with drude oscillators: theory and molecular dynamics simulation algorithm,” J. Chem. Phys., 119, 5185–5197, 2003. [97] M. Sprik and M.L. Klein, “A polarizable model for water using distributed charge sites,” J. Chem. Phys., 89, 7556–7560, 1988. [98] B. Chen, J. Xing, and I.J. Siepmann, “Development of polarizable water force fields for phase equilibrium calculations,” J. Phys. Chem. B, 104, 2391–2401, 2000. [99] H.A. Stern, F. Rittner, B.J. Berne, and R.A. Friesner, “Combined fluctuating charge and polarizable dipole models: application to a five-site water potential function,” J. Chem. Phys., 115, 2237–2251, 2001. [100] S.J. Stuart and B.J. Berne, “Effects of polarizability on the hydration of the chloride ion,” J. Phys. Chem., 100, 11934–11943, 1996. [101] A. Grossfield, P. Ren, and J.W. Ponder, “Ion solvation thermodynamics from simulation with a polarizable force field,” J. Amer. Chem. Soc., 125, 15671–15682, 2003. [102] J.C. Shelley, M. Sprik, and M.L. Klein, “Molecular dynamics simulation of an aqueous sodium octanoate micelle using polarizable surfactant molecules,” Langmuir, 9, 916–926, 1993. [103] J.W. Caldwell and P.A. Kollman, “Cation–π interactions: nonadditive effects are critical in their accurate representation,” J. Amer. Chem. Soc., 117, 4177–4178, 1995a. [104] J.W. Caldwell and P.A. Kollman, “Structure and properties of neat liquids using nonadditive molecular dynamics: water, methanol, and N-methylacetamide,” J. Phys. Chem., 99, 6208–6219, 1995b. [105] J. Gao, D. Habibollazadeh, and L. Shao, “A polarizable potential function for simulation of liquid alcohols,” J. Phys. Chem., 99, 16460–16467, 1995. [106] M. Freindorf and J. Gao, “Optimization of the Lennard–Jones parameter for combined ab initio quantum mechanical and molecular mechanical potential using the 3-21G basis set,” J. Comp. Chem., 17, 386–395, 1996. [107] P. Cieplak, J.W. Caldwell, and P.A. Kollman, “Molecular mechanical models for organic and biological systems going beyond the atom centered two body additive approximations: aqueous solution free energies of methanol and N-methyl acetamide, nucleic acid base, and amide hydrogen bonding and chloroform/water partition coefficients of the nucleic acid bases,” J. Comp. Chem., 22, 1048–1057, 2001. [108] L.X. Dang, “Computer simulation studies of ion transport across a liquid/liquid interface,” J. Phys. Chem. B, 103, 8195–8200, 1999. [109] G.A. Kaminski, H.A. Stern, B.J. Berne, R.A. Friesner, Y.X. Cao, R.B. Murphy, R. Zhou, and T.A. Halgren, “Development of a polarizable force field for proteins via ab initio quantum chemistry: first generation model and gas phase tests,” J. Comp. Chem., 23, 1515–1531, 2002. [110] V.M. Anisimov, I.V. Vorobyov, G. Lamoureux, S. Noskov, B. Roux, and A.D. MacKerell, Jr. “CHARMM all-atom polarizable force field parameter development for nucleic acids,” Biophys. J., 86, 415a, 2004. [111] S. Patel, A.D. MacKerell, Jr., and C.L. Brooks, III, “CHARMM fluctuating charge force field for proteins: II protein/solvent properties from molecular dynamics simulations using a non-additive electrostatic model,” 25, 1504–1514, 2004.
Interatomic potentials: molecules
525
[112] A. Morita and S. Kato, “An ab initio analysis of medium perturbation on molecular polarizabilities,” J. Chem. Phys., 110, 11987–11998, 1999. [113] A. Morita, “Water polarizability in condensed phase: ab initio evaluation by cluster approach,” J. Comp. Chem., 23, 1466–1471, 2002.
2.6 INTERATOMIC POTENTIALS: FERROELECTRICS Marcelo Sepliarsky1, Marcelo G. Stachiotti1 , and Simon R. Phillpot2 1
Instituto de Física Rosario, Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, 27 de Febreo 210 Bis, (2000) Rosario, Argentina 2 Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA
Ferroelectric perovskites are important in many areas of modern technology including memories, sensors and electronic applications, and are of fundamental scientific interest. The fascinating feature of perovskites is that they exhibit a wide variety of structural phase transitions. Generically these compounds have a chemical formula ABO3 , where A is a monovalent or divalent cation and B, a transition metal cation; perovskites in which both A and B are trivalent, such as LaAlO3 also exist, though we will not discuss them here. Although their high-temperature structure is very simple (Fig. 1), it displays a wide variety of structural instabilities, which may involve rotation and distortions of the oxygen octahedral as well as displacement of the ions from their crystallographically defined sites. The types of crystal symmetries manifested in these materials and the types of phase transitions behavior depend on the individual compound. Among the perovskites one finds ferroelectric crystals such as BaTiO3 , KNbO3 (displaying three solid-state phase transitions), and PbTiO3 (displaying only one transition), antiferroelectrics such as PbZrO3 , and materials such as SrTiO3 that exhibit other nonpolar instabilities involving the rotation of the oxygen octahedra [1]. In recent years, new applications have opened up for these materials as the systems exploited have become both chemically more complex, e.g., solid solutions and superlattices, and microstructurally more complex, e.g., thin films and nanocapacitors. While the overall properties of such systems can be relatively easily investigated experimentally, it is difficult to obtain microscopic information. There is thus a significant need for a simulation method which can provide atomic-level information on ferroelectric behavior, and yet is computationally efficient enough to allow materials problems to be addressed. Computer 527 S. Yip (ed.), Handbook of Materials Modeling, 527–545. c 2005 Springer. Printed in the Netherlands.
528
M. Sepliarsky et al. A
O
B
Figure 1. Cubic perovskite-type structure, ABO3 .
simulations based on interatomic potentials can provide such microscopic insights. However, the validity of any simulation potential study depends on the quality of the interatomic potential used, to a considerable extent. Obtaining accurate interatomic potentials which are able to describe ferroelectricity in ABO3 perovskites constitutes a challenging problem, mainly due to the small energy differences (sometimes less than 10 meV/cell) involved in the lattice instabilities associated with the various phases. The theoretical investigation of ferroelectric materials can be addressed at different lenght scale and level of complexity, ranging from phe-nomenological theories (based on the continuous medium approximation) to first-principles methods. The traditional approach is based on Ginzburg–Landau–Devonshire (GLD) theory [2]. This mesoscale approach treats a ferroelectric as a continuum solid denned by components of polarization and by elastic strains or stresses. This approach has proved very successful in providing significant insights into the ferroelectric properties of perovskites. However, it cannot provide detailed microscopic information. Over the last decade, considerable progress has been made in first-principles calculations of ferroelectricity in perovskites [3, 4]. These calculations have contributed greatly to the understanding of the origins of structural phase transitions in perovskites and to the nature of the ferroelectric instability. These methods are based upon a full solution for the quantum mechanical ground state of the electron system in the framework of Density Functional Theory (DFT). While able to provide detailed information on the structural, electronic and lattice dynamical properties of single crystals, they also have limitations. In particular, due to the heavy computational load, only systems of up to approximately a hundred ions can be simulated. Moreover, at the moment such calculations cannot provide anything but static, zero temperature, properties. An effective Hamiltonian method has been used for the simulation of finite-temperature properties of
Interatomic potentials: ferroelectrics
529
perovskites [3]. Here, a model Hamiltonian is written as a function of a reduced number of degrees of freedom (a local mode amplitude vector and a local strain tensor). The parameters of the Hamiltonian are determined in order to reproduce the spectrum of low-energy excitations of a given material as obtained from first-principles calculations. This approach has been applied with considerable success to several ferroelectric materials (pure compounds and solid solutions), producing results in very good qualitative agreement with experiments. However, some quantitative predictions are not so satisfactory; in particular, the calculated transition temperatures can differ from the experimental values by hundreds of degrees. Moreover, the lack of an atomistic description of the material makes the effective Hamiltonian approach inappropriate for the investigation of many interesting properties of perovskites, such as surface and interface effects. Atomistic modeling using interatomic potentials has a long and illustrious history in the description of ionic materials. The fundamental idea is to describe a material at the atomic level, with the interatomic interactions defined by classical potentials, thereby providing spatially much more detailed information than the GLD approach, yet without the heavy computational load associated with the first-principles methods. In the context of ionic materials, the interactions between the point ions are generally described via the Coulombic interactions between the atoms which provides cohesion. However, a neutral solid interacting purely by Coulombic interactions is unstable to a catastrophic collapse in which all the ions become arbitrarily close. Thus, to mimic the physical short-ranged repulsion that prevents such a collapse, an empirical largely repulsive interaction is added. One standard choice for this function is the Buckingham potential, which consists of a purely repulsive, exponential decaying Born–Mayer term between shells and a van der Waals attractive term to account for covalency effects: V (r) = ae(−r/ρ) − (c/r 6 ). This is the so-called rigid ion model. In the shell model, an important improvement over the rigid-ion model, atomic polarizability is accounted for by defining a core and a shell for each ion (representing the ion core with the closed shells of electrons, and the valence electrons, respectively), which interact with each other through a harmonic spring (characterizing the ionic polarizability), and interact with the cores and shells of other ions via repulsive and Coulombic interactions. In some parameterizations, the ions (core plus shell) are assigned their formal charges. However, in ionic materials with a significant amount of covalency, such as perovskites, the incomplete transfer of electrons between the cations and anions can be accounted for by assigning partial charges (smaller than the formal charges) to the ions as well as the van der Waals term, which is non-zero only for the O–O interactions. For more details see the article “Interatomic potential models for ionic materials” by Julian Gale presented in this handbook.
530
M. Sepliarsky et al.
The success of the atomistic approach is evident from the large number of investigations on complex oxides crystals. Regarding ferroelectric perovskites, we note the early work of Lewis and Catlow, who derived empirical shellmodel potential parameters for the study of defect energies in cubic BaTiO3 [5, 18]. This model was subsequently used for more refined ab initio embeddedcluster calculations of impurities, as well as for the simulation of surface properties. For lattice dynamical properties, the most successful approach has been carried out in the framework of the nonlinear oxygen polarizability model [6]. In this shell model an anisotropic core–shell interaction is considered at the O2− ions, with a fourth-order core–shell interaction along the B–O bond. The potential parameters were obtained by fitting experimental phonon dispersion curves of the cubic phase. The main achievement of this model was the description of the soft mode temperature dependence (TO-phonon softening which is related with the ferroelectric transition). However, neither of these models, was able to simulate the ferroelectric phase behavior of the perovskites. Besides the traditional empirical approach, in which potentials are obtained by suitable fitting procedures to macroscopic physical properties, there is increasing interest in deriving pair potentials from first-principles calculations. In 1994, Donnerberg and Exner developed a shell model for KNbO3 , deriving the Nb–O short-range pair potential from Hartree–Fock calculations performed on a cluster of ions [7]. They showed that this ab initio pair potential was in good agreement with a corresponding empirical potential obtained from fitting procedures to macroscopic properties. Their model, however, was not able to simulate the structural phase transition sequence of KNbO3 either. They argued that the consideration of additional many-body potential contributions would enable them to model structural phase transitions. However, as we will see, it is in fact possible to simulate ferroelectric phase transitions just by using classical pairwise interatomic potentials fitted to first-principles calculations. Ab initio methods provide underlying potential surfaces and phonon dispersion curves at T = OK, thereby exposing the presence of structural instabilities in the full Bril-louin zone, and this information is indeed very useful for parameterizing classical potentials which can then be used in molecular dynamics simulations. In this way, finite-temperature simulations of ABO3 perovskites and the properties of chemically and microstructurally more complex systems can be addressed at the atomic level.
1.
Modeling Ferroelectric Perovskites
Among the perovskites BaTiO3 which can be considered as a prototypical ferroelectric is one of the most exhaustively studied [8]. At high temperatures, it has the classic perovskite structure. This is cubic centrosymmetric, with the
Interatomic potentials: ferroelectrics
531
Ba at the corners, Ti at the center, and oxygen at the face centers (see Fig. 1). However, as the temperature is lowered, it goes through a succession of ferroelectric phases with spontaneous polarizations along the [001], [011], and [111] directions of the cubic cell. These polarizations arise from net displacements of the cations with respect to the oxygen octahedra along the above directions. Each ferroelectric phase involves also a small homogeneous deformation which can be thought of as an elongation of the cubic unit cell along the corresponding polarization direction. Thus the system becomes tetragonal at 393 K, orthorhombic at 278 K, and rhombohedral at 183 K. An anisotropic shell model with pairwise repulsive Buckingham potentials was developed for the simulation of ferroelectricity in BaTiO3 [9]. This model is a classical shell model where an anisotropic core–shell interaction is considered at the O2− ions, with a fourth-order core–shell interaction along the O–Ti bond. The Ba and Ti ions are considered to be isotropically polarizable. The set of seventeen shell model parameters were obtained by fitting phonon frequencies, lattice constant of the cubic phase, and underlying potential surfaces for various configurations of atomic displacements. In order to better quantify the ferroelectric instabilities of the cubic phase, a first-principles frozen-phonon calculation of the infrared active modes was performed. Once the eigenvectors at had been determined, the total energy as a function of the displacement pattern of the unstable mode was evaluated for different directions in the cubic phase, including also the effects of the strain. The first-principles total energy calculations were performed within DFT, using the highly precise full-potential Linear Augmented Plane Wave (LAPW) method. The energy surfaces of the model for different ferroelectric distortions is shown in Fig. 2, where they are compared with the first-principles results. A satisfactory overall agreement is achieved. The model yields clear ferroelectric instabilities with similar energies and minima locations as the LAPW calculations. Energy lowerings of ≈1.2, 1.65, and 1.9 mRy/cell are obtained for the (001), (011), and (111) ferroelectric mode displacements, respectively, which is consistent with the experimentally observed phase transitions sequence. Concerning the energetics for the (001) displacements, it can be also seen in the left panel that the effect of the tetragonal strain is to stabilize these displacements with a deeper minimum and with a higher energy barrier at the centrosymmetric positions. Phonon dispersion relations provide a global view of the harmonic energy surface around the cubic perovskite structure. In particular the unstable modes, which have imaginary frequencies, determine the nature of the phase transitions. A first-principles linear response calculation of the phonon dispersion curves of cubic BaTiO3 revealed the presence of structural instabilities with pronounced two-dimensional character in the Brillouin zone, corresponding to chains of displaced Ti ions oriented along the [001] directions [10]. The shell model reproduces these instabilities is illustrated in the calculated phonon
532
M. Sepliarsky et al. 1
E (mRy/cell)
[001]
[111]
[011]
0
1
2 0.00
c/ a =
0.05
1.01
0.00
0.05
0.00
0.05
Ti relative to Ba displacement (Å) Figure 2. Total energy as a function of the unstable mode displacements along the [001] (left panel), [011] (center panel), and [111] (right panel) directions. For the sake of simplicity, the mode displacement is represented through the Ti displacement relative to Ba; the oxygen ions are also displaced in a manner determined by the Ti ion displacement. Energies for [001] displacements in a tetragonal strained structure are also included in the left panel. First-principles calculations are denoted by squares (circles) for the unstrained (strained) structures. Full lines correspond to the shell model result.
dispersion curves in Fig. 3. Excellent agreement with the ab initio linear response calculation is achieved, particularly for the unstable phonon modes. Two transverse optic modes are unstable at the point, and they remain unstable along the –X direction with very little dispersion. One of them stabilizes along the –M and X–M directions; and both become stable along the –R and R–M lines. The Born effective charge tensor is conventionally defined as the proportionality coefficients between the components of the dipole moment per unit cell and the components of the κ sublattice displacement which give rise to the dipole moment ∗ = Z κ,αβ
∂ Pβ . ∂δκ,α
(1)
For the cubic structure of ABO3 perovskites, this tensor is fully characterized by four independent numbers. Experimental data had suggested that the amplitude of the Born effective charges should deviate substantially from the nominal static charges, with two essential features: the oxygen charge tensor is highly anisotropic (with two inequivalent directions either parallel or perpendicular to the B–O bond), and the Ti and O|| effective charges are anomalously large. This was confirmed by more recent first-principles calculations [3] demonstrating the crucial role played by the B(d)–O(2p) hybridization as a dominant mechanism for such anomalous contributions.
Interatomic potentials: ferroelectrics
533
800
Frequency (cm-1 )
600
400
200
0
200 Γ
X
M
Γ
R
M
Figure 3. Phonon dispersion curves of cubic BaTiO3 calculated with the shell model. Imaginary phonon frequencies are represented as negative values.
Although the shell model does not explicitly include charge transfer between atoms, it takes into account the contribution of the electronic polarizability effects through the shell model. It is thus possible to evaluate the Born effective charge tensor by calculating the total dipole moment per unit cell created by the displacement of a given sublattice of atoms as a sum of two contributions Pα = Z κ δκ,α +
Yκ wκ,α .
(2)
κ
The first term is the sublattice displacement contribution while the second term is the electronic polarizability contribution. The calculated Born effective charges for cubic BaTiO3 are listed in Table 1 together with results obtained from different theoretical approaches. The two essential features of the Born effective charge tensor of BaTiO3 are satisfactorily simulated. To this point, we have shown that this anisotropic shell model for BaTiO3 reproduces the lattice instabilities and several zero-temperature properties which are relevant for this material. To investigate if the model can describe the temperature driven structural transitions of BaTiO3 constant-pressure molecular dynamics (MD) simulations were performed. Although an excellent overall agreement was obtained for the structural parameters, showing that the model reproduces the delicate structural changes involved along the transitions, the theoretically determined transition temperatures were much lower
534
M. Sepliarsky et al. Table 1. Born effective charges of BaTiO3 in the cubic structure
Nominal Experiment First principles Shell model (nominal) Shell model (effective)
Z ∗Ba
Z T∗ i
Z ∗O
+2 +2.9 +2.75 +1.86 +1.93
+4 +6.7 +7.16 +3.18 +6.45
−2 −2.4 −2.11 −1.68 −2.3
⊥
Z ∗O
||
−2 −4.8 −5.69 −1.68 −3.79
than in experiment [9]. Interestingly, the effective Hamiltonian approach presents the same problem. Since ferroelectricity is very sensitive to volume, the neglect of thermal expansivity in the effective Hamiltonian approach was thought to be responsible for the shifts in the predicted transition temperatures. The MD simulations, however properly simulate the thermal expansion and, nevertheless, result in a similar anomaly in the transition temperatures. This indicates the presence of inherent errors in the first-principles LDA approach which tend to underestimate the ferroelectric instabilities. A recent study demonstrated that, in the effective Hamiltonian approach, there are at least two significant sources of errors: the improper treatment of the thermal expansion and the LDA error. Both types of errors may be of same magnitude [11]. While the anisotropic shell model for BaTiO3 does have the desired effect of describing the ferroelectric phase transition in perovskites it can only be used in crystallographic well-defined environment of O ions. Unfortunately, it is not always possible to unambiguously characterize the crystallographic environment of any given ion, for example, in the simulation of a grain boundary or other interface. For such systems isotropic models are required. Isotropic shell models have recently been developed, which describe the phase behavior of both KNbO3 [12] and BaTiO3 [13]. The isotropic shell model differs from the anisotropic one only in that the anisotropic fourthorder core–shell interaction on the O ions is replaced by an isotropic fourthorder core–shell interaction on both the transition metal and the O ions, which together stabilize the ferroelectric phases. Since the LDA-fitted shell model gives theoretically determined transition temperatures much lower than in experiment, the parameters of the potential were improved in an ad hoc manner to give better agreement. In this way, the model for KNbO3 displays the experimentally observed sequence of phases on heating: rhombohedral, orthorhombic, tetragonal and finally cubic with transition temperatures of 225 K, 475 K and 675 K, which are very close to the experimental values of 210 K, 488 K and 701 K, respectively. As shown in Fig. 4, for BaTiO3 , in comparison with the anisotropic model, the isotropic shell model gives transition temperature values (140 K, 190 K and 360 K) in better agreement with the experimental values (183 K, 278 K and 393 K).
Interatomic potentials: ferroelectrics BaTiO3
4.08
Lattice parameters (Å)
535
4.04
4
0
100
200
300
400
100
200
300
400
30
2
Polarization (µC/cm )
BaTiO3 20
10
0 0
Temperature (K) Figure 4. Phase diagram of BaTiO3 as determined by MD simulations for the isotropic shell model. Top panel: cell parameters as a function of temperature. Bottom panel: the three components of the average polarization (each one represented with a different symbol).
2.
Solid Solutions
The current keen interest in solid solutions of perovskites is driven by the idea of tuning the composition to create structures with properties unachievable in single component materials. Prototypical solid solutions are Bax Sr1−x TiO3 (BST), a solid solution of BaTiO3 and SrTiO3 , and KTax Nb1−x O3 , a solid solution of KTaO3 and KNbO3 . Both solutions exist for the whole concentration range and are mixtures of a ferroelectric with an incipient ferroelectric. We present briefly the main features of isotropic shell-model potentials developed to describe the structural behavior of BST.
536
M. Sepliarsky et al.
In order to simulate BST solid solutions, it was also necessary to develop an isotropic model for SrTiO3 . From a computational point of view, the SrTiOs model must be compatible with the BaTiO3 model in that the only difference between the two can be in the different Ba–O and Sr–O interactions and the different polarizability parameters for Ba and Sr. The challenge is thus, by only changing these interactions, to reproduce the following main features of ST: (i) a smaller equilibrium volume, (ii) incipient ferroelectricity, and (iii) a tetragonal antiferrodistortive ground state. It is indeed possible to reproduce these three critical features. The equilibrium lattice constant of the resulting model in the cubic phase is a = 3.90 Å which reproduces the extrapolation to T = 0 K of the experimental lattice constant. Regarding the other two conditions, the low-frequency phonon dispersion curves of the cubic structure are shown in Fig. 5. The model reproduces the rather subtle antiferrodistortive instabilities, driven by the unstable modes at the R and M points. It also presents a subtle ferroelectric instability (unstable mode at the zone center). These detailed features of the dispersion of the unstable modes along different direction in the Brillouin zone are in good agreement with ab initio linear response calculations. Random solid solutions of BST of various compositions in the range x = 0 (pure SrTiO3 ) to x = 1 (pure BaTiO3 ) have been simulated. In the simulation supercell the A-sites of the ATiO3 perovskite are randomly occupied by Ba and Sr ions. The results of the molecular dynamics simulations on the phase behavior of BST are summarized in Fig. 6 (filled symbols connected by solid lines) as the concentration dependence of the transition temperatures.
Figure 5. Low-frequency phonon dispersion curves for cubic SrTiO3 . The negative values correspond to imaginary frequencies, characteristic of the ferroelectric instability at the point and the additional antiferrodistortive instabilities at the R and M points.
Interatomic potentials: ferroelectrics
537
400 Ba xSr1- x TiO 3 Cubic
l
na
300
o ag
tr
T (K)
Te 200
Orthorhombic 100 Rhombohedral 0
0
0.2
0.6
0.4
0.8
1
x Figure 6. Concentration dependence of transition temperatures (solid symbols and dark lines) shows good agreement with experimental values (open symbols and dotted lines).
With increasing concentration of Sr (i.e., decreasing x), the Curie temperature decreases essentially linearly with x. The simulations showed that all four phases remain stable down to x ≈ 0.2 at which the three transition temperatures essentially coincide. Below x ≈ 0.2 only the cubic and rhombohedral phases appear in the phase diagram. These results are similar to the experimental data (open symbols and dotted lines), giving particularly good agreement for the concentration at which the tetragonal and orthorhombic phases disappear from the phase diagram. The above analyses demonstrate that the atomistic approach can reproduce the basic features of the phase behavior of perovskite solid solutions, on a semiquantitative basis. There are two fundamental structural effects associated with the solid solution: a concentration dependence of the average volume and large variations in the local strain arising from strong variations in the local composition [12, 13]. SrTiO3 is denser than BaTiO3 . Thus in the solid solution, the SrTiO3 cells tend to be under a tensile strain (which tends to encourage a ferroelectric distortion) while the BaTiO3 cells tend to be under a compressive strain (which tends to suppress the ferroelectric distortion). Indeed, the large tensile strain on the SrTiO3 cells has the effect of inducing a polarization. Remarkably, at a given concentration (fixed volume) the polarization of the SrTiO3
538
M. Sepliarsky et al.
cells is actually larger than that of the BaTiO3 cells. There is also an additional effect associated with the local environment of each unit cell. In particular, the simulations show that the maximum and minimum values of polarization for the SrTiO3 cells correspond to the polarizations of SrTiO3 cells (of the same average volume as that of the solid solution) embedded completely in a matrix of SrTiO3 and BaTiO3 cells, respectively. Likewise, for the BaTiO3 cells the maximum and minimum polarizations correspond to SrTiO3 and BaTiO3 embeddings, respectively.
3.
Heterostructures
Superlattices containing ferroelectric offer another approach to achieving dielectric, and optical properties unachievable in the bulk. Among the heterostructures grown have been ferroelectric/paraelectric superlattices including BaTiO3 /SrTiO3 and KNbO3 / KTaO3 and ferroelectric/ferroelectric superlattices PbTiO3 /BaTiO3 . In comparison with the well-documented tunability of the properties of solid solutions, the tunability of the properties of multilayer heterostructures has been less well demonstrated. While there is experimental evidence for a strong dependence of the properties of such superlattices on modulation length, (the thickness of a KNbO3 / KTaO3 bilayer), the underlying physics controlling their properties is only poorly understood. Atomic-level simulations are ideal for the study of multilayers because the simulations can be carried out on the same length scale as the experimental systems. Moreover, the crystallography of the multilayer can be defined and the position of every ion determined, thereby providing atomic-level information on the ferroelectric and dielectric properties. Furthermore, once the nature of the interactions between ions and the crystallographic structure of the interface are defined, the atomic-level simulations will determine the local atomic structure and polarization at the interfaces. To that purpose, the structure and properties of coherent KNbO3 /KTaO3 superlattices were simulated using isotropic shell-model potentials for KNbO3 and KTaO3. Since the simulations were intended to model a superlattice on a KT substrate, as had been experimentally investigated, the in-plane lattice parameter was fixed to that of KT at zero temperature; however since the heterostructure is not under any constraint in the modulation direction, the length of the simulation cell in the z direction was allowed to expand or contract to reach zero stress. Figure 7 shows the variation in the polarization in the modulation direction Pz (solid circles) and in the x–y plane, Px = Py (open circles) averaged over unit-cell-thick slices through the = 36 superlattice. In analyzing these polarization profiles, we first address the strain effects produced by the KT substrate, which result in a compressive strain of 0.7% on the KN layers.
Interatomic potentials: ferroelectrics
539
40 Pz
2
Polarization (µC/cm )
30 20
P x =P y
10 0 10 20 30
0
9
18
27
36
45
54
63
72
Z Figure 7. Components of polarization, Px (open circles) and Pz (solid circles), in unit-cellthick slices through the = 36 KN/KT superlattice on a KT substrate.
To compensate for this in-plane compression, the KN layers expand in the z direction thereby breaking the strict rhombohedral symmetry of the polarization of KN; however, these strains are not sufficient to force the KN to become tetragonally polarized. Similarly, the absence of any in-plane polarization for the KT layer is consistent with the absence of any strain arising from the KT substrate. The finite value of Pz in the interior of the KT layer, however, is different from the expected value of Pz =0 for this unstrained layer and arises from the very strong coupling of the electric field produces by the electric dipoles in the KNbO3 layers with the very large dielectric response of the KTaO3 [14, 15]. The switching behavior of ferroelectric heterostructures is of considerable interest. It was found that for = 6, the polarization in the KTaO3 layers is almost as large as in the KNbO3 layers; moreover, the coercive fields for the KNbO3 and KTaO3 layers are identical. This single value for the coercive fields and the weak spatial variation in the polarization indicates that the entire superlattice is essentially acting as a single structure, with properties different from either of its components. For = 36, the KNbO3 layer has a square hysteresis loop characteristic of a good ferroelectric; the polarization and coercive field are larger than for = 6, consistent with more bulk-like
540
M. Sepliarsky et al.
behavior of a thicker KNbO3 layer. The KTO layer also displays hysteretic behavior. However, by contrast with the = 6 superlattice, the coercive field for the KTaO3 layers is much smaller than for the KNO layer, indicating that the KNbO3 and KTaO3 layers are much more weakly coupled than in the = 6 superlattice. The hysteresis loop for the KTO layers resembles the response of a poor ferroelectric; however, it was shown that it is actually the response of a paraelectric material under the combination of the applied electric field and the internal field produced by the polarized KNbO3 layers. The hysteretic behavior is, therefore, not an intrinsic property of the KTaO3 layer but arises from the switching of the KNbO3 layers under the large external electric field which, in turn, switches the sign of the internal field on the KTaO3 layers.
4.
Nanostructures
The causes of size effects in ferroelectrics are numerous, and it is difficult to separate true size effects from other factors that change with film thickness or capacitor size, such as microstructure, defect chemistry, and electrode interactions. For this reason, atomic-level investigations play a crucial role in determining their intrinsic behavior. The anisotropic shell model for BaTiO3 was used to determine the critical thickness for ferroelectricity in a free-standing BaTiO3 stress-free film (it was also shown that the model developed for the bulk material can also describe static surface properties [16] such as structural relaxations and surface energies, which are in quite good agreement with firstprinciples calculations). For this investigation a [001] TiO2 -terminated slab was chosen. The equilibrated zero-temperature structure of the films was determined by a zero-temperature quench. The size and shape of the simulation cell was allowed to vary to reach zero stress. Shown in the top panel of Fig. 8 is the cell-by-cell polarization profile pz (z) at T = 0 K of a randomly chosen chain perpendicular to the film surface for various film thicknesses. It is clear from this figure that the film of 2.8 nm width does not display ferroelectricity. As a consequence of surface atomic relaxations, the two unit cells nearest to the surface develop a small polarization at both sides of the slab, which are pointing inwards towards the bulk, so the net chain polarization vanishes. For the cases of 3.6 nm and 4.4 nm film thickness, however, the chains develop a net out-of-plane polarization. Although these individual chains display a perpendicular nonvanishing polarization, the net out-of-plane polarization of the film is zero due to the development of stripe-like domains, as is shown in the bottom panel of Fig. 8. It was demonstrated that the strain effect produced by the presence of a substrate can lead to the stabilization of a polydo-main ferroelectric state in films as thin as 2.0 nm [16].
Interatomic potentials: ferroelectrics
541 d=2.8 nm d=3.6 nm d=4.4 nm
12
2
pz ( µ C/ cm )
9 6 3 0
3 6
0.0
0.8
1.6
2.4
3.2
4.0
z(nm) 2
Pz ( µ C/ cm ) 6 -- 8 4 -- 6 2 -- 4 0 -- 2 2 -- 0 4 -- 2 6 -- 4 8 -- 6
Figure 8. Top panel: Cell-by-cell out-of-plane polarization profile of a ramdomly chosen chain perpendicular to the film surface for different slab thickness. Bottom panel: top view of the out-of-plane polarization pattern for the case d = 4.4 nm showing stripe-like domains. A similar picture is obtained for d = 3.6 nm.
To investigate to what extent a decrease in lateral size will affect the ferroelectric properties of the film, the equilibrium atomic positions and local polarizations at T = 0 K for a stress-free cubic cell of 3.6 nm size were computed. The nanocell is constructed in such a way that the top and bottom faces (perpendicular to the z axis) are [001] TiO2 -planes and its lateral faces (parallel to the z axis) are [100] BaO-planes.
542
M. Sepliarsky et al.
Shown in the top panel of Fig. 9 are the cell-by-cell polarization profiles pz (z) for three different chains along the z direction: one chain at an edge of the cell, one at the center of a face, and the last one inside the nanocell. It is clear from this figure that the total chain polarization at the edges and at the lateral faces is zero. The large local polarizations pointing in opposite directions, at both sides of the cell, are just a consequence of strong atomic relaxations at the nanocell surface. On the other hand, the chain inside the nanocell displays
edge face inside film
2
pz ( µ C/ cm )
40 20 0
20 40 0.0
0.4
0.8
1.2
1.6
2.0
2.4
2.8
3.2
3.6
z(nm) 2
Pz ( µ C/ cm ) 3 -- 5 1 -- 3 1 -- 1 3 -- 1 5 -- 3
Figure 9. Top panel: cell-by-cell polarization profiles ( pz (z)) of three chosen chains in the nanocell. The profile for the 3.6 nm slab is showed for comparison. Bottom panel: top view of the polarization pattern for the nanocell.
Interatomic potentials: ferroelectrics
543
a net, nonvanishing, polarization of ≈ 5 µC/cm2 . For comparison we have also plotted in Fig. 9, the pz (z) profile of the stress-free film of 3.6 nm width. We can clearly see that the two profiles are very similar. This is an indication that the decrease in lateral size does not affect the original ferroelectric properties of the thin film. As in the film case, the net polarization of the nanocell is zero due to the development of domains with opposite polarizations, as is shown in the bottom panel of Fig. 9. It was further demonstrated that a nanocell with different lateral faces, TiO2 planes instead of Ba–O planes, present a different domain structure and polarization due to a strong surface effect [17].
5.
Outlook
First-principles calculations of ferroelectric materials can answer some important questions directly, but this approach by itself cannot address the most challenging materials-related and microstructure-related problems. Fortunately, first-principles methods can provide benchmarks for the validation of other conceptually less sophisticated approaches that, because of their low computational loads, can address such issues. The atomistic approach presented here demonstrates that enough of the electronic effects associated with ferroelectricity can be mimicked at the atomic level to allow the fundamentals of ferroelectric behavior to be reproduced. Moreover, the interatomic potential approach, firmly grounded by having its parameters computed on firstprinciples calculations, will be a very useful tool for the theoretical design of new materials for specific target applications. One important challenge in this field is the simulation of technologically important solid solutions which are more complex than the ones discussed here; for example, PbZrx Ti1−x O3 (PZT) and PbMg1/3 Nb2/3 O3 -PbTiO3 (PMNPT), which is a single crystal piezoelectric with giant electromechanical coupling. The difficult point here is the development of interatomic potentials suitable for such investigations. The simultaneous fitting of transferable potentials for the different pure materials is a way to develop interatomic potentials for the solid solutions. This could be done by using an extensive first-principles database to adjust the potential parameters. Although the methodology presented here is computationally efficient enough to allow materials problems to be addressed, clearly there are a lot of work to do in order to get a closer coupling with experiment. Real ferroelectric materials are frequently ceramics, and a critical role is often played by grain boundaries, impurities, surfaces, dislocations, domains walls, etc. Among the critical issues that atomic-level simulation should be able to address include the microscopic processes associated with ferroelectric switching by domainwall motion and the coupling of ferroelectricity and microstructure in such ceramics. There are exciting challenges in the simulation of ferroelectric
544
M. Sepliarsky et al.
device structures. However, since such structures can involve ferroelectrics, electrodes (metallic or conducting oxide) and semiconductors, the development of atomic-level methods to simulate such chemically diverse materials will have to be developed; this is an exciting challenge for the future.
Acknowledgments We would like to thank S. Tinte, D. Wolf, and R.L. Migoni, who collaborated in the work described in this review.
References [1] M.E. Lines and A.M. Glass, Principles and Applications of Ferroelectric and Related Materials, Clarendon Press, Oxford, 1977. [2] A.F. Devonshire, “Theory of ferroelectrics,” Phil. Mag., (Suppl.) 3, 85, 1954. [3] D. Vanderbilt, “First-principles based modelling of ferroelectrics,” Current Opinion in Sol. Stat. Mater. Sci., 2, 701–705, 1997. [4] R. Cohen, “Theory of ferroelectrics: a vision for the next decade and beyond,” J. Phys. Chem. Sol., 61, 139–146, 2000. [5] G.V. Lewis and C.R.A. Catlow, “Potential model for ionic oxides,” J. Phys. C, 18, 1149–1161, 1985. [6] R. Migoni, H. Bilz, and D. B¨auerle, “Origin of Raman scattering and ferroelectricity in oxide perovskites,” Phys. Rev. Lett., 37, 1155–1158, 1976. [7] H. Donnerberg and M. Exner, “Derivation and application of ab initio Nb5+ –O2− short-range effective pair potentials in shell-model simulations of KNbO3 and KTaO3 ,” Phys. Rev. B, 49, 3746–3754, 1994. [8] F. Jona and G. Shirane, Ferroelectric Crystals, Dover Publications, New York, 1993. [9] S. Tinte, M.G. Stachiotti, M. Sepliarsky, R.L. Migoni, and C.O. Rodriguez, “Atomistic modelling of BaTiO3 based on first-principles calculations,” J.Phys.: Condens. Matter, 11, 9679–9690, 1999. [10] P.H. Ghosez, E. Cockayne, U.V. Waghmare, and K.M. Rabe, “Lattice dynamics of BaTiO3 , PbTiO3 and PbZrO3 : a comparative first-principle study,” Phys. Rev. B, 60, 836–843, 1999. [11] S. Tinte, J. Iniguez, K. Rabe, and D. Vanderbilt, “Quantitative analysis of the firstprinciples effective Hamiltonian approach to ferroelectric perovskites,” Phys. Rev. B, 67, 064106, 2003. [12] M. Sepliarsky, S.R. Phillpot, D. Wolf, M.G. Stachiotti, and R.L. Migoni, “Atomiclevel simulation of ferroelectricity in perovskite solid solutions,” Appl. Phys. Lett., 76, 3986–3988, 2000. [13] S. Tinte, M.G. Stachiotti, S.R. Phillpot, M. Sepliarsky, D. Wolf, and R.L. Migoni, “Ferroelectric properties of Bax Sr1−x TiO3 solid solutions by molecular dynamics simulation,” J. Phys.: Condens. Matt., 16, 3495–3506, 2004. [14] M. Sepliarsky, S. Phillpot, D. Wolf, M.G. Stachiotti, and R.L. Migoni, “Long-ranged ferroelectric interactions in perovskite superlattices,” Phys. Rev. B, 64, 060101 (R), 2001.
Interatomic potentials: ferroelectrics
545
[15] M. Sepliarsky, S. Phillpot, D. Wolf, M.G. Statchiotti, and R.L. Migoni, “Ferroelectric properties of KNbO3 /KTaO3 superlattices by atomic-level simulation,” J. Appl. Phys., 90, 4509–4519, 2001. [16] S. Tinte and M.G. Stachiotti, “Surface effects and ferroelectric phase transitions in BaTiO3 ultrathin films,” Phys. Rev. B, 64, 235403, 2001. [17] M.G. Stachiotti, “Ferroelectricity in BaTiO3 nanoscopic structures,” Appl. Phys. Lett., 84, 251–253, 2004. [18] G.V. Lewis and C.R.A. Catlow, “Defect studies of doped and undoped Barium Titanate using computer simulation techniques,” J. Phys. Chem. Sol., 47, 89–97, 1986.
2.7 ENERGY MINIMIZATION TECHNIQUES IN MATERIALS MODELING C.R.A. Catlow1,2 1
Davy Faraday Laboratory, The Royal Institution, 21 Albemarle Street, London W1S 4BS, UK 2 Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
1.
Introduction
Energy minimization is one of the simplest but most widely applied of modeling procedures; indeed, its applications have ranged from biomolecular systems to superconducting oxides. Moreover, minimization is often the first stage in any modeling procedure. In this section, we review the basic concepts and techniques, before providing a number of topical examples. We aim to show both the wide scope of the method as well as its extensive limitations.
2.
Basics and Definitions
The conceptual basis of energy minimization (EM) is simple: an energy function E(r1 , . . . , r N ) is minimized with respect to the nuclear coordinates ri (or combinations of these) of a system of N atoms, which may be a molecule or cluster, or a system with 1, 2 or 3D periodicity; in the latter case, the minimization may be applied to the lattice parameter(s), in addition to the coordinates of the atom within the repeat unit. E may be calculated using a quantum mechanical method, although the term energy minimization is often associated with interatomic potential methods or some simpler procedures. The term “molecular mechanics” is essentially synonymous but refers to applications to molecular systems. The term “static lattice” methods is also widely used and normally implies a minimization procedure followed by the calculation of properties of the minimized configuration. EM methods may be extended to “free energy minimization” if the entropy contribution can be calculated 547 S. Yip (ed.), Handbook of Materials Modeling, 547–564. c 2005 Springer. Printed in the Netherlands.
548
C.R.A. Catlow
by configurational or by molecular or lattice dynamical procedures. But by definition, EM excludes any explicit treatment of thermal motions. EM methods normally involve the specification of a “starting point” or initial configuration and the subsequent application of a numerical algorithm to locate the nearest local minimum, from which there arises possibly the most fundamental limitation of the approach, i.e., the “local minimum” problem: minimization can never be guaranteed to find the global minimum of an energy (or any other) function. And straightforward implementations of the method are essentially refinements of approximately known structures. Indeed, for many complex systems, e.g., protein structures, unless the starting configuration is very close to the global minimum, a local minimum will invariably be generated by minimization. Procedures for attempting to identify global minima will be discussed later in the section. Although minimization by definition excludes dynamical effects, it is possible to apply the technique to rate processes (e.g., diffusion and reaction) using methods based on an Absolute Rate Theory, in which rates (ν) are calculated according to the expression: ν = ν0 exp(−G ACT /kT ),
(1)
where the pre-exponential factor, ν0 may be loosely related to a vibrational frequency and G ACT refers to the free energy of activation of the process, i.e., the difference between the free energy of the transition states for the process and the ground state of the system. If the transition states can be located via some search procedure (or can be postulated from symmetry or other considerations), then the activation energy and (much less commonly) activation free energy may be calculated. Such procedures have been widely used in modeling atomic transport in solids. In Section 2.1, we first consider the type of energy function employed; the methods used to identify minima are then discussed followed by a more detailed survey of methodologies. Recent applications are reviewed in the final sub-section. In all cases, the emphasis is on applications to materials, but many of the considerations apply generally to atomistic modeling.
2.1.
Energy Functions
As noted earlier, minimization may be applied to any energy function that may be calculated as a function of nuclear coordinates. In atomistic simulation studies, three types of energy function may be identified: (i) Quantum mechanically evaluated energies, where essentially we use the energy calculated by solving the Schr¨odinger equation at some level of approximation. Extensive discussions of such methods are, of course, available elsewhere in this volume.
Energy minimization techniques in materials modeling
549
(ii) Interatomic potential based energy function. Here we use interatomic potentials to calculate the total energy of the system with respect to component atoms (i.e., the cohesive energy) or ions (the lattice energy), i.e., E=
N N N N N 1 1 Vi2j (ri j ) + V 3 (ri r j rk ) . . . . 2 i j =/ i 3 i j =/ i k/= j =/ i i j k
(2)
where the Vi j are the pair potential components, Vi j k the three-body term and of course the series continues in principle to higher order terms. The sum is over all N atoms in the system, but would normally be terminated beyond a “cut-off” distance (although note the case of the electrostatic term discussed later). In a high proportion of calculations (especially on non-metallic systems) only the two-body term is included, which allows the energy, E, for periodic systems to be written as: E=
Nc Ncut 1 Vi j (ri j ), 2 i=1 j =/ i
(3)
where the first summation refers to all atoms in the unit cell where interactions with all other atoms are summed up to the specified cut-off. It is common to separate off the electrostatic contributions Vi j , i.e., Vi j (ri j ) =
qi q j + ViSR j (ri j ), ri j
(4)
where qi and q j are atomic or ion charges and V SR is the remaining, “shortrange” component of the potential. This allows us to write: E = Ec +
Nc Ncut
Vij (ri j ),
(5)
i=1 j = /i
where E c is the coulomb term, obtained by summing the r −1 terms, which should not be truncated in any accurate calculation. The short-range terms can, however, usually be safely truncated at a distance of 10–20 Å. The summation of the electrostatic term must be carefully undertaken, as it may be conditionally convergent if handled in real space. The most widely used procedure rests on the work of Ewald (see, e.g., [1]) which obtains rapid convergence by a partial transformation into reciprocal space. The procedure has been very extensively used and for applications to materials we refer to the articles in Ref. [2].
550
2.2.
C.R.A. Catlow
Other Functions
In some cases, a simple “cost function” may be used based on geometrical criteria rather than energies. For example, the distance least squares (DLS) approach [3] is based on minimization of a cost function obtained by summing the squares of the distances between calculated and “standard” bond lengths for a structure. More complex cost functions include deviation from calculated and specified coordination numbers. We have also noted earlier that if entropy terms can be estimated, energy can be extended to free energy minimization. Such extensions will be discussed in detail for the case of periodic lattices.
2.3.
Identification of Minima
We recall that standard minimization methods aim to identify the energy minimum starting form a specified initial configuration, using algorithms which will be discussed later. And as argued earlier, it is impossible ever to guarantee that a global minimum has been achieved. However, a number of procedures are available to mitigate the effects of the local minimum problem, with the two main classes being: (i) Simulated Annealing (SA), where the approach is to use molecular dynamics (MD) or Monte Carlo (MC) systems initially at high temperature, thereby allowing the system to explore the potential energy surface and escape from local into the global minimum region. The normal procedure is to “cool” the system during the course of the simulation, which usually concludes with a standard minimization. SA has been used successfully and predictively in a number of cases in crystal structure modeling. If used carefully and appropriately, the method offers a good probability of identifying the global minimum; but there always remains a distinct possibility that the simulation will fail to locate regions of configurational space close to the global minimum, especially if there are substantial energy barriers between this and other regions. (ii) Genetic Algorithm methods (GA), which GA have been widely used in optimization studies, and where the approach is fundamentally different from SA. Instead of one starting point, there are many, which may simply be different random arrangements of atoms (with some overall constraint such as unit-cell dimensions). A cost function is specified, and is evaluated for each configuration. the population of configurations then evolves through successive generations. The “breeding” process involves exchange of features between different members of the population and is driven so as to generate a population with a low cost function.
Energy minimization techniques in materials modeling
551
At the end of the procedure, selected members of the population are subjected to energy minimization, giving a range of minimum structures from which the lowest energy one may be selected. GA methods again offer no guarantee that the global minimum has been located. Their particular merit is that they use a variety of initial configurations, rather than one as in SA. However, both approaches unquestionably have their value. A good account of the application of the GA method to periodic solids is given in Ref. [4].
3.
Methodologies
Minimization methods may be applied to periodic lattices, to defects within lattices, to surfaces and to clusters. The methodological aspects are similar in all these different areas. In this section, we pay the greatest attention to perfect lattice minimization. The field of defect calculations is reviewed in Chapter 6.4.
3.1.
Perfect Lattice Calculations
The first objective here is to calculate the lattice energy, in which the summation in Eq. (1) is taken over all atoms/ions in the unit cell interacting with all other species. The calculation is tractable via the use of the Ewald summation for the Coulombic terms and the cut-off for the short-range interactions. We note that the great majority of lattice energy calculations only include the two-body contribution to the short-range energy. One important matter of definition is that the lattice energy gives the energy of the crystal with respect to component ions at infinity. If it is desired to express the energy with respect to atoms at infinity (for which the more appropriate term is then the cohesive energy) then the appropriate ionization energies and electron affinities will be added. Lattice energy calculations are now routine, and may be carried out for very large unit cells containing several hundred atoms. The codes METAPOCS, THBREL and GULP undertake lattice energy calculations including both twoand three-body terms, using both bond-bending and triple-dipole formalisms. Lattice energy calculations provide valuable insight into the structures and stabilities of ionic and semi-ionic solids. The technique is most powerful when combined with energy minimization procedures, which generate the structure of minimum energy. These are discussed later after the calculation of entropies have been described. The results in Table 1 give a good illustration of the value of lattice energy studies. They are the energy minimum lattice energies calculated for a number of purely siliceous microporous zeolitic structures which
552
C.R.A. Catlow Table 1. Relative energies (per mol) of microporous siliceous structures with respect to quartz (after Ref. [5]) Structure
Energy (kJ/mol)
Silicalite Mordenite Faujasite
11.2 20.52 21.4
are compared with the lattice energy of α-SiO2 . The latter has the lowest value as would indeed be expected since the more porous structures are known to be metastable with respect to the dense α-SiO2 polymorph. Of greater interest is the observation that of the porous structures, silicalite has the greatest stability. This accords with the fact that this polymorph can only be prepared as a highly siliceous compound unlike the case with the other zeolitic structures which are normally synthesized with high aluminium contents. The calculations which are discussed in greater detail by Ooms et al. [5], suggest that this behavior has its origin at least in part in the thermodynamic stability of the compounds. We note that more recently very similar results were obtained by Henson et al. [6] who also showed that the calculated values were in excellent agreement with experiment. In addition to calculating energies, it is also possible to calculate routinely a range of crystal properties, including the lattice stability, the elastic and dielectric and piezoelectric constants, and the phonon dispersion curves. The techniques used which are quite standard require knowledge of both first and second derivatives of the energy with respect to the atomic coordinates. Indeed it is useful to describe two quantities: first the vector, g, whose components gα i are defined as: gα i =
∂E ∂xα i
(6)
i.e., the first derivative of the lattice energy with respect to a given Cartesian coordinate (α) of the ith atom. The second derivative matrix W has components αβ Wij ; defined by:
∂ 2E αβ Wij = β ∂xα i ∂xj
(7)
The expressions used in calculating the properties referred to above from these derivatives are discussed in greater detail in Refs. [2] and [7]. For more detailed discussions of the calculation of phonon dispersion curves from the second derivative or “dynamical” matrix W , the reader should consult [8] and
Energy minimization techniques in materials modeling
553
Parker and Price [9]. Finally, we note that by the term “lattic stability” we refer to the equilibrium conditions both for the atoms within the unit cell, and for the unit cell as a whole. The former are available from the gradient vector g, while the latter are described in terms of the six components ε1 . . . ε1 which define the strain matrix ε, where
ε=
ε1
1 ε 2 4
1 ε 2 4 1 ε 2 5
ε2
1 ε 2 5 1 ε 2 6
1 ε 2 6
ε3
(8)
So when the unit cell as a whole is strained, we describe the modification of an arbitrary vector r in the unstrained matrix to a vector r in the strained matrix, using the equation: r = (1 + ε) r
(9)
where 1 is the unit matrix. The six derivatives of energy with respect to strain, [∂ E/∂εi ], therefore measure the forces acting on the unit-cell. The equilibrium condition for the crystal therefore requires that g = 0 and [∂ E/∂εi ] = 0 for all i.
3.2.
Entropy Calculations
The entropy in a solid arises first from configuration terms which for a perfect solid are zero; while for a solid showing orientational or translational disorder configurational expressions based on the Boltzmann expression S = k ln(W ) may be used. In this section we shall pay more attention to the second term, which is due to the population of the vibrational degrees of freedom of the solid. Thus the entropy of a solid may be written as:
Q
Svib = k
dQ
hνi
−1 hνi −hνi exp − 1 − ln 1 − exp kT kT kT
i
0
(10)
where the sum is over all phonon frequencies and the integral is over the Brillouin zone. In practice the integral is normally evaluated by sampling over the zone for which a variety of techniques are available. Vibrational terms also give a contribution to the lattice energy of the crystal:
Q
E vib = kT
dQ 0
hνi i
−1
hνi hνi + exp −1 2kT kT kT
(11)
554
C.R.A. Catlow
which results in the following expression for the crystal free energy with respect to ions at rest of infinity: F = E + kT
Q
dQ 0
hνi i
2kT
+ ln 1 − exp
hνi kT
(12)
where E is the lattice energy (omitting vibrational terms).
3.3.
Energy Minimization
Having evaluated energies and free energies of a crystal structure we are now able to implement these in an energy (or free energy) minimization procedure. Let us consider first the simple case of minimization to constant volume (i.e., within fixed cell dimensions). We write the energy of the crystal as a Taylor expansion in the displacements of the atoms, δ, from that current configuration giving: 1 E(δ) = E 0 + gδ + δW δ + . . . . 2
(13)
If we terminate this function at the second order term and minimize E with respect to δ, we obtain for the energy minimum: 0 = g + Wδ
i.e., δ = −gW −1
(14)
Displacement of the coordinates by δ as given in Eq. (14) will generate the energy minimum configuration. Of course, in practice, it will not be valid to truncate the summation at the quadratic term, except when very close to the minimum. However, Eq. (14) provides the basis of an effective iterative procedure for attaining the minimum. Indeed this “Newton Raphson” method is widely used in both perfect and defect lattice energy minimization, as it is generally rapidly convergent. Its main disadvantage is that it requires the calculation, inversion and storage of the second derivative matrix, W . Recalculation and inversion each iteration may be avoided by use of updating procedures (see e.g., [10]). The storage problem may become serious with very large structures owing to the high cpu memory requirements. Recourse may be made to gradient methods, e.g., the well known conjugate gradients technique, which make use only of first derivatives. Such methods are, however, more slowly converging. The increasing availability of very large cpu memories is, however, reducing the difficulties associated with the storage of the W matrix. For evaluation of the energy minimum with respect to constant pressure (i.e., with variable cell dimensions), first we note that we can define the six
Energy minimization techniques in materials modeling
555
components of the mechanical pressure acting on the solid, corresponding to the six strain components, defined in Eq. (8), i.e., P εi =
1 V
dUi dεi
(15)
where V is the unit cell volume. The strains can then be evaluated, using Hooke’s law, ε = PC −1
(16)
where C is the (6 × 6) elastic constant tensor, which may be calculated from W . Substitution of these calculated strain components into Eq. (16) then yields the new cell dimensions and atomic coordinates. Again, the procedure is iterative, as it is only strictly valid in the region of applicability of the harmonic approximation. With a sensible starting point, however, only a small number of iterations (typically 2–5) is required. The treatment above assumes that the pressure and corresponding strains are entirely mechanical in origin. However, at finite temperatures there will be a “kinetic pressure” arising from the changes in the vibrational free energy with volume. These may be written as: εi Pvib
1 = V
dFvib dεi
(17)
where Fvib is the vibrational free energy. These kinetic pressures are most simply evaluated by applying small arbitrary strains to the structure and calculating the corresponding changes in Fvib . If Pvib is added to the mechanical pressure P in Eq. (15), it enables us to carry out free energy minimization. (see e.g., [11]). A general computer code, PAPAPOCS, is available for such calculations and the same functionality is available in the GULP code [12]. A detailed discussion is given by Parter and Price [9] and Watson et al. [11] who also describe how the techniques may be used to calculate lattice expansivity, either directly or by calculating the cell dimension as a function of temperature or by calculation of the thermal Gr¨uneisen parameter.
3.4.
Surface Simulations
The procedures here are closely related to those employed in perfect lattice calculations but adapted to 2D periodicity. The most widely used procedure is that pioneered by Tasker et al. [13], in which a slab is taken and divided into
556
C.R.A. Catlow
two regions. Full minimization is undertaken on the upper region which represents the relaxed surface structure and which is embedded in a rigid representation of the underlying lattice. The Ewald summation must be adapted for 2D periodicity using the formalism developed by Parry [14]. Surface simulations have been widely and successfully applied especially to the surfaces of ionic materials, and a number of standard codes are available, e.g., METADISE and MARVIN. The methods may also be readily adapted to study interfaces and other 2D periodic systems such as grain boundaries as will be discussed later in this chapter.
3.5.
Defect and Cluster Calculations
Defects simulations, as discussed in detail in Chapter 6.4, proceed by relaxation of an atomistically represented region of lattice which is embedded in a more approximate representation of the more distant regions of the lattice whose dilectric and/or elastic response to the defect is calculated. An increasingly widely used extension of the procedure is to describe the immediate environment of the defect, (the defect itself and a small number of surrounding coordination shells) quantum mechanically. The detailed discussion of such “embedded cluster” methods is beyond the scope of the present chapter; a recent review is available in Ref. [15]. Minimization of the energy of clusters is, of course, conceptually straightforward. Minimization algorithms are applied to the cluster energy (or free energy) obtained by direct summation. Considerable attention has been paid in this field to the use of global optimization techniques owing to the prevalence of multiple minima. A recent review of cluster simulations is available from Ref. [16].
4.
Discussion and Applications
Minimization methods have been extensively applied to metals, ceramics, silicates, semiconductors and molecular materials. In this section we will provide topical examples which will illustrate the current capabilities of the techniques.
4.1.
Predictions of the Structures of Microporous Materials
Microporous materials have been widely investigated over the last 50 years owing to their extensive range of applications in catalysis, gas separation and
Energy minimization techniques in materials modeling
557
ion exchange. Zeolites, (originally observed as minerals, but now extensively available as synthetic materials) are all silica or aluminosilicate materials, based on fully corner shared networks of SiO4 and AlO4 tetrahedra, but with structures that contain channels pores and voids of molecular dimensions; pore sizes are typically in the range 5–15 Å. The aluminosilicate materials contain exchangeable cations, while the microporous structures give rise to the applications in molecular sieving and sorption. Exchange of protons into the materials creates acid sites which promote catalytic reactions including cracking, isomerization and hydrocarbon synthesis; while metal ions in both framework and extraframework locations can act as active sites for partial oxidation reactions. Modeling techniques have been applied extensively and successfully to the study of microporous materials (see, e.g., the books edited by Catlow [17] and Catlow et al. [18]). And there have been a number of successful applications of minimization techniques to the accurate and indeed to the predictive modeling of microporous structures. Here we highlight a recent significant development, namely the prediction of new hypothetical structures. There have been many attempts to predict new microporous structures, most of which have rested on the fact that the very definition of these materials is based on geometry, rather than on precise chemical composition, occurence or function. In order to be considered as a zeolite, or zeolitetype material (zeotype), a mineral or synthetic material must possess a 3D four-connected inorganic framework, i.e., a framework consisting of tetrahedra which are all corner-sharing. There is an additional criterion that the framework should enclose pores or cavities which are able to accommodate sorbed molecules or exchangeable cations, which leads to the exclusion of denser phases. Topologically, the zeolite frameworks may thus be thought of as fourconnected nets, where each vertex is connected to its four closest neighbours. So far 139 zeolite framework types are known , either from the structures of natural minerals or from synthetically produced inorganic materials. In enumerating microporous structures, a number of fruitful approaches have been developed. Some have involved the decomposition of existing structures into their various structural subunits, and then recombining these in such ways as to generate novel frameworks . Methods which involve combinatorial, or systematic, searches of phase space have also been successfully deployed. Recently, an approach based on mathematical tiling theory has also been reported [19]. It was established that there are exactly 9, 117 and 926 topological types of fourconnected uninodal (i.e., containing one topologically distinct type of vertex), binodal and trinodal networks, respectively, derived from simple tilings (tilings with vertex figures which are tetrahedra), and at least 145 additional uninodal networks derived from quasi-simple tilings (the vertex figures of which are derived from tetrahedra, but contain double edges). In principle, the tiling
558
C.R.A. Catlow
approach offers a complete solution to the problem of framework enumeration, although the number of possible nets is infinite. Potentially therefore we may be able to generate an unlimited number of possible zeolitic frameworks. Of these, only a portion is likely to be of interest as having desirable properties, with an even smaller fraction being amenable to synthesis in any given composition. It is this last problem, the feasibility of hypothetical frameworks, which is the key question in any analysis of such structures. The answer is not a simple one, since the factors which govern the synthesis of such materials are not fully understood. As discussed earlier, zeolites are metastable materials. Aside from this thermodynamic constraint, the precise identity of the phase or phases formed during hydrothermal synthesis is said to be under “kinetic control,” although there is increasing sophistication in targeting certain types of framework using various templating methods, fluoride media and other synthesis parameters . Additionally, certain structural motifs are more likely to formed within certain compositions, e.g., double four-rings in germanates, three-rings in beryllium-containing compounds and so on. A full characterization of any hypothetical zeolite must therefore include an analysis of framework topology and of the types of building unit present, as well as some estimate of the thermodynamic stability of the framework. Using an appropriate potential model, lattice energy minimization can, as shown above, provide a very good measure of this stability and well as optimizing structures to a high degree of accuracy. In the method adopted by Foster et al. [20], networks derived from tiling theory were first transformed into “virtual zeolites” of composition SiO2 by placing silicon atoms at the vertices of the nets, and bridging oxygens at the midpoints of connecting edges. The structures were then refined using the geometry-based DLS procedure, referred to above, before final optimization by lattice energy minimization. Among the 150 or so uninodal structures examined, all 18 known uninodal zeolite frameworks were found. Moreover, most of the unknown frameworks had been described by previous authors; in fact there a considerable degree of overlap between sets of uninodal structures generated by different methods. Most of the binodal and trinodal structures, however, are completely new. Using calculated lattice energy as an initial measure of feasibility, a number of the more interesting structures are shown in Fig. (1). The challenge is now to synthesize these structures.
4.2.
Grain Boundary Structures in Mantle Minerals
Grain boundaries are known to be a major factor controling mechanical and rheological properties of materials. Detailed knowledge of their structures is, however, limited. Simulation methods have made a major contribution over
Energy minimization techniques in materials modeling
559
detl_14
detl_19
detl_11
delt_71
delt_35
Figure 1. Illustrations of feasible uninodal zolite structures generated by tiling theory and modeled using lattice energy minimization.
the past 20 years in developing models for grain boundaries as in the work of Keblinski et al. [21] on metal systems and Duffy, Harding and Stoneham [22] on ionic systems. Recent work has explored grain boundary properties in the Mantle mineral forsterite Mg2 SiO4 , a member of the olivine group of minerals, which comprise a major proportion of the upper part of the Earth’s Mantle. Knowledge of the grain boundary structure of this material is vital for developing an improved
560
C.R.A. Catlow
understanding of the rheology of the Mantle. Modeling boundaries in this material, however, presents substantial challenges owing to the complexity of the crystal structure. The recent work of de Leeuw et al. [23] investigated this problem using static lattice simulation techniques. They modeled the forsterite grain boundaries using empirical potential models for SiO2 and MgO. Atomistic simulation techniques are appropriate for these calculations because they are capable of modeling systems consisting of large numbers of ions which is necessary when modeling grain boundaries, as shown in many studies. Energy minimization techniques were used to investigate the structure and stability of the grain boundaries and the interactions between the lattice ions at the boundaries and adsorbed species, such as protons and dissociated water molecules, to identify the strength of interaction with specific boundary features. They employed the energy minimization code METADISE, which is designed to model dislocations, interfaces and surfaces . A grain boundary is created by fitting two surface blocks together in different orientations. In the present case, two series of tilt grain boundaries (M1 and M2, defined by the type of cation site at the surface) were created from appropriate models of stepped forsterite (010) surfaces at increasing boundary angles. Both boundary and adhesion energies were calculated, which describe the stability of the boundary with respect to the bulk material and free surfaces, respectively. Results are reported in Table 2 and Fig. 2. The atomistic models generated are shown in Fig. 3. The larger grain boundaries do not form a continuously disordered interface but rather a series of open channels in the interfacial region with practically bulk termination of the two mirror planes (Fig. 3). We would expect that physical processes such as melting and diffusion of ions and molecules, e.g., oxygen or water, will be enhanced especially at the larger-terraced boundaries due to the low density of these regions compared to the bulk crystal. The minima in the adhesion energies at φ = ∼ 200 (M1) or ∼ 300 (M2) (Fig. 2) Table 2. Calculated boundary energies of (010) tilt grain boundaries in forsterite Boundary
Boundary angle (◦ )
Boundary energy (Jm−2 )
M2
65 47 36 28 23 60 41 30 23 19
1.32 2.72 3.57 3.50 3.09 2.12 3.13 3.19 2.94 2.88
M1
Energy minimization techniques in materials modeling
561
adhesion energy (J/m2)
5
4
3
2
1
0 0
20
40
60
80
angle (degrees) M2
M1
Figure 2. Adhesion energies as a function of grain boundary tilt angle.
indicate the boundaries which are most easily cleaved and are due to the relative stabilitities of the grain boundaries and corresponding free surfaces. Overall, the results show the ability of simulation methods to generate realistic models for these complex interfaces.
4.3.
Nanocluster Structures in ZnS
Our final example is an intriguing case study in cluster chemistry. As part of an extensive study aimed at identifying the structures of the critical growth nuclei in the growth of ZnS crystals Spano et al. [24, 25] have identified a whole series of stable open cluster structures for (ZnS)n clusters with n ranging from 1 to 80. They have employed simulated annealing and minimization techniques using interatomic potentials but with critical structures also being modeled by Density Functional Theory electronic structure methods, (the results of which validate the interatomic potential based simulations.) The cluster structures have quite different topologies from bulk ZnS. A particularly interesting example is shown in Fig. 4. It is an onion like cluster with an inner core and outer shell. Work is in progress aimed at detecting these structures experimentally.
562
C.R.A. Catlow
Figure 3. Relaxed structures of tilt grain boundaries with (010) mirror terraces, top (100) step wall showing two round channels per terrace, bottom (001) step wall with one triangular channel per terrace.
5.
Conclusions
This chapter has surveyed the essential methodological aspects of minimization techniques and has illustrated the scope of the field by a number of recent examples. Despite their simplicity, minimization methods will remain powerful tools in materials simulation.
Energy minimization techniques in materials modeling
563
Figure 4. Predicted onion-like structure for (ZnS)60 .
Acknowledgments I am grateful to many colleagues for their contributions to the work discussed in this chapter, but special thanks go to Robert Bell, Martin Foster, Nora de Leeuw, Stephen Parker and Said Hamad, whose recent work was highlighted in the applications section.
References [1] M.P. Tosi, Solid State Phys., 16, 1, 1964. [2] C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallograpy, Academic Press, London, 1997. [3] W.M. Meier and H. Villiger, Z. Kristallogr, 128, 352, 1969. [4] S.M. Woodley, In: R.L. Johston (ed.), Structure and Bonding, vol. 110, Springer, Heidelberg, 2004. [5] G. Ooms, R.A. van Santen, C.J.J. den Ouden, R.A. Jackson, and C.R.A. Catlow, J. Phys. C: Condensed Matter., 92, 4462, 1988. [6] N.J. Henson, A.K. Cheetham, and J.D. Gale, Chem. Mater., 6, 1647, 1994. [7] C.R.A. Catlow and W.C. Mackrodt (eds.), “Computer simulation of solids,” Lecture Notes in Physics, vol. 166, Springer, Berlin, 1982. [8] W. Cochran, Crit. Rev. Solid Sci., 2, 1, 1971. [9] S.C. Parker and G.D. Price, In: C.R.A. Catlow (ed.), Advanced Solid State Chemistry, vol. 1, JAI Press, 1990.
564
C.R.A. Catlow
[10] M.J. Norgett and R. Fletcher, J. Phys. C: Condensed Matter, 3, L190, 1970. [11] Watson et al., In: C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallography, Academic Press, London, p. 55, 1997. [12] J.D. Gale, J. Chem Soc. Faraday Trans., 93, 629, 1997. [13] P.W. Tasker, J. Phys. C: Condensed Matter., 12, 4977, 1979. [14] D.E. Parry, Surf. Sci., 49, 433, 1975. [15] P. Sherwood et al., J. Mol. Struct. – Theochem, 632, 1, 2003. [16] R.L. Johnston, Dalton Trans., 22, 4193, 2003. [17] C.R.A. Catlow (ed.), Modelling of Structure and Reactivity in Zeolites, Academic Press, London, 1992. [18] C.R.A. Catlow, B. Smit, and R.A. van Santen (eds.), Modelling Microporous Materials, Elsevier, Amsterdam, 2004. [19] O. Delgado Friedrichs, A.W.M. Dress, D.H. Huson, J. Klinowski, and A.L. Mackay, Nature, 400, 644, 1999. [20] M.D. Foster, A. Simpler, R.G. Bell, O. Delgado Friedrichs, F.A. Almeida Paz, and J. Klinowski, Nature Mat., 3, 234, 2004. [21] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Philos. Mag. A., 79, 2735, 1999. [22] D.M. Duffy, J.H. Harding, and A.M. Stoneham, Philos. Mag. A, 67, 865, 1993. [23] N.H. De Leeuw, S.C. Parker, C.R.A. Catlow, and G.D. Price, Am. Mineral, 85, 1143, 2000. [24] E. Spano, S. Hamad, and C.R.A. Catlow, J. Phys. Chem. B, 107, 10337, 2003. [25] E. Spano, S. Hamad, and C.R.A. Catlow, Chem. Commun., 864, 2004.
2.8 BASIC MOLECULAR DYNAMICS Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
A working definition of molecular dynamics (MD) simulation is technique by which one generates the atomic trajectories of a system of N particles by numerical integration of Newton’s equation of motion, for a specific interatomic potential, with certain initial condition (IC) and boundary condition (BC). Consider, for example, a system with N atoms in a volume . We can define its internal energy: E ≡ K + U , where K is the kinetic energy, K ≡
N 1 i=1
2
m i |˙xi (t)|2 ,
(1)
and U is the potential energy, U = U (x3N (t)).
(2)
x3N (t) denotes the collective of 3 D coordinates x1 (t), x2 (t), . . . , x N (t). Note that E should be a conserved quantity, i.e., a constant of time, if the system is truly isolated. One can often treat a MD simulation like an experiment (Fig. 1). Below is a common flowchart of an ordinary MD run: [system setup] sample selection (pot., N , IC, BC)
→
[equilibration] sample preparation (achieve T, P)
→
[simulation run] property average (run L steps)
→
[output] data analysis (property calc.)
in which we fine-tune the system until it reaches the desired condition (here, temperature T and pressure P), and then perform property averages, for instance calculating the radial distribution function g(r) [1] or thermal conductivity [2]. One may also perform a non-equilibrium MD calculation, during which the system is subjected to perturbational or large external driving forces, 565 S. Yip (ed.), Handbook of Materials Modeling, 565–588. c 2005 Springer. Printed in the Netherlands.
566
J. Li
N particles
xi(t) z
y x
Figure 1. Illustration of the MD simulation system.
and we analyze its non-equilibrium response, such as in many mechanical deformation simulations. There are five key ingredients to a MD simulation, which are boundary condition, initial condition, force calculation, integrator/ensemble, and property calculation. A brief overview of them is given below, followed by more specific discussions. Boundary condition. There are two major types of boundary conditions: isolated boundary condition (IBC) and periodic boundary condition (PBC). IBC is ideally suited for studying clusters and molecules, while PBC is suited for studying bulk liquids and solids. There could also be mixed boundary conditions such as slab or wire configurations for which the system is assumed to be periodic in some directions but not in the others. In IBC, the N -particle system is surrounded by vacuum; these particles interact among themselves, but are presumed to be so far away from everything else in the universe that no interactions with outside occur except perhaps responding to some well-defined “external forcing.” In PBC, one explicitly keeps track of the motion of N particles in the so-called supercell, but the supercell is surrounded by infinitely replicated, periodic images of itself. Therefore a particle may interact not only with particles in the same supercell but also with particles in adjacent image supercells (Fig. 2). While several polyhedra shapes (such as hexagonal prism and rhombic dodecahedron from Wigner–Seitz construction) can be used as the space-filling unit and thus can serve as PBC supercell, the simplest and most often used supecell shape is a parallelepiped, specified by its three edge vectors h1 , h2 and h3 . It should be noted that IBC can most often be well mimicked by a large enough PBC supercell so the images do not interact. Initial condition. Since Newton’s equations of motion are second-order ordinary differential equations (ODE), IC basically means x3N (t = 0) and
Basic molecular dynamics
567
rc h2
h1
Figure 2. Illustration of periodic boundary condition (PBC). We explicitly keep track of trajectories of only the atoms in the center cell called the supercell (defined by edge vectors h1 , h2 and h3 ), which is infinitely replicated in all three directions (image supercells). An atom in the supercell may interact with other atoms in the supercell as well as atoms in the surrounding image supercells. rc is a cut-off distance of the interatomic potential beyond which interaction may be safely ignored.
x˙ 3N (t = 0), the initial particle positions and velocities. Generating the IC for crystalline solids is usually quite easy, but IC for liquids needs some work, and even more so for amorphous solids. A common strategy creating a proper liquid configuration is to melt a crystalline solid. And if one wants to obtain an amorphous configuration, a strategy is to quench the liquid during the MD run. Let us focus on IC for crystalline solids. For instance, x3N (t = 0) can be a fcc perfect crystal (assuming PBC), or an interface between two crystalline phases. For most MD simulations, one needs to write a structure generator. Before feeding the initial configuration thus created into a MD run, it is a good idea to visualize it first, checking bond lengths and coordination numbers, etc. [3]. A frequent cause of MD simulation breakdown is pathological initial condition, as the atoms are too close to each other initially, leading to huge forces. According to the equipartition theorem [4], each independent degree of freedom should possess kB T /2 kinetic energy. So, one should draw each
568
J. Li
component of the 3N -dimensional x˙ 3N (t =0) vector from a Gaussian–Maxwell normal distribution N (0, kB T /m i ). After that, it is a good idea to eliminate the center of mass velocity, and for clusters, the net angular momentum as well. Force calculation. Before moving into details of force calculation, it should be mentioned that two approximations underly the use of the classical equation of motion mi
∂U d2 xi (t) = fi ≡ − , 2 dt ∂xi
i = 1, . . . , N.
(3)
to describe the atoms. The first is the Born–Oppenheimer approximation [5] which assumes the electronic state couples adiabatically to nuclei motion. The second is that the nucleus motion is far removed from the Heisenberg uncertainty lower bound: Et h¯ /2. If we plug in E = kB T /2, the kinetic energy, and t = 1/ω, where ω is a characteristic vibrational frequency, we obtain kB T /h¯ ω 1. In solids, this means the temperature should be significantly greater than the Debye temperature, which is actually quite a stringent requirement. Indeed, large deviations from experimental heat capacities are seen in classical MD simulations of crystalline solids [2]. A variety of schemes exist to correct this error [1], for instance the Wigner–Kirkwood expansion [6] and path integral molecular dynamics [7]. The evaluation of the right-hand side of Eq. (3) is the key step that usually consumes most of the computational time in a MD simulation, so its efficiency is crucial. For long-range Coulomb interactions, special algorithms exist to break them up into two contributions: a short-ranged interaction, plus a smooth, field-like interaction, both of which can be computed efficiently in separate ways [8]. In this contribution we focus on issues concerning shortrange interactions only. There is a section about the Lennard–Jones potential and its trunction schemes, followed by a section about how to construct and maintain an atom–atom neighborlist with O(N ) computational effort per step. Finally, see Chap. 2.4 and 2.5 for the development of interatomic potential U (x3N ) functions for metallic and covalent materials, respectively. Integrator/ensemble. Equation (3) is a set of second-order ODEs, which can be strongly nonlinear. By converting them to first-order ODEs in the 6N dimensional space of {x N , x˙ N }, general numerical algorithms for solving ODEs such as the Runge–Kutta method [9] can be applied. However, these general methods are rarely used in practice, because the existence of a Hamiltonian allows for more accurate integration algorithms, prominent among which are the family of predictor-corrector integrators [10] and the family of symplectic integrators [8, 11]. A section in this contribution gives a brief overview of integrators. Ensembles such as the micro-canonical, canonical, and grand-canonical are concepts in statistical physics that refer to the distribution of initial conditions. A system, once drawn from a certain ensemble, is supposed to follow strictly
Basic molecular dynamics
569
the Hamiltonian equation of motion Eq. (3), with E conserved. However, ensemble and integrator are often grouped together because there exists a class of methods that generates the desired ensemble distribution via time integration [12, 13]. Equation (3) is modified in these methods to create a special dynamics whose trajectory over time forms a cloud in phase space that has the desired distribution density. Thus, the time-average of a single-point operator on one such trajectory approaches the thermodynamic average. However, one should be careful in using it to calculate two-point correlation function averages. See Chap. 2.4 for detailed description of these methods. Property calculation. A great strength of MD simulation is that it is “omnipotent” at the level of classical atoms. All properties that are well-posed in classical mechanics and statistical mechanics can in principle be computed. The remaining issue is computational efficiency. The properties can be roughly grouped into four categories: 1. Structural characterizations. Examples include radial distribution function, dynamic structure factor, etc. 2. Equations of state. Examples include free-energy functions, phase diagrams, static response functions like thermal expansion coefficient, etc. 3. Transport. Examples include viscosity, thermal conductivity (electronic contribution excluded), correlation functions, diffusivity, etc. 4. Non-equilibrium response. Examples include plastic deformation, pattern formation, etc.
1.
The Lennard–Jones Potential
The solid and liquid states of rare gas elements Ne, Ar, Kr, Xe are better understood than other elements because their closed-shell electron configurations do not allow them to participate in covalent or metallic bonding with neighbors, which are strong and complex, but only to interact via weak van der Waals bonds, which are perturbational in nature in these elements and therefore mostly additive, leading to the pair-potential model: U (x3N ) =
N
V (|x j i |),
x j i ≡ x j − xi ,
(4)
j >i
where we assert that the total potential energy can be decomposed into the direct sum of individual “pair-interactions.” If there is to be rotational invariance in U (x3N ), V can only depend on r j i ≡ |x j i |. In particular, the Lennard–Jones potential V (r) = 4
12
σ r
−
6
σ r
,
(5)
570
J. Li
is a widely used form for V (r), that depends on just two parameters: a basic energy-scale parameter , and a basic length-scale parameter σ . The potential is plotted in Fig. 3. There are a few noteworthy facts about the Lennard–Jones potential: • V (r = σ ) = 0, at which point the potential is still repulsive, meaning V (r = σ ) > 0 and two atoms would repel each other if separated at this distance. • The potential minimum occurs at rmin = 21/6 σ , and Vmin = −. When r > rmin the potential switches from being repulsive to being attractive. • As r → ∞, V (r) is attractive and decays as r −6 , which is the correct scaling law for dispersion (London) forces between closed-shell atoms. To get a feel for how fast V (r) decays, note that V (r =2.5σ )=−0.0163, V (r = 3σ ) = −0.00548, and V (r = 3.5σ ) = −0.00217. • As r → 0, V (r) is repulsive as r −12 . In fact, r −12 blows up so quickly that an atom seldom is able to penetrate r < 0.9σ , so the Lennard– Jones potential can be considered as having a “hard core.” There is no conceptual basis for the r −12 form, and it may be unsuitable as a model for certain materials, so it is sometimes replaced by a “soft core” of the form exp(−kr), which combined with the r −6 attractive part is called the Buckingham exponential-6 potential. If the attractive part is also of an exponential form exp(−kr/2), then it is called a Morse potential.
2
VLJ(r)/ε
1.5 1 0.5 0 ⫺0.5 ⫺1
1
1.5
2 r/σ
Figure 3. The Lennard–Jones potential.
2.5
Basic molecular dynamics
571
For definiteness, σ = 3.405 Å and = 119.8 kB = 0.01032 eV for Ar. The mass can be taken to be the isotopic average, 39.948 a.m.u.
1.1.
Reduced Units
Unit systems are invented to make physical laws look simple and numerical calculations easy. Take Newton’s law: f =ma. In the SI unit system, this means that if an object of mass x (kg) is undergoing an acceleration of y (m/s2 ), the force on the object must be x y (N). However, there is nothing intrinsically special about the SI unit system. One (kg) is simply the mass of a platinum–iridium prototype in a vacuum chamber in Paris. If one wishes, one can define his or her own mass unit – ˜ which say is 1/7 of the mass of the Paris prototype: 1 (kg) = 7 (kg). ˜ (kg), ˜ If (kg) is one’s choice of the mass unit, how about the unit system? One really has to make a decision here, which is either keeping all the other units ˜ transition, or, changing some unchanged and only making the (kg) → (kg) ˜ other units along with the (kg) → (kg) transition. Imagine making the first choice, that is, keeping all the other units of the SI system unchanged, including the force unit (N), and only changes the mass unit ˜ That is all right, except in the new unit system the Newton’s from (kg) to (kg). ˜ law must be re-expressed as F = ma/7, because if an object of mass 7x (kg) 2 is undergoing an acceleration of y (m/s ), the force on the object is x y (N). There is nothing inherently wrong with the F = ma/7 expression, which is just a recipe for computation – a correct one for the newly chosen unit system. Fundamentally, F = ma/7 and F = ma describe the same physical law. But it is true that F = ma/7 is less elegant than F = ma. No one likes to memorize extra constants if they can be reduced to unity by a sensible choice of units. The SI unit system is sensible, because (N) is picked to work with other SI units to satisfy F = ma. ˜ as the mass unit? How may we have a sensible unit system but with (kg) ˜ ˜ ˜ Simple, just define (N) = (N)/7 as the new force unit. The (m)–(s)–(kg)–( N)– unit system is sensible because the simplest form of F = ma is preserved. Thus we see that when a certain unit in a sensible unit system is altered, other units must also be altered correspondingly in order to constitute a new sensible unit system, which keeps the algebraic forms of all fundamental physical laws unaltered. (A notable exception is the conversion between SI and Gaussian unit systems in electrodynamics, during which a non-trivial factor of 4π comes up.) In science people have formed deep-rooted conventions about the simplest algebraic forms of physical laws, such as F = ma, K = mv 2 /2, E = K + U , P = ρ RT , etc. Although nothing forbids one from modifying the constant coefficients in front of each expression, one is better off not to. Fortunately, as long as one uses a sensible unit system, these algebraic expressions stays invariant.
572
J. Li
Now, imagine we derive a certain composite law from a set of simple laws. On one side, we start with and consistently use a sensible unit system A. On the other side, we start with and consistently use another sensible unit system B. Since the two sides use exactly the same algebraic forms, the resultant algebraic expression must also be the same, even though for a given physical instance, a variable takes on two different numerical values on the two sides as different unit systems are adopted. This means that the final algebraic expression describing the physical phenomena must satisfy certain concerted scaling invariance with respect to its dependent variables, corresponding to any feasible transformation between sensible unit systems. This strongly limits the form of possible algebraic expressions describing physical phenomena, which is the basis of dimensional analysis. As mentioned, once certain units are altered, other units must be altered correspondingly to make the algebraic expressions of physical laws look invariant. For example, for a single element Lennard–Jones system, one can ˜ = (J), new length unit (m) ˜ = σ (m), and new mass define new energy unit (J) ˜ unit (kg) = m a (kg) which is the atomic mass, where , σ and m a are pure ˜ unit system, the potential energy function is, ˜ m)–( ˜ kg) numbers. In the (J)–( V (r) = 4(r −12 − r −6 ),
(6)
and the mass of an atom is m = 1. Besides that, all physical laws must remain invariant. For example, K = mv 2 /2 in the SI system, and it still should hold ˜ unit system. This can only be achieved if the derived time ˜ kg) in the (J˜)–(m)–( unit (also called reduced time unit), (˜s) = τ (s), satisfies,
m aσ 2 . (7) ˜ v = 1 (m)/(˜ ˜ s), and K = 1/2 (J˜) is a solution To see this, note that m = 1 (kg), 2 ˜ ˜ ˜ kg) unit system, but must also be a solution to to K = mv /2 in the (J)–(m)–( K = mv 2 /2 in the SI system. For Ar, τ turns out to be 2.156 × 10−12 , thus the reduced time unit [˜s] = 2.156 [ps]. This is roughly the timescale of one atomic oscillation period in Ar. = m a σ 2 /τ 2 ,
1.2.
or τ =
Force Calculation
For pair potential of the form (4), there is, fi = −
∂ V (ri j ) j =/i
=
j =/i
∂xi
=
j =/i
1 ∂ V (r) − r ∂r r=ri j
∂ V (r) − ∂r r=ri j
xˆ i j
xi j ,
(8)
Basic molecular dynamics
573
where xˆ i j is the unit vector, xˆ i j ≡
xi j , ri j
xi j ≡ xi − x j .
(9)
One can define force on i due to atom j ,
fi j ≡
1 ∂ V (r) − r ∂r r=ri j
xi j ,
(10)
and so there is, fi =
fi j .
(11)
j =/i
It is easy to see that, fi j = −f j i .
(12)
MD programs tend to take advantage of symmetries like the above to save computations.
1.3.
Truncation Schemes
Consider the single-element Lennard–Jones potential in (5). Practically we can only carry out the potential summation up to a certain cutoff radius. There are many ways to truncate, the simplest of which is to modify the interaction as
V0 (r) =
V (r) − V (rc ), r < rc . 0, r ≥ rc
(13)
However, V0 (r) is discontinuous in the first derivative at r = rc , which causes large error in time integration (especially with high-order algorithms and large time steps) if an atom crosses rc , and is detrimental to calculating correlation functions over long time. Another commonly used scheme
V1 (r) =
V (r) − V (rc ) − V (rc )(r − rc ), r < rc 0, r ≥ rc
(14)
makes the force continuous at r = rc , but also makes the potential well too shallow (see Fig. 4). It is also slightly more expensive because we have to compute the square root of |xij |2 in order to get r. An alternative is to define V˜ (r) =
V (r) exp(rs /(r − rc )), r < rc 0, r ≥ rc
574
J. Li LJ6-12 potential and its truncated forms
E [ε]
0
⫺0.5
V(r) V0(r) V1(r) W(r)
⫺1 1
1.5
2
2.5
r [σ] Figure 4. Lennard–Jones potential and its modified forms with cutoff rc = 2.37343 σ . Black lines indicate positions of neighbors in a single-element fcc crystal at 0 K.
which has all derivatives continuous at r = rc . However, this truncation scheme requires another tunable parameter rs . The following truncation scheme, 6 18 12 12 σ σ σ σ 4ε − + 2 − r r rc rc 6 12 6 W (r) = r σ σ × −3 +2 , σ rc rc
0,
r < rc
(15)
r ≥ rc
is recommended. W (r), V (r), V0 (r) and V1 (r) are plotted in Fig. 4 for comparison. rc is chosen to be 2.37343σ , which falls exactly at the 2/3 interval between the fourth and fifth neighbors at equilibrated fcc lattice of 0 K. There is clearly a tradeoff in picking rc . If rc is large, the effect of the artificial truncation is small. On the other hand, maintaining and summing over a large neighbor list (size ∝ rc3 ) costs more. For a properly written O(N ) MD code, the cost versus neighbor number relation is almost linear. Let us see what is the minimal rc for a fcc solid. Figure 5 shows the neighboring atom shells and their multiplicity. Also drawn are the three glide planes.
Basic molecular dynamics
575 fcc neighboring shells 68; 86
748; 134
4 12; 54
324; 42
112; 12
origin
524; 78
26; 18
Figure 5. FCC neighboring shells. For example, label “68; 86 ” means there are eight sixth nearest neighbors of the type shown in figure, which adds up to 86 neighbors in all if included. The ABC stacking planes are also shown in the figure.
With (15), once the number of interacting neighbor shells are determined, we can evaluate the equilibrium volume and bulk modulus of the crystal in closed form. The total potential energy of each atom is r j i
(16)
For fcc crystal, we can extract scale-independent coefficients from the above summation and differentiate with respect to the lattice constant a – the minima of which yields the equilibrium lattice constant a0 . If we demand rc to fall into an exact position between the highest included shell and the lowest excluded shell, we can iterate the process until mutual consistency is achieved. We then plug a0 into (16) to calculate the binding energy per atom, e0 ; the atomic volume a03 , 4 and the bulk modulus 0 =
(17)
a02 d 2 e 4 d 2 e dP = B≡− = d log 90 da 2 a 9a0 da 2 a 0
0
(for fcc). 0
(18)
576
J. Li
Table 1. FCC neighboring shells included in Eq. (15) vs. properties n
N
rc [σ ]
a0 [σ ]
0 [σ 3 ]
e0 [ε]
B[εσ −3 ]
1 2 3 4 5 6 7 8 9 10
12 18 42 54 78 86 134 140 176 200
1.44262944953 1.81318453769 2.11067974132 2.37343077641 2.61027143673 2.82850677530 3.03017270367 3.21969263257 3.39877500485 3.56892997792
1.59871357076 1.57691543349 1.56224291246 1.55584092331 1.55211914976 1.55023249772 1.54842162594 1.54727436382 1.54643096926 1.54577565469
1.02153204121 0.98031403353 0.95320365252 0.94153307381 0.93479241591 0.93138774467 0.92812761235 0.92606612556 0.92455259927 0.92337773387
−2.03039845846 −4.95151157088 −6.12016548816 −6.84316556834 −7.27254778301 −7.55413237921 −7.74344974981 −7.88758411490 −7.99488847415 −8.07848627384
39.39360127902 52.02448553061 58.94148705580 64.19738627468 66.65093979162 68.53093399765 69.33961787572 70.63452119577 71.18713376234 71.76659559499
The self-consistent results for rc ratio 2/3 are shown in Table 1. That is, rc is exactly at 2/3 the distance between the nth interacting shell and the (n +1)th non-interacting shell. The reason for 2/3(> 1/2) is that we expect thermal expansion at finite temperature. If one is after converged Lennard–Jones potential results, then rc = 4σ is recommended. However, it is about five times more expensive per atom than the minimum-cutoff calculation with rc = 2.37343σ .
2.
Integrators
An integrator serves the purpose of advancing the trajectory over small time increments t: x3N (t0 ) → x3N (t0 + t) → x3N (t0 + 2t) → · · · → x3N (t0 + Lt) where L is usually ∼104 − 107 . Here we give a brief overview of some popular algorithms: central difference (Verlet, leap-frog, velocity Verlet), Beeman’s algorithm [14], predictor-corrector [10], and symplectic integrators [8, 11].
2.1.
Verlet Algorithm
Assuming x3N (t) trajectory is smooth, perform Taylor expansion xi (t0 + t) + xi (t0 − t) = 2xi (t0 ) + x¨ i (t0 )(t)2 + O((t)4 ).
(19)
Since x¨ i (t0 ) = fi (t0 )/m i can be evaluated given the atomic positions x3N (t0 ) at t = t0 , x3N (t0 + t) in turn may be approximated by,
xi (t0 + t) = −xi (t0 − t) + 2xi (t0 ) +
fi (t0 ) (t)2 + O((t)4 ). mi (20)
Basic molecular dynamics
577
By throwing out the O((t)4 ) term, we obtain a recursion formula to compute x3N (t0 + t), x3N (t0 + 2t), . . . successively, which is the Verlet [15] algorithm. The velocities do not participate in the recursion but are needed for property calculations. They can be approximated by vi (t0 ) ≡ x˙ i (t0 ) =
1 [xi (t0 + t) − xi (t0 − t)] + O((t)2 ). 2t
(21)
To what degree does the outcome of the above recursion mimic the real trajectory x3N (t)? Notice that in (20), assuming xi (t0 ) and xi (t0 − t) are exact, and assuming we have a perfect computer with no machine error storing the relevant numbers or carrying out floating-point operations, the computed xi (t0 + t) would still be off from the real xi (t0 + t) by O((t)4 ), which is defined as the local truncation error (LTE). LTE is an intrinsic error of the algorithm. Clearly, as t → 0, LTE → 0, but that does not guarantee the algorithm works, because what we want is x3N (t0 +t ) for a given t , not xi (t0 +t). To obtain x3N (t0 + t ), we must integrate L = t /t steps, and the difference between the computed x3N (t0 + t ) and the real x3N (t0 + t ) is called the global error. An algorithm can be useful only if when t → 0, the global error → 0. Usually (but with exceptions), if LTE in position is ∼ (t)k+1 , the global error in position should be ∼ (t)k , in which case we call the algorithm a k-th order method. The Verlet algorithm is third order in position and potential energy, but only second order in velocity and kinetic energy. This is only half the story because the order of an algorithm only characterizes its performance when t → 0. To save computational cost, most often one must adopt a quite large t. Higher-order algorithms do not necessarily perform better than lower-order algorithms at practical t’s. In fact, they could be much worse by diverging spuriously (causing overflow and NaN), while a more robust method would just give a finite but manageable error for the same t. This is the concept of the stability of a numerical algorithm. In linear ODEs, the global error e of a certain normal mode k can always be written as e(ωk t, T /t) by dimensional analysis, where ωk is the mode’s frequency. One then can define the stability domain of an algorithm in the ωt complex plane as the border where e(ωk t, T /t) starts to grow exponentially as a function of T /t. To rephrase, a higher-order algorithm may have a much smaller stability domain than the lower-order algorithm even though its e decays faster near the origin. Since e is usually larger for larger |ωk t|, the overall quality of an integration should be characterized by e(ωmax t, T /t) where ωmax is the maximum intrinsic frequency of the molecular system that we explicitly integrate. The main reason behind developing constraint MD [1, 8] for some molecules is so that we do not have to integrate its stiff intramolecular vibrational modes, allowing one to take a larger t, so one can follow longer the “softer modes” that we are more interested in. This is also
578
J. Li
the rationale behind developing multiple time step integrators like r-RESPA [11]. In addition to LTE, there is round-off error due to the computer’s finite precision. The effect of round-off error can be better understood in the stability domain: (1) In most applications, the round-off error LTE, but it behaves like white noise which has a very wide frequency spectrum, and so for the algorithm to be stable at all, its stability domain must include the entire real ωt axis. However, as long as we ensure non-positive gain for all real ωt modes, the overall error should still be characterized by e(ωk t, T /t), since the white noise has negligible amplitude. (2) Some applications, especially those involving high-order algorithms, do push the machine precision limit. In those cases, equating LTE ∼ where is the machine’s relative accuracy, provides a practical lower bound to t, since by reducing t one can no longer reduce (and indeed would increase) the global error. For single-precision arithmetics (4 bytes to store one real number), ∼ 10−8 ; for double-precision arithmetics (8 bytes to store one real number), ≈ 2.2 × 10−16 ; for quadrupleprecision arithmetics (16 bytes to store one real number), ∼ 10−32 .
2.2.
Leap-frog Algorithm
Here we start out with v3N (t0 − t/2) and x3N (t0 ), then,
vi t0 +
t 2
t 2
= vi t0 −
+
fi (t0 ) t + O((t)3 ), mi
(22)
followed by,
xi (t0 + t) = xi (t0 ) + vi
t t0 + 2
t + O((t)3 ),
(23)
and we have advanced by one step. This is a second-order method. The velocity at time t0 can be approximated by,
vi (t0 ) =
2.3.
1 t vi t0 − 2 2
+ vi t0 +
t 2
+ O((t)2 ).
(24)
Velocity Verlet Algorithm
We start out with x3N (t0 ) and v3N (t0 ), then, xi (t0 + t) = xi (t0 ) + vi (t0 )t +
1 2
fi (t0 ) (t)2 + O((t)3 ), mi
(25)
Basic molecular dynamics
579
evaluate f3N (t0 + t), and then,
1 fi (t0 ) fi (t0 + t) + t + O((t)3 ), vi (t0 + t) = vi (t0 ) + 2 mi mi
(26)
and we have advanced by one step. This is a second-order method. Since we can have x3N (t0 ) and v3N (t0 ) simultaneously, it is very popular.
2.4.
Beeman’s Algorithm
It is similar to the velocity Verlet algorithm. We start out with x3N (t0 ), f3N (t0 − t), f3N (t0 ) and v3N (t0 ), then,
4fi (t0 ) − fi (t0 − t) (t)2 xi (t0 + t) = xi (t0 ) + vi (t0 )t + mi 6 4 + O((t) ),
(27)
evaluate f3N (t0 + t), and then,
2fi (t0 + t) + 5fi (t0 ) − fi (t0 − t) t , (28) vi (t0 + t) = vi (t0 ) + mi 6 and we have advanced by one step. This is a third-order method.
2.5.
Predictor-corrector Algorithm
Let us take the often used 6-value predictor-corrector algorithm [10] as an example. We start out with 6 × 3N storage: x3N(0) (t0 ), x3N(1) (t0 ), x3N(2) (t0 ), . . . , x3N(5) (t0 ), where x3N(k) (t) is defined by,
xi(k) (t)
≡
dk xi(t ) dt k
(t)k k!
.
(29)
The iteration consists of prediction, evaluation, and correction steps:
2.5.1. Prediction step (0) (1) (2) (3) (4) (5) x(0) i = xi + xi + xi + xi + xi + xi , (1) (2) (3) (4) (5) x(1) i = xi + 2xi + 3xi + 4xi + 5xi , (2) (3) (4) (5) x(2) i = xi + 3xi + 6xi + 10xi , (3) (4) (5) x(3) i = xi + 4xi + 10xi , (4) (5) x(4) i = xi + 5xi .
(30)
580
J. Li
The general formula for the above is xi(k) =
M−1 k =k
k ! xi(k ) , (k − k)!k!
k = 0, . . . , M − 2,
(31)
with M = 6 here. The evaluation must proceed from 0 to M − 2 sequentially.
2.5.2. Evaluation step Evaluate force f3N using the newly obtained x3N(0) .
2.5.3. Correction step Define the error e3N as, ei ≡
x(2) i
−
fi mi
(t)2 . 2!
(32)
Then apply corrections, xi(k) = xi(k) − C Mk ei ,
k = 0, . . . , M − 1,
(33)
where C Mk are constants listed in Table 2. It is clear that the LTE for x3N is O((t) M ) after the prediction step. But one can show that the LTE is enhanced to O((t) M+1 ) after the correction step if f3N depends on x3N only (i.e., is conservative). And so the global error would be O((t) M ).
2.6.
Symplectic Integrators
In the absence of round-off error, certain numerical integrators rigorously maintain the phase space volume conservation property (Liouville’s theorem) of Hamiltonian dynamics, which are then called symplectic. This severely limits the possibilities of mapping from initial to final states, and for this reason symplectic integrators tend to have much better total energy conservation in Table 2. Gear predictor-corrector coefficients C Mk M M M M M
k=0
k=1
=4 1/6 5/6 =5 19/120 3/4 =6 3/20 251/360 = 7 863/6048 665/1008 = 8 1925/14112 19087/30240
k =2 1 1 1 1 1
k=3
k=4
k=5
k=6
k=7
1/3 1/2 1/12 11/18 1/6 1/60 25/36 35/144 1/24 1/360 137/180 5/16 17/240 1/120 1/2520
Basic molecular dynamics
581 Integration of 1000 periods of Kepler orbitals with eccentricity 0.5
Integration of 100 periods of Kepler orbitals with eccentricity 0.5 0
10
0
10
⫺1
10 ⫺1
10
⫺2
II final (p,q) error II2
II final (p,q) error II2
10 ⫺2
10
⫺3
10
⫺4
10
⫺5
10
⫺6
Ruth83 Schlier98_6a Tsitouras99 Calvo93 Schlier00_6b Schlier00_8c 4th Runge-Kutta 4th Gear 5th Gear 6th Gear 7th Gear 8th Gear
⫺4
10
⫺5
10
⫺6
10
⫺7
10
⫺8
Ruth83 Schlier98_6a Tsitouras99 Calvo93 Schlier00_6b Schlier00_8c 4th Runge-Kutta 4th Gear 5th Gear 6th Gear 7th Gear 8th Gear
10
10
100
⫺3
10
150
200
300
400
500
number of force evaluations per period
600
700
800 900 1000
150
200
300
400
500
600
700 800 900 1000
1200 1400 16001800 2000
number of force evaluations per period
Figure 6. (a) Phase error after integrating 100 periods of Kepler orbitals. (b) Phase error after integrating 1000 periods of Kepler orbitals.
the long run. The velocity Verlet algorithm is in fact symplectic, followed by higher-order extensions [16, 17]. As with the predictor-corrector method which can be derived up to order 14 following the original construction scheme [10], suitable for double-precision arithmetics, symplectic integrators also tend to perform better at higher orders even on a per cost basis. We have benchmarked the two families of integrators (Fig. 6) by numerically solving the two-body Kepler’s problem (eccentricity 0.5) which is nonlinear and periodic, and comparing with the exact analytical solution. The two families have different global error versus time characteristics: non-symplectic integrators all have linear energy error (E ∝ t) and quadratic phase error (|| ∝ t 2 ), while symplectic integrators have constant (fluctuating) energy error (E ∝ t 0 ) and linear phase error (|| ∝ t), with respect to time. Therefore the asymptotic long-term performance of a symplectic integrator is always superior to that of a non-symplectic integrator. But, it is found that for a reasonable integration duration, say 100 Kepler periods, high-order predictorcorrector integrators can have a better performance than the best of the symplectic integrators at large integration timesteps (small number of force evaluations per period). This is important, because it means that in a real system if one does not care about the autocorrelation of a mode beyond 100 oscillation periods, then high-order predictor-corrector algorithms can achieve the desired accuracy at a lower computational cost.
3.
Order- N MD Simulation With Short-ranged Potential
We outline here a linked-bin algorithm that allows one to perform MD simulation in a PBC supercell with O(N ) computational effort per time step, where N is the number of atoms in the supercell (Fig. 7). Such approach
582
J. Li
(a)
each timestep: N
2
(b)
(c) 1
2
3
rc
2D usage ratio: 35% ? ?
3D usage ratio: 16% (!)
Figure 7. There are N atoms in the supercell. (a) The circle around a particular atom with radius rc indicates the range of its interaction with other atoms. (b) The supercell is divided into a number of bins, which have dimensions such that an atom can only possibly interact with atoms in adjacent 27 bins in 3D (nine in 2D). (c) This shows that an atom–atom list is still necessary because on average there are only 16% of the atoms in 3D in adjacent bins that interact with the particular atom.
is found to outperform the brute-force Verlet neighbor-list update algorithm, which is O(N 2 ), when N exceeds a few thousand atoms. The algorithm to be introduced here allows for arbitrary supercell deformation during a simulation, and is implemented in large-scale MD and conjugate gradient relaxation programs as well as a visualization program [3]. Denote the three edges of a supercell in Cartesian frame by row vectors h1 , h2 , h3 , which stack together to form a 3 × 3 matrix H. The inverse of the H matrix B ≡ H−1 satisfies I = HB = BH.
(34)
If we define row vectors b1 ≡ (B11, B21, B31),
b2 ≡ (B12, B22, B32 ),
b3 ≡ (B13, B23 , B33), (35)
then (34) is equivalent to hi · b j ≡ hi bTj = δi j .
(36)
Since b1 is perpendicular to both h2 and h3 , it must be collinear with the normal direction n of the plane spanned by h2 and h3 : b1 ≡ |b1 |n. And so by (36), 1 = h1 · b1 = h1 · (|b1 |n) = |b1 |(h1 · n).
(37)
Basic molecular dynamics
583
But |h1 · n| is nothing other than the thickness of the supercell along the h1 edge. Therefore, the thicknesses (distances between two parallel surfaces) of the supercell are, d1 =
1 1 1 , d2 = , d3 = . |b1 | |b2 | |b3 |
(38)
The position of atom i is specified by a row vector, si = (si1 , si2 , si3 ), with siµ satisfying 0 ≤ siµ < 1, µ = 1, . . . , 3,
(39)
and the Cartesian coordinate of this atom, xi , also a row vector, is xi = si1 h1 + si2 h2 + si3 h3 = si H,
(40)
where siµ has the geometrical interpretation of the fraction of the µth edge in order to build xi . We will simulate particle systems that interact via shortranged potentials of cutoff radius rc (see previous section for potential truncation schemes). In the case of multi-component system, rc is generalized to a matrix rcαβ , where α ≡ c(i), β ≡ c( j ) are the chemical types of atom i and j , respectively. We then define xji . (41) x j i ≡ x j − xi , r j i ≡ |x j i |, xˆ j i ≡ r ji The design of the program should allow for arbitrary changes in H that include strain and rotational components (see Section 2.5). One should use the Lagrangian strain η, a true rank-2 tensor under coordinate frame transformation, to measure the deformation of a supercell. To define η, one needs a reference H0 of a previous time, with x0 = sH0 and dx0 = (ds)H0 , and imagine that with s fixed, dx0 is transformed to dx = (ds)H, under H0 → H ≡ H0 J. The Lagrangian strain (see Chap 2.4) is defined by the change in the differential line length, dl 2 = dx dxT ≡ dx0 (I + 2η)dxT0 ,
(42)
where by plugging in dx = (ds)H = (dx0 )H−1 0 H = (dx0 )J, η is seen to be
η=
1 2
T −T H−1 0 HH H0 − I =
1 2
JJT − I .
(43)
Because η is a symmetric matrix, it always has three mutually orthogonal eigen-directions x1 η = x1 η1 , x2 η = x2 η√ 2 , x3 η = x3 η√ 3 . Along those √ directions, the line lengths are changed by factors 1 + 2η1 , 1 + 2η2 , 1 + 2η3 , which achieve extrema among all line directions. Thus, as long as η1 , η2 and η3 oscillate between [−ηbound , ηbound] for some √ chosen ηbound, any line segment at H0 can√be lengthened by no more than 1 + 2ηbound and shortened by no less than 1 − 2ηbound . That is, if we define length measure √ (44) L(s, H) ≡ sHHT sT ,
584
J. Li
then so long as η1 , η2 , η3 oscillate between [ηmin , ηmax ], there is
1 + 2ηmin L(s, H0 ) ≤ L(s, H) ≤
1 + 2ηmax L(s, H0 ).
(45)
One can use the above result to define a strain session, which begins with H0 = H and during which no line segment is allowed to shrink by less than a threshold f c ≤ 1, compared to its length at H0 . This is equivalent to requiring that, f ≡
1 + 2 (min(η1 , η2 , η3 )) ≤ f c .
(46)
Whenever the above condition is violated, the session terminates and a new session starts with the present H as the new H0 , and triggers a repartitioning of the supercell into equal-size bins, which is called a strain-induced bin repartitioning. The purpose of bin partition is the following: it can be a very demanding task to determine if atoms i, j are within rc or not, for all possible i j combinations. Formally, this requires checking r j i ≡ L(s j i , H) ≤ rc .
(47)
Because si , s j and H are all moving – they differ from step to step, it appears that we have to do this at each step. This O(N 2 ) complexity would indeed be the case but for the observation that, in most MD, MC and static minimization procedures, si ’s of most atoms and H often change only slightly from the previous step. Therefore, once we ensured that (47) hold at some previous step, we can devise a sufficient condition to test if (47) still must hold now, at a much smaller cost. Only when this sufficient condition breaks down do we resort to a more complicated search and check in the fashion of (47). As a side note, it is often more efficient to count interaction pairs if the potential function allows for easy use of such half-lists, such as pair- or EAM potentials, which achieves 1/2 saving in memory. In these scenarios we pick a unique “host” atom among i and j to store the information about the i j -pair, that is, a particle’s list only keeps possible pairs that are under its own care. For load-balancing it is best if the responsibilities are distributed evenly among particles. We use a pseudo-random choice of: if i + j is odd and i > j , or if i + j is even and i < j , then i is the host; otherwise it is j . As i > j is “uncorrelated” with whether i + j is even or odd, significant load imbalance is unlikely to occur even if the indices correlate strongly with the atoms’ positions. The step-to-step small change is exploited as follows: one associates each si with a semi-mobile reduced coordinate sai called atom i’s anchor (Fig. 8). At each step, one checks if L(si − sai , H), that is, the current distance between 0 or not. If it is not, then sai i and its anchor, is greater than a certain rdrift ≥ rdrift a does not change; if it is, then one redefines si ≡ si at this step, which is called
Basic molecular dynamics
585
atom trajectory
d L
anchor trajectory
d
Usually,
d = 0.05rc
Figure 8. This illustrates the concepts of an anchor, which is the relative immbobile part of an atom’s trajectory. Using an anchor–anchor list, we can derive a “flash” condition that locally updates an atom’s neighbor-list when the atom drifts sufficiently far away from its anchor.
atom i’s flash incident. At atom i’s flash, it is required to update records of all atoms (part of the records may be stored in j ’s list, if 1/2-saving is used and j happens to be the host of the i j pair) whose anchors satisfy L(saj − sai , H0 ) ≤ rlist ≡
0 rc + 2rdrift . fc
(48)
Note that the distance is between anchors instead of atoms (sai = si , though), and the length is measured by H0 , not the current H. (48) nominally takes O(N ) work per flash, but we may reduce it to O(1) work per flash by partitioning the supercell into m 1 × m 2 × m 3 bins at the start of the session, whose thicknesses by H0 (see (38)) are required to be greater than or equal to rlist : d1 (H0 ) d2 (H0 ) d3 (H0 ) , , ≥ rlist . m1 m2 m3
(49)
The bins deform with H and remains commensurate with it, that is, its s-width 1/m 1 , 1/m 2 , 1/m 3 remains fixed during a strain session. Each bin keeps an updated list of all anchors inside. When atom i flashes, it also updates the bin-anchor list if necessary. Then, if at the time of i’s flash two anchors are separated by more than one bin, there would be L(saj − sai , H0 ) >
d1 (H0 ) d2 (H0 ) d3 (H0 ) , , ≥ rlist, m1 m2 m3
(50)
and they cannot possibly satisfy (48). Therefore we only need to test (48) for anchors within adjacent 27 bins. To synchronize, all atoms flash at the start of a strain session. From then on, atoms flash individually whenever L(si −sai , H) > rdrift . If two anchors flash at the same step in a loop, the first flash may get it wrong – that is, missing the second anchor, but the second flash will correct the mistake. The important thing here is not to lose an interaction. We see that to maintain anchor lists that captures all solutions to (48) among the latest anchors, it takes only O(N ) work per step, and the pre-factor of which is also 0 . small because flash events happen quite infrequently for a tolerably large rdrift
586
J. Li
The central claim of the scheme is that if j is not in i’s anchor records (suppose i’s last flash is more recent than j ’s), which was created some time ago in the strain session, then r j i > rc . The reason is that the current separation 0 , between the anchor i and anchor j , L(saj − sai , H), is greater than rc + 2rdrift since by (45), (46) and (48), L(saj − sai , H) ≥ f · L(saj − sai , H0 ) > f · rlist ≥ f c · rlist = f c ·
0 rc + 2rdrift . fc (51)
So we see that r j i > rc maintains if neither i or j currently drifts more than f · rlist − rc 0 ≥ rdrift , (52) 2 from respective anchors. Put it another way, when we design rlist in (48), we take into consideration both atom drifts and H shrinkage which both may bring i j closer than rc , but since the current H shrinkage has not yet reached the designed critical value, we can convert it to more leeway for the atom drifts. For multi-component systems, we define rdrift ≡
αβ
rlist ≡
0 rcαβ + 2rdrift , fc
(53)
0 0 are species-independent constants, and rdrift can be where both f c and rdrift thought of as putting a lower bound on rdrift , so flash events cannot occur too frequently. At each bin repartitioning, we would require
d1 (H0 ) d2 (H0 ) d3 (H0 ) αβ , , ≥ max rlist . α,β m1 m2 m3
(54)
And during the strain session, f ≥ f c , we have
α rdrift
≡ min min β
αβ
f · rlist − rcαβ , min β 2
βα
f · rlist − rcβα 2
,
(55)
a time- and species-dependent atom drift bound that controls whether an atom of species α needs to flash.
4.
Molecular Dynamics Codes
At present there are several high-quality molecular dynamics programs in the public domain, such as LAMMPS [18], DL POLY [19, 20], Moldy [21], and some codes with biomolecular focus, such as NAMD [22, 23] and Gromacs [24, 25]. CHARMM [26] and AMBER [27] are not free but are standard and extremely powerful codes in biology.
Basic molecular dynamics
587
References [1] M. Allen and D. Tildesley, Computer Simulation of Liquids, Clarendon Press, New York, 1987. [2] J. Li, L. Porter, and S. Yip, “Atomistic modeling of finite-temperature properties of crystalline beta-SiC - II. Thermal conductivity and effects of point defects,” J. Nucl. Mater., 255, 139–152, 1998. [3] J. Li, “AtomEye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Eng., 11, 173–177, 2003. [4] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, New York, 1987. [5] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, 2nd edn., Clarendon Press, Oxford, 1954. [6] R. Parr and W. Yang, Density-functional Theory of Atoms and Molecules, Clarendon Press, Oxford, 1989. [7] S.D. Ivanov, A.P. Lyubartsev, and A. Laaksonen, “Bead-Fourier path integral molecular dynamics,” Phys. Rev. E, 67, art. no.–066710, 2003. [8] T. Schlick, Molecular Modeling and Simulation, Springer, Berlin, 2002. [9] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C: the Art of Scientific Computing, 2nd edn., Cambridge University Press, Cambridge, 1992. [10] C. Gear, Numerical Initial Value Problems in Ordinary Differential Equation, Prentice-Hall, Englewood Cliffs, NJ, 1971. [11] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000. [12] S. Nose, “A unified formulation of the constant temperature molecular dynamics methods,” J. Chem. Phys., 81, 511–519, 1984. [13] W.G. Hoover, “Canonical dynamics – equilibrium phase-space distributions,” Phys. Rev. A, 31, 1695–1697, 1985. [14] D. Beeman, “Some multistep methods for use in molecular-dynamics calculations,” J. Comput. Phys., 20, 130–139, 1976. [15] L. Verlet, “Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard–Jones molecules,” Phys. Rev., 159, 98–103, 1967. [16] H. Yoshida, “Construction of higher-order symplectic integrators,” Phys. Lett. A, 150, 262–268, 1990. [17] J. Sanz-Serna and M. Calvo, Numerical Hamiltonian Problems, Chapman & Hall, London, 1994. [18] S. Plimpton, “Fast parallel algorithms for short-range molecular-dynamics,” J. Comput. Phys., 117, 1–19, 1995. [19] W. Smith and T.R. Forester, “DL POLY 2.0: a general-purpose parallel molecular dynamics simulation package,” J. Mol. Graph., 14, 136–141, 1996. [20] W. Smith, C.W. Yong, and P.M. Rodger, “DL POLY: application to molecular simulation,” Mol. Simul., 28, 385–471, 2002. [21] K. Refson, “Moldy: a portable molecular dynamics simulation program for serial and parallel computers,” Comput. Phys. Commun., 126, 310–329, 2000. [22] M.T. Nelson, W. Humphrey, A. Gursoy, A. Dalke, L.V. Kale, R.D. Skeel, and K. Schulten, “NAMD: a parallel, object oriented molecular dynamics program,” Int. J. Supercomput. Appl. High Perform. Comput., 10, 251–268, 1996. [23] L. Kale, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten, “NAMD2: Greater scalability for parallel molecular dynamics,” J. Comput. Phys., 151, 283–312, 1999.
588
J. Li [24] H.J.C. Berendsen, D. Vanderspoel, and R. Vandrunen, “Gromacs – a messagepassing parallel molecular-dynamics implementation,” Comput. Phys. Commun., 91, 43–56, 1995. [25] E. Lindahl, B. Hess, and D. van der Spoel, “GROMACS 3.0: a package for molecular simulation and trajectory analysis,” J. Mol. Model., 7, 306–317, 2001. [26] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, “Charmm – a program for macromolecular energy, minimization, and dynamics calculations,” J. Comput. Chem., 4, 187–217, 1983. [27] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham, S. Debolt, D. Ferguson, G. Seibel, and P. Kollman, “Amber, a package of computer-programs for applying molecular mechanics, normal-mode analysis, molecular-dynamics and freeenergy calculations to simulate the structural and energetic properties of molecules,” Comput. Phys. Commun., 91, 1–41, 1995.
2.9 GENERATING EQUILIBRIUM ENSEMBLES VIA MOLECULAR DYNAMICS Mark E. Tuckerman Department of Chemistry, Courant Institute of Mathematical Science, New York University, New York, NY 10003
Over the last several decades, molecular dynamics (MD) has become one of the most important and commonly used approaches for studying condensed phase systems. MD calculations generally serve two often complementary purposes. First, an MD simulation can be used to study the dynamics of a system starting from particular initial conditions. Second, MD can be employed as a means of generating a collection of classical microscopic configurations in a particular equilibrium ensemble. The latter of these uses shows that MD is intimately connected with statistical mechanics and can serve as a computational tool for solving statistical mechanical problems. Indeed, even when MD is used to study a system’s dynamics, one never uses just a single trajectory (generated from a single initial condition). Dynamical properties in the linear response regime, computed according to the rules of statistical mechanics from time correlation functions, require an ensemble of trajectories starting from an equilibrium distribution of initial conditions. These points underscore the importance of having efficient and rigorous techniques capable of generating equilibrium distributions. Indeed while the problem of producing classical trajectories from a distribution of initial conditions is relatively straightforward – one simply integrates Hamilton’s equations of motion – the problem of generating the equilibrium distribution for a complex system is an immense challenge for which advanced sampling techniques are often required. Whether or not one is employing MD on its own or combining it with one of a variety of advanced sampling methods, the underlying MD scheme must be tailored to generate the desired distribution. Once such a scheme is in place, it can be employed as is or adapted for advanced sampling techniques such as umbrella sampling [1], the bluemoon ensemble approach [2, 3], or variable transformations [4]. In this contribution, our focus will be on the underlying MD schemes, themselves, and the problem of generating numerical integrators 589 S. Yip (ed.), Handbook of Materials Modeling, 589–611. c 2005 Springer. Printed in the Netherlands.
590
M.E. Tuckerman
for these schemes. The latter is still an open area of research in which a number of important theoretical questions remain unanswered. Thus, we will discuss the current state of knowledge and allude to the outstanding issues as they arise. At this point, it is worth mentioning that equilibrium ensemble distributions are not the sole domain of MD. Monte Carlo (MC) methods and hybrid MD/MC approaches can also be employed. Moreover, advanced sampling techniques designed to work with MC, such as configurational bias MC [5], and with hybrid methods, such as hybrid MC [6], exist as well. To some extent, the choice between MC, MD and hybrid MD/MC approaches is a matter of taste. Each has particular advantages and disadvantages and both allow for creative innovations within their respective frameworks. A particular advantage of the MD and hybrid MD/MC approaches lies in the fact that they lend themselves well to scalable parallelization, allowing large systems and long time scales to be accessed. Indeed, efficient parallel algorithms for MD have been proposed [7] and a wide variety of parallel MD codes are available to the community via the Web, such as the NAMD (www.ks.uiuc.edu/Research/namd) and PINY MD (homepages.nyu.edu/˜mt33/PINY MD/PINY.html) codes, to name just a few. In thermodynamics, one divides the thermodynamic universe into the system and its surroundings. How the system interacts with its surroundings determines the particular ensemble distribution the system will obey. The interaction between the system and its surroundings causes certain thermodynamic variables to fluctuate and others to remain fixed. For example, if the system can exchange thermal energy with its surroundings, its internal energy will fluctuate, however, its temperature will, when equilibrium is reached, be fixed at the temperature of the surroundings. Thermodynamic variables of the system that are fixed due its interaction with the surroundings can be viewed as “control variables,” since they can be adjusted via the surroundings (e.g., changing the temperature of the surroundings will change the temperature of the system if the two can exchange thermal energy). These control variables, therefore, characterize the ensemble. Let us begin our discussion with the simplest possible case, that of a system that has no interaction with its surroundings. Let the system contain N particles in a container of volume V . Let the positions of the N particles at time t be designated r1 (t), . . . , r N (t) and velocities v1 (t), . . . , v N (t), and let the particles have masses m 1 , . . . , m N . In general, the time evolution of any classical system is given by Newton’s equations of motion m i r¨ i = Fi
(1)
where Fi is the total force on the ith particle, and the overdot notation signifies time differentiation, i.e., r˙ i = dri /dt = vi . Thus, r¨ i is the acceleration of the ith particle. Since Newton’s equations constitute a set of 3N coupled second order differential equations, if an initial condition on the positions and
Generating equilibrium ensembles via molecular dynamics
591
velocities r1 (0), . . . , r N (0), v1 (0), . . . , v N (0) is specified, the solution to Newton’s equations will be a unique function of time. For a system isolated from its surroundings, the force on each particle will only be due to its interaction with all of the other particles in the system. Thus, the forces F1 , . . . , F N will be functions only of the particle positions, i.e., Fi = Fi (r1 , . . . , r N ), and, in addition, they will be conservative, meaning that they can be expressed as the gradient of a scalar potential energy function U (r1 , . . . , r N ): ∂ (2) Fi (r1 , . . . , r N ) = − U (r1 , . . . , r N ) ∂ri If a conservative force is taken to act over a closed path that brings a particle back to its point of origin, no net work is done. When only conservative forces act within a system, the total energy E=
N 1 m i v2i + U (r1 , . . . , r N ) 2 i=1
(3)
is conserved by the motion. Given the law of conservation of energy, the equations of motion for an isolated system can be cast in a way that is particularly useful for establishing the connection to equilibrium ensembles, namely, in terms of the classical Hamiltonian. The Hamiltonian is nothing more than the total energy E expressed as a function of the positions and momenta, pi = m i vi . Thus, the Hamiltonian H is a function of these variables, i.e., H = H (p1 , . . . , p N , r1 , . . . , r N ). Introducing the shorthand notation r ≡ r1 , . . . , r N , p ≡ p1 , . . . , p N , and substituting vi = pi /m i into Eq. (3), the Hamiltonian becomes H (p, r) =
N p2i i=1
2m i
+ U (r, . . . , r N )
(4)
The equations of motion for the positions and momenta are then given by Hamilton’s equations ∂ H pi ∂H ∂U = =− (5) p˙ i = − r˙ i = ∂pi m i ∂ri ∂ri It is straightforward to show, by substituting the time derivative of the equation for r˙ i into the equation for p˙ i , that Hamilton’s equations are mathematically equivalent to Newton’s equations (1). It is also straightforward to show that H (p, r) is conserved by simply computing dH/dt via the chain rule:
N ∂H ∂H dH = · r˙ i + · p˙ i dt ∂ri ∂pi i=1
=
N ∂H i=1
=0
∂H ∂H ∂H · − · ∂ri ∂pi ∂pi ∂ri
(6)
592
M.E. Tuckerman
(It is important to note that the form of Hamilton’s equations is valid in any set of generalized coordinates q1 , . . . , q3N , p1 , . . . , p3N , i.e., q˙k = ∂ H/∂ pk , p˙ k = −∂ H/∂qk .) Just as for Newton’s equations, given an initial condition, (p(0), r(0)), Hamilton’s equations will generate a unique solution (r(t), p(t)) that conserves the total Hamiltonian, i.e., that satisfies H (p(t), r(t))=constant. This condition tells us that the positions and momenta are not all independent variables. In order to understand what this means, let us introduce an abstract 6N -dimensional space, known as phase space, in which 3N of the mutually orthogonal axes are labeled by the 3N position variables and the other 3N axes are labeled by the 3N momentum variables. Since a classical system is completely specified by specifying all of the positions and momenta, a classical microscopic state, or classical microstate, is represented by a single point in the phase space. The condition H (p, r) = constant defines a (6N − 1)dimensional hypersurface in the phase space known as the constant energy hypersurface. It, therefore, becomes clear that any solution to Hamilton’s equations will, for all time, remain on a constant energy hypersurface determined by the initial conditions. If the dynamics is such that the trajectory is able to visit every point of the constant energy hypersurface given an infinite amount of time, then the trajectory is said to be ergodic. There is no general way to prove that a given trajectory is ergodic, and, indeed, in many cases, an arbitrary solution of Hamilton’s equations will not be ergodic. However, if a trajectory is ergodic, then it will generate a sampling of classical microscopic states corresponding to constant total energy, E. Moreover, since the system is in isolation, the particle number N and volume, V are trivially conserved. The collection of classical microscopic states corresponding to constant N , V , and E comprise the statistical mechanical ensemble known as the microcanonical ensemble. In the microcanonical ensemble, the classical microstates must be distributed according to f (p, r) ∝ δ(H (p, r) − E), which satisfies the equilibrium Liouville equation { f, H } = 0, where {. . . , . . .} is the classical Poisson bracket. Thus, an ergodic trajectory generates, not only the dynamics of the system, but also the complete microcanonical ensemble. This tells us that any physical observable expressible as an average A over the ensemble A =
MN (N, V, E)
dp
dr A(p, r)δ (H (p, r) − E)
(7)
D(V )
of a classical phase space function A(p, r) where M N = E 0 /(N !h 3N ), E 0 is a reference energy, h is Planck’s constant, D(V ) is the spatial domain defined by the containing volume, and (N, V, E) is the microcanonical partition function (N, V, E) = M N
dp D(V )
dr δ (H (p, r) − E)
(8)
Generating equilibrium ensembles via molecular dynamics
593
can be computed from a time average over an ergodic trajectory 1 A = A¯ ≡ lim T →∞ T
T
dt A(p(t), r(t))
(9)
0
In Eq. (8), the phase space volume element dp dr = dp1 · · · dp N dr1 · · · dr N is a 6N -dimensional volume element. The Dirac delta-function δ(H (p, r) − E) restricts the integration over the phase space to only those points that lie on the constant energy hypersurface. Clearly, then, the microcanonical partition function corresponds to the total number of microscopic states contained in the microcanonical ensemble. It is, therefore, related to the entropy of the system S(N, V, E) via Boltzmann’s relation S(N, V, E) = k ln (N, V, E)
(10)
where k is Boltzmann’s constant. From this, it is clear that the partition function leads to other thermodynamic quantities via differentiation. The temperature, pressure and chemical potential, for example, are given by
∂S k∂ ln 1 = = T ∂ E N,V ∂E N,V P ∂S k∂ ln = = T ∂ V N,E ∂V N,E µ ∂S k∂ ln =− = T ∂ N V ,E ∂ N V ,E
(11)
The complexity of the forces in Hamilton’s equations is such that an analytical solution is not possible, and one must resort to numerical techniques. In constructing numerical integration schemes, it is important to preserve two properties characterized by Hamiltonian systems. The first is known as Liouville’s Theorem. For simplicity, let us denote the phase space trajectory (p(t), r(t)) simply by xt , known as the phase space vector. Since the solution, xt to Hamilton’s equations is a unique function of the initial condition x0 , we can express xt as a function of x0 , i.e., xt = xt (x0 ). This designation shows that Hamilton’s equations generate a transformation of the complete set of phase space variables from x0 −→ xt . If we consider a small volume element dxt in phase space, this volume element will transform according to dxt = J (xt ; x0 )dx0
(12)
where J (xt ; x0 ) is the Jacobian |∂ xt /∂ x0 | of the transformation. Liouville’s theorem states that J (xt ; x0 ) = 1 or equivalently that dxt = dx0
(13)
In other words, the phase space volume element is conserved. Liouville’s theorem is a consequence of the fact that Hamiltonian systems have a vanishing
594
M.E. Tuckerman
phase space compressibility, κ(x) defined in an analogous manner to the usual hydrodynamic compressibility κ(x) = ∇ · x˙ = =
N ∂
i=1
∂ · p˙ i + · r˙ i ∂pi ∂ri
i=1
∂ ∂H ∂ ∂H − · + · ∂pi ∂ri ∂ri ∂pi
N
=0
(14)
The second property is the time reversibility of Hamilton’s equations. This property implies that if an initial condition x0 is allowed to evolve up to time t, at which point all of the momenta are reversed, the system will, in another time interval of length t, return to the point x0 . Any numerical integration scheme applied to Hamilton’s equations should respect these two properties, as they both ensure that all points of the constant energy hypersurface are given equal statistical weighting, as required by the equilibrium statistical mechanics. A class of integrators that satisfies these conditions are the so called symplectic integrators. In devising a numerical integrator for Hamilton’s equations, it is certainly possible to use a Taylor series approach and expand the solution xt for a short time t = t about t = 0. While this method is adequate for Hamiltonian systems described by Eq. (4), it generally fails for more complicated Hamiltonian forms as well as for non-Hamiltonian systems of the type we will be considering shortly for generating other ensembles. For this reason, we will introduce a more powerful and elegant approach based on operator calculus. This approach begins by recognizing that Hamilton’s equations can be case in a compact form as r˙ i = iLri
p˙ i = iLpi
where a linear operator iL has been introduced (i = iL = =
N ∂H i=1
∂ ∂H ∂ · − · ∂pi ∂ri ∂ri ∂pi
i=1
∂ ∂ · + Fi · m i ∂ri ∂pi
N pi
√
(15) −1) given by
(16)
This operator is known as the Liouville operator. Note that the operator L, itself, is Hermitian. Thus, the equations of motion can be cast in terms of the phase space vector as x˙ = iL x, which has the formal solution x t = eiLt x0
(17)
Generating equilibrium ensembles via molecular dynamics
595
The unitary operator exp(iLt) is known as the classical propagator. Since the classical propagator cannot be evaluated analytically for any but the simplest of systems, it would seem that Eq. (17) is little better than a formal device. In fact, Eq. (17) is the starting point for the derivation of practically useful numerical integrators. In order to use Eq. (17) in this way, it is necessary to introduce an approximation to the classical propagator. To begin, note that iL can be written in the form iL = iL 1 + iL 2
(18)
where iL 1 =
N pi i=1
mi
·
∂ ∂ri
iL 2 =
N
Fi ·
i=1
∂ ∂pi
(19)
Although these two operators do not commute, the propagator exp(iLt) can be factorized according to the Trotter theorem: eiLt = lim
M→∞
eiL 2 t /2M eiL 1 t /M eiL 2 t /2M
M
(20)
where M is an integer. As will be seen shortly, each of the operators in brackets can be evaluated analytically. Thus, the exact propagator could be evaluated by dividing the time t into an infinite number of “steps” of length t/M and evaluating the operator in brackets for each of these steps. While this is obviously not possible in practice, if we approximate M as a finite number, a practical scheme emerges. For finite M, Eq. (20) becomes
eiLt ≈ eiL 2 t /2M eiL 1 t /M eiL 2 t /2M
M
+ O(t 3 /M 2 )
eiLt /M ≈ eiL 2 t /2M eiL 1 t /M eiL 2 t /2M + O(t 3 /M 3 ) eiLt ≈ eiL 2 t /2 eiL 1 t eiL 2 t /2 + O(t 3 )
(21)
where, in the second line, the 1/M power of both sides is taken, and, in the third line, the identification t = t/M is made. The error terms in each line illustrate the difference between the global error in the long-time limit and the error in a single short time step. While the latter is t 3 , the former is t 3 /M 2 = tt 2 , indicating that the error in a long trajectory generated by repeated application of the approximate propagator in Eq. (21) is actually t 2 , despite the fact that the error in the approximate short-time propagator is t 3 . In order to illustrate how to evaluate the action of the approximate propagator in Eq. (21), consider a single particle moving in one dimension. Let q and p be the coordinate and conjugate momentum of the particle. The equations of motion are simply q˙ = p/m and p˙ = F(q). Thus, the approximate propagator becomes ∂ ∂ p ∂ t t F(q) F(q) exp t exp (22) exp[iLt] = exp 2 ∂p m ∂q 2 ∂p
596
M.E. Tuckerman
In order to evaluate the action of each of the operators, we only need the operator identity
exp c
∂ ∂x
f (x) = f (x + c)
(23)
where c is independent of x. This identity can be proved by expanding the exponential of the operator in a Taylor series. This type of operator is called a shift or translation operator because it has the effect of shifting x by an amount c. Applying the operator to the phase space vector (q, p) gives
q(t) p(t)
∂ ∂ p ∂ t t F(q) F(q) = exp exp t exp 2 ∂p m ∂q 2 ∂p
p ∂ ∂ t = exp exp t F(q) 2 ∂p m ∂q p+
= exp
t ∂ F(q) 2 ∂p p+
q+
=
p+
t 2
p+
t 2
p+
F(q) + F q + q+
t m
t 2
t m
p+
F(q) + F q +
F q+
t 2 t m
t 2 2m
t mp
p+
p+
F(q)
t 2
F(q)
F(q)
t m
F(q)
q + t mp
=
q p
q t 2
t 2 2m
(24)
F(q)
Since the last line is just (q(t), p(t)) staring from the initial condition (q, p), the algorithm becomes, after substituting in (q(0), p(0)) for the initial condition: q(t) = q(0) + tv(0) + v(t) = v(0) +
t 2 F(q(0)) 2m
t F(q(0)) + F(q(t)) 2m
(25)
where the momentum has been replaced by the velocity v = p/m. Equation (25) is the well known velocity Verlet algorithm. However, it has been derived in a very powerful way starting from the classical propagator. In fact, the real power of the operator approach is that it can eliminate the need to derive a set of explicit finite difference equations. To see this, note that the velocity Verlet
Generating equilibrium ensembles via molecular dynamics
597
algorithm can be written in the following equivalent way t F(q(0)) 2m q(t) = q(0) + tv(t/2) t v(t) = v(t/2) + F(q(t)) 2m
v(t/2) = v(0) +
(26)
Written in this way, it becomes clear that the three assignments in Eq. (26) correspond to the three operators in Eq. (22), i.e., a shift by an amount (t/2m) F(q(0)) applied to the velocity v(0), followed by a shift of the coordinate q(0) by tv(t/2), followed by a shift of v(t/2) by an amount (t/2m) F(q(t)). Note that the input to each operation is just the output of the previous operation. This fact suggests that one can simply look at an operator such as that of Eq. (22) and directly write the instructions in code corresponding to each operator, only keeping in mind that when the coordinate changes, the force needs to be recalculated. We call this technique of translating the operators in a given factorization scheme directly into instructions in code the direct translation method [8]. Applying this approach to Eq. (22), the following pseudocode could be written down immediately just by looking at the operator expression: v ←− v + t ∗ F/m q ←− q + t ∗ v Call GetNewForce(q, F) v ←− v + t ∗ F/m
!! Shift the velocity !! Shift the coordinate !! Evaluate force at new coordinate !! Shift the velocity
(27)
The velocity Verlet method is an example of a symplectic integrator as can be shown by computing the Jacobian of the transformation (q(0), p(0) → (q(t), p(t)). One could also factorize the propagator according to
exp[iLt] = exp
∂ t p ∂ t p ∂ exp t F(q) exp 2 m ∂q ∂p 2 m ∂q
(28)
and obtain yet another symplectic integrator known as the position Verlet method [9]. The use of the Liouville operator formalism also allows for easy development of integrators capable of exploiting the natural separation of time scales in many complex systems to yield more efficient algorithms [9]. Having seen how to devise numerical integration algorithms for the microcanonical ensemble, we now take up the issue of generating other ensembles. The next case we will consider is that of a system interacting with its surroundings via exchange of thermal energy. If the temperature of the surroundings is T , then, in equilibrium, the system will also have this temperature, and its internal energy will fluctuate. However, since only thermal energy is exchanged with the surroundings, the number of particles N and volume V of the system
598
M.E. Tuckerman
are trivially conserved. Thus, in this case, we have an ensemble whose thermodynamic control variables are N , V and T , known as the canonical ensemble. In this ensemble, the average of any quantity A(p, r) is given by A =
CN Q(N, V, T )
dp
dr A(p, r)e−β H (p,r)
(29)
D(V )
where C N = 1/(N !h 3N ), β = 1/kT , and Q(N, V, T ) is the canonical partition function Q(N, V, T ) = C N
dp
dr e−β H (p,r)
(30)
D(V )
Thermodynamic quantities in the canonical ensemble are given in terms of the partition function as follows: The Helmholtz free energy is A(N, V, T ) = −
1 ln Q(N, V, T ) β
(31)
The pressure, internal energy, chemical potential, and heat capacity at constant volume are given by
∂ ln Q(N, V, T ) P = kT ∂V N,T ∂ ln Q(N, V, T ) E =− ∂β N,V ∂ ln Q(N, V, T ) µ = −kT ∂N V ,T
C V = kβ
2
∂ 2 ln Q(N, V, T ) ∂β 2
(32) N,V
In the canonical ensemble, the surroundings act as a heat bath coupled to the system. Thus, unless we treat explicitly the surroundings that might be present in an actual constant temperature experiment, we cannot determine how this coupling will affect the dynamics of the system. Since this is clearly out of the question, the only alternative is to mimic the effect of the surroundings in a simple way so as to ensure that the system will be driven to generate a canonical distribution. There is no unique way to accomplish this, a fact that has lead practitioners of MD to propose a variety of methods. One class of methods that has become increasingly popular since their introduction are the so called extended phase space methods, originally pioneered by Andersen [10]. In this class of methods, the physical position and momentum variables of the particles in the system are supplemented by additional phase space variables that mimic the effect of the surroundings by controlling the fluctuations in certain quantities in such a way that their averages are
Generating equilibrium ensembles via molecular dynamics
599
consistent with the desired ensemble. For example, in the canonical ensemble, additional variables are used to control the fluctuation in the instantaneous kinetic energy i p2i /2m i such that its average is 3N kT /2. Extended phase space methods based on both Hamiltonian and non-Hamiltonian dynamical systems have been proposed. The former include the original formulation by Nos´e [11], and the more recent Nos´e-Poincar´e method [12]. The latter include the well known Nos´e–Hoover [13] and Nos´e–Hoover chain approaches [13] as well as the more recent generalized Gaussian moment method [14]. It is not possible to discuss all of these methods here, so we will focus on the Nos´e– Hoover and Nos´e–Hoover chain approaches, which are among the most widely used. Since these methods are of the non-Hamiltonian variety, it is necessary to review some of the basic statistical mechanics of non-Hamiltonian systems [15, 16]. Consider a non-Hamiltonian system with a generic smooth evolution equation x˙ = ξ(x)
(33)
where ξ(x) is a vector function. A clear signature of a non-Hamiltonian system will be a non-vanishing compressibility, κ(x), although non-Hamiltonian systems with vanishing compressibility exist as well. The consequence of nonzero compressibility is that the Jacobian of the transformation x0 −→ xt is no longer 1, and the Liouville theorem of Eq. (13) does not hold. However, for a large class of non-Hamiltonian systems described by Eq. (33), a generalization of Liouville’s theorem can be derived [15, 16]. This generalization states that a metric-weighted volume element is conserved, i.e.,
g(xt , t)dxt =
g(x0 , 0)dx0
where the metric factor
√
g(xt , t) = e−w(xt ,t )
(34)
g(xt , t) is given by (35)
where the function w(x) is related to the compressibility by κ(xt )=dw(xt , t)/dt. Equation (34) shows that for non-Hamiltonian systems, phase space integrals should use e−w(x,t )dx as the integration measure rather than just dx. This will be an important point in the analysis of the dynamical systems we will be considering. Finally, although Eq. (34) allows for time-dependent metrics, the systems we will be considering all have time-independent metric factors. Suppose the non-Hamiltonian in Eq. (33) has a time-independent metric factor and a set of Nc conservation laws k (x) = Ck , k = 1, . . . , Nc , where k is a function on the phase space and Ck is a constant. Then,if the system is ergodic, it Nc δ(k (x) − Ck ), which will generate a microcanonical distribution f (x) = k=1
600
M.E. Tuckerman
satisfies a non-Hamiltonian generalization of the Liouville equation [15, 16]. The corresponding partition function is =
dx e−w(x)
Nc
δ(k (x) − Ck )
(36)
k=1
The first non-Hamiltonian system we will consider for generating the canonical distribution are the Nos´e–Hoover equations (NH) [17]. In the Nos´e–Hoover system, an additional variable η and its corresponding momentum pη and “mass” Q (so designated because Q actually has units of energy × time2 ) are introduced into a Hamiltonian system as follows: pi r˙ i = mi pη p˙ i = Fi − pi Q pη (37) η˙ = Q N p2i − 3N kT p˙η = mi i=1 The physics embodied in Eqs. (37) is based on the fact that the term −( pη /Q)pi in the momentum equation acts as a kind of dynamic frictional force. Although the average pη = 0, instantaneously, pη can be positive or negative and, therefore, act to damp or boost the momentum. According to the equation for pη , if the kinetic energy is larger than 3N kT /2, pη will increase and have a greater damping effect on the momenta, while if the kinetic energy is less than 3N kT /2, pη will decrease and have a greater boosting effect on the mometa. In this way, the NH system acts as a “thermostat” regulating the kinetic energy so that its average is the correct canonical value. Equations (37) have the conserved energy H =
N p2i i=1
2m i
+ U (r1 , . . . , r N ) +
= H (p, r) +
pη2 + 3N kTη 2Q
pη2 + 3N kT 2Q
(38)
where H (p, r) is the Hamiltonian of the physical system. Moreover, the compressibility of Eqs. (37) is κ(x) =
N ∂
∂pi pη = −3N Q = −3N η˙ i=1
· p˙ i +
∂ p˙η ∂ ∂ η˙ + · r˙ i + ∂ri ∂η ∂ pη
(39)
Generating equilibrium ensembles via molecular dynamics
601
√ This implies that w(x) = −3N η, and the metric factor is g(x) = exp(3N η). If Eq. (38) is the only conservation law, then the partition function generated by Eqs. (37) can be written down as =
dp D(V )
pη2 + 3N kTη − E dr dη d pη e3Nη δ H (p, r) + 2Q
(40)
Performing the integrals over the variables η and pη yields the partition function of the physical subsystem
pη2 1 1 dp dr d pη exp E − H (p, r) − = 3N kT kT 2Q D(V ) √ 2π QkT e E/kT = dp dr e−H (p,r)/ kT 3N kT
(41)
D(V )
which shows that the partition function for the physical system is canonical apart from the prefactors. Although this analysis would suggest that the NH equations should always produce a canonical distribution, it turns out that if even a single additional conservation law is obeyed by the system, Eqs. (37) will fail [16]. Figure 1 shows that for a simple harmonic oscillator coupled to the NH thermostat, the physical phase space and position and momentum distribution are not those of the canonical ensemble. Note that in N -particle systems, a common additional conservation law is conservation of N total momentum i=1 pi = K, where K is a constant vector. This conservation N Fi = law is obeyed by systems on which no external forces act, so that i=1 0. Conservation of total momentum is an example of a common conservation law in N -particle systems that can cause the NH equations to fail rather spectacularly [16]. A solution to this problem was devised by Martyna et al. [13] in the form of the Nos´e–Hoover chain equations. In this scheme, the heat bath variables, themselves, are connected to a heat bath, which, in turn is connected to a heat bath, until a “chain” of M heat baths is generated. The equations of motion are r˙ i =
pi mi
p˙ i = Fi − η˙ k =
pηk Qk
p˙ηk = G k − p˙η M = G M
pη1 pi Q1 k = 1, . . . , M pηk+1 pη Q k+1 k
602
M.E. Tuckerman
Figure 1. Simple harmonic oscillator with momentum p, coordinate q, mass m = 1, frequency ω = 1 and temperature kT = 1. Top left: Poincar´e section ( pq plane) of the oscillator when coupled to the Nos´e–Hoover thermostat with Q = 1 and q(0) = 0, p(0) = 1, η(0) = 0, pη (0) = 1. Middle left: The position distribution function of the oscillator. The solid line is the distribution function generated by the NH dynamics while the dashed line is the analytical result for a canonical ensemble. Bottom left: Same for the momentum distribution. Top right: Poincar´e section for the Nos´e-Hoover chain scheme with M = 4, q(0) = 0, p(0) = 1, ηk (0) = 0, pηk (0) = (−1)k . Middle right: The position distribution function. The solid line is the distribution function generated by the NHC dynamics while the dashed line is the analytical result. Bottom right: Same for the momentum distribution. In all simulations, the equations of motion were integrated for 5×106 steps using a time step of 0.01 and a fifth-order SY decomposition with n c = 5.
where the heat-bath forces have been introduced and are given by G1 =
N p2i i=1
mi
− 3N kT
Gk =
pη2k−1 Q k−1
− kT
(42)
Equations (42) have the conserved energy H = H (p, r) +
M pη2k k=1
2Q k
+ d N kT η1 + kT
M k=2
ηk
(43)
Generating equilibrium ensembles via molecular dynamics
603
and a compressibility κ(x) = −3N η˙1 −
M
η˙ k
(44)
k=2
By allowing the “length” of the chain to be arbitrarily long, the problem of unexpected conservation laws is avoided. In Fig. 1, the physical phase space and momentum and position distributions for a harmonic oscillator coupled to a thermostat chain of length M = 4 is shown. It can be seen that the correct canonical distribution is obtained. The general proof that the canonical distribution is generated by Eqs. (42) follows the same pattern as for the NH equations. However, if additional conservation laws, such as conservation of total momentum, are obeyed, the NHC equations will still generate the correct distribution [16]. The NHC scheme can be used in a flexible manner to enhance the equilibration of a system. For example, rather than using a single global NHC thermostat, it is also possible to couple many NHCs to a system, one to each of a small number of degrees of freedom. In fact, coupling one NHC to each degree of freedom has been shown to lead to a highly effective method for studying quantum systems via the Feynman path integral using molecular dynamics [18]. In order to develop a numerical integration algorithm for the NHC equations, it is important to keep in mind the modified Liouville theorem, Eq. (34). The complexity of the NHC equation is such that a Taylor series approach cannot be employed to derive a satisfactory integrator, i.e., one that does not lead to substantial drifts in the conserved energy [19]. Thus, the NHC system is an example of a problem on which the power of the Liouville operator method can be brought to bear. We begin by writing the total Liouville operator for Eqs. (42) as iL = iL 1 + iL 2 + iL T
(45)
where iL 1 and iL 2 are given by Eq. (19) and iL T =
M k=1
N M−1 pη ∂ pηk ∂ ∂ pη1 ∂ k+1 + Gk − pi · − pηk Q k ∂ηk ∂ pηk Q1 ∂pi Q k+1 ∂ pηk i=1 k=1
(46) The propagator is now factorized in a manner very similar to the velocity Verlet algorithm
eiLt = eiL T t /2eiL 2 t /2eiL 1 t eiL 2 t /2eiL T t /2 + O t 3
(47)
The only new feature in this scheme is the operator exp(iL T t/2). Application of this operator to the phase space requires some care. Clearly, the operator needs to be further factorized into individual operators that can be applied
604
M.E. Tuckerman
analytically. However, the NHC equations constitute a stiff set of differential equations and, therefore, a simple O(t 3 ) factorization scheme will not be accurate enough. Thus, for this operator, a higher-order factorization is needed. Note that the overall integrator will still be O(t 3 ) despite the use of a higherorder method on the thermostat operator. The higher order method we choose is the Suzuki–Yoshida (SY) scheme [20, 21], which involves the introduction of weighted time steps, w j t, j = 1, . . . , n sy , the value of n sy determines the n order of the method. The weights w j are required to satisfy j sy=1 w j = 1 and are chosen so as to cancel out the lower order error terms. Applying the SY scheme, the operator exp(iL T t/2) becomes eiL T t /2 =
n sy
eiL T w j t /2
(48)
j =1
In order to avoid needed to choose n sy too high, another device can be introduced, namely, simply cutting the time step by a factor of n c and applying the operator in Eq. (48) n c times, i.e., e
iL T t /2
=
n sy nc
eiL Tw j t /2nc
(49)
i=1 j =1
In this way, both n c and n sy can be adjusted so as to minimize the number of operations needed for satisfactory performance of the overall integrator. Having introduced the above scheme, it only remains to specify a particular factorization of the operator exp(iL T w j t/2n c ). Defining δ j = w j t/n c , we choose the following factorization
δj δj ∂ GM = exp exp iL T 2 4 ∂ pη M
δj ∂ Gk × exp 4 ∂ pηk
N
1 k=M−1
δ j pηk+1 ∂ exp − pηk 8 Q k+1 ∂ pηk
δ j pηk+1 ∂ exp − pηk 8 Q k+1 ∂ pηk
δ j pη1 ∂ × exp − pi · 2 Q1 ∂pi i=1 ×
M−1 k=1
M
δ j pηk ∂ exp − 2 Q k ∂ηk k=1
δ j pηk+1 ∂ δj ∂ Gk exp − pηk exp 8 Q k+1 ∂ pηk 4 ∂ pηk
δ j pηk+1 ∂ × exp − pη 8 Q k+1 k ∂ pηk
δj ∂ GM exp 4 ∂ pη M
(50)
Although the overall scheme may seem complicated, the use of the direct translation technique simplifies considerably the job of coding the algorithm.
Generating equilibrium ensembles via molecular dynamics
605
All of the operators appearing in Eq. (50) are either translation operators or operators of the form exp(cx∂/∂ x), the action of which is
exp cx
∂ x = xec ∂x
(51)
We call such operators scaling operators, because the effect is to multiply x by an x-independent factor ec . The examples of Fig. 1 were generated using the above scheme. The last ensemble we will discuss corresponds to a system that interacts with its surroundings through exchange of thermal energy and via a mechanical piston that adjusts the volume of the system until its internal pressure is equal to the external pressure of the surroundings. Such an ensemble will be characterized by constant particle number, N , internal pressure P, and temperature T and is known as the isothermal-isobaric ensemble. In this ensemble, it is necessary to consider all possible values of the volume. Thus, the average of any quantity A(p, r) is given by DN A = (N, P, T )
∞
dV e
−β P V
dp
dr A(p, r)e−β H (p,r)
(52)
D(V )
0
where D N = 1/(N !h 3N V0 ), with V0 being a reference volume, and where the partition function (N, P, T ) is given by (N, P, T ) = D N
∞
dV e
−β P V
dp
dr e−β H (p,r)
(53)
D(V )
0
The thermodynamic quantities defined in this ensemble are the Gibbs free energy, given by G(N, P, T ) = −
1 ln (N, P, T ) β
(54)
and the average volume, average enthalpy, chemical potential, and constantpressure heat capacity, given, respectively, by
∂ ln (N, P, T ) ∂P N,T ∂ ln (N, P, T ) H=− ∂β N,P ∂ ln (N, P, T ) µ = −kT ∂N P,T
V = −kT
C P = kβ
2
∂ 2 ln (N, P, T ) ∂β 2
N,P
(55)
606
M.E. Tuckerman
As with the canonical ensemble,there is no unique way to generate the correct volume fluctuations. Nevertheless, among the various algorithms that have been proposed for constant pressure MD, it can be shown [16] that they do not all generate the correct isothermal-isobaric distribution. We shall, therefore, focus on the Martyna–Tobias–Klein (MTK) algorithm [22], which has been shown to give both the correct phase space and volume distributions. The MTK approach uses both a set of thermostat variables to control the kinetic energy fluctuations as well as a barostat to control the fluctuations in the instantaneous pressure. The latter is given by the virial expression
N N 1 ∂U p2i + ri · Fi − 3V Pint = 3V i=1 m i ∂V i=1
(56)
Finally, the volume V is also treated as a dynamical variable. Thus, the equations of motion take the form pi p + ri r˙ i = m i W 1 p pη pi − 1 pi p˙ i = Fi − 1 + N W Q1 3V p V˙ = W N 1 pξ p2i p˙ = (Pint − P) + − 1 p N i=1 m i Q1 pηk (57) η˙ k = k = 1, . . . , M Qk pη p˙ηk = G k − k+1 pηk Q k+1 p˙η M = G M pξ ξ˙k = k k = 1, . . . , M Qk pξ p˙ξk = G k − k+1 pξk Q k+1 p˙ξ M = G M In Eqs. (57), the variable p with mass parameter W (having units of energy × time2 ) corresponds to the barostat, coupling both to the positions and the momenta. If the system is subject to a set of holonomic constraints, leaving only N f degrees of freedom, then the 1/N factors appearing in Eq. (57) must be replaced by 3/N f in three spatial dimensions. Moreover, note that two Nos´e– Hoover chains are coupled to the system, one to the particles and the other to the barostat. This device is particularly important, as the barostat tends to evolve on a much slower time scale than the particles. The heat-bath forces G k are defined by G 1 =
p 2 − kT W
G k =
pξ2k−1
Q k−1
− kT
(58)
Generating equilibrium ensembles via molecular dynamics
607
The MTK equations have the conserved energy M p2 H = H (p, r) + + P V + 2W k=1
+ kT
M
ηk + kT
k=2
M
pη2k pξ2k + 2Q k 2Q k
ξk
+ dN kT η1 (59)
k=1
and a phase space metric factor
g(x) = exp dN η1 +
M
ηk +
k=2
M
ξk
(60)
k=1
In order to prove that the MTK equations generate a correct isothermalisobaric distribution, one needs to substitute Eqs. (60) and (59) into Eq. (36) and perform the integrals over all of the heat bath variables and p following the same procedure as was done for the canonical ensemble. Moreover, since Nos´e-Hoover chain thermostats are employed in the MTK scheme, the correct distribution will also be generated even if additional conservation laws, such as total momentum, are obeyed by the system. Integrating the MTK equations is only slightly more difficult than integrating the NHC equations and builds on the technology already developed. We begin by introducing the variable = (1/3) ln(V / V0 ) and writing the total Liouville operator as iL = iL 1 + iL 2 + iL ,1 + iL ,2 + iL T−baro + iL T−part
(61)
where iL 1 =
N pi i=1
N
p ∂ + ri · mi W ∂ri
p ∂ iL 2 = Fi − α pi · W ∂pi i=1 p ∂ iL ,1 = W ∂ ∂ iL ,2 = G ∂ p
(62)
and iL T−part and iL T−baro are defined in an analogous manner to Eq. (46). In Eq. (62), α = 1 + 1/N , and G = α
p2 i i
mi
+
N i=1
ri · Fi − 3V
∂φ − PV ∂V
(63)
608
M.E. Tuckerman
The propagator is factorized in a manner that bears a very close resemblance to that of the NHC equations, namely
t t t exp iL T−part exp iL ,2 exp(i Lt) = exp iL T−baro 2 2 2 t × exp iL 2 exp iL ,1 t exp (iL 1 t) 2 t t t exp iL ,2 exp iL T−part × exp iL 2 2 2 2 t × exp iL T−baro + O(t 3 ) 2
(64)
In evaluating the action of this propagator, the Suzuki–Yoshida decomposition already developed for the NHC equations is applied to the operators exp(iL T−baro t/2) and exp(iL T−part t/2). The operators exp(iL ,1 t) and exp(iL ,2 t/2) are simple translation operators. The operators exp(iL 1 t) and exp(iL 2 t/2) are somewhat more complicated than their microcanonical or canonical ensemble counterparts due to the barostat coupling. The action of the operator exp(iL 1 t) can be determined by solving the differential equation r˙ i = vi + v ri
(65)
for constant vi =pi /m i and constant v = p /W for an arbitrary initial condition ri (0) and evaluating the solution at t = t. This yields the evolution ri (t) = ri (0)ev t + tvi (0)ev t /2
sinh(v t/2) v t/2
(66)
Similarly, the action of exp(i L 2 t/2) can be determined by solving the differential equation v˙ i =
Fi − αv vi mi
(67)
for an arbitrary initial condition vi (0) and evaluating the solution at t = t. This yields the evolution vi (t/2) = vi (0)e−αv t /2 +
t sinh(αv t/4) Fi (0)e−αv t /4 2m i αv t/4
(68)
In practice, the factor sinh(x)/x should be evaluated by a power series for x small to avoid numerical instabilities. These equations together with the Suzuki–Yoshida factorization of the thermostat operators completely define an integrator for the isothermal-isobaric ensemble that can be shown to satisfy Eq. (34). The integrator can be easily coded using the direct translation technique. As an example, the MTK algorithm is applied to the problem of a
Generating equilibrium ensembles via molecular dynamics
609
particle moving in a one-dimensional potential
2π q mω2 V 2 1 − cos (69) 2 4π V where V is the one-dimensional “volume” or box length. The system is coupled to the MTK thermostat/barostat and subject to periodic boundary conditions. Figure 2 shows the position and volume distributions generated together with the analytical results. It can be seen that the method is capable of generating correct distributions of both the phase space and of the volume. We conclude this contribution with a few closing remarks. First, the MTK equations can be generalized [22] to treat anisotropic pressure fluctuations as the Parrinello-Rahman scheme [23]. In this case, one considers the full 3 × 3 φ(q, V ) =
1.5
f(q)
1
0.5
0
0
1
2
3
4
6
8
q 0.5 0.4
f(V)
0.3 0.2 0.1 0 0
2
4
V Figure 2. Top: The position distribution of the system described by the periodic potential of Eq. (69) in the isothermal-isobaric ensemble. The numerical and analytical distributions are shown as the solid and dashed lines, respectively. Bottom: Same for the volume distribution. Nos´e–Hoover chain lengths of 4 were coupled to the particle and to the barostat. The mass m and frequency ω were both taken to be 1, W = 18, kT = 1, P = 1, Q k = 1, Q k = 9. The time step was taken to be 0.005, and the equations of motion were integrated for 5×107 steps using a seventh-order SY scheme with n c = 6.
610
M.E. Tuckerman
cell matrix h = (a, b, c), where a, b, and c, which form the columns of h, are the three cell vectors. The partition function for this ensemble is (N, P, T ) =
1 dh e−β Pdet(h) [det(h)]2
dp
dr e−β H (p,r)
(70)
D(h)
Although we will not discuss the equations of motion here, we remark that it is important to generate the correct factors of det(h) (recall det(h) = V ) in the distribution. The generalized MTK algorithm has been shown to achieve this. Next, the reader may have noticed the glaringly obvious absence of a pure MD based approach to the grand canonical ensemble. Although a number of important proposals for generating this ensemble via MD have appeared in the literature, there is no standard, widely adopted approach to this problem, as is the case for the canonical and isothermal-isobaric ensembles, and the development of such a method for the grand canonical ensemble remains an open question. The main problem with the grand canonical ensemble comes from the need to treat the fluctuations in a discrete variable, N . Here, adiabatic dynamics techniques adopted to allow slow insertion and deletion of particles in the system at constant chemical potential might be useful. Finally, although we encourage the use of the Liouville operator approach in developing integrators for new sets of equations of motion, this method is not foolproof and must be used with some degree of caution, particularly for nonHamiltonian systems. Not every factorization scheme applied to the propagator of a non-Hamiltonian system is guaranteed to preserve the phase space volume as Eq. (34) requires. Although significant attempts have been made to develop a general procedure for devising such factorization schemes, not enough is known at this point about the phase space structure of non-Hamiltonian systems for a truly general theory of numerical integration, so that this, too, remains an open area. An advantage, however, of the Liouville operator approach is that it renders the problem of combining the NHC and MTK schemes with multiple time scale methods [9] and constraints [24] relatively transparent.
References [1] G.M. Torrie and J.P. Valleau, “Nonphysical sampling distributions in Monte Carlo free energy estimation: umbrella sampling,” J. Comp. Phys., 23, 187, 1977. [2] E.A. Carter, G. Ciccotti, J.T. Hynes, and R. Kapral, “Constrained reaction coordinate dynamics for the simulation of rare events,” Chem. Phys. Lett., 156, 472, 1989. [3] M. Sprik and G. Ciccotti, “Free energy from constrained molecular dynamics,” J. Chem. Phys., 109, 7737, 1998. [4] Z. Zhu, M.E. Tuckerman, S.O. Samuelson, and G.J. Martyna, “Using novel variable transformations to enhance conformational sampling in molecular dynamics,” Phys. Rev. Lett., 88, 100201, 2002.
Generating equilibrium ensembles via molecular dynamics
611
[5] J.I. Siepmann and D. Frenkel, “Configurational bias Monte Carlo – a new sampling scheme for flexible chains,” Mol. Phys., 75, 59, 1992. [6] S. Duane, A.D. Kennedy, B.J. Pendleton, and D. Roweth, “Hybrid Monte Carlo,” Phys. Lett. B, 195, 216, 1987. [7] S. Plimpton, “Fast parallel algorithms for short-range molecular dynamics,” J. Comput. Phys., 117, 1, 1995. [8] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein, “Explicit reversible integrators for extended systems dynamics,” Mol. Phys., 87, 1117, 1996. [9] M.E. Tuckerman, G.J. Martyna, and B.J. Berne, “Reversible multiple time scale molecular dynamics,” J. Chem. Phys., 97, 1990, 1992. [10] H. Andersen, “Molecular dynamics at constant temperature and/or pressure,” J. Chem. Phys., 72, 2384, 1980. [11] S. Nos´e, “A unified formulation of the constant temperature molecular dynamics methods,” J. Chem. Phys., 81, 511, 1984. [12] S.D. Bond, B.J. Leimkuhler, and B.B. Laird, “The nos´e–poincar´e method for constant temperature molecular dynamics,” J. Comput. Phys., 151, 114, 1999. [13] G.J. Martyna, M.E. Tuckerman, and M.L. Klein, “Nos´e–Hoover chains: the canonical ensemble via continuous dynamics,” J. Chem. Phys., 97, 2635, 1992. [14] Y. Liu and M.E. Tuckerman, “Generalized Gaussian moment thermostatting: a new continuous dynamical approach to the canonical ensemble,” J. Chem. Phys., 112, 1685, 2000. [15] M.E. Tuckerman, C.J. Mundy, and G.J. Martyna, “On the classical statistical mechanics of non-Hamiltonian systems,” Europhys. Lett., 45, 149, 1999. [16] M.E. Tuckerman, Y. Liu, G. Ciccotti, and G.J. Martyna, “Non-Hamiltonian molecular dynamics: Generalizing Hamiltonian phase space principles to non-Hamiltonian systems,” J. Chem. Phys., 115, 1678, 2001. [17] W.G. Hoover, “Canonical dynamics – equilibrium phase space distributions,” Phys. Rev. A, 31, 1695, 1985. [18] M.E. Tuckerman, B.J. Berne, G.J. Martyna, and M.L. Klein, “Efficient molecular dynamics and hybrid Monte Carlo algorithms for path integrals,” J. Chem. Phys., 99, 2796, 1993. [19] M.E. Tuckerman and G.J. Martyna, Comment on “Simple reversible molecular dynamics algorithms for No´se–Hoover chain dynamics,” J. Chem. Phys., 110, 3623, 1999. [20] H. Yoshida, “Construction of higher-order symplectic integrators,” Phys. Lett. A, 150, 262, 1990. [21] M. Suzuki, “General-theory of fractal path-integrals with applications to many-body theories and statistical physics,” J. Math. Phys., 32, 400, 1991. [22] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant-pressure molecular-dynamics algorithms,” J. Chem. Phys., 101, 4177, 1994. [23] M. Parrinello and A. Rahman, “Crystal-structure and pair potentials – a moleculardynamics study,” Phys. Rev. Lett., 45, 1196, 1980. [24] J.P. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical-integration of cartesian equations of motion of a system with constraints – molecular-dynamics of n-alkanes,” J. Comput. Phys., 23, 327, 1977.
2.10 BASIC MONTE CARLO MODELS: EQUILIBRIUM AND KINETICS George Gilmer1 and Sidney Yip2 1 Lawrence Livermore National Laboratory, P.O. box 808, Livermore, CA 94550 USA 2
Department of Nuclear Science and Engineering and Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
1.
Monte Carlo Simulations in Statistical Physics
Monte Carlo (MC) is a very general computational technique that can be used to carry out sampling of distributions. Random numbers are employed in the sampling, and often in other parts of the code. One definition of MC based on common usage in the literature is, any calculation that involves significant applications of random numbers. Historical accounts place the naming of this method in March 1947, when Metropolis suggested it for his method of evaluating the equilibrium properties of atomic systems, and this is the application that we will discuss in this section [1]. An important sampling technique is the one named after Metropolis, which we will describe below. There are several areas of computation besides the statistical mechanics of atomic systems where MC is used. An efficient method for the numerical evaluation of many-dimensional integrals is to apply random sampling techniques on the integrand [2]. A second application is the simulation of random walk diffusion processes in statistical mechanics and condensed matter physics [3]. Tracking particles and radiation (neutrons, photons, charged particles) during transport in non-equilibrium systems is another important area [4–7]. Models for crystal growth, ion implantation, radiation damage and other nonequilibrium systems often make use of random numbers. For example, in most of the MC models of ion implantation, the positions where the ions impinge on the surface of the target are selected using random numbers, whereas the trajectories of ions and target atoms are calculated deterministically using atomic collision theory. In models with diffusion, such as crystal growth and the annealing of radiation damage, the decision on which direction to move a particle or defect performing a random walk will be determined by random numbers. 613 S. Yip (ed.), Handbook of Materials Modeling, 613–628. c 2005 Springer. Printed in the Netherlands.
614
1.1.
G. Gilmer and S. Yip
Metropolis Sampling
In statistical physics one can find the average of a property A({r }) that is a function of the coordinates {r} of N particles, in a system that is in thermodynamic equilibrium,
A =
d3N r A({r }) exp[−U ({r})/kT ] . d3N r exp[−U ({r})/kT ]
(1)
The calculation involves averaging the dynamical variable of interest, A, which depends on the positions of all the particles in the system, over an appropriate thermodynamic ensemble. Often the canonical ensemble is chosen; one with a fixed number of particles, volume and temperature, N , V , and T . In this case the configurations are weighted by the Boltzmann factor exp[−U ({r})/kT ], where U is the potential energy of the system, and k the Boltzmann constant. Integration is over the positions of all particles (3N coordinates). The denominator in Eq. (1) is needed for normalization, and is an important quantity in its own right, because the Helmholtz free energy can be obtained from it (for a system with the independent variables, N , V , and T ). We consider two ways to perform the indicated integral. It is clearly overkill to integrate over all of configuration phase space, because the number of integrals is 3N , where N may have values of thousands or millions. The selection of some representative points seems like a reasonable alternative. One approach is to sample distinct configurations randomly and then obtain A by approximating Eq. (1) by a sum over a set of configurations A =
i=1
A({r }i ) exp[−U ({r}i )/kT ] . i=1 exp[−U ({r}i )/kT ]
(2)
The configurations could be selected by use of a random number generator. One could obtain coordinates to assign to the N atoms with which to fill the cell with N atoms using a sequence of numbers ξ i that are uniform in the range (0, 1), and scaling 3N values of ξ by the edge lengths of the rectangular computational cell. However this procedure would also be grossly inefficient. In a solid or liquid system, many of the atoms in such a random configuration would be overlapping, giving a huge potential energy, and hence a negligible weight, exp[−U ({r})/kT ], in the sampling procedure. The net result is that only a small fraction of “low energy” configurations would determine the value of A, and even these configurations would likely have potential energies much larger than the actual value U . To get around this difficulty, a second approach may be used, where the sampled configurations are picked in a way that is biased by the probability that they will appear in the equilibrium ensemble, i.e., using the factor exp[−U ({r})/kT ]. Then A is determined by weighing the contribution from
Basic monte carlo models: equilibrium and kinetics
615
each configuration equally, since the bias in the selection of configurations accounts for the Boltzmann weighting factor, A =
,Cn i=1
A({r i })
,Cn i=1
δii
,
(3)
where {r}i are configurations sampled from the biased distribution, as indicated by Cn above the summation sign. (The denominator is simply the number of states summed over, or .) How does one do this biased summation? One way is to adopt a procedure developed by Metropolis et al. in 1953 [8]. This procedure is an example of the concept of importance sampling in MC methods [9].
1.2.
Metropolis Sampling
One option for obtaining a set of configurations biased by exp[−U ({r})/ kT ] is to take small excursions from an initial configuration that has a low energy U ({r}). The initial coordinates could be the coordinates of N atoms in a perfect crystalline lattice structure at 0 K. Then, an atom is picked at random, and given a displacement that is small enough that the atom will not approach a neighbor too closely, and yet long enough to produce a significant displacement or change in the system energy. Let the initial position of the particle be (x, y, z). Imagine now displacing the particle from its initial position to a trial position (x + αξx , y + αξ y , z + αξz ), where α is a constant, and ξx , ξ y , and ξz are uniform in the interval (−1, 1). The value of α for obtaining the optimum sampling of phase space depends on the conditions, including density and T , among others. This could be determined from a preliminary run, or optimized as the simulation proceeds. With this move the system goes from configuration {r} j → {r} j +1 . The Metropolis procedure now consists of four steps. 1. Move system in the way just described. 2. Calculate U = U (final) − U (initial) = Uj +1 − Uj , i.e., U is the energy change resulting from the move. 3. If U < 0, accept the move. This means leaving the particle in its new position. 4. If, U > 0, accept the move provided ξ < exp[−U/kT ], where ξ is a fourth random number in the interval (0, 1). The Metropolis sampling technique generates a series of configurations, each of which is closely related to the previous one. This is true because of the small change on the total configuration affected by the displacement of only one atom. The series is, however, a Markov chain, since it satisfies the condition that the new configuration is derived from the previous one, without
616
G. Gilmer and S. Yip
i i⫹1 Markov chain (Time)
i⫹2
i⫹3
i⫹4
Figure 1. Illustration of the chain of states created by the Metropolis algorithm for a model of group of adatoms on a crystal surface. Each state differs from the one preceding it by the displacement of one atom to a neighboring lattice site.
taking into account the history of states before it. This is very different from molecular dynamics (MD) simulations, where the momentum of the particles plays an important role in determining the configuration of the next iteration. Figure 1 shows a schematic of the states generated by the Metropolis algorithm for a lattice gas, modeling a group of adatoms on the (100) face of a crystal. The elementary move in this model is a diffusion hop of an atom to a neighboring lattice site, and clearly the four hops in this series left much of the system unchanged. We see that it is the uphill moves of step 2 that account for the effect of temperature on the distribution of the system over the energy states. High temperature increases the magnitude of the Boltzmann factor, and therefore the probability of acceptance of moves that increase the energy of the system. If not for step 2, step 3 would only allow the system to go downhill in energy U , which would mean that the system of atoms would lose potential energy systematically and end up in a local energy minimum.
2.
Proof that Metropolis Sampling Results in a Canonical Ensemble
One can show that the Metropolis procedure allows one to sample the distribution of states biased by exp[−U/kT ]. Consider two states (configurations) of the system, i and j , and let Ui > U j . According to the Metropolis procedure, the probability of an (i → j ) transition is Pi ν ij , where Pi is the probability that the system is in state i, and ν ij is the transition probability that a system in state i will go to state j . Similarly, the probability of a ( j → i) transition is P j ν ij exp[−(Ui − U j )/kT ], where we have used the fact that ν j i= ν ij exp[−(Ui − U j )/kT ] according to the Metropolis procedure described above. At equilibrium the two transitions must have equal probabilities, otherwise the populations of some states in the ensemble could be increasing in probability, others decreasing, and the system would not be in equilibrium. This is the principle of microscopic reversibility, or detailed balance. Figure 2 shows
Basic monte carlo models: equilibrium and kinetics
617
Vij Vji
state i
state j
Vij exp (⫺Ui/kT) ⫽ Vji exp (⫺Uj/kT) Figure 2. The microscopic reversibility condition on the transition rates (or probabilities) between two states. This condition is necessary to insure that there is an equilibrium state for the system.
an example of this for a lattice model of an atomic system. Thus, equating the probability of an (i → j ) transition to that for the reverse transition, we find: Pi = P j exp[−(Ui − U j )/kT ] or Pi = C exp[−Ui /kT ] and P j = C exp[−U j /kT ],
(4)
where C is a normalization constant. Whereas (4) relates the probability of finding the ensemble in state i to that for state j , based on the direct transitions between the two states, it also applies to states without direct transitions. Of course, a system can reach internal equilibrium only if there is a sequence of states, connected by direct transitions, between any two states in the system. That is, all of the states are interconnected. Any model that does not satisfy this condition will have isolated pockets of states in phase space that will not equilibrate with each other. But, a system of states that are interconnected in this way will have all states satisfying Eq. (4), which is the canonical ensemble. This completes the proof of the Metropolis sampling method. Stated again, the Metropolis method is an efficient way to sample states of the system with a bias equal to the Boltzmann factor, and that has the same form as the canonical distribution in thermodynamics. It is worthwhile to note that this method can be used in optimization problems, where one is interested in finding the global minimum of multidimensional parameters. One example is to calculate the optimum arrangement of the components of a silicon device to minimize the path length of electrical interconnect lines. The analog of energy is the total length of the conducting lines. The method is better than the standard energy minimization methods such as the conjugate gradient procedure, because it allows the system energy (length of interconnect lines) to increase occasionally in the search for the global minimum. This feature allows it to surmount energy
618
G. Gilmer and S. Yip
barriers and visit more than one global minimum. The approach to optimization problems is similar to that used to find the global minimum in the energy of an atomic system. A large initial “annealing temperature” is chosen, since this allows the system to pass between global minima. The “temperature” is then reduced in steps for annealing until eventually reaching zero temperature and a minimum, hopefully the global minimum, and the desired optimum value. This is the basis of the “simulated annealing” algorithm used for optimization problems [10].
3.
Free Energy Calculations
As mentioned earlier, the Helmholtz free energy of an atomic system can be obtained from an integration of the Boltzmann factor over phase space, and this is given by
F = −kT · ln V−N
d3N r exp[−U ({r})/kT ] ,
or in an unbiased sample equivalent to Eq. (2), F = −kT · ln
i=1
exp[−U ({r}i )/kT ] . i=1 δii
(5)
This result is not very helpful for obtaining F, however, for the same reason that Eq. (2) is not a useful way to average properties in a canonical ensemble. Again, the major contributions to the sum in (5) occur in well-ordered configurations with atoms avoiding close encounters with their neighbors, whereas the random sampling approach will yield very few such low potential energy configurations indeed. The Metropolis algorithm does not help either. In the biased sample derived from the Metropolis technique, the equivalent of (5) includes a term that cancels the bias in the sum in the numerator and denominator. For purposes of understanding we can assume that we sum over the same states, but that the number of times a given state is included in the Metropolis series, or its degeneracy, is proportional to the Boltzmann factor. That is, each state in the canonical ensemble has, in effect, been multiplied by exp[−U ({r}i )/kT ] because of the preferential choice of states with low potential energy. Therefore to obtain the equivalent of Eq. (5) in the canonical ensemble sums, we simply multiply each term of the sums by exp[U ({r}i )/kT ], giving
F = −kT · ln
,Cn ,Cn
i=1
δii
exp[U ({r}i )/kT ] 1 = −kT · ln . exp[U ({r}i )/kT]Cn
i=1
(6)
Basic monte carlo models: equilibrium and kinetics
619
Although the evaluation of exp[U ({r}i )/kT ]Cn by the Metropolis method is a valid way to obtain the free energy, it is also totally impractical. The sum in the denominator of the middle expression in Eq. (6) will not be evaluated accurately, since it is large when bias factor is small, and vice versa. Therefore all states are equally important for evaluating the average exp[U ({r}i )/kT ]Cn , with the bias factor canceling the exponential in each term of the sum. Importance sampling fails, because each term is equally important, even states that have essentially zero probability of appearing in the ensemble, because these terms are multiplied by the huge exponential, exp[U ({r}i )/kT ]. One approach to calculating the free energy of a system of atoms is to relate it to a known reference system, i.e., a set of Einstein oscillators. If we define a potential energy U ({r}i ) = λU1 ({r}i ) + (1−λ)U0 ({r}i ), then when λ goes from 0 to 1, the potential energy goes from that corresponding to the interatomic potential for U0 ({r}i ) to that for U1 ({r}i ). Differentiating Eq. (5) with respect to λ, using our definition of U ({r}i ), we obtain
(U1 ({r }i ) − U0 ({r}i )) exp[−U ({r}i )/kT ] ∂F , = i=1 ∂λ i=1 exp[−U ({r}i )/kT ∂F = U1 ({r}i ) − U0 ({r i })Cn , ∂λ
or
(7) (8)
where the sampling in Eq. (8) is over an ensemble weighted with the Boltzmann factor exp[−{λU1{r }i + (1 − λ)U0{r }i }/kT ]. Integration of the derivative of F with respect to λ then gives the change in F between the reference state and the state with the desired configuration. Another method known as “umbrella sampling” has been used in situations where is it desired to compare two systems with almost identical interatomic potentials, or with slightly different temperatures [11,12]. If the interatomic potential is changed only a small amount, U ({r}i ) = U ({r}i ) − U0 ({r}i ), then it may be possible to make accurate calculations of the differences in the free energies or other properties A in a single Metropolis MC run. Then one chooses an “unphysical” bias potential, exp[−UUMB ({r}i )/kT , that will, ideally, reproduce the minimum values of both U0 ({r}i ) and of U ({r}i ). Then A0 is given by A0 =
,CnUMB i=1
A({r }i ) exp[−U0 + UUMB ]/kT
,CnUMB i=1
δii exp[−U0 + UUMB ]/kT
,
(9)
as discussed in Ref. [8]. Comparing Eq. (9) with Eq. (3) and the discussion following it, we see that the modified Metropolis method generates only one set of configurations, based on the bias potential, but that the average value of A must be calculated from these configurations weighted by the appropriate exponential, as shown in (9). An analogous expression holds for A for the interatomic potential giving U ({r}i ). Accurate results are only obtained for
620
G. Gilmer and S. Yip
small differences in the potential, and if the size of the atomic configuration is less than several hundred atoms. The choice of bias functions is also crucial for accurate results. But the selection of these functions usually requires some laborious trial-and-error runs. A more complete discussion on methods to obtain free energy differences is given in Chapter 2.15 by de Koning and Reinhardt. MC methods have a number of advantages over MD for obtaining free energies and other equilibrium properties. The ability to bias the sampling process and transition rates while retaining the conditions for an equilibrium ensemble provides some powerful methodologies. One of these applies to the evaluation of the properties of metastable and other defects such as dislocations, surfaces, and interfaces. Because of the small number of atoms involved compared to the total number in the system, statistical noise from the fluctuations in the bulk system will interfere with the measurement of the relatively small impact of the defect on the properties of the atomic system. MC methods allow the concentration of events on the region around the defect being investigated, while retaining the essential condition of microscopic reversibility. In this way, slowly relaxing regions can be allowed to approach a metastable equilibrium without spending most of the computer time on a less important part of the system. Slow structural rearrangements can be accommodated at the interface, without spending computer power simulating the uninteresting parts of the system as they perform their equilibrium fluctuations. MD simulations tend to be more efficient computationally than MC in the case where a system of atoms is being equilibrated at a new temperature or some other change in its conditions is implemented. The advantage for MD results from the fact that the displacements of the atoms during an MD time step are quite different from those discussed earlier for the MC methods. With classical MC, a displacement of a particle has nothing to do with the environment of the particle, but is chosen by random numbers along the three orthogonal coordinate axes. A particle that is close to a neighbor and therefore in a strong repulsive force field may be given a displacement moving it even closer. Such a move will likely cause a large increase in energy and be rejected, but the cost of generating the random numbers for the unsuccessful move affects the efficiency of the process. Furthermore, coordinated moves of a number of particles such as those moving into a region of reduced pressure are not possible with Metropolis MC, whereas their presence in MD allows fast relaxation of a pressure pulse or recovery from artificial initial conditions. Force bias MC was developed to speed up MC relaxation of atomic systems [13]. In this technique, atomic displacements with a large component in the same direction as the force on an atom are selected preferentially to those that are mainly in a direction orthogonal to the force. To maintain microscopic reversibility, atoms moving against the force must also be given a larger selection probability, but
Basic monte carlo models: equilibrium and kinetics
621
since they are likely to be moving uphill in energy and to have their move rejected, the result is that more atoms move in the desired direction. This technique is found to be effective and to increase the speed of relaxation in many MC systems. But the calculation of the forces requires extra computer time, so that some applications are still faster if done by basic MC methods [13]. In cases where the flexibility of the MC technique provides strong advantages, it is likely to be advantageous to implement the force-bias algorithm.
4.
Kinetic Interpretation of MC [6]
The Metropolis algorithm was developed primarily for obtaining equilibrium properties of a physical system. Strictly speaking, however, the method never reaches complete equilibrium condition; that is, states whose appearance in an ensemble occurs with the probability Pi = C exp[−Ui /kT ]. Consider the behavior of an infinite ensemble, i.e., an infinite number of identical computational cells, and all starting in the same state, but run with different random number sequences. Calculate the ensemble average properties Ai at each MC step i, starting with the initial state i = 0. In other words, we obtain the average of the system property A by averaging over the computational cells composing the ensemble after each MC event. This differs from the usual procedure, where A is averaged over the successive states of a single computational cell generated by the Metropolis method. The ensemble average Ai will initially have properties similar to the initial state, A0 , since most of the atoms will be in the same position as the starting state. Unless the initial state has very unusual properties, Ai will change its value as i increases, and eventually approach an asymptotic value corresponding to equilibrium, with Pi = C exp[−Ui /kT ]. The approach to the equilibrium ensemble is a property of the system “kinetics,” and depends strongly on the probabilities for transitions between states, ν ij . The ν ij can be thought of as transition rates, in which case the approach to the equilibrium ensemble can be plotted as a function of time instead of MC event number i. A transition with U < 0 has the highest probability ν ij , and would correspond to the highest transition rate. However, transition rates proportional to the Metropolis transition probabilities are unphysical, and would not yield the kinetics of any real system. For this purpose, it is necessary to obtain rate constants for atomic diffusion, chemical reactions, and other unit mechanisms that are relevant for the physical system being studied. These may be obtained by the use of interatomic potentials in molecular dynamics simulations as discussed in preceding chapters, or from molecular dynamics or saddle point evaluations using density functional theory as discussed in Chapter 1.
622
G. Gilmer and S. Yip
Kinetic Monte Carlo (KMC) is similar to equilibrium MC, but with transition rates appropriate for real systems. It can be applied both to equilibrium conditions and to conditions where the system is out of equilibrium. In order to distinguish KMC from equilibrium MC, we will use different terminology. Let P(x, t) be the probability that the system configuration is x at time t. Note that the configuration previously represented by {r}i is now simply x. Then P(x, t) satisfies the equation dP(x, t) =− W (x → x )P(x, t) + W (x → x)P(x , t), dt x x
(10)
where W (x → x ) is the transition probability per unit time of going from x to x (W is analogous to ν ij in the Metropolis method above). Equation (10) is called the Master equation. For the system to be able to reach equilibrium, as discussed above, the transition probabilities must satisfy the condition of microscopic reversibility, (cf. Eq. (4)). Peq (x)W (x → x ) = Peq (x )W (x → x).
(11)
At equilibrium, P(x, t) = Peq (x) and dP(x, t)/dt = 0. Since the probability of occupying state x is Peq (x) =
1 exp[−U (x)/kT ], Z
(12)
where Z is the partition function, Z = i exp[−U ({r}i )/kT ], and (11) gives the basic condition that must be satisfied by the transition probabilities imposed by microscopic reversibility, we have W (x → x ) = exp[{U (x) − U (x )}/kT ]. W (x → x)
(13)
Equation (13) is satisfied by the Metropolis procedure, but other transition rates also satisfy this condition. As we noted above, the Metropolis procedure is unphysical, but real systems also have equilibrium states when the transition rates that satisfy Eq. (13).
5.
Lattice MC: Crystal Growth
Kinetic Monte Carlo models of thin film and crystal growth are often based on the simplification of the lattice model, where atoms are confined to lattice sites on a perfect crystal lattice. We introduced a simple case in Fig. 1, where we discussed a model of a group of atoms diffusing on a crystal surface, and the model consisted of moving the atoms between lattice sites corresponding to a square array of binding sites on a fcc(100) substrate.
Basic monte carlo models: equilibrium and kinetics
623
The potential energies of the KMC lattice gas model (KMC LG) can be obtained from empirical interatomic potentials developed for MD simulations, or from simple bond-counting methods if the properties of the model are not required to match experiments. Usually the interactions are limited to nearest neighbors, although the embedded atom potentials have an effective range that is greater than the cut-off value because of indirect interactions through the embedding function. Thus, a potential that has an embedding function and pair interaction limited to first neighbors actually has interactions extending to second or third neighbors. Most KMC LG models do not account for stress fields, and as a result the potential energies U (x) ¯ take on discrete values. The Boltzmann factors for the allowed displacements can then be easily tabulated for computational efficiency. The efficiency of KMC LG models depends on the disparity of the different atomic displacement rates. The example of vapor deposition onto a crystal surface illustrates the possible effects of a large disparity. In the case of Al, the diffusion of an adatom to an adjacent site on a (111) surface requires crossing a potential energy barrier of less than 0.1 eV, according to first principles calculations, implying a rate of approximately of 1010 hops/s at room temperature. On the other hand, the deposition of atoms by sputtering gives an accumulation rate of only about 4 nm/s for the deposited material, or a rate of 20 atoms/s impinging on every surface site. Since the models are usually designed to measure film growth processes and morphologies, it is apparent that the simulations require runs corresponding to real deposition times on the order of a second or more. But it is also necessary to include all of the diffusion hops, which require spending a large fraction of the computer time on moving adatoms around on the surface. However, the capability for performing such simulations has been increasing dramatically, both as a result of cheaper computational power, and because of new algorithms that dramatically speed up the simulations. Techniques are being developed to model random walk diffusion processes, without the necessity of simulating explicitly each of the millions of diffusion hops, by making use of the known properties of random walk diffusion processes [14]. In addition, there are several methods that handle highly disparate events without the inefficiency of spending computer time calculating moves that subsequently get rejected, as in the case of the Metropolis algorithm [15–17]. Methods to treat systems with long-range correlations efficiently have also been developed [18].
6.
Off-lattice KMC: Ion Implantation and Radiation Damage
The implantation of dopant ions into silicon wafers is the primary means to insert the electrically active atoms during the manufacturing of silicon
624
G. Gilmer and S. Yip
devices. Atomistic models of this process are receiving much attention recently because of the decreasing size of silicon device components. Atomistic effects are becoming important since fluctuations in dopant atoms may degrade uniformity in device properties, and control of the distribution of the dopant atoms is becoming more critical. Two distinct models are required for the simulations. First, a model describing the entry of the energetic ions into the crystal, together with the damage resulting from silicon atoms displaced from their lattice sites. Although these models, for example, MARLOWE [19], involve some use of random numbers as mentioned above, most of the computer time is involved with calculating the collisions of the energetic particles with the silicon atoms. After the ions are implanted, the wafer is usually annealed to reduce the damage and improve the electrical properties of the device. This requires the simulation of several types of defects and dopant atoms diffusing through the crystal. Vacancies and interstitials are the two main defects, although the diffusion of complexes such as interstitial-dopant and vacancy-dopant pairs, interstitial dimers, divacancies, and larger clusters can have a significant influence on the redistribution and clustering of dopant atoms. Rather complex set of events can be simulated by the KMC OL method. In these simulations, the defects and clusters diffuse through a complex path of saddle points and potential energy minima; only the vacancy spends most of its time on lattice sites. Furthermore, the exact path of the diffusing species as a function of time is not particularly important for the KMC OL simulation, although they are essential for the more detailed first principles calculations used to calculate overall diffusion rates. The crucial parameters for KMC OL are the binding energies between defects and dopant atoms and their mobilities, the defect–defect binding energies, cross-sections for capture, and the recombination cross section for vacancies and interstitials. Fortunately, there have been a number of first principles calculations for these parameters, at least for the smaller clusters and defects. As in the case of surface diffusion, the disparity of diffusion rates is quite large, and it is essential to employ efficient algorithms for the simulations. An example of the complexity of the simulations, is given in Fig. 3, where we show model calculations of the relatively simple case of the implantation of silicon ions into a silicon target using the DADOS simulator [20]. Silicon ions (5 keV of kinetic energy) are implanted into perfect crystalline silicon dislodging some silicon atoms from their lattice sites creating vacancies (dark spheres) and interstitials (grey spheres). Figure 3(a) shows the high concentration of defects after implantation at room temperature, with many vacancy-interstitial pairs created by the energetic ions. After a few seconds of annealing, Fig. 3(b), a large number of point defects have recombined, leaving an excess of interstitials corresponding to the implanted ions. The excess interstitials gradually aggregate and form {311} defects, Fig. 3(c) and (d). Note that
Basic monte carlo models: equilibrium and kinetics
625
Figure 3. Kinetic Monte Carlo results showing point defects in crystalline silicon after implantation of Si ions into perfect crystalline Si at room temperature, and during subsequent annealing at 800◦ C [19]. Grey spheres represent interstititals, and dark ones vacancies; only the defects are shown. (a) corresponds to the defects after implantation at room temperature, (b) 1 s anneal, (c) 40 s anneal, and (d) 250 s anneal.
simulation does not predict the structure of the interstitial clusters, because of the off-lattice nature of the model. The structure {311} of the defects is inserted into the model since it is important for the point-defect cluster interactions and cross-sections. As the defects diffuse and recombine in the initial stages and, later, as the {311} defects emit and absorb interstitials during the ripening phase, a very large number of diffusion hops take place demanding long KMC simulations. Eventually the interstitial clusters dissolve as the interstitial excess equilibrates with the surface.
7.
Simulation of Particle and Radiation Transport
MC is quite extensively used to track the individual particles as each moves through the medium of interest, streaming and colliding with the atomic constituents of the medium. To give a simple illustration, we consider the trajectory of a neutron as it enters a medium, as depicted in Fig. 4. Suppose the first interaction of this neutron is a scattering collision at point 1. After the scattering the neutron moves to point 2 where it is absorbed, causing a fission reaction which emits two neutrons and a photon. One of the neutrons streams to point 3 where it suffers a capture reaction with the emission of a photon, which in turn leaves the medium at point 6. The other neutron and the photon from the fission event both escape from the medium, to points 4 and 7, respectively, without undergoing any further collisions. By sampling a trajectory we mean that process in which one determines the position of point 1 where the scattering occurs, the outgoing neutron direction and its energy, the position of point 2 where fission occurs, the outgoing directions and energies of the two fission neutrons and the photon, etc. After tracking many such trajectories one can estimate the probability of a neutron penetrating the medium and the amount of energy deposited in the medium as a result of the reactions induced along the path of each trajectory. This is the kind of information
626
G. Gilmer and S. Yip
Figure 4. Schematic of a typical particle trajectory simulated by Monte Carlo. By repeating the simulation many times one obtains sufficient statistics to estimate the probability of radiation penetration in the case of shielding calculations, or the probability of energy deposition in the case of dosimetry problems.
that one needs in shielding calculations, where one wants to know how much material is needed to prevent the radiation (particles) from getting across the medium (a biological shield), or in dosimetry calculations where one wants to know how much energy is deposited in the medium (human tissue) by the radiation.
8.
Comparison of MC with MD
As discussed in several of the sections of Chapter 2, MD is a technique to generate the atomic trajectories of a system of N particles by direct numerical integration of the Newtons equations of motion. In a similar spirit, we say that the purpose of MC is to generate an ensemble of atomic configurations by stochastic sampling. In both cases we have a system of N particles interacting through the same interatomic potential. In MD, the system evolves in time by following the Newtons equations of motion where particles move in response to forces created by their neighbors. The particles therefore follow the correct dynamics according to classical mechanics. In contrast, in MC the particles move by sampling a distribution such as the canonical distribution. The dynamics thus generated is stochastic or probabilistic rather than deterministic which is the case for MD. The difference is, dynamics becomes important in problems where we wish to simulate the system over a long period of time. Because MD is constrained to real dynamics, the time scale of the simulation is fixed by such factors as the interatomic potential, and the mass of the particle. This time scale is of the order of picoseconds (10−12 ). If one wants to observe a phenomenon on a longer scale such as microseconds, it would require extensive computer resources to simulate it directly by MD. On the other hand, the time scale of MC is not fixed in the same way. KMC models
Basic monte carlo models: equilibrium and kinetics
627
often are able to simulate many of the same phenomena as MD, but on a much longer time scale by using a simplified description of the motion. If we consider the system of atoms on a crystal surface represented in Fig. 1, the MD simulation would consist of a substrate that provides a potential consisting of a square array of binding sites. Mobile atoms on the substrate would vibrate around the potential energy minimum of the binding site, and occasionally surmount the barrier and hop to a neighboring site. The vibrations of the atoms around the binding site may not be of importance for many applications, but the diffusion hops to neighboring sites and the aggregation into larger clusters on the substrate could be important for studying thin film structures during annealing, as discussed earlier. A KMC model could be developed where the elementary move is a diffusion hop to a neighboring site, ignoring the vibrations. Information from the MD model on the hop rate to neighboring sites, together with the effect of neighboring atoms on the hop rate, is often used to develop the KMC model. Because of the greatly reduced frequency of the diffusion events compared to the vibrations, the simulation can cover much larger time and length scales, and yet provide the needed information on the atomic diffusion and clustering. Another way to characterize the difference between MC and MD is to consider each as a technique to sample the degrees of freedom of the system. Since we follow the particle positions and velocities in MD, we are sampling the evolution of the system of N particles in its phase space, the 6-N dimensional space of the positions and velocities of the N particles. In MC we generate a set of particle positions in the system of N particles, thus the sampling is carried out in the 3-N configurational space of the system. In both cases, the sampling generates a trajectory in the respective spaces, as shown in Fig. 5. Such trajectories then allow properties of the system to be calculated as averages over these trajectories. In MD one performs a time average whereas in MC one
Figure 5. Schematic depicting the evolution of the same N-particle system in the 3-N dimensional configurational space (µ) as sampled by MC, and in the 6-N dimensional phase space (γ ) sampled by MD. In each case, the sampling results in a trajectory in the appropriate space, which is the necessary information that allows average system properties to be calculated. For MC, the trajectory is that of a random walk (Markov chain) governed by stochastic dynamics, whereas for MD the trajectory is what we believe to be the correct dynamics as given by the Newton’s equations of motion in classical mechanics. The same interatomic potential is used in the two simulations.
628
G. Gilmer and S. Yip
performs an average over discrete states. Under appropriate conditions MC and MD give the same results for equilibrium properties, a consequence of the so called ergodic hypothesis (ensemble average = time average); however, dynamical properties calculated using the two methods in general will not be the same.
References [1] N. Metropolis, “The beginning of the Monte Carlo method,” Los Alamos Sci., Special Issue, 125, 1987. [2] E.J. Janse van Rensburg and G.M. Torrie “Estimation of multidimensional integrals: is Monte Carlo the best method?” J. Phys. A: Math. Gen., 26, 943–953, 1993. [3] A.R. Kansal and S. Torquato, “Prediction of trapping rates in mixtures of partially absorbing spheres,” J. Chem. Phys., 116, 10589, 2002. [4] H. Gould and J. Tobochnik, An Introduction to Computer Simulation Methods, Part 2, Chaps 10–12, 14, 15, Addison-Wesley, Reading, 1988. [5] D.W. Hermann, Computer Simulation Methods, 2nd edn., Chap 4, Springer-Verlag, Berlin, 1990. [6] K. Binder and D.W. Hermann, Monte Carlo Simulation in Statistical Physics, An Introduction, Springer-Verlag, Berlin, 1988. [7] E.E. Lewis and W.F. Miller, Computational Methods of Neutron Transport, Chap 7, American Nuclear Society, La Grange Park, IL, 1993. [8] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087, 1953. [9] M.H. Kalos and P.A. Whitlock, Monte Carlo Methods , Wiley, New York, 1986. [10] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, “Optimization by simulated annealing,” Science, 220, 671, 1983. [11] G.M. Torrie and J.P. Valleau, “Non-physical sampling distributions in Monte Carlo free energy estimation – umbrella sampling,” J. Comput. Phys., 23, 187, 1977. [12] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford University Press, Oxford, 1987. [13] M. Rao, C. Pangali, and B.J. Berne, “On the force bias Monte Carlo simulation of water: methodology, optimization and comparison with molecular dynamics,” Mol. Phys., 37, 1773, 1979. [14] J. Dalla Torre, C.-C. Fu, F. Willaime, and J.-L. Bocquet, Simulations multi-echelles des experiences de recuit de resistivite isochrone dans le Fer-ultra pur irradie aux electrons: premiers resultants, CEA Annuel Rapport, p. 94, 2003. [15] D. T. Gillespie, “General method for numerically simulating stochastic time evolution of coupled chemical-reactions,” Comp. Phys., 22, 403–434, 1976. [16] A.B. Bortz, M.H. Kalos, and J. L. Lebowitz, J. Comput. Phys., 17, 10, 1975. [17] G. H. Gilmer, “Growth on imperfect crystal faces,” J. Cryst, Growth, 36, 15, 1976. [18] R.H. Swendsen and J.S. Wang, “Replica Monte Carlo simulation of spin-glasses,” Phys. Rev. Lett., 57, 2607, 1986. [19] M.T. Robinson, “The binary collision approximation: background and introduction, Rad. Eff. Defects Sol., 130–131, 3, 1994. [20] M.E. Law, G.H. Gilmer, and M. Jaraiz, “Simulation of defects and diffusion phenomena in silicon,” MRS Bull., 25, 45, 2000.
2.11 ACCELERATED MOLECULAR DYNAMICS METHODS Blas P. Uberuaga1, Francesco Montalenti2 , Timothy C. Germann3, and Arthur F. Voter4 1 Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 2
INFM, L-NESS, and Dipartimento di Scienza dei Materiali, Universit`a degli Studi di Milano-Bicocca, Via Cozzi 53, I-20125 Milan, Italy 3 Applied Physics Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 4 Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Molecular dynamics (MD) simulation, in which atom positions are evolved by integrating the classical equations of motion in time, is now a well established and powerful method in materials research. An appealing feature of MD is that it follows the actual dynamical evolution of the system, making no assumptions beyond those in the interatomic potential, which can, in principle, be made as accurate as desired. However, the limitation in the accessible simulation time represents a substantial obstacle in making useful predictions with MD. Resolving individual atomic vibrations – a necessity for maintaining accuracy in the integration – requires time steps on the order of femtoseconds, so that reaching even one microsecond is very difficult on today’s fastest processors. Because this integration is inherently sequential in nature, direct, spatial parallelization does not help significantly; it just allows simulations of nanoseconds on much larger systems. Beginning in the late 1990s, methods based on a new concept have been developed for circumventing this time scale problem. For systems in which the long-time dynamical evolution is characterized by a sequence of activated events, these “accelerated molecular dynamics” methods [1] can extend the accessible time scale by orders of magnitude relative to direct MD, while retaining full atomistic detail. These methods – hyperdynamics, parallel-replica dynamics, and temperature accelerated dynamics (TAD) – have already been demonstrated on problems in surface and bulk diffusion and surface growth. With more development they will become useful for a broad range of key materials problems, such as pipe diffusion along a dislocation core, impurity clustering, grain 629 S. Yip (ed.), Handbook of Materials Modeling, 629–648. c 2005 Springer. Printed in the Netherlands.
630
B.P. Uberuaga et al.
growth, dislocation climb and dislocation kink nucleation. Here we give an introduction to these methods, discuss their current strengths and limitations, and predict how their capabilities may develop in the next few years.
1. 1.1.
Background Infrequent Event Systems
We begin by defining an “infrequent-event” system, as this is the type of system we will focus on in this article. The dynamical evolution of such a system is characterized by the occasional activated event that takes the system from basin to basin, events that are separated by possibly millions of thermal vibrations within one basin. A simple example of an infrequent-event system is an adatom on a metal surface at a temperature that is low relative to the diffusive jump barrier. We will exclusively consider thermal systems, characterized by a temperature T , a fixed number of atoms N , and a fixed volume V ; i.e., the canonical ensemble. Typically, there is a large number of possible paths for escape from any given basin. As a trajectory in the 3N -dimensional coordinate space in which the system resides passes from one basin to another, it crosses a (3N –1)dimensional “dividing surface” at the ridgetop separating the two basins. While on average these crossings are infrequent, successive crossings can sometimes occur within just a few vibrational periods; these are termed “correlated dynamical events” [2–4]. An example would be a double jump of the adatom on the surface. For this discussion it is sufficient, but important, to realize that such events can occur. In most of the methods presented below, we will assume that these correlated events do not occur – this is the primary assumption of transition state theory – which is actually a very good approximation for many solid-state diffusive processes. We define the “correlation time” (τcorr ) of the system as the duration of the system memory. A trajectory that has resided in a particular basin for longer than τcorr has no memory of its history and, consequently, how it got to that basin, in the sense that when it later escapes from the basin, the probability for escape is independent of how it entered the state. The relative probability for escape to a given adjacent state is proportional to the rate constant for that escape path, which we will define below. An infrequent event system, then, is one in which the residence time in a state (τrxn ) is much longer than the correlation time (τcorr ). We will focus here on systems with energetic barriers to escape, but the infrequent-event concept applies equally well to entropic bottlenecks.1 The key to the accelerated
1 For systems with entropic bottlenecks, the parallel-replica dynamics method can be applied very effectively [1].
Accelerated molecular dynamics methods
631
dynamics methods described here is recognizing that to obtain the right sequence of state-to-state transitions, we need not evolve the vibrational dynamics perfectly, as long as the relative probability of finding each of the possible escape paths is preserved.
1.2.
Transition State Theory
Transition state theory (TST) [5–9] is the formalism underpinning all of the accelerated dynamics methods, directly or indirectly. In the TST approximation, the classical rate constant for escape from state A to some adjacent state B is taken to be the equilibrium flux through the dividing surface between A and B (Fig. 1). If there are no correlated dynamical events, the TST rate is the exact rate constant for the system to move from state A to state B. The power of TST comes from the fact that this flux is an equilibrium property of the system. Thus, we can compute the TST rate without ever propagating a trajectory. The appropriate ensemble average for the rate constant for escape from A, k TST A→ , is k TST A→ = |dx/dt | δ(x − q) A ,
(1)
where x ∈ r is the reaction coordinate and x = q the dividing surface bounding state A. The angular brackets indicate the ratio of Boltzmann-weighted integrals over 6N -dimensional phase space (configuration space r and momentum space p). That is, for some property P(r, p),
P =
P(r, p)exp[−H (r, p)/kB T ] dr dp , exp[−H (r, p)/kB T ] dr dp
(2)
A
Ea
B
Figure 1. A two-state system illustrating the definition of the transition state theory rate constant as the outgoing flux through the dividing surface bounding state A.
632
B.P. Uberuaga et al.
where kB is the Boltzmann constant and H (r, p) is the total energy of the system, kinetic plus potential. The subscript A in Eq. (1) indicates the configuration space integrals are restricted to the space belonging to state A. If the effective mass (m) of the reaction coordinate is constant over the dividing surface, Eq. (1) reduces to a simpler ensemble average over configuration space only [10], k TST A→ =
2kB T /π m δ(x − q) A .
(3)
The essence of this expression, and of TST, is that the Dirac delta function picks out the probability of the system being at the dividing surface, relative to everywhere else it can be in state A. Note that there is no dependence on the nature of the final state B. In a system with correlated events, not every dividing surface crossing corresponds to a reactive event, so that, in general, the TST rate is an upper bound on the exact rate. For diffusive events in materials at moderate temperatures, these correlated dynamical events typically do not cause a large change in the rate constants, so TST is often an excellent approximation. This is a key point; this behavior is markedly different than in some chemical systems, such as molecular reactions in solution or the gas phase, where TST is just a starting point and dynamical corrections can lower the rate significantly [11]. While in the traditional use of TST, rate constants are computed after the dividing surface is specified, in the accelerated dynamics methods we exploit the TST formalism to design approaches that do not require knowing in advance where the dividing surfaces will be, or even what product states might exist.
1.3.
Harmonic Transition State Theory
If we have identified a saddle point on the potential energy surface for the reaction pathway between A and B, we can use a further approximation to TST. We assume that the potential energy near the basin minimum is well described, out to displacements sampled thermally, with a second-order energy expansion – i.e., that the vibrational modes are harmonic – and that the same is true for the modes perpendicular to the reaction coordinate at the saddle point. Under these conditions, the TST rate constant becomes simply −E a /kB T , k HTST A→B = ν0 e
(4)
where 3N
min i νi . ν0 = 3N−1 νisad i
(5)
Accelerated molecular dynamics methods
633
Here E a is the static barrier height, or activation energy (the difference in energy between the saddle point and the minimum of state A (Fig. 1)), {νimin } are the normal mode frequencies at the minimum of A, and {νisad } are the nonimaginary normal mode frequencies at the saddle separating A from B. This is often referred to as the Vineyard [12] equation. The analytic integration of Eq. (1) over the whole phase space thus leaves a very simple Arrhenius temperature dependence.2 To the extent that there are no recrossings and the modes are truly harmonic, this is an exact expression for the rate. This harmonic TST expression is employed in the temperature accelerated dynamics method (without requiring calculation of the prefactor ν0 ).
1.4.
Complex Infrequent Event Systems
The motivation for developing accelerated molecular dynamics methods becomes particularly clear when we try to understand the dynamical evolution of what we will term complex infrequent event systems. In these systems, we simply cannot guess where the state-to-state evolution might lead. The underlying mechanisms may be too numerous, too complicated, and/or have an interplay whose consequences cannot be predicted by considering them individually. In very simple systems we can raise the temperature to make diffusive transitions occur on an MD-accessible time scale. However, as systems become more complex, changing the temperature causes corresponding changes in the relative probability of competing mechanisms. Thus, this strategy will cause the system to select a different sequence of state-to-state dynamics, ultimately leading to a completely different evolution of the system, and making it impossible to address the questions that the simulation was attempting to answer. Many, if not most, materials problems are characterized by such complex infrequent events. We may want to know what happens on the time scale of milliseconds, seconds or longer, while with MD we can barely reach one microsecond. Running at higher T or trying to guess what the underlying atomic processes are can mislead us about how the system really behaves. Often for these systems, if we could get a glimpse of what happens at these longer times, even if we could only afford to run a single trajectory for that long, our understanding of the system would improve substantially. This, in essence, is the primary motivation for the development of the methods described here.
2 Note that although the exponent in Eq. (4) depends only on the static barrier height E , in this HTST a
approximation there is no assumption that trajectory passes exactly through the saddle point.
634
1.5.
B.P. Uberuaga et al.
Dividing Surfaces and Transition Detection
We have implied that the ridgetops between basins are the appropriate dividing surfaces in these systems. For a system that obeys TST, these ridgetops are the optimal dividing surfaces; recrossings will occur for any other choice of dividing surface. A ridgetop can be defined in terms of steepest-descent paths – it is the 3N –1-dimensional boundary surface that separates those points connected by steepest descent paths to the minimum of one basin from those that are connected to the minimum of an adjacent basin. This definition also leads to a simple way to detect transitions as a simulation proceeds, a requirement of parallel-replica dynamics and temperature accelerated dynamics. Intermittently, the trajectory is interrupted and minimized through steepest descent. If this minimization leads to a basin minimum that is distinguishable from the minimum of the previous basin, a transition has occurred. An appealing feature of this approach is that it requires virtually no knowledge of the type of transition that might occur. Often only a few steepest descent steps are required to determine that no transition has occurred. While this is a fairly robust detection algorithm, and the one used for the simulations presented below, more efficient approaches can be tailored to the system being studied.
2.
Parallel-Replica Dynamics
The parallel-replica method [13] is the simplest and most accurate of the accelerated dynamics techniques, with the only assumption being that the infrequent events obey first-order kinetics (exponential decay); i.e., for any time t > τcorr after entering a state, the probability distribution function for the time of the next escape is given by p(t) = ktot e−ktot t ,
(6)
where ktot is the rate constant for escape from the state. For example, Eq. (6) arises naturally for ergodic, chaotic exploration of an energy basin. Parallelreplica allows for the parallelization of the state-to-state dynamics of such a system on M processors. We sketch the derivation here for equal-speed processors. For a state in which the rate to escape is ktot , on M processors the effective escape rate will be Mktot , as the state is being explored M times faster. Also, if the time accumulated on one processor is t1 , on the M processors a total time of tsum = Mt1 will be accumulated. Thus, we find that p(t1 ) dt1 = Mktot e−Mktot t1 dt1 p(t1 ) dt1 = ktot e−ktot tsum dtsum p(t1 ) dt1 = p(tsum ) dtsum
(7a) (7b) (7c)
Accelerated molecular dynamics methods
635
and the probability to leave the state per unit time, expressed in tsum units, is the same whether it is run on one or M processors. A variation on this derivation shows that the M processors need not run at the same speed, allowing the method to be used on a heterogeneous or distributed computer; see Ref. [13]. The algorithm is schematically shown in Fig. 2. Starting with an N -atom system in a particular state (basin), the entire system is replicated on each of M available parallel or distributed processors. After a short dephasing stage during which each replica is evolved forward with independent noise for a time tdeph ≥ τcorr to eliminate correlations between replicas, each processor carries out an independent constant-temperature MD trajectory for the entire N -atom system, thus exploring phase space within the particular basin M times faster than a single trajectory would. Whenever a transition is detected on any processor, all processors are alerted to stop. The simulation clock is advanced by the accumulated trajectory time summed over all replicas, i.e., the total time τrxn spent exploring phase space within the basin until the transition occurred. The parallel-replica method also correctly accounts for correlated dynamical events (i.e., there is no requirement that the system obeys TST), unlike the other accelerated dynamics methods. This is accomplished by allowing the trajectory that made the transition to continue on its processor for a further amount of time tcorr ≥ τcorr , during which recrossings or follow-on events may occur. The simulation clock is then advanced by tcorr , the final state is replicated on all processors, and the whole process is repeated. Parallelreplica dynamics then gives exact state-to-state dynamical evolution, because the escape times obey the correct probability distribution, nothing about the procedure corrupts the relative probabilities of the possible escape paths, and the correlated dynamical events are properly accounted for.
A
B
C
D
A
Figure 2. Schematic illustration of the parallel-replica method (after Ref. [1]). The four steps, described in the text, are (A) replication of the system into M copies, (B) dephasing of the replicas, (C) evolution of independent trajectories until a transition is detected in any of the replicas, and (D) brief continuation of the transitioning trajectory to allow for correlated events such as recrossings or follow-on transitions to other states. The resulting configuration is then replicated, beginning the process again.
636
B.P. Uberuaga et al.
The efficiency of the method is limited by both the dephasing stage, which does not advance the system clock, and the correlated event stage, during which only one processor accumulates time. (This is illustrated schematically in Fig. 2, where dashed line trajectories advance the simulation clock but dotted line trajectories do not.) Thus, the overall efficiency will be high when τrxn /M tdeph + tcorr .
(8)
Some tricks can further reduce this requirement. For example, whenever the system revisits a state, on all but one processor the interrupted trajectory from the previous visit can be immediately restarted, eliminating the dephasing stage. Also, the correlation stage (which only involves one processor) can be overlapped with the subsequent dephasing stage for the new state on the other processors, in the hope that there are no correlated crossings that lead to a different state. Figure 3 shows an example of a parallel-replica simulation; an Ag(111) island-on-island structure decays over a period of 1 µs at T = 400 K. Many of the transitions involve concerted mechanisms. Parallel-replica dynamics has the advantage of being fairly simple to program, with very few “knobs” to adjust – tdeph and tcorr , which can be conservatively set at a few ps for most systems. As multiprocessing environments become more ubiquitous, with more processors within a node or even on a chip, and loosely linked Beowulf clusters of such nodes, parallel-replica dynamics will become an increasingly important simulation tool. Recently, parallel-replica dynamics has been extended to driven systems, such as systems with some externally applied strain rate. The requirement here is that the drive rate is slow enough that at any given time the rates for the processes in the system depend only on the instantaneous configuration of the system.
3.
Hyperdynamics
Hyperdynamics builds on the basic concept of importance sampling [14, 15], extending it into the time domain. In the hyperdynamics approach [16], the potential surface V (r) of the system is modified by adding to it a nonnegative bias potential Vb (r). The dynamics of the system is then evolved on this biased potential surface, V (r) + Vb (r). A schematic illustration is shown in Fig. 4. The derivation of the method requires that the system obeys TST – that there are no correlated events. There are also important requirements on the form of the bias potential. It must be zero at all the dividing surfaces, and the system must still obey TST for dynamics on the modified potential surface. If such a bias potential can be constructed, a challenging
Accelerated molecular dynamics methods
637
t = 0.00 µs
t = 0.15 µs
t = 0.25 µs
t = 0.39 µs
t = 0.41 µs
t = 0.42 µs
t = 0.44 µs
t = 0.45 µs
t = 1.00 µs
Figure 3. Snapshots from a parallel-replica simulation of an island on top of an island on the Ag(111) surface at T = 400 K (after Ref. [1]). On a microsecond time scale, the upper island gives up all its atoms to the lower island, filling vacancies and kink sites as it does so. This simulation took 5 days to reach 1 µs on 32 1 GHz Pentium III processors.
task in itself, we can substitute the modified potential V (r) + Vb (r) into Eq. (1) to find k TST A→ =
|v A | δ(x − q)Ab , eβVb (r) Ab
(9)
where β = 1/kB T and the state Ab is the same as state A but with the bias potential Vb applied. This leads to a very appealing result: a trajectory on this modified surface, while relatively meaningless on vibrational time scales,
638
B.P. Uberuaga et al.
C A
B
Figure 4. Schematic illustration of the hyperdynamics method. A bias potential (V (r)), is added to the original potential (V (r), solid line). Provided that V (r) meets certain conditions, primarily that it be zero at the dividing surfaces between states, a trajectory on the biased potential surface (V (r) + V (r), dashed line) escapes more rapidly from each state without corrupting the relative escape probabilities. The accelerated time is estimated as the simulation proceeds.
evolves correctly from state to state at an accelerated pace. That is, the relative rates of events leaving A are preserved: k TST k TST Ab →B A→B = . TST k TST k Ab →C A→C
(10)
This is because these relative probabilities depend only on the numerator of Eq. (9) which is unchanged by the introduction of Vb since, by construction, Vb = 0 at the dividing surface. Moreover, the accelerated time is easily estimated as the simulation proceeds. For a regular MD trajectory, the time advances at each integration step by tMD , the MD time step (often on the order of 1 fs). In hyperdynamics, the time advance at each step is tMD multiplied by an instantaneous boost factor, the inverse Boltzmann factor for the bias potential at that point, so that the total time after n integration steps is thyper =
n
tMD eV (r(t j ))/ kB T.
(11)
j =1
Time thus takes on a statistical nature, advancing monotonically but nonlinearly. In the long-time limit, it converges on the correct value for the
Accelerated molecular dynamics methods
639
accelerated time with vanishing relative error. The overall computational speedup is then given by the average boost factor,
boost(hyperdynamics) = thyper/tMD = eV (r)/ kB T
Ab ,
(12)
divided by the extra computational cost of calculating the bias potential and its forces. If all the visited states are equivalent (e.g., this is common in calculations to test or demonstrate a particular bias potential), Eq. (12) takes on the meaning of a true ensemble average. The rate at which the trajectory escapes from a state is enhanced because the positive bias potential within the well lowers the effective barrier. Note, however, that the shape of the bottom of the well after biasing is irrelevant; no assumption of harmonicity is made. Figure 5 illustrates an application of hyperdynamics for a two-dimensional, periodic model potential using a Hessian-based bias potential [16]. The hopping diffusion rate was compared against MD at high temperature, where the two calculations agreed very well. At lower temperatures where the MD calculations would be too costly, it is compared against the result computed ⫺5
⫺10
In(D)
47 200
⫺15
⫺20
3435 8682
⫺25 4
6
8 1/kBT
10
12
Figure 5. Arrhenius plot of the diffusion coefficients for a model potential, showing a comparison of direct MD (), hyperdynamics (•), and TST + dynamical corrections (+). The symbols are sized for clarity. The line is the full harmonic TST approximation, and is indistinguishable from a least-square line through the TST points (not shown). Also shown are the boost factors, relative to direct MD, for each hyperdynamics result. The boost increases dramatically as the temperature is lowered (after Ref. [16]).
640
B.P. Uberuaga et al.
using TST plus dynamical corrections. As the temperature is lowered, the effective boost gained by using hyperdynamics increased to the point that, at kB T = 0.09, the boost factor was over 8500. See Ref. [16] for details. The ideal bias potential should give a large boost factor, have low computational overhead (though more overhead is acceptable if the boost factor is very high), and, to a good approximation, meet the requirements stated above. This is very challenging, since we want, as much as possible, to avoid utilizing any prior knowledge of the dividing surfaces or the available escape paths. To date, proposed bias potentials typically have either been computationally intensive, have been tailored to very specific systems, have assumed localized transitions, or have been limited to low-dimensional systems. But the potential boost factor available from hyperdynamics is tantalizing, so developing bias potentials capable of treating realistic many-dimensional systems remains a subject of ongoing research by several groups. See Ref. [1] for a detailed discussion on bias potentials and results generated using various forms.
4.
Temperature Accelerated Dynamics
In the temperature accelerated dynamics (TAD) method [17], the idea is to speed up the transitions by increasing the temperature, while filtering out the transitions that should not have occurred at the original temperature. This filtering is critical, since without it the state-to-state dynamics will be inappropriately guided by entropically favored higher-barrier transitions. The TAD method is more approximate than the previous two methods, as it relies on harmonic TST, but for many applications this additional approximation is acceptable, and the TAD method often gives substantially more boost than hyperdynamics or parallel-replica dynamics. Consistent with the accelerated dynamics concept, the trajectory in TAD is allowed to wander on its own to find each escape path, so that no prior information is required about the nature of the reaction mechanisms. In each basin, the system is evolved at a high temperature Thigh (while the temperature of interest is some lower temperature Tlow ). Whenever a transition out of the basin is detected, the saddle point for the transition is found. The trajectory is then reflected back into the basin and continued. This “basin constrained molecular dynamics” (BCMD) procedure generates a list of escape paths and attempted escape times for the high-temperature system. Assuming that TST holds and that the system is chaotic and ergodic, the probability distribution for the first-escape time for each mechanism is an exponential (Eq. (6)). Because harmonic TST gives an Arrhenius dependence of the rate on temperature (Eq. (4)), depending only on the static barrier height, we can then extrapolate each escape time observed at Thigh to obtain a corresponding escape time at Tlow that is drawn correctly from the exponential distribution at Tlow . This extrapolation, which requires knowledge of the saddle point energy, but not the preexponential factor, can be illustrated graphically in an
Accelerated molecular dynamics methods
641
Arrhenius-style plot (ln(1/t) vs. 1/T ), as shown in Fig. 6. The time for each event seen at Thigh extrapolated to Tlow is then tlow = thigh e Ea (βlow −βhigh ) ,
(13)
Tlow time
Thigh time
In(νmin)
ln(1/t)
In(ν*min)
low
ln(1/t short ) ln(1/tstop)
1/Thigh
1/Tlow 1/T
Figure 6. Schematic illustration of the temperature accelerated dynamics method. Progress of the high-temperature trajectory can be thought of as moving down the vertical time line at 1/Thigh . For each transition detected during the run, the trajectory is reflected back into the basin, the saddle point is found, and the time of the transition (solid dot on left time line) is transformed (arrow) into a time on the low-temperature time line. Plotted in this Arrhenius-like form, this transformation is a simple extrapolation along a line whose slope is the negative of the barrier height for the event. The dashed termination line connects the shortest-time transition recorded so far on the low temperature time line with the confidence-modified minimum =ν preexponential (νmin min /ln(1/δ)) on the y-axis. The intersection of this line with the highT time line gives the time (tstop , open circle) at which the trajectory can be terminated. With confidence 1-δ, we can say that any transition observed after tstop could only extrapolate to a shorter time on the low-T time line if it had a preexponential lower than νmin .
642
B.P. Uberuaga et al.
where, again, β = 1/kB T . The event with the shortest time at low temperature is the correct transition for escape from this basin. Because the extrapolation can in general cause a reordering of the escape times, a new shorter-time event may be discovered as the BCMD is continued at Thigh. If we make the additional assumption that there is a minimum preexponential factor, νmin , which bounds from below all the preexponential factors in the system, we can define a time at which the BCMD trajectory can be stopped, knowing that the probability that any transition observed after that time would replace the first transition at Tlow is less than δ. This “stop” time is given by
ln(1/δ) νmin tlow,short thigh,stop ≡ νmin ln (1/δ)
Tlow /Thigh
,
(14)
where tlow,short is the shortest transition time at Tlow . Once this stop time is reached, the system clock is advanced by tlow,short, the transition corresponding to tlow,short is accepted, and the TAD procedure is started again in the new basin. The average boost in TAD can be dramatic when barriers are high and Thigh/Tlow is large. However, any anharmonicity error at Thigh transfers to Tlow ; a rate that is twice the Vineyard harmonic rate due to anharmonicity at Thigh will cause the transition times at Thigh for that pathway to be 50% shorter, which in turn extrapolate to transition times that are 50% shorter at Tlow . If the Vineyard approximation is perfect at Tlow , these events will occur at twice the rate they should. This anharmonicity error can be controlled by choosing a Thigh that is not too high. As in the other methods, the boost is limited by the lowest barrier, although this effect can be mitigated somewhat by treating repeated transitions in a “synthetic” mode [17]. This is in essence a kinetic Monte Carlo treatment of the low-barrier transitions, in which the rate is estimated accurately from the observed transitions at Thigh , and the subsequent low-barrier escapes observed during BCMD are excluded from the extrapolation analysis. Temperature accelerated dynamics is particularly useful for simulating vapor-deposited crystal growth, where the typical time scale can exceed minutes. Figure 7 shows an example of TAD applied to such a problem. Vapor deposited growth of a Cu(100) surface was simulated at a deposition rate of one monolayer per 15 s and a temperature T = 77 K, exactly matching (except for the system size) the experimental conditions of Ref. [18]. Each deposition event was simulated using direct MD for 2 ps, long enough for the atom to collide with the surface and settle into a binding site. A TAD simulation with Thigh = 550 K then propagated the system for the remaining time until the next deposition event was required, on average 0.3 s later. The overall boost factor was ∼ 107 . A key feature of this simulation was that, even at this low temperature, many events accepted during the growth process
Accelerated molecular dynamics methods
1 ML
2 ML
3 ML
4 ML
643
5 ML Figure 7. Snapshots from a TAD simulation of the deposition of five monolayers (ML) of Cu onto Cu(100) at 0.067 ML/s and T =77 K, matching the experimental conditions of Egelhoff and Jacob [18]. Deposition of each new atom was performed using direct molecular dynamics for 2 ps, while the intervening time (0.3 s on average for this 50 atom/layer simulation cell) was simulated using the TAD method. The boost factor for this simulation was ∼107 over direct MD (after Ref. [1]).
involved concerted mechanisms, such as the concerted sliding of an eight-atom cluster [1]. This MD/TAD procedure for simulating film growth has been applied also to Ag/Ag(100) at low temperatures [19] and Cu/Ag(100) [20]. Heteroepitaxial systems are especially hard to treat with techniques such as kinetic Monte Carlo because of the increased tendency for the system to go off lattice due
644
B.P. Uberuaga et al.
to mismatch strain, and because the rate table needs to be considerably larger when neighboring atoms can have multiple types. Recently, enhancements to TAD, beyond the “synthetic mode” mentioned above, have been developed that can increase the efficiency of the simulation. For systems that revisit states, the time required to accept an event can be reduced for each revisit by taking advantage of the time accumulated in previous visits [21]. This procedure is exact; no assumptions beyond the ones required by the original TAD method are needed. After many visits, the procedure converges. The minimum barrier for escape from that state (E min ) is then known to within uncertainty δ. In this converged mode (ETAD), the average time at Thigh required to accept an event no longer depends on δ, and the average boost factor becomes simply
t low,short = exp E min boost(ETAD) = t high,stop
1 1 − kB Tlow kB Thigh
(15)
for that state. The additional boost (when converged) compared to the original TAD can be an order of magnitude or more. For systems that seldom (or never) revisit the same state, it is still possible to exploit this extra boost by running in ETAD mode with E min supplied externally. One way of doing this is to combine TAD with the dimer method [22]. In this combined dimer-TAD approach, first proposed by Montalenti and Voter [21], upon entering a new state, a number of dimer searches are used to find the minimum barrier for escape, after which ETAD is employed to quickly find a dynamically appropriate escape path. This exploits the power of the dimer method to quickly find low-barrier pathways, while eliminating the danger associated with the possibility that it might miss important escape paths. Although the dimer method might fail to find the lowest barrier correctly, this is a much weaker demand on the dimer method than trying to find all relevant barriers. In addition, the ETAD phase has some chance of correcting the simulation during the BCMD if the dimer searches did not find E min .
5.
Outlook
As these accelerated dynamics methods become more widely used and further developed (including the possible emergence of new methods), their application to important problems in materials science will continue to grow. We conclude this article by comparing and contrasting the three methods presented here, with some guidelines for deciding which method may be most appropriate for a given problem. We point out some important limitations of the methods, areas in which further development may significantly increase their usefulness. Finally, we discuss the prospects for these methods in the immediate future.
Accelerated molecular dynamics methods
645
The key feature of all of the accelerated dynamics methods is that they collapse the waiting time between successive transitions from its natural time (τrxn ) to (at best) a small number of vibrational periods. Each method accomplishes this in a different way. TAD exploits the enhanced rate at higher temperature, hyperdynamics effectively lowers the barriers to escape by filling in the basin, and parallel-replica dynamics spreads the work across many processors. The choice of which accelerated dynamics method to apply to a problem will typically depend on three factors. The first is the desired level of accuracy in following the exact dynamics of the system. As described previously, parallel-replica is the most exact of the three methods; the only assumption is that the kinetics are first order. Not even TST is assumed, as correlated dynamical events are treated correctly in the method. This is not true with hyperdynamics, which does rely upon the assumptions of TST, in particular the absence of correlated events. Finally, temperature accelerated dynamics makes the further assumptions inherent in the harmonic approximation to TST, and is thus the most approximate of the three methods. If complete accuracy is the main goal of the simulation, parallel-replica is the superior choice. The second consideration is the potential gain in accessible time scales that the accelerated dynamics method can achieve for the system. Typically, TAD is the method of choice when considering this factor. While in all three methods the boost for escaping from each state will be limited by the smallest barrier, if the barriers are high relative to the temperature of interest, TAD will typically achieve the largest boost factor. In principle, hyperdynamics can also achieve very significant boosts, but, in practice, existing bias potentials either have a very simple form which generally provide limited boosts for complex many-atom systems, or more sophisticated (e.g., Hessian-based) forms whose overhead reduces the boosts actually attainable. It may be possible, using prior knowledge about particular systems, to construct a computationally inexpensive bias potential which simultaneously offers large boosts, in which case hyperdynamics could be competitive with TAD. Finally, parallel-replica dynamics usually offers the smallest boost given the typical access to parallel computing today (e.g., tens of processors or fewer per user for continuous use), since the maximum possible boost is exactly the number of processors. For some systems, the overhead of, for example, finding saddle points in TAD may be so great that parallel-replica can give more overall boost. However, in general, the price of the increased accuracy of parallel-replica dynamics will be shorter achievable time scales. It should be emphasized that the limitations of parallel-replica in terms of accessible time scales are not inherent in the method, but rather are a consequence of the currently limited computing power which is available. As massively parallel processing becomes commonplace for individual users, and any number can be used in the study of a given problem, parallel-replica should become just as efficient as the other methods. If enough processors are available
646
B.P. Uberuaga et al.
so that the amount of simulation time each processor has to do for each transition is on the order of ps, parallel-replica will be just as efficient as TAD or hyperdynamics. This analysis may be complicated by issues of communication between processors, but the future of parallel-replica is very promising. The last main factor determining which method is best suited to a problem is the shape of the potential energy surface (PES). Both TAD and hyperdynamics require that the PES be relatively smooth. In the case of TAD, this is because saddle points must be found and standard techniques for finding them often perform poorly for rough landscapes. The same is true for the hyperdynamics bias potentials that require information about the shape of the PES. Parallel-replica, however, only requires a method for detecting transitions. No further analysis of the potential energy surface is needed. Thus, if the PES describing the system of interest is relatively rough, parallel-replica dynamics may be the only method that can be applied effectively. The temperature dependence of the boost in hyperdynamics and TAD gives rise to an interesting prediction about their power and utility in the future. Sometimes, even accelerating the dynamics may not make the activated processes occur frequently enough to study a particular process. A common trick is to raise the temperature just enough that at least some events will occur in the available computer time, hoping, of course, that the behavior of interest is still representative of the lower-T system. When faster computers become available, the same system can be studied at a lower, more desirable, temperature. This in turn increases the boost factor (e.g., see Eqs. (12) and (14)), so that, effectively, there is a superlinear increase in the power of accelerated dynamics with increasing computer speed. Thus, the accelerated dynamics approaches will become increasingly more powerful in future years simply because computers keep getting faster. A particularly appealing prospect is that of accelerated electronic structurebased molecular dynamics simulations (e.g., by combining density functional theory (DFT) or quantum chemistry with the methods discussed here), since accessible electronic structure time scales are even shorter, currently on the order of ps. However, because of the additional expense involved in these techniques, the converse of the argument given in the previous paragraph indicates that, for example, accelerated DFT dynamics simulations will not give much useful boost on current computers (i.e., using DFT to calculate the forces is like having a very slow computer). DFT hyperdynamics may be a powerful tool in 5–10 years, when breakeven (boost = overhead) is reached, and this could happen sooner with the development of less expensive bias potentials. TAD is probably close to being viable for combination with DFT, while parallel-replica dynamics and dimer-TAD could probably be used on today’s computers for electronic structure studies on some systems. Currently, these methods are very efficient when applied to systems in which the barriers are much higher than the temperature of interest. This is often true
Accelerated molecular dynamics methods
647
for systems such as ordered solids, but there are many important systems that do not so cleanly fall into this class, a prime example being glasses. Such systems are characterized by either a continuum of barrier heights, or a set of low barriers that describe uninteresting events, like conformational changes in a molecule. Low barriers typically degrade the boost of all of the accelerated dynamics methods, as well as the efficiency of standard kinetic Monte Carlo. However, even these systems will be amenable to study through accelerated dynamics methods as progress is made on this low-barrier problem. A final note should be made about the computational scaling of these methods with system size. While the exact scaling depends on the type of system and many aspects of the implementation, a few general points can be made. In the case of TAD, if the work of finding saddles and detecting transitions can be localized, it can be shown that the scaling goes as N 2−Tlow /Thigh [21] for the simple case of a system that has been enlarged by replication. This is improved greatly with ETAD, which scales as O(N ), the same as regular MD. Real systems are more complicated and, typically, lower barrier processes will arise as the system size is increased. Thus, even hyperdynamics with a bias potential requiring no overhead might scale worse than N . The accelerated dynamics methods, as a whole, are still in their infancy. Even so, they are currently powerful enough to study a wide range of materials problems that were previously intractable. As these methods continue to mature, their applicability, and the physical insights gained by their use, can be expected to grow.
Acknowledgments We gratefully acknowledge vital discussions with Graeme Henkelman. This work was supported by the United States Department of Energy (DOE), Office of Basic Energy Sciences, under DOE Contract No. W-7405-ENG-36.
References [1] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Annu. Rev. Mater. Res., 32, 321–346, 2002. [2] D. Chandler, “Statistical-mechanics of isomerization dynamics in liquids and transition-state approximation,” J. Chem. Phys., 68, 2959–2970, 1978. [3] A.F. Voter and J.D. Doll, “Dynamical corrections to transition state theory for multistate systems: surface self-diffusion in the rare-event regime,” J. Chem. Phys., 82, 80–92, 1985. [4] C.H. Bennett, “Molecular dynamics and transition state theory: simulation of infrequent events,” ACS Symp. Ser., 63–97, 1977. [5] R. Marcelin, “Contribution a` l’´etude de la cin´etique physico-chimique,” Ann. Physique, 3, 120–231, 1915.
648
B.P. Uberuaga et al. [6] E.P. Wigner, “On the penetration of potential barriers in chemical reactions,” Z. Phys. Chemie B, 19, 203, 1932. [7] H. Eyring, “The activated complex in chemical reactions,” J. Chem. Phys., 3, 107–115, 1935. [8] P. Pechukas, “Transition state theory,” Ann. Rev. Phys. Chem., 32, 159–177, 1981. [9] D.G. Truhlar, B.C. Garrett, and S.J. Klippenstein, “Current status of transition state theory,” J. Phys. Chem., 100, 12771–12800, 1996. [10] A.F. Voter and J.D. Doll, “Transition state theory description of surface selfdiffusion: comparison with classical trajectory results,” J. Chem. Phys., 80, 5832– 5838, 1984. [11] B.J. Berne, M. Borkovec, and J.E. Straub, “Classical and modern methods in reaction-rate theory,” J. Phys. Chem., 92, 3711–3725, 1988. [12] G.H. Vineyard, “Frequency factors and isotope effects in solid state rate processes,” J. Phys. Chem. Solids, 3, 121–127, 1957. [13] A.F. Voter, “Parallel-replica method for dynamics of infrequent events,” Phys. Rev. B, 57, 13985–13988, 1998. [14] J.P. Valleau and S.G. Whittington, “A guide to Monte Carlo for statistical mechanics: 1. highways,” In: B.J. Berne (ed.), Statistical Mechanics. A. A Modern Theoretical Chemistry, vol. 5, Plenum, New York, pp. 137–168, 1977. [15] B.J. Berne, G. Ciccotti, and D.F. Coker (eds.), Classical and Quantum Dynamics in Condensed Phase Simulations, World Scientific, Singapore, 1998. [16] A.F. Voter, “A method for accelerating the molecular dynamics simulation of infrequent events,” J. Chem. Phys., 106, 4665–4677, 1997. [17] M.R. Sørensen and A.F. Voter, “Temperature-accelerated dynamics for simulation of infrequent events,” J. Chem. Phys., 112, 9599–9606, 2000. [18] W.F. Egelhoff, Jr. and I. Jacob, “Reflection high-energy electron-diffraction (RHEED) oscillations at 77K,” Phys. Rev. Lett., 62, 921–924, 1989. [19] F. Montalenti, M.R. Sørensen, and A.F. Voter, “Closing the gap between experiment and theory: crystal growth by temperature accelerated dynamics,” Phys. Rev. Lett., 87, 126101, 2001. [20] J.A. Sprague, F. Montalenti, B.P. Uberuaga, J.D. Kress, and A.F. Voter, “Simulation of growth of Cu on Ag(001) at experimental deposition rates” Phys. Rev. B, 66, 205415, 2002. [21] F. Montalenti and A.F. Voter, “Exploiting past visits or minimum-barrier knowledge to gain further boost in the temperature-accelerated dynamics method,” J. Chem. Phys., 116, 4819–4828, 2002. [22] G. Henkelman and H. J´onsson, “A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives,” J. Chem. Phys., 111, 7010–7022, 1999.
2.12 CONCURRENT MULTISCALE SIMULATION AT FINITE TEMPERATURE: COARSE-GRAINED MOLECULAR DYNAMICS Robert E. Rudd Lawrence Livermore National Laboratory, University of California, L-045 Livermore, CA 94551, USA
1.
Embedded Nanomechanics and Computer Simulation
With the advent of nanotechnology, predictive simulations of nanoscale systems have become in great demand. In some cases, nanoscale systems can be simulated directly at the level of atoms. The atomistic techniques used range from models based on a quantum mechanical treatment of the electronic bonds to those based on more empirical descriptions of the interatomic forces. In many cases, however, even nanoscale systems are too big for a purely atomistic approach, typically because the nanoscale device is coupled to its surroundings, and it is necessary to simulate the entire system comprising billions of atoms. A well-known example is the growth of nanoscale epitaxial quantum dots in which the size, shape and location of the dot is affected by the elastic strain developed in a large volume of the substrate as well as the local atomic bonding. The natural solution is to model the surroundings with a more coarse-grained (CG) description, suitable for the intrinsically longer length scale. The challenge then is to develop the computational methodology suitable for this kind of concurrent multiscale modeling, one in which the simulated length scale can be changed smoothly and seamlessly from one region of the simulation to another while maintaining the fidelity of the relevant mechanics, dynamics and thermodynamics. The realization that Nature has different relevant length scales goes back at least as far as Democritus. Some 24 centuries ago he put forward the idea that solid matter is comprised ultimately at small scales by a fundamental constituent that he termed as atom. Implicit in his philosophy was the idea that an 649 S. Yip (ed.), Handbook of Materials Modeling, 649–661. c 2005 Springer. Printed in the Netherlands.
650
R.E. Rudd
understanding of the atom would lead to a more robust understanding of the macroscopic world around us. In the intervening period, of course, not only has the science of this atomistic picture been put on a sound footing through the inventions of chemistry, the discovery of the nucleus and the development of quantum mechanics and modern condensed matter physics, but a host of additional length scales with their own relevant physics has been uncovered. A great deal of scientific innovation has gone into the development of physical models to describe the phenomena observed at these individual length scales. In the past decade a growing effort has been devoted in understanding how physics at different length scales works in concert to give rise to the observed behavior of solid materials. The use of models at multiple length scales, especially computer models optimized in this way, has been known as multiscale modeling. An example of multiscale modeling that we will consider in some detail is the modeling of the elastic deformation of solids at the atomistic and continuum levels. Clearly one kind of multiscale model would be to calculate the mass density and elastic constants within an atomistic model, and to use those data to parameterize a continuum model to describe large-scale elastic deformation. Such a parameter-passing, hierarchical approach has been used extensively to study a variety of systems [1]. Its success relies on the occurrence of well-separated length scales. We shall refer to such an approach as sequential multiscale modeling. In some systems, it is not clear how to separate the various length scales. An example would be turbulence, in which vortex structures are generated at many length scales and hierarchical models have to date only worked in very special cases [2]. Alternatively, the system of interest may be inhomogeneous and have regions in which small-scale physics dominates embedded in regions governed by large-scale physics. Examples would include fracture [3, 4], various nucleation phenomena [5], nanoscale moving mechanical components on computer chips (NEMS) [6], ion implantation and radiation damage events [7], epitaxial quantum dot growth [8] and so on. In either case hierarchical approach is not ideal, and concurrent multiscale modeling is preferred [9]. Here we focus on the inhomogeneous systems, and in particular on systems like those mentioned above in which the most interesting behavior involves the mechanics of a nanoscale region, but the overall behavior also depends on how the nanoscale region is coupled to its large-scale surroundings. This embedded nanomechanics may be studied effectively with concurrent multiscale modeling, where regions dominated by different length scales are treated with different models, either explicitly through a hybrid approach or effectively through a derivative approach. Here we focus on the methodology of coarse-grained molecular dynamics (CGMD) [9–12], one example of a concurrent multiscale model. CGMD describes the dynamic behavior of solids concurrently at the atomistic level and at more coarse-grained levels. The CG description is similar to finite element
Concurrent multiscale simulation at finite temperature
651
modeling (FEM) of continuum elasticity, with several important distinctions. CGMD is derived directly from the atomistic model without recourse to a continuum description. This approach is important because it allows a more seamless coupling of the atomistic and coarse-grained models. The other important distinction is that CGMD is designed for finite temperature, and the coarse-graining procedure makes use of the techniques of statistical mechanics to ensure that the model provides a robust description of the thermodynamics. Several other concurrent multiscale models for solids have been proposed and used [13–18]. The Quasicontinuum technique is of particular note in this context, because it is also derived entirely from the underlying atomistic model [14]. CGMD was the first concurrent multiscale model designed for finite temperature simulations [10]. Recently, another finite temperature concurrent multiscale model has been developed using renormalization group techniques, including time renormalization [17]. This model is very interesting, although to date its formulation is based on bond decimation procedures that is limited to simple models with pair-wise nearest-neighbor interactions. The formulation of CGMD is more flexible, making it compatible with most classical interatomic potentials. It has been applied to realistic potentials in 3D whose range extends beyond nearest neighbors.
2.
Formulation of CGMD
Coarse-grained molecular dynamics provides a model whose minimum length scale may vary from one location to another in the system. The CGMD formulation begins with a specification of a mesh that defines the length scales that will be represented in each region (see Fig. 1). As in finite element modeling [19], the mesh is unstructured, and it comes with a set of shape functions that define how fields are continuously interpolated on the mesh. For example, the displacement field is the most basic field in CGMD, and it is approximated as u(x) ≈
u j N j (x),
(1)
j
where N j (x) is the value of the j th shape function evaluated at the point x in the undeformed (reference) configuration. It is often useful to let N j (x) have support at node j so that the coefficient u j represents the displacement at node j , but it need not be so for the derivation of CGMD. We will refer to u j as nodal displacements, bearing in mind that the coarse-grained fields could be more general. Ultimately the usual criteria to ensure well-behaved numerics will apply, such as the cells should not have high aspect ratios and the mesh size should not change too abruptly; for the purposes of the formulation, the only requirement we impose is that if a region of the mesh is at the atomic
652
R.E. Rudd Micron Resonator
CG
MD
Figure 1. Schematic diagram of a concurrent multiscale simulation of a NEMS silicon microresonator [4–6] to illustrate how a system may be decomposed into atomistic (MD) and coarse-grained (CG) regions. The CG region comprises most of the volume, but the MD region contains most of the simulated degrees of freedom. Note that the CG mesh is refined to the atomic scale where it joins with the MD lattice.
scale, the positions of the nodes coincide with equilibrium lattice sites. This is not required for coarser regions of the mesh. To the first approximation, CGMD is governed by mass and stiffness matrices. They are derived from the underlying atomistic physics, described by a molecular dynamics (MD) model [20]. Define the discrete shape functions by evaluating the shape function N j (x) at the equilibrium lattice site x0µ of atom µ: N jµ = N j (x0µ ).
(2)
The discrete shape functions allow us to approximate the atomic displace ments uµ ≈ j u j N jµ . If we were to make this a strict equality, we would be on the path to the Quasicontinuum technique. Instead, we consider this a constraint on the system, and allow all of the unconstrained degrees of freedom in the system to fluctuate in thermal equilibrium. In particular, we demand that the interpolating fields be best fits to the underlying atomistic degrees of freedom of the system. In the case of the displacement field this requirement means that the nodal displacements minimize the chi-squared error of the fit: 2 2 u j N j µ . χ = uµ − µ j
(3)
Concurrent multiscale simulation at finite temperature
653
The minimum of χ 2 is given by u j = (N N T )−1 j k Nkµ uµ ≡ f jµ uµ ,
(4)
where repeated indices are summed and the inverse is a matrix inverse. We have introduced the weighting function expressed in terms of the discrete shape function as f jµ = (N N T )−1 j k Nkµ . Equation (4) provides the needed correspondence between the coarse and fine degrees of freedom. Once the weighting function f jµ is defined, the CGMD energy is defined as an average energy over the ensemble of systems in different points in phase space satisfying the correspondence relation (4). Mathematically, this is expressed as E(uk , u˙ k ) = Z
−1
dxµ dpµ HMD e−β HMD ,
(5)
where Z is the constrained partition function (the same integral without the HMD pre-exponential factor). The integral runs over the full 6Natom -dimensional MD phase space. The inverse temperature is given by β = 1/kT . The factor HMD is the MD Hamiltonian, the sum of the atomistic kinetic and potential energies. The potential energy is determined by an interatomic potential, a generalization of the well-known Lennard–Jones potential that typically includes non-linear many-body interactions [20]. The factor is a product of delta functions enforcing the constraint, =
j
δ uj −
µ
uµ f jµ δ u˙ j −
pµ f j µ µ
mµ
.
(6)
Once the energy (5) is determined, the equations of motion are derived as the corresponding Euler–Lagrange equations. The CGMD energy (5) consists of kinetic and potential terms. The CGMD kinetic energy can be computed exactly using analytic techniques for any system; the CGMD potential energy can also be calculated exactly, provided the MD interatomic potential is harmonic. Anharmonic corrections may be computed in perturbation theory. The details are given in Ref. [11]. Here we focus on the harmonic case, in which the potential energy is quadratic in the atomic displacements, and the coefficient of the quadratic term (times 2) is known as the dynamical matrix, Dµν . The result for harmonic CGMD is that E(uk , u˙ k ) = Uint + 12 (M j k u˙ j · u˙ k + u j · K j k uk ), Uint = Natom E + 3(Natom − Nnode )kT, Mij = m Niµ N jµ , −1 f j ν )−1 K ij = ( f iµ Dµν × ˜ −1 × Dµν D j ν , = Niµ Dµν N j ν − Diµ coh
(7) (8) (9) (10) (11)
654
R.E. Rudd
where Mij is the mass matrix and K ij is the stiffness matrix. Here again and throughout this Article a sum is implied whenever indices are repeated on one side of an equation unless otherwise noted. The internal energy Uint includes the total cohesive energy of the system, Natom E coh , as well as the internal energy of a collection of (Natom − Nnode ) harmonic oscillators at finite temperature. The form of the mass matrix (9) assumes a monatomic lattice. A more general form is given in Ref. [11]. The two forms of the stiffness matrix are equivalent in principle, although in practice numerical considerations have favored one form or the other for particular applications. The first form (10) was used for the early CGMD applications. It is most suited for applications in which the nodal index may be Fourier transformed, such as the computation of phonon spectra. The second form (11) is better suited for real space applications. It depends on an off-diagonal block of the dynamical matrix
D ×jµ = δµρ − N jµ f jρ Dρν N j ν
(12)
−1 for the internal and a regularized form of the lattice Green function D˜ µν degrees of freedom that is defined in Ref. [11]. Note that the mass matrix and the compliance matrix (the inverse of the stiffness matrix) are weighted averages of the corresponding MD quantities, the MD mass and MD lattice Green function, respectively. The CGMD equations of motion are derived from the CGMD Hamiltonian (5) using the Euler–Lagrange procedure
M j k u¨ k = −K j k uk + Fext j ,
(13)
where we have included the possibility of an external body force on node j given by Fext j . The anharmonic corrections to these equations of motion form an infinite Taylor series in powers of uk [11]. In regions of the mesh refined to the atomic level, it has been shown that the infinite series sums up to the MD interatomic forces; i.e., the original MD equations of motion are recovered in regions of the mesh refined to the atomic scale [10]. In the case of a harmonic system, the recovery of the MD equations of motion in the atomic limit should be clear from the equations for the mass and stiffness matrices. In this limit Niµ = δiµ and f iµ = δiµ , so Mij = mδij and K ij = Dij from Eqs. (9) and (10), respectively. In practice, we define two complementary regions of the simulation. In the CG region, the harmonic CGMD equations of motion (13) are used, whereas in the region of the mesh refined to the atomic level, called the MD region, the anharmonic terms are restored through the use of the full MD equations of motion. In a CGMD simulation the mass and stiffness matrices are calculated once at the beginning of the simulation. The reciprocal space (Fourier transform) representation of the dynamical matrix is used in order to make the calculation of the stiffness matrix tractable. This representation implicitly assumes that the solid in the form of a crystal lattice free from defects in the CG region.
Concurrent multiscale simulation at finite temperature
655
The CGMD mass matrix involves couplings between nearest neighbor nodes in the CG region, just as the distributed mass matrix of finite element modeling does. The fact that the mass matrix is not diagonal is inconvenient, since a system of equations must be solved in order to determine the nodal accelerations. The system of equations is sparse, but this step introduces some computational overhead, and it is desirable to eliminate it. In FEM, the distributed mass matrix is often replaced by a diagonal approximation, the lumped mass matrix [19]. In CGMD, the lumped mass approximation, lump
Mij
= m δij
Niµ
(no sum on i)
(14)
µ
has proven useful in the same way [9]. This definition assumes that the shape functions form a partition of unity, so that i Niµ = 1 for all µ. In principle, the determination of the equations of motion together with the relevant initial and boundary conditions completely specifies the problem. In practice, we have typically used a thermalized initial state and a mixture of periodic and free boundary conditions suitable for the problem of interest. The equations of motion are integrated in time using a velocity Verlet time integrator [20] with the conventional MD time step used throughout the simulation. The natural time scale of the CG nodes is longer due to the greater mass and greater compliance of the larger cells, and it would be natural to use a longer time step in the CG region. We have found little motivation to explore this possibility, however, since the computational cost of our simulations is typically dominated by the MD region, so there is little to gain by speeding up the computation in the CG region. We now turn to the question of how CGMD simulations are analyzed. Much of the analysis of CGMD simulations is accomplished using standard MD techniques. The simulations are typically constructed such that the most interesting phenomena occur in the MD region, and here most of the usual MD tools may be brought to bear. Thermodynamic quantities are calculated in the usual way, and the identification and tracking of crystal lattice defects may be accomplished with conventional techniques. In some cases it may be of interest to analyze the simulation in the CG region, as well. For example, it may be of interest to plot the temperature throughout the simulation in order to verify that the behavior at the MD/CG interface is reasonable. In MD the temperature is directly related to the mean ˙ 2 , where the brackets indicate the kinetic energy of the atoms: kT = 13 m|u| average [20]. In CGMD, a similar expression holds [11] kT = 13 |u˙ i |2 /Mii−1
(no sum on i),
(15)
where Mii−1 is the diagonal component corresponding to node i of the inverse of the mass matrix. This analysis of the temperature and thermal oscillations is
656
R.E. Rudd
closely tied to the kinetic energy in the CG region. Similar tools are available to analyze the potential energy and the related quantites such as deformation, pressure and stress [11].
3.
Validation
Validation of concurrent multiscale models is a challenge in its own right, and the development of quantitative tools and performance measures to analyze models like CGMD has taken place at the same time as the development of the first models. CGMD has been tested in several ways to see how it compares with a full MD simulation of a test system, as well as other concurrent multiscale simulations. The first test was the calculation of the spectrum of elastic waves or phonons. The techniques to calculate these spectra in atomistic systems have been developed long ago in the field of lattice dynamics [21]. In general the phonon spectrum is comprised of D acoustic mode branches (where D is the number of dimensions) together with D(Nunit −1) optical branches (where Nunit is the number of atoms in the elementary unit cell of the crystal lattice) [22]. The acoustic modes are distinguished by the fact that their frequency goes to zero as their wavelength becomes large. The infinite wavelength corresponds to uniform translation of the system, a process that costs no energy and hence corresponds to zero frequency. Elastic wave spectra are an interesting test of CGMD and other concurrent multiscale techniques because they represent a test of dynamics and because elastic waves have a natural length scale associated with them: the wavelength. When a CG mesh is introduced, the shortest wavelengths are excluded. These modes are eliminated because they are irrelevant in the CG region, and their elimination increases the efficiency of the simulation. The test then is to see how well the model describes those longer wavelength modes that are represented in the CG region. The elastic wave spectra for solid argon were computed in CGMD on a uniform mesh for various mesh sizes, and compared to the MD spectra and spectra computed using a FEM model based on continuum elasticity [9, 11]. The bonds between argon atoms were modeled with a Lennard–Jones potential cut off at the fifth shell of neighboring atoms. Several interesting results were found. First, both CGMD and FEM agreed with the MD spectrum at long wavelengths. This is to be expected, since for wavelengths much longer than the mesh spacing, the waveform should be well represented on the mesh. Also, at long wavelengths the FEM assumption of a continuous medium is justified, and the slope of the spectrum gives the sound velocity, c = ω/k for k → 0. Here ω is the (angular) frequency and k is the wave number. The error in ω(k) was found to be of order O(k 2 ) for FEM, as expected. It goes to zero in the long wavelength limit, k → 0. One nice feature of CGMD was a reduced
Concurrent multiscale simulation at finite temperature
657
error of order O(k 4 ) [10]. Moreover, CGMD provides a better approximation of the elastic wave spectra for all wavelengths supported on the mesh. Of course, CGMD also has the important feature that the elastic wave spectra are reproduced exactly when the mesh is refined to the atomic level, a property that FEM does not possess. Interatomic forces are not merely FEM elasticity on an atomic sized grid. Solid argon forms a face-centered cubic crystal lattice and hence has only three acoustic wave branches in its phonon spectrum. For crystals with optical phonon branches, there is more than one way to implement the coarsegraining, depending on the physics that is of interest, but the general CGMD framework continues to work well [23]. The other validation of CGMD has been the study of the reflection of elastic waves from the MD/CG interface. For applications such as crack propagation, it has proven important to control this unphysical reflection. The reflected waves can propagate back into the heart of the MD simulation and interfere with the processes of interest. In the case of crack propagation, a noticeable anomaly in the crack speed occurs at the point in time when the reflected waves reach the crack tip [24]. The reflection coefficient, a measure of the amount of elastic wave energy reflected at a given wavelength, has been calculated for CGMD and FEM based on continuum elasticity [10, 11]. Typical results are shown in Fig. 2. Long wavelength elastic waves are transmitted into the CG region, whereas short wavelength modes are reflected. The short wavelengths cannot be supported on the mesh, and since energy is conserved, they must go somewhere and they are reflected. The transmission threshold is expected to occur at a wave number k0 = π/(Nmax a). The CGMD threshold occurs precisely at 1 CGMD lump FEM dist FEM
0.8
Reflection Coefficient
Reflection Coefficient
1
0.6 0.4 0.2
⫺5
10
⫺10
10
CGMD lumped mass FEM distributed mass FEM
⫺15
10
⫺20
0
10 0
0.2
0.4
0.6
0.8
Wave number k/k0
1
1.2
1.4
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Wave number k/k0
Figure 2. A comparison of the reflection of elastic waves from a CG region in three cases: CGMD and two varieties of FEM. Note that the reflection coefficient is plotted on a log scale. A similar graph plotted on a linear scale is shown in Ref. [10]. The dashed line marks the natural cutoff [k0 = π/(Nmax a)], where Nmax is the number of atoms in the largest cells. The bumps in the curves are scattering resonances. Note that at long wavelengths CGMD offers significantly suppressed scattering.
658
R.E. Rudd
this wave number, while the threshold for transmission in distributed mass and lumped mass FEM models occurs somewhat above and below this value, respectively. The scattering in the long wavelength limit shows a generalized Rayleigh scattering behavior. In conventional Rayleigh scattering the scattering crosssection goes like σ ∼ k 4 , which is the behavior exhibited by scattering here in FEM. For CGMD, the scattering drops off more quickly at long wavelengths, with the reflection coefficient approximately proportional to k 8 [11]. One aspect of concurrent multiscale modeling that remains poorly understood is the requirements for a suitable mesh. Certainly, many of the desired properties are clear either from the nature of the problem or from experience with FEM. For example, the mesh needs to be refined to the atomic level in the MD region, so here the mesh nodes should coincide with equilibrium crystal lattice sites. In the periphery large cells are desirable since the gain in efficiency is proportional to the cell size. From FEM it is well known that the aspect ratio of the cells should not be too large. Beyond these basic criteria, one is left with the task of generating a mesh that interpolates between the atomic-sized cells in the MD region to the large cells in the periphery without introducing high aspect ratio cells. One question we have investigated is whether the abruptness of this transition matters, and indeed it does matter. Figure 3 shows the reflection coefficient as a function of the wave number for two meshes that go between an MD region and a CG region with a maximum cell size of 20 lattice spacings. In one case, the transition is made gradually, whereas in the other case it is made abruptly. The mesh with the
1 CGMD smooth mesh
0.8
Reflection Coefficient
Reflection Coefficient
1
CGMD abrupt mesh 0.6 0.4 0.2 0
⫺5
10
⫺10
10
CGMD smooth mesh ⫺15
CGMD abrupt mesh
10
⫺20
0
0.2
0.4
0.6
0.8
Wave number k/k0
1
1.2
1.4
10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Wave number k/k0
Figure 3. A comparison of the reflection of elastic waves from a CG region whose mesh varies smoothly in cell size and one with an abrupt change in cell size, both computed in CGMD. In both cases the reflection coefficient is plotted as a function of the wave number in units of the natural cutoff [k0 = π/(Nmax a)], where a is the lattice constant and Nmax a = 20a is the maximum linear cell size in the mesh. The pronounced series of scattering resonances in the case of the abruptly changing mesh is undesirable. The second panel is a log-linear plot of the same data in order to show how the series of scattering resonances continues at decreasing amplitudes to long wavelengths.
Concurrent multiscale simulation at finite temperature
659
abrupt transition exhibits markedly increased scattering, including a series of strong scattering resonances. Note that the envelope of the scattering curve is well defined in the case of the abrupt mesh, a property used to calculate the scaling of the reflection coefficient, R ∼ k 8 .
4.
Outlook
CGMD provides a formalism for concurrent multiscale modeling at finite temperature. The initial tests have been very encouraging, but there are still many ways in which CGMD can be developed. One area of active research is numerical algorithms to make CGMD more efficient for large simulations. The calculation of the stiffness matrix involves the inverse of a large matrix whose size grows with the number of nodes in the CG region, NCGnode . The 3 and the matrix storage scales calculation of the inverse scales like NCGnode 2 like NCGnode , for the exact matrix without any cutoff. Even though the calculation of the stiffness matrix need only be done once during the simulation, the calculation has proven sufficiently onerous to prevent the application of CGMD to the large-scale simulations for which it was originally intended. Only now are linear scaling CGMD algorithms starting to become available. There are several directions in which CGMD has begun to be extended for specific applications. The implementation of CGMD described in this Article conserves energy. It implicitly makes the assumption that the only thermal fluctuations that are relevant to the problem are those supported on the mesh. Fluctuations of the degrees of freedom that have been integrated out are neglected. Those fluctuations can be physically relevant in several ways [12]. First, they exert random and dissipative forces on the coarse-grained degrees of freedom in a process that is analogous to the forces in Brownian motion exerted on a large particle by the atoms in the surrounding liquid. Second, they also act as a heat bath that is able to exchange and transport thermal energy. Finally, they can transport energy in non-equilibrium processes, such as the waves generated by a propagating crack discussed above. A careful treatment of the CG system leads to a generalization of the CGMD equations of motion presented above [12]. In addition to the conservative forces, there are random and dissipative forces that form a generalized Langevin equation. The dissipative forces involve a memory function in time and space that acts to absorb waves that cannot be supported in the CG region. The memory kernel is similar to those that have been discussed in the context of absorbing boundary conditions for MD simulations [25, 26], except that in CGMD the range of the kernel is shorter because the long wavelength modes are able to propagate into the CG region and do not need to be absorbed. Interestingly, in the case of a CG region surrounded by MD regions, the memory kernel also contains propagators that recreate the absorbed waves on the far
660
R.E. Rudd
side of the CG region after the appropriate propagation delay [12]. Of course, use of the generalized Langevin incurs additional computational expenses both in terms of run time and memory. There are many other ways in which CGMD could be extended. Additional CG fields could be introduced to model various material phenomena such as electrical polarization, defect concentrations and local temperature. Fluxes such as heat flow and defect diffusion can be included through the technique of coarse-graining the atomistic conservation equations. CGMD provides a powerful framework in which to formulate finite temperature multiscale models for a variety of applications.
Acknowledgments This article was prepared under the auspices of the US Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
References [1] J.A. Moriarty, J.F. Belak, R.E. Rudd, P. Soderlind, F.H. Streitz, and L.H. Yang, “Quantum-based atomistic simulation of materials properties in transition metals,” J. Phys.: Condens. Matter, 14, 2825–2857, 2002. [2] A.A. Townsend, The Structure of Turbulent Shear Flow, 2nd edition, Cambridge University Press, Cambridge, 1976. [3] F.F. Abraham, J.Q. Broughton, E. Kaxiras, and N. Bernstein, “Spanning the length scales in dynamic simulation,” Comput. in Phys., 12, 538–546, 1998. [4] F.F. Abraham, R. Walkup, H. Gao, M. Duchaineau, T. Diaz de la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl. Acad. Sci. USA, 99, 5783–5787, 2002. [5] D.R. Mason, R.E. Rudd, and A.P. Sutton, “Atomistic modelling of diffusional phase transformations with elastic strain,” J. Phys.: Condens. Matter, 16, S2679–S2697, 2004. [6] R.E. Rudd and J.Q. Broughton, “Atomistic simulation of MEMS resonators through the coupling of length scales,” J. Model. Simul. Microsys., 1, 29–38, 1999. [7] R.S. Averback and T. Diaz de la Rubia, “Fundamental studies of radiation effects in solids,” Solid State Phys., 51, 281–402, 1998. [8] R.E. Rudd, G.A.D. Briggs, A.P. Sutton, G. Medieros-Ribiero, and R.S. Williams, “Equilibrium model of bimodal distributions of epitaxial island growth,” Phys. Rev. Lett., 90, 146101, 2003. [9] R.E. Rudd and J.Q. Broughton, “Concurrent multiscale simulation of solid state systems,” Phys. Stat. Sol. (b), 217, 251–291, 2000. [10] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics and the atomic limit of finite elements,” Phys. Rev. B, 58, R5893–R5896, 1998.
Concurrent multiscale simulation at finite temperature
661
[11] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics: non-linear finite elements and finite temperature,” Phys. Rev. B, 2004 (unpublished). [12] R.E. Rudd, Coarse-grained molecular dynamics: Dissipation due to internal modes. Mater. Res. Soc. Symp. Proc., 695, T10.2, 2002. [13] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack-propagation in bcc crystals studied with a combined finite-element and atomistic model,” Philos. Mag. A, 64, 851–878, 1991. [14] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Philos. Mag. A, 73, 1529–1563, 1996. [15] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391–2403, 1999. [16] L.E. Shilkrot, R.E. Miller, and W.A. Curtin, “Coupled atomistic and discrete dislocation plasticity,” Phys. Rev. Lett., 89, 025501, 2002. [17] S. Curtarolo and G. Ceder, “Dynamics of an inhomogeneously coarse grained multiscale system,” Phys. Rev. Lett., 88, 255504, 2002. [18] W.A. Curtin and R.E. Miller, “Atomistic/continuum coupling in computational materials science,” Modell. Simul. Mater. Sci. Eng., 11, R33–R68, 2003. [19] T.J.R. Hughes, The Finite Element Method: Linear Static and Dynamic Finite Element Analysis, Dover, Mineola, 2000. [20] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [21] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, Clarendon Press, Oxford, 1954. [22] N.W. Ashcroft and N.D. Mermin, Solid State Physics, Saunders College Press, Philadelphia, 1976. [23] B. Kraczek, private communication, 2003. [24] B.L. Holian and R. Ravelo, “Fracture simulations using large-scale moleculardynamics,” Phys. Rev. B, 51, 11275–11288, 1995. [25] W. Cai, M. de Koning, V.V. Bulatov, and S. Yip, “Minimizing boundary reflections in coupled-domain simulations,” Phys. Rev. Lett., 85, 3213–3216, 2000. [26] W.E and Z. Huang, “Matching conditions in atomistic-continuum modeling of materials,” Phys. Rev. Lett., 87, 135501, 2001.
2.13 THE THEORY AND IMPLEMENTATION OF THE QUASICONTINUUM METHOD E.B. Tadmor1 and R.E. Miller2 1 Technion–Israel Institute of Technology, Haifa, Israel 2
Carleton University, Ottawa, ON, Canada
While atomistic simulations have provided great insight into the basic mechanisms of processes like plasticity, diffusion and phase transformations in solids, there is an important limitation to these methods. Specifically, the large number of atoms in any realistic macroscopic structure is typically much too large for direct simulation. Consider that the current benchmark for largescale fully atomistic simulations is on the order of 109 atoms, using massively paralleled computer facilities with hundreds or thousands of CPUs. This represents 1/10 000 of the number of atoms in a typical grain of aluminum, and 1/1 000 000 of the atoms in a typical micro-electro-mechanical systems (MEMS) device. Further, it is apparent that with such a large number of atoms, substantial regions of a problem of interest are essentially behaving like a continuum. Clearly, while fully atomistic calculations are essential to our understanding of the basic “unit” mechanisms of deformation, they will never replace continuum models altogether. The goal for many researchers, then, has been to develop techniques that retain a largely continuum mechanics framework, but impart on that framework enough atomistic information to be relevant to modeling a problem of interest. In many examples, this means that a certain, relatively small, fraction of a problem require full atomistic detail while the rest can be modeled using the assumptions of continuum mechanics. The quasicontinuum method (QC) has been developed as a framework for such mixed atomistic/continuum modeling. The QC philosophy is to consider the atomistic description as the “exact” model of material behaviour, but at the same time acknowledge that the sheer number of atoms make most problems intractable in a fully atomistic framework. Then, the QC uses continuum assumptions to reduce the degrees of freedom and computational demand without losing atomistic detail in regions where it is required. 663 S. Yip (ed.), Handbook of Materials Modeling, 663–682. c 2005 Springer. Printed in the Netherlands.
664
E.B. Tadmor and R.E. Miller
The purpose of this article is to provide an overview of the theoretical underpinnings of the QC method, and to shed light on practical issues involved in its implementation. The focus of the article will be on the specific implementation of the QC method as put forward in Refs. [1–4]. Variations on this implementation, enhancements, and details of specific applications will not be presented. For the interested reader, these additional topics can be found in several QC review articles [5–8] and of course in the original references. The most recent of the QC reviews [5] provides an extensive literature survey, detailing many different implementations, extensions and applications of the QC. Also included in that review are several other coupled methods that are either direct descendants of the QC or are similar alternatives developed independently. For a detailed comparison between several coupled atomistic/continuum methods including the QC, the reader may find the review by Curtin and Miller [9] of interest. A QC website designed to serve as a clearinghouse for information on the QC method has been established at www.qcmethod.com. The site includes information on QC research, links to researchers, downloadable QC code and documentation. The downloadable code is freely available and corresponds to the QC implementation discussed in this paper.
1.
Atomistic Modeling of Crystalline Solids
In the QC, the point-of-view which is adopted is that there is an underlying atomistic model of the material which is the “correct” description of the material behaviour. This could, in principle, be a quantum-mechanically based description such as density functional theory (DFT), but in practice the focus has been primarily on atomistic models based on semi-empirical interatomic potentials. A review of such methods can be found, for example, in [10]. Here, we present only the features of such models which are essential for our discussion. We focus on lattice statics solutions, i.e., we are looking for equilibrium atomic configurations for a given model geometry and externally imposed forces or displacements, because most applications of the QC have used a static implementation. Recent work to extend QC to finite temperature and dynamic simulations shows promise, and can be found in Ref. [11]. We assume that there is some reference configuration of N atomic nuclei, confined to a lattice. Thus, the reference position of the ith atom in the model X i is found from an integer combination of lattice vectors and a reference (origin) atom position, X 0 X i = X 0 + li A1 + m i A2 + n i A3 ,
(1)
The theory and implementation of the quasicontinuum method
665
where (li , m i , n i ) are integers, A j is the j th Bravais lattice vector.1 The deformed position of the ith atom x i , is then found from a unique displacement vector ui for each atom. x i = X i + ui .
(2)
The displacements ui , while only having physical meaning on the atomic sites, can be treated as a continuous field u(X) throughout the body with the property that u(X i ) ≡ ui . This approach, while not the conventional one in atomistic models, is useful in effecting the connection to continuum mechanics. Note that for brevity we will often refer to the field u to represent the set of all atomic displacements {u1 , u2 , . . . , u N } where N is the number of atoms in the body. In standard lattice statics approaches using semi-empirical potentials, there is a well defined total energy function E tot that is determined from the relative positions of all the atoms in the problem. In many semi-empirical models, this energy can be written as a sum over the energy of each individual atom. Specifically, E tot =
N
E i (u),
(3)
i=1
where E i is the site energy of atom i, which depends on the displacements u through the relative positions of all the atoms in the deformed configuration. For example, within the embedded atom method (EAM) [13, 14] atomistic model, this site energy is given by E i = Ui (ρ¯i ) +
1 Vi j (ri j ), 2 j =/ i
(4)
where Ui can be interpreted as an electron-density dependent embedding energy, Vi j is a pair potential between atom i and its neighbor j and ri j = (x i − x j ) · (x i − x j ) is the interatomic distance. The electron density at the position of atom i, ρ¯i , is the superposition of spherically averaged density contributions from each of the neighbors, ρ j : ρ¯i =
ρ j (ri j ).
(5)
j= /i
A similar site energy can be identified for other empirical atomistic models, such as those of the Stillinger–Weber type [15], for instance. 1 We omit a discussion of complex lattices with more than one atom at each Bravais lattice site. This topic is discussed in Refs. [5, 12].
666
E.B. Tadmor and R.E. Miller
In addition to the potential energy of the atoms, there may be energy due to external loads applied to atoms. Thus, the total potential energy of the system (atoms plus external loads) can be written as (u) = E tot(u) −
N
f i ui ,
(6)
i=1
where − f i ui is the potential energy of the applied load f i on atom i. In lattice statics, we seek the displacements u such that this potential energy is minimized.
2.
The QC Method
The goal of the static QC method is to find the atomic displacements that minimize Eq. (6) by approximating the total energy of Eq. (3) such that: 1. the number of degrees of freedom is substantially reduced from 3N , but the full atomistic description is retained in certain “critical” regions, 2. the computation of the energy in Eq. (3) is accurately approximated without the need to explicitly compute the site energy of all the atoms, 3. the fully atomistic, critical regions can evolve with the deformation, during the simulation. In this section, the details of how the QC achieves each of these goals are presented.
2.1.
Removing Degrees of Freedom
A key measure of a displacement field is the deformation gradient F. A body deforms from reference state X to deformed state x = X + u(X), from which we define F(X) ≡
∂u ∂x =I+ , ∂X ∂X
(7)
where I is the identity tensor. If the deformation gradient changes gradually on the atomic scale, then it is not necessary to explicitly track the displacement of every atom in the region. Instead, the displacements of a small fraction of the atoms (called representative atoms or “repatoms”) can be treated explicitly, with the displacements of the remaining atoms approximately found through interpolation. In this way, the degrees of freedom are reduced to only the coordinates of the repatoms.
The theory and implementation of the quasicontinuum method
667
The QC incorporates such a scheme by recourse to the interpolation functions of the finite element method (FEM) (see, for example, [16]). Figure 1 illustrates the approach in two-dimensions in the vicinity of a dislocation core. The filled atoms are the selected repatoms, which are meshed by a space-filling set of linear triangular finite elements. Any atom not chosen as a repatom, like the one labeled “A”, is subsequently constrained to move according to the interpolated displacements of the element in which is resides. The density of repatoms is chosen to vary in space according to the needs of the problem of interest. In regions where full atomistic detail is required, all atoms are chosen as repatoms, with correspondingly fewer in regions of more slowly varying deformation gradient. This is illustrated in Fig. 1, where all the atoms around the dislocation core are chosen as repatoms. Further away, where the crystal experiences only the linear elastic strains due to the dislocation, the density of repatoms is reduced. This first approximation of the QC, then, is to replace the energy E tot by tot,h E : E
tot,h
=
N
E i (uh ).
(8)
i=1
In this equation the atomic displacements are now found through the interpolation functions and take the form h
u =
Nrep
Sα uα ,
(9)
α=1
where Sα is the interpolation (shape) function associated with repatom α, and Nrep is the number of repatoms, Nrep N . Note that the formal summation over the shape functions in Eq. (9) is in practice much simpler due to the compact support of the finite element shape functions. Specifically, shape functions are identically zero in every element not immediately adjacent to a specific repatom. Referring back to Fig. 1, this means that the displacement of atom A is determined entirely from the sum over the three repatoms B, C and D defining the element containing A: uh (X A ) = SB (X A )uB + SC (X A )uC + SD (X A )uD .
(10)
Introducing this kinematic constraint on most of the atoms in the body will achieve the goal of reducing the number of degrees of freedom in the problem, but notice that for the purpose of energy minimization we must still compute the energy and forces on the degrees of freedom by explicitly visiting every atom – not just the repatoms – and building its neighbor environment from the interpolated displacement fields. Next, we discuss how these calculations are approximated and made computationally tractable.
668
E.B. Tadmor and R.E. Miller (a)
A
(b)
D
B
A
C
Figure 1. Selection of repatoms from all the atoms near a dislocation core are shown in (a), which are then meshed by linear triangular elements in (b). The density of the repatoms varies according to the severity of the variation in the deformation gradient. After Ref. [5]. Reproduced with permission.
The theory and implementation of the quasicontinuum method
2.2.
669
Efficient Energy Calculations: The Local QC
In addition to the degree of freedom reduction described in Section 2.1, the QC requires an efficient means of computing the energy and forces without the need to visit every atom in the problem as implied by Eq. (8). The first way to accomplish this is by recourse to the so-called Cauchy–Born (CB) rule (see Ref. [17] and references therein), resulting in what is referred to as the local formulation of the QC.1 The use of linear shape functions to interpolate the displacement field means that within each element, the deformation gradient will be uniform. The Cauchy–Born rule assumes that a uniform deformation gradient at the macro-scale can be mapped directly to the same uniform deformation on the micro-scale. For crystalline solids with a simple lattice structure,2 this means that every atom in a region subject to a uniform deformation gradient will be energetically equivalent. Thus, the energy within an element can be estimated by computing the energy of one atom in the deformed state and multiplying by the number of atoms in the element. In practice, the calculation of the CB energy is done separately from the model in a “black box,” where for a given deformation gradient F, a unit cell with periodic boundary conditions is deformed appropriately and its energy is computed. The strain energy density in the element is then given by E(F) =
E 0 (F) , 0
(11)
where 0 is the unit cell volume (in the reference configuration) and E 0 is the energy of the unit cell when its lattice vectors are distorted according to F. Now the total energy of an element is simply this energy density times the element volume, and the total energy of the problem is simply the sum of element energies: E
tot,h
≈E
tot,h
=
N element
e E(F e ),
(12)
e=1
where e is the volume of element e. The important computational saving made here is that a sum over all the atoms in the body has been replaced by a sum over all the elements, each one requiring an explicit energy calculation for only one atom. Since the number of elements is typically several orders of magnitude smaller than the total number of atoms, the computational 1 The term “local” refers to the fact that use of the CB rule implies that the energy at each point in the continuum will only be a function of the deformation at that point and not on its surroundings. 2 A simple lattice structure is one for which there is only one atom at each Bravais lattice site. In a complex lattice with two or more atoms per site, the Cauchy–Born rule must be generalized to permit shuffling of the off-site atoms. See Ref. [12].
670
E.B. Tadmor and R.E. Miller
savings is substantial. The number of elements scales linearly with the number of repatoms, and so the local QC scales as O(Nrep ). Note, however, that even in the case where the deformation is uniform within each element, the local prescription for the energy in the element is only approximate. This is because in the constrained displacement field uh , the deformation gradient varies from one element to the next. At element boundaries and free surfaces, atoms can have energies that differ significantly from that of an atom in a bulk, uniformly deformed lattice. Figure 2 illustrates this schematically for an initially square lattice deformed according to two different deformation gradients in two neighboring regions. The energy of the atom labeled as a “bulk atom” can be accurately computed from the CB rule; its neighbor environment is uniform even though some of its neighbors occupy other elements. However, the “interface atom” and “surface atom” are not accurately described by the CB rule, which assumes that these atoms see uniformly deformed bulk environments. In situations where the deformation is varying slowly from one element to the next and where surface energetics are not important, the local approximation is a good one. Using the CB rule as in Eq. (11), the QC can be thought of as a purely continuum formulation, but with a constitutive law that is based on
Reference
Deformed
interface atom
surface atom
bulk atom
Figure 2. On the left, the reference configuration of a square lattice meshed by triangular elements. On the right, the deformed mesh shows a bulk atom, for which the CB rule is exactly correct, and two other atoms for which the CB rule will give the wrong energy due to its inability to describe surfaces or changes in the deformation gradient. After Ref. [5]. Reproduced with permission.
The theory and implementation of the quasicontinuum method
671
atomistics rather than on an assumed phenomenological form. The CB constitutive law automatically ensures that the correct anisotropic crystal elasticity response will be recovered for small deformations. It is non-linear elastic (as dictated by the underlying atomistic potentials) for intermediate strains and includes lattice invariance for large deformations; for example, a shear deformation that corresponds to the twinning of the lattice will lead to a rotated crystal structure with zero strain energy density. An advantage of the local QC formulation is that it allows the use of quantum-mechanical atomistic models that cannot be written as a sum over individual atom energies such as tight binding (TB) and DFT. In these models only the total energy of a collection of atoms can be obtained. However, for a lattice undergoing a uniform deformation it is possible to compute the energy density E(F) from a single unit cell with periodic boundary conditions. Incorporation of quantum-mechanical information into the atomic model generally ensures that the description is more transferable, i.e., it provides a better description of the energy of atomic configurations away from the reference structure to which empirical potentials are fitted. This allows truly firstprinciples simulations of some macroscopic processes such as homogeneous phase transformations.
2.3.
More Accurate Calculations: Mixed Local/Non-Local QC
The local QC formulation successfully enhances the continuum FEM framework with atomistic properties such as nonlinearity, crystal symmetry and lattice invariance. The latter property means that dislocations may exist in the local QC. However, the core structure and energy of these dislocations will only be coarsely represented due to the CB approximation of the energy. The same is true for other defects such as surfaces and interfaces, where the deformation of the crystal is non-uniform over distances shorter than the cutoff radius of the interatomic potentials. For example, to correctly account for the energy of the interface shown in Fig. 2, the non-uniform environment of the atoms along the interface must be correctly accounted for. While the local QC can support deformations (such as twinning) which may lead to microstructures containing such interfaces, it will not account for the energy cost of the interface itself. In order to correctly capture these details, the QC must be made non-local in certain regions. The energy of Eq. (8), which in the local QC was approximated by Eq. (12), must instead be approximated in a way that is sensitive to non-uniform deformation and free surfaces, especially in the limit where full atomistic detail is required.
672
E.B. Tadmor and R.E. Miller
We now make the ansatz that the energy of Eq. (8) can be approximated by computing only the energy of the repatoms, but we will identify each repatom as being either local or non-local depending on its deformation environment. Thus, the repatoms are divided into Nloc local repatoms and Nnl non-local repatoms (Nloc + Nnl = Nrep ). The energy expression is then approximated as E
tot,h
≈
Nnl
n α E α (uh ) +
α=1
Nloc
n α E α (uh ).
(13)
α=1
The important difference between Eq. (8) and Eq. (13) is that the sum on all the atoms in the problem has been replaced with a sum on only the repatoms. The function n α is a weight assigned to repatom α, which will be high for repatoms in regions of low repatom density and vice versa. For consistency, the weight functions must be chosen so that Nrep
n α = N,
(14)
α=1
which further implies (through the consideration of a special case where every atom in a problem is made a repatom) that in atomically-refined regions, all n α = 1. From Eq. (14), the weight functions can be physically interpreted as the number of atoms represented by each repatom α. The weight n α for each repatom (local or non-local) is determined from a tessellation that divides the body into cells around each repatom. One physically sensible tessellation is Voronoi cells [18], but an approximate Voronoi diagram can be used instead due to the high computational overhead of the Voronoi construction. In practice, the coupled QC formulation makes use of a simple tessellation based on the existing finite element mesh, partitioning each element equally between each of its nodes. The volume of the tessellation cell for a given repatom, divided by the volume of a single atom (the Wigner–Seitz volume) provides n α for the repatom. In typical QC simulations, non-local regions are fully refined down to the atomic scale, and so the weight of the non-local repatoms is one. To compute the energy of a local repatom α, we recognize that of the n α atoms it represents, n eα reside in each element e adjacent to the repatom. The weighted energy contribution of the repatom is then found by applying the CB rule within each element adjacent to α such that Eα =
M n eα e=1
nα
0 E(F e ),
nα =
M
n eα ,
(15)
e=1
where E(F e ) is the energy density in element e by the CB rule, 0 is the Wigner–Seitz volume of a single atom and e runs over all elements adjacent to α.
The theory and implementation of the quasicontinuum method
673
Note that this description of the local repatoms is exactly equivalent to the element-by-element summation of the local QC in Eq. (12); it is only the way that the energy partitioning is written that is different. In a mesh containing only local repatoms, the two formulations are the same, but the summations have been rearranged from one over elements in Eq. (12) to one over the repatoms here. The energy of each non-local repatom is computed from the deformed neighbor environment dictated from the current interpolated displacements in the elements. In essence, every atom in the vicinity of a non-local repatom is displaced to the deformed configuration, the energy of each non-local repatom in this configuration is computed from Eq. (4), and the total energy is the sum of these repatom energies weighted by n α . For example, the energy of the repatom identified as an “interface atom” in Fig. 2 requires that the neighbor environment be generated by displacing each neighbor according to the element in which it resides. Thus, the energy of each non-local repatom is exactly as it should be under the displacement field uh , while the local approximation is used in regions where the deformation is uniform on the atomic scale. From this starting point, the forces on all the repatoms can be obtained as the appropriate derivatives of Eq. (13), and energy minimization can proceed. When making use of the mixed formulation described in Eq. (13), it now becomes necessary to decide whether a given repatom should be local or non-local. This is achieved automatically in the QC using a non-locality criterion. Note that simply having a large deformation in a region does not in itself require a non-local repatom, as the CB rule of the local formulation will exactly describe the energy of any uniform deformation, regardless of the severity. The key feature that should trigger a non-local treatment of a repatom is a significant variation in the deformation gradient on the atomic scale in the repatom’s proximity. Thus, the non-locality criterion in implemented as follows. A cut-off, rnl , is empirically chosen to be between two and three times the cut-off radius of the interatomic potentials. The deformation gradients in every element within this cut-off of a given representative atom are compared, by looking at the differences between their eigenvalues. The criterion is then: max |λak − λbk | < ,
(16)
a,b;k
where λak is the kth eigenvalue of the right stretch tensor U a = F aT F a in element a, k = 1 · · · 3, and the indices a and b run over all elements within rnl of a given repatom. The repatom will be made local if this inequality is satisfied, and non-local otherwise. In practice, the tolerance is determined empirically. A value of 0.1 has been used in a number of tests and found to give good results. The effect of this criterion is clusters of non-local atoms in regions of rapidly varying deformation.
674
E.B. Tadmor and R.E. Miller
The fact that the non-local repatoms tend to cluster into atomistically refined regions surrounded by local regions leads to non-local/local interfaces in the QC. As in all attempts to couple a non-local atomistic region to a local continuum region found in the literature, this will lead to spurious forces near the interface. These forces, dubbed “ghost-forces” in the QC literature, arise due to the fact that there is an inherent mismatch between the local (continuum) and non-local (atomistic) regions in the problem. In short, the finite range of interaction in the non-local region mean that the motion of repatoms in the local region will effect the energy of non-local repatoms, while the converse may not be true. Upon differentiating Eq. (13), forces on repatoms in the vicinity of the interface may include a non-physical contribution due to this asymmetry. Note that these ghost forces are a consequence of differentiating an approximate energy functional, and therefore they still are “real” forces in the sense that they come from a well-defined potential. The problem is that the mixed local/non-local energy functional of Eq. (13) is approximate, and the error in this approximation is most apparent at the interface. A consequence of this is that a perfect, undistorted crystal containing an artificial local/nonlocal interface will be able to lower its energy below the ground-state energy by rearranging the atoms in the vicinity of the interface. This is clearly a non-physical result. In Ref. [3], a solution to the ghost forces was proposed whereby corrective forces were added as dead loads to the interface region. In this way, there is a well-defined contribution of the corrective forces to the total energy functional (since the dead loads are constant) and the minimization of the modified energy can proceed using standard conjugate gradient or Newton–Raphson techniques. The procedure can be iterated to self-consistency.
2.4.
Evolving Microstructure: Automatic Mesh Adaption
The QC approach outlined in the previous sections can only be successfully applied to general problems in crystalline deformation if it is possible to ensure that the fine structure in the deformation field will be captured. Without a priori knowledge of where the deformation field will require fine-scale resolution, it is necessary that the method have an automatic way to adapt the finite element mesh through the addition or removal of repatoms. To this end, the QC makes use of the finite element literature, where considerable attention has been given to adaptive meshing techniques for many years. Typically in finite element techniques, a scalar measure is defined to quantify the error introduced into the solution by the current density of nodes (or repatoms in the QC). Elements in which this error estimator is higher than some prescribed tolerance are targeted for adaption, while at the same time
The theory and implementation of the quasicontinuum method
675
the error estimator can be used to remove unnecessary nodes from the model. The error estimator of Zienkiewicz and Zhu [19], originally posed in terms of errors in the stresses, is re-cast for the QC in terms of the deformation gradient. Specifically, we define the error estimator to be
1
εe =
e
1/2
¯ − F e ) :( F ¯ − F e )d (F
,
(17)
e
where e is the volume of element e, F e is the QC solution for the deformation ¯ is the L 2 -projection of the QC solution for F, gradient in element e, and F given by ¯ = SF avg . (18) F Here, S is the shape function array, and F avg is the array of nodal values ¯ Because the deformation gradients of the projected deformation gradient F. are constant within the linear elements used in the QC , the nodal values F avg are simply computed by averaging the deformation gradients found in each element touching a given repatom. This is then interpolated throughout the elements using the shape functions, providing an estimate to the discretized field solution that would be obtained if higher order elements were used. The error, then, is defined as the difference between the actual solution and this estimate of the higher order solution. If this error is small, it implies that the higher order solution is well represented by the lower order elements in the region, and thus no refinement is required. The integral in Eq. (17) can be computed quickly and accurately using Gaussian quadrature. Elements for which the error εe is greater than some prescribed error tolerance are targeted for refinement. Refinement then proceeds by adding three new repatoms at the atomic sites closest to the mid-sides of the targeted elements. Notice that since repatoms must fall on actual atomic sites in the reference lattice, there is a natural lower limit to element size; if the nearest atomic sites to the mid-sides of the elements are the atoms at the element corners, the region is fully refined and no new repatoms can be added. The same error estimator is used in the QC to remove unnecessary repatoms from the mesh. In this process, a repatom is temporarily removed from the mesh and the surrounding region is locally remeshed. If the all of the elements produced by this remeshing process have a value of the error estimator below the threshold, the repatom can be eliminated.
3.
Practical Issues in QC Simulations
In this section, we will use a specific, simple example to highlight the practical issues surrounding solutions using the QC method. The example to be
676
E.B. Tadmor and R.E. Miller
discussed is also provided with the QC download at qcmethod.com, and it is discussed in even greater detail in the documentation that accompanies that code.
3.1.
Problem Definition
Consider the problem of a twin boundary in face-centered cubic (FCC) aluminum. The boundary is perfect but for a small step. A question of interest may be “how does this stepped boundary respond to mechanical load?” In this example, we probe this question by using the QC method to solve the problem shown in Fig. 3(a), where two crystals, joined by a stepped twin boundary, are sheared until the boundary begins to migrate due to the load. The result will elucidate the mechanism of this migration. The implementation of the QC method used to solve this problem has been described as “two and a half” dimensional to emphasize that, while it is not a fully 3D model it is also not simply 2D. Specifically, the reference crystal structure is 3D, and all the underlying atomistic calculations (both local and non-local) consider the full, 3D environment of each atom. However, the deformation of the crystal is constrained such that the three components of displacement, u x , u y and u z are functions only of two coordinates x and y. This allows, for example, both edge and screw dislocations, but forces the line direction of the dislocations to be along z. For the reader who is familiar with purely atomistic simulations, this is equivalent to imposing periodic boundary conditions along the z direction, and then using a periodic cell with the
(a)
(b)
200
200
150
150 fcc Al
100
50
Stepped twin boundary
0
Y
Y
50
100
⫺50
0
⫺50
⫺100
⫺100
fcc Al
⫺150
⫺150
⫺200
⫺200 ⫺200 ⫺100
x
0
100
200
⫺200 ⫺100
x
0
100
200
Figure 3. (a) Initial coarse mesh used to define the simulation volume and (b) the final mesh after the automatic adaption.
The theory and implementation of the quasicontinuum method
677
minimum possible thickness along z to produce the correct crystal structure. We sometimes refer to this as a “2D” implementation for brevity, but ask that the reader bears in mind the true nature of the model. The use of a 2D implementation of the QC to study this problem is appropriate given its geometry. However, fully 3D implementations of the QC exist and these must be used for many problems of interest (see examples in Ref. [5]). The starting point for a QC simulation is a crystal lattice, defined by an origin atom and a set of Bravais vectors as in Eq. (1). To allow the QC method to model polycrystals, it is necessary to define a unique crystal structure within each grain. The shape of each grain is defined by a simple polygon in 2D. Physically, it makes sense that the polygons defining each grain do not overlap, although it may be possible to have holes between the grains. In our example, it is easy to see how the shape of the two grains could be defined to include the grain boundary step. Mathematically, the line defining the boundary should be shared identically by the two grains, but this can lead to numerical complications; for example in checking whether two grains overlap. Fortunately, realistic atomistic models are unlikely to encounter atoms that are less than an Angstr¨om or so apart, and so there exists a natural “tolerance” in the definition of these polygons. For example, a gap between grains of 0.1 Å will usually provide sufficient numerical resolution between the grains without any atoms falling “in the gap” and therefore being omitting from the model. In the QC implementation, the definition of the grains is separate from the definition of the actual volume of material to be simulated. This simulation volume is defined by a finite element mesh between an initial set of repatoms. Each element in this mesh must lie within one or more of the grain polygons described above, but the finite element mesh need not fill the entire volume of the defined grains. It is useful to think of the actual model (the mesh) being “cut-out” from the previously defined grain structure. For our problem, a sensible choice for the initial mesh is shown in Fig. 3(a), where the grain boundary lies approximately (to within the height of the step) along the line y = 0. Elements whose centroid lie above or below the grain boundary are assumed to contain material oriented according to the lattice of the upper or lower grain, respectively. Since our interest here is atomic scale processes along the grain boundary, it is clear that the model shown in Fig. 3(a), with elements approximately 50 Å in width, will not provide the necessary accuracy. Thus, we can make use of the QC’s automatic adaption to increase the resolution near the grain boundary. The main adaption criterion, as outlined earlier, is based on error in the finite element interpolation of the deformation gradient. However, there will initially be no deformation near the grain boundary and thus no reason for automatic adaption to be triggered. It is therefore necessary to force the model to adapt in regions that are inhomogeneous at the atomic scale for reasons other than deformation. To this end, we can identify certain segments of the
678
E.B. Tadmor and R.E. Miller
grain boundary as “active” segments. Any repatom within a prescribed distance of an active segment will be made non-local. This further implies that the elements touching this repatom will be targetted for refinement, since we require that n α = 1 for all non-local repatoms. The effect of such a technique is shown in Fig. 3(b), where the segment of the boundary between x = −100 and 100 Å was defined to be active. The result is that the grain boundary structure is correctly captured in the vicinity of the step, as well as for some distance on either side of the step.
3.2.
Solution Procedure
In the static QC implementation, the solution procedure amounts to minimization of the total energy (elastic energy plus the potential energy of the applied loads, see Eq. (6)) for a given set of boundary conditions (applied displacements or forces on certain repatoms). However, problems solved using the QC method are typically highly nonlinear, and as such their energy functional typically includes many local minima. In order to find a physically realistic solution, it is necessary to use a quasi-static loading approach, whereby boundary conditions are gradually incremented, the energy is minimized, and the minimum energy configuration is used in generating an initial guess to the solution after the subsequent load increment. Again, we can refer to the specific example of the stepped twin boundary to make this more clear. Our desire, in this example, is to study the effect of applying a shear strain to the stepped twin boundary. Specifically, we may be interested in knowing the critical shear strain at which the boundary begins to migrate and to understand the mechanism of this migration. We begin by choosing a sensible strain increment to apply, such that the incremental deformation will not be too severe between minimization steps. For this example, the initial guess, un+1 0 , used to solve for the relaxed displacement, un+1 , of load step n + 1 is given by = un + F X, un+1 0
(19)
where un is the relaxed, minimum energy displacement field from load step n, u0 = 0, and the matrix F corresponding to pure shear along the y direction is
1 γ F = 0 1 0 0
0 0 . 1
(20)
Thus, a shear strain increment of γ is applied, the outer repatoms are held fixed to the resulting displacements, and all inner repatoms are relaxed until the
The theory and implementation of the quasicontinuum method
679
energy reaches a minimum. Then, another strain increment is superimposed on these relaxed displacements and the process repeated. After n load steps, a total macroscopic shear strain of γ = n γ has been applied to the outer boundary of the bi-crystal. The energy minimization can be performed using several standard approaches, such as the conjugate gradient (CG) or the Newton–Raphson (NR) methods (both of which are described, for example, in Ref. [20]). The CG method has the advantage over the NR technique in that it requires only the energy functional and its first derivatives with respect to the repatom positions (i.e., the forces). The NR method requires a second derivative, or “stiffness matrix” that is not straightforward to derive or to code in an efficient manner. Once correctly implemented, however, the NR method has the advantage of quadratic convergence (compared to linear convergence for the CG method) once the system is close to the energy minimizing configuration. By monitoring the applied force (measured as the sum of forces in the y-direction applied to the top surface of the bi-crystal) versus the accumulated shear strain, γ , it can be observed that there is an essentially linear response for the first six load steps, and then a sudden load drop from step six to seven. This jump corresponds to the first inelastic behaviour of the boundary, the mechanism of which is shown in Fig. 4. In Fig. 4(a), a close-up of the relaxed step at an applied strain of γ = 0.03 is shown, while Fig. 4(b) shows the relaxed configuration after the next strain increment at γ = 0.035. The mechanism of this boundary motion is the motion of two Shockley partial dislocations from the corners of the step along the boundary. This can be seen clearly by observing the finite element mesh between the repatoms in Fig. 4(c). Because the mesh is triangulated in the reference configuration, the effect of plastic slip is the shearing of a row of elements in the wake of the moving dislocations. One challenge in modeling dislocation motion in crystals at the atomic scale is evident in this simulation. In crystals with a low Peierls resistance like the FCC crystal modelled here, dislocations will move long distances under small applied stresses. In this simulation, the Shockley partials which nucleated at the step move to the ends of the region of atomic-scale refinement. In order to rigorously compute the equilibrium position of the dislocations, it would be necessary to further adapt the model. The presence of the dislocation in close proximity to the larger elements to the left of the fully refined region will trigger the adaption criterion, as well as increase the number of repatoms that are non-local according to the non-locality criterion defined earlier. This will allow the dislocations to move somewhat further upon subsequent relaxation. In principle, this process of iteratively adapting and relaxing can be repeated until the dislocations come to its true equilibrium, which in this example would be at the left and right free surfaces of the bi-crystal.
680
E.B. Tadmor and R.E. Miller
(a)
Initial Boundary Location (b)
Boundary Migration (c)
Slip of Shockley partials Figure 4. Mechanism of migration of the twin boundary under shear. (a) Before migration. (b) After migration (c) Deformed mesh showing the motion of Shockley partial dislocations.
In practice, however, we may not be interested in the full details of where this dislocation comes to rest, if we are willing to accept some degree of error in the simulation. Specifically, the fact that the dislocation is held artificially close to the step may effect the critical load level at which subsequent migration events occur. The compromise is made for the sake of computational speed, which will be significantly compromised if we were to iteratively adapt and relax many times for each load step.
The theory and implementation of the quasicontinuum method
4.
681
Summary
This review has summarized the theory and practical implementation of the QC method. Rather than provide an exhaustive review of the QC literature (which can already be found, for example, in Ref. [5]), the intent has been to provide a simple overview for someone interested in understanding one implementation of the QC method. More specific details, including free, opensource code and documentation, can be found at www.qcmethod.com.
References [1] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Phil. Mag. A, 73, 1529–1563, 1996a. [2] E.B. Tadmor, R. Phillips, and M. Ortiz, “Mixed atomistic and continuum models of deformation in solids,” Langmuir, 12, 4529–4534, 1996b. [3] V.B. Shenoy, R. Miller, E. Tadmor, D. Rodney, R. Phillips, and M. Ortiz, “An adaptive methodology for atomic scale mechanics: the quasicontinuum method,” J. Mech. Phys. Sol., 47, 611–642, 1998a. [4] V.B. Shenoy, R. Miller, E.B. Tadmor, R. Phillips, and M. Ortiz, “Quasicontinuum models of interfacial structure and deformation,” Phys. Rev. Lett., 80, 742–745, 1998b. [5] R.E. Miller and E.B. Tadmor, “The quasicontinuum method: overview, applications and current directions,” J. of Computer-Aided Mater. Design, 9(3), 203–231, 2002. [6] M. Ortiz, A.M. Cuitino, J. Knap, and M. Koslowski, “Mixed atomistic continuum models of material behavior: the art of transcending atomistics and informing continua,” MRS Bull., 26, 216–221, 2001. [7] D. Rodney, “Mixed atomistic/continuum methods: static and dynamic quasicontinuum methods,” In: A. Finel, D. Maziere, and M. Veron (eds.), NATO Science Series II, Vol. 108, “Thermodynamics, Microstructures and Plasticity,” Kluwer Academic Publishers, Dordrecht, 265–274, 2003. [8] M. Ortiz and R. Phillips, “Nanomechanics of defects in solids,” Adv. Appl. Mech., 36, 1–79, 1999. [9] W.A. Curtin and R.E. Miller, “Atomistic/continuum coupling methods in multi-scale materials modeling,” Model. Simul. Mater. Sci. Eng., Vol. 11(3), R33–R68, 2003. [10] A. Carlsson, “Beyond pair potentials in elemental transition metals and semiconductors,” Sol. Stat. Phys., 43, 1–91, 1990. [11] V. Shenoy, V. Shenoy, and R. Phillips, “Finite temperature quasicontinuum methods,” Mater. Res. Soc. Symp. Proc., 538, 465–471, 1999. [12] E. Tadmor, G. Smith, N. Bernstein, and E. Kaxiras, “Mixed finite element and atomistic formulation for complex crystals,” Phys. Rev. B, 59, 235–245, 1999. [13] M. Daw and M. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals,” Phys. Rev. B, 29, 6443–6453, 1984. [14] J. Norskøv and N. Lang, “Effective-medium theory of chemical binding: application to chemisorption,” Phys. Rev. B, 21, 2131–2136, 1980. [15] F. Stillinger and T. Weber, “Computer-simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985.
682
E.B. Tadmor and R.E. Miller [16] O.C. Zienkiewicz, The Finite Element Method, vols. 1–2, 4th edn. McGraw-Hill, London, 1991. [17] J. Ericksen, In: M. Gurtin (ed.), Phase Transformations and Material Instabilities in Solids, Academic Press, New York. [18] A. Okabe, Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Wiley, Chichester, England, 1992. [19] O.C. Zienkiewicz and J. Z. Zhu, “A simple error estimator and adaptive procedure for practical engineering analysis,” Int. J. Numer. Meth. Eng., 24, 337–357, 1987. [20] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge, 1992.
2.14 PERSPECTIVE: FREE ENERGIES AND PHASE EQUILIBRIA David A. Kofke1 and Daan Frenkel2 1 University at Buffalo, The State University of New York, Buffalo, New York, USA 2
FOM Institute for Atomic and Molecular Physics, Amsterdam, The Netherlands
Analysis of the free energy is required to understand and predict the equilibrium behavior of thermodynamic systems, which is to say, systems in which temperature has some influence on the equilibrium condition. In practice, all processes in the world around us proceed at a finite temperature, so any application of molecular simulation that aims to evaluate the equilibrium behavior must consider the free energy. There are many such phenomena to which simulation has been applied for this purpose. Examples include chemical-reaction equilibrium, protein-ligand affinity, solubility, melting and boiling. Some of these are examples of phase equilibria, which are an especially important and practical class of thermodynamic phenomena. Phase transformations are characterized by some macroscopically observable change signifying a wholesale rearrangement or restructuring occurring at the molecular level. Typically this change occurs at a specific value of some thermodynamic variable such as the temperature or pressure. At the exact point where the transition occurs, both phases are equally stable – have equal free energy – and we find a condition of phase equilibrium or coexistence [1].
1.
Free-Energy Measurement
Free-energy calculations are among the most difficult but most important encountered in molecular simulation. A key “feature” of these calculations is their tendency to be inaccurate, yielding highly reproducible results that are nevertheless wrong, despite the calculation being performed in a way that is technically correct. Often seemingly innocuous changes in the way the calculation is performed can introduce (or eliminate) significant inaccuracies. So it 683 S. Yip (ed.), Handbook of Materials Modeling, 683–705. c 2005 Springer. Printed in the Netherlands.
684
D.A. Kofke and D. Frenkel
is important when performing these calculations to have a strong sense of how they can go awry, and proceed in a way that avoids their pitfalls. The aim of any free-energy calculation is to evaluate the difference in free energy between two systems. “System” is used here in a very general sense. The systems may differ in thermodynamic state (temperature, pressure, chemical composition), in the presence or absence of a constraint, or most generally in their Hamiltonian. Often the free energy of one system is known, either because it is sufficiently simple to permit evaluation analytically (e.g., an ideal gas or a harmonic crystal), or because its free energy was established by a separate calculation. In many cases the free-energy difference is itself the principal quantity of interest. The important point here is that free-energy calculations always involve two (or more) systems. We will label these systems A and B in our subsequent discussion, and their free energy difference will be defined F = FB − FA . Once the systems of interest have been identified, a large variety of methods are available to evaluate F. At first glance the methods seem to be very diverse and unrelated, but they nevertheless can be grouped into two broad categories: (a) methods based on measurement of density of states and (b) methods based on work calculations. Implicit in both approaches is the idea of a path joining the two systems, and one way that specific methods differ is in how this path is defined. As free energy is a state function, the free-energy difference of course does not depend on the path, but the performance of a method can depend greatly on this choice (and other details). It is always possible to define a parameter λ that locates a position on the path, such that one value λ A corresponds to system A and another value λ B indicates system B. The parameter λ may be continuous or discrete (in fact, it is not uncommon that it have only two values, λ A and λ B ), and may represent a single variable or a set of variables, depending on the choice of the path. Moreover, for a given path, the parameter λ can be viewed as a state variable, such that a free energy F(λ) can be associated with each value of λ. Thus F = F(λ B ) − F(λ A ). The term “Landau free energy” is sometimes used in connection with this dependence.
1.1.
Density-of-States Methods
If a system is given complete freedom to move back and forth across the path joining A and B, it will explore all possible values of the path variable λ, but it will (in general) not spend equal time at each value. The probability p(λ) that the system is observed to be at a particular point λ on the path is related to the value of the free energy there p(λ) ∝ exp (−F(λ)/kT ) ,
(1)
Perspective: free energies and phase equilibria
685
where T is the absolute temperature and k is Boltzmann’s constant. This relation is the basic idea behind the density-of-states methods. The specific way in which λ samples values depends on how the simulation is implemented. Typically density-of-states calculations are performed as part of Monte Carlo (MC) simulations. In this case sampling includes trial moves in which λ is perturbed to a new value, and a decision to accept the trial is taken in the usual MC fashion. It is possible also to have λ vary as part of a molecular dynamics (MD) simulation. In such a situation λ must couple to the equations of motion of the system, usually via an extended-Lagrangian formalism [2]. Then λ follows a deterministic dynamical trajectory akin to the way that the particles’ coordinates do. In almost all cases of practical interest, conventional Boltzmann sampling will probe only a small fraction of all possible λ-values. The variation of the free energy F(λ) can be many times kT when considered over all λ values of interest, and consequently the probability p(λ) can vary over many orders of magnitude. Extra measures must therefore be taken to ensure that sufficient information is gathered over all λ to evaluate the desired free-energy difference, and one of the features distinguishing different density-of-states methods is the way that they take these measures. Almost always an artificial bias φ(λ) must be imposed to force the system to examine values of λ where the free energy is unfavorable, Usually the aim is to formulate the bias to lead to a uniform sampling over λ, which is achieved if φ(λ) = −F(λ). Of course, inasmuch as the aim is to evaluate F(λ) it is necessary to set up a scheme in which the free energy can be estimated either through preliminary simulations or as part of a systematic process of iteration. The greatest difficulty is found if the free energy change is extensive, meaning that λ affects the entire system and not just a small part of it (e.g., a path that results in a change in the thermodynamic phase, versus a path in which a single molecule is added to the system). In such cases F(λ) scales with the system size and is likely to vary by very large amounts with λ. The practical consequence is that the bias must be tuned very precisely to ensure that good sampling over all λ is accomplished. A robust solution to the problem is the use of windowing, in which the problem of evaluating the full free energy profile F(λ) is broken into smaller problems, each involving only a small range of all λ of interest. Separate simulations are performed over each λ range, and the composite data are assembled to yield the full profile. Even here there are different ways that one can proceed, and a popular approach to this end uses the histogram-reweighting method, which optimally combines the data in a way that accounts for their relative precision. Histogram reweighting is discussed in another chapter of this volume. Within the framework outlined above, the most obvious way to measure the probability distribution p(λ) is to use a visited-states approach: MC or MD sampling of λ values is performed, perhaps in the presence of the bias φ, and
686
D.A. Kofke and D. Frenkel
a histogram is recorded of the frequency with which each value (or bin of values) of λ is occupied. The Wang-Landau method [3, 4] (and its extensions) is the most prominent such technique today. Another approach of this type applies a history-dependent bias using a Gaussian basis [5]. An alternative to visited-states has recently emerged in the form of transition-matrix methods [6–10]. In such an approach one does not tabulate the occupancy of each λ value; rather one tallies statistics about the attempts to transition from one λ to another in a MC simulation. The movement among different λs forms a Markov process, and knowledge of the transition probabilities is sufficient to derive the limiting distribution p(λ). Interestingly, even rejected MC trials contribute information to the transition matrix, so it seems that this approach is gathering information that is discarded in visited-states methods. The transition-matrix approach has several other appealing features. The method can accommodate the use of a bias to flatten the sampling, but the bias does not enter into the transition matrix, so if the bias is updated as part of a scheme to achieve a flat distribution the previously recorded transition probabilities do not have to be discarded, as they must be in visited-states methods (at least in its simpler formulations). Moreover, if windowing is applied to obtain uniform samples across λ, it is easy to join data from different windows. It is not even required that adjacent windows overlap, just that they attempt trials (without necessarily accepting) into each other’s domain. Details of the transition-matrix methods are still being refined, and the versatility of the approach is currently being explored through its application to different problems. Additionally, there are efforts now to combine visited-states and transition-matrix approaches, exploiting the relatively fast (but rough) convergence of the former while relying on the more complete data collection abilities of the latter to obtain the best precision [11].
1.2.
Work-Based Methods
Classical thermodynamics relates the difference in free energy between two systems to the work associated with a reversible process that takes one into the other. A straightforward application of this idea leads to the thermodynamic integration (TI) free-energy method, which has a long history and has seen widespread application. The TI method is but one of several approaches in a class based on the connection between F and the work involved in transforming a system from A to B. A very important development in this area occurred recently, when Jarzynski showed that F could be related to work associated with any such process, not just a reversible one [12–15]. Jarzynski’s non-equilibrium work (NEW) approach requires evaluation of an ensemble of
Perspective: free energies and phase equilibria
687
work values, and thus involves repeated transformation from A to B, evaluating the work each time. The connection to the free energy is then exp(−F/kT ) = exp(−W/kT ),
(2)
where W is the total work, and the overbar on the right-hand side indicates an average taken over many realizations of the path from A to B, always starting from an equilibrium A condition. For an equilibrium (reversible) path, the repeated work measurements will each yield exactly the same value (within the precision of the calculations), while for an arbitrary non-equilibrium transformation a distribution of work values will be observed. It is remarkable that these non-equilibrium transformations can be analyzed to yield a quantity related to the equilibrium states. The instantaneous work w involved in the transformation λ → λ + λ will in general depend upon the detailed molecular configuration of the system at the instant of the change. Assuming that there is no process of heat transfer accompanying the transformation, this work is given simply by the change in the total energy of the system w = E(r N ; λ + λ) − E(r N ; λ).
(3)
For sufficiently small λ, this difference can be given in terms of the derivative dE(λ) λ, (4) w= dλ r N which can be interpreted in terms of a force acting on the parameter λ. The derivative relation is the natural formulation for use in MD simulations, in which the work is evaluated by integrating the product of this force times the displacement in λ over the complete path. The former expression (Eq. (3)) is more appropriate for MC simulation, in which larger steps in λ are typically taken across the path from A to B. Thermodynamic integration is perhaps the first method by which free energies were calculated by molecular simulation. Thermodynamic integration methods are usually derived from classical thermodynamics [1], with molecular simulation appearing simply to measure the integrand. As indicated above, TI also derives as a special (reversible) case of Jarzynski’s NEW formalism, whereby F =W rev for the reversible path. The total work W rev is in turn given by integration of Eq. (4), leading to: F =
λ B
w(λ) dλ.
(5)
λA
Equilibrium values of w are measured in separate simulations at a few discrete λ points along the path. It is then assumed that w is a smooth function
688
D.A. Kofke and D. Frenkel
of λ, and simple quadrature formulas (e.g., trapezoid rule) can be applied. The primary mechanism for the failure of TI is the occurrence of a phase transition, and therefore a discontinuity in w, along the path. Otherwise TI has been successfully applied to a very wide variety of systems, dating to the earliest simulations. Its primary disadvantage is that it does not provide direct measurement of the free energy, and if one is not interested in behavior for points along the integration path then another approach might be preferred. TI approximates a reversible path by smoothing equilibrium, ensembleaveraged, “forces” measured discretely along the path. Alternatively, one can access a reversible path by mimicking a truly reversible process, i.e., by attempting to traverse the path via a slow, continuous transition. In this manner the simulation constantly evolves from system A to system B, such that every MC or MD move is accompanied by a tiny step in λ (or some variation of this protocol). The differential work associated with these changes is accumulated to yield the total work W , which then approximates the free-energy difference. The process may proceed isothermally or adiabatically, the latter being the so-called adiabatic-switch method (and which instead yields the entropy difference between A and B) [16]. The weakness of these methods is in the uncertainty on whether the evolution of the system is sufficiently slow to be considered reversible. Such concerns can be allayed by implementing the calculation using the Jarzynski free-energy formula, Eq. (9); however this remedy then requires averaging of repeated realizations of the transition. One is then led to ask whether it is better to average, say, ten NEW passes, or to perform a single switch ten times more slowly. Free-energy perturbation (FEP) is obtained as the special case of the NEW method in which the transformation from A to B is taken in a single step. Free-energy perturbation is a well established and widely used method. Its principal advantage is that it permits F to be given as an ensemble average over configurations of the A system, removing the complication and expense of defining and traversing a path. The working formula emphasizes this feature exp(−βF) = exp [−β(E B − E A )]A .
(6)
A given NEW calculation can in principle be performed in either direction, starting from A and transforming to B, or vice versa. In practice the calculation will give different results when applied in one or the other direction; moreover these results will bracket the correct value of F. The results differ because they are inaccurate, and the fact that they bracket the correct value makes it tempting to take their average as the “best” result. But this practice is not a good idea, because the magnitude of the inaccuracies is in general not the same for the two directions [17,18]. In fact, it is not uncommon for one direction to provide the right result while the other yields an inaccurate one. But it is also not uncommon in other cases for the average to give a better estimate than either direction individually. The point is that one often does not know what
Perspective: free energies and phase equilibria
689
is the best way to interpret the results. The more careful practitioners will apply sufficient calculation (and perhaps use sufficient stages) until a point is reached in which the results from each direction match each other. However, this practice can be wasteful. To understand the problem and its remedy it is helpful to consider the systems A and B from the perspective of configuration space.
1.3.
Configuration Space
Configuration space is a high-dimensional space of all molecular configurations, such that any particular arrangement of the N atoms in real space is represented by a single point in 3N -dimensional configuration space (more generally we may consider 6N -dimensional phase space, which includes also the momenta) [19]. An arbitrary point in configuration space will typically describe a configuration that is unrealistic and unimportant, in the sense that one would not expect ever to observe the configuration to arise spontaneously in course of the system’s natural dynamics. For example, it might be a configuration in which two atoms occupy overlapping positions. Configuration space will of course contain points that do represent realistic, or important configurations, ones that are in fact observed in the system. It is helpful to consider the set * of all such configurations, as we do schematically in Fig. 1. The enclosing square represents the high-dimensional configuration space, and the ovals drawn within it represent (in a highly simplified manner) the set of all important configurations for the systems. The concept of “important configurations” is relevant to free-energy calculations because the ease with which a reliable (accurate) free-energy difference can be measured depends largely on the relation between the * regions of the two systems defining the free-energy difference. There are five general possibilities [20], summarized in Fig. 1. In a FEP calculation perturbing from A to B, the simulation samples the region labeled ∗A and at intervals it examines its present configuration and gauges its importance to the B system. Three general outcomes are possible for the difference E B − E A seen in Eq. (6): (a) it is a large positive number and the contribution to the FEP average is small; this occurs if the point is in ∗A but not in ∗B ; (b) it is a number of order unity, and a significant contribution is made to the FEP average; this occurs if the point is in ∗A and in ∗B ; or (c) it is a large negative number, and an enormous contribution is made to the FEP average; this occurs if the point is not in ∗A but is in ∗B . The third case will arise rarely if ever, because the sampling is by definition largely confined to the region ∗A . This contradiction (a large contribution made by a configuration that is never sampled) is the source of the inaccuracy in FEP calculation, and it arises if any part of ∗B lies outside of ∗A .
690
D.A. Kofke and D. Frenkel
(a)
Γ
(b)
(c) Γ*B
Γ*B Γ*A
Γ*A (d)
Γ*A
Γ*B
(e) Γ*B
Γ*A ⫻
Γ*A Γ*B
Figure 1. Schematic depiction of types of structures that can occur for the region of important configurations involving two systems. The square region represents all of phase space, and the filled regions are the important configurations ∗A and ∗B for the systems “A” and “B”, as indicated. (a) simple case in which ∗A and ∗B are roughly coincident, and there is no significant region of one that lies outside the other; (b) case in which the important configurations of A and B have no overlap, and energetic barriers prevent each from sampling the other; (c) case in which one system’s important configurations are a wholly contained, not-very-small subset of the others; (d) case in which ∗B is a very small subset of ∗A ; (e) case in which ∗A and ∗B overlap, but neither wholly contains the other.
This observation leads us to the most important rule for the reliable application of FEP: the reference and target systems must obey a configuration-space subset relation. That is, the important configuration space of the target system (B) must be wholly contained within the important configuration space of the system governing the sampling (A). Failure to adhere to this requirement will lead to an inaccurate result. Note the asymmetry of the relation “is a subset of” is directly related to the asymmetry of the FEP calculation. Exchange of the roles of A and B as target or reference can make or break the accuracy of the calculation. For example, consider the free energy change associated with the addition of a molecule to the system. In this case, F equals the excess chemical potential. The A system is one in which the “test” molecule has no interaction with the others, and the B system is one in which it interacts as all the other molecules do. Any configuration in which the test molecule overlaps another molecule is not important to B but is (potentially) important to A – the B system may be a subset of A, while A is most certainly not a subset of B. Whether all of ∗B is within ∗A cannot be stated for the general case. In more complex
Perspective: free energies and phase equilibria
691
systems (e.g., water) it is likely that there are configurations sampled by B that would not be important to A, while in simpler systems (a Lennard–Jones fluid at moderate density) the subset relation is satisfied. This black-and-white picture, in which the * regions are well defined with crisp boundaries, presents only a conceptual illustration of the nature of the calculations. In reality the “importance” of a given configuration (point in ) is not so clear-cut, and the * regions for the A and B systems may overlap in shades of gray (i.e., degrees of importance). The discussion here is given in the context of a FEP calculation, but the same ideas are relevant to the more general NEW calculation. Each increment of work performed in a NEW calculation must adhere to the subset relation too. The difference with NEW is that if the change is made sufficiently slowly (approaching reversibility), then the important phase spaces at each step will differ by only small amounts (cf. Fig. 1(a)), and the subset relation will be satisfied. To the extent that a NEW calculation is performed irreversibly, the issue of inaccuracy and asymmetry becomes increasingly important.
1.4.
Staging Strategies
In practice one is confronted with pair of systems for which F is desired, and there is no control over whether their * regions satisfy a subset relation. Yet FEP and NEW cannot be safely applied unless this condition is met. Two remedies are possible. Phase space can be redefined, such that a given point in it can represent different configurations for the A and B systems [21–23]. This approach has been applied to evaluate free energy differences between crystal structures (e.g., fcc vs. bcc) of a given model system. The phase-space points are defined to represent deviations from a perfect-crystal configuration, and the reference crystal is defined differently for the two systems. The switch from A to B entails swapping the definition of the reference crystal while keeping the deviations (i.e., the redefined phase-space point) fixed. With this transformation, two systems having disjoint * regions are redefined such that their * at least have significant overlap, and perhaps obey the subset requirement. Multiple staging is a more general approach to deal with systems that do not satisfy the subset relation [24–26]. Here the desired free energy difference is expressed in terms of the free energy of one or more intermediate systems, typically defined only to facilitate the free-energy calculation. Thus, F = (FB − FM ) + (FM − FA ),
(7)
where M indicates the intermediate. Free-energy methods are then brought to evaluate separately the two differences, between the M and B and M and A systems, respectively. The M system should be defined such that a subset relation can be formed between it and both the A and B systems. There are
692
D.A. Kofke and D. Frenkel
several options to this end, depending on the * relation in place for the A and B systems. Figure 2 summarizes the possibilities, and the cases are named as follows: • Umbrella sampling. Here M is formulated to contain both A and B, and sampling is performed from it into each [27]. • Funnel sampling. This is possible only if B is already a subset of A. Then M is defined as a subset of A and superset of B, and each perturbation stage is performed accordingly [20, 25, 28]. • Overlap sampling. Here M is formulated to be a subset of both A and B, and sampling is performed on each with pesrturbation into M [29]. General ways to define M to satisfy these requirements are summarized in Table 1, which also lists the general working equations for each multistage scheme. Umbrella sampling is a well-established method but is has only recently been viewed from the perspective given here. Bennett’s acceptanceratio method is a particular type of overlap sampling in which an optimal
Figure 2. Schematic depiction of types of structures that can occur for the region of important configurations involving two systems and a weight system formulated for multistage sampling. The square region represents all of phase space, and the filled regions are the important configurations ∗A , ∗B , and ∗M for the systems A and B, and M as indicated. (a) well formulated umbrella potential defines important configuration that have both ∗A and ∗B as subsets; (b) safely formulated funnel potential needed to focus sampling on tiny set of configurations ∗B while still representing all configurations important to A; (c) well formulated overlap potential, with important configurations formed as a subset of both the A and B systems. Table 1. Summary of staging methods for free-energy perturbation calculations Method
Formula for e−β(FB −F A )
Preferred staging potential, e−β E M
−β(E −E ) B M e Umbrella sampling −β(E −E ) M e
A
e
M
M
Funnel sampling
−1
e−β(E A −F A ) + e−β(E B −FB )
M
−β(E −E ) M A e Overlap sampling −β(E −E ) A B
B
e+β(E A −F A ) + e+β(E B −FB )
e−β(E M −E A ) A e−β(E B −E M ) M No general formulation
Perspective: free energies and phase equilibria
693
M is selected to minimize the variance of F; it is a highly effective and underappreciated method. The funnel-sampling multistage scheme is new, and a general, effective formulation for an M system appropriate to it has not yet been identified. Overlap sampling and umbrella sampling are not particularly helpful if A and B already satisfy the subset relation – they do not give much better precision than a simple single-stage FEP calculation taken in the appropriate direction. However, if implemented correctly they do provide some measure of safety against problems of inaccuracy, which is useful because in most cases one usually does not know clearly the nature of the phase-space relation for the A and B systems, and whether (and which way) a single-stage calculation is safe to perform between them.
2.
Methods for Evaluation of Phase Coexistence
Our perspective now shifts to the calculation of phase coexistence by molecular simulation, for which free-energy methods play a major role. Applications in this area have exploded over the past decade or so, owing to fundamental advances in algorithms, hardware, and molecular models. Some of the methods and concepts surveyed here have been discussed in more detail in recent reviews [30, 31].
2.1.
What is a Phase?
An order parameter is a statistic for a configuration. It is a number (or perhaps a vector, tensor, or some other set of numbers) that can be calculated or measured for a system in a particular configuration, and that in some sense quantifies the configuration. Examples include the density, the mole fraction in a mixture, the magnetic moment of a ferromagnet, and so on. Some molecular order parameters are formulated as expansion coefficients of an appropriate distribution function rendered in a suitable basis set. For example, a natural choice for crystalline translation order parameters is the value of the structure factor for an appropriate wave vector k. Orientational order parameters are widely used in the field of liquid crystals, and a common choice is based on expansion of the orientation distribution in Legendre polynomials. Usually an order parameter is defined such that it has a physical manifestation that can be observed experimentally. A thermodynamic phase is the set of all configurations that have (or are near) a given value of an order parameter. Phases are important because a system will spontaneously change its phase in response to some external perturbation.
694
D.A. Kofke and D. Frenkel
In doing so, the configurations exhibited by the system change from those associated with one value of the order parameter to those of another. Usually such a large shift in the predominant configurations will cause the system’s physical properties (mechanical, electrical, optical, etc.) to change in ways that might be very useful. A well known example is the boiling of a liquid to form a vapor. In response to a small change in temperature, the observed configurations of the system go from those corresponding to a large density to those for a much smaller density. In both cases the system (being at fixed pressure) is free to adopt any desired density. In changing phase it overwhelmingly selects configurations for one density over another. This phenomenon, and its many variants, has a multitude of practical applications. Clearly, there is a close connection between this molecular picture of a phase transformation, and the ideas presented above about the important phase space for a system. When a system changes phase, it is actually changing its important phase space, and the * region for the system before and after the change can relate in any of the ways described in Fig. 1. Analysis of the free energy is required to identify the location of the phase change quantitatively. Often the order parameter describing the phase change serves as the path parameter λ when performing this analysis.
2.2.
Conditions for Phase Equilibria
In a typical phase-equilibrium problem one is interested in the two (or more) phases involved in the transformation. At the exact condition at which one becomes favored over the other, both are equally stable. Molecular simulation is applied to locate this point of phase equilibrium and to characterize the coexisting phases. Formally, the thermodynamic conditions of coexistence can be identified as those minimizing an appropriate free energy, or equivalently by finding the states in which the intensive “field” variables of temperature, pressure, and chemical potential (and perhaps others) are equal among the candidate phases. Most methods for evaluation of phase equilibria by molecular simulation are based on identifying the conditions that satisfy the thermodynamic phase-coexistence criteria, and consequently they require evaluation of free energies or a free-energy difference. Still there is a lot of variability in the approaches, because really there are two problems involved in the calculation. The first is the measurement of the thermodynamic properties, particularly the free energy, while the second is the numerical “root-finding” problem of locating the coexistence conditions. Methods differ largely in the way they combine these two numerical problems, and the most effective and popular methods synthesize these calculations in elegant ways.
Perspective: free energies and phase equilibria
2.3.
695
Direct Contact of Phases, Spontaneous Transformations
Before turning to the free-energy based approaches for evaluating phase coexistence, it is worthwhile to consider the more intuitive approaches that mimic the way phase transitions are studied experimentally. By this we mean methods in which a system is simulated and the phase it spontaneously adopts is identified as the stable thermodynamic phase. Two general approaches can be taken, depending on the types of variables that are fixed in the simulation (i.e., the governing ensemble). In the first case, only one size variable is imposed (typically the number of molecules), and the remaining variables are fields (temperature, pressure, chemical potential difference). Then a scan is made of one or more of the fields (e.g., the temperature is increased), and one looks for the condition at which the phase changes spontaneously (e.g., the system undergoes a sudden expansion). For example, the temperature at which this happens, and the conditions of the phases before and after the transition, characterizes the coexistence point. In practice this method is effective only for producing a coarse description of the phase behavior. It is very easy for a system to remain in a metastable condition as the field variable moves through the transition point, and the spontaneous transformation may occur at a point well beyond the true value. The reverse process is susceptible to the same problem, so the transformation process exhibits hysteresis when the field is cycled back and forth through the transition value. In the second case, two or more extensive variables are imposed (i.e., the number of molecules and the volume), and the system is simulated at a condition inside the two-phase region. A macroscopic system in this situation would separate into the two phases, and both would coexist in the given volume. In principle, this too happens in a molecular simulation, but usually the system size is not sufficiently large to wash out effects due to the presence of the interface. In effect, neither bulk phase is simulated. Nevertheless, the directcontact method does work in some situations. Solid-fluid phase behavior has been studied this way. The interface is slow to equilibrate in this system, so one must be careful to ensure that the simulation begins with a well equilibrated solid. Vapor-liquid equilibria have also been examined using direct contact of the phases. Of course, this approach cannot be applied when too close to the critical point. Often such systems are examined because the interfacial properties are themselves of direct interest. Spontaneous formation of phases has been used recently to examine the behaviors of models that exhibit complex morphologies. Glotzer et al. have examined the mesophases formed by a wide variety of model nanoparticles, including hard particles with tethers, and particles with sticky patches [32].
696
D.A. Kofke and D. Frenkel
The systems have been observed to spontaneously form many complex structures, including columns, lamella, micelles, sheets, double layers, gyroid phases, and so on. The question remains of the absolute stability of the observed structures, but their spontaneous formation is a strong indicator that they are certainly relevant, and could likely be the most stable of all possible phases at the simulated conditions. The phase behaviors of other types of mesoscale models are also studied through the direct-observation methods. Systems modeled using dissipative particle dynamics [2, 33] are good candidates for this treatment, because they have a very soft repulsion and particles can in effect pass through each other; and as a consequence they equilibrate very quickly.
2.4.
Methods Based on Solution of Thermodynamic Equalities
A well worn approach to the free-energy based evaluation of phase equilibria focuses on satisfying the coexistence conditions given in terms of equality of the field parameters. In this approach each phase is studied separately, and state conditions are varied systematically until the coexistence conditions are met. An effective way to attack this problem is to combine the search for the coexistence point with the evaluation of the free energy through thermodynamic integration. For example, to evaluate a vapor-liquid coexistence point, one can start with a subcooled liquid of known chemical potential (evaluated using any of the methods reviewed above), and proceed with a series of isothermal-isobaric simulations following a line of decreasing pressure. At each point the chemical potential can be evaluated through the thermodynamic integration using the measured density µ(P) = µ(P0 ) +
P
d p/ρ( p).
(8)
P0
A similar series of simulations can be performed in the vapor separately, at the same temperature as the liquid simulations, but increasing the pressure toward the point of saturation (alternatively, an equation of state might be applied to characterize the vapor). Once the liquid and vapor simulations reach an overlapping range of pressures, the chemical potentials computed according to Eq. (8) can be examined at each pressure, until the point is found at which chemical potential is equal across the two phases for a given pressure. This general approach can be somewhat tedious to implement, but it is perhaps the most robust of all methods. It is likely to provide a good result for almost all types of coexistence. It has been applied to many types of phase equilibria, including those involving solids [34], liquid crystals [35], plastic
Perspective: free energies and phase equilibria
697
crystals, as well as fluids. The search for the coexistence condition can be applied using almost any order parameter (density was used in this example), although one must perhaps put some effort toward developing the appropriate formalism defining a field to couple to the parameter, and implementing a simulation in which this field is applied. Complications arise if many field parameters are relevant. For example, if one is studying a mixture, then a separate field parameter (chemical potential) is needed to couple to each molefraction variable. The problem can be simplified by fixing all but one of the field variables in the two phases, but often this leads to a statement of the coexistence problem that is at odds with the problem of real interest (e.g., one might want to know the composition of the incipient phase arising from another phase of given composition, which in the context of vapor-liquid equilibria is known as a bubble-point or a dew-point calculation). For mixtures, this formulation is expressed by the semigrand ensemble [36]. This method, like many others, will suffer when applied to characterize a weak phase transition, that is, one that is accompanied by only a small change in the relevant order parameter. The order parameter is related to the slope of the line that is being mapped in this calculation, and consequently for a weak transition the slopes of these lines for the two phases will not be very different from each other. It can be difficult to locate precisely the intersection of two nearly parallel lines – any errors in the position of the lines will have a greatly magnified effect on the error in the point of intersection. Therefore the application of this method to a weak transition can fail if the relevant ensemble averages and the free energies for the initial points of the integration are not measured with high precision and accuracy.
2.5.
Gibbs Ensemble
A breakthrough in technique for the evaluation of phase coexistence by molecular simulation arrived in 1987 with the advent of the Gibbs ensemble [37]. This method presents a very clever synthesis of the problem of locating the conditions of coexistence and measuring the free energy in the candidate phases. It accomplishes this through the simulation of both phases simultaneously, each occupying its own simulation volume. Although the phases are not in “physical” contact, they are in contact thermodynamically. This means that they are capable of exchanging volume and mass in response to the thermodynamic driving forces of pressure and chemical potential difference, respectively. The systems evolve in this way, increasing or decreasing in density with the mass and volume exchanges, until the point of coexistence is found. Upon reaching this condition the systems will fluctuate in density about the values appropriate for the equilibrium state, which can then be measured as a simple
698
D.A. Kofke and D. Frenkel
ensemble average. Details of the method are available in several reviews and texts [2, 37, 38]. The Gibbs ensemble is the method of choice for straightforward evaluation of vapor–liquid and liquid–liquid equilibria. It does not suffer any particular complications when applied to mixtures, and it has been applied with great success to many phase coexistence calculations. However, there are several ways in which it can fail. First, an essential element of the technique is the exchange of molecules at random between the coexisting phases. If trials of this type are not accepted with sufficient frequency, the systems will not equilibrate and a poor result is obtained. This problem arises in applications to large, complex molecules, and/or at low temperatures and high densities. It can be overcome to a useful degree through the application of special sampling techniques, such as configurational bias. Second, in its basic form the Gibbs ensemble is not applicable to equilibria involving solids, or to lattice models. The problem is only partially due to the difficulty of inserting a molecule into a solid. The “mass balance” is the more insidious obstacle. The number of molecules present in each phase at equilibrium is set by the initial number of molecules and the volume of the composite system of both phases (as well as the values of the coexistence densities). A defect-free crystal can be set up in a periodic system using only a particular number of molecules. For example an fcc lattice in cubic periodic boundaries can be set up using 32, 108, 256, 500, and so on molecules (i.e., 4n 3 where n is an integer). When beginning a Gibbs ensemble calculation there is no simple way to ensure this condition will be met in the equilibrium system. Tilwani and Wu [39] have treated these problems with an alternative approach in which an atom is added to the unit box of the solid and this new unit box is used to fill up (tile) space. In this way, particles can be added or removed from the system, while the crystal structure is maintained. The Gibbs ensemble fails also upon approach to the critical point. As this condition is reached, contributions to the averages increase for densities in the region between the two phases. It then becomes possible, even likely, that the simulated phases will swap their roles as the liquid and vapor phases. This is not a fatal flaw, but it presents a complication to the method, and it is an indicator that the general approach is beginning to fail. Thus the consensus today is that in this region of the phase envelope density-of-states methods are more suitable for characterizing the coexistence behavior. More generally, the Gibbs ensemble can encounter difficulty when applied to any weak phase transition, if only because it is necessary to configure the composite system so that it lies in the two phase region – this can be difficult to do if this region is very narrow. Interestingly enough, the Gibbs ensemble can fail also if it is applied using very large system sizes. In this situation an interface is increasingly likely to form in one or both phases, and the result is that a clean separation of phases between the volumes is no longer in place – instead both
Perspective: free energies and phase equilibria
699
simulation volumes each end up representing both phases. Typically the Gibbs ensemble is applied for its simplicity and ability to provide quick results, so the large systems needed to raise this problem are not usually encountered.
2.6.
Gibbs–Duhem Integration
The Gibbs–Duhem integration (GDI) method [40] applies thermodynamic integration to both parts of the combined problem of evaluating the free energy and locating the point of transition. In particular, the path of integration is constructed to follow the line of coexistence. All of this is neatly packaged by the Clapeyron differential equation for the coexistence line, which in the pressure–temperature plane is [1]
dP dT
= σ
H , T V
(9)
where H and V are the differences in molar enthalpy and molar volume, respectively, between the two phases; the σ subscript indicates a path along the coexistence line. The GDI procedure treats Eq. (9) as a numerical problem of integrating an ordinary differential equation. The complication, of course, is that the right-hand side must be evaluated through molecular simulation at the temperature and pressure specified by the integration procedure, and moreover separate simulations are required to characterize both phases involved in the difference. A simple iterative process is applied to refine the pressure according to Eq. (9) after a step in temperature is taken, using preliminary results for the ensemble averages from the simulations. Predictor-corrector methods are effective in performing the integration, and inasmuch as the primary error in the calculation arises from the imprecision of the ensemble averages, a low-order integration scheme suffices for the purpose. The GDI method applies much more broadly than indicated in this description. Any type of field variables can be used in the role held by pressure and temperature in Eq. (9), with appropriate modification to the right-hand side. For example, integrations have been performed along paths of varying composition, polydispersity, orientational order, and interparticle-potential softness, rigidity, or shape [36]. The method applies equally well to equilibria involving fluids or solids, or other types of phases. It has been used to follow three-phase coexistence lines too. In this application one must integrate two differential equations similar to Eq. (9), involving three field variables. In all cases there are a number of practical implementation issues to consider, such as how the integration is started, and the proper selection of the functional form of the field variables (e.g., integration in ln(P) vs. 1/T has advantages for tracing
700
D.A. Kofke and D. Frenkel
vapor–liquid coexistence lines). These issues have been discussed in some detail in recent reviews [36, 41]. The GDI method has some limitations. It does require an initial point of coexistence in order to begin the integration procedure. Concerns are often expressed that errors in this initial point will propagate throughout the integration, but this problem is not as bad as one might think. A stability analysis shows that any such errors will be attenuated if the integration is performed in a direction from a weaker to a stronger transition (e.g., away from the liquid– vapor critical point toward lower temperatures). On the other hand, if the integration is performed in the opposite direction, initial and accumulated errors will be amplified. Regardless it seems that in practice any such problems do not arise. A related concern is the general difficulty in treating weak phase transitions. If the differences on the right-hand side of Eq. (9) are small, and thus may be formed using averages that have stochastic errors comparable to the differences themselves, then it is clear that the method will not work well. In such cases one might be better off employing a method that directly bridges the difference between the phases, such as by mapping the full density of states in this region. The basic idea of tracing coexistence lines has been further generalized for mapping of other classes of phase equilibria, such as tracing of azeotropes [42], and dew/bubble-point lines [41]. Escobedo has developed and applied a general framework for these approaches [30, 43–47].
2.7.
Mapping the Density of States
Density of states methods evaluate coexisting phases by calculating the full free-energy profile across the range of values of the order parameter between and including the two phases. It is only in the past few years that this method has come to be viewed as generally viable, and even a good choice for evaluating phase coexistence. The effort involved in collecting information for the intermediate points seems wasteful, although with the approach these data are needed to obtain the relative free energies of the real states of interest (i.e., the coexisting phases). The methods reviewed above are popular because they avoid this complication and are more efficient because of it. However, there is some advantage in having the system cycle through the uninteresting states. It helps to move the sampling through phase space. Thus, a simulated system might go from a liquid configuration, then to a vapor, and back to the liquid but in a very different configuration from which it started. This is particularly important for complex fluids such as polymers (in the context of other phase equilibria), in which it is otherwise difficult to escape from ergodic traps. Second, the intermediate states may be of interest in themselves; they can be used, for example, to evaluate the surface tension associated with contacting the two
Perspective: free energies and phase equilibria
701
phases [10]. Third, it may be that the distance between the coexisting phases is not so large (i.e., the transition is weak), so covering the ground between them does not introduce so much expense; moreover in such a situation other methods do not work very well. Regardless, continuing improvements in computing hardware and algorithms (some reviewed above), particularly in parallel methods and architectures, have made the density-of-states strategy look much more appealing. We describe the basic approach in the context of vapor–liquid equilibria. Simulation can be performed in the grand-canonical potential with a chemical potential selected to be in the vicinity of the coexistence value. The density of states is mapped as a function of number of molecules at fixed volume; the transition-matrix method with a biasing potential in N has been found to be convenient and effective in this application. The resulting density of states will most likely exhibit two unequal peaks, representing the two nearly coexisting phases. Histogram reweighting is then applied to the density of states to determine the value of the chemical potential that makes the peaks equal in size. This is taken to be the coexistence value of the chemical potential, and the positions of the peaks give the molecule numbers (densities) of the coexisting phases. The coexistence pressure can be determined from the grand potential, which is available from the density of states. Additional details are presented by Errington [9].
3.
Outlook
The nature of the questions that we address with the help of computer simulations is changing. Increasingly, we wish to be able to predict the changes that will occur in a system when external conditions (e.g., temperature, pressure or the chemical potential of one or more species) are changed. In order to predict the stable phase of a many-body system, or the “native” conformation of a macromolecule, we need to know the accessible volume in phase space that corresponds to this state or, in other words, its free energy. Both the MC and the MD methods were created in effectively the form in which we use them today. However, the techniques used to compute free energy differences have expanded tremendously and have become much more powerful and much more general than they were only a decade ago. Yet, the roots of some of these techniques go back a long way. For instance, the density-of-states method was already considered in the late 1950s [48] and was first implemented in the 1960s [49]. The aim of the present chapter is to provide a (very concise) review of some of the major developments. As the developments are in a state of flux, this review provides nothing more than a snapshot.
702
D.A. Kofke and D. Frenkel
It is always risky to identify challenges for the future, but some seem clear. First of all, it would seem that there must be a quantum-mechanical counterpart to Jarzynski’s NEW method. However, it is not at all obvious that this would lead to a tractable computational scheme. A second challenge has to do with the very nature of free energy. In its most general (Landau) form, the free energy of a system is a measure of the available phase space compatible with one or more constraints. In the case of the Helmholtz free energy, the quantities that we constrain are simply the volume V and the number of particles N . However, when we consider the pathway by which a system transforms from one state to another, the constraint may correspond to a non-thermodynamic order parameter. In simple cases, we know this order parameter, but often we do not. We know the initial and final states of the system and hopefully the transformation between the two can be characterized by one, or a few, order parameters. If such a low-dimensional picture is correct, it is meaningful to speak of the “free-energy landscape” of the system. However, although methods exist to find pathways that connect initial and final states in a barriercrossing process [50], we still lack systematic ways to construct optimal low-dimensional order-parameters to characterize the transformation of the system. To date, most successful schemes to map free-energy landscapes assume that the true reaction coordinates are spanned by a relatively small set of supposedly relevant coordinates. However, is not obvious that it will always be possible to find such coordinates. Yet, without a physical picture of the constraint or reaction coordinate, free energy surfaces are hardly more informative than the high-dimensional potential-energy surface from which they are ultimately derived. Without this knowledge we can still compute the relative stability of initial and final state (provided we have a criterion to distinguish the two), but we will be unable to gain physical insight into the factors that affect the rate of transformation from the metastable to the stable state.
Acknowledgments DAK’s activity in this area is supported by the U.S. Department of Energy, Office of Basic Energy Sciences. The work of the FOM Institute is part of the research program of FOM and is made possible by financial support from the Netherlands organization for Scientific Research (NWO).
References [1] K. Denbigh, Principles of Chemical Equilibrium, Cambridge: Cambridge University, 1971. [2] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, Academic Press, San Diego, 2002.
Perspective: free energies and phase equilibria
703
[3] F. Wang and D.P. Landau, “Determining the density of states for classical statistical models: a random walk algorithm to produce a flat histogram,” Phys. Rev. E, 64, 056101-1–056101-16, 2001a. [4] F. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050–2053, 2001b. [5] A. Laio and M. Parrinello, “Escaping free-energy minima,” Proc. Nat. Acad. Sci., 99, 12562–12566, 2002. [6] M. Fitzgerald, R.R. Picard, and R.N. Silver, “Canonical transition probabilities for adaptive Metropolis simulation,” Europhys. Lett., 46, 282–287, 1999. [7] J.-S. Wang, T.K. Tay, and R.H. Swendsen, “Transition matrix Monte Carlo reweighting and dynamics,” Phys. Rev. Lett., 82, 476–479, 1999. [8] M. Fitzgerald, R.R. Picard, and R.N. Silver, “Monte Carlo transition dynamics and variance reduction,” J. Stat. Phys., 98, 321, 2000. [9] J. R. Errington, “Direct calculation of liquid–vapor phase equilibria from transition matrix Monte Carlo simulation,” J. Chem. Phys., 118, 9915–9925, 2003a. [10] J. R. Errington, “Evaluating surface tension using grand-canonical transition-matrix Monte Carlo simulation and finite-size scaling,” Phys. Rev. E, 67, 012102-1 – 012102-4, 2003b. [11] M.S. Shell, P.G. Debenedetti, and A.Z. Panagiotopoulos, “An improved Monte Carlo method for direct calculation of the density of states,” J. Chem. Phys., 119, 9406– 9411, 2003. [12] C. Jarzynski, “Equilibrium free-energy differences from nonequilibrium measurements: a master-equation approach,” Phys. Rev. E, 56, 5018–5035, 1997a. [13] C. Jarzynski, “Nonequilibrium equality for free energy difference,” Phys. Rev. Lett., 78, 2690–2693, 1997b. [14] G.E. Crooks, “Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems,” J. Stat. Phys., 90, 1481–1487, 1998. [15] G.E. Crooks, “Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences,” Phys. Rev. E, 60, 2721–2726, 1999. [16] M. Watanabe and W.P. Reinhardt, “Direct dynamical calculation of entropy and free energy by adiabatic switching,” Phys. Rev. Lett., 65, 3301–3304, 1990. [17] N.D. Lu and D.A. Kofke, “Accuracy of free-energy perturbation calculations in molecular simulation I. Modeling,” J. Chem. Phys., 114, 7303–7311, 2001a. [18] N.D. Lu and D.A. Kofke, “Accuracy of free-energy perturbation calculations in molecular simulation II. Heuristics,” J. Chem. Phys., 115, 6866–6875, 2001b. [19] J.P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, London, 1986. [20] D.A. Kofke, “Getting the most from molecular simulation,” Mol. Phys., 102, 405– 420, 2004. [21] A.D. Bruce, N.B. Wilding, and G.J. Ackland, “Free energy of crystalline solids: a lattice-switch Monte Carlo method,” Phys. Rev. Lett., 79, 3002–3005, 1997. [22] A.D. Bruce, A.N. Jackson, G.J. Ackland, and N.B. Wilding, “Lattice-switch Monte Carlo method,” Phys. Rev. E, 61, 906–919, 2000. [23] C. Jarzynski, “Targeted free energy perturbation,” Phys. Rev. E, 65, 046122, 1–5, 2002. [24] J.P. Valleau and D.N. Card, “Monte Carlo estimation of the free energy by multistage sampling,” J. Chem. Phys., 57, 5457–5462, 1972. [25] D.A. Kofke and P.T. Cummings, “Quantitative comparison and optimization of methods for evaluating the chemical potential by molecular simulation,” Mol. Phys., 92, 973–996, 1997.
704
D.A. Kofke and D. Frenkel [26] R.J. Radmer and P.A. Kollman, “Free energy calculation methods: a theoretical and empirical comparison of numerical errors and a new method for qualitative estimates of free energy changes,” J. Comp. Chem., 18, 902–919, 1997. [27] G.M. Torrie and J.P. Valleau, “Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling,” J. Comp. Phys., 23, 187–199, 1977. [28] D.A. Kofke and P.T. Cummings, “Precision and accuracy of staged free-energy perturbation methods for computing the chemical potential by molecular simulation,” Fluid Phase Equil., 150, 41–49, 1998. [29] N.D. Lu, J.K. Singh, and D.A. Kofke, “Appropriate methods to combine forward and reverse free energy perturbation averages,” J. Chem. Phys., 118, 2977–2984, 2003. [30] J.J. de Pablo, Q.L. Yan, and F.A. Escobedo, “Simulation of phase transitions in fluids,” Ann. Rev. Phys. Chem., 50, 377–411, 1999. [31] A.D. Bruce and N.B. Wilding, “Computational strategies for mapping equilibrium phase diagrams,” Adv. Chem. Phys., 127, 1–64, 2003. [32] Z.L. Zhang, M.A. Horsch, M.H. Lamm, and S.C. Glotzer, “Tethered nano building blocks: Towards a conceptual framework for nanoparticle self-assembly,” Nano Lett., 3, 1341–1346, 2003. [33] R.D. Groot and P.B. Warren, “Dissipative particle dynamics: bridging the gap between atomistic and mesoscopic simulation,” J. Chem. Phys., 107, 4423–4435, 1997. [34] P.A. Monson and D.A. Kofke, “Solid–fluid equilibrium: insights from simple molecular models,” Adv. Chem. Phys., 115, 113–179, 2000. [35] M.P. Allen, G.T. Evans, D. Frenkel, and B.M. Mulder, “Hard convex body fluids,” Adv. Chem. Phys., 86, 1–166, 1993. [36] D.A. Kofke, “Semigrand canonical Monte Carlo simulation; Integration along coexistence lines,” Adv. Chem. Phys., 105, 405–441, 1999. [37] A.Z. Panagiotopoulos, “Direct determination of phase coexistence properties of fluids by Monte Carlo simulation in a new ensemble,” Mol. Phys., 61, 813–826, 1987. [38] A.Z. Panagiotopoulos, “Direct determination of fluid phase equilibria by simulation in the Gibbs ensemble: a review,” Mol. Sim., 9, 1–23, 1992. [39] P. Tilwani, “Direct simulation of phase coexistence in solids using the Gibbs ensemble: Configuration annealing Monte Carlo,” M.S. Thesis, Colorado School of Mines, Golden, Colorado, 1999. [40] D.A. Kofke, “Direct evaluation of phase coexistence by molecular simulation through integration along the saturation line,” J. Chem. Phys., 98, 4149–4162, 1993. [41] J. Henning, and D.A. Kofke, “Thermodynamic integration along coexistence lines,” In: P.B. Balbuena and J. Seminario (eds.), Molecular Dynamics, Amsterdam: Elsevier, 1999. [42] S.P. Pandit and D.A. Kofke, “Evaluation of a locus of azeotropes by molecular simulation,” AIChE J., 45, 2237–2244, 1999. [43] F.A. Escobedo, “Novel pseudoensembles for simulation of multicomponent phase equilibria,” J. Chem. Phys., 108, 8761–8772, 1998. [44] F.A. Escobedo, “Tracing coexistence lines in multicomponent fluid mixtures by molecular simulation,” J. Chem. Phys., 110, 11999–12010, 1999. [45] F.A. Escobedo, “Molecular and macroscopic modeling of phase separation,” AIChE J., 46, 2086–2096, 2000a. [46] F. A. Escobedo, “Simulation and extrapolation of coexistence properties with singlephase and two-phase ensembles,” J. Chem. Phys., 113, 8444–8456, 2000b. [47] F.A. Escobedo and Z. Chen, “Simulation of isoenthalps and Joule–Thomson inversion curves of pure fluids and mixtures,” Mol. Sim., 26, 395–416, 2001.
Perspective: free energies and phase equilibria
705
[48] Z.W. Salsburg, J.D. Jacobson, W. Fickett, and W.W. Wood, “Application of the Monte Carlo method to the lattice-gas model. I.Two-dimensional triangular lattice,” J. Chem. Phys., 30, 65–72, 1959. [49] I.R. McDonald and K. Singer, “Calculation of thermodynamic properties of liquid argon from Lennard-Jones parameters by a Monte Carlo method,” Discuss. Faraday Soc., 43, 40–49, 1967. [50] P.G. Bolhuis, D. Chandler, C. Dellago, and P.L. Geissler, “Transition path sampling: throwing ropes over rough mountain passes, in the dark,” Ann. Rev. Phys. Chem., 53, 291–318, 2002.
2.15 FREE-ENERGY CALCULATION USING NONEQUILIBRIUM SIMULATIONS Maurice de Koning1 and William P. Reinhardt2 1 University of S˜ao Paulo S˜ao Paulo, Brazil 2
University of Washington Seattle, Washington, USA
1.
Introduction
Stimulated by the progress of computer technology over the past decades, the field of computer simulation has evolved into a mature branch of modern scientific investigation. It has had a profound impact in many areas of research including condensed-matter physics, chemistry, materials and polymer science, as well as in biophysics and biochemistry. Many problems of interest in all of these areas involve complex many-body systems and analytical solutions are generally not available. In this light, atomistic simulations play a particularly important role, giving detailed insight into the fundamental microscopic processes that control the behavior of complex systems at the macroscopic level. They provide key and effective tools for providing ab initio predictions, interpreting complex experimental data, as well as conducting computational “experiments” that are difficult or impossible to realize in a laboratory. In this article, we will discuss one of the most fundamental and difficult applications of atomistic simulation techniques such as Monte Carlo (MC) [1] and molecular dynamics (MD) [2, 3]; the determination of those thermodynamic properties that require determination of the entropy. The entropy, the chemical potential, and the various free energies are examples of thermal thermodynamic properties. In contrast their mechanical counterparts such as the enthalpy, thermal quantities cannot be computed as simple time, or ensemble, averages of functions of the dynamical variables of the system and, therefore, are not directly accessible in MC or MD simulations. Yet, the free energies are often the most fundamental of all thermodynamic functions. Under appropriate constraints they control chemical and phase equilibria, and transition state estimates of the rates of chemical reactions. Examples of applications 707 S. Yip (ed.), Handbook of Materials Modeling, 707–727. c 2005 Springer. Printed in the Netherlands.
708
M. de Koning and W.P. Reinhardt
range from determination of the influence of crystal defects on the mechanical properties of materials, to the mechanisms of protein folding. The development of efficient and accurate techniques for their calculation has therefore attracted considerable attention during the past fifteen years, and is still a very active field of research [4]. As detailed in the previous chapter [4], the evaluation of free energies (or, more specifically free-energy differences) requires simulations that collect data along a sequence of states on a thermodynamic path linking two equilibrium states. If the system is at equilibrium at every point along such a path, the simulated process is quasistatic and reversible, and standard thermodynamic results may be used to interpret collected data and to estimate the free-energy difference between the initial and final equilibrium states. The present chapter generalizes this approach to the case where data is collected during nonequilibrium, and thus irreversible, processes. Several important themes will emerge, making clear why this generalization is of interest, and how nonequilibrium calculations may be set up to provide both upper and lower bounds (and thus systematic in addition to statistical error estimates) to the desired thermal quantities. Additionally, the irreversible process may be optimized in a variational sense so as to improve such bounds. The statistical–mechanical theory of nonequilibrium systems within the regime of linear response will prove particularly helpful in this endeavor. Finally, newly developed re-averaging techniques have appeared that, in some cases, allow quite precise estimates of equilibrium thermal quantities directly from nonequilibrium data. The combination of such techniques with near-optimal paths can give well converged results from relatively short computations. In the illustrations that follow, for sake of conciseness, we will limit ourselves to the application of nonequilibrium methods within the realm of the classical canonical ensemble. For this representative case the relevant thermodynamic variables are the number of particles N , the volume V , and the temperature T ; and the appropriate free energy is the Helmholtz free energy, A(N, V, T ) = E(N, V, T ) − T S(N, V, T ), E and S being the internal energy and entropy, respectively. However, appropriate generalizations of nonequilibrium methods to other classical ensembles, as well as to quantum systems, are readily available.
2.
Equilibrium Free-Energy Simulations
The calculation of thermodynamic quantities by means of atomistic simulation is rooted in the framework of equilibrium statistical mechanics [5], which provides the link between the microscopic details of a system and its macroscopic thermodynamic properties. Let us consider a system consisting
Free-energy calculation using nonequilibrium simulations
709
of N classical particles with masses m i . A microscopic configuration of the system is fully specified by the set of N particle momenta {pi } and positions {ri }, and its energy is described in terms of a potential-energy function U ({ri }). Statistical mechanics in the canonical ensemble then tells us that the distribution of the particle positions and momenta is given by ρ(Γ) =
1 exp(−β H (Γ)), Z (N, V, T )
(1)
where Γ ≡ ({p}, {r}) denotes a microstate of the system, β = 1/k B T (with k B Boltzmann’s constant) and H (Γ) is the classical Hamiltonian. The denominator in Eq. (1) is referred to as the canonical partition function, defined as Z (N, V, T ) =
dΓ exp[−β H (Γ)],
(2)
and guarantees proper normalization of the distribution function. The mechanical thermodynamic properties such as the internal energy, enthalpy and pressure, can be expressed as ensemble averages over the distribution function ρ(Γ). Here, the attribute “mechanical” means that the quantity of interest, X , is associated with a specific function X = X (Γ) of the microstate, Γ, of the system and can be written as X =
dΓρ(Γ)X (Γ).
(3)
Standard atomistic simulation techniques such as Metropolis MC [1] and MD [2, 3] provide powerful algorithms for generating sequences of microstates (Γ1 , Γ2 , . . . , Γ M ) that are distributed according the particular statistical– mechanical (e.g., canonical) distribution function of interest. In this manner, the average implied by Eq. (3) is easily estimated by averaging the function X (Γ) over a sequence, Γj , of microstates generated using MC or MD simulation, X = lim
M→∞
M 1 X (Γ j ). M j =1
(4)
Although the partition function Z , itself, is not known this does not present a problem in the case one is interested in any of the mechanical properties of the system; since Z is implicit in the generation of the sequence of microstates, Γi , it is not needed to perform the ensemble average of Eq. (3). The calculation of thermal quantities is not so straightforward, however. For example, the Helmholtz free energy A(N, V, T ) = −
1 1 ln Z (N, V, T ) = − ln β β
dΓ exp[−β H (Γ)] ,
(5)
710
M. de Koning and W.P. Reinhardt
is seen to be an explicit function of the partition function Z rather than an average of the type shown in Eq. 3. Therefore, as Z is not directly accessible in an MC or MD simulation, indirect strategies must be used. The most widely adopted strategy is to construct a real or artificial thermodynamic path that consists of a continuous sequence of equilibrium states linking two states of interest of the system and then attempt to calculate the free-energy difference between them. Should the free energy of one of these states be exactly known, the free energy of the other may then be put on an absolute basis. This approach provides the basis for the common thermodynamic integration (TI) method. Usually TI relies on the definition of a thermodynamic path in the space of system Hamiltonians. Typically, this involves the construction of an “artificial” Hamiltonian H (Γ , λ), which, aside from the usual dependence on the microstate Γ is also a function of some generalized coordinate or switching parameter λ. This generalized Hamiltonian is then constructed in such a way that it leads to a continuous transformation from the Hamiltonian of a system of interest to that of a reference system of which the free energy is known beforehand. Within the canonical ensemble, the Helmholtz free-energy difference between the initial and final states of the path, characterized by the switching coordinate values λ1 and λ2 , respectively, is then given by A ≡ A(λ2 ; N, V, T ) − A(λ1 ; N, V, T ) λ2
dλ
= λ1
∂ A(λ; N, V, T ) ∂λ
λ
λ2
= dλ λ1
∂ H (Γ, λ) ∂λ
λ
≡ Wrev ,
(6)
where A(λ; N, V, T ) is the Helmholtz free energy of the system as a function of the switching coordinate λ for fixed N , V , and T , and the brackets in the second integral denote an average evaluated for the canonical ensemble associated with the generalized coordinate value λ = λ . From a thermodynamic standpoint, Eq. (6) may be interpreted in the following way. The free-energy difference between the initial and final states is equal to the reversible work Wrev done by the generalized thermodynamic driving force ∂ H (Γ, λ)/∂λ along a quasistatic, or reversible process connecting both states. By quasistatic we mean that the process is carried out so slowly that the system remains in equilibrium at all times and the instantaneous driving force is equal to the associated equilibrium ensemble average. In this way, the TI method represents a numerical discretization of the quasistatic process; Wrev is estimated by computing the equilibrium ensemble averages of the driving force on a grid of λ-values on the interval [λ1 , λ2 ], after which the integration is carried out using standard numerical techniques. For further details of the TI method and its applications we refer to the chapter by Kofke and Frenkel [4].
Free-energy calculation using nonequilibrium simulations
3. 3.1.
711
Nonequilibrium Free-Energy Estimation Establishing Free-Energy Bounds: Systematic and Statistical Errors
Nonequilibrium free-energy estimation is an alternative approach to measuring the reversible work Wrev . Instead of discretizing the quasistatic process in terms of a sequence of independent equilibrium states, the reversible work is estimated by means of a single, dynamical sequence of nonequilibrium states, explored along an out-of-equilibrium simulation. This is achieved by introducing an explicit “time-dependent” element into the originally static sequence of states by making λ = λ(t) an explicit function of the simulation “time” t. Here we have used the quotes to emphasize that t should not always be interpreted as a real physical time. For instance, in contrast to MD simulations, typical displacement MC simulations do not involve a natural time scale, in case of which t is simply an index variable that orders the sequence of sampling operations, measured in simulation steps. Suppose we choose λ(t) such that λ(0)=λ1 and λ(tsim )=λ2 , so that λ varies between λ1 and λ2 in a time tsim . Accordingly, the Hamiltonian H (Γ, λ) = H (Γ, λ(t)) also becomes a function of t, and is driven from the initial system H1 to the final system H2 in the same time. The irreversible work Wirr done by the driving force along this switching process, defined as tsim
dt
Wirr = 0
dλ dt
t
∂H ∂λ
λ(t )
,
(7)
provides an estimator for the reversible work Wrev done along the corresponding quasistatic process. The point of this nonequilibrium procedure is that values of Wirr can be found, in principle, from a single simulation, because the integration in Eq. (7) involves instantaneous values of the function ∂ H/∂λ rather than ensemble averages. If efficient, this would be much less costly than the TI procedure in Eq. (6), which requires a series of independent equilibrium simulations. But there is, of course, a trade-off. While the TI method is inherently “exact” in that the errors are associated only with statistical sampling and the discreteness of the mesh used for the numerical integration, the irreversible work procedure provides a biased estimator for Wrev . That is, aside from statistical errors arising from different choices of initial configurations for calculation of Eq. (7), the irreversible estimator Wirr is subject to a systematic error Esyst. Both types of error are due to the inherently irreversible nature of the nonequilibrium process. The statistical errors originate from the fact that, for a fixed and finite simulation time tsim , the value of the integral in Eq. (7) depends on the initial
712
M. de Koning and W.P. Reinhardt
conditions of the nonequilibrium process. In other words, for different initial conditions, Γ j (t = 0), and a finite simulation time tsim , the value of Wirr in Eq. (7) is not unique. Instead, it is a stochastic quantity characterized by a distribution function with a finite variance, giving rise to statistical errors of the sort arising in any MC or MD simulation. The systematic error manifests itself in terms of a shift of the mean of the irreversible work distribution with respect to the value of the ideal quasistatic work Wrev . This shift is caused by the dissipative entropy production characteristic of irreversible processes [6]. Because the entropy always increases, the systematic error Ediss is always positive, regardless of the sign of the reversible work Wrev . In this way, the average value Wirr of many measurements of the irreversible work will yield an upper bound to the reversible work Wrev , provided the average is taken over an ensemble of equilibrated initial conditions j (t = 0) at the starting point, t = 0. The importance of satisfying the latter condition was demonstrated by Hunter et al. [7]. From a purely thermodynamic point of view, the bounding error is simply a consequence of the Helmholtz inequality. Starting from an equilibrium initial state, for instance at λ = λ1 , the irreversible work upon driving the system to λ = λ2 is always an upper bound to the actual free-energy change between the equilibrium states of initial and final systems, i.e., Wirr ≥ A = A(λ2 ; N, V, T ) − A(λ1 ; N, V, T ).
(8)
Only in the limit of an ideally quasistatic, or reversible process, represented by the tsim → ∞ limit, does the inequality in Eq. (8) become the equality, Wrev = A, as also manifested in Eq. (6). The preceding ideas are illustrated conceptually in Fig. 1(a) and (b), which show typical distribution functions of irreversible work measurements starting from an ensemble of equilibrated initial conditions. Figure 1(a) compares the results that might be obtained for irreversible work measurements for two different finite simulation times tsim = t1 and tsim = t2 , with t2 > t1 to the ideally reversible tsim → ∞ limit. Both finite-time results show distribution functions with a finite variance and whose mean values have been shifted with respect to the reversible work value by a positive systematic error. Both the variance and systematic error for tsim = t1 are larger than the corresponding values for tsim = t2 , given that the latter process proceeds in a slower manner, leading to smaller irreversibility. Figure 1(b) shows the irreversible work estimators obtained for the reversible work associated with a quasistatic process in which system 1 is transformed into system 2 as obtained in the forward (1 → 2) and backward (2 → 1) directions using the same simulation time tsim . Given that the systematic error is always positive, the forward and backward processes provide upper and lower bounds to the reversible work value, respectively. However, in general, the systematic and statistical errors need not be equal for both directions.
Free-energy calculation using nonequilibrium simulations (a)
713
(b) tsim → ∞
tsim t2 > t1
(2 → 1)
(1 → 2)
∆Ediss (t2)
tsim t1 ∆Ediss(t1)
Wrev (1 → 2)
Wrev Wirr
Wirr
Figure 1. Conceptual illustration of typical irreversible work distributions obtained from nonequilibrium simulations. (a) compares the results that might be obtained for irreversible work measurements for two different finite simulation times tsim = t1 and tsim = t2 , with t2 > t1 to the ideally reversible tsim → ∞ limit. (b) shows the irreversible work estimators obtained for the reversible work associated with a quasistatic process in which system 1 is transformed into system 2 as obtained in the forward (1 → 2) and backward (2 → 1) directions using the same simulation time tsim .
3.2.
Optimizing Free-Energy Bounds: Insight from Nonequilibrium Statistical Mechanics
A natural question that arises after considering the discussion in previous section is how one might tune the nonequilibrium process so as to minimize the systematic and statistical errors associated with the irreversibility for given initial and final equilibrium states and a given simulation time tsim . To answer this question, it is useful to investigate the microscopic origin of entropy production in nonequilibrium processes. For this purpose, it is particularly helpful to consider the particular class of close-to-equilibrium nonequilibrium processes for which the instantaneous distribution functions of nonequilibrium states do not deviate too much from the ideally quasistatic equilibrium distribution functions and where theory of linear response [5] is appropriate. As we will see later on, it is not too difficult to reach this condition in practical situations. As described by Onsager’s regression hypothesis [5], when a nonequilibrium state is not too far from equilibrium, the relaxation of any mechanical property can be described in terms of the proper equilibrium autocorrelation function. In other words, the hypothesis states that the relaxation of a nonequilibrium disturbance is governed by the same laws as the regression of spontaneous microscopic fluctuations in an equilibrium system.
714
M. de Koning and W.P. Reinhardt
Under the assumption of proximity to equilibrium, one can then derive the following expression for the mean dissipated energy, i.e., the systematic error Ediss(tsim ), for a series a irreversible work measurements obtained from nonequilibrium simulations of duration tsim [8–10]: 1 Ediss(tsim ) = kB T
tsim
dt 0
dλ dt
2 t
∂H τ [λ(t )] var ∂λ
.
λ(t )
(9)
Aside from the switching rate, the integrand in Eq. (9) contains both the correlation time as well as the equilibrium variance of the driving force ∂ H/∂λ. These two factors describe, respectively, how quickly the fluctuations in the driving force decay and how large these fluctuations are in the equilibrium state. It is clear that the integral is positive-definite, as it must be. Moreover, it indicates that, for near-equilibrium processes, the systematic error should be the same for forward and backward processes. This means that, in the linear–response regime, one can obtain an unbaised estimator for the reversible work Wrev by combining the results obtained from forward and backward processes. More specifically, in this regime we have Wirr (1 → 2) = Wrev (1 → 2) + Ediss ,
(10)
Wirr (2 → 1) = −Wrev (1 → 2) + Ediss ,
(11)
and
leading to the unbaised estimator (i.e., subject to statistical fluctuations only) Wrev (1 → 2) = 12 (Wirr (1 → 2) − Wirr (2 → 1) .
(12)
Concerning minimization of dissipation, Eq. (9) tells us that one should attempt to reduce both the magnitude of the fluctuations in the driving force as well as the associated correlation times. This involves both a static component, i.e., the magnitude of the equilibrium fluctuations, and a dynamic one, namely the typical decay time of equilibrium correlations. This shows that not only the choice of the path, H (λ), but also the simulation algorithm by which the system is propagated in “time” (i.e., MC or MD simulation) will affect the dissipation in the irreversible work measurements. Whereas the magnitude of the equilibrium fluctuations should be algorithm independent (as long as the algorithms sample the same equilibrium distribution function), the correlation time is certainly algorithm-dependent. In case of displacement MC simulation, as we will see below, the choice of the maximum displacement parameter affects the correlation time τ , and, consequently, the magnitude of the dissipation.
Free-energy calculation using nonequilibrium simulations
715
Finally, let us now assume that we have a prescribed path H (λ) and a simulation algorithm to sample the nonequilibrium process between the systems H (λ1 ) and H (λ2 ). How do we now choose the functional form of the time-dependent switching function λ(t) to minimize the dissipation? Equation (9) provides us with an explicit answer. To see this, we first perform a change of integration variable, setting x = t /tsim , obtaining Ediss(tsim ) =
1 tsim
Ediss[λ(x)],
(13)
with 1 Ediss[λ(x)] = kB T
1
dx 0
dλ dx
2 x
∂H τ (λ(x )) var ∂λ
λ(x )
.
(14)
Equation (14) is a functional of the common form [11] 1
S[λ(x)] =
dx F(λ (x), λ(x), x).
(15)
0
The minimization of the dissipation is thus equivalent to finding the function λ(x) that minimizes a functional of the type (15) subject to the boundary conditions λ(0)=λ1 and λ(1)=λ2 . Standard variational calculus then shows that the solution is obtained by solving the Euler–Lagrange equation [11] associated with the functional, d ∂F ∂F , = dx ∂λ ∂λ
(16)
subject to the mentioned boundary conditions.
4.
Applications of Nonequilibrium Free-Energy Estimation
To illustrate the discussion of the previous sections we will now discuss a number of applications of nonequilibrium free-energy estimation, demonstrating the bounding properties of irreversible-work measurements, as well as aspects of dissipation optimization.
4.1.
Harmonic Oscillators
In the first application we consider the problem of computing the free-energy difference between two systems consisting of 100 identical, independent,
716
M. de Koning and W.P. Reinhardt
one-dimensional harmonic oscillators of unit mass with different characteristic frequencies [9]. In particular we will consider the path defined by H (λ) =
100 1 [(1 − λ)ω12 + λω22 ] xi2 , 2 i=1
(17)
with ω1 = 4 and ω2 = 0.5 at a temperature k B T = 2. Note that we are considering only the potential energy of the oscillators here and have neglected any kinetic energy contributions. We can do this because the free-energy difference between two harmonic oscillators at a fixed temperature is determined only by the configurational part of the partition function. The value of the desired reversible work Wrev per oscillator associated with a quasistatic modification of the frequency from ω1 to ω2 is known analytically: ω1 = −4.15888. (18) Wrev (ω1 → ω2 ) = −k B T ln ω2 The simulation algorithm we utilize is standard Metropolis displacement MC with a fixed maximum trial displacement xmax = 0.3. First we consider the statistics of the irreversible work measurements as a function of the simulation “time” tsim , which here stands for the number of MC sweeps (one sweep corresponds to one trial displacement per oscillator) per process, for a linear switching function. The results are shown as the dashed line curves in Fig. 2(a) and (b), in which each data point represents the mean value of Wirr over 50 independent initial conditions. Figure 2(a) shows that the upper and lower
Upper/Lower bounds to Work
2 3 4 5 6
Analytical Linear function Optimized function
7 8 9 10
0
0.5
1
1.5
2
2.5
tsim ( 104 MC sweeps)
Average of forward and backward
(b)
(a)
4.0 4.5 5.0 Analytical Linear function Optimized function
5.5 6.0 6.5
0
0.5
1
1.5
2
2.5
tsim ( 104 MC sweeps)
Figure 2. Results of irreversible-work measurements per oscillator as a function of the switching time tsim for the linear (dashed lines) and optimal (solid lines) switching function. The analytical reversible work value is also shown (dot dashed line). (a) shows the results of the forward (upperbounds) and backward (lowerbounds) directions. (b) shows the values of the combined estimator of Eq. (12).
Free-energy calculation using nonequilibrium simulations
717
limit do converge toward the reversible value Wrev , although they do so quite slowly. The slow convergence becomes more apparent when we consider the behavior of the combined estimator of Eq. (12) in Fig. 2(b). If the process were sufficiently slow for linear–response theory to be accurate, the combined estimator should be unbiased and show no systematic deviation. It is clear that this is only the case for the slowest process, at tsim =2.56×104 MC sweeps. All shorter simulations show a systematic deviation, indicating that the associated processes remain quite far from equilibrium, hampering convergence. Next, we attempt to minimize dissipation in the simulation by using the switching function λ(x) that satisfies the Euler–Lagrange Eq. (16). For this purpose we first measured the equilibrium variance in the driving force and the characteristic correlation time of decay as a function of λ from a series of equilibrium simulations (i.e., fixed λ), after which we numerically solved Eq. (16), subject to the boundary conditions λ(0) = 0 and λ(1) = 1. The equilibrium variances, correlation times and the resulting optimal switching function are shown in Fig. 3(a)–(c), respectively. The results in Fig. 3(a) and (b) indicate that the main contribution to the dissipation originates from the region λ ≈ 1, where both the magnitude as well the characteristic decay time of the fluctuations in the driving force increase sharply. The optimal switching function in Fig. 3(c) captures this effect, prescribing a slow switching rate where one should and going faster where one can. The results obtained with this function for the irreversible work measurements are shown as the red lines in Fig. 2(a) and (b). The improvement compared to the linear switching function is quite significant. Figure 2(b), for instance, shows that for tsim as short as 3.2 × 103 MC sweeps, the nonequilibrium process has already reached the linear–response regime. The above optimization procedure is useful in cases where the thermodynamic path H (λ) is prescribed beforehand. This is the case, for instance, for
5
(a)
10
0 0
(c)
(b) 1.0
100
Correlation time
Variance
10 20
0.5
1
0.8
50
0.6 0.4
Linear Optimized
0.2 0
0.0 0
0.5
1
0
0.2 0.4 0.6 0.8 1.0
x
Figure 3. (a) The equilibrium variance (∂ H/∂λ), and (b) the correlation decay time (in MC sweeps) as a function of λ. (c) shows the optimal switching function, as determined by numerically solving Euler–Lagrange equation (16).
718
M. de Koning and W.P. Reinhardt
the reversible-scaling method [12], in which each state along the fixed path H (λ) = λV (V is the interatomic interaction potential) represents the physical system of interest in a different temperature state. In this manner, a single irreversible-work simulation along the scaling path provides a continuous series of estimators of the system’s free energy on a finite temperature interval. If one has some information about the behavior of the magnitude of the and correlation-decay times of the fluctuations of the driving force, one may use the variational method described above to optimize the switching function and minimize dissipation effects.
4.2.
Compression of Confined Lennard–Jones Particles
In the following application we consider a system consisting of 30 Lennard– Jones particles, constrained to move on the x-axis only. In addition, the particles are subject to an external field whose strength is controlled by an external parameter L. More specifically, we consider the path
6 12 σ σ 2xi 26 − + , H (L) =
xi j
xi j
L
(19)
where xi describes the position of particle i on the x-axis and xi j ≡ |xi − x j | is the distance between particles i and j . The second term in Eq. (19) is the external field, which is a very steeply rising potential and has the effect of confining the particles through very strong interactions with the first and last particles, effectively causing the 30 particles to lie approximately evenly spaced between x = ±L/2. Now consider the compression process wherein L changes from L 0 = 30σ to L 1 = 26σ , forcing the line of particles to undergo a one-dimensional compression. As in the previous example, we will attempt to compute the reversible work associated with this process by measuring the irreversible work Wirr for both process directions. Once again we utilize the Metropolis MC algorithm, but instead of fixing the algorithm parameter xmax , describing the maximum trial displacement, we now consider the effects of changing the sampling algorithm on the convergence of the upper and lower bounds. Although the variance of the driving force var (∂ H/∂λ) will not be affected, the correlation time will certainly depend on the choice of xmax . This is illustrated in Fig. 4, which shows the convergence of the upper and lower bounds to the reversible work as obtained for 3 different values of max at a temperature k B T = 0.35 : xmax = 0.6σ , 0.1σ , and 0.04σ , respectively. Effectively, the variation of this algorithm parameter may be thought of as changing the strength of the coupling between the MC “thermostat” and the system of particles. We utilized the linear switching function which varies L linearly between L 0 and L 1 in tsim MC sweeps (each sweep
Free-energy calculation using nonequilibrium simulations
719
Figure 4. Results of forward (upperbound) and backward (lowerbound) irreversible-work measurements (in units of ) as a function of the switching time tsim for the linear switching function for three different values of the MC algorithm parameter xmax .
consisting of 30 MC single-particle trial moves). Each data point and corresponding error bar (±1 standard deviation) were obtained from a set of 21 irreversible work measurements initiated for independent, equilibrated initial conditions. It is also useful to note that it is not necessary to explicitly compute the work Wirr by using (7). All that is needed, through the first law of thermodynamics which applies equally to reversible and irreversible processes, is to calculate the work as Wirr = E − Q, where E is the difference in internal energies of the system between the first and last switching steps, and Q is the heat accumulated during the switching process. This heat, Q, is simply the sum of energies added to, or subtracted from, the system as MC configurations evolve during a simulation. Given that these energies, εi , are already calculated in determining whether moves for particle i are to be accepted or rejected according to the canonical exp(−εi /k B T ), no extra programming is needed to calculate Wirr . It is immediately seen that the strength of the system-thermostat coupling through the algorithm parameter max is indeed a variational parameter
720
M. de Koning and W.P. Reinhardt
for the free-energy computations. Accordingly, rather than selecting a pre-set acceptance ratio of trial moves, as is usually done in equilibrium MC simulations, xmax should be determined so as to minimize the difference between the upper and lower bounds to A. The results show that for all three values of xmax , the upper and lower bounds show convergence. Yet, the convergence properties are clearly different for the three parameter values, giving the best results for xmax = 0.1 and the worst for xmax = 0.04, indicating that the correlation decay time for the fluctuations in the driving force are the shortest for the former and the longest for the latter. Nevertheless, the convergence of the bounds is still quite slow, in that hundreds of thousands of MC sweeps are required to obtain convergence of to within a few percent. This is a consequence of the strong interactions between the particles, as their hard cores interact during the compression from the “ends” of the line of particles and such hard core density gradients are typically slow to work themselves out through single particle MC moves. Contrary to the simple harmonic oscillator problem discussed in the previous section, this problem will be ubiquitous in most atomic and molecular systems in the condensed phase, seemingly rendering the free-energy computations on realistic systems of interest problematic. The questions that now arise are as to whether we can estimate the systematic errors Ediss from data already in hand and use it to improve the estimates of Fig. 4; and/or if we can optimize the thermodynamic path to reduce dissipation and achieve better behavior at short switching times; or perhaps both?
4.3.
Estimating Equilibrium Work from Nonequilibrium Data
Recently, Jarzynski [13] has generalized the Gibbs–Feynman identity, A = A1 − A0 = −k B T lnexp[−(H1 − H0 )/k B T ]0
(20)
where · · · 0 denotes canonical averaging with respect to configurations generated by H0 , and which is the basis of thermodynamic perturbation theory [4], to finite-time processes. Equation (20) is an identity, but in practice it is useful only when the configurations generated by canonical sampling with respect to H0 strongly overlap those generated by H1 . For hard core fluids this would be unusual unless H1 and H0 are quite “close”, resulting in the perturbative use of Eq. (20). Jarzynski now allows H0 to dynamically approach H1 along a path, in analogy with the above discussions. The result, in the context discussed here, suggests that for a given set of N irreversible-work measurements Wi ≡ Wirr (i , t = 0), with i = 1, . . . , N , instead of estimating Wirr as the sim-
Free-energy calculation using nonequilibrium simulations
721
ple arithmetic mean of the Wi , one should calculate the Boltzmann weighted “Jarzynski” (or “Jz”) average W Jz =
M 1 exp(−Wi /k B T ), M i=1
(21)
and then estimate the free energy change as AJz ≡ −k B T lnW Jz .
(22)
In this way bounding is sacrificed, but a more accurate result is not precluded given that, in principle, the Jz-average is unbiased. This approach has been shown to be effective both in the analysis of simulation data as well as finite-time polymer extension experiments, which are of course irreversible. An immediate concern, however, is that, although in the limit of complete sampling as in the Gibbs–Feynman identity, the Jarzynski results are exact in the context of a dissipation-free system, incomplete MC sampling may result in unsatisfactory results.
Work of Compression 300 Forward arithmetic average Backward arithmetic average Forward Jarzynski average Backward Jarzynski average
Upper/Lower bounds to Work
250
200
150
100
50
0 100
1000
10
4
10
5
tsim (MC sweeps) Figure 5. Results of forward and backward irreversible-work averages (in units of ) for the 30-particle confined Lennard–Jones system as a function of the switching time tsim . The results show both the simple arithmetic averages as well as the Boltzmann-weighted Jarzynski averages.
722
M. de Koning and W.P. Reinhardt
This is illustrated in Fig. 5, where data used to generate the bounds to A in Fig. 4, are plotted over a much larger range of switching times tsim , and compared to the AJz estimates. Both the simple arithmetic as well as the Jarzynski averages for both directions were computed over the 21 independent initial conditions. It is evident that, although not giving bounds, the AJz estimates indeed improve the upper and lower bounds compared to those calculated as simple averages. However, the Jarzynski averages become useful when the convergence of the simple arithmetic averages has reached the order of less than 1 k B T per particle. In this fashion, although a promising computational asset, the Jarzynski procedure still requires systematic procedures for finding more reversible paths.
4.4.
Path Optimization through Scaling and Shifting of Coordinates
As we have seen in the harmonic oscillator and Lennard–Jones problems, the choice of the thermodynamic path and the used switching function is quite crucial to the success of nonequilibrium free-energy estimation. In the case of the harmonic oscillator problem it was relatively straightforward to find a good switching function by explicitly solving the variational problem in Eqs. (15) and (16), which lead to an optimized simulation that “spends the right amount of time along each segment” of the already defined path. Here it is important to note that this variational optimization should be carried out over an ensemble averaged Wirr , being identical for every member of the ensemble, independently of any specific i (t = 0). This is the reason why early attempts by Pearlman and Kollman [14] to determine paths “on the fly” by looking ahead and avoiding strong dissipative collisions in specific configurations may result in the unintentional introduction of a Maxwell demon [15], violating the second law of thermodynamics, which is of course the fundamental origin of the Helmholtz inequality. Compared to the simple harmonic oscillator problem, the optimization of the nonequilibrium simulation of the confined Lennard–Jones system is significantly more challenging because of the strong interactions between the particles as during the compression of the system. Given that this type of interaction is expected to occur in most interesting problems, it is of interest to design thermodynamic paths that are different from the ones in which one simply follows H (λ) as λ runs from an initial to a final value, like we did in the case of the harmonic oscillator problem. We now present two approaches that follow this idea and lead to thermodynamic paths that are significantly more reversible. Both the coordinate scaling [16] and coordinate shifting methods discussed below derive from
Free-energy calculation using nonequilibrium simulations
723
the same fundamental thought: is there a (λ-dependent) coordinate system in which all particles are apparently at rest with relative to one another during the switching process? In such a coordinate system perhaps all particles will have little difficulty in remaining close to equilibrium during the whole switching process, with only the magnitude of their local fluctuations changing.
4.4.1. Coordinate scaling Figure 6 illustrates the possibilities of such an approach, when applied to the simple problem of compression discussed above. Here, in an admittedly simple example, all particles should be compressed “uniformly,” rather than by the nonuniform compression generated through the interactions of the confining potential with the particles at both ends of the line. This is accomplished by writing the coordinates as s(λ) xi , where s(λ) is a (common) scaling parameter, which may then be variationally optimized. The greatly improved bounds of Fig. 6 indicate that a better path has indeed been found. How does this fit the “at rest” criterion mentioned earlier? If one watches the MC dynamics in the unscaled “xi ” coordinates using an optimized s(λ), rather than in the actual physical coordinates, s(λ) xi , it appears that the equilibrium positions xi do not change during the switching, and thus, indeed, the only irreversibility arises from the changes in the RMS fluctuations about the equilibrium positions. It should be noted, however, that, as these scalings may be regarded as a change in the metric that affects the length and volumes definitions, one should include a entropic (calculable) correction to obtain the desired free-energy difference. Recently, there has been a variety of applications of the scaling approach [16–18], including the determination of the absolute free energy of Lennard– Jones clusters and a smooth metric scaling through a first order solid–solid phase transition, fcc to bcc, with no apparent hysteresis with its resulting irreversibility.
4.4.2. Coordinate shifting In the applications of metric scaling, thermodynamic paths are often easily determined when a clear symmetry is present. Another approach, namely coordinate shifting is more useful when such symmetries are absent. As an alternative to writing a moving coordinate using the scaling relation s(λ) xi , one can take xi = xifluct + xiref (λ). Here each particle moves in a concerted fashion along a λ-dependent reference path, chosen by symmetry, or by methods such as simulated annealing, to avoid strong hard core interactions or other
724
M. de Koning and W.P. Reinhardt
likely causes of irreversibility. As λ evolves, only the fluctuation coordinates xifluct are subject to MC variations: should the physical environment of each particle remain at least roughly constant, one may hope that the fluctuations from the xiref (λ) do not depend strongly on λ. To the extent that this is the case, the fluctuation coordinates are always at equilibrium, and thus the path is reversible! Figure 7 illustrates the efficacy of this method for the linear compression problem. As opposed to coordinate scaling, coordinate shifting does not change the metric, dispensing the need for entropic corrections and paving the way for applications involving inhomogeneous systems where the possible absence of symmetries obscures the choice of an appropriate metric obvious and complicates the computation of scaling entropy corrections. As is also clear from the results shown in Figure 7, the finite-time upper and lower bounds converge sufficiently quickly for the Jarzynski averaging to actually markedly improve even the shortest-time results. More general “non-linear” combinations of scaling and shifting may also be used to advantage, as in [19].
Work of Compression: Optimized Scaling
Upper/Lower bounds to Work
225
175
125
75
25 1
10
2
10
3
10
4
10
5
10
6
10
tsim (MC sweeps) Figure 6. Convergence of upper and lower bounds to the free-energy change associated with the compression of the confined Lennard–Jones system at k B T = 0.35 as a function of the switching time tsim . The outer pair of lines are from standard finite-time switching, whereas the inner pair represents the results from finite-time switching using linear metric scaling. The vertical bars represent the standard error in the mean of 100 replicas.
Free-energy calculation using nonequilibrium simulations
725
Work of Compression: Optimized Shifting 71 Forward arithmetic average Backward arithmetic average Forward Jarzynski average Backward Jarzynski average
Upper/Lower bounds to Work
70 69 68 67 66 65 64 63 10
2
10
3
10
4
tsim (MC sweeps) Figure 7. Convergence of upper and lower bounds to the free-energy change associated with the compression of the confined Lennard–Jones system at k B T = 0.35 as a function of the switching time tsim as obtained by optimized coordinate shifting. The vertical bars represent the standard error in the mean of 21 replicas. The results obtained with Jarzynski averages are also shown.
5.
Outlook
One of the most fundamental and challenging applications of atomistic simulation techniques concerns the determination of those thermodynamic properties that require determination of the entropy, the chemical potential and the various free energies, which are all examples of thermal thermodynamic properties. In contrast to their mechanical counterparts (e.g., enthalpy, pressure) they cannot be computed as ensemble (or time) averages, and indirect strategies must be adopted. Here, we have discussed the basic aspects of a particular strategy, that of using nonequilibrium simulations to obtain estimators of reversible work between equilibrium states. The point of this approach is that, in contrast to equilibrium methods such as thermodynamic integration, the desired value can, in principle, be estimated from a single simulation. But there is a trade-off, in that the nonequilibrium estimators are subject to both systematic and statistical errors, caused by the inherently irreversible nature of nonequilibrium processes.
726
M. de Koning and W.P. Reinhardt
Yet, the approach allows one to systematically obtain upper and lower bounds to the requested reversible result by exploring the nonequilibrium processes both in forward and backward directions. The bounds for a given process become tighter with decreasing process rates. But more importantly, it is possible to optimize the nonequilibrium process so as to minimize irreversibility and, for a given process time, decrease the bounds. We have discussed a number of methods by which to conduct this optimization task, including explicit functional optimization using standard variational calculus and techniques based on special coordinate transformations aimed at the reduction of irreversibility. These techniques have been quite successful so far, allowing accurate free-energy measurements using relatively short nonequilibrium simulations. In this light, the idea of using nonequilibrium simulations has now grown into a robust and efficient computational approach to the problem of computing thermal thermodynamic properties using atomistic simulation methods. Nevertheless, further development remains necessary, in particular toward improving/generalizing the existing optimization schemes.
References [1] G. Gilmer and S. Yip, Handbook of Materials Modeling, vol. I, chap. 2.14, Kluwer, 2004. [2] J. Li, Handbook of Materials Modeling, vol. I, chap. 2.8, Kluwer, 2004. [3] M.E. Tuckerman, Handbook of Materials Modeling, vol. I, chap. 2.9, Kluwer, 2004. [4] D.A. Kofke and D. Frenkel, Handbook of Materials Modeling, vol. I, chap. 2.14, Kluwer, 2004. [5] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. [6] L.D. Landau and E.M. Lifshitz, Statistical Physics, Part 1, 3rd edn., Pergamon Press, Oxford, 1980. [7] J.E. Hunter III, W.P. Reinhardt, and T.F. Davis, “A finite-time variational method for determining optimal paths and obtaining bounds on free energy changes from computer simulations,” J. Chem. Phys., 99, 6856, 1993. [8] L.W. Tsao, S.Y. Sheu, and C.Y. Mou, “Absolute entropy of simple point charge water by adiabatic switching processes,” J. Chem . Phys., 101, 2302, 1994. [9] M. de Koning and A. Antonelli, “Einstein crystal as a reference system in free energy estimation using adiabatic switching,” Phys. Rev. E, 53, 465, 1996. [10] M. de Koning and A. Antonelli, “Adiabatic switching applied to realistic crystalline solids: vacancy-formation free energy in copper,” Phys. Rev. B, 55, 735, 1997. [11] R. Courant and D. Hilbert, Methods of Mathematical Physics, vol. 1, Wiley, New York, 1953. [12] M. de Koning, A. Antonelli, and S. Yip, “Optimized free energy evaluation using a single reversible-scaling simulation,” Phys. Rev. Lett., 83, 3973, 1999. [13] C. Jarzynski, “Nonequilibrium equality for free energy differences,” Phys. Rev. Lett., 78, 2690, 1997.
Free-energy calculation using nonequilibrium simulations
727
[14] D.A. Pearlman and P.A. Kollman, “The lag between the Hamiltonian and the system configuration in free energy perturbation calculations,” J. Chem. Phys., 91, 7831, 1989. [15] H.S. Leff and A.F. Rex, Maxwell’s Demon 2, Entropy, Classical and Quantum Information, Computing, Institute of Physics Publishing, Bristol, U.K, 2002. [16] M.A. Miller and W.P. Reinhardt, “Efficient free energy calculations by variationally optimized metric scaling: concepts and applications to the volume dependence of cluster free energies and to solid–solid phase transitions,” J. Chem. Phys., 113, 7035, 2000.
728
M. de Koning and W.P. Reinhardt [17] L.M. Amon and W.P. Reinhardt, “Development of reference states for use in absolute free energy calculations of atomic clusters with application to 55-atom Lennard– Jones clusters in the solid and liquid states,” J. Chem. Phys., 113, 3573, 2000. [18] W.P. Reinhardt, M.A. Miller, and L.M. Amon, “Why is it so difficult to simulate entropies, free energies and their differences?” Accts. Chem. Res., 34, 607, 2001. [19] C. Jarzynski, “Targeted free energy perturbation,” Phys. Rev. E, 65, 046122, 2002.
2.16 ENSEMBLES AND COMPUTER SIMULATION CALCULATION OF RESPONSE FUNCTIONS John R. Ray 1190 Old Seneca Road, Central, South Carolina 29630, USA
1.
Statistical Ensembles and Computer Simulation
Calculation of thermodynamic quantities in molecular dynamics (MD) and Monte Carlo (MC) computer simulations is a useful, often employed tool [1–3]. In this procedure one chooses a particular statistical ensemble for the computer simulation. Historically, this was the microcanonical, or (EhN) ensemble for MD and the canonical, or (ThN) ensemble for MC, but there are several choices available for MD or MC. The notations, (EhN), (ThN) denote ensembles by the thermodynamic state variables that are constant in an equilibrium simulation; energy E, shape-size matrix h, particle number N and temperature T . (There could be other thermodynamic state variables, gi , i = 1, 2, . . . , such as electric or magnetic field applied to the system, and these additional variables would be in the defining brackets.) The shape-size matrix is made up of the three vectors defining the computational MD or MC cell. If the vectors defining the parallelepiped, containing the particles in the computational cell, are denoted (a, b, c) then the 3×3 shape-size matrix is defined by having its columns constructed from the three cell vectors, h = (a, b, c).The volume V of the computational cell is related to the h matrix by V = det(h). For simplicity, we assume that the atoms in the simulation are described by classical physics using an effective potential energy function to describe the inter-particle interactions. Unless explicitly stated otherwise we suppose that periodic boundary conditions are applied to the particles in the computational cell. The periodic boundary conditions have the effect of removing surface effects and, conveniently, making the calculated system properties approximately equal to those of bulk matter. We assume the system obeys 729 S. Yip (ed.), Handbook of Materials Modeling, 729–743. c 2005 Springer. Printed in the Netherlands.
730
J.R. Ray
the Born–Oppenheimer approximation and can be described by a potential energy U using classical mechanic and classical statistical mechanics.
2.
Ensembles
For a single component system there are eight basic ensembles that are convenient to introduce. These ensembles and their connection to their reservoirs are shown in Fig. 1 [4]. Each ensemble represents a system in contact with different types of reservoirs. These eight systems are physically realizable and each can be employed in MD or MC simulations. The combined reservoir is a thermal reservoir, a tension (or stress) and pressure reservoir (the pressure reservoir in Fig. 1 represents a tension and pressure reservoir) and a chemical potential reservoir. The reservoirs are used to impose, respectively,
Figure 1. Shown are the eight ensembles for a single component system. The systems interact through a combined temperature, pressure and chemical potential reservoir. The ensembles on the left are adiabatically insulated from the reservoir while those on the right are in thermal contact with the reservoir. Pistons and porous walls allow for volume and particle exchange. Adiabatic walls are shown cross-hatched while dithermal walls are shown as solid lines. Ensembles on the same line like a and e are related by Laplace and inverse Laplace transformations. The pressure stands for the pressure and the tension.
Ensembles and computer simulation calculation
731
constant temperature, tension and pressure, and chemical potential. The eight ensembles naturally divide into pairs of ensembles. The left-hand column in Fig. 1, a–d are constant energy ensembles while ensembles in the right hand column, e–h have constant temperature. These pairs of ensembles are connected to each other by direct and inverse Laplace transformations, a ↔ e, et cet. The energies that are associated with each ensemble are related to the internal energy E by Legendre transformations [4]. The eight ensembles may be defined using the state variables that are held constant in the ensemble ([5] pp. 293–304). The eight ensembles include the (EhN) and (ThN) ensembles introduced earlier. Another pair of ensembles is the (H t and P N) and (T t and P N) ensembles where H = E + Vo Tr(tε) + PV is the enthalpy, tij is the thermodynamic tension tensor, εij the strain tensor, P the pressure and Tr represents the trace operation. The thermodynamic tension is a modified stress tensor applied to the system that is introduced in the thermodynamics of anisotropic media. Due to definitions in the thermodynamic of non-linear elasticity we denote the tension and pressure separately. A third pair of ensembles is the (Lhµ) and (Thµ), where L is the Hill energy L = E−µN and µ the chemical potential for the one component system. The isothermal member of this latter pair of ensembles is Gibb’s grand canonical ensemble, (Thµ) ensemble. The final pair of ensembles is the (R t and Pµ) and (T t and Pµ) ensembles where R = E + Vo Tr(tε) + PV −µN is the R-energy. The latter member of this ensemble pair was introduced by Guggenheim [6] and is interesting since it has all intensive variables, T, P, µ, and these are all held fixed, but we know only two of these can be independent. Nevertheless, this ensemble can be used in simulations although its size will increase or decrease in the simulation. The (R t and P µ) ensemble allows variable particle number along with variable shape/size. These last four ensembles all have constant chemical potential and variable particle number. For multi-component systems there are a series of hybrid ensembles that are useful. As an example, for two component systems we can use the (T t and P µ1 N2 ) ensemble that is useful for studying the absorption of species 1 in species 2 as for example the absorption of hydrogen gas in a solid [7, 8]. Each of the eight ensembles, for a single component system, may be simulated using either MD or MC simulations. The probability distributions are exponentials for the isothermal ensembles and power laws for the adiabatic ensembles. For example, for the (TVN) ensemble the probability density has the Boltzmann form P(q; T VN ) = Ce−U (q)/(kB T ) with U (q) the potential energy and C a constant. For the (H t and PN) ensemble P(q;H, t,P,N) = CV N (H −Vo Tr(tε) −PV−U(q))(3N/2−1) . The trial MC moves involve particle moves and shape/size matrix moves [9]. For the (R t and Pµ) ensemble MC moves involve particle moves, shape/size matrix moves and attempted creation and destruction events [10]. For MC simulation of these ensembles one uses the probability density directly in the simulation, whereas for MD simulations
732
J.R. Ray
ordinary differential equations of motion are solved for equations arising from Hamilton’s equations. An important advancement in using MD to simulate different ensembles was the extended variable approach introduced by Andersen [11]. In this approach, which some variation is used in all but the (EhN) ensemble, extra variables are introduced into the system to introduce the variation of the variable in the ensemble. Although these variations are fictitious it can be proven that the correct ensemble is generated using these extended variable schemes. In the original approach for the (H PN) ensemble Andersen introduced an equation of motion for the volume that responds to a force that is the difference between the internal microscopic pressure and an external constant pressure imposed by the reservoir. This leads to volume fluctuations that are appropriate to the (H PN) ensemble, see Fig. 1. Nose, thereafter, generalized MD to the isothermal ensembles by introducing a mass scaling variable that allows for energy fluctuations in the (ThN) and the other isothermal ensembles [12]. These energy fluctuations mimic the interaction of the system with the heat reservoir and allow MD to generate the probability densities of the isothermal ensembles. Which ensemble/ensembles to use, and whether to use MD or MC depends on user preference and the particular problem under consideration. For the variable particle number ensembles (those involving the chemical potential in their designation) one usually employs MC methods since simulations using these ensembles involve attempted creation and destruction of particles and this fits naturally with the stochastic nature of the MC method. However, MD simulations of these ensembles have been investigated and performed [13].
3.
Response Function Calculation
Response functions are thermodynamic properties of the system that are often measured, such as specific heats, heat capacities, expansion coefficients, and elastic constants to name a few. Response functions are associated with derivatives of the basic thermodynamic state variables like energy, pressure, entropy and include the basic thermodynamic state variables themselves. We do not include (non-equilibrium) transport properties, such as thermal conductivity, electrical conductivity, and viscosity, in our discussions since they fall under a different calculation schema that uses time correlation functions [14]. Formulas, that may be used to calculate response functions in simulations, may be derived by differentiation of quantities connecting thermodynamic state variables with integrals over functions of microscopic particle variables. These formulas are specific to each ensemble, and are standard statistical mechanics relations. Such a quantity, in the canonical ensemble, is the partition
Ensembles and computer simulation calculation
733
function Z (T, h, N), which for a N particle system in three-dimensions has the form Z (T, h, N ) =
1 N !(2π)3N
e−H (q, p,h)/ kB T d 3N qd 3N p,
(1)
where q and p denote the 6N -dimensional phase space canonical coordinates of the system, H the system Hamiltonian, kB Boltzmann’s constant, Plank’s constant, and dτ = d 3N qd3N p the phase space volume element. The integral in Eq. (1) is carried out over the entire phase space. Although we have indicated the Hamiltonian depends on the cell vectors, h, it would also depend on additional thermodynamic state variables gi . For liquids and gases the dependence on h is replaced by simple dependence on the volume V ; for discussions of elastic properties of solids it is important to include the dependence on the shape and size of the system through the shape size matrix h or some function of h. The Helmholtz free energy A(T, h, N ) is obtained from the canonical ensemble partition function A(T, h, N ) = −k B T ln Z (T, h, N ).
(2)
Average values of phase space functions may be calculated using the phase space probability, which for the canonical ensemble is the integrand in the partition function in Eq. (1). For example, the canonical ensemble average for the phase space function f(q,p,h)is f=
f e−H/k B T dτ e−H/k B T dτ .
(3)
In an MD or MC simulation the thermodynamic quantity f is calculated by using a simple average over the simulation configurations, for MD this is an average over time, whereas for MC it is an average over the Markov chain of configurations generated. If the value of f at each configuration (each value of q, p, h) is f n , n = 1, 2, 3, . . . , M. for M time-steps in MD or trials in MC, then the average of f for the simulation is M
fn
. (4) M In the simulation Eq. (4) is the approximation to the phase space average in Eq. (3). If, for example, H = f , then this average gives the thermodynamic energy E = H and the caloric equation of state E = E(T, h, N ). The assumption that Eq. (4) approximates the integral in Eq. (3) is often referred to in the literature by saying that MD or MC “generates the ensemble”. The approximate equality of these two results in MD is the quasi-ergodic hypothesis of statistical mechanics which states that ensemble averages, Eq. (3) and time averages, Eq. (4) are equal. This hypothesis has never been proven f=
n=1
734
J.R. Ray
for realistic Hamiltonians but it is the pillar on which statistical mechanics rests. In what follows we shall assume that averages over simulation-generated configurations are equal to statistical mechanics ensemble averages. Thus, we use formulas from statistical mechanics but calculate the average values in simulations using Eq. (4) employing MD or MC. An important point to note is that for calculation of meaningful averages in a simulation we must “equilibrate” the system before collecting the values f n in Eq. (4). This is done by carrying out the simulation for a “long enough time” and then discarding these configurations and starting the simulation from that point. This removes transient behavior, associated with the particular initial conditions used to start the simulation, from overly influencing the average in Eq. (4). How long one must “equilibrate” the system depends on relaxation rates in the system, that are initially unknown. Tasks like the equilibration of the system, the estimate of the accuracy of calculated values, and so forth are part and parcel of the art of carrying out valid and, therefore, useful simulations and must be learned by actually carrying out simulations. In this aspect computer simulations have a similarity to experimental science, like gaining experience with the measuring apparatus, but, of course, they are theoretical calculations made possible by computers. From our discussion, so far, it might seem, to those who know thermodynamics, that the problem of calculating all response functions is finished, since if the Helmholtz free energy is known from Eq. (2) then all response functions may be calculated by differentiation of the Helmholtz free energy with respect to various variables. For example, the energy H may be found from H = kT 2
∂( A/kT ) . ∂T
(5)
Unfortunately, in MC or MD only average values like Eq. (3), that are ratios of phase space integrals, can be easily evaluated in simulations and not the 6N dimensional phase space integral itself, like Eq. (1). The reason for this is that in high-dimensions (dimensions greater than say, 10) the numerical methods used to accurately calculate integrals (e.g., Simpson’s rule) require computer resources beyond those presently available. For example, in 10 dimensions, for a grid of 100 intervals in each dimension, 1020 variables are required for the grid. Even with the most advanced computer, this number of variables is not easy to handle. In a typical simulation the dimension is typically hundreds or thousands, not ten. One might think that the high dimensional integrals could be calculated directly by MD or MC methods but this also does not work since the integrand in the high dimensional phase space is rapidly varying and one cannot sample for long enough to smooth out this rapid variation. The integral is determined by the value of the integrand in a few pockets (“equilibrium pockets”) in phase space that will only be sampled infrequently. For the ratio of high dimensional integrals, MD or MC methods have the
Ensembles and computer simulation calculation
735
effect of focusing the sampling on just those important regions. The difficulty, in high dimensions, of calculating quantities that require the evaluation of an integral as compared to the ratio of integrals leads to a classification of quantities to be calculated by computer simulation as thermal or mechanical properties. Thermal properties require the value of the partition function, or some other high-dimensional integral, for their evaluation whereas mechanical properties do not require the value of the partition function for their evaluation, but are a ratio of two high dimensional integrals. As examples, for the canonical ensemble the Helmholtz free energy is a thermal variable and the energy is a mechanical variable. Other thermal variables are the entropy, chemical potential, and Gibbs free energy. Other mechanical variables are temperature, pressure, enthalpy, thermal expansion coefficient, elastic constants, heat capacity, and so forth. Special methods must be developed for calculating thermal properties and the calculation of thermal properties is, in general, more difficult. We have developed novel methods to calculate thermal variables using different ensembles [15, 16] but shall not discuss them in detail in this contribution. As an example of the calculation of a mechanical response function, consider the fluctuation formula for the heat capacity in the canonical ensemble. Differentiation of the average energy H in Eq. (3) with respect to T while holding the cell vectors rigid leads to the heat capacity at constant shape-size CV CV =
1 ∂H = (H 2 − H 2 ). ∂T kB T 2
(6)
Recall that in the simulation the average values in Eq. (2) are approximated by simple averages of the quantity. Thus, in a single canonical ensemble simulation, MC or MD we may calculate the heat capacity of the system at the given thermodynamic state point by calculating the average value of the square of the energy, subtracting the average value of the energy squared and dividing by kB T 2 . The quantity, δ H 2 = H 2 − H 2 ,
(7)
the variance in probability theory, is called the fluctuation in the energy H. The fluctuation of quantities enters into the formulas for response functions for mechanical variables. It should be noted that a direct way of calculating the heat capacity CV is to calculate the thermal equation of state at a number of temperatures and then numerically differentiate H with respect to T . This requires a series of simulations and is not as convenient or as easy to determine an estimate of accuracy but is simple and is a useful check on the value obtained from the fluctuation formula, Eq. (6). We refer to this method of calculating response functions as the direct method. Any mechanical response function can, in principle, be calculated by the direct method.
736
4.
J.R. Ray
Thermodynamics of Anisotropic Media
For the present we choose the reference state to be the equilibrium state of the system with zero tension applied to the system. The h matrix for this reference state is h o while for an arbitrary state of tension we have h. The following formulation of the thermodynamics of aniostropic media is consistent with nonlinear or finite elasticity theory. In the following repeated indices are summed over. The elastic energy Uel is defined by Uel = Vo Tr(tε),
(8)
where Vo is the reference volume, t is the thermodynamic tension tensor, ε is the strain tensor and Tr implies trace. The h matrix maps the particle coordinates into fractional coordinates, sai , in the unit cube through the relation xai = h ij sa j . The strain of the system relative to the unstressed state is εij = 12 (h oT −1 Gh −1 0 − I )ij ,
(9)
where G = h T h is the metric tensor. Here h o is the reference value for measuring strain, that is, the value of h when the system is unstrained. This value can be obtained by carrying out a (H t and PN) simulation, MD or MC with the tension set to zero. Equation (9) can be derived by noting that the deformation gradient can be written in terms of the h matrices as ∂ xi /∂ xoj = h ik h −1 okj , and using this in the defining relation for the Lagrangian strain of the system. The thermodynamic tension tensor is defined so that the work done in an infintesimal distortion of the system is given by dW = V o Tr(tdε). The stress tensor, σ , is related to the thermodynamic tension by T −1 T h / V. σ = Vo hh −1 o th o
(10)
The thermodynamic law is T d S = dE + Vo Tr(t dε),
(11)
where T is the temperature, S the entropy and E the energy of the particles. Using the definition of the strain, Eq. (9), the thermodynamic law can be recast as T −1 dG)/2. T d S = dE + Vo Tr(h −1 o th o
(12)
From this latter we obtain T −1 )kn /2. (∂ E/∂ G kn ) S = −(Vo h −1 o th o
(13)
In the (EhN) ensemble we have the general relation (∂ E/∂ G kn ) S = ∂ H/∂ G kn ,
(14)
Ensembles and computer simulation calculation
737
where H is the particle Hamiltonian and the average is the (EhN) ensemble average. Combining the last two equations leads to T −1 )kn /2. ∂ H/G kn = −(Vo h −1 o th o
(15)
The particle Hamiltonian is transformed by the canonical transformation xai = h ij sa j, pai = h ijT −1 πa j , into H (sa , πa , h) =
N 1 πai G −1 ij πa j /m a + U (r12 , r13 , . . .), 2 a=1
(16)
where the distance between particles a and b is to be replaced by the relation2 = sabi G ij sabj and sabi is the fractional coordinate difference between a ship rab and b. The microscopic stress tensor ij may be obtained by differentiation of the particle Hamiltonian with respect to the h matrix while holding constant (sa , πa ) : ∂ H/∂h ij = ik Akj , where A is the area tensor A=VhT −1 . For the Hamiltonian, Eq. (16), the microscopic stress tensor is 1 ij = V
pai pa j /m a −
a
∂U a
∂rab
xabi xabj /rab .
(17)
Differentiating the Hamiltonian with respect to the parameters G kn we obtain Mkn ≡ (∂ H/∂ G kn ) = −(V h −1 h T −1 )kn /2,
(18)
where is the microscopic stress tensor, Eq. (17). If the average value of Eq. (18) is combined with Eq. (15) we obtain t = V h o h −1 h T −1 h oT / Vo .
(19)
Comparing Eq. (19) and Eq. (10) we find σ =
(20)
the stress tensor is the average of the microscopic stress tensor. Equation (20) holds in all ensembles but the proofs would be different. For the (ThN) ensemble we would use the Helmholtz free energy A=E−TS instead of the energy E. The counterpart to Eq. (14) would be (∂ A/∂ G kn )T = ∂ H/∂ G kn .
5.
Calculation of Elastic Constants in the (EhN) Ensemble
In order to discuss the calculation of the elastic constants we describe the system by the microcanonical, (EhN) ensemble. The adiabatic elastic constants are defined as the derivative of the tension by the strain Ci(S) j kl = −(∂tij /∂εkl ) S .
(21)
738
J.R. Ray
Note the minus sign in Eq. (21) implies that the tension and stress are positive for compressive loading. Often the opposite convention is employed and no minus sign occurs in Eq. (21) in that convention. In the literature of finite elasticity the elastic constants defined in Eq. (21) are often called stiffness coefficients or elastic moduli. Assume the system Hamiltonian describing the system has the form H (xa , pa ) =
N 1 p2 /m a + U (r12 , r13 , . . .), 2 a=1 a
(22)
where pa is the momentum of particle a, rab is the distance between particle a and b and the system contains N particles. Let the reference value, h o , denote the shape-size matrix for the unstressed system and h represent an arbitrary state of stress. The (EhN) fluctuation formula involving the adiabatic elastic constants, for a potential that depends only on interparticle distances has the form −1 −1 −1 (S) Vo h −1 oip h 0 j q h okr h ons C pqrs = −4δ(Mij Mkn )/k B T −1 −1 −1 +2N k B (G −1 in G j k + G ik G j n ) N
k(a, b, c, d)sabi sabj sabk sabn ,
(23)
k(a, b, c, d) = (∂ 2 U/∂rab ∂rcd − (∂U/∂rab )δac δbd /rab )/(rabrcd ).
(24)
+
a
where
The averages in Eq. (23) are calculated using (EhN) simulations, MD or MC. In (EhN) MD we would solve Newton’s laws for the motion of the particles: m a x¨ai = −∂U/∂ xai to generate configurations to be used to calculate averages, Eq. (4). In MC we would use the probability density [17]: W(q) = C(E−U(q))3N/2−1 to generate configurations by attempting a trial move of an atom q → q(trial), and accepting the move if W(q(trial))/W(q) > random, where is a random number between 0 and 1. Equations (23) and (24) also holds for the isothermal elastic constants if one replaces C (S) pqrs by the isothermal elas(T ) tic constants, C pqrs and calculates the average values in Eq. (23) using (ThN) simulations. The three distinct terms in Eq. (23) are called the fluctuation term (term involving the fluctuation of M), the kinetic term (term with multiplier 2NkB ) and Born term (term containing k (a,b,c,d). Equations (23) and (24) are valid for any potential that depends only on the distance between particles; it is valid for many-body forces as long as they can be written in terms of only the distance between particles. In particular, this would include potentials that depend on tetrahedral and dihedral angles and, therefore, have many body forces. For pair wise additive potentials the last term in Eq. (23) reduces to
Ensembles and computer simulation calculation
739
N 4 the simpler form a
X ij =
Mij (E − H (q, p))dτ ,
(25)
where M is defined in Eq. (18) and the unit step function. For Eq. (25) applied to large system one can keep, to good approximation, only the largest term or Mij =
Mij (E − H (q, p))dτ,
(26)
where the phase volume inside the energy shell, H (q, p) = E. The entropy is related to the phase volume by the Boltzmann relation S = k B ln . Differentiation of Eq. (26) with respect to G kn leads to −1 −1 −1 (S) 2 Vo h −1 oip h 0 j q h okr h ons C pqrs = − 4δ(Mij Mkn )/kB T + 4∂ H/∂ G ij ∂ G kn . (27)
Calculating the last term in Eq. (27) leads to Eqs. (23) and (24). More rigorous derivations of Eq. (23) are discussed by Ray [18]. Equations (23) and (24) have been used to calculate elastic constants in a nearest neighbor Lennard–Jones (6–12) system in both the microcanonical and canonical ensemble using MD and compared to calculations of these same quantities in earlier canonical ensemble MC calculations [19, 20]. These calculations have been reproduced by several workers and now can be used to check programs that are written to calculate elastic constants. Since there are thermodynamic relations connecting the adiabatic and isothermal elastic constants (like the thermodynamic formulas connecting CV and CP ) this makes it possible to calculate the adiabatic elastic constants in either the (EhN) ensemble or the (ThN) ensemble, and the same for the isothermal elastic constants. A comparison of the values in the two ensembles can be looked upon as a stringent test of the validity of the Nose [12] theory for isothermal MD simulations, [20]. Equations (23) and (24) have also been used to calculate the elastic constants of crystalline and amorphous silicon modeled by the Stillinger–Weber potential [21, 22]. Equations (23) and (24) allow one to break down the Born
740
J.R. Ray
term into a two-body Born term and a three-body Born term for the Stillinger– Weber potential. These values have also been checked by a number of workers and can now be used as program checks. Equations (23) and (24) were generalized to apply to potentials with an explicit volume dependent term, such as in metallic potentials, or when using the Ewald method to evaluate the Coulomb potential. The resulting theory was then applied to a model of sodium [23]. Another generalization was to study the calculation of the third-order elastic constants using a generalization of Eqs. (23) and (24) [24]. For systems where the reference state for measuring strain is a stressed state of the system, generalizations of Eqs. (23) and (24) are required. This extension with calculations for a model of solid helium has been developed [25]. A detailed application of Eqs. (23) and (24) was applied to embedded atom method potentials for palladium by Wolf et al. [26]. Extension of (Ht and PN) calculations to higher order elastic constants was given by Ray [27].
6.
Calculation of Elastic Constants in the (Ht and PN) and (Tt and PN) Ensembles
In these ensembles the shape-size or strain of the system fluctuates. The Parrinello–Rahman fluctuation formula for the elastic constants involves just this fluctuation [28] δ(εij εkl ) = k B T (CiSj kl )−1 / Vo ,
(28)
where the adiabatic compliance tensor (C S )−1 is the inverse of the elastic constant tensor, (CiSj kl )−1 = −(∂εij /∂tkl ) S ,
(29)
and S is the entropy. The averages in Eq. (28) are calculated using (H t and PN) MD or MC. The same formula, Eq. (28), holds in the (T t and PN) ensemble if we change to the isothermal elastic constants and calculate averages using isothermal MD or MC. For MD the extended Hamiltonian for variable shapesize ensembles has the form [29] H1 (s, π, h, , f, ρ) =
(πaT G −1 πa /(2m a f 2 ) + U
a
+Tr( T )/(2W ) + Vo Tr(tε) + P V +ρ 2 /(2M) + (3N + 1)kB To ln( f )),
(30)
where (s,π ) are scaled coordinates and conjugate momenta, U is the potential energy, (h, ) are the coordinates and momenta of the computational cell, and
Ensembles and computer simulation calculation
741
(f,ρ) are the Nose mass scaling variable and its conjugate momenta. The constants W and M are introduced so that h and f satisfy dynamical equations; note that in classical statistical mechanics equilibrium properties of the system are independent of the masses and, therefore, are independent of W, M and the particle masses m a . To is the reservoir temperature in the constant temperature ensembles. The physical particle variables (xa , pa ) are related to the scaled particle variables by xa = hsa , pa = h T −1 π a / f . The relationship between the physical variables and the scaled variables may be described by a canonical transformation defined by h along with a mass scaling transformation with f . The equations of motion following from this Hamiltonian may be written in the form m a f 2 s¨si = −
(∂U/∂rab )sabi /rab − m a ( f 2 G −1 G˙ + 2 f f˙)˙sai ,
(31)
W h¨ = ( − P I )A − h ,
(32)
M f¨ = 2K / f − (3N + 1)kB To / f,
(33)
T −1 is related to the tension applied to the system and K is where = Vo h −1 o th o the particle kinetic energy. Equation (31) is just Newton’s law applied to the particles with the additional modification of the variable cell and the mass scaling variables. Equation (32) is the Parrinello–Rahman equation [28] as generalized [29, 30] to be valid for finite deformations which involves introducing the tension instead of the stress; this lead to the form of the enthalpy for finite elasticity in agreement with Thurston [31]. Equation (33), [12] is the equation of motion for the mass scaling variable which is introduced to drive the average temperature of the system to the reservoir temperature To in an equilibrium simulation. If the Nose mass scaling variable satisfies f = 1, df/dt = 0 then Eqs. (31) and (32) are the MD equations of motion for the (H t and PN) ensemble and the trajectories yield averages in this ensemble. If the cell matrix satisfies h = const., dh/dt = 0, then Eqs. (30) and (32) are the equations of motion for the (ThN) ensemble and the trajectories yield averages in this ensemble. If the previous conditions on h and f are both satisfied then the (EhN) ensemble is generated. If Eqs. (30)–(33) are solved in the general case with f and h varying then the (Tt and PN) ensemble is generated. The variable cell equations of motion have great utility in studying solid– solid phase transformations by computer simulation. These same transformations can be studied using MC methods. In the (H t and PN) ensemble the calculation of elastic constants in MD is not as good as in MC. That is, Eq. (28) converges faster using MC than MD. This is illustrated in detail by Fay and Ray, 1992 [9] and Karimi et al. 1998 [32]. However, (H t and PN) MC elastic constant calculations do not converge as fast as (EhN) MD or MC. The convergence is governed by the fluctuation terms in either Eq. (23) or Eq. (28). The fluctuation of the microscopic stress tensor in Eq. (23) converges
742
J.R. Ray
faster than the fluctuation of the shape/size matrix in Eq. (28) in the cases we have investigated. This is unfortunate since the (EhN) formulas require values of the second derivatives of the potential whereas the (H t and PN) fluctuation formulas require only first derivatives in MD or no derivatives in MC. The derivatives of the potential may not be easy to calculate for a many body potential although one could employ algebraic computer programs to calculate the derivatives. One can calculate elastic constants in the variable particle number ensembles but we have not discovered a case where that offers any advantage over the four fixed particle number ensembles discussed. If the second derivatives of the potential can be evaluated or accurately approximated, then the (EhN) formuals, Eqs. (23) and (24), using either MD or MC are the best choice for calculating the elastic constants. If the second derivative of the potential is not available then MC using the probability density P(q; H, t, P, N ) = CV N (H − Vo Tr(tε)− PV−U(q))(3N/2−1) with Eq. (28) is the best choice. MC calculations in the (H t and PN) ensemble also offer the advantage of not having to worry about the choice of the different fictitious kinetic energy and mass terms introduced in extended MD; these are not unique. Either Eqs. (23) or (28) offers a convenient way of calculating elastic properties of condensed matter systems as a function of temperature or other parameters in a way that includes all anharmonic effects in an exact manner.
References [1] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford Univeristy Press, Oxford, 1987. [2] G. Ciccotti, D. Frenkel, and I.R. McDonald, Simulation of Liquids and Solids, NorthHolland, Amsterdam, 1987. [3] D. Frenkle and B. Smit, Understanding Molecular Simulations, Academic Press, New York, 1996. [4] H.W. Graben and J.R. Ray, “Eight physical systems of thermodynamics, statistical mechanics, and computer simulation,” Mol. Phys., 80, 1183–1193, 1993. [5] M.W. Zemansky and R.H. Dittman, Heat and Thermodynamics, 7th edn., McGraw Hill, New York, 1997. [6] E.A. Guggenheim, J. Chem. Phys., 7, 103, 1939. [7] R.J. Wolf, M.W. Lee, R.C. Davis, P.J. Fay, and J.R. Ray, “Pressure-composition isotherms for palladium hydride,” Phys. Rev. B, 48, 12415–12418, 1993. [8] R.J. Wolf, M.W. Lee, and J.R. Ray, “Pressure-composition isotherms for nanocrystalline palladium hydride,” Phys. Rev. B, 73, 557–560, 1994. [9] P.J. Fay and J.R. Ray, “Monte Carlo simulations in the isoenthalpic-isotension– isobaric ensemble,” Phys. Rev. A, 46, 4645–4649, 1992. [10] J.R. Ray and R.J. Wolf, “Monte Carlo simulations at constant chemical potential and pressure,” J. Chem. Phys., 98, 2263–2267, 1993. [11] H.C. Andersen, “Molecular dynamics simulations at constant pressure and/or temperature,” J. Chem. Phys., 2384–2393, 1990.
Ensembles and computer simulation calculation
743
[12] S. Nose, “A unified formulation of the constant temperature molecular dynamics method,” J. Chem. Phys., 81, 511–519, 1994. [13] T. Cagin and B.M. Pettitt, “Molecular dynamics with a variable number of particles,” Mol. Phys., 72, 169, 1991. [14] E. Helfand, “Transport coefficients from dissipation in a canonical ensemble,” Phys. Rev., 119, 1, 1960. [15] P.J. Fay, J.R. Ray, and R.J. Wolf, “Detailed balance method for chemical potential determination in Monte Carlo and molecular dynamics simulations,” J. Chem. Phys., 100, 2154–2160, 1994. [16] P.J. Fay, J.R. Ray, and R.J. Wolf, “Detailed balance method for chemical potential determination,” J. Chem. Phys., 103, 7556–7561, 1995. [17] J.R. Ray, “Microcanonical Ensemble Monte Carlo method,” Phys. Rev. A, 44, 4061– 4064, 1991. [18] J.R. Ray, “Elastic Constants and statistical ensembles in molecular dynamics,” Comput. Phys. Rep., 8, 109–152, 1988. [19] J.R. Ray, M.C. Moody, and A. Rahman, “Molecular dynamics calculation of the elastic constants for a crystalline system in equilibrium,” Phys. Rev. B, 32, 733–735, 1985. [20] J.R. Ray, M.C. Moody, and A. Rahman, “Calculation of elastic constants using isothermal molecular dynamics,” Phys. Rev. B, 33, 895–899, 1986. [21] M.D. Kluge, J.R. Ray, and A. Rahman, “Molecular dynamic calculation of the elastic constants of silicon,” J. Chem. Phys., 85, 4028–4031, 1987. [22] M.D. Kluge and J.R. Ray, “Elastic constants and density of states of a moleculardynamics model of amorphous silicon,” Phys. Rev. B, 37, 4132–4136, 1988. [23] T. Cagin and J.R. Ray, “Elastic constants of sodium from molecular dynamics,” Phys. Rev., 37, 699–705, 1988. [24] T. Cagin and J.R. Ray, “Third-order elastic constants from molecular dynamics; Theory and an example calculation,” Phys. Rev. B, 38, 7940–7946, 1988. [25] J.R. Ray, “Effective elastic constants of solids under stress: theory and calculations for helium from 11.0 to 23.6 GPa,” Phys. Rev. B, 40, 423–430, 1989. [26] R.J. Wolf, K.A. Mansour, M.W. Lee, and J.R. Ray, “Temperature dependence of elastic constants of embedded-atom models of palladium,” Phys. Rev. B, 46, 8027– 8035, 1992. [27] J.R. Ray, “Fluctuations and thermodynamic properties of anisotropic solids,” J. Appl. Phys., 53, 6441–6443, 1982. [28] M. Parrinello and A. Rahman, “Polymorphic transitions in single crystals: a new molecular dynamics method,” J. Appl. Phys., 52, 7182–7190, 1981. [29] J.R. Ray and A. Rahman, “Statistical ensembles and molecular dynamics studies of anisotropic solids II,” J. Chem. Phys., 82, 4243–4247, 1985. [30] J.R. Ray and A. Rahman, “Statistical ensembles and molecular dynamics studies of anisotropic solids,” J. Chem. Phys., 80, 4423–4428, 1984. [31] R.N. Thurston, Physical Acoustics: Principles and Methods, W.P. Mason (ed.), Academic Press, New York, 1964. [32] M. Karimi, H. Yates, J.R. Ray, T. Kaplan, M. Mostoller, “Elastic constants of silicon using Monte Carlo simulations,” Phys. Rev. B, 58, 6019–6025, 1998.
2.17 NON-EQUILIBRIUM MOLECULAR DYNAMICS Giovanni Ciccotti1 , Raymond Kapral2 , and Alessandro Sergi2 1
INFM and Dipartimento di Fisica, Universit`a “La Sapienza,” Piazzale Aldo Moro, 2, 00185 Roma, Italy 2 Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, Ont. M5S 3H6, Canada
Statistical mechanics provides a well-established link between microscopic equilibrium states and thermodynamics. If one considers systems out of equilibrium, the link between microscopic dynamical properties and nonequilibrium macroscopic states is more difficult to establish [1, 2]. For systems lying near equilibrium, linear response theory provides a route to derive linear macroscopic laws and the microscopic expressions for the transport properties that enter the constitutive relations. If the system is displaced far from equilibrium, no fully general theory exists to treat such systems. By restricting consideration to a class of non-equilibrium states which arise from perturbations (linear or non-linear) of an equilibrium state, methods can be developed to treat non-equilibrium states. Furthermore, non-equilibrium molecular dynamics (NEMD) simulation methods can be devised to provide estimates for the transport properties of these systems. Molecular dynamics is typically based on equations of motion derived from a Hamiltonian. However, often in the simulation of large complex systems, constraints are introduced to remove certain “fast” degrees of freedom from the system that are deemed to be unimportant for the phenomena under investigation. An important and prevalent example is the introduction of bond constraints to remove rapid vibrational degrees of freedom from the molecules of the system. Such constraints can be handled by the introduction of generalized coordinates and in these coordinates a Hamiltonian description of the equations of motion may be written. However, it is often more convenient to work with Cartesian coordinates with the holonomic constraints explicitly introduced in the equations of motion through Lagrange multipliers. One can treat the set of Lagrange multipliers as parameters that can be determined by 745 S. Yip (ed.), Handbook of Materials Modeling, 745–761. c 2005 Springer. Printed in the Netherlands.
746
G. Ciccotti et al.
SHAKE [3] and one still has a kind of Hamiltonian description involving these parameters [4]. Alternatively, one can explicitly determine the Lagrange multipliers as functions of the phase space coordinates and in this case the equations of motion are non-Hamiltonian and are characterized by the existence of a non-zero phase space compressibility [5, 6]. Consequently, such constrained systems are a special case of general non-Hamiltonian systems whose statistical mechanical formulation has been, recently, a topic of considerable interest. The statistical mechanical methods that have been developed for nonHamiltonian flows [7] can be used to formulate a non-equilibrium statistical mechanics of constrained molecular systems. With such a formulation in hand, a response theory can be developed to investigate linear and non-linear perturbations of equilibrium states, thus providing the link between microscopic dynamics and macroscopic non-equilibrium properties. In this chapter we show how this program can be carried out. In simulations, when external forces are applied to the system, the equations of motion must be supplemented with a thermostating mechanism to compensate for the input of energy from the external forces. The resulting thermostated equations are non-Hamiltonian in character. While, for simplicity, we do not explicitly consider the thermostat in the formulation presented below, the techniques we describe can also be extended to treat this more general situation.
1.
Non-Hamiltonian Equations of Motion with Constraints
Consider an N -particle system with coordinates ri and momenta pi and Hamiltonian H0 , H0 =
N p2i i=1
2m i
+ V (r),
(1)
where V (r) is the potential energy. We let phase space coordinate labels without particle indices stand for the entire set of coordinates, (r, p) = (r1 , r2 , . . . , r N , p1 , p2 , . . . , p N ). When we wish to refer to these variables collectively, we use the notation x = (r, p). We suppose that the system is subjected to holonomic constraints σα (r) = 0,
α = 1, . . . , .
(2)
The σα could be the bond constraints mentioned above, or any other holonomic constraint such as a reaction coordinate constraint imposed in the simulation of rare reactive events [8]. The constrained equations of motion are pi , p˙ i = Fi − λα ∇i σα (r), (3) r˙ i = mi
Non-equilibrium molecular dynamics
747
where Fi = −∇i V is the force on particle i due to the potential energy and the second term represents the constraint forces with Lagrange multipliers λα . We use the Einstein summation convention on the Greek indices. Equivalently, we may write this pair of equations as single second order equation as m i r¨ i = Fi − λα ∇i σα (r).
(4)
Since σα is constrained at all times, σ˙ α = i (pi /m i ) · ∇i σα = 0. Typically, such equations are solved in molecular dynamics simulations using the SHAKE algorithm. However, in order to carry out the statistical mechanical treatment of such constrained systems is more convenient to formulate the problem in a form where its non-Hamiltonian character is evident. To this end we first determine an explicit expression for the Lagrange multipliers. The Lagrange multipliers can be found by differentiating the constraint σ˙ α = 0 with respect to time and using Eq. (4) to yield, σ¨ α = =
d pi · ∇i σ α = r¨ i · ∇i σα + r˙ i r˙ j : ∇i ∇ j σα dt i m i i i, j
Fi
pi p j λβ : ∇i ∇ j σα = 0. − ∇i σ β · ∇i σ α + mi mi mi m j i, j
i
(5)
Solving this equation for λα we find,
pi p j Fi : ∇i ∇ j σβ (Z−1 )βα , · ∇i σ β + λα = λα (x) =
mi
i
i, j
mi m j
(6)
where Zαβ =
1 i
mi
(∇i σα ) · (∇i σβ ).
(7)
Using this explicit form of the Lagrange multiplier, the resulting equations of motion, pi , p˙ i = Fi − λα (x)∇i σα (r), (8) r˙ i = mi are no longer in Hamiltonian form and represent a motion with the constraints as conserved quantities [5, 6]. The phase space flow is compressible and the compressibility is given by κ = ∇x · x˙ = −
N ∂λα (x) i=1
= −2
1 i
where Z = det Z.
∂pi mi
· ∇i σα (r)
(∇i σ˙ α ) · (∇i σβ )(Z−1 )βα = −
d ln Z , dt
(9)
748
G. Ciccotti et al.
The constrained phase space flow in Eq. (8) may be generated by the action of the Liouville operator, iL 0 = x˙ · ∇x =
N pi
∂ ∂ ∂ · + Fi · − λα (x)(∇i σα (r)) · , m ∂ri ∂pi ∂pi
i=1
(10)
on the phase space variables. More generally, the evolution of any dynamical variable, B(x), is given by dB(x(t)) = iL 0 B(x(t)), dt
(11)
whose formal solution is B(x(t)) = eiL 0 t B(x).
(12)
We now wish to discuss the statistical mechanics of such a system. The existence of a phase space compressibility has implications for the nature of the phase space flow and the computation of statistical properties [6, 7]. The phase space volume element at time t0 , dxt0 transforms into dxt = J (xt ; xt0 )dxt0 j at time t, where J (xt ; xt0 ) = det J and the matrix J has elements Ji j = ∂xit /∂xt0 . Using the fact that J = det J = exp(Tr ln J) one may derive an equation of motion by differentiating this expression for J to find,
∂ x˙ti ∂ x t dJ −1 dJ (xt ; xt0 ) 0 = Tr J J = J j i dt dt ∂ x ∂ x t t0 i, j
=
j
∂ x˙ i t i
∂ xti
J = κ(xt ) J (xt ; xt0 ).
(13)
Integrating this equation and using the explicit expression for κ given above, one may show that the Jacobian takes the form, J (xt ; xt0 )=Z (rt0 )/Z (rt ). Consequently, we see that Z (rt ) dxt = Z (rt0 ) dxt0 and dµ(r, p) = Z (r) dr dp is the invariant measure for the phase space flow. Next we consider the time evolution of the phase space distribution func tion f (x), where V dµ(x) f (x)= V dxZ (r) f (x) ≡ V dxρ(x) is the fraction of systems in the phase space volume V . The phase space flow is conserved so that ρ(x) = Z (x) f (x) satisfies the continuity equation, ∂ρ(x, t) + ∇x · (˙xρ(x, t)) = 0, ∂t
(14)
and, therefore, ∂ρ(x, t) = −(˙x · ∇x + ∇x · x˙ )ρ(x, t) = −(iL 0 + κ)ρ(x, t). ∂t
(15)
Non-equilibrium molecular dynamics
749
We now want to derive the evolution equation for f (x, t). To this end we first note that we may again use the identity Z = det Z = exp(Tr ln Z) and the fact that Z depends only on the coordinates to compute iL 0 Z =
pi i
=
mi
· ∇i Z =
pi i
1 j
mj
· (∇i Z αβ )(Z−1 )βα Z
mi
(∇ j σ˙ α ) · (∇ j σβ ) + (∇ j σα ) · (∇ j σ˙ β ) (Z−1 )βα Z = −κ Z , (16)
where we have used Eq. (9) and the fact that Z is a symmetric matrix to relate the expression in the penultimate equality to the compressibility. Then, inserting the definition ρ(x) = Z (x) f (x) into Eq. (15) and using the result iL 0 Z = −κ Z we find the Liouville equation for f (x), d f (x, t) ∂ f (x, t) = + iL 0 f (x, t) = 0. dt ∂t
(17)
The equations of motion (8) have H0 , σα and σ˙ α as constants of the motion. Consequently, the equilibrium density is given by f eq (x) = (E)−1 δ(H0 − E)
α
δ(σα )δ(σ˙ α ),
(18)
where (E) is a normalizing factor and E is the energy of the microcanonical system. In non-equilibrium statistical mechanics the average value of a dynamical variable at time t is given by the integral over the phase space measure of the phase space probability density times the dynamical variable, ¯ = B(t)
dµ(x)B(x) f (x, t) =
dµ(x)B(x)e−iL 0 t f (x).
(19)
We may transfer the action of the evolution operator to the dynamical variable. To do this we first use the following identity for any phase space functions A(x) and B(x), which is obtained by integrating by parts:
dxB(x)iL 0 A(x) = =−
dx
−iL 0 +
∂λα (x) i
∂pi
· (∇i σα ) B(x) A(x)
dx((iL 0 + κ)B(x))A(x).
(20)
The last equality was obtained using ∂λα (x) i
∂pi
· (∇i σα ) = −κ.
(21)
750
G. Ciccotti et al.
Making use of this result we may also show that
dµ(x)B(x)iL 0 A(x) =
=− =−
dxZ (r)B(x)iL 0 A(x)
dx((iL 0 + κ)(Z (r)B(x)))A(x) dx(iL 0 B(x)))Z (r)A(x),
(22)
where the last equality follows from the fact that iL 0 (Z A) = −κ Z A + Z iL 0 A, again using iL 0 Z = −κ Z . Thus, expanding the propagator exp(−iL 0 t) in Eq. (19) as power series, integrating by parts term by term, using the above ¯ identities and finally resumming, we obtain for B(t) in Eq. (19), ¯ = B(t)
dµ(x)(eiL 0 t B(x)) f (x) =
dµ(x)B(x(t)) f (x).
(23)
Thus, we have the result
dµ(x)B(x)e
−iL 0 t
f (x) =
dµ(x)(eiL 0 t B(x)) f (x),
(24)
which shows that when the scalar product is defined with respect to the measure dµ(x), the Liouville operator L 0 is self-adjoint. Alternatively, we may write the right hand side of Eq. (24) and then integrate by parts, using the properties discussed above, to obtain the equivalent formulas, ¯ B(t) =
dxB(x)ρ(x, t) =
=
dxB(x)e−(iL 0 +κ)t ρ(x)
dx(eiL 0 t B(x))ρ(x) =
dµ(x)(eiL 0 t B(x)) f (x).
(25)
This result shows that if the scalar product is defined with respect to integration over the phase space coordinates, and not the invariant measure, the adjoint of iL 0 is −(iL 0 + κ) and the operator is not self-adjoint. The development we have presented contains the standard Hamiltonian description of statistical mechanics if the constraints are not present. In this case the terms involving the explicit forms of the Lagrange multipliers no longer appear and the equations of motion adopt a Hamiltonian form. The metric factor Z (r) = 1 and consequently dxt = dxt0 and the Liouville operator is self-adjoint with respect to this simple metric. We may now use this Liouville formulation of the dynamics of constrained systems to carry out an analysis of how the system responds when external forces are applied to the system.
Non-equilibrium molecular dynamics
2.
751
Response Theory
We next examine how the constrained non-Hamiltonian system responds to external time dependent forces. In the presence of such external forces the equations of motion take the general form, pi p + Cri (x)F (t), p˙ i = Fi − λα (x, t)∇i σα (r) + Ci (x)F(t). (26) r˙ i = mi We write this set of equations compactly as x˙ = G(x, t),
(27)
where we have indicated the explicit time dependence in λα (x, t) and G(x, t) arising from the external force. Since the Lagrange multipliers must be determined in the presence of the external forces, they acquire explicit time dependence. In the general case we are considering, the external forces are not assumed to be derived from a generalized potential; i.e., there is no funcp tion V(r, p) such that Cri = (∂V/∂pi ) and Ci = −(∂V/∂ri ). We do assume that CT (x) = (Cr , C p ), where T stands for the transpose, satisfies the incompressibility condition, ∇x · C = 0. This latter condition guarantees that, even in the presence of the external forces, the compressibility arises only from the Lagrange multipliers, which impose the constraints, and is still given by κ(t) = ∇x · x˙ = −
N ∂λα (x, t) i=1
∂pi
· ∇i σα (r).
(28)
The Liouville operator that generates these equations of motion is iL(t) = x˙ · ∇x = G(x, t) · ∇x , or, more explicitly, iL(t) =
N pi i=1
+
·
m
N
∂ ∂ + (Fi − λα (x, t)(∇i σα (r))) · ∂ri ∂pi
Cri ·
i=1
∂ ∂ p + Ci · ∂ri ∂pi
F(t).
(29)
We compute the response of the system to the external force by measuring the average value of a dynamical variable B(x) as (cf. Eq. (25)) ¯ = B(t)
dxB(x)ρ(x, t),
(30)
where ρ(x, t) again satisfies the continuity Eq. (14) which now takes the form, ∂ρ(x, t) = −(˙x · ∇x + ∇x · x˙ )ρ(x, t) = −(iL(t) + κ(t))ρ(x, t). (31) ∂t The compressibility κ(t) now also depends explicitly on time since the Lagrange multipliers appear in its expression. If we integrate Eq. (31) from
752
G. Ciccotti et al.
some initial time t0 to time t and solve the resulting integral equation by iteration we obtain, ρ(x, t) = ρ(x, t0 ) −
t
dt1 (iL(t1 ) + κ(t1 ))ρ(x, t0 ) +
t0
t
t1
dt1 t0
dt2 (iL(t1 )
t0
+ κ(t1 ))(iL(t2 ) + κ(t2 ))ρ(x, t0 ) + · · · .
(32)
The formal solution of Eq. (32) can be written as ρ(x, t) = U † (t, t0 )ρ0 (x),
(33)
where ρ0 (x) = ρ(x, t0 ) and the propagator U † (t, t0 ) is defined by
U † (t, t0 ) = T exp −
t
dτ (iL(τ ) + κ(τ )),
(34)
t0
where T is the time-ordering operator. For any two phase space functions A(x) and B(x) we have the analog of Eq. (20): −
dxB(x)(iL(t) + κ(t))A(x) =
dx(iL(t)B(x))A(x).
(35)
Consequently, we may substitute the series solution for ρ(x, t) into Eq. (32) and integrate by parts term by term, using Eq. (35), to obtain, ¯ B(t) =
dxB(x)ρ(x, t)
=
dx B(x) +
t
dt1 iL(t1 )B(x)
t0
+
t
t1
dt1 t0
dt2 iL(t1 )iL(t2 )B(x) + · · · ρ0 (x).
(36)
t0
This series defines the evolution operator U (t, t0 ), t U (t, t0 ) = T exp dτ iL(τ ),
(37)
t0
which is the formal solution of the equation of motion, dU (t, t0 ) = iL(t)U (t, t0 ). dt
(38)
Non-equilibrium molecular dynamics
753
The propagator U (t, t0 ) is the adjoint of U † (t, t0 ). As a result of these considerations we may write, ¯ = B(t)
dxB(x)U † (t, t0 )ρ0 (x) =
dx(U (t, t0 )B(x))ρ0(x).
(39)
This formula provides a means to determine the non-equilibrium macroscopic value of the dynamical variable B(x) by considering its evolution under the fully perturbed dynamics and taking the phase space average over the arbitrary initial preparation of the system described by ρ0 (x). If the initial distribution is taken to be the equilibrium distribution in the absence of the perturbing field, ρeq (x) = Z (r) f eq (x), then the problem has a well-defined structure. Inserting the initial equilibrium distribution into Eq. (39) we find ¯ B(t) =
=
dx(U (t, t0 )B(x))ρeq(x) dµ(x)(U (t, t0 )B(x)) f eq(x) ≡ U (t, t0 )B(x)eq ,
(40)
where the measure dµ(x) is that for the unperturbed system discussed in the previous section. From this equation we see that, for systems initially at equilibrium, non-equilibrium properties may be obtained from the equilibrium ensemble average of the observable evolved under the full perturbed dynamics due to the external force. This equation expresses some fundamental features of nonequilibrium statistical mechanics. It is an expression of Onsager’s regression hypothesis that relates the decay of macroscopic observables to the regression of fluctuations about the equilibrium state [9] and has been exploited in NEMD simulations to take dynamical averages out of equilibrium [10]. In the limiting case where the equations of motion are Hamiltonian in form and the external perturbation arises from a potential VI (t)= A(x)F(t), Eq. (40) reduces to the microcanonical version of Kubo’s linear response result [11] in the limit of weak perturbations. To see this we note that the perturbed Liouville operator in Eq. (29) reduces to the following usual form for this simple Hamiltonian case:
iL(t) =
N pi i=1
N ∂ ∂ A(r) ∂ ∂ ∂ A(x) ∂ + Fi +F(t) · − · m ∂ri ∂pi ∂pi ∂ri ∂ri ∂pi i=1
·
≡ iL 0 + iL I (t).
(41)
Using the decomposition of iL(t) into unperturbed and perturbed parts and the expression U0 (t, t0 ) = exp(−iL 0 (t − t0 )) for the propagator of the
754
G. Ciccotti et al.
unperturbed system, we may write Eq. (38) in the form of a Dyson equation for U (t, t0 ): U (t, t0 ) = U0 (t, t0 ) +
t
dτ U0 (τ, t0 )iL I (τ )U (t, τ ).
(42)
t0
If we then insert this expression for U (t, t0 ), truncated to first order in the external force, into Eq. (40), we obtain ¯ = B(x)eq + B(t)
t
dτ t0
˙ d feq (x) F(τ ), dx(U0 (t, τ )B(x)) A(x) dH0
(43)
where we have used the fact that f eq (x) = (E)−1 δ(H0 − E) is a function of the phase space coordinates through the Hamiltonian. Finally, using the identity g(z)δ (z − a) = −g (a)δ(z − a), and the fact that the microcanonical partition function is related to the entropy by (E) = exp(S(E)/kB ), we have d f eq (x)/dH0 = −β f eq (x). Thus, we obtain the standard result ¯ =− B(t)
t
˙ dτ (U0 (t, τ )B(x)) A(x) eq F(τ ).
(44)
t0
¯ ¯ = Here B(t) is the deviation from the equilibrium average value, B(t) ¯ B(t) − B(x)eq . However, the linear response of constrained systems to perturbations, either with Hamiltonian or non-Hamiltonian structure, is not simple. The external forces enter in the Lagrange multipliers that appear in the nonHamiltonian equations of motion as well as in the explicit forces that couple the system to the external fields. Consequently, the form of the perturbation in the Liouville operator is complicated and unfamiliar terms appear in the linear response formulas. In addition, the equilibrium distribution function in Eq. (18) does not depend solely on the Hamiltonian but contains delta function contributions from the conserved constraint variables. As a result, some of the standard manipulations that are often carried out in linear response theory to obtain the response as a physically interesting correlation function in Eq. (43), such as those that give d f eq (x)/dH0 = −β feq (x), may no longer be carried out. For instance, even if the perturbation is of the form of VI (t) given above, the linear response of the constrained system is not simple because the form of the equilibrium density precludes a reduction to that in Eq. (44). These technical difficulties with the linear response derivation do not detract from the computational utility of Eq. (40) which forms the starting point for investigating the linear or non-linear response of either Hamiltonian or nonHamiltonian systems by NEMD. Below, we discuss how such simulations may be carried out.
Non-equilibrium molecular dynamics
3.
755
Simulation of Non-Equilibrium Systems
The dynamical response of a system subjected to the general timedependent external force in Eq. (26) and initially in an equilibrium state of the unperturbed system can be computed from Eq. (40). To do this one simply samples phase space configurations along an equilibrium trajectory of the unperturbed system. For independent initial configurations extracted from this trajectory, one evolves the dynamical variable B(x) using Eq. (26) under the full perturbed dynamics for a time t. The ensemble average over these trajectory segments yields B¯ (t). The method based on Eq. (40) is called the dynamical approach to non-equilibrium molecular dynamics. In carrying out such NEMD simulations, one necessarily needs perturbation strengths that are huge on the macroscopic scale in order to produce a detectable response. Such large perturbation strengths are needed to yield a response that is larger than the statistical noise. From such a simulation it is difficult to obtain information in the region of linear behavior. Consequently, an extrapolation to small perturbation strengths is required in order to compare the numerical results with those of linear response theory. For example, consider the mobility of an ion with mass m immersed in a fluid. In the absence of an external field the average ion velocity is zero and its variance is kB T /m. The typical velocity of the ion is (kB T /m)1/2 and sampling of 100 independent configurations will yield an estimate of the average value zero by (kB T /m)1/2 /10, which is still a large number. If we wish to apply an external force to push the ion and to compute its drift velocity, the drift velocity should be significantly larger than the noise, v ion neq 10−1 (kB T /m)1/2 . In order to fulfill this condition a huge field strength is required and one can no longer guarantee that the linear regime is being investigated. A solution to this problem was obtained by using the subtraction technique [12], a method that permits one to decrease the noise of the response. If we consider evolution under U0 (t, t0 ), the propagator of the unperturbed system, then, since the equilibrium distribution is stationary under this evolution † we have U0 (t, t0 ) f eq (x) = 0 and U0 (t, t0 )B(x)eq = 0. Therefore we may write Eq. (40) in the form, ¯ = (U (t, t0 )B(x) − U0 (t, t0 )B(x))eq. B(t)
(45)
The dynamical variable inside the parentheses has the same average value as that in Eq. (40) but the variance is significantly different. This is easily seen for a time-impulsive perturbation at time t → t0+ : Var [U (t, t0 )B(x) − U0 (t, t0 )B(x)] = Var [U (t, t0 )B(x)] + Var [U0 (t, t0 )B(x)] − 2Cov [U (t, t0 )B(x), U0(t, t0 )B(x)] .
(46)
756
G. Ciccotti et al.
Using Cov[X, Y ] = (Var[X ]Var[Y ])1/2 γ , with the correlation coefficient |γ | ≤ 1 and noticing that for t close to t0 the correlation coefficient of the two microscopic fluxes is equal to 1 + O(F), one finds from Eq. (46) that the leading term of the variance of the difference between the two fluxes is
Var U (t → t0+ , t0 )B(x) − U0 (t → t0+ , t0 )B(x)
= SD U (t → t0+ , t0 )B(x) − SD U0 (t → t0+ , t0 )B(x) = O(F),
2
+ O(F) (47)
where SD stands for the standard deviation. This result applies only for t ≈ t0 and the variance will generally grow exponentially as t increases. In many situations, though, the desired results can be obtained using a time range compatible with this divergence of the variance. A simple illustration of the subtraction technique is provided by the mobility of a charged particle in an atomic fluid of Lennard–Jones particles. In this case the system has no constraints and we may use the simpler limiting forms of the equations presented above. The interaction between the charged particle and the neutral bath atoms is given by the Lennard–Jones plus a charge induced dipole term VD (r) = VLJ (r) − 12 α P e2r −4 , where αP is the atomic polarizability and e the electric charge. (See Ref. [12] for details.) To calculate the mobility of the ion we take B(x) = vc , the velocity of the ion, and use Eq. (45). The molecular dynamics runs are broken into segments and two trajectories of the particles are computed in each segment, starting from the same initial configuration: one trajectory evolves in the absence of the external force, and a constant force F of order 1 eV cm−1 is applied to the charged particle in the other trajectory. The drift velocity of the charged particles uD induced by the applied field is computed as a function of time simply by calculating the difference of the particle’s velocity in the perturbed and unperturbed trajectories, averaged over all segments of the run. One obtains, uD ≡ v¯ c (t) = [U (t, 0)vc − U0 (t, 0)vc ]eq .
(48)
The mobility constant µ is given by uD (∞) = µF at vanishingly small F. The force applied in Ref. [12] was about 10−7 of the mean Lennard–Jones force. The calculation of the drift velocity induced by such a small external field in a simulation run of realistic length is made possible only by the subtraction technique. The results for the mobility agree well with experimental data for argon and with calculations using the Green–Kubo formula. As a second illustration of the subtraction technique to simple systems, we consider the calculation of the shear-rate dependence of the viscosity of a Lennard–Jones fluid [13, 14]. To simulate the response of the equilibrium system, at t = 0 a fictitious external impulsive field is applied which induces
Non-equilibrium molecular dynamics
757
a planar Couette flow (the so-called SLLOD perturbation [15, 16]). The equations of motion are r˙ i = pi /m i + ri · ∇u, p˙ i = Fi − pi · ∇u,
(49) (50)
where Fi is the total force acting on atom i with mass m i and the velocity gradient ∇u has been chosen to yield a planar Couette flow in the x direction, with shear along y:
0 0 0 ∇u = h 0 δ(t) 0 0 , 0 0 0
(51)
where h 0 is a constant related to the shear rate ˙ . The velocity gradient, induced in the MD cell by the application of the chosen external field, has to be accomodated at the boundaries by applying the appropriate generalization of the periodic boundary conditions known in the literature as Lees–Edwards boundary conditions [17, 18]. The transient behavior of the off-diagonal element Sx y of the stress tensor of the perturbed system was simulated over a time interval t in order to determine the viscosity coefficient, η, η = lim η(t) = lim lim t →∞
t →∞ h 0 →0
h −1 0
t
dτ U (τ, 0)Sx y − U0 (τ, 0)Sx y eq .
(52)
0
In Fig. 1 we show the time dependent friction coefficient η(t) calculated by the subtraction technique for a low shear rate (˙ < 0.02). One see that good agreement between the non-equilibrium molecular dynamics simulation and linear response theory is found for low shear rates. In Ref. [13] it is also shown that, for shear rates below 1012 s−1 , the viscosity, considered as a function of the shear rate, does not differ significantly from its limiting value. It is only at rates higher than 1012 s−1 (a rather high perturbation!) that it starts to depend on the shear rate. Moreover, the dependence is well represented by an analytical power series truncated to the forth order. Non-equilibrium molecular dynamics has also been used to investigate transport properties of polyatomic systems using Cartesian coordinates with imposed holonomic constraints. Liquid butane has been the subject of a number of studies [19, 20] and the subtraction technique was used in [20] to compute the shear viscosity of this molecular system. In this calculation the symmetric part of the molecular stress tensor for a polyatomic fluid in a volume V is determined from the center of mass positions and velocities of the butane molecules: 1 1 1 s PI x PI y + (FI x R I x + FI y R I y ) . (53) Sx y = − V I M 2
758
G. Ciccotti et al. ψ
1.5
B 1
A
0.5
t
0 0
200
400
600
800
Figure 1. Curve A: time dependent friction coefficient η(t) for a Lennard–Jones atomic fluid, calculated by the subtraction technique at low shear rate (˙ < 0.02); curves above and below indicate t the uncertainties. Curve B: η(t) from the Green–Kubo expression, η(t) = (V /k B T ) 0 ds Sx y (s)Sx y (0)eq ; errors are shown as vertical bars.
Here M, R I , P I are, respectively, the mass, the center of mass coordinate and the total linear momentum of molecule I , while F I is the total force acting on the center of mass of the molecule; the sum runs over all molecules in the system. The shear viscosity coefficient η can be directly obtained from the constitutive law [21], Ss − pI = 2η∇u,
(54)
where p is the hydrostatic pressure and u is the local velocity field corresponding to a pure deformational flow (∇ · u = 0). The equations of motion for such a polyatomic system can be written [15] (denoting the atomic coordinates and momenta of the n atoms of molecule I by (r I , p I ) = (r1I , r2I , . . . , rn I , p1I , p2I , . . . , pn I )) as pk I + R I · ∇u, mk mk ∇u · P I − λα I ∇k I σα I (r I ), = Fk I − M
r˙ k I =
(55)
p˙ k I
(56)
where Fk I is the force acting on atom k of molecule I , the tensor ∇u is the homogeneous velocity gradient (possibly time dependent), σα I (r I ) are the constraints acting on molecule I with their associated Lagrange multipliers
Non-equilibrium molecular dynamics
759
λα I . The perturbed equations of motion (55)–(56), are known as the DOLLS [22] tensor equations and can be derived from a Hamiltonian perturbation, HI =
R I P I : (∇u)T ,
(57)
I
where (∇u)T is the transpose of ∇u. The calculation of the shear viscosity by the subtraction technique proceeds as in the previous application. An impulsive external force derived from a shear gradient of the form
0 (h 0 /2)δ(t) 0 0 0 ∇u = (h 0 /2)δ(t) 0 0 0
(58)
is applied to the equilibrium system at t = 0. Notice that, due to the choice of a symmetric velocity gradient tensorial perturbation, Eq. (58), no distinction exists, in this case, between the DOLLS tensor equations, used in this application, and the SLLODS equations employed in Ref. [13]. The viscosity is then computed from the analog of Eq. (52) using the symmetric part of the stress tensor, Sxs y = (Sx y + S yx )/2. Since the molecular constraints do not act on the centers of mass of the molecules no technical difficulties are encountered as a result of their presence. The viscosity obtained using this method has been compared with the corresponding Green–Kubo formula and with the available experimental data. Although the equivalence between the Green–Kubo and NEMD methods is not evident in the present molecular case, the authors find a remarkable agreement between the results of the two methods, while the known experimental value is about half of the computed values. There are many possible reasons for this discrepancy (and the authors correctly point them out), however, at this stage, we cannot exclude a theoretical inconsistency.
4.
Outlook
Non-equilibrium molecular dynamics is a field with a long history. For atomic systems the problem has been formulated completely and a variety of applications have been studied (see Ref. [18] for a review, as well as other chapters in this book). In this chapter we have shown that there are new issues that need to be considered when molecular systems with constraints are studied. In order to make molecular dynamics simulations practical, most complex molecular systems are treated by imposing constraints to remove certain degrees of freedom from the problem. It is therefore important to formulate a response theory for such constrained systems in order to be able to compute non-equilibrium properties. We have shown that it is possible to carry out this program for systems with constraints in the context of a non-Hamiltonian
760
G. Ciccotti et al.
formulation of the equations of motion. The general expression for the response (Eq. (40)) is simple and is in a form that permits direct application of the subtraction method to determine small responses. The passage to the linear regime and determination of the analogs of standard Green–Kubo expressions for transport properties involve subtle issues that deserve further study. We have not included a thermostat in the formulation presented here. In practice it is necessary to thermostat the system when external forces are applied to it to study the response. Such thermostats may also be implemented naturally in the context of a non-Hamiltonian framework.
References [1] D.J. Evans and G.P. Morriss, Statistical Mechanics of Nonequilibrium Liquids., Academic Press, New York, 1990. [2] G. Ciccotti, D. Frenkel, and I.R. McDonald, Simulations of Lliquids and Solids, 3rd edn., North-Holland, Amsterdam, 1987. [3] J.P. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical-integration of cartesian equations of motion of a system with constraints–molecular-dynamics of n-alkanes,” J. Comp. Phys., 23, 327–341, 1977. [4] T.O. White, G. Ciccotti, and J.P. Hansen, “Brownian dynamics with constraints,” Mol. Phys., 99, 2023–2036, 2001. [5] S. Melchionna, “Constrained systems and statistical distributions,” Phys. Rev. E, 61, 6165–6170, 2000. [6] M.E. Tuckerman, Y. Liu, G. Ciccotti, and G.L. Martyna, “Non-Hamiltonian molecular dynamics: generalizing Hamiltonian phase space principles to non-Hamiltonian systems,” J. Chem. Phys., 115, 1678–1702, 2001. [7] M.E. Tuckerman, C.J. Mundy, and G.L. Martyna, “On the classical statistical mechanics of non-Hamiltonian systems,” Europhys. Lett., 45, 149–155, 1999. [8] G. Ciccotti, R. Kapral, and A. Sergi, “Simulating reactions that occur once in a blue moon,” In: S. Yip (ed.), Handbook of Materials Modeling, Volume 1: Methods and Models, Springer, Berlin, 2005. [9] L. Onsager, “Reciprocal relations in irreversible processes. I,” Phys. Rev., 37, 405– 426, 1931. “Reciprocal relations in irreversible processes. II,” Phys. Rev., 38, 2265– 2279, 1931. [10] G. Ciccotti, G. Jacucci, and I.R. McDonald, “Thought experiments by molecular dynamics,” J. Stat. Phys., 21, 1–22, 1979. [11] R. Kubo, M. Toda, N. Hashitsume, M. Toda, and R. Kubo, Statistical Physics II: Nonequilibrium Statistical Mechanics, 2nd edn., Springer, Berlin, 1995. [12] G. Ciccotti and G. Jacucci, “Direct computation of dynamical response by moleculardynamics–mobility of a charged Lennard–Jones particle,” Phys. Rev. Lett., 35, 789– 792, 1975. [13] J.P. Ryckaert, A. Bellemans, G. Ciccotti, and G.V. Paolini, “Shear-rate dependence of the viscosity of simple fluids by nonequilibrium molecular dynamics,” Phys. Rev. Lett., 60, 128–131, 1988. [14] J.P. Ryckaert, A. Bellemans, G. Ciccotti, and G.V. Paolini, “Evaluation of transport coefficients of simple fluids by MD: comparison of Green–Kubo and nonequilibrium approaches for shear viscosity,” Phys. Rev. A, 39, 259–267, 1989.
Non-equilibrium molecular dynamics
761
[15] A.J.C. Ladd, “Equations of motion for non-equilibrium molecular-dynamics simulations of viscous-flow in molecular fluids,” Mol. Phys. Rep., 53, 459–463, 1984. [16] D.J. Evans and G.P. Morriss, “Non-Newtonian molecular-dynamics,” Comp. Phys. Rep., 1, 297–343, 1984. [17] A.W. Lees and S.F. Edwards, “The computer study of transport processes under extreme conditions,” J. Phys. C, 5, 1921–1972, 1972. [18] G. Ciccotti, C. Pierleoni, and J.P. Ryckaert, “Theoretical foundation and rheological application of non-equilibrium molecular dynamics,” In: M. Mareschal and B.L. Holian (eds.), Simulations of Complex Hydrodynamic Phenomena, NATO ASI Series B 292. Plenum Press, New York, 1992. [19] R. Edberg, D.J. Evans, and G.P. Morriss, “Conformational dynamics in liquid butane by nonequilibrium molecular dynamics,” J. Chem. Phys., 87, 5700–5708, 1987. [20] G. Marechal, J-P. Ryckaert, and A. Bellemans, “The shear viscosity of n-butane by equilibrium and non-equilibirum molecular dynamics,” Mol. Phys., 61, 33–49, 1987. [21] S.R. de Groot and P. Mazur, Non-Equilibrium Thermodynamics, North-Holland, Amsterdam, 1962. [22] W.G. Hoover, D.J. Evans, R.B. Hickman, A.J. Ladd, W.T. Ashurst, and B. Moran, “Lennard–jones triple-point bulk and shear viscosities. Green–Kubo theory, hamiltonian mechanics, and nonequilibrium molecular dynamics,” Phys. Rev. A, 22, 1690– 1697, 1980.
2.18 THERMAL TRANSPORT PROCESS BY THE MOLECULAR DYNAMICS METHOD Hideo Kaburaki Japan Atomic Energy Research Institute, Tokai, Ibaraki, Japan
We do molecular dynamics simulations for the system in equilibrium, for example, at some finite temperature, and by taking averages the spontaneous fluctuations we can evaluate thermal transport coefficients that control the nonequilibrium system, such as thermal conductivity, fluid viscosity, and diffusion constant. We can do this by exploiting a significant formula, called the Green–Kubo formula, in nonequlibrium statistical mechanics that connects the macroscopic thermal transport coefficient and the microscopic time autocorrelation function.
1.
Diffusion Process, Transport Properties, Macroscopic Equations
When we put a drop of ink in a fluid, particles of ink in the localized region of higher concentration extend to the region of lower concentration, that is, the system relaxes to a uniform state. This equalization process is a diffusion phenomenon, which is a fundamental process in nonequilibrium state. A typical diffusion process is heat or thermal conduction, which is connected to the irreversibility stated in the second law of thermodynamics. When there is some nonuniformity of energy, momentum, and particle concentration in a material, a flow occurs accompanying dissipation. When the gradients of these quantities are small, where thermodynamic quantities are defined locally, transport coefficients, which regulate flows of energy, momentum, and particles can be defined in the process of thermal conduction, viscous flow, and diffusion. All the macroscopic phenomenological equations of motion that govern the thermal conduction, viscous flow, and diffusion, are written in the conservation form or the equation of continuity. In the case of thermal conduction, a conserved quantity is the energy of unit volume ρε, where ρ is the density 763 S. Yip (ed.), Handbook of Materials Modeling, 763–771. c 2005 Springer. Printed in the Netherlands.
764
H. Kaburaki
and ε is the internal energy per unit mass. A conservation law is described by equating the time derivative of the conserved quantity and minus the divergence of flux density vector. The magnitude of the flux density vector j is the amount of energy per unit time passing through a unit area perpendicular to the direction of flow. A conservation equation ofenergy iswritten as ∂(ρε) ∂t =−div j. Time derivative of ε is expressed as ∂ε ∂t =c∂ T ∂t, where c is the specific heat, assuming that the thermal expansion is neglected. The flux density j is related to the local temperature T (r, t) by the Fourier’s law of thermal conduction j = −k∇T through the thermal conductivity κ. With all these relations combined, we finally get the thermal conduction equation or the diffusion equation of temperature ∂ T /∂t = (κ/ρc)∇ 2 T . Here, κ/(ρc) is the thermal diffusivity or the temperature conductivity. The diffusion equation is derived for the mass density ρ, and the diffusion constant D is defined. Also, in the same way, a fluid motion can be described using the momentum density ρ ν and the momentum flux tensor , and the Navier–Stokes equation results with the transport coefficient of viscosity [1].
2.
Classical Laws of Mechanics Nonequilibrium Statistical Mechanics
The derived equations above are all phenomenological and are based on the continuum assumption, that is, microscopic properties of a material are all smeared out and the material is assumed to be smooth and continuous. Here, a set of equations describing the space–time variations of the conserved quantities is closed when the transport coefficients are determined by experiment. The role of nonequilibrium statistical mechanics is to derive these properties starting from the laws of mechanics. Let us consider a material or a system as an assembly of N atoms following the classical laws of mechanics. The phase space is defined as a space consisting of coordinates and momenta of atoms (q1 , q2 , . . . , q N ; p1 , p2 , . . . , p N ) for 6N degrees of freedom. The mechanical state of this system is described as a point in the phase space and the time development of this system is described by a set of equations of 6N variables. If the Hamiltonian of this system is given by H=
N
p2i /(2m) + U (q1, q2 , . . . , q N ),
i=1
the Hamilton’s equations of motion are dpi /dt = −∂ H/∂qi = −∂U/∂qi = Fi , dqi /dt = ∂ H/∂pi = ∂pi /m for i = 1 . . . N.
Thermal transport process by the molecular dynamics method
765
The molecular dynamics simulation is equivalent to the time integration of these equations and to a trajectory in the phase space. For the small volume in the phase space d = dq1 · · · dq N dp1 · · · dp N , we can define the particle density f (q1 , q2 , . . . , q N ; p1 , p2 , . . . , p N , t) = f (q, p, t). The number of phase points in the volume d is described as f (q, p, t) d. If we consider a flow of points in the phase space, the equation of the density, the Liouville equation
N ∂f ∂H ∂f ∂H ∂f =− − , ∂t ∂qi ∂pi ∂pi ∂qi i=1
which is equivalent to the Hamilton’s equations of motion, is obtained. These are the fundamental microscopic formulas for a system consisting of N atoms. Now consider some dynamical quantity A(q, p) and what microscopic expressions correspond to the observed values in macroscopic state. The particle density f (q, p, t) is considered as a distribution function or a probability of finding a phase point in the phase space. We regard the ensemble average A =
dq dp A(q, p) f (q, p, t)
in terms of the distribution function in the phase space corresponds to the observed values. Here, the distribution function f (q, p, t) follows the Liouville equation, so that by partial integration the distribution function in the above expression is replaced by the ensemble average with reference to the distribution function for initial conditions. Since we cannot control the initial conditions microscopically, an observed value corresponds to taking averages by sampling from the distribution of initial conditions (q0 , p0 ) [2]. In a molecular dynamics simulation, we can derive a time evolution of a system of interacting N particles starting from some initial condition (q0 , p0 ). This means that one trajectory in the phase space is generated for a long time starting from an initial point in a molecular dynamics simulation. In order to take the ensemble average A using the molecular dynamics, we repeat simulations starting from the arbitrarily N chosen initial conditions, and take averages of these similar systems. However, time averages A¯ are better suited to a molecular dynamics simulation to exploit a long time trajectory. We call for the ergodic hypothesis to use the time averages. This hypothesis states that for a stationary random process the time average of observing N instants of time for a long time single simulation is equal to the ensemble average. Time averages are better suited to a single run of long time molecular dynamics simulation.
766
3.
H. Kaburaki
Linear Response Theory, Green–Kubo Formula
We consider a nonequilibrium system that is very close to equilibrium. We have to consider the thermal internal forces as a perturbation to the system to describe thermal conduction and diffusion. The thermal internal forces are intrinsically statistical and cannot be treated as a perturbation to the Hamiltonian, as in the external mechanical forces. In order to evaluate thermal transport properties, we need heat reservoirs of different temperatures or nonuniform temperature distribution to establish the nonequilibrium state. There are various ways of describing microscopic states for this nonequilibrium system, and many theories have been presented. The final formula for the thermal transport coefficients is expressed all in the same form, as far as the system is in the linear regime from the equilibrium [2–5]. The Green–Kubo formula for thermal conductivity is expressed as κ=
1 3V k B T 2
∞
J(t) · J(0) dt ,
0
where the heat flux J is described as J=
i
E i vi +
1 (r i − r j )[F ij · (vi + v j )], 2 i> j pairs
1 φ(|ri − r j |), E i = mv2i + 2 i> j pairs
Fi j = −
∂φ(|ri − r j |) . ∂ri
Here means the ensemble average, V is the volume, and φ is the pair potential energy between atoms. Also, E i and Vi are the energy and velocity of an ith particle and Fij is a force on a particle i from a particle j . The formula is expressed in terms of the autocorrelation functions of the heat current density. Time correlation function expressions for all the transport coefficients in the macroscopic equations are derived in the same way [4].
4.
Time Correlation Function
The Green–Kubo formula reveals that for the system in the linear regime close to equilibrium the macroscopic transport coefficient is connected to microscopic quantities. For example, the macroscopic thermal conductivity coefficient is represented as the time integral of the ensemble averaged heat flux autocorrelation function. This indicates that the dynamical properties of the system can be derived from this formulation, which is in contrast to the kinetic approach where the stochasticity is introduced [6].
Thermal transport process by the molecular dynamics method
767
In order to have a finite thermal conductivity, the autocorrelation function should be decayed rapidly, for example, exponentially. There are some cases where the transport properties show anomalous behavior. For example, in the case of the two-dimensional fluid, the autocorrelation function decreases algebraically, which is called the long time tail. In this case, self-diffusion coefficient diverges. In the case of one and two-dimensional lattices, thermal conductivity diverges as the number of particles increases [7]. Generally, the time correlation function is very difficult to calculate exactly and for some cases such as classical fluid, molecular dynamics plays a very important role for studying the structure of liquids [8].
5.
Numerical Calculations of Transport Coefficients by the Molecular Dynamics Method
Molecular dynamics simulation provides a method of calculating the heat flux autocorrelation function in the Green–Kubo formula for thermal conductivity. While the Green–Kubo formula is frequently used to determine the thermal conductivity of liquids [9], studies on solids using this method are rather recent. For a material with not too high thermal conductivity, such as argon, the direct evaluation of the heat flux autocorrelation function is effective in using the molecular dynamics trajectory of the time sequence of positions and velocities. As stated above, the ensemble average is replaced by the time average of observing N instants of time sampled from a long time single simulation as in Fig. 1. The sampling time tsample should be taken long enough for the autocorrelation function to be decayed. During this sampling time, the autocorrelation functions of various time differences, in which tsample is the longest correlation time, are evaluated. The difference time between samples tshift should be shifted long enough for the samples to be independent. For example, we show here a molecular dynamics simulation of thermal conductivity for the liquid and solid argon cases, where a system of 864 particles is numerically integrated using the Lennard-Jones potential φ(r) = 4ε[(σ/r)12 − (σ/r)6 ], where ε = 119.8 K, σ = 3.405 Å, with a time step of 2 fs and total time steps of up to 107 . Here, the length of a sampling time tsample is 2.0 × 10−12 − 1.0 × 10−11 s. The time between samples tshift is taken as 0.1 ps in this case. Figure 2 shows how the ensemble average for the autocorrelation function converges to the final results with increasing the number of samplings. It is seen that the result with 500 samplings mostly converges to the final result with large number of samplings. Figure 3 shows the time autocorrelation functions for liquid and solid argon at 90 K and 60 K under the freestanding condition in the Cartesian coordinate plot. The autocorrelation function for the liquid is clearly seen to decay exponentially, while that of solid also decreases exponentially
768
H. Kaburaki
Figure 1. Sampling of ensembles from a single long time molecular dynamics simulation.
Figure 2. Convergence of the ensemble averaged autocorrelation function.
Thermal transport process by the molecular dynamics method
769
Figure 3. Heat flux autocorrelation function for argon liquid and solid in the Cartesian coordinates.
but has a longer tail. This is clearly seen by plotting the results in the log–log representation in Fig. 4. The relaxation of solid argon consists of two stages, shorter atomistic and longer phonon parts. The final thermal conductivity is derived by numerically integrating the time autocorrelation function, and as it is seen in Fig. 5 that it converges to the constant value of thermal conductivity. The final converged value of thermal conductivity for the liquid state is 0.1255 W m−1 K−1 . For a material with high thermal conductivity, such as covalent bond crystals, very long runs are required since a smaller step size is needed and the correlation extends more than 100 ps. One method is to derive the power spectrum from the original heat flux autocorrelation function through fast Fourier transforms and to take the zero-frequency limit ω → 0 to obtain the thermal conductivity. Since the length of the data is finite, care should be taken that there is some ambiguity in this process. This process can be understood by considering the more general expression of thermal conductivity for unsteady deviations from the equilibrium. The formula for this case is generalized to [5]. 1 κ(ω) = 3V k B T 2
∞ 0
J (t) · J (0)eiωt dt.
(1)
770
H. Kaburaki
Figure 4.
Heat flux autocorrelation function for argon liquid and solid in the log–log plot.
Figure 5. Integral of heat flux autocorrelation function for liquid and solid argon.
Thermal transport process by the molecular dynamics method
771
Taking the zero-frequency limit ω → 0 means that the macroscopic time 2π/|ω| is much larger than the microscopic relaxation time of the autocorrelation function. The above expression of thermal conductivity reduces to the static expression under this limit.
6.
Outlook
The Green–Kubo formalism combined with the molecular dynamics simulation has been applied, immediately after the theory is formulated, to the classical liquid problem and the method is well established in this area. On the other hand, the application of this method to the thermal conductivity of solids is rather delayed. The thermal conductivity problems have been studied mostly by the phonon Boltzmann–Peierls method with the relaxation time approximation. With this method, the relaxation times are finally fitted to experimental data or evaluated by the theoretical perturbation calculation. On the other hand, the method of Green–Kubo formula is formally exact and can open a way to the calculation of the thermal transport coefficients for any states, gas, liquid, or solid. However, this method requires a calculation of autocorrelation function of fluctuating fluxes, which is equal to solving a system of equations of motion for N interacting particles. With the development of computers, the molecular dynamics method with this formula continues to be an effective tool, in particular, to delve into dynamical aspects of thermal transport properties for various materials.
References [1] L. Landau and E.M. Lifshitz, Fluid Mechanics, 2nd (edn.), Pergamon Press, New York, 1987. [2] D.N. Zubarev, Nonequilibrium Statistical Thermodynamics, Consultants Bureau, New York, 1974. [3] R. Zwanzig, “The correlation functions and transport coefficients in statistical mechanics,” Annu. Rev. Phys. Chem., 16, 67–102, 1965. [4] D.A. McQuarrie, Statistical Mechanics, Harper & Row, New York, 1976. [5] R. Kubo, M. Toda, and N. Hashitsume, Statistical Physics II, 2nd (edn.), Springer, Berlin, 1991. [6] R.E. Peierls, Quantum Theory of Solids, Oxford University Press, New York, 1955. [7] S. Lepri, R. Livi, and A. Politi, “Thermal conduction in classical low-dimensional lattices,” Phys. Rep., 377, 1–80, 2003. [8] P. Boon and S. Yip, Molecular Hydrodynamics, Dover, New York, 1980. [9] C. Hoheisel and R. Vogelsang, “Thermal transport coefficients for one- and twocomponent liquids from time correlation functions computed by molecular dynamics,” Comput. Phys. Rep., 8, 1–70, 1988.
2.19 ATOMISTIC CALCULATION OF MECHANICAL BEHAVIOR Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, OH, USA
Mechanical behavior is stress-related behavior. This can mean the material response is driven by externally applied stress (or partially), or the underlying processes are mediated by an internal stress field; very often both are true. Due to defects and their collective behavior [1], the spatiotemporal spectrum of stress field in a real material tends to have very large spectral width, with non-trivial coupling between different scales, which is another way of saying that the mechanical behavior of real materials tends to be multiscale. The concept of stress field is usually valid when coarse-grained above a few nm; in favorable circumstances like when crystalline order is preserved locally, it may be applicable down to sub-nm lengthscale [2]. But overall, the atomic scale is where the stress concept breaks down, and atomistic simulations [3–5] provide very important termination or matching condition for stress-based theories. Large-scale atomistic simulations (chap 2.27) are approaching µm lengthscale and are starting to reveal the collective behavior of defects [6]. But studying defect unit processes is still a main task of atomistic simulation. It is infeasible to list the current developments in this area to any degree of completeness, so only a few highlights are given. A somewhat more detailed review can be found in Ref. [5]. • The study of deformation [7–11], grain growth [12] and fracture [13, 14] in nanocrystalline materials. • Atomistic simulation of adhesion and friction [15, 16], and nanoindentation [17–20]. • The study of dislocation core structure and Peierls stress in BCC metals [21], semiconductors [22] and intermetallics [23]. A proper definition of dislocation core energy and numerically precise ways [24, 25] to extract 773 S. Yip (ed.), Handbook of Materials Modeling, 773–792. c 2005 Springer. Printed in the Netherlands.
774
• • • • •
• •
J. Li the core energy from periodic boundary condition (PBC) atomistic calculations. Thin film deposition, texture evolution and mechanical properties [26, 27]. The study of dynamical brittle fracture [28, 29] and lattice trapping barriers [30, 31], ductile fracture [6, 32]. The study of phase and grain boundaries [33, 34]. Deformation and fracture of amorphous materials [35]. The application of Hessian-free minimum energy path (MEP) search algorithms [36] to study dislocation cross-slip in FCC metals [37], double kink nucleation and migration in semiconductors [38] and BCC metals [39], and heterogeneous dislocation nucleation at crack tips [2]. Defect generation/evolution induced by irradiation, and effect on mechanical properties [40–42]. Connection of atomistics to the mesoscale [43–46].
In this contribution, we review the basic concepts of strain, stress and elastic constant [47]. Then we move to a discussion about dislocation core energy [25]. Finally we discuss a minimum energy path calculation of heterogeneous dislocation nucleation at an atomically sharp crack tip [2].
1.
Strain, Stress and Elastic Constants
Stress and strain have many definitions, which although do not change the physics, differ in the efficiency of representing a particular problem. Here we introduce a system that is usually the most convenient for atomistic calculations. Strain should be relative. To define strain, one must first declare the reference state. This is reasonable because strain describes deformation. Strain should be frame-covariant like any true second-rank tensor [48], since how much an object is deformed does not really depend on the angle one looks at it. Here we denote the geometrical configuration of an object by X, Y or Z , which describes its shape, i.e., surface constraints. For periodic boundary condition (PBC) simulations, this would be the supercell H -matrix (Chapter 2.8). Affine transformation of an object from one shape to the other is specified by the tensor J , expressed as Y = J X , which is homogeneous in the sense that surface constraints of the object change uniformly according to J . But it does not have to be a microscopically homogeneous transformation, as different kinds of atoms may have different atomic-scale relaxations. The Lagrangian strain is defined to be, ηYX ≡ 12 ( J T J − 1).
(1)
Atomistic calculation of mechanical behavior
775
Subscript X in ηYX denotes the reference state and superscript Y denotes the final state. If the final state is apparent we may omit the superscript and simply write as η X . The polar decomposition theorem [49] states that every matrix can be uniquely expressed as the left or right product of a symmetric matrix and a rotational matrix,
J = RM = ML M T = M, R T R = L T L = 1
(2)
Therefore, η X = 12 ( J T J − 1) = 12 (M 2 − 1).
(3)
There is one-to-one correspondence between η X and M, as, M=
1 + 2η X = 1 + η X − 12 η2X + . . .
(4)
Let Y = J X, Z = K Y = KJX . There is ηYZ = 12 (K T K − 1), η XZ = 12 ( J T K T K J − 1) = 12 ( J T (1 + 2ηYZ ) J − 1) = J T ηYZ J + ηYX ,
(5)
which is the law of η conversion between reference systems. Contrary to strain, stress should be absolute, meaning it should not depend on any reference state besides the current state of the object. We use two definitions of stress here: the first is the external stress τij , which is the usual “force per area” definition used by engineers, dTi = τij n j dS,
(6)
where dTi is the external traction force, n j is the outward surface normal and dS is the surface area, and the Einstein summation convention is used. τij is what the outside environment exerts on the object. To prevent rotation, it must satisfy τij = τ j i . The second kind of stress is the thermodynamic stress tij , also called the intrinsic stress of the volume, whose definition is based on the Helmholtz free energy F(N, T, X ) of the object: F(N, T, X ) = E − T S ≡ −kB T ln Z (N, T, X )
(7)
where Z (N, T, X ) is the partition function [50, 51], Z (N, T, X ) ≡
X
exp(−βH(q N , p N ))
dq N d p N . N !h 3N
(8)
776
J. Li
Here F is a function of the particle number N , temperature T , and geometrical constraint X . Since the Hamiltonian H(q N , p N ) is usually rotationally invariant, F is also rotationally invariant. Thus, F(N, T, Y ) = = = = =
F(N, T, J X ) F(N, T, R M X ) F(N, T, M X ) F(N, T, 1 + 2η X X ) F(N, T, η X , X ),
(9)
i.e., F is a function of η X , once X is chosen. A function can always be expanded into Taylor series:
F(η X , X ) = F(0, X ) +
∂ F ∂ηij η
1 ∂ 2 F + 2 ∂ηij ∂ηkl η
ηij X =0
ηij ηkl + . . .
(10)
X =0
Because ηij is symmetric, the expansion should only involve six independent variables: η11 , η22 , η33 , η23 , η13 , η12 . But that is often inconvenient for index contraction, so what people do is to symmetrize the expansion coefficients over ηij and η j i whenever possible, but pretending ηij , η j i to be different summation variables. Let us define second and fourth rank symmetrization operators: Sˆ 2(G ij ) = 12 (G ij + G j i ), Sˆ 4(Wi j kl ) =
1 (Wi j kl 4
(11)
+ Wi j lk + W j ikl + W j ilk ).
(12)
The thermodynamic stress at configuration X is defined to be,
1 ˆ ∂ F(η X , X ) S2 tij (X ) = (X ) ∂ηij
,
(13)
η X =0
and the elastic constant:
1 ˆ ∂ 2 F(η X , X ) S4 Ci j kl (X ) = (X ) ∂ηij ∂ηkl η
,
(14)
X =0
where (X ) is the volume of the object at X , so tij and Ci j kl are intensive quantities. By definition,
F(η X , X ) = F0 + (X ) tij (X )ηij + 12 Ci j kl (X )ηij ηkl . . . tij = t j i ,
Ci j kl = Ci j lk = C j ikl = C j ilk .
(15)
Atomistic calculation of mechanical behavior
777
Notice that since tij and Ci j kl are expansion coefficients of η X in F(η X , X ) at η X = 0, they themselves are not functions of η X , but only of X . That means the definitions of thermodynamic stress and elastic constant do not require a reference state, since to evaluate them we use the object itself at that moment as the reference state. The use of this co-moving reference frame has some “strange” consequences, which is covered generally in differential geometry [48]. For instance, tij (Y ) =/ tij (X ) + Ci j kl (X )(ηYX )kl + . . . ,
(16)
which is not what one may expect for the Taylor expansion of the “first-order derivative” in terms of the “second-order derivative”, which works when we use a fixed reference frame. In fact, in light of (5), F(Z ) = F(ηYZ , Y )
= F(Y ) + (Y )Tr t (Y )ηYZ + O (ηYZ )2 = F(η XZ , X )
= F(X ) + (X )Tr t (X )η XZ +
+ O (η XZ )3
(17)
(X ) Z Tr η X C(X )η XZ 2
(X ) Z Tr η X C(X )η XZ
2 Y 2 Z Z 2 + O (η X ) ηY + O (ηY ) .
= F(X ) + (X )Tr t (X )η XZ +
(18)
The linear coefficient of ηYZ in (17) and (18) must be equal. Plugging in (5) to (18), we have,
F(Z ) = const + (X )Tr J t (X ) J T ηYZ + (X )Tr J ηYX C(X ) J T ηYZ
+ O (ηYX )2 ηYZ + O (ηYZ )2 .
(19)
Therefore matching the linear coefficient of ηYZ to that of (17), we have, t (Y ) =
J (C(X )ηYX ) J T J t (X ) J T + + O (ηYX )2 . det |J | det |J |
(20)
It can be shown that if J is constrained to be symmetric, then
tij (Y ) = tij (X ) + Bi j kl (X )(ηYX )kl + O (ηYX )2 ,
(21)
where Bi j kl (X ) is the elastic stiffness coefficient [47]: Bi j kl (X ) = Ci j kl (X ) + 12 (δik t j l (X ) + δ j k til (X ) + δil t j k (X ) + δ j l tik (X ) − 2δkl tij (X )).
(22)
778
J. Li
Bi j kl (X ) is equal to Ci j kl (X ) only when tij (X ) = 0, therefore the use of elastic constant as the linear expansion coefficient of stress versus strain (both defined above) is only a valid practice at zero load. It can be proven by minimizing the Gibbs free energy [47] that equilibrium is reached at X when tij (X ) = τij . Thus the two quantities have identical values at equilibrium; however they have different connotations physically. Atomistic expressions for the thermodynamic stress and elastic constants can be derived for the canonical ensemble [50, 52]. The partition function for a deformed system is,
Z (X, M) =
exp(−βH(q˜ N , p˜ N ))
MX
dq˜ N d p˜ N , N !h 3N
(23)
where we assume, H(q˜ N , p˜ N ) =
N p˜ nT · p˜ n n=1
+ V (q˜ 1 , q˜ 2 , . . . , q˜ N ).
2m n
(24)
Under a change of variables q˜ n → qn , p˜ n → pn : p˜ n ≡ M −1 pn ,
q˜ n ≡ Mqn ,
n = 1, . . . , N,
(25)
the Hamiltonian can be written as, H(q N , p N ) =
N pnT M −2 pn
+ V (Mq1 , Mq2 , . . . , Mq N ).
2m n
n=1
(26)
Using (4) and also, M −2 =
1 = 1 − 2η X + 4η2X + . . . 1 + 2η X
(27)
the partition function can be written as: Z (X, η X ) =
exp −β
N pnT (1 − 2η X + 4η2X )pn n=1
X
+V
1 + ηX −
2m n 1 2 η 2 X
q
N
dq N d p N ,
(28)
where we threw away the N !h 3N constant. Using index notation ηij for matrix η X : ∂Z 1 1 ∂F · =− = ∂ηij β Z ∂ηij Z
Tij exp(−βH)dq N d p N X
(29)
Atomistic calculation of mechanical behavior
779
where, H(q N , p N ) ≈
N pnT (1 − 2η X + 4η2X )pn
2m n
n=1
+V
1 + η X − 12 η2X q N , (30)
and Tij =
N ∂H pin (−δ j k + 4η j k ) pkn = + (δik − ηik )qkn ∇ nj V ((1 + η X )q N ). ∂ηij n=1 mn
(31) Setting η X to zero, we get the atomistic formula for the thermodynamic stress:
tij (X ) =
N − pin pnj 1 ˆ S2 + qin ∇ nj V (q N ) (X ) m n n=1
.
(32)
The means canonical ensemble average in the original configuration X . One may wonder why the sum (32) does not always give 0 at T = 0, since n ∇ j V (q N ) ≡ 0 for bulk atoms at equilibrium. The answer is that if we were to compute the stress using (32) as it stands now, we must count those atoms on the surface, whose equilibrium conditions F jn =∇ nj V (q N ) in general require the presence of external force F jn , which is the force the wall exerts on the atom to keep it within X . Since in (32) those F n ’s are weighted by q n ’s, this surface contribution does not vanish in the thermodynamic limit (N, → ∞), as the surface energy does, on a per volume basis. On the other hand, it appeals to one’s intuition that stress originates from the bulk, not from the surface, and is an intensive quantity. This can be seen in the following way: because V (q N ) in general is the sum of local interac N tions, for instance V (q ) = {lmn} W (ql , qm , qn ), where W ’s are three-body local interactions. Due to translational symmetry: W (ql + δ, qm + δ, qn + δ) = W (ql , qm , qn ), one must have ∇ l W +∇ m W +∇ n W ≡ 0, so the contribution of this specific interaction to the total (32) sum can be rewritten as (qil −qin )∇ lj W + (qim − qin )∇ mj W , conceptualized as F · q, i.e., force contribution weighted by the relative distance between action and reaction. Through this localization transformation, all q n weighting factors in the sum can be converted to q’s which are not larger than the interatomic distance. In this transformed summation, which should be converted from (32) as soon as the interatomic potential model is known, the surface contribution would vanish in the thermodynamic limit on a per volume basis, like the surface energy. So for local interactions, we can prove that the stress is intensive and indeed may be thought of as originating from the bulk.
780
J. Li
To get the atomistic formula for elastic constants, we need to further differentiate (29): 1 ∂2 F = ∂ηij ∂ηkl Z + =β
X
β Z
∂ Tij − βTij Tkl exp(−βH)dq N d p N ∂ηkl
Tkl exp(−βH)dq N d p N Tij
X
Tij Tkl − Tij Tkl
+
∂ Tij . ∂ηkl
(33)
From (31) we can get: N N 4 pin pkn ∂ Tij m n m n N n n N = δ + q q ∇ ∇ V (q ) − δ q ∇ V (q ) . j l il k i l j k j ∂ηkl η X =0 n=1 m n m,n=1
(34) So we get the unsymmetrized form of elastic constants: Di j kl = β(X )
1 + (X )
tij tkl − tij tkl
N
1 + (X )
qkm qin ∇lm ∇ nj V (q N ) −
m,n=1
N 4 pin pkn n=1
N
mn
δ jl
qkn ∇ nj V (q N )δil .
(35)
n=1
The first term is defined to be the fluctuation term. The last term is defined to be the Born term, usually written as CiBj kl . The elastic constant is therefore Ci j kl = Sˆ4(Di j kl ),
(36)
which is valid at finite temperature and for arbitrary stress. The summation (35) also needs to undergo the localization procedure as (32) to be computable in atomistic calculations. Equations (32) and (35) especially, are only applicable to canonical ensemble. For micro-canonical ensemble, a different set of formulas can be derived [53].
2.
Dislocation Core Energy
The dislocation core is a remarkable bond-cutting machine (the “sharpest knife”) that nature comes up with to relieve the stored elastic energy. While the internal mechanisms of this machine can be highly complicated, the overall effect is that atomic bonds come into the machine, get cut in shear, and new
Atomistic calculation of mechanical behavior
781
bonds with dislocated neighbors are left in the wake, much like a combine in a crop field. With its operation, diffuse elastic strain in the environment are collected and condensed into local inelastic (transformation) strain in a one-atomic-layer thin platelet, the glide plane [5]. There are actually two definitions of the dislocation core size [24, 25] a physical core width and a mathematical/elasticity core width. The physical core was described in the first paragraph, and is defined by atoms whose local atomic order like the coordination number or inversion symmetry (chap. 2.32) is drastically different from that of the crystalline bulk, from which we may define a phys core size r0 . In other words, the physical core is the set of atoms which are phys participating actively in the bond-cutting business. Obviously, r0 is significant and useful, but needs not be a precise real number (like 1.8234a0 ) due to lattice discreteness. In contrast, the mathematical core radius r0 and core energy E core can be defined precisely as real numbers from an asymptotic expansion of the total energy of a dislocation dipole in an infinite, and otherwise perfect, atomic lattice, E(d) = 2E core + 2A(θ) +
K s |b|2 |d| + O |d|−1 , log 2π r0
(37)
at large |d|. Here, E(d) is defined to be the total energy increase in a thought experiment of an infinite lattice whose atoms displace according to the leadingphys order [55] solution uG (x) at |x−d/2|, |x+d/2| r0 , but which are allowed to relax atomistically near the physical cores. As the Stroh solution is selfequilibrating (stress equilibrium is satisfied), the above thought experiment is well-posed and E(d) is the final increase in the atomistic total energy. At large |d|, the leading d-dependent term in E(d) must be K s |b|2 log |d|/2π , with K s proven invariant with respect to the displacement cut direction dˆ ≡ d/|d| [56]. Let us define θ to be the angle between dˆ and an arbitrarily chosen reference ˆ = |ˆa| = 1, and ξ is the line direcdirection aˆ , with dˆ ⊥ ξ and aˆ ⊥ ξ , |d| tion of the straight dislocation. An asymptotic expansion of E(d) at large |d| would yield O(log |d|), O(1), O(|d|−1 ), . . . terms. The O(1) term may contain a θ-dependent component 2A(θ), and a θ-independent component. For the sake of definiteness, we require A(θ = 0) = 0, and aˆ will be called the zero-angle reference axis. A(θ) is given entirely by anisotropic elasticity, 2A(θ) =
3 b T Kα b α=1
4π
log
(dˆx + pαr dˆy )2 + ( pαi dˆy )2 , (aˆ x + pαr aˆ y )2 + ( pαi aˆ y )2
(38)
where pα ≡ pαr + i pαi , α = 1..3, are the three Stroh eigenvalues with nonnega)Im(Lα )T + Im(Lα )Re(Lα )T ) is the tive imaginary parts, and Kα ≡ −2(Re(L 3 α T mode-specific modulus [56], with α=1 b Kα b = K s |b|2 . Physically, 2A(θ) is the rotational energy landscape of a dislocation dipole with fixed |d| in an infinite anisotropic medium [24], when |d| is asymptotically large. It is seen
782
J. Li
¯ and Mo from (38) that A(θ) = A(θ + π ). To illustrate, A(θ)’s for Si a0 /2[110] a0 /2[111] screw dislocations are evaluated and shown in Fig. 1. With O(log |d|) and θ-dependent O(1) parts known, the |d|- and θ-independent O(1) part of E(d) can be used to determine the mathematical core r0 , E core pair. Imagine for a fixed θ, we plot E(d) data with |d| on a chart (d can only take discrete lattice spacing), and we would like to fit the data ˜ to a smooth function E(d). We need to shift the function K s |b|2 log |d|/2π up or down to get a good fit at large |d|. That shift operation is well defined asymptotically and is mathematically unique. If we ignore |d|−1 , etc. terms 2 ˜ , 2E core + 2A(θ) in the fitting template E(d) ≡ 2E core + 2A(θ) + K s2π|b| log |d| r0 ˜ would be the abscissa of E(d) at |d| = r0 . It does not mean, however, that ˜ only fits E(d) well at large |d| (satisfying at E(r0 ) = 2E core + 2A(θ), as E(d) phys minimum |d| 2r0 ). It is thus clear that r0 , E core are mathematical instruments to fit E(d) to an asymptotic form and do not carry physical meaning in either quantity alone. If one likes, one may choose r0 =1000|b| and select E core ˜ accordingly so E(d) remains the same function and nothing is changed. There are several popular choices, however, such as (a) take r0 = |b|, (b) choose r0 so phys E core = 0, (c) r0 =r0 to minimize confusion, (d) r0 = 1Å to simplify numerical calculation, etc. It is seen that except for (c), none of the r0 ’s has anything to ˜ do with a physical core size. It is also clear that although E(d) by definition phys must fit E(d) well at large |d|, there should be a big error as |d| → 2r0 and the physical cores begin to overlap. Finally, r0 and E core (and aˆ too) combined (b)
(a)
3
10 3.5
0.04
3
0.03
2.5
A(θ) [eV/A]
A(θ) [eV/A]
0.05
0.02 0.01 0
2 1.5
0.01
1
0.02
0.5
0.03
0
20 40 60 80 100 120 140 160 180 θ [degree]
TB FS
0
0
20 40 60 80 100 120 140 160 180 θ [degree]
¯ shuffle-set screw dislocation in Figure 1. (a) The angular function A(θ) of a0 /2[110] ¯ as the zero-angle reference axis aˆ . The correStillinger–Weber potential Si [24], with 112 sponding core energy is computed to be 0.502 eV/Å for r0 = |b|. In a separate calculation [54], with 111 as the zero-angle reference axis, the core energy was computed to be 0.526 eV/Å. The 0.024 eV/Å difference is verified to be exactly A(θ = π/2), as shown above in circle. (b) The angular function A(θ) for Mo a0 /2 [111] screw dislocation using the Finnis–Sinclair ¯ potential (dash line) and the tight-binding potential (solid line), both with aˆ chosen to be 112. There is A(θ) = A(θ + π/3) due to crystal symmetry.
Atomistic calculation of mechanical behavior
783
do carry physical meaning – as much as any other defect formation energies – for example in evaluating the absolute total energy of formation of a dislocation loop. The atomistically computed E core is critical for constructing the total energy landscape of coarse-grained models like nodal dislocation dynamics. From the above, it is apparent that the choice of the zero-angle reference axis aˆ influences the numerical value of E core , in addition to the choice of r0 . This point is not widely appreciated. Indeed, even the existence of the dipole rotational energy 2A(θ) has usually been ignored in the analyses of atomistic simulation results in the literature. Note from Eq. (38) that A(θ) originates entirely from elasticity. A(θ =/ nπ ) is generally non-zero for any dislocation dipole except screw dislocation dipole in isotropic medium. For example, A(θ) is nonzero for edge dislocation dipole in isotropic medium. E core thoroughly characterizes the net energy consequence of core atomic relaxations, but one must be informed about what elasticity function parameters r0 and aˆ are chosen as ¯ matching partners. For instance, it was reported [54] that E core of a0 /2[110] shuffle-set screw dislocation in diamond cubic Si was 0.502 eV/Å, with r0 = |b| and using the Stillinger–Weber potential. Later, a separate, independent calculation gives E core = 0.526 eV/Å for the same setup. It is then traced back and ¯ the fordetermined that while the latter calculation uses definition aˆ = 112, mer calculation in effect used aˆ = 111. The offset is exactly given by A(θ = π/2) = 0.024 eV/Å as shown in Fig. 1(a). So both calculations are correct, with the only difference in the choice of the zero-angle reference axis aˆ and a trivial conversion of E core ’s between them. To reiterate, the numerical value of E core carries no physical meaning unless aˆ and r0 are specified. The conversion of E core to other aˆ , r0 “basis” can be performed easily using the fact that E(d) of Eq. (37), being a physical measurable in a well-posed thought experiment, is invariant, while aˆ , r0 , E core are merely parameters in the mathematical representation of its asymptotic form. In the example next, we show how the core energy of BCC Mo screw dislocation can be calculated in a small supercells using the Finnis–Sinclair ¯ potential [57]. All our E core values below will be based on r0 =|b| and aˆ =112. ¯ e2 = a0 [110], ¯ e3 = a0 /2[111]. The setup is as follows. Define e1 = a0 [112], An orthogonal supercell 7e1 × 11e2 × e3 is almost square and contains 462 atoms, in which we can put in four equally spaced screw dislocations to form a quadrupole. Because of symmetry redundancy, this quadrupole cell can be mapped to an entirely equivalent dipole cell half its size with three edges h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 . The 0.5e3 in h2 is critical to this mapping, in view of the fact that total = elastic + plastic, where total is total strain corresponding to the tilt of the supercell, plastic is the plastic strain generated by the displacement cut in the dipole cell (in the quadrupole cell, plastic is zero as there are two opposing cuts), and elastic is the volume-averaged elastic strain in the supercell, which relates directly to the cell-averaged Virial stress τvirial . So, by “preemptively” making total = plastic, we make sure that
784
J. Li
the elastic = 0 and τvirial ≈ 0. It can be shown that (a) τvirial = 0 minimizes the supercell total energy E atomistic with respect to cell shape (h1 , h2 , h3 ) [24, 54], and (b) at dipole separation d = h1 /2, the local stresses at the first and second dislocations vanish simultaneously: τ1 = τ2 = 0. This stabilizes the two dislocations so they would not annihilate, which happens frequently in small supercell calculations. And even when they do not annihilate, a finite driving force would push the dislocation core against the lattice barrier and distort its shape from equilibrium, which introduces error to the computed core energy E core . We can now briefly discuss the image sum procedure for extracting the core energy from periodic supercell calculations. A detailed account is given in chap. 2.22. An instructive approach to this problem is to think about how to explicitly construct a displacement field u(x) in the supercell, that (a) satisfies the displacement cut required by the dipole, (b) is self-equilibrating, and (c) is compatible with the PBC: u(x + h0i ) = u(x) and all orders of derivatives including the first, with {h0i } being the supercell edges before the dipole cut. The following Green’s function sum
u˜ λ (x) ≡ λ uG (x) +
uG (x − R)
(39)
R= /0
could conceivably lead to u(x), where uG (x) is the displacement field of an isolated dislocation dipole in an infinite medium (the one used in the thought experiment). The dislocation lines are all parallel to h03 , and R = n 1 h01 + n 2 h02 , n 1 = −N..N, n 2 = −α N..α N. λ is from 0 to 1 to label the magnitude of the cut displacement from 0 to b. Presence of the uG (x) term in u˜ λ (x) will satisfy condition (a). Condition (b) is trivially satisfied as all Green’s function displacements are self-equilibrating away from the cores. Condition (c) is a bit more subtle. But it can be rigorously shown that,
1 (40) N as N → ∞, where D(α) is a 3 × 3 affine transformation matrix that depends on the image summation aspect ratio α only. D(α) is the cause of the apparent conditional convergence. To get rid of it, we write: u˜ λ (x + h0i ) − u˜ λ (x) = λD(α)h0i + O
uλ (x) = u˜ λ (x) − λD(α)x.
(41)
It is seen now that uλ (x) satisfies (a),(b),(c) simultaneously, so one can use uλ=1(x) to transform atoms in the PBC cell without creating gaps or stress non-equilibrium. In practice, D(α) is evaluated numerically by analyzing the behavior of u˜ λ (x) from image summations at a constant α and progressively large N ’s. Suppose we start out with a PBC supercell {h0i } containing a stress-free crystal. We adiabatically change λ by effecting a cut increment dλb along the
Atomistic calculation of mechanical behavior
785
dipole cut in the cell. At each instant, the displacement field in the cell is uλ (x), so the stress field σλ (x) is available by plugging in ∇uλ (x). The incremental work is simply:
dW = dλ
b · σλ (x) · n dS,
(42)
which is converted to potential energy. Equations (39), (41), and (42) combined give a total energy expression that consists of: • dipole self-energy in the form of (37) • image dipole/displacement-cut coupling energy • D(α) stress/displacement-cut coupling energy Summation over individual Stroh modes like Eq. (38) is needed to account for the dipole-dipole interaction energy E dipole−dipole. The expression E dipole−dipole =
|R + d||R − d| K s |b|2 log 2π |R|2
(43)
is simply incorrect in anisotropic medium as it ignores the 2A(θ) angularcoupling terms. Note also that one needs to put in an extra factor of 1/2: Wimage dipole = 12 E dipole−dipole
(44)
for the R=/ 0 dipole–dipole interaction energy, since one dipole “owns” only one half of the total coupling energy. All these follow automatically from Eq. (42). The Eq. (41) setup is easier to explain, but gives a large supercell virial stress, as total = 0, and since plastic ≡
T Dplastic + Dplastic
elastic = − plastic.
2
,
Dplastic ≡
b(d × h03 )T , V (45)
Therefore in practice we use uλ (x) = u˜ λ (x) + λ(Dplastic − D(α))x
(46)
solution more often, with a new supercell hi =h0i +λDplastich0i that is introduced at the beginning of this section. The energy of this setup can be related to the previous one by accounting for the boundary work, which leads to a very simple result [24, 53]. To validate the above, we relax the Mo screw dislocation dipole in four supercell geometries using the Finnis–Sinclair potential: i. h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 cell, containing 231 atoms, ii. h1 = 8e1 , h2 = 16e2 + 0.5e3 , h3 = e3 cell, containing 768 atoms,
786
J. Li
iii. h1 = 16e1 , h2 = 64e2 + 0.5e3 , h3 = e3 cell, containing 6, 144 atoms, iv. h1 = 32e1 , h2 = 32e2 + 0.5e3 , h3 = e3 cell, containing 6, 144 atoms. The differential displacement maps [58] of (i) and (ii) are shown in Fig. 2, in which the spontaneous polarities are manifest. If we use Å as the length unit, then we can write:
K s |b|2 E atom = E elastic + 2 E core − log r0 |h03 |, 4π
(47)
where E atom is the increase in total energy in the PBC supercell, E elastic is the result of the elastic energy summation without the r0 , E core constants, and also ¯ so the 2A(θ) term in Eq. (37) gives no contribution by choosing aˆ = 112 (but its effects are present in the image dipole coupling energies). K s |b|2 /4π , the single dislocation energy prefactor, is 0.499 eV/Å for the Finnis–Sinclair potential. Numerical results for (i)–(iv) are shown in Table 1, respectively. We see that by varying the supercell size and shape, the elastic energy contribution E elastic dominates the total energy landscape. However, the differences between ¯ E atom and E elastic remain remarkably constant. If we take r0 = |b| and aˆ = 112, then E core = 0.300 ± 0.001 eV/Å, a definitive result. Further, we note that cell (ii)
(i)
Figure 2. Differential displacement map [58] of Mo screw dislocation using the Finnis– Sinclair potential. (i) h1 = 7e1 , h2 = 3.5e1 + 5.5e2 + 0.5e3 , h3 = e3 cell. (ii) h1 = 8e1 , h2 = 16e2 + 0.5e3 , h3 = e3 cell. ¯ Table 1. Mo screw dislocation core energy with r0 = |b| and aˆ = 112 using the Finnis–Sinclair potential
(i) (ii) (iii) (iv)
E supercell [eV]
E elastic [eV]
E core [eV/Å]
6.0410 7.0069 8.8935 11.0432
7.1361 8.0955 9.9838 12.1318
0.2995 0.3006 0.3003 0.3007
Atomistic calculation of mechanical behavior
787
(i), which contains only 231 atoms, is capable of representing the core energy very accurately.
3.
Crack-tip Dislocation Emission
A stressed crack tip has two basic options to relieve its stored strain energy: surface creation by breaking bonds, or plastic deformation (localized shearing). Whichever route has the lower activation energy in the long run should be the dominant mechanism. Therefore activation energy calculations are essential for understanding brittle-to-ductile transitions (BDT). Dislocation nucleation [59, 60] and migration [61] are both possible rate-limiting step in BDT. The former has become one of the standard problems in nanomechanics [62, 63], because proper treatment of the crack and dislocation cores are necessary. Previous atomistic calculations focused on K emit , the athermal dislocation emission threshold, and the so-called 2D activation pathway in which the dislocation is constrained to be always straight. Zhu et al. [2] have applied the nudged elastic band (NEB) method [36] to calculate the 3D bow-out nucleation pathway atomistically. Figures 3a–3c show the calculation setup for Cu (111) crack using the empirical potential of Mishin √ [64]. The 3D minimum energy path (MEP) obtained at K I = 0.44 MPa m is compared with 2D MEP in Fig. 3d. It is seen to be the lower pathway for the same initial and final states. The external load is applied via a fixed-displacement boundary condition for all the NEB nodes (i–ix) during path relaxation. We find that for this model of Cu with the unstable stacking energy γus = 158 mJ/m2 , the Rice–Peierls model [62] underestimates both K I,emit and the activation energy √ Q(K I ) of partial dislocation turns out to be 0.508 MPa m, which is 45% greater than emission. K I,emit √ the 0.35 MPa m from the analytic formula of Rice and Beltz [62]. Furthermore, at (K I /K I,emit )2 = G I /G I,emit = 0.75, we find Q(K I ) to be 1.1 eV, which is significantly larger than the first continuum estimate of 0.18 eV based on a perturbative approach [62], and a second, improved estimate of 0.41 eV using a more flexible representation of the embryonic dislocation loop [63]. Preliminary analyses indicate that two factors may be causing the discrepancy, which if corrected, may lead to much better semi-continuum models. The first is the negligence of surface deformation energetics near the crack tip [59, 60]. The second is that we believe the continuum models may induce a systematic error in the dislocation core energy E core (see last topic), which drives down the energy cost of nucleating a half loop. We suggest that whenever one uses semi-continuum models to calculate activation energies, the core energies of straight dislocations should first be calibrated against atomistic results. The semi-continuum model may then be systematically improved to give better core energies, or if not, very often the error can be conveniently adsorbed in
788
J. Li
(b)
(c)
(d) Actual atomistic σyy
Stroh σyy solution
2
2D activation
1
∆E [eV]
0
iii iv i
v 3D activation
ii
1
vi
2 at GI/GI,Emit 0.75
3
vii
4 5
(a) x2
[111]
x1,[112]
θ
[110]
ii
i
[112]
viii 0
0.2
0.4
0.6
ix
0.8
Reaction coordinate iii
1
iv
(111)
v
vi
vii
viii
ix
Figure 3. (a) Geometry of the mode-I crack [2], containing 24 unit cells (61 Å) in x2 (periodic boundary condition) and 103,920 Cu atoms in a R = 80 Å cylinder. Atoms within 5 Å of the cylinder border are fixed according to anisotropic linear elastic [65] solution. (b) Continuum Stroh solution and (c) the actual atomistic local stress distribution [20] of σ yy at G I /G I,emit = 0.75. (d) 3D activation minimum energy path (solid line) of partial dislocation emission by bow-out, and its competing 2D pathway (dash line). i–ix show the sequential nine NEB nodes or images on the minimum energy path, with iv being the saddle point; atoms whose coordination number [66] differs from 12 are not shown. Note that a stacking fault is actually dragged behind the dislocation.
heuristic gradient functionals like κ|∇u (x)|2 . Otherwise the semi-continuum model to calculate activation energies will have a systematic “core energy error” compared to atomistic results. This recommendation is quite general since heterogeneous nucleation of dislocation half loops by 3D bow-out is ubiquitous, in cross-slip, slip transmission across grain and phase boundaries, initiation at surface asperities, etc. That it has not been carried out before has more to do with the fact that the proper definition of dislocation core energy and numerically precise way to calculate it atomistically were only worked out recently [24, 25]. Figure 4 shows the saddle-point configuration obtained at G I /G I,emit = 0.75. It shows the birth of a shear-dominant singularity (embryonic dislocation loop) near a tensile-dominant singularity, the crack. To make connections with continuum models, we calculate the relative displacement between atoms on two sides of the slip plane. This completely discrete data set are then interpolated to form a continuum field estimate u(x), which is further decomposed into shear shock component u (x) parallel to the slip plane (localized inelastic, or transformation, strain), and tensile opening component u⊥ (x) normal to
Atomistic calculation of mechanical behavior
789
Figure 4. Analysis of the shock displacement field u(x) on the inclined slip plane at the saddle point iv, obtained by 2D spline interpolation of the discrete atomic displacements. (a) Atomic view. (b) Shear component u (x) normalized by b p = a0 [112]/6, and (c) |∇u (x)|2 . (d) Tensile opening component u⊥ (x) normalized by the interplanar spacing h 0 = 3−1/2 a0 .
the slip plane (large, but still elastic). The dislocation core is best visualized by looking at |∇u (x)|2 (Fig. 4c), showing that the core is simply the domain wall between inelastically sheared and unsheared regions [5]. Yet, in the heart of this shear-dominant secondary singularity, there is also a little tensile component. Figure 4d shows that u⊥ (x) is maximized near where |∇u (x)|2 is maximized. Such are the intricacies of shear-tension coupling, and one kind of singularity giving birth to the opposite kind. For instance, we know that when a lot of dislocations are piled up on a hard interface, a microcrack may also be nucleated heterogeneously.
References [1] R. Phillips, Crystals, Defects and Microstructures: Modeling Across Scales, Cambridge University Press, Cambridge, 2001. [2] T. Zhu, J. Li, K.J. Van Vliet, S. Ogata, S. Yip, and S. Suresh, “Predictive modeling of nanoindentation-induced homogeneous dislocation nucleation in copper,” J. Mech. Phys. Solids, 52, 691–724, 2004. [3] M. Allen and D. Tildesley, Computer Simulation of Liquids, Clarendon Press, New York, 1987. [4] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn., Academic, San Diego, 2002. [5] J. Li, A.H.W. Ngan, and P. Gumbsch, “Atomistic modeling of mechanical behavior,” Acta Mater., 51, 5711–5742, 2003. [6] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl Acad. Sci. USA., 99, 5783–5787, 2002.
790
J. Li [7] J. Schiotz, F.D. Di Tolla, and K.W. Jacobsen, “Softening of nanocrystalline metals at very small grain sizes,” Nature, 391, 561–563, 1998. [8] V. Yamakov, D. Wolf, S.R. Phillpot, and H. Gleiter, “Dislocation–dislocation and dislocation–twin reactions in nanocrystalline Al by molecular dynamics simulation,” Acta Mater., 51, 4135–4147, 2003. [9] J. Schiotz and K.W. Jacobsen, “A maximum in the strength of nanocrystalline copper,” Science, 301, 1357–1359, 2003. [10] V. Yamakov, D. Wolf, S.R. Phillpot, A.K. Mukherjee, and H. Gleiter, “Deformationmechanism map for nanocrystalline metals by molecular-dynamics simulation,” Nat. Mater., 3, 43–47, 2004. [11] H. Van Swygenhoven, P.M. Derlet, and A.G. Froseth, “Stacking fault energies and slip in nanocrystalline metals,” Nat. Mater., 3, 399–403, 2004. [12] A.J. Haslam, V. Yamakov, D. Moldovan, D. Wolf, S.R. Phillpot, and H. Gleiter, “Effects of grain growth on grain-boundary diffusion creep by molecular-dynamics simulation,” Acta Mater., 52, 1971–1987, 2004. [13] A. Hasnaoui, H. Van Swygenhoven, and P.M. Derlet, “Dimples on nanocrystalline fracture surfaces as evidence for shear plane formation,” Science, 300, 1550–1552, 2003. [14] A. Latapie and D. Farkas, “Molecular dynamics investigation of the fracture behavior of nanocrystalline alpha-Fe,” Phys. Rev. B, 69, art. no.–134110, 2004. [15] M.H. Muser, “Towards an atomistic understanding of solid friction by computer simulations,” Comput. Phys. Commun., 146, 54–62, 2002. [16] M. Urbakh, J. Klafter, D. Gourdon, and J. Israelachvili, “The nonlinear nature of friction,” Nature, 430, 525–528, 2004. [17] C.L. Kelchner, S.J. Plimpton, and J.C. Hamilton, “Dislocation nucleation and defect structure during surface indentation,” Phys. Rev. B, 58, 11085–11088, 1998. [18] J.A. Zimmerman, C.L. Kelchner, P.A. Klein, J.C. Hamilton, and S.M. Foiles, “Surface step effects on nanoindentation,” Phys. Rev. Lett., 8716, art. no.–165507, 2001. [19] G.S. Smith, E.B. Tadmor, N. Bernstein, and E. Kaxiras, “Multiscale simulations of silicon nanoindentation,” Acta Mater., 49, 4089–4101, 2001. [20] K.J. Van Vliet, J. Li, T. Zhu, S. Yip, and S. Suresh, “Quantifying the early stages of plasticity through nanoscale experiments and simulations,” Phys. Rev. B, 67, 2003. [21] V. Vitek, “Core structure of screw dislocations in body-centred cubic metals: relation to symmetry and interatomic bonding,” Philos. Mag., 84, 415–428, 2004. [22] H. Koizumi, Y. Kamimura, and T. Suzuki, “Core structure of a screw dislocation in a diamond-like structure,” Philos. Mag. A, 80, 609–620, 2000. [23] C. Woodward and S.I. Rao, “Ab initio simulation of (a/2)¡110] screw dislocations in gamma-TiAl,” Philos. Mag., 84, 401–413, 2004. [24] W. Cai, V.V. Bulatob, J.P. Chang, J. Li, and S. Yip, “Periodic image effects in dislocation modelling,” Philos. Mag., 83, 539–567, 2003. [25] J. Li, C.-Z. Wang, J.-P. Chang, W. Cai, V.V. Bulatov, K.-M. Ho, and S. Yip, “Core energy and peierls stress of screw dislocation in bcc molybdenum: a periodic cell tight-binding study,” Phys. Rev. B, (in print). See http://164.107.79.177/Archive/ Papers/04/Li04c.pdf, 2004. [26] H.C. Huang, G.H. Gilmer, and T.D. de la Rubia, “An atomistic simulator for thin film deposition in three dimensions,” J. Appl. Phys., 84, 3636–3649, 1998. [27] L. Dong, J. Schnitker, R.W. Smith, and D.J. Srolovitz, “Stress relaxation and misfit dislocation nucleation in the growth of misfitting films: molecular dynamics simulation study,” J. Appl. Phys., 83, 217–227, 1998.
Atomistic calculation of mechanical behavior
791
[28] D. Holland and M. Marder, “Ideal brittle fracture of silicon studied with molecular dynamics,” Phys. Rev. Lett., 80, 746–749, 1998. [29] M.J. Buehler, F.F. Abraham, and H.J. Gao, “Hyperelasticity governs dynamic fracture at a critical length scale,” Nature, 426, 141–146, 2003. [30] R. Perez and P. Gumbsch, “Directional anisotropy in the cleavage fracture of silicon,” Phys. Rev. Lett., 84, 5347–5350, 2000. [31] N. Bernstein and D.W. Hess, “Lattice trapping barriers to brittle fracture,” Phys. Rev. Lett., 91, art. no.–025501, 2003. [32] S.J. Zhou, D.M. Beazley, P.S. Lomdahl, and B.L. Holian, “Large-scale molecular dynamics simulations of three-dimensional ductile failure,” Phys. Rev. Lett., 78, 479– 482, 1997. [33] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, “Structure of grain boundaries in nanocrystalline palladium by molecular dynamics simulation,” Scr. Mater., 41, 631–636, 1999. [34] M. Mrovec, T. Ochs, C. Elsasser, V. Vitek, D. Nguyen-Manh, and D.G. Pettifor, “Never ending saga of a simple boundary,” Z. Metallk., 94, 244–249, 2003. [35] M.L. Falk and J.S. Langer, “Dynamics of viscoplastic deformation in amorphous solids,” Phys. Rev. E, 57, 7192–7205, 1998. [36] G. Henkelman and H. Jonsson,“Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points,” J. Chem. Phys., 113, 9978–9985, 2000. [37] T. Vegge and W. Jacobsen, “Atomistic simulations of dislocation processes in copper,” J. Phys.-Condes. Matter, 14, 2929–2956, 2002. [38] V.V. Bulatov, S. Yip, and A.S. Argon, “Atomic modes of dislocation mobility in silicon,” Philos. Mag. A, 72, 453–496, 1995. [39] M. Wen and A.H.W. Ngan, “Atomistic simulation of kink-pairs of screw dislocations in body-centred cubic iron,” Acta Mater., 48, 4255–4265, 2000. [40] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Energetics of formation and migration of self-interstitials and self-interstitial clusters in alpha-iron,” J. Nucl. Mater., 244, 185–194, 1997. [41] T.D. de la Rubia, H.M. Zbib, T.A. Khraishi, B.D. Wirth, M. Victoria, and M.J. Caturia, “Multiscale modelling of plastic flow localization in irradiated materials,” Nature, 406, 871–874, 2000. [42] R. Devanathan, W.J. Weber, and F. Gao, “Atomic scale simulation of defect production in irradiated 3CSiC,” J. Appl. Phys., 90, 2303–2309, 2001. [43] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Philos. Mag. A, 73, 1529–1563, 1996. [44] V. Bulatov, F.F. Abraham, L. Kubin, B. Devincre, and S. Yip, “Connecting atomistic and mesoscale simulations of crystal plasticity,” Nature, 391, 669–672, 1998. [45] V.B. Shenoy, R. Miller, E.B. Tadmor, D. Rodney, R. Phillips, and M. Ortiz, “An adaptive finite element approach to atomic-scale mechanics – the quasicontinuum method,” J. Mech. Phys. Solids, 47, 611–642, 1999. [46] R. Madec, B. Devincre, L. Kubin, T. Hoc, and D. Rodney, “The role of collinear interaction in dislocation-induced hardening,” Science, 301, 1879–1882, 2003. [47] J.H. Wang, J. Li, S. Yip, S. Phillpot, and D. Wolf, “Mechanical instabilities of homogeneous crystals,” Phys. Rev. B, 52, 12627–12635, 1995. [48] I.S. Sokolnikoff, Tensor Analysis, Theory and Applications to Geometry and Mechanics of Continua., 2nd edn., Wiley, New York, 1964. [49] S.C. Hunter, Mechanics of Continuous Media, 2nd edn., E. Horwood, Chichester, 1983.
792
J. Li [50] J.F. Lutsko, “Stress and elastic-constants in anisotropic solids – molecular dynamics techniques,” J. Appl. Phys., 64, 1152–1154, 1988. [51] J.F. Lutsko, “Generalized expressions for the calculation of elastic constants by computer-simulation,” J. Appl. Phys., 65, 2991–2997, 1989. [52] J.R. Ray, “Elastic-constants and statistical ensembles in moleculardynamics,” Comput. Phys. Rep., 8, 111–151, 1988. [53] T. Cagin and J.R. Ray, “Elastic-constants of sodium from molecular-dynamics,” Phys. Rev. B, 37, 699–705, 1988. [54] W. Cai, V.V. Bulatov, J.P. Chang, J. Li, and S. Yip, “Anisotropic elastic interactions of a periodic dislocation array,” Phys. Rev. Lett., 86, 5727–5730, 2001. [55] A. Stroh, “Steady state problems in anisotropic elasticity,” J. Math. Phys., 41, 77– 103, 1962. [56] J. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1982. [57] M.W. Finnis and J.E. Sinclair, “A simple empirical n-body potential for transitionmetals,” Philos. Mag. A, 50, 45–55, 1984. [58] V. Vitek, “Theory of core structures of dislocations in body-centered cubic metals,” Cryst Lattice Defects, 5, 1–34, 1974. [59] J. Knap and K. Sieradzki, “Crack tip dislocation nucleation in FCC solids,” Phys. Rev. Lett., 82, 1700–1703, 1999. [60] J. Schiotz and A.E. Carlsson, “The influence of surface stress on dislocation emission from sharp and blunt cracks in fcc metals,” Philos. Mag. A, 80, 69–82, 2000. [61] P. Gumbsch, J. Riedle, A. Hartmaier, and H.F. Fischmeister, “Controlling factors for the brittle-to-ductile transition in tungsten single crystals,” Science, 282, 1293–1295, 1998. [62] J.R. Rice and G.E. Beltz, “The activation-energy for dislocation nucleation at a crack,” J. Mech. Phys. Solids, 42, 333–360, 1994. [63] G. Xu, A.S. Argon, and M. Oritz, “Critical configurations for dislocation nucleation from crack tips,” Philos. Mag. A, 75, 341–367, 1997. [64] Y. Mishin, M.J. Mehl, D.A. Papaconstantopoulos, A.F. Voter, and J.D. Kress, “Structural stability and lattice defects in copper: ab initio, tight-binding, and embeddedatom calculations,” Phys. Rev. B, 6322, art. no.–224106, 2001. [65] A. Stroh, “Dislocations and cracks in anisotropic elasticity,” Phil. Mag., 7, 625, 1958. [66] J. Li, “Atomeye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Eng., 11, 173–177, 2003.
2.20 THE PEIERLS–NABARRO MODEL OF DISLOCATIONS: A VENERABLE THEORY AND ITS CURRENT DEVELOPMENT Gang Lu Division of Engineering and Applied Science, Harvard University, Cambridge, Massachusetts, USA
Dislocations are central to the understanding of mechanical properties of crystalline solids. While continuum elasticity theory describes well the long-range elastic strain of a dislocation for length scales beyond a few lattice spacings, it breaks down near the singularity in the region surrounding the dislocation center, known as the dislocation core. There has been a great deal of interest in describing accurately the dislocation core structure on an atomic scale because of its important role in many phenomena of crystal plasticity [1–3]. The core properties control, for instance, the mobility of dislocations, which accounts for the intrinsic ductility or brittleness of solids. The core is also responsible for the interaction of dislocations at close distances, which is relevant to plastic deformation. Two types of theoretical approaches have been employed to study dislocation core properties. The first is based on direct atomistic simulations using either empirical interatomic potentials or ab initio calculations. Empirical potentials involve the fitting of parameters to a predetermined database and hence may not be reliable in predicting the core properties, where severe distortions like bond breaking, bond formation and switching necessitate a quantum mechanical description of the electronic degrees of freedom. On the other hand, ab initio total energy calculations, though considerably more accurate, are computationally expensive for the studies of dislocation properties. The second approach is based on the framework of the Peierls–Nabarro (P–N) model which holds the promise of becoming a plausible alternative to direct atomistic simulations. For this reason, there has been a resurgence of interest in the simple and tractable P–N model for studying the dislocation core structure and mobility. In particular, the P–N model permits easy estimation of the key dislocation 793 S. Yip (ed.), Handbook of Materials Modeling, 793–811. c 2005 Springer. Printed in the Netherlands.
794
G. Lu
characteristics of nucleation and mobility, directly from quantities (GSF energy, see later) accessible through standard quantum mechanical or empirical atomistic computations.
1.
Original P–N Model
Peierls [4] first proposed the remarkable hybrid model in which some of the details of the discrete dislocation core were incorporated into an essentially continuum framework. Nabarro [5] and Eshelby [6] further developed Peierls’ model and gave the first meaningful estimate of the lattice friction to dislocation motion. Later attempts to generalize the original treatment of Peierls and Nabarro assumed a more general core configuration from which they derived the interactions across the glide plane which satisfy the Peierls integral equation. The basic idea of the P–N model can be illustrated in Fig. 1. The dislocated solid is separated into two elastic half-spaces joined by atomic-level forces across their common interface, known as the glide plane. The dislocation is characterized by a slip distribution δ(x) = u(x, 0+ ) − u(x, 0− ), where u(x) is the displacement vector at position x in the glide plane. The goal of the P–N model is to determine the slip distribution δ(x) (or the displacement field u(x)) across the glide plane that minimizes the total energy of the solid. The total energy includes two distinct contributions that compete with each
Y
Linear elastic half–spaces
Nonlinear interplanar potential
X
Figure 1. A schematic illustration showing an edge dislocation in a lattice. The partition of the dislocated lattice into linear elastic region and nonlinear atomistic region allows a multiscale treatment of the problem.
The Peierls–Nabarro model of dislocations
795
other in determining the equilibrium slip distribution. One of contributions accounts for the atomic interaction across the glide plane which reflects the fact that there is an energy penalty for the misfit across the glide plane. Such misfit energy can be written as +∞
γ(δ(x)) dx,
Umisfit =
(1)
−∞
where γ(δ(x)) is the generalized stacking fault (GSF) energy defined as the following [7]: consider a perfect crystal cut across a single plane into two parts which are then subjected to a relative displacement through an arbitrary vector δ and rejoined. The reconnected lattice has a surplus energy per unit area γ(δ). As the vector δ is varied to span a unit cell of the interface, γ(δ) generates the generalized stacking fault energy surface. The procedure can be repeated for various crystal planes. The significance of the GSF energy surface (or γ -surface) is that for a fault vector δ there is an interfacial restoring stress Fb (δ) = −∇(γ (δ)),
(2)
which has the same formal interpretation as the restoring stress in the P–N model. Note that the GSF energy surface retains the translational and rotational symmetry of the underlying lattice. For example, there is no attendant energy cost if the atoms across the glide plane experience a relative displacement equal to the Burgers vector. The second energy contribution to the total energy is the elastic energy stored in the two elastic half-spaces. This energy corresponds to the elastic energy of the dislocation, and it depends on the slip distribution δ(x) as well. Without losing generality, we can assume a one-dimensional slip of δ(x) first, and deal with a three-dimensional slip δ(x) later. As pointed out by Eshely [6], a straight dislocation can be represented as a continuous distribution of infinitesimal dislocations whose Burgers vectors are defined as the local gradient of the slip distribution. For example, the infinitesimal dislocation lying between x and x + dx has a Burgers vector
dδ(x) db(x ) = dx
x=x
dx ≡ ρ(x ) dx ,
(3)
where the local slip gradient ρ(x) is also called dislocation density. Integrating the dislocation density over all x we find +∞
ρ(x) dx =
−∞
+∞ −∞
dδ(x) dx = δ(+∞) − δ(−∞) = b, dx
(4)
which is what we would expect from the definition of the dislocation density (see Fig. 1). These infinitesimal dislocations interact elastically, and the total
796
G. Lu
elastic energy can be obtained through the superposition principle by adding up the contribution from each infinitesimal dislocation separately. More specifically, an infinitesimal edge dislocation located at x produces a shear stress at some other point x which is given by σx y (x, 0) = K e
db(x ) , x − x
(5)
K e is the prelogarithmic elastic factor for an edge dislocation. The displacement u(x) necessary to create the infinitesimal dislocation at x takes place in the presence of the shear stress from the dislocation at x , giving the following contribution to the elastic energy from the latter dislocation: dUelastic = K e
db(x ) u(x). x − x
(6)
Integrating this expression over all values of x from −L to L, and over db(x ) to add the contribution from all infinitesimal dislocations, we obtain the total elastic energy of the original dislocation 1 Uelastic = K e 2
L b −L 0
db(x ) 1 u(x) dx = K e x − x 2
L L
u(x) −L −L
ρ(x ) dx dx, x − x
(7)
where L is an inconsequential constant introduced as a large cutoff distance. Performing an integration by parts over x, we arrive the following expression for the elastic energy: 1 Ke Uelastic = K e b2 ln L − 2 2
L L
ρ(x)ρ(x ) ln |x − x | dx dx .
(8)
−L −L
Similar results can be also found for a screw dislocation with K e replaced by the corresponding elastic constant for the screw dislocation. For a general mixed dislocation with an angle θ between the dislocation line and its Burgers vector, the elastic energy is given by 1 K Uelastic = K b2 ln L − 2 2 where µ K= 2π
L L
ρ(x)ρ(x ) ln |x − x | dxdx ,
(9)
−L −L
sin2 θ + cos2 θ , 1−ν
(10)
for an isotropic solid. µ and ν are the shear modulus and Poisson’s ratio, respectively. This result clearly separates the contribution of the long-range
The Peierls–Nabarro model of dislocations
797
elastic field of the dislocation, embodied in the first term of Eq. (9), from the contribution of the large distortions at the dislocation core, embodied in the second term of the equation. We will drop the first term in our later discussion and concentrate on the second term, which represents the energy contribution from the dislocation core. Now we arrive expression of the total energy of the dislocation as a functional of dislocation density ρ(x): +∞
γ(δ(x))dx −
Ut ot [ρ(x)] = −∞
K 2
L L
ρ(x)ρ(x ) ln |x − x | dxdx . (11)
−L −L
By minimizing the above energy functional, we can find the equilibrium structure of the dislocation. A variational derivative of Eq. (11) with respect to the dislocation density ρ(x) leads to the P–N integro-differential equation: +∞
K −∞
1 dδ(x ) dx = Fb (δ(x)). x − x dx
(12)
If a simple sinusoidal form is assumed for Fb (δ(x)), as in the original P–N treatment, the misfit is then given by the well-known analytical solution, b x b tan−1 + , π ζ 2
(13)
d Kb = 4π Fmax 2(1 − ν)
(14)
δ(x) = where ζ=
is the half-width of the dislocation core and Fmax = µb/(2π d) is the maximum restoring stress with d as the interlayer distance between the glide planes. One of the key features that emerges from this solution is that the P–N model removes the artificial divergence at the core that is associated with the idealized continuum dislocation of Volterra. By introducing the nonlinear and nonconvex interplanar potential into the model, the solution of P–N model in terms of stress and strain is seen to be well behaved. One of the achievements of the P–N model is that it provides a reasonable estimate of the dislocation size, characterized by ζ as a result of the competition between the two energy contributions. The more important achievement of the P–N model is that it offers an insight into the value of the critical stress to move a dislocation in an otherwise perfect lattice. Such stress has thus been termed as Peierls stress. In order to derive the Peierls stress, however, the P–N energy functional needs to be modified. The expression in Eq. (11) that we have discussed so far is invariant with respect to an arbitrary translation of the dislocation density ρ(x) → ρ(x + t). In other words, the dislocation
798
G. Lu
described by the P–N solution does not experience any resistance as it moves through the lattice. This is clearly unrealistic, and is a consequence of neglecting the discrete nature of the lattice: The P–N model views the solid as a continuous medium. The only effect of the lattice periodicity. so far, comes from the periodicity of the misfit potential with a period of the Burgers vector. In order to rectify this problem and to recover the lattice resistance of dislocation motion, the P–N energy functional was modified so that the misfit potential is not sampled continuously as in Eq. (11), but only at the positions of the actual atomic planes. This amounts to the following modification of the first term in total energy of the dislocation in Eq. (11): +∞
+∞
γ (δ(x))dx →
γ (δ(xn )) x,
(15)
n=−∞
−∞
where xn is the position of the nth atomic planes and x is the spacing between these atomic planes. Assuming a sinusoidal restoring stress F[δ(x)]= µb/(2π d) sin[2π δ(x)/b], the misfit potential (Frenkel potential) γ [δ(x)] is
γ [δ(x)] =
2π δ(x) Fmax b 1 − cos . 2π b
(16)
If the center of the dislocation is displaced by αb with α < 1, the total misfit energy becomes:
+∞ µb3 n 1 + cos 2 tan−1 2(1 − ν) α + b/d Umisfit = 2 8π d n=−∞ 2
,
(17)
where we have used Eq. (13) for δ(x), and Eq. (16) for γ(δ) to evaluate the misfit energy in Eq. (15). After appropriate manipulations, Eq. (17) may be rewritten as Umisfit (α) =
+∞ µb3 4ζ 2 1 , 2 2 2 4π d b n=−∞ (2ζ /b) + (2α + n)2
(18)
which can then be handled by the Poisson summation formula +∞
f (n) =
n=−∞
+∞ +∞
f (x)e2πikx dx.
(19)
k=−∞−∞
After performing the relevant integrations, we arrive the final expression for the misfit energy
Umisfit (α) =
µb2 µb2 −4π ζ + exp 4π(1 − ν) 2π(1 − ν) b
cos 4π α.
(20)
From the above expression, we see that the straight dislocation experiences a periodic potential wells and in the act of passing from one potential well
The Peierls–Nabarro model of dislocations
799
to the next it must cross an energy barrier known as the Peierls barrier W p . The stress required to surmount this energy barrier is the Peierls stress, σ p , given by [8]
1 ∂Umisfit (α) 2π W p ≡ 2 b ∂α b2 max 2µ 4π ζ 2µ 2π d exp − exp − = = . 1−ν b 1−ν b(1 − ν)
σp =
(21)
A few observations are now in place. First, we note that the Peierls stress is extremely sensitive to the ratio of (ζ /b) or (d/b) for fixed values of elastic constants µ and ν. Therefore an edge dislocation is more mobile than a screw dislocation with the same Burgers vector in the same material since the edge dislocation is wider (larger ζ ) than the corresponding screw dislocation. In general, the more the edge component of a mixed dislocation, the wider the dislocation core, and hence the greater the mobility. In a given crystal, the slip system of dislocations corresponds to the largest value of (d/b), namely, the slip plane tends to have the largest interplanar spacing, and the slip direction or Burgers vector is along the nearest neighbor direction (smaller b). In closepacked metallic systems, the values of (ζ /b) and (d/b) are large, and these materials are usually ductile. In contrast, crystals with more complex unit cells (such as ceramics) have relatively small d/b ratio, giving larger Peierls stress. In these materials, the shear stress for dislocation motion cannot be overcome before fracturing the solid, thus they are usually brittle. A second observation to be made concerns the magnitude of the Peierls stress. What we find from the P–N model is that the stress to move a dislocation is down by an exponential factor in comparison with the shear modulus. It explains the fact that in many materials, plastic deformation operates at a shear stress that is orders of magnitude below its shear modulus: it is all due to dislocation motion! A final observation that we should make is, that dislocation motion often takes place by kink mechanisms in which a bulge on the dislocation line brings a segment of the dislocation into adjacent Peierls valleys, and the resulting kinks propagate with the end result being a net forward motion of the dislocation. In this case, what the kinks have to overcome is the secondary Peierls barrier along the dislocation line direction. Despite the great heuristic value and useful insight that the original P–N model offers, it lacks the quantitative power for prediction. In particular, the model becomes increasingly inaccurate for dislocations with narrow cores, as is typically the case in covalently bonded solids. Since the P–N model represents a combination of the continuum model and the GSF interplanar potential, its accuracy can be affected by either component. One of the main deficiencies of the original P–N model is the assumption of sinusoidal force law. In real materials, however, the interplanar potential (GSF energy) is not
800
G. Lu
at all sinusoidal. At present, the GSF energies can be calculated very accurately by using an ab initio quantum mechanical framework, which brings to the problem of possible inaccuracies in the continuum component. In the following, we will address the limitations in the original P–N model, and present some solutions that have been put forward to improve the quantitative description of dislocation core properties and mobility. (1) The original P–N model is based on the isotropic or the pseudo–isotropic elasticity theory. Recently, full anisotropic treatments have been implemented in the P–N model, which do not require much more computational effort [9]. (2) The original P–N model is one-dimensional, assuming the slip is along one direction. This “constrained path approximation” fails for dislocations in many Bravais lattices. For example, in an fcc lattice, dislocations can dissociate into partials whose Burgers vectors are not parallel, and a treatment in two dimension is mandatory. Currently, many P–N theories have been proposed to solve this problem. For example, we will introduce a powerful P–N model, namely, the Semidiscrete Variational P–N (SVPN) model which is capable to deal with three-dimensional displacement field, particularly useful for studying narrow dislocations. (3) The original P–N model yields a variation of the misfit energy and the Peierls stress which has a periodicity of b/2, in contrast with the feature of the dislocation barrier, which must in general, exhibit the periodicity of the Burgers vector b. There have been controversies and confusions with regard to this problem (e.g., see [10]), and it was attributed to an erroneous representation of the atomic positions across the glide plane in the original P–N model after the dislocation is translated by a distance [11]. By correcting this error, one can recover the correct periodicity of b for the misfit energy and the Peierls stress. Another relevant idea has also been exploited in a numerical formulation of the P–N model. Namely, the misfit energy is not summed over the the position of atomic nuclei, but rather it is averaged over the Thomas–Fermi radius around the nuclei. This modification is particularly useful for metallic systems where electrons are more delocalized. It has been shown that the Peierls barrier and Peierls stress can be lowered considerably by this modification [12]. (4) In the original P–N model, the Peierls stress is derived by considering the misfit energy exclusively (see Eq. (20)). The elastic energy is assumed to be constant, in other words, the dislocation shape is assumed to be rigid during the dislocation translation process. This assumption turns out to be unrealistic. In fact, it is critical to include the variation of the elastic energy in Eq. (20) when evaluating the Peierls stress. However in doing so, one faces an inconsistency in the formulation: while the elastic energy is computed by a continuous integration, the misfit energy has to be sampled discretely in order to incorporate the discrete nature of lattice. Thus, the two energy contributions are not treated on the equal footing and the total energy is not variational. The
The Peierls–Nabarro model of dislocations
801
SVPN model was developed precisely to resolve this inconsistency by discretizing the elastic energy [13]. As we shall see later, the variational formulation of the P–N theory permits us to compute the Peierls stress more accurately by allowing the dislocation shape to change during the translation. The relaxation of dislocation core is particularly important for narrow dislocations. In fact, it has been shown that the Peierls stress can be reduced by three orders of magnitude for the screw dislocation in Si by allowing the relaxation of the dislocation shape. The reduced Peierls stress is in a much better agreement with the direct atomistic simulation result. We should emphasize that the P–N model calculation takes only a small fraction of the computational time (a few minutes) that a direct atomistic simulation may take (hours or even days).
2.
Semidiscrete Variational P–N Model
In the remaining of this article, we will introduce the SVPN model to exemplify the current development of the P–N theory. There are two versions (planar and nonplanar) of the SVPN model that have been developed. The planar model aims to treat a dislocation that is entirely confined to a single glide plane, while the nonplanar model deals with a dislocation that is spread onto more than one glide planes. The nonplanar model was developed in order to study stress-assisted dislocation cross-slip and constriction [14]. After the discussion of the models, we will apply the planar model to dislocations in Al with ab initio determined GSF energy surface. To facilitate the presentation, we adopt the following conventions. As defined in Fig. 2, xoz plane is the (111) glide plane for Al, z axis is in the direction of the dislocation line, and x axis is the glide direction, with y axis normal to the glide plane. For planar dislocations, the displacements along y direction are usually small. The Burgers vector b lies on the glide plane making an angle θ with the z axis. The Burgers vector is along x axis (θ = 90◦ ) for an edge dislocation and along z axis (θ = 0◦ ) for a screw dislocation. The Burgers vector of a mixed dislocation has both an edge component, b sin θ, and a screw component, b cos θ. In general, the atomic displacements have components in all three directions rather than only along the direction of the Burgers vector, because the path along the Burgers vector may have to surmount a higher interplanar energy barrier in the GSF surface (see Fig. 4). In the planar SVPN formalism, the dislocation slip is assumed to be confined to a single glide plane, separating two semi-infinite linear elastic continua. The equilibrium structure of a dislocation is obtained by minimizing the dislocation energy functional [15] Udisl = Uelastic + Umisfit + Ustress + K b2 ln(L),
(22)
802
G. Lu y [111] normal to glide plane
b sinθ
O
x glide direction
θ
θ os
b
bc
σ
xy
σzy
σby
z dislocation line Figure 2. in Al.
Cartesian set of coordinates showing the directions relevant for dislocations
where Uelastic[{ρ}] =
1 i, j
Umisfit [{δ}] =
4π
(2) (2) (3) (3) χij [K e (ρi(1) ρ (1) j + ρi ρ j ) + K s ρi ρ j ],
xγ3 (δi ),
(23) (24)
i
Ustress[{ρ}, τ ] = −
2 x i2 − x i−1 i,l
2
(ρi(l) τi(l) ),
(25)
with respect to the dislocation density or the slip vector. Here, ρi(1) , ρi(2), and ρi(3) are the edge, vertical and screw components of the general dislocation density at the ith nodal point and γ3 (δi ) is the three-dimensional GSF energy surface. x is the area assigned to each atomic row (the length of all dislocation lines is 1 Å). The corresponding components of the applied stress interacting with the ρi(1), ρi(2) , and ρi(3) , are τ (1) = σ21 , τ (2) = σ22 and
The Peierls–Nabarro model of dislocations
803
τ (3) = σ23 , respectively. K , K e , and K s are the prelogarithmic energy factors defined earlier. The dislocation density at the ith nodal point is defined as ρi = (δi − δi−1 )/(xi − xi−1 ). The remaining quantities entering in this expression are: χij = 32 φi,i−1 φ j, j −1 +ψi−1, j −1 +ψi, j −ψi, j −1 −ψ j,i−1 , with φi, j =xi −x j , and ψi, j = 12 φi,2 j ln |φi, j |. The first term in the energy functional, Uelastic, represents the configurationdependent (density or slip) part of the elastic energy, which has been discretized. Since any details of the displacements across the glide plane other than those on the atomic rows are disregarded, it is consistent to assume that the dislocation density is constant between the nodal points. This explicit discretization of the elastic energy term removes the inconsistency in the original P–N model and produces a total energy functional which is variational. Another modification in this approach is that the nonlinear misfit potential in the energy functional, Umisfit , is a function of all three components of the nodal displacements, δ(xi ). Namely, in addition to the displacements along the Burgers vector, lateral and even vertical displacements across the glide plane are also included. This in turn, allows the treatment of straight dislocations of arbitrary orientation in arbitrary glide planes. Furthermore, because the slip vector δ(xi ) is allowed to change during the process of dislocation translation, the Peierls energy barrier can be significantly lowered compared to its corresponding value from a rigid translation. In order to examine the trend of energetics for different dislocations, we identify the dislocation configurationdependent part of the total energy as the core energy, Ucore = Uelastic + Umisfit , which includes the density-dependent part of the elastic energy and the entire misfit energy, in the absence of external stress. The last term in Eq. (22), K b2 ln(L), is independent of the dislocation density, and hence, is irrelevant in the variational procedure and has no contribution to the evaluation of the Peierls stress (a typical value for the outer cutoff radius L is 103 Å; we use this value for all dislocations in the calculations discussed below). The response of a dislocation to an applied stress is determined by the minimization of the energy functional with respect to ρi at a given value of applied stress, τi(l) . An instability is reached when an optimal solution for ρi no longer exists, which is manifested numerically by the failure of the minimization procedure to convergence. The Peierls stress is defined as the critical value of the applied stress which gives rise to this instability. Having developed the planar SVPN model, it is not difficult to extend it to more than one glide plane. We have recently developed the nonplanar SVPN model in order to study dislocation cross-slip and constriction in fcc metals [14]. As shown in Fig. 3, a screw dislocation placed at the intersection of the primary (plane I) and cross-slip plane (plane II) is allowed to spread into the two planes simultaneously. The X (X ) axis represents the glide direction of the dislocation at the plane I (II). For an fcc lattice, the two slip
804
G. Lu Y [111]
dislocation line
I O
X [121] θ
L Z [101]
X'
II
Figure 3. Cartesian set of coordinates showing the directions relevant to the screw dislocation located at the intersection of the two glide planes. Plane I (II) denotes the primary (cross-slip) plane.
¯ planes are (111) and (111), forming an angle θ ≈ 71◦ . The total energy of the dislocation is Utot = UI + UII + U˜ .
(26)
Here, UI and UII are the energies associated with the dislocation spread on planes I and II, respectively, and U˜ represents the elastic interaction energy between the dislocation densities on planes I and II. The expressions for UI and UII are identical to that given earlier for the single glide plane case, while the new term U˜ can be derived from Nabarro’s equation for general parallel dislocations [5]: UI(II) =
1 i, j
2
χij {K e [ρ1I(II) (i)ρ1I(II) ( j ) + ρ2I(II) (i)ρ2I(II) ( j )]
+ K s ρ3I(II) (i)ρ3I(II) ( j )} +
xγ3 δ1I(II) (i), δ2I(II) (i), δ3I(II) (i)
i
−
x(i)2 − x(i − 1)2 i,l
2
ρlI(II) (i)τlI(II) + K b2 ln L ,
The Peierls–Nabarro model of dislocations U˜ = −
p
K s ρ3I (i)ρ3 ( j ) Aij −
i, j p + ρ2I (i)ρ2 ( j )]Aij
−
805 p
K e [ρ1I (i)ρ1 ( j )
i, j p
p
K e [ρ2I (i)ρ2 ( j )Bij + ρ1I (i)ρ1 ( j )Cij
i, j p − ρ2I (i)ρ1 ( j )Dij
p
− ρ1I (i)ρ2 ( j )Dij ].
Here, δ1I(II) (i), δ2I(II) (i), and δ3I(II) (i) represent the edge, vertical, and screw component of the general dislocation slip vector at the ith nodal point in plane I (II), respectively, while the corresponding component of dislocation density in plane I (II) is defined as before in the planar case. The projected dislocation density ρ p (i) is the projection of the density ρ II (i) from plane II onto plane I in order to deal with the nonparallel components of the slip vector. χij , Aij , Bij , Cij , and Dij are double-integral kernels defined by χij =
x j xi
ln|x − x | dx dx ,
x j −1 x i−1 x
j xi
Aij = x j −1 x i−1
1 ln(x02 + y02 ) dx dx , 2
x
j xi
Bij =
ln x j −1 x i−1
x02
x02 dx dx , + y02
x
j xi
Cij =
ln x j −1 x i−1
y02 dx dx , x02 + y02
x
j xi
Dij =
ln x j −1 x i−1
x0 y0 dx dx , + y02
x02
where x0 = L − x + x cos θ, and y0 = −x sin θ. The equilibrium structure of the dislocation is again determined by minimizing the total dislocation energy functional with respect to the dislocation density.
3.
Dislocation Core Properties in Aluminum
The GSF energy surface, γ3 (δi ) entering the P–N model can usually be determined from ab initio calculations based on the density functional theory
806
G. Lu
[15]. In Fig. 4, we show the GSF energy surface for Al which was computed by using a pseudo-potential plane-wave method. The computational detail can be found in [16]. As shown in Fig. 4, the calculated GSF energy surface maintains the underlying translational and rotational symmetry of the fcc lattice. The three high peaks of the GSF surface correspond to the run-on stacking fault configuration ABC|CABC, in which two C layers are nearest neighbors. The local minimum and maximum along the [112] direction correspond to intrinsic and unstable stacking faults, respectively. We first examine the core properties of four typical dislocations, i.e., the screw, 30◦ , 60◦ and edge dislocations. These dislocations have the same Burgers vector, b = a/2 [101], but different orientations (characters). The results for the energetics and the Peierls stress for the four dislocations are presented in Table 1, along with the values of ζ . First one can see the trend that the half-width ζ increases monotonically with the dislocation angle θ. Secondly, the misfit energy, Umisfit , also increases monotonically from the screw to the edge dislocation, while the configuration-dependent elastic energy, Uelastic (negative in sign) decreases as the angle increases.
0.5 0.4 0.3 0.2 0.1 0
[110] [112]
Figure 4. The GSF energy surface for displacements along a (111) plane in Al (J/m2 ) (the corners of the plane and its center correspond to identical equilibrium configurations, i.e., the ideal Al lattice) from DFT calculations.
The Peierls–Nabarro model of dislocations
807
Table 1. Core half-widths ζ (in Å); core energies Ucore and separate contributions from the configuration-dependent elastic energy, Uelastic and the misfit energy Umisfit ; K b2 ln L (in eV/A); and Peierls stress (in MPa) for the four dislocations. Core widths Ucore Uelastic Umisfit K b2 ln L Peierls stress
Screw 2.1 −0.0834 −0.1828 0.0938 1.6050 256
30◦ 2.5 −0.1096 −0.2317 0.1221 1.8123 53
60◦ 3.0 −0.1678 −0.3199 0.1521 2.233 98
Edge 3.5 −0.1979 −0.3666 0.1688 2.446 35
The configuration-independent elastic energy K b2 ln L is also included. Several points need to be emphasized: (1) The configuration-dependent elastic energy Uelastic, ignored in some previous studies, is the dominant contribution to the core energy Ucore (about a factor of two larger than Umisfit). More importantly, it depends strongly on the dislocation character; (2) While Uelastic is negative here, in principle, it can be of either sign. For example, Uelastic was found to be positive in Si; (3) Inclusion of the configuration-independent elastic term, K b2 ln L, yields positive values for both the total energy and the total elastic energy. As alluded earlier, the Peierls stress in this work is calculated as the critical value of the applied stress τ , at which the dislocation energy functional fails to be minimized with respect to ρi through standard conjugate gradient techniques. This approach is more accurate and physically transparent, because it captures the nature of the Peierls stress as the stress at which the displacement field of the dislocation undergoes a discontinuous transition. A typical value for the Peierls stress of Al from the analysis of the Bordoni internal peaks is about 230 MPa, which is very close to our value for the screw dislocation (256 MPa) [17]. In order to correlate dislocation properties with the dislocation character, we have studied dislocation properties of 19 different dislocations that have the same Burgers vector but different orientations. The angle between the dislocation line and the Burgers vector varies from 0◦ to 90◦ . The core energy, along with its separate contributions from the configuration-dependent elastic energy Uelastic and the misfit energy Umisfit , are presented in Fig. 5 as a function of the dislocation angle θ. We find that Ucore and Uelastic decrease monotonically as the angle increases, whereas Umisfit increases with θ. The configuration-dependent elastic energy Uelastic decreases with θ because the prelogarithmic factor K increases with θ. On the other hand, the monotonic increase of Umisfit with θ is due to the fact that the core width increases with
808
G. Lu 0.20
0.10 Ucore Uelastic Umisfit
Energy (eV/Å)
0.00
⫺0.10 ⫺0.20 ⫺0.30 ⫺0.40 0.0
30.0
60.0
90.0
Angle (θ) Figure 5. The core energy, elastic energy and misfit energy as a function of dislocation orientations.
the dislocation angle. Note that the configuration-dependent elastic energy, not only is the dominant contribution to the total energy stored in the core region, but also is more sensitive to the dislocation character than the misfit energy. To correlate the Peierls stress with the dislocation character, we plotted ¯ b) as a function of ζ /a¯ in Fig. 6. Here, ζ is the half-width of a ln(σ p a/K dislocation and a¯ is the average nodal spacing along the x direction. It should be pointed out that most of the dislocations in the fcc lattice have noneven nodal spacing, except for the 30◦ and edge dislocations. Most of the calculated values can be fitted (solid line) with σp =
2π K b −1.7ζ /a¯ e . a¯
(27)
The large deviation of σ P for the 30◦ and edge dislocations from the common trend, indicates that the nodal spacing (even vs. non-even) between atomic planes plays an important role on the Peierls stress [18]. On the other hand, the deviation of the 10.9◦ and 14.9◦ dislocations from the common trend is
The Peierls–Nabarro model of dislocations
809
⫺4.0
In(σpab-1 K-1)
⫺6.0
30 ⫺8.0
10.9 ⫺10.0 90
⫺12.0 0.0
2.0
14.9
4.0
6.0
8.0
10.0
ζ/a Figure 6. The scaled Peierls stress as a function of the ratio of the core width to the average atomic spacing perpendicular to the dislocation line.
unclear to us at present. Note, that the Peierls stress is more sensitive to the average atomic spacing a¯ than to the half-width. For example, while both the 0◦ and 14.9◦ dislocations have predominant screw components and similar half-widths of 2.1 Å and 2.3 Å, respectively, they have quite different atomic spacings, 1.2 Å and 0.3 Å, respectively. This results in a Peierls stress of 6 MPa for the 14.9◦ dislocation, almost two orders of magnitude smaller than that of 256 MPa for the screw dislocation.
4.
Conclusion
To conclude, the P–N model serves as a link between atomistic and continuum approaches, by providing a means to incorporate information obtained from atomistic calculations (ab initio or empirical) directly into continuum models. The resultant approach can then be applied to problems that neither atomistic nor conventional continuum models could handle separately. The simplicity of the P–N model makes it an attractive alternative to direct
810
G. Lu
atomistic simulations of dislocation properties. It provides a rapid and inexpensive route to determine dislocation core structure and mobility. Combined with ab initio determined GSF energy surface, the P–N model could give rather reliable quantitative predictions for various dislocation properties. Furthermore, since ab initio based P–N model calculations are much more expedient than direct ab initio atomistic calculations for dislocations, the P–N model could serve as a powerful and efficient tool for alloy design, where the goal is to select the “right” elements with the “right” alloy composition to tailor desired mechanical, and in particular, dislocation properties. Finally, we should comment that the P–N model is just one example of more general cohesive surface models that are built upon the idea of limiting all constitutive nonlinearity to certain privileged interfaces, while the remainder of materials is treated through more conventional continuum theories. The same strategy has also been applied to the study of fracture and dislocation nucleation from a crack tip [19].
References [1] M.S. Duesbery, “Dislocation core and plasticity,” Dislocations in Solids, F.N.R. Nabarro, ed., vol. 8, 67, North-Holland, Amsterdam, 1989. [2] M.S. Duesbery and G.Y. Richardson, “The dislocation core in crystalline materials,” CRC Crit. Rev. Sol. State Mater. Sci., 17, 1, 1991. [3] V. Vitek, “Structure of dislocation cores in metallic materials and its impact on their plastic behavior,” Prog. Mater. Sci., 36, 1, 1992. [4] R. Peierls, “The size of a dislocation,” Proc. Phys. Soc. London, 52, 34, 1940. [5] F.R.N. Nabarro, “Dislocations in a simple cubic lattice,” Proc. Phys. Soc. London, 59, 256, 1947. [6] J.D. Eshelby, “Edge dislocations in anisotropic materials,” Phil. Mag., 40, 903, 1949. [7] V. Vitek, “Intrinsic stacking faults in body-centered cubic crystals,” Phil. Mag., 18, 773, 1968. [8] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1992. [9] G. Schoeck, “The core energy of dislocations,” Acta Metall. Mater., 127, 3679, 1995. [10] J.W. Christian and V. Vitek, “Dislocations and stacking faults,” Rep. Prog. Phys., 33, 307, 1970. [11] J. Wang, “A new modification of the formulation of peierls stress,” Acta Mater., 44, 1541, 1996. [12] G. Schoeck, “Peierls energy of dislocations: a critical assessment”, Phys. Rev. Lett., 82, 2310, 1999. [13] V. Bulatov and E. Kaxiras, “Semidiscrete variational peierls framework for dislocation core properties,” Phys. Rev. Lett., 78, 4221, 1997. [14] G. Lu, V. Bulatov, and N. Kioussis, “A non-planar peierls–nabarro model and its application to dislocation cross-slip,” Phil. Mag., 83, 3539, 2003. [15] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys, Rev., 136, B864, 1964.
The Peierls–Nabarro model of dislocations
811
[16] G. Lu, N. Kioussis, V. Bulatov, and E. Kaxiras, “Generalized-stacking-fault energy surface and dislocation properties of aluminum,” Phys. Rev. B, 62, 3099, 2000a. [17] W. Benoit, N. Bujard, and G. Gremaud, “Kink dynamics in f.c.c. metals,” Phys. Stat. Sol., (a), 104, 427, 1987. [18] G. Lu, N. Kioussis, V. Bulatov, and E. Kaxiras, “The peierls-nabarro model revisited,” Phil. Mag. Lett., 80, 675, 2000b. [19] J.R. Rice, “Dislocation nucleation from a crack tip: an analysis based on the peierls concept,” J. Mech. Phys. Sol., 40, 239, 1992.
2.21 MODELING DISLOCATIONS USING A PERIODIC CELL Wei Cai Department of Mechanical Engineering, Stanford University, Stanford, CA 94305-4040
Dislocations are lattice defects responsible for many mechanical behaviors of crystalline materials, ranging from their growth to deformation and failure [1]. Dislocation motion leads to plastic deformation. In some cases, dislocation interactions give rise to materials strengthening, while in other cases they participate in ductile fracture and fatigue. Dislocation nucleation and subsequent multiplication are important processes for many new technologies on thin film and micro-mechanical structures. Examples include relaxation of strained heteroepitaxial semiconductor layers [2], and materials characterization by micro- and nano-indentations [3]. While long-range interactions between dislocations are well described by continuum elasticity theory, dislocation nucleation, motion and short-range reactions are sensitively dependent on atomistic mechanisms in the nonlinear region of dislocation core. Understanding these local unit mechanisms requires atomistic simulations such as Molecular Dynamics (MD). However, the long-range elastic fields of dislocations present some problems for atomistic simulations, which are usually limited in length scale. For example, a dislocation can generate appreciable stress over a range of a micrometer, whereas a cubic simulation cell containing 106 atoms is only about 30 nm in size. Consequently atomistic simulation results can easily be contaminated by artifacts if boundary conditions are not treated properly. A natural approach to this problem is to establish a good coupling between discrete atomistic models with continuum elasticity theory. The purpose of this article is to discuss some ubiquitous problems of boundary conditions in atomistic modeling of dislocations. Specifically we focus on the usage of periodic boundary conditions (PBC) and the problem of conditional convergence during error correction. The solution of this problem gives the proper procedure for setting up the initial dislocation structure, extracting 813 S. Yip (ed.), Handbook of Materials Modeling, 813–826. c 2005 Springer. Printed in the Netherlands.
814
W. Cai
dislocation core energy, as well as for computing the Peierls stress. It is also applicable to compute stress fields in microscale dislocation dynamics (DD) simulations using PBC.
1.
Setting up a Dislocation Structure
From a continuum mechanics point of view, a dislocation is the boundary line of a cut plane, the two sides of which has slipped with respect to each other by a constant vector b – the Burgers vector. Figure 1(a) shows a screw dislocation (b parallel to dislocation line) in the center of a cylinder. In isotropic elasticity, the displacement field for this dislocation is parallel to z direction and is given by, θ , (1) u z (x, y) = b 2π where θ ∈ (−π, π ] is vector angle of the field point as shown in Fig. 1(a). In Fig. 1 we have assumed that the outer radius of the cylinder goes to infinity. The angle θ and the displacement u z become ill defined at the geometrical center of the dislocation (x = y = 0). Here the continuum theory breaks down. To circumvent this problem, a small tube with radius rc around the dislocation center is usually carved out in continuum models, as shown in Fig. 1(a). rc is called the core radius. In reality this problem does not exist because crystals are not continuum but consists of discrete atoms. With the core cut-off, the total elastic energy stored within a radius R of the cylinder can also be derived, µb2 R ln , (2) 4π rc where µ is the shear modulus of the medium. The choice of rc is merely a convention and hence somewhat arbitrary (rc = 1b is often used). We need to E el (R, rc ) =
(a)
(b)
(c)
y
x
I
II
b z
Figure 1. (a) A straight screw dislocation (thick dashed line) in an cylindrical continuum medium, with Burgers vector magnitude b. (b) Atomic structure cell of a screw dislocation in Ta with boundary atoms (grey) fixed according to linear elastic solutions. The interior atoms are initially displaced by elasticity solutions and are subsequently relaxed to minimize total energy. Atoms with high local energies are plotted in dark color showing the dislocation core region. (c) Side view of the same structure as in (b).
Modeling dislocations using a periodic cell
815
carefully specify our choice of rc when comparing results with others since rc influences elastic energy values. To create an atomic structure of a dislocation, a standard approach is to start with a perfect lattice structure and then displace all atoms according to the prediction of elasticity theory, for example, Eq. (1). When choosing the positions of dislocation center and cut plane, it is best to avoid intersection with any atoms, so that there will be no ambiguity in computing atomic displacements.1 Figure 1(b) shows a choice of cut plane (dashed line) for creating a screw dislocation in BCC metal Tantalum. The displacement field obtained this way is usually accurate far away from the dislocation center. Hence we can fix the outmost layer of atoms at these positions to serve as boundary conditions, such as the grey atoms in Fig. 1(b). We can then relax the inner atoms to their equilibrium positions by minimizing the total atomistic energy. The atomic structure thus obtained is usually different from the elasticity theory prediction near the dislocation center, and is referred to as the core structure. The atoms near the core usually have high local energies. By plotting high energy atoms with a different color, as in Fig. 1(b), we can identify the position and spread of the dislocation core. We emphasize that the “physical spread” of the dislocation core in this visualization has nothing to do with the core radius rc introduced earlier in the elasticity theory. The latter is only a theoretical construct to get rid of the singularity. There is no singularity in the atomistic model. The local energy of every atom is finite, so is the total energy E atm of the relaxed structure within a cylinder of radius R. We define E atm with respect to the energy of a perfect lattice with the same number of atoms. Thus E atm and E el in Eq. (2) refer to the same quantity though defined in different models. If elasticity theory is valid, the two energies should agree with each other up to a constant (due to the arbitrariness of rc ), that is, E atm (R) = E el (R, rc ) + E core (rc ).
(3)
In other words, E atm and E el should have the same dependence on R, provided that R is large enough. E core is called the core energy. We emphasize that the core energy depends on rc , hence it is not a physical quantity by itself. Its dependence on rc should exactly cancel the dependence of E el on rc , so that the sum of the two gives rise to a total energy that is invariant with rc . Therefore, the core energy supplements elasticity theory to form a complete physical description of a dislocation. Equation (3) has been numerically verified by atomistic simulations [4], from which one can deduce the core energy. However, a simulation using 1 If Burgers vector has non-zero component out of the cut plane, it would also be necessary to insert atoms into or delete them from the original lattice [1].
816
W. Cai
cylindrical boundary conditions is not the most accurate way to compute core energy because of the ambiguity of defining the radius R of a cylinder – one can always find a range of R that encloses the same number of atoms. Using periodic boundary conditions is a better approach, which we will describe below.
2.
Dislocations in a Supercell
Periodic boundary conditions are widely used in atomistic simulations of condensed matter. The principal advantage of PBC is that they eliminate artificial surfaces and preserve translational invariance of space. In PBC, the atomic structure is periodically repeated in space with three repeat vectors: c1,2,3 . This means that, whenever there is an atom at position r, there are also atoms at positions r + n 1 c1 + n 2 c2 + n 3 c3 , where n 1 , n 2 and n 3 are arbitrary integers. As shown in Fig. 2(a), we can imagine a parallelepiped simulation cell defined by these three vectors [5, 6]. This simulation cell, usually called a supercell, is then surrounded by an infinite number of copies of itself. The border of the supercell is immaterial; only its shape (as specified by c1,2,3 ) is relevant. To see this, consider the infinite structure formed by periodically repeating the supercell. We can then arbitrarily shift the supercell to a different location, carve out the atoms within it and form a new supercell. The new supercell will correspond to exactly the same system as the original one. Therefore, no point in space is made more special than others and the translational invariance of space is preserved. In comparison, this is not the case for other types of boundary conditions. For example in Fig. 1(b), the atoms in
(a)
(b)
c2 βb
c2 −αb b
b a
a
R αb
c1
−βb
c2
Figure 2. (a) Atomistic supercell containing a dislocation dipole in silicon. The repeat vectors of the supercell are c1 =4[112], c2 =3[111], c3 =[110]. The distance between the two dislocations is a = c1 /2. (b) Schematic representation of the supercell and its image cells, each containing a dislocation dipole. Image dipoles are illustrated in grey and “ghost” dislocations introduced to facilitate image energy calculation is plotted in white.
Modeling dislocations using a periodic cell
817
the outer layer of the cylinder are fixed while the inner ones are allowed to move. This creates an interface between the two domains, which may be quite undesirable for certain applications. In addition, PBC is the de facto boundary condition for electronic structure calculations that employ plane-wave bases, simply because plane wave demands translational invariance. However, the advantages of PBC come at a price. First, supercells can only accommodate dislocation arrangements whose net Burgers vector is zero. Thus, the minimal number of dislocations that can be introduced in a supercell is two, forming a dipole. These two dislocations interact with each other, as well as with the periodic images of themselves. Associated with these interactions are additional strains, energies and forces whose effects can “pollute” the simulation results. Fortunately, it is possible to quantify such artifacts by exercising continuum elasticity theory; after that they can be either corrected or minimized. This extra work is well worth it given the unique advantages offered by PBC – their simplicity, flexibility and versatility. Let us begin our discussion with setting up the atomic structure of a dislocation dipole in a supercell. This turns out to be nontrivial. Consider two separated from each other screw dislocations with Burgers vector b and −b, by a . According to Eq. (1), the displacement field of these two dislocations in an infinite medium (without any images) is simply, u dipole(x, y) = b
θ1 − θ2 , 2π
(4)
to the where θ1 (θ2 ) is the angle between the vector from the dislocation b (−b) field point and the cut plane, similarly defined as in Fig. 1(a). Figure 3(a) plots the displacement field in the rectangular region x ∈ [−1, 1], y ∈ [−0.5, 0.5] for a dislocation dipole at x = ±0.5, y = 0. The cut planes of the two dislocations overlap with each other so that the discontinuity in displacement field only occurs between the dislocations. This displacement field is obviously non-periodic. Attempts to fit this configuration into a periodic supercell will inevitably create some mismatch at the box boundaries. Nevertheless, taking this as an initial condition, it is often possible to relax the mismatch away – but this is not guaranteed. Some mismatch can persist as spurious interfaces that contaminate the simulation. It is thus desirable to create initial atomic structures that already satisfies PBC. A natural approach to generate the desired displacement field is to superimposing the displacement fields of many dislocation dipoles that form a periodic array, that is, r) = u sum z (
= u dipole u dipole ( r − R) ( r ) + u img r) , z z z (
R
r) u img z (
≡
R
dipole uz ( r
, − R)
(5)
818
W. Cai
(a)
(b)
u dipole z
1
1
0
0
⫺1 0.5
⫺1 0.5
y
1
0 ⫺0.5 ⫺1
0
y
x
u err z
(c)
1
0
0
⫺1 0.5
⫺1 0.5
y
1 ⫺0.5 ⫺1
0
x
1
0 ⫺0.5 ⫺1
0
x
err uz⫹uimg z ⫺ uz
(d)
1
0
img u dipole ⫹ uz z
y
1
0 ⫺0.5 ⫺1
0
x
Figure 3. Constructing the displacement field u z (x, y) of a screw dislocation dipole in PBC by superimposing the displacement field of primary, in (a), and image dipoles. The fully corrected result is shown in (d), see text.
where the summation runs over the two dimensional lattice R =n 1 c1 + n 2 c2 , (n 1 , n 2 are integers) that specifies the offset of the image dipoles with respect to the primary dipole (that lies inside the supercell), and the dislocation lines are parallel to c3 . The summation excludes the term for the primary dipole, that is, R = 0. In practice, this sum is evaluated only for a finite number of image dipoles closest to the primary dipole. From Fig. 3(b) it is clear that, with the image contributions added, the displacement field now looks “more periodic” than before, but not exactly. Let . It can be us denote the desired but yet unknown periodic solution as u PBC z sum PBC = u − u is a field with proved that the remaining non-periodic part u err z z z a constant slope [5, 6], as shown in Fig. 3(c). It can also be shown that the error field u err z can converge to arbitrary values, depending on how the terms
Modeling dislocations using a periodic cell
819
are ordered during summation or, equivalently, how the sum is truncated at the end of the summation. For example, the field will converge to different values by summing over the dipoles contained in circles with increasing radius and those in squares with increasing size. This undesirable behavior is called conditional convergence and is a consequence of the long-range character of the elastic fields of dislocations. Fortunately, we are just one step away from obtaining the desired and . Since u err unique solution u PBC z z must be a linear function, it can be easily r + c1,2 ) − u sum r ), because by definimeasured by taking differences u sum z ( z ( PBC PBC r + c1,2 ) = u z ( r ). Thus, we recover the periodic solution u PBC ( r) tion u z ( z by subtracting off the linear term (see Fig. 3(d)). After correction, the result err u sum z (x, y) − u (x, y) becomes absolutely convergent, in that it no longer depends on the details of summation procedures. For completeness, it is often desirable to use a displacement field that is not strictly periodic with respect to x and y but includes a constant tilt. In terms of the supercell repeat vectors, this corresponds to the case where c1,2,3 are not orthogonal to each other. The purpose is to minimize the average elastic stress within the supercell. The required tilt can be computed through the following considerations. The creation of the dislocation dipole introduces a plastic strain to the supercell, 1 V, (6) pl = ( A ⊗ b + b ⊗ A)/ 2 where A is the area of the cut plane (times plane normal vector) on which the V is the volume of the supercell. pl will cause displacement field jumps by b. non-zero average elastic strain and stress unless the supercell is tilted in such a way to exactly accommodate it. In our example, A = ( c3 × c1 )/2 c2 . Zero average internal stress can be achieved by using a new repeat vector, c2 = c2 + b/2.
(7)
This corresponds to introducing a linear term, u tilt (x, y)=by/2 to the displacement field. The result, u PBC (x, y) + u tilt (x, y), is no longer periodic in y. We emphasize that although u tilt (x, y) and u err (x, y) are both linear fields, they have completely different meaning. u err (x, y) is an arbitrary error that needs to be subtracted off from the summation in order to obtain a unique answer. On the other hand, u tilt (x, y) has a specific value and is introduced intentionally to minimize internal stresses in the supercell.
3.
Core Energy
The atomic structure created by the above procedure can be used as initial conditions for an energy minimization algorithm to find the equilibrium
820
W. Cai
structure of the dislocation dipole in a supercell. Let the atomistic energy be a ).2 This energy can also be derived from continuum elasticity theory E atm ( a , rc ). Again, for elasticity theory to be and we denote the result to be E el ( valid, the two energies must agree with each other up to a constant, a ) = E el ( a , rc ) + 2E core (rc ) , E atm (
(8)
where we have assumed that the core energy for the two dislocations are identical. This equation provides another way to extract dislocation core energies, a , rc ). For self-consistency, the core energy provided we can compute E el ( E core (rc ) thus obtained should only depend on the choice of rc , and independent of the a and c1,2,3 . a , rc ), we start by writing down the elastic energy of To compute E el ( an isolated dislocation dipole (assuming isotropic elasticity and screw dislocations), µb2 a ln . (9) 2π rc At the same time, the total elastic energy also includes interactions between the dislocation dipole in the primary supercell and those in the (infinitely many) image cells, dipole
E el
( a , rc ) =
dipole
img
a , rc ) = E el ( a , rc ) + E el ( a ), E el ( 1 img a) = E dd ( R), E el ( 2 R 2
= µb ln E dd ( R) 2π
(10) (11)
| R + a | · | R − a | . 2 | R|
(12)
is the interaction energy between the primary dipole with an image E dd ( R) The summation is over a lattice R = n 1 c1 + n 2 c2 excluding dipole offset by R. the origin. The factor 12 appears in Eq. (11) because only half of the interaction energy should be attributed to the primary supercell (the other half belongs to the image cell). a , rc ) and hence Given the above three equations, the task of computing E el ( E core (rc ) does not look complicated. Unfortunately, we have the conditional convergence problem again, this time for the summation in Eq. (11). Depending on how do we truncate the summation (e.g., by circles or by squares), img a ). This is obviously we will converge to different numerical values for E el ( unacceptable. It turns out that the solution to this problem is similar to the one we encountered earlier for setting up the initial atomic structure. One can show 2 The dependence on repeat vectors c 1,2,3 is implicitly assumed by not written out explicitly.
Modeling dislocations using a periodic cell
821
that the conditional convergent component in the image energy is proportional to the spurious average stress (σierr j ) generated by the image dipoles in the supercell, which is also a conditional convergent quantity [5, 6]. The absolutely convergent form of the image energy is, img
a) = E el (
1 + 1 A j bi σierr E dd ( R) j . 2 2
(13)
R
Similar to the previous approach of measuring u err z , the second term in the above equation can be measured by the interaction energy between image dipoles with “ghost” dislocation dipoles introduced at the supercell boundary – when the spurious stress is not present, the interaction with the “ghost” dislocations should vanish. Therefore, the image energy can also be written as, img
a) = E el (
1 −1 E dd ( R) E dg ( R), 2 2 R
(14)
R
is the interaction energy between a image dipole (offset by R) where E dg ( R) with four ghost dislocations, as shown in Fig. 2(b). The Burgers vector magnitude of the ghost dislocations satisfies the condition: a = α c1 + β c2 . Figure 4 plots the numerical data [5, 6] for a dislocation dipole in silicon with supercell geometry shown in Fig. 2(a). Both E atm and E el are computed for a few supercell geometries with a kept at c1 /2. The fact that the difference between these two energies remains a constant demonstrates the validity of our approach, from which we can also extract the dislocation core energy. We note that in order to reach this accuracy, one need to use anisotropic elasticity theory, in which the formula for dislocation interactions are more complicated
4
E (eV/A)
3 2 1 0 4
6
8
10
c1([112])
Figure 4. Atomistic and linear elastic energies of the dislocation dipole as functions of the supercell shape. E atm is shown in ; E el is in ♦. The solid line represents 2E core = E atm − E el .
822
W. Cai
than Eq. (12). The elastic constants used in elastic energy calculation should be the ones corresponding to the interatomic potential used for the atomistic simulation.
4.
Peierls Stress
Continuum elasticity theory can be used to determine the thermodynamic driving force on a dislocation line, which is the ratio of energy dissipation over a virtual dislocation displacement. Yet, how fast will the dislocation move in response to its driving force is usually beyond the realm of continuum elasticity and requires an atomistic treatment. A fundamental property of dislocation mobility is the Peierls stress, which is the minimum stress required for a straight dislocation to move at zero temperature. Peierls stress is related, although not necessarily directly, to the macroscopic yield stress above which the crystal deforms plastically. It is an idealized concept, because in a real crystal dislocations are usually not straight, and zero temperature cannot be achieved in practice. Nonetheless, Peierls stress is a well defined quantity (at least in theory) and is a useful measure of the intrinsic lattice resistance to dislocation motion, for dislocations with a higher Peierls stress generally have lower mobility under similar conditions. Peierls stress can be computed from a series of atomistic simulations in which the applied stress is gradually increased. Each stress increment is followed by an energy minimization of the atomic structure. The critical stress τc at which the dislocation starts to move is an estimate of the Peierls stress τ PN . However various boundary condition artifacts introduce errors to the measured critical stress τc , making it deviate from the true Peierls stress τ PN . In the following, we describe how to compute Peierls stress using a supercell and how to minimize the error coming from the boundary conditions. Figure 5(a) shows a suitable supercell set up for Peierls stress calculations. The two screw dislocations (b along z) are separated vertically ( a along y) while their glide planes are horizontal. An applied stress (σ yz ) will exert equal force on the two dislocations but in opposite directions. When the critical stress is reached, both dislocations starts to move indefinitely across the supercell (without the danger of annihilating each other) and the energy minimization algorithm will fail to converge. This makes the critical condition easy to detect. Due to the interaction between the two dislocations in the supercell, plus the interaction with their images, the actual force experienced by each dislocation does not solely come from the applied stress. To analyze the effect of boundary conditions, consider the energy variation as the relative positions of the two dislocations changes along x direction. As shown in Fig. 5(b), the energy variation E(x) is a periodic function of x. The data from anisotropic elasticity theory agree well with atomistic simulations [5, 6]. This shows
Modeling dislocations using a periodic cell (a)
823
(b)
(c)
0.1
c2
2.02
b
τc (GPa)
a
Eel (eV/A)
2
x
1.98
0.05
⫺b c1
1.96 1.94
0
0.5
0 x /c1
0.5
1.92 1
2
3
4
c2 / c1
Figure 5. (a) A supercell suitable for Peierls stress calculation with a = c2 /2 initially. The normal vector dislocation glide planes are parallel to c2 , so that dislocations will not annihilate with each other by gliding. (b) The atomistic energy variation of a dislocation dipole in silicon as a function of the relative position of two dislocations along the x direction, in ◦. When x = 0, two dislocations are separated by a = c2 /2. The solid line is the anisotropic elasticity prediction and the dashed line is the isotropic elasticity result. (c) Critical stress τc of screw dislocations in silicon for aspect ratio of the supercell (c2 /c1 ) at fixed c1 = 5[112]. are the data points for x = 0 and are for x = c1 /2. • are obtained by averaging and .
continuum elasticity can be used to accurately determine boundary effects by computing the elastic energy [Eq. (10)]. This is advantageous because direct computation of the energy variation from atomistic simulations is time consuming. The slope of the E(x) curve gives rise to an image force on the dislocations in addition to the Peach–Koehler force [1] due to the applied stress σ yz . This extra force introduces an error in the Peierls stress calculation. Considering the shape of the E(x) curve, it is obvious that this error is minimized for dislocation positions where dE/dx = 0, that is, either at x = 0 or x = c1 /2. A second order error still exists even in these two special configurations, due to the curvature (d2 E/dx 2 ) and lattice discreteness. Because the E(x) curve is close to sinusoidal, the error in critical stress calculation are opposite in sign at x = 0 and x = c1 /2. Therefore the error can be further reduced by computing the Peierls stress at these two settings and average the results. Figure 5(c) plots the critical stress τc of screw dislocation in silicon usc1 = 5[112]). Values ing supercells with different height (c2 ) but fixed width ( of τc computed at x = 0 are shown in , while those for x = c1 /2 are in . Both sets of data converge to 1.98 GPa with increasing c2 , while their averaged values (in •) reach this asymptotic value even for relatively short cells. This indicates that with this procedure it is possible to accurately determine the Peierls stress using a relatively small supercells due to error cancellation. This is helpful for first principles simulations which are limited to small supercells.
824
5.
W. Cai
Stress Field Calculations
The conditional convergence problems we encountered in atomistic simulations using supercells, for computing displacement fields or elastic energies, are caused by the intrinsic long-range character of dislocation interactions. Because periodic boundary conditions are widely used in various kinds of simulations, it is natural to expect this problem to be quite ubiquitous. For example, PBC is often used in microscale DD simulations. The calculation of stress fields in DD also involves a conditional convergent summation. In the following, we discuss how can we apply the same idea as developed above to solve this problem. Dislocation dynamics simulations do not deal with atoms. The relevant degrees of freedom are mathematical lines, usually discretize into straight segments (or curved splines), representing the location of the dislocations. The driving force on each segment is related to the local stress field, which is the superposition of the applied stress and the internal stress from all other segments. Therefore, most of the time in DD simulations is spent on computing the stress of one segment centered at position S at the material point P. When periodic boundary conditions are used, the stress field due to infinite number of images of segment S need to be included. As an example, consider a differential segment at origin and field point at r = (x, y, z). A segment is called differential if its length dl is considerably smaller than r. The stress field for differential segments takes a simpler form – proportional to dl, than that for finite length segments. To be specific, consider an edge dislocation segment with b along y axis and dl along x-axis, and consider the x–z component of its stress field. In isotropic elasticity, it is given by [1], seg
νx 3x z 2 r) σ13 ( = − , µ · b · dl (1 − ν)r 3 (1 − ν)r 5
(15)
where µ is the shear modulus, ν is the Poisson ratio. To construct the stress field when a supercell is used, we need to superimpose the above stress field for a periodic array of dislocation segments, r) = σ sum (
σ seg ( r − R),
(16)
R
runs over all lattice points R = n 1 c1 + n 2 c2 + n 3 c3 . where the sum seg −2 r ) ∼ r , the sum in Eq. (16) (now in three dimensions) is only Because σ ( conditionally convergent. Define the desired but yet unknown stress field to r ). One can show that the difference between σ sum ( r ) and the correct be σ PBC ( solution is a field with a constant slope, r ) = σ PBC ( r ) + g · r + σ 0 , σ sum (
(17)
Modeling dislocations using a periodic cell
825
where g is a third-order tensor accounting for a stress gradient and σ 0 is an average stress. Both g and σ 0 are conditionally convergent – their values varies depending on how the summation is truncated. Because the stress fields of a differential dislocation segment is anti-symmetric with respect to inversion, r ) = −σ seg ( r ), it is a simple matter to ensure that σ 0 = 0 by that is, σ seg (− always including the image segment at − R whenever an image segment at R is encountered. The stress gradient g, on the other hand, is generally nonzero. However, it can be easily computed after the summation is completed, by measuring the stress difference at supercell borders, for example, ci /2) − σ sum(− ci /2), g · ci = σ sum (
for i = 1, 2, 3.
(18)
r ) from an It is then straightforward to obtain a regularized solution σ PBC ( r ), by solving for g and subtractarbitrarily chosen summation sequence σ sum ( r ). In practice, the summations over stress ing off the term g · r from σ sum ( fields are performed before the DD simulation for efficiency. The results are tabulated so that no image summation is required during the DD simulation.
6.
Summary
Dislocations are dual objects: They possess both a localized highly nonlinear core region and a long-range elastic field. Because of this, setting up proper boundary conditions for dislocation modeling is not trivial, and usually requires coupling between atomistic models with continuum elasticity theory. This article focuses on periodic boundary conditions and the ensuing conditional convergence problem, which appears both in setting up the initial dislocation structure and in extracting the dislocation core energy. The problem is solved by that fact that the conditional convergent term can always be related to a field with a constant slope, which can be measured and subtracted off, so that the correct solution can be recovered. The idea can be applied to minimize the boundary artifacts for Peierls stress calculations, as well as to compute stress fields in microscale Dislocation Dynamics simulation using a supercell.
Acknowledgment This work was performed under the auspices of U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.
826
W. Cai
References [1] [2] [3] [4] [5] [6]
J.P. Hirth and J. Lothe, Theory of Dislocations, Wiley, New York, 1982. K.W. Schwarz, Phys. Rev. Lett., 91, 145503, 2003. J. Li, K.J. Van Vilet, S. Yip, and S. Suresh, Science, 418, 307, 2002. W. Xu and J. Moriarty, Phys. Rev. B, 54, 6941, 1996. W. Cai, V.V. Bulatov, J. Chang, J. Li, and S. Yip, Philos. Mag., 83, 539, 2003. W. Cai, V.V. Bulatov, J. Chang, J. Li, and S. Yip, Phys. Rev. Lett., 86, 5727, 2001.
2.22 A LATTICE BASED SCREW-EDGE DISLOCATION DYNAMICS SIMULATION OF BODY CENTER CUBIC SINGLE CRYSTALS Meijie Tang Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94550
1.
Introduction and Historical Background
This article is an introduction to the three-dimensional (3d) dislocation dynamics (dd) simulation method. It is complementary to the article by Z Bib in Chapter 3 of this handbook. It is the intention of the author to introduce a specific model of the method with examples that can be understood rather easily. A complete understanding of plasticity involves the understanding of individual dislocation properties as well as their collective behavior at the mesoscopic scale. Although the dislocation theory and transmission electron microscopy (TEM) have revealed significant basic properties and mechanisms of dislocations [1, 2], the multiplicity and the complexity of the mechanisms of dislocation motion and interactions make it hopeless for reaching a quantitative description of dislocation mechanism based plasticity. The 3d dd method was developed as a numerical means to track the complex collective dislocation motion and to provide a key link between individual dislocation properties and isolated dislocation mechanisms to the plastic deformation properties of materials at the macroscopic length scale. In the mid-1980s, the first computational models for dd appeared. In these approaches, the dislocations were handled as simplified point objects, representing infinitely long parallel dislocation lines, in idealized 2d crystals with a set of rules defining their behavior [3, 4]. However, these simulations missed important effects such as slip geometry, line tension effect, dislocation multiplications and intersections. In early 1990s, the development of new dd simulations in 3d started to merge (for a review, see Bulatov et al. [5]). In these 827 S. Yip (ed.), Handbook of Materials Modeling, 827–837. c 2005 Springer. Printed in the Netherlands.
828
M. Tang
new simulation models, the dislocation lines are rigorously represented by a proper discretization scheme, either segment based or node based. The dislocation motion and mutual interactions are handled explicitly according to the physics input and the underlying governing mechanisms arriving from either the dislocation core properties or the elastic properties. A wealth of plastic deformation properties can be obtained including the stress-strain curves, the dislocation density evolution, and detailed information related to the dislocation microstructure evolution. In this article, we use the screw-edge based dislocation dynamics method as an example to introduce some fundamental aspects of the 3d dd method, to show how the method is formulated, to demonstrate how the input and output are constructed. Specific dd simulation examples are given for the body center cubic (bcc) single crystals at low temperatures where the screw-edge based method is most convenient.
2.
Lattice Based Screw-Edge Dislocation Dynamics Simulation Method
Unlike molecular dynamics simulations, dislocation dynamics simulations track the motion of line objects (dislocations) instead of point objects (atoms). These line objects can be straight or highly curved, and they interact with each other in complex manners. All dislocations interact with each other through the elastic field, which is long ranged similar to the electric field of charges. When dislocations approach to contact each other, various topological changes (i.e., so called short-range interaction) can occur. Dislocations can annihilate, reconnect, and form new types of dislocations. Similar to what has been observed under TEM, the “spaghetti”-like dislocation microstructures can be quite complex. What is more complex in the simulation is that the dislocation microstructure evolves as a function of time.
2.1.
Dislocation Line Discretization and Topology Changes
The basic function of the dd simulation is to track the topological changes of the dislocation lines. A proper discretization scheme is needed to do so. In the screw-edge based approach, the dislocations are represented by piece-wise connected screw and/or edge segments residing on a discretized sublattice [6]. The basic unit segments are the smallest segments defined on the underlying discretized lattice structure. For example, for a bcc single crystal, the lattice structure for the dd simulation is a simple cubic one with a lattice parameter defined by a ∗ , where a ∗ is typically about a few nanometers and is thus much
A lattice based screw-edge dd simulation
829
larger than the atomistic crystalline lattice parameter (a few angstroms). For ¯ in the case of a given Burgers vector and slip plane, for example, 111(011) bcc, one can define the unit (or smallest) screw and edge segments in the discretized lattice as shown in Fig. 1(a). An arbitrary dislocation loop, as shown in Fig. 1(b), can then be represented by screw and edge segments that are multiplies of the unit ones for the slip system considered. The dislocation segments can have both positive and negative directions. A bcc single crystal has 12 slip systems of 111(110) type. Unit segments of screw and edge are defined for each slip system and thus the whole dislocation configurations can be represented. The screw-edge discretization scheme provides a convenient way to follow the topological changes including dislocation motion as well as short-range interactions. A few examples of the topological changes are given below. These are the segment movement, the segment annihilation, and the segment rediscretization. In order to maintain the connectivity of dislocation lines, when a dislocation segment moves, new segments are inserted between the moving segment and its previous neighbors, as shown in Fig. 2(a). If the moving segment is an edge (or screw), the inserted ones are screw (or edge). Also, the smallest distance of movement that one segment can make is determined by the unit segment of its opposite character in the discretized lattice. For example, the smallest moving distance of a screw is the length of a unit edge; The smallest moving distance of an edge is the length of a unit screw. Because all dislocation segments are made of multiples of the unit segments and all segment ends reside at lattice sites, the moving distances are multiples of the unit segment lengths. Figure 2(b) is an example of partial annihilation and reconnection between two dislocations with the same Burgers vector and opposite directions approaching each other in the same slip plane. Another type of topological changes deals with the rediscretization as shown in
(a)
(b)
Figure 1. (a) A dd sublattice with unit screw √ and edge segments defined. The unit screw segment is along 111 with the length of 3a ∗ . The unit edge segment is along 112 and √ and with the length of 6a ∗ . (b) An arbitrarily shaped dislocation loop is discretized into piecewise connected screw and edge segments.
830
M. Tang (a)
(b)
(c)
Figure 2. A few examples of topological changes. (a) Segment movement. (b) Segment annihilation and reconnection. (c) Segment rediscretization and movement.
Fig. 2(c). When the discretization is too coarse to describe detailed dislocation configurations, the long segment is cut into shorter ones according to the pre-described criteria (such as stress variation along the dislocation line or the curvature of the line), and the subsequent movements will result in more refined dislocation line shapes.
2.2.
Dislocation Motion and Plastic Strain
Dislocations move and make topological changes in response to the dislocations to the local resolved shear stress τ ∗, which is the total driving force for dislocation motion. It can be calculated using the Peach–Kholer formula from the various stresses the dislocations experience. Typically, it has the following contributions τ ∗ = τelas + τapp + τbc ,
(1)
where τelas comes from the elastic interaction stresses between dislocation segments, τapp comes from the externally applied stresses, and τbc comes from the specified boundary conditions. For example, in the case of thin films, τbc comes from the image stresses due to the free surfaces in thin films. In the
A lattice based screw-edge dd simulation
831
case of bulk systems with periodic boundary conditions, τbc is due to the periodic images of the dislocations in the simulation box. The local resolved shear stress is calculated at the center of each segment. How dislocations move under the local resolved shear stresses is defined by the mobility rules. Because the dislocation mobility is determined by the dislocation core structure, it is atomistic in nature and thus needs to be provided to the dd simulations as input. The dislocation mobility varies for different crystalline structures and for different characters of dislocations. In the case of bcc single crystals, the screw dislocations have a three-dimensional core structure and thus have high lattice friction for motion, that is, high-Peierls stress. As a result, they move by the thermally assisted kink pair mechanism at low temperatures [7]. On the other hand, the edge dislocations have low-Peierls stress and move by the phonon dragging mechanism. The mobility for the edge and the screw in a bcc single crystal is given by [8] νscrew = ν0 exp[−H (τ ∗ )/k B T ] νedg = τ ∗ b/B
(2)
where H (τ ∗ ) is the kink pair activation enthalpy, ν0 is a pre-factor, T is the temperature, k B is the Boltzman constant, and B is the drag coefficient. For typical conditions at low temperatures, the mobility of the screw can be several orders of magnitude slower than that of the edge. If the time step is dictated by the edges, the computation is very inefficient because the screws will stay idle most of the time. In our method, the time step of the simulation is chosen based on the screw mobility instead, and the edges are assumed to move infinitely fast. The edges will only stop at either their equilibrium positions determined by the balance of the applied stress, the line tension and other elastic stresses, or stop at the positions where close contact interactions occur. This algorithm is approximate, but it is efficient and captures the dominating effect of the low temperature plastic behavior of bcc single crystals. In the screw mobility, the most critical input is the stress dependent activation enthalpy, which is extracted from experiments. It can also be calculated directly from atomistic simulations. Once the mobility rules are provided, the dd simulations proceed by moving the segments with finite time steps. As the dislocations move, they accumulate plastic strains. The incremental plastic strain δε p during a time step by all segments is given by δε p =
n L i · νi δt · bi i=1
V
,
(3)
where V is the volume of the simulation box, n is the total number of segments, L i the ith segment length, νi δt is the distance moved by the ith segment within the time step, and bi is the Burgers vector of the segment.
832
2.3.
M. Tang
Dislocation Junctions and Strain Hardening
The dislocation motion during the plastic flow is rarely steady. Dislocations experience both long-range elastic interactions as well as close contact interactions. While the former is relatively smooth during the dislocation motion, the later can be quite jerky and complex depending on the reacting dislocations’ initial line shape, the characters, and the trajectories. These close interactions tend to have important consequences to the plastic flow. One of the most important types of close interactions is the formation of dislocation junctions. Junctions are dislocations with lower energy than the reacting ones and they are often sessile, thus pinning the initially mobile dislocations temporarily. When the local stress is large enough to provide the work needed to break the junctions, the dislocations can be de-pinned from the junctions and set free to move again. In order to keep the same plastic flow, the applied stresses need to be increased in order to overcome the junctions. This is one of the main mechanisms for strain hardening in single crystalline materials. Figure 3 is an example of dislocation junctions in bcc single crystals both shown schematically in Fig. 3(a) and as simulated in Fig. 3(b). In Fig. 3(a), the long dislocation line is a screw dislocation pinned at its two ends by junctions formed. Since the screw dislocation moves by the kink pair mechanism, the kinks nucleate at the center and pile up at its ends so that the middle portion of the screw keeps moving forward. When the distance traveled by the screw reaches a critical value X c , the line tension due to the curvature of the connecting arms between the junction and the screw provides large enough driving force to break the junctions. The critical distance is determined by X c = αµb/τa , where µ is the shear modulus, b the Burgers vector, τa the flow stress resolved on the slip plane, and α the average junction strength parameter. When the junction is broken, the dislocation segment is set free and rejoins the plastic flow.
(a)
(b)
Figure 3. Dislocation junctions in bcc single crystals. (a) Schematic drawing shows the critical configuration of a junction. (b) A simulated array of dislocation junctions along a screw dislocation.
A lattice based screw-edge dd simulation
3.
833
Examples of dd Simulations of bcc Single Crystals at Low Temperatures
The examples given below are generic to most transition bcc single crystals. The first example is a Frank–Read source at low temperatures. Frank– Read source is a mechanism to multiply dislocations from a single dislocation as shown in Fig. 4. At low temperatures, the edges move much easier than the screws. They travel long distance and leave behind elongated screws. The simulated Frank–Read source resembles what was observed under in situ TEM experiments. The next example is a 3d simulation of yield stresses of bcc at low temperatures. The simulations are performed at a constant strain rate of 10−4 /s under uniaxial loading condition. The simulation box size is 15 µm in length. The simulation starts with a dislocation configuration of an initial density of 1011 /m 2 with equal density of screw and edge segments randomly (both length and location) distributed in all slip systems. The constant strain rate is maintained through the monitoring of the applied stress by δσ = C(˙ε δt − δε p ),
(4)
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4. Frank–Read sources for bcc at low temperatures. (a) A screw segment initially pinned at one end (the other end extending out of simulation box). (b) The screw segment moves under an applied stress and creates an edge neighbor at the pinning point, which moves for a large distance under the same applied stress. (c) The edge segment moves out of the box and left behind two screws with opposite directions. (d) Both screws move and the pinned screw creates new edge neighbors again. The edge moves in the opposite direction from in (b). (e) The same process starts to repeat from (a) to (d) with one pair of screw segments generated. (f) As the process continues, the screws move away from the center and the edges move out of the simulation box. Several pairs of elongated screw segments have been multiplied.
834
M. Tang
where δσ is the increment of the applied stress, C is the Young’s modulus along the uniaxial loading orientation, ε˙ is the applied constant strain rate, and δε p is the total plastic strain increment along the loading orientation during the time step. By doing so, one is able to bring the system to and maintain a constant strain rate. Thus, the so-called stress-strain curves can be generated. Some examples of stress-strain curves are shown in Fig. 5. These curves all show a distinctive turnover of the flow stress from going up sharply to being near flat. When the stresses are low, the edge segments move, but most screws do not. The screws hinder the overall plastic deformation because the edges are stopped by the line tension of their screw neighbors. When the plastic strain rate is below the applied value, the applied stress is increased rapidly according to Eq. (4). As the stress increases to a level at which most screw dislocations start to move, the Frank–Read sources start to operate and the mobile dislocation densities increase significantly. Then, much larger plastic strains are accumulated for each time step. The simulated strain rate approaches the applied value rapidly. Therefore, the flow stresses show significant slow down of increase when they approach the yield stress values. The yield stress is a macroscopic material property that can be measured in experiments. Its experimental definition is the flow stress at the strain of 1% or 2%. Essentially it is defined as the stress at the onset of plastic deformation, below which elastic deformation dominates. In the dd simulations, it is defined through Eq. (4), that is, when the plastic strain rate reaches the applied value. Fig. 5(b) shows the yield stresses obtained from the end of the stressstrain curves shown in Fig. 5(a). The comparison with experiments is quite reasonable [8].
(a)
(b)
Figure 5. Stress-strain curves and yield stresses as a function of temperature for a single crystal tantalum. (a) Simulated stress-strain curves at different temperatures (from top to bottom, the temperatures are 160 K, 197 K, 250 K, and 300 K, respectively). Both the flow stress and the plastic strain are resolved in the primary slip plane. (b) The simulated resolved yield stresses (filled circles) are extracted from the end of the stress strain curves in (a). They are compared with two sets of experimental measurements (filled triangles).
A lattice based screw-edge dd simulation (a)
(b)
835 (c)
Figure 6. Dislocation configurations of bcc single crystals at low temperatures. (a) A snapshot taken at the start of the simulation for the initial relaxed configuration. The dislocation density is 2×1011 /m 2 . (b) A snapshot taken from the simulation after the plastic yielding. The dislocation density is 1.9 × 1012 /m 2 . The simulation is performed at the constant strain rate of 1/s and at 300 K under a single slip orientation. The simulation box size is 15 µm. (c) is a TEM micrograph ¯ slip plane after the plastic deformation at 50 K. of niobium on (011)
The elongated screw dislocations seen in the Frank–Read source in Fig. 4 is quite characteristic for bcc plastic deformation at low temperatures. Even during pre-yield stage, the edges start to move at quite low stresses while the screws do not. Thus, the screws are elongated without moving. At yielding, the screws start to move and they continue to multiply long screws as seen in Fig. 4. Some typical dislocation configurations are shown in Fig. 6. The two snapshots are taken from the constant strain rate simulations of tantalum at 300 K and at the strain rate of 1/s. Also shown in the figure is a TEM ¯ slip plane of a niobium micrograph of long screw dislocations in the (011) crystal after deformation at 50 K.
4.
Progress and Outlook
This article introduces the basic aspects that form a dd simulation for bcc single crystals by the lattice based screw-edge type of model. A few examples are given to introduce some basic applications of the method. This model is most convenient for the plastic deformation of bcc single crystals at low temperatures where the screw dislocations dominate the plastic deformation. By now, various improved or more sophisticated versions of dd codes have been developed or are being developed. As far as the discretization scheme is concerned, there are several varied approaches including dislocation node based [9, 10], the parametric segment based [11], and the off-lattice mixed segments approach [12]. As for the lattice based approach, a few selected mixed segments are added to the screw and edge basis for the face center cubic (fcc) simulations [13]. Most methods are general for either bcc, or fcc. Some
836
M. Tang
methods are extended to other single crystals such as diamond cubic crystals. As for the boundary conditions, the periodic boundary condition has been developed for the simulation of bulk materials [5]. For systems with free surfaces, it becomes a rather standard approach to couple a dd code with a finite element method (FEM). An advanced algorithm is being developed to utilize an analytical solution to account for the singular stresses at the intersection of the dislocations with the free surfaces. Another important forefront development of the 3d dd method is to reach high scalability on massively parallel computers. The newly developed ParaDiS code at Lawrence Livermore National Laboratory is among the latest development of performing large scale dd simulations using thousands and even more CPUs [9]. Progress has also been made in bridging length scales from the atomistic level to the continuum level. For example, atomistic simulations using rather accurate first principle based interatomic potential functions have been used to calculate the kink pair activation enthalpy for bcc single crystals [14]. On the other end, dd simulations are used to provide insight, fit parameters, and validate models that are based on dislocation density at the continuum level [15]. The reader is suggested to read related parts of this handbook that discuss the other length scales such as the atomistic and the continuum.
References [1] J. Friedel, Dislocations, Pergamon, Oxford, 1964. [2] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edition, Wiley, New York, 1982. [3] J. Lepinoux and L.P. Kubin, “The dynamics organization of dislocation structures: a simulation,” Scripta Metall., 21, 833–838, 1987. [4] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Solid State Phenom., 3&4, 377–388, 1988. [5] V.V. Bulatov, M. Tang, and H.M. Zbib, “Crystal plasticity from dislocation dynamics,” MRS Bull., 26, 191–195, 2001. [6] B. Devincre, “Meso-scale simulation of the dislocation dynamics,” In: H.O. Kirchner, L.P. Kubin, and V. Pontikis, (eds.), Computer Simulation in Materials Science, NATO ASI E 318, Kluwer, Dordrecht, 309–323, 1995. [7] L.P. Kubin, “The low temperature plastic deformation of bcc metals,” Rev. Deformation Behavior Mater., 1, 244–288, 1977. [8] M. Tang, L.P. Kubin, and G.R. Canova, “Dislocation mobility and the mechanical response of bcc single crystals: a mesoscopic approach,” Acta Meta., 46, 3221–3235, 1998. [9] V. Bulatov, W. Cai, M. Hiratani, M. Tang, J. Fier, G. Hommes, T. Pierce, M. Rhee, K. Yates, and T. Arsenlis, “Scalable line dynamics in ParaDiS,” supercomputing 2004, http://www.sc-conference.org/sc2004/schedule/pdfs/pap206.pdf. [10] K.W. Schwarz, “Interaction of dislocations on crossed glide planes in a strained epitaxial layer,” Phys. Rev. Lett., 78, 4785–4788, 1997.
A lattice based screw-edge dd simulation
837
[11] N.M. Ghoniem, S.-H. Tong, and L.Z. Sun, “Parametric dislocation dynamics: a thermodynamics-based approach to investigations of mesoscopic plastic deformation,” Phys. Rev. B, 61, 913–927, 2000. [12] J.P. Hirth, M. Rhee, and H.M. Zbib, “Modeling of deformation by a 3D multi-pole curved dislocations,” J. Comput.-Aided Mater. Des., 3, 164–66, 1996. [13] R. Madec, B. Devincre, and L.P. Kubin, “From dislocation junctions to forest hardening,” Phys. Rev. Lett., 89, 255508–255512, 2002. [14] J.A. Moriarty, J.F. Belak, R.E. Rudd, P. Soderlind, F.H. Streitz, and L.H. Yang, “Quantum-based atomistic simulation of materials properties in transition metals,” J. Phys. Condens. Matter, 14, 2825, 2002. [15] A. Arsenlis and M. Tang, “Crystal plasticity continuum model development from dislocation dynamics,” Modelling Simul. Mater. Sci. Eng., 11, 1251, 2003.
2.23 ATOMISTICS OF FRACTURE Diana Farkas1 and Robin L.B. Selinger2 1 Department of Materials Science and Engineering, Virginia Tech, Blacksburg, VA 24061, USA 2 Physics Department, Catholic University, Washington, DC 20064, USA
Atomistic simulation studies of fracture are aimed at addressing both practical problems in materials engineering and providing basic understanding in fundamental issues in the science of solid mechanics. A practical goal is the development of computational tools to predict the fracture toughness of materials as a function of composition, microstructure, temperature, environment, and loading conditions. Such tools would be extremely useful in the engineering development of novel high-strength structural materials by identifying likely candidate formulations and reducing the number of laboratory trials needed for their testing and validation. As basic research, computer simulation of fracture in single crystals has provided new insight into the stability of crack propagation, the phenomenon of lattice trapping, and the origins of brittle and ductile behavior. Simulation studies of polycrystalline and particularly nanocrystalline solids are increasingly important research tools for investigating fracture and deformation mechanisms in these materials. Large scale simulations that are made possible by the increasing computational power available [1, 2] can shed new light on phenomena that can now be compared with experimental observations. For recent reviews, see Refs. [3, 4]. The accuracy of any atomistic model is primarily determined by the quality of the potential function used to calculate interatomic interactions. Most classical potentials have been fit to reproduce a material’s equilibrium bulk properties, which depend mostly on the shape and curvature of the potential near its minimum. By contrast, the behavior of a crack under loading depends sensitively on the mechanical response of the material surrounding the crack tip, where chemical bonds are stretched and bonding geometries are distorted, so that interactions are governed by the shape of the potential far away from its minimum. Surface energy, a property that plays a crucial role in crack stability, may depend on the phenomenon of surface reconstruction, which 839 S. Yip (ed.), Handbook of Materials Modeling, 839–853. c 2005 Springer. Printed in the Netherlands.
840
D. Farkas and R.L.B. Selinger
is not always well described by classical potentials. Where fracture is ductile, it is important also to consider whether dislocation core structure and mobility are accurately reproduced by the chosen interatomic potential. In spite of these concerns, the behavior of many metallic materials, particularly those with FCC structure, can be described with reasonable accuracy by computationally efficient many-body semi-empirical potentials such as the embedded atom method (EAM) [5] and effective medium theory (EMT) [6]. Such potentials can be developed by fitting to first principles calculations, as described in this volume in the chapter by Mishin, and have been successfully used to model fracture in FCC materials [2]. However, many classical potentials have been shown to be far less accurate in modeling the fracture behavior of other materials, notably semiconductors; see, e.g., Ref. [7]. Multi-scale methods discussed elsewhere in this volume allow the use of more accurate models to describe chemical bonding in a small region near the crack tip, while the rest of the system is modeled using classical potentials and continuum-level models of elastic-plastic behavior; see also Ref. [8]. After selecting an interatomic potential, the next tasks in constructing a fracture simulation include choosing the initial configuration; defining an appropriate loading geometry and boundary conditions; and deciding by what dynamical algorithm the system will evolve. A wide variety of loading and boundary condition schemes have been developed to study either quasistatic or dynamic crack propagation. Consider a three-dimensional simulation block of copper atoms arranged in an FCC structure. If the sample is a single crystal, its crystallographic orientation must be specified, keeping in mind that material properties such as surface energy and dislocation mobility are highly anisotropic and that the orientation of relevant slip planes will be an important factor. A crack can be introduced into the initial configuration by calculating the displacement field associated with an ideal straight crack in a linear elastic continuum solid under an overall strain and then displacing the atoms accordingly; details for different crack loading geometries can be found in Ref. [9]. Since an atomistic solid behaves non-linearly at large strains, we can anticipate that the atoms in the crack tip region are out of elastic equilibrium and will relax once we set the simulation in motion. An alternative way to introduce a defect is to remove atoms to create a crack-shaped void, or remove a wedge of atoms to create a notch. Dislocations can also be introduced into the initial configuration using continuum elastic displacements [10]. Again we expect that atoms in the core region will be out of elastic equilibrium and will relax once the simulation is set in motion. If desired, the initial configuration can be constructed as a polycrystal instead of a single crystal. A model grain structure can be constructed using a variety of algorithms, including the Voronoi construction [11]. Alternatively, amorphous configurations can be constructed by melting and then quenching a sample [12]. The technique can also be applied to study fracture behavior in
Atomistics of fracture
841
amorphous materials such as metallic glass [13]. The most stable amorphous systems are typically mixtures with atoms of two or more different sizes. To cause the sample to fracture or deform, we apply a strain or stress of specified character, and then maintain or perhaps increase it as the system evolves through some dynamical scheme. We accomplish this goal through the use of constraints, which are typically applied via the system’s boundary conditions. Thus the choice of boundary conditions is a crucial part of designing any fracture simulation. Though our simulated sample is by necessity relatively small, our goal in selecting boundary conditions and loading geometry is often to mimic a single isolated crack within an infinite solid. One choice is the use of periodic boundary conditions along one or more directions, but the crack and any other defects in the system will be affected by interactions with their periodic images, which may considerably complicate analysis of results. Periodic boundary conditions also introduce topological constraints on extended defects such as grain boundaries and dislocations. Another possibility is using free boundaries with the strain maintained by traction forces applied to atoms along the edges of the system. However, free boundaries introduce image interactions [10] which again complicate analysis of results; and surface atoms subject to traction forces may simply tear away from the sample if the stress is too large. Several better geometries have been developed. For studying the quasistatic propagation of a single crack, one may use a block geometry where atoms along the boundaries of the sample are constrained with positions calculated from the continuum elastic displacements associated with an ideal crack under the appropriate loading. Atoms in the interior of the sample, including the crack tip region, are unconstrained. As the applied strain is increased during the course of the simulation, the positions of the constrained atoms are recalculated using the continuum solution. Of all the options available, this geometry gives the closest approximation of an isolated crack moving at low speed in an infinite solid and is ideal for use with molecular statics methods. We discuss this technique in further detail below. A particularly interesting geometry for dynamic fracture simulation is a crack “treadmill”. As a crack propagates through a crystalline solid, broken crystal planes are removed from trailing end of the sample and defect-free, strain-matched crystal planes are added on the leading end, so that the crack remains always in the middle of the simulation cell [14, 15]. Applied strain may be maintained or adjusted by constraining the top and bottom layers of the crystal to a given separation. This choice of loading conditions allows a dynamic crack to propagate long enough to reach steady state speed, even in a relatively small sample. Marder has demonstrated that even extremely small samples give results for crack properties that converge rapidly toward bulk values. In a dynamic fracture simulation, the moving crack emits phonons which eventually bounce off the system boundaries – or propagate through periodic
842
D. Farkas and R.L.B. Selinger
boundary conditions – and impinge upon the crack tip. A “stadium” geometry has been developed to isolate a moving crack from these reflections by the use of surrounding damping regions to absorb them [16]. In a “treadmill” geometry, damping is also needed to protect the crack tip from any shock waves or other disturbance that may be generated when crystal planes are added or removed. Basic atomistic simulation techniques used in the simulation of fracture are of two types: molecular statics and molecular dynamics. Molecular statics (MS) is a technique designed to determine the lowest energy configuration of a system under its applied constraints. Every atom within the simulated system interacts with its neighbors according to an interaction potential, and the presence of a defect typically induces forces. The non-constrained atoms are moved using an iterative relaxation process to bring the system to a minimum energy configuration. Using a conjugate gradient approximation method, the minimization technique moves the atoms along the direction of the steepest gradient of the energy function, i.e., in the direction of greatest energy decrease. In each single iteration step, the atom is displaced in the direction of the resultant force applied by its neighbors as well as in a direction perpendicular to its previous displacement. The energy is computed after each iteration and the system is assumed to be at equilibrium when the energy gradient drops to zero, or when the forces on each atom are below a specified value. The number of iterations required to reach equilibrium may vary from several tens to several hundreds. Once the system reaches elastic equilibrium, the applied strain is incremented once more and the system is again relaxed to elastic equilibrium. This procedure is repeated until the process being studied, e.g., the motion of a crack across the sample, is complete. The molecular statics method has the advantage that it represents quasistatic evolution of the system under a slowly varying strain, but the disadvantage that it does not take account of the effects of finite temperature. No atomic vibrations due to thermal activation are taken into account and the results obtained only characterize the material at 0 K. To introduce time, strain rate, and temperature as meaningful variables, we turn to molecular dynamics (MD) simulation techniques, which model the motion of atoms according to Newton’s laws with forces arising from both interatomic interactions and applied constraints. For a general introduction to MD methods, see Ref. [17]. In an MD simulation, the initial configuration includes the position, mass, and velocity of each atom. The initial velocities are selected from a random distribution (e.g., the Maxwell–Boltzmann distribution) associated with a specific initial temperature; if a different random velocity distribution is used, the velocities will naturally evolve toward the Maxwell–Boltzmann distribution in the first steps of the simulation. Forces acting on each atom from its neighbors are summed by vector components, and the equations of motion are integrated forward in time. The
Atomistics of fracture
843
value of the integration time step t, must be set low compared to the period of the highest frequency motion in the system in order to conserve energy and momentum to the desired precision. During each time step both atomic positions and velocities evolve. A variety of numerical methods may be used to carry out the numerical integration [17]; higher order methods allow the use of a larger time step. When the system evolves under Newtonian dynamics, the sum of the potential and kinetic energies remains constant. However, an applied constraint that changes over time may do work on the system, e.g., if you gradually increase the applied strain, the system’s total energy may increase. Motion of dislocations or propagation of a crack both relieve the applied strain and thus convert potential energy to kinetic, causing the system to heat up. This is a realistic effect, but because the strain rates are so high and system size so small, the temperature increase may be much more extreme than that observed in a relevant experiment. To control the temperature in the simulation, we can place all or part of the system in contact with a model heat bath via the use of a “thermostat” algorithm derived from statistical thermodynamics. In simulating dynamic fracture, it is often useful to avoid applying the thermostat to the crack-tip region so that the crack’s stability is not affected. Any heat generated will diffuse toward the thermostat region, flowing down a temperature gradient. If periodic boundary conditions are used, stress or hydrostatic pressure can be similarly controlled through the use of a “barostat,” which allows the simulation cell size, and thus the strain, to fluctuate. Both thermostats and barostats may introduce a fictitious degree of freedom and an associated time scale [17]. Care must be taken that those time scales are well separated from the dynamics of any other type of mechanical response under study in the simulation. Considering that the MD algorithm is directly derived from a classical mechanics treatment of the system, the simulated system is expected to evolve as it would during an experiment on a short (e.g., nanoseconds) timescale. Fracture mechanisms and diffusion mechanisms can be determined by direct observation, without any a priori assumptions. The relative importance of various mechanisms can also be studied as a result of the simulation. Thus, the MD simulation is a very powerful technique that produces very detailed information about the simulated system. It is an appropriate tool when the goal is precisely to study the exact nature of the fracture mechanisms. The main drawback of the molecular dynamics technique is the short time scale accessible. For studies of deformation this means that the deformation process is carried out at very high strain rates. Since the strain rate may affect the deformation mechanism, it follows that molecular dynamics, although very useful as a technique can give only part of the picture. On the other hand, the molecular statics alternative calculates quasi-equilibrium configurations at various stress intensities and is therefore a better model of stable crack growth. Table 1 compares and summarizes the basic algorithms associated with MS and
844
D. Farkas and R.L.B. Selinger
Table 1. Simulation procedure using molecular statics and molecular dynamics
MD Crack Simulation
MS Crack Simulation
MD, as applied to the simulation of fracture introducing a semi-infinite crack loaded to a given stress intensity K. In both cases the loading is introduced by using displacement fields obtained from elasticity theory for the given value of the stress intensity. Because of their size and time-scale limitations, atomistic simulations cannot independently tell us everything we need to know about the fracture behavior of a bulk solid. Accurate atomistic simulations need to be bridged with simulations at other length scales. Multi-scale models that address hand-shaking issues between simulation techniques at different length scales are undergoing rapid development and are discussed elsewhere in this volume. However, even without
Atomistics of fracture
845
those techniques, the bridging of length scales can be accomplished in simpler ways by the use of (a) interatomic potentials that are developed based on calculations performed at the quantum theory level and (b) boundary conditions that are based on the continuum theory of fracture mechanics, e.g., using the block geometry discussed above. We now turn to the topic of interfacing ab initio calculations with molecular statics and dynamics. Classical interatomic potentials describe the energy associated with chemical interactions among atoms of the same or differing species, in a simplified form that is computationally efficient, as a function of atomic positions. One way to bridge length scales in a sense is to derive interatomic potentials from first principle simulations of impurity effects, mostly using ab initio density functional theory in the linear augmented plane wave (LAPW) approach. These calculations can be performed for cluster sizes of 10–50 atoms, and they must be bridged in some way to techniques at a larger scale. These studies include atomic configurations that deviate significantly from equilibrium configurations. As discussed above, accurate modeling of a crack tip, where strains are large, requires good fitting of the potential not only near its minimum but also for atoms in high energy configurations. It is therefore important to use a description of interatomic interaction that though empirical can be reliable in these off-equilibrium configurations. Experimental information is usually linked to situations that are only deviated slightly from equilibrium, mostly in the elastic regime. With the possible exemption of the activation energy for diffusion, there are no experimental properties of the bulk material that can provide information on energetic interactions far from equilibrium, so first principle calculations play a major role in developing accurate interatomic potentials. Potentials derived in this manner yield a good description for pure metals and metallic alloys. In simulation studies of fracture, particular interest must be devoted to accurate values of the surface energies in various crystallographic directions. These properties of the potential directly influence the cleavage planes observed. For the study of ductile/brittle behavior the other important quantity that needs to be accurately reproduced by the potential is the unstable stacking fault energy. We must also address the issue of interfacing an atomistic simulation with the continuum through boundary conditions. From a macroscopic point of view, plastic deformation in a crystalline material proceeds by simultaneous sliding on available slip planes. In continuum theory simulations, slip systems on which the resolved shear stress exceeds a certain threshold value are assumed to be active. The sliding on these slip systems determines the overall plastic deformations of the body. This kind of macroscopic simulation setup requires input parameters such as threshold stresses for the activation of particular slip systems. One goal of atomistic simulations of fracture is to provide
846
D. Farkas and R.L.B. Selinger
precisely the parameters that the continuum type of simulation requires. Such macroscopic parameters and criteria should be consistent with the plasticity observed in the atomistic simulations. These parameters can be used in the continuum theory simulations. In turn, continuum theory provides the boundary conditions necessary for the simulations in many cases and the appropriate criteria and parameter values will be found via an iterative process. The basic idea is that we will find a self-consistent solution for which the criteria for the onset of plasticity used in the macroscopic calculations of the boundary conditions are consistent with the atomistic results. We note that this is a very efficient way to interface atomistic and continuum calculations that uses mostly existing code, without the need for handshaking procedures. Figure 1 shows a schematic of how the continuum simulations are used as boundary conditions in a typical fracture simulation at the atomistic level. In this figure we indicate the possibility of introducing a buffer region of atoms that are at intermediate distances from the crack tip. The buffer atoms do not move independently but can be adjusted according to the forces they experience from the free atoms. The use of the buffer region is not necessary but results in the possibility of using smaller simulation cells without significant effects from the fixed region. If a buffer region is not used, larger simulation cells will be necessary to avoid effects from the fixed region. The introduction of the crack is performed using the continuum solutions for all the regions indicated in Fig. 2. The role of the continuum solution is two-fold. First, it serves as an initial guess for the relaxed atomic configuration in all the regions of the simulation. Second, it serves as boundary conditions that are kept fixed in the fixed region far from the crack tip. Since
Mode I loading
Y
FIXED BUFFER
Y X
FREE
Z Crack front line
Figure 1. Illustration of block geometry.
X
Atomistics of fracture
847 Nanocrystalline α-iron samples S1, S2 and S3
Geometric construction of three samples: S1, S2, S3.
Relaxation of the samples with MS + EAM potential for α-iron To obtain a minimum energy atomistic configuration of nc α-iron
Introduction crack by MS
of a Mode I at 0K
Mode I loading
Temperature equilibration of samples S1 S2 and S3 at
100K, 300K and 600K respectively using MD
Cracked samples S1, S2 and S3 at 100K, 300K and 600K respectively
Figure 2.
Cracked samples at 0K
Modeling process to obtain three cracked samples at different temperatures.
the solution based on elastic fracture mechanics should be valid far from the crack tip, the atomic positions in the fixed region are kept fixed during the energy minimization procedure in molecular statics or during a certain number of molecular dynamics time steps in the MD technique. As the simulation proceeds the loading can be increased, and this is accomplished by updating the fixed boundary conditions to those representative of a crack with a higher loading level. In the simplest case, the boundary conditions are given by the solution of the displacement field of a semi infinite crack in an isotropic medium. These are [18]:
θ K r θ cos 2 − 4ν + 2 sin2 ux = 2µ 2π 2 2 θ K r θ sin uy = 4 − 4ν − 2 cos2 . 2µ 2π 2 2
848
D. Farkas and R.L.B. Selinger
Using this isotropic approximation, simulations of more complicated polycrystalline and multi-phase systems can also be performed using this technique, since the continuum solutions, to first approximation, are independent of the detailed crystal configuration in the atomistic simulation block. As a case study, we now consider the fracture of nanocrystalline Fe of varying grain size at different temperatures. Fracture of single crystal Fe has been studied since the early days of computer simulation. Cheung and Yip [19] used MD simulations for α-iron to show that a brittle-to-ductile transition occurs between 200 and 300 K for various crack tip geometries: 100{110}, 100{100} and 110{100}, where the crack is lying on the indicated plane and its crack front is located along the given direction. At low temperature, i.e., at temperatures of 200 K and below, the three crack orientations show brittle behavior and cleavage crack is observed to occur on {100} or {110} planes. With increasing temperature above the ductile-brittle transition (DBT) temperature, profuse dislocation emission accompanied with crack tip blunting is observed in the three orientations. 111{110} is identified as one of the slip systems activated. Furthermore, for the 110{100} orientation, additional features of local structural transformation and twinning associated with 111{112} are observed. DeCelis et al. [20] also found brittle cleavage to be the preferred mode of instability for cracks at 0 K. Kohlhoff et al. [21] have used a combined finite elements and atomistic model to study {100} and {110} cracks. Their approach is to consider both crack planes {100} and {110} with their crack front oriented along either 100 or 110 directions. Both cracks with either of the two orientations are observed to cleave without dislocation emission. However, cleavage on the {100} plane is found to be easier than {110}. Shastry and Farkas [22] investigated crack propagation under Mode I loading using molecular statics simulation models. Their study involved cracks on {110} planes but with different crack geometries than those considered previously, i.e., {110} crack plane associated with 100, 110 and 111 crack front directions. These results show that crack propagation in single crystal samples is very dependent on the particular crystallographic orientation of the crack front and plane. More recently, fracture of nanocrystalline Fe has been studied by Latapie and Farkas [23]. Figure 2 shows the procedure used for the simulation of fracture behavior in a nanocrystalline material. In this procedure, three different samples with varying grain sizes were created using a bcc crystalline structure and a Voronoi construction for the randomly oriented grain structure. Each sample contained 15 grains. These samples were equilibrated using both molecular statics and dynamics to obtain a stable grain boundary structure. The initial crack was then introduced using the equations above and stress intensity initially at about the Griffith value, K IC =
2µG IC / (1 − υ)
with G IC being twice the surface energy.
Atomistics of fracture
849
For the case of Fe with G IC = 2∗ γs = 2∗ 0.089 = 0.178 eV/Å2 , ν = 0.293 and µ = C44 = 0.699 ev/Å3 we obtain K IC = 0.6 eV/Å5/2 = 0.96 MPa.m1/2 . The cracked samples are then equilibrated at three different temperatures (100, 300 and 600 K) using MD for 2000 time steps, with a step size of 8 × 10−15 s. With the same technique, the fracture process in each sample was conducted for the three temperatures by incrementally loading the semi-infinite mode I crack starting from a stress intensity value slightly below the Griffith criterion for α-iron single crystal. We let the system evolve for 1000 MD steps between each loading, giving an overall simulation time of 200 ps. Since the MD technique follows the actual forces on the atoms as they migrate, the fracture mechanisms can be determined by direct observation, without a priori assumptions. As the simulation progresses and the stress intensity is increased, the crack begins to advance and we follow the crack for a stress intensity up to three times the Griffith value. The simulated strain rate is very high compared to real experiments, so to check for strain rate effects, MS simulations of the same samples are conducted to compare them with the MD simulation results at low temperature. This procedure verifies that the same fracture and deformation mechanisms occur using the conjugate gradient technique, helping to rule out effects of the unrealistic high strain rates typical of molecular dynamics. Visualization of the results is an important consideration, e.g., the nucleation and motion of lattice defects and propagation of cracks need to be clearly identified. The standard techniques of visualizing the atomic configurations use coloring schemes related to atomic environment or energetics. In the example of the simulation of fracture in nanocrystalline Fe, the color scheme denotes the coordination number for each atom. Darker shades of gray represent atoms with coordination numbers different from eight. The fully three-dimensional samples can then be visualized in slices perpendicular to the crack front. Each of the slices can be rendered using any molecular visualization package that takes as input the atomic coordinates, atom types and at least one extra parameter to control the color and/or size of the atomic symbols. The results of visualizing the fracture of nanocrystalline Fe using this technique are shown in Fig. 3, for a sample of 12 nm grain size loaded up to three times the Griffith stress intensity at different temperatures. The results clearly show the increasing ductility that is associated with increasing the temperature of the simulation. Quantitative evaluation of simulation data allows comparison between simulation and experiment. One important quantity is fracture resistance. By plotting the applied stress intensity as a function of crack tip position one can obtain crack resistance curves, such as are used in continuum fracture mechanics. These curves give information on how the crack advances as increasingly higher loading is applied, including the effects of the plastic deformation
850
Figure 3.
D. Farkas and R.L.B. Selinger
Temperature dependence on intergranular fracture, at 100, 300 and 600 K.
Atomistics of fracture
851
processes that occur simultaneously with crack advance. These curves are particularly useful in studying effects of various parameters of the crack configuration and loading condition, such as the effects of crystallographic orientation, temperature, or grain size in polycrystalline simulations. In the example of nanocrystalline Fe, this technique can be used to study the effect of grain size on fracture resistance at 100 K. The results are shown in Fig. 4, where increased fracture resistance is shown with decreasing grain size. In coming years, we anticipate significant new results in atomistic simulation of fracture in metals and metal alloys using semi-empirical potentials. Current computing power now allows the simulation of polycrystalline materials with grain sizes up to and above the 30 to 40 nm range. This is an important accomplishment because grain sizes larger than these values in metals begin to behave much like their macroscopic counterparts. Larger sample sizes will empower researchers to examine not only dislocation emission from the crack tip but also the subsequent evolution of the plastic zone surrounding the crack tip, and give researchers better ability to predict fracture mechanisms as a function of composition, microstructure, and loading geometry. Faster computers will also allow MD simulation of fracture at slower and more realistic strain rates. The predictive value of such large simulations will be limited primarily by the accuracy of the empirical interatomic potentials employed. Further improvement in available computing resources will make it possible to use more accurate methods than just semi-empirical potentials; these improvements will include 1.6
Stress Intensity Factor KI
1.5 1.4 1.3 1.2 1.1
9nm 12nm
1 0.9 0.8 0.7 10
20
30
40
50
60
Crack Tip Position Figure 4. Fracture resistance curves from simulation studies of nanocrystalline Fe with two different average grain sizes (9 and 12 nm) at a temperature of 100 K.
852
D. Farkas and R.L.B. Selinger
the simulation of fracture in small samples using purely first principles techniques. Many of the same computational methods and geometries used at present with semi-empirical potentials will be useful in that context as well. Multi-scale methods also show enormous promise to overcome the accuracy limitations of semi-empirical potentials, but without the computational requirements of an exclusively first principles calculation. Techniques for direct coupling between classical atomistic simulation and first principles techniques are already being developed for semiconductors and could also be applied to metals. Other multiscale techniques such as the Quasicontinuum method, described in this volume in the chapter by Miller, will likely serve as important tools to couple atomistic simulations with larger scale modeling techniques.
References [1] F.F. Abraham, “Very large scale simulations of materials failure,” Phil. Trans. R. Soc. Lond. Ser. A—Math. Phys. Eng. Sci., 360, 367–382, 2002. [2] S.J. Zhou, P.S. Lomdahl, A.F. Voter, and B.L. Holian, “Three-dimensional fracture via large-scale molecular dynamics,” Eng. Fract. Mech., 61, 173–187, 1998. [3] M. Marder, “Molecular dynamics of cracks,” Comput. Sci. Eng., 1, 48–55, 1999. [4] R.L.B. Selinger and D. Farkas (eds.), “Atomistic theory and simulation of fracture,” MRS Bulletin, 25, No. 5, 2000. [5] M.S. Daw and M.I. Baskes, “Semiempirical, quantum mechanical calculation of hydrogen embrittlement in metals,” Phys. Rev. Lett., 50, 1285–1288, 1983. [6] K.W. Jacobsen, J.K. Norskov, and M.J. Puska, “Interatomic interactions in the effective-medium theory,” Phys. Rev. B, 35, 7423–7442, 1986. [7] J.A. Hauch, D. Holland, M.P. Marder, and H.L. Swinney, “Dynamic fracture in single crystal silicon,” Phys. Rev. Lett., 82, 3823–3826, 1999. [8] F.F. Abraham, N. Bernstein, J.Q. Broughton, and D. Hess, “Dynamic fracture of silicon: concurrent simulation of quantum electrons, classical atoms, and the continuum solid,” MRS Bull., 25(5), 27–32, 2000. [9] Lawn, Brian, Fracture of Brittle Solids, Cambridge University Press, Cambridge, U.K., 1993. [10] J.P. Hirth and J. Lothe, Theory of Dislocations, JohnWiley & Sons, New York, 1992. [11] D. Farkas, H. Van Swygenhoven, and P.M. Derlet, “Intergranular fracture in nanocrystalline metals,” Phys. Rev. B, 66, 060101–4(R), 2002. [12] P. Keblinski, D. Wolf, and S.R. Phillpot, “Molecular dynamics simulation of grainboundary diffusion creep,” Interface Sci., 6, 205–212, 1998. [13] M. Falk, “Molecular-dynamics study of ductile and brittle fracture in model noncrystalline solids,” Phys. Rev. B, 60, 7062–7070, 1999. [14] D. Holland and M. Marder, “Ideal brittle fracture of silicon studied with molecular dynamics,” Phys. Rev. Lett., 80, 746–749, 1997. [15] R.L.B. Selinger and J.M. Corbett, “Dynamic fracture in disordered media,” MRS Bull., 25(5), 46–50, 2000. [16] S.J. Zhou, P.S. Lomdahl, R. Thomson, and B.L. Holian, “Dynamic crack processes via molecular dynamics,” Phys. Rev. Lett., 76, 2318–2321, 1996. [17] D. Rapaport, The Art of Molecular Dynamics Simulation, 2nd edn. Cambridge University Press, Cambridge, U.K., 2004.
Atomistics of fracture
853
[18] G.C. Sih and H. Liebowitz, Fracture: An Advanced Treatise, In: H. Liebowitz (ed.), vol.II, Academic Press, New York, 69, 189, 1968. [19] K.S. Cheung and S. Yip, “Brittle–ductile transition in intrinsic frcture behavior of crystals,” Phys. Rev. Lett., 65, 2804–2807, 1990. [20] B. DeCelis, A.S. Argon, and S. Yip, “Molecular dynamics simulation of crack tip processes in alpha-iron and copper,” J. Appl. Phys., 54, 4864–4878, 1983. [21] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack propagation in b.c.c. crystals studied with a combined finite-element and atomistic model,” Philos. Mag. A, 64, 851–878, 1991. [22] C. Shastry and D. Farkas, “Molecular statics simulation of fracture in α-iron,” Modeling Simulation Mater. Sci. Eng., 4, 473–492, 1996. [23] A. Latapie and D. Farkas, “Molecular dynamics investigation of the fracture behavior of nanocrystalline α-Fe,” Phys. Rev. B, 69, 134110, 2004.
2.24 ATOMISTIC SIMULATIONS OF FRACTURE IN SEMICONDUCTORS Noam Bernstein Naval Research Laboratory, Washington, DC, USA
1.
Introduction
Semiconductors are the materials that underlie nearly all modern electronics. They include elemental solids, such as silicon and germanium, as well as compounds such as gallium arsenide and silicon carbide. Since their main use is in electronic applications, semiconductors are not usually thought of as structural materials. Nevertheless there are important reasons, both technological and scientific, for the study of mechanical properties of semiconductors. The developing field of micro-machines, from micro-electromechanical systems (MEMS) to nanotechnology, relies on fabrication techniques developed for electronic devices to make microscopic mechanical system. To a large extent it is the link between these fabrication techniques, including deposition, masking, and etching, and the materials that has driven the use of semiconductors as structural components. On a more fundamental level, the ability to fabricate extremely pure and nearly defect free samples makes semiconductors excellent model systems for studying the physics of fracture. In this section I will attempt to give an overview of the ways in which atomistic simulations have been applied to fracture in semiconductors using a number of illustrative examples. Fracture is one possible failure mode of materials under mechanical load [1]. It occurs when a crack grows, eventually entirely through a sample, causing it to fail. In brittle fracture the crack tip is sharp, and the geometry causes a concentration of stress at the tip. In a continuum description of the solid and in the limit of an infinitely sharp crack the stress concentration becomes singular, and the stress field diverges at the crack tip. This stress concentration makes the material ahead of an existing crack most susceptible to failure, and causes the behavior of the material to be dominated by preexisting cracks. Since the 855 S. Yip (ed.), Handbook of Materials Modeling, 855–873. c 2005 Springer. Printed in the Netherlands.
856
N. Bernstein
amount of stress concentration, i.e., the coefficient of the singular term [2], is correlated with the length of the crack, brittle materials tend to break catastrophically: once a crack has begun to propagate, it becomes longer, increasing the stress concentration, and making it even more likely that it will continue to propagate. The presence of a singularity in the continuum elasticity solution of the stress field would naively indicate that the stress at the crack tip is infinite. While in a real material this singularity would be cut-off by the discrete nature of the atomic lattice, even the continuum problem has an elegant solution. Griffith set up an energy balance equation, comparing the amount of elastic energy released by crack extension with the amount of surface energy needed to generate the newly exposed crack surface [1, 3]. From this equation emerged the Griffith criterion for brittle fracture G ≥ 2γs ,
(1)
where G is the elastic energy release rate and γs is the surface energy. The elastic energy release rate is generally quadratic in the applied load (stress or strain). When the applied load is large enough, the elastic energy released by the lengthening crack will overcome the energy cost of making new surface, and the crack will be unstable with respect to growth. While this criterion is appealing, it is not necessarily valid in practice. Because it is based on a conservation of energy argument, it is most likely a good lower bound to the critical load. However, Griffith’s derivation completely ignores any atomistic details of the bond breaking process. The lack of accurate, independent ways of measuring the surface energy experimentally makes it difficult to test the Griffith criterion, but simulation remains a possibility. Since semiconductors seem to be such ideal brittle materials, atomistic simulations of fracture can be used to test the Griffith criterion. Another possible failure mode for a material under mechanical load is plastic deformation. The material can deform irreversibly by moving dislocations that allow the sample to relieve some of the applied stress [4]. The stress concentration at the tip of a crack tends to enhance the probability that dislocations will nucleate and move near the crack tip. However, unlike in brittle fracture, plasticity dissipates a lot of energy, reduces the stress concentration by making the crack blunt, and the dislocations can shield the crack tip from the applied stress. This type of ductile behavior, typical of metals, leads to robust structural materials: the initiation of failure does not necessarily extend catastrophically through the entire sample, and a lot of energy is dissipated in the process of deforming the material [5]. The issue of brittle fracture, ductility, and the brittle-to-ductile transition (BDT) is in fact a very important aspect of semiconductor fracture. Many materials, including some of great technological interest as advanced structural materials, undergo a transition from brittle fracture at low temperature to
Atomistic simulations of fracture in semiconductors
857
ductile behavior at higher temperatures. Examples include steels, intermetallic alloys such as TiAl3 , and semiconductors such as Si, Ge, and SiC. Since brittle materials tend to have low fracture toughness and to fail catastrophically, one possible route for improving their technological usefulness is by inducing a transition to ductile behavior, if it can be done without compromising the strength of the material. Silicon has become a model system for the BDT because in silicon the transition is extremely sharp [6]. The competition that controls brittleness and ductility is whether the material near the crack is more likely to cleave or to emit and propagate dislocations. Because both brittle and ductile failure of materials are controlled by atomic scale processes such as bond breaking and dislocation nucleation, atomistic simulation is nearly the only tool that can provide us with an atomic resolution view of what is happening at the crack tip during fracture. One question that we can address is whether the Griffith criterion for brittle fracture is valid given the discrete, atomistic nature of matter. Another is the nature of the microscopic processes that occur at the crack tip as the material fractures or deforms plastically. We can also address matters of technological importance, such as the development of new, stronger materials, or the tailoring of the failure properties of existing materials. To carry out the simulation we need both a procedure for computing some property that we can relate to an experimentally observed macroscopic property, as well as a procedure for computing the interaction between the atoms in the material. The structural properties of semiconductors are controlled by their atomic composition and structure. Essentially all semiconductors, elemental or compound, consist of a network of atoms joined by covalent (or mixed covalentionic) bonds. These covalent bonds typically involve sp3 hybrid orbitals that favor tetrahedral coordination, leading to open lattices such as the diamond structure (for elemental semiconductors) or its two-component analog, the zinc-blende structure. The covalent bonds are stiff with respect to deformation of the angles between the bonds, leading to a strong resistance to shear in the lattice. The directionality of the bonds leads to a large energy cost for forming the defects that allow for plasticity, such as dislocations. This suppression of dislocations makes most semiconductors brittle, at least at low temperatures. The nature of the bonding in semiconductors also affects the methods that can be used to simulate them. The basic ingredient that underlies all atomistic simulation is a method for computing the interactions between the atoms. Since covalent bonds are inherently a manifestation of the quantum-mechanical nature of the electrons, approaches that treat the quantum-mechanics explicitly have been quite successful. These approaches range from first-principles methods such as density functional theory (DFT) [7] to faster approximations such as the tight-binding (TB) approach [8–10]. Many interatomic potentials that approximate, but do not explicitly describe, the quantum-mechanics are also
858
N. Bernstein
available for semiconductors. The potentials typically include bond stretching and bond bending terms [11]. Given a method for computing the interaction, we can compute the energy of a particular configuration of atoms and the forces on the atoms. However, using this capability to compute fracture properties is still quite challenging. In a real materials, the fracture process is complex and spans a wide range of length scales. An elastic field that can extend over an entire macroscopic sample is focused, through the stress singularity, at the crack tip. The progress of the crack can be affected by many factors, including the crystal lattice, geometry, impurities, and defects. Clearly a single simulation describing this range of phenomena is too computationally expensive to be feasible, at least with the more accurate quantum-mechanical simulation methods. Thus, a number of approaches are used to simplify the problem. These can be roughly separated into two classes: idealized models and direct simulations, either quasistatic or dynamic.
2.
Idealized Models
The simplest approach to simulations of fracture properties is to ignore all of the details, and develop a highly idealized model that relates true fracture properties to some quantity that is easier to compute atomistically. The combination of model and atomistic calculation has several benefits for treating the range of length and time scales involved in fracture processes. In and of itself an atomistic simulation is limited to systems with size ranging from a few hundred atoms for a first-principles method, to a few million for an empirical interatomic potential. This translates to a size of about 10–500 Å. It is also governed by the time scale over which a single atom moves, comparable to the fastest vibrational mode, about 10−13 s in silicon. Plugging the results of the atomistic simulations into the idealized model connects these tiny length and time scales to a description of processes that occur in macroscopic systems on experimental time scales. A number of approximate calculations of fracture properties of semiconductors have used empirical relations between elastic moduli and some phenomenological measure of hardness. Usually this has been the Knoop or Vickers hardness, which is defined as the apparent pressure required to indent a material by a particular shape diamond indenter [12]. This type of relation was implied in the classic paper by Liu and Cohen predicting that cubic carbon nitride [13] can exist, and might be harder than the hardest known substance, diamond. While it was not the central point of the paper, a correlation between low compressibility (i.e., high bulk modulus) and high measured hardness was the main reason for the interest in this material. However, the shear
Atomistic simulations of fracture in semiconductors
859
modulus is actually better correlated with hardness, and the shear modulus of cubic carbon nitride is not as high as that of diamond. While these phenomenological approaches have the advantage that they are among the most computationally inexpensive ways of computing anything related to fracture, they are at best approximate. The elastic moduli are nearequilibrium properties that represent the curvature of the potential energy surface for small deformation. Fracture, on the other hand, is a process that is far from equilibrium, and depends on the unstable part of the potential energy surface where bonds are broken or irreversible deformation occurs. An alternative to the phenomenological approach is to use a microscopically motivated model to determine some quantities that are feasible to compute. The Griffith criterion (Eq. (1)) is probably the first example of this approach. Using an analytical solution of continuum theory, the macroscopic fracture toughness (i.e., the energy dissipated during the growth of the crack) is related to an essentially microscopic quantity, the surface energy. Firstprinciples calculations of the surface energy of semiconductors are now routine, so they can be used for prediction of fracture properties by assuming the validity of the Griffith criterion. However, since there is no reliable independent way of measuring the surface energy, checking the accuracy of the prediction is impossible. Another simple approach for the calculation of fracture properties neglects the complexities of fracture mechanics and computes instead an “ideal strength”. In a simplified picture, this is the peak stress that a uniform system experiences as a function of applied strain, typically uniaxial tension or simple shear. The ideal strength is relatively easy to compute, even with an accurate first-principles approach. It requires that the energy and stress of a bulk system (i.e., a small unit cell with periodic boundary conditions) be computed as a function of strain for a range of applied strains. A minor complication is caused by the fact that semiconductors have complex lattices, i.e., lattices with more than one atom per unit cell, so the positions of the different atoms in the unit cell have to be relaxed at each applied strain. Figure 1 shows an
Figure 1. Plot of the calculated stress vs. applied tensile strain for three semiconducting materials (After Fig. 2 in D. Roundy and M.L. Cohen, Phys. Rev. B, 64, p. 212103, 2001. Reproduced with permission).
860
N. Bernstein
example of this technique applied to three elemental semiconductors: Si, Ge, and C. The stress in the simulated system at a range of applied shears shows a linear rise in of the stress in the elastic regime, following by inelastic behavior and finally a maximum in the stress the material can support. The results show the trend of decreasing strength from C to Si to Ge, consistent with experiment. The ideal shear strengths of Si and Ge are quite low relative to their ideal tensile strengths. Since shear strength is (qualitatively) characteristic of resistance to dislocation formation and motion, and tensile strength is (qualitatively) characteristic of resistance to cleavage, this relation indicates that Si and Ge might be expected to be ductile. Since both Si and Ge are brittle, at least at low temperatures, the prediction of ductility shows the limitation of the simplified model that underlies the ideal strength approach. The quantitative values of the critical stress and strain in the simulation are also much higher than ever observed experimentally. Many complications present in real cracks might explain these discrepancies. The stress field at the tip of the crack is closer to biaxial tension than to uniaxial tension. The stress field is also highly inhomogeneous and anisotropic. The process of crack propagation by cleavage (opening a gap between two particular atomic planes) is not quite the uniform tension that ideal tensile strength measures, and dislocation nucleation depends on the slip between two atomic planes, which is not the same as uniform shear. All of these inhomogeneities are neglected by the ideal strength calculations, but gauging their significance is not easy. Reliable experimental numbers that are accurate and free from material imperfections are non-existent, so more sophisticated simulations are currently the only practical approach. A more sophisticated idealized model that has played an important role in atomistic simulations for fracture in semiconductors is a criterion for dislocation nucleation analogous to the Griffith criterion for brittle fracture. The Rice criterion, as it is known, is based on an expression for the critical load for dislocation nucleation [14]. The critical load is computed by combining the continuum elasticity solution for a loaded crack with an atomistic expression for the energy of a solid as a function of slip between two atomic planes. When the critical load for dislocation nucleation is lower than the critical load for cleavage, the material is ductile: it will nucleate dislocations before it cleaves, and these dislocations will shield the crack tip. The essential ingredients for this calculation are the elastic constants and surface energy (needed for the Griffith criterion) and the unstable stacking energies, which are the saddle point energies of the so called γ-surface. The γ surface is the energy of the material as a function of slip. It can be computed by computing the energy of an infinite crystal (represented by a unit cell with periodic boundary conditions) separated into two halves by a plane, and translating one half with respect to the other, as a function of the relative translation vector. One interesting aspect of the γ-surface is that it is a theoretical construct. There is no way to deform
Atomistic simulations of fracture in semiconductors
861
a system experimentally and measure its γ-surface or its unstable stacking energies. Since all three ingredients in the Rice criterion can be computed using DFT [7], it is possible to use this reliable and accurate method to study the complex interplay between brittleness and ductility. The geometry of the diamond structure lattice (common to elemental semiconductors) makes the details of the calculation complex. The dominant orientation for both cleavage and slip are high-symmetry (111) planes of the lattice. There are two inequivalent places to cut the lattice (Fig. 2). For cleavage one of these cuts is much higher in energy, and therefore irrelevant. For slip, on the other hand, the saddle point energies are comparable (although the energy maxima are not). The main conclusion is that in Si the unstable stacking energies are large enough that Si should be brittle according to the Rice criterion. This conclusion is consistent with the experimental observations of silicon as a brittle material at low temperatures. However, it is in contradiction with Rice’s original rough estimate used before the unstable stacking energy was calculated. It is unclear how to relate Rice’s criterion and calculations of unstable stacking energies to the most interesting aspect of brittleness and ductility in semiconductors: the BDT. Rice’s criterion in its original form is a zerotemperature theory that neglects kinetics and finite temperature effects, while the BDT is inherently a finite temperature phenomenon. A number of theoretical explanations for the BDT has been proposed, invoking thermally activated dislocation motion, a thermally activated shift from immobile to mobile dislocations, and collective effects of dislocation–dislocation interactions on nucleation or mobility. A detailed discussion of this topic is beyond the scope of this chapter, since the BDT theories are complicated and controversial. However, it clear that advances in simulation methods are making it possible to reliably and accurately compute the quantities that enter into these BDT theories. Perhaps future work based on atomistic calculations will help settle the mechanism for the BDT in Si and other semiconductors.
3.
Quasistatic Direct Numerical Simulation
A complementary approach to the use of idealized models is direct numerical simulation. As discussed in more detail in Chapter 1 [15, 16], it is possible to use energy and force calculations to follow the trajectory of a system of atoms. If the length and time scale issues can be adequately addressed, this type of simulation can give us the most direct view of the fracture process. The greatest difficulty in carrying out direct numerical simulations is in developing a method for computing the energy of the system: the method must be accurate enough to capture the important physics while remaining fast enough to be practical. Three possible approaches have been tried. One is very accurate first principles or other electronic structure methods, but applied to very
862
N. Bernstein (a)
(b)
Figure 2. Schematic of the shuffle cut plane and γ surface (upper panel) and glide cut plane and γ surface (lower panel). Note the different energy scales for the two γ surfaces. (After Fig. 1 in E. Kaxiras and M.S. Duesbery, Phys. Rev. Lett., 70, p. 3742, 1993. Reproduced with permission.)
Atomistic simulations of fracture in semiconductors
863
tiny systems, only a few tens of atoms. Another is simulations using interatomic potentials, which can be easily applied to 104 atoms or more, but, as we discuss in more detail below, have serious problems with accuracy that can lead to qualitatively wrong results. The third is a coupling of the two methods, using an accurate method near the tip of the crack, and an interatomic potential far from the crack. To use a first-principles method such as density functional theory to directly simulate fracture both the length and time scales must be minimized. In practice this means that the simulated system is made as small as possible, typically about 100 atoms. The time scale can be removed altogether by making the simulation quasistatic. This means that instead of simulating the dynamics of the atoms by integrating Newton’s equations of motion (i.e., doing molecular dynamics [16]), the system is allow to relax toward the minimum energy atomic positions at each applied load [15]. The energy minimization can find stable configurations, but the path the system goes from the initial state to the final state does not necessarily have physical meaning. Unless directly manipulated, energy minimization methods are usually designed never to go over energy barriers. This constraint can dramatically change the way kinetics, for example the competition between two mechanisms with comparable energy barriers, affect the simulated crack propagation process. The energy minimization approach can be used to study the limits of the Griffith criterion in predicting the crack propagation process. One possible effect of the discrete nature of the atomic lattice is inherent in the localization of the bonding to pairs of atoms. The Griffith criterion treats the bonding energy as a uniform surface energy density. The connectivity of the covalentbond network in the semiconductor makes this energy density, in so far as it is even well defined, inhomogeneous. When two atoms that are directly bonded are being separated by the propagation of the crack, the energy cost, related to bond stretching, is large. When the hypothetical continuum crack tip is propagating through a region that does not cross any covalent bonds, atoms that are moving apart are connected through a chain of bonds. The opening of the crack faces can be accommodated primarily by bond bending, which is less energetically costly than bond stretching. Although the average energy cost for extending the crack surface is the same as the continuum value, there may be “lattice trapping”, where energy barriers associated with breaking each interatomic bond impede the propagation of the crack [17]. Another aspect of fracture in a real material that is neglected by the continuum description is the orientation of the crack front. In the continuum theory that underlies Griffith’s work the only relevant parameter is the surface energy. In a crystal the surface energy is orientation dependent, and tends to be minimized for high symmetry crystal faces such as the (111), (110), or (100), favoring cracks that create such high-symmetry surfaces. The geometry of the network of bonds that is being broken depends on the orientation of the crack front. Although
864
N. Bernstein
this crack-front orientation dependence isn’t captured by the Griffith criterion, it may affect the true critical load. To carry out the calculation, a small system of bulk Si surrounding the tip of a planar crack is deformed according to the continuum elasticity solution of the displacement field around a crack tip. A visualization of this configuration is shown in Fig. 3. Because the experimental system is many orders of
Figure 3. An image of one of the crack-tip systems simulated by energy minimization from Fig. 1 in Ref. [18]. Grey circles indicate Si atoms and white circles indicate H atoms passivating broken bonds. Atoms outside the dotted-line region are constrained to the continuum elasticity displacement field positions. (After Fig. 1 in R. P´erez and P. Gumbsch, Acta Mater., 48, p. 4517, 2000. Reproduced with permission.)
Atomistic simulations of fracture in semiconductors
865
magnitude larger, the edges of the simulated system are unphysical. To prevent electronic surface states that can form on these fictitious free surfaces, each broken bond must be passivated with a H atom. To apply the correct loading two layers of Si atoms at the boundary are kept fixed at the continuum displacement field positions. Observing the system as it is relaxed reveals whether the crack propagates at each applied load. The behavior as a function of load indicates the minimum critical load for crack propagation (at or above the Griffith criterion) and maximum critical load for crack healing (at or below the Griffith criterion). According to the Griffith criterion, where there are no energy barriers and no hysteresis, these two loads would be the same. The deviations among these two loads and the Griffith critical load are a measure of the lattice trapping. The variations in the critical loads as a function of crack front orientation (but always exposing the same surface with the same surface energy) quantify the cleavage anisotropy. It is known experimentally that cracks that open (111) and (110) surfaces can propagate in a stable manner in Si, although the behavior of the (110) cracks is dependent on the propagation direction. Simulations on all of these crack geometries show significant deviation from the Griffith criterion predictions, revealing the importance of the discreteness of the atomic lattice. The critical loads for crack propagation are between 20 and 35% higher in applied stress (i.e., 40–70% higher in G, which is quadratic in applied stress) than the Griffith criterion prediction. This deviation in and of itself shows that significant energy barriers will affect crack propagation. The differences between the (111) cracks, which are isotropic with respect to propagation direction, and (110) cracks, which are anisotropic, is also explained by the simulation results. Cracks that expose (111) surfaces show the least lattice trapping. Cracks that expose (110) surfaces, on the other hand, show more lattice trapping, and the amount depends on the crack front direction. The crack-front direction that corresponds to experimentally observed propagating cracks (a [001] front) shows a moderate amount of lattice trapping, while the direction where no stable crack propagation is observed ¯ front) shows the most. This orientation anisotropy is experimentally (a [110] attributed to the geometry of the bonds that are just ahead of the crack. For the directions with low and moderate lattice trapping the load is concentrated on just one bond ahead of the crack and the bond breaking process is continuous: as the load is increased, the length of each bond increases smoothly from strained bulk-like to a broken bond. The high level of lattice trapping in the ¯ crack-front is caused by the distribution of the load between two bonds, [110] and the bond breaking process is discontinuous. The lengths of bonds ahead of the crack increase slowly until a critical load where the bonds snap open (Fig. 4). This example of quasistatic simulations of fracture shows the power of atomistic simulations to reveal details of the fracture process. The simulation
866
N. Bernstein
Figure 4. Plot of bond distance for each bond along the crack propagation direction for a crack ¯ crack front at different applied loads. The loads are scaled to on the (110) plane with a [110] the Griffith criterion critical stress intensity factor. (After Fig. 5 in R. P´erez and P. Gumbsch, Acta Mater., 48, p. 4517, 2000. Reproduced with permission.)
results can be used in different ways. The simplest is for the calculation of quantities such as the critical load for brittle fracture. This load can be compared to experiment, or stand as a prediction for materials that have not been studied experimentally. A more sophisticated approach is to use the critical load as a parameter in a more coarse-grained simulation. Cohesive zone models, for example, numerically solve the continuum elasticity problem [19] with finite elements while including the possibility of cracks opening up in the material. One of the essential parameters for the cohesive zones, which model the opening crack, is the critical energy release rate, which can be obtained from reliable quasistatic first-principles simulations. Another way to use the simulation results is in more detailed analysis, not to make quantitative prediction, but to explain experimental observations. The relation between the propagation-direction dependence of (110) cracks to the way in which the peak crack-tip stress is distributed over the network of bonds is one example. This level of insight into the reason for a previously unexplained experimental observation is one of the great contributions that atomistic simulations can make.
Atomistic simulations of fracture in semiconductors
4.
867
Dynamic Direct Numerical Simulation
Dynamic simulations of fracture are in many ways the closest we can get to a “computer experiment”. Molecular dynamics simulation [16] provide this capability, but both the time and length scales required for dynamic simulations are inherently substantial. To avoid transient startup effects and to gather reasonable statistics it is helpful to be able to simulate the crack moving a significant distance (at least significant on the atomic scale). This requirement translates to systems that are large enough to enclose the distance the crack will travel, as well as enough surrounding material to insulate the crack tip from edge effects. It also requires simulations that are long enough in time to follow the crack as it moves this distance. Until recently only interatomic potentials have been sufficiently computationally efficient to make dynamical simulations of fracture practical. As I discuss below, it has turned out that most commonly used interatomic potentials for silicon fail qualitatively to simulate brittle fracture. This failure has motivated the use of hybrid methods, which combine an interatomic potential simulation of a fracturing sample with a more accurate electronic structure method near the crack tip. Some aspects of the basic physics of fracture are inherently dynamic, and can’t be captured by quasistatic energy minimization calculation. One example is a dynamic form of lattice trapping in brittle fracture called the velocity gap [20]. The discrete nature of the fracturing material makes it impossible for a dynamic crack to propagate below a critical speed. In dynamic propagation this gap can manifest itself as a range of forbidden crack velocities, or as a difference between the loading required to make a crack begin to propagate and the loading required to stop a steady-state propagating crack. Since semiconductors are so brittle at low temperatures, they make good model systems for studying this basic instability in dynamic fracture. Because the velocity gap is an inherently dynamic and steady-state phenomenon, the simulations needed to be dynamic, and free of transient effects. To achieve these requirements the molecular dynamics runs must have a long duration and be carefully monitored for their progress toward steady state. These simulations can be made computationally feasible by using a modified version of the Stillinger–Weber (SW) interatomic potential [11] (discussed in more detail below), together with a number of techniques to minimize the simulated system size in a controlled manner. The crack is simulated in a quasitwo-dimensional geometry: a thin sample with periodic boundary conditions are used in the direction along the crack front (Fig. 5). This simulates an infinitely thick system with a straight crack front. To minimize the size of the system perpendicular to the crack surface the scaling of the crack phenomenon with respect to that system dimension can be studied analytically. With this analytical solution results from a small system can be extrapolated to infinite system size. To minimize the size of the system along the crack propagation
868
N. Bernstein
y
x Figure 5. Cartoon of a two-dimensional dynamic fracture simulation. The system is loaded in tension along y by fixing the positions of the top and bottom layers of atoms. The crack front is a straight line parallel to the z-axis. The crack is propagating from left to right (indicated by the arrow), parallel to the x-axis. Periodic boundary conditions are used along z. The dashed box on the left indicates region where atoms are removed from the simulation, and the solid box on the right indicates region where atoms are added ahead of the crack. In an actual simulation all of the in-plane dimensions are significantly larger.
direction a sort of virtual treadmill is used. Only a block of material near the crack front is explicitly simulated. Material behind the crack front, where the crack has already opened fully, is dropped from the simulated system. To compensate more material is added ahead of the crack front, far enough that the new material has reached local equilibrium before the crack front has reached it. The simulation results show the effects of both quasistatic lattice trapping and of the dynamic velocity gap. At very low temperatures, near 0 K, there is a deviation of about 10% in applied strain between the loading required to initiate crack propagation to the loading required to stop a propagating crack. However, both of these dynamic critical loads are more than 20% in applied strain over the Griffith criterion critical load, indicating the presence of quasistatic lattice trapping as well. The velocity gap becomes becomes smaller at higher temperatures, and essentially disappears at room temperature. This velocity gap has not been observed experimentally [21], although the experiments are quite challenging and it is not yet clear if it has been ruled out. The velocity gap simulations also show one important problem with empirical potential simulations of fracture in silicon: most commonly used empirical potentials for Si (the one known exception is discussed below) show ductile fracture. At the initial stages of loading dislocations form at the crack tip,
Atomistic simulations of fracture in semiconductors
869
and at higher loads additional dislocations nucleate until the material ahead of the crack simply disintegrates. Since the velocity gap is a feature of brittle fracture, those simulations were carried out using a modified form of the SW potential that increased the energy cost for bond bending, thereby suppressing dislocation formation. Although perhaps not the best description of silicon, the modified SW is a more realistic model than the models previously used to study similar instabilities. Rather than using simplified geometries and analytical extrapolations, sophisticated computational tool can be used to make simulations of large systems with many atoms computationally tractable. Parallel computers, often implemented by networking together a large number of low-priced workstations, have brought this approach within reach for many research groups. The development of parallel algorithms for material simulations, in particular the issue of distributing the work evenly between the parallel processors, and analyzing the vast quantities of data that result, are beyond the scope of this discussion. However, even these computationally sophisticated approaches must beware of the problems with empirical potential simulations of fracture. Silicon nitride is a dielectric material used in Si and GaAs electronic devices. During production and operation thermal and mechanical stresses can cause cracking of the Si3 N4 in the device, but the cracks are arrested when they reach the Si layer. To understand the reason for the crack arrest the system was simulated using the SW potential for Si, and a sophisticated potential including both covalent and electrostatic effects for the Si–N interactions. The technical achievements of the simulation were considerable: over 106 atoms were used to minimize edge effects in the fully three-dimensional simulations. The results show that brittle cracks in the Si3 N4 are arrested at the Si interface, and emit dislocations into the bulk Si region. However, the behavior of the original and modified SW potential simulations in the velocity gap simulations suggest that this agreement with experiment may be fortuitous. Since SW simulations never show brittle fracture, it is unsurprising that the simulated cracks arrested at the Si3 N4 /Si interface. The unphysical extreme ductility of the SW potential explains that the simulated crack arrest is an artifact that is most likely unrelated to the reason for the crack arrest in the experiment. The ostensible disagreement between simulations of Si, which show cracktip ductility, and experiments on Si, which show apparently brittle fracture at low temperatures, is an instructive example of the difficulties in definitively comparing simulations and experiment. Is the ductility seen in simulations in fact unphysical? Large scale simulations using empirical potentials show that the dislocations remain at or near the crack-tip. The size of the disordered region is so small that even if it is real it is not clear whether it would have been noticed in experiments. Visualization of the simulation results shows a crack-tip that is blunt on the atomic scale, but quite sharp (a few tens of Å) on the macroscopic scale. The speed of the crack, which one might naively
870
N. Bernstein
expect to be quite different in brittle fracture vs. localized crack-tip ductility, turns out to be quite insensitive to the mode of fracture. Both for the ductile empirical potentials as well as the brittle modified SW, the speed goes up with applied load but never exceeds about 2/3 of the theoretical limiting speed, the Rayleigh wave speed [22]. A more sensitive quantitative measure is required to settle the question. The critical energy release rate G, which measures the amount of energy dissipated during crack propagation, provides the necessary information. If the critical G is close to the Griffith criterion value, the fracture process must be essentially brittle. If the critical G is much higher, microscopic ductility remains possible. Careful experimental measurements finally showed that the critical energy release for fracture in Si is quite close to twice the best estimate of the surface energy (from density functional theory calculations). While the uncertainty in the experiments and the surface energy calculations prevent this measurement from being an accurate test of the Griffith criterion prediction, it does seem to rule out significant ductility. The behavior of most empirical potentials for Si is simply not consistent with the experimentally measured energy release rate. The specific problems with interatomic potential simulations of fracture in Si, as well as the general view that an explicitly quantum-mechanical method would be more reliable, drove the development of a multi-scale method for simulating fracture and other material processes. The general approach, pioneered by Kohlhoff et al. [23] takes advantage of the fact that in most of the loaded sample continuum mechanics or interatomic potentials are a very accurate way of describing the material. Only in the crack-tip region is the deviation from the continuum elasticity result significant, and most likely only at the crack tip, where bond are being broken, are the shortcomings of the interatomic potential significant. Coupling together different computational methods, each applied in a different part of the system where it is valid, can combine the accuracy and efficiency of the different methods. This coupled approach was applied to fracture in silicon by embedding a tight-binding simulation of the crack-tip region into an interatomic-potential simulation that describes the rest of the system. The coupled simulation shows brittle fracture initiating at loads only slightly above the Griffith criterion prediction. Analyzing the change in energy as the crack moves by one atomic spacing shows that the main difference between the interatomic potentials and the (crack-tip) tight-binding results was in the size of the lattice trapping energy barrier. The interatomic potentials show large energy barriers, large enough to suppress fracture up to the critical loads for dislocation nucleation. This explains the unphysically large amount of ductility seen in empirical potential simulations. The coupled simulation has a much smaller energy barrier that disappears at the critical load for brittle fracture. Decomposing the energy changes during crack propagation into bond-breaking and elastic-relaxation parts indicates that two length scales control the height of the barrier: the
Atomistic simulations of fracture in semiconductors
871
distance over which the bond is broken, which is related to the type of bonding and the method used to simulate it, and the elastic relaxation distance, which is controlled by the shape of the crack-tip. The modified SW used for the velocity gap simulations is more brittle than SW not because the barrier is smaller, but because the dislocation nucleation point is pushed to higher loads by the artifically stiff bond angles. There is one interatomic potential for Si that does produce brittle fracture, based on the modified embedded atom method (MEAM) [24]. Simulations using a MEAM potential for Si show brittle fracture with a critical load that is about 20% higher in applied stress than the Griffith criterion prediction. This amount of lattice trapping is comparable to the small-system quasistatic first-principles simulations, but significantly larger than the dynamic coupled empirical-potential-tight-binding simulations. At loads significantly above the critical load for crack propagation it has been observed experimentally that the crack speed increases with increasing load, but not as quickly as the continuum elasticity prediction. The reason for this deviation, and for the saturation of the experimental crack speed at about 2/3 the continuum limit is not known. The MEAM simulation, which reproduces the experimental crack speed measurements at high loads, reveals the reason for the reduced crack speed. At higher loads some of the elastic energy provided by the load is dissipated in the creation of damage near the crack-tip. This damage takes the form of dislocations at moderate loads, and surface steps of one to five atomic layers at larger loads (shown in Fig. 6). Even in an initially perfect material, the crack propagation process becomes unstable and produces an irregular crack surface.
[111] [211]
Figure 6. Image of crack propagating through Si at high loading, showing surface steps and dislocations. (After Fig. 4 in J.G. Swadener, M.I. Baskes, and M. Nastasi, Phys. Rev. Lett., 89, p. 85503, 2002. Reproduced with permission.)
872
5.
N. Bernstein
Future Directions
The applications of atomistic simulation methods to fracture in silicon have come a long way since their beginnings in the 1960s and 1970s. The power of computers has grown by many orders of magnitude, and the methods for evaluating the energies and interatomic forces have become correspondingly more sophisticated. Nevertheless, the range of time and length scales that are inherently involved in the fracture process, from the atomic vibrations that lead to bond breaking to steady-state crack growth or the interplay with plasticity, will continue to make this a challenging computational problem. Many open questions are just beginning to be addressed: the nature of lattice trapping in its dynamic and static forms, the nature of instabilities in dynamic crack propagation, and the brittle-to-ductile transition. The unexpected difficulties in applying interatomic potentials will ensure a role for explicitly quantum-mechanical methods until more reliable interatomic potentials are developed. The need for such computationally expensive methods will ensure an important role for simulation approaches based on idealized models that make the calculations tractable. While simple empirical models may become less important, more sophisticated ways of linking atomistically calculated quantities, through continuum mechanics, to macroscopic mechanical properties will continue to be useful. For dynamical processes direct numerical simulations will require advances in interatomic potentials and ways of using quantum-mechanical methods just in the regions where they are most needed. With these ongoing advances atomistic simulations are poised to finally give us an atomic resolution view of the complex fracture properties of semiconductors.
References [1] K.B. Broberg, Cracks and Fracture, Academic Press, San Diego, 1999. [2] G.R. Irwin, “Analysis of stresses and strains near the end of a crack traversing a plate,” J. Appl. Mech., 24, 361–364, 1957. [3] A.A. Griffith, “The phenomena of rupture and flow in solids,” Philos. Trans. R. Soc. London A, 221, 163, 1921. [4] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1992. [5] D. Farkas and R.L.B. Selinger, “Atomistics of fracture,” Article 2.23, this volume. [6] J. Samuels and S.G. Roberts, “The brittle-ductile transition in silicon. I. Experiments,” Proc. Roy. Soc. London A, 421, 1–23, 1989. [7] M.L. Cohen, “Concepts for modeling electrons in solids,” Article 1.2, this volume. [8] W.A. Harrison, Electronic Structure and the Properties of Solids., Freeman, San Francisco, 1980. [9] M.J. Mehl and D.A. Papaconstantopoulos, “Tight-binding total energy methods for magnetic materials and multi-element systems,” Article 1.14, this volume. [10] C.Z. Wang and K.M. Ho, “Environment-dependent tight-binding potential models,” Article 1.15, this volume.
Atomistic simulations of fracture in semiconductors
873
[11] J. Justo, “Interatomic potentials: covalent bonds,” Article 2.4, this volume. [12] P. Haasen, Physical Metallurgy, Cambridge University Press, Cambridge, 1986. [13] A.Y. Liu, and M.L. Cohen, “Prediction of new low compressibility solids,” Science, 245, 841–842, 1989. [14] J.R. Rice, “Dislocation nucleation from a crack tip: an analysis based on the Peierls concept,” J. Mech. Phys. Solids, 40, 239–271, 1992. [15] C.R.A. Catlow, “Perspective: energy minimisation techniques in materials modelling,” Article 2.7, this volume. [16] J. Li, “Basic molecular dynamics,” Article 2.8, this volume. [17] B. Lawn, Fracture of Brittle Solids, Cambridge University Press, Cambridge, p. 148, 1993. [18] R. Perez and P. Gumbsch, “An ab initio study of the cleavage anisotropy in silicon,” Acta Mater., 48, 4517–4530, 2000. [19] D.J. Bammann, “Perspective: continuum modeling of mesoscale/macroscale phenomena,” Article 3.2, this volume. [20] M. Marder, “Molecular dynamics of cracks,” Comp. Sci. Eng., 1, 48–55, 1999. [21] I. Beery, U. Lev, and D. Sherman, “On the lower limiting velocity of a dynamic crack in brittle solids,” J. Appl. Phys., 93, 2429–2434, 2003. [22] L.B. Freund, Dynamic Fracture Mechanics, Cambridge University Press, Cambridge, 1998. [23] S. Kohlhoff, P. Gumbsch, and H.F. Fischmeister, “Crack propagation in BCC crystals studied with a combined finite-element and atomistic model,” Phil. Mag. A, 64, 851–878, 1991. [24] Y. Mishin, “Interatomic potentials: metals,” Article 2.2, this volume.
2.25 MULTIMILLION ATOM MOLECULAR-DYNAMICS SIMULATIONS OF NANOSTRUCTURED MATERIALS AND PROCESSES ON PARALLEL COMPUTERS Priya Vashishta1, Rajiv K. Kalia2 , and Aiichiro Nakano3 1
Collaboratory for Advanced Computing and Simulations, Department of Chemical Engineering and Materials Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA 2 Collaboratory for Advanced Computing and Simulations, Department of Physics & Astronomy, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA 3 Collaboratory for Advanced Computing and Simulations, Department of Computer Science, University of Southern California, 3651 Watt Way, VHE 608, Los Angeles, CA 90089-0242, USA
1.
Introduction
Materials by design efforts have thus far focused on controlling structures at diverse length scales – atoms, defects, fibers, interfaces, grains, pores, etc. Because of the inherent complexity of such multiscale materials phenomena, atomistic simulations are expected to play an important role in the design of materials such as metals, semiconductors, ceramics, and glasses [1]. In recent years, we have witnessed rapid progress in large-scale atomistic simulations, highly efficient algorithms for massively parallel machines, and immersive and interactive virtual environments for analyzing and controlling simulations in real time. As a result of these advances, simulation efforts are being directed toward reliably predicting properties of materials in advance of fabrication. Thus, materials simulations are capable of complementing and guiding experimental search for new and novel materials. Computer simulation is the third mode of scientific research that bridges the gap between analytical theory and laboratory experiment. Experiments 875 S. Yip (ed.), Handbook of Materials Modeling, 875–928. c 2005 Springer. Printed in the Netherlands.
876
P. Vashishta et al.
search for patterns in complex natural phenomena. Theories encode the discovered patterns into mathematical equations that provide predictive laws for the behavior of nature. Computer simulations solve these equations numerically in their full complexity, where analytical solutions are prohibitive due to a large number of degrees of freedom, nonlinearity, or lack of symmetry. In computer simulations, environments can be controlled with any desired accuracy and extreme conditions are accessible far beyond the scope of laboratory experiments. Advanced materials and devices with nanometer grain/feature sizes are being developed to achieve higher strength and toughness in ceramic materials and greater speeds in semiconducting electronic and photonic devices. Below the length scale of 100 nm, however, continuum description of materials and devices must be supplemented by atomistic descriptions [2]. Current state-of-the-art atomistic simulations involve 1 million to 1 billion atoms [3– 5]. Finally, the impact of large-scale nanosystems simulations cannot be fully realized without major breakthroughs in scientific visualization [6]. The current practice of sequentially processing visualization data is highly ineffective for large-scale applications that produce terabytes of data. The only viable solution is to integrate visualization into simulation, so that they are both performed concurrently on multiple parallel machines and then examine the results in real time in three-dimensional immersive and interactive virtual environments. This article describes our efforts to combine scalable and portable simulation, visualization, and data-management algorithms to enable very large-scale molecular-dynamics (MD) simulations. Scalable multiresolution algorithms and visualization, and data-management algorithms that enable these largescale simulations are described in the first part of this article. In the second part, we discuss the molecular dynamics simulations of various nanostructured materials and processes of great scientific and technological importance. The simulations described in this article were carried out in collaboration with our past and current graduate research assistants, postdoctoral research associates, and our long-term overseas collaborators.
2.
Part I: Scalable Simulation and Visualization Algorithms
In this part, we describe our scalable simulation and visualization algorithms. Following a general introduction to the MD simulation method and interatomic potentials to describe various materials, we describe our space–time multiresolution MD algorithms and their implementation on massively parallel computers. We also describe a multiscale simulation approach, which seamlessly combines quantum-mechanical and MD simulations with continuum simulation based on the finite-element method, on a Grid of globally distributed
Multimillion atom molecular-dynamics simulations
877
parallel computers. We conclude part I with discussions of management, mining, and immersive and interactive visualization of massive simulation data.
2.1.
Molecular Dynamics Simulation
In the MD approach, the phase–space trajectories of the system (positions and velocities of all the atoms at all time) are obtained from the numerical solution of Newton’s equations (see Fig. 1). This allows one to study how atomistic processes determine macroscopic materials properties. Recent advances in scalable, space–time multiresolution algorithms coupled with access to massively parallel computing resources have enabled us to perform some of the largest atomistic simulations of complex materials. The mathematical model underlying an MD simulation is the Newton’s equation of motion in mechanics, which states that the acceleration of an atom is proportional to the total force exerted on the atom by all the other atoms [5]. In MD simulations, a physical system consisting of N atoms is represented by a set of coordinates, {r k = (xk , yk , z k ) | k = 1, . . . , N }, and we trace the atomic trajectories – positions, r k (t), and velocities, vk (t) – by integrating the Newton’s equations numerically with respect to time, t (see N N Fig. 1). Fk = −∂ E MD r ∂rk is the force acting on the kth atom, E MD r is the interatomic potential energy, and r N = (r1 , r2 , . . . , r N ) is a 3N -dimensional vector representing the positions of the atoms.
j r ij i
k r ik
Figure 1. A molecular-dynamics simulation consists of a collection of atoms, which exert forces to each other, depending on their mutual interactions and relative positions.
878
P. Vashishta et al.
Choice of numerical algorithms is crucial for efficient simulations. For example, the velocity Verlet algorithm is time reversible, i.e., a simulation can be played back to recover the starting state exactly. In addition, the solution satisfies a certain symmetric property called symplecticness, which is related to the conservation of phase–space volume along the trajectory. These properties are essential for the long-time stability of a simulation. Mathematically, a force law is encoded in the interatomic potential energy, E MD (r N ). In the past years, we have developed reliable interatomic potentials for a number of materials, including ceramics such as silica (SiO2 ) [7, 8], silicon nitride (Si3 N4 ) [9–12], silicon carbide (SiC) [13–16] and alumina (A12 O3 ), as well as semiconductors such as gallium arsenide (GaAs), aluminum arsenide (AlAs), and indium arsenide (InAs) [17–20]. Interatomic potential energy for these materials consists of two- and three-body terms. The two-body potential energy is a sum over contributions from N(N + 1)/2 atomic pairs, (i, j ). The contribution from each pair depends only on their relative distance, |ri j | (see Fig. 1). Physically, the two-body terms are steric repulsion between atoms, electrostatic interaction due to charge transfer, charge–dipole interaction that takes into account the large electronic polarizability of negative ions, and van der Waals interaction. The three-body potential energy consists of contributions from atomic triples (i, j, k), and takes into account covalent effects through bending and stretching of atomic bonds, ri j and rik (see Fig. 1). These many-body potentials have been validated through comparison of simulation data with various experimental quantities. Theoretical results are in good agreement with experimental lattice constants, cohesive energy, elastic constants, melting temperature, phonon density-of-states, and fracture energies. As shown in Fig. 2, MD simulations on GaAs reproduce other experimental data as well – phonon dispersion, X-ray static structure factor, Sx (q), of amorphous state, and high-pressure structural transformation [21]. In Fig. 3, we compare MD and experimental data on the neutron-scattering static structure factor in amorphous SiO2 [22]. Figure 3 also shows MD and neutron-scattering experimental data on the phonon density of states in crystalline α-Si3 N4 [9].
2.2.
Multiresolution Algorithms
Efficient algorithms are key to extending the scope of simulations to larger spatial and temporal scales that are otherwise impossible to be simulated. These algorithms often utilize multiresolutions in both space and time. The most computationally intensive problem in an MD simulation is the computation of the electrostatic energy for N charged atoms. Direct evaluation of all the atomic-pair contributions requires O(N 2 ) operations. In 1987, Greengard and Rokhlin discovered an O(N ) algorithm called the fast
Multimillion atom molecular-dynamics simulations
879
Figure 2. Comparison with MD and experimental results for GaAs. (a) Theoretical and experimental phonon dispersion of zinc-blende GaAs. (b) X-ray static structure factor of amorphous GaAs. The MD results (solid curves) are in excellent agreement with X-ray diffraction data (open circles). (c) MD and EXAFS results for the GaAs nearest-neighbor distance during forward (squares) and reverse (circles) structural transformations.
Figure 3. (Left) Neutron-scattering static structure factor, S N (q), of amorphous SiO2 : Solid curve, the MD result at 300 K; open circles, neutron diffraction experiment at 10 K. (Right) Neutron-weighted phonon density of states of α-Si3 N4 : Solid curve, MD result; open circles, neutron scattering result.
multipole method (FMM) [23]. The FMM groups distant atoms together and treats them collectively [23–25]. Hierarchical grouping is facilitated by recursively dividing the physical system into smaller cells, therefore generating a tree structure (see Fig. 4). The root of the tree is at level 0, and it corresponds to the entire simulation box. A parent cell at level l is decomposed into 2 × 2 × 2 children cells of equal volume at level l + 1. The FMM uses the truncated multipole expansion and the local Taylor expansion of the electrostatic
880
P. Vashishta et al.
Level 0
Level 1
Level 2
Level 3
Figure 4. Schematic of the far-field computation in a two-dimensional system in the fast multipole method. The multipoles of a parent cell at level l are obtained by shifting the multipoles of its children cells at level l + 1 and summing them. Solid circles represent charged particles, and vertical lines represent parent-child relationships.
potential field. By computing both expansions recursively for the hierarchy of cells, the electrostatic energy is computed with O(N ) operations. The FMM also has well defined error bounds. For systems with periodic boundary conditions, other schemes based on Ewald summations, such as the O(N log N ) particle-mesh Ewald method [26], are ideally suited. The discrete time step, t, in MD simulations must be chosen sufficiently small so that the fastest characteristic oscillations of the simulated system are accurately represented. However, many important physical processes are slow and are characterized by time scales that are many orders-of-magnitude larger than t. Molecular-dynamics simulations of such “stiff” systems require many iteration steps, and this severely restricts the applicability of the simulation. We have used an approach called the multiple time-scale (MTS) method [27] which uses different t for different force components to reduce
Multimillion atom molecular-dynamics simulations
P2
881
P5 Step 3
P1
P4 Step 2
P0
Step 1
Step 4
P3
Figure 5. Spatial decomposition (2 × 3 × 1) of a porous silica system into 6 systems, which are mapped onto 6 processors (P0–P5). Two types of spheres represent silicon and oxygen. Logical partition boundaries between subsystems are represented by yellow planes. As denoted by arrows, message passing for interprocessor caching is completed in 4 steps. (This number of message-passing steps is smaller than 6, since the system is not partitioned in the third dimension in this example.)
the number of force evaluations. To further speed up simulations, we have also used a hierarchy of dynamics including rigid-body motion of atomic clusters [28]. Our multiresolution molecular-dynamics (MRMD) algorithm [5, 24] combining the FMM and MTS has been implemented on a number of parallel computers using a spatial decomposition, see Fig. 5. The MRMD algorithm is highly scalable: for a 664 million-atom SiO2 system, one MD step takes only 7 s on 1024 IBM SP3 nodes. The parallel efficiency of this algorithm-machine combination, defined as a speedup divided by the number of processors, is 93%.
2.3.
Parallel Molecular Dynamics
Parallel computing technology has extended the scope of computer simulations in terms of simulated system size. In order to perform parallel computer simulations efficiently, however, algorithms developed for serial computers must often be modified.
882
P. Vashishta et al.
Parallel computing requires decomposing the computation into subtasks and mapping them to multiple processors [5]. For MD simulations, the divideand-conquer strategy based on spatial decomposition is commonly used. The total volume of the system is divided into P subsystems of equal volume, and each subsystem is assigned to a node in an array of Pprocessors (see Fig. 5). The data associated with atoms of a subsystem are assigned to the corresponding processor. To calculate the force on an atom in a subsystem, the coordinates of the atoms in the boundaries of neighbor subsystems must be “cached” from the corresponding processor. In the actual code, the message passing to the 26 neighbors is completed in six steps by sending the boundary-atom information to east, west, north, south, up and down neighbors sequentially. The corner and edge boundary atoms are copied to proper neighbor processors by forwarding some of the received boundary atoms to other neighbors. After updating the atomic positions due to the time-stepping procedure, some atoms may have moved out of its subsystem. These atoms are “migrated” to the proper neighbor nodes. With the spatial decomposition, the computation scales as N/P while communication scales as (N/P)2/3, where N is the number of atoms. Thus the communication overhead becomes less significant when N (typically 106 –109 ) is much larger than P(102 –103 ), i.e., for coarse-grained applications. To implement the FMM discussed in the main text on parallel computers, processors are logically organized in a 3-dimensional array of Px × Py × Pz . For deeper tree levels, l ≥ log2 (max(Px , Py , Pz )), the calculation of the multipoles is local to each processor so that the computation scales with N/P [24, 25]. For lower levels, however, the number of FMM cells, 8l , becomes smaller than the number of processors. Consequently many processors become idle or alternatively they duplicate the same computation, and this computation overhead scales as log P. For a coarse-grained decomposition (N P), this log P overhead also becomes insignificant. Many MD simulations are characterized by irregular atomic distribution. Simulation of dynamic fracture is a typical example. One practical problem in simulating such irregular systems on parallel computers is that of load imbalance. Suppose that we partition the simulation system into subsystems of equal volume according to the three-dimensional array of processors as in Fig. 5. Because of the irregular distribution of atoms, this uniform spatial decomposition results in unequal partition of workloads among processors. As a result, the parallel efficiency is degraded significantly. This load-imbalance problem can be solved by partitioning the system not in the physical Euclidean space but in a computational space, which is related to the physical space by a curvilinear coordinate transformation (see Fig. 6). (The computational space shrinks where the workload density is high and expands where the density is low, so that the workload is uniformly distributed.) The optimal coordinate system is determined to minimize the load-imbalance and communication costs [29, 30].
Multimillion atom molecular-dynamics simulations
883
Figure 6. Schematic of the Computational space decomposition for load balancing. A 2D slice of a 3D MD configuration shows atoms as circles and partition boundaries between subsystems as curves.
Figure 7. Snapshots of a variable-charge MD simulation of an oxygen molecule on an aluminum surface. Atomic charges vary according to the local environment. Aluminum and oxygen atoms are represented by small and large spheres, respectively.
Having established scalable MD algorithms on parallel computers, current focus of research is how to enhance the physical realism of these simulations. For example, conventional interatomic potential functions used in MD simulations are often fitted to bulk solid properties, and they are not easily transferable to systems containing defects, cracks, surfaces, and interfaces. In these systems, partial charges and other chemical properties of atoms vary dynamically according to the change in the local environment. For example, environment-dependent charge distribution is crucial for the physical properties of these systems including the fracture toughness. Transferability of interatomic potentials is greatly enhanced by incorporating variable atomic charges, which dynamically adapt to the local environment. Recently, a simple semiempirical approach has been developed in which atomic charges are determined to equalize electronegativity (see, Fig. 7) [31–33].
884
P. Vashishta et al.
However, the increased physical realism of this model comes at the cost of computational overhead to determine atomic charges by minimizing the electrostatic energy at every MD step. This minimization is equivalent to solving a dense, linear equation system, and the computational cost scales as the cubic power of N.We have developed an acceleration scheme [34] which computes the matrix-vector multiplication in O(N ) time using the FMM. Also a dynamical simulated-annealing scheme uses the charges determined at the previous MD step to initialize an iterative solution, reducing the number of iterations to O(1). To speed up the solution further, our multilevel preconditioned conjugate gradient (MPCG) method splits the Coulomb-interaction matrix into short- and long-range components. The method uses the sparse short-range matrix as a preconditioner to improve the linear system’s spectral property, thereby accelerating the solution. The MPCG algorithm has enabled the first successful MD simulation of the oxidation of an aluminum nanocluster [35].
2.4.
Multiscale Simulation Approach
Processes such as crack propagation and fracture in real materials involve structures on many different length scales. They occur on a macroscopic scale, but require atomic-level resolution in highly nonlinear regions. To study such multiscale materials processes, we need a multiscale simulation approach that can describe physical and mechanical processes over several decades of length scales. Recently, Abraham, Broughton, Bernstein, and Kaxiras have developed a hybrid simulation approach that combines quantum mechanical (QM) calculations based on the tight-binding approximation, with large-scale MD simulations embedded in a continuum, which is handled with the finite element (FE) approach based on linear elasticity [36]. Such a multiscale FE/MD/QM simulation approach is illustrated in Fig. 8 for a material with a crack [2, 37]. Let’s denote the total system to be simulated as So . A subregion denoted as S1(⊂ S0 ) near the crack exhibits significant nonlinearity, and hence it is simulated atomistically, whereas the rest of the system, S0 − S1 is accurately described by the FE approach. In the region S2 (⊂ S1) near the crack surfaces, bond breakage during fracture and chemical reactions due to environmental effects are important. To handle such chemical processes, QM calculations must be performed in S2,while the subsystem, S1 − S2 , can be simulated with the classical MD method. Figure 8 also shows typical length scales covered by each of the FE, MD, and QM methods. In the following, we describe how an FE calculation can seamlessly embed an MD simulation, which in turn embeds a QM calculation.
Multimillion atom molecular-dynamics simulations
885
Figure 8. Illustration of a hybrid FE/MD/QM simulation. The FE, MD, and QM approaches are used to compute forces on particles (either FE nodes or atoms) in subsystems, S0 − S1 (represented by meshes), S1 − S2 , and S2 . These forces are then used in a time-stepping algorithm to update the positions and velocities of the particles. Typical length scales covered by each of the FE, MD, and QM method is also shown.
2.5.
Hybrid FE/MD Scheme
In continuum elasticity theory, a displacement vector, u(r), is associated with each point, r, in a deformed medium. In the FE method, space is tessellated with a mesh. The displacement field, u, is discretized on the mesh points (nodes), while its values within the mesh cells (elements) are interpolated from its nodal values. Time evolution of u(r) is governed by equations of motion that are a set of coupled ordinary differential equations subjected to forces from surrounding nodes. The nodal forces are derived from the potential energy, EFE [u(r)], which encodes how the system responds mechanically in the framework of elasticity theory. In hybrid FE/MD approaches, the physical system is spatially divided into FE, MD, and handshake (HS) regions [36–38]. Within the FE region, the equations for continuum elastic dynamics are solved on an FE mesh. To make the transition from the FE to MD regions seamlessly, the FE mesh in the HS region is refined down to the atomic scale near the FE/MD interface in such a way that each FE node coincides with an MD atom. The FE and MD regions
886
P. Vashishta et al.
are made to overlap over the HS region, establishing a one-to-one correspondence between the atoms and the nodes. Figure 9(a) illustrates an FE/MD approach. On the top is the atomistic region (crystalline silicon in this example), and on the bottom is the FE region. The red box marks the HS region, in which particles are hybrid nodes/atoms, and the blue dotted line within the HS region marks the FE/MD interface. These hybrid nodes/atoms follow hybrid dynamics to ensure a smooth transition between the FE and MD regions. In the scheme by Abraham, Broughton, Bernstein, and Kaxiras, an explicit energy function, or Hamiltonian, for the transition zone is defined to ensure energy-conserving dynamics [36]. All finite elements that cross the interface contribute half their weight to the potential energy; similarly, any MD interaction between atomic pairs and triples that cross the FE/MD interface contributes half its value to the potential energy. We use a lumped-mass scheme in the FE region, i.e., the
Figure 9. (a) Illustration of a hybrid FE/MD scheme for a three-dimensional silicon crystal for the crystallographic orientations (011). On the top is the MD region, where spheres and lines represent atoms and atomic bonds, respectively. At the bottom is the FE region, where spheres represent FE nodes and FE cells are bounded by lines. Region enclosed between the lines is the handshake (HS) region, in which particles are hybrid nodes/atoms, and the dotted line within the HS region indicates the FE/MD interface. (b) Time evolution of FE nodes and MD atoms in a hybrid FE/MD simulation of a projectile impact on a silicon crystal. (The figure shows a thin slice of the crystal for clarity.) Absolute displacement of each particle from its equilibrium position is shown. No reflection is seen at the boundary.
Multimillion atom molecular-dynamics simulations
887
mass is assigned on nodes instead of being distributed continuously within an element. This reduces to the correct description in the atomic limit, where nodes coincide with atoms. To rapidly develop an FE/MD code by reusing an existing MD code, we took advantage of formal similarities between the FE and MD dynamics. In our FE/MD program, particles are either FE nodes or MD atoms, and their positions and velocities are stored in a single array. The FE method requires an additional book keeping, since each element must be associated with its corresponding nodes. This is done efficiently with use of the linked cell list in the MD code. The FE/MD simulation approach has been parallelized based on the same spatial decomposition scheme as in our parallel MD program. To validate our FE/MD scheme, we simulated a projectile impact on a three-dimensional block of crystalline silicon, see Fig. 9(b) [2]. The block has ¯ ¯ crystallographic dimensions of 10.5 nm and 6.1 nm along the [211] and [011] orientations, respectively, and periodic boundary conditions are imposed in these directions. Along the [111] direction, the system consists of a 11.5 nm thick MD region, a 0.63 nm thick HS region, and a 19.6 nm thick FE region. The top surface in the MD region is free and the nodes at the bottom surface in the FE region are fixed. The fully three-dimensional FE scheme uses 20-node brick elements for the region far from the HS region, which provide a quadratic approximation for the displacement field and are adequate for continuum. In the scaled down region close to the FE/MD interface, we switch to eight-node brick elements, which provide a linear approximation for the displacement field. Within the HS region, the elements are distorted so as to exhibit the same lattice structure as crystalline silicon. In addition to these elements, prism-like elements are used for coarsening the FE mesh from the atomic to larger scales. The projectile is approximated by an infinite-mass hard sphere of radius 1.7 nm, from which the silicon atoms scatter elastically. A harmonic motion of the projectile along the [111] direction creates smallamplitude waves in the silicon crystal. Figure 9(b) shows snapshots at three different times, in which only a thin slice is plotted for the clarity of presentation. The color denotes the absolute displacement from the equilibrium positions measured in Å. The induced waves in the MD region propagate into the FE region without reflection, demonstrating seamless handshaking between MD and FE.
2.6.
Hybrid MD/QM Scheme
Empirical interatomic potentials used in MD simulations fail to describe chemical processes. Instead, interatomic interaction in reactive regions needs to be calculated by a QM method that can describe breaking and formation of bonds. There have been growing interests in developing hybrid MD/QM
888
P. Vashishta et al.
simulation schemes, in which a reactive region treated by a QM method is embedded in a classical system of atoms interacting via an empirical interatomic potential. An atom consists of a nucleus and surrounding electrons, and QM schemes treat electronic degrees-of-freedom explicitly, thereby describing wave-mechanical nature of electrons. One of the simplest QM schemes is based on the tight-binding (TB) method [36]. The TB method does not involve electronic wave functions explicitly, but solves an eigenvalue problem for the matrix that represents interference between electronic orbitals. In the TB scheme, electronic contribution to interatomic forces is derived through the Helmann–Feynman theorem, which states that only partial derivatives of the matrix elements with respect to r N contribute to forces. A more accurate but compute-intensive QM scheme deals explicitly with electronic wave functions, ψ Nwf (r) = {ψ1 (r), ψ2 (r), . . . , ψ Nwf (r)} (Nwf is the number of independent wave functions, or electronic bands, in the QM calculation), and their mutual interaction in the framework of the density functional theory (DFT) theory [39–41] and electron–ion interaction using pseudopotentials [42]. The DFT, for the development of which Walter Kohn received a 1998 Nobel chemistry prize, reduces the exponentially complex quantum–body problem to a self3 ) operations. In consistent eigenvalue problem that can be solved with O(Nwf the DFT scheme, not only accurate interatomic forces are obtained from the Helmann–Feynman theorem, but also electronic information such as charge distribution can be calculated. Hybrid MD/QM schemes have been developed extensively by the quantumchemistry community. In the seminal work by Abraham, Bernstein, Broughton, and Kaxiras [37], which combines the FE/MD/QM approaches in a single simulation, a semiempirical TB method is used as a QM method and a HS Hamiltonian is introduced to link the MD/TB boundary. We have developed a hybrid scheme for dynamic simulations of materials on parallel computers, in which a QM region is embedded in an atomistic region, see Fig. 9 [37, 43]. The motion of atoms in the QM region is described by a real-space [44] multigridbased DFT [45–47] and in the surrounding region with the MD approach. To partition the total system into the cluster and its environmental regions, we use a modular approach that is based on a linear combination of QM and MD potential energies and consequently requires minimal modification of existing QM and MD codes [48]: system
E = E CL
system
cluster cluster + E QM − E CL ,
is the classical semiempirical potential energy for the whole where E CL system and the last two terms encompass the QM correction to that energy. cluster is the QM energy for an atomic cluster cut out of the total system (its E QM
Multimillion atom molecular-dynamics simulations
889
cluster dangling bonds are terminated by hydrogen atoms – HS Hs) and E CL is the semi-empirical potential energy of a classical cluster in which HS Hs are replaced by appropriate atoms. In this approach, both QM and MD potential energies for the cluster need be calculated. Termination atoms are introduced in both calculations for the cluster. Handshake atoms linking the cluster and the environment regions are treated by a novel scaled position method, in which the positions of handshake atoms are determined as functions of the original atomic positions in the system with different scaling parameters in the QM and classical clusters to relate the HS atoms to the termination atoms. The hybrid simulation scheme is implemented on massively parallel computers by first dividing processors into the QM and the MD calculations (task decomposition), and then using spatial decomposition in each task. The parallel program is based on the message-passing paradigm and is written with the message passing interface (MPI) standard. Processors are first grouped into MD and QM groups by defining two MPI communicators. (Communicator is an MPI data structure that represents a dynamic group of processes with a unique ID called context.) The code is written in the single program multiple data (SPMD) programming style, so that each processor executes an identical program. Selection statements are used for the QM and the MD processors to execute only the QM and the MD code segments, respectively. The hybrid MD/QM simulation code was applied to oxidation of a silicon surface, to demonstrate seamless coupling of the cluster and the environment atoms [45]. Figure 10 shows snapshots of the atomic configuration at 50 and 250 fs in which atomic kinetic energies are color-coded. We see that dissociation energy released at the reaction of an oxygen molecule with silicon atoms in the cluster region is transferred seamlessly to silicon atoms in the environment region.
2.7.
Hybrid FE/MD/QM Scheme
Due to the formal similarity between parallel MD and FE/MD codes and the modularity of the MD/QM scheme mentioned above, it is also straightforward to embed a QM subroutine in a parallel FE/MD code to develop a parallel FE/MD/QM program [37]. The parallel hybrid simulation code is applied to oxidation of Si (111) surface to demonstrate seamless coupling of the FE, MD, and QM regions, see Fig. 11. For the hybrid simulation, a slab ¯ ¯ [111]) directions with dimensions (212.8 Å, 245.7 Å, 83.l Å) in ([211], [011], is cut out from bulk Si. The MD and the FE/MD–HS regions consist of 12 and 4 atomic layers along [111] direction, respectively, whereas the FE region corresponds to bottom two-thirds of the Si slab. Periodic boundary conditions ¯ ¯ directions. The total number of atoms and FE are applied in [211] and [011]
890
P. Vashishta et al.
Figure 10. (Top) Initial configuration in the present hybrid MD/QM simulation for oxidation of Si (100) surface. Magenta spheres represent the cluster silicon atoms; gray, the environment silicon atoms; yellow, termination hydrogen atoms for QM calculations; blue, termination silicon atoms for MD calculations; green, cluster oxygen atoms. (Bottom) Snapshots at 50 and 250 fs in the present hybrid MD/QM simulation for oxidation of Si (100) surface. Colors represent kinetic energies of atoms in Kelvin.
Figure 11. Snapshots at 150 fs, 300 fs, and 900 fs in the hybrid simulation of oxidation of Si (111) surface. Colors represent [111]-displacements of the atoms and the FE nodes.
nodes for the Si slab is N = 15, 212. Initial configuration of the hybrid simu¯ direction lation is obtained by placing an O2 molecule (oriented along [211] with zero velocity) 2.0 Å above the (111) surface of the slab. The O2 molecule and surrounding Si atoms are treated in the DFT calculation. Figure 11 shows
Multimillion atom molecular-dynamics simulations
891
snapshots of the atomic configuration at 150 fs, 300 fs, and 900 fs, in which [111] displacements of the atoms and the FE nodes are color-coded. The O2 molecule dissociates and each O atom is captured by a Si–Si bond at the surface to form a Si–O–Si structure, which is associated with increase in the Si–Si distance. Resulting strains in the QM region are transferred to the surrounding Si atoms in the MD region as shown in Fig. 11. Such strain waves reach the QM/MD–HS regions at ∼300 fs, and propagate into the FE regions at ∼900 fs with no reflection or refraction observed at the QM/MD and MD/FE boundaries.
2.8.
Grid Computing
Metacomputing on a Grid [49] of geographically distributed Teraflop-toPetaflop computers and immersive virtual reality environments connected via high-speed networks will revolutionize science and engineering, by enabling hybrid simulations that integrate multiple expertise distributed globally. We have performed a multidisciplinary, collaborative MD/DFT simulation on a Grid of geographically distributed Linux clusters in the US and Japan, based on the modular, additive hybridization scheme (see Fig. 12) [50]. The multiscale MD/QM simulation code has been Grid-enabled based on a divideand-conquer scheme, in which the QM region is a union of multiple QM clusters.
Figure 12. Multiscale MD/DFT simulation of the reaction of water at a crack tip in silicon (top), on a Grid of distributed Linux clusters in the US and Japan (bottom). In this figure, five QM calculations (circles) around five water molecules are embedded in an MD simulation.
892
P. Vashishta et al.
Since the energy is a sum of the QM energy corrections for the clusters in the additive divide-and-conquer hybridization scheme, each QM-cluster calculation does not access the atomic coordinates in the other clusters, and accordingly its parallel implementation involves no inter-QM-cluster communication. Furthermore, the multiple-QM-cluster scheme is computationally more efficient than the single-QM-cluster scheme because of the O(N 3 ) scaling. (The large prefactor of O(N ) DFT algorithms makes conventional O(N 3 ) algorithms faster below a few hundred atoms.) We have implemented the multiscale MD/DFT simulation algorithm as a single MPI program. The Globus middleware and the Grid-enabled MPI implementation, MPICH-G2, have been used to implement the MPI-based multiscale MD/DFT simulation code in a Grid environment. In the initial implementation, processors on multiple PC clusters are statically allocated using a host file. The user specifies the number of processors for each QM-cluster calculation in a configuration file. In more recent MD/DFT simulations, a simple local error indicator based on atomic bond lengths has been used to automatically change the size of QM calculations in run-time. The Gridified MD/QM simulation code has been used to study environmental effects of water molecules on fracture in silicon. A preliminary run of the code has achieved a parallel efficiency of 94% on 25 PCs distributed over 3 PC clusters in the US and Japan.
2.9.
Data Management and Mining
A serious technological gap exists between the growth in processor power and that of input/output (I/O) speed. The I/O (including data transfer to remote archival storage devices) has thus become the bottleneck in our large-scale MD simulations. We address the I/O problem using a scalable data-compression scheme we have developed recently [51]. It uses octree indexing and sorts atoms accordingly on the resulting spacefilling curve (see Fig. 13). By storing differences between successive atomic coordinates, the I/O requirement with the same error tolerance level reduces from O(N log N ) to O(N ). This, together with a variable-length encoding to handle exceptional values, reduces the I/O size by an order of magnitude with a user-controlled error bound. Large-scale MD simulations are expected to reveal atomistic correlations between local stresses and microstructural activities during dynamic fracture in complex materials. A challenge is to extract topological defects, such as dislocations, and their activities from massive data with large thermal noises, especially at high temperatures. This will require nontrivial knowledge discovery or data-mining processes from very large noisy data sets.
Multimillion atom molecular-dynamics simulations
893
Figure 13. (a) A spacefilling curve based on octree indexing maps the 3D space into a sequential list, while preserving spatial proximity of consecutive list elements. (The panel shows a 2D example.) (b) Atoms are sorted along the spacefilling curve and only relative positions are stored. (c) A 3D spacefilling curve color-coded from red (the head of the list) to blue (the tail).
Visualization of collective motion of many atoms is a difficult task because of the high dimensionality (3N dimensions for N atoms) of the space in which the collective motion occurs. We find that concerted motion of many atoms can be visualized effectively by using graph data structures. In this approach, atoms and interatomic bonds are regarded as nodes and edges of a graph, respectively [52]. Node degree, the number of neighbor atoms, is a measure of local chemical order. Atomic processes are characterized by reconnection of edges. We have found that intermediate-range order in amorphous solids, which often extends up to five edge-lengths, is closely related to the distribution of the shortest-path rings of the graph [53]. Furthermore, graph data structures even encode global properties such as the rigidity of the entire solid [54]. Recently, we have applied a graph-theoretical topological analysis to pressure-induced structural transformation in gallium arsenide nanocrystals
894
P. Vashishta et al.
Figure 14. Graph-theoretical topological spectroscopy showing four- and six-membered rings during structural transformation of a gallium arsenide nanocrystal under pressure. The highpressure phase represented by four-membered rings nucleates at the surface and grows inward.
[55, 56]. The low- and high-pressure phases of this system are characterized by six- and four-membered rings, respectively (see Fig. 14). We have found that these ring structures are insensitive to the existence of surfaces. Consequently, bulk topological defects and incipient phases during structural phase transformation hidden inside the system can be easily detected. We have also applied a graph-theoretical approach based on the shortestpath ring analysis to identify and track topological defects such as dislocations during indentation and impact on materials [15].
2.10.
Multiscale Visualization in Virtual Environment
The MRMD algorithm with associated data structures (octree FMM cells, linked lists for the member atoms of FMM cells, and neighbor-atom lists) have been reused to efficiently visualize multimillion-atom simulations. We have developed a software named Atomsviewer, which visualizes billion-atom data sets at interactive speeds in an immersive environment [6]. In visualization, polygon rendering on a graphics pipeline is the primary bottleneck; thus, we minimize the pipeline workload by processing only the data the viewer will see. To do this, we use data-management techniques based on the octree data structure, see Fig. 15. Novel algorithms and techniques, such as our
Multimillion atom molecular-dynamics simulations
895
Figure 15. The octree data structure overlain on the atomistic data. The figure shows only the atoms that are selected for subsequent rendering.
probabilistic approach, to remove hidden atoms and a parallel and distributed design can further reduce the rendering pipeline’s workload. Furthermore, we offload all processing that precedes rendering to a PC cluster and dedicate the graphics server to rendering. The resulting architecture provides multiple viewpoints, thus enhancing the user experience. A plan is under way to increase the system size of atomistic simulations significantly, using a “Grid” of distributed, heterogeneous parallel machines and immersive and interactive virtual environments. This will present unprecedented challenges of scalability, load balancing, distributed data access, latency hiding, and control of levels of detail for fast rendering. Accordingly, multidisciplinary research involving simulation, visualization, and data management/ mining algorithms will become increasing more important. In particular, emerging hybrid simulation algorithms combining continuum and atomistic approaches will provide solid foundations for hybrid rendering algorithms combining atomistic, volumetric, and surface models.
3.
Part II: Multimillion-atom Molecular Dynamics Simulations of Nanostructured Materials and Processes
The scalable simulation algorithms described in the previous section have been used to perform large-scale atomistic simulations of various
896
P. Vashishta et al.
nanostructured materials and interfaces. In the following sections, we summarize some of the simulation results. The simulations that are described in this section deal with semiconductors, ceramic, and metallic nanostructures and nanostructured materials and processes. These include sintering of nanoclusters and nanostructured ceramics, fracture in nanostructured materials and scaling properties of fractured surfaces, interfacial fracture of silicon/silicon nitride interface, nanometer-scale stress patterns in silicon/silicon nitride nanopixels, self-limiting growth and critical lateral sizes in gallium arsenide/ indium arsenide nanomesas, structural transformation in GaAs nanocrystals, nanoindentation of crystalline and amorphous ceramic films, dynamics of oxidation of aluminum nanoparticles, ceramic fiber composite, and environmental effects on fracture – stress corrosion. For the next generation of aerospace engines and high efficiency and environmentally clean turbines, it will be necessary to have materials that are mechanically stable at or above l700◦ C. This is a very challenging problem and to accomplish the objective of making such materials synthesis, processing, and simulations will have to be carried out concurrently. In the first set of simulations, we will discuss sintering of silicon carbide and silicon nitride nanoclusters, crack propagation and fracture on nanostrucuted silicon nitride, including the scaling behavior of fractured surfaces.
3.1.
Sintering of Silicon Nitride Nanoclusters
Sintering is key to a number of advanced technologies. For example, multilayer ceramic integrated circuits (MCIC) are attracting much attention as an effective way to integrate discrete components for high-frequency wireless communication equipment. The major challenge in MCIC is to control constrained sintering of laminated ceramic multilayers to obtain mechanically stable products with desired properties. Computer simulations of MCIC is of particular interest to companies such as Motorala, Texas Instruments, Intel and other related industries in the area of wireless communication technologies. The first MD simulations of sintering of ceramic nanoclusters have been carried out in our group in 1996. The MD simulations have been performed to study sintering of Si3 N4 nanoclusters (each cluster consisting of 20 335 atoms) [57]. The simulations provide a microscopic view of anisotropic neck formation during early stages of sintering (Fig. 16). In the case of Si3 N4 nanocrystals at 2000 K, considerable relative motion of clusters is observed in the initial stages. Subsequently a few Si and N atoms join the two nanocrystals and, thus bound, they continue to rotate relative to each other for 100 ps. In the next 100 ps, the relative motion subsides and a steady growth of an asymmetric neck between the two nanocrystals is observed. In the neck region, there are more four-fold than three-fold coordinated Si atoms.
Multimillion atom molecular-dynamics simulations
897
Figure 16. (Left panel) Snapshots of Si3 N4 nanocrystals at 2000 K: (a) at time t = 0; (b) after 40 ps; (c) after 100 ps; and (d) close-up of the neck region after 200 ps. Two types of spheres denote Si and N atoms. (Right panel) Snapshot of sintered amorphous nanoclusters after 700 ps.
The sintering of amorphous Si3 N4 nanoclusters has been also simulated at 2000 K. The neck between amorphous nanoclusters is much more symmetric than the neck between thermally rough nanocrystals (Fig. 16). The neck region between amorphous nanoclusters has nearly the same number of three- and four-fold coordinated Si atoms. For both nanocrystals and amorphous nanoclusters, sintering is driven by diffusion of surface atoms. The diffusion in the neck region of amorphous clusters is four times faster than in the neck between nanocrystals. MD simulations have been also performed to study sintering between three nanoclusters. For nanocrystals, a significant rearrangement of the nanocrystals occurs within 100 ps, followed by the onset of neck formation. Amorphous nanoclusters aligned along a straight line have been simulated. Within 100 ps, we observed relative motion of the clusters. In the next 100 ps, a symmetric neck forms between each pair of clusters and, thereafter, the relative motion subsides. The simulation shows a chain-like structure, which have been observed experimentally as well.
3.2.
Structure and Mechanical Properties of Nanostructured Ceramics
Advanced structural ceramics are highly desirable materials for applications in extreme operating conditions. Light-weight, elevated melting temperatures,
898
P. Vashishta et al.
high strengths, and wear and corrosion resistance make them very attractive for high-temperature and high-stress applications. The only serious drawback of ceramics is that they are brittle at low to moderately high temperatures. In recent years, a great deal of progress has been made in the synthesis of ceramics that are much more ductile than conventional coarse-grained materials [58, 59]. These so called nanostructured materials are fabricated by in situ consolidation of nanometer size clusters. Despite a great deal of research, many perplexing questions concerning nanostructured ceramics remain unanswered. Experiments have yet to provide information regarding the morphology of pores or the structure and dynamics of atoms in nanostructured ceramics. As far as modeling is concerned, only a few atomistic simulations of nanostructured materials have been reported thus far. This is due to the fact that these simulations are highly compute-intensive: a realistic MD simulation of a nanostructured solid requires 105 –106 time steps and ∼106 atoms (each nanocluster itself consists of 103 –104 atoms). Large-scale MD simulations have been performed to investigate sintering, structure, and mechanical behavior of nanostructured Si3 N4 [60–62], SiC [13] and SiO2 [8]. Figure 17 shows the results of the first joint experimental and MD study of sintering of nanostructured SiC (n-SiC) [13]. In both experiment (solid diamonds) and simulation (open circles), the onset of sintering is around l500 K. The MD simulations provide a microscopic picture of how the morphology of micropores in n-SiC changes with densification. The fractal dimension and the surface roughness exponent of micropores are found to be 2.4 and 0.45, respectively, over the entire pressure range between 0 and
Figure 17. (Left) Snapshot of nanophase SiC. (Right) The onset of sintering is indicated by an increase in the average particle size in the neutron data (solid diamonds) and an increase in the rate of bond formation between nanoparticles in the MD results (open circles). The dotted line is a guide to the eye for the MD results.
Multimillion atom molecular-dynamics simulations
899
15 GPa. Small-angle neutron scattering at low wave vectors yields a fractal dimension of two for pores in n-SiC. MD calculations of pair-distribution functions and bond-angle distributions reveal that interfacial regions between nanoparticles are highly disordered with nearly the same number of three-fold and four-fold coordinated Si atoms. The effect of consolidation on mechanical properties is also investigated with the MD approach. The results show a power-law dependence of elastic moduli on the density with an exponent of 3.4 ± 0.1. The simulation of nanostructured SiO2 involves amorphous nanoclusters, which are obtained from bulk amorphous SiO2 [8]. In n-SiO2 the morphology of micropores, mechanical behavior, and the effect of nanoscale structures on the short-range and intermediate-range order (SRO and IRO) are investigated (Fig. 18). Pores in nanostructured a-SiO2 are found to have a self-similar structure with a fractal dimension close to two; the pore surface width scales √ with the volume as, W ∼ V. The MD simulations also reveal that the SRO in nanostructured silica glass is very similar to that in the bulk glass: both of them consist of corner-sharing Si(O1/2 )4 tetrahedra. However, the IRO in nanostructured silica glass is quite different from that in the bulk glass. We have also investigated the mechanical behavior of nanostructured a-SiO2 . The elastic moduli are found to have a power-law dependence on the density with
Figure 18. (Top) Snapshots of nanophase amorphous silica at densities 1.37, 1.59, 1.84, and 2.13 g/cc, corresponding to pressures 2, 4, 8 and 16 GPa, respectely. (Bottom) The same systems as the above, but pores are colored as red.
900
P. Vashishta et al.
an exponent of 3.5. These results are in excellent agreement with experimental measurements on high-density silica aerogels.
3.3.
Crack Propagation in Amorphous SiO2 and Nanostructured Si3 N4
Amorphous silica (a-SiO2 ) was obtained by heating β-cristobalite to 3200 K and then quenching the molten system to room temperature. The short-range spatial correlations and medium-range order in the computer-generated system are in good agreement with neutron scattering measurements. The calculated bond angle distribution, Si–O–Si, also compares very well with Nuclear Magnetic Resonance measurements [63]. The amorphous system was notched and a uniaxial strain was applied to atoms within 7.5 Å (cutoff in the potential) from the outermost layers. The system was relaxed for several thousand time steps before incrementing the strain. Figure 19 shows that crack propagation is accompanied by nucleation and growth of nanometer scale cavities ahead of the crack tip. Cavities coalesce and
Figure 19. (Top) Snapshot of atoms (t = 55 ps) in a MD simulation of fracture in a-SiO2 at room temperature shows nanometer scale cavities (black) in front of the crack, cavity coalescence, and merging of cavities with the advancing crack. (Bottom) AFM picture which is relative to a stress corrosion crack (i.e., sub-critical crack growth, where the corrosion by the water contained in the atmosphere assists the crack propagation) in an aluminosilicate glass at room temperature reveals nanometric cavities ahead of the crack. With the Fracturesurface Topography Analysis (FRASTA) method, it is shown that the voids contribute to the final fracture and are actually damage cavities. Recently, the group has observed the same fracture mechanism in silica glass.
Multimillion atom molecular-dynamics simulations
901
merge with the advancing crack to ultimately cause failure. Recent experimental work of Bouchaud et al. [64], involving an Atomic Force Microscope study of fracture in an aluminosilicate glass, reveals nanocavitation and coalescence of cavities with the crack to be the mechanism of fracture. √ The calculation of the MPa m and the experimental critical stress intensity factor, K√1C , in a-SiO2 is 1 √ values range between 0.8 MPa m and 1.2 MPa m [65]. Turning to nanostructured silicon nitride (n-Si3 N4 ), we first remove a spherical nanoparticle of diameter 6 nm from crystalline α-Si3 N4 [60]. The nanoparticle is thermalized at room temperature and then 108 different configurations of the nanoparticle are placed randomly in a cubic MD box. (The system contains approximately 106 atoms.) The initial configuration is heated to 2000 K and subsequently sintered under hydrostatic pressures of 5,10, and 15 GPa. The final sintered system (at 15 GPa) is cooled down and thermalized at room temperature. Subsequently, the pressure is reduced to 10, 5, and 0 GPa. In each instance, the system is relaxed for thousands of time steps. The final n-Si3 N4 configuration is consolidated to 92% of the density of crystalline a-Si3 N4 . MD calculations of Si–Si, Si–N, and N–N pair-distribution functions and Si–N–Si and N–Si–N bond-angle distributions reveal that interior regions of nanoparticles remain crystalline whereas interfacial regions between nanoparticles are akin to amorphous Si3 N4 [60]. This was confirmed by MD simulations of amorphous Si3 N4 of the same mass density as the average mass density of interparticle regions in n-Si3 N4 . Partial pair-distribution functions and bond-angle distributions for the two systems are similar. As we shall see momentarily, the amorphous structure of interparticle regions plays a key role in crack propagation in n-Si3 N4 . In MD simulations of dynamic fracture in n-Si3 N4 , the sintered system at room temperature is notched and subjected to an external strain [61]. Figure 20(a) is a snapshot of the system at 10 ps after the strain reaches
Figure 20. MD simulations of dynamic fracture in n-Si3 N4 at room temperature. Snapshots show the crack front and the cavities in n-Si3 N4 at 10 ps after applied strains of 5% (a), 11% (b), and 14% (c) were reached.
902
P. Vashishta et al.
5%. (To highlight cavities and cracks in the system, atoms are not shown in the figure.) In addition to the notch (magenta), we observe nanoscale cavities in amorphous interparticle regions. As the strain is increased, the notch advances and cavities grow and coalesce among themselves and also with the advancing crack; see Fig. 20(b). The crack front meanders through amorphous interparticle regions; see Fig. 20(c). Nanoscale cavitation, crack meandering, and crack branching render n-Si3 N4 much tougher than a-Si3 N4 crystal, which undergoes cleavage fracture. Fracture toughness of n-Si3 N4 is estimated to be 6 times larger than that of the crystal. We have also investigated crack propagation in amorphous nanostructured silica (n-SiO2 ). The system was generated by removing a spherical nanoparticle of diameter 8 nm from the bulk a-SiO2 system mentioned before. After thermalizing it at room temperature, 100 different configurations of the nanoparticle were placed randomly in a cubic box. Periodic boundary conditions were applied and the system was sintered at l000 K under hydrostatic pressure of 16 GPa. Subsequently, the system was cooled down to room temperature and thermalized both before and after removing the pressure. MD simulations of fracture in n-SiO2 reveal that the crack propagates through interparticle regions. At small values of the applied strain, these regions have a few isolated nanocavities. As the applied strain is increased, we observe: (a) the precrack advances mostly through interfacial regions; (b) nanocavities grow and coalesce; and (c) new nanocavities form ahead of the crack in interparticle regions. The crack meanders through nanoparticle boundaries, coalescing with nanocavities in its path, until the system completely fractures.
3.4.
Scaling Properties of Fracture Surfaces
We have examined the morphology of fracture surfaces in n-Si3 N4 and have found scaling behavior akin to that observed experimentally in a variety of other materials. Fracture surfaces are self-affine objects with the height– height correlation function varying as: ∝ rζ , h(r) = (x(z + r) − x(z))2 1/2 z
(1)
where x is the height of the fracture profile normal to the plane of crack propagation and . . .z implies an average over z. Figure 21 shows the MD results for fracture surfaces in n-Si3 N4 . The log–log plot of h vs. r reveals two distinct power-law regimes with exponents ζ = 0.58 and 0.84 below and above a cross-over length, ξc , respectively. The smaller exponent (0.58) is found to be due to intra-cavity correlations while the larger one (0.84) results from intercavity correlations and crack–cavity coalescence. The cross-over length, ξc is close to the size of the nanoparticle.
Multimillion atom molecular-dynamics simulations
903
Figure 21. Height–height correlation function for fracture surfaces in n-Si3 N4 . The MD results show that the roughness exponent is 0.58 and 0.84 below and above a certain crossover length, respectively. The cross-over length is close to the nanoparticle size.
Fracture experiments on various metals, alloys, ceramics, and glasses reveal similar scaling behavior [66]. The experimental value of the lower experiment is around 0.5 while the larger exponent is always close to 0.8, independent of the material or its microstructure. The cross-over length ξc is, however, a material characteristic, which decreases with an increase in the crack velocity.
3.5.
Interfacial Fracture at Silicon/Silicon Nitride Interface
Interfaces between dissimilar materials are ubiquitous in silicon integrated circuit and other heterojunction based technologies. Owing to the differences in their mechanical and thermal properties, high stresses are known to develop at such interfaces and at the edge regions generated in delineating discrete device elements or pixels [67]. This can cause defect formation, including crack initiation and propagation. Fracture at interfaces has been a subject of numerous experimental and theoretical studies. Cracking patterns range from surface cracks and channeling in the film to substrate damage, spalling and debonding of the interface. Silicon dioxide and silicon nitride are two dielectrics commonly employed in semiconductor technology for a variety of purposes such as gate insulator, trench isolation, encapsulation, etc. In recent years, finite element analyses have been undertaken to examine aspects of stress distributions in such situations to supplement the more limited results available from analytical theories [68]. Theoretical studies and simulations of crack initiation,
904
P. Vashishta et al.
propagation (i.e., dynamics), and fracture have, however, been lacking for the semiconductor/dielectric interfaces. A way to simulate crack initiation and its propagation is to apply uniaxial strain parallel to the interface and examine, via molecular dynamics, the time evolution of the system to analyze its failure resistance. We will discuss the Si/Si3 N4 interface and nanopixels in the following two sections. In an effort to increase processing speed and memory density, the feature sizes of semiconductor devices are expected to shrink to 50 nm or smaller in the next several years. Stresses induced in Si/Si3 N4 nanopixels are major sources of defects and inhomogeneities in the system and they become significant for pixel sizes in the nanometer range. One of the main issues is the development of reliable physical models for the Si/Si3 N4 interface, which can supplement empirical data in nanopixel design. In our simulations, silicon nitride is represented by an interatomic potential involving two- and three-body interactions. The two-body terms include steric repulsion, the effect of charge transfer via Coulomb interaction, and the large electronic polarizability of anions through the charge-dipole interaction. Threebody terms account for bond-bending and bond-stretching effects. Bulk and Young moduli along with the phonon density-of-states of a crystal and structural correlations in the amorphous state [21] are described well by the interaction potential. It has been used successfully to study fracture in crystalline, amorphous, and nanophase Si3 N4 [9, 10, 60, 61]. The silicon system is described by the Stillinger/Weber potential [69]. To account for all the structural correlations for silicon, silicon nitride and the Si (111)/Si3 N4 (0001) interface, the system is modeled using eight components [70, 71]. These consist of: Si4+ and N3− in the bulk Si3 N4 ; Si3+ , N2− , and N3− at the Si3 N4 side of the interface; three-fold coordinated Si at Si (111) interface, its four-fold coordinated neighboring silicon in the plane; and bulk Si. The multimillion atom simulations were performed on a variety of parallel supercomputers using highly efficient space–time MRMD algorithms. Molecular dynamics, Langevin dynamics, and steepest descent quench methods were used. These interfacial bond lengths obtained from our interaction potentials are consistent with chemical arguments and self consistent linear combination on of atomic orbitals (LCAO) calculations [72] and give satisfactory description of the structure of silicon nitride/silicon interface (see Fig. 22). A schematic of the geometry of the interface system is shown in Fig. 23 [71]. After thermalizing the system at 300 K, the system is stretched parallel ¯ direction for silicon nitride and in the [211] ¯ to the interface, i.e., in the [21¯ 10] direction for silicon until it failed. For each percent of strain the system has been subjected to a 2 ps stretching phase and a 2 ps relaxation phase as seen in the time evolution of σx x , the stress tensor component in the stretching direction (see Fig. 23). The system did not show any failure up to 8% strain. At 9% strain, within the first 2 ps, σx x decreased dramatically. This is due to the
Multimillion atom molecular-dynamics simulations
905
Figure 22. (Left) The atomic structure of the Si (111)/Si3 N4 (0001) interface. The small red spheres are the Si (111) atoms; the small cyan spheres are the Si atoms of the Si3 N4 side; and the large spheres are the N atoms of the Si3 N4 side. (Right) The valence charge–density map of the same system calculated with an LCAO calculation.
Figure 23. (Left) Schematic of the simulated Si [111]/Si3 N4 [0001] system. (Center) Uniaxial stress in the x direction as a function of time. (Right) A slice of the system, in which a dislocation is highlighted by a circle. Solid dots are atoms.
fact that a crack started to form at the top surface of the silicon nitride layer and it propagated through the whole silicon nitride layer within 17 ps. The system was monitored for additional 80 ps. It was found that the crack does not propagate into Si, but instead emits dislocations, which correlates well with an additional drop in σx x after 48 ps. We have examined the structure of
906
P. Vashishta et al.
Figure 24. Close-up a dislocation loop. Only Si atoms with energies larger than the average silicon atom energy by +0.35 eV are plotted.
silicon at the interface to determine the nature of defects created by the crack arriving from silicon nitride. In Fig. 23 the extra line of atoms (in yellow) in a Si[111] plane parallel to the interface – an edge dislocation – can be clearly seen. The dislocation core lies within the white dashed circle. The projection of ¯ direction as indicated the displacement vector onto the [111] plane is in [110] by the arrow from a red to a yellow Si atom. The time evolution is given in Figs. 24 and 25(a)–(c). Only those Si atoms whose energies are higher than the average silicon energy by +0.35 eV are shown. Interfacial (blue) and surface atoms (red) also satisfy this criterion, i.e., their energy is 0.35 eV larger than the average energy, and can be seen at the top (interface – blue atoms) and bottom (silicon surface – red atoms). In Fig. 25(a), we see the formation of a dislocation loop at the interfacial plane (blue) and the right-hand silicon surface (surface atoms belonging to the vertical planes have been removed from the plot to make the dislocation loop ¯ plane denoted with dashed lines visible). The dislocation loop lies on a (1¯ 11) in Fig. 25(a). This loop has five segments – the line in the interfacial plane ¯ – the first segment. Moving clock(blue atoms at the top) is in direction [110] wise, the second segment, vertical, is in direction [011], the third in direction ¯ the fourth in direction [110], and the last segment is in direction [011]. [011], As time proceeds the dislocation loop grows (see Figs. 25(b) and (c)) till it reaches the silicon surface (red) at the bottom after 13 ps. From our simulation data we estimate the speed of the dislocation motion to be 500 (±100) m/s.
3.6.
Nanometer-Scale Stress Patterns in Si/Si3 N4 Nanopixels
The first MD simulations of nanopixels in the ranges of 25 to 70 nm were performed in our group by using parallel MD [70]. Large scale
Multimillion atom molecular-dynamics simulations
907
Figure 25. Time evolution of dislocation motion. Only Si atoms with eneries larger than the average silicon atom energy by +0.35 eV are plotted, i.e., interfacial atoms (blue), surface atoms (red), and atoms in the dislocation core (red). (a) At 9.12 ps, formation of a dislocation loop at the interfacial plane (blue) and the right-hand silicon surface (surface atoms belonging to the vertical planes have been removed from the plot to make the dislocation loop visible). The ¯ plane denoted by dashed lines. The dislocation loop consists of dislocation loop lies on a (1¯ 11) five segments. The first segment is in the interfacial plane (blue atoms at the top) in direction ¯ [110]; moving clockwise, the second segment, vertical, is in direction [011], the third is in ¯ the fourth in direction [110], and the last segment is in direction [011]. (b) and direction [011], (c) show dislocation loop after 10.56 ps and 12 ps, respectively.
computing resources located at the Caltech National Science Foundation facility and the DoD sites were used for these simulations. There are many interesting and challenging issues at the semiconductor/ceramic interface. These include stresses due to bonding of two very dissimilar materials and the effect of lattice mismatch of interfacial stresses. Beyond these, there is also the question of stresses due to edges and corners for such small nanostructures. Many interesting phenomena are associated with length scales far beyond those accessible to electronic structure calculations. At present, the only viable solution to this problem is large-scale MD simulations provided the interatomic potentials are able to describe Si, Si3 N4 , and the interface in a seamless fashion.
908
P. Vashishta et al.
In the interatomic interaction scheme, a clear distinction between Si atoms in the silicon substrate and those in silicon nitride is essential. In addition, the atoms near the interface have different charge transfer from those in bulk Si3 N4 . The LCAO electronic structure calculations for the Si (111)/Si3 N4 (0001) interface [72] indicate that the interatomic interaction in Si/Si3 N4 can be modeled very well as an eight-component system, where each of the eight atom types is associated with a different set of parameters in the interatomic potential. Bulk Si is modeled by the Stillinger–Weber potential. The potential for bulk silicon nitride is a sum of two-body and three-body terms. The former includes the effects of charge transfer, electronic polarizability, steric repulsion and Van der Waals interactions; the latter takes into account covalent effects through bondbending and bond-stretching terms. The interatomic potential has been validated by comparison with experiments on crystalline and amorphous Si3 N4 . For atoms at the interface, the charge transfer, bond lengths, and bond angles are consistent with the results of the electronic-structure calculations. To study atomic-level stress distribution in a Si/Si3 N4 nanopixel, we have performed MD simulations involving up to 27 million atoms [70]. The interatomic potential model used in the simulations has been developed on the basis of LCAO electronic structure calculations [72]. The system consists of a Si mesa placed on top of a Si(111) substrate. The top surface of the mesa is covered with a crystalline Si3 N4 (0001) or amorphous Si3 N4 film. The Si (111)/cSi3 N4 (0001) interface has a 1.1% lattice mismatch which induces stresses in the system. The lattice mismatch causes compressive stresses in Si3 N4 , while a tensile stress is observed in Si (Fig. 26(a)). Note the effects of surfaces and edges on the stress. In the case of an a-Si3 N4 film, we find the stress to be nonuniform laterally, as seen in Fig. 26(b), which is quite different from that for crystalline films. Lateral stress domains on the scale of 100–150 Å are observed in the case of amorphous Si3 N4 film. As the PECVD (plasma enhanced chemical vapor deposition) films employed are polycrystalline, such a lateral inhomogeneity in stress is expected in the films employed and our results reveal a hitherto unappreciated serious consideration for the processing of nanoscale pixels. Figure 27 shows stress patterns and the effect of the mesa shape (square or rectangular) on stresses. For a 25 nm square-mesa system with 3.7 million atoms, three-fold symmetry of Si (111) gives rise to three tensile stress domains. For a 10 million-atom system with a rectangular mesa of dimensions 54 nm × 33 nm, a similar stress pattern is observed in silicon just below the interface. However, the aspect ratio 1.6 for the 54 nm × 33 nm mesa does not accommodate two three-fold patterns like the one in the 25 nm square mesa. The two stress patterns are squeezed together into a Y shape with the longer leg along the longer length of the mesa. Pixel sizes on the order of or less than 50 nm are currently being considered by industry and government agencies for fabrication in 2005–2010
Multimillion atom molecular-dynamics simulations
909
Figure 26. Pressure distribution in a Si/Si/Si3 N4 nanopixel. (a) To show the pressure inside a nanopixel covered with crystalline Si3 N4 , one quarter of the system is removed. (b) Pressure distribution in Si substrate parallel to the interface with amorphous Si3 N4 .
Figure 27. Horizontal cross-sections of stress distributions in nanopixels covered with amorphous Si3 N4 for two different system sizes. The slices are taken through Si3 N4 above the interface and Si below the interface for the 25 nm square and the 54 nm × 33 nm rectangular mesas.
910
P. Vashishta et al.
period. Stress domains in these pixels may have a significant effect on the performance of such devices: they may cause dopant distribution to be highly inhomogeneous, since their size of stress domains can be comparable to the dimensions of the nanopixel.
3.7.
Self-limiting Growth and Critical Lateral Sizes in GaAs/InAs Nanomesas
In recent years, coherently strained three-dimensional islands formed in semiconductor overlayers having high lattice-mismatch with underlying substrates have attracted much attention due to their importance in the study of electronic behavior in zero dimension and applications in electronic and optoelectronic devices. The role and manipulation of stress in the formation of such nanostructures have been systematically examined through a study of the growth of InAs on planar and patterned GaAs (001) substrate (these systems have a large lattice mismatch of 6.6%). On infinite planar substrates, the strain relief leads to the formation of coherent three-dimensional island structures above a critical amount, ∼1.6 monolayers (ML), of InAs deposition. On the contrary, when InAs is deposited on 100 oriented GaAs square mesas of size ≤75 nm, the island morphology is suppressed and, instead, a continuous film with flat morphology is observed. This InAs film growth is, however, self-limiting and stops at ∼11 ML. In order to understand the self-limiting nature of the InAs film growth, we have recently performed MD simulations of InAs/GaAs nanomesas with {101}-type sidewalls, see Fig. 28(a) [18]. The inplane lattice constant of InAs layers parallel to the InAs/GaAs (001) interface starts to exceed the InAs bulk
Figure 28. (a) Atomic-level hydrostatic stress in an InAs/GaAs square nanomesa with a 12 ML InAs overlayer. (b) Vertical displacement of As atoms in the first As layer above the InAs/GaAs interface in the 8.5 million-atom and the 2.2 million-atom nanomesas.
Multimillion atom molecular-dynamics simulations
911
value at the 12th ML and the hydrostatic stresses in InAs layers become tensile above ∼12th ML. As a result, it is not favorable to have InAs overlayers thicker than 12 ML. This may explain the experimental findings of the growth of flat InAs overlayers with self-limiting thickness of ∼11 ML on GaAs nanomesas. Length scales are of critical significance for stress relaxation and manipulation leading to control of the island number on chosen nanoscale area arrays. For example, on stripe mesas of sub-100-nm widths on GaAs (001) substrates, deposition of InAs is shown to allow self-assembly of three, two, and single chains of InAs three-dimensional island quantum dots selectively on the stripe mesa tops for widths decreasing from 100 nm down to 30 nm. We have recently investigated lateral size effects on the stress distribution and morphology of InAs/GaAs nanomesas using parallel MD simulations, see Fig. 28(b) [17]. Two mesas with the same vertical size but different lateral sizes are simulated. For the smaller mesa, a single stress domain is observed in the InAs overlayer, whereas two stress domains are found in the larger mesa (a highly compressive domain is located at the center of the InAs overlayer, whereas the peripheral region of the InAs overlayer is less compressive). This indicates the existence of a critical lateral size for domain formation in accordance with recent experimental findings. We have also studied the morphology of the InAs overlayer near the InAs/GaAs interface. For the 2.2 million-atom nanomesa, the As layer is “dome” shaped. In contrast, the As layer in the 8.5 million-atom nanomesa shows a “dimple” at the center of the mesa. This provides clear evidence that there exists a critical lateral size for such stress domain formation and the critical value is somewhere between 124 and 407 Å. Detailed analysis of structural correlations have revealed that the InAs overlayer in the larger mesa is laterally constrained to the GaAs bulk lattice constant but vertically relaxed to the InAs bulk lattice constant, which is consistent with the Poisson effect.
3.8.
Structural Transformation in GaAs Nanocrystals
Aggregates of nanometer-size semiconductor crystals have promising applications as photovoltaics, light-emitting diodes, and single-nanocrystal, singleelectron transistors. Self-organized assembly of colloidal nanocrystals acts as an intelligent photonic-crystal material, which can be used as sensors and optical switches. Recently, self-formation of laser was demonstrated in semiconductor nanopowders due to disorder-induced photon localization mechanisms. Rod-shaped nanocrystals emit polarized light and will be useful for biological tagging applications. The most recent additions to this family of nanocrystals include tetrapods. Such systematically-controlled anisotropic shapes can be used as building blocks for self-assembly of three-dimensionally integrated nanostructures through surface-stress encoded epitaxy, self-alignment, and
912
P. Vashishta et al.
biological templates. Finally, these nanocrystals can be utilized as new synthetic paths to novel materials that do not exist in bulk form. Size-dependent phase stability plays an essential role in the synthesis of nanocrystals. For example, many III–V and II–VI semiconductors transform from a four-coordinated phase to a six-coordinated phase as the pressure is increased, and the transition pressure often exhibits strong size dependence. Upon release of pressure, the metastable high-pressure phase can be kinetically trapped in nanoclusters. This is due to the large number of surface atoms such that the surface energetics essentially affects the phase stability. We may thus be able to prepare interior bonding geometries that do not occur in the known extended solid by adjusting the surface energy. In other words, it is possible to manipulate nanocrystal surfaces to trap structures that might ordinarily be unstable in the bulk. Nanophase engineering uses controlled pressurization and annealing to achieve new material forms, which are nonexistent in the bulk. Molecular-dynamics simulations are expected to reveal microscopic mechanisms of the nanocrystalline phase kinetics. We have performed MD simulations to investigate pressure-induced structural transformations in GaAs nanocrystals of different sizes [55]. To simulate the experimental situation, the nanocrystals are immersed in a Lennard-Jones liquid so that they can be subjected to hydrostatic pressure, see Fig. 29. It is found that the transformation from four-fold (zinc-blende) to six-fold (rocksalt) coordination starts at the surfaces of nanocrystals and proceeds inwards with increasing pressure, see Fig. 30(a). Inequivalent nucleation of the high-pressure phase at different sites leads to an inhomogeneous deformation of the nanocrystal. For sufficiently
Figure 29. Initial thermalized system. The GaAs nanocrystal is embedded in the Lennard– Jones liquid that serves as a hydrostatic pressure medium.
Multimillion atom molecular-dynamics simulations
913
Figure 30. Structural transformation in a GaAs nanocrystal from outer to inner shells. (a) An 8 Å slice of an initially spherical nanocrystal of diameter 60 Å that is partially transformed at a pressure of 17.5 GPa. Outermost shell shows the rocksalt structure (atoms making fourmembered rings) while the innermost shell continues to show the zinc-blende (atoms making six-membered rings). (b) The same slice with the nanocrystal completely transformed at 22.5 GPa. The rocksalt structure can now be seen in the innermost shell. The red lines are a guide to the eye to see the differently oriented grains.
large spherical nanocrystals, this gives rise to rocksalt structures of different orientations separated by grain boundaries, see Fig. 30(b). The absence of such grain boundaries in a faceted nanocrystal of moderate size indicates sensitivity of the transformation to the initial nanocrystal shape. The pressure corresponding to the complete transformation increases with the nanocrystal radius and it approaches the bulk value for a spherical nanocrystal of ∼5000 atoms.
3.9.
Nanoindentation of Silicon Nitride
Nanoindentation testing is a unique probe of mechanical properties of materials. Typically, an atomic force microscope tip is modified to indent the surface of a very thin film, see Fig. 31. The resulting damage is used to rank the ability of the material to withstand plastic damage against that of other materials. In addition, a load-displacement curve is constructed from the measured force at each displacement, and the elastic modulus in the direction of the indent can be measured from the initial part of the unloading curve. Commercial nanoindenting apparatus typically have a force resolution of ±75 nN and
914
P. Vashishta et al. Cantilever Arm
Indenter Tip Damage Caused By Indenter Thin Film Coating
Substrate (a)
Indente
Top
[0001]
Y Y
Z
[1210]
X
X
[1010] (b)
(c)
Figure 31. (a) Schematic of an AFM modified for nanoindentation experiments. (b) and (c) Schematic view of the indenter/substrate system. In our MD simulations the substrate has dimensions 60.6 × 60.6 × 30 nm3 and has 10, 614, 240 atoms. The x- and y-axes are normal ¯ and (12 ¯ 10) ¯ surfaces, respectively. The indent was done into the (0001) surface. to the (1010)
depth resolution of ±0.1 nm. Recent developments in parallel computing and multiscale algorithms have enabled MD simulations to reach the scale of such commercial nanoindenters, resulting in a better atomic-level understanding of the indentation process.
Pressure (Gpa)
Multimillion atom molecular-dynamics simulations
915
20 Å
40 Å
60 Å
80 Å
0 Å
20 Å
40 Å
60 Å
20 10 5 0 ⫺5
Figure 32. Local pressure distribution directly under the indenter. Frames from the loading and unloading cycles are shown in clockwise order. The displacement of the indenter is given in the top left corner of each frame.
We have performed MD simulations to investigate nanoindentation in Si3 N4 [11, 12]. The nanoindentation simulation is performed on the (0001) surface of a 60 nm × 60 nm × 30 nm crystalline α-Si3 N4 slab consisting of 10 million atoms (see Fig. 31). The sample is indented using a pyramid indenter with a load ∼10 µN and indentation depth ∼10 nm (see Figs. 32 and 33). From the load-displacement curve, hardness value is estimated to be of 50.3 Gpa (see Fig. 34). (We have also calculated the hardness of amorphous Si3 N4 to be 31.5 GPa using a similar geometry.) Our simulations reveal significant plastic deformation and pressure-induced amorphization under the indenter. The simulations also exhibit anisotropic fracture toughness: Indentation cracks are ¯ 10] ¯ direction, which coincides with one of the diagonal observed along the [12 ¯ directions of the indenter, but not for the other diagonal direction, [1100]. Simulations were also performed to determine temperature effects, loadrate effects, and simulation-size effects in crystalline and amorphous silicon nitride. The simulations were run on several different parallel platforms, including the 256 and 128 node IBM SPs at the US Army Engineer Research and Development Center (ERDC), the 1088-node Cray T3E at NAVO, and the 512 node Origin 2000 at Aeronautical Systems Center (ASC). Recently we have completed nanoindentation simulations of SiC [15] and GaAs. The MD simulations of nanoindentation of alumina are in progress.
3.10.
Oxidation of Aluminum Nanoparticles
Oxidation plays a critical role in the performance and durability of various nanosystems. Oxidation of metallic nanoparticles offers an interesting possibility of synthesizing nanocomposites with both metallic and ceramic
916
P. Vashishta et al.
Figure 33. Time sequence of an indented surface of α-Si3 N4 .
Multimillion atom molecular-dynamics simulations
917
Figure 34. Load-displacement curve for (left) 10 million-atom α-Si3 N4 nanoindentation simulation and (right) 10 million-atom amorphous Si3 N4 simulation.
properties. We have performed the first successful MD simulation of oxidation of an Al nanoparticle (diameter 200 Å) [35]. The MD simulations are based on an interaction scheme developed by Streitz and Mintmire, which can successfully describe a wide range of physical properties of both metallic and ceramic systems [32]. This scheme is capable of treating bond formation and bond breakage and changes in charge transfer as the atoms move and their local environments are altered. The MD simulations provide detailed picture of the rapid evolution and culmination of the surface oxide thickness, local stresses, and atomic diffusivities, see Fig. 35. In the first 5 ps, oxygen molecules dissociate and the oxygen atoms first diffuse into octahedral and subsequently into tetrahedral sites in the Al nanoparticle. In the next 20 ps, as the oxygen atoms diffuse radially into and the Al atoms diffuse radially out of the nanoparticle, the fraction of six-fold coordinated oxygen atoms drops dramatically. Concurrently, there is a significant increase in the number of O atoms, forming clusters of cornersharing and edge-sharing OA14 tetrahedra. Between 30 and 35 ps, clusters of OA14 coalesce to form a neutral, percolating tetrahedral network that impedes further intrusion of oxygen atoms into and of Al atoms out of the nanoparticle. The electrostatic and non-electrostatic contributions to the local pressure in the nanocluster after 100 ps of simulation time are shown in Figs. 36(a) and (b), respectively. A stable oxide scale formed at the end of our simulation is shown in Fig. 37. Structural analysis reveals a 40 Å thick amorphous oxide scale on the Al nanoparticle. The thickness and structure of the oxide scale are in accordance with experimental results.
918
P. Vashishta et al.
Figure 35. Initial stages of oxidation of an Al nanoparticle. Size distributions of OAl4 clusters between 20 and 31 ps are shown. The clusters coalesce and percolate rapidly.
The MD simulations provide detailed picture of the rapid evolution and culmination of the surface oxide thickness, local stresses, and atomic diffusivities. Clusters of OA14 coalesce to form a neutral, percolating tetrahedral network that impedes further intrusion of oxygen atoms into and of Al atoms out of the nanoparticle. As a result, a stable oxide scale is formed. Structural analysis reveals a 40 Å thick amorphous oxide scale on the Al nanoparticle, see Fig. 37. The thickness and structure of the oxide scale are in accordance with experimental results.
3.11.
Ceramic Fiber Composites
Physical properties of composite materials often exhibit synergistic enhancement. For example, the fracture toughness of a fiber composite is much larger than a linear combination of the toughness values of the constituent
Multimillion atom molecular-dynamics simulations
919
Figure 36. (a) Electrostatic and (b) nonelectrostatic contributions to the local pressure in the nanocluster after 100 ps of simulation time.
materials. This enhanced toughness has been attributed to the frictional work associated with pulling out of fibers, which suggests that tough composites can be designed by combining strong fibers with weak fiber-matrix interfaces. Recently we have performed MD simulations (Fig. 38) to investigate the atomistic toughening mechanisms in Si3 N4 ceramic matrix (bulk modulus 285 GPa) reinforced with SiC fibers (bulk modulus 220 GPa, 16 vol. % fibers) coated with amorphous silica (bulk modulus 36 GPa) [73]. The simulations involving 1.5 billion atoms were performed on DoD parallel supercomputers. Fiber-reinforcement is found to increase the fracture toughness by a factor of two. The atomic-stress distribution shows an enhancement of shear stresses at the interfaces. The enhanced toughness results from frictional work during the pullout of the fibers. Immersive visualization of these simulations reveals a rich diversity of atomistic processes including fiber rupture and emission of molecular fragments, which must be taken into account in the design of tough ceramic composites.
3.12.
Environmental Effects on Fracture
The hybrid MD/QM simulation scheme was applied to study the effects of environmental molecules on fracture initiation in silicon, see Fig. 39 [43]. A (110) crack under tension (mode-I opening) is simulated with multiple H2 O
920
P. Vashishta et al.
Figure 37. Snapshot of the Al nanocluster after 0.5 ns of simulation time. (A quarter of the system is cut out to show the aluminum/aluminum-oxide interface.) The larger spheres correspond to oxygen and smaller spheres to aluminum; color represents the charge on an atom.
Figure 38. (Left panel) Fractured silicon nitride ceramic reinforced with silica coated silicon carbide fibers. (Right panel) close-up of the fractured composite system. Small spheres represent silicon atoms and large spheres represent nitrogen, carbon, and oxygen atoms.
Multimillion atom molecular-dynamics simulations
921
Figure 39. Schematic of three types of reaction processes–(a) chemisorption,√(b) oxidation, (c) bond breakage–found in the MD/QM simulation with K = 0.4 and 0.5 MPA m.
molecules around the crack front. Electronic structure near the crack front is calculated with density functional theory. To accurately model the longrange stress field, the quantum-mechanical description is embedded in a large classical molecular dynamics simulation. The hybrid simulation results show that the reaction of H2 O molecules at√a silicon crack tip is sensitive to the stress intensity factor K . For K = 0.4 MPa m, an H2 O molecule either decomposes and adheres to dangling-bond sites on the crack surface or oxidizes Si, resulting √ in the formation of a Si–O–Si structure. For a higher K value, 0.5 MPa m, an H2 O molecule either oxidizes or breaks a Si–Si bond.
3.13.
Conclusion and Future Research
Current multi-teraflop parallel supercomputers (operating trillions of floating-point operations per second) enable large-scale MD simulations involving up to billion atoms [5]. Petaflop computers (operating 1015 floating-point
922
P. Vashishta et al.
operations per second) anticipated to be built in the next 5–10 years are expected to enable trillion-atom MD simulations. In the same time frame, metacomputing on a Grid of geographically distributed supercomputers, mass storage, and virtual environment connected via high-speed networks will revolutionize computational research by enabling (i) very large-scale computations that are beyond the power of a single supercomputer, and (ii) collaborative, hybrid computations that integrate distributed, multiple expertise [49]. A multidisciplinary application that will soon require Grid-level computing is emerging at the forefront of computational science and engineering. We have recently developed such a multiscale simulation approach which seamlessly combines continuum mechanics based on the FE method, MD simulations to describe atomistic processes, and QM calculations based on the DFT to handle breakage and formation of atomic bonds [37]. These emerging new computer architectures, together with further developments in scalable simulation algorithms and parallel computing frameworks, will be critical for the advancement of modeling and simulation research. Some of the most exciting and challenging opportunities in simulation research lie at the nano-bio interface. The following illustrates several nanoscale systems that will be amenable to atomistic simulations in the near future.
3.14.
Chemically Synthesized Quantum Dot Structures
3.14.1. Quantum rods and tetrapods Self-organized assembly of colloidal nanocrystals acts as an intelligent photonic-crystal material, which can be used as sensors and optical switches. Rod-shaped nanocrystals synthesized by Paul Alivisatos’ group at Berkeley [74] (Fig. 40, left) emit polarized light and will be useful for biological tagging
Figure 40. Transmission electron micrographs of CdSe nanocrystal quantum rods (left) and tetrapod (right) (from Paul Alivisatos’s Group, University of California, Berkeley).
Multimillion atom molecular-dynamics simulations
923
applications. The most recent additions to this family of nanocrystals include tetrapods (Fig. 40, right). Such systematically-controlled anisotropic shapes can be used as building blocks for self-assembly of three-dimensionally integrated nanostructures through surface-stress encoded epitaxy, self-alignment, and biological templates [75].
3.14.2. Core-shell nanoparticles The nanocrystals mentioned above are often coated with heterogeneous materials to form so called “core-shell” structures [76]. In semiconductor nanocrystals, the core-shell structures achieve better quantum confinement and enhanced luminescence quantum yield compared with their monolithic counterparts. Semiconductor nanocrystals can be passivated by both epitaxially grown heterogeneous semiconductor layers and disordered oxides to improve carrier confinement and enhance optical diffraction. Furthermore, these semiconductor quantum dots can be coated with conducting layers, enabling intercluster charge transfer to achieve tunable electrophotonic properties. Although coreshell quantum dots such as CdSe/CdS and CdSe/ZnS achieve higher luminescence quantum yields compared with their monolithic counterparts, the large lattice mismatch (CdSe lattice constant is 4.0% and 12.7% larger than that of CdS and ZnS, respectively) causes residual stresses and mechanical instabilities such as cracking. From a geometrical consideration, materials with larger lattice constants are preferable as a shell (such as InAs shell with 6.6% larger lattice constant than that of GaAs core). Molecular-dynamics simulations will be useful to study residual stresses and cracking in lattice-mismatched core-shell quantum dots.
3.14.3. Protein-based nanostructures Self-assembled protein structures from extremophiles (microbes such as bacteria and archea living in extreme environments), combined with advanced genetic-engineering techniques, offer tremendous opportunities for nanotechnology. Recently, double-ring structures composed of heat shock proteins isolated from hyperthermophilic archea have been used as building blocks for synthesizing a wide variety of self-assembled nanostructures such as nanotubes and two-dimensional superlattices (Fig. 41). Protein nanostructures have potential applications as templates for self-assembled optoelectronic devices and as biocompatible coatings [75]. Since each protein consists of 104 –105 atoms, atomistic simulations of these protein-based nanostructures will be a challenge, requiring the Petaflop and Grid architectures.
924
P. Vashishta et al.
Figure 41. (Left) Sliced top view revealing the ring structure in the thermosome from Thermoplasma acidophilum – a heat shock protein 60 in organisms (thermophiles) living at high temperatures. (Center) Aggregates of chaperonin filaments synthesized by Jonathan Trent’s group at NASA Ames. (Right) a two-dimensional superlattice of chaeronins synthesized by Jonathan Trent’s group at NASA Ames.
On Petaflop machines, due to be available in the 2010 time frame, it should be possible to simulate in its entirety the three-dimensional nanostructure built from nanoscale rods, tetrapods, and core-shell nanoparticles on an ordered array of proteins.
Acknowledgments This work is partially supported by AFOSR, ARO, DARPA, DOE, NSF, and USC-Berkeley-Princeton-LSU DURINT. A few million-atom simulations were performed using the inhouse parallel computers at the Collaboratory for Advanced Computing and Simulations at the University of Southern California. Ten million to billion atom simulations were performed using parallel computers at the High Performance Computing Center at the University of Southern California and at the Department of Defense’s Major Shared Resource Centers under a DoD Challenge project.
References [1] A. Pechenik, R.K. Kalia, and P. Vashishta, Computer-Aided Design of HighTemperature Materials. , Oxford University Press, Oxford, UK, 1999. [2] A. Nakano, M.E. Bachlechner, R.K. Kalia, E. Lidorikis, P. Vashishta, G.Z. Voyiadjis, T.J. Campbell, S. Ogata, and F. Shimojo, “Multiscale simulation of nanosystems,” Comput. Sci. Engrg., 3(4), 56–66,2001. [3] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: Brittle fracture,” Proc. Nat. Acad. Sci. USA., 99, 5777–5782, 2002. [4] T.C. Germann and P.S. Lomdahl, “Recent advances in large-scale atomistic materials simulations,” IEEE Comput. Sci. Eng., 1(2), 10, 1999.
Multimillion atom molecular-dynamics simulations
925
[5] A. Nakano, R.K. Kalia, P. Vashishta, T.J. Campbell, S. Ogata, F. Shimojo, and S. Saini, “Scalable atomistic simulation algorithms for materials research,” Sci. Progr., 10, 263, 2002. [6] A. Sharma, A. Nakano, R.K. Kalia, P. Vashishta, S. Kodiyalam, P. Miller, W. Zhao, X.L. Liu, T.J. Campbell, and A. Haas, “Immersive and interactive exploration of billion-atom systems,” Presence-Teleoper. Vir. Environ., 12, 85–95, 2003. [7] P. Vashishta, R.K. Kalia, J.P. Rino, and I. Ebbsjo, “Interaction potential for SiO2 – a molecular-dynamics study of structural correlations,” Phys. Rev. B, 41, 12197–12209, 1990. [8] T. Campbell, R.K. Kalia, A. Nakano, F. Shimojo, K. Tsuruta, P. Vashishta, and S. Ogata, “Structural correlations and mechanical behavior in nanophase silica glasses,” Phys. Rev. Lett., 82, 4018–4021, 1999. [9] P. Vashishta, R.K. Kalia, and I. Ebbsjo, “Low-energy floppy modes in hightemperature ceramics,” Phys. Rev. Lett., 75, 858–861, 1995. [10] A. Nakano, R.K. Kalia, and P. Vashishta, “Dynamics and morphology of brittle cracks – a molecular-dynamics study of silicon-nitride,” Phys. Rev. Lett., 75, 3138– 3141, 1995. [11] P. Walsh, R.K. Kalia, A. Nakano, P. Vashishta, and S. Saini, “Amorphization and anisotropic fracture dynamics during nanoindentation of silicon nitride: a multimillion atom molecular dynamics study,” Appl. Phys. Lett., 77, 4332–4334, 2000. [12] P. Walsh, W. Li, R.K. Kalia, A. Nakano, P. Vashishta, and S. Saini, “Structural transformation, amorphization, and fracture in nanowires: a multimillion-atom molecular dynamics study,” Appl. Phys. Lett., 78, 3328–3330, 2001. [13] A. Chatterjee, R.K. Kalia, A. Nakano, A. Omeltchenko, K. Tsuruta, P. Vashishta, C. K. Loong, M. Winterer, and S. Klein, “Sintering, structure, and mechanical properties of nanophase SiC: a molecular-dynamics and neutron scattering study,” Appl. Phys. Lett., 77, 1132–1134, 2000. [14] F. Shimojo, I. Ebbsjo, R.K. Kalia, A. Nakano, J.P. Rino, and P. Vashishta, “Molecular dynamics simulation of structural transformation in silicon carbide under pressure,” Phys. Rev. Lett., 84, 3338–3341, 2000. [15] I. Szlufarska, R.K. Kalia, A. Nakano, and P. Vashishta, “Nanoindentation-induced amorphization in silicon carbide,” Appl. Phys. Lett., 85, 378–380, 2004. [16] J.P. Rino, I. Ebbsjo, P.S. Branicio, R.K. Kalia, A. Nakano, and P. Vashishta, “Shortand intermediate-range structural correlations in amorphous silicon carbide (a-SiC): a molecular dynamics study,” Phys. Rev. B, 70, 045207, 2004. [17] X.T. Su, R.K. Kalia, A. Nakano, P. Vashishta, and A. Madhukar, “Critical lateral size for stress domain formation in InAs/GaAs square nanomesas: a multimillion-atom molecular dynamics study,” Appl. Phys. Lett., 79, 4577–4579, 2001. [18] X.T. Su, R.K. Kalia, A. Nakano, P. Vashishta, and A. Madhukar, “Million-atom molecular dynamics simulation of flat InAs overlayers with self-limiting thickness on GaAs square nanomesas,” Appl. Phys. Lett., 78, 3717–3719, 2001. [19] P.S. Branicio, R.K. Kalia, A. Nakano, J.P. Rino, F. Shimojo, and P. Vashishta, “Structural, mechanical, and vibrational properties of Gal-xInxAs alloys: a molecular dynamics study,” Appl. Phys. Lett., 82, 1057–1059, 2003. [20] P.S. Branicio, J.P. Rino, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Molecular dynamics study of structural, mechanical, and vibrational properties of crystalline and amorphous Gal-xInxAs alloys,” J. Appl. Phys., 94, 3840–3848, 2003. [21] A. Nakano, M.E. Bachlechner, P. Branicio, T. J. Campbell, I. Ebbsjo, R.K. Kalia, A. Madhukar, S. Ogata, A. Omeltchenko, J.P. Rino, F. Shimojo, P. Walsh, and P. Vashishta, “Large-scale atomistic modeling of nanoelectronic structures,” IEEE T. Electron Dev., 47, 1804–1810, 2000.
926
P. Vashishta et al.
[22] A. Nakano, R.K. Kalia, and P. Vashishta, “First sharp diffraction peak and intermediate-range order in amorphous silica – finite-size effects in moleculardynamics simulations,” J. Non-Crystall. Sol., 171, 157–163, 1994. [23] L. Greengard and V. Rokhlin, “A fast algorithm for particle simulations,” J. Comput. Phys., 73, 325, 1987. [24] A. Nakano, R.K. Kalia, and P. Vashishta, “Multiresolution molecular-dynamics algorithm for realistic materials modeling on parallel computers,” Comput. Phys. Commun., 83, 197–214, 1994. [25] S. Ogata, T.J. Campbell, R.K. Kalia, A. Nakano, P. Vashishta, and S. Vemparala, “Scalable and portable implementation of the fast multipole method on parallel computers,” Comput. Phys. Commun., 153, 445–461, 2003. [26] T. Darden, D. York, and L. Pederson, “Particle mesh Ewald: an Nlog(N) method for Ewald sums in large systems,” J. Chem. Phys., 98, 10089, 1993. [27] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein, “Explicit reversible integrators for extended systems dynamics,” J. Chem. Phys., 101, 4177, 1994. [28] A. Nakano, “Fuzzy clustering approach to hierarchical molecular-dynamics simulation of multiscale materials phenomena,” Comput. Phys. Commun., 105, 139, 1997. [29] A. Nakano and T.J. Campbell, “An adaptive curvilinear-coordinate approach to dynamic load balancing of parallel multiresolution molecular dynamics,” Parallel Comput., 23, 1461, 1997. [30] A. Nakano, “Multiresolution load balancing in curved space: the wavelet representation,” Concurrency: Prac. Exper., 11, 343, 1999. [31] A.K. Rappe and W.A. Goddard, “Charge equilibration for molecular-dynamics simulations,” J. Phys. Chem., 95, 3358–3363, 1991. [32] F.H. Streitz and J.W. Mintmire, “Electrostatic potentials for metal-oxide surfaces and interfaces,” Phys. Rev. B, 50, 11996, 1994. [33] A.C.T. van Duin, S. Dasgupta, F. Lorant, and W.A. Goddard, “ReaxFF: a reactive force field for hydrocarbons,” J. Phys. Chem. A, 105, 9396–9409, 2001. [34] A. Nakano, “Parallel multilevel preconditioned conjugate-gradient approach to variable-charge molecular dynamics,” Comput. Phys. Commun., 104, 59, 1997. [35] T. Campbell, R.K. Kalia, A. Nakano, P. Vashishta, S. Ogata, and S. Rodgers, “Dynamics of oxidation of aluminum nanoclusters using variable charge molecular-dynamics simulations on parallel computers,” Phys. Rev. Lett., 82, 4866–4869, 1999. [36] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60, 2391–2403, 1999. [37] S. Ogata, E. Lidorikis, F. Shimojo, A. Nakano, P. Vashishta, and R.K. Kalia, “Hybrid finite-element/molecular-dynamics/electronic-density-functional approach to materials simulations on parallel computers,” Comput. Phys. Commun., 138, 143–154, 2001. [38] E. Lidorikis, M.E. Bachlechner, R.K. Kalia, A. Nakano, P. Vashishta, and G.Z. Voyiadjis, “Coupling length scales for multiscale atomistics-continuum simulations: atomistically induced stress distributions in Si/Si3 N4 nanopixels,” Phys. Rev. Lett., 87, 086104, 2001. [39] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, 864, 1964. [40] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, 1133, 1965. [41] W. Kohn and P. Vashishta, “General density functional theory,” In: N.H. March and S. Lundquist (eds.), Inhomogeneous Electron Gas, Plenum, 79, 1983. [42] N. Troullier and J.L. Martins, “Efficient pseudopotentials for plane-wave calculations. 2. Operators for fast iterative diagonalization,” Phys. Rev. B, 43, 8861–8869, 1991.
Multimillion atom molecular-dynamics simulations
927
[43] S. Ogata, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Environmental effects of H2 O on fracture initiation in silicon: a hybrid electronic-densityfunctional/molecular-dynamics study,” J. Appl. Phys., 95, 5316–5323, 2004. ¨ ut, I. Vasiliev, and A. Stathopoulos, “Electronic [44] J.R. Chelikowsky, Y. Saad, S. Og¨ structure methods for predicting the properties of materials: grids in space,” Phys. Stat. Sol. (b), 217, 173, 2000. [45] S. Ogata, F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Hybrid quantum mechanical/molecular dynamics simulation on parallel computers: density functional theory on real-space multigrids,” Comput. Phys. Commun., 149, 30–38, 2002. [46] F. Shimojo, R.K. Kalia, A. Nakano, and P. Vashishta, “Linear-scaling densityfunctional-theory calculations of electronic structure based on real-space grids: design, analysis, and scalability test of parallel algorithms,” Comput. Phys. Commun., 140, 303–314,2001. [47] J.-L. Fattebert and J. Bernholc, “Towards grid-based O(N) density-functional theory methods: optimized nonorthogonal orbitals and multigrid acceleration,” Phys. Rev. B, 62, 1713, 2000. [48] S. Dapprich, I. Kom´aromi, K.S. Byun, K. Morokuma, and M.J. Frisch, “A new ONIOM implementation in Gaussian 98. I. The calculation of energies, gradients, vibrational frequencies, and electric field derivatives,” J. Mol. Struct. (Theochem.), 461–462, 1, 1999. [49] I. Foster and C. Kesselman, The Grid 2: Blueprint for a New Computing Infrastructure., Morgan Kaufmann, San Francisco, 2003. [50] H. Kikuchi, R.K. Kalia, A. Nakano, P. Vashishta, H. Iyetomi, S. Ogata, T. Kouno, F. Shimojo, K. Tsuruta, and S. Saini, “Collaborative simulation Grid: multiscale quantum-mechanical/classical atomistic simulations on distributed PC clusters in the US and Japan,” Proc. Supercomputing ’02, IEEE, 2002. [51] A. Omeltchenko, T.J. Campbell, R.K. Kalia, X.L. Liu, A. Nakano, and P. Vashishta, “Scalable I/O of large-scale molecular dynamics simulations: a data-compression algorithm,” Comput. Phys. Commun., 131, 78–85, 2000. [52] A. Sharma, R.K. Kalia, A. Nakano, and P. Vashishta, “Large multidimensional data visualization for materials science,” Comput. Sci. Engrg., 5(2), 26–33, 2003. [53] J.P. Rino, I. Ebbsjo, R.K. Kalia, A. Nakano, and P. Vashishta, “Structure of Rings in Vitreous SiO2 ,” Phys. Rev. B, 47, 3053–3062, 1993. [54] D.J. Jacobs and M.F. Thorpe, “Generic rigidity percolation – the pebble game,” Phys. Rev. Lett., 75, 4051–4054, 1995. [55] S. Kodiyalam, R.K. Kalia, H. Kikuchi, A. Nakano, F. Shimojo, and P. Vashishta, “Grain boundaries in gallium arsenide nanocrystals under pressure: a parallel molecular-dynamics study,” Phys. Rev. Lett., 86, 55–58, 2001. [56] A. Nakano, R.K. Kalia, and P. Vashishta, “Scalable molecular-dynamics, visualization, and data-management algorithms for materials simulations,” Comput. Sci. Engrg., 1, 39–47, 1999. [57] K. Tsuruta, A. Omeltchenko, R.K. Kalia, and P. Vashishta, “Early stages of sintering of silicon nitride nanoclusters: a molecular-dynamics study on parallel machines,” Europhys. Lett., 33, 441–446, 1996. [58] H. Gleiter, “Materials with ultrafine microstructures: retrospectives and perspectives,” Nanostruct. Mater., l, 1, 1992. [59] R.W. Siegel, “Creating nanophase materials,” Sci. Amer., December, 74, 1996. [60] R.K. Kalia, A. Nakano, K. Tsuruta, and P. Vashishta, “Morphology of pores and interfaces and mechanical behavior of nanocluster-assembled silicon nitride ceramic,” Phys. Rev. Lett., 78, 689–692, 1997.
928
P. Vashishta et al.
[61] R.K. Kalia, A. Nakano, A. Omeltchenko, K. Tsuruta, and P. Vashishta, “Role of ultrafine microstructures in dynamic fracture in nanophase silicon nitride,” Phys. Rev. Lett., 78, 2144–2147, 1997. [62] K. Tsuruta, A. Nakano, R.K. Kalia, and P. Vashishta, “Dynamics of consolidation and crack growth in nanocluster-assembled amorphous silicon nitride,” J. Amer. Ceram. Soc., 81, 433–436, 1998. [63] R.F. Pettifer, R. Dupree, I. Farnan, and U. Sternberg, “NMR determinations of Si– O–Si bond angle distributions in silica,” J. Non-Crystall. Sol., 106, 408–412, 1988. [64] E. Celarie, S. Prades, D. Bonamy, L. Ferrero, E. Bouchaud, C. Guillot, and C. Marliere, “Glass breaks like metal, but at the nanometer scale,” Phys. Rev. Lett., 90, 075504, 2003. [65] L.V. Brutzel, C.L. Rountree, R.K. Kalia, A. Nakano, and P. Vashishta, MRS Proc., 703, 3.9.1–3.9.6, 2001. [66] P. Daguier, B. Nghiem, E. Bouchaud, and F. Creuzet, “Pinning and depinning of crack fronts in heterogeneous materials,” Phys. Rev. Lett., 78, 1062–1065, 1997. [67] S.M. Hu, “Stress-related problems in silicon technology,” J. Appl. Phys., 70, R53– R80, 1991. [68] S.C. Jain, H.E. Maes, K. Pinardi, and I. DeWolf, Appl. Phys. Rev., 79, 8145, 1996. [69] F.H. Stillinger and T.A. Weber, “Computer-simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [70] A. Omeltchenko, M.E. Bachlechner, A. Nakano, R.K. Kalia, P. Vashishta, I. Ebbsj¨o, A. Madhukar, and P. Messina, “Stress domains in Si (lll)/Si3 N4 (0001) nanopixel – 10 million-atom molecular dynamics simulations on parallel computers,” Phys. Rev. Lett., 84, 318, 2000. [71] M.E. Bachlechner, A. Omeltchenko, A. Nakano, R.K. Kalia, P. Vashishta, I. Ebbsj¨o, and A. Madhukar, “Dislocation emission at the silicon/silicon nitride interface: a million-atom molecular dynamics simulation on parallel computers,” Phys. Rev. Lett., 84, 322–325, 2000. [72] G.L. Zhao and M.E. Bachlechner, “Electronic structure and charge transfer in alphaand beta-Si3 N4 and at the Si (lll)/Si3 N4 (001) interface,” Phys. Rev. B, 58, 1887–1895, 1998. [73] P. Vashishta, R.K. Kalia, and A. Nakano, “Large-scale atomistic simulations of dynamic fracture,” Comput. Sci. Engrg., 1(5), 56–65, 1999. [74] X.G. Peng, L. Manna, W.D. Yang, J. Wickham, E. Scher, A. Kadavanich, and A.P. Alivisatos, “Shape control of CdSe nanocrystals,” Nature, 404, 59–61, 2000. [75] R.A. McMillan, C.D. Paavola, J. Howard, S.L. Chan, N.J. Zaluzec, and J.D. Trent, “Ordered nanoparticle arrays formed on engineered chaperonin protein templates,” Nat. Mater., 1, 247–252, 2002. [76] M.C. Schlamp, X.G. Peng, and A.P. Alivisatos, “Improved efficiencies in light emitting diodes made with CdSe(CdS) core/shell type nanocrystals and a semiconducting polymer,” J. Appl. Phys., 82, 5837–5842, 1997.
2.26 MODELING LIPID MEMBRANES Christophe Chipot,1 Michael L. Klein,2 and Mounir Tarek1 1 Equipe de dynamique des assemblages membranaires, Unit´e mixte de recherche Cnrs/Uhp 7565, Institut nanc´eien de chimie mol´eculaire, Universit´e Henri Poincar´e, BP 239, 54506 Vandœuvre–l`es–Nancy cedex, France 2 Center for Molecular Modeling, Chemistry Department, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104–6323, USA
1.
Introduction
Membranes consist of an assembly of a wide variety of lipids [1], proteins and carbohydrates that self-organize to assume a host of biological functions in the cell machinery, like the passive and active transport of matter, the capture and storage of energy, the control of the ionic balance, or the intercellular recognition and signalling. In essence, membranes act as walls that delimit the interior of the cell from the outside environment, preventing the free translocation of small molecules from one side to the other. At an atomic level, knowledge of both the structure and the dynamics of membranes remains to a large extent fragmentary, on account of the remarkable fluidity of these systems under physiological conditions. As a result, the amount of experimental information that can be interpreted directly in terms of positions and motions is still rather limited. A method that could provide the atomic detail of lipid bilayers, that is often inaccessible to conventional experimental techniques would, therefore, be extremely valuable for improving our understanding of how membranes function. It would further constitute a bridge between observations at the macroscopic and the microscopic levels, and possibly reconcile the two views. Atomic simulations [2], in general, and molecular dynamics (MD) simulations, in particular, have proven to be an effective approach for investigating lipid aggregates, providing new insights into both the structure and the dynamics of these systems. 929 S. Yip (ed.), Handbook of Materials Modeling, 929–958. c 2005 Springer. Printed in the Netherlands.
930
C. Chipot et al.
Basic structural characteristics of the membrane are determined by the nature of the lipids, and how the latter self-organize into complex threedimensional arrangements, exposing their polar head groups to the aqueous environment, while protecting the aliphatic domain to form the hydrophobic core. Atomic simulations have developed over the past two decades to such an extent that it possible to model with the desired accuracy these structural features. Statistical simulations rely on models that have undeniably improved over the years, getting inexorably closer to the chemical, physical and biological reality of the systems investigated. Yet, they remain models, subject to a host of underlying approximations. It is, therefore, necessary to confront as systematically as possible the results of numerical simulations to the experimental data available. Only when the models have proven to have reached the appropriate robustness and reliability, can they serve as an explanatory, possibly predictive tool, capable of (i) rationalizing experimental findings, (ii) providing additional insights into experimentally observed phenomena, and (iii) suggesting new experiments. In the particular case of water–lipid assemblies, there is a considerable wealth of experimental information that can potentially be used to support or contradict in silico studies, albeit immediate confrontation often turns out to be rather cumbersome. Modeling biological membranes raises a number of difficulties, that still have not found a satisfactory solution. A lipid bilayer is, in essence, a disordered liquid crystal of virtually infinite extent. Truncation of this system into a finite-size patch, to comply with the current limitations of molecular simulations, de facto rubs out significant ranges of the wavelength spectrum that corresponds, for instance, to bending and splay motions. Current limitations in the available computational resources not only impose restrictions on the size of the system, but also on the time–scales explored. In silico experiments, like MD simulations, nonetheless, represent a powerful tool which is able to offer new insights into the structural and dynamical properties of lipid bilayers. This chapter is aimed at introducing this field to non-specialists, yet providing the necessary guidance for setting up and understanding statistical simulations of lipid–water assemblies, together with key-references for further reading. Up-to-date comprehensive reviews on modeling membranes can be found elsewhere [3–10]. After outlining the properties that govern selforganization, and the type of structural information accessible from experiment, the methodologies utilized to model these systems are described. Next, examples of atomic simulations of lipid bilayers are presented emphasizing how the results can be compared to experiment. Last, selected simulations of more complex membrane assemblies are described and discussed critically, with a glimpse into the future of this very promising research area.
Modeling lipid membranes
2. 2.1.
931
Lipid–Water Assemblies What Are The Factors That Determine The Morphology?
By and large, lipids and surfactants are amphipathic chemical species formed, roughly speaking, by a hydrophilic head group and a hydrophobic, alkyl tail. As a function of the chemical specie, this non-polar tail may be constituted by one or two aliphatic chains, either saturated or unsaturated. In the case of phospholipids, the head group usually consists of a phosphate group bonded to a variety of functional moieties, like a choline, an ethanolamine, a serine, or a glycerol fragment. Depending upon the type of fragment, the lipid is either charged – e.g., dimyristoylphosphatidylglycerol (DMPG), or neutral – e.g., dimyristoylphosphatidylcholine (DMPC). At the so-called sn–3 position, the phosphate group is attached to a glycerol hydroxyl moiety, the two remaining hydroxyl moieties being connected to aliphatic chains by means of ester linkages at position sn–1 and sn–2. At low concentrations, lipids or surfactants in an aqueous medium usually remain in a monomeric state. Beyond the critical micelle concentration (CMC), they self-assemble into a wide variety of unique three-dimensional structures, that encompass micelles, inverse micelles, bilayers, hexagonal tubular phases and more complicated bicontinuous labyrinths (see Fig. 1). The nature of the lipid determines the morphology of the three-dimensional arrangement [11, 12]. In water, lipids aggregate in such a fashion that the polar head group be hydrated adequately, while protecting the alkyl chains from exposure towards the aqueous environment. As a consequence, the cross-sections of both the head group and the chains dictate the morphology of the resulting lipid–water assembly. For instance, lipids featuring a large head group and a single alkyl chain usually form direct micelles, whereas lipids characterized by a smaller head group and possibly two alkyl chains tend to self-organize into inverse micelles [1]. For lipids forming planar bilayer assemblies, the net charge borne by the head group plays a noteworthy role in the self-organization process. Small, charged head groups show an interesting tendency to associate by means of intermolecular hydrogen bonds, resulting in compact structures with a small surface area per lipid – e.g., dilaureylphosphatidylethanolamine (DLPE) [13]. In larger, zwitterionic lipids , like phosphatidylcholine, lipid head groups are organized in inter and intra molecular charge pairs between the oppositely charged choline and phosphate groups [14]. Aside from the nature of the lipid, external conditions, like the concentration, the temperature, the pressure or the ionic strength of the solvent, strongly influence self-organization into a particular structure. Extensive variables, for instance, can be used to control the transition between phases. At low
932
C. Chipot et al.
(a)
(b)
(c) Figure 1. Polymorphism of lipid–water assemblies. (a) The cross-sectional area of the head group is larger than that of the alkyl tail. In an aqueous environment, this specie forms direct micelles, which further organize into hexagonal HI phases. (b) The cross-sectional area of the head group is smaller than that of the alkyl tail. The lipids form inverted micelles in water, which may aggregate into hexagonal HII phases. (c) The cross-sectional areas of the head group and the tail are comparable. The lipids assemble into planar bilayers, in the gel, Lβ , phase (left) or in the liquid crystal, Lα , phase.
temperatures, lipid bilayers remain in the gel, Lβ , phase, wherein the alkyl chains, mostly in an all-trans conformation, are well ordered and exhibit a reduced mobility. At higher temperatures, the gel phase transforms into a liquid crystal, Lα , phase characterized by an increase of the surface area per lipid and a decrease of the thickness of the bilayer, as a direct consequence of the “melting” of the participating alkyl chains. The transition temperature, depends on the chemical nature of the lipid. For instance, it increases with the length of the alkyl chains, but it decreases with the number of unsaturations. Most cell membranes in vivo exist in the fluid, liquid crystal phase, barring a few cases, e.g., stratum corneum specialized membrane [15]. It is, therefore, not too surprising that, at the exception of a handful of simulations of lipid bilayers in the gel phase, most investigations have focused on the so-called, biologically relevant Lα phase.
2.2.
Experimental Available Information
To this date, neutron and x-ray diffraction experiments probably remain the most powerful tools for determining structures of lipid bilayers at an atomic
Modeling lipid membranes
933
resolution [16–19]. A particularly pertinent information supplied by diffraction experiments are density distributions [20], that can be deconvoluted in terms of atomic positions in the direction normal to the water–membrane interface, for different types of atoms. High-resolution x-ray diffraction experiments may offer additional, valuable information, that can directly serve as a reference for computational studies. Such is the case of the surface area per lipid, that may be derived from gravimetric x-ray methods or from electron density profiles. It should be underlined, however, that the highly disordered nature of liquid crystal, Lα , phases, and their fluctuations makes the observation of such systems particularly difficult, and explains the large uncertainty in the values supplied by the literature [20]. Whereas x-ray and neutron diffraction on multi-layered samples have historically been a source of high-resolution structural information of model membranes, neutron reflectivity has provided unique data on single lipid bilayers in contact with bulk water. Scattering length density (SLD) profiles along the normal to a layered system are deduced from the information collected as a function of the scattering wave-vector transfer (Q). Recently it has been shown that it is also possible to invert directly the reflectivity spectra to obtain SLD profiles [21]. It is important to note, however, that only the total SLD profile is determined. For more complex systems, atomistic modeling can provide valuable insight into such structures, thereby complementing the experimental studies [22–24]. Nuclear magnetic resonance (NMR) techniques are also used extensively to probe the molecular organization in lipid membranes. Earlier on, 2 H NMR experiments on oriented lipid matrices supplied lipid order parameters, against which the average orientational order along the acyl chains calculated from simulations could be confronted. Today, thanks to the introduction of magic angle spinning (MAS) techniques, a very large number of parameters from lipid bilayers are available, providing a wealth of information on the conformation of all lipid segments [25]. X-ray and neutron scattering measurements as well as NMR experiments may also be used as a possible source of comparison of dynamical properties against MD simulations. As will be seen in what follows, the significant computational effort involved in atomic simulations of large lipid–water assemblies limits, from a biological perspective, their length to relatively short times. Short time-scale dynamics is yet amenable to MD, and the data determined by this approach can be confronted directly to scattering experiments [4, 26], and, for instance, to nuclear Overhauser enhancement spectroscopy (NOESY) cross-relaxation rates [27], which probe motions occurring over comparable time-scales.
934
3.
C. Chipot et al.
Modeling Lipid Bilayers
In order to eliminate edge effects and to mimic a macroscopic system, simulations of lipid bilayers consist of considering a small patch of lipid and water molecules confined in a central simulation cell, and replicating the latter using periodic boundary conditions (PBCs) in the three directions of Cartesian space, as is being done in the simulations of molecular liquids and crystals. In doing so, the simulated system corresponds to a small fragment of either a multi-lamellar liposome or of a multi-lamellar oriented lipid stack, similar to those deposited on a substrate (see Fig. 2). The size of the simulated sample results in artefactual, symmetry-induced effects and the impossibility to witness collective phenomena like bending or splay motions that occur over length-scales above the size of the cell [20, 28, 29]. If needed, one may render a more biologically or physically meaningful picture, consistent with experimentally observed phenomena, by incorporating a large number of lipid and water molecules [30]. Even then, the length of the simulation constitutes another critical aspect in the modeling of lipid–water assemblies, essentially because a number motions in lipid bilayers, occur over time-scales exceeding 10 ns (see Fig. 2).
3.1.
Choice of the Thermodynamic Ensemble
From a technical perspective, the simplest thermodynamical ensemble for simulating lipid–water assemblies is undeniably the microcanonical, (N , V, E),
10⫺8s (c)
10⫺11s
(a)
10⫹4s 10⫺9s
10⫺6s
(b)
Figure 2. Left: small patch of lipid bilayer replicated by PBCs. Right: characteristic time-scales in lipid bilayers. Overall, motions occur on times that range between a few ps for the separation of sn–1 and sn–2 alkyl chains, to a few hours for the so-called flip–flop, where in a lipid unit migrates from one leaflet to the other.
Modeling lipid membranes
935
ensemble, or possibly the canonical, (N, V, T ), ensemble, wherein the temperature is controlled rigorously by means of a thermostat. In this event, the modeler may choose to fix the cross-sectional area per lipid unit to its experimental value and leave an appropriate head space of air in contact with the water lamellae, above and below the membrane. Whereas this protocol is ad hoc in the case of a simple, homogeneous lipid bilayer, one may legitimately wonder how it will perform when additives – e.g., small solutes to large proteins, are introduced into the membrane or in its vicinity. A better adapted thermodynamic ensemble should then be employed to allow the participating lipid chains to relax in response to the modification of the surface tension imposed by the additive. A very tempting solution consists in turning to the isobaric–isothermal, (N, P, T ), ensemble, that makes use of rigorous barostats and thermostats to maintain, respectively, the pressure and the temperature at the desired values. This raises, however, difficulties of its own. In a mixture of oil and water with a positive surface tension, the free energy increases monotonously with the surface area, as the system minimizes the contact area between the two liquids. In the case of lipids interacting with water – viz. typically a hydrated lipid bilayer, the picture is somewhat more intricate. Just like for a mixture of oil and water, by virtue of the hydrophobic effect, the free energy increases with the surface area. This is evidently not the sole contribution governing the behavior of lipid bilayers, the surface area of which would be minimized regardless of the temperature, thereby forcing the system in the gel, Lβ , phase. Small surface areas, indeed, restrain the alkyl chains in an ordered state, consequently decreasing the entropy of the lipid–water assembly. As a result, the free energy no longer increases with the surface area, but, on the contrary, exhibits a minimum that corresponds to an optimum surface area for a given temperature. This also implies that the surface tension, γ , should be strictly zero, and, therefore, that the lateral pressure, P , be strictly equal to the pressure normal to the water–lipid interface, P⊥ : γ=
P⊥ − P (z) dz = 0.
(1)
This important result, which is expected for a self-organized system, prompted a host of authors to simulate lipid bilayers in the isotropic isobaric– isothermal ensemble, (N, P, T ) [31]. Whereas, strictly speaking, Eq. (1) is true for a lipid–water assembly of virtually infinite extent, it should be kept in mind that in atomic simulations, one models patches of finite size. Feller and Pastor put forward that a finite surface tension should be introduced to compensate for such finite-size effects that eliminate the possibility to observe collective phenomena like undulations over significant length-scales [32, 33], as in ripple, Pβ , phases, for instance. Tieleman and Berendsen argued that in the systems they investigated, the dependence of the surface tension with the surface area was marginal [34]. Lindahl and Edholm later showed that an applied
936
C. Chipot et al.
surface tension in the order of 10 mN/m would correct for large fluctuations in the surface area per lipid unit that are witnessed in simulations of lipid–water assemblies of limited size [35]. One thing is certain: in atomic simulations of lipid bilayers, P and P⊥ are anticipated to vary differently on account of the anisotropy of the environment. It is, therefore, strongly recommended to adopt an algorithm that generates the (N, P, T ) distribution, so that the dimensions of the simulation cell are rescaled independently in the x, y (in-plane) and in the z-directions [2, 36, 37].
3.2.
The Potential Energy Function
In atomic statistical simulations of membranes, all atoms pertaining to the system are treated classically as point masses, which, in the harmonic approximation, are connected to each other by means of springs. In some instances, for the sake of computational effort, certain groups of atoms, like methylene, –CH2 –, or methyl, –CH3 , moieties, are represented as a single, “united” atom of appropriate van der Waals radius and well depth [38]. Seminal simulations of lipid–water assemblies made use of the available multi-purpose force fields, often aimed at the modeling of solvated proteins and nucleic acids. It is, therefore, not too surprising that in early investigations, the agreement with experiment was either far from optimal, or clearly too good to not suspect a fortuitous cancellation of errors due to the conjunction of inadequate parameters and excessively short runs. In the following years, it was realized that a specific potential energy function should be employed to mimic accurately the properties of lipids, like the subtle trans-gauche equilibrium in the alkyl chains. A dearth of efforts was and is still invested to improve the representation of lipids and surfactants by means of an appropriate parameterization of the force-field contributions likely to affect the structural and dynamical features of these systems [39–43]. In some of these force-fields, to obtain a better description of the ordering in the fatty aliphatic chains, that can be ascribed to trans-gauche defects, the standard low-order Fourier series that is often used in conventional macromolecular force fields, was replaced by the more sophisticated Ryckaert–Bellemans torsional potential [44]. In addition, correct packing of the alkyl chains depends to a large extent on the quality of the van der Waals parameters utilized. One of the underlying assumptions made for the design of force fields is the transferability of these parameters between molecules – e.g., the van der Waals radius and well depth of an aliphatic sp3 carbon should be the same regardless of the chemical environment. The interaction parameters of the united methylene and methyl groups were originally derived from statistical simulations of short hydrocarbons like n-butane, as is the case of the OPLS force field [45]. The
Modeling lipid membranes
937
transferability hypothesis has proven to be inadequate when handling long alkyl chains, prompting a number of authors to reoptimize van der Waals interactions based on simulations of large hydrocarbons like pentadecane [31]. Determination of net atomic charges for lipids and surfactants from sophisticated quantum mechanical calculations may turn out to be a difficult task, on account of the size of the molecules. Unquestionably, partial charges derived from the electrostatic potential constitute the most satisfactory solution among the arsenal of approaches available to the modeler [46]. Yet, as has been demonstrated, point charges are inherently conformation-dependent [47], thus making the derivation of a unique set of charges representative of all possible conformations questionable. To circumvent the difficulties connected to the size of the molecules, it has been proposed to derive the net atomic charges as independent fragments, that are ultimately pieced together. This scheme, although tempting, should be considered with great care if local charges are delocalized over large spatial extents.
3.3.
Intermolecular Interactions
As has been mentioned previously, physically and biologically realistic simulations should involve a sufficiently large number of lipid and water molecules to minimize finite-size effects. Much of the computation effort involved in atomic simulations of lipid–water assemblies lies in the evaluation of pairwise interactions, the number of which increases dramatically with the number of particles in the system. Based upon the assumption that intermolecular interactions decay with the distance, earlier studies have employed a cut-off sphere, beyond which the interactions are truncated. This approximation is expected to be ad hoc for the short-range, van der Waals contribution. The use of a brute, finite spherical cut-off for truncating the short-range van der Waals interactions may, however, modulate the forces responsible for the cohesion of lipid–water assemblies. Accurate use of a cut-off requires to take into account the appropriate long-range corrections for both the energy and the pressure [48], based on the classical formulae utilized for Lennard–Jones fluids [49]. For Coulomb interactions, the range of which varies in 1/rn, where n ≤ 3 [2, 49], truncation becomes particularly arguable. In this event, the long-range character of the participating charge–charge (n = 1) and charge–dipole (n = 2) interactions makes the use of a spherical truncation unsuitable. Probably the most accurate approach for handling the long-range nature of electrostatic interactions in spatially replicated simulation cells is solving the Poisson equation. The Ewald approach [50], that decomposes the conditionnally convergent Coulombic sum over periodic boxes into two rapidly decaying contributions evaluated respectively in the direct and reciprocal spaces is the most
938
C. Chipot et al.
used method. Formally, the computational effort involved in this method scales as (N 2 ), thus making statistical simulations of large ensembles of atoms particularly costly. This effort can be reduced, scaling down the calculation to (N ln N ), by solving the Poisson equation numerically on a grid of points, over which the position of the particles are interpolated. Such a scheme constitutes the central idea of algorithms like particle–mesh Ewald (PME) or particle– particle–particle–mesh (P3 M) [51]. For completeness, while to our knowledge, it has not been yet applied in membrane simulations, it is worth mentioning the fast multipole approach, a method alternative to Ewald summation, that treats long-range interactions in a rigorous fashion, and scales linearly with N for very large systems – viz. on the order of 100 000 atoms [52]. The substantial computational investment required to attain a physically consistent description of the simulated molecular assembly may be further reduced by taking advantage of recent advances in the MD methodology. Considering that the different degrees of freedom involved in the system relax over distinct time-scales, it is not necessary that the corresponding force contributions be evaluated concurrently. This is, in essence, the basic principle of the so-called multiple time-step methods [53, 54], in which intramolecular, van der Waals and Coulomb forces can be updated with different frequencies [55]. In conjunction with constraint algorithms like SHAKE or RATTLE [56], that virtually eliminate the vibrations due to hard degrees of freedom it is possible to explore large regions of the phase space for a lesser computational effort, thus making long simulations of large lipid–water assemblies somewhat more affordable – the reader is referred to the chapter of Tuckerman and Martyna dedicated to integrator and ensembles in statistical simulations. Contemporary, academic MD packages, like AMBER [57], CHARMM [58], GROMACS [59] or NAMD [60], have benefited from several recent methodological developments on the algorithmic front, and incorporate more or less all the key-features discussed so far, that are necessary to investigate lipid–water assemblies rigorously. As has been commented on, obtaining a realistic picture of complex chemical systems like membranes requires handling sufficiently large sets of atoms, thereby increasing rapidly the computational effort in a dramatic fashion. From a modeling perspective, numerical simulations, in order to prove their usefulness, should supply the desired answer within a reasonable computation time – i.e., hopefully faster than would experimental data aquisition and analysis be carried out. Fulfilling this requirement implies taking advantage of modern, parallel architectures, over which the computational chore can be distributed. Yet, a number of the most popular MD codes were written several years ago, in the dawn of parallelism, when scalar computers were utilized predominantly. Although methodological and technical improvements of MD programs still constitute an ongoing process, the best performances in MD simulations can admittedly only be obtained using those codes that were designed specifically for parallel architectures,
Modeling lipid membranes
939
often based on a domain decomposition scheme in conjunction with an appropriate load balancing, that spreads the computational effort evenly across the array of available processors. Among these programs, NAMD, for instance, was developed in the spirit of conserving an optimal scalability as the number of processors increases – assuming large enough ensembles of atoms. By and large, MD codes targeted at massively parallel environments have undeniably contributed to making atomic simulations of membranes more affordable, allowing the modeler to deal with up to a few hundred thousands atoms [61]. Aside from the purely computational aspect of membrane modeling, visualization has also proven to play a significant role in these advances by helping to interpret the raw results of MD simulations. Flexible, user-friendly visualization programs developed in academical environments tend to become an indispensable element in the arsenal of tools at the disposal of the modeler. Today, non-commercial packages, like VMD [62], offer an increasing number of functionalities that can be tailored according to the own aspiration of the modeler, through object-oriented languages and the possibility to introduce new features by means of plugins.
4.
Atomic Simulations of Lipid Membranes
Traditionally, phospholipids have served as models for investigating in silico the structural and dynamical properties of membranes. From both a theoretical and an experimental perspective, zwitterionic phosphatidylcholine (PC) lipids constitute the best characterized systems. Hydrated DMPC [13, 63] and dipalmitoylphosphatidylcholine [31, 34, 64–67] (DPPC) bilayers have been so far probably the most extensively surveyed lipid membranes. Yet, on account of their intrinsic limitations – viz. the short alkyl chains in DMPC and the temperature of Lβ to Lα phase transition in DPPC, above physiological conditions – several authors have turned to biologically more relevant lipids like palmitoyloleylphosphatidylcholine [68, 69] (POPC), in particular for examining membrane proteins in a realistic environment, and lipids based on mixtures of saturated/polyunsaturated alkyl chains (SDPC, 18:0/22:6 PC) [43, 70]. A variety of alternative lipids, featuring different, possibly charged, head groups, have also been explored – e.g., DLPE [13, 71] (DLPE), dipalmitoylphosphatidylserine [72, 73] (DMPS) or glycerolmonoolein [30, 74] (GMO). In several cases, however, the modeler is faced with an absence of experimental data to which the results of atomic simulations can be confronted. Bilayers built from PC lipids, nonetheless, represent remarkable test systems not only to probe the methodology, but also to gain additional insight into the physical properties of membranes. In this section, the derivation of these properties from MD trajectories and how a bridge with experiment can be established will be detailed.
940
4.1.
C. Chipot et al.
Bilayer Structure
4.1.1. Density distributions As can be seen in Fig. 3, the spatial extent encompassed by the headgroup region of the DMPC units in a bilayer arrangement is remarkably broad. This is clearly seen in the number density profiles computed from the MD trajectory – an analysis along the direction normal to water–membrane interface of the in-plane densities of lipid and water atoms. A striking feature emerging from these distributions is the penetration of water far in the head-group region. The farthest extent of water molecules roughly coincides with the ester moieties of the lipids. The width of the interfacial region, on the order of 8–10 Å for a fully hydrated DMPC bilayer highlights the significant static and dynamic roughness of the membrane surface [74, 75], therefore, refining the traditional textbook picture, like that of Figs. 1 and 2. Interestingly enough, phosphate and choline groups lie approximately at the same depth in the bilayer, indicating that head groups are rather oriented in the plane of the bilayer. The average orientation of the head-group P–N bond dipoles with respect to the normal of the water–membrane interface arises around 70◦ , pointing towards the aqueous medium, albeit the orientational distribution is remarkably wide, and depends upon the temperature and, as expected, the potential energy function utilized [14]. In addition, the level of lipid hydration has been shown to play a noteworthy role in the orientation of the head groups [9]. Under any circumstances, it is crucial that the slow reorientation of the lipid head groups be considered when interpreting results from short MD trajectories. Estimates from single-molecule anisotropy
Figure 3. Left: snapshot taken from an MD simulation of a fully hydrated DMPC bilayer. Methylene and methyl carbon atoms of the alkyl chains are shown in light grey. The lipid headgroup atoms are shown dark grey, and the water molecules are drawn in black and white. Note the protrusion of head groups that results locally in a rough water–membrane interface. Right: density distributions for selected groups of atoms in a fully hydrated DMPC bilayer examined at 303 K (from Ref. [76]).
Modeling lipid membranes
941
imaging for fluorophore-tagged POPC molecules [77] indicate a rotational diffusion coefficient of ca. 0.7 rad2 /ns, slightly below estimates from MD simulations [78], suggesting that sampling of the whole rotational space for each molecule would necessitate over few tens of a nanosecond. The information provided by the density distributions can be confronted directly to x-ray and neutron diffraction measurements [17] (considering, respectively, the atomic scattering length densities, or the electron densities). The MD trajectories can further be used in conjunction with the scattering experiments in order to refine the data by, for instance including fraction volumes extracted from the simulation [79].
4.1.2. Lipid tail conformation Deuterium quadrupole splitting measured by 2 H NMR on non-oriented samples of membrane preparations, is mainly determined by the average conformation of the phospholipid molecules, and, as such, supplies valuable structural information about the system. Order parameters can be derived from MD trajectory, and can be expressed as a tensor, the elements of which write [80]: 1 3 cos ϕα cos ϕβ − δαβ (2) 2 Here, ϕα is the angle formed by the α-th molecular axis and the normal to the water–bilayer interface, and · · · is an ensemble average over all lipid chains. In most circumstances, based on symmetry relationships, it is assumed that the order parameters, SCD , for an alkyl chain bearing deuterium labels can be expressed as:
Sαβ =
1 3 cos2 θ − 1 (3) 2 where θ is simply the angle between the C–D chemical bond and the normal to the bilayer. When C–D is uniformly distributed, SCD = 0, and when the chain is all–trans, |SCD| = 0.5. In general for saturated lipids, |SCD | exhibits a plateau value at ca. 0.2 for the upper chain segments (Fig. 4). Force fields of the new generation reproduce quite well these order parameters, barring small discrepancies for the second carbon atom of the alkyl chains. Further analysis of MD trajectories may be aimed at extracting additional information from the NMR experiments. For instance, one may refine those methods targeted at obtaining such quantities as the average chain length or the surface area per molecule [81]. Another study exemplifies the successful combination of MD simulation with experiment to probe alkyl chain packing in lipid membranes. Such is the case of infrared (IR) data, that have been reinterpreted to estimate the concentrations of gauche–gauche, trans–gauche and trans–trans conformational sequences in a DPPC bilayer [82].
SCD =
942
C. Chipot et al.
Figure 4. Snapshot taken from an MD simulation of a synthetic channel formed of cyclic peptides of alternated D- and L- chiralities, embedded in a fully hydrated DMPC bilayer. Color coding of the atoms is identical to that in Fig. 3. Note the antiparallel β-sheet like conformation of the nanotube spanning the membrane. Within a few hundreds of ps, a single-file chain of water molecules is established in the hollow tubular structure.
4.1.3. Hydration of the head-group region In atomic simulations, solvation properties are often measured by means of radial distribution or pair correlation functions (RDFs):
gi j (r) =
N j (r; r + δr) 4π j
(4)
r 2 dr
where N j is the number of particles j at a distance from i comprised between r and r + δr and j is the density of particles j . In essence, this definition is targeted at isotropic fluids, and, in principle, should not be applied, as is, to anisotropic lipid–water assemblies [83]. To estimate the coordination number of site i – e.g., PC head groups, it seems far more appropriate to merely evaluate N j (r; r + δr) as a function of the separation r and determine its value at the first minimum of a qualitative RDF computed using Eq. (4).
4.1.4. Transmembrane electrostatic potentials Orientation of water molecules near the head-group region of the lipid bilayer is clearly anisotropic, compared to the bulk aqueous medium. This can be shown by measuring the average cosine of the angle formed by the dipole moment of the water molecules and the normal to the bilayer, as a function
Modeling lipid membranes
943
of the distance from its geometrical center. A marked peak emerges at a distance characteristic of the phosphate groups, emphasizing the orienting power exerted by this moiety on the surrounding aqueous environment. The preferential orientation of the dipole moment borne by the water molecules is at the origin of the vocabulary “dipole potential,” that has been employed extensively to denote the electrostatic potential across the water–membrane interface [84, 85]. This conspicuous ordering of water molecules was recently directly evidenced using coherent anti-Stokes Raman scattering microscopy [86]. In a number of in silico investigations, the electrostatic potential has been estimated from the knowledge of the charge density. In the spirit of atomic density distributions, charges are accumulated as a function of their position along the direction normal to water–bilayer interface. The negative of the first integral of the charge density yields the electric field. In turn, integral of the field provides the electrostatic potential. Not too surprisingly, the resulting “dipole potential” inherently depends upon the choice of the potential energy function and should, thus, be interpreted cautiously [4, 31, 63].
4.2.
Dynamics
The increasing level of interaction between experimental studies and numerical simulations of lipid bilayers evidenced in the previous section also holds for the dynamics of lipid bilayers. Feller et al. [27] have used MD simulations to analyze NOESY cross relaxation rates in lipid bilayers. Magnetic dipole–dipole correlation in such systems occurs over a variety of time-scales and depends upon the probability of close approach for proton–proton interactions. The relaxation rates have been calculated directly from a 10 ns MD simulation of DPPC. Fitting the autocorrelation functions yields characteristic correlation times and weight factors that determine the relative contributions of the individual type of motions. Combining simulations and experiments, relaxation rates may, therefore, be assigned to various motions – viz. less than 1 ps for chemical bond vibrations, 50–100 ps for trans–gauche isomerization, 1–2 ns for molecular rotation and wobble, and beyond 100 ns for lateral diffusion. A model for the dynamics of individual lipid molecules has also been proposed based on a thorough comparison of simulation data and experimental measurements of the 13 C NMR T1 relaxation in DPPC alkyl chains [87]. Employing Brownian dynamics and MD simulations associated to fits of experimental data, it was found that lipid molecules confine themselves into a cylinder within the 100 ps time scale, and wobble in a cone-like potential on the nanosecond time scale. A similar model for lipid dynamics has emerged from an MD study aimed at interpreting inelastic neutron scattering (INS) data. One particular aspect
944
C. Chipot et al.
of such experiments, probing the motion of individual hydrogen nuclei – i.e., self correlation of single particle, is that they are space- and time resolved. In the case of DPPC bilayers, a good agreement between simulations and experiments probing the 100 ps time scale is attained [88]. The analysis corroborates the fact that the motion of the center of mass and the internal motions of lipid molecules are decoupled. Moreover, the former is well described as a diffusion in a confined space, i.e., a cylinder. A refined picture of the internal dynamics arising from the simulation shows that protons of the alkyl chains move according to a chain defect model, where kinks or chain defects form and disappear randomly – i.e., stochastic model – along the lipid tail, rather than diffuse along the chain. Collective dynamics of lipid bilayers have also been examined carefully as simulations over increasingly significant time scales and length scaled are feasible. Large systems involving 1,024 lipid molecules studied over 10 ns led to the direct observation of bilayer undulations and thickness fluctuations of mesoscopic nature [35]. Continuum properties such as bending modulus, surface compressibility and mode relaxation times were calculated and agreed nicely with experiment. Several processes occurring at different length scales were identified. The undulatory motions could be separated in two regimes – one involving more than 50 lipids, that can be ascribed to mesoscopic undulations, and the other, involving less than 25 lipids, that is attributed to collective lipid protrusion. Peristaltic modes – i.e., anti-correlated modes between the two layers – could also be distinguished in two types: bending modes involving 50–400 lipids, and protrusion modes over shorter length scales. Shorter wavelength collective dynamics may be probed using coherent inelastic, viz. neutron or x-ray, scattering. Density fluctuations of length scales comparable to the interlipid distance are believed to play a pivotal role in the transport of small molecules across the bilayer. Recently, MD simulations have been used to complement inelastic x-ray data of lipid bilayers, both in the gel, Lβ , and the liquid crystal, Lα , phases [89]. The results support the applicability of generalized hydrodynamics to describe the motion of carbon atoms in the hydrophobic core, thus allowing the modeler to extract key-parameters, such as sound mode propagation velocity, thermal diffusivity and kinematic longitudinal viscosity.
4.3.
Modeling Transport Phenomena
Models of lipid bilayers have been employed widely to investigate diffusion properties across membranes through assisted and non-assisted mechanisms. Simple ions, e.g., Na+ , K+ , Ca2+ or Cl− , have been shown to play a significant role in the cell machinery, in particular at the level of intercellular communication. In order to enter the cell, the ion must preliminarily
Modeling lipid membranes
945
permeate the lipid bilayer that acts as a rampart towards the cytoplasm. Wilson and Pohorille have investigated the passive transport of Na+ and Cl− ions across a lipid bilayer formed by glycerolmonoolein units, which undergoes severe deformations as the ions translocate across the water–membrane interface. This process is accompanied by thinning defects and the formation of water fingers that ensure an appropriate hydration of the ion as it penetrates in the non-polar environment [90]. Ideally, atomic simulations could also serve as a predictive tool for estimating water–membrane partition coefficients of small drugs, in strong connection with the so-called blood–brain barrier – the ultimate step in the de novo design of pharmacologically active molecules. Diffusion of small, organic solutes in lipid bilayers was examined for a variety of molecular species ranging from benzene [91, 92] to more complex anesthetics [93–95]. Yet, access to partition coefficients by means of statitistical simulations implies the determination of the underlying free energy behavior along the direction normal to the interface [96]. In the specific instance of inhaled anesthetics, an analysis of the variations of the free energy for translocating the solute from the aqueous medium into the interior of the bilayer suggests that potent anesthetics reside preferentially near the water–membrane interface. Contrary to the dogmatic Meyer–Overton hypothesis [97], potency is shown to correlate with the interfacial concentration of the anesthetic, rather than its sole lipophilicity [98]. The considerable free energy associated to the transfer of ions from the aqueous medium to the interior of the membrane rationalizes the use in cells of specific transmembrane channels, pumps or carriers that facilitate while controlling selectively the passage of ionic species across the lipid bilayer [99]. Recent complete reviews of the theoretical developments and simulation capabilities in ion channels modelling can be found in reference [100] and [101]. Here, we briefly describe some of the complex systems examined hitherto. Gramicidin A, a prototypical channel for assisted ion transport, has been the object of thorough analyses from both experimental and theoretical perspectives. Dimerization of individual protein units results in membranespanning channels suitable for ion conduction. MD simulations of gramicidin A embedded in hydrated lipid bilayers, e.g., DMPC, were able to reproduce the structural features observed experimentally [102]. Such studies have clearly shown that important questions related to ion selectivity, ion binding, gating and proton transfer mechanisms may be addressed with some confidence. Internal arrangement of water molecules in single-file chain of water molecules, characteristic in complex transporters [103], was also witnessed in a somewhat more rudimentary, synthetic channel formed by stacked cyclic peptides of alternated D- and L-chiralities (see Fig. 5) [76]. Such nanotubes have been recognized to modify in a selective fashion the permeability of cell membranes and are envisioned to act as potent therapeutic agents in response to bacterial resistance [104]. Aquaporins, membrane channels ubiquitous to most
946
C. Chipot et al.
(a)
(b)
Figure 5. MD simulation of DNA rods interacting with a membrane formed by cationic and neutral lipids, from Ref. [118]. Snapshot along the DNA axis (a) and perpendicular to it (b). Color coding: DNA phosphate moieties are shown in light grey, PC phosphate groups in black and choline groups dark grey.
living species controlling the water contents of the cells, have also focused much attention lately. They are formed of tetramers that organize to facilitate the transport of water, and possibly other small solutes, across the lipid bilayer. The resulting water pores remain, however, impervious to the passage of small ions to ensure a proper conservation of the electrochemical potential [105]. As a final note, it is worth mentioning that, as expected, the determination of the high resolution structure of KscA, a bacterial K+ channel, has motivated a large number of realistic simulations taking into account the lipidic environment studies aimed at deciphering the underlying complex conduction mechanism.
4.4.
Interaction of Small Molecules, Nucleic Acids, Peptides and Proteins with Membranes
In most circumstances, the biological membrane is described at the theoretical level as a simple, homogeneous bilayer formed by a single type of lipid – usually the well-studied, zwitterionic PC lipids. Membranes, however, are infinitely more complex and consist of an heterogeneous assembly of
Modeling lipid membranes
947
different lipids, either charged or not, carbohydrates and proteins. Approaching the fine detail of the biological picture by incorporating in atomic simulations chemical species of different natures is evidently the direction towards which the modeler is evolving. From a modeling perspective, the influence of cholesterol [106–110], and more generally sterols [111], on the structure and dynamics of lipid bilayers has attracted a lot of attention in recent years. Although the limited sampling in some simulations calls into question the conclusions reached by the authors, cholesterol is shown to increase the order parameters of the alkyl chains while decreasing their tilt angle with respect to the normal to the water–membrane interface, in qualitative agreement with experiment [112]. Aside from transporters and channels that assist the transport of chemical species across lipid bilayers, a vast array of key-cellular functions are accomplished by proteins that interact with the membrane, either spanning the latter, or bound to its surface [113]. Yet, interfacial and transmembrane proteins generally play distinct roles in the cell machinery, albeit the frontier between these two classes of proteins remains somewhat fuzzy. A number of proteins, for instance, are only partially buried in the membrane – e.g., melittin or alamethicin, the insertion of which is conditioned by the transmembrane electric field [114–117]. Recent MD simulations have focused on the association of DNA with lipid membranes, that results in stable complexes of potential use as viral-based carriers. Here, rods of DNA are intercalated between bilayer leaflets formed by mixed cationic and neutral, PC, lipid units, producing undulations of the membrane interface (see Fig. 5). In such a topology, where the host interacts with the head groups, it is shown that both PC and cationic lipids contribute to the overall screening of the phospate groups of the nucleic acids [118]. The strength of in silico experiments is to provide glimpses into the atomic detail of biological membranes that conventional experimental techniques cannot capture. Of particular interest is the molecular interplay that govern membrane–protein association, accessible through large-scale atomic simulations. MD simulations illuminated, for instance, how the presence of a protein perturbs the structure of the lipid membrane. For example, the helices of the Influenza A M2 channel tilt in a DMPC bilayer to maximize membrane–protein hydrophobic contacts [119, 120]. In the case of gramicidin A, key-residues located in the head-group region have been shown to stabilize the channel in the membrane [102]. The influence of the protein on the lipid bilayer can be viewed as the subtle balance between hydrophobic and hydrophilic contributions that, in principle, can be captured by MD simulations. Differences in the order parameters of lipid units adjacent to the protein and far from it have led to the concept of “boundary lipids”. In a vast number of instances, among which the Mycobacterium tuberculosis MscL channel [121], the Influenza A virus M2
948
C. Chipot et al.
protein [122], and the Escherichia coli OmpF trimer [123], it was observed that the membrane protein induces an increasing disorder of the lipid alkyl chains in its neighborhood. In sharp contrast, alkyl chains close to the transporter gramicidin A tend to be more ordered, compared to those pertaining to the bulk lipid environment [102, 124]. In the light of these computational investigations, it would, therefore, appear that trans–gauche equilibria in lipid chains are dictated by the very nature of the membrane protein. Yet, as was shown recently [76], drawing definitive conclusions based on limited simulation lengths may turn out to give a distorted vision of the actual behavior of the lipid bilayer. In principle, exceedingly short simulations do not permit the complete relaxation of lipid chains in the vicinity of the protein, and should, thus, be interpreted cautiously. The close adequation between the thickness of the lipid bilayer and the length of the hydrophobic segment of the protein spanning the latter constitutes yet another important facet of the protein–membrane interplay. By providing the microscopic detail of the interactions of integral proteins with the lipid environment, atomic statistical simulations may contribute to advance our understanding of the underlying physical principles that govern the function and structure of membranes [125], In the light of a series of experimental investigations on model peptides embedded in PC membranes with alkyl chains of increasing length, it was found that if the hydrophobic thickness of the peptide is greater than that of the bilayer, the latter becomes thinner, and vice versa [126]. A similar phenomenon was observed recently in the MD simulation of a single peptide nanotube inserted in a hydrated DMPC bilayer [76]. The hydrophobic thickness of the membrane adjusts itself as the synthetic channel tilts concurrently to adapt to its host lipid environment. Whereas the so-called hydrophobic mismatch [127] does not appear to induce perturbations in peptide nanotubes, it can, however, modulate strongly the function of more complex proteins. As was observed recently for gramicidin A, minute changes in the length of the lipid alkyl chains – viz. from the 18-carbon oleyl- to the 20-carbon eicosenoylphosphatidylcholine, switch the protein from a stretch-activated to a stretch-inactivated channel. Symmetrically, the hydrophobic mismatch may alter the phase behavior of the membrane, as demonstrated in the case of walp peptides that promote the formation of non-lamellar phases [128]. These remarkable results should, therefore, incline the modeler to be cautious when solvating membrane proteins in lipid surroundings. The choice of the lipid unit for a given protein may turn out as a genuine leap of faith if attention has not been paid to the possible imbalance in the hydrophobic thicknesses of the membrane and the protein, likely to render a physically unrealistic picture of the assembly. When devised appropriately, atomic simulations can, nonetheless, shed new light on the nature of the protein–membrane interplay, by allowing the modeler to not only visualize, but also possibly quantify the strength of the participating
Modeling lipid membranes
949
interactions. Of particular interest, the non-covalent chemical bonds formed by l-Trp residues and acceptor moieties of the head-group region have been recognized to act as anchoring points of the protein into the lipid bilayer [129, 130]. As has been shown in the case of gramicidin A, the presence of several l-Trp amino acids at the level of the lipid head groups is expected to mediate the overall stabilization of the channel in the membrane [102].
5.
Discussion, Outlook and Future Prospects
Retrospectively, with about 15 years of hindsight, it has become clear that atomic simulations, in particular MD simulations of lipid–water assemblies have contributed in a large measure to improve our knowledge of these very complex systems from both a structural and a dynamical point of view. It is also obvious that the successes of pioneering, tantalizing investigations, which not only ignited the field of lipid simulations, but were also rapidly fueled by many studies on larger assemblies, often reflected as much good fortune as they did science. Yet, major advances on both the hardware and the software, algorithmic fronts progressively allowed the modeler to tackle systems of increasing complexity over time-scales compatible with the physical, chemical and biological reality. Among these advances, the development of specific methods for performing the simulation in apt thermodynamic ensembles, the improvement of potential energy functions targeted at the specific modeling of lipid–water assemblies, and the continuous decrease of the price/performance ratio of modern computers have helped pushing back the intrinsic limitations of MD simulations. More recent studies have demonstrated that simulations at least an order of magnitude longer than those reported when the field was only in its infancy, were required to obtain reliable and reproducible results [131]. Simulation of lipid–water systems still constitutes a research area seething with excitement. The development of all the ingredients to investigate in silico lipid bilayers with full confidence opens new perspectives, in particular on the biological front, and should rapidly allow the modeler to use lipids in a routine fashion, just like any other solvent. In this spirit, theoretical studies of membrane proteins in a realistic environment should continue to flourish in the near future. Unfortunately, as the level of sophistication of atomic simulations increases, together with the available computational power, so does the ambition of the modeler, attempting to deal with molecular systems yet even more complex, both in terms of size- and time-scales. This explains the current teeming activity in the development of approximate schemes that could serve as alternatives to a full-atomic description for the modeling of large lipid–water assemblies over long times. Among these alternatives, a dearth of effort has been invested in recent years in the field of implicit solvation [8]. Since the seminal work of Onsager on
950
C. Chipot et al.
continuum electrostatics [132], the temptation to represent explicit surroundings by a simple dielectric medium for a myriad of chemical systems has been the object of tremendous interest. Modeling the complexity of lipid bilayers by means of a continuum description has been used, for example, to investigate the insertion of α-helical peptides in a membrane [133], or the interaction of a small toxin with the latter [116]. Results of continuum electrostatics simulations, which are based on solving the Poisson–Boltzmann equation numerically, are, in general, in qualitative agreement with atomic simulations. Yet, not too unexpectedly, this approximate description cannot capture subtle, specific interactions that govern the stability of the solute – e.g., a short peptide, at the water–membrane interface. As was underlined recently by Lin et al., the reproduction of membrane dipole potentials based on a sole continuum electrostatics representation is usually erroneous, but can be significantly improved by inclusion of explicit layers of water molecules near the head-group region [134]. Aside from implicit solvation approaches, the use of coarse-grained representations, wherein each lipid unit is described by a limited number of interacting sites, is probably the most promising. The underlying assumption that the formation of a lipid vesicle is a sufficiently robust process to be simulated by simplified models of lipids was ascertained recently by Marrink and Mark through a study of the aggregation of DPPC units into small unilamellar vesicles [135]. By and large, the strength of coarse-grained models resides in their ability to make simulations self-assembly processes substantially more affordable than conventional all- or even united-atom models [136]. The level of representation offered by this alternative is, in sharp contrast, incompatible with the fine description of specific interactions of the participating lipid units with small solutes, yet, the major advantage of coarse-grain (CG) models is their usefulness in simulating processes, which are otherwise either difficult or impossible to carry out using the conventional atomistic approaches. Many phenomena involving membranes lie within the mesoscopic spatio-temporal scale that may be explored with coarse grain methods. [137] Among those, recent studies have shown the power of such technique in investigating lipid-protein interactions, and membrane-membrane interactions such as anti-microbial attack on membranes and membrane fusion [138].
References [1] R.B. Gennis, Biomembranes: Molecular Structure and Function, Spring Verlag, Heidelberg, 1989. [2] D. Frenkel and B. Smit, Understanding Molecular Simulations: From Algorithms to Applications, Academic Press, San Diego, 1996. [3] D.P. Tieleman, S.J. Marrink, and H.J.C. Berendsen, “A computer perspective of membranes: molecular dynamics studies of lipid bilayer systems,” Biochim. Biophys. Acta, 1331, 235–270, 1997.
Modeling lipid membranes
951
[4] D.J. Tobias, “Water and membranes: molecular details from md simulations,” In: M.C. Bellissent-Funel (ed.), Hydration Processes in Biology, vol. 305, NATO ASI Series A: Life Sciences, IOM Press, New York, pp. 293–310, 1999. [5] L.R. Forrest and M.S.P. Sansom, “Membrane simulations: bigger and better,” Curr. Opin. Struct. Biol., 10, 174–181, 2000. [6] S.E. Feller, “Molecular dynamics simulations of lipid bilayers,” Curr. Opin. Colloid Interface Sci., 5, 217–223, 2000. [7] H.L. Scott, “Modeling the lipid component of membranes,” Curr. Opin. Struct. Biol., 12, 495–502, 2002. [8] D.J. Tobias, “Electrostatic calculations: recent methodological advances and applications to membranes,” Curr. Opin. Struct. Biol., 11, 253–261, 2001. [9] R.J. Mashl, H.L. Scott, S. Subramaniam, and E. Jakobsson. “Molecular simulation of dioleylphosphatidylcholine bilayers at differing levels of hydration,” Biophys. J., 81, 3005–3015, 2001. [10] L. Saiz and M.L. Klein, “Computer simulation studies of model biological membranes,” Acc. Chem. Res., 35, 482–489, 2002. [11] J. Israelachvili, S. Marcelja, and R.G. Horn, “Physical principles of membrane organization,” Quart. Rev. Biophys., 13, 121–200, 1980. [12] J. Israelachvili, Intermolecular and Surface Forces. Academic Press, London, 1992. [13] K.V. Damodaran and K.M. Merz Jr., “A comparison of dmpc- and dlpe-based lipid bilayers,” Biophys. J., 66, 1076–1087, 1994. [14] L. Saiz and M.L. Klein, “Electrostatic interactions in a neutral model phospholipid bilayer by molecular dynamics simulations,” J. Chem. Phys., 116, 3052–3057, 2002. [15] J.A. Bouwstra, M.A. Salomons-de Vries, J.A. Van der Spek, and W. Bras, “Structure of human stratum corneum as a function of temperature and hydration: a wide angle x-ray diffraction study,” Int. J. Pharmacol., 84, 205–216, 1992. [16] G. Zaccai, G. B¨uldt, A. Seelig, and J. Seelig, “Neutron diffraction studies on phosphatidylcholine model membranes. II. chain conformation and segmental order,” J. Mol. Biol., 134, 693–706, 1979. [17] M.C. Wiener and S.H. White, “Structure of fluid dioleylphosphatidylcholine bilayer determined by joint refinement of x-ray and neutron diffraction data. III. complete structure,” Biophys. J., 61, 434–447, 1992. [18] J.F. Nagle, R. Zhang, S. Tristram-Nagle, W.J. Sun, H.I. Petrache, and R.M. Suter, “X–ray structure determination of fully hydrated Lα phase dipalmitoylphosphatidylcholine bilayers,” Biophys. J., 70, 1419–1431, 1996. [19] K. Hristova and S.H. White, “Determination of the hydrocarbon core structure of fluid dopc bilayers by X–ray diffraction using specific bromination of the double– bonds: effect of hydration,” Biophys. J., 74, 2419–2433, 1998. [20] J.F. Nagle and S. Tristram-Nagle, “Lipid bilayer structure,” Curr. Opin. Struct. Biol., 10, 474–480, 2000. [21] C.F. Majkrzak and N.F. Berk, “Exact determination of the phase in neutron reflectometry by variation of the surrounding media,” Phys. Rev. B., 58, 15416–15418, 1998. [22] M. Tarek, K. Tu, M.L. Klein, and D.J. Tobias, “Molecular dynamics simulations of supported phospholipid/alkanethiol bilayers on a gold(111) surface,” Biophys. J., 77, 464–472, 1999. [23] C.F. Majkrzak, N.F. Berk, S. Krueger, J.A. Dura, M. Tarek, D.J. Tobias, V. Silin, C.W. Meuse, J. Woodward, and A.L. Plant. “First principle determination of hybrid bilayer membrane structure by phase-sensitive neutron reflectometry,” Biophys. J., 79, 3330–3340, 2000.
952
C. Chipot et al. [24] S. Krueger, C.W. Meuse, C.F. Majkrzak, J.A. Dura, N.F. Berk, M. Tarek, and A.L. Plant. “Investigation of hybrid bilyer membranes with neutron reflectometry: probing the interaction of melittin,” Langmuir, 17, 511–521, 2001. [25] K. Gawrisch, N.V. Eldho, and I.V. Polozov, “Novel NMR tools to study structure and dynamics of biomembranes,” Chem. Phys. Lipids, 116, 135–151, 2002. [26] S.J. Marrink, M. Berkowitz, and H.J.C. Berendsen, “Molecular dynamics simulation of a membrane–water interface: The ordering of water and its relation to the hydration force,” Langmuir, 9, 3122–3131, 1993. [27] S.E. Feller, D. Huster, and K. Gawrisch, “Interpretation of NOESY cross-relaxation rates from molecular dynamics simulations of a lipid bilayer,” J. Am. Chem. Soc., 121, 8963–8964, 1999. [28] K. Sengupta and J. Raghunathan, “Structure of ripple phase in chiral and racemic dimyristoylphosphatidylcholine multibilayers,” Phys. Rev. E, 59, 2455–2457, 1999. [29] J. Katsaras, S. Tristram-Nagle, Y. Liu, R.L. Headrick, E. Fontes, P.C. mason, and J.F. Nagle. “Clarification of the ripple phase of lecithin bilayers using fully hydrated aligned samples,” Phys. Rev. E, 61, 5668–5677, 2000. [30] S.J. Marrink and A.E. Mark, “Effect of undulations on surface tension in simulated bilayers,” J. Phys. Chem. B, 105, 6122–6127, 2001. [31] O. Berger, O. Edholm, and F.J¨ahnig, “Molecular dynamics simulations of a fluid bilayer of dipalmitoylphosphatidylcholine at full hydration, constant pressure, and constant temperature,” Biophys. J., 72, 2002–2013, 1997. [32] S.E. Feller and R.W. Pastor, “On simulating lipid bilayers with an applied surface tension: periodic boundary conditions and undulations,” Biophys. J., 71, 1350–1355, 1996. [33] S.E. Feller and R.W. Pastor, “Constant surface tension simulations of lipid bilayers: the sensitivity of surface areas and compressibilities,” Biophys. J., 111, 1281–1287, 1999. [34] D.P. Tieleman and H.J.C. Berendsen, “Molecular dynamics simulations of a fully hydrated dipalmitoylphosphatidylcholine bilayer with different macroscopic boundary conditions and parameters,” J. Chem. Phys., 105, 4871–4880, 1996. [35] E. Lindahl and O. Edholm, “Mesoscopic undulations and thickness fluctuations in lipid bilayers from molecular dynamics simulations,” Biophys. J., 79, 426–433, 2000. [36] G.J. Martyna, D.J. Tobias, and M.L. Klein, “Constant pressure molecular dynamics algorithms,” J. Chem. Phys., 101, 4177–4189, 1994. [37] S.E. Feller, Y.H. Zhang, R.W. Pastor, and B.R Brooks, “Constant pressure molecular dynamics simulations – the Langevin piston method,” J. Chem. Phys., 103, 4613– 4621, 1995. [38] A.M. Smondyrev and M.L. Berkowitz, “United atom force field for phospholipid membranes: constant pressure molecular dynamics simulation of dipalmitoylphosphatidicholine/water system,” J. Comput. Chem., 20, 531–545, 1999. [39] E. Egberts, S.J. Marrink, and H.J.C. Berendsen, “Molecular dynamics simulation of a phospholipid membrane,” Eur. Biophys. J., 22, 423–436, 1994. [40] D.J. Tobias, K. Tu, and M.L. Klein, “Assessment of all–atom potentials for modeling membranes: molecular dynamics simulations of solid and liquid alkanes and crystals of phospholipid fragments,” J. Chim. Phys., 94, 1482–1502, 1997. [41] S.-W. Chiu, M. Clark, E. Jakobsson, S. Subramaniam, and H. L. Scott, “Optimization of hydrocarbon chain interaction parameters: application to the simulation of fluid phase lipid bilayers,” J. Phys. Chem. B, 103, 6323–6327, 1999.
Modeling lipid membranes
953
[42] S.E. Feller and A.D. MacKerell Jr., “An improved empirical potential energy function for molecular simulations of phospholipids,” J. Phys. Chem. B, 104, 7510–7515, 2000. [43] S.E. Feller, K. Gawrisch, and A.D. MacKerell Jr., “Polyunsaturated fatty acids in lipid bilayers: intrinsic and environmental contributions to their unique physical properties,” J. Am. Chem. Soc., 124, 318–326, 2002. [44] J. Ryckaert and A. Bellemans, “Molecular dynamics of liquid alkanes,” Chem. Soc. Faraday Discuss., 66, 95–106, 1978. [45] W.L. Jorgensen and J. Tirado-Rives, “The OPLS potential functions for proteins: energy minimizations for crystals of cyclic peptides and crambin,” J. Am. Chem. Soc., 110, 1657–1666, 1988. [46] W.D. Cornell and C. Chipot, “Alternative approaches to charge distribution calculations,” In: P.v.R. Schleyer, N.L. Allinger, T. Clark, J. Gasteiger, P.A. Kollman, H. F. Schaefer III, and P. R. Schreiner (eds.), Encyclopedia of Computational Chemistry, vol. 1, Wiley and Sons, Chichester, pp. 258–263, 1998. [47] F. Colonna and E. Evleth, “Conformationally invariant modeling of atomic charges,” Chem. Phys. Lett., 212, 665–670, 1993. [48] K. Tu, D.J. Tobias, and M.L. Klein, “Constant pressure and temperature molecular dynamics simulation of a fully hydrated liquid crystal phase DPPC bilayer,” Biophys. J., 69, 2558–2562, 1995. [49] M.P. Allen and D.J. Tildesley. Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [50] A.Y. Toukmaji and J.A. Board Jr., “Ewald summation techniques in perspective: a survey,” Comput. Phys. Comm., 95, 73–92, 1996. [51] R.W. Hockney and J.W. Eastwood, Computer Simulation Using Particles, IOP Publishing Ltd., Bristol, England, 1988. [52] K.E. Schmidt and M.A. Lee, “Implementing the fast multiple method in three dimensions,” J. Stat. Phys., 63, 1223–1235, 1991. [53] G.J. Martyna, M.E. Tuckerman, D.J. Tobias, and M.L. Klein. “Explicit reversible integrators for extended systems dynamics,” Mol. Phys., 87, 1117–1128, 1996. [54] M.E. Tuckerman and G.J. Martyna, “Understanding modern molecular dynamics: techniques and applications,” J. Phys. Chem. B, 104, 159–178, 2000. [55] J. A. Izaguirre, S. Reich, and R.D. Skeel, “Longer time steps for molecular dynamics,” J. Chem. Phys., 110, 9853–9864, 1999. [56] J. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical integration of the cartesian equations of motion for a system with constraints: molecular dynamics of n-alkanes,” J. Comput. Phys., 23, 327–341, 1977. [57] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.R. Ross, T.E. Cheatham III, S. DeBolt, D. Ferguson, G. Seibel, and P. Kollman. “AMBER, a computer program for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to elucidate the structures and energies of molecules,” Comput. Phys. Commun., 91, 1–41, 1995. [58] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. “CHARMM: a program for macromolecular energy, minimization, and dynamics calculations,” J. Comput. Chem., 4, 187–217, 1983. [59] E. Lindhal, B. Hess, and D. van der Spoel, “GROMACS 3.0: a package for molecular simulation and trajectory analysis,” J. Mol. Mod., 7, 306–317, 2001. [60] L. Kal´e, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten. “NAMD2: greater scalability for parallel molecular dynamics,” J. Comput. Phys., 151, 283–312, 1999.
954
C. Chipot et al. [61] E. Tajkhorshid, A. Aksimentiev, I. Balabin, M. Gao, B. Isralewitz, J.C. Phillips, F. Zhu, and K. Schulten. “Large scale simulation of protein mechanics and function,” In: F.M. Richards, D.S. Eisenberg, and J. Kuriyan (eds.), Advances in Protein Chemistry, vol. 66, Elsevier Academic Press, New York, pp. 195–247, 2003. [62] W. Humphrey, A. Dalke, and K. Schulten, “VMD – visual molecular dynamics,” J. Mol. Graph., 14, 33–38, 1996. [63] S.-W. Chiu, M. Clark, V. Balaji, S. Subramaniam, H. L. Scott, and E. Jakobsson, “Incorporation of surface tension into molecular dynamics simulation of an interface: a fluid phase lipid bilayer membrane,” Biophys. J., 69, 1230–1245, 1995. [64] R.M. Venable, Y. Zhang, B.J. Hardy, and R.W. Pastor, “Molecular dynamics simulations of a lipid bilayer and of hexadecane: an investigation of membrane fluidity,” Science, 262, 223–226, 1993. [65] S.E. Feller, R.M. Venable, and R.W. Pastor, “Computer simulation of a DPPC phospholipid bilayer: structural changes as a function of molecular surface area,” Langmuir, 13, 6555–6561, 1997. [66] U. Essman and M. Berkowitz, “Dynamical properties of phospholipid bilayers from computer simulations,” Biophys. J., 76, 2081–2089, 1999. [67] S.-W. Chiu, M. Clark, E. Jakobsson, S. Subramaniam, and H. L. Scott, “Application of combined Monte Carlo and molecular dynamics method to simulation of dipalmitoyl phosphatidylcholine lipid bilayer,” J. Comp. Chem., 11, 1153–1164, 1999. [68] S.-W. Chiu, E. Jakobsson, S. Subramaniam, and H.L. Scott. “Combined Monte Carlo and molecular dynamics simulation of fully hydrated dioleyl and palmitoyl–oleyl phosphatidylcholine lipid bilayers,” Biophys. J., 77, 2462–2469, 1999. [69] T. R´og, K. Murzyn, and M. Pasenkiewicz-Gierula, “The dynamics of water at the phospholipid bilayer: A molecular dynamics study,” Chem. Phys. Lett., 352, 323– 327, 2002. [70] L. Saiz and M.L. Klein, “Structural properties of a highly polyunsaturated lipid bilayer from molecular dynamics simulations,” Biophys. J., 81, 204–216, 2001. [71] M.L. Berkowitz and M.J. Raghavan, “Computer simulation of a water/membrane interface,” Langmuir, 7, 1042–1044, 1991. [72] J.J. L´opez Cascales, H.J.C. Berendsen, and J. García de la Torre, “Molecular dynamics simulation of water between two charged layers of dipalmitoylphosphatidylserine,” J. Phys. Chem., 100, 8621–8627, 1996. [73] S.A. Pandit and M.L. Berkowitz, “Molecular dynamics simulation of dipalmitoylphosphatidylserine bilayer with Na counterions,” Biophys. J., 82, 1818–1827, 2002. [74] M. Wilson and A. Pohorille, “Molecular dynamics of a water–lipid bilayer interface,” J. Am. Chem. Soc., 116, 1490–1501, 1994. [75] A. Pohorille and M.A. Wilson, “Molecular dynamics studies of simple membrane– water interfaces: structure and functions in the beginnings of cellular life,” Orig. Life Evol. Biosph., 25, 21–46, 1995. [76] M. Tarek, B. Maigret, and C. Chipot, “Molecular dynamics investigation of an oriented cyclic peptide nanotube in DMPC bilayers,” Biophys. J., 85, 2287–2298, 2003. [77] G.S. Harms, M. Sonnleitner, G.J. Schtz, and T. Schmidt. “Single-molecule anisotropy imaging,” Biophys. J., 77, 2864–2870, 1999. [78] P.B. Moore, C.F. Lopez, and M.L. Klein, “Dynamical properties of a hydrated lipid bilayer from a multinanosecond molecular dynamics simulation,” Biophys. J., 81, 2484–2494, 2001. [79] R.S. Armen, O.D. Uitto, and S.E. Feller, “ Phospholipid component volumes: determination and application to bilayer structure calculations,” Biophys. J., 75, 734–744, 1998.
Modeling lipid membranes
955
[80] J.P. Doulier, A.L´eonard, and E.J. Dufourc, “Restatement of order parameters in biomembranes: calculation of C–C bond order parameters from C–D quadrupolar splitting,” Biophys. J., 68, 1727–1739, 1995. [81] H.I. Petrache, K. Tu, and J.F. Nagle, “Analysis of simulated NMR order parameters for lipid bilayer structure determination,” Biophys. J., 76, 2479–2487, 1999. [82] R.G. Snyder, K. Tu, M.L. Klein, R. Mendelssohn, H.L. Strauss, and W. Sun, “Acyl chain conformation and packing in dipalmitoylphosphatidylcholine bilayers from MD simulations and IR spectroscopy,” J. Chem. Phys. B, 106, 6273–6288, 2002. [83] M. Tarek, D.J. Tobias, and M.L. Klein, “Molecular dynamics simualtion of tetradecyltrimethylammonium bromide monolayers at the air/water interface,” J. Phys. Chem., 99, 1393–1402, 1995. [84] K. Gawrisch, D. Ruston, J. Zimmerberg, V. Parsegian, R. Rand, and N. Fuller, “Membrane dipole potentials, hydration forces, and the ordering of water at membrane surfaces,” Biophys. J., 61, 1213–1223, 1992. [85] W. Shinoda, M. Shimizu, and S. Okazaki, “Molecular dynamics study on electrostatic properties of a lipid bilayer: polarization, electrostatic potential, and the effects on structure and dynamics of water near the interface,” J. Phys. Chem. B, 102, 6647–6654, 1998. [86] J.X. Cheng, S. Pautot, D.A. Weitz, and X.S. Xie, “Ordering of water molecules between phospholipid bilayers visualized by coherent anti-stokes raman scattering microscopy,” Proc. Natl Acad. Sci. USA, 100, 9826–9830, 2003. [87] R.W. Pastor, R.M. Venable, and S.E. Feller, “Lipid bilayers, NMR relaxation, and computer simulations,” Acc. Chem. Res., 35, 438–446, 2002. [88] D.J. Tobias, “Membrane simulations,” In: O.H. Becker, A.D. Mackerell Jr., B. Roux, and M. Watanabe (eds.), Computational Biochemistry and Biophysics, Marcel Dekker, New York, 2001. [89] M. Tarek, D.J. Tobias, S.H. Chen, and M.L. Klein, “Short waverlength collective dynamics in phospholipid bilayers: a molecular dynamics study,” Phys. Rev. Lett., 87, 238101, 2001. [90] M.A. Wilson and A. Pohorille, “Mechanism of unassisted ion transport across membrane bilayers,” J. Am. Chem. Soc., 118, 6580–6587, 1996. [91] H.E. Alper and T.R. Stouch, “Orientation and diffusion of a drug analogue in biomembranes molecular dynamics simulations,” J. Phys. Chem., 99, 5724–5731, 1995. [92] D. Bassolino-Klimas, H.E. Alper, and T.R. Stouch, “Drug–membrane interactions studied by molecular dynamics simulation: size dependence of diffusion,” Drug Des. Discov., 13, 135–141, 1996. [93] K. Tu, M. Tarek, M.L. Klein, and D. Scharf, “Effects of anesthetics on the structure of a phospholipid bilayer: molecular dynamics investigation of halothane in the hydrated liquid crystal phase of dipalmitoylphosphatidylcholine,” Biophys. J., 75, 2123–2134, 1998. [94] L. Koubi, M. Tarek, M.L. Klein, and D. Scharf, “Distribution of halothane in a dipalmitoylphosphatidylcholine bilayer from molecular dynamics calculations,” Biophys. J., 78, 800–811, 2000. [95] L. Koubi, M. Tarek, M.L. Bandyophadhyay, and D. Scharf. “Effects of the nonimmobilizer hexafluroethane on the model membrane DMPC,” Anesthesiology, 97, 848–855, 2002. [96] A. Pohorille and M.A. Wilson, “Excess chemical potential of small solutes across water–membrane and water–hexane interfaces,” J. Chem. Phys., 104, 3760–3773, 1996.
956
C. Chipot et al.
[97] E. Overton, Studien u¨ ber die Narkose zugleich ein Betrag zur allgemeinen Pharmakologie, Verlag von Gustav Fischer, Jena, 1901. [98] A. Pohorille, M.A. Wilson, M.H. New, and C. Chipot. “Concentrations of anesthetics across the water–membrane interface; the Meyer–Overton hypothesis revisited,” Toxicology Lett., 100, 421–430, 1998. [99] A. Pohorille, M.A. Wilson, K. Schweighofer, M.H. New, and C. Chipot, “Interactions of membranes with small molecules and peptides,” In: J. Leszczynski (ed.), Theoretical and Computational Chemistry – Computational Molecular Biology, vol. 8, Elsevier, The Netherlands, pp. 485–535, 1999. [100] D.P. Tieleman, P.C. Biggin, G.R. Smith, and M.S.P. Sansom, “Simulation approaches to ion channel structure-function relationships,” Quart. Rev. Biophys, 34, 473–561, 2001. [101] B. Roux, “Theoretical and computational models of ion channels,” Curr. Opin. Struct. Biol., 12, 182–189, 2002. [102] B. Roux, “Computational studies of the gramicidin channel,” Acc. Chem. Res., 35, 366–375, 2002. [103] R. Pom`es and B. Roux, “Molecular mechanism of H+ conduction in the single–file water chain of the gramicidin channel,” Biophys. J., 82, 2304–2316, 2002. [104] S. Fernandez-Lopez, H.S. Kim, E.C. Choi, M. Delgado, J.R. Granja, A. Khasanov, K. Kraehenbuehl, G. Long, D.A. Weinberger, K.M. Wilcoxen, and M.R. Ghadiri, “Antibacterial agents based on the cyclic d, l-α-peptide architecture,” Nature, 412, 452–455, 2001. [105] E. Tajkhorshid, P. Nollert, M.O. Jensen, L.J.W. Miercke, J. O’Connell, R.M. Stroud, and K. Schulten. “Control of the selectivity of the aquaporin water channel family by global orientational tuning,” Science, 296, 525–530, 2002. [106] O. Edholm and A.M. Nyberg, “Cholesterol in model membranes: a molecular dynamics study,” Biophys. J., 63, 1081–1089, 1992. [107] R.R. Gabdoulline, G. Vanderkooi, and C. Zheng, “Comparison of the structures of dimyristoylphosphatidylcholine in the presence and absence of cholesterol by molecular dynamics simulations,” J. Phys. Chem., 100, 15942–15946, 1996. [108] K. Tu, M.L. Klein, and D.J. Tobias, “Constant–pressure molecular dynamics investigation of cholesterol in a dipalmitoylphosphatidylcholine bilayer,” Biophys. J., 75, 2147–2156, 1998. [109] A.M. Smondyrev and M.L. Berkowitz, “Structure of dipalmitoylphosphatidylcholine/cholesterol bilayer at low and high cholesterol concentrations: molecular dynamics simulation,” Biophys. J., 77, 2075–2089, 1999. [110] S.-W. Chiu, E. Jakobsson, and H.L. Scott, “Combined Monte Carlo and molecular dynamics simulation of hydrated dipalmitoyl–phosphatidylcholine–cholesterol lipid bilayers,” Biophys. J., 114, 5435–5443, 2001. [111] A.M. Smondyrev and M.L. Berkowitz, “Molecular dynamics simulation of the structure of dimyristoylphosphatidylcholine bilayers with cholesterol, ergosterol, and lanosterol,” Biophys. J., 80, 1649–1658, 2001. [112] T.W. McMullen and R.N. McElhaney, “Physical studies of cholesterol–phospholipid interactions,” Curr. Opin. Coll. Int. Sci., 1, 83–90, 1996. [113] A. Watts, “Solid-state NMR apporaches for studying the interaction of peptides and proteins with membranes,” Biochim. Biophys. Acta, 1376, 297–318, 1998. [114] D.S. Cafiso, “Alamethicin: A peptide model for voltage gating and protein– membrane interactions,” Annu. Rev. Biophys. Biomol. Struct., 23, 141–165, 1994. [115] C.E. Dempsey, “The actions of melittin on membranes,” Biochim. Biophys. Acta, 1031, 143–161, 1990.
Modeling lipid membranes
957
[116] S. Bern`eche, M. Nina, and B. Roux, “Molecular dynamics simulation of melittin in a dimyristoylphosphatidylcholine bilayer membrane,” Biophys. J., 75, 1603–1618, 1998. [117] D.P. Tieleman, H.J.C. Berendsen, and M.S.P. Sansom. “Voltage-dependent insertion of alamethicin at phospholipid/water and octane/water,” Biophys. J., 80, 331–346, 2001. [118] S. Bandyopadhyay, M. Tarek, and M.L. Klein, “Molecular dynamics study of lipid– DNA complexes,” J. Phys. Chem. B, 103, 10075–10080, 1999. [119] Q. Zhong, T. Hisslein, P.B. Moore, D.M. Newns, P. Pattnaik, and M.L. Klein, “The M2 channel of influenza A virus: a molecular dynamics study,” FEBS Lett., 434, 265–271, 1998. [120] K. Schweighofer and A. Pohorille, “Computer simulation of ion channel gating: the M2 channel of influenza a virus in a lipid bilayer,” Biophys. J., 78, 150–163, 2000. [121] D.E. Elmore and D.A. Dougherty, “Molecular dynamics simulations of wild-type and mutant forms of the mycobacterium tuberculosis MscL channel,” Biophys. J., 81, 1345–1359, 2001. [122] T. Husslein, P.B. Moore, Q.F. Zhong, D.M Newns, P.C. Pattnaik, and M.L. Klein. “Molecular dynamics simulation of a hydrated diphytanol phosphatidylcholine lipid bilayer containing an alpha-helical bundle of four transmembrane domains of the Influenza a virus M2 protein,” Faraday Disc., 111, 201–208, 1998. [123] D.P. Tieleman, L.R. Forrest, M.S.P. Sansom, and H.J.C. Berendsen. “Lipid properties and the orientation of aromatic residues in OMPF, Influenza M2 and alamethicin systems: molecular dynamics simulations,” Biochemistry, 37, 17544–17561, 1998. [124] S.W. Chiu, S. Subramanian, and E. Jakobsson, “Simulation study of a gramicidin/lipid bilayer system in excess water and lipid. II. rates and mechanisms of water transport,” Biophys. J., 76, 1939–1950, 1999. [125] O.G. Mouritsen and M. Bloom, “Mattress model of lipid–protein interactions in membranes,” Biophys. J., 46, 141–153, 1984. [126] M.R.R. de Planque, D.V. Greathouse, H. Sch¨afer, D. Marsh, and J.A. Killian, “Influence of lipid/peptide hydrophobic mismatch on the thickness of diacylphosphatidylcholine bilayers. A 2 H NMR and ESR study using designed transmembrane α-helical peptides and gramicidin A,” Biochemistry, 37, 9333–9345, 1998. [127] D. Duque, X.J. Li, K. Katsov, and M. Schick, “Molecular theory of hydrophobic mismatch between lipids and peptides,” J. Chem. Phys., 116, 10478–10484, 2002. [128] S. Morein, R.E. Koeppe II, G. Lindblom, B. de Kruijff, and J.A. Killian, “The effect of peptide/lipid hydrophobic mismatch on the phase behavior of model membranes mimicking the lipid composition of Escherichia coli membranes,” Biophys. J., 78, 2475–2485, 2000. [129] M.R.R. de Planque, J.A.W. Kruijtzer, R.M.J. Liskamp, D. Marsh, D.V. Greathouse, R.E. Koeppe II, B. de Kruijff, and J. A. Killian. “Different membrane anchoring positions of tryptophan and lysine in synthetic transmembrane α-helical peptides,” J. Biol. Chem., 274, 20839–20846, 1999. [130] W.M. Yau, W.C. Wimley, K. Gawrisch, and S.H. White, “The preference of tryptophan for membrane interfaces,” Biochemistry, 37, 14713–14718, 1998. [131] C. An´ezo, A.H. de Vries, H.D. H¨oltje, P. Tieleman, and S.J. Marrink, “Methodological issues in lipid bilayer simulations,” J. Phys. Chem. B, 107, 9424–9433, 2003. [132] L. Onsager, “Electric moments of molecules in liquids,” J. Am. Chem. Soc., 58:1486– 1493, 1936.
958
C. Chipot et al.
[133] N. Ben-Tal, A. Ben-Shaul, A. Nicholls, and B. Honig. “Free–energy determinants of α-helix insertion into lipid bilayers,” Biophys. J., 70, 1803–1812, 1996. [134] J.H. Lin, N. A. Baker, and J.A. McCammon, “Bridging implicit and explicit solvent approaches for membrane electrostatics,” Biophys. J., 83, 1374–1379, 2002. [135] S.J. Marrink and A.E. Mark, “Molecular dynamics simulation of the formation, structure, and dynamics of small phospholipid vesicles,” J. Am. Chem. Soc., 125, 15233–15242, 2003. [136] J.C. Shelley, M.Y. Shelley, R.C. Reeder, S. Bandyopadhay, P.B. Moore, and M.L. Klein. “Simulations of phospholipids using a coarse grain model,” J. Phys. Chem. B, 105, 9785–9792, 2001. [137] S.O. Nielsen, C.F. Lopez, G. Srinivas, M.L. Klein, “Coarse grain models and the computer simulation of soft materials,” J. Phys. Condens. Matter, 16, R481–R512, 2004. [138] C.F. Lopez, S.O. Nielsen, P.B. Moore, M.L. Klein, “Understanding nature’s design for a nanosyringe,” Proc. Natl. Acad. Sci. 101, 4431–4434, 2004.
2.27 MODELING IRRADIATION DAMAGE ACCUMULATION IN CRYSTALS Chung H. Woo The Hong Kong Polytechnic University, Hong Kong SAR, China
Bombardment of crystalline solids by energetic particles produces lattice defects, the accumulation of which is the origin of the macroscopic effects of irradiation damage. In an all-inclusive theory, the defects produced fall into two categories: (1) atomic displacements creating freely migrating vacancies and interstitials and their clusters, both mobile and immobile; and (2) transmutations creating impurity elements, such as helium. The first type of damage is called displacement damage, and is recoverable via the recombination of the vacancies and the interstitials before they disappear into grain boundaries, voids and dislocations, but the second type is not. In the present article, our attention is on the former. Depending on the energy transfer between the projectile particle and the atom of first encounter in the irradiated material, i.e., the primary knock-on atom or PKA, the initial displacement damage may take the form of (i) vacancy–interstitial (Frenkel) pairs, when the energy transferred is just sufficient to overcome the displacement threshold, e.g., irradiation by MeV electrons; or (ii) cascades and sub-cascades, when the energy transferred is substantially higher, e.g., irradiation by fast neutrons and heavy ions. The lattice defects generated directly from the displacement damage in case (i) are isolated vacancies and interstitials, and their generation rates can be approximated by the displacement rate [1], after discounting the recombination of correlated pairs [2]. In case (ii), due to the high concentration of displacement damage produced in the small cascade volume, and the high mobility of atoms resulting from the energy deposited by the PKA, substantial recombination of the lattice defects takes place already during the coolingdown phase. The final numbers of interstitials and vacancies produced from the cascade are only small fractions of those estimated from the available energy due to the PKA. While the recombination is taking place, a significant fraction of interstitials and vacancies cluster at the same time. The fraction 959 S. Yip (ed.), Handbook of Materials Modeling, 959–986. c 2005 Springer. Printed in the Netherlands.
960
C.H. Woo
of vacancies immobilized in the primary vacancy clusters (PVCs) need not be exactly the same as that of interstitials in the primary interstitial clusters (PICs), and the concepts of atomic displacements and residual defect production must be carefully distinguished. It is obvious that the details of the initial displacement damage determines the characteristics of the lattice defects produced, and hence the kinetics of their reactions among themselves and with the existing microstructure, and ultimately, the macroscopic effects resulting from the ensuing microstructure evolution. For example, electron irradiation produces displacement damage in the form of isolated Frenkel pairs. Accordingly, microstructure evolution can be modeled in terms of the reaction kinetics of point-defects produced homogeneously in space and uniformly in time. On the other hand, irradiation damage with heavier particles begins in the form of cascades and sub-cascades, in which case, defect clustering occurs heterogeneously and athermally. The modeling of the resulting microstructure evolution must take into account these characteristics. In terms of spatial and temporal scales, investigations involving the details of the initial damage, the properties of the defects produced, and the characteristics of the subsequent reactions, naturally lie within the realm of atomistic modeling using techniques such as molecular dynamics (MD) or lattice kinetic Monte Carlo (KMC) (see other papers in this volume). On the other hand, the macroscopic manifestation of the damage effects occurs through the accumulation of irradiation-induced defects, which causes changes in the underlying microstructure of the irradiated material, often through multiple development stages. Modeling irradiation damage in this domain must consider spatial dimensions in the range of microns, time scales of days, weeks, years, or even decades, and accumulation of astronomical numbers of events occurring with infinitesimal probabilities. Such scales make direct atomistic modeling prohibitively expensive, even if one may take into account the rapid development of computer hardware and computational techniques. Irradiation damage modeling must recognize the multi-scale nature of the problem. Theories on irradiation effects, such as void swelling, irradiation creep and growth, based on the reaction kinetics of the underlying microstructure evolution, has been the mainstream approach of irradiation-damage modeling for half a century since the early 1960s (see Ref. [3], for a comprehensive review). This is still by far the most effective approach that bridges the vast gap between mechanisms at the atomistic scale to the associated effects at the component scale. Many models have been developed from these theories, for use as technological tools of analysis and interpretation for reactor design and operation. However, due to the lack of information on the irradiation-induced defects, model parameters fitted to experimental data have to be used. In many cases, based on a single set of parameters, a model cannot be made to be consistent with all existent experimental observations, a weakness which usually reflects
Modeling irradiation damage accumulation in crystals
961
the inadequate theoretical understanding or model oversimplification. In this situation, the usefulness of the model for predictive purposes, or as an interpretative tool, would be seriously compromised. This underpins the need of a well-articulated multi-scale approach to irradiation damage modeling, which spans the vast territory between nano- and macro scales. In this article we present an overview of the reaction kinetic theory of irradiation damage. To facilitate the articulation with the domain of atomistic modeling, our approach will be based on the discrete crystal lattice description. To prepare for the fundamentals, we start off with the general theory of bimolecular reaction kinetics in a diffusive medium, within the framework of which we discuss the standard rate-theory model, the concept of sink bias, the effect of anisotropic diffusion and the DAD bias, and the introduction of the effective lifetime for the treatment of sink competition. We then consider, within the atomistic picture, the effects on the kinetics due to interactions with applied fields, its implications to the reaction kinetics of one-dimensional diffusers, and the stress-induced preferred absorption (SIPA) effect due to elasto-diffusion. Damage caused by cascade-producing irradiation and its modeling using production bias is covered next. Various difficulties facing the production-bias model necessitate considerations of modeling issues beyond the mean-field approximation. To overcome the inadequacy of the mean-field approximation, stochastic effects due to random cascade initiation have to be taken into account. The nucleation of voids and dislocation loops during cascade damage, the stability of a spatial homogeneity, and the development of heterogeneous microstructure are discussed in this context. The overview is concluded with a summary and outlook section.
1.
Reaction Kinetics in a Diffusive Medium
The kinetics of the reactions involving point-defects and microstructure components have been considered via several approaches, including the rate theory approach (see Ref. [4]) the master equation approach (see Ref. [5]), Fokker Planck Equation approach (see Ref. [6]). We begin systematically with the well-developed theory of diffusion-influenced chemical reaction kinetics for molecules diffusing in a solvent, and relate to the other approaches as we proceed.
1.1.
Bimolecular Reaction Kinetics
Chemical reaction kinetics theory has a long history, starting with the elegant work of von Smoluchowski’s [7]. Based on the concept of the pair probability function, Goesele and Seeger [8] developed a theory for the
962
C.H. Woo
bimolecular reaction rates, which can be generalized to include several important factors in the study of irradiation damage, such as interactions between the reaction partners, proximity of other reaction partners, anisotropic diffusion, and finite lifetime of the reactants. Goesele [9] gave a detailed exposition on this topic. Starting with randomly distributed reactants A and B, having diffusion tensors DA and DB , and an interaction potential E(r) between them, the rate of change in the spatially averaged concentrations CA and CB (in atomic fractions) is governed by the reaction coefficient α(t): ¯ C˙ A = C˙ B = − Dα(t)C A CB ,
(1)
where the dots over CA and CB denote the time derivative and D¯ = (Dx D y Dz )1/3 .
(2)
Dx , D y , Dz are the principal values of the relative diffusion tensor D, which is assumed to be constant in space and time D = DA + DB .
(3)
In the case of three-dimensional anisotropic diffusion, Dx , D y , Dz are all non zero. Then α(t) is a function of time and is related to the average diffusiv¯ and the drift field E originated from the interaction between A and B. ity D, Explicit expressions for the most interesting cases have been derived [9, 10]. If B is a microstructure component that is indestructible, C˙ B = 0 and Eq. (1) only applies to CA . The concept of pair probability densities discussed above is only valid for the equilibrium case, and not for the dynamic case, in which the reactants are continuously generated and subsequently annihilated. Goesele [11] introduced the concept of effective lifetime τeff of a point-defect as the mean time until annihilation of the point-defect, whether through spontaneous or induced conversion, recombination with the anti-defect, or rejoining the crystal lattice at a microstructure component. In terms of τeff the effective time-independent reaction constant α can be derived from known expressions of the equilibrium time-dependent annealing reaction coefficient α(t).
1.2.
Rate Theory Model and the Concept of Sink Bias
The steady-state kinetic equation is similar to Eq. (1), but with α(t) replaced by α. Goesele, [11] showed that α is related to the sink strength in the effective medium approximation of the rate-theory model. Indeed, if B is an inexorable sink, then αCB is the usual sink strength k 2 of a microstructure component (sink) in the effective medium approximation. The lifetime
Modeling irradiation damage accumulation in crystals
963
of an individual process and the corresponding reaction coefficients are also related, i.e., ¯ B )−1 kB2 = αCB = ( Dτ and k2 =
¯ eff )−1 kB2 = ( Dτ
(4)
(5)
B
with −1 τeff =
τB−1
(6)
B
An imbalance in the fluxes of vacancies and interstitials to a microstructure component causes microstructure evolution, which in most cases, results in macroscopic property changes. To be specific, let us consider a case in which vacancies and self-interstitials are produced at a rate G. Replacing α(t) by α in Eq. (1), the net interstitial annihilation rate at S, s , is then given by s = αis D¯ i Ci − αvs D¯ v Cv ,
(7)
where αis is the reaction constant of interstitials with a microstructure component S, and αvs is the corresponding quantity for vacancies. The symbols Ci and Cv are the interstitial and vacancy concentrations (i.e., the spatial and temporal mean) respectively, and D¯ i , D¯ v are their respective averaged diffusion coefficients. The concentrations Ci and Cv can be calculated using the usual particle conservation equations, giving Gα s ¯ s = ni n (βs − β) N αv
(8)
n
where βs = and
αis − αvs αis
(9)
N n αin βn n ¯ β= n n N αi
(10)
n
Here N n is the density of the n th type of sinks. Note that the average is weighted by the reaction constant for the interstitials. Equation (8) carries an important message, namely, whether a microstructure component absorbs a net flow of vacancies or interstitials is determined by the difference between the respective reaction constants α nj for the two
964
C.H. Woo
types of defects. α nj depends on a myriad of factors, including the size and shape of reaction volumes and the associated boundary conditions, anisotropy of the diffusion, interaction between reaction partners, proximity of other reactant partners, continuous generation of reaction partners, lifetime and geometric arrangement, and spatial correlation of the reacting partners. Thus, the behavioral difference between vacancies and interstitials relating to any of the forgoing factors will contribute to the preference of the sink for a particular type of defect. Historically, the effects of the dislocation strain field on the kinetics of such reactions may be the earliest factor studied in irradiation damage theory, leading to the concept of the dislocation bias to explain void swelling (see Ref. [11]). Thus, neglecting diffusion anisotropy, the difference between αis and αvs is caused only by the difference in the drift potentials E (i.e., the elastic interaction between the reaction partners), between the vacancies and the interstitials. For example, for edge dislocations, E is larger for the self-interstitials than for the vacancies because of the larger elastic interaction. For voids, E is negligible for both kinds of point-defects. Thus, Eq. (9) gives for voids, β v = 0, and for edge dislocations, β D > 0, and Eqs. (9) and (10) immediately give βv − β¯ < 0 and βD − β¯ > 0 showing that, in an irradiated crystal with voids and edge dislocations, the voids will grow because it absorbs a net flux of vacancies. At the same time, interstitial loops will grow and edge dislocations will climb, producing a volume strain, because they receive a net flux of interstitials. This is the conventional understanding of the mechanism responsible for void swelling. β s is sometimes referred to as the bias of the microstructure component s.
1.3.
Ansiotropic Diffusion and the DAD Effect
Taking into account the crystal structure, the mobilities of the defects produced during irradiation are not always isotropic. This is of prime importance to the understanding of irradiation damage behavior in crystals. Diffusional anisotropy can be a consequence of either the non-cubic structure of the host lattice (e.g., hexagonal close packed, hcp, structure of the zirconium lattice), or that of the defect itself (e.g., a crowdion in a cubic lattice), or both. In such cases, the diffusional anisotropy difference (DAD) between the vacancies and interstitials may also introduce a large bias according to Eq. (9). Woo [10] comprehensively reviewed the effects of anisotropic diffusion in the theory of irradiation damage in non-cubic crystals. The bias caused by DAD is independent of the associated strain field of the sink, but may completely dominate the conventional dislocation bias caused by the elastic interaction between the point-defects and the sink. Thus, unlike the usual dislocation bias caused by the elastic interaction, the bias for edge dislocations in non-cubic
Modeling irradiation damage accumulation in crystals
965
metals depend on their line directions in the crystal, and does not have to be biased towards interstitials. Instead of being weakly biased sinks when the effect of anisotropic diffusion is neglected, grain boundaries and surfaces can also be strongly biased towards the vacancies or interstitials according to their orientations. This large variability of biases for sinks is a source of the complex behavior of irradiated non-cubic metals. Indeed, Woo and Goesele [12] are the first to suggest the link between anisotropic diffusion and coldworked zirconium alloys with the hexagonal-close-pack (hcp) crystal structure. Subsequently, reviewing the irradiation damage accumulation behavior of hcp zirconium alloys, Woo [13, 14] traced many of its “anomalous” properties to the DAD effect. Anisotropic diffusion also offers a natural explanation to the ordering of microstructure such as void lattice formation, and the ordering of dislocation loops [15]. Subsequent atomistic studies using molecular dynamics and statics of point-defect diffusion in α-zirconium [16] and α-titanium [17] both of hcp crystal structure, have indeed found evidence of DAD in both cases. The bias calculated from the atomistic anisotropy ratios was found to be consistent with the experimental HVEM loop growth measurements over a wide range of temperatures [13]. To formalize these concepts in the understanding of irradiation damage, the terms elastic interaction difference (EID) and diffusional anisotropy difference (DAD) were introduced [10]. EID is a major source of driving force for microstructure evolution traditionally considered to cause the dislocation bias, and DAD is a natural candidate responsible for irradiation-induced effects geometrically related to the crystallography of the host lattice, such as irradiation growth in non-cubic metals, void lattice formation, dislocation loop ordering, etc.
1.4.
Sink Competition and Effective Lifetime
The effects of sink competition and similar topics have been investigated in the modeling of reaction kinetics involving continuously produced migrating defects. Two basic issues are of concern: (a) the limited lifetime between the creation of the migrating point-defects, and its annihilation at sinks, and (b) the overlapping diffusion fields, realized through the applied boundary conditions. The first can be easily visualized through the classical probability P of annihilation, by a sink, of a three-dimensional diffuser created in its neighborhood. Polya [18] first found that P was a function of the boundary condition, and the initial distributions. Simpson and Sosin [19] showed that for the Smoluchowski boundary condition and a δ-function initial distribution, P was a function of the duration t, given by
P(r, t) = 1 −
r−R R erfc √ r 4Dt
(11)
966
C.H. Woo
For high sink concentrations, or high recombination, the point-defect lifetime is shortened, and the annihilation probability becomes very large [11]. Indeed, the effect of sink competition can also be seen in Eq. (5), in which the reaction constant α is expressed as a function of the effective lifetime τeff , which depends on the total reaction constant, to which α itself contributes. As the total sink strength increases due to the increased strength of competing sinks, the effective lifetime decreases, resulting in an increase of the strength of the particular sink under consideration. The effect of overlapping diffusion fields can be seen from the difference between sink strengths derived from various boundary conditions, such as the effective medium, the cellular model, etc. Goesele [9] has discussed this topic in some detail, to which the interested reader is referred.
2.
Effects of Interactions and Applied Fields: The Atomistic Picture
The kinetics of reactions among molecules depends on forces on them, which may come from their mutual interaction or from an external field. In earlier studies (see Ref. [20] for review), the averaged rate of arrival of pointdefects at a sink (i.e., microstructure component) within the continuum theory of drift diffusion is considered, by solving the boundary-value problems involving equations of the type, ∂C = ∇[D0 (∇C + βC∇ E)], ∂t
(12)
where C is the concentration of the point-defects per unit volume, D0 the diffusivity tensor of the ideal crystal, E is the potential energy of the defects in the applied force field (external plus internal), and β is the reciprocal of the product of the Boltzmann constant and the absolute temperature. This approach has provided a useful model for determining the reaction constant of a mobile defect with a sink that has a strain field associated with it, e.g., dislocations. The continuum theory behind Eq. (12) neglects the atomistic nature of the migration of the defects and omits important crystalline effects [21], because: (i) the effect of the force field on the symmetry of the diffusivity tensor is absent; (ii) the configuration of the point-defect with which the interaction energy is evaluated is not clear; and (iii) the symmetry of the elementaryjump mechanism has no effect. Using a kinetic theoretic treatment of lattice jumps, in terms of the atomic jump vectors and jump barriers, Eq. (12) can be rewritten with the diffusivity tensor D given by Dij =
1 h i h j λeff (h) exp −β E s (h) − E eeff . 2 h
(13)
Modeling irradiation damage accumulation in crystals
967
Here the summation is carried over all nearest-neighbor jump directions hˆ (the symbol ˆ in hˆ denotes the unit vector in the direction of h), and h i is the ith component of the position vector of the nearest neighbor to which a ˆ is the interaction energy of the point-defect in its saddlejump occurs. E s (h) point configuration with the external field, which in general depends on the ˆ is an effective jump frequency in the hˆ direction ˆ λ0 (h) jump direction h. eff of the defect in the unstressed crystal, obtained by averaging over different non-equivalent configurations of the point-defect, before and after the jump in the hˆ direction. At the same time, the drift potential E in Eq. (12) is replaced by an effective E eeff , obtained by averaging over ground-state configurations of the point-defect [see Eq. (36) of Ref. [21]]. Comparison of Eq. (13) with the ideal diffusivity tensor used in Eq. (12), (D0 )ij =
1 ˆ h i h j λ0e f f (h), 2 ˆ
(14)
h
shows that the atomistic theory reduces to the continuum theory, if there is no difference between the drift potential evaluated with the point-defect at the saddle-point configuration, and that at the ground-state configurations. In other words, the atomistic effects are produced mostly by the difference between the ground-state and saddle-point configurations. The latter in general varies with the jump direction, while the former does not. However, in the case of slightly distorted cubic crystals, Dederich and Schroeder [21] show that the effect of E eeff only enters through the boundary conditions. In this case, writing the drift potentials E e and E s , for which we have dropped the superscript eff, as a sum of two contributions from, respectively, the internal (i.e., short-range, subscript i) and external (e.g., uniform, subscript x), strain fields, i.e., it can be shown that Eq. (12) can be rewritten with D0 replaced by Dx , and E, by E si . Dx is the same as Eq. (13) expressed only in terms of the external component E x of the total drift potential. Expanding the exponential in Eq. (13), and retaining terms linear in the external stress, we obtain (Dx )ij ∼ (Do )ij + di j kl σkl ,
(15)
where di j kl is the elastodiffusion tensor [21], and σkl the applied stress field. Equation (15) shows that the crystal lattice introduces two distinct effects. Firstly, a short-range interaction produces a drift field defined by the pointdefect configuration at the saddle-point rather than at equilibrium. Secondly, the lattice distortion introduced by an external stress changes the symmetry of the diffusion field of the point-defect throughout the crystal, and hence its reaction kinetics accordingly.
968
C.H. Woo
In terms of its atomistic properties, the elastic interaction potential V (r) of a point-defect at a point r in an external field ζ i j (r) can be expressed in terms of the atomistic properties of the point-defect [22]: V (r) = −Pij ζij (r) − 12 αi j kl ζij (r)ζkl (r)
(16)
where ζij (r) = eij (r) + εij (r).
(17)
Repeated indices imply summation. Here, ε ij (r) is the strain field associated with the microstructure component (short-ranged), and eij (r) is the strain field caused by an externally applied stress (long-ranged) at the defect position r. Pij is the elastic dipole tensor of the point-defect, which describes its elastic strain field away from the defect centre. α i j kl is the elastic polarizability of the point-defect and describes the modification of the point-defect strain field caused by the total applied strain field ζ ij (r). These atomistic quantities can be obtained via computer modeling [23]. In the continuum theory in which the point-defect is modeled as a spherical inhomogeneous elastic inclusion, the interaction arising from Pij corresponds to the first-order size effect and that arising from α i j kl , to the second-order inhomogeneity interaction [24], and there is no distinction between the equilibrium or saddle-point configurations. Within the atomistic theory, however, both Pij and α i j kl refer to the saddlepoint configuration of the point-defect. To simplify the notation, we drop the superscript used to distinguish between the two different configurations. The elastic interaction of point-defects with a stress field modifies the potential barriers to possible atomic jumps and, subsequently, their reaction constants with a specific microstructure component. Vacancies and intersitials have different dipole tensors Pij and polarizability α i j kl , which interact differently with the external field, and produce different reaction constants with the same microstructure component. This is the origin of the so-called dislocation bias [25], as we have discussed. In earlier models, the microstructure component is usually represented as a sink with surface geometry and boundary conditions that describe the proceedings of the reaction. The interaction between the defect and the microstructure component is considered to be the size–effect interaction −pV, where V is the isotropic volume strain of the point-defect, and p is the hydrostatic stress field of the microstructure component. In the presence of an external applied stress, the second-order inhomogeneity interaction couples the strain fields of the sink and the external stress through the cross-term contained in the quadratic strain term. The stress dependence of the reaction constant then produces a stress-induced preferred absorption (SIPA) effect, which is often used to explain deformation due to creep during particle bombardment [26].
Modeling irradiation damage accumulation in crystals
969
In the atomistic picture, the reaction rate depends on the renormalized diffusivity [27] that can be rewritten using Eq. (16) as
1 s ˆ exp β Pkls (h)ζ ˆ kl + 1 αklmn ˆ kl ζmn , h i h j λ0eff (h) ( h)ζ (18) D˜ ij = 2 2 ˆ h
where the index s denotes a quantity evaluated at the saddle-point configuration in a stress-free crystal. If one uses the approximation that the point-defect is a centre of isotropic dilation or contraction, then only the hydrostatic component of the applied strain field ζkl (r) contributes to the interaction ζkl Pkls . As a result, this interaction, and hence the diffusivity tensor, would be independent of the orientation of the external stress relative to the dislocation. However, when the anisotropy of the point-defect configuration (shape) is taken into account, the total interaction energy must include a contribution due to the shear component of the applied field ζ kl (r). Using Eq. (17), the normalized diffusivity tensor in Eq. (18) can be expanded as 1 ˆ kls (h) ˆ h i h j λ0eff (h)P D˜ ij = D˜ ij0 + βekl 2 hˆ 1 ˆ kls (h) ˆ h i h j λ0eff (h)P + βεkl 2 hˆ (19) 1 2 s ˆ ˆ kls (h)P ˆ mn + β ekl εmn h i h j λ0eff (h)P (h) 2 hˆ 1 s ˆ klmn ˆ + O(ekl emn ; εkl εmn ) h i h j λ0eff (h)α (h) + βekl εmn 2 ˆ h
The first term is the renormalized diffusivity tensor of the defect in the ideal crystal, i.e., in the absence of an applied stress. The second term has the form d˜i j kl ekl where d˜i j kl is the renormalized elasto-diffusion tensor defined by 1 ˆ kls (h) ˆ h i h j λ0eff (h)P d˜i j kl = β 2 ˆ
(20)
h
This term is responsible for the global diffusional anisotropy introduced by the application of an external stress, i.e., external to the crystal, as discussed earlier in this section. The third term, which does not contain ekl is responsible for the dislocation bias due to the size and shape effects [28]. The fourth and fifth terms couple the strain fields of the external stress and that of the sink; they cause a dependence of the bias of the sink on the orientation of the external stress and hence a SIPA-type effect. The fourth term arises from the non-linear dependence of the diffusivity on the interaction energy and is a result of the discrete lattice theory. The fifth term comes from the point-defect polarizability and is responsible for the conventional SIPA mechanism [26]. The effects of both the fourth and fifth terms are second-order, being proportional to the product of two strains.
970
2.1.
C.H. Woo
Implications to the Reaction Kinetics of One-Dimensional Diffusers
It is obvious from Eq. (18) that when small jump barriers are involved, such as typical of crowdion motion of single interstitial or clusters, λ0eff is close to the ideal lattice frequency, and the elastic interaction may dictate the migratory properties of the defect. Thus, recent MD computer simulation results of Wirth et al. [29] show that clusters of 19 and 37 interstitials in “Fe” have intrinsic migration energies of 0.023 and 0.052 eV, respectively. In comparison, using the infinitesimal loop approximation, the interaction energy of a circular planar interstitial cluster of area δA in a stress field σij is given by δ Anˆ i bj σij , where nˆ i is the unit plane-normal vector [30]. For a uniaxial stress of 100 MPa, acting along b, the interaction energies of the clusters, in the form of prismatic loops, are over 0.1 and 0.2 eV, respectively. Larger clusters will have even larger interaction energies, in proportional to its defect content. In such cases, the reaction kinetics between the defect clusters and the microstructure component may be determined, not so much by its intrinsic properties, but rather by its interaction with the stress fields associated with other crystal defects. Trapped at the local minima of the interaction energy, they may continue to evolve in response to the net influx of vacancies or interstitials to it, similar to immobile clusters. Or, if the interaction experienced by the migrating defect is repulsive and causes the direction of motion to change, its migration may proceed via a three-dimensional percolation mode [31]. Indeed, recent MD simulations show that interstitial clusters in alloys tend to diffuse three-dimensionally, instead of one-dimensionally, as in 100% single-component crystalline materials [32, 33]. Dudarev, Semenov and Woo [31] estimate that, in practical terms, the impurities generated via radioactive transmutation are already sufficient to reduce the one-dimensional diffusing range down to the sub-micron range, and suggested that the general importance of one-dimensional diffusion kinetics to explain features of microstructure of scales beyond the sub-micron range should not be overemphasized.
2.2.
Stress-induced Preferred Absorption (SIPA) due to Elastodiffusion
It may seem that the fourth and fifth terms of Eq. (19) are the only ones that can produce a dependence of the reaction constant on the orientation of the microstructure component with respect to the external stress, thus producing a SIPA effect. However, according to Eq. (4), the reaction constant of a geometrically anisotropic reaction volume in an anisotropic diffusion field would also depend on the relative orientations of the two entities. In the present case,
Modeling irradiation damage accumulation in crystals
971
through the operation of the elasto-diffusion term (i.e., the second term on the RHS of Eq. (15) or (19)), an external stress produces an anisotropy in an otherwise isotropic diffusion field (or changes the anisotropy of an intrinsically anisotropic one). This anisotropy can also cause the reaction constant to depend on the geometric orientation of the microstructure component with respect to the external stress, and thereby producing a SIPA effect. It is important to note that this is a first-order effect, being proportional to the external stress, in contrast to the second-order effects represented by the fourth and fifth terms in Eq. (19). An important feature of this term is its line-direction dependence, causing dislocations with different line directions to have different biases under the action of an external stress. As a result, under nonequilibrium conditions, the application of a stress will cause edge dislocations to climb with different velocities, according to their line directions. If these dislocations also have different Burgers vectors, atoms will be deposited on, or removed from, various crystallographic planes at different rates, thus producing a time-dependent deformation, i.e., creep and stress relaxation [27]. The drift-diffusion problem has been solved analytically for a straight edge dislocation [28] and an infinitesimal edge dislocation loop [34]. In the presence of an external shear stress, the reaction constant has a much stronger dependence (by an order of magnitude) on the line direction than on the Burgers vector direction, which can be traced completely to the external-stressinduced anisotropy of diffusion. The effects of the second-order terms, i.e., the fourth and fifth terms in Eq. (19), are indeed negligible compared with those of the first-order one (i.e., the second term).
3.
Damage by Cascade Producing Irradiation and Production Bias
The energy transferred during a high energy recoil event, such as caused by fast fission or fusion neutrons, causes a large number of atomic displacements in a crystalline solid in a very short time (∼10−12 s). The high concentration of displacement damage produced and the large energy deposited, by the PKA in the small cascade volume, give rise to two effects. Firstly, extensive annealing occurs during the cooling-down phase, allowing only a small fraction of the initial displacements to survive as individual vacancies and interstitials. Secondly, a significant fraction of the remaining interstitials and vacancies form clusters. These “primary clusters” are also segregated in space, such that the primary vacancy clusters (PVCs) are formed near the cascade core and the primary interstitial clusters (PICs) are formed near the cascade periphery. Early investigations of the structure of cascades and sub-cascades concentrated on the intra-cascade clustering of vacancies. Using a diffusion-reaction
972
C.H. Woo
formulation to account for clustering, Woo, et al. [35] calculated the recombination and clustering of interstitials in a cascade and discovered that a significant fraction of the interstitials produced in a cascade may be immobilized in the form of clusters. Subsequently, numerous molecular dynamics studies of cascades confirmed the intra-cascade formation of interstitial clusters (see related articles, this volume). More imporatntly, the fraction of vacancies in the PVCs is not exactly the same as that of interstitials in the PICs. Trinkaus et al. [36] reviewed the experimental evidence, and concluded that all observations thus far are consistent with the premise of interstitial cluster formation. Thus, the available evidence suggests that under cascade damage conditions, a substantial fraction of surviving interstitials and vacancies are produced in the forms of PICs and PVCs, in addition to the Frenkel pairs. Up to the peak swelling temperature, the PICs are thermally stable because of their large binding energy, and unlike the PVCs, which are generally immobile, the larger PICs (containing more than 10 interstitials) usually collapse into platelets forming dislocation loops, which may be glissile or sessile. The glissile ones are onedimensional diffusers with a very small jump barrier, and can reach and react with the sinks via long-range migration of the cluster as a whole. However, as discussed earlier in this article, the one-dimensional diffusing PICs can easily get trapped at the local fields of other crystal defects, or change its direction of motion when repeled or released from a trapped state. In realistic materials, the direction change may occur sufficiently frequently between their creation and annihilation, so that their mean free paths are much shorter than the sink separation, and the reaction kinetics of the mobile PICs are effectively threedimensional diffusers [31]. As a simplifying assumption, the mobile PICs may be reasonably considered just as a constituting component of the interstitial flux in their reaction with sinks.
3.1.
Modeling Irradiation Damage Under Cascade Conditions and Production Bias
In view of the specific features discussed in the foregoing, it is important that the characteristics of the damage production and annihilation be represented accurately in the modeling of irradiation damage under cascade conditions. Specifically, the extensive intra-cascade recombination, the continuous generation and accumulation of the PICs and PVCs must be adequately accounted for. This requires that the evolution of the PICs and PVCs, the kinetics of their annihilation by the extended defects (e.g., dislocations), and their functions as both sources and sinks of the free defects, be incorporated as an integral and self-consistent part of any irradiation damage theory involving cascades.
Modeling irradiation damage accumulation in crystals
973
As the irradiation damage production process becomes increasingly better understood, irradiation damage modeling has progressed from the standard rate theory (SRT) model [25] to the BEK model [37] to the production bias model (PBM) [38, 39]. The strength and weakness of the models in terms of their ability and consistency in the comprehensive description of the effects of temperature, dose rate and particle type on available experimental observations in swelling, creep, growth, microstructure evolution, radiation enhanced diffusion (RED), irradiation-induced segregation (RIS) have been analyzed and reviewed by Woo et al. [3]. Interested readers are refered to this article for a critical overview and comparison of these models. In the following, we give a brief introduction of the production bias model, and then concentrate on its further development in the last several years. That a significant fraction of point-defects are retained in the form of immobile clusters immediately suggests that this portion may not participate in the conventional segregation of interstitials and vacancies via preferential attraction of single interstitials to dislocations. Instead, it is now realized that at irradiation temperatures above annealing Stage V, vacancies would evaporate from the PVCs due to thermal dissociation, and a large fraction of them would enter the medium and contribute to the global vacancy supersaturation. The PICs, on the other hand, are expected to remain thermally stable, at least up to peak-swelling temperatures [22]. Woo and Singh [38, 39] noticed this large asymmetry between the effective production efficiencies of mobile vacancies and interstitials, and found that the resulting “production bias” can provide a large driving force for microstructure evolution. This has led to the introduction of the production bias concept. In general terms, intra-cascade clusters can be both sources and sinks of point-defects. At low temperatures the PVCs and the immobile portion of the PICs (IPICs) are predominantly sinks, and at high temperatures, emission due to thermal dissociation makes them effectively sources. The difference between the PVCs and IPICs in their capacity as point-defect sources varies with temperature, from which a net point-defect flux to sinks to drive microstructure evolution can be derived, just like from the dislocation bias. However, it is important to realize that this bias does not originate from the reaction kinetics of the point-defect with the sink, but from the difference between the effective production rates of the two types of freely migrating defects. That is, it should not be confused with a sink bias such as the dislocation bias. It is also important to note here that the interstitial clusters considered in the production bias model (PBM) are the IPICs, and the mobile interstitial clusters are effectively considered as part of the collection of the three-dimensional migrating interstitials annihilated at the sinks [38–40]. The PBM was initially developed to consider steady-state void swelling. The high swelling rate and the sharp temperature dependence in the peak swelling regime was naturally caused, not by the dislocation bias, but by the
974
C.H. Woo
additional supply of free vacancies from the thermal dissociation of PVCs. Their model revealed the natural occurrence of the following characteristics of void swelling under cascade damage conditions, which is consistent with experimental observations. Thus, there are two sharply separated temperature regimes: low swelling rate at lower temperatures and high swelling rate at the peak swelling temperature. The swelling mechanism in the high swelling rate regime is dominated by the production bias whereas in the low swelling rate regime, it is determined by the dislocation bias. The transition from the low temperature (dislocation bias) regime to the high temperature (production bias) regime is abrupt. The steepness of the temperature dependence in this transition regime is consistent with a relatively high activation energy (∼3 eV), nearly equal to the activation energy for self-diffusion. The swelling rate at all temperatures increases with the amount of interstitial clustering. Similarly, the temperature dependence of irradiation growth in zirconium [41] also showed the existence of two sharply separated temperature regimes: a low growth rate regime at lower temperature and a high growth rate peak at higher temperatures. The possible connection between this growth behavior and the large excessive vacancy supersaturation created by the production bias was investigated, and production bias was found to be a plausible explanation [3]. It is worthwhile noting that the steep temperature dependence of the steadystate swelling rate observed under cascade damage conditions is a cascade effect, and cannot be explained within the rate theory (SRT). In cases where point-defects are generated homogeneously in the form of Frenkel pairs, the SRT is applicable, and the temperature dependence at low temperatures is determined via the kinetics of vacancy–interstitial recombination, which becomes important because of the high point-defect concentrations that usually prevail under these conditions. Since the activation energy for the recom bination controled process is half of the vacancy migration energy (∼ 0.7 eV in steels), the swelling rate would decrease only slowly with decreasing irradiation temperature [3].
3.2.
Difficult Issues Facing the Production-bias Model
Whilst the PBM gives a very good description of the behavior of void swelling, it does not offer a consistent explanation to the evolution of interstitial clusters and the dislocation structure. Indeed, in the temperature range just above the annealing stage V, i.e., the peak-swelling regime, the flux of freely migrating vacancies to all sinks is, on the average, much higher than that of the interstitials, due to the dissociation of PVCs that are thermally less stable than the PICs. It is not obvious how interstitial loops can nucleate and
Modeling irradiation damage accumulation in crystals
975
grow, how the swelling strain is realized, and how the PICs are removed, so that they do not accumulate and suppress all driving forces for microstructure evolution. Another issue is connected with the experimental observation that in wellannealed metals with a dislocation density of ∼1011 m−2 , neutron irradiation at the peak swelling temperatures yields a high swelling rate of ∼1%/dpa [42, 43] at doses less than 10−2 dpa. At the same time, a heterogeneous and segregated microstructure forms by self-organization. Dislocation-bias has not been able to explain the observed swelling. Although using the mean-field approximation, the swelling rate predicted by PBM is much higher, it is still far too small to explain the measured values. The development of heterogeneous void swelling observed near grain boundaries [44, 45] in both pure metals and concentrated alloys is another challenge to PBM. Attempts to explain the formation of the ordered microstructure within the framework of SRT using the concept of dislocation bias was met with failure. Trinkaus et al. [46] suggested that this may also be caused by the long-range transport by the one-dimensional random walk of small interstitial clusters along the close-packed atomic directions, often observed in MD simulations (see related articles in this volume). This suggestion was investigated in further detail subsequently by Dudarev [47]. The predicted microstructure varied significantly according to the dimensionality of the diffusion of the PICs assumed, and the agreement with experiments of the calculated swelling profile was found to have improved if the diffusion was assumed to be one-dimensional. Indeed, Singh (1999) speculated that this offered the direct evidence of one-dimensional diffusion kinetics of the mobile PICs generated under cascade damage conditions. However, the recent discovery of this phenomenon in concentrated alloys [49] weakened this speculation considerably. Indeed, as explained earlier in this article, pure one-dimensional migration over distances large compared with the sink separations, without interruption, is hard to justify in the presence of high concentration of trapping or deflection centers. When the Burgers vector change produced by the interruption is sufficiently frequent, the reaction kinetics becomes effectively three-dimensional. The one-dimensional kinetics argument is further weakened by the fact that the capture probability of a one-dimensional random walker by sinks is much smaller than that of a three-dimensional one, so that long-range onedimensional transport of clusters is basicaly inconsistent with the formation of a heterogeneous microstructure in a volume with low dislocation and cluster densities. Above all, if a large fraction of interstitials generated by irradiation is removed from the bulk to the grain boundaries, it would be difficult for the interstitial loops to nucleate and grow. This is particularly true at elevated temperatures, when there is a net vacancy flux because of the vacancy emission from the thermally unstable vacancy clusters. Without the nucleation
976
C.H. Woo
and growth of interstitial loops, the microstructure evolution at high doses is difficult to understand. Experimental observations of the behavior of void evolution during irradiation are also inconsistent with the operation of the reaction kinetics of onedimensional diffusers (see review by Woo, [6]). Thus, the capture probability of a one-dimensional diffusing cluster by the voids is proportional to the square of the void radius, and for free vacancies this probability is only linearly proportional to the void radius. If a significant portion of PICs is able to diffuse one-dimensionally over distances of several microns, as it is assumed, then void swelling must saturate when the void sizes become large enough. Indeed, the calculated swelling rate in copper based on the assumption of one-dimensional diffusing clusters is reduced by several times when the voids grow from less than 0.01–1 dpa. Experimentally, however, starting from a dose of 5 dpa up to doses of more than 100 dpa, voids in copper exhibits very robust swelling rates of about 0.5% per dpa in the temperature range 370–430 ◦ C, and there is no sign of swelling saturation. In another aspect, the interaction of voids with interstitial clusters migrating one-dimensionally along the close-packed directions should promote the formation of void lattice at a sufficiently large irradiation dose, because voids aligned along these directions have the most favorable spatial positions for the growth [15]. There is also no observation of the void lattice formation in copper either. The apparent difficulty in understanding large-scale heterogeneous voidswelling without one-dimensional diffusion is considered recently by Dudarev et al. [31], who showed that this phenomenon can be explained according to the PBM, and within the framework of three-dimensional diffusion reaction kinetics of defects, if one may assume a heterogeneous dislocation structure that recognizes the denudation of dislocations next to the grin boundary.
4.
Beyond the Mean-field Theory: Stochastic Effects
The issues encountered in the foregoing are of fundamental importance to irradiation damage modeling, which, thus far in this article, has adopted the spatial and temporal average picture, i.e., the mean-field approximation. A more realistic description of the microstructure evolution under neutron and heavy-ion irradiation must also recognize the strongly stochastic nature of this problem, derived from the discrete nature of the crystal lattice. This is particularly true when considering the evolution of the small interstitial clusters and the nucleation of interstitial loops and voids, and in situations where a random microstructure self-organized into a spatially ordered structure. Indeed, under cascade-damage conditions, point-defects and their clusters are produced randomly in time and space, and in discrete packages. The statistical nature of diffusion jumps and cascade initiation introduce fluctuations in
Modeling irradiation damage accumulation in crystals
977
the point-defect arrival rates at the sinks. In processes that involve only a small number of point-defects, such as the evolution of small point-defect clusters during nucleation events, it is intuitively clear that the fluctuations are important. Thus, an interstitial cluster that has been annihilated due to a wave of vacancies, cannot be revived by interstitials that arrive afterwards, even though the interstitial wave may be much bigger. On the other hand, the interstitial cluster would have survived and grow within the mean-field picture, resulting in a largely over-estimated number density of the interstitial clusters, which may completely distort the behavior of the irradiated system. To deal with the problem of temporal fluctuations and spatial variations in defect production and microstructure development, many authors formulate their problems using more advanced kinetic theories such as the Master equation or the Fokker–Planck equation. Monte Carlo simulation techniques are also used sometimes on problems for which a limited scope in time, space and the number of sink types is not important. Most calculations performed before the PBM only takes into account stochastic fluctuations due to the randomness of the point-defect jumps, and not those due to the randomness of the location, time and size of cascade initiation. The cascade diffusion theory of Mansur et al. [50], which took into account the space and time variation of cascade initiation, but not the randomness due to the migratory jumps of the point-defects, nor the variation among defect contents of different cascades, is an exception. Nevertheless, the results of this work suffer from a flaw in their statistical treatment [51]. To properly resolve the difficult issues faced by PBM, there is no doubt that the effects of stochastic fluctuations have to be rigorously explored. An attempt to take on this challenging task was made by Semenov and Woo, who considered these issues in a series of papers published in the period between 1993 and 2003. This work is complex, but has been partially reviewed by Woo [6]. To avoid repetition, the reader is referred to this review and the references therein for the statistical analysis of the point-defect production, transport and annihilation, under cascade damage conditions, based on which the evolution equation is formulated. The following concentrates on the application of the stochastic theory to microstructure nucleation, and the development of the heterogeneous microstructure due to the loss of stability of the homogeneous one.
4.1.
The Evolution Equations with Cascade Effects
The evolution equations of the small clusters are of central importance to the proper formulation of the PBM. Starting from the statistical description of the arrival at sinks of vacancies and interstitials from randomly initiated cascades, Semenov and Woo [52] derived the full kinetic equation for the
978
C.H. Woo
distribution function of the net number of vacancies accumulated in a sink. Various levels of approximations applied to this equation results in different forms that can be identified with various equations used in the literature. Thus, when all direct effects of stochastic fluctuations, spatial or temporal, are ignored, the kinetic equations reduce to the conventional rate equations. If only the probabilistic nature of cascade initiation is neglected, the kinetic equation reduces to the conventional master equation, frequently used to describe microstructure evolution under the continuous irradiation. When the stochastic process can be approximated by a Markov process, i.e., when the fluctuations can be assume to be delta correlated, statistical cumulants of order higher than two (i.e., k > 2) can be neglected, and the familiar Fokker–Planck equation is obtained, which takes into account both the migratory jump-induced and the cascade-induced fluctuations. In this case the probability distribution functions of the stochastic variables concerned are approximated by the appropriate Gaussian distributions. Based on the general kinetic equation, Semenov and Woo [52] analyzed the relative importance of the cascade-induced fluctuations, in comparison with the migratory jump-induced fluctuations, and concluded that cascadeinduced fluctuations play a much more important role than previously realized. For example, when they are absent, the conventional master equation gives a description qualitatively similar to the mean-field approximation. The total cluster density is typically much too high, and the interstitial content of the matrix is much too low, resulting in a swelling rate that is significantly underpredicted. Other theoretical calculations [53] and analyses of the experimental data on radiation-enhanced diffusion [54] also came to the same conclusion. In contrast, by properly taking into account the cascade effects, the Fokker– Planck equation approach gives a more reasonable picture. Indeed, the neglect of cascade-induced fluctuations produce a large increase of the cluster density, causing a seven-fold drop of the swelling rate in the case of steels in the peak-swelling regime, from about 1%/NRTdpa to 0.15%/NRTdpa [55]. Taking into account the cascade-induced fluctuations at the initial stages of irradiation leads to an order-of-magnitude reduction in the total sink strength. As expected, inclusion of the fluctuations due to random cascade initiation is also important for a proper description of the nucleation of interstitial loops and voids, as we shall see in the following.
4.2.
Nucleation of Voids and Dislocation Loops During Cascade Damage
The conventional approach to modeling void nucleation under irradiation is based on the classical description of the formation of small precipitates in a supersaturated solution, in which small thermally unstable new-phase
Modeling irradiation damage accumulation in crystals
979
embryos continuously form and redissolve in the supersaturated solution, but some can grow beyond the critical size via stochastic fluctuations. Beyond the critical size, the nuclei of the new phase become thermally stable and, on the average, can grow directly from the supersaturated solution, without the help of the stochastic fluctuations. At this stage, the nucleation process is considered to be complete. In this model, nucleation cannot occur within the mean-field theory. In earlier models, only statistical fluctuations produced by random point-defect jumps are considered, and the dislocation bias is the only driving force for the evolution of the damage microstructure. Using the Fokker–Planck equation to account for stochastic effects, Semenov and Woo [56, 57] applied the classical nucleation model to both voids and interstitial loops, also including contributions from the random initiation of cascades and the emission of vacancies from voids. In the classical nucleation model, void nucleation essentially constitutes the growth of small thermally unstable void embryos to the critical void size, which can only occur via the stochastic fluctuations of point-defect fluxes received by the void embryo. Three sources of fluctuations of the point-defect fluxes have been included: the diffusive jump, the cascade initiation, and the vacancy emission from the void. At elevated temperatures and when the sink density is low, the fluctuation of vacancy emission from voids is the dominant factor. The effects of the cascade fluctuations is important only when the total sink strength for point-defects is high, e.g., > 1015 m−2 . Application of the model to void nucleation in neutron-irradiated annealed pure copper and molybdenum at elevated temperatures, show reasonable agreement with the experimental observation. The nucleation of interstitial loops can be considered along a similar approach. As mentioned earlier in this article, in the temperature range just above the annealing stage V (i.e., the peak swelling regime) the dissociation of primary vacancy clusters (PVCs) produces a net flux of freely migrating vacancies to all sinks. Despite the net vacancy flux they receive, steady growth of faulted loops may be achievable through the absorption of smaller interstitial clusters and loops by coalescence. Indeed, the numerical calculation of Semenov and Woo [55] showed that the absorption of small interstitial clusters and loops could provide both the positive growth rate of the larger loops, and the sufficiently high climb rate of network dislocations, to produce a swelling rate in agreement with experimentally observed values. However, the probability of finding a neighboring cluster, with which it can combine, diminishes as the loop size decreases, and vanishes for the smallest immobile interstitial clusters. Thus, this mechanism can only account for the growth of sufficiently large interstitial loops. The smaller loops (or clusters) can only grow through stochastic fluctuations, similar to the case of sub-critical voids. From the foregoing description, the resemblance between the nucleation of voids and Frank loops at elevated temperatures is clear. Both vacancy and
980
C.H. Woo
interstitial clusters are directly produced in collision cascades, and critical sizes exist for both. Both the average sub-critical void and loop embryos shrink during the nucleation processes, and nucleation can only be accomplished via stochastic fluctuations. Thus, within the framework of the classical theory of nucleation, the nucleation processes of both voids and interstitial loops from primary clusters can be accomplished under the same framework. Indeed, Semenov and Woo [58] derived an analytic expression for the nucleation probability, applicable to both voids and Frank loops at elevated temperatures. Based on the classical nucleation model, Semenov and Woo [57] showed that, despite being the receivers of a net vacancy flux due to dissociating vacancy clusters at elevated temperatures, a small fraction of the primary interstitial clusters may still grow to achieve the critical size via the stochastic fluctuations. The probability that this may be achieved increases exponentially with a reduction in the mean loop shrinking rate, and/or the increase in the strength of the stochastic fluctuations. The contribution from the cascadeinduced fluctuations increases the nucleation probability by several orders of magnitude. The calculated rate of interstitial loop nucleation based on the derived nucleation probability is sufficiently high to account for the experimentally observed number densities of interstitial loops at a dose of one NRT dpa. The continuous regeneration of network dislocations from the present theory produces a swelling rate that agrees very well with the experimental value, which is on the order of ∼1%/NRT dpa.
4.3.
System Instability and Heterogeneous Microstructure Development
Many of the contentious issues facing the PBM arise in cases in which the microstructure is heterogeneous. In this regard, one must realize that spatial homogeneity is an integral part of the description of a system within the mean-field approximation. A system with a heterogeneous microstructure is basically inconsistent with the mean-field approximation of PBM. Indeed, the self-organization of a homogeneous structure of higher entropy into an ordered structure with lower entropy is a strong indication of the instability of the former. Thus, a solution based on the spatial homogeneity assumption may exist in the mean-field approximation, but may not be stable when the statistical nature of the system is explicitly taken into account. The average capture probability of a point-defect generated in a volume V in the neighborhood of a particular sink can be calculated. It can be verified that only about 20% of the point-defects created inside the characteristic sink volume are annihilated at that sink. Since the average steady-state flux of point-defects to a sink is equal to the rate of generation of such point-defects in the characteristic sink volume, there must then be a continuous exchange of
Modeling irradiation damage accumulation in crystals
981
point-defects between neighboring volumes. As a result, concentration fluctuations in V do not cancel each other completely. This exchange of point defects between neighboring regions gives rise to the classical V −1 -dependence in the variance of point-defect concentrations. The random cascade initiation produces additional variations. The assumption of spatial homogeneity requires detailed balancing to be observed, which cannot be assumed a priori according to the foregoing description. In most cases, nevertheless, the magnitude of the variance, over a meaningful volume, is small and bounded, i.e., stable with respect to small perturbations, and the error of the assumption is negligible. Thermodynamically, the force to keep the entropy production to a minimum, tends to maintain the stability of the spatial homogeneity of the irradiated system [59]. However, in a far from equilibrium situation, there are cases in which the spatially homogeneous system is only conditionally stable, and small initial deviations from the condition of detail balancing will grow beyond all bounds, if the conditions are not met [60]. The assumption of spatial homogeneity is used implicitly in most calculations in the study of irradiation damage. An interesting case, in which the break down of this assumption may occur, can be found in the cascade-irradiationinduced microstructure evolution of a fully annealed metal, at low-dose. In the absence of dislocations, both the absorption of primary clusters by dislocations and the effects of dislocation bias are insignificant. Assuming spatial homogeneity, the temporal evolution of the microstructure has been considered using both the mean-field approximation and the Fokker–Planck equation approach [6]. At low doses and for temperatures at which the vacancy clusters are thermally unstable, vacancy accumulation will take place at the voids. Interstitials will accumulate in the PICs that are continuously produced in the cascades. Assuming detail balancing of the point-defect fluxes, the solution of the kinetic equations at low doses must satisfy matter conservation that requires the local equality of the void swelling rate and the rate of interstitials accumulation in PICs. In this scenario, void growth will continue until the PICs become the dominant sink, and act predominantly as recombination centres. Statistically, we have seen that the number density of PICs must vary, and higher void swelling can be expected in regions of lower PIC density. However, the a priori assumption of spatial homogeneity does not allow these to happen. In the case when the spatially homogeneous solution is unstable, the relation between the local swelling rate and the rate of interstitial accumulation in PICs cease to hold, and the local concentration of point-defects over a sizable region may deviate drastically from the global average. Then the description of the evolution of the local microstructure under the assumption of spatial homogeneity breaks down. Noting that the actual microstructure observed in fully annealed pure copper at low doses is not spatially homogeneous, but is heterogeneous and
982
C.H. Woo
segregated, Semenov and Woo [61] analyzed the stability conditions of the spatial homogeneous solution of a system of primary interstitial clusters and small voids. They treated the IPICs reduced below a minimum size as mobile three-dimensional random walkers. With the constraint of spatial homogeneity on the point-defect concentrations removed, and replaced by the diffusionbased matter conservation equation, the spatially homogeneous solution was found to be conditionally stable only. When the homogeneous void growth rate became sufficiently low, due to the increase of either the void concentration or average sizes under the irradiation, the system becomes unstable. At the onset of the instability, the microstructure starts to evolve heterogeneously. Since there is a wide and continuous spectrum of the growing spatial modes, the developing structure is not spatially periodic. The instability develops with increasing spatial scales, in agreement with the experimental observation. It also follows from this investigation that the characteristic scale of the spatial heterogeneity increases significantly with temperature, from a few microns at 525 K to tens of microns at 625 K. Physically, void growth and nucleation in the spatially homogeneous stage lead to the accumulation of small clusters, and the reduction of the homogeneous void growth rate. At the same time, small inhomogeneous deviations of the void size change the rate of vacancy emission from the voids, and this feeds back to produce further enhancements in the variation of the void sizes. When the void-growth rate becomes sufficiently low during the homogeneous stage of the evolution, any increase in the amplitude of inhomogeneous variations of void sizes cannot be damped out by the net vacancy flux into the voids. This produces an unstable increase in the variation of voids sizes in different regions, so that voids may become under-critical and shrink away in some regions, while in the other regions their size may grow beyond the saturation value allowed by the corresponding homogeneous solution. Both the void swelling rate and the interstitial accumulation rate in clusters become location dependent due to the development of heterogeneity. This means that the shrinkage rate of the interstitial clusters becomes location dependent as well. Consequently, the flux of small mobile interstitial clusters between adjacent spatial regions does not observe detailed balancing as in the homogeneous case, due to the difference in the cluster shrinkage rates in different regions. This is consistent with and earlier result of Semenov and Woo [60], that the outflow of small mobile clusters, from a volume V with a size of 10–20 average cavity spacing, is sufficient to totally balance the net vacancy flux in an adjacent region with a characteristic width on the order of 0.1 µm. In this earlier work, the possibility of instability of the spatially homogeneous solution is explored, and the physical nature of the instability is studied. It is found that the escape of the mobilized clusters from a finite volume, if not counterbalanced by an equal amount of mobile PICs from the neighboring volumes, leads to the accumulation of vacancies, and enhancement of void swelling in
Modeling irradiation damage accumulation in crystals
983
this volume. At the same time, in the adjacent volumes the influx of interstitials in small mobile PICs neutralizes the net vacancy flux towards primary interstitial clusters. This prevents the clusters from shrinkage, thus reducing the escape probability of PICs, and enhancing the accumulation of interstitials (in clusters) in these regions. The entire process gives rise to a positive feedback system that leads to instability.
5.
Summary and Outlook
This article presents an overview of recent advances in the modeling of irradiation-damage accumulation, recognizing the limitation of the continuum theory of point-defect migration, and the weakness of a mean-field approach in treating the kinetics of reactions involving small clusters. Focusing on the improved insight provided by the atomistic picture of the crystal lattice and the statistical considerations of the reaction kinetics, we trace the progress from the standard rate theory model to the production bias model, and from the mean-field theory to the stochastic theory, taking into account progressively more realistic features of the irradiation-damage process, with an increasing degree of sophistication. Consideration of reaction kinetics in a diffusive medium from an atomistic point of view leads to the discovery of a powerful sink bias arising from the diffusional anisotropy difference (DAD) between the interstitial and vacancy type defects, which adds a new dimension to the understanding of the behavior of irradiated crystals. Within the atomistic picture, it is clear that the DAD effects depend on fundamental properties of the defects at both the ground state and the saddle points, such as the direction-dependent jump distance, jump barriers, and the configurations in terms of the corresponding dipoletensors. With the advent of computational hardware and software, the dynamic and static properties of such defects can be obtained readily via atomistic simulation. Further work in this direction will yield information that may contribute a long way towards the understanding of the complex behavior of metals and alloys under irradiation, particularly for cases in which either the defect or the host crystal has non-cubic symmetry. Thus, the behavior of one-dimensional diffusers in the strain field of impurities and small dislocation loops or near a grain boundary, should be investigated in the context of trapping and detrapping, recombination and coalescence. Such information is of fundamental importance to irradiation-damage modeling. The intrinsic and external-fieldinduced anisotropy of diffusion of vacancies and interstitials, and their clusters in hexagonal metals of different c/a ratios should also be obtained for an understanding of the systemic irradiation damage behavior of these metals. In the context of DAD, we also consider the effects of an externally applied stress on the point-defect kinetics in irradiated metals. The most
984
C.H. Woo
important effect that emerges in this regard, within the atomistic picture, arises from the change of the symmetry of the point-defect diffusional field in response to an external applied stress. Such a change introduces diffusional anisotropy (elasto-diffusion) in an isotropic diffusing species, or changes the anisotropy of an anisotropic diffusing species. In both cases the reaction constants between point-defects and sinks become a function of the sink orientation with respect to the principal directions of the applied stress. Thus, the bias differential is changed among dislocations with different line directions, or among grain boundaries with different surface normals. This has a profound effect on the development and evolution of the microstructure and the associated macroscopic dimensional changes. Being a first-order effect, it is likely that this stress-induced diffusional anisotropy plays a major role in void swelling and irradiation creep mechanisms and the coupling between them. In this context, the effect of stress in the nucleation of an anisotropic dislocation structure, and of voids should be considered in irradiation deformation studies in proper nucleation theory that takes full account of the stochastic effects. It is important to recognize the strong effects arising from the stochastic nature of the reactions between point-defects and small clusters, and to take into account the intrinsic statistical variation of concentrations and size distributions. One must consider the statistical nature of point-defect production, transport, and annihilation at sinks. Of special importance, the subtle effects of fluctuations must be taken into account in considering the behavior of small clusters, which are one of the most important, yet most obscured, components of the microstructure. A cluster that has been annihilated cannot be revived afterwards, independent of whether the time-averaged point-defect flux dictates that it must always grow or shrink. Indeed, the great majority of small clusters will shrink away in this manner, leaving only a very small proportion of survivors, of which some will grow much faster than others. Coarsening, whether it is in the case of the distribution of voids or interstitial loops, is one of the most important stochastic effects that results. Consideration of this effect is crucial in nucleation models, and models in which small clusters form an essential component of the sink. An additional important point that also must be appreciated is that, relative to the diffusive jump-induced fluctuations usually considered in most calculations, the cascade-induced fluctuations have a much larger effect. As a result, more recent calculations found that stochastic effects play a much more important role than previously thought, that is, before the real characteristic of cascade damage is appreciated quantitatively via the establishment of the production bias model. Calculations involving the solution of the Fokker–Planck equations, however, are complex. To facilitate easy application, a satisfactory way of including these effects, within a reasonable approximation in a simple model such as the rate theory is desirable, but has yet to be accomplished.
Modeling irradiation damage accumulation in crystals
985
Another important issue that is often overlooked is that spatial homogeneity is an integral part of a system describable within the mean-field approximation. A heterogeneous microstructure is basically inconsistent with the mean-field approximation. The self-organization of a homogeneous structure of higher entropy, preferred by near-equilibrium thermodynamics, into an ordered structure with lower entropy is a strong indication of the instability of the former. Thus, when the statistical nature of the system is explicitly taken into account, the stability of a spatially homogeneous solution cannot be taken for granted, but has to be established. Otherwise, a flawed conclusions could be the result.
Acknowledgment This project was supported by grants from the Research Grants Council of the Hong Kong Special Administrative Region (PolyU 5177/02E, 5167/01E, 5173/01E).
References [1] M.J. Norgett, M.T. Robinson, and I.M. Torrens, ASTM Standards E 521–83, 1983. [2] W. Schilling and H. Ullmaier, Mater. Sci. Technol., 10B, 179, 1994. [3] C.H. Woo, B.N. Singh, and A.A. Semenov, J. Nucl. Mater., 239, 7, 1996. [4] R. Bullough, Proceedings Conference on Dislocations and Properties of Real Materials, Royal Society, London, The Institute of Metals: London, p. 382, 1985. [5] N.M. Ghoniem, Phys. Rev. B, 39, 11810, 1989. [6] C.H. Woo, J. Computer-Aided Mater. Des., 6, 247, 1999. [7] M. von Smoluchowski, Z. Phys. Chem., 92, 129, 1917. [8] U. Goesele and A. Seeger, Philos. Nag., 14, 177, 1976. [9] U.M. G¨osele, Prog. React. Kin., 13, 63, 1984. [10] C.H. Woo, J. Nucl. Mater., 159, 237, 1988. [11] U. Goesele, J. Nucl. Mater., 78, 83, 1978. [12] C.H. Woo and U. Goesele, J. Nucl. Mater., 119, 119, 1983. [13] C.H. Woo, Radiation Effects and Defects in Solids, 144, 145, 1998. [14] C.H. Woo, J. Nucl. Mater., 276, 90, 2000. [15] C.H. Woo and W. Frank, J. Nucl. Mater., 137, 7, 1985. [16] C.H. Woo, Huang, Hanchen, and W.J. Zhu, Appl. Phys. A, 76, 101, 2003. [17] M. Wen, C.H. Woo, and J. Huang, Hanchen, J. of Computer-Aided Mater. Des., 7, 97, 2000. [18] R. Polya, Math. Annalen, 84, 149, 1926. [19] H.M. Simpson and A. Sosin, Radiat. Eff., 3, 1, 1970. [20] R. Bullough, D.V. Wells, J.R. Willis, and M.H. Wood, Dislocation Modeling of Physical Systems, Pergammon Press, New York, p. 116, 1980. [21] P.H. Dederichs and K. Schroeder, Phys. Rev. B, 17, 2524, 1978.
986
C.H. Woo [22] H. Ullmaier and W. Schilling, Physics of Modern Materials, International Atomic Energy Agency, Vienna, 301, 1980. [23] M.P. Puls and C.H. Woo, J. Nucl. Mater., 139, 48, 1986. [24] A.H. Cottrell, Report on Conference on the Strength of Solids, The Physical Society, London, 1948. [25] A.D. Brailsford and R. Bullough, J. Nucl. Mater., 44, 121, 1972. [26] R. Bullough and J.R. Willis, Philos. Mag., 31, 855, 1975. [27] C.H. Woo, J. Nucl. Mater., 120, 55, 1984. [28] B.C. Skinner and C.H. Woo, Phys. Rev. B, 30, 30384, 1984. [29] B.D. Wirth, G.R. Oddette, D. Maroudas, and G.E. Lucas, J. Nucl. Mater., 276, 33, 2000. [30] F. Kroupa, Philos. Mag., 7, 783, 1962. [31] S.L. Dudarev, A.A. Semenov, and C.H. Woo, Phys. Rev. B, 67, 094103, 2003 and Phys. Rev. B, 70, 094115, 2004. [32] J. Marian, B.D. Wirth, J.M. Perlado, G.R.Odette, and T. Diaz de la Rubia, Phys. Rev. B, 64, 094303, 2001. [33] J. Marian, B.D. Wirth, A. Caro, B. Sadigh, G.R. Odette, J.M. Perlado, and T. Diaz de la Rubia, Phys. Rev. B, 65, 144102, 2002. [34] C.H. Woo and E.J. Savino, J. Nucl. Mater., 116, 17, 1983. [35] C.H. Woo, B.N. Singh, and H.L. Heinisch J. Nucl. Mater., 174, 190, 1990. [36] H. Trinkaus, V. Naundorf, B.N. Singh, and C.H. Woo, J. Nucl. Mater., 210, 244, 1994. [37] R. Bullough, B.L. Eyre, and K. Krishan, Proc. R. Soc. A, 346, 81, 1975. [38] C.H. Woo and B.N. Singh, Phys. Stat. Sol. (b), 159, 609, 1990. [39] C.H. Woo and B.N. Singh, Phil. Mag. A, 65, 889, 1992. [40] B.N. Singh and A.J.E. Foreman, Phil. Mag. A, 66, 975, 1992. [41] R.P. Tucker, V. Fidleris, and R.B. Adamson, ASTM STP 804, 427, 1984. [42] B.N. Singh, T. Leffers, and A. Horsewell, Phil. Mag. A, 53, 233, 1986. [43] T. Leffers, B.N. Singh, A.V. Volobuyev, and V.V. Gann, Phil. Mag. A, 53, 243, 1986. [44] C.W. Chen and R.W. Buttry, Radiat. Eff., 56, 219, 1981. [45] B.N. Singh, T. Leffers, W.V. Green, and S.L. Green, J. Nucl. Mater., 105, 1, 1982. [46] H. Trinkaus, B.N. Singh, and A.J.E. Foreman, J. Nucl. Mater., 206, 200, 1993. [47] S.L. Dudarev, Phys. Rev. B, 62, 9325, 2000. [48] B.N. Singh, Radiat. Eff. Defects Solids, 148, 383, 1999. [49] S. Zinkle and B.N. Singh, J. Nucl. Mater., 283–287, 306, 2000. [50] L.K. Mansur, A.D. Brailsford, and W.A. Coghlan, Acta Metall., 33, 1407, 1985. [51] A.A. Semenov and C.H. Woo, J. Nucl. Mater., 233–237, 1045, 1996. [52] A.A. Semenov and C.H. Woo, Appl. Phys. A, 69, 445, 1999. [53] H. Wiedersich, J. Nucl. Mater., 205, 40, 1993. [54] H. Trinkaus, B.N. Singh, and C.H. Woo, J. Nucl. Mater., 212–215, 18, 1994. [55] A.A. Semenov and C.H. Woo, Appl. Phys. A, 67, 193, 1998. [56] A.A. Semenov and C.H. Woo, Phys. Rev. B, 66, 024118, 2002. [57] A.A. Semenov and C.H. Woo, Philos. Mag., 83, 3765, 2003. [58] A.A. Semenov and C.H. Woo, J. Nucl. Mater., 323, 192, 2003. [59] G. Nicolis and I. Prigogine, Self-organization in Nonequilibrium Systems, John Wiley & Sons, Inc, New York, 1977. [60] A.A. Semenov and C.H. Woo, Appl. Phys. A, 73, 371, 2001. [61] A.A. Semenov and C.H. Woo, Appl. Phys. A, 74, 639, 2002a.
2.28 CASCADE MODELING Jean-Paul Crocombette CEA Saclay, DEN-SRMP, 91191 Gif/Yvette cedex, France
Cascade modeling deals with the effect of a high velocity particle impact on a solid. These simulations are of primary interest in the nuclear engineering community as they are major tools to analyze the behavior of materials submitted to internal or external irradiation. Simulations tools were originally designed and are still used by this community. However, these simulations also interest implantation studies for micro electronics as well as sputtering and more generally surface modification studies.
1.
Introduction
When an energetic particle penetrates a solid, it looses its energy by series of elastic nuclear collisions and through excitations of the electronic system. The latter dominates in the high energy (MeV) range whereas the former is most important for smaller energies (below a few tens or hundreds of keV). Elastic collisions set into motion target particles, which can in turn displace neighboring atoms, thereby creating a displacement cascade. Similar cascades appear when a radioactive atom inserted in the solid decays. Due to the inability of all displaced atoms to return to their original or equivalent sites, a cascade results in the creation of vacancies and self interstitials and in the mixing of the atomic structure. The accumulation of such defects under irradiation eventually leads to the modification of microstructure and properties of the material. A general review on damage in irradiated materials can be found in Ref. [1]. Atomistic simulations are essential tools to analyze cascades as they provide a description of the cascade processes and a detailed view of the primary state of damage. In this part, we will focus on the atomic description of the cascade, the microstructure and property changes under irradiation being addressed in another section of this book. Cascade descriptions start with the 987 S. Yip (ed.), Handbook of Materials Modeling, 987–998. c 2005 Springer. Printed in the Netherlands.
988
J.-P. Crocombette
definition of the primary knocked on atom (PKA), which is defined as the first atom set into motion. Depending on the situation, it can be the target atom initially struck by the irradiation particle, the incident ion itself or the recoil nucleus created by radioactive decay. After collision with the PKA, a target atom is displaced from its original position if its kinetic energy is larger than a threshold displacement energy (TDE), which depends on the element, the material and the crystallographic direction of the impulsion of the target atom. They are of the order of 20–70 eV. Cascades occur when kinetic energies of the target atoms are large enough to ensure further displacements of other atoms. Simple mechanics show that this is not the case for electron irradiations, which due to their small mass, can transfer energies up to a few tens of eV to target atoms. Electron irradiation thus only creates isolated Frenkel pairs. At the opposite, neutron or ion irradiations cause PKA to recoil with energies of dozens of keV thus leading to displacement cascades. A cascade can be decomposed in three successive phases. Once the PKA has been set into motion, series of atomic displacements take place through collisions (ballistic phase). After less than ∼0.2 ps the energies of the recoil atoms fall below their threshold displacement energy and the ballistic phase ends. But, due to the atomic motions, a local area of high temperature exists in the material creating the condition of a thermal spike (thermal phase), the material being locally in a liquid-like state. After a few picoseconds the spike dissipates and the so-called primary state of damage is reached. Simulations have shown that the primary state of damage depend highly on the way the material structure reacts during the thermal spike. In some materials the crystalline structure rebuilds rapidly leaving only point defects after the cascade. This is the case in pure metals such as Ni and Fe where cascade creates vacancies and interstitials. Some ionic compounds (e.g., UO2 ) react in the same way. In metallic alloys, anti-sites are also produced leading to an ion mixing effect. At the opposite, in silicon and other covalent or ionocovalent materials (SiO2 , zircon) cascades do not create isolated defects but amorphous pockets of materials which do not re-crystallize after the thermal phase. After the thermal phase, starts the subsequent diffusive phase during which long-term evolution of the material through thermally activated phenomena takes place. Depending on the material, the competition between the accumulation of damage and the diffusive restoration phenomena may lead to dramatic changes such as the complete amorphization of the material as the crystalline structure eventually collapses, leading to the so-called metamict state. Displacement cascade simulations aim to reproduce the ballistic and thermal phases and to describe the primary state of damage. Two main simulation methodologies exist. The binary collision approximation (BCA) describes the ballistic phase and gives fast and reliable results for a global picture of energy loss, damage geometry and ion implantation range. It is used in situations
Cascade modeling
989
where good statistics are needed. Molecular dynamics (MD) simulations are much more computationally demanding but lead to a more precise description of the material as they describe both the ballistic and the thermal phases.
2.
Binary Collision Approximation
In the BCA, particles are supposed to move along straight trajectories between successive two body (binary) collisions. Moving atoms collide only with particles at rest. The BCA is at the heart of the Kinchin and Pease (KP) expression which relates the number of defects produced by one collision of the PKA with a secondary atom as a function of its energy: υ(T ) = T /2E d where T and E d are the kinetic energy of the PKA and E d is the TDE of the target atoms. Norgett, Robinson and Torrens (NRT) [2] have obtained a modified form which is widely used to quantify irradiation damage. By integration these formula give the total number of defect in the cascade as a function of the PKA initial energy. Many simulation codes are based on the BCA. In these codes, the only energy that is dealt with is the kinetic energy of moving particles. For each encounter, a fraction of the kinetic energy of the incoming particle is transmitted to the target particle. After such collision the target particle is set into motion only if its kinetic energy exceeds its TDE, otherwise it remains motionless. Some energy is subtracted from its initial kinetic energy to account for the binding energy of the particle. At the opposite the incoming particle stops if its kinetic energy falls below its TDE. The simulation is initiated by giving an initial impulsion to the PKA and stops when there are no moving particles left. The Binary Collision Approximation therefore only describes the ballistic phase. The determination of the amount of transmitted energy between colliding particles relies on cross section calculations using some conservative central potential. Different kind of pair potentials may be used. A common choice is the Ziegler-Biersack-Littmark (ZBL) potential [3] which assumes a universal form for inter-atomic interactions. Electronic excitations are modelled through a supplementary energy loss applied for each collision. The Binary Collision Approximation simulations show that, for high energy PKA, the cascade can be divided in quite disconnected subcascades. The first mechanism for subcascade formation is the fact that, for projectiles with energies greater than a few tens of keV, the mean free path between successive energetic collisions is long. Since most secondary recoil atoms have energies much less than the PKA, their subsequent collisions take place close to their original sites. The main cascade is therefore made of a string of localized defective zones separated by areas containing few defects. Secondly, the rare high energy recoils create subcascades of their own that
990
J.-P. Crocombette
are disconnected for the main one. Finally, at lower energies, the crystal structure plays an important role in subcascade formation as a moving atom can be steered by atomic rows through a crystal channel. Of course, this last mechanism cannot take place in materials where no such channel exists, for example, low symmetry crystals or glasses. The two major BCA codes are SRIM [3, 4] and MARLOWE [5]. They differ by the level of description of the atomic structure of the material. MARLOWE is well suited for crystals as it includes the description of the atomic structure. SRIM randomly determines successive collision targets and the only structural pieces of information are the composition and density of the material under study. The Binary Collision Approximation codes are very fast: a cascade simulation of any energy takes about one second with SRIM on any computer. It is therefore possible to make multiple simulations to obtain a statistically relevant picture of the damage caused by one kind of irradiation. These codes are therefore mainly used by experimentalists to quickly assess expected results of irradiation such as the depth of penetration of implanted particles. A rough estimation of the structure of the cascade track and the number of defects created are also obtained.
3.
Threshold Displacement Energy Calculations with Molecular Dynamics
Within MD, TDE can be easily calculated, for each ion type, by giving, in various directions, a series of impulsions of increasing kinetic energy to an atom and following the subsequent atomic motions. The threshold displacement energy calculations are therefore nothing else but simulations of lowenergy cascades. In each direction, for energies lower than the TDE, after some atomic displacements, the knocked-on atom readily returns to its original position, leaving the crystal unperturbed. In this case, the atom did not leave the region of instability surrounding its vacant site called the spontaneous recombination volume (SRV). Beyond the threshold energy, the knocked-on atom goes out of the SRV and does not return to its original site. In this case, at least one Frenkel pair remains at the end of the simulation. Technically, the simulated time should be large enough to allow for spontaneous recombination and as small as possible to save computational time and prevent diffusion recombinations. A simulated time around 0.5 ps seems reasonable. The threshold displacement energy calculations are also of interest as they often exhibits behaviors present in higher energies cascades. For instance in crystals where atoms are aligned in straight rows, replacement collision sequences (RCS) may take place when series of atoms are displaced along a crystalline row leading to a disconnected interstitial-vacancy pair.
Cascade modeling
991
A difficulty common to cascade and TDE MD simulations arises for the inter-atomic potential. Indeed common potentials are designed to fit (low energy) equilibrium or close to equilibrium properties, whereas cascades involve high-energy configurations and very short inter-atomic distances. For such small distances one has to turn on specific potentials especially devoted to the high energies involved such as the ones used in BCA codes. Thus, two kinds of potentials have to be connected. One common way to do this is to connect smoothly (thanks to a high order polynomial form) the high-energy and short-distance potential to the pair repulsion that exists in all forms of low-energy potentials and to extinguish at small distances the higher order terms (3 body or embedding parts) that may appear in the equilibrium potentials. Unfortunately, this connection takes place in a sensitive range of interatomic repulsion, namely between 10 and 100 eV which is precisely the range of the TDE. The calculated values of the TDE are therefore highly dependant on this connection. In the uncommon cases where experimental figures are available for the TDE, they should be used to properly design the connection. The threshold displacement energies, as all quantitative figures of cascade modelling, depend strongly on the details of potentials used.
4.
Methodology of Cascade Simulations with Molecular Dynamics
At the opposite of BCA models, which treat the cascade as a succession of independent two body encounters, MD simulations fully integrate the classical equations of motion for all atoms simultaneously. Of course, the price to pay is a much heavier computer requirement than for BCA codes. However, MD leads to a much more precise picture of the material as it describes both the ballistic and the thermal phases. However, MD simulations deal with atoms or ions and thus focus on the elastic loses. Inelastic loses due to electronic excitations are not explicitly considered. In the case of metals, they play little role in the energy range considered in cascade simulations (tens of keV) and they are conveniently accounted for by a friction-like force acting on moving atoms. For insulators, such approaches are clearly insufficient as electronic excitations are long lived and can lead to specific defect formations. No satisfactory formalism exists at present to deal with them. After the initial impulsion has been given to the PKA, one simply follows the movements of all the atoms inside the simulation box until a meta-stable state is reached for the time scales of MD simulations that is, picoseconds. Long-term evolution of the material is out of reach for MD and should be studied with other tools (see other sections of this book). Due to the computational cost of MD simulations, the size of the simulation box and the number of time steps are limited and so are the PKA energies. The size of the
992
J.-P. Crocombette
box should be large enough to easily accommodate the cascade. There is no standard rule on this point but common practice is that the projected range of the projectile should be less than one-fourth the length of the box and that the number of atoms in the box should be greater than 25 times the energy of the projectile in eV (500 000 atoms for 20 keV). Such large boxes are especially needed when channeling processes are awaited as they may lead to damage creation far from the PKA track. A proper cascade simulation should last for at least a few picoseconds, which amounts to a few ten thousandtime steps (see below). Fortunately, thanks to the increase in computer power, the size and time that can be simulated have become larger and larger. It is now possible, with supercomputers, to simulate irradiation events of primary energy close to what is expected in nuclear reactors or α disintegrations, that is, tens of keV. Even for cases where it is not possible to reproduce the real energy of the PKA (in implantation studies for instance), the division in subcascades evidenced by BCA models for high energy PKA justifies the simulation of lower energy cascades. Due to the large kinetic energies involved, the velocities of the ions can reach quite high values at the beginning of the cascade. To ensure a proper conservation of the energy, it is necessary to use a time step as small as 10−5 ps to discretize the trajectories of the ions. After some 10−2 ps, the maximum atomic velocity starts to decrease and the time step of the simulation can be progressively increased. This may be accomplished routinely by adjusting the time step so that the maximum atomic displacement between two consecutive time steps is smaller than some distance (e.g., 0.05 Å). It is also possible to use multi-time steps algorithms that consider various time steps in the different areas of the box depending on the local velocities of atoms. Cascade simulations should be performed in the pseudo (see below) NVE ensemble, that is, at constant volume and energy. However tempting, constant pressure or constant temperature algorithms should not be used. Indeed these kinds of algorithms are built up to describe thermo-dynamical equilibrium properties when in fact there is no such thing as equilibrium during a cascade. Using such algorithms leads to clearly spurious and unphysical behaviors. For instance, a large and sudden increase of the temperature of the material takes place in the core of the cascade, creating the thermal spike. Applying global constant temperature algorithms (such as Nose-Hoover) completely freezes out the atoms in the periphery of the cascade to almost zero temperature in an attempt to achieve an average constant temperature in the box. Nevertheless, it is true that one should take care of the temperature and pressure wave that propagates from the cascade core to the rest of the simulation box. First, one should use large enough boxes. Indeed the size of the simulation box should be carefully chosen with respect to the energy of cascade that is modeled. Due to the finite size of the box, whatever the boundary conditions are, the heat and pressure waves created by the cascade will eventually return to the cascade
Cascade modeling
993
area. The box should be large enough for the ballistic and thermal phases to be completed before the return of the waves. Second, an approximate way to deal with these waves has been designed. It consists in initiating the cascade in the centre of the simulation box and damping the thermal wave by controlling the temperature of the external layer of the box to model the thermal bath constituted in reality by the crystal. This damping can be performed either by simple rescaling of the atomic velocities or through some Lagrangian formalism. The heat wave of the cascade is then partially absorbed on the border of the simulation box. Proper handling of the pressure wave is less common, but a generalized Langevin type approach exists [6].
5.
Results of Cascade Simulations with Molecular Dynamics
Results of MD simulation vary from one material to the other. See Nordlund et al. [7] for comparison between different materials. A common feature is that the cascade region undergoes local melting and that this melting has a large influence on the primary state of damage. Many studies have been done to characterize this melting zone in terms of volume, temperature, density, structural characteristics, cooling rates and duration. A difficulty lies in the somewhat approximate definition one has to use for the cascade zone. Quite obviously, the most important result of the MD cascade simulation is the set of atomic positions at the end of it. One should really take time to look at atomic configurations as much information can be learned by eye inspection. Special care should therefore be paid to the visualization of atoms positions. Looking at the final atomic structure gives information about the kind of defects created by the cascade. One can easily see whether a cascade creates some point defects or an amorphous area. To visually analyze the primary stage of damage beyond eye inspection of all atoms’ positions, one has to design, on a case-by-case basis, convenient analysis tools, the idea being to extract from the complete set of atomic positions the atoms that are in a specific state of disorder. For materials where the crystalline structure is restored during the thermal phase, a cascade creates ion mixing and some point defects. For these materials, MD simulation has shown that the number of defect predicted by the KP and NRT models are overestimated. The ratio between MD results and NRT predictions decrease to a limit of one third for energies of the order of 10 keV. This reduced defect production efficiency is analyzed as the effect of fast defect recombination during the thermal phase. At the end of the cascade, one can define and plot vacancies, interstitials, antisites and simple replacements as in Fig. 1, which shows the primary state of damage in Ni3 Al after a 30 keV cascade [8]. This image exemplifies behaviors that appear in many materials.
994
J.-P. Crocombette
Figure 1. Morphology of a zircon (ZrSiO4 ) crystal after a 5 keV cascade [9]; Si (light gray), O (gray), Zr (dark gray).
Thus one can see that the cascade is divided into subcascades by channeling (see the red lines). Replacement collision sequences are also visible as little tails that point out of the main damaged zone. Quite naturally, vacancies and interstitials are, on the whole, situated respectively in the centre and at the periphery of the cascade area. Other MD simulations in metals have shown that the primary state of damage includes clusters of self interstitials which, exhibit fast (possibly athermal) diffusion properties. At the opposite of metals and alloys for which the recrystallization is almost complete, in materials subject to amorphization (silicon, zircon, zirconolite, etc.) pockets of amorphous material may remain at the end of the cascade. The possible difference between the structure of such amorphous domains and the glassy structure obtained by fast quenching is still under debate. Zircon is an example of such direct amorphization process around the PKA track. The best way to figure this behavior is to show all atomic positions,
Cascade modeling
995
as quoted in Fig. 2 [9]. The amorphous area in the centre appears clearly. Once defined, the cascade area can be analyzed in terms of radial or angular distribution functions, which can be compared with experiments such as neutron diffraction or extended x-ray absorption fine structure (EXAFS) performed on irradiated materials.
Figure 2. Visualization of the result of a 30 keV cascade in Ni3 Al [8]. Replaced atoms and antistes are represented by small white spheres. Light and dark gray spheres represent vacancies and interstitials, respectively. The lines join two channeled atoms to their original positions.
996
J.-P. Crocombette
Global quantitative figures can also be extracted from the final configurations such as the number of defects created, the energy stored in the structure at the end of the cascade. These figures allow comparisons with the prediction of other models or with experiments. They can in turn be fed in long time models of the global evolution of the material under irradiation.
6.
Summary
Binary collision approximation and MD simulations have proved to be highly informative on the cascade unfolding and damages. For a low computational cost, BCA gives an overall picture of the damage. Molecular dynamics simulations lead to reliable qualitative information. Quantitative predictions are also possible even if they depend more heavily on the choice of the empirical potential. Still, limitations exist for these simulations. On the one hand, BCA approaches suffer from their rough approximations and, on the other, even with supercomputers most of the experimental irradiation conditions are out of reach for MD simulations. Some attempts have been made to chain the BCA and MD approaches to push up the limits in PKA energies. Other development involve the linking of MD to simulate cascades with Kinetic Monte Carlo approaches to access the long term evolution of the material under irradiation. These developments are promising but are still far from routine as they depend highly on the system under study. Last but not least, for the case of insulators, electronic loses which are known experimentally to be important are completely neglected by present day simulations.
References [1] R. Averback and T. Diaz de la Rubia, “Displacement damage in irradiated metals and semiconductors,” Sol. Stat. Phys., 51, 281, 1998. [2] M.J. Norgett, M.T. Robinson, and I.M. Torrens, “A proposed method of calculating displacement dose rates,” Nucl. Eng. Design, 33, 50–54, 1975. [3] J.F. Ziegler, J.P. Biersack, and U. Littmark, The Stopping and Range of Ions in Solids, New York, Pergamon, 1985. [4] J.F. Ziegler, SRIM www.srim.org, 2003. [5] M.T. Robinson, “MARLOWE www.ssd.ornl.gov/Programs/Marlowe/guide/index .htm,” 2003. [6] M. Moseler, J. Nordiek, and H. Haberland, “Reduction of the reflected pressure wave in the molecular-dynamics simulation of energetic particle-solid collisions,” Phys. Rev. B, 56(23), 15439–15445, 1997. [7] K. Nordlund, et al., “Defect production in collision cascades in elemental semiconductors and fcc metals,” Phys. Rev. B, 57, 7556–7570, 1998.
Cascade modeling
997
[8] N.V. Doan and R. Vascon, “Displacement cascades in metals and ordered alloys. Molecular dynamics simulations,” Nucl. Instrum. Meth. B, 135(1–4), 207–213, 1998. [9] J.P. Crocombette and D. Ghaleb, “Molecular dynamics modeling of irradiation damage in pure and uranium doped zircon,” J. of Nucl. Mater., 295, 167–178, 2001.
2.29 RADIATION EFFECTS IN FISSION AND FUSION REACTORS G. Robert Odette1 and Brian D. Wirth2 1
Department of Mechanical Engineering and Department of Materials, University of California, Santa Barbara, CA, USA 2 Department of Nuclear Engineering, University of California, Barkeley, CA, USA
Since the prediction of “Wigner disease” [1] and the subsequent observation of anisotropic growth of the graphite used in the Chicago Pile, the effects of radiation on materials has been an important technological concern. The broad field of radiation effects impacts many critical advanced technologies, ranging from semiconductor processing to severe materials degradation in nuclear reactor environments. Radiation effects also occur in many natural environments, ranging from deep space to inside the Earth’s crust. As selected examples that involve many basic phenomena that cross-cut and illustrate the broader impacts of radiation exposure on materials, this article focuses on modeling microstructural changes in iron-based ferritic alloys under high-energy neutron irradiation relevant to light water fission reactor pressure vessels. We also touch briefly on radiation effects in structural alloys for fusion reactor first wall and blanket structures; in this case the focus is on modeling the evolution of self-interstitial atom clusters and dislocation loops. Note, since even the narrower topic of structural materials for nuclear energy applications encompass a vast literature dating from 1942, the references included in this article are primarily limited to these two narrower subjects. Thus, the references cited here are presented as examples, rather than comprehensive bibliographies. However, the interested reader is referred to proceedings of continuing symposia series that have been sponsored by several organizations,∗ several monographs [2–4] and key journals (e.g., Journal of Nuclear Materials, Radiation Effects and Defects in Solids). ∗ Meetings and symposia series of interest include the American Society for Testing and Mechanics Special Topical Meetings on Radiation Effects on Materials, the International Conference on Fusion Reactor Materials, the Symposia series on Microstructural Processes in Irradiated Materials sponsored by the Materials Research Society and the Materials, Metals and Minerals (TMS) Society, and the International Symposia series on Environmental Degradation of Materials in Light Water Reactors.
999 S. Yip (ed.), Handbook of Materials Modeling, 999–1037. c 2005 Springer. Printed in the Netherlands.
1000
G.R. Odette and B.D. Wirth
The underlying physics controlling neutron radiation damage, and its attendant consequence to material properties, is inherently hierarchical and multiscale. Pertinent length and time scales controlling radiation effects range from neutron collision-reactions on the scale of the nucleus to the size and service lifetimes of structural components, spanning factors in excess of 1014 (length) and 1022 (time) [5–10]. Radiation effects are also inherently “multi-physics” as well. Numerous basic nuclear, atomic and solid-state physics processes are linked to complex nano and microstructural evolutions in multi-constituent, multi-phase engineering materials through non-equilibrium thermodynamics and accelerated kinetics, leading to structure–property and property–property relations described by micro and macro mechanics models [5, 7, 11]. The governing processes involve enormous degrees of freedom and critical outcomes often depend on small differences between large competing effects. For example, void swelling results from a small bias in vacancy versus self-interstitial atom fluxes to different sinks [5, 12]. The fundamental objective of multi-scale – multi-physics (MSMP) radiation effects modeling is quantitatively predicting the generation, transport, fate and consequences of all defect species created and solutes transported by irradiation. The practical aim of modeling is to provide improved predictions of materials (component) performance and lifetime by relating time-dependent property changes to the combination of governing material and irradiation variables. Physical models provide a framework for synthesizing experimental information, ranging from laboratory-based mechanism studies, to real world surveillance data. Thus models can be used to more reliably extrapolate beyond an often limited and imperfect database [7, 13–16].
1.
Irradiation Effects in Ferritic–Bainitic and Ferritic–Martensitic Alloys
An example of the successful application of the multi-scale-modeling concept is improvements in the prediction of irradiation embrittlement of reactor pressure vessel (RPV) steels [13, 14]. Western RPVs are fabricated from quenched and tempered C–Mn–Si–Mo–Ni low-alloy steels and operate around 300 ◦ C. These ferritic–bainitic alloys contain coarse scale Fe(Mn)3 C and smaller Mo2 C carbides with dislocation densities of about 2 × 1014 /m2 . As summarized in Table 1, RPVs accumulate fast neutron fluences from about 1 to 10×1023 n/m2 over a 40–60 year service life, corresponding to a maximum damage dose less than 0.15 displacement per atom (dpa) [8, 13, 17]. Even this relatively low dose is sufficient to produce embrittlement characterized by upward shifts in the transition temperature (T ) between the more brittle cleavage and more ductile microvoid coalescence fracture regimes.
Reactor pressure vessel (RPV) steels Composition (weight %)
Microstructures
Dislocation densities (m−2 ) Irradiation conditions: - Temperature (◦ C) - Dose rate (dpa/s) - Target dose (dpa) - Gas generation (appm He, H) - Neutron flux (E > 1 MeV) - Neutron flux (E > 10 MeV) - Mean PRA energy
Fe–C(0.05–0.2%)–Mn(0.7–1.6%)–Mo(0.4– 0.6%)–Ni(0.2–1.4%)–Si(0.2–0.6%)–Cr(0.05– 0.5%)–Cu(0.05–0.4%)–P(0.005–0.025%) Ferritic–bainitic quenched and tempered forgings and stress-relieved submerged arc welds; Coarse scale Fe(Mn)3 C and Mo2 C carbides ≈ 2×1014 ≈ 290 ≈ 0.5 × 10−10 – 10−11 <≈ 0.05 minimal 1.1×1015 n/(m2 -s), (≈ 20% of total) ≈15 keV
Low activation martensitic steels (LAMS) Fe–Cr(8–12%)–W(1–2%)–Ta(0.05–0.5%)– V(0.05–0.3%)–C(0.1%) Normalized and tempered ferritic–martensitic steels; Ta and V alloyed carbides, Cr-rich α and Fe2 W Laves phases 0.5–10×1014 (depending on tempering conditions) 300–550 ≈ 0.5×10−6 − 10−6 150 1500 appm He, 6000 appm H 1.5 × 1018 n/(m 2 − s), (≈ 40% of total) 8.8 × 1017 n/(m 2 − s), (≈20% of total) ≈ 50 keV
Radiation effects in fission and fusion reactors
Table 1. Summary of RPV steel and LAMS composition and irradiation conditions
1001
1002
G.R. Odette and B.D. Wirth
Embrittlement is primarily the result of irradiation hardening, reflected in increases in yield stress (σ y ), and can reach values of 300 ◦ C or more. Embrittlement is controlled by a complex combination of variables [7, 8, 13, 18], including the neutron flux, fluence and spectrum, the irradiation temperature (irradiation variables), and the alloy’s starting microstructure and composition (material variables). Important compositional variables (all compositions given in weight %) include Cu (0.02–0.4%), Ni (0.2–2%) and Mn (0.3–1.9%), while P (0.005–0.040%) and Si (0.2–0.7%) play a secondary role. The primary hardening and embrittling features are a high concentration of nm-scale coherent Cu-rich precipitates (CRPs) [19]. The CRPs are alloyed with Mn, Ni, Si and P [5, 7, 13, 18, 20, 21]. In alloys containing high quantities of Mn and Ni, the CRPs give way to Mn-Ni rich precipitates (MNPs). The MNPs are also alloyed with Cu and Si, but MNPs can form in Cu-free steels [22]. In low Cu steels, the main hardening features are vacancy–cluster solute (Mn, Ni, Si) complexes, MNPs and alloy phosphide phases [5, 7, 17, 20]. The mechanisms of RPV steel embrittlement are summarized below. Normalized and tempered low activation martensitic steels (LAMS) are the leading candidate for use in fusion first wall and breeding blanket reactor structures [23–26]. These alloys will experience much larger doses in service, in the range of 100–200 dpa, accompanied by high concentrations of solid and gaseous transmutation products, including insoluble He (1000–2000 atomic parts per million, appm) and reactive H (4000–8000 appm). Fusion reactor components fabricated from LAMS will operate at service temperatures ranging from about 300 to 550 ◦ C. As summarized in Table 1, the hard fusion neutron spectrum, with a large (≈ 20%) component of 14 MeV neutrons from the D–T reactions, produces much higher levels of He and H from (n, α) and (n, p) threshold reactions. Predictive irradiation effects models must treat a large number of performance sustaining properties that may be degraded, including the yield strength and strain hardening constitutive laws, various types of “ductility,” fatigue crack growth rates, fracture toughness, irradiation and thermal creep rates, void swelling, creep-rupture time and strain, thermo-mechanical fatigue stress and strain limits, creep-crack growth rates, creep–fatigue interactions, environmentally assisted cracking and bulk corrosion–oxidation compatibility [23–26]. LAMS typically contain ≈ 8% Cr plus 1–2% W and ≈ 0.1% C, along with smaller quantities of carbide forming micro alloying elements like and Ta and small to modest concentrations of Mn and Si [11, 26]. Depending on the irradiation temperature and dose, major phases include a variety of alloyed carbides, Cr-rich α and Fe2 W Laves phases. LAMS microstructures are composed of moderately high dislocation densities (≈ 2 × 1014 /m2 ) and dislocation sub-structures inside martensitic laths, forming small groups of lath packets within the prior austenitic grains. These coarse scale (>0.05 µm) structures and phases formed during processing are generally stable at
Radiation effects in fission and fusion reactors
1003
low-to-intermediate irradiation temperatures and doses. In the regime below about 400 ◦ C, the dominant features induced by irradiation are dislocation loops, gas bubbles and voids, and in some cases, fine scale precipitates like α [27, 28]. These fine-scale features cause hardening, loss of uniform tensile strain capacity, flow localization and embrittlement [23, 29]. The effects of hardening may be amplified by high levels of H and He that, at sufficient concentrations, may lead to grain boundary decohesion and a brittle intergranular fracture mode up to very high temperatures, potentially reaching 500 ◦ C or more [11, 30]. Irradiation creep occurs over the entire range of service temperatures and is the dominant source of dimensional instability at low-to-intermediate temperatures [28]. The number densities of the features decrease with increasing irradiation temperature [27]. In the range of 400–500 ◦ C, high He levels may result in enhanced swelling associated with the biased vacancy flux driven transformation of stably growing bubbles to unstably growing voids [31]. Above about 450 ◦ C (or perhaps a lower limit with increasing dose) the coarser microstructures and phases become increasingly unstable and tend to recover and coarsen, while precipitation of grain boundary Laves phases occurs [32]. These evolutions can lead to both softening and non-hardening embrittlement [11, 33]. Irradiation and applied stresses may accelerate and lower the temperature range of the time–temperature C-curves describing these transformations by radiation enhanced diffusion or other mechanisms. However, LAMS do not generally show large irradiation “driven” effects on microstructural stability, or phenomena such as severe solute segregation. The main concern at higher temperatures (above 500 ◦ C) is the accumulation of He on grain boundaries, which may be accompanied by severe reductions in creep rupture times and strains due to stress driven creep controlled growth of creep cavities that nucleate on bubbles [34]. The high sink density in LAMS is believed to offer some degree of grain boundary protection [35], but this has not been verified under fusion relevant conditions. A major challenge to predicting the performance of materials in fusion environments is the absence of a high-energy, high-dose neutron source [36, 25]. However, fission reactors can be used to study the effects of high levels of helium by combinations of spectral tailoring and doping with isotopes of elements with high (n, α) cross sections (B and Ni) and more recently, in situ α-implantation of LAMS from thin adjoining layers rich in these elements [37]. Information from these experiments will be used to develop, calibrate and validate multi-scale models of the effects of damage accumulations, including the transport, fate and consequences of high levels of helium. Since the wide range of irradiation effects are controled by the large combination of many material and irradiation variables, purely empirical characterization of material properties under irradiation is impossible, and predictions will always involve significant interpolation and extrapolation. Therefore,
1004
G.R. Odette and B.D. Wirth
development of experimentally calibrated and validated MSMP models that are based on a physical understanding the underlying processes and how they interact is a practical necessity.
2.
MSMP Models of Radiation Effects
Figure 1 illustrates the hierarchy of processes that must be integrated into MSMP models of property changes in fission and fusion environments. Since the focus here is on modeling microstructure evolution under irradiation, we begin by simply noting that hierarchical modeling of irradiation effects on materials performance [7, 11, 13] involves linking sub-models relating: (a) changes in the microstructure to local structure sensitive deformation and fracture properties, like changes in yield stress (σ y ); (b) combinations of the fundamental local constitutive properties to more complex continuum engineering properties, like shifts in the temperature indexing a specified fracture
Figure 1. Illustration of the length and time scales (and inherent feedback) involved in the multiscale processes responsible for microstructural changes in irradiated materials. The processes are described in more detail in the text.
Radiation effects in fission and fusion reactors
1005
toughness (T ). The relationship between microstructure and local constitutive behavior has traditionally been addressed with phenomenological models based on dislocation theory [13, 38, 39]. Corresponding models pertinent to local fracture properties have been based on micro-mechanics theories [11, 40, 41]. More recently, however, direct simulations based on MD and dislocation dynamics (DD) have been used to characterize many details of the structure property relation, such as dislocation – obstacle interaction mechanics [42– 45], the motion of single and multiple dislocations through arrays of obstacles of varying strength [13, 46, 47] and the evolution of dislocation structures [48–50]. However, a detailed review of these topics is beyond the scope of this article.
2.1.
Primary Recoil Atoms: Neutron Scattering and Reactions
Radiation damage begins with the creation of energetic primary recoil atoms (PRA) through high-energy neutron–nuclear interactions and concurrent production of He, H and solid transmutants. For a given material, the PRAs have a characteristic cross-section K(E,T) determined by the kinematics of the nuclear interactions that describe the probability for a neutron of energy (E) producing a recoil of energy (T). The nuclear cross-sections and kinematics models needed to compute PRA spectra and gas production reactions are incorporated in codes such as SPECTER [51]. As illustrated in Fig. 2, the PRAs in a fusion first wall spectrum have a high-energy component, peaking at ≈ 500 keV with a mean energy of 50 keV. This compares to a mean T ≈ 15 keV for the quarter-thickness location in the RPV of a pressurized water reactor [52].
2.2.
Primary Defect Production in Displacement Cascades
The PRAs quickly lose kinetic energy through a branching chain of atomic displacement collisions, as well as non-displacing interactions with electrons, generating a high temperature displacement cascade containing large concentrations of vacancy and self-interstitial atom (SIA) defects [53, 54]. The formation, initial cooling and relaxation of displacement cascades, including cluster formation and spontaneous recombination of vacancies and SIA, occurs over very short times of ≈100 ps within regions less than approximately 50 nm diameter [55–59]. The standard radiation damage dose unit is the number of displacementsper-atom (dpa). The computed dpa dose does not account for cascade recombination or defect reconfigurations in cascades. The dpa is essentially the total
1006
G.R. Odette and B.D. Wirth 1
Normalized PRA spectrum
10⫺1 10⫺2 10⫺3 10⫺4 ITER Be first wall ITER Be first wall HFIR PTP PWR 1/4-T RPV
10⫺5 10⫺6
10⫺4
10⫺3
10⫺2
10⫺1
1
PRA ENERGY (MeV) Figure 2. Normalized PRA energy spectra for four prototypic irradiation environments, including a fusion reactor (FFTF mid-core Be first wall) and the reactor pressure vessel of a fission reactor (PWR 1/4-T RPV) [52].
kinetic energy deposited in atomic recoils that is not lost to electrons. While the dpa has been empirically successful as a dose unit [60], more physical measures of damage production are needed in MSMP models. This requires modeling the structure and dynamics of cascades. Conceptually, replacement sequences transport the SIA to the periphery of the cascade leaving a vacancy rich core. The SIA are very mobile and a large fraction of them rapidly form SIA clusters. Over short time scales the vacancies are relatively immobile, but also form some small clusters as well as precursors to larger nanovoids that continue to evolve at longer times. The structure of cascades and the number of primary defects (νd ) produced by a PRA of energy T [νd (T )] have been extensively studied by molecular dynamics (MD) simulations [55–59] and binary collision approximations [61], and are discussed in a companion article by J.-P. Crocombette. Libraries of MD cascades in iron and other elements show the number of primary defects are statistically distributed around a mean νd (T ). The primary defects that are typically considered include the total displacements, the net residual defects escaping initial recombination and the number of small vacancy and
Radiation effects in fission and fusion reactors
1007
SIA clusters grouped in different size bins. The residual defect fraction as a function of the T derived from MD simulations is shown in Fig. 3 [58]. The MD predictions are consistent with low temperature experiments and show that for T >1 keV, only roughly 1/3 of the total computed displacements survive initial recombination [57, 58]. Thus the spectral averaged defect production cross sections σd and production rate Rd are given by σd =
E T
φ(E)K (E, T )dE νd (T )dT φt
Rd = φt σd φt =
(1) (2.a)
φ(E)dE
(2.b)
E
While some residual questions remain, such as the effects of alloying, preexisting microstructures and electron–phonon coupling on cascade evolution; primary defect production cross sections needed in MSMP models are relatively well established. However, the important physics of cascade aging over much longer periods of time has only recently been addressed. This stage of
Point detect survival fraction (per NRT)
1.6
Average values and standard error: 100k 600k 900k
1.4 1.2 1 0.8 0.6 0.4 0.2
0.1
1 10 MD cascade energy (keV)
100
Figure 3. Fraction of point defects surviving in-cascade recombination as a function of PRA energy in MD simulations [58].
1008
G.R. Odette and B.D. Wirth
modeling requires understanding of defect transport and interaction dynamics. Since the pertinent time scales are very different, the SIA and vacancy evolutions can often be treated sequentially.
2.3.
SIA Cluster Properties
Extensive MD simulations have been used to study the properties of SIA and SIA clusters using Finnis–Sinclair and EAM-type interatomic potentials† [62–66]. The SIA ground state in Fe is predicted to be a 110 split-dumbbell, in agreement with experiment [62, 67]. As illustrated in Fig. 4, the dumbbells rotate into the 111 split dumbbell–crowdion configuration with low activation energy of about 0.18 eV. The crowdions undergo essentially athermal diffusion with activation energies < 0.05 eV, until rotating back into the 110 split-dumbbell orientation with a small activation barrier. Thus the overall SIA diffusion obtained from MD simulations is 3D at higher temperatures, with an effective activation energy of about 0.13 eV [65]. However, the simulations are sensitive to the approach used to treat interatomic-interaction energies and forces. As summarized in Table 2, recent ab initio calculations confirm that the 110 dumbbell is the lowest SIA energy configuration, but the corresponding energy of the 111 SIA is higher by 0.7 eV [68, 69]. This is a rather dramatic difference compared to the EAMtype simulations. The ab initio results indicate an activation energy of 0.34 eV for a rotation and translation jump of the 110 split-dumbbell, which involves a similar but not identical jump process as described by the EAM potentials. This may be due to the fact that standard EAM-type potentials do not treat directional, multi-electron band and magnetic effects. However, the sensitivity of the ab initio results to the selection of pseudopotential, basis set and k-point sampling remain to be fully understood, as do the effect of image interactions from small periodic supercells. Thus, the migration mechanism and associated activation energy of the self-interstitial atom in Fe remains a subject of ongoing research and scientific debate. Further, the ab initio simulations predict that di-SIA with 110 orientations are strongly bound and have a lower energy than 111 di-SIA [69]. This is at variance with the MD EAM-type potential results that predict ground state 111 configurations for all SIA clusters with sizes n ≥ 2 [62–64]. The EAM simulations reveal that larger SIA clusters form prismatic a/2111 dislocation loops that undergo rapid 1 D diffusion on their glide prism, at least
†
The Finnis–Sinclair [70] and embedded atom method (EAM) [71]–type potentials have very similar functional forms. In this article, EAM-type potentials will be used as a general reference to the results obtained in bcc Fe alloys by semi-empirical Finnis-Sinclair and EAM potentials.
Radiation effects in fission and fusion reactors
Formation energy (eV)
<110>
<***>
<111>
1009 <111> crowdion
<111>
5.05 5.01 4.99 4.87
Migration path coordinate Figure 4. Illustration of the migration process of a single self-interstitial atom predicted by MD simulations. The lowest energy (0 K) configuration is a 110 split-dumbbell. Migration occurs as a result of migration into 111 split-dumbbell configurations with 1 D translation in the 111 direction through the 111 crowdion saddle point.
Table 2. Summary of SIA formation energies (in eV) obtained from semi-empirical Finnis–Sinclair and EAM potentials, and recent ab initio results [62, 63, 65, 69]
E f , 110 E f , 111 E f , 111 crowdion
Finnis–Sinclaira
EAMb
Finnis–Sinclairc
Ab initiod
4.76 4.87 4.91
4.33
4.87 4.99 5.01
3.64 4.34 4.34
a [60]; b [61]; c [63]; d [67]
in pure iron [64, 72, 73]. While the 1 D diffusing SIA loops show strong correlations between individual sequences of jumps, the 1 D diffusion process can be described by a diffusion coefficient with an activation energy < 0.1 eV and a weakly size (n) dependent pre-exponential factor (≈ 1/n2/3 ) on the order of 0.5–1×10−6 m2 /s [66, 73, 74]. Generally equivalent behavior is predicted in other crystal structures as well, and perfect prismatic vacancy loops are even found to be mobile in MD-EAM simulations [75]. The high 1 D mobility of SIA cluster-loops has a profound effect on the kinetics and nature of long-term evolution of the overall microstructure under irradiation. For example, this mechanism helps explain the apparent absence of observable dislocation loops in RPV steels irradiated to intermediate doses of 0.05 dpa. At higher doses, the sink bias between vacancies undergoing 3 D diffusion
1010
G.R. Odette and B.D. Wirth
and 1 D migrating SIA cluster-loops may enhance phenomena like void swelling [76]. It is clear that at some size the a/2111 SIA cluster configuration will have the lowest energy. However, what this size is, and the size-dependent mobility of the SIA clusters are important unresolved issues. Many other details remain to be resolved as well. For example, recent MD studies have indicated SIA cluster trapping by interstitial C (>0.6 eV) and He (>1.0 eV) [77, 78]. In contrast, the EAM MD simulations indicate that oversized substitutional Cu has little effect on SIA and SIA cluster-loop mobility [65, 66]. Other important issues are how SIA interact with other defects, including each other (discussed below), precipitate interfaces, as well as dislocation and grain boundary sinks. However, at high temperatures of interest (>500 K) at least some of these details can probably be safely ignored in microstructural evolution models. For example, as long as their mobility is much larger than vacancies, the precise value of the diffusion coefficient of SIA and SIA clusters is not important to overall defect balances. Hence, the focus of modeling should be on determining critical mechanisms like the dimensionality of SIA and SIA cluster diffusion and its consequences, trapping, the ability of mixed-dumbbell SIA to transport solutes (hence, to drive chemical segregation to sinks), the effective sink efficiencies and strengths for SIA clusters, and the reactions and effective reaction rates between all mobile defect species.
2.4.
Cascade Aging and Delayed Defect Production
Returning to the issue of cascade aging, based on any reasonable assumed diffusion parameters, SIA and SIA clusters quickly (< µs) leave the cascade region, unless they are strongly trapped, or recombine with cascade vacancies. Recombination during the post cooling stage has been studied by object-based Monte Carlo (KMC) methods that transport the SIA and SIA clusters within and away from the cascade, while the vacancies remain immobile [63, 79– 81]. Unlike the initial cascade cooling stage, additional recombination during this phase increases with increasing PRA energy T in the range >10 keV [63, 80], reaching values up to about 45% of the initial cascade defects, for the highest cascade energies (50 and 100 keV). Table 3 shows the results of such simulations [82]. This energy dependence is primarily due to SIA and SIA clusters that escape one sub-cascade and then recombine with vacancies in the spatially correlated (nearby) sub-cascades. This mechanism has not yet been accounted for in energy-dependent defect production cross-sections, but may have important implications in developing physically based damage production models for fusion versus fission neutron spectra. The cascade cores continue to evolve (age) over much longer times by spatially and time correlated short-range vacancy and coupled solute
Radiation effects in fission and fusion reactors
1011
Table 3. Summary of additional recombination during initial stages of cascade aging (t < 10−6 s) at 290 ◦ C. The table provides the average number of Frenkel pairs formed in the cascade [58] and the average number of surviving vacancies. The vacancy – self-interstitial recombination radius was the lattice parameter, a0 PRA energy 500 eV 1 keV 2 keV 5 keV 10 keV 20 keV 40 keV 50 keV 100 keV
Average number of Frenkel pairs produced in cascade (MD simulation)
Average number of surviving vacancies (KMC simulation)
4.2 6.4 9.4 22.0 33.9 59.3 131 168.3 332.3
3.0 (71%) 4.7 (73%) 6.1 (65%) 13.2 (60%) 20.2 (60%) 38.2 (64%) 77.5 (59%) 90.9 (54%) 180.1 (54%)
diffusion. This period (which can be described as delayed defect production) has been extensively studied using kinetic lattice Monte Carlo (KLMC) and object KMC techniques [80, 81, 83–85]. Both techniques track the real time associated with the cascade aging processes. The KLMC simulations show that cascade aging produces vacancy clustering, cluster migration, cluster coalescence and ultimately vacancy cluster dissolution. Interactions between vacancies and solutes (such as Cu) enhance the formation of vacancy–cluster complexes that are also mobile and often grow by cluster coalescence [83, 84]. Ultimately, most of the primary vacancies leave the cascade region, but some may form the nuclei for larger clusters that continue to grow by long-range vacancy and solute diffusion. The time-scale for cascade aging depends on the irradiation temperature; it overlaps with long-range diffusion processes below around 300 ◦ C. As described in more detail in the companion article on Monte Carlo by G. Gilmer, the KLMC simulations require the interaction and migration activation energies for vacancies and solutes. The Boltzmann weighted MC exchange probabilities depend on the local vacancy environment. The simplest approach possible uses pair bond models to compute lattice site energies. Most simulations have assumed the vacancy–solute activation barrier scales (increases over) that for Fe by half the difference in the lattice site energies. EAM potentials and MD have been used to derive ground state and activation energies in the Fe–Cu system [85–87]. Ultimately, relaxed ab initio simulations could be used to derive the many-bodied lattice site and activation energies for vacancy–solute (e.g., Cu, Mn, Ni, Si, Cr, He, . . . )-solvent (Fe) configurations, somewhat akin to the cluster variational approach used to model alloy phases [88]. Several workers have used nearest neighbor and EAM potentials to examine the effect of the local many solute atom environment on the activation energy for vacancy exchanges and used this information in KLMC to simulate precipitation of coherent clusters in systems with a
1012
G.R. Odette and B.D. Wirth
single vacancy, showing that adding such detail results in large changes in the kinetic decomposition paths [89–92]. An example of a KLMC simulation of cascade aging is shown in Fig. 5 for a 50 keV cascade in an Fe–0.3%Cu at 300 ◦ C alloy. This figure shows the vacancy–Cu cluster evolution starting at 1 ns and ending at more than 106 s. In order to simulate the enormous range of times in the KLMC simulations, special rescaling-annealing algorithms were developed [83]. The red (dark) circles show the positions of vacancies, green (light) circles show the positions of those Cu atoms that are nearest neighbors (clusters) to one or more Cu atom or vacancy. By 20 ms, 14 vacancy–Cu complexes have formed containing 80% of the initial vacancies, while 20% of the residual vacancies have left the cascade region. The cluster-complexes are thermodynamically unstable and dissolve by vacancy emission depending on the irradiation temperature as well as their size and composition. However, small cluster-complexes are also very mobile and between 20 ms and >105 s, diffusion–coalescence processes eventually lead to the formation of just one or two larger cluster complexes. Small migrating complexes also getter additional Cu, which increases the cluster binding energy, and thereby decreases the vacancy emission (a)
(b)
(c)
(e)
(f)
5 nm (d)
Figure 5. Kinetic Monte Carlo simulation results of the vacancy–Cu evolution (a) 1 ns, (b) 9 ms, (c) 20 ms, (d) 5 s, (e) 135 s, and (f) 1.35 × 106 s following the production of a 50 keV displacement cascade in an Fe–0.3% Cu alloy. Red (dark) circles show the position of vacancies, green (light) circles the clustered Cu atoms.
Radiation effects in fission and fusion reactors
1013
rates from clusters. The largest nanovoids contain up to several tens of vacancies, with some Cu segregated to their surfaces. Cluster diffusion–coalescence processes compete with dissolution by vacancy emission, but the rates of both processes decrease rapidly with increasing cluster size and Cu content. Eventually, the single or few large clusters fully dissolve, and in this case the last vacancy leaves the cascade region at 1.35×106 s. Notably some high energy cascade simulations have shown cascade lifetimes approaching 109 s. Small Cu clusters are left in the wake of the emission of cluster complex vacancies during the various stages of cascade aging. The preceding discussion dealt with the birth to death cycle for isolated cascades. However, as in the example given above, when the time scale of cascade aging becomes comparable to long-range diffusion processes and new overlapping cascade production, these processes must be accounted for. For example, the more stable vacancy solute cluster complexes and residual solute clusters will continue to grow by long-range diffusion of vacancies and solutes, respectively. Assuming a fusion reactor flux of ≈ 1019 n/m2 s and a cascade production cross-section of 2 × 10−28 m2 /atom, cascades will overlap within a 5 nm cascade core dimension approximately once every ≈ 104 s. KLMC simulations of the overlap of the vacancy rich cores of cascades results in more numerous and smaller vacancy – Cu clusters and less escape of isolated vacancies. Figure 6 shows a comparison of vacancy – Cu clusters formed in an Fe–0.3% Cu alloy at 290 ◦ C and a dose of approximately 0.4 mdpa, for
(a)
(b)
y
z
y
x
x z
vacancy ‘clustered’ Cu
5 nm Figure 6. Comparison of vacancy–Cu clusters formed at about 0.4 mdpa, with new defect (cascade) introduction at (a) 10−12 dpa/s and (b) 10−9 dpa/s.
1014
G.R. Odette and B.D. Wirth
introducing additional damage (cascades) introduction at a rate of about 10−12 versus 10−9 dpa/s. At the lower dose rate, the longer time between the arrival of a new overlapping cascade is such that the remnants of previous cascades are nearly completely dissolved. In contrast, at the higher dose rate, the vacancy– Cu complex remnants act as sinks for newly created cascade vacancies, thereby reducing the fraction of vacancies that escape the cascade region. The nonlinear interactions result in smaller but more numerous vacancy–Cu complexes. These simulations overestimate the vacancy survival and clustering under cascade overlap conditions, since they do not account for recombination due to the SIA and SIA clusters in the new overlapping cascades. Thus, object KMC simulations of the cascade overlap recombination phase will be used to refine these simulations in the near future. Cu is a surrogate for other solutes such as Mn, Ni and Si in RPV alloys that also bind to vacancies and form cluster complexes. Simulations show that a higher total active solute concentration (1 to > 2%) leads to smaller, but more numerous vacancy cluster–solute complexes with somewhat shorter lifetimes compared to the larger cluster complexes formed in more dilute alloys. These results are consistent with positron annihilation lifetime studies of complex steels and Fe–Mn–Cu model alloys versus simple Fe–Cu binaries [93]. Over longer times the residual solute clusters continue to evolve and are the likely source of loosely aggregated Mn, Ni and Si matrix features. The so-called matrix features are primarily responsible for hardening in low and no Cu RPV steels. Further, these features are likely nucleation sites for wellformed Mn–Ni–Si rich phases that grow due to long-range diffusion of these elements when present in sufficiently high concentrations. These so-called late blooming phases are discussed in Section 2.5. The time-scale overlap of cascade aging, where spatially and time correlated processes are important, with both long range diffusion and multiple local cascade events presents a significant modeling challenge. Further, a good method for coupling these atomistic results to mean field cluster dynamics diffusion reaction simulations is not yet in hand. One possible approach is to use delayed defect production cross sections based on an analysis of extensive libraries of aged cascades, including the effects of cascade overlap. However, even with a good database, this approach may be cumbersome to implement and lead to ambiguities that are difficult to resolve. MC methods provide a more direct approach to simulate larger volumes and longer times. Certainly, the rapid increases in computing power and parallel software, coupled with methods such as domain decomposition, will make MC the method of choice for such simulations in the future. These include both object and event based MC, described in the companion article by George Gilmer and in the literature [94]. The effects of misfit strain and long-range strain field interactions present a particular challenge. We recently have begun applying a fast multipole based MC technique to efficiently treat such interactions.
Radiation effects in fission and fusion reactors
3.
1015
Long-Term Microstructural Evolution
As described in the previous sections, primary cascade production processes are very rapid. But, depending on the irradiation temperature, subsequent cascade aging processes may occur over long periods of time. In general, however, long-term microstructural evolution takes place primarily by coupled long-range diffusion of defects and solutes. While a detailed discussion is beyond the scope and space available in this article, the most important processes can be briefly summarized as including: • Annihilation of mobile defect species at sinks, including dislocations and grain boundaries. The sink strengths generally are different for SIASIA clusters and vacancies. Such sink bias can arise from strain field diffusion-drift interactions, differences in local defect annihilation processes and one versus three-dimensional SIA cluster diffusion. • Clustering of insoluble He (produced by n, α reactions) to form gas bubbles that can act as nucleation sites for both voids and grain boundary creep cavities. Bubbles are stable in the sense that they grow only with the addition of gas atoms. However, a sink bias driven excess flux of vacancies relative to SIAs transforms bubbles that have grown beyond a critical size, r∗ , into unstably growing voids or creep cavities. • Void swelling and network dislocation climb, annihilation and production from loop unfaulting leading to evolved dislocation substructures, again due to bias driven imbalances between SIA and vacancy fluxes. • Driven non-equilibrium chemical radiation induced segregation (RIS) or desegregation due to coupling of solutes to persistent defect fluxes to fixed sinks. • Long-range diffusional aggregation of solutes forming a wide range of equilibrium and non-equilibrium precipitate phases due to radiationenhanced diffusion (RED) in lower temperature regimes, which are normally kinetically inaccessible under thermal aging conditions. The long-term microstructural evolution and consequences are governed by the mechanisms and microstructures controlling the transport and fate of defects coupled to He, solutes and impurities. These processes, in turn, depend on a large number of atomic scale processes, such as: He diffusion, trapping and emission from features in the matrix, on dislocations and on grain boundaries; He interactions with other mobile defects; and the properties of small He-vacancy clusters. These atomic scale mechanisms and pertinent parameters can be evaluated by various MC techniques, EAM-MD simulations, and in principle ab initio methods coupled to diffusion reaction cluster dynamics models. The final sections describe two examples of modeling microstructural evolution relevant to (i) RPV embrittlement, and (ii) dislocation loop evolution at intermediate dose and temperature.
1016
4.
G.R. Odette and B.D. Wirth
Nanoscale Precipitation in Irradiated RPV Steels
As noted previously, irradiation embrittlement of RPV steel has been most commonly characterized by shifts in a Charpy transition temperature (T ) marking a specified energy index (41 J). The T is due to the corresponding irradiation hardening, usually represented by increases in the yield stress (σy ) produced by ultrafine nm-scale precipitates and defect cluster complexes that evolve under irradiation. Micromechanical models are consistent with empirical observation that T ≈ Cc σy , where Cc ≈ 0.65 ± 0.15 ◦ C/MPa. The σy can be related to the size distribution, number density and the dislocation obstacle strengths of the mix of hardening features [13]. Modeling microstructural evolutions have been described in a series of publications [5, 7, 8, 13, 14, 18–21, 95] and they will not be repeated in the following paragraphs except as necessary. It has been common to divide the modeling of nm-scale features into those associated with Cu, which remains highly supersaturated following typical RPV heat treatments, and those that evolve in both Cu bearing (Cu > ≈ 0.075%) and Cu free (Cu < ≈ 0.075%) steels. In Cu-bearing steels, radiation enhanced diffusion greatly accelerates Cu clustering and the formation of coherent (bcc) Cu-rich transition phase precipitates (CRPs) alloyed with Mn, Ni and smaller quantities of other elements. Based on the assumption that radiation enhanced diffusion controlled Cu clustering is the rate-controlling step, the CRP kinetics can be treated with mean field cluster dynamics (CD) models of the time evolution of a number density of clusters N j , containing j = 2, n max Cu atoms, as dN j = α j +1 N j +1 + β j −1 N j −1 − (α j + β j )N j j = 3, n max − 1 dt (3) Here α j and β j are the Cu emission and impingement rates, respectively. Slightly different forms of Eq. (3) are needed for N1 , N2 and Nnmax to complete the set of n max coupled ODEs. These equations can be numerically integrated to compute all the N j (t). Since, they are not particularly stiff, and since large sets of equations can readily integrated (note n max = 10 000 corresponds to a CRP rmax ≈ 3 nm) there is no computational barrier to full CD simulations of Cu clustering. The physics is subsumed into the coefficients, α and β. Assuming, simple diffusion controlled kinetics, pure Cu precipitates and the capillary approximation ∗ α j ≈ 4πr j DCu ∗ β j ≈ 4πr j DCu
X ce 2γpm Va exp Va r j kT X cm Va
(4.a)
(4.b)
Radiation effects in fission and fusion reactors
1017
Here X ce is the equilibrium fraction of Cu in the ferrite matrix in equilibrium with a bcc phase, X cm is the remaining dissolved Cu fraction, γpm is the effective CRP – matrix interface energy, and Va is the atomic volume of Cu. These parameters can be determined from ab initio or EAM potential based simulations, and can be measured or estimated from experiment. Typical val∗ is the radiation enhanced ues used in the models are given in Table 4. Dcu diffusion coefficient that must be modeled separately. Within the assumptions ∗ simply sets the time, or φt-dpa dose, scale for precipitation. of the model, Dcu Integration of Eq. 3 predicts the time/fluence-dependent evolution of the pure Cu CRPs, N (φt, rp ). The enhanced diffusion of Cu under irradiation is primarily due to the cor∗ can be estimated using a responding excess concentration of vacancies. Dcu standard steady-state rate theory (SRT) model [12] as ∗ ≈ K (φ, T, St , . . .)φ + Dcu Dcu
(5)
Here K is the RED factor that depends on the total defect sink strength St, as well as solutes and other features that trap vacancies, promoting vacancy–SIA recombination, and Dcu is the thermal Cu diffusion coefficient. Both K and Dcu depend on the interactions between vacancies and Cu and the corresponding vacancy jump frequencies in the vicinity of Cu. The RED factor can be modeled by SRT as follows. Assuming steady state vacancy (X v ) and SIA (X i ) concentrations (atomic fractions) and ignoring bias effects and vacancy trapping, the defect balance can be expressed as G v − Dv X v Stv − X v X i (Dv + Di )R = 0 G i − Di X i Sti − X v X i (Dv + Di )R = 0
(6.a) (6.b)
Here G v = G i = G = σv φ are the vacancy and SIA generation rates and σv is the vacancy production cross-section, Stv = Sti is the total sink strength, Dv Table 4. Nominal values used in the mean field cluster dynamics calculations of radiation enhanced copper diffusion and copper precipitation kinetics Parameter Burger’s vector (b) Vacancy production cross-section (σv ) Recombination & trapping radius (rv ) CRP-matrix interface energy (γpm ) Atomic volume (Va ) Vacancy diffusion pre-factor (Dv,0 ) Vacancy migration energy (E m,0 ) Trap-vacancy binding energy (Hb ) Total sink strength (St ) Trap concentration (X t )
Value 0.248 nm 0.6 × 10−25 m2 0.57 nm 0.4 J/m2 1.17 × 10−29 m3 5 × 10−5 m2 /s 125 kJ/mol 30 kJ/mol 2 × 1014 m−2 0.03
1018
G.R. Odette and B.D. Wirth
and Di are the vacancy and SIA diffusion coefficients and R = 4πrr / Va is the vacancy–SIA recombination parameter where Va is the atomic volume and rr is the recombination radius. If recombination is ignored Dv X v = Di X i =
G D∗ ≈ sd St fc
(7)
∗ is the radiation enhanced solvent (Fe) self-diffusion coefficient and where Dsd f c is the self-diffusion correlation factor (≈1). Considering recombination and assuming Dv Di
f t (T, φ, St )G St 2 f t = η [(1 + η)1/2 − 1]
Dv X v =
η=
(8.a) (8.b)
16πrr G Va Dv St2
(8.c)
Here f t is the fraction of vacancies that survive recombination with SIA and reach sinks. However, recombination is greatly enhanced if vacancies are strongly bound to a high concentration (X t ) of solute trapping sites. Assuming that a solute trap is limited to one bound vacancy and that a small fraction of traps are occupied (X t X tv ),
G+
4π(rt X t + rr X v ) X tv − Dv X v St + τt Va
X tv Dv X v 4πrt X tv Dv X v 4πrt X t − − =0 Va τt Va τt ≈
b2 Dv exp (−Hb /RT )
(9.a) (9.b) (9.c)
Here, rt is the trap capture radius, τt is the average trapping time, Hb is the trap–vacancy binding energy and b (= 0.248 nm) is the atomic spacing. Solute vacancy binding energies are typically in the range of about 5 – 30 kJ/mol [96]. However, the effective Hb may be even higher. Equation (9) can be solved for the φ corresponding to a specified f t (φ, Ti , St, X t , E b ) as
1 St −1 ft σv . φ( f t ) = 4πrt f t 4πrt τt 4πrt X t f t + ft − 1 + Va Va St Dv St Va
(9.d)
Radiation effects in fission and fusion reactors
1019
∗ For dilute alloys, the simplest way to model Dcu is in terms of the radiation ∗ enhanced self-diffusion coefficient, Dsd , as ∗ ∗ Dcu ≈ Dsd
Dcu G ft ≈ + Dsd Dsd St
Dcu Dsd
(10)
Here Dsd is the thermal self-diffusion coefficient. ∗ t = f (φt). Note that there The total Cu precipitation is proportional to Dcu are two sources of dose rate (φ) effects in this formulation. If, as is typi∗ t = σv φt f t /St and any dose rate depcally the case, G v f t /St Dsd , then Dcu endence of precipitation at a given φt is contained in the f t recombination ∗ t is term. At low dose rates in the sink dominated regime, f t ≈ 1 and Dcu independent of dose rate.√At higher dose rates, in the recombination dominated regime, f t scales as ≈ 1/ φ. This means that the φt needed to produce a given amount of precipitation increases with increasing dose rate. At still higher dose rates, transient cascade vacancy clusters become the dominant defect sink and 10 270˚C 290˚C 320˚C
ft
1
0.1
0.01
0.001 1013
1014
1015
1016 1017 1018 φ[n/(m2−s)]
1019
1020
1021
Figure 7a. The fraction of vacancies that escape recombination and reach sinks, as a function of irradiation temperature and neutron flux.
1020
G.R. Odette and B.D. Wirth
f t scales as ≈ 1/φ; in this case the precipitation depends on time, t, but not φt. This is also the case at very low dose rates, in the thermal diffusion dominated regime, where Dsd G ft /St. More generally f t varies continuously with dose rate, scaling as φ −p where p varies between 0 and 1. Figure 7a shows f t in the sink and recombination dominated regimes as a function of φ and Ti for the base model parameters given in Table 4. Figure 7b shows the φ corresponding to f t = 0.5 versus Hb for various X t and St . The main advantage of framing RED in the form of Eq. (10) is that all the key atomic scale diffusion processes (that depend on the various vacancy– solute interaction energies, and the corresponding jump frequencies) are lumped in the [Dcu /Dsd ] term. Experimental estimates of [Dcu /Dsd ] are available at high temperatures, but they must be extrapolated to ≈ 300 ◦ C pertinent to RED Cu precipitation. Notably, however, the extrapolated [Dcu /Dsd ] ratio is much less sensitive to various uncertainties than either Dcu or Dsd . Further, 1020 St⫽2⫻1015m⫺2
1019
φ[ft⫽0.5(n/m2⫺S)
1018 St⫽2⫻1014m⫺2
1017
1016
1015 Xc⫽0.005 Xc⫽0.03
1014
1013 0
5
10
15
20
25
30
35
40
Hb(kJ/mol) Figure 7b. The flux at which the recombination fraction equals 50%, as a function of trap concentration, binding enthalpy and total sink density.
Radiation effects in fission and fusion reactors
1021
the jump frequencies that govern [Dcu /Dsd ] can be estimated from atomistic calculations based on MD and EAM type potentials; or even, in principle, using ab initio methods. The jump frequencies can be used in analytical models of [Dcu /Dsd ], including the effects of alloy composition. Alternatively, the jump frequencies can be used in KLMC simulations to extract the diffusion coefficients. Such a formulation provides a good example of an effective way to bridge the gap between atomistic simulations and other types of models. Estimates of [Dcu /Dsd ] based on fits to precipitation and hardening data in typical RPV steels suggest values of ≈ 50 at around 300 ◦ C for the nominal values of σv and St in Table 4; these are within an order of magnitude of the atomistic estimates [18]. Figure 8 shows the results of a cluster dynamics Cu precipitation simulation for a Fe–0.4%Cu alloy irradiated at 290 ◦ C. The results of the CD model are expressed in terms of the CRP number density (j > 3), Np , average radius, ∗ = 10−22 m2 /s. rp , and volume fraction f p as a function of φt, for a nominal Dcu Note, in principle, the CD models need not involve any adjustable parameters. The model shows overlapping stages of nucleation, growth and coarsening. Overall the predictions are in good semi-quantitative agreement with experimental observations. However, the simple CD models based on assuming dif∗ modestly differ in some details fusion controlled kinetics and a constant Dcu from experimental observations for both thermal and RED precipitation. Possible reasons for the disparity include (i) uncertainties with extrapolating
100 1025
101 N
G(C)
C
fp <rp>(n/m)
Np(m⫺3)
fp(%)
Np
<rp> 10⫺2 1035 1021
1022
1023
10⫺1 1024
φt(n/m2) Figure 8. CD model prediction of the nucleation, growth and coarsening evolution of Cu precipitate number density (Np ), mean radius (rp ) and volume fraction ( f p ) in an Fe–0.4% Cu alloy irradiated at 290 ◦ C.
1022
G.R. Odette and B.D. Wirth
thermodynamic and capillary-type concepts to the atomic scale, (ii) complex, non-uniform precipitate structures at the atomic scale, (iii) precipitates alloyed with Mn, Ni, Si, P, and even consisting of Mn–Ni rich phases at high alloy Mn and Ni concentrations (promoted by lower T and alloy Cu), (iv) excess free energy contributions from misfit coherency strains (or strain gradients), (v) a continuum range of inter-related features, from vacancy cluster-solute complexes to CRPs and MNPs, (vi) complex correlated diffusion processes associated with strong vacancy-solute interactions in semi-dilute alloys, and (vii) evolution of the RED coefficient with defect sink and vacancy trap evolution. Space does not permit a full discussion of these issues, but a brief discussion of the thermodynamic and LMC treatment of precipitate composition and chemical structure will be presented. Mean field thermodynamics can model the average composition of the precipitates as a function of the alloy composition. This requires evaluating the chemical potential (µi ) and corresponding activity (ai = exp[(µi − G io )/RT ]) of each species (Cu, Mn, Ni, Fe, . . . ) in both the matrix (m) and precipitate (p) phases, where G io is the free energy of pure element i. All constituents are allowed to flow to (aim > aip ) or from (aim < aip ) the precipitate until quasiequilibrium is established at the appropriate level of solute partitioning. For example, a significant amount of Cu in solution has a very high activity (acm 1) compared with that in a pure Cu precipitate (acp =1). Thus the matrix Cu must decrease to very low values to reach the condition acm = acp ≈ 1. Evaluations of µi are based on the standard definition:
µi =
∂ Gt ∂n i
T ,n j,k
.
(11)
Here, G t is the total free energy of the precipitate or matrix mixture. Evaluating the µi requires modeling the corresponding molar free energy (G) of the mixture as a function of temperature and composition. The G(X A , X B , . . . , T ) can be determined from regular or sub-regular solution models with empirical excess free energy (G ex ), enthalpy (Hex ) and entropy (Sex) of mixing (G ex = Hex − TSex ) and lattice change energy terms (G st ) taken from compilations such as CALPHAD [95] (www.calphad.org), in addition to the ideal solution terms. For the precipitate phase, G=
X i [G o,i + G st,i + RT lnX i ] + Hex − T Sex + 4πrp2 γpm
(12)
i
G o,i , n i and X i are the free energy, number of moles and mole fraction of the i’th element. The Hex derives from differences between bonding energies between like (e.g., Fe–Fe, Cu–Cu) and unlike (e.g., Fe–Cu) atoms. The binary
Radiation effects in fission and fusion reactors
1023
interaction between A and B atoms is typically given by a sub-regular solution model (e.g., www.calphad.org) as Hex = X a X b [X a L a (T ) + X b L b (T )].
(13)
Here the L A and L B are tabulated polynomial functions of T (over specified ranges) for various crystal structures; they are most often derived by fits to experimental binary phase diagrams. Analogous empirical analytic expressions exist for Sex. For a regular solution, L A = L B = (independent of temperature) and Sex = 0 [97]. The G evaluations can be extended to a larger number of constituents by summing the contributions from the binary (e.g., Cu–Mn) and higher order (e.g., Cu–Mn–Ni–..) interaction terms; however, generally, only the binary interaction terms are available. A further limitation is that there may not be information for the appropriate crystal structure, as for the bcc binary Cu–Ni phase. The free energy contributions of the composition dependent precipitate– matrix interface energy (γpm ) to µi must also be considered for nm-scale precipitates. The chemical energy contribution to γpm for a coherent interface can be approximated in terms of a regular solution pair bonding model [5, 95] γpm =
2 Hsi z b . (X ip − X im )2 3 i Ai z
(14)
Here, Hsi is the heat of solution for solute i, Ai is the area per atom in the interface, z b is the number of bonds across the interface (≈2) and z is the atomic coordination (= 8). The factor 2/3 is an adjustment to account for the observation that the simple pair bond model for γpm is typically about 50% higher than better experimental and theoretical estimates. The model predicts γpm ≈ 0.4 J/m2 for a pure Cu precipitate. Theoretical estimates of γpm can be obtained from MD simulations using Fe–Cu EAM potentials, or ab initio calculations. However, the main advantage of the simple pair bonding model is that it can be readily extended to interfaces between phases with multiple components (e.g., Fe, Cu, Ni Mn, . . . ). Since the Hs is much lower for Mn and Ni than Cu, these elements are more enriched at small precipitate sizes than for the corresponding case of bulk phases. The lower γpm and higher Mn and Ni solute concentrations are predicted to promote the nucleation and growth of a higher number density of precipitates, consistent with observation; P, and perhaps Si, also appear to play a similar role. The bulk phase boundaries can be determined by setting γpm = 0. Note the corresponding composition dependent coherency stain energy should also be considered, but this generally smaller effective contribution has not been included in the models to date. The thermodynamic models predict the existence of Mn–Ni phases even in Cu free steels, as well as MNPs in Cu-bearing alloys. Note these Mn–Ni rich phases are favored below ≈350 ◦ C where normal thermal aging kinetics is so slow that MNPs would not be observed experimentally. However, once
1024
G.R. Odette and B.D. Wirth
nucleated, RED would result in large volume fractions of the corresponding MNPs leading to severe embrittlement. Nucleation calculations indicate that Cu is very effective in promoting (catalyzing) MNP formation due to its high super-saturation, even at relatively small concentrations. In steels with Cu > 0.05–0.1%, Cu readily clusters along with Mn and Ni. In Cu-bearing alloys without Ni, the thermodynamic models predict that Mn will be enriched in the precipitates to X mn = 0.1–0.2. However, Ni strongly interacts with Mn; hence, when present in the steel, Ni is enriched in the precipitates as well. The models predict Xni /Xmn ratios between approximately 0.5 and 1, increasing with alloy Ni (and Mn) concentrations. In medium to high Cu alloys, the X cu /X mn ratios are approximately 3 to 1 depending on the Ni content. The larger volume fractions and higher number densities of the CRPs and MNP result in much larger hardening that increases rapidly with increasing Ni and Mn. For example the peak σy in alloys with 0.4Cu, 1.6Mn and with 0.0Ni versus 1.6Ni alloy are about 60 MPa compared to 270 MPa [18]. Thus the thermodynamic models rationalize the strong synergistic effect between Cu, Ni and Mn in irradiation hardening and embrittlement. The modelbased predictions of MNPs in high Ni and Mn Cu-bearing steels has been experimentally confirmed in numerous subsequent experiments. Experimental confirmation of the thermodynamic model for Cu bearing alloys includes the effects of thermal annealing at temperatures up to 450 ◦ C and above, which is predicted to reduce significantly Mn and Ni contents of precipitates in Cu bearing alloys [13, 98]. Among other limitations, the mean field thermodynamic model cannot accurately treat the detailed chemical and crystallographic structure of the precipitates. For example, it is expected that Mn and Ni would segregate to the outside of polyhedral precipitates with (100) and (110) facets, thus lowering γpm and the total precipitate interface energy. Further, the strong bonding interactions would be expected to produce some degree of ordering in the Mn and Ni rich regions. The actual precipitates would have a range of lowest energy configurations as modified by entropic effects. LMC methods can be applied to predict these structures. Ideally this would involve the use of rigorous many bodied interaction models or at least semi-empirical EAM type potentials. However, since such information is generally not available, regular solution pair bond energy (εij ) models have been derived based on thermodynamic data [21, 95] and references 20–25 therein; The εij can be estimated [97] as εij ≈
G ex (X i ,T ) εii + ε j j + Na z 2
(15)
Here G ex (X i , T ) is the excess molar free energy of a specified mixture of i and j, z = 8 is the atomic coordination, Na is Avogadro’s number and the εii and ε j j like bond energies determined from the pure element cohesive
Radiation effects in fission and fusion reactors
1025
energies. The G ex (X i , T ) are evaluated for prototypic precipitate and matrix compositions from thermodynamic data in the literature (e.g., references 20–25 in Ref. [95]). They also contain terms, as needed, for transformation to the bcc structure. For example, G ex (X i , T ) data is not available for the bcc phase of Cu–Ni, so 2G ex (X i , T )/3 obtained for the fcc phase is used to approximate the effects of lower coordination. Other modest adjustments to obtain estimates of εij in Fe–Cu–Ni–Mn–Si alloys are discussed elsewhere [95]. The total energy (E ) of a particular configuration of atoms is simply the sum of all the like and unlike bond energies. Starting with a random solid solution the Kawasaki LMC algorithm exchanges atomic positions with a Boltzmann weighted probability (P) as E , if E > 0 (16.a) P = exp − kT (16.b) P = 1, if E ≤ 0 Here E is the energy difference before and after the exchange. The algorithm randomly picks atoms for possible exchanges for a large number of sweeps until E fluctuates around a constant free energy minimum, reflecting the ensemble of precipitate configurations at a given temperature. This MC approach is essentially a regular thermodynamic solution model cast in atomistic form and thus, should produce results (e.g., phase boundaries) that are generally similar to the mean field predictions. However, within the approximations of the simple pair bond model, it can provide additional atomic level detail on the chemical and crystallographic structure on nm-scale precipitates. Some results are illustrated in Fig. 9. Figure 9a shows a typical snapshot for a partially ordered precipitate in an Fe–0.24%Cu, 0.59% Ni, 1.5% Mn, 1.0% Si alloy at 290 ◦ C with a Cu-rich core surrounded by a Ni–Si–Mn rich shell. The simulation is remarkably consistent with both atom probe and SANS measurements on an irradiated RPV weld with this composition [21, 95]. Figures 9 b–e show the range of predicted typical precipitate structures for other alloy compositions and T, including the structure of MNPs. Since they may be slow to nucleate, the MNPs in Cu-free (and very low Cu) steels were dubbed potential “late blooming phases” that could produce severe and unexpected rapid embrittlement above a high incubation dose (φt or dpa). The predicted formation of large volume fraction of MNPs in very low and Cu-free steels [5, 13, 20, 21], and corresponding high levels of hardening, has only recently been confirmed by a variety of characterization methods [22]. This excellent example of modeling leading experiment may have profound implications to the extended life of RPVs. Integrated experiments and refined models will be critical to further map the T, φ, φt, Cu, Ni, Mn regimes where MNPs may be important and to assess the possible role of other solutes, like Si, and phases as well.
1026
G.R. Odette and B.D. Wirth
Cu
Ni
Mn
Si
(a)
3 nm
5 nm (b)
(c)
Cu
(d)
Mn 2 nm
Ni
(e)
Figure 9. MC predictions of the atomic structure of CRP/MNPs. Bulk alloy compositions and temperatures for the simulations were (a) Fe–0.24% Cu–0.59% Ni–1.5% Mn–1.0% Si at 283 ◦ C, (b) 0.26%Cu, 0% Ni, 1.2% Mn at 260 ◦ C, (c) 0.26% Cu, 0.75% Ni, 1.2% Mn at 260 ◦ C, (d) 0.26% Cu, 1.2% Ni, 1.2% Mn at 260 ◦ C, and (e) 0.13% Cu, 0.75% Ni, 1.2% Mn at 290 ◦ C.
Radiation effects in fission and fusion reactors
5.
1027
Dislocation Loop Evolution in Ferritic Alloys
TEM examination of LAMS intended for fusion first wall and blanket application does not reveal any visible damage following low dose, intermediate temperature irradiation (<0.05 dpa at 300 ◦ C). However, as the irradiation dose increases above ∼0.05 dpa, a significant population of dislocation loops, primarily of self-interstitial type, is experimentally observed with b = a100 and b = a/2111. The distribution of loop Burger’s vectors observed ranges from almost equal proportions to predominantly a100, rather than the expected and lowest energy b = a/2111. While this result has been known for nearly 40 years [99–101], a self-consistent mechanisms to explain the presence of 100 loops in ferritic alloys has not been established until recently [102]. MD-EAM simulations show that self-interstitials and small clusters up to tetra-interstitials diffuse three dimensionally, with intrinsic activation energies of only a few tenths of an eV [62–65]. As previously discussed, recent ab initio results raise questions about whether these SIA clusters are a/2110 or a/2111 type [68, 69]. Larger (n>≈ 5 − 10) self-interstitial cluster a/2111 dislocation loops migrate by quasi 1 D diffusion along their glide prism, with activation energies less than 0.1 eV [64, 72]. The 1 D migration of a/2111 clusters is reasonably consistent with the ab initio results, which indicate very small energy differences between 111 dumbbell and 111 crowdion configurations [68, 69]. But, the size at which SIA clusters transform from 110 to 111-orientations is an issue, as is solute and impurity trapping. At damage levels relevant to fusion conditions, a100 dislocation loops are an important, but relatively unexplained part of the irradiation-induced microstructure. Two mechanisms have been proposed to explain the formation and growth of 100 loops in α-Fe [102, 103]. The Eyre and Bullough mechanism [103] assumes that SIA clusters of a/2110-orientation (Burger’s vector) form during irradiation and, upon reaching a critical size shear into a more energetically preferred configuration with a Burger’s vector of a100 or a/2111. However, the Eyre-Bullough model [103] does not explain why the a100 loop form in preference to the lower energy a/2111, since they involve nearly equivalent shear transformations. Further, a/2110 SIA clusters contain a stacking fault in the body centered cubic Fe structure. Such stacking faults have not been observed experimentally, nor anticipated due to very high stacking fault energies. Recently, it has been proposed that intersections between loops could lead to a100 loop formation [102]. Experiments performed in the early 1960s [104] clearly established that hexagonal dislocation networks composed of a/2111 and a100 dislocation segments form in Fe. It was recognized that a100 loops could form as a result of the reaction [99]: a a ¯ → a[100] [111] + [11¯ 1] (17) 2 2
1028
G.R. Odette and B.D. Wirth
However, Masters discounted this possibility since a/2111 loops were not observed [99]. As discussed previously, MD-EAM simulations show such loops form directly in displacement cascades [58] and are high mobile due to 1D on their 111-glide cylinder [64, 72]. As expected from continuum elasticity theory, MD-EAM simulations show that loops with a100 Burger’s vector have a higher self-energy than a/2111. However, recent MS calculations using a Finnis–Sinclair for Fe potential reveal a much smaller difference in energy than expected [102], raising the possible existence of metastable a100 loops. As shown in Fig. 10, MD simulations of interactions (collisions) between SIA dislocation loops reveal that junctions of a100 type do form in α-Fe consistent with Eq. (17). The necessary conditions for 100 junction formation by Eq. (17) are that both (a)
(b)
(c)
Figure 10. Sequence of MD snapshots at (a) 0, (b) 120 and (c) 430 ps, of the interaction of two a/2111 loops with Burgers vectors appropriate to Eq. (17) at 1000 K. The loop on the left side of the image is a perfect, hexagonal 37-SIA cluster, while the one on the right is a 34-SIA jogged hexagonal loop. After forming a 100 junction following the loop collision, the junction expands throughout the resulting loop.
Radiation effects in fission and fusion reactors
1029
interacting loops are larger than ≈ 20 SIAs and are approximately the same size [102]. When these conditions are not met, the smaller cluster always rotates into the 111 orientation of the larger cluster [78, 102]. These junctions are thermally (meta-) stable and can propagate across the loop through a complicated two-step mechanism described by Marian and co-workers [66]. MD simulations also reveal a mechanism for a100 clusters to grow to TEM observable sizes. Although potentially glissile, a100 loops have a very large activation energy for glide, computed to be >2.5 eV and are effectively sessile. Notably, MD simulations of the interaction between a/2111{110} and a100{100} loops reveal rotation of the smaller a/2111 cluster to join the larger a100 loop. Thus, immobile a100 loops are a biased sink for absorption of both mobile SIAs and a/2111 loops. Figure 11 shows an MD simulation in which a 19-SIA a/2111 cluster is absorbed by a 50-SIA a100 square loop, even though the lowest energy configuration is a 69-SIA a/2111 loop, the system follows the path favored by the lattice dynamics as: 100 + 2
1 2
111 → 211 → 100
(a)
(b)
(c)
(d)
(18)
Figure 11. Sequence of MD snapshots at (a) 0.0, (b) 1.5, (c) 2.2 and (d) 3.5 ps, of the absorption of a hexagonal, 19-SIA a/2[111](110) cluster by a square, 50-SIA a[100](100) loop according to equation (18) at 100 K. Interstitials displayed in white are those belonging to the a/2[111] cluster that have rotated to an a[100] configuration.
1030
G.R. Odette and B.D. Wirth
This reaction involves rotation of individual 111-oriented interstitials (in the presence of 100 SIAs) into an intermediate metastable 211 configuration that rapidly rotates into the 100 orientation [102]. This description provides a plausible mechanism for the formation and growth of a100 dislocation loops in LAMS, although additional research is required to quantify the loop density evolution with irradiation conditions, and validate the formation mechanism.
6.
Outlook
The effect of irradiation on materials is a classic example of an inherently multiscale problem involving multiple physical phenomena, and impacts a wide range of technologies. While much is known about the hierarchical processes that govern irradiation effects, the investigation of controlling mechanisms, refinement of key sub-models and extension of the modeling approaches to treat multi-constituent alloys is an active research area. While they are neither perfect, nor fully based on first principles, physical sub-models for the majority of key MSMP processes mediating irradiation effects in fission (RPV embrittlement) and to a lesser extent fusion reactors are now available [7, 13, 14]. Indeed, models have often led experimental observations of key embrittlement phenomena. Examples include the dominant role of RED-copper precipitation in embrittlement of RPV steels [19], the composition and structure of CRPs [20, 21, 95] and the existence of late-blooming MNPs [13, 18, 20]. More generally, existing RPV embrittlement models rationalize almost all observed embrittlement trends, including those that are counterintuitive and complex, such as seemingly contradictory effects of neutron flux [13, 105]. The integration of available sub-models into a comprehensive MSMP model for RPV embrittlement, in what is called a virtual test reactor (VTR), is being carried out in the REVE project. Reve, which stands for REactor for Virtual Experiments and means “dream” in French, is an international collaboration between a large number of institutions in Europe, the United States and Japan [106]. REVE has been led by Professor Jean-Claude Van Duysen, and Stephanie Jumel has led the code integration effort. The first integrated code RPV-1 simulator, which inputs key embrittlement variables and outputs the net corresponding yield stress increase, was recently released and is currently being calibrated and validated with large experimental databases [107]. RPV-1 links five codes and two databases contained in three modules that can be run separately. The linked codes consist of models of PRA production (SPECMIN) and sub-cascade formation (INCAS), a rate theory defect-Cu solute conservation code (MF-VISC) to simulate clustering and nanofeature evolution, a non-equilibrium thermodynamic code (DIFFG)
Radiation effects in fission and fusion reactors
1031
provided by Odette and co-workers [5], and a Foreman and Makin type model (DUPAIR) to simulate the shear stress required for dislocation penetration through a slip plane of obstacles. The component codes of RPV-1 are informed by databases of cascade structure, ms cascade aging and the strengths of individual obstacles. RPV-1 includes a user friendly Python interface and visualization package [107]. Progress on RPV-1 led to a new program to develop RPV-2, aimed at improving the sub-model codes and physics in RPV-1 and extending the hardening model to treat changes in fracture toughness; as well as INTERN-1, a VTR devised to simulate irradiation effects in stainless steels. These new developments are being carried out in a large effort (the PERFECT project) supported by the European Commission in the 6th Framework Program. The REVE project has also been expanded in Europe to model stress corrosion cracking in Zr–Nb alloys for fuel cladding in the on-going SIRENA project. The complexity and challenge of the broader field of radiation effects involves more phenomena and properties (e.g., radiation-induced segregation, non-equilibrium phase evolution and microstructure instabilities, and their impact on properties ranging from creep rupture to fatigue crack growth). However, over the longer term, all of these issues can be dealt with in a MSMP framework. Implementation of a fully integrated MSMP model has substantial advantages. These include a direct and rigorous accounting of defect balances and solute redistribution, better treatment of highly coupled processes, such as vacancy trapping and solute RED or RIS, and the inclusion of effects related to evolving sink and trapping microstructures. Further, an integrated model provides a convenient framework for testing and evaluating the impact of alternative and improved sub-models and a convenient tool for interpreting and analyzing data ranging from nanoscale characterization studies to quantitative statistical fits to engineering data. Steady progress will entail building a knowledge base that is far more accessible and useful (e.g., for design of new materials) than traditional approaches. As an example, a new initiative to simulate the transport, fate and consequences of He in LAMS and advanced high temperature steels has been initiated as a collaboration between the University of California, Santa Barbara, University of California, Berkeley and the Pacific Northwest National Laboratory. The simulations will encompass irradiation conditions pertinent to current experiments in fission reactors and fusion first wall and blanket structures. Similar activities are underway in Europe for simulating fusion materials performance [108]. Finally, we note that the role of advanced computational materials in the development of advanced fission and fusion energy systems was the topic of an international workshop in the spring of 2004 sponsored by the DOE Office of Science and the DOE office of Nuclear Energy and Sciences. In their report [109], a distinguished international panel of experts endorsed a balanced
1032
G.R. Odette and B.D. Wirth
computational modeling and experimental validation approach to meeting the enormous and indeed unprecedented challenges of developing and predicting the performance of materials in the critical new sources of energy that will serve mankind for the millennia.
Acknowledgments The authors express their appreciation to a large number of people who have contributed to this work. In particular, we thank Drs Gene Lucas, Takuya Yamamoto, Rick Kurtz, Roger Stoller, Steve Zinkle and Randy Nanstad for many helpful discussions. Finally, we gratefully acknowledge the financial support of the US Nuclear Regulatory Commission under contracts #04-94049 and 04-01-064, the Office of Fusion Energy Sciences, US Department of Energy under Grant DE-FG02-04ER54275 at UCSB, and the Office of Fusion Energy Sciences, US Department of Energy under Grant DE-FG0204ER54750 at UCB.
References [1] E.P. Wigner, Report for Month Ending December 15, 1942, Physics Division. US Atomic Energy Commision Report CP-387, University of Chicago, 1942. [2] D.R. Olander, Fundamental Aspects of Nuclear Reactor Fuel Elements. U.S. DOE, 1976. [3] J. Gittus, Irradiation Effects in Crystalline Solids. Applied Science Pub. Ltd, London, United Kingdom, 1978. [4] J.T.A. Roberts, Structural Materials in Nuclear Power Systems. Plenum Press, New York, 1981. [5] G.R. Odette, Neutron Irradiation Effects in Reactor Pressure Vessel Steels and Weldments. International Atomic Energy Agency, Vienna, IAEA IWG-LMNPP-98/3, 438, 1998. [6] B.N. Singh, “Impacts of damage production and accumulation on materials performance in irradiation environments,” J. Nucl. Mater., 258–263, 18, 1998. [7] G.R. Odette, B.D. Wirth, D.J. Bacon, and N.M. Ghoneim, “Multiscale-multiphysics modeling of radiation-damaged materials: embrittlement of pressure vessel steels,” MRS Bull., 26, 176, 2001. [8] G.R. Odette, Nuclear Reactors: Pressure Vessel Steels. Encyclopedia of Materials: Science and Technology, Elsevier Science Ltd., Amsterdom, 2001. [9] B.N. Singh, N.M. Ghoniem, and H. Trinkaus, “Experiment-based modeling of hardening and localized plasticity in metals irradiated under cascade damage conditions,” J. Nucl. Mater., 307–311, 159, 2002. [10] D.J. Bacon and Y.N. Osetsky, “Multiscale modeling of radiation damage in metals: from defect generation to material properties,” Mater. Sci. Eng. A, 365, 46, 2004.
Radiation effects in fission and fusion reactors
1033
[11] G.R. Odette, T. Yamamoto, H.J. Rathbun, M.Y. He, M.L. Hribernik, and J.W. Rensman, “Cleavage fracture and irradiation embrittlement of fusion reactor alloys: mechanisms, multiscale models, toughness measurements and implications to structural integrity assessment,” J. Nucl. Mater., 323, 313, 2003. [12] A.D. Brailsford and R. Bullough, “The rate theory of swelling due to void growth in irradiated metals,” J. Nucl. Mater., 44, 121, 1972. [13] G.R. Odette and G.E. Lucas, “Recent progress in understanding reactor pressure vessel steel embrittlement,” Rad. Effects Defects Solids, 144, 189, 1998. [14] S. Jumel, C. Domain, J. Ruste, J.-C. Van Duysen, C. Becquart, A. Legris, P. Pareige, A. Barbu, E. Van Walle, R. Chaouadi, M. Hou, G.R. Odette, R.E. Stoller, and B.D. Wirth, J. Test. Eval., 30, 37, 2002. [15] E.D. Eason, J.E. Wright, and G.R. Odette, Improved Embrittlement Correlations for Reactor Pressure Vessel Steels. NUREG/CR-6551, 1998. [16] T.J. Williams and D. Ellis, Effects of Radiation on Materials: 20th International Symposium, ASTM STP 1405, S.T. Rosinski et al. (eds.), American Society for Testing and Materials, West Conshohocken, PA, p. 8, 2001. [17] G.R. Odette and G.E. Lucas, “Embrittlement of nuclear reactor pressure vessels,” J. Metals, 53, 18, 2001. [18] G.R. Odette, T. Yamamoto, and D. Klingensmith, “On the effect of dose rate on irradiation hardening of RPV steels,” Phil. Mag., in press, 2005. [19] G.R. Odette, “On the dominant mechanism of irradiation embrittlement of reactor pressure vessel steels,” Scripta Met., 17, 1183, 1983. [20] G.R. Odette, “Radiation induced microstructural evolution in reactor pressure vessel steels,” Mater. Res. Soc. Symp. Proc., 373, 137, 1995. [21] G.R. Odette and B.D. Wirth, “A computational microscopy study of nanostructural evolution in irradiated pressure vessel steels,” J. Nucl. Mater., 251, 157, 1997. [22] G.R. Odette, M.K. Miller, K.F. Russell, and B.D. Wirth, “Precipitation in neutron irradiated copper free RPV steels,” J. Nucl. Mater., submitted, 2004. [23] S.J. Zinkle and N.M. Ghoniem, “Operating temperature windows for fusion reactor structural materials,” Fusion Eng. Des., 51–52, 55, 2000. [24] K. Ehrlich, “Materials research towards a fusion reactor,” Fusion Eng. Des., 56–57, 71, 2001. [25] E.E. Bloom, S.J. Zinkle, and F.W. Wiffen, “Materials to deliver the promise of fusion power – progress and challenges,” J. Nucl. Mater., 329–333, 12, 2004. [26] S. Jitsukawa, A. Kimura, A. Kohyama, R.L. Klueh, A.A. Tavassoli, B. van der Schaaf, G.R. Odette, J.W. Rensman, M. Victoria, and C. Petersen, “Recent results of the reduced activation ferritic/martensitic steel development,” J. Nucl. Mater., 329– 333, 39, 2004. [27] A. Kimura, M. Narui, and H. Kayano, “Effects of alloying elements on the postirradiation microstructure of 9-percent Cr 2-percent W low activation martensitic steel,” J. Nucl. Mater., 191, 879, 1992. [28] F.A. Garner, M.B. Toloczko and B.H. Sencer, “Comparison of swelling and irradiation creep behavior of FCC-austenitic and BCC-ferritic/martensitic alloys at high neutron exposure,” J. Nucl. Mater., 276, 123, 2000. [29] N. Hashimoto, S.J. Zinkle, R.L. Klueh, A.F. Rowcliffe, and K. Shiba, “Deformation mechanisms in ferritic/martensitic steels irradiated in HFIR,” Mater. Res. Soc. Proc., 650, R1.10.1, 2001. [30] G.R. Odette, T. Yamamoto, and H. Kishimoto, “An analysis of the effects of helium on fast fracture and embrittlement of 8Cr tempered martensitic steels,” Fusion Materials Semi-Annual Progress Report, DOE/ER-0313/35, 80, 2003.
1034
G.R. Odette and B.D. Wirth
[31] G.R. Odette, “On mechanisms controlling swelling in ferritic and martensitic alloys,” J. Nucl. Mater., 155–157, 921, 1988. [32] B. van der Schaaf, D.S. Gelles, S. Jitsukawa, A. Kimura, R.L. Klueh, A. Moslang, and G.R. Odette, “Progress and critical issues of reduced activation ferritic/martensitic steel development,” J. Nucl. Mater., 283–287, 52, 2000. [33] T. Yamamoto, G.R. Odette, H. Kishimoto, and J.W. Rensman, “Compilation and preliminary analysis of an irradiation hardening and embrittlement database for 8Cr martensitic steels,” Fusion Materials Semi-Annual Progress Report, DOE/ER0313/35, 100, 2003. [34] H. Trinkaus and H. Ullmaier, “High temperature embrittlement of metals due to helium: is the lifetime dominated by cavity growth or crack growth?” J. Nucl. Mater., 212–215, 303, 1994. [35] A. Kimura, R. Kasada, K. Morishita, R. Sugano, A. Hasegawa, K. Abe, T. Yamamoto, H. Matsui, N. Yoshida, B.D. Wirth, and T. Diaz de la Rubia, “High resistance to helium embrittlement in reduced activation martensitic steels,” J. Nucl. Mater., 307–311, 521, 2002. [36] E.E. Bloom, “The challenge of developing structural materials for fusion power systems,” J. Nucl. Mater., 258–263, 7, 1998. [37] G.R. Odette and T. Yamamoto, “A Helium injector concept for irradiating fusion reactor materials at representative He/dpa ratios,” Fusion Materials Semi-Annual Progress Report, DOE/ER-0313/37, 2005. [38] G.E. Lucas, “The evolution of mechanical property change in irradiated austenitic steels,” J. Nucl. Mater., 206, 287, 1993. [39] B.N. Singh, A.J.E. Foreman, and H. Trinkaus, “Radiation hardening revisited: role of intracascade clustering,” J. Nucl. Mater., 249, 103, 1997. [40] R.O. Ritchie, J.F. Knott, and J.R. Rice, “On the relationship between critical tensile stress and fracture toughness in mild steel,” J. Mech. Phys. Solids, 21m, 395, 1973. [41] G.R. Odette and M.Y. He, “A cleavage toughness master curve model,” J. Nucl. Mater., 283–287, 120, 2000. [42] D. Rodney and G. Martin, “Dislocation pinning by glissile interstitial loops in a nickel crystal: a molecular-dynamics study,” Phys. Rev. B, 61, 8714, 2000. [43] Y.N. Osetsky and D.J. Bacon, “An atomic-level model for studying the dynamics of edge dislocations in metals,” Model. Simul. Mater. Sci. Eng., 11, 427, 2003. [44] D. Rodney, “Molecular dynamics simulation of screw dislocations interacting with interstitial frank loops in a model FCC crystal,” Acta Mater., 52, 607, 2004. [45] B.D. Wirth, V.V. Bulatov and T. Diaz de la Rubia, J. Eng. Mater. Tech., 124, 329, 2002. [46] A.J.E. Foreman and M.J. Makin, “Dislocation movement through random arrays of obstacles,” Can. J. Phys., 45, 511, 1967. [47] Y. Xiang, D.J. Srolovitz, L.-T. Cheng, and E. Weinan, “Level set simulations of dislocation-particle bypass mechanisms,” Acta Mater., 52, 1745, 2004. [48] V.V. Bulatov, “Current developments and trends in dislocation dynamics,” J. Computer-Aid. Mater. Des., 9, 133, 2002. [49] T.A. Khraishi, H.M. Zbib, T.D. De La Rubia, and M. Victoria, “Localized deformation and hardening in irradiated metals: three-dimensional discrete dislocation dynamics simulations,” Metal. Mater. Trans. B, 33B, 285, 2002. [50] X. Han, N.M. Ghoniem, and Z. Wang, “Parametric dislocation dynamics of anisotropic crystals,” Phil. Mag., 83, 3705, 2003. [51] L.R. Greenwood and R.K. Smither, SPECTER: Neutron Damage Calculations for Materials Irradiations, ANL/FPP-TM-197, 1985.
Radiation effects in fission and fusion reactors
1035
[52] R.E. Stoller and L.R. Greenwood, “Subcascade formation in displacement cascade simulations: implications for fusion reactor materials,” J. Nucl. Mater., 271–272, 57, 1999. [53] J.A. Brinkman, J. Appl. Phys., 25, 961, 1954. [54] A. Seeger, Proceedings of the Second UN International Conference on Peaceful Uses of Atomic Energy, Geneva, vol. 6, United Nations, New York, 20, 1958. [55] A.F. Calder and D.J. Bacon, “A molecular dynamics study of displacement cascades in alpha-iron,” J. Nucl. Mater., 207, 25, 1993. [56] R.E. Stoller, G.R. Odette, and B.D. Wirth, “Primary damage formation in BCC iron,” J. Nucl. Mater., 251, 49, 1997. [57] R.S. Averback and T. Diaz de la Rubia, “Displacement damage in irradiated metals and semi-conductors,” Solid State Phys., 51, 281, 1998. [58] R.E. Stoller, “The role of cascade energy and temperature in primary defect formation in iron,” J. Nucl. Mater., 276, 22, 2000. [59] C.S. Becquart, A. Souidi, and M. Hou, “Relation between the interaction potential, replacement collision sequences, and collision cascade expansion in iron,” Phys. Rev. B, 66, 134104, 2002. [60] R.E. Stoller and G.R. Odette, “Recommendations on damage exposure units for ferritic steel embrittlement correlations,” J. Nucl. Mater., 186, 203, 1992. [61] S. Jumel and J.C. Van-Duysen, “INCAS: an analytical model to describe displacement cascades,” J. Nucl. Mater., 328, 151, 2004a. [62] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Energetics of formation and migration of self-interstitials and self-interstitial clusters in α-iron,” J. Nucl. Mater., 244, 185, 1997. [63] N. Soneda and T. Diaz de la Rubia, “Defect production, annealing kinetics and damage evolution in a-Fe: an atomic-scale computer simulation,” Phil. Mag., A, 78, 995, 1998. [64] Y.N. Osetsky, D.J. Bacon, A. Serra, B.N. Singh, and S.I.Y. Golubov, “Stability and mobility of defect clusters and dislocation loops in metals,” J. Nucl. Mater., 276, 65, 2000. [65] J. Marian, B.D. Wirth, J.M. Perlado, G.R. Odette, and T. Diaz de la Rubia, “Dynamics of self-interstitial migration in Fe–Cu alloys,” Phys. Rev. B, 64, 094303, 2001. [66] J. Marian, B.D. Wirth, A. Caro, B. Sadigh, G.R. Odette, J.M. Perlado, and T. Diaz de la Rubia, “Dynamics of self-interstitial cluster migration in pure α-Fe and Fe–Cu alloys,” Phys. Rev. B, 65, 144102, 2002. [67] P. Ehrhart, K.H. Robrock, and H.R. Schober, In: R.A. Johnson and A.N. Orlov (eds.), Physics of Radiation Effects in Crystals, Elsevier, Amsterdam, Netherlands, 63, 1986. [68] C. Domain and C.S. Becquart, “Ab initio calculations of defects in Fe and dilute Fe–Cu alloys,” Phys. Rev. B, 65, 024103, 2002. [69] C.-C. Fu, F. Willaime, and P. Ordejon, “Stability and mobility of mono- and di-interstitials in a-Fe,” Phys. Rev. Lett., 92, 175503, 2004. [70] M.W. Finnis and J.E. Sinclair, “A simple empirical N-body potential for transition metals,” Phil. Mag. A, 50, 45, 1984. [71] M. Daw and M. Baskes, “Embedded-atom method: derivation and application to impurities, surfaces and other defects in metals,” Phys. Rev. B, 29, 6443, 1984. [72] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Dislocation loop structure, energy and mobility of self-interstitial clusters, in BCC iron,” J. Nucl. Mater., 276, 33, 2000.
1036
G.R. Odette and B.D. Wirth
[73] N. Soneda and T. Diaz de la Rubia, “Migration kinetics of the self-interstitial atom and its clusters in bcc Fe,” Phil. Mag. A, 81, 331, 2001. [74] Y.N. Osetsky, D.J. Bacon, A. Serra, B.N. Singh, and S.I. Golubov, “One-dimensional atomic transport by clusters of self-interstitial atoms in iron and copper,” Phil. Mag., 83, 61, 2003. [75] Y.N. Osetsky, D.J. Bacon, and A. Serra, “Atomistic simulation of mobile defect clusters in metals,” Mater. Res. Soc. Symp., 540, 649, 1999. [76] H. Trinkaus, B.N. Singh, and S.I. Golubov, “Progress in modelling the microstructural evolution in metals under cascade damage conditions,” J. Nucl. Mater., 283– 287, 89, 2000. [77] Y.N. Osetsky, personal communication, 2004. [78] B.D. Wirth, G.R. Odette, J. Marian, L. Ventelon, J.A. Young-Vandersall, and L.A. Zepeda-Ruiz, “Multiscale modeling of radiation damage in Fe-based alloys in the fusion environment,” J. Nucl. Mater., 329–333, 103, 2004. [79] H.L. Heinisch and B.N. Singh, “Stochastic annealing simulation of intracascade defect interactions,” J. Nucl. Mater., 251, 77, 1997. [80] B.D. Wirth, G.R. Odette, and R.E. Stoller, “Recent progress toward an integrated multiscale–multiphysics model of reactor pressure vessel embrittlement,” MRS Soc. Symp. Proc., 677, AA5.2, 2001. [81] C. Domain, C.S. Becquart, and L. Malerba, “Simulation of radiation damage in Fe alloys: an object kinetic Monte Carlo approach,” J. Nucl. Mater., 335, 121, 2004. [82] B.K.P. Chang and B.D. Wirth, “Monte Carlo simulation of point defect recombination during the initial stages of cascade aging in Fe,” J. Nucl. Mater., in preparation, 2005. [83] B.D. Wirth and G.R. Odette, “Kinetic lattice Monte Carlo simulations of cascade aging in iron and dilute iron–copper alloys,” MRS Soc. Symp. Proc., 540, 637, 1999. [84] C. Domain, C.S. Becquart, and J.C. Van-Duysen, “Kinetic Monte Carlo simulations of FeCu alloys,” MRS Soc. Symp. Proc., 540, 643, 1999. [85] N. Soneda, S. Ishino, A. Takahashi, and K. Dohi, “Modeling the microstructural evolution in bcc-Fe during irradiation using kinetic Monte Carlo computer simulation,” J. Nucl. Mater., 323, 169, 2003. [86] B.D. Wirth and G.R. Odette, MRS Soc. Symp. Proc., 540, 637, 1999. [87] C. Domain, C.S. Becquart, J.C. Van Duysen, MRS Soc. Symp. Proc., 540, 643, 1999. [88] C. Buzano and M. Pretti, “Cluster variation approach to the Ising square lattice with two- and four-spin interactions,” Phys. Rev. B, 56, 636, 1997. [89] F. Soisson, A. Barbu, and G. Martin, “Monte Carlo simulations of copper precipitation in dilute iron–copper alloys during thermal ageing and under electron irradiation,” Acta Mater., 44, 3789, 1996. [90] M. Athenes, P. Bellon, and G. Martin, “Identification of novel diffusion cycles in B2 ordered phases by Monte Carlo simulation,” Phil. Mag. A, 76, 565, 1997. [91] S. Delage, B. Legrand, F. Soisson, and A. Saul, “Dissolution modes of Fe/Cu and Cu/Fe deposits,” Phys. Rev. B, 58, 15810, 1998. [92] T.T. Rautiainen and A.P. Sutton, “Influence of the atomic diffusion mechanism on morphologies, kinetics, and the mechanisms of coarsening during phase separation,” Phys. Rev. B, 59, 13681, 1999. [93] B.D. Wirth, G.R. Odette, P. Asoka-Kumar, R.H. Howell, and P.A. Sterne, “Characterization of nanostructural features in irradiated reactor pressure vessel model alloys,” In: G.S. Was (ed.), Proceedings of the 10th International Symposium on Environmental Degradation of Materials in Light Water Reactors, National Association of Corrosion Engineers, 2002.
Radiation effects in fission and fusion reactors
1037
[94] J. Dalla Torre, J.L. Bocquet, N.V. Doan, and E. Adam, “Jerk, an event-based Kinetic Monte Carlo model to predict microstructure evolution of materials under irradiation,” Phil. Mag., in press, 2004. [95] C.-L. Liu, G.R. Odette, B.D. Wirth, and G.E. Lucas, “A LMC simulation of nanophase compositions and structures in irradiated pressure vessel Fe–Cu–Ni–Mn– Si steels,” Mater. Sci. Eng. A, 238, 202, 1997. [96] A. Moslang, E. Albert, E. Recknagel, A. Weidinger, and P. Moser, “Interaction of vacancies with impurities in iron,” Hyperfine Interact., 15, 409, 1983. [97] D.A. Porter and K.E. Easterling, Phase Transformations in Metals and Alloys, Van Nostrand Reinhold, Thetford, Great Britain, 1986. [98] E.D. Eason, J.E. Wright, G.R. Odette, and E. Mader, Models for Embrittlement Recovery Due to Annealing of Reactor Pressure Vessel Steels, NUREG/CR-6327, 1995. [99] B.C. Masters “Dislocation loops in irradiated iron,” Phil. Mag., 11, 881, 1965. [100] B.L. Eyre and A.F. Bartlett, “An electron microscope study of neutron irradiation damage in alpha-iron,” Phil. Mag., 11, 261, 1965. [101] A.C. Nicol, M.L. Jenkins, and M.A. Kirk, “Matrix damage in iron,” Mater. Res. Soc. Symp., 650, R1.3, 2001. [102] J. Marian, B.D. Wirth, and J.M. Perlado, “On the mechanism of formation and growth of 100 interstitial loops in ferritic materials,” Phys. Rev. Lett., 88, 255507, 2002. [103] B.L. Eyre and R. Bullough “On the formation of interstitial loops in b.c.c. metals,” Phil. Mag., 12, 31, 1965. [104] W. Carrington, K.F. Hale, and D. McLean, “Arrangement of dislocations in iron,” Proc. R. Soc. Lond. A, 259, 203, 1960. [105] G.R. Odette, E.V. Mader, G.E. Lucas, W.J. Phythian, and C.A. English, “The Effect of Flux on the Irradiation Hardening of Pressure Vessel Steels,” In: A.S. Kumar, D.S. Gelles, R.K. Nanstad, and E.A. Little (eds.), Effects of Radiation on Materials: 16th International Symposium, ASTM-STP-1175, American Society for Testing and Materials, Philadelphia, PA, 373, 1993. [106] S. Jumel, C. Domain, J. Ruste, J.C. Van-Duysen, C. Becquart, A. Legris, P. Pareige, A. Barbu, E. Van Walle, R. Chaouadi, M. Hou, G.R. Odette, R.E. Stoller, and B.D. Wirth, “Simulation of Irradiation Effects in Reactor Pressure Vessel Steels: the reactor for Virtual Experiments (REVE) Project,” J. Test. Eval., 30, 37, 2002. [107] S. Jumel and J.C. Van-Duysen, “RPV-1: a first virtual reactor to simulate irradiation effects in light water reactor pressure vessel steels submitted for publication,” J. Nucl. Mater., 2005. [108] M. Victoria and G. Martin, personal communication, 2004. [109] R.E. Stoller, et al., DOE Workshop on Advanced Computational Materials Science: Application to Fusion and Generation IV Fission Reactors, Washington, D.C.31 March-2 April 2004, ORNL/TM-2004/132, 2004.
2.30 TEXTURE EVOLUTION DURING THIN FILM DEPOSITION Hanchen Huang Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180-3590, USA
The modeling of materials processing intrinsically spans multiple scales, in terms of both space and time. The modeling of thin film deposition, together with the accompanying texture evolution, spans 15 orders of magnitude in time, from fundamental atomic vibration period of 10−13 s to deposition duration of 102 s. This section describes challenging issues, critically presents existing approaches, and offers an outlook of future developments in the modeling of thin film texture evolution.
1.
Thin Film Deposition and Texture Evolution
Human beings have processed thin films over thousands years, aiming to improve the quality of life. Cars with advanced electronics improve the quality of life, and so do artistic paintings. As a specific example, thin films are applied as metal conductors in integrated circuits (ICs). The performance of these thin films depends on their texture, which refers to the alignment of grain orientations. If most Cu grains in a thin film have their 111 direction along the surface normal, the film has an out-of-plane texture of 111. When this texture dominates, the metal conductors in ICs will last longer, and therefore the lifetime of computers and other IC-based equipments increases. In addition to the technological importance, the texture evolution is also scientifically challenging to materials scientists and solid-state physicists. To model the texture evolution during thin film deposition, let us first examine relevant processes. Although deposition techniques vary, the atomic or molecular addition to an evolving solid surface is common to all of them. Different deposition techniques lead to primarily different sources of atoms 1039 S. Yip (ed.), Handbook of Materials Modeling, 1039–1049. c 2005 Springer. Printed in the Netherlands.
1040
H. Huang
or molecules. To facilitate the presentation, let us take the physical vapor deposition as the prototype of deposition technique; see Powell and Rossnagel [1] for a comprehensive review of this deposition technique. The texture evolution is the result of complex atomic activities, across 15 orders of magnitude in time. Three distinct time scales are identifiable in the texture evolution process. The first time scale characterizes the initial incorporation of atoms on the surface of film or substrate, as shown in Fig. 1(a). The atoms may come from various sources, such as sputtered targets or evaporated filaments. These atoms carry kinetic energies, and their binding with the surface leads to additional energy release. These energies, together with momentums, cause local atoms to rearrange. The time scale of initial atomic incorporation is dictated by the intrinsic atomic vibration period, about 10−13 s. The next time scale characterizes the atomic diffusion or mass transport for clustering, as shown in Fig. 1(b). Atoms with less than perfect coordination, such as an atom having less than 12 nearest neighbors in close-packed Cu, tend to diffuse. The time scale of atomic diffusion depends on the activation energy and the local temperature, and varies over a wide range. It is not out-of-bound to associate a diffusion event with nanoseconds. As clusters grow and merge, a polycrystalline thin film forms (Fig. 1(c)). The third time scale characterizes the motion of grain boundaries in the film, and is on the order of seconds. As an estimate, the grain boundary may migrate 10–100 nm over the entire deposition period, say 100 s. That is, the migration of one atomic layer takes about one second. The entire texture evolution process, from initial atomic incorporation to completion of deposition, spans 15 orders of magnitude in time scale of 10−13 –102 s. It is worthy mentioning the spatial scale, in addition to the time scale, for completeness. In contrast to the 15 orders of magnitude in time scale, the spatial scale spans over only a few orders of magnitude. The smallest spatial scale, the atomic size, is a fraction of nm. Typical grain size or thickness of thin films is on the order of 100–1000 nm. The narrow span of 10−1 − 103 nm justifies the emphasis on time scale in this section.
(a)
(b)
(c)
Figure 1. Texture evolution during physical vapor deposition of thin films, starting from a substrate (a), to islands or grain nuclei (b), and to polycrystalline thin film (c). The spheres represent atoms, and the lines delineate grains.
Texture evolution during thin film deposition
2.
1041
Models of Texture Evolution
The large span of time scales, as discussed in the previous part, poses the biggest challenge to any models of texture evolution. The texture evolution lasts over the entire period of the deposition process, about 102 s. On the other hand, the atomic processes that dictate the evolution occur over the intrinsic time scale of 10−13 s. It is impossible for a brutal force model to span the entire 15 orders of magnitude in time scale. Various models have emerged with increasing degree of rigor, as knowledge and computational power build up. In terms of the time scale, three modeling approaches exist. The first approach focuses on the macroscopic time scale and ignores details of atomic vibration and atomic clustering. The second focuses on the details of atomic vibration and atomic clustering, by artificially speeding up the deposition process. The third incorporates the details of atomic vibration in atomic diffusion and atomic clustering in an effective manner over the macroscopic time scale. The following presentation elaborates on each of the three modeling approaches. The first modeling approach represents a polycrystalline thin film as a continuum [2, 3] and neglects the details of atomic motion, as shown in Fig. 2. The interior of each grain is a continuum, and the grain boundaries define the size and shape of each grain. Within this continuum approach, a series of meshing points and interpolation between them, fully represent the grain boundaries. Positions and velocities of the meshing points characterize the motion of grain boundaries. Each meshing point advances with a velocity according to the local driving force. One of the most common driving forces is the grain boundary curvature. A larger curvature corresponds to a more curved grain boundary, and a larger grain boundary area. For given energy per unit area of
(a)
(b)
Figure 2. Schematic of continuum model of texture evolution, from initial (a) to final (b) texture.
1042
H. Huang
grain boundary, the total energy goes up with the grain boundary curvature. Energy minimization drives the reduction of the curvature, and thereby the motion of grain boundaries. Relevant to texture evolution, the driving forces also include film surface energy, strain energy, and film-substrate interface energy. In a strain-free Cu polycrystalline thin film on an amorphous substrate, the minimization of surface and grain boundary energies drives the texture evolution. The minimization of the surface energy will favor grains having {111} surfaces, leading to the 111 texture. The minimization of grain boundary energy will favor grain boundaries of smaller curvature; ignoring differences of grain boundaries in terms of energy per unit area. This will lead to grain coarsening. Meshing points at grain boundaries, under these two driving forces, migrates toward non-111 grains or toward smaller grains. The speed of the migration depends on two physical quantities: the driving force and the migration barrier. The driving force determines the direction of migration, or the sign of the velocity. However, its effect on the magnitude of velocity (the speed) is approximately linear and limited. The speed is an exponential function of the migration barrier in Arrhenius form. The grain coarsening is demonstrated in Figs. 2(a) and (b). One extension of this continuum approach is the inclusion of deposition, in addition to the annealing process [4]. The other extension is the incorporation of atomistic mechanisms, such as grain rotation, in modeling the texture evolution [5]. The continuum approach is capable of tracking texture evolution in laboratory time scale of seconds. This advantage comes at the expense of neglecting details of atomic vibration and atomic clustering. Another limitation is the necessity of the initial grains distribution, such as the one shown in Fig. 2a. Before proceeding to the second modeling approach, it is worthy to note that the Pott’s model [2] may be considered as within the continuum approach. Although each grain appears in the form of discrete blocks of materials, the principles of grain boundaries motion in the Pott’s model are similar to those in the continuum approach. In contrast to the continuum approach, the second modeling approach is atomistic and based on the molecular dynamics method; Li discusses this method in detail in this handbook. Thin films and the corresponding substrates consist of atoms. The atomic positions and their relative arrangements naturally outline the texture of thin films. When atoms are packed in crystalline order, they represent the interior of a grain, of a specific orientation. Meanwhile, atoms in noncrystalline order belong to grain boundaries. The motion of grain boundaries, and thereby the texture evolution, is a natural result of atomic activities. Each atom interacts with its neighbors, according to a prescribed interatomic potential. The force on the atom determines its acceleration, and its dynamics according to the Newton’s second law. In principle, one may start from an amorphous substrate and track the grain nucleation and texture evolution during deposition. However, the grain nucleation on amorphous substrates
Texture evolution during thin film deposition
1043
is difficult to model; this issue will be elaborated in the Outlook part. Instead, one generally has to start from bi-crystalline or polycrystalline substrates. The molecular dynamics based approach allows the atomistic studies of texture competition as a function of various deposition parameters, such as kinetic energy of incoming atoms [6]. The details of atomic vibration and atomic clustering are natural output of the molecular dynamics based approach, but they come at a price. To track atomic vibrations, the numerical time step is usually 10−15 s. For millions of numerical steps, the total simulated time is only on the order of nanoseconds. Consequently, the deposition rate has to be on the order of several atomic layers per nanosecond or ∼1 m/s, which is nine orders of magnitude higher than realistic deposition rates. Usually, one has to compensate this artificially high deposition rate by high temperature in order to ensure enough atomic diffusion. The hyper molecular dynamics method [7] extends the time scale by several orders of magnitude, and therefore helps reduce the deposition rate. This method is the most effective when kinetic processes are not too complex, or when the potential energy surfaces of atomic migration are simple. During thin film deposition, it is common that surface atoms form complex configurations, rendering the hyper molecular dynamics less effective. So far, the molecular dynamics simulations of texture evolution remain in two dimensions. Extension to three dimensions is becoming feasible with the ever-increasing computational capacity. In addition to the computational constraints, physical approximations deserve full appreciation as well. The molecular dynamics method does not explicitly treat electrons and relies on an interatomic potential to effectively represent the electronic effects. This effective treatment assumes that electrons redistribute in a particular fashion according to a given function of atomic configurations, and that the redistribution is instantaneous – the Born–Oppenheimer approximation. In general, the effective treatment leads to correct crystal structures, but may fail in quantitative predictions of atomic energetics and their effects. The third modeling approach is atomistic and based on the Monte Carlo method; Gilmer discusses this method in detail in this handbook. Instead of following Newton’s equations, atomic motions are governed by atomic energetics and the corresponding Boltzmann’s factor in the Monte Carlo method. One variation of this approach leads to similar results as the molecular dynamics method. According to this variation, atoms may occupy any point in the space. Mapping the continuous space takes much computational effort. In two dimensions, this variation is realizable using effective particles [8]. Each effective particle represents a cluster of atoms. The particles move around in the continuous spaces during deposition. Their motion on surface corresponds to the effective diffusion of adatoms. Domains of various orientations are formed during two-dimensional simulations (Fig. 3). These domains may be interpreted as grains; however, they are not. Extension of this method to three dimensions results in unbearable computational cost [9].
1044
H. Huang
Figure 3. Schematic of domains formation in two dimensions.
(a)
(b)
(c)
Figure 4. Schematic of texture evolution in the Monte Carlo based approach, starting from an amorphous substrate (a), to grain nuclei (b), and to a polycrystalline thin film (c).
The other variation of the Monte Carlo based approach employs lattices. Like other Monte Carlo methods, atomic energetics govern the atomic motion according the Boltzmann’s factor. These energetics come from classical molecular dynamics simulations, ab initio calculations, and experimental measurements. The core of this variation is the lattice kinetic Monte Carlo method. According to this method, atoms occupy only lattice sites, and each lattice represents one grain of a specific orientation. As shown in Fig. 4(a), an amorphous substrate consists of atoms in different lattices (indicated by different gray scales). Starting from this substrate, an incoming atom may choose to align with any of the substrate atoms that it comes to form nearest neighbors. At the same time, atoms may also diffuse around and form grain nuclei (Fig. 4(b)). As more atoms attach to the nuclei, they grow and impinge, forming a polycrystalline thin film (Fig. 4(c)). This lattice kinetic Monte Carlo based approach enables simulations of thin film deposition over long time scales, up to 102 s, or more. At the same time, by taking inputs from detailed molecular dynamics and ab initio studies of atomic vibrations, this method also effectively accounts for atomic motions of finer time scales. Further, the atomic energetics can be more accurately represented than in molecular dynamics simulations, because they may come from ab initio calculations and
Texture evolution during thin film deposition
1045
experiments. The studies of atomic energetics are a continuing endeavor of knowledge accumulation. They include both the determination of conventional atomic energetics such as surface diffusion barriers and the identification of novel atomic mechanisms [10, 11]. These advantages of better energetics representation and longer time simulations are realizable if multiple lattices can be used in the Monte Carlo method. In contrast to the studies of atomic energetics, the use of multiple lattices is much more challenging. The challenges are two fold. First, an atom of one grain and atoms of other grains must not occupy the same spatial site. Avoiding the multiple occupancy is possible but numerically intensive, because of the necessity of examining all grains for each atom. Second, the direct use of multiple lattices will cost too much computer memory. Should one directly use 1000 lattices to represent 1000 grains, the number of atoms that a computer is capable of simulating will be reduced by 1000 times? For a single lattice, the present day computer is capable of simulating a billion atoms, or simulating films with linear dimension being ∼250 nm. The direct use of multiple lattices will reduce this dimension to only 25 nm. Parallel computations are not effective in increasing this dimension, because of the predominantly integer operations [12]. In lieu of the two challenges or difficulties, it will be advantageous to represent multiple lattices by a single lattice. The single lattice serves as a reference in space. A lattice of arbitrary orientation can be transformed to the reference lattice through three independent rotations. A one-to-one relationship between sites of this lattice and the reference lattice exists, since both lattices have the same site density. In another word, a lattice of arbitrary orientation can be mapped onto the reference lattice. In the single reference lattice, one readily knows whether a site is occupied. Therefore, the mapping solves the first problem, the possible multiple occupancy. However, the direct mapping in three dimensions results in the same requirement of computer memory as the direct use of multiple lattices. The alternative to the direct three-dimensional mapping is the use of three consecutive two-dimensional mappings. In the three-dimensional mapping of N lattices each having linear dimension L, the number of integers stored will be on the order of NL3 . On the other hand, the three consecutive two-dimensional mappings require memory storage of only order 3NL2 . For linear dimension of L = 250 nm (or 1000 atomic diameters), the use of three consecutive two-dimensional mappings reduces the memory requirement by 300 times. Two variations of this mapping concept have been implemented. The first implementation incorporates multiple lattices, and is in two dimensions [13]. This implementation enables studies of multiple textures competition, in twodimensional space. The second implementation incorporates only two lattices, and is in three dimensions [14]. This implementation is based on the mapping of face-centered-cubic {111} plane sites onto {100} plane sites. As a result, this implementation enables the simulations of two out-of-plane textures, 111 and
1046
H. Huang
100, using a single 100 lattice. The full implementation of multiple lattices in three dimensions is in progress of preparation and publication, and will be elaborated in the Outlook part. Before closing this part, it is worthwhile to appreciate the multiscale – in addition to the polycrystalline – nature of the lattice kinetic Monte Carlo based approach. The span of multiple time scales is realized through representation of atomic energetics in the lattice kinetic Monte Carlo based approach. The classical molecular dynamics simulations and ab initio calculations, together with experimental measurements, provide reliable atomic energetics and mechanisms of motion. In the lattice kinetic Monte Carlo based approach, the energetics and the mechanisms are parameterized as a function of atomic coordination. Although potential energies of individual atoms are not well defined, a parametric representation is meaningful in terms of the total energy of a simulated thin film. The nonlinear parameterization of potential energy as a function of the atomic coordination ensures the reproduction of surface defect formation energies. This reproduction, with respect to the molecular dynamics predictions in this case (Fig. 5), is essential to physical faceting during thin film deposition [15]. As to the atomic mechanisms of motion, multiple Monte Carlo jumps are used to represent diffusion jumps over steps and facets.
MC-EAM MD-EAM
Figure 5. Nonlinear parameterization of atomic potential energy vs. coordination (open circle and solid line) in the Monte Carlo model. The molecular dynamics predictions (solid diamonds) are included for comparison.
Texture evolution during thin film deposition
1047
Among the three approaches – the continuum, the molecular dynamics based, and the Monte Carlo based approaches – the lattice kinetic Monte Carlo based approach looks the most promising. In particular, it enables the simulations of texture evolution at the atomic level, under realistic deposition rates, without assuming initial grain distributions. Certainly, this approach is far from being complete and suffers from several drawbacks. First, the implementation has been realized for only multiple lattices in two dimensions or two lattices in three dimensions. Second, the mechanism of grain nucleation on amorphous substrates remains largely unclear, and there is no available method to study such mechanisms. Third, this approach is incapable of simulating thin films of 1000 nm or larger in linear dimensions. Finally, strain effects are intrinsically missing in this lattice kinetic Monte Carlo based approach.
3.
Outlook
Since the lattice kinetic Monte Carlo based approach looks the most promising, it will be the focus of this outlook of future developments. The first development is the full implementation of mapping multiple lattices onto one reference lattice in three dimensions. This will be realized through three consecutive two-dimensional mappings. The previous implementation of one such mapping in two dimensions indicates the feasibility. Once completed, the full implementation will enable simulations of multiple texture competition in three dimensions at the atomic level and under realistic deposition rates. The second development is the atomistic study of grain nucleation on amorphous substrates. The necessary condition of this study is a generic amorphous substrate. Although intermetallic glasses may serve as amorphous substrates, their surface roughness and local crystallinity are not controllable. In laboratory experiments over large substrate areas, such uncontrollability is not an issue. However, in atomistic simulations, substrate areas are small, and the variation of surface roughness and local crystallinity overshadow the underlying nucleation principles. Therefore, one aspect of this development is the design of a generic amorphous substrate with controllable roughness and crystallinity. In parallel, the other aspect is the formulation of an analysis technique to characterize the grain nucleation on amorphous substrates. This technique allows one to define whether a cluster of atoms is crystalline. These two aspects of the development will result in more clear understanding on mechanisms of grain nucleation on various amorphous substrates. The third development is bridging approaches of different length scales. At the present, the available computer memory of a single processor is capable of treating thin films of 500 nm × 500 nm in horizontal dimensions and 25 nm in thickness. To model thin films of larger dimensions, this approach needs to
1048
H. Huang
be bridged with grain continuum models, such as PLENTE [16]. Efforts have been made in this direction, but a seamless bridging is yet to be accomplished. Finally, the fourth development is the incorporation of strain effects. The use of lattices is necessary to simulate deposition processes of seconds in time scale. At the same time, the use of lattice intrinsically excludes strain. Fortunately, energy may represent strain effects in the form of strain energy. The incorporation of strain effects requires a combined use of the lattice kinetic Monte Carlo based approach and continuum analyses. At any moment of texture evolution, the strain distribution will be determined from a continuum analysis, based on the grain continuum model. The strain and corresponding strain energy serve as input to the subsequent Monte Carlo simulations of texture evolution. Once the texture evolves, strain distribution can be analyzed again. This iteration will enable the effective incorporation of strain effects in simulations of texture evolution. The first development – the full implementation of mapping multiple lattices – has been completed [17].
References [1] R.A. Powell and S. Rossnagel, Thin Films: PVD for Microelectronics, Academic Press, New York, 1999. [2] G. Grest, M. Anderson, D. Srolovitz, and A. Rollett, “Abnormal grain growth in three dimensions,” Scripta Metall. Mater., 24, 661–665, 1990. [3] D. Walton, H. Frost, and C. Thompson, “Development of near-bamboo and bamboo microstructures in thin film strips,” Appl. Phys. Lett., 61, 40–42, 1992. [4] Paritosh, D.J. Srolovitz, C.C. Battaile, X. Li, and J.E. Butler, “Simulation of faceted film growth in two dimenions: microstructure, morphology and texture,” Acta Mater., 47, 2269–2281, 1999. [5] D. Moldovan, D. Wolf, and S.R. Phillpot, “Linking atomistic and mesoscale simulations of nanocrystalline materials: quantitative validation for the case of grain growth,” Philos. Mag., 83, 3643–3659, 2003. [6] L. Dong and D. Srolovitz, “Texture development mechanisms in ion beam assisted deposition,” J. Appl. Phys., 84, 5261–5269, 1998. [7] A. Voter, “Hyperdynamics: accelerated molecular dynamics of infrequent events,” Phys. Rev. Lett., 78, 3908–3911, 1997. [8] M.J. Brett, S.K. Dew, and T. Smy, Thin Films: Modeling of Film Deposition for Microelectronic Applications, S. Rossnagel, ed., Academic Press, New York, 1996. [9] F. Baumann and G.H. Gilmer, “3D modeling of sputter and reflow processes for interconnect metals,” IEDM Technical Digest, 89, 1995. [10] S.J. Liu, H. Huang, and C.H. Woo, “Schwoebel–Ehrlich barrier: from two to three dimensions,” Appl. Phys. Lett., 80, 3295–3297, 2002. [11] M.G. Lagally and Z.Y. Zhang, “Materials science - Thin-film Cliffhanger,” Nature, 417, 907–910, 2002. [12] J.W. Shu, Q. Lu, W.O. Wong, and H. Huang, “Parallelization strategies for Monte Carlo simulations of thin film deposition,” Comput. Phys. Commun., 144, 34–45, 2002.
Texture evolution during thin film deposition
1049
[13] H. Huang and G.H. Gilmer, “Multi-lattice Monte Carlo model of thin films,” J. Compu. Aided Mater. Des., 6, 117–127, 1999. [14] G.H. Gilmer, H. Huang, T. Diaz de la Rubia, J.D. Torre, and F. Baumann, “Lattice monte Carlo models of thin film deposition,” Thin Solid Films, 365, 189–200, 2000. [15] H. Huang, G.H. Gilmer, and T. Diaz de la Rubia, “An atomistic simulator for thin film deposition in three dimensions,” J. Appl. Phys., 84, 3636–3649, 1998. [16] M.O. Bloomfield, D.F. Richards, and T.S. Cale, “A computational framework for modelling grain-structure evolution in three dimensions,” Philos. Mag., 83, 3549– 3568, 2003. [17] H. Huang and L.G. Zhou, “Atomistic simulator of polycrystalline thin film deposition in three dimensions,” J. Compu. Aided Mater. Des., in press, 2005.
2.31 ATOMISTIC VISUALIZATION Ju Li Department of Materials Science and Engineering, Ohio State University, Columbus, Ohio, USA
Visualization plays a critical role in materials modeling. This is particularly true for atomistic modeling, in which there is a large number of discrete degrees of freedom (DOF): the positions of the atoms. Atomic resolution is therefore the defining feature of atomistic visualization. This, however, does not exclude the possibility of going up in scale – visualizing the coarse-grained continuum fields, or going down – visualizing the electronic structure around a particular atom or cluster of atoms, of a configuration if the need arises. These discrete DOF in an atomistic simulation do not necessarily satisfy any smoothness condition like the continuum fields. For example the reconstructed atomic structure of a dislocation core in Si is not likely to be describable by a formula or a series expansion. However, this does not mean that there is no order in these DOF. Atomic-level order is ubiquitous in materials, even in amorphous or disordered materials, even in liquids. Finding these order, quantifying them, and then representing them in the best light are the tasks of atomistic visualization. Atomistic visualization is not merely a software engineering problem, it is also inherently a physics and mechanics problem. To appreciate the importance of atomistic visualization, one must recognize that in a setup like a large-scale molecular dynamics (MD) simulation, it is not infrequent that the DOF self-organize in ways that the investigator would not have expected before the simulation is carried out. Thus, a main function of atomistic simulation is discovering new structures, new kinetic pathways and micro-mechanisms, with atomic resolution. Even though these discoveries often need to be taken with a grain of salt due to the present accuracy of empirical interatomic potentials, large-scale simulation is nonetheless a unique and tremendously powerful tool of identifying key structures and processes. Once a structure or a process is clearly described and understood, it often can be isolated and modeled with a much smaller number of atoms at the firstprinciples level, allowing one to eventually select the most probable structure 1051 S. Yip (ed.), Handbook of Materials Modeling, 1051–1068. c 2005 Springer. Printed in the Netherlands.
1052
J. Li
or process out of a catalog of possible low-energy structures or processes. This surveying mission of large-scale simulation would be impossible without efficient visualization, for the amount of data from a large-scale simulation is truly enormous. This contribution is organized as follows. First, a brief survey of the present state-of-the-art in atomistic visualization is given, that includes both tool development and work done using the tools. Special emphasis is put on publicdomain visualization tools that the author is familiar with. Then, the design philosophy behind the free atomistic configuration viewer AtomEye is analyzed. Finally, a recently developed characterization of local atomic structure called central symmetry parameter is explained.
1.
A Brief Survey of Molecular Visualization
At the time this article is written, the state-of-art in atomistic visualization can be experienced in a movie that Farid Abraham et al. (IBM) made for a one-billion atom MD simulation of work-hardening, with two notched dislocation sources [1]. The MD simulation was performed for 200 000 time steps on the 12-teraflop, 4096-node ASCI White supercomputer at LLNL, for four days wall clock time, which generated 25 terabytes of raw data. They were compressed with 30× efficiency to less than 1 terabyte, which would still take about 10 hard drives (weighs ∼1.2 lb each) to store. The movie was made in the post-processing stage by Mark Duchaineau, a computer scientist (LLNL). It has a resolution of 640×480, a file size of 66 mb, and lasts 46 s. In terms of file size, the movie is less than a 1/1000% of the raw data. Watching the movie takes only a 1/100% of the time it takes the fastest computer in the world to run the simulation. Yet, one gets a very good overview of what went on in the simulation, that entail dislocation nucleation, interaction and dynamics, by just watching the movie. Thus, a main purpose of visualization is condensation of information. A crucial trick that enables such high condensation rate of data is selective representation of atoms. That is, one only renders “interesting” atoms near defects in the atomistic configuration, in this case dislocations and cracks. The “uninteresting” atoms which have bulk order are not rendered and do not cover up the field of view. Here, the “interesting” atoms are determined by a local energy criterion. Later in the article, we are going to illustrate alternative methods of distinguishing “interesting” atoms using some geometrical criteria without knowing the particular interatomic potential. As a side note, it was observed personally that the above movie never fails to captivate the audience in seminars and lectures, whether they are experts or not. Thus, aside from sifting and compressing information, atomistic visualization also lowers the barrier of entry for accessing the information.
Atomistic visualization
1053
Century-old methods of scientific visualization such as graphing/charting are still important as ever (they achieve even higher information compression rate). But the new kinds of visualization that come with the information age, in the forms of snapshots, movies/animations, and interactive navigation, great complement and enhance the traditional methods. Top-quality atomistic visualization such as above [1, 2] still require the expertise of dedicated computer science professionals. They may also require specialized hardware such as an Immersadesk or CAVE system [3, 4]. However, for day-to-day research, there is an array of visualization software available on personal computers. Commercial modeling packages such as Materials Studio, CAChe, ChemOffice, HyperChem, Spartan, etc. come with powerful visualization front ends, that usually include graphical user interface (GUI)driven atomic configuration builders as well. And there are also more specialized crystallographic software such as CrystalMaker. But here we are going to focus on free software, or freeware, that are accessible to everyone. Molscript by Kraulis [5] and Rasmol [6] are two pioneering freewares that have had tremendous impact on visualization, beyond the field of molecular biology from which they originated. According to the Institute for Scientific Information (ISI), from 1991 to 2004 the Molscript paper [5] has been cited more than 10 000 times, making it one of the most cited papers in science. Molscript takes an input file, which specifies the 3-D coordinates of biomolecules and the desired graphics state (such as viewpoint), and renders into publication-quality schematics in vector image formats like PostScript, which can be directly inserted into typesetting program such as LaTeX. Later, photorealistic rasterization program Raster3D [7, 8] and charge-density isosurface plotting program CONSCRIPT [9] were developed that can work in unison with Molscript. Similar to many present-day raytracing programs, Molscript, Raster3D and CONSCRIPT run on the command line and are noninteractive. So, while the qualities the configuration snapshots are excellent, they are less suitable as a configuration navigation and surveying tool. Rasmol, on the other hand, is designed with navigation in mind. One is able to rotate the configuration and change the rendering state interactively. The Rasmol source code, which is freely available starting from the early 1990s, implements advanced features such as shared memory extension for local display, scripting input interface, and various fast rendering technique, that advances the knowledge-base of developing molecular visualization freeware. Other macromolecule visualization tools with similar functions include the Swiss-PdbViewer (Deep View) [10, 11], and MOLMOL [12]. It should be pointed out that there are many detailed differences between molecular visualization of soft matter, specifically proteins, and atomistic visualization of hard matter. For example, in modeling deformation of solids, one can often use the perfect crystal as reference state. This means, in a visualization scheme, collective modes or defects can often be identified by comparing
1054
J. Li
with crystalline order atom by atom. Configuration changes in hard matter such as defect nucleation and mobility are often accompanied by the breaking and reformation of stiff, nearest-neighbor covalent or metallic bonds. In proteins, there is no crystalline reference state, and conformation changes are usually accomplished by the breaking and reformation of softer, non-nearestneighbor bonds like hydrogen bonds. And while the concepts of local strain and stress are still useful in proteins [13], quantification/visualization poses perhaps a greater challenge. On the other hand, there are well-recognized local orders in proteins such as α-helices, β-sheets, turns and loops, that do not have direct analogies in hard matter, and require special representations such as ribbons/thick tubes, arrows, and lines/thin tubes. Historically, the Protein Data Bank (PDB) configuration file format [14] and the Research Collaboratory for Structural Bioinformatics (RCSB) molecular structure database has been a major driving force behind promoting molecular visualization and standardization. No such standards yet exist in materials modeling. However, there are several good reasons not to use the PDB format to save one’s configurations and for information exchange in atomistic modeling of hard matter, which are: • Precision. PDB format has a fixed precision of 0.001 Å for storing the atomic coordinates. While this is probably sufficient for proteins, for which one usually models at around T = 300 K in solution so there is plenty of indeterminant thermal noise anyway, it is often not precise enough for hard matter. • Extensibility. Since PDB adopts a fix-line format, there is no standard and supported way to add in new properties. For instance, there is no standard option to store atomic velocities. • Support for periodic boundary condition (PBC). It is very difficult to coax the PDB format to robustly and consistently store atomic configurations satisfying PBC, because the atomic coordinates are saved in direct Cartesian x, y, z coordinates rather than dimensionless reduced coordinates [15]. In order to effect an affine transformation on the supercell, for instance, one needs to modify all atomic coordinates explicitly in PDB, rather than just modifying the 3 × 3 H-matrix [15]. An extensible, arbitrary-precision configuration file format (CFG) and its supporting viewer AtomEye [16] is introduced in the next section, which provides full support for PBC and is ideally suited for large-scale MD simulations. We now turn to another area, quantum chemistry, which also had profound influence on atomistic visualization. One deals with a smaller number of atoms in one configuration, usually no more than a few hundred at present, but scalar fields such as orbital wavefunctions need to represented besides molecular conformation. The pioneering freeware in this field is Molden [17], which renders the orbital wavefunctions, charge-density and electrostatic potential of
Atomistic visualization
1055
molecules, as well as their relaxation dynamics, vibrational normal modes and reaction pathways. It works well interactively, but also gives good quality vector graph output for 2-D contours and 3-D isosurfaces. Another freeware with similar functionalities is gOpenMol. An excellent freeware for visualizing electronic structure in crystals is XCrySDen [18, 19]. One can store the crystal structure plus an arbitrary number of scalar fields defined on a regular grid under PBC in the so-called XSF format, which can be visualized, rotated and numerically manipulated interactively. Isosurfaces and cut-plane contours of the scalar fields can be rendered with a variety of colormap, transparency, and specularity options. Both the onscreen display and the snapshots have outstanding quality, and the controls are highly responsive. XCrySDen also has some tools for analyzing reciprocalspace properties such as interactive selection of k-paths in the Brillouin zone for band-structure plots, and visualization of the Fermi surface. Presently, the most powerful and versatile freeware for visualizing molecular dynamics simulation trajectories is perhaps VMD [20]. It is based on OpenGL, with graphical user interfaces, but also a command line with full scripting capabilities. There is even a special syntax for choosing subsets of atoms for display (includes boolean operators, regular expressions, etc.). Trajectories can be played back, analyzed and easily converted to movies. Sterescopic display is fully supported. VMD can also display volumetric data sets, including electron density maps, electron orbitals, potential maps, and various types of user-generated volumetric data. They can be rendered using “VolumeSlice” or “Isosurface” representations, each of which provides several geometric rendering styles for viewing the data, varying isolevels, slice plane position, etc. 1-D, 2-D, and 3-D textures can be applied onto molecular and volumetric data representations to convey various types of information. VMD also provides the ability to render molecular scenes using external programs such as ray-tracing programs. This feature can be used to attain higher image quality that is possible using the built-in OpenGL rendering features. There are also many special features for analyzing large biomolecular systems. Compared to VMD, freeware such as AViz [21] and AtomEye [16], which are dedicated to atomistic visualization of nonbiological systems, are more lightweight. A good idea for beginners is to install and try all three freeware. The design philosophy behind AtomEye [16] is introduced in the next section. Aside from the specialized tools introduced above, there are general visualization packages such as OpenDX and VTK, that are programmable and extremely powerful. The python interface of VTK, for instance, has been incorporated into Atomic Simulation Environment (ASE), an open-source distribution of python scripts [22] that can wrap around several ab initio and molecular mechanics engines (Dacapo, SIESTA, MMTK, etc.). The commercial software package MATLAB is also a very good environment for data visualization. Freeware in this aspect include Gnuplot, Grace, Octave, and Scilab.
1056
2.
J. Li
Design of an Efficient Atomistic Configuration Viewer
AtomEye [16] is a lightweight and memory-efficient atomistic configuration viewer, which nonetheless achieves high quality in the limited number of things that it can do. It is based on the observation that when visualizing MD simulation results, most often only the spheres and cylinders, representing the atoms and bonds, need to be drawn in massive quantities. Therefore, special subroutines were developed to render the spheres and cylinders as graphics primitives, rather than as composites of polygons. This combined with areaweighted anti-aliasing [23] greatly enhance AtomEye’s graphics quality. One can also produce snapshots (in PNG, JPEG or EPS file formats) of a configuration in the desired graphics state at arbitrary resolutions (like 2560×2560) that are greater than the monitor display resolution, to obtain publication-quality figures (Figs. 1–6). Making movie is straightforward with a set of sequentially named configuration files. AtomEye is an easy-to-use configuration navigator with full support for PBC. The user can move the view frustum anywhere inside the atomic configuration (see Figs. 2, 4). This is done by defining an anchor point, which can be the position of an atom, the center of a bond, or the center of mass of the entire configuration. Dragging the mouse up or down with the right mouse button
Figure 1. A strand of DNA, visualized in AtomEye.
Atomistic visualization
1057
Figure 2. Inside a chiral single-walled carbon nanotube.
Figure 3. Dislocation emission in a two-dimensional bubble raft under a spherical indentor [24]. The color encoding of atoms is by the auxiliary property of local atomistic von Mises stress invariant.
pressed pulls the viewpoint away or closer from the anchor. Rotation is always done such that the anchor position is invariant in the field of view. At beginning, the anchor is taken to be the center of mass. This allows for global view of the configuration by rotating with mouse or with arrow keys (see below). When one right-clicks on an atom or a bond, the anchor is transferred to that particular atom or bond. So if one is interested in a closer view of a
1058
J. Li
Figure 4. A vacancy defect in silicon. Three-fold coordinated atoms are colored green, while 4-fold coordinated atoms are colored silver.
Figure 5. Cu nanocrystal configuration consisting of 424 601 atoms. Atom coloring is by coordination number.
Atomistic visualization
1059
Figure 6. Central symmetry color encoding showing intrinsic stacking faults bounded by partial dislocations in indentation of Cu.
particular atomic local environment, one right-clicks on an atom or bond and then drags the mouse down without releasing the right mouse button. To pull away, simply right-click on a vacuum region and drag the mouse up without releasing the right mouse button. One can always recover the center of mass anchor by pressing key “w”. Rotation by mouse movement is accomplished with the following concepts: there is a glass sphere about half the viewport size hinged at the center of the viewport. The configuration is “frozen” in the glass sphere and corotates with it. After the rotation, there is a compensating translation if necessary, to fix the anchor in the viewport. To rotate, one imagines putting a finger on the glass sphere surface and move the fingertip, which is done by left-clicking in the window and dragging the mouse without releasing the left button. The remainder of the viewport comprises of a flat glass surface parallel to the viewport, left-clicking and dragging which causes the configuration to rotate clockwise and counterclockwise. By pressing the arrow keys ←, →, ↑, ↓, and shift +↑, ↓, the configuration can also be rotated along three orthogonal axes. The rate of rotation is governed by the socalled gearbox value, that actually controls all rates of changes, which can be varied by pressing the numeric keys 0–9. One can always recover the initial view frustum orientation with x, y, z perfectly aligned, by pressing key “u”.
1060
J. Li
At this point we need to explain the design of the CFG configuration file format which AtomEye supports. (Though there is elementary support for the PDB file format [14], PDB is not recommended. See the last section.) In the CFG file, one always assumes that the configuration is under PBC, with a parallelopiped supercell defined by its three edge vectors (not necessarily orthogonal to each other). The reason for enforcing the PBC requirement is that, while it is quite easy to express a cluster configuration as a PBC configuration by putting a large enough PBC box around it, therefore separating the periodic images by vacuum, it is not so easy the other way around. To define a PBC configuration, a minimum of 3N + 9 real numbers needs to be supplied, where N is the number of atoms. First, one must specify a 3 × 3 matrix,
H11 H = H21 H31
H12 H22 H32
H13 H23 , H33
(1)
in the unit of angstrom (Å), which specifies the supercell size and shape. AtomEye uses a row-based vector notation. That is, the first row of the H matrix corresponds to the first edge (or basis) vector h1 of supercell, and similarly for h2 and h3 : h1 ≡ (H11 H12 H13),
h2 ≡ (H21 H22 H23 ),
h3 ≡ (H31 H32 H33 ). (2)
So, for instance, H23 is the z-component of the second edge vector of the supercell (in Å). It is recommended that h1 , h2 , h3 constitute a right-handed system, that is (h1 × h2 ) · h3 = det(H) > 0,
(3)
but it is not required. The atom positions are specified in the CFG file by the socalled reduced coordinates {si } instead of the Cartesian coordinates {xi }. Here i runs from 1 to N (in the program it actually runs from 0 to N − 1), and both si and xi are 1 × 3 row vectors si ≡ (si1 si2 si3 ) ,
xi ≡ (xi yi z i ) .
(4)
si1 , si2 , si3 are called reduced coordinates since 1. They are dimensionless, unlike xi , yi , z i which are in Å. 2. They are all between 0 and 1: 0 ≤ si1 < 1,
0 ≤ si2 < 1,
0 ≤ si3 < 1.
(5)
xi and si are related by the matrix vector product xi = si H = si1 h1 + si2 h2 + si3 h3 .
(6)
Atomistic visualization
1061
Since h1 , h2 , and h3 are the three edges of the parallelopiped supercell, it is seen that any point inside the supercell corresponds to si1 , si2 , si3 ∈ [0, 1), and vice versa. Any image atom outside the supercell can be expressed as (si1 + l, si2 + m, si3 + n), in which l, m, n are all integers, which is separated from the original atom xi by Cartesian distance lh1 + mh2 + nh3 . Knowing xi and H, one can also invert Eq. (6) to get si si = xi H−1 .
(7)
/ [0, 1), the atom is outside of the supercell (i.e., it is an If any of si1 , si2 , si3 ∈ image atom) and needs to be mapped back to the original supercell, by siα → siα − siα ,
α = 1, 2, 3,
(8)
where · is the floor function, returning the largest integer not greater than the argument. The reciprocal vectors of the supercell g1 , g2 , and g3 are the first, second and third row vectors of the matrix G ≡ 2π(H−1 )T ,
(9)
and satisfy the fundamental relations gα hβT = 2π δαβ ,
α, β ∈ 1 · · · 3
(10)
Since g1 is normal to the plane spanned by h2 and h3 , g2 is normal to the plane spanned by h1 and h3 , g3 is normal to the plane spanned by h1 and h2 , it is easy to see that the thicknesses of the supercell perpendicular to the three sets of planes are d1 =
2π , |g1 |
d2 =
2π , |g2 |
d3 =
2π , |g3 |
(11)
respectively. It can be shown that a sphere of radius R can fit into one supercell (without touching any of the six faces) if and only if 2R < min(d1 , d2 , d3 ).
(12)
The above is an important relation since it tells us whether a great simplification in treating image interactions can be taken or not. To appreciate this, let us suppose two atoms would interact/consider each other their neighbor, whenever their distance is less than rc . Given the contents of the supercell, the physical system it represents is an infinite lattice composed of infinitely tiled replicas of the original supercell. Theoretically, to determine how many neighbors an atom xi in the original supercell has, one needs to go over all atoms in nearby supercells. It is then possible that both x j + lh1 + mh2 + nh3 and x j +l h1 + m h2 + n h3 are neighbors of xi , which is called multiple counting. There is nothing wrong with multiple counting, but
1062
J. Li
this possibility makes the program more complicated and less efficient. So a natural question is, under what conditions would multiple counting be guaranteed to not occur, and one only has single counting? In other words, when would any two atoms i and j in the original supercell have at most one interaction even when all images of j are taken into account? To figure this out, suppose xi is at the center of the original parallelopiped, si = 12 , 12 , 12 . It is then seen that if and only if 2rc < min(d1 , d2 , d3 ),
(13)
can single counting be guaranteed for atom i, and all possible neighbors are within the original supercell. One then realizes that this criterion does not really matter where atom i is. One can always define a shifted supercell (possibly containing some image atoms) with atom i at its center, that has one-to-one mapping with atoms 1 · · · N in the original supercell. So long as Eq. (13) is satisfied, one only needs to loop over atoms 1 · · · N once to find out all the neighbor of i, according to the formulas
sij ≡ (sij 1 sij 2 sij 3 ),
sij α = siα − s j α − siα − s j α +
1 , 2
α = 1, 2, 3 (14)
xij ≡ sij H,
rij ≡ |xij |.
(15)
In the engines of AtomEye, condition (13) is assumed to hold, which is mostly the case for configurations involved in empirical potential simulations. However, configurations from ab initio calculations often do not satisfy (13). So, when AtomEye loads in the configuration, if (13) is found to be not satisfied, the configuration is automatically replicated in the necessary direction(s) so that condition (13) will become satisfied. In the CFG file, the H matrix can be specified flexibly according to the following formula
H = AH0 I + 2ηT,
(16)
where A, η and T are optional parameters, and I is the 3 × 3 identity matrix. A is a scalar and has the meaning of the basic lengthscale of the configuration in Å, and its default value is one. √ η is a desired Lagrangian strain which is a 3 × 3 symmetric matrix, and I + 2η is the affine transformation matrix that achieves η without rotation (see Chap 2.3); by default, η = 0, the zero matrix. Finally, T is an affine transformation matrix, which can contain a rotational component; by default, T = I. When A, η and T all take their default values, H = H0 . So if one do not care about scaling and affine transformations, one can just specify H by directly specifying H0 in Å, like
1.8075 H0 = 1.8075 0
1.8075 0 1.8075
0 1.8075 , 1.8075
(17)
Atomistic visualization
1063
for FCC Cu crystal primitive cell with equilibrium lattice constant 3.615 Å. However, it is perhaps better to set
0.5 H0 = 0.5 0
0.5 0 0.5
0 0.5 , 0.5
(18)
but set A = 3.615. This way, if we want to create a series of configurations with varying lattice parameters, we only need to change one number in the CFG file. Optional η and T matrixes are established for the same reasons. If we want to deform the entire configuration, we only need to change one or few parameters in the CFG file. For instance,
1 0 T = 0.5 1 0 0
0 0 1
(19)
means effecting a simple shear such that e y → e y + 0.5ex with ex and ez unchanged. The main data block comes after various required and optional declarations. There are N lines in the data block, one line for each atom. The first three entries on each line are the si1 , si2 , si3 of that atom. Depending on the declaration, there may or may not be three numbers following them that contain the velocity information. The CFG file is extensible in the sense that there is a supported way for the user to store extra atomic properties in the CFG file. For example, one may wish to store the instantaneous force on each atom computed fron an ab intio calculation, along with the positions. To do this, one can declare the existence of three auxiliary properties auxiliary[0] = f x[eV/Å] auxiliary[1] = f y[eV/Å] auxiliary[2] = f z[eV/Å], which provide indexing (start from zero), property name, and unit information. One can then append the auxiliary property data at the end of the line for each atom. AtomEye can be used to query atom by atom and to graphically visualize these auxiliary properties with various threshold and colormap options (see Fig. 3). The CFG file (with recommended suffix “.cfg”) is meant to be readable and editable by people, so it is in plain ASCII format. One can add comments after “#”, which is also a way to store nonstandard information. All data values can be specified to an arbitrary number of effective digits that the user deems necessary. To compensate for the large size, the user may compress the CFG file using gzip (recommended suffix “.cfg.gz”) or bzip2 (recommended suffix “.cfg.bz2”). AtomEye can directly load the compressed files, using an
1064
J. Li
automatic recognition and decompression scheme. To simplify operations, one CFG file should store one atomistic configuration only. A sequence of configurations should be named like “mdrun00001.cfg.gz”, “mdrun00002.cfg.gz”, “mdrun00003.cfg.gz”, . . . , etc., with the starting identifier “mdrun” arbitrary. AtomEye can recognize the above file name patterns automatically to determine a file group, with browsing forward, backward and loop-back capabilities. This greatly facilitates inspecting MD trajectories. AtomEye presently has three builtin functions to characterize the local atomic environment 1. Coordination number {ki }. This counts the total number of first-nearest neighbors each atom i has in a configuration (self excluded). It is of course a fuzzy concept, especially in liquids, where the sharp shell structure of the crystal reference is largely smeared out. Procedure wise, what is done in AtomEye is that there is a default atomic radius value Ru defined for each element species u. An atom i of species u would consider an atom j of species v its first-nearest neighbor if their distance rij (see Eq. (15)) is lesser than rc,uv ≡ Ru + Rv . By the definition above, and by common sense, this relationship is reciprocal; that is, if atom i considers atom j its first-nearest neighbor, then atom j would also consider atom i its first-nearest neighbor. The choice of the Ru default value is based on the following considerations. The first is Slater’s empirical atomic radius tabulation based on the equilibrium bond lengths in over 1200 ionic, metallic, and covalent crystals and molecules [25]. The second is that in order for the procedure to be maximally resistant to thermal noise at low T for the ground state perfect crystal, Ru should be set to be approximately halfway between the first and the second atomic shells of the T = 0 perfect crystal. (In liquids a similar choice would be to set Ru to the location of the minimum between the first and second maxima in the radial distribution function g(r).) This default rc,uv value however does not always work well in practice, and the user can change it. In AtomEye, {ki } is used as a versatile characterization of atomic defects. Point defects (Fig. 4), dislocations, grain boundaries (Fig. 5), etc. will often change the coordination number of atoms in their cores, thereby allowing their conformation to be visualized. Often, to see the defects, one also needs to render the “uninteresting” atoms invisible. Here, the uninteresting atoms are identified as those whose ki remains unchanged from the reference crystal value, such as 12 in FCC crystal. Ctrl+shift+right-click on them will make them invisible. 2. Central symmetry parameter {ci }. There are some important defects in crystals, such as stacking faults and twin boundaries, which do not change the coordination number of atoms. But they can be identified by
Atomistic visualization
1065
evaluating the degree of inversion symmetry breaking around each atom. This is explained in the next section. An example is shown in Fig. 6, where intrinsic stacking faults bound by Shockley partial dislocations in FCC crystals are visualized. 3. Local von Mises shear strain invariant {ηi }. A reference-state free measure of local atomic strain shear invariant has been derived for high-symmetry crystals [26]. Furthermore, the user is free to devise his/her own local environment characterization scheme, save the result as an auxiliary property (see Fig. 3) to visualize it later on. One may also define a “color patch” file that accompanies a CFG file to explicitly control how the atoms should be rendered. AtomEye provides a suite of tools to survey and interrogate the configuration. One can find out about atomic properties (auxiliaries included), bond length, bond angle, surface normal, and dihedral angle by right clicking on the atoms. One may define a large number of simultaneous cutting planes, and shift the configuration under PBC to expose the most interesting features. Finally, one can put down color marking on the atoms in one configuration and trace their diffusive or displacive motion in the ensuing configurations, for example during deformation.
3.
Central Symmetry Parameter
The central symmetry parameter {ci }, i = 1 · · · N is used to characterize the degree of inversion symmetry breaking in each atom’s local environment. Especially, it is useful for visualizing planar faults in FCC and BCC crystals [27]. We illustrate here how it is done. Define integer constant M to be the maximum number of neighbors for the computation of {ci }. For FCC lattice, we may want to use M = 12. For BCC lattice, we may want to use M = 8. The computer of course does not know whether the configuration is FCC- or BCC-based, so by default it is going to use, Nmost × 2, (20) Mdefault ≡ 2 where Nmost is the most popular coordination number in the set {Ni }, i =1 · · · N of the configuration. The user is able to override the default. But in any case, M must be even as we will be counting pairs of atoms. Now for each atom i ∈ 1 · · · N , define, m˜ i ≡ min(M, Ni ).
(21)
If m˜ i = 0, ci ≡ 0 since an isolated atom should have perfect inversion symmetry. If m˜ i = 1, ci ≡ 1, since a coordination-1 atom has no inversion image
1066
J. Li
to compare with, so in a sense its inversion symmetry is the most broken. For m˜ i ≥ 2, define,
mi ≡
m˜ i × 2, 2
(22)
and we use the following procedure to determine ci . 1. Sort the j = 1 · · · Ni neighbors of atom i according to their distances |d j | to atom i in ascending order. Pick the smallest m i -set. 2. Take the closest neighbor d1 . Search, among the other m i − 1 neighbors, the one that minimizes,
2 D˜ j ≡ d1 + d j ,
(23)
and let us define, j ≡ arg min D˜ j , j =2..m i
D1 ≡ D˜ j .
(24)
3. Throw atoms 1 and j out of the set, and look for the closest neighbor in the remaining set. Then repeat Step 2 until the set is empty. We then have obtained D1 , D2 , .., Dm i /2 . Define, m i /2
ci ≡
Dk . 2 j =1 |d j |
k=1
2
m i
(25)
Equation (25) is dimensionless. In the case of m i = 2, suppose the two neighbors are independently randomly oriented, it is easy to show that the mathematical expectation, 1 E[ci ] = . 2 On the other hand, we can prove that, max ci = 1, {d j }
(26)
(27)
so this matches with the definition of ci ≡ 1 at m˜ i = 1. But when m i 2, 1 (28) E[ci ] < , 2 because of the minimization process. For instance, at the intrinsic stacking fault in FCC lattice ABC|BCA, there is a loss of inversion symmetry in the two layers C|B, and ci is, √ 3 × 0 + 3 × (d 3/2 × 1/3 × 2)2 1 ≈ 0.0416, (29) = ci = 2 2 × 12d 24 assuming perfect stacking.
Atomistic visualization
1067
The good thing about expression (25) is that according to the Lindemann/ Gilvarry rule [28], a crystal melts when the atomic vibrational amplitudes reach about ∼12% of the nearest neighbor distance, so ci for perfect crystal should be < 0.01 even at finite temperature. Therefore, it is not very difficult to threshold out thermal noise vs a true stacking fault.
4.
Outlook
Visualization of modeling results plays the same role as microscopy in experiments: one relies on it to extract useful information from the often staggering amount of data. Good pictures and animations grab the audience’s attention in classrooms and seminars, and user-friendly visualization tools allow them to really interact with the numerical models. Atomistic visualization will become more widespread as suitable techniques are developed and software tools are refined. In the future, we expect distributed visualization of large data sets, like distributed number-crunching on Beowulf clusters and grid computers, will become more prevalent. In this paradigm, the display node takes care of assembling the scenes and user input, while multiple nodes on a fast network perform data readout and render the scenes in the background, in real-time navigations.
References [1] F.F. Abraham, R. Walkup, H.J. Gao, M. Duchaineau, T.D. De la Rubia, and M. Seager, “Simulating materials failure by using up to one billion atoms and the world’s fastest computer: work-hardening,” Proc. Natl. Acad. Sci. USA., 99, 5783–5787, 2002. [2] P. Vashishta, R.K. Kalia, and A. Nakano, “Multimillion atom molecular dynamics simulations of nanostructures on parallel computers,” J. Nanopart. Res., 5, 119–135, 2003. [3] S. Xu, J. Li, C. Li, and F. Chan, “Immersive visualisation of nano-indentation simulation of cu,” In: H. Lee and K. Kumar (eds.), Recent Advances in Computational Science and Engineering, World Scientific, Singapore. Proceedings of the International Conference on Scientific and Engineering Computation (IC-SEC) ISBN: 1-8609, 2002. [4] A. Sharma, A. Nakano, R.K. Kalia, P. Vashishta, S. Kodiyalam, P. Miller, W. Zhao, X.L. Liu, T.J. Campbell, and A. Haas, “Immersive and interactive exploration of billion-atom systems,” Presence-Teleoper. Virtual Env., 12, 85–95, 2003. [5] P.J. Kraulis, “Molscript - a program to produce both detailed and schematic plots of protein structures,” J. Appl. Crystallogr., 24, 946–950, 1991. [6] R.A. Sayle and E.J. Milner-White, “Rasmol – biomolecular graphics for all,” Trends Biochem. Sci., 20, 374–376, 1995.
1068
J. Li
[7] E.A. Merritt and M.E.P. Murphy, “Raster3D photorealistic molecular graphics,” Acta Crystallogr. Sect. D-Biol. Crystallogr., 50, 869–873, 1994. [8] E.A. Merritt and D.J. Bacon, “Raster3D: photorealistic molecular graphics,” Methods Enzymol., 277, 505–524, 1997. [9] M.C. Lawrence and P. Bourke, “CONSCRIPT: a program for generating electron density isosurfaces for presentation in protein crystallography,” J. Appl. Crystallogr., 33, 990–991, 2000. [10] N. Guex and M.C. Peitsch, “SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling,” Electrophoresis, 18, 2714–2723, 1997. [11] N. Guex, A. Diemand, and M.C. Peitsch, “Protein modelling for all,” Trends Biochem. Sci., 24, 364–367, 1999. [12] R. Koradi, M. Billeter, and K. Wuthrich, “MOLMOL: A program for display and analysis of macromolecular structures,” J. Mol. Graph., 14, 51–55, 1996. [13] O. Miyashita, J.N. Onuchic, and P.G. Wolynes, “Nonlinear elasticity, proteinquakes, and the energy landscapes of functional transitions in proteins,” Proc. Natl. Acad. Sci. USA, 100, 12570–12575, 2003. [14] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, “The protein data bank,” Nucleic Acids Res., 28, 235– 242, 2000. [15] M. Parrinello and A. Rahman, “Polymorphic transitions in single-crystals – a new molecular dynamics method,” J. Appl. Phys., 52, 7182–7190, 1981. [16] J. Li, “Atomeye: an efficient atomistic configuration viewer,” Model. Simul. Mater. Sci. Engrg., 11, 173–177, 2003. [17] G. Schaftenaar and J.H. Noordik, “Molden: a pre- and post-processing program for molecular and electronic structures,” J. Comput.-Aided Mol. Des., 14, 123–134, 2000. [18] A. Kokalj, “Xcrysden – a new program for displaying crystalline structures and electron densities,” J. Mol. Graph., 17, 176, 1999. [19] A. Kokalj, “Computer graphics and graphical user interfaces as tools in simulations of matter at the atomic scale,” Comput. Mater. Sci., 28, 155–168, 2003. [20] W.D.A. Humphrey and K. Schulten, “VMD: visual molecular dynamics,” J. Mol. Graph., 14, 33–38, 1996. [21] J. Adler, A. Hashibon, N. Schreiber, A. Sorkin, S. Sorkin, and G. Wagner, “Visualization of md and mc simulations for atomistic modeling,” Comput. Phys. Commun., 147, 665–669, 2002. [22] S.R. Bahn and K.W. Jacobsen, “An object-oriented scripting interface to a legacy electronic structure code,” Comput. Sci. Engrg., 4, 56–66, 2002. [23] J.D. Foley, A. van Dam, S.K. Feiner, and J.F. Hughes, Computer Graphics: Principles and Practice in C, 2nd edn., Addison-Wesley, Reading, 1995. [24] J. Li, K.J. Van Vliet, T. Zhu, S. Yip, and S. Suresh, “Atomistic mechanisms governing elastic limit and incipient plasticity in crystals,” Nature, 418, 307–310, 2002. [25] J. Slater, J. Chem. Phys., 39, 3199, 1964. [26] J. Li, To be published, 2004. [27] C.L. Kelchner, S.J. Plimpton, and J.C. Hamilton, “Dislocation nucleation and defect structure during surface indentation,” Phys. Rev. B, 58, 11085–11088, 1998. [28] J. Gilvarry, “The lindemann and gruneisen laws,” Phys. Rev., 102, 308, 1956.
Chapter 3 MESOSCALE/CONTINUUM METHODS
3.1 MESOSCALE/MACROSCALE COMPUTATIONAL METHODS M.F. Horstemeyer Mississippi State University, Mississippi State, MS, USA
The central idea of this chapter is the multiscale aspect of the constitutive relations for plasticity, damage/fracture, and fatigue. In continuum mechanics, the constitutive relations are required to complete the set of governing equations in concert with the conservation equations of mass, momentum, and energy. The constitutive equations essentially distinguish the material in the modeling framework. The particular constitutive relations focused on in this chapter relate to thermodynamically dissipating materials with the perspective of mesoscale and macroscale analyses. Mesoscale analyses typically start at the scale of the grain or crystal, whereas macroscale analyses start at the polycrystalline level. Mesoscale analyses at times focus on just activities within a single crystal. In a sense mesoscale analyses are ascribed as discrete methods but can address polycrystalline materials as averaging schemes over the single crystals are performed. Hence, mesoscale analyses employ continuum mechanics as well. Several types of mesoscale analyses can be performed. In this chapter, dislocation dynamics occurring at the scale of the grain is presented, which is a discrete method. Also, crystal plasticity, which is a mix of discrete and continuum concepts, starts at the grain scale and can be used for polycrystalline analysis with the use of averaging schemes. The macroscale analysis in this chapter is discussed in the context of internal state variables, which are rooted in continuum level thermodynamics. Because the mesoscale dislocation dynamics and crystal plasticity formulations require a large capacity of computing power, they are not generally used to solve large scale boundary value problems of structural components or systems. It is the macroscale internal state variable continuum theory that is often employed in solving practical engineering problems. The use of the mesoscale dislocation dynamics and crystal plasticity formulations arises in material analyses studies, which in turn can play a role in the macroscale 1071 S. Yip (ed.), Handbook of Materials Modeling, 1071–1075. c 2005 Springer. Printed in the Netherlands.
1072
M.F. Horstemeyer
efforts. We should also note that particular discrete details of dislocation nucleation, motion, and interaction can be quite clearly captured in the dislocation dynamics formulation. As such, the degrees of freedom required to solve an engineering problem are prohibitive. Now the crystal plasticity formulation employs discrete crystals but treats the dislocation effects in a phenomenological manner. Hence, it loses some of the details but can capture a larger scale boundary value problem. The macroscale internal state variable captures even less detail than the crystal plasticity formulation but can address even larger scale boundary value problems. When considering the multiscale modeling and simulation methods described in this book, one can consider that the ab initio and atomistic method simulation results can be imported into the dislocation dynamics formulation. The results from the dislocation dynamics formulation can be imported into the crystal plasticity formulation. And the crystal plasticity results can be imported into the macroscale internal state variable formulation. This bridging of the length scales is a relatively recent area of research in the areas of plasticity, damage/fracture, and fatigue. This chapter breaks down the different plasticity formulations related to the different length scales. The damage/fracture and fatigue portions can be included in the plasticity formulations at each of the scales as well. However, the continuum damage mechanics, fracture mechanics, and fatigue formulations have had more applicability at the macroscale and structural scale and much less at the mesoscale. The mesoscale couplings of damage, fracture, and fatigue with dislocation dynamics and crystal plasticity are certainly promising areas of research. Before proceeding further into the depths of each of the formulations, it is pertinent to discuss the context of the models within the context of continuum mechanics. These formulations considered in this chapter each focus on the kinematics, kinetics, and thermodynamics related to each scale and type of formulation for the most part (exception is dislocation dynamics). The constitutive model embedded within the governing conservation law equations is sometimes referred to as a law, but generally are not in the strict scientific sense. They essentially represent the constitution of the material. Just as the constitution of the United States dictates the internal response upon an external stimulus, so the constitution of a material will yield a stress upon an externally applied strain. The reader should know that several postulates of continuum theory are part of a constitutive theory. Classically, these assumptions (essentially postulates) are the following: frame indifference, physical admissibility, material memory, equipresence, and local action. Perhaps the most important of these assumptions is physical admissibility. In fact, the assumption of local action has been questioned recently when dealing with various length scales and some constitutive equations have assumed nonlocal notions.
Mesoscale/macroscale computational methods
1073
In order to address the physical admissibility assumption, let us summarize various length scales that arise in plasticity and damage/fracture analysis. At a low level, the lattice parameter is key since atomic rearrangement is intimately related to various types of dislocations. Nabarro developed a relationship with dislocation’s that related stress to the inverse of a length scale parameter, the Burgers vector [1]. Nabarro [2] also showed that the diffusion rate is inversely proportional to the grain size, another length scale parameter. Hall [3] and Petch [4] related the work hardening rate to the grain size. Ashby [5] found that the dislocation density increased with decreasing second phase particle size. Frank [6, 7] and Read [8] showed a relation with a dislocation bowing as a function of spacing distance and size. Hughes and Hanson [9] discovered that geometrically necessary boundary spacing decreases with increasing strain. Recently, Horstemeyer et al. [10, 11] found that the yield stress is a function of the volume per surface area, a different type of length scale parameter. Recent experimental studies have revealed that material properties change as a function of size. For example, Fleck et al. [12] have shown in torsion of thin polycrystalline copper wires the normalized yield shear strength increases by a factor of three as the wire diameter is decreased from 100 µ to 12.5 µ. Stolken and Evans [13] observed a substantial increase in hardening during the bending of ultra thin beams. In micro-indentation and nano-indentation tests [14–19], the measured indentation hardness increased by a factor of two as the depth of indentation decreased from 10 µ to 1 µ. Lloyd [20] investigated an aluminum-silicon matrix reinforced by silicon carbide particles. He observed a significant increase in strength when the particle diameter was reduced from 16 µ to 7.5 µ while holding the particle volume fraction fixed at 15%. Hughes et al. [21] investigated deformation induced from frictional loading and found that the stresses near the surface were much greater than that predicted by the local macroscale continuum theory, i.e., a length scale dependence was observed. Elssner et al. [22] measured both the macroscopic fracture toughness and the atomic work associated with the separation of an interface between two dissimilar single crystals. The interface (crack tip) between the two materials remained sharp, even though the materials were ductile and contained a large number of dislocations. The stress level necessary to produce atomic decohesion of a sharp interface is on the order of ten times the yield stress, while local theories predict that the maximum achievable stress at a crack tip is no larger than four to five times the yield stress. In continuum mechanics, length scales have been apart of some of the original premises. Euler in the 1700s related buckling to a column length. In the 1800s, Cauchy related the stress state to radius of a cylinder. In the 1900s Bridgman showed for many metal alloys that the notch radii change the stress state of the material. In terms of damage/fracture, Griffith [23] found a
1074
M.F. Horstemeyer
relation between the crack length and the stress intensity factor. Fairly recently, McClintock [24] determined the void growth rates as a function of the void size. Void/Crack nucleation was determined by various aspects of the second phase particle size distribution by Gangalee and Gurland [25]. Horstemeyer et al. [26] determined the nearest neighbor distance as a length scale parameter for void coalescence modeling. It is clear that whether damage mechanics or fracture mechanics is employed the length scale of interest is important. Although the computational methods described in this chapter address different length scales of analysis and hence require differing levels of discreteness or continuumness, all of the methods have some points of commonality. They are focused on crystalline metals, not being restricted to fcc, bcc, or hcp lattice structures. They have a common theme energy requirements and dissipative mechanisms within their relevant scale. Each scale of analysis has focused upon various strain rate and temperature effects as fundamental conditions. Continuum mechanics can be thought of as a branch of applied mathematics. The formulations comprise rigorous mathematical restrictions so standard tensor notation is used throughout the text. The tensor notation needs some explanation as well. An underscore indicates a first rank tensor (vector) for a lower case letter and a second rank tensor for a capital letter, i.e., v and F, respectively. A global Cartesian coordinate system is assumed so no distinction is made between the contravariant and covariant components. A first and second rank tensor in the Einsteinian indicial notation are given by v i andFi j , respectively. For the implementation of the constitutive model into the finite element codes, the tensors are denoted in bold face type. The summation convention over repeated indices is implied, for example, σii = σ11 + σ22 + σ33. In general, for any tensor variable x, x˚ represents the corotational derivative. The tensorial dyadic product is denoted by ⊗, for example, a ⊗ a is a second rank tensor. The rest of this chapter is outlined by the following sections which give detailed descriptions of mesoscale/macroscale continuum formulations: dislocation dynamics, crystal plasticity, internal state variable theory, ductile fracture, continuum damage mechanics, microstructure sensitive computational fatigue, and a final perspective on modeling at these scales.
References [1] [2] [3] [4]
J.M. Burgers, Proc. Kon. Ned. Akad. Wetenschap., 42, 293, 1939. F. Nabarro, Adv. in Phys., 1, 269, 1952. E.O. Hall, Proc. Phys. Soc. B, 64, 747, 1951. N.J. Petch, J. Iron Steel I., 174, 25, 1953.
Mesoscale/macroscale computational methods
1075
[5] M. Ashby, Strengthening Methods in Crystals, A. Kelly and R.B. Nicholson (eds.), 137, 1971. [6] F.C. Frank, Disc. Faraday Soc., 5, 48, 1949. [7] F.C. Frank, Phil. Mag., 42, 809, 1951. [8] W.T. Read, Dislocations in Crystals, McGraw-Hill, New York, 1953. [9] D.A. Hughes and N. Hansen, “High angle boundaries and orientation distributions at large strains,” Scripta Metallurgica et Materialia, Vol. 33, No. 2, Jul. 15, pp. 315– 321, 1995. [10] M.F. Horstemeyer and M.I. Baskes, “Atomistic finite deformation simulations: a discussion on length scale effects in relation to mechanical stresses,” J. Eng. Matls. Techn. Trans. ASME, 121, pp. 114–119, 1998. [11] M.F. Horstemeyer, S.J. Plimpton, and M.I. Baskes, “Size scale and strain rate effects on yield and plasticity of metals,” Acta Mater., 49, 4363–4374, 2001. [12] N.A. Fleck, G.M. Muller, M.F. Ashby, and J.W. Hutchinson, “Strain gradient plasticity – theory and experiment,” Acta Met., 42, (2), 475–487, 1994. [13] J.S. Stolken and A.G. Evans, “A microbend test method for measuring the plasticity length scale,” Acta Mater., 46, n 14 5109, 1998. [14] W.D. Nix, “Mechanical properties of thin films,” Metall. Trans., 20A, 2217–2245, 1989. [15] M.S. De Guzman, G. Newbauer, P. Flinn, and W.D. Nix, “The role of indentation depth on the measured hardness of materials,” Materials Research Symposium Proceedings, 308, 613–618, 1993. [16] N.A. Stelmashenko, N.A. Walls, L.M. Brown, and Y.V. Milman, “Microindentation on W and Mo oriented single crystal: an STM Study,” Acta. Metall. Mater., 41, 2855– 5865, 1993. [17] Q. Ma and D.R. Clarke, “Size dependent hardness in silver single crystals,” J. Mater. Research, 10, 853–863, 1995. [18] W.J. Poole, M.F. Ashby, and N.A. Fleck, “Microhardness of annealed and work hardened copper polycrystals,” Scripta Metall. Mater., 34, 559–564, 1996. [19] K.W. McElhaney, J.J. Vlassak, and W.D. Nix, “Determination of indentor tip geometry and indentation contact area for depth-sensing indentation experiments,” J. Mater. Res., 13, 1300–1306, 1998. [20] D.J. Loyd, “Particle Reinforced aluminum and magnesium matrix composites,” Int. Mater. Rev., 39, 1–23, 1994. [21] D.A. Hughes, D.B. Dawson, J.S. Korellis, and L.I. Weingarten, “Near surface microstructures developing under large sliding loads,” J. Matls. Enginring. Performance, 3, 459–475, 1994. [22] G. Elssner, D. Korn, and M. Ruehle, “The influence of interface impurities on fracture energy of UHV diffusion bonded metal-ceramic bicrystals,” Scripta Metall. Mater., 31, 1037–1042, 1994. [23] A.A. Griffith, “The phenomena of rupture and flow in solids,” Phil. Trans. Roy. Soc. London, Series A, 221, 163–198, 1920. [24] F.A. McClintock, “A criterion for ductile fracture by growth of holes,” J. Appl. Mechanics, 35, 363, 1968. [25] A. Gangalee and J. Gurland, “On the fracture of silicon particles in aluminum–silicon alloys,” Trans. Metall. Soc. of AIME, 239, 269–272, 1967. [26] M.F. Horstemeyer, M.M. Matalanis, A.M. Sieber, and M.L. Botos, “Micromechanical finite element calculations of temperature and void configuration effects on void growth and coalescence,” Int J. Plasticity, 16, 2000.
3.2 PERSPECTIVE ON CONTINUUM MODELING OF MESOSCALE/ MACROSCALE PHENOMENA D.J. Bammann Sandia National Laboratories, Livermore, CA, USA
The attempt to model or predict the inelastic response or permanent deformation and failure observed in metals dates back over 180 years. Various descriptions of the post elastic response of metals have been proposed from the fields of physics, materials science (metallurgy), engineering, mechanics, and applied mathematics. The communication between these fields has improved and many of the modeling efforts today involve concepts from most or all of these fields. Early engineering description of post yield response treated the material as perfectly plastic – the material continues to deform with zero additional increase in load. These models became the basis of the mathematical theory of plasticity and were extended to account for hardening, unloading, and directional hardening. In contradistinction, rheological models treated the finite deformation of a solid similar to the deformation of a viscous fluid. In many cases of large deformation, rheological models have provided both adequate and accurate information about the deformed shape of a metal during many manufacturing processes. The treatment of geometric defects in solid bodies initiated within the mathematical theory of elasticity, the dislocation, introduced as an incompatible “cut” in a continuum body. This resulted in a very large body of literature devoted to the linear elastic study of dislocations, dislocation structures, and their interactions, and has provided essential information in the understanding of the “state” of a deformed material. Later it was recognized that this mathematical description was consistent with the defect in a crystal responsible for inelastic deformation. Following this, many dislocation models were developed that explained macroscopically observed phenomena such as work hardening, rate sensitivity, temperature dependence, and load path dependent response in crystalline solids. In the 1950s, the understanding of defects in deformed bodies was explored through 1077 S. Yip (ed.), Handbook of Materials Modeling, 1077–1096. c 2005 Springer. Printed in the Netherlands.
1078
D.J. Bammann
incompatibility theory, introducing a precise mathematical structure to describe the deformation of solid bodies. While these theories dealt with the internal elastic strains associated with defects in a deformed body, the finite deformation kinematics resulting from the development of these theories formed the basis for many finite deformation models of plasticity. Meanwhile, the direct links were established between incompatibility theory and the modern differential geometry approach employed by physicists to describe gravitation theory and twisted and curved spaces. With the introduction of faster and larger computers crystal plasticity models became very important tools in the design of many large strain manufacturing processes, such as rolling. These approaches illustrated the importance of describing the underlying crystalline structure of metals in predicting evolving anisotropy and texture. These theories greatly enhanced the understanding of the effects of the deformation on the integrity of the final manufactured parts and resulted in significant improvements in metal forming processes. Phenomenological models were developed introducing plastic spin as a necessary link between crystal plasticity models and conventional engineering plasticity theories. Simultaneous with these developments, efforts were advanced to predict the self-organization of crystals deformed to extremely large strains, into cell or wall like structures. In these cases, the original crystal nature of the solid becomes less dominant than the properties of the smaller cell structure in determining the mechanical response to further loading. Dislocation theories, reaction-diffusion theories and other models involving higher order spatial gradients of strain or state have been proposed to model these self-organization processes. At very large deformations the schematic representation for a series of misoriented cells with alternating polarity to satisfy meso scale angular momentum balance is identical to the Benard instability resulting from convection between two plates of unequal temperature. Descriptions of large strain localization often contain descriptors such as rotational instability, previously associated solely with fields such as turbulence. This brief introduction is intended to illustrate how descriptions of the inelastic response of crystalline materials are extremely diverse and in some cases seemingly unrelated. As a result, this perspective cannot begin to cover all aspects of either crystal deformation or associated models. Instead, an attempt to relate the some of the approaches of the various articles in this chapter is presented. The concept of a dislocation has been utilized in modeling the internal state and mechanical response of crystals spanning orders of magnitude of length scales from atomistic to macroscopic. The different approaches to modeling inelastic response at different lengths scales are equally varied and include molecular dynamics (MD) simulations, discrete dislocation simulations, phenomenological continuum theories, including crystal plasticity and theories considering the average motion of groups or densities of dislocations.
Continuum modeling of mesoscale/macroscale phenomena
1079
In addition to the increasing importance of surface effects at smaller length scales due to the increased surface to volume ratio, theories at smaller length scales are also characterized by an increased number of degrees of freedom. An approach to encompass many of these features will be considered, beginning with a very brief overview of some of the important early developments in this field.
1.
Historical Overview
The oldest description of plasticity is that proposed by Coulomb [1], Tresca (1868) and amended by Venant [2, 3] and is based upon the assumption that plastic flow initiates when the maximum shear stress reaches a critical value. This describes a hexagonal prism in principle stress space with the axis of the prism having equal inclinations to all of the coordinate axes. If the stress state lies inside the surface, the response of the material is in a state of elastic loading or unloading. The experimental works of Bauschinger (1886) [4] were critical in enhancing the understanding of inelastic response of metals in terms of unloading. An alternate criterion for the initiation of plastic flow was proposed independently by Huber [5] and von Mises [6] in which plastic flow is initiated when the distortional energy reaches a critical value. This theory was later redefined by Hencky [7]. The Tresca surface in stress space is inscribed within the von Mises ellipse as shown in Fig. 1 as both theories predict a volume preserving deformation. The mathematical treatment of defects in a body began in the early twentieth century with Volterra (1907) when he studied the elastic stress field around displacement continuities in a continuous medium. Volterra introduced six S2
Tresca Mises
S1
Figure 1. A plane in principle stress space depicting the Tresca yield surface inscribed within the von Mises yield surface.
1080
D.J. Bammann
fundamental defects through a thick walled cylinder as shown in Fig. 2 with a cut extending the axial length the cylinder. Three of the discontinuities were introduced by a translational displacement of one side of the cut radially, axially, and circumferentially and represent two types of edge dislocations and screw dislocation, respectively. The other three defects introduced by rotating or twisting the opposite faces of the cut as seen in Fig. 2 are called rotational dislocations or disclinations. Parallel with advances in macroscopic descriptions of inelastic deformation and Volterra’s elasticity solutions for individual defects, x-ray diffraction techniques were utilized to develop a better understanding of the crystallographic nature of metals. In Frenkel (1926) calculated the theoretical shear strength of a crystal and determined that it greatly exceeded experimentally observed results. To account for this striking difference, Taylor [8], Orowan [9], and Polanyi [10] independently postulated the existence of dislocations as a mechanism for crystal deformation at stress levels far below the theoretical shear strength. Orowan [11] considered the mean rather than the individual aspects of dislocation motion in an attempt to describe macroscopic flow. He postulated that the rate of plastic deformation ε˙ p , was determined by the number of mobile dislocations per unit length and their rate of propagation. ε˙ p = ρm bv
(a)
(1) (b)
(c)
(d)
b
b
b
(e)
(f)
(g)
S ξ0
ω
ω ω
Figure 2. (a) the original cut in the cylindrical tube considered by Volterra, (b) edge dislocation created by translating one face of the cut surface radially inward, (c) screw dislocation created by translating the faces of the cut apart, (d) screw dislocation created by slipping one face of the cut axially with respect to the other, (e), (f) and (g) rotational dislocations or disclinations created by rotating or twisting the faces of the cut surface.
Continuum modeling of mesoscale/macroscale phenomena
1081
where ρm , v and b are the average mobile density, speed and Burgers vector, respectively. Johnston and Gilman [12] experimentally measured the average dislocation velocity as a function of stress in lithium fluoride crystals and proposed an empirical relationship to describe their results. They also assumed the rate of increase of mobile dislocations is proportional to the flux of mobile dislocations and the rate of immobilization proportional to the square root of the mobile density. When coupled with an empirical expression for dislocation velocity and using Orowan’s relationship with the assumption that all dislocations are mobile, the yield phenomena in LiF crystals was accurately predicted. This success led to the widespread adoption of what Argon [13] calls the “dilute solution” approach to dislocation motion. In this approach dislocations are assumed to move in a quasi-viscous manner under the action of an applied stress. The velocity of the motion is determined by the lattice resistance or friction. Interactions between dislocations are assumed negligible or accounted for by an effective stress (applied stress minus back stress). Hardening, instead of being related to a rate mechanism, is defined in terms of the effective stress. Along these lines Webster (1966) assumed that the time rate of change of dislocation density was due to multiplication and immobilization processes in a manner analogous to Gilman. Substituting an empirical relationship for the dislocation velocity in which he assumed an exponential dependence upon the applied stress, resulted in ρ˙ = n + αρ − βρ 2
(2)
where n, α and β are independent of dislocation velocity and functions of stress and temperature. For conditions of constant stress and temperature, Webster accurately predicted stages I and II creep in brass and aluminum oxide crystals. Following Taylor [8], strain hardening is described by assuming the flow stress consists of an athermal component that is proportional to the shear modulus µ and thermal component dependent upon strain rate and temperature. Taylor assumes a random distribution of dislocations in a network with average spacing between dislocations give by λ which by geometry is inversely proportional to the square root of the dislocation density. The stress any dislocation experiences due to the forces exerted on it by its neighbors is given as µb µb √ = ρ (3) 2πλ 2π The strain rate and temperature dependence of the plastic flow arises as dislocations are able to surmount these local obstacles through thermal fluctuations. Kocks (1975) has labeled the stress field associated with local obstacles the mechanical threshold stress and utilized this formulation in the development a macroscopic model of plastic flow. Mecking and Kocks [14] have also τ=
1082
D.J. Bammann
proposed a more physically based internal state variable evolution equation for the dislocation density based upon dislocation storage-recovery in the Taylor lattice. They proposed that in an increment of plastic strain, dislocations were stored inversely proportional the mean free path l, and recover proportional to the density of dislocations. In a Taylor lattice the mean free path is inversely proportional to the square root of the number of dislocations, therefore dρ c1 √ = − c2 ρ = c1 ρ − c2 ρ. dε p l
(4)
This internal state variable evolution equation for dislocation density (or similar forms) has been introduced to describe hardening in many phenomenological plasticity models. It was not until dislocations in crystalline slip were mathematically described that the significance of Volterra’s solutions became apparent. Taylor investigated the properties of straight dislocation lines in an elastic continuum; Burgers [15] studied the properties of curved dislocation lines using an analogy with vortex lines in hydrodynamics; and Peach and Koehler [16] derived the configurational force on a dislocation in an arbitrary stress field. These coupled with the studies of Peierls [17], Nabarro [18] and others began the study of the elastic interaction of defects with the crystal lattice which ultimately resulted in the computational models of discrete dislocations of Zbib (1992), Ghoniem [19], Van der Giessen and Needleman [20] and many others. These models yield solutions to complex boundary value problems, enhancing our understanding of defect interaction, and of equal importance provide insight into mesoscale modeling efforts, such as boundary conditions in strain gradient theories. One of the simplest representations of a dislocation line is the through the concept of a Burgers circuit, an atom-by-atom closed circuit drawn in a crystal. Consider Burgers circuit around the extra plane of atoms (edge dislocation) as depicted in Figs. 3 and 4. The closure failure resulting from the extra plane of atoms is termed the Burgers vector and is a measure of the presence of the dislocation. The Burgers circuit also provides a means of distinction between statistically stored dislocations (SSDs) and geometrically necessary dislocations (GNDs). Ashby postulated that GNDs naturally occurred during plastic flow in crystals to ensure overall compatibility of the total deformation. For example, GNDs are created to prevent gaps or overlaps as crystals rotate with respect to each other during polycrystalline deformation. Other examples of GNDs occur during indentation, at precipitate particles or other dislocation pileups that occur during deformation. SSDs occur under homogeneous deformation in positive and negative pairs as in the network considered by Taylor (1940) or Mecking and Kocks [14] in their models of hardening. These types of dislocations are responsible for most of the hardening observed during deformation of crystals, but result in a compatible deformation as evidenced
Continuum modeling of mesoscale/macroscale phenomena
1083
Figure 3. Burgers circuit for discrete geometrically necessary dislocation.
n t
ds
Figure 4.
Continuum Burgers circuit around a dislocation line with tangent t and normal n.
by the lack of closure failure and zero Burgers vector (Fig. 5). Notice that this distinction depends upon the size of the representative volume element or Burgers circuit considered, and if the size is chosen small enough, all dislocations are geometrically necessary. By considering a Burgers circuit in a plane normal to each axis, Nye [21] was able to construct a tensor that
1084
D.J. Bammann
Figure 5. Burgers circuit for statistically stored dislocations resulting in zero closure failure or incompatibility.
represented the dislocation lines piercing a volume and therefore contained information about the total Burgers vector at a continuum point. Nye related this dislocation density tensor to the curvature, which describes the rotation of the lattice and examined the distribution of dislocations that resulted during bending. The similarity to Volterra’s concept of a dislocation is easily seen by constructing a Burgers circuit in a continuum. A dislocation line in a continuum is simply a cut, displaced surface or a line of singularities. Therefore, any closure failure that occurs when integrating the displacement around a closed circuit that encompasses the dislocation line results in a closure failure or the Burgers vector.
du = −b
(5)
C
By considering a Burgers circuit in a plane normal to each axes, Nye [21] was able to construct a tensor that represented the dislocation lines piercing a volume and therefore contained information about the total Burgers vector at a continuum point. Nye related this dislocation density tensor to the curvature, which describes the rotation of the lattice and examined the distribution of dislocations that resulted during bending. The finite deformation equivalent of
Continuum modeling of mesoscale/macroscale phenomena
1085
this was developed in an attempt to solve the elasticity problem of the internal stress field in an unloaded (but previously loaded) body. Bilby et al. 1957 and [22] independently proposed that the deformation gradient be multiplicatively decomposed into elastic and plastic parts (Fig. 6). F = Fe Fp
(6)
Fp represents the plastic deformation from the prior loading while Fe the elastic strain in the unloaded body resulting from the presence of dislocations. The natural configuration is defined by unloading through F−1 e , but in general does not represent a compatible deformation state. Denoting reference configuration variables by upper case, current configuration by lower case and quantities in the natural configuration by an over tilde, a line segment is mapped form the reference to the configuration by dx = FdX
(7)
Integrability conditions for a compatible deformation require that
F dX = 0 or
dx = C
C0
dX = C0
F−1 dX = 0,
(8)
C
Fp dx
dx F
Fe dx
Figure 6.
Multiplicative decomposition of deformation gradient into elastic and plastic parts.
1086
D.J. Bammann
where C0 is any closed path surrounding an area A0 with surface normal N. Using Stokes theorem Eq. 7 can be rewritten as integrals over the areas as,
F dX = −
C0
Curl F N dA = 0
and
A0
F−1 dX = −
C
(Curl F−1 )n da = 0.
(9)
A0
This leads to the local form of compatibility as, Curl F = 0
and
Curl F−1 = 0.
(10)
Where Curl (•) = ∇X (•) · ε
(11)
is taken with respect to the reference configuration and ε is the reference configuration alternator tensor. In Cartesian coordinates, Eq. 11 takes the form, Curl F = Fik,l εklj ei ⊗ ej
(12)
Substituting Eq. 7 into Eq. 10 and mapping to the natural configuration, it follows that compatibility is satisfied if the elastic dislocation density tensor is equal to the negative of the plastic dislocation density tensor, both in the intermediate configuration, or ¯T det Fp α¯ pT = −A e
(13)
where, α¯ p = Fp Curl Fp ¯ e = Curl (Fe ) F−T A e
(14)
These quantities sum to give the total dislocation density tensor in the intermediate configuration. Therefore, the total dislocation density is merely the sum of plastic and elastic dislocation density tensors – just like strains. And just like strains, these tensors can be mapped to any configuration where a similar relation holds. Lardner showed that the vanishing of this total dislocation density tensor is equivalent to the vanishing of the net dislocation density tensor or vanishing of excesses of dislocations of the same sign (GND). Therefore, the presence of geometrically necessary dislocations results in an incompatible
Continuum modeling of mesoscale/macroscale phenomena
1087
natural configuration and the elastic deformation gradient produces a rotation to restore compatibility to the total deformation. These equations were obtained previously by Teodosiu [23], Werne (1976) and recently by Steinmann [24], while the small strain formulation of these equations was first obtained by Kroner [22]. Bilby et al. [25] and Kondo (1952) using concepts from tensor calculus also obtained the dislocation density tensor. In this approach, the dislocation density tensor is a result of the fact that the Cartan torsion tensor does not vanish and therefore the intermediate configuration is not a Euclidean space. There has been a resurgence of interest in regularizing or adding a mathematical length scale to these continuum models of deformation. The motivation for this comes from several sources. From the point of view of numerical solutions, it is well established that in the post-bifurcation regime of a solution (initiation of either strain localization or damage associated with material softening) the system of differential equations changes generating an ill posed problem. In codes modeling hyperbolic systems, the differential equations become elliptic in the post-bifurcation regime, but the boundary conditions are still prescribed for hyperbolic systems. Similarly, in static codes, elliptic systems transform into hyperbolic systems but the associated boundary conditions are still prescribed for the original elliptic system. This incongruence results in pathological mesh dependence; in other words, the system does not converge to a solution regardless of mesh size. This problem can be resolved by regularizing or adding a mathematical length scale to the continuum, either in the form of spatial gradients in the constitutive model or by some numerical construction. For example, let us consider the solution of plane strain extension of block of material using the Gurson damage model as described in a previous section of this chapter [26]. As the mesh size is continually reduced, the damage localizes into a smaller concentration associated with the smallest element (Figs. 7 and 8). The associated load displacement curve is also reduced with reducing mesh size. However, the addition of a Laplacian of effective plastic strain to the yield function results in convergence to a finite damage band that is proportional to the constant introduced in the strain gradient term. Another motivation for models containing a mathematical length scale results from the attempt to solve boundary value problems at extremely small length scales. Recent experimental studies have revealed that material properties change as a function of size. Fleck et al. [27] presented experimental data on the torsion of thin wires, which showed the break down of local theory at very small wire diameters. Similar effects have been observed in problems of microindentation, where the flow stress increases with decreasing indentor size (Ma and Clarke 1995 and Stelmashenko et al. 1993). At these small diameters, the flow stress increased with decreasing radius of the torsion specimen. For example, in mechanical tests on small specimens, when the
1088
D.J. Bammann
8⫻32
16⫻64
24⫻96
32⫻128
Figure 7. The contours of localized damage continue to decrease to a finer width as the mesh is refined. No convergence.
specimen dimensions reached a critically small size, the yield strength began to increase sharply with further decrease in specimen dimension. This dimension is generally smaller than the reasonable applicability of local continuum theory, but still much larger than that required for tractable solutions of the problem with atomistic methods. This problem can also be resolved by the aforementioned introduction of a length scale using spatial gradients. And finally, as the demands of design require the incorporation of more of the underlying physics, it is becoming important to develop a means to bridge the length scales from the macroscopic continuum to the atomistic levels. One approach to achieve this is generalization or the addition of more kinematic degrees of freedom to the continuum. This approach results in the development of a crystal plasticity model with a physical length scale. This can be accomplished by incorporating GNDs as a internal state variable or by choosing other higher order gradients of strain as state variables. Hence, the resulting model of the crystal includes a natural, physical length scale.
Continuum modeling of mesoscale/macroscale phenomena
8⫻32
16⫻64
24⫻96
32⫻128
1089
Figure 8. The contours of damage converge to a finite band associated with the length scale introduced by the spatial gradient.
2.
Internal State Variable Model of Gradient Crystal Plasticity
Many macroscopic models of plasticity have been developed that incorporate SSDs and an appropriate evolution equation to describe experimentally observed hardening. An incomplete list includes Teodosiu [23], Rice [28], Bammann (1984), Miller [29], Chaboche (1967), Hart [30], Kratochvil and Dillon [31], Perzyna (1964), Krieg et al. [32] and Bodner and Partom [33]. Teodosiu [23] was first to embed these concepts within the framework of the thermodynamics of internal state variables [34]. These types of models have been implemented into finite element codes and used to solve a wide range of boundary value problems extending over regimes from creep to the shock regime and cryogenic temperatures to melt. A complete overview of this internal state variable approach is presented earlier in this chapter. The use of internal state variable theory permits the formal introduction of the physics
1090
D.J. Bammann
associated with dislocation, void and crack mechanics including both the stored energy and dissipation associated with these defects. As stated previously, formal internal state variable theory provides the format to include statistically stored dislocations and geometrically necessary dislocations. Since GNDs introduce an incompatibility into the plastic deformation, their inclusion introduces a length scale that eliminates the problems of post bifurcation nonuniqueness as well as specimen size scale dependent results for very small length scales. Theories of this type have been proposed by Teodosiu [23], Acharya and Basani (1996), Acharya and Beaudoin (1999), Bammann (2000), Cermelli and Gurtin (2000), Gurtin [35], Svendsen (2001) among others. As a simplified example of this approach neglect temperature, in which case the mechanical version of the second law of thermodynamics simply states that the rate of change of internal energy must be less than the work done on the body. . (15) ψ ≤ σ · d p, where σ is the Cauchy stress and d p the plastic stretching or strain rate. Now assume that the Helmholtz free energy depends upon the intermediate config¯ e , the elastic lattice strain associated with a network of uration elastic strain E statistically stored dislocations εss and a strain like measure of GNDs κe . T κe = l1 Je F−1 Curl F−1 e e
√ εss = bµ ρss
(16)
This is a natural incorporation of GNDs from compatibility theory20 and the theory of Taylor from dislocation mechanics. Therefore, if we assume that ¯ e , εss , κe ), substitute into the mechanical the free energy takes the form, ψ(E version of the second law, and equate like terms and making standard assumptions of independence commonly utilized in internal state variable theory, we get the following,
¯ e , εss , κe ∂ψ E σ , ¯ = ∂ E¯ e
τˆ =
¯ e , εss , κe ∂ψ E ∂εss
,
χ=
∂ψ E¯ e , εss , κe
∂ κe (17)
and the dissipation reduces to,
σ ¯ · d p − τˆ ε˙ ss − χ · κe ≥ 0
(18)
Now assuming that the free energy is quadratic in al the elastic strain like variables,
2
2 ¯ e , εss , κe = 1 µ E ¯ e + 1 K tr 2 E ¯ e + cτ µεss + c2 µ |κe |2 ρψ E 2 2
(19)
Continuum modeling of mesoscale/macroscale phenomena
1091
Then, √ τˆ = cτ µεss = cτ µb ρss ,
σ ¯ = λtr E¯ e + 2µE¯ e ,
χ = c2 µκe
(20)
Notice that the Taylor model of internal strength is naturally recovered. Up to this point this model is similar to a mechanical version of the one proposed by Teodosiu. To complete the theory, an evolution equation is required for the SSD and GND densities. In addition to these evolution equations, Teodosiu required the div κe since the divergence of the curl of a field must vanish. In effect, this introduced a mathematical length scale into the model. With the advent of crystal plasticity, this is unnecessary. The expression for the plastic stretching is given as, Lp =
γ˙ (i) s(i) ⊗ n(i)
(21)
i
where, s(i) and n(i) are slip direction and normal, respectively. The magnitude of slip along a particular system is usually chosen as a power law function of the applied shear stress on that system and the slip resistance, τ .
γ˙
(i)
(i)
= γ˙0
σ ¯ n(i) · s(i) τˆ
(22)
The Kocks–Mecking model for the evolution of the statistically stored dislocations is modified to account for a mean free path associated with the GNDs (Acharya and Beaudoin (1999), Bammann (2000)) such that,
(i) = ρ˙ss
c1 L (i) s
+
c2
L (i) g
(i) (i) (i) γ˙ − c3 ρss γ˙
(23)
where as before, the mean free path L (i) s is inversely proportional to the square root of the density of SSDs and the mean free path for the GNDs is given as
−1
(i) · s(i) L (i) g = κe n
(24)
The evolution for the curl of the inverse of the elastic deformation gradient can be solved for explicitly since the plastic spin is known. This is a very important step in the ability to accurately predict GNDs which develop during deformation – no phenomenological evolution equation is required! And, the geometrically necessary dislocations that are generated during a deformation are precisely calculated to within the accuracy of the crystal plasticity model. This model comprises information assembled from many different fields. Beginning with local continuum mechanics, a multiplicative decomposition of the deformation gradient was introduced that implicitly introduced more degrees of freedom into the model and required an additional equation over classical small strain plasticity – the plastic spin. More degrees of
1092
D.J. Bammann
freedom were introduced through internal state variables that were chosen to represent real physical entities – statistically stored dislocations and geometrically necessary dislocations. Only elastic strains or strain like quantities appear in the free energy since the elastic distortions of the atoms associated with either external loading or internal defects are quantities that actually cause changes in the free energy. The Taylor model of internal strength is recovered in the model and the Mecking and Kocks evolution equation for SSDs is required because of the extra degree of freedom associated with the internal state variable. The internal state variable for GNDs draws upon compatibility theory and really goes back to the original Volterra concept of a dislocation. Since the GNDs are inherently tied to the rotations that occur during deformation, the expression for the plastic spin is sufficient to specify κe . The resulting model predicts observed sample size effects as illustrated in changing the dimensions of a block in simple shear (Fig. 9). In addition, as long as the mesh size is smaller than the introduced length scale, convergence to a solution results in the post bifurcation regime. Issues yet to be addressed include appropriate microforce balance laws for the conjugate thermodynamic
Figure 9. Model prediction of specimen size effect in simple shear.
Continuum modeling of mesoscale/macroscale phenomena
1093
forces associated with the internal state variables and the extra boundary conditions associated with the extra spatial gradient in the model. This will require micromechanical analysis such as discrete dislocation simulations and micromechanics of configurational forces. The importance of the plastic spin in local finite deformation plasticity was independently proposed by Loret (1983) and Dafalias (1983) that characterized the plastic spin using representation theorems for isotropic tensor functions. Other models were later developed based upon single slip by Bammann and Aifantis (1987) and double slip by Prantil et al. (1993), Zbib [36] and Van der Giessen [37]. An extension of the gradient crystal plasticity theory such as the one discussed above has been proposed by Regueiro et al. [38]. Alternatively, Clayton and McDowell (2002) directly calculated the incompatibilities associated with crystal slip and microcracks at the meso level resulting in a homogenized estimate of an incompatible deformation gradient at the macroscale level. Other approaches include higher order gradients of strain as state variables in attempt to accomplish the same effects, but the list is too long to list here.
3.
Summary
Local theory treats a body as a “continuum” of particles or points, the only geometrical property being that of position. A closer look at materials reveals a complex microstructure of grains, subgrains, shear bands and other topological features of the distribution of mass that are not taken into account by classical local theories. If the observer is far enough removed from a grain, he will see only a point. But a theory that strips away all of the geometrical properties of a grain except for the position of its center of mass will certainly fail to explain the more complex aspects of its mechanical response. In addition, the finite element implementation of this theory is incapable of dealing with boundary value problems where instabilities such as strain localization or fracture initiation result in a bifurcation in the solution. At the onset of these instabilities, the system of differential equations loses ellipticity and the problem becomes mathematically ill posed, resulting in pathological mesh dependency of the solutions. To overcome these difficulties requires a multi-field approach. The continuum must be embedded with more degrees of freedom to attempt to capture the large degrees of freedom associated with the physics occurring at very small length scales. This can be accomplished by increasing the degrees of freedom in the kinematics, the introduction of internal state variables or introducing higher order spatial gradients of variables. But in each case, an additional degree of freedom requires additional information to complete the system. For example, the multiplicative decomposition of the deformation gradient into elastic and plastic parts results the
1094
D.J. Bammann
need for an expression for the plastic spin. This is where the physics from smaller length scales must be embedded. Similarly, the extra degrees of freedom associated with internal state variables require equations that describe the temporal and maybe, spatial evolution of the variables. A physical, predictive theory requires that this information come from appropriate micromechanical models, including any additional boundary conditions. This approach allows a true bridging of length scales and with increasing computational capabilities, more detailed and physically descriptive model will be developed.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
[16] [17] [18] [19] [20] [21] [22] [23]
C.A. Coulomb, Theorie des machines simple, Paris, 1821. B.d. St. Venant, Compt. Rend., 70, 1870. B.d. St. Venant, Compt. Rend., 74, 1009, 1872. J. Bauschinger, Mitt. Mech. Tech., Lab Munchen, Vol. 13, No. 1, 1886. M.T. Huber, Czasopismo Techniczne, Lwow, 1904. R.v. Mises, Nachricten Ges. d. Wiss. Goettingen, 582, 1913. H. Hencky, Z. agnew Math. Mech., 5, 116, 1925. G.I. Taylor, Proceedings of Royal Society A, 145, 362, 1934. E. Orowan, Z. Phys., 84, 634, 1934. M. Polanyi, Z. Phys., 660, 1934. E. Orowan, Proceedings of Physical Society of London, 52, 8, 1940. W.G. Johnston and J.J. Gilman, J. Appl. Phys., 30, 129, 1959. A.S. Argon, Mat. Sci. Eng. 3, 24, 1968/1969. M.F. Ashby, “The deformation of plastically non-homogeneous materials,” Phil. Mag. 21, 399–424, 1970. H. Mecking and U.F. Kocks, “Kinetics of Flow and Strain-Hardening,” Acta Metall., 29, 1865–1875, 1981. J.M. Burgers, “Some considerations on the fields of stress connected with dislocations in a regular crystal lattice I,” Proceedings of Konshat Nederlands Akdemie Wetensch, 42, 293–324, 1939. M. Peach and J.S. Koehler, “The forces exerted on dislocations and the stress fields produced by them,” Physical Review., 80, 436–439, 1950. R.E. Peierls, P. Phys. Soc. Lond., 52, 34, 1940. F.R.N. Nabarro, “Dislocations in a simple cubic lattice,” Proceedings of the Physical Society of London, V. 59, 332, 256–272, 1947. N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Sol. Stat. Phenom, 3&4, 379–406, 1988. E. Van der Giessen and A. Needleman, “Discrete dislocation plasticity: a simple planar model,” Mater. Sci. Eng., 18, 41, 1995. J.F. Nye, “Some geometrical relations in dislocated crystals,” Acta Metall., 1, 153– 162, 1953. E. Kr¨oner, “Allgemeine kontinuumstheorie der versetzungen und eigenspannungen,” Arch. Rat. Mech. Anal., 4, 273–334, 1960. C. Teodosiu, “A dynamic theory of dislocations and its application to the theory of elastic-plastic continuum,” In: Fundamental Aspects of Dislocation Theory, NBS Special Publication 317, U.S. Government Printing Office, Gaithersburg, MD, 1969.
Continuum modeling of mesoscale/macroscale phenomena
1095
[24] P. Steinmann, “Views on multiplicative elastoplasticity and the continuum theory of dislocations,” Int. J. Eng. Sci., 34, 1717–1735, 1996. [25] B.A. Bilby and E. Smith, “Continuous distributions of dislocations III,” Proceedings of the Royal Society of London, A 232, 481–505, 1956. [26] S. Ramaswamy and N. Aravas, “Finite element implementation of gradient plasticity models Part I: Gradient-dependent yield functions,” Comout. Meth. Appl. Mech. Engrg., 163, 11–32, 1998. [27] N.A. Fleck, G.M. Muller, M.F. Ashby, and J.W. Hutchinson, “Strain gradient plasticity: theory and experiments,” Acta Metall. Mater., 42, 475–487, 1994. [28] J.R. Rice, “Inelastic constitutive relations for solids: an internal-variable theory and its application to metal plasticity,” J. Mech. Phys. Solids, 19, 433–455, 1971. [29] A.K. Miller, “An inelastic constitutive equation for monotonic, cyclic and creep deformation; part I, equations development and analytic procedures, part 2, application to type 304 stainless steel,” J. Eng. Mater. Tech., 98H, 97–113, 1976. [30] E.W. Hart, “Constitutive relations for the nonelastic deformaton of metals,” ASME J. Eng. Mater. Tech., 98, 193–202, 1976. [31] J. Kratochvil and O.W. Dillon, Jr., “Thermodynamics of elastic-plastic materials as a theory with internal state variables,” J. Appl. Phys., 40, 3207–3218, 1969. [32] R.D. Krieg, J.C. Swearengen, et al., “A physically based internal variable model for rate dependent plasticity,” Inelastic Behavior of Pressure Vessel and Piping Components ASME/CSME, PVP-PB-028, 15–36, 1978. [33] S. Bodner and Y. Partom, “A large deformation elastic visco-plastic analysis of thick walled spherical shells,” J. Eng. Mater. Tech., 115, 358–364, 1972. [34] B.D. Coleman and M.E. Gurtin, J. Chem. Phys., 47, 597–613, 1967. [35] M.E. Gurtin, “A gradient theory of single-crystal viscoplasticity that accounts for geometrically necessary dislocations,” J. Mech. Phys. Solids, 50, 5–32, 2002. [36] H.M. Zbib, “On the mechanics of large inelastic deformations: kinematics and constitutive modeling,” Acta Mech., 96, 119 138, 1993. [37] E. Van der Giessen, “Micromechanical and thermodynamic aspects of the plastic spin,” Int. J. Plast., 7, 365–386, 1991. [38] R.A. Regueiro, D.J. Bammann, E.B. Marin, and K. Garikipati, “A nonlocal phenomenological anisotropic finite deformation plasticity model accounting for dislocation defects,” J. Eng. Mat. Tech., 124, 380–387, 2002. [39] A. Acharya and J.L. Bassani, “Incompatible lattice deformations and crystal plasticity,” In: N. Ghoniem (ed.), Plastic and Fracture Instabilities in Materials, AMD vol. 200/MD vol. 57, ASME, N.Y., pp. 75–80, 1995. [40] A. Acharya and J.L. Bassani, “Lattice incompatibility and a gradient theory of crystal plasticity,” J. Mech. Phys. Solids, 48, 1565–1595, 2000. [41] D.J. Bammann, “A model of crystal plasticity containing a natural length scale,” Mat. Sci. Eng., A309-310, 406–410, 2001. [42] B.A. Bilby, R. Bullough, and E. Smith, “Continuous distributions of dislocations: a new application of the methods of non-Riemannian geometry,” Proceedings of the Royal Society of London, A 231, 263–273, 1955. [43] G.C. Butler and D.L. McDowell, “Polycrystal constraint and grain subdivision,” Int. J. Plast., 14, 703–717, 1998. [44] P. Cermelli and M.E. Gurtin, “On the characterization of geometrically necessary dislocations in finite plasticity,” J. Mech. Phys. Solids, 49, 1539–1568, 2001. [45] J.L. Chaboche, “Viscoplastic relations for the nonelastic deformation of metals,” Bulletin de l’Academie des Sciences Techniques, 25, 33–42,1977.
1096
D.J. Bammann
[46] J.D. Clayton and D.L. McDowell, “A multiscale multiplicative decomposition for elastoplasticity of polycrystals,” Int. J. Plast., in Press, 2002. [47] E. Cosserat and F. Cosserat, Sur la m´ecanique g´en´erale. Comptes Rendus de ´l Academe des Sciences Paris, 145, 1139–1142, 1907. [48] E. Cosserat and F. Cosserat, Th´eorie des Corps D´eformables, Hermann, Paris, 1909. [49] Y.F. Dafalias, “The plastic spin,” J. Appl. Mech., 107, 865–871, 1985. [50] I. Demir, J.P. Hirth, and H.M. Zbib, “The somigliana ring dislocation,” J. Elast., 28, 223–246, 1992. [51] N.A. Fleck and J.W. Hutchinson, “A phenomenological theory for strain gradient effects in plasticity,” J. Mech. Phys. Solids, 41, 1825–1857. 1993. [52] J. Frenkl, Zeit. Phys., 37, 572, 1926. [53] J.J. Gilman, Proceedings of 5th U.S. National Congress Appl. Mech. ASME., 1966. [54] J.J. Gilman, Micromechanics of Flow in Solids, New York, McGraw-Hill, 1969. [55] M.F. Horstemeyer and D.L. McDowell, “Modeling effects of dislocation substructure in polycrystal elastoviscoplasticity,” Mech. Mater., 27, 145–163, 1998. [56] D.A. Hughes, Q. Liu, D.C. Chrzan, and N. Hansen, “Scaling of microstructural parameters: misorientations of deformation induced boundaries,” Acta Mat., 45, 105–112, 1997. [57] E.H. Lee and D.T. Liu, “Elastic-plastic theory with application to plane-wave analysis,” J. Appl. Phys., 38, 19–27, 1967. [58] R.v. Mises, Z. agnew Math. Mech., 8, 161, 1928. [59] Missing [60] P. Perzyna, “The constitutive equations for work-hardening and rate sensitive plastic materials,” Proc. Vibr. Probl., 4, 281–290, 1963. [61] B. Svendsen, “Continuum thermodynamic models for crystal plasticity including the effects of geometrically-necessary dislocations,” J. Mech. Phys. Solids, 50, 1297–1329, 2002.
3.3 DISLOCATION DYNAMICS H.M. Zbib1 and T.A. Khraishi2 1 Washington State University, Pullman, WA, USA 2
University of New Mexico, Albuquerque, NM, USA
Crystalline materials are usually far from being perfect and may contain various forms of defects, such as vacancies, interstitials and impurity atoms (point defects), dislocations (line defects), grain boundaries, heterogeneous interfaces and microcracks (planar defects), chemically heterogeneous precipitates, twins and other strain-inducing phase transformations (volume defects). Indeed, these defects determine to a large extent the strength and mechanical behavior of the crystal. Most often, dislocations define plastic yield and flow behavior, either as the dominant plasticity carriers or through their interactions with the other strain-producing defects. A dislocation can be easily understood by considering that a crystal can deform irreversibly by slip, i.e., shifting or sliding along one of its atomic planes. If the slip displacement is equal to a lattice vector, the material across the slip plane will preserve its lattice structure and the change of shape will become permanent. However, rather than simultaneous sliding of two halfcrystals, slip displacement proceeds sequentially, starting from one crystal surface and propagating along the slip plane until it reaches the other surface. The boundary between the slipped and still unslipped crystal is a dislocation, and its motion is equivalent to slip propagation. In this picture, crystal plasticity by slip is a net result of the motion of a large number of dislocation lines, in response to the applied stress. It is interesting to note that this picture of deformation by slip in crystalline materials was first observed in the nineteenth century by [1, 2]. They observed that deformation of metals proceeded by the formation of slip bands on the surface of the specimen. Their interpretation of these results was obscure since metals were not viewed as crystalline at that time. Over the past seven decades, experimental and theoretical developments have firmly established the principal role of dislocation mechanisms in defining material strength. It is now understood that macroscopic properties of 1097 S. Yip (ed.), Handbook of Materials Modeling, 1097–1114. c 2005 Springer. Printed in the Netherlands.
1098
H.M. Zbib and T.A. Khraishi
crystalline materials are derivable, at least in principle, from the behavior of their constituent defects. However, this fundamental understanding has not been translated into a continuum theory of crystal plasticity based on dislocation mechanisms. The major difficulty in developing such a theory is the multiplicity and complexity of the mechanisms of dislocation motion and interactions that make it impossible to develop a quantitative analytical approach. The problem is further complicated by the need to trace the spatiotemporal evolution of a very large number of interacting dislocations over very long periods of time, as required for the calculation of plastic response in a representative volume element. Such practical intractability of the dislocation-based approaches, on one hand, and the developing needs of material engineering at the nano and micro length scales on the other, have created the current situation when equations of crystal plasticity used for continuum modeling are phenomenological and somewhat disconnected from all of the degrees of freedom related to the underlying dislocation behavior. Bridging the gap between dislocation physics and continuum crystal plasticity has become possible with the advancement in computational technology with larger and faster computers. To this end, over the past two decades various discrete dislocation dynamics models have been developed. The early discrete dislocation models were two-dimensional (2D) and comprised periodic cells containing multiple dislocations whose behavior was governed by a set of simplified rules [3–8]. These simulations, although served as a useful conceptual framework, were limited to 2D and, consequently, could not directly account for such important features in dislocation dynamics as slip geometry, line tension effects, multiplication, certain dislocation intersections and cross-slip, all of which are crucial for the formation of dislocation patterns. In the 1990s, development of new computational approaches of dislocation dynamics (DD) in three-dimensional (3D) space generated hope for a principal breakthrough in our current understanding of dislocation mechanisms and their connection to crystal plasticity [9–12]. In these new models, dislocation motion and interactions with other defects, particles and surfaces are explicitly considered. However, complications with respect to dislocation multiplications, self-interactions and interactions with other defects, and keeping track of complex mechanisms and reactions have provided a new set of challenges for developing efficient computational algorithms. The DD analysis and its computer simulation modeling devised by many researchers [4, 12–16] have been advanced significantly over the past decade. This progress has been further magnified by the idea of coupling DD with continuum mechanics in computational algorithms such as finite element codes. This coupling may pave the way to better understanding of the local response of materials at the nano and micro scales and globally at the macroscale [17], increasing the potential for future applications of this method in material, mechanical, structural and process engineering analyses. In the following, the
Dislocation dynamics
1099
principles of DD analysis will be presented followed by the procedure for the measurement of local quantities such as plastic distortion and internal stresses. The incorporation of DD technique into the 3D plastic continuum mechanicsbased finite elements modeling will then be described. Finally, examples are provided to illustrate the applicability of this powerful technique in material engineering analysis.
1.
Theoretical Fundamentals
In order to better describe the mathematical and numerical aspects of the DD methodology, first we will identify the basic geometric conditions and kinetics that control the dynamics of dislocations. This will be followed by discussion of the dislocation equation of motion, elastic interaction equations, and descritization of these equations for numerical implementation.
1.1.
Kinematics and Geometric Aspects
A dislocation is a line defect in an otherwise perfect crystal described by its line sense vector ξ and Burgers vector b. The Burgers vector has two distinct components: edge, perpendicular to its line sense vector, and screw, parallel to its line sense vector. Under loading, dislocations glide and propagate on slip planes causing deformation and change of shape. When the local line direction becomes parallel to the Burgers vector, i.e., screw character, the dislocation may propagate into other slip planes. This switching of the slip plane, which makes the motion of dislocations 3D, is better known as cross slip and is an important recovery mechanism to be dealt with in DD. In addition to glide and cross slip, dislocations can also climb in a non-conservative 3D motion by absorbing and/or emitting intrinsic point defects, vacancies, and interstitials. Some of these mechanisms become important at high load levels or temperatures when point defects become more mobile. In summary, the 3D dislocation dynamics accounts for the following geometric aspects: • Dislocation topology; 3D geometry, Burgers vector and line sense. • Identification of all possible slip planes for each dislocation. • Changes in the dislocation topology when part of it cross-slips and/or climbs to another plane. • Multiplication and annihilation of dislocation segments. • Formation of complex connections and intersections such as junctions, jogs, and branching of the dislocation in multiple directions.
1100
1.2.
H.M. Zbib and T.A. Khraishi
Kinetics and Interaction Forces
The dynamics of the dislocation is governed by a “Newtonian” equation of motion, consisting of an inertia term, damping term, and driving force arising from short-range and long-range interactions. Since the strain field of the dislocation varies as the inverse of the distance from the dislocation core, dislocations interact among themselves over long distances. As the dislocation moves, it has to overcome internal drag, and local barriers such as the Peierls stresses (i.e., lattice friction). The dislocation may encounter local obstacles such as stacking fault tetrahedra, defect clusters, and vacancies that interact with the dislocation at short ranges and affect its local dynamics. Furthermore, the internal strain field of randomly distributed local obstacles gives rise to stochastic perturbations to the encountered dislocations, as compared with deterministic forces such as the applied load. This stochastic stress field also contributes to the spatial dislocation patterning in the later deformation stages. Therefore, the strain field of local obstacles adds spatially irregular uncorrelated noise to the equation of motion. In addition to the random strain fields of dislocations or local obstacles, thermal fluctuations also provide a stochastic source in dislocation dynamics. Dislocations also interact with free surfaces, cracks, and interfaces, giving rise to what is termed as image stresses or forces. In summary, the dislocation may encounter the following set of forces: • Drag force, Bv, where B is the drag coefficient and v is the dislocation velocity. • Peierls stress Fpeierls. • Force due to externally applied loads, Fexternal. • Dislocation-dislocation interaction force F D. • Dislocation self-force Fself. • Dislocation-obstacle interaction force Fobstacle. • Image force Fimage. • Osmotic force Fosmotic resulting from non-conservative motion of dislocation (climb) and results in the absorption or emission of intrinsic point defects. • Thermal force Fthermal arising from thermal fluctuations. The DD approach attempts to incorporate all of the aforementioned kinematics and kinetics aspects into a computational traceable framework. In the numerical implementation, three-dimensional curved dislocations are treated as a set of connected segments as illustrated in Fig. 1. It is possible to represent smooth dislocations with any desired degree of realism, provided that the discretization resolution is taken high enough for accuracy (limited by the size of the dislocation core radius r0 , typically the size of one Burgers vector b). In such a representation, the dynamics of dislocation lines is reduced to
Dislocation dynamics
j⫹1
1101
j
ξ “2”
C2
“3”
i⫹1 i⫺2
z
i⫺1 i
y x
vj
ξ C1
j⫺1 “1”
j
C3
RjP
S j⫹1
P Field point
vj+1 dl´ v Velocity vector
Figure 1. points.
Discretization of dislocations loops and curves into nodes, segments and collocation
the dynamics of discrete degrees of freedom of the dislocation nodes connecting the dislocation segments.
1.3.
Dislocation Equation of Motion
The velocity v of a dislocation segment s is governed by a first order differential equation consisting of an inertia term, a drag term and a driving force vector [18–20], such that
1 1 dW v = Fs with m s = m s v˙ + Ms (T, p) ν dv Fs = Fpeirels + F D + Fself + Fexternal + Fobstacle + Fimage + Fosmotic + Fthermal
(1)1 (1)2
In the above equation, the subscript s stands for the segment, m s is defined as the effective dislocation segment mass density, Ms is the dislocation mobility which could depend both on the temperature T and the pressure p, and W is the total energy per unit length of a moving dislocation (elastic energy plus
1102
H.M. Zbib and T.A. Khraishi
kinetic energy). As implied by (1)2 , the glide force vector Fs per unit length arises from a variety of sources described in the previous section. The following relations for the mass per unit dislocation length have been suggested [19] for screw (m s )screw and edge (m s )edge dislocations when moving at a high speed. W0 (−γ −1 + γ −3 ) 2 ν W0 C 2 = (−16γl − 40γl−1 + 8γl−3 + 14γ + 50 γ −1 v4 − 22 γ −3 + 6γ −5 )
(m s )screw = (m s )edge
(2)
where γl = (1 − ν 2 /Cl2 )1/2 , γ = (1 − ν 2 /C 2 )1/2 , Cl is the longitudinal sound velocity, C is the transverse sound velocity, ν is Poisson’s ratio, W0 = (Gb2 /4ð) ln(R/r0 ) is the rest energy for the screw per unit length, G is the shear modulus. The value of R is typically equal to the size of the dislocation cell (about 1000b), or in the case of one dislocation is the shortest distance from the dislocation to the free surface [21]. In the non-relativistic regime when the dislocation velocity is small compared to the speed of sound, the above reduce to the familiar expression m = βρb2 ln(R/r0 ), where β is a constant dependent on the type of the dislocation, and ρ is the mass density.
1.3.1. Dislocation Mobility Function The reliability of the numerical simulation depends critically on the accuracy of the dislocation drag coefficient B(= 1/M) which is material dependent. There are a number of phenomenological relations for the dislocation glide velocity νg [22, 23], including relations of power law forms and forms with an activation term in an exponential or as the argument of a sinh form. Often, however [23, 24] the simple power law form is adopted for expedience, e.g., νg = νs (τe /τs )m , resulting in nonlinear dependence of M on the stress. In a number of cases of pure phonon/electron damping control or of glide over the Peierls barrier a constant mobility (with m = 1), predicts the results very well. This linear form has been theoretically predicted for a number of cases as discussed by Hirth and Lothe [21]. Mechanisms to explain dislocation drag have been studied for long time and the drag coefficients have been estimated in numerous experimental and theoretical works by atomistic simulations or quantum mechanical calculations (see, e.g., the review by Al’shitz [25]). The determination of each of the two components (phonon and electron drag) that constitute the drag coefficient for a specific material is not trivial, and various simplifications have been made, e.g., the Debye model neglects Van Hove singularities in phonon spectrum [26], isotropic approximation of deformation potentials, and so on.
Dislocation dynamics
1103
Also the values are sensitive to various parameters such as the mean free path or core radius. Nevertheless, in typical metals, the phonon drag Bph range is 30 ∼ 80 µPa s at room temperature and less than 0.1 µPa s at very low temperatures around 10 K, while for the electron drag Be the range is a few µPa s and expected to be temperature independent. Under strong magnetic fields at low temperature, macroscopic dislocation behavior can be highly sensitive to orientation relative to the field within accuracy of 1% [27]. Except for special cases such as deformation under high strain rate, weak dependences of drag on dislocation velocity are usually neglected. Examples of temperature dependence of each component of the drag coefficient can be found for the case of edge dislocation in Copper [28], or in Molybdenum [29]. Generally, however, the dislocation mobility could be, among other things, a fuction of the angle between the Burgers vector and the dislocation line sense, i.e., dislocation character, especially at low temperatures. For example, Wasserb¨ach [30] observed that at low deformation temperatures (77–195 K) the dislocation structure in Ta single crystals consisted of primary and secondary screw dislocations and of tangles of dislocations of mixed characters, while at high temperatures (295–470 K) the behavior was similar to that of fcc crystals. In the work of Mason and MacDonald [31] they measured the mobility of dislocation of an unidentified type in NB as 4.2 × 104 (Pa s)−1 near room temperature. A smaller value of 3.3 × 103 (Pa s)−1 was obtained by Urabe and Weertman [32] for the mobility of edge dislocation in Fe. The mobility for screw dislocations in Fe was found to be about a factor of two smaller than that of edge dislocations near room temperature. A theoretical model to explain this large difference in behavior is given in Hirth and Lothe [21] and is based on the observation that in bcc materials the screw dislocation has a rather complex three-dimensional core structure, resulting in a high Peierls stress, leading to a relatively low mobility for screw dislocations while the mobility of mixed dislocations is higher.
1.3.2. Dislocation Collisions When two dislocations collide, their response is dominated by their mutual interactions and becomes much less sensitive to the long-range elastic stress associated with external loads, boundary conditions, and all other dislocations present in the system. Depending on the shapes of the colliding dislocations, their approach trajectories and their Burgers vectors, two dislocations may form a dipole, or react to annihilate, or to combine to form a junction, or to intersect and form a jog. In the DD analysis, the dynamics of two colliding dislocations is determined by the mutual interaction force acting between them. In the case that the two dislocation segments are parallel (on the same plane and or intersecting planes) and have the same Burgers vector with opposite
1104
H.M. Zbib and T.A. Khraishi
sign they would annihilate if the distance between them is equal to the core size. Otherwise, the colliding dislocations would align themselves to form a dipole, a jog or a junction depending on their relative position. A comprehensive review of short-range interaction rules can be found in Rhee, Zbib et al. [33].
1.3.3. Discretization of Dislocation Equation of Motion Equation (1) applies to every infinitesimal length along the dislocation line. In order to solve this equation for any arbitrary shape, the dislocation curve may be discretized into a set of dislocation segments as illustrated in Fig.1. Then the velocity vector field over each segment may be assumed to be linear and, therefore, the problem is reduced to finding the velocity of the nodes connecting these segments. There are many numerical techniques to solve such a problem. Consider, for example, a straight dislocation segment s bounded by two nodes j and j + 1 as depicted in Fig. 1. Then within the finite element formulation [34], the velocity vector field is assumed to be linear over the dislocation segment length. This linear vector field v can be expressed in terms of the velocities of the nodes such that v = [ND ]T V D where V D is the nodal velocity vector and [ND ] is the linear shape function vector [34]. Upon using the Galerkin method, Eq. (1) for each segment can be reduced to a set of six equations for the two discrete nodes (each node has three degrees of freedom). The result can be written in the following matrix-vector form. D [M D ]V˙ + [CD ]V D = FD
D
D T
(3)
where [M D ] = m s N N dl is the dislocation segment 6 × 6 mass mat T rix, [CD ] = (1/Ms ) ND ND dl is the dislocation segment 6 × 6-damping matrix, and FD = ND Fs dl is the 6 × 1 nodal force vector. Then, following the standard element assemblage procedure, one obtains a set of discrete system of equations, which can be cast in terms of a global dislocation mass matrix, a global dislocation damping matrix, and a global dislocation force vector. In the case of one dislocation loop and with ordered numbering of the nodes around the loop, it can be easily shown that the global matrices are banded with half-bandwidth equal to one. However, when the system contains many loops that interact among themselves and new nodes are generated and/or annihilated continuously, the numbering of the nodes becomes random and the matrices become unbanded. To simplify the computational effort, one can employ the lumped mass and damping matrix method. In this method, the mass matrix [M D ] and damping matrix [C D ] become diagonal matrices (half-bandwidth equal to zero), and therefore the only coupling between the equations is through the nodal force vector F D . The computation of each component of the force vector is described below.
Dislocation dynamics
1.4.
1105
The Dislocation Stress and Force Fields
The stress induced by any arbitrary dislocation loop at an arbitrary field point P can be computed by the Peach–Koehler integral equation given in Hirth and Lothe [21]. This integral equation, in turn, can be evaluated numerically over many loops of dislocations by discretizing each loop into a series of line segments. If we denote, Nl = total number of dislocation loops Ns(l) = number of segments of loop l Nn(l) = number of nodes associated with the segments of loop l, i.e., Nn(l) = Ns(l) + 1 Ns = total number of segments = Ns(l) × N l , where summation over l is implied Nn = total number of nodes = Nn(l) × N l , where summation over l is implied. ls = length of segment s. r = distance from point P to the segment s. Then the discretized form of the Peach–Koehler integral equation for the stress at any arbitrary field point P becomes ()
d
σ ij (P) =
Ns N l=1 s=1
−
G 8π
ls
bp ∈mpi
G ∂ 2 ∇ R dxj − ∂xm 8π
G ∂ × ∇ 2 R dxi − ∂xm 4π (1 − ν)
×
ls
bp ∈mpi
ls
bp ∈mpk
∂3 R ∂ 2 ∇ R dxk − δij ∂xm ∂xi ∂xj ∂xm
(4)
where ∈i j k is the permutation symbol, and R is the magnitude of the R = r –r (with r being the position vector of point P and r the position vector of a differential line segment of the dislocation loop or curve). The integral over each segment can be explicitly carried out using the linear element approximation. Exact solution of Eq. (4) for a straight dislocation segment can be found in DeWit [35] and Hirth and Lothe [21]. However, evaluation of the above integral requires careful consideration as the integrand becomes singular in cases where point P coincides with one of the nodes of the segment that integration is taken over, i.e., self-segment integration. Thus, • If P is not part of the segment s, there is no singularity since R =/ 0 and the ordinary integration procedure may be performed. • If P coincides with a node of the segment s where the integration should be carried out, special treatment is required due to the singular nature of the stress field as R → 0. Here, the regularization scheme developed by Zbib and co-workers have been employed.
1106
H.M. Zbib and T.A. Khraishi
In general, the dislocation stresses can be decomposed into the following form. d
σ(P) =
N s −2
d
d
d
σ (s) + σ ( P+) + σ ( P−)
(5)
s=1 d
where σ (s) is the contribution to the stress at point P from a segment s, and d d σ ( P+) , σ ( P−) are the contributions to the stress from the two segments that are shared by a node coinciding with P which will be further discussed below. Once the dislocation stress field is computed the forces on each dislocation segment can be calculated by summing the stresses along the length of the segment. The stresses are categorized into those coming from the dislocations as formulated above and also from any other externally applied stresses plus the internal friction (if any) and the stresses induced by any other defects or micro-constituents. A model for the osmotic force Fosmotic is given in Raabe [36] and its inclusion in the total force is straightforward since it is a deterministic force. However, the treatment of the thermal force Fthermal is not trivial since this force is stochastic in nature, requiring a special consideration and algorithm leading to what is called stochastic dislocation dynamics (SDD) as developed by Hiratani and Zbib [37]. Therefore, the force acting on each segment can be written as:
Fs =
Ns
a (s)
d (m)
σ
+σ
+ τs
d
a
· bs × ξs = Fs + Fs + Fthermal
(6)
m=1 d
where σ (m) , is the contribution to the stresses along segment s from another a segment m (dislocation-dislocation interaction), σ (s) is the sum of all externally applied stresses, internal friction (if any) and the stresses induced by d
a
any other defects, and τs is the thermal stress; Fs , Fs and Fthermal are the corresponding total Peach–Koehler (PK) forces. d
Using Eq. (5), the force Fs can also be decomposed into two parts one arising from all dislocation segments and one from the self-segment, which is better known as the self-force, that is, d
Fs =
N s −2
d
d
Fs (m) + Fs (self)
(7)
m=1 d
d
where Fs(m) and Fs(self) are respectively, the contribution to the force on segment s from segment m and the self-force. In order to evaluate the self-force, a special numerical treatment as given by Zbib, Rhee et al. [12] and Zbib and
Dislocation dynamics
1107
Diaz de la Rubia [17] should be used in which exact expressions for the selfforce are given. This approximation works well in terms of accuracy and numerical convergence for segment lengths as small as 20b. For finer segments, however, one can use a more accurate approximation as suggested by Scattergood and Bacon [38]. Another treatment has been given by Gavazza and Barnett [39] and used in the recent work of Ghoniem and Sun [40]. The direct computation of the dislocation forces discussed above requires the use of a very fine mesh, especially when dealing with problems involving dislocation-defect interaction. As a rule to capture the effect of the very small defects, the dislocation segment size must be comparable to the size of the defect. Alternatively, one can use large dislocation segments compared to the smallest defect size, provided that the force interaction is computed over many points (Gauss points) over the segment length. In this case, the self-force of segment s would be evaluated first. Then the force contribution from other dislocations and defects is calculated by computing the stresses at several Gauss points along the length of the segment. The summation as in Eq. (6) would then follow according to: Fs =
Fsself
+
N s −2 m=1
ng d (m)
1 σ ( pg ) + · · · · bs × ξs n g g=1
(8)
where pg is the Gauss point g and n g is the number of Gauss points along segment s. The number of Gauss points depends on the length of the segment. As a rule the shortest distance between two Gauss points should be larger or equal to 2r0 , i.e., twice the core size.
1.5.
The Stochastic Force and Cross-slip
Thermal fluctuations can arise from dissipation mechanisms due to collision of dislocations with surrounding particles, such as phonons or electrons. Rapid collisions and momentum transfers result in random forces on dislocations. These stochastic collisions, in turn, can be regarded as time-independent noise of thermal forces acting on the dislocations. Suppose the exertion of thermal forces follows a Gaussian distribution. Then, thermal fluctuations most likely result in very small net forces due to mutual cancellations. However, they sometimes become large and may cause diffusive dislocation motion or thermal activation events such as overcoming obstacle barriers. Therefore, the DD simulation model should also account not only for deterministic effects but also for stochastic forces; leading to a model called “stochastic discrete dislocation dynamics” (SDD) [41]. The procedure is to include the stochastic force Fthermal in the DD model by computing the magnitude of the stress pulse (τs ) using a Monte Carlo type analysis.
1108
H.M. Zbib and T.A. Khraishi Table 1. The stress pulse peak height for various combinations of parameters, t = 50 fs T (K)
1/M
0 50 100 300
µPa s
2 5 10 30
τh (MPa) (l = 5b)
τh (MPa) (l = 10b)
11.5 40.6 81.1 256
8.11 28.7 57.4 181
Based on the assumption of the Gaussian process, the thermal stress pulse has zero mean and no correlation [36, 42] between any two different times. This leads to the average peak height given as [43, 44]. σs =
2kT /Ms b2 lt
(9)
where k denotes Boltzman constant, T absolute temperature of the system, b the magnitude of Burgers vector, t time step, and l is the dislocation segment length, respectively. Some values of the peak height are shown in Table 1 for typical combinations of parameters [41]. Here, t is chosen to be 50 fs, roughly the inverse of the Debye frequency. Numerical implementation includes an algorithm where stochastic components are evaluated at each time step of which strengths are correlated and sampled from a bivariate Gaussian distribution [45]∗ . With the inclusion of stochastic forces in DD analysis, one can treat cross-slip (a thermally activated process) in a direct manner, since the duration of waiting time and thermal agitations are naturally included in the stochastic process. For example, for the cross-slip in fcc model one can develop a model based on the Escaig– Friedel (EF) mechanism where cross-slip of a screw dislocation segment may be initiated by an immediate dissociation and expansion of Shockley partials. This EF mechanism has been observed to have lower activation energy than Shoeck–Seeger mechanism where the double super kinks are formed on the cross slip plane (this model is used for cross-slip in bcc [33]). In the EF mechanism, the activation enthalpy G depends on the interval of the Shockley partials (d) and the resolved shear stress on the initial glide plane (σ ). (See, e.g., the MD simulation of Rasmussen and Jacobs [46] and Rao, Parthasarathy et al. [47]). The constriction interval L is also dependent on σ . For example, for the case of copper, the activation energy for cross-slip can be computed using an empirical formula fitted to the MD results of Rao, Parthasarathy et al. [47]. Figure 2 depicts the G (σ ) for the case of copper where the value * Hiratani and Zbib Here generate stress pulses as τ = σ √−2 ln r cos(2πr ) where r and r are uniform s s 1 2 1 2
random numbers between zero and unity [45].
Dislocation dynamics
1109
1
∆G/∆Gc
0.9 0.8 0.7 0.6 0.5 10⫺5
10⫺4
10⫺3
10⫺2
σ/µ Figure 2. The normalized activation enthalpy for copper as a function of the normalized resolved shear stress on the glide plane. Gc and µ denote the activation free energy and the shear modulus, respectively (From Hiratani and Zbib, 2003).
of the activation free energy is 1.2 eV, and for stacking fault energy is equal to 0.045 J/m2 . This activation energy for stress assisted cross-slip is entered as an input data into the DD code. Usually, within the DD code, dislocations are represented as perfect dislocations while a pair of parallel Shockley partials are introduced in the case of screw dislocation segments only for stress calculation. Then a Monte Carlo type procedure is used to select either the initial plane or the cross slip plane according to the activation enthalpy [33]. For simplicity, one can set the regime of the barrier with area of L × d and strength of G/Ld. The virtual Shockley partials move according to the Langevin forces in addition to the systematic forces according to Eq. (13) until the partials overcome the barrier and the interval decreases to the core distance. The implementation of this model captures the anisotropic response of cross-slip activation process to the loading direction, and consideration of the time duration (waiting time) during the cross-slip event, which have been missing in the former DD simulations.
1.6.
Modifications for Long-Range Interactions: The Super-Dislocation Principle
Inclusion of the interaction among all the dislocation loops present in a large body is computationally expensive since the number of computations
1110
H.M. Zbib and T.A. Khraishi
per step would be proportional to Ns2 where Ns is the number of dislocation segments. A numerical compromise technique termed the super-dislocation method, which is based on the multipolar expansion method [7, 12, 33], reduces the order of computation to Ns log Ns with a high accuracy. In this approach, the dislocations far away from the point of interest are grouped together into a set of equivalent monopoles and dipoles. In the numerical implementation of the DD model, one would divide the 3D computational domain into sub-domains, and the dislocations in each sub-domain (if there are any) are grouped together in terms of monopoles, dipoles, etc. (depending on the desired accuracy) and the far stress field is then computed.
1.7.
Evaluation of Plastic Strains
The motion of each dislocation segment gives rise to plastic distortion, which is related to the macroscopic plastic strain rate tensor ε˙ p , and the plastic spin tensor W p through the relations ε˙ = p
Wp =
Ns ls νgs s=1 Ns s=1
(ns ⊗ bs + bs ⊗ ns )
(10)1
ls νgs (ns ⊗ bs − bs ⊗ ns ) 2V
(10)2
2V
where ns is a unit normal to the slip plane, νgs is the magnitude of the glide velocity of segment s, V is the volume of the representative volume element (RVE). The above relations provide the most rigorous connection between the dislocation motion (the fundamental mechanism of plastic deformation in crystalline materials) and the macroscopic plastic strain, with its dependence on strength and applied stress being explicitly embedded in the calculation of the velocity of each dislocation. Length scale effects are explicitly included into the calculation through long-range interactions. Another microstructure quantity, the dislocation density tensor α, can also be calculated according to
α=
Ns ls s=1
V
bs ⊗ ξs
(11)
This quantity provides a direct measure for the net Burgers vector that gives rise to strain gradient relief (bending of crystal) [48].
Dislocation dynamics
1.8.
1111
The DD Numerical Solution: An Implicit–Explicit Integration Scheme
An implicit algorithm to solve the equation of motion (3) with a backward integration scheme may be used, yielding the recurrence equation ν
t +δt
t 1+ m s Ms
t +δt
= νt +
t t +δt F ms s
(12)
This integration scheme is unconditionally stable for any time step size. However, the DD time step is determined by two factors: (i) the shortest flight distance for short-range interactions, and (ii) the time step used in the dynamic finite element modeling to be described later. This scheme is adopted since the time step in the DD analysis (for high strain rates) is of the same order of magnitude of the time required for a stable explicit finite element (FE) dynamic analysis. Thus, in order to ensure convergence and stable solution, the critical time tc and the time step for both the DD and the FE ought to be tc = lc /Cl, and t = tc /20, respectively, where lc is the characteristic length scale which is the shortest dimension in the finite element mesh. In summary, the system of equations given above summarizes the basic ingredients that a DD simulation model should include. There are a number of variations in the manner in which the dislocation curves may be discretized, for example zero order element (pure screw and pure edge), first order element (or piecewise linear segment with mixed character), or higher order nonlinear elements, but this is purely a numerical issue. Nonetheless, the DD model should have the minimum number of parameters and, hopefully, all of them should be basic physical and material parameters and not phenomenological ones for the DD result to be predictive. The DD model described above has the following set of physical and material parameters: • • • • • • •
Burgers vectors, elastic properties, core size (equal to one Burgers vector), thermal conductivity and specific heat, mass density, stacking fault energy, and dislocation mobility.
Also there are two numerical parameters: the segment length (minimum segment length cannot be less that three times the core size) and the time step (as discussed in conjunction with Eq. 12), but both are fixed to ensure convergence of the result. In the above list, it is emphasized that in general the dislocation mobility is an intrinsic material property that reflects the local drag mechanisms as discussed above. One can use an “effective” mobility that accounts
1112
H.M. Zbib and T.A. Khraishi
for additional drag from dislocation-point defect interaction, and thermal activation processes if the defects/obstacles are not explicitly impeded in the DD simulations. However, there is no reason not to include these effects explicitly in the DD simulations (as done in the model described above), i.e., dislocation defect interaction, stochastic processes and inertia effects, which actually permits the prediction of the “effective” mobility from the DD analysis [37, 44].
References [1] O. M¨ugge, Neues Jahrb, Min 13, 1883. [2] J.A. Ewing, and W. Rosenhain, “The crystalline structure of metals,” Phil. Trans. Roy. Soc. A, 193, 353–375, 1899. [3] J. Lepinoux and L.P. Kubin, “The dynamic organization of dislocation structures: A simulation,” Scripta Metall., 21, 833–838, 1987. [4] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Sol. Stat. Phenom., 3 & 4, 379–406, 1988. [5] I. Groma and G.S. Pawley, “Role of the secondary slip system in a computer simulation model of the plastic behavior of single crystals,” Mater. Sci. Engrg. A, 164, 306–311, 1993. [6] E. Van der Giessen and A. Needleman, “Discrete dislocation plasticity: A simple planar model,” Mater. Sci. Eng., 3, 689–735, 1995. [7] H.Y. Wang and R. LeSar, “O(N) Algorithm for dislocation dynamics,” Phil. Mag. A, 71, 149–164, 1995. [8] K.C. Le and H. Stumpf, “A model of elasticplastic bodies with continuously distributed dislocations,” Int. J. Plasticity, 12, 611–628, 1996. [9] L.P. Kubin and G. Canova, “The modelling of dislocation patterns,” Scripta Metall., 27, 957–962, 1992. [10] G. Canova, Y. Brechet, L.P. Kubin, B. Devincre, V. Pontikis, and M. Condat, “3D simulation of dislocation motion on a lattice: Application to the yield surface of single crystals,” Microstructures and Physical Properties, J. Rabiet (ed.), CH-Transtech, 1993. [11] J.P. Hirth, M. Rhee, and H.M. Zbib, “Modeling of deformation by a 3D simulation of multipole, curved dislocations,” J. Computer-Aided Materials Design, 3, 164–166, 1996. [12] H.M. Zbib, M. Rhee, and J.P. Hirth, “3D simulation of curved dislocations: discretization and long range interactions,” Advances in Engineering Plasticity and its Applications, T. Abe and T. Tsuta (eds.), Pergamon, NY, 15–20, 1996. [13] G.R. Canova, Y.Brechet, and L.P. Kubin, “3D Dislocation simulation of plastic instabilities by work softening in alloys,” In: S.I. Anderson et al. (eds.), Modelling of Plastic Deformation and Its Engineering Applications, Riso National Laboratory, Roskilde, Denmark, 1992. [14] L.P. Kubin, “Dislocation patterning during multiple slip of FCC Crystals,” Phys. Stat. Sol. (a), 135, 433–443, 1993. [15] K.W. Schwarz, and J. Tersoff, “Interaction of threading and misfit dislocations in a strained epitaxial layer,” Appl. Phys. Lett., 69(9), 1220, 1996.
Dislocation dynamics
1113
[16] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40, 113–127, 1998. [17] H.M. Zbib and T. Diaz de la Rubia, “A multiscale model of plasticity,” Int. J. Plasticity, 18(9), 1133–1163, 2002. [18] J.P. Hirth, “Injection of dislocations into strained multilayer structures,” Semiconductors and Semimetals, Academic Press, 37, 267–292, 1992. [19] J.P. Hirth, H.M. Zbib, and J. Lothe, “Forces on high velocity dislocations,” Modeling & Simulations in Maters. Sci. & Enger., 6, 165–169, 1998. [20] H. Huang, N. Ghoniem, T. Diaz de la Rubia, H.M. Rhee, Z. and J.P. Hirth, “Development of physical rules for short range interactions in BCC Crystals,” ASME-JEMT, 121, 143–150, 1999. [21] J.P. Hirth, and J. Lothe, “Theory of dislocations,” New York, Wiley, 1982. [22] U.F. Kocks, A.S. Argon, and M.F. Ashby, “Thermodynamics and kinetics of slip,” Oxford, Pergamon Press, 1975. [23] R. Sandstrom, “Subgrain growth occurring by boundary migration,” Acta Metall., 25, 905–911, 1977. [24] W.G. Johnston and J.J. Gilman, “Dislocation velocities, dislocation densities, and plastic flow in Lithium Flouride Crystals,” J. Appl. Phys., 30, 129–144, 1959. [25] V.I. Al’shitz, “The phonon-dislocation interaction and its role in dislocation dragging and thermal resistivity,” Elastic Strain and Dislocation Mobility, V.L. Indenbom and J. Lothe, Elsevier Science Publishers B.V, Chapter 11, 1992. [26] N.W. Ashcroft, and N.D. Mermin, Solid State Physics: Saunders College, 1976. [27] T.J. McKrell and J.M. Galligan, “Instantaneous dislocation velocity in iron at low temperature,” Scripta Materialia, 42, 79–82, 2000. [28] M. Hiratani and E.M. Nadgorny, “Combined model of dislocation motion with thermally activated and drag-dependent stages,” Acta Mat., 40, 4337–4346, 2001. [29] C. Jinpeng, V.V. Bulatov, and S. Yip, “Molecular dynamics study of edge dislocation motion in a bcc metal,” J. Comput. Mat., 6, 165–173, 1999. [30] W. Wasserb¨ach, “Plastic deformation and dislocation arrangement of Nb-34 at.% TA Alloy Crystals,” Phil. Mag. A, 53, 335–356, 1986. [31] W. Mason and D. MacDonald, “Damping of dislocations in Niobium by phonon viscosity,” J. Appl. Phys., 42, 1836, 1971. [32] N. Urabe and J. Weertman, “Dislocation mobility in potassium and iron single crystals,” Mater. Sci. Engng., 18, 41, 1975. [33] M. Rhee, H.M. Zbib, J.P. Hirth, H. Huang, and T.D. de la Rubia, “Models for long/short range interactions in 3D dislocation simulation,” Modeling & Simulations in Maters. Sci. & Enger., 6, 467–492, 1998. [34] K.J. Bathe, “Finite element procedures in engineering analysis,” New Jersey, Prentice-Hall, 1982. [35] R. DeWit, “The continuum theory of stationary dislocations,” Solid State Phys., 10, 249–292, 1960. [36] D. Raabe, “Introduction of a hybrid model for the discrete 3D simulation of dislocation dynamics,” Comput. Mater. Sci., 11, 1–15, 1998. [37] M. Hiratani and H.M. Zbib, “Stochastic dislocation dynamics for dislocation-defects interaction,” J. Enger. Mater. Tech., 124, 335–341, 2002. [38] R.O. Scattergood and D.J. Bacon, “The Orowan mechanism in ansiotropic crystal,” The Philosophical Magazine, 31, 179–198, 1975. [39] S.D. Gavazza and D.M. Barnett, “The self-force on a planar dislocation loop in an anisotropic linear-elastic medium,” J. Mech. Phys. Solids, 24, 171–185, 1976.
1114
H.M. Zbib and T.A. Khraishi
[40] N.M. Ghoniem and L. Sun, “A fast sum method for the elastic field of 3D dislocation ensembles,” Phys. Rev. B, 60, 128–140, 1999. [41] M. Hiratani and H.M. Zbib, “On dislocation-defect interaction and patterning: Stochastic discrete dislocation dynamics,” J. Nuc. Enger., in press, 2003. [42] D. Ronnpagel, T. Streit, and T. Pretorius, “Including thermal activation in simulation calculation of dislocation glide,” Phys. Stat. Sol., 135, 445–454, 1993. [43] T.J. Koppenaal and D. Kuhlmann-Wilsdorf, “The effect of prestressing on the strength of neutron-irradiated copper single crystals,” Appl. Phys. Lett., 4, 59, 1964. [44] M. Hiratani, H.M. Zbib, and M.A. Khaleel, “Modeling of thermally activated dislocation glide and plastic flow through local obstacles,” Int. J. Plasticity, 19, 1271–1296, 2003. [45] M.P. Allen and D.J. Tildesley, “Computer simulation of liquids,” Oxford Science Publications, 1987. [46] T. Rasmussen, and K.W. Jacobs, “Simulations of atomic structure, energetics, and cross slip of screw dislocations in copper,” Phys. Rev. B, 56(6), 2977, 1997. [47] S. Rao, T.A. Parthasarathy, and C. Woodward, “Atomistic simulation of cross-slip processes in model fcc structures,” Phil. Mag. A, 79, 1167, 1999. [48] K. Shizawa and H.M. Zbib, “Thermodynamical theory of strain gradient elastoplasticity with dislocation density: Part I – Fundamentals,” Int. J. Plasticity, 15, 899–938, 1999.
3.4 DISCRETE DISLOCATION PLASTICITY E. Van der Giessen1 and A. Needleman2 1 University of Groningen, Groningen, The Netherlands 2
Brown University, Providence, RI, USA
Plastic deformation of crystalline solids is of both scientific and technological interest. Over a wide temperature range, the principal mechanism of plastic deformation in crystalline solids involves the glide of large numbers of dislocations. As a consequence, since the 1930s, when dislocations were identified as carriers of plastic deformation in crystalline solids, there has been considerable interest in elucidating the physics of individual dislocations and of dislocation structures. Major effort has also been devoted to developing tools to solve boundary value problems based on phenomenological continuum descriptions in order to predict the plastic deformations that result in structures and components from some imposed loading. Since the 1980s these two approaches have grown toward each other, driven by, for instance, miniaturization and the need for more accurate models in engineering design. The approaches meet at a scale where the collective behavior of individual dislocations controls phenomena. This encounter, together with continuously increasing computing power, has fostered the development of an approach where boundary value problems are solved with plastic flow modeled in terms of the collective motion of discrete dislocations represented as line defects in a linear elastic continuum [1, 2]. This is the field of discrete dislocation plasticity. A dislocation is a line defect in a crystalline solid which bounds the region on a plane where the material above and below are shifted relative to each other. This shift is termed the slip and the key geometric ingredient of discrete dislocation plasticity is the Burgers vector that characterizes the magnitude and direction of the slip. As a consequence of slip the displacement field is not continuous. The associated stress, strain and rotation fields are continuous except on the dislocation line where they are singular. The state near the dislocation line, the dislocation core region, is not accurately represented by linear elasticity theory. However, atomistic simulations have shown that the linear 1115 S. Yip (ed.), Handbook of Materials Modeling, 1115–1131. c 2005 Springer. Printed in the Netherlands.
1116
E. Van der Giessen and A. Needleman
elastic fields give an excellent description of the displacement fields beyond 8–10 Burgers vectors from the core, so that also stress and deformation are described well by the linear fields. A discrete dislocation model of plastic flow entails the simulation of the evolution of the dislocation structure in response to a prescribed loading. The history dependence of plastic deformation is thus contained in the history of the dislocation structure. The physical mechanisms that underlie phenomena such as dislocation glide, annihilation, cross slip, etc. are governed by corelevel atomic-scale events and their governing properties are supplied in the form of constitutive rules. In this section, we outline discrete dislocation plasticity, giving a perspective on key assumptions, capabilities and limitations.
1.
Discrete Dislocation Dynamics
The aim is to determine the quasi-static evolution of the deformation and stress states for a dislocated solid subject to some prescribed loading history. This is done in an incremental manner in time. At a given instant, the stress state and dislocation structure are presumed known. An increment of loading is prescribed, and (i) the updated deformation and stress state, and (ii) the change in the dislocation structure need to be computed. The dislocations are represented as line singularities in a linear elastic solid. The long range interaction between dislocations is determined directly from elasticity theory, but constitutive rules are required for dislocation motion, dislocation nucleation, dislocation annihilation and, possibly, other short range interactions. Each time step involves three main computational stages: (i) determining the driving force for dislocation motion; (ii) determining the rate of change of the dislocation structure, which involves the motion of dislocations, the generation of new dislocations, their mutual annihilation, and their possible pinning at obstacles; and (iii) determining the stress and strain state for the updated dislocation arrangement. The key idea for determining the stress and deformation state of the solid given the current dislocation structure is superposition. The equilibrium stress and strain fields associated with the individual dislocations are singular, but they are known analytically [1, 2]. For a body with specified boundary conditions, the actual stress and deformation fields can be written as the sum of the singular fields associated with the individual dislocations and a nonsingular image field that enforces the boundary conditions. The advantage of this superposition is that while standard numerical methods for elasticity problems such as finite element, finite difference or boundary element methods cannot accurately represent the strongly singular individual dislocation fields, they can accurately resolve the image fields.
Discrete dislocation plasticity
1117
The governing equations to be satisfied at time t are: • Equilibrium, ∂σi j = 0, ∂x j
(1)
together with σi j = σ j i . • The constitutive relation, σi j = L i j kl kl ,
(2)
where L i j kl are the components of the tensor of elastic moduli. • The strain-displacement relation, 1 i j = 2
∂u j ∂u i + . ∂x j ∂ xi
(3)
For a dislocated solid, the strain field does not satisfy compatibility, i.e.,
u i, j ds = /0
(4)
C
since the displacement field is not a continuous single-valued function. • Boundary conditions, i.e., either prescribed displacements Ui0 or prescribed tractions σi j n j = Ti0 on the boundary with outward unit normal n i . The total displacement, u i , strain, i j , and stress, σi j fields are written as u i = u˜ i + uˆ i ,
i j = ˜i j + ˆi j ,
σi j = σ˜ i j + σˆ i j in V ,
(5)
respectively. The (˜) fields are the superposition of the fields of the individual dislocations, in their current configuration, i.e., u˜ i =
u iI ,
˜i j =
I
iIj ,
σ˜ i j =
I
σiIj
(I = 1, . . . , N )
(6)
I
where ( ) I denotes the singular field associated with an individual dislocation, N being the number of dislocations in the current configuration. The (˜) fields give rise to tractions T˜i and displacements U˜ i on the boundary of the body. The (ˆ) fields represent the image fields that correct for the actual boundary conditions on S. The governing equations for the (ˆ) fields are ∂ σˆ i j =0, ∂x j σˆ i j = L i j kl ˆkl
1 ˆi j = 2
∂ uˆ j ∂ uˆ i + ∂x j ∂ xi
(7) (8)
1118
E. Van der Giessen and A. Needleman σˆ i j n j = Tˆi = Ti0 − T˜i u i = Uˆ i = Ui0 − U˜ i
on ST on Su
(9)
Here, ST is the portion of the boundary on which tractions are prescribed and Su is the portion of the boundary on which displacements are prescribed, as illustrated in Fig. 1. A key point is that the (ˆ) fields are smooth, so that Eqs. (7)–(9) constitute a conventional linear elastic boundary value problem that can be conveniently solved by a conventional numerical method for linear elasticity problems. To date, only the finite element method has been used for this purpose, but other methods are also suitable and, for example, boundary element methods may have advantages for three-dimensional problems. The driving force for dislocation evolution is the Peach–Koehler force which is the configurational force associated with a change in dislocation position. With denoting the potential energy, the Peach–Koehler force f I on dislocation I is given by δ = −
I
L
f I · δs I dl
(10)
I
where L denotes dislocation line I and δs I is the change in its position. With t I a unit vector tangent to dislocation line I and m I a unit vector normal to its glide plane, the local glide direction is t I ×m I and the component of the Peach–Koehler force in the glide direction, f I , is I
I
f =
m iI
σˆ i j +
σiJj
b Ij
(11)
J= /I
Figure 1. Decomposition into the problem of interacting dislocations in an infinite solid, the (˜) fields, and the complementary problem for the finite body without dislocations, the (ˆ) or image fields.
Discrete dislocation plasticity
1119
Here, b Ij are the components of the Burgers vector of dislocation I . Note that the value of f I does not depend on any specification of core properties. This is because the Peach–Koehler force is calculated for a translation of the dislocation. An actual dislocation motion will, in general, involve a change in dislocation shape and thus a change in dislocation line length. The change in line length is accounted for through a constitutive rule and is referred to as the line tension. Dislocations can change glide planes (cross slip) and climb (motion off a glide plane), particularly at temperatures that are a significant fraction of the melting temperature, but attention here is confined to glide. Any effect of geometry changes is neglected in the formulation described above. Large deformations occur inside dislocation cores and these are not modeled by the linear elastic description of dislocations. However, outside dislocation cores, finite-deformation effects can come into play once significant slip has occurred. In particular, there are effects of lattice reorientation on dislocation glide and of geometry changes on the momentum balance. It is known from continuum slip crystal plasticity that lattice reorientation effects can have a significant on the overall response. Effects of geometry changes arise in two contexts: (i) overall shape change, as in the reduction of cross-sectional area in a plastically deformed tensile bar and (ii) the formation of surface slip steps and the resulting stress concentration that occurs there. Effects of overall shape changes also occur in continuum plasticity, but the possible formation of slip steps is an additional feature of discrete dislocation plasticity. A finite deformation discrete dislocation plasticity framework has been presented by Deshpande et al. [3]. Here, we will confine attention to the formulation with geometry changes neglected.
2.
Three-Dimensional Dislocation Dynamics The geometry of a dislocation is governed by a number of variables: • the slip plane, denoted with its unit normal vector m; • the dislocation line as a parameterized line on this plane and with a local tangent vector t; • the Burgers vector b. There are a few special parts of a generic loop, namely edge: b · t = 0 ; screw: b · t = ±b ,
(12) (13)
b being the length of b: b = |b|. Edge and screw dislocations are the central notions in two-dimensional studies, as discussed in a subsequent section.
1120
E. Van der Giessen and A. Needleman
The first step in discrete dislocation plasticity in three dimensions is the description of the individual dislocations. Most methods currently in use, involve discretization of each dislocation. These schemes vary from a screw– edge representation (e.g., [4]), a representation with straight segments, e.g., [5–7], to one with a spline representation [8]. The representation of dislocation loops by straight segments, as illustrated in Fig. 2, implies that each segment has in general a mixed nature, 0 ≤ |b · t| ≤ b. The advantage of this discretization within the superposition framework is that the fields of straight segments are known exactly for a linear elastic isotropic medium. The expressions for the stress fields of individual segments in infinite space are given by Hirth and Lothe [1] while the corresponding displacement fields can be found in [9]. The topology of the discretized loop illustrated in Fig. 2 is at any instant characterized by the set x A of positions of the nodes A = 1, . . . , N . Assuming glide motion only, the velocity of any node, v A , can be written as v A = v A t × m = v A s. The velocity at any point x(l) is obtained by linear interpolation between the nodal velocities v A . Assuming over-damped motion along the entire dislocation loop, the velocity v(l) can be related to the local Peach–Koehler force F(l) projected onto s via the drag relationship F(l) = Dv(l)
glide plane
m l⫽0 A1 x (0)
A⫹1 dl
t A s
x (l)
O
Figure 2. Description of a dislocation loop in its glide plane; m is the normal to the glide plane; the orientation of the loop is determined by the local tangent vector t and the Burgers b; s is defined as t × m. A loop is confined to its glide plane.
Discrete dislocation plasticity
1121
with F = F · s. Treating the discretized dislocation loop through a onedimensional finite element discretization, the dynamics of the loop can be formulated through the set of equations [6] FA =
N
K AB v B
( A = 1, . . . , N )
B=1
with K AB a “stiffness” matrix that is determined by the loop geometry and the chosen shape functions, and is linear in the drag coefficient D. When the nodal Peach–Koehler forces FA are calculated, the nodal velocities are obtained by solving this set of equations. The formulation can be extended to handle sliding nodes to treat dislocation junctions and dislocation segments leaving the crystal via a free surface. The computation of the nodal Peach–Koehler force FA requires care when it comes to the self-interaction, i.e., the contribution of the segments belonging to the same dislocation. In order to eliminate the singular contributions from the ends of the two adjacent segments, Brown’s scheme can be used, see [6, 7]. Nevertheless, high-order Gaussian integration is generally needed to obtain convergence with a loop discretization that is not excessively fine. There are various issues that require due attention in integrating the motion of a dislocation loop in time, which have to do with the continuous change of local curvature. Weygand et al. [6] have suggested (i) a two-level time stepping approach that minimizes the N 2 problem of interaction calculations and (ii) an adaptive re-discretization scheme of the dislocation. But there probably is much room to improve these numerical procedures in order to reduce the number of calculations while retaining accuracy. In particular, multipole methods [10, 11] can considerably reduce the computational time for evaluating dislocation interactions. Experience with the superposition approach to boundary-value problems in three dimensions, so far, has revealed that the numerics are more demanding than one may expect from two-dimensional applications. First of all, higherorder finite elements seem necessary; 20-node brick elements with eight-point Gaussian integration are likely to be the minimum requirement. Even then, Weygand et al. [6] found that at least one to two elements are needed between the dislocation and a free surface in order for the calculated image forces to converge. Moreover, sufficiently many integration points per surface element are needed to compute the nodal forces from the long-range traction fields T˜i . The evolution of the dislocation structure may lead to events where nodal points and part of the corresponding dislocation segments leave the material. These events need to be detected when dislocation nodes are moved and proper constraints must be applied to the resulting surface nodes. To facilitate this detection, the surface of the sample is approximated by a triangular mesh in [6]. When part of a dislocation glides out of the crystal, the dislocation cannot
1122
E. Van der Giessen and A. Needleman (b)
(a)
C
C
outside
surface
B A
A
C
sample
D E
B
A
B
d
E D
D E
Figure 3. The pseudo-mirror construction to mimic the attractive interaction: (a) node leaves the sample; (b) surface nodes are introduced and a mirror construction is created. The view shows the projection onto the glide plane.
be treated as being open but needs has to be closed through virtual segments outside the crystal. This ensures that the analytic expressions for the stress and displacement fields remain valid and that the step produced on the surface is captured through the analytic displacement field. The error by closing the loop outside the crystal is corrected by the (ˆ)-solution. The shape of the virtual dislocation part is in principle irrelevant, but care needs to be taken that the strong attractive image force on the remaining dislocation from the free surface is resolved to sufficient accuracy. A judicious choice of this shape can aid the accuracy of the calculation of the dislocation – surface interaction within the finite element context. Weygand et al. [6] have proposed a procedure where the first two outer segments (after a surface node) are put into positions which correspond to a “mirror image” of the inner last two segments before the surface node, as shown in Fig. 3. This idea is inspired by the notion of image dislocations [1] for plane surfaces and dislocation lines parallel to that surface; for the general situation of curved dislocations on glide planes that are not orthogonal to the free surface the approach is only approximate.
3.
Two-Dimensional Dislocation Dynamics
The computational complexity of discrete dislocation dynamics is substantially reduced by restricting attention to two-dimensional (2D) plane strain situations. The advantage of a 2D formulation is that complex boundary value problems can be solved with realistic dislocation densities with relatively modest computing resources. A disadvantage is that the range of phenomena that can be modeled is limited by the restricted physics of two-dimensional dislocation interactions.
Discrete dislocation plasticity
1123
Within the constraint of plane strain, the dislocations are restricted to being edge dislocations (screw dislocations are consistent with anti-plane shear deformations). For an elastically isotropic solid with shear modulus µ and Poisson’s ratio ν, the stress and displacement fields at (x1 , x2 ) for a dislocation with Burgers vector b I e1 at (X 1 , X 2 ) are:
I (x1 , x2 ) σ11
(x2 ) 3(x1 )2 + (x2 )2 µb I =− 2 2π(1 − ν) (x1 )2 + (x2 )2
I (x1 , x2 ) σ22
(x2 ) (x1 )2 − (x2 )2 µb I = 2 2π(1 − ν) (x1 )2 + (x2 )2
I (x1 , x2 ) σ12
(x1 ) (x1 )2 − (x2 )2 µb I =− 2 2π(1 − ν) (x1 )2 + (x2 )2
(14)
(15)
(16)
u 1I (x1 , x2 )
bI 1 (x1 )(x2 ) x1 = − (1 − ν) tan−1 2 2 2π(1 − ν) 2 (x1 ) + (x2 ) x2
u 2I (x1 , x2 ) =
(17)
(x2 ) b 1 2π(1 − ν) 2 (x1 )2 + (x2 )2 I
2
(x1 )2 + (x2 )2 1 − (1 − 2ν) ln 4 (b I )2
(18)
where xi = xi − X i . It can be computationally useful to take advantage of the fact that the superposition in Eq. (5) is not unique. As long as the (˜) fields incorporate the appropriate singularities, Eqs. (14)–(18) can be extended to include any convenient non-singular fields. In particular, in circumstances where there is a traction-free surface, such as a crack surface, the gradients that the numerically computed (ˆ) fields need to resolve can be reduced by using the dislocation fields for a half-space. These fields are most simply expressed in terms of a complex stress function ϕ (the dislocation index ( ) I is omitted from ϕ for clarity). With the traction-free surface being the x1 -axis, with θ the angle between the Burgers vector and the x1 -axis and the dislocation position being (x1 , h), the stress and displacement fields are given by I I − i σ˜ 12 = ϕ (z) − ϕ (¯z ) + (z − z¯ )ϕ (z), σ˜ 22 I I + i σ˜ 12 = ϕ (z) + ϕ (¯z ) + 2ϕ (z) − (z − z¯ )ϕ (z) σ˜ 11
(19) (20)
where z = x1 + i x2 and an overbar denotes the complex conjugate. The displacement components are given through 2µ(u˜ 1I + i u˜ 2I ) = (3 − 4ν)ϕ(z) + ϕ(¯z ) − (z − z¯ )ϕ (z)
(21)
1124
E. Van der Giessen and A. Needleman
with
2b I h µ i b¯I {ln [−m(ih − z)] − ln [m(ih ¯ + z)]} + ϕ(z) = 4π(1 − ν) z − ih (22) with m defined by b I = |b I |m = |b I |(cos θ + i sin θ). In addition to accounting for traction-free surfaces in the (˜) fields, it can be convenient to use analytical fields for infinite arrays of dislocations in case of periodic boundary conditions. Expressions for walls (dislocations stacked normal to the Burgers vector) and carpets (rows of dislocations parallel to the Burgers vector) in infinite space can be found in the literature as well as for carpets of dislocations in a half space. Such solutions are characterized by being periodic in one direction and decaying exponentially in the perpendicular direction. The latter eliminates the development of artificial patterning of dislocations when using individual dislocations with their 1/r decay and a finite cut-off radius for dislocation-dislocation interactions. A variety of two-dimensional analyses have been carried out so far where the magnitude of the Burgers vector is b for all dislocations and using the following set of simple constitutive rules: • Dislocation nucleation: Dislocation dipoles are nucleated by simulating Frank–Read sources. In 2D this is implemented through point sources that nucleate a dislocation dipole when the Peach–Koehler force at source site I ∗ ,
I
f =
m iI
σˆ i j +
σiJj
b j = m iI σi j b j
(23)
J
equals or exceeds bτnuc during a period of time tnuc , where b is the Burgers for each source. vector magnitude and τnuc and tnuc are parameters specified In Eq. (23) the superscript I pertains to the source while J σiJj gives the stress at the source site from the individual dislocation fields. The distance L nuc between the generated dislocations is taken to be given by L nuc =
b µ . 2π(1 − ν) τnuc
(24)
• Dislocation glide: The magnitude of the glide velocity v I of dislocation I is given by Bv I = f I − bτP with B the drag coefficient and τP the Peierls stress. * Note that the magnitude of f I in Eq. (23) is equal to b times the local resolved shear stress.
(25)
Discrete dislocation plasticity
1125
• Dislocation annihilation: Annihilation of two dislocations with opposite signed Burgers vector occurs when they come within a critical annihilation distance L e of each other. • Dislocation obstacles: Obstacles to dislocation motion are modeled as fixed points on a slip plane. Pinned dislocations can only pass the obstacles when their Peach–Koehler force exceeds a specified value bτobs . Within the framework of these constitutive rules, the sources and obstacles are specified initially and do not evolve with deformation. Two-dimensional simulations have been carried out that allow for strains of several percent and realistic dislocation densities, even in complex boundary value problems. However, the range of phenomena that can be modeled using the 2D framework is limited by the restricted physics of 2D dislocation interactions. For example, while the natural formation of dipoles at the intersection of slip planes emerges in 2D analyses, the formation of three-dimensional 3D junctions, which can be much stronger, is not accounted for. As a consequence, for example, 2D analyses of plane strain tension using the constitutive rules described above exhibit non-hardening behavior, i.e., after some initial transient plastic flow occurs at a more or less constant stress. Hardening can occur, but only when geometrically necessary dislocations are present. Recently, Benzerga et al. [12] have proposed dislocation constitutive rules for 2D analyses that model 3D dislocation mechanisms including dynamic junction formation, with some of the junctions serving as dislocation sources and some purely as obstacles. In this manner, the dislocation source density evolves with deformation, which is key for a realistic description of hardening. The physical background for these rules is given in [12]; here we just summarize the constitutive rules: • Junction formation: The formation of a junction is taken to occur when two dislocations gliding on two intersecting slip planes approach within a specified distance d ∗ from the intersection point of the slip plane traces regardless of the sign of the dislocations. The intersection point is identified with the junction location and the two dislocations forming the junction are immobile until the junction is broken. When a junction forms, there is a probability p that it acts as a potential anchoring point for a Frank–Read source and a probability (1 − p) that it acts as an obstacle. • Dynamic obstacles: Dislocations that approach the junction are kept at a distance greater than or equal to d ∗ from the junction location. A junction I is destroyed if the Peach–Koehler force acting on either dislocation I b with comprising the junction attains or exceeds the breaking force τbrk I = βbrk τbrk
µb SI
(26)
Here, S I is the distance to the nearest junction in any of the two intersecting planes, b is the magnitude of the Burgers vector of the dislocation
1126
E. Van der Giessen and A. Needleman
making up the junction and βbrk is a parameter giving the strength of the junction. • Source operation: A dislocation dipole is nucleated at source I when the I b for a time value of the Peach-Koehler force at the junction exceeds τnuc I tnuc , where I = βnuc τnuc
µb SI
(27)
with βnuc giving the source strength and S I the distance to the nearest junction on the slip plane. In evaluating S I all junctions are considered regardless of whether they are anchoring points or obstacles. The time I is given by tnuc I =γ tnuc
SI |τ I |b
(28)
where τ I is the resolved shear stress at the junction location and γ deI . pends on the drag coefficient B and on τ I /τnuc For nucleation of an isolated loop, I L nuc = κS I
(29)
where κ > 1. However, the emitted dipole is not allowed to pass through a dislocation near the source. As a consequence, the size of the emitted I < κS I . loop is S I ≤ L nuc • Line tension: The energy cost associated with loop expansion is modeled through a configurational force of magnitude L I b pointing from one dislocation in a dipole toward the other. The magnitude of L I is L I = −α
µ|b| SdI
(30)
where α is a proportionality factor and SdI is the algebraic distance between the two dislocations comprising the dipole, so that the sign of L I depends on the sign of SdI . The line tension is then included in Eq. (25) by adding L I b as a driving force to the right-hand side. • Interaction of moving dislocations with junctions: An anchoring point can be destroyed by annihilation of one of the dislocations forming the junction. On the other hand, an obstacle can be destroyed either by annihilation or by the local stress exceeding the obstacle strength. In order to analyze the consequences of these two mechanisms, two options have been considered: (i) only junction destruction can occur when a critical stress is reached so that, as a consequence, only obstacles can be destroyed and; (ii) annihilation is possible in which case both obstacles and
Discrete dislocation plasticity
1127
anchoring points can be destroyed. In option (i), when a dislocation of opposite sign comes close to an obstacle it is pinned at a distance d ∗ from the obstacle, while when a dislocation of opposite sign comes close to an anchoring point the gliding dislocation is free to oscillate around the anchoring point. Calculations using these constitutive rules also use the constitutive rules for dislocation motion, Eq. (25), and dislocation annihilation. In addition, initial static sources and obstacles can be specified. Although initial results are encouraging [12], it remains to be seen how much of 3D dislocation physics can actually be incorporated in a 2D formulation. Computing the change in the dislocation structure in each time increment involves: (i) computing the motion of existing dislocations; (ii) checking for interactions with the static obstacles and with existing dynamic junctions; (iii) checking for dislocation annihilation; (iv) determining if any dislocations have exited at a free surface; (v) determining if any dislocations pinned at static obstacles have broken away; (vi) checking for the destruction of the dynamic junctions; (vii) checking for the creation of new dynamic junctions; (viii) checking for nucleation at the static and dynamic sources. Since only edge dislocations are present in the 2D analyses and since nucleation involves the production of dipoles, the total Burgers vector does not change during the deformation history. The net Burgers vector in the body can only change when dislocations exit the body, leaving a step on the surface. Since edge dislocations correspond to addition or subtraction of a half-plane of atoms, conservation of total Burgers vector reflects conservation of mass. It is worth mentioning that the constitutive relations used for dislocation nucleation pertain to nucleation from Frank–Read sources where the main issue is mainly one of propagating a loop to its stable size. Criteria for other nucleation processes, for example from surface steps or grain boundaries (which can also act as dislocation sinks), remain to be developed. Dislocation dynamics is chaotic [13]. It seems that the chaotic behavior has relatively little effect on the predicted stress-strain response under monotonic loading, where the variations in dislocation position tend to average out, but possibly more effect on fracture predictions, where local values of stress and deformation can matter. However, the implications of this chaotic behavior remain to be fully explored.
4.
Example
Experiments have shown that stress evolution in films with a thickness on the order of micrometers is size dependent. This effect cannot be resolved by classical continuum theories since they lack a material length scale. The method presented above is illustrated by considering a 2D plane strain model
1128
E. Van der Giessen and A. Needleman
of a thin film bonded to an elastic substrate, as analyzed by [14]. The film of thickness h is considered to be a single crystal and perfectly bonded to a halfinfinite substrate, see Fig. 4. The single crystal contains three slip systems with slip plane orientation: φ (1) = 0◦ ; φ (2) = 60◦ ; φ (3) = 120◦ , which resembles an fcc crystal with the (110) plane coinciding with the x1 -x2 plane of deformation. The elastic properties of the film are assumed to be isotropic and the same as those of the substrate. Stress is caused by the mismatch in the coefficients of thermal expansion and arises from cooling from the stress-free state. This is taken into account by subtracting the thermal stress 3EαT /(1 − 2ν) due to a temperature difference T from the left-hand side of (8), where E =2(1+ ν)µ is Young’s modulus and α is the difference of the coefficient of linear thermal expansion in film, α f , and of that in the substrate, αs . Note that the thermal part of the problem is taken care of through the (ˆ) fields. The film is infinitely long in the x 1 direction but is treated as being periodic with cell width w. The (˜) fields are constructed from the periodic fields of a dislocation and all its replicas at mutual distance w. The traction-free condition of the film surface x2 = h is accounted for by the (ˆ) fields. The interface between film and substrate is treated here as being impenetrable by dislocations (by putting very strong obstacles at the ends of the slip planes). Simulations start from a stress-free and dislocation-free configuration. The film contains a random distribution of 60 sources/µm2 . The nucleation strength τnuc of each source is randomly taken out of a Gaussian distribution with average τnuc = 25 MPa and standard deviation τnuc = 5 MPa. A dislocation dipole is generated from the source when the resolved shear stress at the source exceeds the nucleation strength for a given time tnuc = 10 ns. There are no obstacles, and neither junction formation nor line tension is accounted for. x2
w
αf
h
φ
x1 αs π
π ∞
Figure 4. Geometry of the film-substrate problem. A unit cell of width w is analyzed and the height of the substrate is taken large enough to represent a half space.
Discrete dislocation plasticity
1129
Figure 5 shows how the dislocation distribution evolves from the initially dislocation- and stress-free state during cooling in a film with h = 0.5 µm from T = 600 K. After roughly 25 K, the first dislocation dipoles are generated inside the hitherto uniform elastic stress field. One dislocation moves toward the impenetrable interface where it gets stopped, while the other exits the film at the free surface. As cooling proceeds, more and more dislocations are generated and pile up against the interface. This causes the formation of a boundary layer of relatively high stress just above the interface. The thickness of the boundary layer turns out to be more or less independent of film thickness. This gives rise to a size effect: thinner films are harder, as shown in Fig. 6. The stress-temperature curves are serrated as a consequence of the discrete nucleation events. The straight-line fits demonstrate that hardening is approximately linear with the constitutive rules adopted in this simulation. The kink in the stress-temperature curve after ∼70 K for the h = 0.25 µm film is caused
(a)
(b)
(c) 0.4 0.2 0 ⫺0.2 ⫺0.4 0
0.5
1
1.5
2
Figure 5. Evolution of the dislocation distribution inside the film during cooling by: (a) 100 K; (b) 150 K; (c) 200 K. In (c) the distribution of the stress σ11 parallel to the film is superimposed, also showing the top 0.5 µm of the substrate.
1130
E. Van der Giessen and A. Needleman 150
h=0.25µm
<σ11>f
100
h=0.5µm 50
h=1µm
0 600
550
500
450
400
T[K] Figure 6.
Average stress in the film, σ11 f , versus temperature for three film thicknesses.
by the limited availability of sources in such thin films [14]. Quite generally, at small size scales limited source availability can significantly affect the evolution of plastic deformation.
References [1] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New York, 1982. [2] F.R.N. Nabarro, Theory of Crystal Dislocations, Oxford Univ., Press, Oxford, 1967. [3] V.S. Deshpande, A. Needleman, and E. Van der Giessen, “Finite strain discrete dislocation plasticity,” J. Mech. Phys. Solids, 51, 2057–2083, 2003. [4] L.P. Kubin, G. Canova, M. Condat, B. Devincre, V. Pontikis, and Y. Br´echet, “Dislocation microstructures and plastic flow: a 3D simulation,” Solid State Phenomena, 23-24, 455–472, 1992. [5] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40, 113–127, 1998. [6] D. Weygand, L.H. Friedman, E. Van der Giessen, and A. Needleman, “Aspects of boundary-value problem solutions with three-dimensional dislocation dynamics,” Model. Simul. Mat. Sci. Engrg., 10, 437–468, 2002.
Discrete dislocation plasticity
1131
[7] K.W. Schwarz, “Simulation of dislocations on the mesoscopic scale. I. Methods and examples,” J. Appl. Phys., 85, 108–119, 1999. [8] N.M. Ghoniem and L.Z. Sun, “Fast-sum method for the elastic field of threedimensional dislocation ensembles,” Phys. Rev. B, 60, 128–140, 1999. [9] D.M. Barnett, “The displacement field of a triangular dislocation loop,” Phil. Mag. A, 51, 383–387, 1985. [10] G.J. Rodin, “Towards rapid evaluation of the elastic interactions among threedimensional dislocations,” Phil. Mag. Lett., 77, 187–190, 1998. [11] R. LeSar and J.M. Rickman, “Multipole expansion of dislocation interactions: application to discrete dislocations,” Phys. Rev. B, 65, 144110, 2002. [12] A.A. Benzerga, Y. Br´echet, A. Needleman, and E. Van der Giessen, “Incorporating three-dimensional mechanisms intodislocation dynamics,” Modelling Simul. Mater. Sci. Eng., 12, 159–196, 2004. [13] V.S. Deshpande, A. Needleman, and E. Van der Giessen, “Dislocation dynamics is chaotic,” Scripta Mat., 45, 1047–1053, 2001. [14] L. Nicola, E. Van der Giessen, and A. Needleman, “Discrete dislocation analysis of size effects in thin films,” J. Appl. Phys., 93, 5920–5928, 2003.
3.5 CRYSTAL PLASTICITY M.F. Horstemeyer1, G.P. Potirniche1 , and E.B. Marin2 1 Mississippi State University, Mississippi State, MS, USA 2
Sandia National Laboratories, Livermore, CA, USA
Besides Dislocation Dynamics, crystal plasticity can be considered a mesoscale formulation, since the details of the equations start at the scale of the crystal or grain. In this section, the topics of classical crystal plasticity formulations, kinematics, kinetics, and the polycrystalline average methods will be discussed. Continuum slip polycrystal plasticity models have become quite popular in recent years as a tool to study deformation and texture behavior of metals during processing [1] and shear localization [2, 3]. The basic elements of the theory comprises (i) kinetics related to slip system hardening laws to reflect intragranular work hardening, including self and latent hardening components [4], (ii) kinematics in which the concept of the plastic spin plays an important role, and (iii) intergranular constraint laws to govern interactions among crystals or grains. The theory is commonly acknowledged for providing realistic prediction/correlation of texture development and stress-strain behavior at large strains as it joins continuum theory with discretized crystal activity. Different authors have developed or recommended various forms of the basic elements of polycrystal plasticity theory that address specific applications. Some have developed formulations at what is called the intermediate stress configuration [5, 6]. Others have focused on current configuration formulations [3, 7]. The texture and stress-strain responses are essentially the same. Most have ignored elasticity effects [7], while others include it in their formulations [3]. Again, the results are the same. Where differences arise lie within the assumptions related to the kinetics of slip. Inelastic deformation has historically been attributed to dislocation glide on slip planes, otherwise known as crystallographic slip. Taylor and Elam [8] were the first to determine the relationship between the orientation of one crystallographic slip axis and the tensile test axis, which were not necessarily coincident. They conjectured that perhaps two slip systems were involved. 1133 S. Yip (ed.), Handbook of Materials Modeling, 1133–1149. c 2005 Springer. Printed in the Netherlands.
1134
M.F. Horstemeyer et al.
Schmid [9] determined that the magnitude of crystallographic slip on the glide planes was related to the resolved shear stress. The next major historical work related to that of Taylor [10], who founded the “principle of minimum shears.” This principle disregarded elastic strains and assumed that only five independent slip systems were necessary to describe three dimensional polycrystalline behavior. Using Taylor’s assumption, Bishop and Hill [11] determined the three dimensional stress state resulting from all the slip possibilities in a facecentered cubic (FCC) lattice, which has twelve slip systems (three possible [110] slip directions on four {111} planes). Books by Havner [12] and Kocks et al. [4] provide a nice review of the history and the pertinent issues related to the kinetics, kinematics, and intergranular constraints of crystal plasticity. Now we turn towards examining the kinematics of crystal plasticity. The deformation gradient is often assumed to be a multiplicative decomposition of elastic and plastic parts after Lee and Liu [13], F = FeF p
(1)
so correspondingly the velocity gradient is given by
e p e p L = F˙ F −1 = F˙ F p + F e F˙ F p−1 F e−1 = F˙ F e−1 + F e F˙ F p−1 F e−1
(2) where L e = F˙ F e−1 and L p = F˙ F p−1 . Now the plastic velocity gradient corresponding to crystallographic slip is given by e
Lp =
2
γ˙i s 0i ⊗ m 0i
p
(3)
i=1
where γ˙i is the plastic slip rate on ith slip system, and s 0i and m 0i are the slip direction vector and unit normal vector to the slip plane, respectively. Because s 0i and m 0i are fixed in space according to the classical assumption of Taylor (material flows through the lattice), the so-called intermediate configuration is specified; hence, material plastically flows from the reference to the intermediate configuration. After plastic deformation, the lattice deforms and rotates with F e , which is defined by the polar decomposition F e = R eU e
(4)
where R e is the proper orthogonal rotation tensor, and U e is the right elastic stretch tensor. In general, R e comprises the rotation from both elastic deformation and rigid body rotation. As a result, the velocity gradient in the current configuration is given by Lˆ p =
2 i=1
γ˙i s i ⊗ m i
(5)
Crystal plasticity
1135
giving the velocity gradient as T T L = L e + R e U e R e Lˆ p R e U e−1 R e
(6)
eT
since R = R e−1 for proper orthogonal R e . Infinitesimal elastic strains are typically assumed; hence, the right elastic stretch is given by ∼ I +Y Ue =
(7)
in which higher order terms are generally neglected as well, where I is the identity tensor, and Y is the infinitesimal perturbation of the elastic stretch. The inverse of the right elastic stretch is given by
∼ I − Y. U e−1 = I − Y + O Y 2 =
(8)
Substituting (7) and (8) into (6), we get T T L = L e + Lˆ p + R e YR e Lˆ p + Lˆ p R e YR e .
(9)
The Green-elastic strain with respect to the intermediate configuration is given by E=
1 2
U e2 − I ,
(10)
so the second rank Cauchy stress tensor, σ , in the current configuration can be related to the intermediate configuration stress, σˆ , according to T
R e σˆ (E) R e = σ F e .
(11)
As a consequence, the elastic rotation due to elastic deformation may be neglected, and R e essentially represents a rigid rotation. The general constitutive form can be determined at the intermediate (stress free) configuration through a hyperelastic law as
ˆ σˆ Eˆ = C E,
(12)
where the elastic stiffness tensor, C is invariant for a given crystal in the intermediate configuration. The intermediate configuration is aligned with the crystalline axes. σˆ is the second Piola–Kirchhoff stress in the intermediate configuration, and Eˆ is the conjugate Green elastic strain. For cubic orthotropy, the single crystal elastic moduli are formed on axes of cubic symmetry (100, 010, and 001 axes). By defining C1 = C1111 = C2222 = C3333 C2 = C1122 = C2233 = C1133 C3 = C1212 = C1313 = C2323
(13)
1136
M.F. Horstemeyer et al.
the components are formed on the Cartesian axes coincident with (100, 010, and 001 axes) with all other Ci j kl equal to zero. The stress in the current configuration is related to the second Piola–Kirchhoff stress by T
σ = 1J F e σˆ F e .
(14)
Now the Zener anisotropy factor as related to the crystal axis (not the specimen axis) is given by Z=
2C3 . C1 − C2
(15)
When Z = 1, the elastic properties are isotropic; however, for copper Z > 3 for example. In finite inelastic deformation, grains rotate and tend to align themselves toward a texture pole. Now we will incorporate the kinematic equations into the constitutive equations. By virtue of Eq. (7), the Green-elastic strain can be written
E = Y + O Y2 ,
(16)
and the inverted elastic stiffness matrix can be defined as B ∗ = C ∗−1 .
(17)
By combining (11), (16), and (17), we may write B ∗ • σ = R e YR et
(18)
to be used later. The velocity gradient can be decomposed into its symmetric and antisymmetric parts as L = D + W.
(19)
By using (7), (10), (18), and (19) the symmetric and anti-symmetric parts of the velocity gradient in the current configuration can be identified as
D = D e + Dˆ p + B ∗ • σ Wˆ p − Wˆ p B ∗ • σ ,
W = W e + Wˆ p + B ∗ • σ Dˆ p − Dˆ p B ∗ • σ
(20)
(21)
when neglecting the higher order terms. Here W e = R˙ R e . We can gain insight into the interpretation of the current configuration quantities Dˆ p and Wˆ p by rearranging Eqs. (20) and (21) as e
Dˆ p = D − D e − B ∗ • σ Wˆ p + Wˆ p B ∗ • σ and
T
(22)
Wˆ p = W − W e − B ∗ • σ Dˆ p + Dˆ p B ∗ • σ .
(23)
Crystal plasticity
1137
In many macroscale plasticity formulations, the plastic rate of deformation and plastic spin are prescribed. What distinguishes macroscale internal state variable theory from this crystal plasticity formulation is that these quantities fall out naturally within the formulation. It is instructive to observe the rate forms of the crystal plasticity equations. The material time derivative of the Cauchy stress in the current configuration is given by differentiating Eq. (14) as σ˙ = R˙ σˆ (E) R e + R e σ˙ˆ (E) R e + R e σˆ (E) R˙ . e
T
eT
T
(24)
From the co-rotational stress rate in the current configuration, σ˙ is given by σ˙ = C • E˙
(25)
∼ Y˙ neglecting higher order terms, and the elastic part of the where E˙ = U e U˙ = velocity gradient is given by e
eT
˙ , D e = R e YR
T e e = R˙ R e .
(26)
By combining (24)–(26), the Cauchy stress rate becomes σ˙ = e σ − σe + C ∗ • D e .
(27)
The stress rate that co-rotates with the crystal lattice, which spins with W e , is a Jaumann-type form given by o
σ = C ∗ • D e = σ˙ − e σ + σe ,
(28)
where e = W e .
(29)
Combining (24)–(29), the stress rate becomes
˙ σ − σ W e + C ∗ • D − Dˆ p σ˙ = W e
+ C ∗ • Wˆ p B ∗ • σ − B ∗ • σ Wˆ p ,
(30)
since D e = D − D p + spin terms. The next important aspect of crystal plasticity is to include kinetics relations to the aforementioned kinematics and constitutive relations. A common viscoplastic employed by Hutchinson [14] for isotropic hardening and modified by Horstemeyer et al. [15] with kinematic hardening is given by the following, τi − αi M , g
γ˙i = γ˙o sgn (τi − αi )
i
(31)
1138
M.F. Horstemeyer et al.
where the plastic slip rate on the ith slip system, γ˙i , is a function of a fixed reference strain rate, γ˙0 , the reference shear strength, gi , the resolved shear stress on the slip system, τi , the rate sensitivity exponent for the material, M, and an internal state variable representing kinematic hardening effects resulting from backstress at the slip system level, αi . The isotropic hardening evolution law for the internal hardening state variable, gi , on ith slip system is given by g˙ i =
12
h i j γ˙ j
(32)
i, j =1
where h ij are the hardening (or plastic) moduli. The self-hardening components arise when i = j and the latent hardening components arise when i =/ j . The increase or decrease of flow stress on a secondary slip system due to crystallographic slip on an active slip system is referred to as latent hardening. Taylor and Elam [8], based on experimental evidence on aluminum crystals, observed that when latent hardening equals self hardening, an isotropic response exists. Kocks et al. [4] reviewed the behavior of several materials under different loading conditions and surmised that an intersecting slip system induces higher stresses in the well-developed flow stress regime. The latent hardening ratio, which is the ratio of hardening on the secondary system compared to the primary system, ranges from 1.0 to 1.4 for the form used by Hutchinson [14] and Peirce et al. [2], sometimes called the PAN rule, where 1.0 corresponds to Taylor hardening. However, texture and conventional latent hardening effects cannot account for all sources of anisotropy, in general. In essence, latent hardening models have focused on dislocation-dislocation interactions, but in reality latent hardening arises from dislocation-substructure interactions as well. In the latter case, an evolving latent hardening ratio would be necessary. Although potentially important, an evolving latent hardening ratio has yet to be established. A simple form of the hardening moduli [3] employing the PAN rule is given by
h i j = F (γ ) δi j + lhr 1 − δi j ,
(33)
where ) is a function of the cumulative shear on all slip systems, γ = F(γ γ j dt, and lhr is the latent hardening ratio. j Other latent hardening forms have been proposed and might be fruitful to consider in such parameter studies; Equation (21) cannot distinguish between acute and obtuse cross-slips in reversed quasi-static loading conditions. Havner [12] employed a two-parameter rule to examine latent hardening effects, showing that the contribution of incremental slip from self hardening equals that of the latent system. Other issues regarding latent hardening include differences that have been observed from one latent system to the next. In fcc Cu and Al single crystals, slip systems in which dislocations can form
Crystal plasticity
1139
sessile junctions appear to exhibit primary latent hardening. Secondary latent hardening is associated with systems for which dislocations form glissile junctions or Hirth locks with those of the active slip systems. Also not considered is the influence of the stacking fault energy; the lower the stacking fault energy, the higher the latent hardening. Models to date only empirically fit constants to the latent hardening equation and physical motivation is often lacking. Finally, although the latent hardening ratio seems to be independent of temperature, alloy type, and strain rate [4], it does change during deformation, saturating at a strain on the order of unity. The slip system hardening coefficient, F(γ ), has been emphasized by different researchers attempting to model various aspects of dislocation interaction. One example is the Rashid and Nemat-Nasser [3] hardening rule given by F(γ ) =
h 0,
0 ≤ γ ≤ γ0
h0 , γ0 ≤ γ 1 + (γ − γ0 )
,
(34)
where h 0 , , and γ0 are material constants. Another example is a modified hardening-recovery equation [15] that was also used in this study is given by F (γ ) = h 0 − Rg(γ ),
(35)
where R is a material constant. Other forms can be appropriated here but the motivation should be based upon hardening and recovery reflecting dislocation initiation, motion, and interaction. Kinematic hardening at the grain level is used to model dislocation substructure contribution to the directional dislocation resistance. Kinematic hardening at the level of the slip system has been rather widely employed to describe strengthening due to heterogeneous dislocation substructure and attendant Bauschinger effects. This substructural internal variable evolution equation evolves at the level of the grain as given by α˙ i = Crate (Csat γ˙i − αi γ˙i ),
(36)
where Crate controls the rate of evolution, and Csat is the saturation level of the backstress and were chosen to fit the experimental data. The substructural hardening internal state variable reflects dislocation interactions within the grain and follows the internal state variable constraint that the rate must be governed by a differential equation in which the plastic rate of deformation appears. It is well-known that a certain degree of kinematic hardening (Bauschinger effect) is introduced by virtue of the orientation dependence of grains and compatibility requirements among them in crystal plasticity theory. However, this is a highly transient effect that occurs over small cumulative plastic strain following a strain reversal. More persistent Bauschinger
1140
M.F. Horstemeyer et al.
effects arise from prescription of kinematic hardening at the scale of individual grains (slip systems), affecting slip system flow rules. Reversed loading experiments on single crystals of both precipitate-strengthened and pure metals exhibit kinematic hardening due to heterogeneous inelastic flow. Precipitates offer a clear source of the behavior in the former. Dislocation substructures induce these effects in the latter. In the latter case, the backstress is induced by the collective effects of interactions with dislocation structures at higher scales. The final topic of discussion pertinent to crystal plasticity is the averaging of the polycrystal from the single crystal starting point. One can think of the bridging of the mesoscale and macroscales is governed by the intergranular constraint formulation, which injects anisotropy through another bridge of length scales besides the micro-meso link illustrated in Eq. (36). A crystalto-aggregate averaging theorem that kinematically constrains all of the crystals in the same manner is based on the work of Taylor [10]. Another limit is to assume the same remote stress applied to each crystal [16]. A third form of polycrystalline constraint used in a crystal plasticity context is what is called relaxed constraints method. Various forms of this exist. Essentially, they start with the remote strain applied to all the crystals according to the Taylor constraint and then relax towards the Sach’s constraint. Terms such as self-consistent, relaxed constraints, and modified constraints have been used to describe this type of constraint. The idea is that the single crystal which is assumed to be an inclusion embedded in a matrix that possesses the aggregate properties of effective stress-strain behavior. One example using the elastic modulus to represent the aggregate in which each crystal’s strain tensor is perturbed from the polycrystal average according to σij − σijave =
2µ(1 − (2(4 − ν)/15(1 − ν))) ave ε − ε , ij eff ij (1 + 3µ(ε p /σ eff ))
(37)
where the volume averaged stress and strains over all the grains are given by σijave =
N 1 (σij )k , N k=1
N 1 (εij )k . N k=1
εijave =
(38)
and (σij )k and (εij )k are the stress and strain on the kth grain. Here, N is the number of grains, µ is the polycrystalline shear modulus, ν is the polycrystalline elastic Poisson’s ratio, and σ eff =
1 2
ave ave σ11 − σ22
2
2
ave ave + σ33 − σ11 2
2
ave ave ave + 6 σ12 + σ23 + σ13
2
ave ave + σ22 − σ33
1/2
2
(39)
Crystal plasticity
1141
The effective plastic strain is given by ε
p eff
=
p ave ε11
−
2
2 9
p ave
+ 43 ε12
p ave 2 ε22 2
p ave
+ ε23
+
p ave ε33 2
p ave
+ ε13
−
p ave 2 ε11
+
p ave ε22
−
p ave 2 ε33
1/2
(40)
Models with relaxed constraints have been used to provide understanding for length scale issues such as grain shape changes in predicting a more accurate texture response than that obtained by using the Taylor “full” constraint. As grains become flat or elongated as deformation proceeds, the average number of operative slip systems decreases. For example, in rolling as the grain shape changes from equiaxed to elongated, the anisotropy for the cube {100}001, Goss {100}011, and brass {110}112 textures is not induced, but the copper {112}111 and S {123}634 textures will be affected. These five main texture components are characterized by recrystallization (cube and Goss components) and by rolling (brass, copper, Goss, and S). Something generally not considered in modeling that would affect all five texture components is the contribution of the substructural geometric necessary boundary (GNB) evolution to the textural evolution. To introduce the substructural GNB effect on the grain shape change and slip system activity, the microheterogeneity internal state variable from Eq. (36) arising from the noncrystallographic microheterogeneity evolution is admitted to modify Eq. (31). The proposed deformation-induced anisotropy internal state variable intergranular constraint relation is given by µ ave ε − ε , (41) σij − σijave = ij C1 αˆ eff ij where αˆ
eff
=
1 2
ave ave αˆ 11 − αˆ 22
2
2 2
ave ave + αˆ 33 − αˆ 11 2
ave ave ave + 6 αˆ 12 + αˆ 23 + αˆ 13
2
ave ave + αˆ 22 − αˆ 33
2
1/2
(42)
and αˆ ijave = N1 3i, j αˆ ij . In Eq. (42), C1 is a constant that governs the intergranular constraint effect. The value of C1 will vary depending on the crystal lattice type (FCC or BCC) and number of material phases. The mathematical form for (µ/C1 αˆ eff ) decreases exponentially as deformation proceeds, analogous to the decay of the mean free path between dislocation substructures in the mesh length theory. In essence, it is a length scale parameter introduced from the lower scale within the grain that affects the intergranuler constraint of the polycrystal. Equation (41) seeks to express intergranular constraint in terms of the evolving magnitude of the grain level microheterogeneity internal state
1142
M.F. Horstemeyer et al.
variable, which responds in a transient manner to any abrupt change of loading path, reflecting in some manner the formation of dislocation substructures and grain subdivision processes. It is noted that αˆ is a long range transient, in general. The justification for the use of αˆ in the intergranular constraint relation is evident when one considers the role of geometric necessary boundaries and grain subdivision in accommodating deformation. Since geometrically necessary dislocations are generated predominately for the purpose of strain accommodation between adjacent grains, the formation of geometric necessary boundaries serves as an intragranular source of relieving intergranular constraint stresses. The fact that αˆ also enters into the flow rule in Eq. (41) reflects the influences of these intragranular structures on deformation-induced anisotropic strengthening. Hence, both intergranular hardening and intragranular constraint aspects of geometric necessary boundary formation are addressed. Now that we have discussed the theoretical aspects of crystal plasticity in terms of the, kinematics, kinetics, and the polycrystalline average methods, we now turn towards the implementation of the model into a numerical setting.
1.
Crystal Plasticity Implementation
The numerical implementation of the above-described theory can differ depending on the rate dependency of plasticity theory considered. While rate dependent crystal plasticity considers plastic deformation occurring simultaneously on all slip systems according to flow rules such as illustrated in Eq. (31), rate independent numerical schemes are confronted with a few problems concerning the activity of plastic slip on crystallographic slip systems in order to accommodate the required remote deformation. In this implementation, the plastic slip (deformation) is assumed to occur on the slip systems of the crystalline lattice. Each slip system can be fully defined by the set of vectors s 0i and m 0i . The mutually perpendicular vectors s 0i and m 0i have values according to the type of crystalline lattice (cubic, hexagonal, etc.) and its orientation with respect to the axes of the system of coordinates. For example, a face-centered cubic (FCC) lattice has twelve slip systems. When the crystallographic directions (1 0 0), (0 1 0) and (0 0 1) are aligned with the global axes of coordinates x, y and z, respectively, the twelve slip systems are defined by set of vectors s 0i and m 0i , defined in Table 1. If the lattice if rotated with respect to the global system of coordinates, then the components of the vectors s 0i and m 0i but be rotated accordingly with the rotation matrix R : si = R · si0
m i = R · m 0i
(43)
Crystal plasticity
1143 Table 1. Summary of slip and normal direction vectors for FCC metal s 0i
ith slip system 1 2 3 4 5 6 7 8 9 10 11 12
[1 [–1 [0 [1 [–1 [0 [–1 [0 [1 [–1 [1 [0
–1 0 1 0 –1 1 0 –1 1 1 0 –1
m 0i 0] 1] –1] 1] 0] –1] 1] –1] 0] 0] 1] –1]
(1 (1 (1 (–1 (–1 (–1 (1 (1 (1 (–1 (–1 (–1
1 1 1 1 1 1 –1 –1 –1 –1 –1 –1
1) 1) 1) 1) 1) 1) 1) 1) 1) 1) 1) 1)
Fourth order elasticity tensor C 0 defining the elastic response of the crystalline lattice under stress should also be transformed using the same rotation matrix: C = R · R · C 0 · RT · RT
(44)
There are many representations of the crystal orientation. One of the most common is Roe convention that uses three Euler angles ψ, φ, and θ. The rotation matrix in Roe convention is written as [4]:
cos ψ cos θ cos φ − sin ψ sin φ R = sin ψ cos θ cos φ + cos ψ sin φ − sin θ cos φ
1.1.
− cos ψ cos θ sin φ − sin ψ cos φ − sin ψ cos θ sin φ + cos ψ cos φ sin θ sin φ
cos ψ sin φ
sin ψ sin φ
(45)
cos θ
Rate Independent Numerical Integration Algorithm
Rate independent integration algorithms of crystal plasticity constitutive equations must deal with a few issues that are avoided in rate dependent plasticity. In rate independent crystal plasticity, several decisions must be made during incremental loading: which slip systems are active, what are the plastic slip increments in order to produce the accommodate remote deformation, and how are the selection of the set of active slip systems determined [5]. Numerous integration schemes of rate independent crystal plasticity have been put forth [5, 6, 10, 17, 18].
1144
M.F. Horstemeyer et al.
In the following paragraphs we will briefly introduce a classical rate independent approach of Anand and Kothary [5]. Consider a load step from time t to t + t. Assume all the variables known at time t, that is Cauchy stress σ t , pl plastic deformation gradient F t , total deformation gradient F t , slip systems orientations s ti and m ti . Also, the total deformation gradient at time t + t, F t + t is known. 1. Calculate the elastic deformation gradient from the total and the plastic deformation gradient, assuming the step is fully elastic:
F et+ t = F t+ t · F tp
−1
(46)
2. Using the elastic deformation gradient, update the normal and tangential vectors defining the slip systems: t+ t
si
= Fte+ t · s ti
t+ t
mi
= m ti · Fte+ t
−1
(47)
3. Calculate elastic Green strain tensor from the elastic deformation gradient: E et+ t =
1 2
F et+ t
T
· F et+ t − I
(48) T
where, I represents the identity matrix and A is the transpose of any matrix A. 4. Compute the trial Cauchy stress by tensorial multiplication of the elastic stiffness tensor and the trial elastic Green tensor: σ t + t = C : E et+ t
(49)
5. Compute the resolved shear stress on each slip system from the trial stress τit + t = σ t + t : P ti + t
(50)
where P ti + t is the Schmid tensor defined as: P ti + t =
1 2
s ti + t · m ti + t + m i
t+ t
· s ti + t
(51)
7. Check yielding criterion. If τit + t < git for all i = 1, n, then the load step [t, t + t] is fully elastic, and exit. Otherwise, continue with the next step. 8. Define the set of potentially active slip systems, as those systems for which the yield functions are greater or equal to zero t + t − git ≥ 0 τi
Apot = i
(52)
9. For the systems considered potentially active, calculate the plastic slip increments from the consistency condition at time t + t
j ∈ Apot
γ j P ti + t : C e : P tj+ t + h i j = τit + t − git − γit
(53)
Crystal plasticity
1145
In the above system of equations, the unknown quantities are the slip increments for the potentially active slip systems γi and γit represent the accumulated plastic slip increments at time t. By solving the above system of equations, one obtains the slip increments for the potentially active slip systems. For all other slip systems, the increments in plastic slip are zero. Depending on the functions h i j , the above system of equations can be solved directly or using a classical Newton-Raphson procedure. 10. The second decision to be made is about the choice of the set of active slip systems. If some of the plastic slip increments γi found previously are negative ( γi < 0 for some α ∈ Apot), then these slip systems are inactive. Consequently, these slip systems are dropped from the set Apot and return to Step 9. 11. After finding the slip increments for the slip systems active at Step 5, the reference shear strengths are recalculated for each slip system, and the inactive slip systems are monitored by calculating their respective yield functions and performing Step 7 again. If nonzero yield functions are found for some of the inactive slip systems, those systems are included in the set of potentially active slip systems Apot and the slip increments are recalculated again going back to Step 9. 12. Update state variables. If this numerical algorithm is implemented into an implicit finite element code, then an elasto-plastic stiffness matrix must be computed and passed to the code. For examples of how to compute a consistent stiffness matrix, see Kalidindi [19], Miehe and Schroder [20].
1.2.
Rate Dependent Numerical Integration Algorithm
Rate dependent integration algorithms for crystal plasticity constitutive equations avoid the complications related to a selection of active set of slip systems, by calculating plastic slip increments on all slip systems according, most commonly, to a power law flow rule. In the following paragraphs we will present a fully implicit integration algorithms proposed by Cuitino and Ortiz [6] for rate dependent crystal plasticity. As in the case of rate independent plasticity, we consider a substep from time t to t + t. At time t, all the variables are known, such as the plastic slip accumulated, as well as the isotropic and kinematic hardening on the slip systems are known. The incremental procedure assumes the deformation gradients at the beginning and the end of the time steps are also known, F(t) and F(t + t), respectively. An implicit
1146
M.F. Horstemeyer et al.
integration scheme of the differential equations representing the constitutive response of the lattice is briefly described in the following steps characteristic for finite element implementations [21]. An implicit integration algorithm is required to avoid numerical instabilities due to the power law flow rule. 1. In performing an implicit integration algorithm, the plastic slip increments are updated based on the values of the parameters from the flow rule at the end of the time step t + t: γ˙i = γ˙0 sgn
τit + t
αit + t
−
τ t + t − α t + t M i i git + t
(54)
The problem that arises here is that none of the quantities from the above equation are known at time t + t, so they must be calculated using an implicit numerical algorithm. Based on these unknown plastic shear rates, all the quantities in the above equations can be calculated. The variation of the reference shear resistance gi and the back stress α i with respect to the plastic shear rates γ˙i was explained in the previous section. Also, the resolved shear stress is a function of the γ˙i as, it can be observed from the following paragraphs. 2. Assuming that the plastic deformation gradient increment can be calculated solving the differential system of equations: p F˙ = L · F p
(55)
which has the solution: p
F t + t = exp (L · t) · F tp
(56)
where L is the velocity gradient in the intermediate configuration a function of γ˙i . 3. Based of the computed plastic deformation gradient, the elastic deformation gradient is:
p
F et+ t = F t + t · F t + t
−1
(57)
4. Using the elastic deformation gradient computed previously, update the normal and tangential vectors defining the slip systems: s t + t = F el · s t
m t + t = m t · F el
−1
(58)
5. Calculate elastic Green strain tensor from the elastic deformation gradient E et+ t =
1 2
Fte+ t
T
· Fte+ t − I
(59)
where, I represents the identity matrix and AT is the transpose of any matrix A.
Crystal plasticity
1147
6. Compute Cauchy stress from the second Piola–Kirchhoff stress: σ t + t =
T 1 e el e : F · C E t + t · Ft + t e det Ft + t
(60)
7. Using the Cauchy stress and the cosines directions of the slip systems, compute the resolved shear stress on each slip system: τit + t = σ t + t :
1 2
s ti + t · m ti + t + m ti + t · s ti + t
(61)
8. To calculate the plastic shear rates, the above equation is reversed and • a Newton–Raphson procedure is applied, to calculate γi : f i = τit + t − αit + t
• t + t 1/M t + t γi • − git + t • sgn γi =0 γ0
(62)
Applying Newton–Raphson method to solve the system of equations −
N ∂ fi j =1
∂γ j
· γ˙ j = f i
(63)
The above linear system of equations is solved for the unknowns γ˙i . The matrix (∂ f i /∂γ j ) is called the Jacobian of the linear system of equation, and it can be calculated by differentiating Eq. (63) defining the functions f i :
∂τ t + t ∂α t + t ∂α t + t γ˙i ∂ fi = i − i − i · · sgn (γ˙i ) ∂ γ˙ j ∂ γ˙ j ∂ γ˙ j ∂ γ˙ j γ0 ∂ γ˙i −αit + t sgn (γ˙i ) ∂ γ˙ j γ0
(64)
9. Updating of the plastic shear rates is performed after the increments
γ˙i are calculated: γ˙i = γ˙i + γ˙i
(65)
10. Check if convergence is achieved by computing some convergence criterion, in the form of an error. For additional details see [21]. An example of convergence criterion is: !
N γ˙i 1! " fi N i=1 γ˙max
2
≤ Tolerance
(66)
γi /i = 1, 2, . . . , N } γ˙max = max {˙
(67)
where,
1148
M.F. Horstemeyer et al.
N is the total number of slip systems (twelve for an FCC lattice), and Tolerance is a small number (very often chosen 1E − 8). The functions f i are defined at Step 12. 11. Re-compute all variables needed for calculating the convergence criterion at Step 11, and check for convergence. If the convergence criterion is fulfilled, exit Newton–Raphson procedure, otherwise perform a new Newton–Raphson step. Due to high nonlinearity of the power laws, especially for very large values of M, Newton–Raphson procedure may diverge and fail to achieve a solution. In this situation, the procedure must be combined with correction procedures, such a line search [22]. 12. Return to finite element program or to next step in algorithm.
References [1] P.R. Dawson, “On Modeling of Mechanical Property Changes During Flat Rolling of Aluminum,” Int. J. Solids Structures, 23(7), 947–968, 1987. [2] D. Peirce, R.J. Asaro, and A. Needleman, “An analysis of nonuniform and localized deformation in Ductile single crystals,” Acta. Metall., 30, pp. 1087–1119, 1982. [3] M.M. Rashid and S. Nemat-Nasser, “A constitutive algorithm for rate dependent crystal plasticity,” Computuer Methods in Applied Mechanics and Engineering, 94, 201–228, 1990. [4] U.F. Kocks, C.N. Tom´e, and H.R. Wenk, “Texture and anisotropy: preferred orientations in polycrystals and their effect on materials properties,” Cambridge University Press, 1998. [5] L. Anand and M. Kothari, “Computational procedure for rate-independent crystal plasticity,” Journal of the Mechanics and Physics of Solids, 44(4), 525–558, 1996. [6] A.M. Cuitino and M. Ortiz, “Computational modelling of single crystals,” Modelling Simul. Mater. Sci. Eng., 1, 225–263, 1992. [7] P.R. Dawson and E.B. Marin, “Computational mechanics for metal deformation processes using polycrystal plasticity,” Advances in Applied Mechanics, 34, 78–171, 1998. [8] G.I. Taylor and C.F. Elam, “The distortion of an aluminum crystal during a tensile test,” Proc. Royal Soc. London, A102, 643–667, 1923. [9] E. Schmid, “Ueber die Schubverfestigung von Einkristallen bei Plasticher Deformation,” Z. Physics, 40, 54–74, 1926. [10] G.I. Taylor, “Plastic strain in metals,” J. Inst. Metals, 62, 307, 1938. [11] J.F.W. Bishop and R. Hill, “A theoretical derivation of the plastic properties of a polycrystalline face-centered metal,” Phil. Mat. Ser., 7(42), 1298–1307, 1951. [12] K.S. Havner, Finite Plastic Deformation of Crystalline Solids, Cambridg University Press, 1992. [13] E.H. Lee and D.T. Liu, “Finite strain elastic-plastic theory with application to planewave analysis,” J. Appl. Phys., 38, 391–408, 1967. [14] J.W. Hutchinson, “Bounds and self-consistent estimates for creep of polycrystalline materials,” Proc. R. Soc. Lond. A, 348, 101–127, 1976.
Crystal plasticity
1149
[15] M.F. Horstemeyer and D.L. McDowell, “Modeling effects of dislocation substructure in polycrystal elastoviscoplasticity,” Mech. Matls., 27, 145–163, 1998. [16] G.Z. Sachs, Verein Deut. Ing., 72, 734, 1928. [17] R.I. Borja and J.R. Wren, “Discrete micromechanics of elastoplastic crystals,” International Journal for Numerical Methods in Engineering, 36, 3815–3840, 1993. [18] J. Schroder and C. Miehe, “Aspects of computational rate independent crystal plasticity,” Compututational Materials Science, 9, 168–176, 1997. [19] S.R. Kalidindi, “Polycrystal plasticity: constitutive modeling and deformation processing,” PhD. Thesis. MIT, Cambridge, MA, 1992. [20] C. Miehe and J. Schroder, “A Comparative study of stress update algorithms for rate-independent and rate-dependent crystal plasticity,” Int. J. Num. Met. in Eng., 50, 273–298, 2001. [21] R.D. McGinty, “Multiscale representation of polycrystalline inelasticity,” PhD. Thesis, Georgia Institute of Technology, Atlanta, GA, 2001. [22] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes, The Art of Scientific Computing, Cambridge University Press, 1986.
3.6 INTERNAL STATE VARIABLE THEORY D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
Many practical problems of interest deal with irreversible, path dependent aspects of material behavior, such as hysteresis due to plastic deformation or phase transition, fatigue and fracture, or diffusive rearrangement. Some of these processes occur so slowly and so near equilibrium that attendant models forego description of nonequilibrium aspects of dissipation (e.g., grain growth). On the other hand, some irreversible behaviors such as thermally activated dislocation glide can occur farther from equilibrium with a spectrum of relaxation times. The fact that quasi-stable, nonequilibrium configurations of defects can exist in lattices at multiple length scales, combined with the long range nature of interaction forces, presents an enormous challenge to the utility of high fidelity, high degree of freedom (DoF) dynamical models that employ atomistic or molecular modeling methods. For example, analyses of simple crystal structures using molecular dynamics have now reached scales on the order of microns, but are limited to rather idealized systems such as pure metals and to small time durations of the order of nanoseconds. High fidelity analyses of generation, motion and interaction of line defects in lattices based on discrete dislocation dynamics, making use of interactions based on linear elastic solutions, cover somewhat higher length scales and longer time scales, but are also limited in considering realistic multiphase, hierarchical microstructures. Crystal plasticity as well cannot be used for large scale finite element simulations, for example, crash simulations of a vehicle into a barrier. As experimental capabilities have become highly automated with increasingly accurate resolution, computer control and digital data acquisition, and in situ characterization capabilities have improved, many important phenomenological aspects of the nonlinear, irreversible behavior of materials have become much better understood. Concurrently, computing capabilities are now sufficient to make feasible the integration of more complex constitutive relations (e.g., stress–strain behavior as a function of temperature, strain, strain 1151 S. Yip (ed.), Handbook of Materials Modeling, 1151–1169. c 2005 Springer. Printed in the Netherlands.
1152
D.L. McDowell
rate, and initial material condition) in obtaining approximate solutions to initial-boundary value problems via finite element or finite difference schemes. To add to this confluence of technologies, mesoscale methods have resulted in advances in understanding and description of the effects of various microstructure features such as grains, reinforcement phases, inclusions, crystal structures, and so forth on deformation and failure processes in various classes of materials. This brief overview considers Internal State Variable (ISV) constitutive theory, which offers a rather robust framework for incorporating irreversible, path dependent behavior, informed by experiments, computational materials science, and mesoscale micromechanics methods.
1.
Internal Variables and the Notion of Constrained Equilibrium States
For simplicity, we employ small strain measure ε with conjugate stress σ. Absolute temperature is designated by T. ISV constitutive theory was initially developed based on the notion that a general nonequilibrium, irreversible process can be treated as a sequence of constrained equilibrium states (cf. [1–4]). See Muschik [5] for an excellent overview. Suppose that for a sequence of nonequilibrium states we locally define entropy and temperature as per their usual equilibrium state functions of (ε, T, ξk ), such that the Helmholtz free energy function is written as ψ = ψ ε, T, ξk = u − T η. The vector of internal state variables, ξk , represents effects of evolving material microstructure on the change of free energy. In this way, we extend the equilibrium state space to nonequilibrium processes via augmentation of the dependence of thermodynamic state functions on nonequilibrium variables ξk . The primary assumption underlying the notion of accompanying constrained equilibrium states to a given nonequilibrium path is that rates of “forcing” variables (ε, T ) are either sufficiently slow relative to the characteristic relaxation rates of viscous, thermally activated barrier bypass or diffusion rates associated with irreversibility such that relaxation occurs to near equilibrium before the next increment is taken, or they are sufficiently fast relative to these characteristic relaxation rates such that viscous rearrangement processes have little time to occur [6]. In certain cases, such as high strain rate deformation of materials, the rates of dynamic rearrangement due to application of forces and relaxation are of similar order and the concept of a local constrained equilibrium state breaks down, as can be the case for other high frequency, short wavelength phenomena. In practice, it is common to assume a relatively small set of internal state variables that represent the pertinent physical processes associated with inelastic deformation. Clearly, if the microstructure does not undergo irreversible change, then the body is at most nonlinearly thermoelastic. Virtually any aspect
Internal state variable theory
1153
of microstructure that can undergo irreversible rearrangement during changes of (ε, T ) is a candidate for description by an internal state variable, e.g., dislocation density and arrangement, voids, cracks, slippage of fibers relative to matrices, particle cracking or debonding, phase changes, lattice orientation, and so on. It is presumed that a representative volume element (RVE) of material is considered, such that properties determined from boundary information over the scale of the RVE will not change with further increases of volume size or translation of its boundaries for uniform macrofields (cf. Krajcinovic 1996; Nemat-Nasser and Hori 1993). This definition of statistically homogeneous behavior requires a suitably large RVE relative to the size and spacing (correlation length) of heterogeneities in terms of both microstructure and microstructure change, depending on the property of interest. On occasion, particularly when conducting analyses with resolution on the same order as the microstructure, this requirement conflicts with the notion of behavior at an infinitesimal point; the theory must be extended to address nonlocal behavior in a finite neighborhood of a point, where length scale effects are an obvious manifestation of this extension. However, nonlocal theories are problematic in that their spatial dependence of the constitutive equations interacts with spatial dependence of the other governing field equations in ways that are beyond the capabilities of most traditional numerical codes. Hence, the local assumption is often invoked as a reasonable approximation. Care must be taken to determine parameters from experiments for which specimens are suitably large to validate the underlying RVE assumptions, including the absence of microstructure rearrangement localization effects (e.g., front of a phase transformation or shear banding) across the entire specimen. Under these assumptions, we may view ψ = ψ ε, T, ξ i as a local equation of state, with the path history dependence of inelastic behavior (nonuniqueness of succession of constrained equilibrium states) embedded in the evolution of ξi . This approach lends itself well to incorporation of known relations for evolution of microstructure. We assign the evolution equations ξ˙i = gi σ, T, ξ j = gˆi σ, T, f j , i, j = 1, 2, . . . , n as a crucial part of the constitutive formulation. Here, ξ˙i are considered as thermodynamic fluxes and are “strain-rate-like” quantities, and fi are conjugate thermodynamic (driving) forces. The nonnegative intrinsic entropy production rate (dissipation) per unit volume is given by ρ γloc = ρ η˙ − ρ ϒT + T1 ∇ · q = ρ η˙ −
1 T
(ρϒ − ∇ · q) ≥ 0
(1)
where ρ is mass density, q is the heat flux vector, ϒ is the internal rate of heat supply per unit mass extrinsic to dissipation associated with microstructure rearrangement processes (e.g., radiation), and η is the entropy per unit mass. The inequality ργloc ≥ 0 is a strong form of the local form of the 2nd Law of Thermodynamics (i.e., the Clausius–Duhem (C–D) inequality) which appends an inequality associated with heat conduction.
1154
D.L. McDowell
The local form of the 1st law of thermodynamics, the energy equation, is written as . (2) ρϒ − ∇ · q = ρ u˙ − σ : ε where u is the internal energy per unit mass. From the primal definition of ˙ = u˙ − T η˙ − η T˙ . Expanding state the Helmholtz free energy, ψ = u − T η → ψ ˙ = ∂ ψ/∂ ε : ε. + ∂ ψ/∂ T T˙ + ni=1 ∂ ψ/∂ ξi ∗ξ˙i , function ψ as a total differential, ψ and substituting into Eq. (2) gives
∂ψ ∂ψ . + ρη T˙ + ρT η˙ − ρϒ + ∇ · q −σ :ε+ ρ ∂ε ∂T n ∂ψ . +ρ ∗ ξi = 0. ∂ ξi i=1
ρ
(3)
Each term in parentheses must vanish independently since ε and T are independent variables, leading to the relations
σ=ρ
∂ψ , ∂ε
η=−
∂ψ , ∂T
−
n ρ 1 ∂ψ . ∗ ξi = ρ η˙ − (ρϒ − ∇ · q) . T i=1 ∂ ξi T
(4) Making use of the C–D inequality in Eq. (1), the last of Eqs. (4) reduces to n n . ρ 1 ∂ψ . ∗ ξi = fi ∗ ξ i ≥ 0, ρ γloc = − T i=1 ∂ ξ i T i=1
(5)
where f i = −ρ(∂ ψ/∂ ξi ) is the thermodynamic force conjugate to displacement ξ i . Operator “*” denotes an appropriate scalar product for the Euclidean space of components of tensor ξi since the various internal variables can be of arbitrary (and different) rank. We may regard the succession of free energy states through which the material evolves as driven by the global minimization of free energy of the RVE. We draw several important points from Eq. (5): (a) the C–D inequality is equivalent to reduction of free energy with respect to irreversible change of microstructure, (b) the free energy release rate is a generalization of the energy release rate concept in fracture mechanics (cf. [7]), and (c) we may admit certain rearrangement processes that increase free energy so long as there are other irreversible processes that dissipate more energy than is stored. This last point helps to explain why ISVs are necessary to model, in a manner consistent with the 2nd Law, certain processes such as negative creep rate in unloaded metals subjected to tensile prestrain at elevated temperatures (cf., [8]) or contraction of muscle tissue under application of a tensile force. It is noted as well that in the absence of irreversible internal structure rearrangement, ψ = ψ (ε, T ) suffices for a mechanical description of reversible elastic behavior; it is unnecessary to introduce ISVs in this case
Internal state variable theory
1155
unless they provide a description of configurational entropy changes as a function of a sequence of reversible elastic states of microstructure, for example, network models of long chain molecules in elastomers.
2.
Measurement and Interpretation of ISVs
At least two aspects of this formulation relegate the internal state variables to the status of “hidden” variables: (i) The local form of equations is written as a set of differential equations with no spatial dependence and therefore does not couple with the governing field balance equations of mass, momentum and energy on the spatial manifold. (ii) the complete set of ISVs in general cannot be directly measured by virtue of the concession of many-body problems to a low order description (i.e., thermodynamical, rather than dynamical, description). As mentioned previously, (i) can be relaxed if warranted by the need to consider nonlocal action (cf., [9]) by introducing spatial dependence of the ISV evolution equations either via nonlocal integral forms (cf., [10]) or in terms of gradient approximations (cf., [11, 12]). In fact, this step is essential if it is desired to allow the microstructure to evolve and spatially “self-organize” in response to changes in both initial and applied boundary conditions [13]. In many cases, however, such effects of heterogeneity are encompassed within the local ISV description in situations where the behavior at each point can be suitably characterized as statistically representative of a volume element. It is possible to account for spatial gradients of microstructure within a RVE . without introducing spatial derivatives in the evolution equations for the ξi by considering distributions of gradients of microstructural features such as dislocations or cracks with a statistically homogeneous infinitesimal neighborhood of a point (cf., [11, 12, 14]). However, requirements of statistical homogeneity often become more difficult to meet in such cases with regard to properties that depend on distributions of microstructure such as fracture resistance and ductility, in comparison to properties that depend in weak fashion on distribution, such as conductivity and elastic stiffness. To understand the implications of point (ii) above, we must consider that adoption of an RVE is intimately related to the replacement of an explicit field of microstructure heterogeneities by an “equivalent” homogeneous continuum. This idea of an equivalent homogeneous material element that replaces the heterogeneous material at each point is central to continuum ISV theory, as ∗ ∗ ∗ ∗ illustrated in Fig. 1. In Fig. 1, the homogenization mapping H (ψ (ε , T , x )) → ψ ε, T, ξ i RVE is assumed to hold at the scale of the RVE, such that
1156
D.L. McDowell
H Figure 1. Homogenization of heterogeneous microstructure into equivalent homogeneous continuum at the scale of an RVE.
local field variables (ε∗ , T ∗ ) that vary within this volume are mapped to RVE averages, which are related to RVE boundary information, with necessary augmentation by the ISVs to achieve the free energy equivalence. The resulting RVE level free energy function serves as the potential for statistically representative stress–strain, entropy-temperature and generalized ISV forcedisplacement relations for internal defects or heterogeneities, as outlined in the preceding. In mesoscale micromechanics methods of elastic heterogeneous media with defects (cf., [15]), homogenization is also achieved at the RVE level, but with the goal of volume averaging of stress, strain and the free energy (typically expressed in isothermal case as strain energy density), explicitly taking into account heterogeneities within the RVE. Accordingly, the aims of such mesoscale methods are essentially captured within the foregoing ISV framework, but there are some important distinctions. Often, the degree of idealization necessary to perform rigorous volume averaging in initial-boundary value problems to obtain RVE level potentials sacrifices too many of the essential features of the heterogeneity fields. For example, Fig. 2 shows three different idealizations of a polycrystal with distributed defects (e.g., dislocations, voids, etc.) within grains. In the first case, appearing at the far left, we explicitly consider both scales of grains and intragranular defects or subgrain structures. In the second case, shown in the middle, ISVs are introduced only to represent heterogeneous phenomena within grains, and only grain-scale heterogeneity is explicitly addressed. At the far right, the fully homogenized description replaces effects of all heterogeneities with ISVs. The number of DoF, N, associated with each of the descriptions shown in Fig. 2, which includes computational DoF, decreases from left to right as we smear heterogeneities and incorporate them in ISVs. It may be feasible to achieve a homogenized description (far right) by solving initial-boundary value problems either analytically or numerically with the second idealization, but it is often intractable
Internal state variable theory
Ψ(∈,T; ξI )
1157
Ψ(∈,T; ξII )
Ψ(∈,T; ξIlI )
N( ξI ) > N( ξII ) > N( ξIII ) Figure 2. Criteria of choice for ISV models: (left) explicit model of grains and subgrain defects/heterogeneities, (middle) implicit ISV model for subgrain regions combined with explicit grain level modeling, and (right) statistically homogeneous RVE that implicitly encompasses all scales shown at left.
to model realistic distributions of heterogeneities at multiple length scales. The number of DoF becomes simply overwhelming for numerical solutions at length scales well above the RVE if individual defects are considered. Moreover, specification of initial conditions becomes problematic, as this must meet certain requirements on quasi-minimum free energy to be realistic, and most assuredly these initial conditions are not unique. We further comment on (ii) above. In systems with large populations of defects or other sites associated with microstructure rearrangement processes per unit volume, statistical mechanics appeals to ensemble averaging. In so doing, we relinquish any attempt to explicitly model the “dynamical state” of defect interactions. Examples of such fully dynamical models include molecular dynamics, explicit models for interacting deformable systems of particles, and discrete dislocation dynamics theories. The number of DoF for these models climbs dramatically with volume of material considered. In contrast, ISV models of local type address only the reduced thermodynamical state and the solution addresses balance of linear and angular momenta through the governing field equations at the equivalent material (i.e., continuum) level rather than between individual defects or heterogeneities within the RVE. In the process of discarding the explicit representation, however, we lose the capability to model the state of the material in terms of a direct mapping between microstructure and properties. For example, in molecular dynamics we track the position and velocity of particles as a function of time, thereby permitting application of Newton’s laws to achieve a complete dynamical description of the body at arbitrary length scales above the atomic level; lacking the fully dynamic description, the positions of atoms, particles or other heterogeneities as measured periodically under the microscope, are insufficient to fully characterize
1158
D.L. McDowell
the change of free energy with microstructure evolution (cf., [16]). For thermally activated microstructure rearrangement, energy barriers are often quite nonuniformly distributed through the microstructure. Hence, we cannot rigorously link the evolution of ISVs of a thermodynamic description to measurable low order geometric parameters of an evolving microstructure, but must rely on guidance from measured kinetics or computational methods to build force–flux relations that approximate the behavior of the actual dynamic, many-body problem. Historically, force–flux relations have been inferred from laboratory measurements in important applications such as distributed cavity growth in high temperature creep and dislocation creep or plasticity of polycrystalline metals.
3.
Comments on Local ISV Framework
The foregoing framework provides little detail regarding the prescription of kinetics of evolution processes other than the need to accord with the C–D . inequality – this must be specified via the constitutive relations for fluxes ξk . We are at liberty to base these relations on physical observations, mesoscale micromechanics methods, other computational materials science simulations, or a heuristic of maximal rate of dissipation [17]. Maximal dissipation, com ˆ bined with the assumption of a convex dissipation potential = σ, T, f j in the space of conjugate forces and temperature, gives rise to a generalized nor. mality structure for the fluxes with respect to this potential, i.e., ξ j = ∂/∂ f j [3, 7]. Rice [21] has shown that this requires that each thermodynamic force depends only on its conjugate generalized flux. While this result is appealing and has been used for many processes such as dislocation plasticity (resolved shear stress vs. slip system shearing rate relations), it is emphasized that the heuristic of maximal dissipation is a constitutive assumption and therefore is not on the same grounds as the 2nd Law inequality. Ziegler [17] makes the point that for multiple mechanisms of dissipation, the viability of maximal rate of dissipation as a governing principle for communal response at the RVE scale depends on whether the mechanisms are coupled or decoupled, and whether force–flux relations are linear or not. For many processes, however, it is a powerful and useful heuristic. If not for the assumption of accompanying constrained equilibrium states, the adoption of the equilibrium form of the thermodynamic relations along a nonequilbrium process would be inessential. In fact, even if we do not invoke equilibrium state functions to model nonequilibrium behaviors under locally . constrained equilibrium, we may still specify evolution equations for the ξ j as statistical ensemble averages, but must remove the modifier “state” from the ISV label; we then consider the ξ j as internal variables, removing the modifier “state”. For highly nonequilibrium processes, Extended Irreversible Thermodynamics (EIT) approaches have been developed (cf., [18]) that include
Internal state variable theory
1159
dependence of nonequilibrium state functions on fluxes of energy, mass, momentum and/or higher moments of velocity; as such, the EIT approach is fully nonlocal and removes undesirable characteristics of infinite speed of propagation of signals associated with otherwise parabolic governing equations absent such flux terms. Another important restriction on the RVE is that in addition. to the use of the free energy as a potential in Eq. (4), the force–flux relations ξi = gˆi σ, T, f j must hold at the RVE level. In fact, these relations are most fundamental to the ISV approach as they do not necessarily rely on the assumption of constrained equilibrium for thermodynamic state functions. As these relations are often quite sensitive to the tails of the distributions (e.g., distribution of largest heterogeneities), the RVE that corresponds to statistically representative force– flux relations may be significantly larger than that corresponding to the use of free energy to serve as a statistically representative potential. In some cases, it may be on the order of the size of the structural element being modeled, and the premise of local ISV theory is inapplicable. This definition of statistically homogeneous behavior based on force–flux relations requires a suitably large RVE relative to the size and spacing (correlation length) of heterogeneities in terms of both microstructure and microstructure change. The assumption of self-similar development of damage is sometimes made for sake of convenience in extending mesoscale methods solutions for elastic stiffness as a function of damage intensity and distribution, by assuming that distributions of defects will evolve uniformly and neglecting differential changes of higher moments of the distributions due to coalescence phenomena. In general, the kinetics of defect field growth must be based on either physical measurements or more sophisticated, material-dependent simulations of damage growth.
4.
Examples of ISV Models
We may identify an inexhaustive list of well known types of constitutive models belonging to the ISV classification (cf., [7, 19, 20]) • Creep damage of metals in the form of voids, leading to the broad field of continuum damage mechanics • Void nucleation, growth and coalescence in metals under tensile loading • Matrix microcracking and fiber fracture in metal- and ceramic-matrix composites • Dislocation plasticity in single crystals or polycrystals. In all of these applications the common feature is that of distributed defects or other entities associated with microstructure rearrangement. In most cases, ISV models are based on a combination of experiments and guidance from available micromechanical or materials science-based models.
1160
D.L. McDowell
When ISV constitutive relations are based on macroscopic experiments, it is common for the modeler to introduce a relatively small set (typically 1-3) of different types of ISVs that reflect major features of the irreversible behavior. Evolution equations are framed to describe limited experimental information – since the description is nonunique, a plethora of models can be introduced to model the same range of behaviors. Usually, as the number of different types of experiments and test conditions expand, candidate models are eliminated due to lack of generality or physical inconsistency. As described later, critical . experiments can be designed to test various forms of evolution equations ξ j . Effectively, the material itself is used as a direct indicator of homogenized behavior, under the assumptions (which ideally should be checked) that boundary conditions and specimen geometry/size have no influence. The model is interpolative in nature. The ISVs are likely to have little relation to measurable features of microstructure or perhaps a single mechanism in view of the foregoing discussion concerning thermodynamic representation of low DoF models. Since this class of ISV model is historically prominent, ISV models are often more generally viewed as primarily phenomenological in nature. A second type of ISV model, however, is a hybrid combination of computational and/or analytical micromechanics methods and motivation from experimental results. A good example is polycrystal plasticity, in which discrete grains and slip systems are addressed computationally, with the slip system level constitutive equations for dislocation glide kinetics and work hardening based on phenomenological laws. This type of model uses relatively large numbers of ISVs for hardening parameters that implicitly address dislocation interactions, as well as explicit ISVs for slip system orientation and grain size/shape. Integrated numerically over a periodic RVE, they provide the necessary information regarding RVE-level thermodynamics and kinetics. Such an approach is much more predictive and robust than macroscopic plasticity, since it explicitly addresses evolution of crystallographic texture and models both anisotropic elasticity and plasticity. It is clear from this example that mesoscale micromechanics methods adopt a thermodynamic representation as assumed in ISV theory. Experiments are needed to specify initial conditions on orientation and misorientation distributions of grains, in addition to hardening laws. Yet a third type of ISV model has been introduced in which analytical or computational mesoscale methods are used to derive the form of the free energy function and evolution equations of ISVs. Then, the role of experiments is to inform the identification of parameters. This differs from ISV models based on experiments in that model idealization has been introduced to facilitate modeling at the RVE level. A good example is the derivation of models for elastic energy and energy release rate due to matrix microcracking, matrix/fiber slippage and fiber damage in viscoelastic matrix composites (cf., [20]). Experiments are needed to determine strengths of phases and interface properties.
Internal state variable theory
1161
The last two examples illustrate the close relation between mesoscale micromechanics methods and ISV theory. Increasing emphasis on modeling either prior to, or concurrent with, experiments as we progress through these examples should not be confused with increased accuracy of the approach. Clearly, within a given range of conditions, approaches that make more use of experiments will likely provide more realistic results. On the other hand, if it is desired to probe structure–property relations for large ranges of microstructure, then experiments are likely too time-consuming and costly, and hybrid models emphasizing computational micromechanics methods with ISV models to represent fine scale phenomena are good candidates. We may regard these as modeling options of a multi-resolution type, and as elements of a hierarchical, multiscale modeling strategy. When modeling moves from the realm of explicit dynamic analyses of particle or defect interactions to mesoscopic or macroscopic continuum formulations at higher length scales, such multiscale strategies are perhaps best facilitated by using thermodynamics and kinetics to make the linkage, whereby statements of thermodynamics and dissipation made by models with different DoF on overlapping spatial domains and time scales should be held to some measure of equivalence. Moreover, experiments suggest that the mix of contributions of various deformation mechanisms can be treated with a heuristic such as maximal rate of dissipation, which facilitates modeling of multiple deformation mechanisms. In any case, physical consistency is essential. This includes consistency with experimental observations as well as accordance with all the other governing equations and boundary conditions. For example, although much attention has been devoted in the literature to the constraints dictated by the C–D inequality in Eq. (1) on the form and parameters of constitutive equations for irreversible processes, many models can be eliminated because they simply predict the incorrect magnitude of dissipation in spite of the proper sign.
5.
Application: Metal Viscoplasticity
Viscoplastic behavior of metals is among the more common applications of ISV theory. There are at least five length scales at which dislocation plasticity may be addressed, as shown in Fig. 3. From left to right, these scales are atomistic (molecular dynamics), collections of dislocations (discrete dislocation theory), sub-grain dislocation substructures (continuously distributed dislocation theory), grain (crystal plasticity theory) and macroscale (classical kinematic-isotropic hardening theory with a macroscopic flow potential). Solid solution or precipitate strengthened metallic alloys, multi-phase alloys and whisker, particulate or fiber-reinforced metal matrix composites each exhibit additional length scales intermediate to those depicted in Fig. 3, which affect the hardening and flow behaviors as well as the interaction and growth
1162
Min. Length Scale, L
D.L. McDowell Atomistic
Discrete dislocations
Dislocation patterns
Polycrystal plasticity
Macroscale plasticity
O(10⫺10 m)
O(10⫺8 m)
O(10⫺7 m)
O(10⫺5 m)
O(10⫺3 m)
Figure 3. Window of resolution for dislocation plasticity/viscoplasticity, including the typical minimum explicit length scale of resolution at each window size of observation.
of defects such as microvoids. Of course, we do not explicitly consider time scales here, but dynamical atomistic approaches and dynamic discrete dislocation mechanics are limited to short time scales, the latter to a lesser degree, while the others are amenable to longer time scales and fall under the framework of the ISV theory outlined earlier. Moving from left to right in Fig. 3, we may roughly describe the variables necessary to specify the “state” for a representative window as decreasing in number in accordance with a shift from characterizing the fully dynamical state (positions and momentum of atoms or defects) to the thermodynamical state of the system (microstructure attributes such as dislocation density, dislocation patterns, grain size, orientation distribution, etc.). Rice [21] addressed metal plasticity due to dislocation slip in the context of ISV theory, employing the notion of a sequence of constrained equilibrium states. By this time, the phenomenological laws of crystalline metal plasticity were rather firmly established, as reviewed later by Asaro [22]. A physically-motivated additive decomposition of strain rate into thermoelastic and inelastic parts has been introduced, assuming small strain for simplicity in presentation, as . . . ε = εe + εin ,
σ =ρ
∂ψ ∂ψ = ρ e, ∂ε ∂ε
ˆˆ εe , T, ξ = ψ εe , T + ψ ξ , T ψ=ψ e in i i
(6)
Note that the complete set of variables necessary to define the state include the elastic strain tensor (reversible lattice deformation) in addition to T and ξi . The free energy is split into thermoelastic parts associated with applied elastic strain, ψe , and internal state variables, ψin , the latter representing, for example, stored elastic energy in the lattice. The inelastic strain rate tensor is given by the variation of. strain with respect to microstructure evolution, i.e., . . . εin ≡ ni=1 (∂ ε/∂ ξi ) ∗ ξi . Hence, εin = 0if and only if ξ i = 0, i = 1, 2, . . . , n. Hence, the inelastic strain tensor should not strictly be considered as in ISV in its own right since it merely provides a geometric compilation of effects
Internal state variable theory
1163
. of irreversible arrangements ξ on ε. For a single crystal [22], the inelastic . in N i (α) (α) (α) s ⊗ m sym , where N is the number strain rate is given by ε = α=1 γ˙ of slip systems, each with respective slip direction and slip plane unit normal vectors s(α) and m(α) . For the elastic–plastic decomposition in Eq. (6) the C–D inequality is written as
n n n . ∂ε . ∂ ψin . 1 ρ 1 . σ : εin + fi ∗ ξi = σ: ∗ ξi − ∗ ξi ≥ 0 T T ∂ ξi T i=1 ∂ ξi i=1 i=1
(7) . The term σ : εin in Eq. (7) represents unrecoverable rate of external work, εin are RVE averages), absent body forces, atthe scale of the RVE (σ and . . n n while the term (ρ/T ) i=1 (∂ ψin /∂ ξi ) ∗ ξi = (ρ/T ) i=1 (∂ ψ/∂ ξi ) ∗ ξi reflects increase of free energy associated with storage of dislocations within the microstructure (stored energy of cold work). There is no stored energy in a purely dissipative process, and hence no change of ψin . For a single crystal undergoing dislocation glide, the first term is just the sum of resolved shear stresses N τ (α) γ˙ (α) . multiplied by associated shearing rates on each slip system, i.e., α=1 For polycrystalline metals at large strains, the energy storage in the second . term of Eq. (7) is typically less than 10% of σ : εin , and plays a significant role at small strains during loading reversals and for abrupt changes in loading direction (identified with effects of slip system back stress and/or evolving threshold stress, cf., [7]). Works of numerous investigators have amplified and offered alternative sets of internal state variables that describe slip system hardening relations as well as slip system (crystallographic) orientation for purposes of tracking texture evolution in polycrystals. In crystal plasticity, the (α) ξi can include the set of slip system shearsγ [21]) (α) (cf., (α) and ∂ ε/∂ ξ i can be N e (α) s ⊗ m sym in the small strain assessed directly since ε = ε + α=1 γ case. The inelastic strain rate tensor is prescribed in terms of kinetics of the shearing rates, which in turn depend on thermodynamic conjugate resolved shear stresses. If a convex dissipation potential is constructed [3, 7] as an additive sum of such potentials for independent dissipative mechanisms (see also, [17]), then a generalized normality structure emerges for both the inelastic strain . rate and the fluxes ξi with respect to this summed potential. For a given external work rate the assertion of maximal dissipation for the collective RVE level behavior would be interpreted as minimum elastic energy storage rate associated with heterogeneous deformation within the RVE, similar to the principle of minimum elastic energy for elastic-perfectly plastic boundary value problems discussed in Nadai [23]. It is worth mentioning that one can apply the variational principle of virtual velocities (PVV), based on the working rate of microstructural rearrangements (dislocation glide, diffusive rearrangements, relative grain boundary motions, etc.) that can be identified as
1164
D.L. McDowell
kinematically admissible with an applied strain field over the RVE, to solve the micromechanical boundary value problem with coupled effects of multiple inelastic deformation and diffusion mechanisms via mesoscale methods (cf. applications to diffusion-driven void growth with viscous dislocation creep in metals by Needleman and Rice [24] and grain growth by Moldovan et al. [25]). In this case we invoke maximal dissipation for each mechanism, but collective behavior is described by an upper bound derived from PVV using a kinematically admissible field; accordingly, we seek the solution that exhibits minimum dissipation from among all conceivable upper bound solutions for the conjugate driving forces. It may be shown that the RVE inelastic strain rate is normal to a potential constructed by summing the potentials of the various micromechanisms, each of which are governed by a convex dissipation potential that encloses the origin; this structure is consistent with the maximal rate of dissipation. Hence, from among all solutions that maximize dissipation within the constitutive framework we choose the one with minimum dissipation to obtain the approximate solution nearest the exact solution to the boundary value problem. This minimum work approach traces back to Nadai [23]. Such methods are useful for modeling ensembles of mechanisms that contribute to overall inelastic strain rate. Upper (kinematically admissible) and lower (statically admissible) bounds can be developed as per usual applications of the PVV (cf., [24]). Cleri et al. [25] advocate a multiscale modeling framework that connects atomistics to mesoscale continuum models using this kind of generalized PVV; it is noted that viscous forces are assumed to dominate inertial forces in the micromechanical and ISV descriptions, although not always the case in atomistics. Accordingly, there is a link between concepts of solving micromechanical boundary value problems at the scale of an RVE and the adoption of an ISV framework at this scale. Mesoscale micromechanics solutions to boundary value problems using PVV as just outlined rely on being able to link the thermodynamic fluxes to the applied strain field in a kinematically admissible sense – compatibility and consistency with imposed boundary velocities; as we have seen, in the ISV approach this may not be possible with hidden variables that reflect contributions from distributed, heterogeneous dissipative mechanisms over a wide range of length scales within the RVE. Multi-resolution numerical simulations based on PVV (e.g., displacement-based finite element method) that pass boundary information from sub-regions to higher length scales offer one way to approach this problem. It is a highly fertile area for future research. It is appropriate to close this paper with a pertinent example of experimental inference of ISV evolution from metal plasticity. At the macroscale, we may introduce as a minimum set of internal variables the back stress and an isotropic hardening variable. They are “hidden” in the sense that their effect can be ascertained or inferred from the evolutionary behavior. The back stress
Internal state variable theory
1165
is used to reflect directional effects of pre-strain on the kinetics of plastic flow (so-called Bauschinger effects). These effects are manifested as dependence of reverse yielding on pre-strain in the forward direction, the occurrence of negative creep rate after unloading to zero stress following a tensile pre-strain, and behavior under abrupt changes in direction of loading paths. In terms of Eq. (7), the back stress introduces effects of storage of elastic energy due to heterogeneity of the microstructure. Physically we can suggest several origins of back stress in polycrystalline metals [8]: • differential yielding among grains with hard and soft orientation, which changes with texture evolution and sub-grain formation • pile-ups of dislocations against hard boundaries such as grain or phase boundaries • differential resistance to slip in forward and reverse directions in certain systems by virtue of irreversible pinning mechanisms in the presence of planar dislocation structures in second phases or in Taylor lattices • distribution of short range barriers to thermally-activated dislocation motion and anelastic bowing of dislocations. These mechanisms result from hierarchical behavior of a spectrum of dislocation trapping and escape events associated with heterogeneity. The back stress itself can be decomposed [27] into multiple components that reflect various mechanisms. Since it does not represent volume averaged internal stresses but instead characteristics of differential yielding or barrier bypass events in time and space within a heterogeneous microstructure, it cannot generally be inferred, for example, based on changes of average lattice spacing from equilibrium values. Back stress is inherently a multiscale modeling concept. The kinematic hardening variable is not the average of a self-equilibrated internal stress field, but is rather related to the peak athermal barrier resistance to thermally-activated motion of dislocations past obstacles. This example clarifies that mean field volume averaging procedures commonly employed in mesoscale methods of composite materials to determine effective properties are less useful in determining ISVs that reflect average effects of distributed microstructure rearrangements on changes of properties. It might be more appropriate to classify the volume averaging procedures for evolving microstructure as activation volume averaging, with the focus on rate limiting events rather than mean fields. In view of the preceding discussion, back stress can be inferred from experiments or from results of multiscale simulations accounting for a range of potential deformation mechanisms. While the simulation route is certainly a worthy goal, it has not yet been fully developed to account for the various mechanisms cited above, whereas the experimental approach is rather well-established and instructive. The asymmetry of forward and reverse yield strength following reversal of stress, termed the Bauschinger effect, has long
1166
D.L. McDowell
been identified as a principal feature of cyclic deformation response of ductile metals. Otherwise known as kinematic hardening, this response has been attributed to the nonuniformity of inelastic flow (dislocation motion) at the microscale. For more details, the reader is referred to the review of cyclic plasticity and viscoplasticity by Chaboche [28]. We consider here the shift and expansion/contraction of a simple, initially isotropic yield surface for type 304 stainless steel, introducing the set ξ i comprised of a single tensorial kinematic hardening variable, α, and a scalar isotropic hardening variable, R. The variable R reflects storage of elastic energy associated with overall dislocation density accumulation and reduction of mean free path. A simple, classical model of yielding for an initially isotropic, plastically incompressible rate-independent polycrystal employs a simple uniaxial normalized J2 yield surface of the form f =
3 2
σij − αi j σij − αi j − R 2
(8)
where σij = σi j − (σmm /3) δi j is the deviatoric stress, αi j is the deviatoric back stress tensor and R is the radius of the yield surface. The flow rule, for f = f˙ = 0, is written as p
ε˙ i j = h1 (σ˙ kl Nkl ) Ni j
(9)
where Ni j = σij − αi j σkl − αkl is the unit normal vector to f in devi
atoric stress space at the current stress point on the yield surface, and the p inelastic strain rate is the plastic strain rate, i.e., ε˙ iinj = ε˙ i j . Within this framework, the form of evolution of the back stress can be probed by experiments that involve continuous axial-torsional strain-controlled, nonproportional deformation of tubular specimens along cyclically stable paths. Under these conditions, it is assumed that R has saturated or stabilized and only back stress evolves. Assuming the form in Eq. (8) holds for the yield surface, Fig. 4 compares the performance of linear and nonlinear kinematic hardening laws for 304 stainless steel at room temperature [29], i.e., α˙ ij = H p˙ Nij
(10)
α˙ ij = C b Ni j − αi j p˙
(11)
where p˙ = ε˙ klin ε˙ klin , and H, C and b are scalars/constants. In Fig. 4, the back stress path is plotted √ in the subspace of axial back stress, α1 = α11 vs. shear back stress α3 = 3α12 for cyclically stable response, as estimated by backward extrapolation from the stress point along the inelastic strain rate direction determined by numerical differentiation of data in the vicinity of each point along the stress path; the backward- extrapolated path is selected that provides the smoothest evolution through the cycle, i.e., minimizing jump
Internal state variable theory
1167
Figure 4. Predicted directions of back stress rate (arrows) based on the Prager kinematic hardening law (top) and Armstrong–Frederick nonlinear kinematic hardening law (bottom), superimposed on trajectory of back stress (solid lines) as inferred from measurements under a cyclically stable nonproportional straining path for type 304 stainless steel [29].
1168
D.L. McDowell
discontinuities near points of abrupt change of direction of loading path. Arrows in the direction predicted by each of the back stress evolution laws in Eqs. (10) and (11) are drawn with tails along this curve. Clearly, the nonlinear form given in Eq. (11) provides a much more accurate description than that in Eq. (10) since the direction of the rate predicted by Eq. (11) is nearly tangent to the measured (inferred) back stress path through the cycle. In contrast, the Prager rule in Eq. (10) is inaccurate since the arrows are far from tangency to the path. Indeed, the Armstrong–Frederick form [30] of kinematic hardening in Eq. (11) has been frequently used in the last twenty years to provide accurate simulations of cyclic behavior under nonproportional loading. Further enhancements of this simple form have addressed more accurate unloading–reloading behavior as well.
References [1] B.D. Coleman and M.E. Gurtin, “Thermodynamics with internal state variables,” J. Chem. Phys., 47, 597–613, 1967. [2] J. Kestin and J.R. Rice, “Paradoxes in the application of thermo-dynamics to strained solids,” E.B. Stuart, B. Gal Or and A.J. Brainard (eds.), A Critical Review of Thermodynamics, Mono-Book Corp., Baltimore, pp. 275–298, 1970. [3] P. Germain, Q.S. Nguyen, and P. Suquet, “Continuum thermodynamics,” J. Appl. Mech. Trans. ASME, 50, 1010, 1983. [4] J. Kestin, “Local equilibrium formalism applied to mechanics of solids,” Int. J. Sol. Struct., 29(14–15), 1827–1836, 1992. [5] W. Muschik, “Fundamentals of nonequilibrium thermodynamics,” In: W. Mushik, (ed.), Non-Equilibrium Thermodynamics with Applications to Solids, CISM Courses and Lectures No. 336, International Centre for Mechanical Sciences, Springer-Verlag, New York, pp. 1–63, 1993. [6] J. Bataille and J. Kestin, “L’interpr´etation physique de la thermodynamique rationnelle,” J. de M´echanique, 14, 365–384, 1975. [7] J. Lemaitre and J.L. Chaboche, Mechanics of Solid Materials, Cambridge University Press, Cambridge, 1990. [8] D.L. McDowell, “Multiaxial effects in metallic materials. Symposium on Durability and Damage Tolerance,” ASME AD-Vol. 43, ASME Winter Annual Meeting, Chicago, IL, Nov. 6–11, pp. 213–267, 1994. [9] Z.P Bazant and E.-P Chen, “Scaling of structural failure,” Appl. Mech. Rev., 50(10), 593–627, 1997. [10] A.C. Eringen, “Non-local polar elastic continua,” Int. J. Engrg. Sci., 10, 1–16, 1972. [11] H.M. Zbib and E.C. Aifantis, “On the gradient-dependent theory of plasticity and shear banding,” Acta Mech., 92, 209–225, 1992. [12] H.M. Zbib and E.C. Aifantis, “Size effects and length scales in gradient plasticity and dislocation dynamics,” Scripta Mater., 48, 155–160, 2003. [13] E.C. Aifantis, “Pattern formation in plasticity,” Int. J. Engrg. Sci., 33(15), 2161– 2178, 1995. [14] T.E. Lacy, D.L. McDowell, and R. Talreja, “Gradient concepts for evolution of damage,” Mech. Mater., 31, 831–860, 1999.
Internal state variable theory
1169
[15] S. Nemat-Nasser and M. Hori, Micromechanics: Overall Properties of Heterogeneous Materials, North-Holland, Amsterdam, 1993. [16] M. Zhou and D.L. McDowell, “Equivalent continuum for dynamically deforming atomistic particle systems,” Phil. Mag. A, 82(13), 2547–2574, 2002. [17] H. Ziegler, “An Introduction to Thermomechanics,” In: E. Becker, B. Budiansky, W.T. Koiter, and H.A. Lauwerier (eds.), North Holland Series in Applied Mathematics and Mechanics, 2nd edn., vol. 21, North Holland, Amsterdam, New York, 1983. [18] G. Lebon, “Fundamentals of nonequilibrium thermodynamics,” In: W. Mushik (ed.), Non-Equilibrium Thermodynamics with Applications to Solids, CISM Courses and Lectures No. 336, International Centre for Mechanical Sciences, Springer-Verlag, New York, pp. 139–204, 1993. [19] D. Krajcinovic, Damage Mechanics, Elsevier, Amsterdam, 1996. [20] R.S. Kumar and R. Talreja, “A continuum damage model for linear viscoelastic composite materials,” Mech. Mater., 35(3–6), 463–480, 2003. [21] J.R. Rice, “Inelastic constitutive relations for solids: an internal variable theory and its application to metal plasticity,” J. Mech. Phys. Sol., 19, 433–455, 1971. [22] R.J. Asaro, “Crystal plasticity,” J. Appl. Mech., Trans. ASME, 50, 921–934, 1983. [23] A. Nadai, Theory of flow and fracture of solids, McGraw-Hill, New York, 1963. [24] A. Needleman and J.R. Rice, “Plastic creep flow effects in the diffusive cavitation of grain boundaries,” Acta Met., 28(10), 1315–1332, 1980. [25] D. Moldovan, D. Wolf, S.R. Phillpot, and A.J. Haslam, “Role of grain rotation during grain growth in a columnar microstructure by mesoscale simulation,” Acta Mater., 50, 3397–3414, 2002. [26] F. Cleri, G. D’Agostino, A. Satta, and L. Colombo, “Microstructure evolution from the atomic scale up,” Comput. Mater. Sci., 24, 21–27, 2002. [27] J.C. Moosbrugger and D.L. McDowell, “A rate dependent bounding surface model with a generalized image point for cyclic nonproportional viscoplasticity,” J. Mech. Phys. Sol., 38(5), 627–656, 1990. [28] J.-L. Chaboche, “Constitutive equations for cyclic plasticity and cyclic viscoplasticity,” Int. J. Plasticity, 5(3), 247–302, 1989. [29] D.L. McDowell, “An experimental study of the structure of constitutive equations for nonproportional cyclic plasticity,” J. Engrg. Mater. Techn., Trans. ASME, 107, 307–315, 1985. [30] P.J. Armstrong and C.O. Frederick, “A mathematical representation of the multiaxial Bauschinger effect,” CEGB Report RD/B/N731, Berkeley Nuclear Laboratories, 1966.
3.7 DUCTILE FRACTURE M. Zikry North Carolina State University, Raleigh, NC, USA
Up to this point, we have focused upon mesoscale and macroscale formulations for plasticity. The ISV method in the previous section applies to ductile fracture as well. It is generally called continuum damage mechanics (CDM). Before we proceed to discuss CDM, it is worthwhile discussing physical mechanisms and models for ductile fracture. Therefore, in this section, we examine another dissipative mechanism, namely that of ductile fracture. Ductile failure or rupture is generally preceded by extensive plastic deformation, and it may occur either due to geometrical instabilities associated with specimen dimensions or due to the nucleation, growth, and coalescence of microscopic voids that initiate and propagate at inclusions, second phase particles, or grain boundaries. Since ductile materials are generally used due to their toughness, it is essential to understand how failure initiates and evolves, so that the inherent relatively high toughness can be used for design.
1.
Essential Concepts
In ductile crystalline materials, such as aluminum, copper, and steels, fracture initiation can be triggered by plasticity. Simple tensile tests of ductile homogenous specimens, without any pre-existing flaws, with geometrical dimensions on the order of centimeters, will fail, after the specimen undergoes extensive necking and plastic deformation in the necked region. As the specimen necks, the stress unloads as a function of strain as illustrated in Fig. 1. Inelastic or plastic deformation occurs on specific crystalline or glide planes, which are oriented at 45◦ planes from the loading axis. Specimen failure can occur as a chisel point or as a ductile cone fracture (Fig. 1). Specimen necking is a geometrical instability that is usually related to the specimen’s aspect ratio, and it is directly related to the accumulation of plastic deformation and failure. Hence, ductile failure is generally distinguished from brittle failure by 1171 S. Yip (ed.), Handbook of Materials Modeling, 1171–1181. c 2005 Springer. Printed in the Netherlands.
1172
M. Zikry
Stress
F
Strain Figure 1. Stress unloading as a function of strain with chisel-point fracture (geometrical instability) and cup-cone fracture.
the extensive inelastic deformation that precedes failure and generally higher deformation energies. When these ductile fracture surfaces are observed and analyzed on axial metallographic sections, at a scale generally on the order of micrometers, the fracture surface appears as dimpled (Fig. 2). Several factors have been shown to influence ductile failure: stress triaxiality, void volume fraction, effective plastic strain, interparticle spacing, and void initiation, growth, and coalescence. Stress triaxiality is the ratio of the mean stress to the equivalent stress. The mean or hydrostatic stress, σm is the average of the normal stresses, σ1 + σ2 + σ3 , (1) 3 and the equivalent stress σeq is a scalar measure of the yield stress for the multiaxial Von Mises yield criteria, and is given as the following, σm =
σeq =
√1 2
1
(σ1 − σ2 )2 + (σ1 − σ3 )2 + (σ3 − σ2 )2 2 ,
(2)
where σ1 , σ2 , and σ3 are the principal normal stresses. Since, ductility decreases with increasing stress triaxiality for most metals [1], this is one measure that has been used to characterize ductile failure. The resulting fracture surface is generally a complex path of coalesced voids running through the material [2]. The failure path for coalescence can develop quickly, on a local scale, in regions that otherwise may appear globally uniform. The dimpled surfaces associated with ductile fracture can
Ductile fracture
Figure 2.
1173
Dimpled failure surface associated with ductile fracture in low-carbon steel.
occur sequentially through void nucleation, growth, and coalescence as follows: • a stage of void nucleation that can occur by internal cracking of second phase particles or inclusions or by the decohesion at particle–matrix interfaces; • a stage of void growth up to a critical void size and intervoid spacing ratio; • a stage of void coalescence of these voids that links the voids in a final path of rupture and subsequent fracture. These three essential stages are briefly described below with some of the approaches that have been used to address each stage.
1.1.
Void Nucleation
Microvoid or cavity nucleation at inclusions or second phase particles can proceed in several distinct ways: • by the fracture of hard non-metallic inclusions. The resulting microcracks can lead to ductile fracture if the stress conditions are not energetically favorable for the propagation of cleavage cracks;
1174
M. Zikry
• by the separation of hard or soft particles from a material matrix at a particle matrix interface by interfacial decohesion. This decohesion occurs when both the dissipated energy exceeds the energy required for the formation of new surfaces and when a local stress exceeds a critical quantity related to the interfacial strength between the particle and the matrix. Argon et al. [3] proposed one of the first expressions expression for the determination of this critical stress for equiaxed particles as kσeq + σm + C
l0 = σc , R0
(3)
where σm is the mean stress, σeq is the equivalent Von Mises stress, k is a geometrical factor, l0 is the distance between particles, R0 is the particle radius, and C is a numerical coefficient, and σc is the critical stress. Argon and his collaborators were able to show that particle interaction does not occur for particle volumes for less than 1% and for particle volume fractions of 10% or greater, and that the onset of decohesion occurs at the beginning of yielding. • And a final means of void nucleation is that at grain boundaries (triple points) where high stresses can arise from misorientations. However, because elastic mismatches of particles or inclusions embedded within a ductile matrix is greater than crystal misorientations, the most dominant feature of void nucleation is that from second phase particles or inclusions.
1.2.
Void Growth
In the earliest approaches that were used to analyze void growth, void interactions were generally ignored in void growth models. Typical of these approaches for void geometrical changes is one developed by McClintock [4], where he analyzed the behavior of a periodic array of circular cylindrical voids, with initial radii R0 in a material subjected to plane strain tensile deformation (in the axial z direction) with a Von Mises hardening criterion, an associated flow rule, and a power law hardening formulation. He obtained the following expression for the rate change of the current radius, R, as √ √ dε x + dε y 3 n 3(σx + σ y ) n dR = dεeq sinh , (4) + R 2 n−1 2σ eq n−1 2 where εx and ε y are the transverse strains, εeq is the equivalent strain, n is the strain-hardening coefficient, and σ y and σz are the transverse stresses. McClintock [5] obtained a similar expression for a cylinder subjected to triaxial loading conditions. Rice and Tracey [6] also obtained an expression for
Ductile fracture
1175
the rate of change for a spherical void subjected to a tensile deformation with a remote hydrostatic stress and linear strain hardening as
dε 1 + dε 2 + dε 3 dR = (1 + G) + R 2
2 2 dε 1 + dε 22 + dε 23 D R0 3
(5)
where R0 is the initial radius of the sphere, and D and G are constants that depend on the stress state and the strain hardening of the material. Both of these Eqs. (4)–(5) can be integrated by invoking incompressibility, and expressions can then be obtained for radial displacements.
1.3.
Void Coalescence
The models developed by McClintock and Rice and Tracey were for the cases of cylindrical and spherical single voids that did not take into account void interactions nor considered the prediction of ultimate failure. Hence, a separate failure criterion had to be applied to characterize microvoid coalescence. Some of the earliest models to address this were developed by Berg [7] and Gurson [8]. Numerous researchers have been extensively applied variations of the Gurson model. In this model, it is assumed that a porous solid material can be approximated as a homogenous spherical body, or matrix, with a spherical cavity at its center. The effects of voids appear indirectly through their influence on the global flow behavior. The main difference between the Gurson model and classical plasticity is that the Gurson model has a dependence on hydrostatic stress, while classical plasticity formulations do not. Gurson [8] developed the following potential function, =
2 σeq
σ02
+ 2 f cosh
3σm 2σ0
− (1 + f 2 ) = 0,
(6)
where σ0 is the yield strength of the matrix, σeq is the Von Mises equivalent stress, σm is the mean hydrostatic stress, and f is the void volume fraction. When f is zero, this potential function reduces to the classical Von Mises yield surface with isotropic hardening. When f is one, the yield surface shrinks to a point and the material stress carrying capacity vanishes. However, Eq. (6) greatly overestimates ductility and failure strains, because it does not account for void interactions. Tvergaard [9] attempted to correct this by introducing two parameters, q1 and q2. He developed these parameters through axisymmetric finite-element analyses of a circular cylindrical unit cell. Based on this, he obtained the following modified yield condition, =
2 σeq
σ02
+ 2 f cosh q1
3q2σm − (1 + q22 f 2 ) = 0, 2σ0
(7)
1176
M. Zikry
which is identical to the Gurson model, if the void fraction is multiplied by a factor and if the hydrostatic tension is increased. The values proposed by Tvergaard are q1 = 1.5 and q2 = 1.00. Hence, this modification has the effect of amplifying the hydrostatic stress at all strain levels. But if this model properly describes overall behavior, it did not account for the rapid drop in stress carrying capacity occurring before final rupture. Thus, Tvergaard and Needleman [10] suggested replacing f in the previous equations with the function f ∗ so that f∗= f f ∗ = fc +
f ≤ fc,
(8a)
f u∗ − f c ( f − fc) ff − fc
f ≥ fc,
(8b)
where the first expression is used when f ∗ is less than or equal to the critical value f c and the second equation is used when f ∗ is greater than f c . Here f f is the void volume fraction at which the stress carrying capacity vanishes, so that f u∗ = 1/q1. As f approaches f f , f ∗ approaches f u∗ and the material loses all stress carrying capacity. Values for f c and f f were determined numerically from different unit cell computations performed by Tvergaard [9]. He determined that f c varies linearly between 0.04 and 0.12 for an initial void volume fraction varying between 0 and 0.06. It has been postulated by several investigators (see for example, Ref. [11]) that the microvoid volume fraction increases partly because of the growth of existing microvoids and partly because of the nucleation of new microvoids. This is assumed to occur as d f = d f growth + d f nucleation,
(9)
where the growth term can be given by the incompressibility of the matrix, εii as d f = (1 − f )εii .
(10)
For strain-controlled nucleation, the nucleation term [12] can be given by
fn 1 d f nucleation = √ exp − 2 sn 2π
εeq − εN σN
2
dεeq ,
(11)
where f n is the volume fraction of particles on which voids are formed, εN is the mean critical plastic strain, σN is the stress corresponding to that critical strain, and sn is a standard deviation. A similar expression for stress-controlled nucleation can be obtained by replacing the strain terms with the corresponding stress terms, such as the mean stress and the equivalent stress. These modifications of Gurson’s formulation have generally resulted in accurate predictions of the softening response of rate-independent materials
Ductile fracture
1177
with periodic distributions (c.f., [11]). However, accurate failure strains based on Gurson’s constitutive formulation, for rate-dependent materials with random void distributions have generally not been obtained. As noted by Magnusen et al. [13, 14], deformation and failure modes that are inherently related to a statistical variation of properties cannot accurately be modeled by a single porosity variable. Furthermore, in Gurson’s constitutive formulation, there is no distinction made for the nature of void distribution in the material. Experiments by Magnusen et al. [13] and Becker and Smelser [15] have shown that a material with a random array of voids has less ductility and lower strength than specimens having a periodic array of voids. However, if a random and a periodic arrangement of voids had the same number of voids per unit volume, the porosity parameter, in Gurson’s formulation, would still be the same for both distributions. There is also no size or volume distinctions made in Gurson’s formulation. A population of small voids and one void having equal volumes would have the same porosity. Furthermore, most of these investigations have used unit cells (similar to mesoscale RVE’s as discussed in the previous section on ISV’s) to represent the overall aggregate mechanical response. In unit cell computations, symmetry is usually exploited. Therefore, only a symmetric portion of the void surface is analyzed. Interaction effects from nearby voids are usually modeled by imposing a kinematic or static boundary condition on a surface of the unit cell. However, there are several shortcomings associated with this approach. One difficulty is that all unit cell boundaries have to remain straight since it is assumed that all voids deform uniformly. This would preclude explicitly accounting for specimen instabilities, such as necking at the free boundary. A second shortcoming is that if only one symmetric portion of a void is used, the non-symmetric and irregular geometries associated with void deformation and coalescence cannot be accurately modeled. Experimental studies (c.f., [14]) have also shown that the spacing between voids is also a critical factor in void coalescence. In unit cell computations, this spacing between voids, as a function of void interaction, has generally not been accurately accounted for such that there is a correlation of unit cell computations with experimental results (c.f., [11]). Furthermore, as noted by Tvergaard [11], the range of parameters (q1 and q2) that are used to determine final fracture is strongly dependent on the initial porosity. These parameters are usually obtained for specific loading conditions and geometries, and they may have to be obtained for different computations.
1.4.
Current Investigations
When ductile solids are sufficiently deformed far in the inelastic regime, a smoothly varying deformation can give way to one involving highly localized
1178
M. Zikry
deformation in the form of shear-bands. This form of shear strain localization can occur in rocks, polymers, granular materials, and structural metals. Shear– strain localization is of considerable practical importance in metal forming, high speed machining, and ballistic impact. The physical mechanisms that can trigger localization are a function of loading rates. At quasi-static loading rates, geometrical softening (crystal rotation) may be the triggering mechanism, while for dynamic loading conditions, thermal softening coupled with geometrical softening may be the triggering mechanisms. The use of smooth yield surfaces with increasing strain hardening rates, such as a Von Mises surface, to model shear-band formation, would not yield localized deformations or stress behavior beyond the initial bifurcation point. Numerous approaches have been used, such that phenomenological models can be used to analyze shear–strain localization and post-bifurcation behavior. These models essentially are based on developing J2 (second invariant of deviatoric stress) theories with a corner or vertex on the yield surface. These models were developed [16] to incorporate the vertex formation on yield surfaces exhibited by physical models of poly-crystalline aggregates. The yield surface corners are formed by the intersection of several smooth yield surfaces, where each potential or smooth yield function can be cast in terms of the strain-rate and work hardening. An angle can be obtained [16] in terms of moduli obtained from deformation theory and linear elasticity in a way to ensure that the surface is convex. These models have been used with some success to model shear band localization Tvergaard [17]. As analyses of shear-band localization indicates, finite-strain plasticity addresses processes involving large plastic deformations that lead to failure. Macroscale continuum theories, such as the yield vertex models, may not directly address the actual physical micromechanisms, which result in permanent inelastic strains, since there is a certain degree of arbitrariness in the formulations. Furthermore, deformation based theories, may not be suitable for non-proportional loading conditions. These issues have been addressed within the context of crystal plasticity, where the deformation kinematics can be related to the atomic structure of the crystalline lattice.
2.
Research Trends Pertaining to Ductile Fracture
It should be obvious from the previous sections that ductile fracture is inherently related to material heterogeneities. These heterogeneities can occur due to voids, inclusions, grain-boundaries, subgrains, or localized bands. Hence, current and future research endeavors pertaining to ductile fracture that are necessary to advance this area would include research to understand and control heterogeneous behavior at different scales in ductile materials and structures. Obviously, there are myriad research avenues to address these
Ductile fracture
1179
issues. Due to the length constraints of this chapter, only three broad topical areas will briefly outlined. There are several references that would be relevant to discussing future research trends related to this topic (c.f., a review by McDowell [18]).
3.
Coupling of Large Strain Plasticity with New Failure Models
The notion here is to understand material behavior at scales ranging from the nano to the macro in such a manner to be able to design new failure resistant materials and systems. Current efforts in this area include the coupling of immobile and mobile dislocation-densities to three dimensional crystal plasticity formulations [19] development of computational models accounting for grain-boundary effects and dislocation-density evolution effects on fracture void growth and coalescence [20, 21], and the effects of discrete dislocations [22] on material response. Others include distinct multi-scale aspects focusing on void nucleation [23] and temperature effects on void coalescence [24].
3.1.
Hierarchical Modeling
The coupling of different computational methods, such as finite-element methods at the continuum level to atomistics and molecular dynamics to understand and control material behavior and physical mechanisms [25], where appropriate computational tools can be used at the relevant spatial and temporal scale. This will require efficient parallel computations based on new algorithms and new constitutive models that incorporate dominant material features and failure mechanisms relevant to each physical scale.
3.2.
Progressive Failure Surface Creation and Propagation
The development of new models that result in the creation of failure surfaces and their progression. Most failure models are based on continuum models with pre-existing defects, such as voids and cracks. The ability to develop predictive tools based on the initiation and growth of defects at different scales, which are not dependent on a specific computational approach, is a gap that if addressed can result in new and revolutionary engineering applications and devices.
1180
4.
M. Zikry
Summary
This chapter was has been mainly devoted as an introduction to ductile failure at the macroscopic scale. It has been seen that as more complex problems, such as shear strain localization, are investigated, new models based on crystal plasticity have been invaluable in providing new physical insights. Current and future research investigations are focused on accounting for failure initiation and growth at scales ranging from the nano to the macro scales. If these investigations are successful, then new materials and structures can be designed and tailored as needed from the lowest scales up. However, this can only be achieved, if modeling efforts are successful in developing physically based validated tools that can be utilized for material and system design. It should be also be underscored that even though models at the micro to the nano levels are invaluable, these approaches have to be linked to continuum level models that are more attuned to system and structural design.
References [1] J.W. Hancock and A.C. Mackenzie, “On the mechanisms of ductile failure in high strength steels subjected to multi-axial stress states,” J. Mech. Phys. Sol., 24, 147– 169, 1976. [2] S.H. Goods and L.M. Brown, “The nucleation of cavities by plastic deformation,” Acta Metall., 27, 1–15, 1979. [3] A.S. Argon, J. Imp, and R. Safely, “Cavity formation from inclusions in ductile fracture,” Met. Trans., 6A, 825–837, 1975. [4] F.A. McClintock, “A criterion for ductile fracture by the growth of hole,” J. Appl. Mech., 35, 363–371, 1968. [5] F.A. McClintock, “Plasticity aspects of fracture,” In: F. Liebowitz (ed.), Fracture: An Advanced Treatise, vol. 3, Academic Press, New York, pp. 47–225, 1971. [6] J.R. Rice and D.M. Tracey, “On the ductile enlargement of voids in triaxial stress fields,” J. Mech. Phys. Solids, 17, 201–217, 1969. [7] C.A. Berg, “Plastic dilation and void interaction,” In: M.F. Kanninen (ed.), Inelastic Behaviour of Solids, vol. 3, McGraw Hill, New York, pp. 171–210, 1970. [8] A.L. Gurson, “Continuum theory of ductile rupture by void nucleation and growth: part I. yield criteria and flow rules for porous ductile media,” J. Eng. Mater. Technol., 99, 2–15, 1977. [9] V. Tvergaard, “On localization in ductile materials containing spherical voids,” Int. J. Fract., 18, 237–252, 1982. [10] V. Tvergaard and A. Needleman, “Analysis of the cup-cone fracture in a round tensile bar,” Acta Metall., 32, 157–169, 1984. [11] V. Tvergaard, “Material failure by void growth to coalescence,” Adv. Appl. Mech., 27, 83–151, 1990. [12] C.C. Chu and A. Needleman, “Void nucleation effects in biaxially stretched sheets,” J. Eng. Math. Technol., 102, 249–256, 1980. [13] P.E. Magnusen, E.M. Dubensky, and D.A. Koss, “The effect of void arrays on void linking during ductile fracture,” Acta Metall., 36, 1503–1509, 1988.
Ductile fracture
1181
[14] P.E. Magnusen, D.J. Srolovitz, and D.A. Koss, “A simulation of void linking during ductile microvoid fracture,” Acta Metall., 38, 1013–1022, 1990. [15] R. Becker and R.E. Smelser, “Simulation of strain localization and fracture between holes in an aluminum sheet,” J. Mech. Phys. Solids, 42, 773–796, 1994. [16] J. Christoffersen and J.W. Hutchinson, “A class of phenomenological corner theories of plasticity,” J. Mech. Phys. Solids, 27, 465–487, 1979. [17] Tvergaard, Viggo (Tech. Univ. of Den., Lyngby), Source: Int. J. Fracture, vol. 17, No. 4, pp. 389–407, Aug. 1981. [18] D.L. McDowell, “Modeling and experiments in plasticity,” Int. J. Solids Struct., 37, 293–309, 2000. [19] Kameda and M.A. Zikry, “Three dimensional high strain rate failure evolution and triple junction grain-boundary effects in intermetallics,” Mech. Mater., 28, 93–102, 1998. [20] W.M. Ashmawi and M.A. Zikry, “Prediction of grain-boundary interfacial effects and mechanisms in crystalline systems,” J. Eng. Mater. Technol., 124, 88–95, 2002. [21] W.M. Ashmawi and M.A. Zikry, “Void morphology and grain-boundary effects in crystalline materials,” Mater. Sic. Eng. A, 343, 126–142, 2003. [22] H.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40 (2–3), 113–127, 1997. [23] M.F. Horstemeyer, M. Negrete, and S. Ramaswamy, “Using a micromechanical finite element parametric study to motivate a phenomenological macroscale model for void/crack nucleation in aluminum with a hard second phase,” Mech. Mater., 35, 675–687, 2003. [24] M.F. Horstemeyer and S. Ramaswamy, “On factors affecting localization and void growth in ductile metals: a parametric study,” Int. J. Damage Mech., 9, 6–28, 2000. [25] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in solids,” Phil. Mag. A, 73(6), 1529–1551, 1996.
3.8 CONTINUUM DAMAGE MECHANICS G.Z. Voyiadjis Louisiana State University, Baton Rouge, LA, USA
Continuum Damage Mechanics (CDM) can be thought of as a subset of ISV theory as described earlier. It was introduced by Kachanov [1] and modified somewhat by Rabotnov [2] has now reached a stage that practical engineering problems can be solved. In contrast to fracture mechanics which considers the process of initiation and growth of microcracks as a discontinuous phenomenon, continuum damage mechanics uses a continuous variable, φ, which is related to the density of these defects, to describe the deterioration of the material before the initiation of macrocracks. Based on the damage variable φ, constitutive equations of evolution are developed to predict the initiation of macrocracks for different types of phenomena. Lemaitre [3] and Chaboche [4] used it to solve different types of fatigue problems. Leckie and Hayhurst [5], Hult [6], and Lemaitre and Chaboche [7] used it to solve creep and creep-fatigue interaction problems. Also, it was used by Lemaitre for ductile plastic fracture [8–10] and for a number of other applications [11]. The damage variable, based on the effective stress concept, represents average material degradation which reflects the various types of damage at the mesoscale level like nucleation and growth of voids, cavities, microcracks, and other microscopic defects. For the case of isotropic damage, the damage variable is scalar and the evolution equations are easy to handle. Lemaitre [11] argued that the assumption of isotropic damage is sufficient to give accurate predictions of the load carrying capacity and the number of cycles or the time to local failure in structural components. An extension of this theory is the incorporation of anisotropic damage and plasticity that has been experimentally confirmed [12–15] even if the virgin material is isotropic. This has prompted several researchers to investigate the general case of anisotropic damage. The theory of anisotropic damage mechanics was developed by Sidoroff [16], Cordebois and Sidoroff [17], and Cordebois [18], and later used by Lee et al. [15], and Chow and Wang [13, 14, 19], and Voyiadjis and Park [20] 1183 S. Yip (ed.), Handbook of Materials Modeling, 1183–1192. c 2005 Springer. Printed in the Netherlands.
1184
G.Z. Voyiadjis
to solve simple ductile fracture problems. Prior to these developments, Krajcinovic and Foneska [21], Murakami and Ohno [22], Murakami [23], and Krajcinovic [24] investigated brittle and creep fracture using appropriate anisotropic damage models. Although these models are based on a sound physical background, they lack vigorous mathematical justification and mechanical consistency. Consequently, more work is needed to develop a more involved theory capable of producing results that can be used for practical applications [21–25]. In the general case of anisotropic damage, the damage variable has been shown to be tensorial in nature [22–26]. This damage tensor was shown to be an irreducible even-rank tensor [27]. Several other basic properties of the damage tensor have been outlined by Betten [28, 29] in a rigorous mathematical treatment using the theory of tensor functions. Lemaitre [30] summarized the work done in the last fifteen years to describe crack behavior using the theory of continuum damage mechanics. Also, Lemaitre and Dufailly [31] described eight different experimental methods (direct and indirect) to measure damage according to the effective stress concept [32]. Chaboche [33–35] described different definitions of the damage variable based on indirect measurement procedures. Examples of these are damage variables based on the remaining life, the microstructure and several physical parameters like density change, resistivity change, acoustic emissions, the change in fatigue limit, and the change in mechanical behavior through the concept of effective stress.
1.
Damage in Metals Due to Uniaxial Loads
We know turn towards the various assumptions and the equivalence principle for CDM. This is followed by the derivation of the damage evolution equations. Finally, a new section is added on the separation of damage due to cracks and voids in metals. All the theory and derivations are based on the uniaxial tension test. Therefore, isotropic damage is assumed and all the equations employ scalar variables.
2.
Principles of Continuum Damage Mechanics
The limitation of classical fracture mechanics have been outlined recently by Lemaitre [30]. Parameters like the J-Integral and crack opening displacement (COD) are difficult to use in cases of large strain plasticity, time-dependent behavior, crack evolution for non-proportional loading, and delamination of composites. Murakami [36] indicated that proper understanding and corresponding mechanical description of damage progression arising from internal
Continuum damage mechanics
1185
defects is vital. A systematic approach to these problems of distributed defects can be provided by CDM. The fundamental notion of this theory is to represent the damage state of materials characterized by distributed cavities in terms of appropriate internal state variables rate equations. Lemaitre [11] indicated that damage in metals is mainly the process of the initiation and growth of microcracks and cavities. At the mesoscale, the phenomenon is discontinuous. At the macroscale the damage variable, however, smears out the response in a continuous smooth fashion and is written in terms of stress or strain. This function can still predict the initiation and growth of damage but in a macroscopic sense. These constitutive equations have been formulated in the framework of thermodynamics and identified for many phenomena: dissipation and low-cycle fatigue in metals [3], coupling between damage and creep [5, 6], high-cycle fatigue [4], creep-fatigue interaction [7], and ductile plastic damage [8, 37, 38]. In CDM, a crack is considered to be a zone (process zone) of high gradients of rigidity and strength that has reached critical damage conditions. Thus, a major advantage of CDM is that it utilizes a local approach and introduces a continuous damage variable in the process zone, while classical fracture mechanics uses more global concepts like the J-Integral and COD. Kachanov [1] introduced the idea of damage in the framework of continuum mechanics. In a damaged body, consider a volume element at the macroscale that is of a size large enough to contain many defects and small enough to be considered as a material point of a continuum. For the case of isotropic damage and using the concept of effective stress (because of its suitability for continuum mechanics), the damage variable ϕ is defined as a scalar in the following manner, φ=
A − A¯ , A
(1)
where A¯ is the effective (net) resisting area corresponding to the damaged area A. The effective area A¯ is obtained from A by removing the surface intersections of the microcracks and cavities and correcting for the micro-stress concentrations in the vicinity of discontinuities and for the interactions between closed defects. The expression given in Eq. (1) implies that φ = 0 corresponds to the undamaged state, and φ = φcr is a critical value which corresponds to the rupture of the element in two parts. According to Lemaitre [11], the critical value of the damage variable lies in the range 0.2 ≤ φcr ≤ 0.8 for metals. In general, the theoretical value of φ should be between 0 ≤ φ ≤ 1. Equation (1) can be rewritten in a more suitable form as follows, A = (1 − φ) A.
(2)
The cross-sectional areas A and A¯ are shown in Fig. 1 on cylindrical material element in the damaged and effective states, respectively.
1186
G.Z. Voyiadjis T ⫽σΑ
T⫽ σΑ
A
A
0≤ϕ≤1
ϕ⫽0
Damaged state
Equivalent fictitious undamaged state
Figure 1. Isotropic damage in uniaxial tension (concept of effective stress).
3.
Assumptions and the Equivalence Hypothesis
Stress, energy, or strain equivalence can be used in CDM. When the hypothesis of strain equivalence [9, 11] is not assumed, the effective resisting area A¯ can be calculated through mathematical homogenization techniques [39], but the shape and size of the defects must be known, which is somewhat difficult, even with an electron microscope. To avoid this difficulty, the hypothesis of strain equivalence is made [40]. This hypothesis states that “every strain behavior of a damaged material is represented by constitutive equations of the undamaged material in the potential of which the stress is simply replaced by the effective stress.” The effective stress σ¯ is defined as the stress in the effective (undamaged) state. Considering Fig. 1, the effective stress σ¯ can be obtained from Eq. 2 by equating the force T = σA acting on the damaged area A with the force T = σ¯ A¯ acting on the hypothetical undamaged ¯ i.e., area A, ¯ σA = σ¯ A,
(3)
where σ is the Cauchy stress acting on the damaged area A. From Eqs. (2) and (3), we can obtain the following expression for the effective Cauchy stress σ¯ σ¯ =
σ . 1−φ
(4)
Continuum damage mechanics
1187
The effective stress σ¯ can be considered as a fictitious stress acting on an undamaged equivalent (fictitious) area A¯ (net resisting area). For the uniaxial tension case shown in Fig. 1, the constitutive relation in Hooke’s law of linear elasticity is given by σ = Eε,
(5)
where ε is the strain and E is the modulus of elasticity (Young’s modulus). The same linear elastic constitutive relation applies to the effective (undamaged) state, i.e., σ¯ = E¯ ε¯ , (6) where ε¯ and E¯ are the effective counterparts of ε and E, respectively. Next, we will derive the necessary transformation equations between the damaged and the hypothetical undamaged states of the material. In the derivation, the following assumptions are incorporated: (1) the elastic deformations are small (infinitesimal) compared with the plastic deformations (finite), and (2) there exists an elastic strain energy scalar function U . This function is assumed based on the linear relation between the Cauchy stress σ and the engineering elastic strain ε given by Eq. (5). The elastic strain energy function U is defined by U = 12 σ ε.
(7)
It is clear from Eqs. (5) and (7) that σ = dU/dε and ε = dU/dσ. Sidoroff [16] proposed the hypothesis of elastic energy equivalence. This latter hypothesis assumes that “the elastic energy for a damaged material is equivalent in form to that of the undamaged (effective) material except that the stress is replaced by the effective stress in the energy formulation.” Thus, according to this hypothesis, the elastic strain energy U = 12 σ ε is equated to the effective elastic strain energy U¯ = 12 σ¯ ε¯ as follows 1 σε 2
= 12 σ¯ ε¯ .
(8)
Substituting Eq. (4) into Eq. (8) and simplifying, we obtain the following relation between the strain ε and the effective strain ε¯ ε¯ = (1 − φ)ε.
(9)
Continuing further, we substitute Eqs. (4) and (9) into Eq. (6), simplify the result and compare it with Eq. (5) to obtain ¯ − φ)2 . (10) E = E(1 Equation (10) represents the transformation law for the modulus of elasticity. It is clear now that Young’s modulus for the damaged material depends on the value of the damage variable φ. Solving Eq. (10) for φ, one obtains
φ=1−
E . E¯
(11)
1188
G.Z. Voyiadjis
Once the values of E¯ are measured experimentally, one can use Eq. (11) to obtain values of the damage variable φ. It should be noted that the value of E¯ is constant for the effective (undamaged) material.
4.
Damage Evolution
There are several approaches in the literature on the topic of damage evolution and the proper form of the kinetic equation of the damage variable. Kachanov [32] proposed an evolution equation of damage based on a power law with two independent material constants. However, the resulting kinetic equation for the damage variable evolution is complicated and difficult to solve. Therefore, a more rational approach based on energy considerations is presented here. The approach will depend on the introduction of a damage strengthening criterion in terms of a scalar function g, and a generalized thermodynamic force that corresponds to the damage variable φ [9, 15]. Substituting Eq. (6) and (9) into the right hand side of Eq. (8), we obtain the elastic strain energy U in the damaged state of the material as follows, ¯ − φ)2 σ ε 2 , U = 12 E(1
(12)
in which E¯ is constant, therefore, the incremental elastic strain energy dU is obtained by differentiating Eq. (12), ¯ − φ)ε 2 dφ. ¯ − φ)2 ε dε − E(1 dU = E(1
(13)
The generalized thermodynamic force y associated with the damage variable φ is thus defined by y≡
∂U ¯ − φ)ε 2 . = − E(1 ∂φ
(14)
Let g(y, L) be the damage function (criterion) as proposed by Lee et al. [15], where L ≡ L(l) is a damage strengthening parameter which is a function of the “overall” damage parameter l. For this problem, the scalar function g takes the following form g(y, L) = 12 y 2 − L(l) ≡ 0.
(15)
The damage strengthening criterion defined by Eq. (15) is similar to the von Mises yield criterion in the theory of plasticity. In order to derive a normality rule for the evolution of damage, we first start with the power of dissipation which is given by = −y dφ − L dl,
(16)
Continuum damage mechanics
1189
where the “d” in front of a variable indicates the incremental quantity of the variable. The problem is to extremize subject to the condition g = 0. Using the mathematical theory of functions of several variables, we introduce the Lagrange multiplier dλ and form the objective function (y, L) such that = − dλ · g.
(17)
The problem now reduces to extremizing the function . For this purpose, the two necessary conditions are ∂/∂ y = 0 and ∂/∂ L = 0. Using these conditions, along with Eqs. (16) and (17), one obtains the following dφ = −dλ
∂g , ∂y
(18a)
∂g . (18b) ∂L Substituting for g from Eq. (15) into Eq. (18b), one concludes directly that dλ = dl. Substituting this into Eq. (18a), along with Eq. (15), we obtain dl = −dλ
dφ = −dλ · y.
(19)
In order to solve the differential Eq. (19), we must first find an expression for the Lagrange multiplier dλ. This can be done by invoking the consistency condition dg = 0. Applying this condition to Eq. (15), we obtain ∂g ∂g dy + dL = 0. ∂y ∂L
(20)
Substituting for ∂g/∂ y and ∂g/∂ L from Eq. (15) and for dL = dl(∂ L/∂l), from the chain rule of differentiation, and solving for dl, we obtain dl = dλ =
y dy . ∂ L/∂1
(21)
Substituting the above expression of dλ into Eq. (19), we obtain the kinetic (evolution) equation of damage,
∂L ∂l
dφ = −y 2 dy
(22)
with the initial condition that φ = 0 when y = 0. The solution of Eq. (22) depends on the form of the function L(l). For simplicity, we may consider a linear function of the form L(l) = cl + d, where c and d are constants. The equivalent damage strengthening parameter can be analogously expressed as √ dl · dl or simply dl whereby giving a linear function in l as discussed above. Substituting this into Eq. (22) and integrating, we obtain the following relation
1190
G.Z. Voyiadjis
between the damage variable φ and its associated generalized thermodynamic force y, φ=−
y3 . 3c
(23)
The above relation is shown graphically in Fig. 2 where it is clear that φ is a monotonically increasing function of y. Next, we investigate the straindamage relationship. Differentiating the expression of y in Eq. (14), we obtain ¯ dy = Eε[ε dφ − 2dε(1 − φ)].
(24)
Substituting the expressions of y and dy of Eqs. (14) and (24), respectively, into Eq. (22), we obtain the strain-damage differential Equation [38]
∂L ∂l
dφ = E¯ 3 ε 5 (1 − φ)2 [2dε(1 − φ) − ε dφ].
(25)
The above differential equation can be solved easily by the simple change of variables x = ε 2 (1 − φ) and noting that the expression on the right-hand side of Eq. (25) is nothing but E¯ 3 x 2 dx. Performing the integration with the initial condition that φ = 0 when ε = 0 along with the linear expression of L(l), we obtain E¯ 3 6 φ = (26) ε . (1 − φ)3 3c
y1
3
3c Cubic function
ϕ1 Figure 2. force y1 .
Relation between the overall damage variable φ1 and its associated generalized
Continuum damage mechanics
1191
One should note that an initial condition involving an initial damage variable φ ◦ could have been used, i.e., φ = φ ◦ when ε = 0. In addition, the straindamage relation of Eq. (26) could easily have been obtained by substituting the expression of y of Eq. (14) directly into Eq. (23). However, it is preferable to derive it directly from the strain-damage differential Eq. (25) without the use of the generalized thermodynamic force y.
References [1] L.M. Kachanov, “On the creep fracture time,” Izv Akad. Nauk USSR Otd. Tekh., 8, 26–31 (in Russian), 1958. [2] Y.N. Rabotnov, “Creep problems of structural members,” North-Holland, Amsterdam, 1969. [3] J. Lemaitre, “Evaluation of dissipation and damage in metals submitted to dynamic loading,” ICM 1, Kyoto, Japan, 1971. [4] J.L. Chaboche, “Une Loi Differentielle d’Endommagement de Fatigue avec Cumulation Non-Lineare,” Rev. Francaise Mecanique, No. 50–51 (in French), 1974. [5] F.A. Leckie and D. Hayhurst, “Creep rupture of structures,” Proceedings of the Royal Society, London, A340, 323–347, 1974. [6] J. Hult, “Creep in continua and structures,” In: Topics in Applied Continuum Mechaniesh, Zeman and Ziegler (eds.), 137, Springer, NY, 1974. [7] J. Lemaitre and J.L. Chaboche, “A Nonlinear model of creep fatigue cumulation and interaction,” Proceedings of IUTAM, Symposium on Mechanics of Viscoelastic Media and Bodies, 291–301, 1975. [8] J. Lemaitre and J. Dufailly, “Modelisation et Identification de l’Endommagement Plastique de Mataux,” Zeme Congres Francaise de Mecanique, Grenoble, 1977. [9] J. Lemaitre, “A continuous damage mechanics model for ductile fracture,” J. Eng. M. Technol., 107, 83–89, 1985. [10] G.Z. Voyiadjis, “Model of inelastic behavior coupled to damage,” Handbook of Materials Behavior Models, J. Lemaitre (ed.), Chapter 9, Section 9.4, Academic Press, New York, 814–820, 2001. [11] J. Lemaitre, “How to use damage mechanics,” Nuc. Eng. Des., 80, 233–245, 1984. [12] D.R. Hayhurst, “Creep rupture under multiaxial states of stress,” J. Mech. Phys. Solids, 20, 381–390, 1972. [13] C.L. Chow and J. Wang, “An anisotropic theory of continuum damage mechanics for ductile fracture,” Eng. Fract. Mech., 27, 547–558, 1987. [14] C.L. Chow and J. Wang, “An anisotropic theory of elasticity for continuum damage mechanics,” Int. J. Frac., 33, 3–16, 1987. [15] H. Lee, K. Peng, and J. Wang, “An anisotropic damage criterion for deformation instability and its application to forming limit analysis of metal plates,” Eng. Fract. Mech., 21, 1031–1054, 1985. [16] F. Sidoroff, “Description of anisotropic damage application to elasticity,” In: IUTAM Colloquium on Physical Nonlinearities in Structural Analysis, pp. 237–244, Springer-Verlag, Berlin, 1981. [17] J.P. Cordebois and F. Sidoroff, “Damage induced elastic anisotropy,” Colloque Euromech, 115, Villard de Lans, 1979. [18] J.P. Cordebois, “Criteres d’Instabilite Plastique et Endommagement Ductile en Grandes Deformations,” Thse de Doctorat, Presente a l’Universite Pierre et Marie Curie, 1983.
1192
G.Z. Voyiadjis
[19] C.L. Chow and J. Wang, “Ductile fracture characterization with an anisotropic continuum damage theory,” Eng. Fract. Mech., 30, 547–563, 1988. [20] G.Z. Voyiadjis and T. Park, “Anisotropic damage for the characterization of the onset of macrocrack initiation in metals,” Int. J. Damage Mech., 5(1), 68–92, 1996. [21] D. Krajcinovic and G.U. Foneska, “The continuum damage theory for brittle materials,” J. Appl. Mech., 48, 809–824, 1981. [22] S. Murakami and N. Ohno, “A continuum theory of creep and creep damage,” In: Proceeding of Third IUTAM Symposium on Creep in Structures, pp. 422–444, Springer, Berlin, 1981. [23] S. Murakami, “Notion of continuum damage mechanics and its application to anisotropic creep damage theory,” J. Eng. M. Tech., 105, 99–105, 1983. [24] D. Krajcinovic, “Constitutive equations for damaging materials,” J. Appl. Mech., 50, 355–360, 1983. [25] E. Krempl, “On the identification problem in materials deformation modeling,” Euromech, 147, on Damage Mechanics, Cachan, France, 1981. [26] F.A. Leckie and E.T. Onat, “Tensorial nature of damage measuring internal variables,” In: IUTAM Colloquium on Physical Nonlinearities in Structural Analysis, 140–155, Springer-Verlag, Berlin, 1981. [27] E.T. Onat and F.A. Leckie, “Representation of mechanical behavior in the presence of changing internal structure,” J. Appl. Mech., 55, 1–10, 1988. [28] J. Betten, “Damage tensors in continuum mechanics,” J. Mecanique Theorique et Appliquees, 2, 13–32, Presented at Euromech Colloquium 147 on Damage Mechanics, Paris-VI, Cachan, 22 September, 1981. [29] J. Betten, “Applications of tensor functions to the formulation of constitutive equations involving damage and initial anisotropy,” Eng. Frac. Mech., 25, 573–584, 1986. [30] J. Lemaitre, “Local approach of fracture,” Eng. Fract. Mech., 25(5/6), 523–537, 1986. [31] J. Lemaitre and J. Dufailly, “Damage measurements,” Eng. Fract. Mech., 28(5/6), 643–661, 1987. [32] L.M. Kachanov, “Introduction to continuum damage mechanics,” Martinus Nijhoff publishers, The Netherlands, 1986. [33] J.L. Chaboche, “Continuum damage mechanics: present state and future trends, international seminar on local approach of fracture, moret-sur-loing, France,” 1986. [34] J.L. Chaboche, “Continuum damage mechanics: part I – general concepts,” J. Appl. Mech., 55, 59–64, 1988a. [35] J.L. Chaboche, “Continuum damage mechanics: part II – damage growth, crack initiation and crack growth,” J. Appl. Mech., 55, 65–72, 1988b. [36] S. Murakami, “Mechanical modeling of material damage,” J. Appl. Mech., 55, 280– 286, 1988. [37] G.Z. Voyiadjis and P.I. Kattan, “A plasticity-damage theory for large deformation of solids, part I: theoretical formulation,” Int. J. Eng. Sci., 30(9), 1089–1108, 1992. [38] G.Z. Voyiadjis and P.I. Kattan, “Advances in damage mechanics: metals and metal matrix composites,” 542 p., Elsevier, Oxford, 1999. [39] P. Suquet, “Plasticit et Homognisation,” Thse d’Etat, Universit Paris 6, 1982. [40] J. Lemaitre and J.L. Chaboche, “Aspect Phenomnologique de la Rupture par Endommagement,” J. Mec. Appl., 2, 317–365, 1978.
3.9 MICROSTRUCTURE-SENSITIVE COMPUTATIONAL FATIGUE ANALYSIS D.L. McDowell Georgia Institute of Technology, Atlanta, GA, USA
Previous sections focused on plasticity and damage formation and evolution under monotomic loading. Under cyclic loading, fatigue failure is a significant consideration. Historically, simple macroscopic fatigue correlations have proven quite useful in estimating fatigue crack initiation life of metallic components, based on measured or calculated stresses and strains at notches in components. The application of stress-based criteria for high cycle fatigue (HCF) or plastic strain-based criteria for low cycle fatigue (LCF) is typically based on transfer of results from tests on relatively small scale notched and unnotched laboratory specimens to larger components (cf. [1]). At the macroscale, fatigue of ductile materials has many common characteristics among alloy systems, leading to the utility of strain-life criteria. At the level of microstructure, fatigue is a complex, cycle-dependent process that differs in detail from one alloy system to the next. Often it is desired to understand mean fatigue resistance and scatter in fatigue as a function of microstructure in order to tailor microstructure to improve component level fatigue resistance. To this end, extension of fatigue analysis methods to microstructures is necessary as a means to augment and reduce the number of required experiments. The process of fatigue failure under constant amplitude loading typically includes several stages: a. cyclic plastic deformation, with formation of stable cyclic dislocation substructures that control intensity of notch root cyclic plasticity; b. formation of crack embryos at interface of regions of intensely localized shear (so-called persistent slip bands) and surrounding matrix, typically referred to as “nucleation”; c. sharpening of the crack front and onset of propagation, assisted by slip irreversibility and damage mechanisms ahead of the tip;
1193 S. Yip (ed.), Handbook of Materials Modeling, 1193–1214. c 2005 Springer. Printed in the Netherlands.
1194
D.L. McDowell
d. propagation beyond regions of stress concentration influenced by the notch root(s); and e. propagation to specimen/component failure. Cyclic plastic deformation behavior depends on a number of factors, including, among others: • Slip planarity (affected by stacking fault energy and solid solution strengthening); • Sizes and arrangement of multiple phases; • Dendrite cell size (DCS), otherwise known as secondary dendrite arm spacing, and interdendritic morphology in cast alloys; • Grain size, orientation and misorientation distribution, and morphology in wrought alloys; and • Larger incoherent precipitates or inclusions. The reader is referred to an excellent review of microstructural fatigue mechanisms in the monograph by Suresh [2]. Historically, much work has focused on observations of crack nucleation (incipient crack formation) under LCF loading conditions; this topic has received attention of recent modeling efforts [3, 4]. While there is much fundamental work to be done to understand fatigue crack formation in broad classes of engineering metals, much progress has been made in developing engineering methods and guidelines for fatigue life estimation and design. The resulting idealizations based on pure metals, both experimentally and theoretically, tend to support the empirical Coffin–Manson power law relation (cf. [1, 2]) for number of cycles to crack nucleation, i.e., ε p = A (2Nnuc )c 2
(1)
where ε p is the range of applied plastic strain over some representative volume of material, and A and c are constants. Indeed such studies are indicative of a still-developing science base for understanding fatigue processes at increasingly fine scales and levels of detail. In practical alloy systems, however, the nucleation regime is often either bypassed or coupled with debonding or fracture of interfaces between inclusions and matrix or at grain boundaries, or crack formation at existing surface scratches, machining marks, or at nearsurface pores or inclusions. The problem then focuses on the formation of small cracks at micronotches that subsequently propagate as microstructurally small cracks. Eventually, these cracks grow until they are sufficiently long compared to microstructure scales to facilitate the assumption of propagation in a homogeneous material; for many alloy systems under low amplitude HCF conditions, such cracks must be 500–1000 µm long.
Microstructure-sensitive computational fatigue analysis
1195
In fatigue life estimation for structural components, notches play a central role in affecting concentration of local stress and cyclic plastic strain. A rather conventional state-of-the-art engineering approach to fatigue life estimation for components is summarized as follows [1]: I. Conduct laboratory experiments or canvass fatigue data on the alloy system of interest to determine constants in strain-life relation. II. Develop load history profile for mission/duty cycles. III. Perform notch analysis either based on Neuber analysis or finite element analysis, and account for any notch size effects. IV. Apply strain-life relations to estimate fatigue crack initiation life, including effects of multiaxial stress–strain states as appropriate (cf. [5, 6]) based on some cumulative damage analysis to estimate a micronotch root crack of transition crack length (cf. [7, 8]) or some characteristic dimension on the order of the notch root radius [9]. V. Apply LEFM-based crack propagation analysis to estimate crack propagation life, accounting for crack closure and load history effects as appropriate. In this algorithm, the definition of fatigue crack initiation typically corresponds to a crack of length 500 µm to 1 mm in unnotched specimens used to determine constants in strain-life relations. An alternative approach used for fatigue-critical components is to ignore the initiation life and consider only propagation (so-called defect-tolerant approach), assuming that cracks of certain lengths pre-exist due to processing and/or handling. Then the goal is to determine the remaining propagation life for fatigue cracks based on measured initial crack size distributions or backward extrapolated values inferred from propagation relations. This approach is more realistic for the LCF regime (shorter lives) in which crack propagation life dominates the number of cycles required for formation of small cracks at the notch root; however, it may be too conservative for most applications in the HCF regime since formation of small cracks and propagation within the microstructure may consume a significant fraction of the total fatigue lifetime.
1.
Crack Formation in Microstructure-based Fatigue at Microstructural Notches
It is well known that formation of fatigue cracks in metals is dominantly related to the cyclic plastic shear strain (slip) range within the microstructure, as well as the local stress or strain state (cf. [10]). The magnitude of cyclic plastic strain within a heterogeneous microstructure varies spatially. Moreover, certain aspects of the microstructure dominate fatigue failure processes by virtue of lower resistance to crack formation, often associated with
1196
D.L. McDowell
enhanced localization of slip. In practical wrought alloy systems, it is often the case that second phase inclusions or impurities control fatigue resistance of a given microstructure, and there are significant effects of grain size, hardness, and other basic characteristics. In cast alloys, large particles and pores either within or near interdendritic boundaries often control fatigue resistance, with additional influence of secondary dendrite arm spacing, grain size, and distributed microporosity. To account for variation in fatigue behavior with regard to variation of microstructure in actual materials, it is useful to conduct deterministic analyses of a range of crack-starting defects or inclusions. Then, using the measured distributions of such inclusions it is possible to quantify the statistical distribution of fatigue responses, within the premises of certain simplified modeling assumptions. Such an approach seems much more satisfying for purposes of microstructure selection or design aimed at improving mean fatigue resistance as well as quantifying variability. In many industrial applications, minimum life design methods are employed for fatigue critical components, so prediction of scatter in fatigue is relevant. For purposes of illustration of a computational-based microstructuresensitive approach, we focus on effects of second phase inclusions. A range of microstructural parameters, applied loading conditions and material properties affect the local cyclic plastic strain localization in the neighborhood of an inclusion. Relevant parameters include: • • • •
Elastic stiffness and strength of inclusions, Elastic and inelastic properties of the matrix, Geometric attributes of inclusions – sizes, shapes, Spatial distribution of the inclusions – nearest neighbor distance of large inclusions, including correlation of position with respect to grain boundaries and free surfaces, • Crystallographic orientation of the grain in which the inclusion lies as well as misorientation with neighboring grain(s), • Integrity of inclusion and matrix–inclusion interface (e.g., perfectly bonded, partially debonded, fully debonded or cracked inclusion), and • Presence of denuded zones around inclusions or grain/dendrite boundaries. In addition to these microstructural parameters, loading parameters such as the amplitude of the applied strain, the load ratio, and multi-axiality can each have a significant effect on the number of cycles necessary to form a crack at an inclusion. Sensitivity to microstructure features and loading parameters can be explored using computational methods that consider relevant length scales and mechanisms of crack formation and small crack propagation within the microstructure. Steps involved in an extension of the notch root approach to
Microstructure-sensitive computational fatigue analysis
1197
estimation of fatigue resistance based on micronotch (e.g., inclusion) analysis can be suggested as follows: I. Identify controlling microstructure features for crack formation and early growth. II. Conduct numerical analyses (e.g., finite element) of cyclic loading of various microstructure/notch geometries for representative loading cases. III. Build transfer functions for each inclusion/micronotch type (including interactions between closely spaced inclusions or with free surfaces) between macroscopic loading and average micronotch root cyclic plastic strain as a function of applied strain amplitude, mean stress or strain, and variable amplitude loading, as appropriate. IV. Apply microstructure-scale crack formation/incubation relations of LCF type based on simple Coffin–Manson forms to model crack formation corresponding to a transition crack length that is suitable for application of crack propagation analysis. V. Apply microstructurally and possibly physically small crack propagation relations for crack growth to a length considered representative of “crack initiation”, typically 500 µm to 1 mm. VI. Calibrate constants of Coffin–Manson and small crack propagation relations to results for experimentally characterized microstructure(s) and then use these constants to predict results for other microstructures.
2.
Application to Cast Al Alloys
An example of the foregoing microstructure-sensitive fatigue analysis scheme is presented next for inclusions in cast Al alloys (cf. [11]). The method can also be applied to fatigue crack formation at inclusions (or other comparable heterogeneities) in wrought alloys. Cast A356-T6 is dominantly an Al–Si alloy with a dendrite cell size ranging from 30 to 90 µm and eutectic, interdendritic regions decorated with Si particles in an Al-rich matrix. There is a hierarchy of scales of inclusions that affect total fatigue life, ranging from distributed gas microporosity and Si particles with diameters on the order of 3–15 µm, to high levels of microporosity with maximum pore diameters about 60 µm to several hundred µm, to large shrinkage pores with diameter greater than several hundred µm, to large oxides which are introduced during casting, typically of size ranging from several hundred µm to mm. The larger inclusions among this hierarchy are increasingly detrimental to fatigue life for a given loading condition. To model their effect on fatigue resistance, it is necessary to treat the scales of inclusions and their features in distinct manner, and to
1198
D.L. McDowell
consider their interactions through propagation/coalescence relations. In this way, the joint probability of finding inclusions from given populations within fatigue-critical highly stressed regions of components can be considered in a microstructure-sensitive fatigue design methodology. Figure 1 schematically illustrates three distinct regions of the constant amplitude, completely reversed uniaxial strain- and stress-life plots for a low porosity cast A356-T6 alloy. In this plot, the length scale “D” pertains to the diameter of a typical Si particle or small gas pore within or near an interdendritic region. The length scale pertains to the size of the plastic zone at the notch root, defined as the scale over which the local plastic shear strain meets or exceeds some specified level, e.g., 0.01%. Effectively, gives a length scale over which local maximum cyclic plastic strain concentration is “substantial,” defined in an arbitrary but consistently applied manner. Here, we restrict the values of to lie in the range 0 ≤ ≤ D, such that we regard the case → D as a limiting case of unconstrained plasticity associated with macroscopic yielding and macroscopic LCF. With increasing stress amplitude, > D and ultimately extensive plasticity sets in; this regime is of little practical interest for most applications. With the Si particle spacing being on the order of particle diameter in the eutectic regions and /D < 0.3, the local plasticity at cracked/debonded particles or gas pores is confined to the vicinity of the inclusion and does not interact strongly with neighboring inclusions. We term this regime as constrained microplasticity. Interestingly, for pores or debonded Si particles, the value of /D = 0.3 approximately corresponds to the macroscopic cyclic yield strength of A356-T6, so we can connect the transition from microplasticity to macroscopic plasticity (appearance of hysteresis in the remote or global cyclic stress–strain response) as that from constrained plasticity to unconstrained plasticity within the microstructure. We may regard /D = 0.3 as a percolation limit for microplasticity through the microstructure. The regime of limit plasticity sets in as /D approaches unity, i.e., the plasticity becomes extensive at the macroscopic scale. Constrained microplasticity exists below the macroscopic yield point and leads to the formation and growth of small fatigue cracks. The applied uniaxial strain amplitude for the yield point of A356-T6 is approximately εa ≈ 0.0023, which corresponds to the percolation limit for microplasticity within the eutectic regions (/D ≈ 0.3) and is the pertinent definition of the demarcation between LCF and HCF. At applied strain amplitudes well above the percolation limit, a condition of limit plasticity is reached for which the macroscopic average plastic strain amplitude approaches the order of the local plastic strain amplitude within the microstructure. Both eventually exceed the remote applied elastic strain amplitude at a remote applied strain amplitude of about 0.008. This distinction from the conventional definition of the HCF–LCF transition in wrought alloys, which assumes equality of elastic and plastic strain ranges ε e = ε p ,
Microstructure-sensitive computational fatigue analysis
1199
Plastic ᐉ
D
1
ᐉ/D ᐉ / D ⫽ 0.3 0 εa ᐉ/D LCF Limit Plasticity Log εa
Constrained microplasticity ᐉ / D < 0.3 HCF
B Unconstrained microplasticity
A
ᐉ/D≈1
C
ᐉ / D ⫽ 0.3
Log NT
Log σa
C
A B
Log NT
Figure 1. Regimes characterizing cyclic microplasticity at Si particles and casting pores [11]. (A) Elastic–plastic fracture mechanics propagation-dominated extensive remoteplasticity, (B) LCF transition regime, (C) incubation-dominated HCF regime. Here, D is the inclusion diameter and is the characteristic size of the micronotch root cyclic plastic zone.
1200
D.L. McDowell
is significant. For A356-T6, the HCF region according to our definition is beyond about 5 × 104 cycles, whereas according to the conventional definition it is beyond only 100 cycles for this alloy. It is likely that high strength wrought alloys with fine scales of heterogeneous microstructure are subject to similar categorization, as their conventional transition fatigue lives are often on the order of tens or hundreds of cycles.
3.
Crack Formation/Incubation Relations
The cyclic plastic shear strain range is central to the evaluation of LCF potency of inclusions. In local finite element analyses, the minimum mesh size serves as a lower bound for the domain over which the plasticity is averaged. The maximum micronotch plastic shear strain range is meshsensitive (increases with a decrease in element size), and it is therefore necessary to introduce a non-local volume averaging procedure over integration points in the mesh to effectively remove mesh dependence. Moreover, such a procedure is entirely physically justified since we are interested in assessing the scale of the intense cyclic plastic zone relative to the micronotch dimension. The non-local average plastic shear strain associated with the θ-plane is calculated by averaging the plastic shear strain on the θ-plane over the area A of the micronotch root region, i.e., p∗ γθ
1 = A
p
γθ dA
(2)
A
The non-local cyclic plastic shear strain range for each plane is calculated using this expression based on the range over the third cycle of the simulation. p∗ The maximum of the range of γθ among all planes is taken to be the non-local maximum cyclic plastic shear strain range, i.e., p 1 γmax p∗ = max γθ β≡ 2 2 θ ∗
(3)
In 3D simulations, a volume V would be used instead of the in-plane area A in the averaging process of Eqs. (2) and (3). Similar studies were carried out to examine the effects of nearest neighbor distance and proximity to the free surface [12]. In these calculations, the matrix elastoplasticity is correlated to experimental cyclic stress–strain behavior of Al–1%Si specimens (eutectic composition) tested at room temperature and a frequency on the order of 1– 10 Hz. Nonlinear kinematic hardening J2 plasticity theory was used to describe the cyclic plasticity of the Al-rich matrix (cf. [13]).
Microstructure-sensitive computational fatigue analysis
1201
The lower bound on length scale of the area A for averaging to find β is related to the minimum slip length over which fatigue cracks might nucleate due to classical PSB formation (Venkataraman et al. [3]) This lower bound may range from about 300 to 1000 nm and effectively establishes the minimum finite element mesh size for the calculation of cyclic plastic shear strain within the matrix to be used in assessment of fatigue crack formation. As will be explained later, the size of an incubated crack at the root of an inclusion will be assumed to be proportional to inclusion diameter, following the concept of transition crack length from fracture mechanics, and therefore the averaging in Eqs. (2) and (3) is performed similarly over a comparable area, proportional to the square of the inclusion diameter. The non-local maximum cyclic plasp∗ /2 is used to estimate the number of cycles tic shear strain amplitude γmax to form and propagate (i.e., incubate) a crack with length on the order of the domain of influence of the micronotch root, a fraction of inclusion size. Following [11], a notch root Coffin–Manson law is applied for this purpose, i.e., α β = Cinc Ninc
(4)
where α and Cinc are material dependent parameters. The formulation of a non-local Coffin–Manson law at the microstructure-scale is consistent with energetic arguments based on slip irreversibility [3], as discussed in connection with Eq. (2). Both α and Cinc are obtained from experimental data while β is obtained computationally. Load ratio dependence is explicitly embedded in Cinc , which differs, in general, from the fatigue ductility coefficient ε f in the traditional application of the Coffin–Manson relations to correlate fatigue crack initiation lifetime (cf. [14]); ε f includes effects of substantial small crack propagation through the heterogeneous microstructure (often up to 1 mm), whereas Cinc pertains only to formation at an individual micronotch. In a real microstructure, inclusions of various sizes and aspect ratios are present. Furthermore, some of the inclusions might be clustered and hence their interaction may affect the non-local maximum plastic shear strain amplitude. Inclusions are also found near the free surface, which promotes shear localization and can lead to premature fatigue crack formation. In addition, the applied mean stress or R-ratio can affect the intensity of cyclic plastic shear strain at the notch root, especially when the inclusions are partially debonded. This is due to the contact interaction between the inclusion and the surrounding matrix. We do not know a priori which of these scenarios would be most critical. Hence parametric studies are conducted to determine the nonlocal maximum plastic shear strain amplitude β as a function of these parameters. Then, information regarding inclusion populations in actual materials can be processed through these relations to assess probabilities of failure based on distribution functions for inclusion types, number densities, sizes, shapes, proximities, etc.
1202
4.
D.L. McDowell
Micronotch Analyses in Cast A356-T6
An hierarchical treatment of five inclusion types was addressed in McDowell et al. [11] for cast Al–Si alloy A356-T6, spanning the range of length scales relative to the secondary dendrite arm spacing or dendrite cell size (DCS) listed below, according to the order of ascending severity: Type A B C D E
Inclusion Distributed microporosity and Si particles; no significant pores or oxides High levels of microporosity; no large pores or oxides (length scale < 3DCS, which is about 60–300 µm) Large pores (length scale > 3DCS) Large pores within one pore diameter of the free surface; no large oxides (length scale > 3DCS) Large folded oxides (length scale > 3DCS )
The hierarchical approach to fatigue modeling of cast alloys permits bypass of certain crack growth regimes associated with lower length scales if the cracks incubate at larger defects. The total fatigue life is modeled as the sum of numbers of cycles spent in several consecutive stages as follows: NT = Ninc + NMSC + NPSC + NLC = Ninc + NMSC/PSC + NLC
(5)
where Ninc is the number of cycles to incubate (nucleation plus small crack growth through the region of notch root influence) a crack at the micronotch root with initial length, ai , on the order of 1/2 the maximum Si particle diameter, Dˆ part , or pore size, Dˆ pore . Here, NMSC is the number of cycles required for propagation of a microstructurally small crack (MSC) with length ai < a < k DCS, where k is a non-dimensional factor which represents a saturation limit at which the 3D crack front encounters a network of Si particles; typically k is in the range of 3 – 5. Further, NPSC is the number of cycles required for propagation of a physically small crack (PSC) during the transition from microstructurally small crack status to that of a dominant, long crack. The long crack propagates according to LEFM with an associated number of cycles NLC . For this alloy, the DCS is typically on the order of 20 – 100 µm, and the PSC regime may conservatively extend up to 300 – 800 µm. For practical purposes, in view of experimental data on this class of alloy, we aggregate NMSC + NPSC into the single term NMSC/PSC . Finite element simulations of cyclic plastic deformation at debonded Si inclusions, which is established as a the worst case scenario for localization of the nonlocal cyclic plastic shear strain [14, 15], are used to fit relations between β in Eq. (5) and the applied von Mises uniaxial equivalent strain amplitude, ε¯ a , and R-ratio based on maximum principal stress (R = σ1 |min / σ1 |max ). The
Microstructure-sensitive computational fatigue analysis
1203
average maximum plastic shear strain at the micronotch root (Eqs. (2)–(3)) is calculated over 5% of the inclusion area. Computational micromechanics studies were conducted over a substantial range of inclusion geometries (pores and Si particles) and size distributions to determine the notch root average value p∗ /2, as a of the maximum local cyclic plastic shear strain amplitude, β = γmax function of the applied strain amplitude [15, 16]. Figure 2 shows examples for uniaxial loading. Particle-matrix contact during loading reversal plays a key role in cyclic plastic strain localization in such problems. The micronotch Coffin–Manson law for incubation life in Eq. (4) is augmented by the relations: ¯εa − 0.0006 = D 0.00567
for
≤ 0.3; D
0.0023 1/r = 1 − 0.7 D ε¯ a ≤1 for 0.3 < D
(6)
∗
p /2 = (0.1666 + 0.0266 R) [ 100 {¯εa − 0.00025 (1 − R)}]2.45 β = γmax × (1 + z ζ ) (7)
Cinc = Cn +
1 0.7
− 0.3 D
Cn = 0.24 (1 − R )
(Cm − Cn ) = Cn + z (Cm − Cn )
(8) (9)
In these equations, D is the maximum Si particle diameter or pore size at a given scale within the preceding hierarchy, D = Dˆ part or Dˆ pore . All hierarchical scales are pursued in analysis and minimum total life is taken among these. The exponent ‘r’ in Eq. (6) controls the shape factor for the transition to limit plasticity, and r = 0.1 is selected to provide a rapid transition into the limit plasticity regime as observed in finite element calculations. Incubation life Ninc rapidly becomes an insignificant fraction of the total fatigue life above the percolation limit for microplasticity where extensive shear localization dominates the eutectic regions. For /D ≥ 0.3, finite element simulations show that β rapidly saturates to a level well above its value at the percolation limit (on the order of 2% plastic strain in the interdendritic regions). In Eqs. (8)–(9), Cn is the coefficient for nucleation and small crack growth at inclusions in the HCF regime (constrained microplasticity), and Cm is the Coffin–Manson coefficient for incubation in the limit plasticity regime (macroscopic LCF), obtained from the dendrite cell Al–1%Si material. The Macauley bracket function is defined by f = f if f ≥ 0, f = 0 for negative f. The matrix fatigue ductility coefficient is estimated as Cm = 0.03, based on LCF experiments on Al–1%Si specimens at lives below 5 × 103 cycles. Dependence of Cn on R reflects an effective decrease in matrix fatigue ductility at higher positive R-ratios due to plastic strain localization; the localized plastic
1204
D.L. McDowell
Figure 2. Correlation of Eq. (6) with the finite element computational results of Ref. [15] for p∗ nonlocal γmax /2 (in %) versus far-field total strain amplitude, εa = ε/2 (in % ) for debonded Si particles (upper curves) and for cracked Si particles (lowest curve). An area of A = 0.0625 D 2 was used in averaging the cyclic plastic strain in 2D calculations, where D is particle diameter.
strain level increases with R-ratio [15]. Furthermore, ratcheting or progressive plastic deformation of the notch root plastic shear strain is also evident in the calculations of Gall et al. [15] and is known to degrade fatigue ductility. The
Microstructure-sensitive computational fatigue analysis
1205
exponent α in Eq. (4) pertains to the eutectic Al-rich matrix, and is estimated from LCF tests on Al–1%Si as α = − 0.5. The localization multiplier z = /D − 0.3 /0.7 is non-zero only above the microplasticity percolation limit, and rapidly transitions to unity as interdendritic plastic shear strain localization sets in just above the microplasticity percolation limit. Beyond this point, the incubation process is negligible (Ninc only a few cycles) due to the severe levels of strain localization between particles or pores in and around interdendritic regions. Multiplier ζ represents eutectic strain intensification in the LCF regime. For debonded particles, the value ζ = 9 is estimated based on finite element results for the R = − 1 case as /D → 1. Once nucleated, small cracks (typically on the order of microns) must then propagate through an enclave with a significant gradient of cyclic stress and plastic strain away from the inclusion, typically losing driving force as they grow away from the micronotch root. If the driving force remains above threshold, a crack effectively leaves behind the influence of the notch and behaves as a crack with a physical length that includes the inclusion diameter (cf. [8, 17]). The transition crack length is typically 10–15% of the inclusion diameter [7]. The MSC/PSC small crack propagation relation is given by
da dN
= G (CTD − CTDth ),
(10)
MSC/PSC
whereCTD is the cyclic crack tip displacement range, and G is a constant for a given microstructure, typically less than unity (cf. [18]). We assign the threshold value CTDth = 2.86×10−10 m = b, where b is the Burgers vector for pure fcc Al. This value is just slightly above the minimum cyclic crack growth advance per cycle measured for squeeze cast Al–Si alloys [19, 20]. We adopt the specific form
CTD = f (ϕ) ¯ CII
DCS DCSo
U σˆ Su
n
a + CI
DCS DCSo
2 p
γmax
2 macro (11)
The first term in Eq. (11) is based on the correlations of Ref. [20] for cracks in low porosity squeeze cast Al–Si alloys in the MSC/PSC regime under HCF loading conditions, with an additional influence of average void volume fraction (porosity) ϕ¯ via the function f (ϕ) ¯ to be discussed later. Coefficient CII is intended to apply to the MSC and PSC regimes for crack lengths ranging from a few microns to the millimeter range (cf. [20, 21]). The second term is added to describe elastic–plastic crack propagation in the limit plasticity regime, with CI as the leading coefficient; da/dN is essentially independent of the crack length in this regime, with the maximum plastic macroscopic
1206
D.L. McDowell
p shear strain, γmax /2 macro , as the driving force. This second term is negligible in the HCF regime as defined by the percolation limit for microplasticity. In Eq. (11), σˆ = 2θ σ¯ a + (1 − θ) σ1 is the range of the uniaxial equivalent stress, which isa linear combination of the von Mises uniaxial effective
stress amplitude σ¯ a = 3/2 σij /2
σij /2
and the range of the max-
imum principal stress, σ1 ; θ is a constant factor (0 ≤ θ ≤ 1) introduced by Hayhurst et al. [22] to model combined stress state effects (θ ≈ 0.4 based on torsional fatigue experiments). The factor U addresses mean stress effects on propagation, which are influenced strongly by interdendritic particle interactions ahead of and in the wake of the crack; U = 1/(1 − R) for R < 0, and U = 1 for R ≥ 0, where stress ratio R is based on the maximum principal stress. U = 0 if the peak principal stress in the cycle is compressive. This form for U is consistent with finite element calculations [23] and results of Ref. [24] for particle-reinforced systems. The driving force Uσˆ is normalized by ultimate strength Su in Eq. (11). We assign a dependence of the eutectic matrix fatigue ductility in the HCF regime on the average porosity, ϕ, ¯ as a scaling parameter to correlate with microporosity, i.e.,
f (ϕ) ¯ = 1 + ω 1 − exp −
ϕ¯ 2ϕth
,
ϕth ≈ 10−4 .
(12)
This accounts for the effect of microporosity in decreasing matrix ductility. The factor of two to three reduction in fatigue life observed for higher microporosity levels relative to low microporosity cast specimens suggests a value of ω ≈ 2 (cf. [20]). For two different low porosity squeeze cast alloys in the HCF regime, Shiozawa et al. [20] measured the combined coefficient GCII = 3.11×10−4 m/cycle for a reference dendrite cell size of DCSo = 30 µm; in this case, the microporosity is very low, i.e., f (ϕ) ¯ ≈ 1. For cast A356-T6, we take G = 0.32 and the other non-dimensional constants that result from data correlation give CI = 0.31, CII = 1.88×10−3 , n = 4.8 (as in Ref. [20]), and ω = 2. The reference DCS value in Eq. (11) is taken as DCSo = 30 µm, corresponding to a horizontally cast plate. For horizontally cast plate, Su = 310 MPa. The exponent n = 4.8 is also reasonably close to the exponent on stress range of the da / dN versus K relation for the A356-T6 alloy in the long crack regime, and is supported by limited finite element calculations of theCTD versus applied stress for cycling in the HCF regime [23]. The mechanically long crack growth relation is given by
da dN
= Ap LC
K e f f
M
− K e f f,t h
M
(13)
√ −4.2 /cycle. The For A356-T6, M ≈ 4.2 and Ap ≈ 1.5 × 10−11 m - MPa √ m intrinsic threshold is given by K eff,th ≈ 1.3MPa m for A356-T6, as
Microstructure-sensitive computational fatigue analysis
1207
determined from experiments at very high stress ratios. The effective stress intensity factor range is defined by K eff = K max − K op if K min < K op , K eff = K max − K min if K min ≥ K op , where the opening stress intensity factor level is given by (Couper and Griffiths 1990) K op = 3.4 + 3.8R 2 for R > 0, K op = 3.4 (1 + R) for 0 ≥ R ≥ − 1, and K op = 0 for R < − 1. Following incubation, we select between the MSC/PSC and LC growth laws as the crack extends by considering the maximum of either of the two respective rates, i.e., da = max dN
da dN
da , dN MSC/PSC
.
(14)
LC
Use of the K -based growth relation is subject to a constraint that the requirements for validity of the homogeneous LEFM approach to model fatigue crack growth in the heterogeneous cast alloy, i.e.,
Sy a > 30DCS σeff
2
(15)
where σeff = σmax − σop , and the opening stress accords with K op . This criterion corresponds to a cyclic plastic zone enclave at the crack tip on the order of the DCS. For Si particles, the initial crack size is given by ai =
˜ Dˆ part + = 0.5625 Dˆ part 2 4
where ˜ ≈
Dˆ part . 4
(16)
For pores with diameter less than 3DCS, the initial crack size is Dˆ pore ˜ Dˆ pore + = 0.5625 Dˆ pore where ˜ ≈ . 2 4 4 For pores with diameter greater than 3DCS, the initial crack size is ai =
ai =
Dˆ pore 1 DCS Dˆ pore DCS + + , = 2 2 2 2 4
(17)
(18)
and the factor β is amplified by Dˆ pore /(3DCS) to account for loss of constraint on slip with increase of pore size relative to the dendrite cell size. The case of localization of cyclic plastic strain for large pores near free surfaces was characterized as well and considered as a separate case. For large oxides of length Dˆ oxide, it is assumed that the incubation relations are bypassed completely, with an initial crack size for propagation given by ai = Dˆ oxide/2. The final crack length is specified either as af = af |dominant = 1 mm for a dominant crack in the HCF regime Cof Fig. 1 (only maximum inclusion size dominates), or an effective crack length a˜ f (a˜ f < af ) in the LCF regime that accounts for multi-site incubation of cracks at largest inclusions of different populations, followed by dilute (non-interactive) growth, and impingement
1208
D.L. McDowell
coalescence. Coalescence phenomena for distributed crack nucleation is considered only for the LCF regime, i.e., /D → 1 (regimes A or B in Fig. 1). We set the final crack length in the propagation analysis as af = af |dominant + z (a˜ f − af |dominant) to account for the transition from dominant crack failure in HCF to multi-site crack formation and coalescence in LCF. For example, the approximate recursion relation
Dˆ pore (0.685 − 0.04ξ1 ) (ξ1 + 1) Dˆ part + δpart + a˜ f ≈ 2 2 2
×
n
(ξ1 )
2i−1
i=1
+
δpore − n (ξ1 )2(n+1)−1 2δpart
(19)
applies to the reduced effective crack length at failure due to a system of largest Si particles residing within a field of monosize gas orshrinkage pores. Here, n = I N T δpore / 2δpart and ξ1 = Dˆ pore / Dˆ pore + Dˆ part . The average spacing between the largest (fractured or debonded) Si particles is given by δpart , and δpore is the average nearest neighbor spacing between pores of a given mean diameter D¯ pore . Finally, we sum the various components of lifetime to arrive at the total number of cycles to produce a crack of 1 mm length or to reach a˜ f , NT = Ninc + NMSC/PSC + NLC . Figures 3 and 4, respectively, show the predicted uniaxial remote stress amplitude and strain amplitude versus NT curves for completely
∆σ/2 (MPa)
103
102
101
102
103
104
105
106
107
NT
Figure 3. Model predictions: remote stress amplitude versus total fatigue life for completely reversed uniaxial loading for a rangeof inclusion types and sizes.
Microstructure-sensitive computational fatigue analysis
1209
∆ε/2
10⫺2
10⫺3 101
102
103
104 NT
105
106
107
Figure 4. Variation of completely reversed, uniaxial applied strain-total life behavior as a function of inclusion type and size for cast A356-T6 Al, including coalescence effects in the LCF regime. In the coalescence propagation analysis, parameters were assigned based on experimental work: ϕ¯ = 1.5 × 10−3 , D¯ pore = 50 µm, Dˆ part = 12 µm, and DCS = 30 µm.
reversed loading (R = −1) as a function of inclusion type considered. For squeeze cast alloys (12 µm evenly spaced fractured Si particles in Figs. 3 and 4), the porosity is minimized by application of hydrostatic pressure during solidification and the maximum fatigue resistance is obtained in the HCF regime. However, coalescence phenomena render this microstructure less resistance to LCF, as shown in Fig. 4. The key point is that once such a multi-mechanism, multi-scale model of fatigue is established and its mean behavior correlated to in situ matrix fatigue behavior along with one or two well-characterized inclusion populations such as in the squeeze cast condition, it is robust in predicting the variability of fatigue resistance associated with a complete range of microstructural features, including dendrite cell size, eutectic Si particles, both gas and shrinkage pores, and oxides. Combining such a tool with component level stress analysis, it is foreseen that variations of microstructure can be achieved through design of process conditions (e.g., solidification rate and pressure) to tailor material fatigue resistance in critical locations of components rather than costly, indiscriminate control of processing on the entire part. Moreover, minimum weight component design can be undertaken considering additional constraints regarding the level of fatigue resistance of the microstructure. Many exciting possibilities exist in this regard.
1210
5.
D.L. McDowell
Cyclic Shakedown and Ratcheting in Fatigue
The foregoing model has focused on reversed cyclic plasticity as a driving force for formation of fatigue cracks. Cyclic plastic strain behavior is generally decomposed into three regimes: elastic shakedown, reversed cyclic plasticity, and plastic ratcheting, as shown in Fig. 5 (cf. [25]). Elastic shakedown is defined as the stress or strain level below which there is a cessation of cyclic plasticity. In other words, the condition of elastic shakedown is obtained when plastic deformation occurs during the early cycles but the steady state behavior is fully elastic due to the build-up of residual stresses. Reversed cyclic plasticity is the condition in which the material experiences reversed plastic straining during cycling with no net accumulation of plastic deformation; reversed cyclic plasticity is sometimes referred as plastic shakedown. Plastic ratcheting describes the condition in which the material accumulates a net directional plastic strain during each cycle. The ratcheting plastic strain increment per cycle is defined as
p
ε i j
ratch
p
= εi j
End of the cycle
p
− εi j
(20)
Beginning of the cycle
The reversed cyclic plastic strain range is given by
p
ε i j
(a)
cyc
p
= ε i j
max Over the cycle
p
− ε i j
(b)
(c)
p ∆εratch
Stress
p ∆εcyc
(21)
ratch
Strain
Figure 5. Steady state responses of plastic strain behavior during the cycle: (a) elastic shakedown, (b) reversed cyclic plasticity, and (c) plastic ratcheting.
Microstructure-sensitive computational fatigue analysis
1211
The effective reversed cyclic plastic strain range and ratcheting plastic strain increment are defined as follows: p ε cyc,eff
=
2 p p ε i j ε i j cyc cyc 3
p
ε ratch,eff =
(22)
2 p p ε i j ε i j ratch ratch 3
(23)
It is assumed that elastic shakedown occurs when the following conditions are satisfied: εipj =/ 0
p
p
and ε cyc,ef f and εrat ch,e f f
≤ εy
(24)
where ( 1) scales shakedown relative to the cyclic yield strain ε y . In other words, when elastic shakedown occurs, both reversed cyclic plastic strain and plastic ratchet strain amplitudes are considered to be zero. Shakedown and ratcheting maps expressed in terms of strain distributions or as a function of loading parameters can be quite valuable for interpreting the role of microstructure in fatigue resistance. The interested reader is referred to recent works of [26–28] which describe such maps constructed for fretting fatigue of Ti–6Al–4V based on computational multiphase crystal plasticity, where the surface boundary layer thickness is on the order of, and therefore intimately related to, microstructure scales. Ratcheting is a very important mechanism for this class of surface contact problems. It should be considered that classical to-and-fro slip is not responsible for all crack formation and propagation mechanisms at the microstructure scale. Progressive pileup of dislocations in slip bands (Zener mechanism) that impinge on grain or phase boundaries, or at oxidized inclusion interfaces, can lead to formation and propagation of small cracks in the microstructure. In fretting fatigue, for example, progressive plastic deformation of surface layers has been shown to contribute significantly to formation and early growth [27] of cracks on the order of grain size under ostensibly HCF conditions. An appropriate measure of plastic strain to reflect this sort of driving force is the p∗ /2 ratchet strain. The averaging procedure in Eq. (3) for non-local β = γmax can also be applied to a non-local measure of the increment of cyclic rate of p∗ ratchet strain accumulation, γmax,ratch , or its cumulative value. A microfracture criterion can be introduced for crack incubation in such cases, e.g., Ninc= p∗ p∗ ∗ , f γmax,ratch , or a Mohr–Coulomb form [6] Ninc =g γmax,ratch + h σ σn,max ∗ where σn,max is the maximum tensile normal stress to the plane of crack formation within the same region over which the averaging is performed to define p∗ γmax,ratch .
1212
6.
D.L. McDowell
Summary
An hierarchical approach for microstructure-sensitive fatigue analysis based on computational micromechanics is outlined. Each aspect of the relation of microstructure to fatigue damage is deterministic, framed to explicitly incorporate microstructure features. Such an approach can predict variability of fatigue life with respect to variation of any particular microstructure feature. Microstructure features for the cast Al alloy example described here include dendrite cell size, maximum Si particle size, maximum pore size, maximum oxide size, proximity to the free surface (for large pores), and average porosity level. If the probability distributions of these features are specified based on quantitative metallography, for example, then the probability distributions for the fatigue life can be computed directly from the model. The foregoing methodology is an extension of existing, straightforward practice of estimation of fatigue life of notched components to microstructural notches (Socie et al., 1984). It requires analyses of notch root behavior for various microstructural features (relations between remote loading conditions and behavior in notch root regions), as well as introduction of appropriate small fatigue crack growth relations for a given microstructure. The extension of this methodology to other characteristic wrought and cast microstructures is straightforward, although it requires an investment of effort to sort out mechanisms for crack formation and propagation, as well as appropriate properties. In principle, relations for broad classes of cast alloys should be similar, as should those for wrought alloys. We also recognize several additional directions of future research in computational fatigue models that can contribute to this type of approach: • Cohesive zone interface separation elements for inclusion-matrix and grain boundary interfaces. • Discrete dislocation simulations for understanding crack tip behavior and behavior of dislocations at very small micronotches for which the scale of cyclic plastic zone size is on the order of dislocation spacing. • Nonlocal relations for dislocation substructure formation and its relation to cyclic plastic deformation. • Adaptive remeshing and mesh-free approaches for propagation of cracks within microstructures to assist in establishing appropriate small crack growth relations for each characteristic type of microstructure.
References [1] J.A. Bannantine, J.J. Comer, and J.L. Handrock, Fundamentals of Metal Fatigue Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1990.
Microstructure-sensitive computational fatigue analysis
1213
[2] S. Suresh, Fatigue of Materials, Cambridge University Press, 2nd edition, Cambridge, UK, 1998. [3] G. Venkataraman, Y.W. Chung, and T. Mura, “Application of minimum energy formalism in a multiple slip band model for fatigue-II. Crack nucleation and derivation of a generalised Coffin-Manson law,” Acta Met. Mater., 39(11), 2631–2638, 1991. [4] E.A. Repetto and M. Ortiz, “A micromechanical model of cyclic deformation and fatigue-crack nucleation in f.c.c. single crystals,” Acta Mater., 45(6), 2577–2595, 1997. [5] D.F. Socie, “Critical plane approaches for multiaxial fatigue damage assessment,” In: Advances in Multiaxial Fatigue, ASTM STP 1191, D.L. McDowell and R. Ellis (eds.), ASTM, Philadelphia, 7–36, 1993. [6] D.L. McDowell, “Multiaxial fatigue strength,” ASM Handbook, vol. 19 on Fatigue and Fracture, ASM International, 263–273, 1996a. [7] R.A. Smith and K.J. Miller, “Fatigue cracks at notches,” Int. J. Mech. Sci., 19, 11–22, 1977. [8] N.E. Dowling, “Fatigue at notches and the local strain and fracture mechanics approaches,” In: C.W. Smith (ed.), Fracture Mechanics, ASTM STP 677, ASTM, Philadelphia, pp. 247–273, 1979. [9] D.F. Socie, N.E. Dowling, and P. Kurath, “Fatigue life estimation of notched members,” In: Fracture Mechanics: 15th Symp., ASTM STP 833, R.J. Sanford (ed.), ASTM, Philadelphia, 284–299, 1984. [10] D.L. McDowell, “Basic issues in the mechanics of high cycle metal fatigue,” Int. J. Fracture, 80, 103–145, 1996b. [11] D.L. McDowell, K. Gall, M.F. Horstemeyer, and J. Fan, “Microstructure-based fatigue modeling of cast A356-T6 alloy,” Eng. Frac. Mech., 70, 49–80, 2003. [12] J. Fan, D.L. McDowell, M.F. Horstemeyer, and K. Gall, “Cyclic plasticity at pores and inclusions in cast Al–Si alloys,” Eng. Frac. Mech., 70(10), 1281–1302, 2003. [13] D.L. McDowell, “Multiaxial effects in metallic materials,” Symp. on Durability and Damage Tolerance, ASME AD-Vol. 43, ASME Winter Annual Meeting, Chicago, IL, Nov. 6–11, 213–267, 1994. [14] K. Gall, N. Yang, M. Horstemeyer, D.L. McDowell, and J. Fan, “The influence of modified intermetallics and Si particles on fatigue crack paths in a cast A356 Al alloy,” Fatigue Fract. Engng. Mater. Struct., 23(2), 159–172, 2000a. [15] K. Gall, M.F. Horstemeyer, B.W. Degner, D.L. McDowell, and J. Fan, “On the driving force for fatigue crack formation from inclusions and voids in a cast A356 aluminum alloy,” Int. J. Fract., 108, 207–233, 2001. [16] K. Gall, M. Horstemeyer, D.L. McDowell, and J. Fan, “Finite element analysis of the stress distributions near damaged Si particle clusters in cast Al–Si alloys,” Mech. Mater., 32(5), 277–301, 2000b. [17] J.C. Ting and F.V. Lawrence, Jr., “Modeling the long-life fatigue behavior of a cast aluminum alloy,” Fatigue Fract. Engng. Mater. Struct., 16(6), 631–647, 1993. [18] C.-H. Goh, D.L. McDowell, and R.W. Neu, “Characteristics of plastic deformation field in polycrystalline fretting contacts,” Int. J. Fatigue, 25(9–11), 1047–1058, 2003b. [19] A. Plumtree and S. Schafer, “Initiation and short crack behaviour in aluminum alloy castings,” In: the Behaviour of Short Fatigue Cracks, EGF Pub. 1, K.J. Miller and E.R. de los Rios (eds.), Mech. Engineering Publications, London, 215–227, 1986.
1214
D.L. McDowell
[20] K. Shiozawa, Y. Tohda, and S.-M. Sun, “Crack initiation and small fatigue crack growth behaviour of squeeze-cast Al-Si aluminum alloys,” Fatigue Fract. Engng. Mater. Struct., 20(2), 237–247, 1997. [21] S. Gungor and L. Edwards, “Effect of surface texture on fatigue life in a squeeze-cast 6082 aluminum alloy,” Fatigue Fract. Engng. Mater. Struct., 16(4), 391–403, 1993. [22] D.R. Hayhurst, F.A. Leckie, and D.L. McDowell, “Damage growth under nonproportional loading,” ASTM STP 853, ASTM, Philadelphia, 688–699, 1985. [23] J. Fan, D.L. McDowell, M.F. Horstemeyer, and K. Gall, “Computational micromechanics analysis of cyclic crack-tip behavior for microstructurally small cracks in dual-phase Al–Si alloys,” Eng. Frac. Mech., 68, 1687–1706, 2001. [24] M.J. Couper, A.E. Neeson, and J.R. Griffiths, “Casting defects and the fatigue behavior of an aluminum casting alloy,” Fatigue Fract. Eng. Mater. Struct., 13(3), 213–227, 1990. [25] J.M. Ambrico and M.R. Begley, “Plasticity in fretting contact,” J. Mech. Phys. Solids, 48(11), 2391–2417, 2000. [26] C.-H. Goh, J.M. Wallace, R.W. Neu, and D.L. McDowell, “Polycrystal plasticity simulations of fretting fatigue,” Int. J. Fatigue, 23, S423–S435, 2001. [27] C.-H. Goh, R.W. Neu, and D.L. McDowell, “Crystallographic plasticity in fretting of Ti–6Al–4V,” Int. J. Plasticity, 19(10), 1627–1650, 2003a. [28] C.-H. Goh, D.L. McDowell, and R.W. Neu, “Characteristics of plastic deformation field in polycrystalline fretting contacts,” Int. J. Fatigue, in press, 2003b.
Chapter 4 MATHEMATICAL METHODS
4.1 OVERVIEW OF CHAPTER 4: MATHEMATICAL METHODS Martin Z. Bazant1 and Dimitrios Maroudas2 1 Massachusetts Institute of Technology, Cambridge, MA, USA 2
University of Massachusetts, Amherst, MA, USA
Mathematics is the language of science. In some sense, therefore, this entire Handbook is devoted to “mathematical methods” of materials modeling. What distinguishes the articles in this chapter is the degree of mathematical intensity or sophistication, as well as contributions from the applied mathematics community. Building on its traditional strengths in fluid mechanics, nonlinear dynamics, and numerical methods, applied mathematics has been steadily moving into materials science. The result is a fresh perspective on a wide range of materials problems, including many from other chapters of the Handbook. This chapter serves to highlight some major themes from current research, such as disordered materials, interfacial dynamics, and multiscale modeling, to give a taste of the subject. The chapter has been structured into three thematic sections. The first one (Articles 4.2–4.5) is devoted to theoretical descriptions of bulk phases of materials that are under extreme conditions of deformation and/or are characterized by heterogeneities in their microstructure or the loss of structural order. The second section (Articles 4.6–4.10) addresses problems of interfacial dynamics and morphological evolution or microstructural evolution of multiphase systems mediated by the dynamics of the boundaries between different phases, that have been challenging the fields of materials science and fluid dynamics for many decades. The third and final section (Articles 4.11–4.15) is devoted to mathematical developments in the multiscale modeling of complex systems, which is a promising and powerful computational means toward analysis and predictive modeling of realistic, technologically important materials and their processing and function.
1217 S. Yip (ed.), Handbook of Materials Modeling, 1217–1222. c 2005 Springer. Printed in the Netherlands.
1218
1.
M.Z. Bazant and D. Maroudas
Bulk Phases of Highly Deformed, Heterogeneous, or Disordered Materials
The articles of this section (Articles 4.2–4.5) present fundamental principles of theoretical analysis and address challenging problems of structural response to mechanical loading (or other external fields) of bulk material phases. The materials are either perfectly crystalline but subjected to elastic deformations that bring the crystal to its ideal-strength limit, or have complex microstructure such as in composite systems or random heterogeneous materials, or are fully disordered such as in amorphous solids. In these cases, the intensity of the loading conditions and/or the structural complexity of the bulk material phase introduce serious challenges in the theoretical analysis and its computational implementation. For example, conventional perturbation analyses about a linearly elastic continuum are not sufficient to rigorously address the fundamental theoretical problems and additional or more sophisticated mathematical tools are required. Article 4.2 by Milstein investigates theoretically the structural response of a perfect crystal under load for elastic deformations in the vicinity of the crystalline material’s “theoretical strength”. The principles of elastic stability analysis are reviewed and crystal stability criteria are derived under various loading modes. Both lattice–statics calculations and isostress molecular-dynamics simulations are used to explore the behavior of the crystal at and beyond the onset of the elastic instability and elucidate the roles of crystal symmetry and mode of loading, as well as the atomic-scale dynamical mechanisms that may lead to structural transformation (phase change) or failure of the crystal beyond the instability limit. Amorphous solids provide a promising starting point for theoretical analysis toward fundamentally understanding general classes of deformation behavior. In Article 4.3 by Falk, Langer, and Pechenik, two questions that are fundamental in the development of amorphous plasticity theory are addressed, namely how hardening-to-flow transitions occur under applied stress and how microstructural dynamics can be incorporated into macroscopic constitutive theories. A theoretical framework is developed that includes the two-state dynamics associated with shear transformation zones (or flow defects). Some predictions of the resulting model are given for the mechanical response of metallic glasses and amorphous polymers and the need to understand better the thermodynamics of nonequilibrium systems is emphasized. Article 4.4 by Sornette reviews the statistical physics of the rupture of heterogeneous materials, such as composite systems, the failure of which is of utmost importance to a broad range of technological applications. The theoretical challenge in the field arises from the complex interplay between heterogeneities and modes of damage, as well as a hierarchy of static and dynamic
Overview of chapter 4: mathematical methods
1219
characteristic scales; a common property of the heterogeneous systems of interest is the presence of large-scale inhomogeneities that limit the use of homogenization theories. The many-body nature of the rupture problem is highlighted and the need for a truly interdisciplinary approach to attack the problem is emphasized. Article 4.5 by Torquato addresses the theoretical prediction of random heterogeneous materials properties through the use of statistical correlation functions to describe the dependence on microstructure of effective material properties and the development of methods to estimate the corresponding functionals of microstructural information. A unified theoretical approach is outlined based on the canonical n-point correlation function and the analysis focuses on static (or approximately static) two-phase heterogeneous materials. The effective properties are used in averaged constitutive equations to close the appropriate homogenized governing partial differential equations (PDEs) that describe, for small-length-scale heterogeneities, physical processes occurring in heterogeneous materials.
2.
Interfacial Dynamics and Morphological & Microstructural Evolution
Materials science is increasingly focusing on the detailed dynamics of microstructures out of equilibrium, to better understand and optimize the macroscopic behavior of complex materials. Such problems are often beyond the reach of atomistic modeling and typically require continuum approaches, which have long been the domain of applied mathematics. The major difficulty is to describe the moving free boundary between different phases, which can be (or become) quite complicated. Even when the governing equations in each phase are linear, the mathematical problem for interfacial dynamics is generally nonlinear and nonlocal. This presents challenges for both numerical and analytical modeling, which are addressed by the articles of this section (Articles 4.6–4.10). A wide variety of numerical methods have been developed, which mostly fall into two classes, Eulerian and Lagrangian. The former represent moving boundaries as level sets of higher-dimensional functions defined on the same fixed mesh as the bulk fields, which naturally allows for topological changes and complicated microstructures. The Phase Field Method for solidification is a well-known example from materials science and is discussed in Chapter 7 of this Handbook. From applied mathematics, the Level-Set Method, presented in Article 4.6 by Sethian, has been used in diverse problems from shock dynamics to image recognition and is now a standard tool to simulate etching and deposition processes in semiconductor micro-fabrication. The method also
1220
M.Z. Bazant and D. Maroudas
is used widely in other areas of materials modeling, such as in thin-film growth (see, e.g., Article 7.15 by Caflisch and Ratsch in Chapter 7 of this Handbook). In contrast, Lagrangian methods explicitly track each moving boundary with a separate data structure. Examples include front-tracking methods for shock waves and immersed-boundary methods for cardiac fibrillations. The Lagrangian approach is particularly useful when the boundary has its own physical properties, separate from the bulk, as in many soft condensed matter systems (see, e.g., articles in Chapter 9 of this Handbook). For example, in simulations of complex fluids containing elastic solid filaments, as discussed in Article 4.7 by Shelley and Tornberg, boundary integral methods can eliminate the need to explicitly describe the bulk fluid phase (e.g., using the methods of Chapter 8 of this Handbook). Due to the complexity of interfacial dynamics, analytical methods are usually restricted to special situations, but, when available, they offer valuable insights. For continuum models based on PDEs, a crucial role is played by exact similarity solutions, in which the independent variables only appear in special power–law combinations, usually due to a separation of length and/or time scales. For example, continuum descriptions with similarity solutions are being developed for modeling crystal surface morphological evolution governed by surface diffusion, as discussed in Article 4.8 by Stone and Margetis. Similarity solutions also are essential to describe singularities in free-surface fluid flows, such as the break-up and coalescence of fluid drops, as discussed in Article 4.9 by Eggers, which are difficult to capture with conventional numerical methods. In two dimensions, conformal-mapping methods from complex analysis allow elegant formulations of interfacial dynamics problems, convenient for analytical and numerical solutions, without any special similarity assumptions, as discussed in Article 4.10 by Bazant and Crowdy. Continuous conformalmap dynamics is a mature subject for viscous fingering and other fluid instabilities, but it is being extended to new problems in materials microstructure, such as viscous sintering, electromigration-driven void dynamics in metals, pore evolution in elastic solids, and solidification in fluid flows. The recent development of stochastic conformal-map dynamics has also been a major breakthrough in the study of diffusion-limited aggregation and other fractalgrowth phenomena.
3.
Multiscale Modeling of Complex Systems
Multiscale modeling methods are becoming significant tools in materials modeling, as well as a broad range of areas in scientific and engineering research. Over the past decade, multiscale modeling has emerged as a
Overview of chapter 4: mathematical methods
1221
powerful, integrated computational approach for understanding, analyzing and quantitatively predicting the behavior of realistic complex systems. The aim of multiscale modeling is to link fine-scale phenomena with macroscopic response exhibited over coarse scales by establishing rigorous links over widely different theoretical formalisms and computational methods; the terms fine and coarse are not uniquely defined, but vary over different complex systems. Although the core capabilities of multiscale modeling include mature methods of quantum mechanics, statistical mechanics, and continuum mechanics, the rigorous coupling of these methods to produce satisfactory, ultimately predictive models for the accurate description of complex systems remains a very serious challenge. The articles of this section (Articles 4.11–4.15) address this very challenge. In current modeling, the best available descriptions of a complex system exist at a fine (atomistic or microscopic) scale while the modeling tasks need to address a much coarser, macroscopic scale. Article 4.11 by Kevrekidis, Gear, and Hummer gives an overview of their novel development of a mathematicallybased, computational enabling technology that allows for performing macroscopic tasks by acting directly on the microscopic models. This “equation-free” approach circumvents the need for deriving accurate macroscopic equations starting from the corresponding microscopic descriptions. An ensemble of short, appropriately initialized, fine-scale computer simulations is used to estimate time derivatives, functions, and functional derivatives, which are then used for system-level modeling through matrix-free numerical analysis and systemstheory tools. The approach has the potential to bridge elegantly microscopic simulation with macroscopic modeling of complex systems. Modeling mesoscopic inhomogeneities arising due to thermal fluctuations and complex interactions between microscopic mechanisms requires efficient description of length and time scales much larger than those captured by conventional molecular/microscopic models and simulations. Article 4.12 by Katsoulakis and Vlachos provides an overview of the key ingredients in the derivation of a mathematical framework for coarse graining of stochastic processes; these involve a coarse grid selection, as well as the derivation through a stochastic closure of a coarse stochastic model for a reduced (compared to the underlying microscopic description) number of observables, leading to the development of coarse-grained Monte Carlo algorithms. The approach is demonstrated focusing on simple Ising-type models and the coarse-graining errors are estimated using information theory methods. Multiscale modeling aims at developing numerical tools of accuracy comparable to that of microscopic models and efficiency comparable to that of macroscopic models, by properly coupling the microscopic with the macroscopic models. Article 4.13 by E and Li reviews some of these strategies that have been developed for multiscale modeling of crystalline solids, focusing on
1222
M.Z. Bazant and D. Maroudas
the coupling between molecular dynamics and continuum mechanics and, in particular, on concurrent coupling methods for linking different scales “on the fly”. These modeling methods are classified into energy-based and dynamicsbased formulations. Specific methods discussed include the quasi-continuum method, macro atomistic ab initio dynamics, coarse-grained molecular dynamics, and the heterogeneous multiscale method. The need for rigorous multiscale modeling of solids beyond single crystals with isolated defects is emphasized. In multiscale modeling, a natural question is the development of a computational method that captures small-scale effects on the large scales using a coarse grid without the requirement to resolve all the small-scale features. Article 4.14 by Hou illustrates some of the key issues in designing multiscale computational methods for fluid flows, using as examples incompressible flow and two-phase flow in heterogeneous porous media. Emphasis is placed on a multiscale finite-element method by constructing local basis functions that capture small-scale information within each element and bring it to the large scales through the coupling of the global stiffness matrix. The need to localize the subgrid small-scale problems by properly implementing microscopic boundary conditions for the local basis functions is highlighted and methodology to accomplish this is discussed for both diffusion-dominated and convection-dominated transport problems. Future directions in addressing the need to carry out multiscale analysis that accounts for long-range interactions of small scales also are discussed. Engineering analysis requires the prediction of selected “outputs” relevant to component/system performance as a function of “inputs”, i.e., system parameters that serve to identify a particular realization of the component/system. Article 4.15 by Cuong, Veroy, and Patera addresses modeling of components or systems in service or in operation, where typical computational tasks include robust parameter estimation and adaptive design, i.e., inverse problems and optimization problems, respectively. Their certified real-time approach to solve parameterized PDEs considers both approximation and computation opportunities and is based on rapidly, uniformly convergent reduced-basis approximations and associated rigorous and sharp error bounds. Examples to demonstrate the approach include Helmholtz elasticity and natural convection. These methods are appropriate for many classes of materials behavior and processing problems.
4.2 ELASTIC STABILITY CRITERIA AND STRUCTURAL BIFURCATIONS IN CRYSTALS UNDER LOAD Frederick Milstein Mechanical Engineering and Materials Depts., University of California, Santa Barbara, CA, USA
What happens when a crystalline material is deformed elastically to the point where it loses structural stability? Under what circumstances will it lose stability? Why are these questions important? The stress required to cause elastic instability is often considered to be the ultimate “theoretical strength” of a crystalline material, which is an inherently intriguing concept, in and of itself. The “theoretical strength” plays important roles in understanding and/or describing practical phenomena, e.g., it forms a basis for calculating the efficiency of grinding processes and it affects the stress distribution near the tip of a crack and thus influences whether a material will exhibit brittle or ductile behavior. From another viewpoint, structural phase change rather than loss of strength is the presumed outcome of elastic instability. New crystalline or amorphous structures that form under mechanical stress may remain elastically stable after the stress is released, and so may continue to exist indefinitely, even if not in the thermodynamic equilibrium state at zero stress. (An example of an elastically stable structure that is also not in the thermodynamic equilibrium state is the extremely hard, tetragonal crystalline form of iron– carbon alloy referred to as martensitic steel; such structures are sometimes called metastable.) Additionally, as noted by Hill [1], “Single crystals free from lattice imperfections are used increasingly as microstructural components. Perfect crystals are capable of elastic strains well beyond what can properly be treated as infinitesimal. Their response to general loading is virtually unknown and is doubtless complex, so experimentation will have to be conducted within some plausible theoretical framework”. In this context, Milstein and Chantasiriwan [2] observed “Atomistic model computations can shed light on these 1223 S. Yip (ed.), Handbook of Materials Modeling, 1223–1279. c 2005 Springer. Printed in the Netherlands.
1224
F. Milstein
complexities, particularly when comprehensive comparisons are made among different metals, crystal structures, and loading directions. Such comparisons can also serve to distinguish between finite strain responses that are sensitive to specific details of atomic binding and those dependent mainly on just crystal symmetries and the general nature of interatomic forces, i.e., attractive between atoms at relatively large interatomic spacing and repulsive between close, neighboring atoms”. Since Hill’s observation, almost 30 years ago, lattice model computations have yielded numerous insights into the large strain, non-linear, elastic response of crystals, although their general response to loading is still largely unknown, especially as it concerns the nature of atomic mechanisms at and beyond the onset of instability. Questions posed at the start of this article may be further elaborated as follows. How is elastic stability under load to be assessed theoretically? What are the roles of crystal symmetry and mode of loading? (For example, a face centered cubic (fcc) crystal with a uniaxial compressive load applied in a [100] direction (i.e., parallel to an edge of a unit cubic crystallographic cell) will respond differently than a body centered cubic (bcc) crystal with a uniaxial tensile load applied in a [111] direction (i.e., parallel to a body diagonal of the unit cell).) If the purported stability limit coincides with a bifurcation point, what are the allowed eigendeformations at the immediate onset of instability? At and after the initiation of a bifurcation, are the atomic mechanisms homogeneous (i.e., with the crystal deforming uniformly, in the manner of a predicted homogeneous eigenmode) or inhomogeneous (e.g., with the formation of domains or the shuffling of crystallographic planes)? Does post-bifurcation behavior lead to failure (loss of load carrying capacity) or to phase change without loss of strength? How are instability processes influenced by thermal activation? A goal of this article is to suggest some definitive, as well as some tentative, answers to the above questions, from within a framework that is both analytical and computational. The article is structured as follows. First, the principles of stability analysis for ideal crystals under load are reviewed; the methodology presented is complete to second order in both the internal energy of the crystal (expressed in terms of the crystal’s second order elastic moduli) and the external work. Then examples are given of various lattice statics (LS) calculations that are intended to provide illustrations of the manner in which particular crystal symmetries and modes of loading yield a rich and diverse range of mechanical and geometrical responses, prior to, at, and after the onset of instability. Next, the potential role of higher order elastic moduli at a point of bifurcation or branching of the crystal is discussed. The final topic is the behavior of crystals under stress in isostress molecular dynamics (IMD) simulations carried out in the methodology proposed by Parrinello and Rahman [3]. At each stage of this presentation, comments are made regarding the “outlook” and needs for future work.
Elastic stability criteria and structural bifurcations
1.
1225
Principles of Stability Analysis of Ideal Crystals
In pioneering work, Born [4] introduced the concept of theoretical strength as an elastic instability phenomenon; the first attempt to carry out calculations of the range of elastic stability of an ideal crystal subjected to uniaxial load was made by Born and Furth [5]. According to Born, a crystal under homogeneous deformation may be treated as a conservative dynamical system with six degrees of freedom; stability, in the ordinary Lagrangian sense, is then to be assessed along conventional lines. In Born’s formulation, however, external work contributions are not explicitly and fully included in the total potential energy. As a result, Born’s criterion for elastic stability amounts to equating the range of elastic stability with the domain of convexity of internal strain energy, which, as first noted by Hill [1], is not coordinate invariant. Consequences of adopting this approach, together with developments of rigorous, coordinate invariant, elastic stability criteria for crystals under load, were presented by Hill and Milstein [6] and Milstein and Hill [7, 8], and are reviewed briefly here. If homogeneous strains of a crystal lattice are described by some set of “generalized coordinates” qr (r = 1, . . . , 6) that together specify the geometry of the deformed crystallographic cell, then work-conjugate “generalized forces” pr in a configuration qr may be defined via the differential form dE = pr dqr
(1)
(summation convention, r = 1, . . . , 6), and “generalized moduli” crs via d pr = crs dqs
(2)
∂ 2E ∂qr ∂qs
(3)
with crs =
where E is the elastic strain energy per unit reference volume (e.g., per unit crystallographic cell). The incremental change in strain energy δ E resulting from incremental changes in the cell’s geometry δq r is then δ E = pr δqr + 12 crs δqr δqs,
(4)
correct to second order in the qr . Various coordinate sets qr have been employed in practice; e.g., the Green variables were always adopted by the Born school; Macmillan and Kelly [9] employed elements of the stretch tensor, and in his earlier work, Milstein [10] used the edges of the crystallographic cell and their included angles; these have been termed G-, S-, and M-variables, respectively. Now, consider the crystal to be in a current, homogeneously deformed, state qr under generalized forces pr and let the crystal undergo any small,
1226
F. Milstein
arbitrary, additional deformation of the chosen set qr specified by the set δqr . Elastic stability of the crystal then signifies that the combined incremental potential energy of the crystal and its external loading (i.e., the sum of the incremental elastic strain energy δ E and external work δW ) is positive for all possible, arbitrary, incremental variations δqr . The increment δW of external work must therefore also be specified objectively to second order in the qr ; i.e., δW = pr δq r + 12 k rs δq r δq s
(5)
where the coefficients krs depend on the test configuration and the choice of variables qr . The algebraic expression of the stability criterion, δ E − δW > 0, then becomes (crs − krs )δq r δq s > 0
(6)
for arbitrary δqr when not all δqr = 0. Inequality (6) may be contrasted with the Born criterion, i.e., crs δq r δq s > 0,
(7)
which neglects the explicit inclusion of the second order work terms. Inequality (7), which thus equates elastic stability with the positive definiteness of the matrix of elastic moduli crs , is equivalent to the assertion that δ E > pr δq r to second order. The lack of general coordinate invariance of inequality (7) is demonstrated briefly below (see Ref. [6] for further details). If qr and qr∗ represent two distinct choices of geometric coordinates, all variables appearing in relations (1)–(6) may be rewritten with asterisks when reckoned to the set qr∗ , and by invariance of the energy per unit mass of the crystal, pu∗ dqu∗ pr dqr , (8) = ρ∗ ρ where ρ ∗ and ρ are the masses (or equivalently, the number of atoms) in the reference cells. The conjugate variables then transform according to
ρ ρ∗
pu∗
=
∂qr ∂qu∗
pr
(9)
from which ρ ∂qr ∂ 2 qr ∗ d pu = d pr + pr dqv∗ . ρ∗ ∂qu∗ ∂qu∗ ∂qv∗
(10)
Next, substitute (2) and its asterisked analog into (10) and compare coefficients of the independent dqv∗ , which yields the transformation formulae for the moduli,
ρ c∗ = ρ ∗ uv
∂qr ∂qu∗
∂qs ∂qv∗
crs +
∂ 2 qr ∂qu∗ ∂qv∗
pr
(11)
Elastic stability criteria and structural bifurcations
1227
from which it follows that
ρ c∗ δq ∗ δq ∗ − crs δqr δqs = pr ρ ∗ uv u v
∂ 2 qr ∂qu∗ ∂qv∗
δqu∗ δqv∗ .
(12)
The right hand side of (12) does not in general vanish, which thus demonstrates the lack of coordinate invariance of the Born criterion. Invariance of δW/ρ also requires that the symmetrized krs transform according to
ρ k∗ = ρ ∗ uv
∂qr ∂qu∗
∂qs ∂qv∗
krs +
∂ 2 qr ∂qu∗ ∂qv∗
pr ,
(13)
in analogy with (11). Combining (11) and (13) then yields
1 ∗ ∗ (cuv − kuv )δqu∗ δqv∗ = ρ∗
1 ρ
(crs − krs )δqr δqs ,
(14)
which thereby demonstrates the coordinate invariance of the stability criterion (6). Relation (6), of course, reduces to relation (7) in the absence of applied load. Consider next the topic of bifurcations of an initially stable crystal on a primary path under a prescribed mode of loading at the “critical stage” where the quadratic form of relation (6) first passes from positive definite to semidefinite, i.e., at the instant at which the stability criterion (6) is first violated. At this stage the homogeneous equations δpr − krs δqs = 0
(15)
necessarily have at least one eigensolution that causes the quadratic form to vanish; these equations are also necessarily coordinate invariant [6]. However, since branching of a primary path under a prescribed mode of loading is associated with loss of stability, it follows that the location of the presumed branch point on the primary path is likewise not coordinate invariant in general when the criterion for its inception is stationarity of the conjugate forces during some virtual increment of deformation; i.e., by analogy with (15), the corresponding eigenequations associated with the Born criterion, δpr = crs δqs = 0,
(16)
are not coordinate invariant in general. (For example, under [100] uniaxial / 0 in general, all other pr = 0, q1 = / q2 = q3 , and loading of a cubic crystal ( p1 = cell edges remain perpendicular on the primary path), p1 achieves a maximum or minimum value on the primary path coincident with the extremum in the axial load l1 if the qr are the S- or M-variables whereas p1 and l1 /λ1 reach
1228
F. Milstein
extrema simultaneously if qr represents the G-variables, where the axial stretch λ1 is the length of any fiber coaxial with the [100] direction divided by its length in the reference state.) The above considerations naturally evoke a number of practical questions. First, how does one deal with the coefficients krs in practice? These coefficients may be readily obtained for certain special cases, such as the well-defined, technically uncomplicated, loading environment provided by a uniform hydrostatic pressure that remains constant during any departure of the crystal’s geometry from equilibrium [6–8]. However, more generally, the precise determination of appropriate krs values presents a particularly challenging problem. As noted by Hill and Milstein [6], “the loading in laboratory experiments is usually frame dependent and the work is affected also by rotation of the specimen. On the intrinsic view, the loads ‘follow’ the material during any disturbance; they may, in addition, be deformation sensitive and so become different in kind from those in a state of equilibrium whose stability is under test.” What, then, might be the additional consequences and implications of dropping the krs terms and reverting to the original Born criterion (i.e., in addition to the issue of coordinate invariance, as already discussed)? More specifically, (i) are the limits of “stability”, as judged from relation (7), strongly or weakly dependent on the choice of geometric variables qr , (ii) what are the mechanical implications of the Born concept of instability, and (iii) are there some exceptional bifurcations on some paths that are essentially coordinate invariant according to this criterion? With regard to part (iii) of the above question, only one such coordinate invariant eigenstate has yet been identified; it occurs on a path of [100] uniaxial loading of an initially cubic crystal; this has been called the “c22 = c23 ” invariant eigenstate and, as is discussed later in this article, this state plays a particularly important role in both the [100] and [110] uniaxial loading behavior of cubic crystals. With regard to (ii), Hill and Milstein [6] did provide a notional mechanical interpretation of the Born criterion. This is summarized, as follows, in a paper on the [111] loading of cubic crystals [11, p. 4289]. Although not explicitly stated by Born, it is implicit to “Born’s view. . . . [that the loading] environment is notional, since the implied work input during any δqr is pr δqr correct to second order. This means that the loading must be imagined to ‘follow’ the deforming crystal servo-mechanically so as to hold fixed the values of pr , regardless of changes in shape or orientation (for instance, with the Green variables, the [111] load must be maintained along the Bravais cell diagonal and proportional to its length). To that extent, Born’s criterion could perhaps be said to characterize an ‘intrinsic’ strength that reflects a property of the material alone. The fact is, however, that [inequality (7)] is not coordinate invariant, but depends on the particular choice of variables and on the reference configuration”.
Elastic stability criteria and structural bifurcations
1229
With regard to part (i) of the question posed on the prior page, it has been clearly demonstrated from a series of lattice statics based calculations of the domains of stability of cubic crystals under constant hydrostatic pressure that the ranges of “stability”, as judged by relation (7), are highly sensitive to the choice of qr (within the group of G-, S-, and M-variables), and they diverge significantly from the domains based on the rigorous criterion (6) [7, 8]. Furthermore, a meaningful physical interpretation of the notional Born criterion in a constant hydrostatic environment is lacking. On the other hand, when the same three sets of variables were used in LS computations to determine the ranges of “Born stability” of initially cubic crystals under uniaxial loadings coincident with principal symmetry directions, the typical result was a fairly small dependence on the choice of variables ([11]; Fang and Milstein, to be published; Chantasiriwan and Milstein, to be published); the exception was bcc metals under compression, as is discussed in the next section of this article. In addition, uniaxial IMD loading simulations have yielded instabilities in close proximity to the LS Born instabilities (Zhao, Maroudas, and Milstein, to be published). These results (which are discussed in following sections of this article) suggest that, while the Born criterion is inadequate, both philosophically and quantitatively, for assessment of stability under a constant hydrostatic environment, it can be efficacious for specific uniaxial loadings. Whether the Born criterion reasonably predicts the onset of instability under other modes of loading (e.g., shear or biaxial) and whether it has a strong or weak dependence on reasonable choices of the geometric variables remains to be investigated by means of IMD simulations, combined with LS computations based on diverse measures of lattice strain as generalized coordinates. With regard to the latter consideration, Hill [1] noted that, in principle, one could use components of various other measures of strain as generalized coordinates, and he considered any tensor coaxial with the principal fibers and having principal values e(λ1 ), e(λ2 ), e(λ3 ) where λ1 , λ2 , λ3 , are the principal stretches; e(λ) can be any smooth monotone function that yields agreement with the classical infinitesimal strain when deformation is first order (i.e., e(1) = 0 and e (1) = 1). Examples of e(λ) are λ − 1, ln λ, and (1/2)(λ2 − 1), the last of which generates the components of Green’s measure of strain. It can be instructive to illustrate the concepts presented above by way of concrete examples. For this purpose, let us consider cubic crystals subjected to three different modes of applied load, viz. hydrostatic pressure, [100] uniaxial loading, and [111] uniaxial loading. Although both lattice statics and isostress molecular dynamics simulations have been carried out for each of these three cases, as is discussed in subsequent sections of this article, the discussion that follows in this section is independent of any specific model of atomic binding; the only assumption about atomic binding is that the path dependent internal energy E and its derivatives with respect to the qr are calculable.
1230
F. Milstein
A cubic crystal under hydrostatic pressure remains cubic on a primary path, and thus has three independent elastic moduli c11, c12 , and c44 . More fundamental, however, are the moduli κ, µ, and µ defined by dσ11 + dσ22 + dσ33 = 3κ(ε11 + ε22 + ε33 ),
(17)
dσ11 − dσ22 = 2µ(ε11 − ε22 ),
(18)
dσ12 = 2µ ε12 ,
(19)
where the Cauchy stress is σi j , the Eulerian strain rate is εi j , and the d preceding the σi j denotes derivatives of components on cubic axes (or indeed on any rotating frame, when the current stress σi j = −Pδi j ). Thus, κ is the bulk modulus and µ and µ are the shear moduli in the relation between the cubicaxes components of the Cauchy stress increment and the rotationless strain increment (evaluated relative to the current configuration under pressure P). Milstein and Hill [7] employed the principles of bifurcation analyses of general materials in the determination of stability criteria for cubic crystals subjected to hydrostatic loading. The analyses are carried out in a manner equivalent to Hill (Ref. [12] Chapter III, Section C2) but without recourse to the general mathematical apparatus for handling follower-loadings. Milstein and Hill’s treatment of crystal stability is rigorous and complete; i.e., (a) the loading environment is fully specified, to sufficient order and in both its active and passive modes, and (b) the potential energy of the system as a whole is examined in all the nearby, possibly inhomogeneous, configurations allowed by the kinematic constraints, if any. Under a hydrostatic pressure that does not vary during any departure from a considered configuration of equilibrium, elastic stability is guaranteed if 2 + . + .) > 0. κ(ε11 + ε22 + ε33 )2 + 23 µ[(ε11 − ε22 )2 + . + .] + 4µ (ε12
(20)
Since the three terms are independently variable, the necessary and sufficient conditions for stability are the simultaneous satisfaction of the inequalities κ(P) > 0,
µ(P) > 0,
and
µ (P) > 0.
(21)
Milstein and Hill [7, 8] identified the primary eigenstates and corresponding eigensolutions ηi j associated with loss of stability on a fundamental path at a pressure P = Q as follows. / 0; (i) κ(Q) = 0, µ(Q) > 0, µ (Q) > 0 with eigensolutions η11 = η22 = η33 = η12 =η23 =η31 =0 (the eigenmode is necessarily homogeneous and purely volumetric, coincident with dP/dV = 0, where V is the volume). (ii) µ(Q) = 0, κ(Q) > 0, µ (Q) > 0 with solutions such that η11 + η22 + η33 = 0; η12 = η23 = η31 = 0 (the uniform eigenmodes make the lattice orthorhombic, or possibly tetragonal, without varying the cell volume).
Elastic stability criteria and structural bifurcations
1231
(iii) µ (Q) = 0, κ(Q) > 0, µ(Q) > 0 with solutions such that η11 = η22 = η33 = 0; any ratios η12 : η23 : η31 (the uniform eigenmodes distort the lattice without varying the lengths of the cell edges). Explicit connections between relations (6) and (21) are obtained as follows. For a cubic crystal under hydrostatic pressure, crs δqr δq s = 13 (c11 + 2c12 )(δq1 + δq2 + δq3 )2 + 13 (c11 − c12 )[(δq1 − δq2 )2 + . + .] + c44 [(δq4 )2 + (δq5 )2 + (δq6 )2 ].
(22)
The form krs δqr δqs can be expanded similarly, so the stability criterion (6) becomes c11 + 2c12 > k11 + 2k12 ,
c11 − c12 > k11 − k12 ,
and
c44 > k44 .
(23)
The relations between the moduli crs and κ, µ, and µ for a cubic crystal under pressure [7] are 4µ λe P 2 e2 c12 e2 c11 =κ + + = κ − µ − P, , λ 3 e λ 3 1 λe e2 c44 = µ + P 1 + . λ 2 e
and (24)
If e(λ) = (1/2)(λ2 − 1) (i.e., the Green measure of strain), relations (21) and (24) yield the stability criteria P 3κ = c11 + 2c12 + > 0, λ λ P µ = c44 − > 0, λ λ
2µ 2P = c11 − c12 − > 0, λ λ
and (25)
from which k11 = P/λ, k12 = −(P/λ), and k44 = P/λ in the Green measure of strain. If e(λ) = λ − 1 (which generates the stretch measure), 3κλ = c11 + 2c12 + 2Pλ > 0, µ λ = c44 − 12 Pλ > 0,
2µλ = c11 − c12 − Pλ > 0,
and (26)
so, if crs represents the S-moduli, k11 = 0, k12 = −Pλ and k44 = Pλ/2. Finally, if the qr are the edges of the cubic cell and their included angles, 3κλ = c11 + 2c12 + 2Pλ > 0, µ λ3 = c44 − Pλ3 > 0,
2µλ = c11 − c12 − Pλ > 0,
so in the M-variables, k11 = 0, k12 = −Pλ, and k44 = Pλ3 .
and (27)
1232
F. Milstein
Consider next the [100] loading of an initially cubic crystal. Under uniaxial / 0, all other pr = 0), the crystal becomes tetragonal on a primary path load ( p1 = / q2 =q3 ; cell edges remain perpendicular) with six independent moduli crs , (q1 = viz. c11 , c12 = c13 , c22 = c33 , c23 , c44 , and c55 = c66 (all other crs = 0, and crs = csr , of course). The differential relations (2) that govern an arbitrary differential disturbance are then d p1 = c11 dq1 + c12 (dq2 + dq3 ), d p2 = c12 dq1 + c22 dq2 + c23 dq3 , d p3 = c12 dq1 + c23 dq2 + c22 dq3 ,
and (28)
with, d p4 = c44 dq4 ,
d p5 = c55 dq5 ,
d p6 = c55 dq6 .
(29)
/ 0, all other d pr = 0, the general If the load were to remain uniaxial, i.e., d p1 = solution to (28) and (29) gives the coordinate increments (dq1 ,. . .,dq6 ) on the primary path, i.e., 2 −1 ] d p1 (c22 + c23 , −c12 , −c12 , 0, 0, 0)[c11 (c22 + c23 ) − 2c12
(30)
The quadratic form crs δqr δqs may be reduced to the sum of independent squares
c11 δq1 + c12 (δq2 + δq3 ) c11
2
1 2c2 + c22 + c23 − 12 2 c11
(δq2 + δq3 )2
+ 12 (c22 − c23 )(δq2 − δq3 )2 + c44 δq42 + c55 δq52 + δq62 ,
(31)
and the determinant of the moduli matrix factors as [6] 2 2 ]c44 c55 . det(crs ) = (c22 − c23 )[c11 (c22 + c23 ) − 2c12
(32)
Thus, the necessary and sufficient conditions for Born stability according to relation (7) are seen to be c11 > 0,
c22 + c23 −
2 2c12 > 0, c11
c22 − c23 > 0,
(33)
together with c44 > 0,
c55 > 0.
(34)
Elastic stability criteria and structural bifurcations
1233
The determinant (32) can vanish when, and only when, at least one factor does and each vanishing factor is associated with a particular type of eigensolution: 2 /c11 , (2c12 , −c11 , −c11 , 0, 0, 0), when c22 + c23 = 2c12 (0, 1, −1, 0, 0, 0), when c22 − c23 = 0, (0, 0, 0, 1, 0, 0), when c44 = 0, (0, 0, 0, 0, 1, 0) and (0, 0, 0, 0, 0, 1), when c55 = 0.
(35)
Born and Furth [5] employed an alternative method of judging convexity of internal energy based on the requirement that all principal minors of the matrix of moduli crs must be positive, according to a standard theorem in algebra. This approach, however, did not reveal the eigensolutions (35), and curiously, Born and Furth arrived at six “necessary and sufficient conditions for the stable equilibrium of the lattice”, five of which are equivalent to (33) and (34), in addition to the condition c22 > 0 (note however that relations (33) also imply c22 > 0). Next, we may ask, how useful is the notional Born concept, as expressed by relations (33)–(35), in judging the stability and bifurcation response of initially cubic crystals under [100] uniaxial loading? From expression (30), we see that the first kind of eigenstate in (35) occurs where the variable p1 becomes stationary, so the associated Young’s modulus vanishes. If this eigenstate were to terminate a notional stability range expressed in the M- or S-variables, it would occur at a maximum or minimum of the applied load l1 and therefore would make reasonable physical sense as a stability limit if the loading apparatus were to attempt to apply a constant uniaxial tensile load in excess of the maximum value of l1 (or compressive load exceeding the minimum l1 ) on the primary path, and transverse strains were unimpeded. (As indicated earlier, in the G-variables, this kind of eigenstate occurs where l1 /λ1 becomes stationary, and thus is found on the primary path before the maximum value of l1 is reached under tension and after a minimum of l1 in compression.) The location of the second eigenstate on the primary path is independent of the choice of qr , and thus if stability is indeed terminated at the invariant c22 =c23 eigenstate, the notional criterion (7) and the stability criterion (6) coincide. (The invariance simply requires that q1 is coaxial with the uniaxial load l1 , which is of course coaxial with the unique axis of the tetragonal crystal, and q2 and q3 are coaxial with the transverse tetragonal axes.) A rigorous proof of this result is given by Hill and Milstein [6, p. 3093]. For physical insight, we note that the eigensolution at this state has all d pr = 0, with dq2 = −dq3 , all other dqr = 0; if qr (r = 1, 2, 3) represents the edges of the tetragonal cell, then pr (r = 1, 2, 3) are the axial loads lr (i.e., the Mvariables). Thus the eigendeformation at the branch point takes the crystal structure from the primary tetragonal path to a secondary or orthorhombic
1234
F. Milstein
branch (q1 = / q2 = / q3 ; cell edges remains orthogonal), with the uniaxial load remaining dead during the differential eigendeformation. On the secondary path, at the branch point, the generalized Poisson ratios dq2 /dq 1 and dq3 /dq 1 are infinite and of opposite algebraic sign and the first order expression for d p1 /dq1 (i.e., expressed in terms of the second order moduli crs alone) is indeterminate. In fact, owing to the highly singular nature of the secondary path at the branch point, the correct expression for the variation of axial load with axial stretch on the secondary path at the point of bifurcation (expressed in terms of elastic moduli on the primary path at this point) must include third and fourth order moduli, crst and crst u . This is discussed further in the section of this article concerned with the role of higher order moduli. The second and third types of eigenstates in (35) are somewhat analogous; i.e., upon rotation of the 2- and 3-axes by 45◦ about the 1-axis, a new set of axes are obtained on which the crystal maintains tetragonal symmetry. (For example, if on the original, unrotated, set of axes the crystallographic cell is described as body centered tetragonal, on the rotated axes the cell appears as face centered tetragonal.) As a result, the c22 = c23 eigenstate, reckoned to the unrotated set of axes, occurs at the exact same point on the primary path as the “c44 = 0” eigenstate, reckoned to the rotated axes, and vice versa. It thus follows that the c22 = c23 and c44 = 0 eigenstates have the same invariance, and / 0, all other δqr = 0, on one set of tetragonal axes is the eigendeformation δq4 = identical to δq2 = −δq3 , all other δqr = 0, on the other (rotated) set of tetragonal axes. In view of the above discussion, we have clear, physically meaningful interpretations of each of the first three eigenstates in (35). Physical clarity in each of these three cases is enhanced by the condition that, during the eigendeformation, the load l1 remains uniaxial, parallel to the 1-axis, which in turn remains perpendicular to the 23-face of the crystallographic cell. Physical interpretation of the “c55 = 0” eigenstate is more problematic, owing to the characteristic shearing mode of the associated eigendeformation. That is, under this eigendeformation, if the load were to remain parallel to the 1-axis, it would cease to be perpendicular to the 23-face, while if it were to remain normal to the 23-face, it would no longer be aligned with the crystallographic ∗ 1-axis. This eigenstate is also not coordinate invariant; e.g., if c55 and c55 represent the Green and stretch moduli, respectively, ∗ = (λ1 + λ2 )2 c55 + l1 ; 4c55
(36)
∗ = 0 eigenstate is preceded by a c55 = 0 eigenthus, the occurrence of a c55 state in a tensile region (l1 > 0); the order of appearances is reversed under compression. Next consider the notional stability criteria and associated bifurcation response for cubic crystals under [111] uniaxial loading, following the exposition of Ref. [11]. Under [111] loading, the primary path is axisymmetric;
Elastic stability criteria and structural bifurcations
1235
select a set of rectangular axes with the 3-axis in the loading direction and the 1- and 2-axes arbitrarily transverse. Consider any fourth-rank tensor of moduli, however, defined, with components crs expressed in the usual 2-index notation. Crystal symmetry reduces the number of independent moduli to six, which, together with their interrelationships, are c11 = c22 , c33 , c44 = c55 , c12 , c13 = c23 , and c14 = −c24 = c56 , with c66 = (1/2)(c11 − c12 ) and c15 = −c25 = −c46 = c14 sin 3φ/cos 3φ, where φ is the angle between the 1-axis and a line of nearest neighbors in the (111) plane; adopt φ = 0◦ (or any integer multiple of π/3), which thus reduces c15 , c25 , and c46 to zero. The method of stability analysis for [111] uniaxial loading is similar to that employed for [100] loading, and again is more direct and insightful than a simple evaluation of the principal minors of the crs matrix. With the symmetries described in the prior paragraph, the quadratic form (6) can be arranged as [c33 δq3 + c13 (δq1 + δq2 )]2 [c44 δq4 + c14 (δq1 − δq2 )]2 + c c44 33 2 2 1 2c13 1 2c14 2 + c11 + c12 − (δq1 + δq2 ) + c11 − c12 − 2 c33 2 c44
× (δq1 − δq2 )2 + δq62 +
(c44 δq5 + c14 δq6 )2 c44
(37)
by successively “completing the square” in the variables taken in a sequence appropriate to the symmetries. Thus, in the manner of expression (31) for [100] loading, expression (37) makes “self-evident” the necessary and sufficient conditions for positive definiteness of the quadratic form crs δqr δqs under [111] loading, i.e. 2 (c11 + c12 )c33 > 2c13
and
2 (c11 − c12 )c44 > 2c14
(38)
with, c33 > 0 and
c44 > 0.
(39)
At the termination of a range of notional stability, quadratic form (37) becomes positive semi-definite at a primary eigenstate, and thus can be made zero by some critical disturbance δqr , called a primary eigenmode (or primary eigendeformation) that satisfies equations (16). This semi-definiteness occurs at the first violation of either of the inequalities (38), which, in themselves, preclude an earlier violation of either of the inequalities (39). Likewise, factorization of the determinant of the matrix crs yields 2 2 2 ][(c11 − c12 )c44 − 2c14 ], det(crs ) = 12 [(c11 + c12 )c33 − 2c13
(40)
wherein vanishing of a factor is associated with semi-definiteness of the corresponding notional stability criterion in (38). The first factor vanishes at an
1236
F. Milstein
extremum of p3 on the primary path, and the eigenmode is that of the axisymmetric path itself; i.e., a first order increment δqr that does not vary from the primary path is governed by
2 2c13 δq3 , δp3 = c33 − c11 + c12
(41)
with δq1 = δq2 =
−δq3 c13 , (c11 + c12 )
(42)
all other δqr = 0. When δp3 vanishes, the eigenmode becomes δq1 = δq2 =
−δq3 c33 . 2c13
(43)
Vanishing of the second factor, since double, is associated with a pair of independent eigenmodes: δq1 −δq2 −δq4 = = , c44 c44 2c14
(44)
δq5 −δq6 = , c14 c44
(45)
all other δqr = 0 in both cases. Unlike the case of [100] loading, none of the [111] loading eigenstates is invariant with respect to choice of strain variables, which thereby attaches a special significance to atomic based simulations of the [111] loading response. Lattice statics studies can determine which eigenstate is primary and whether its location on the primary path is sensitive to the choice of variables. Isostress molecular dynamics simulations can determine whether stability is indeed lost in proximity of a primary eigenstate, the nature of the atomic mechanisms at and after the initiation of instability, whether these mechanisms are in accord with the primary eigenmodes identified above, and whether instability leads to failure or phase change. Other writers on elastic stability have approached the subject from their own unique perspectives. Wang et al. [13] developed criteria that they tested with molecular dynamics simulations and found good agreement between theory and simulation results for a cubic crystal loaded in tension, both hydrostatically and along a cube edge. Their criteria for failure under pressure are identical to those of Hill and Milstein, and failure under volumetric expansion was associated with the vanishing of the bulk modulus κ. Moreover, under [100] tensile loading, their crystal failed in association with the c22 = c23 eigenstate described above, which, of course, is invariant with respect to choice of
Elastic stability criteria and structural bifurcations
1237
coordinates and thus is invariant within the formulation of any suitable theory. Morris et al. [14] sought stability criteria suitable for “tedious” ab initio computations and provided analyses for systems that maintain fixed boundaries; they propose that this condition yields an upper limit to the theoretical strength of a crystal; although, as they indicate, instabilities that result from deformations orthogonal to the chosen deformation may be missed.
2.
Large Strain Mechanical Response
In this section, the mechanical behavior of crystals under large elastic strains is explored through various avenues, which include analyses of specific atomic based LS computations, considerations of crystal symmetry and of the essential nature of interatomic forces (i.e., repulsive and attractive, respectively, at small and large interatomic spacing), and examinations of available experimental evidence. Lattice statics model computations of primary paths, bifurcation points, and secondary paths that branch from the primary paths under homogeneous eigendeformations are discussed within the framework of the stability analyses presented in the prior section. Lattice statics computational results, together with stability theory, provide the bases for understanding inhomogeneous branching observed in IMD simulations, as is discussed in the final section of this article. Lattice statics simulations based upon empirical and semi-empirical atomic models (i.e., pair potentials, embedded atom methods, and quantum mechanically based pseudopotentials) are suitable for our present purpose, which is to elucidate a broad range of qualitative and semi-quantitative phenomena, rather than to delve into more complex ab initio models that currently are unsuitable for use in large scale IMD simulations. Here, we consider first the topic of cubic crystals under hydrostatic pressure, after which, uniaxial and shear loading responses are examined. Apparently the first model computations of the bulk and shear moduli of cubic crystals under hydrostatic pressure, defined by Eqs. (17)–(19), are those of Milstein and Hill [7, 8, 15, 16]. They computed these moduli for the entire family of Morse function fcc, bcc, and simple cubic (sc) crystals under hydrostatic compression and expansion; they also determined the domains of stability according to relations (21) and identified the eigenmodes at the domain limits. In a Morse model crystal, the interaction energy φ(r) between any two atoms separated by a distance rin the crystal is φ(r) = D{exp[−2α(r − r O )] − 2 exp[−α(r − r O )]};
(46)
the internal energy E is then obtained by summing over a sufficient number of pairwise interactions to obtain convergence, and the pressure and elastic moduli are computed from lattice summations containing derivatives of φ(r).
1238
F. Milstein
The empirical parameters D, α, and r O are usually determined by fitting the model crystal to experimental data; the parameter β ≡ exp(αr O ) is an effective potential range indicator; larger values of β yield shorter range, steeper functions φ(r). Values of log β (natural logarithm) calculated from experimental values of elastic moduli and atomic volumes range from about 3 to 8 [17]. Milstein and Hill [7, 8] also computed the Born-notional ranges of stability of the complete family of Morse function cubic crystals, for various coordinate choices, according to relation (7), which by (22) or (23), is seen to be equivalent to c11 + 2c12 > 0, c11 − c12 > 0, and c44 > 0. Their results clearly demonstrated quantitatively the failure of the Born criterion to describe adequately the ranges of stability of crystals in a constant hydrostatic environment. For example, Fig. 1 compares the domains of classical stability (indicated as “exact”) with the notional domains of Born stability for the particular case in which the geometric coordinates are the components of Green’s strain. In the Morse model, each of these domains depends uniquely upon the parameter β; in fact, all dimensionless elastic properties of a Morse model crystal depend uniquely upon β. Among the divergences between the exact and notional criteria are (i) the notional G-stable range of sc crystals exists only in compression, whereas the exact criteria show that the sc crystal is stable only in hydrostatic tension (IMD simulations of Ref. [18] also verified that the sc Morse model crystals are stable in a range of hydrostatic expansion, and lose stability where predicted by the LS computations), (ii) the range of classical stability of fcc crystals increases monotonically as log β decreases whereas the corresponding G-stable range “peaks” near log β = 6, and (iii) the classical range of stability is much smaller than the G-stable range for bcc crystals with large values of log β, and vice versa for small values of log β. The “universal map” of classical stability domains of the Morse function cubic crystals has provided an effective basis for studying bcc to hexagonal close packed (hcp) transformation mechanisms in IMD simulations, as will be discussed in the final section of this article. The map was determined by computing the loci of states, λ = , at which κ, µ, and µ vanish. Subscripts F, B, and S identity the moduli for the fcc, bcc, and sc lattices, respectively. Since the dimensionless elastic properties of each crystal structure depend uniquely upon β, the values of λ at which the moduli vanish likewise depend solely upon β and crystal structure. Figure 2 shows the dependencies of the -values upon log β; the shear modulus µB of bcc crystals exhibited two zeros, µB(L) and µB(R) , where “L” and “R” designate the left- and right-hand zeros, respectively. In this model, the fcc crystals are seen to remain stable under arbitrary hydrostatic compression and to lose stability in tension when κ = 0. Stability of the bcc crystals is terminated when κ = 0 or when µ = 0, depending on whether log β is less than or greater than about 3.91, respectively, and stability of these crystals is lost under tension or compression, depending on whether log β is less than or greater than 4.517, respectively.
Elastic stability criteria and structural bifurcations
1239
(a) 1.14 FCC EXACT FCC G-STABLE
STRETCH, λ
1.12
1.10
1.08
1.06
1.04
3
5
7 log β
9
11
(b) SC EXACT SC G-STABLE
1.2
BBC EXACT
STRETCH, λ
BCC G-STABLE
1.0
0.8
0.6 3
5
7 log β
9
11
Figure 1. Domains of classical stability in a hydrostatic environment, and of convexity of the strain energy relative to the Green variables, for the Morse family of (a) fcc crystals and (b) bcc and sc crystals. The lattices are classically stable according to (21) in the regions indicated as “EXACT,” and notionally stable according to (7), with the Green variables, in the regions indicated as “G-STABLE”. The stretch λ is the length of any fiber in the crystal in its current state divided by its length in the absence of pressure P; λ > 1 indicates hydrostatic tension (P < 0) while λ < 1 signifies compression; the potential range indicator is log β = αr O . From Ref. [8].
1240
F. Milstein (a) 2.0
ΛµB(R) ΛµF 1.5 Λκ, Λµ, and Λµ'
Λµ' B Λµ' F ΛµS
1.0
Λκ
Λµ' S
ΛµB(L) 0.5
3
5
7
9 log β
11
13
15
(b)
ΛµB(R)
1.2 Λµ' B
Λκ, Λµ, and Λµ'
Λµ' S
Λκ
ΛµF
Λµ' F ΛµS
1.1
ΛµB(L)
1.0
3
4
5
6
7 log β
8
9
10
11
Figure 2. Values of stretch α A at which the bulk modulus (α=κ) and shear moduli (α=µ, µ ) of Morse model fcc, bcc, and sc (A = F, B, S) crystals vanish as functions of the potential range indicator log β; µB and µS are positive above µB(R) and µ S , respectively; all other moduli are positive at stretches below their corresponding -curve: (a) over a wide range of values and (b) enlarged view, over a more limited range. From Ref. [7].
Elastic stability criteria and structural bifurcations
1241
Following the work of Milstein and Hill [7, 8, 15, 16] Rasky and Milstein [19] derived analytic formulae for computing the elastic moduli of cubic metals described by quantum mechanically based pseudopotential models, under axial loadings, and formulated specific pseudopotential models appropriate for the alkali metals, based on the Heine–Abarenkov local model potential and the Taylor approximation for electron correlation and exchange. With but two adjustable parameters, the model was shown to provide very good agreement with nine experimentally determined properties (i.e., the binding energies, atomic volumes, elastic moduli κ, µ, and µ , first derivatives of the three moduli with respect to pressure, and second derivatives of κ; the second derivatives of the shear moduli were also computed, but experimental data were lacking); excellent agreement between theoretical and experimental pressure– volume relations was also obtained. Subsequently, Milstein and Rasky [20] employed the pseudopotential model to compute the bulk and shear moduli of the bcc and fcc configurations of each alkali metal over extensive ranges of hydrostatic compression and expansion. The alkali metals are known experimentally to exhibit seemingly diverse behaviors [21]. For example, at low temperatures, the heavier metals Cs, Rb, and K are bcc while Na and Li are in close-packed structures that are similar to fcc with periodic stacking faults; such close-packed structures evidently differ little in energy from the fcc phase. Indeed, cold working of Li below 75 K produces fcc. Under pressure, Cs, Rb, and K undergo bcc to fcc transitions, with the transition pressure greatest for K and least for Cs; also, experimentally, Na transforms from a close-packed structure to bcc at a relatively low pressure, and the bcc and close-packed structures coexist over a large range of pressure. From a theoretical, computational, viewpoint, Milstein and Rasky [20] showed that the bcc to fcc transformations in the heavier alkali metals are associated with the vanishing of the shear modulus µB and the simultaneous growth of the shear modulus µF , from negative (or “weakly positive”) to “strongly positive”. For Na, however, both the bcc and fcc structures exhibit elastic stability over wide ranges of compression in the region of transition between the bcc and close-packed structures, in accord with the experimentally observed “sluggishness” in this transition. Apparently these were the first computations of shear moduli over wide ranges of compression based on a theoretical model more sophisticated than pair potential models, and the first wherein initially unstressed, stable, cubic crystals become unstable under compression as a result of a shear modulus passing from positive to negative. The usual explanation for the bcc to fcc transitions in the heavier alkali metals is that, at high pressure, the valence electrons shift from primarily an s-character to a d-character, and the s–d transfer caused the structural transitions. The work of Milstein and Rasky [20], however, shows that these structural transitions may occur as a natural consequence of lattice instabilities, without the necessity of invoking the s–d electron transfer mechanism as the
1242
F. Milstein
driving force for structural change. Indeed, the s–d electron transfer may be an effect, rather than a cause, of the instabilities. Figure 3 shows results of computations of the shear moduli µB and µF , the bulk moduli κ, and the difference in Gibbs energy between the phases for Na and Rb as examples. For all of the alkali metals, the moduli µB and µF (not shown in Fig. 3) are positive throughout the compression range (µB for Li at very high pressures is an exception). In LS computations, the Gibbs energy and the enthalpy E + P V are identical, of course, since the atomic positions are “frozen”. Additionally, the variations of pressure with stretch for the fcc and bcc phases are found to be almost identical; thus the difference in Gibbs energy between the two phases at a given pressure may be represented on a plot where the stretch λ is the independent variable, as shown in Fig. 3, and the Gibbs energy differences are essentially the differences in binding energy per atom, E, since the pressure–volume products of the two phases are almost identical at a given pressure or volume. An examination of the interplay between the Gibbs energy difference and the shear moduli variations in Fig. 3 illustrates the critical role of elastic stability in phase transformation theory. At λ = a and at λ = b, where both phases have the same Gibbs energy and are equally favored thermodynamically, the values of µB and µF are very close. Where µB is substantially greater than µF , E is strongly negative, thereby favoring the bcc structure; and vice versa where µF is considerably larger than µB . For further insight, assume that a stable bcc crystal is compressed to where it’s Gibbs energy just exceeds that of the fcc crystal (e.g., to a state “just beyond” point b in Fig. 3). At this state, the bcc crystal is no longer thermodynamically favored, but it is still elastically stable, so an enthalpy barrier exists that resists transformation from the bcc state on any and all transformation paths. The barrier may be overcome and the transformation may proceed along some particular path (or paths) under the influence of some finite disturbance. In the absence of sufficient disturbance, the bcc state may continue to exist indefinitely. With further increase of pressure, as the stable bcc crystal approaches the state where µB = 0 (λ = µB(M) ), the disturbance required to cause phase transformation diminishes; at λ = µB(M) , the barrier for transformation vanishes on some unique transformation path (or paths), and then an infinitesimal disturbance would trigger the transformation. Such lattice “disturbances” may include thermally activated atomic vibrations, free surfaces, and internal defects that act as stress raisers. Isostress molecular dynamics studies of thermal activation of phase transitions and fractures associated with elastic instabilities have been carried out and are underway, as is discussed later in this article. Future work should also include IMD simulations of the behavior of more realistic crystal models (i.e., those containing internal defects and/or free surfaces) as an incipient instability is approached. In passing through the series of alkali metals, from Cs to Li, the states where λ = a, λ = b, and λ = µB(M) were found to occur at progressively lower
Elastic stability criteria and structural bifurcations
1243
(a)
(b)
Figure 3. Gibbs energy difference and the elastic moduli that control stability of the bcc and fcc alkali metals in hydrostatic loading; in these figures, the ’s terminate stability ranges; the subscripts M and L indicate “left-hand” and mid-range zeros in the shear moduli functions (1 GPa =1010 dyn/cm2 ). (a) Na and (b) Rb. From Ref. [20].
1244
F. Milstein
values of stretch and higher pressures; i.e., through this series, the curves µ(λ) and E(λ) are shifted toward the region of higher compression. It is of particular interest to note that the states “λ = a”, where the Gibbs energy difference vanishes, occur in regions of hydrostatic tension for K, Rb, and Cs and in compression for Na and Li. Thus, although both the bcc and fcc phases of all five alkali metals are elastically stable at zero-pressure, the thermodynamically preferred zero-pressure structures (i.e., the structures with the lower Gibbs energy) are indicated to be bcc for K, Rb, and Cs and fcc for Li and Na, in good agreement with experiment (i.e., as mentioned earlier, the lowtemperature phases of K, Rb, and Cs are indeed bcc while Li and Na are closed packed similar to “faulted fcc”.) From a theoretical viewpoint, the apparently divergent behaviors among the lighter and heavier alkali metals, as well as the increasing bcc to fcc transformation pressures that occur through the series Cs – K, are thus seen simply to be a result of subtle shifts of the oscillatory moduli and Gibbs energy functions. It is noteworthy (and perhaps contrary to one’s intuition) that increasing compression stabilizes the alkalis’ bcc structure, which is considered a more “open” structure than fcc; this increasing stabilization of bcc occurs over large ranges of tension and compression (e.g., where the slope of E(λ) is positive in Fig. 3). Increasing pressure also stabilizes the bcc structure in the Morse model, as is seen from Figs. 1 and 2. Further comparisons among the pseudopotential and Morse models of the alkali metals reveal similar pressure– volume relations (see Fig. 4), the condition µ > µ for both the bcc and fcc structures over wide ranges of compression and expansion, more than one zero in the µB -functions, and similar low pressure µB (λ)/µB (λ) ratios that increase initially with increasing compression (see Fig. 5). Thus, although the pseudopotential model of the alkalis is both rigorous and highly complex when compared with the Morse model (as, e.g., is evident from a comparison of the lattice summations required for moduli calculations in Ref. [19, Eqs. A5–A37] with those in Ref. [17, Eqs. 4.1–4.4]), the two models do have a number of important features in common. Although the Morse model represents a considerable simplification of the state of interatomic interactions in most real crystals, it has been found useful for exploring qualitative and semi-quantitative phenomena, since (i) it incorporates the essential nature of interatomic interactions, (ii) it often yields good agreement with more sophisticated model computations and with experimental evidence, particularly for uniaxial loadings (as is discussed later in this section), and (iii) it is sufficiently mathematically tractable to enable lattice instabilities and post-bifurcation behavior to be studied in IMD simulations of large (realistic size) supercells. One inadequacy of the Morse model, however, is its inability to replicate relatively large values of shear moduli ratios µB /µB , as are found among the bcc transition metals. Chantasiriwan and Milstein [22] recognized a need for atomic models that accurately reproduce the elastic
Elastic stability criteria and structural bifurcations
1245
Figure 4. Comparison of pseudopotential and Morse model computations of the pressures and bulk moduli of the bcc alkali metals in compression; κ(1) is the bulk modulus at zero pressure. From Ref. [19].
behavior of bcc transition metals and yet are suitable for use in large-supercell molecular dynamics simulations. This consideration led them to formulate embedded atom method (EAM) models for a number of cubic metals, including the bcc transition metals Fe, Mo, and Nb, that identically reproduce experimental values of the three second-order and six third-order elastic moduli. Thus, the initial linear (harmonic) and non-linear (anharmonic) mechanical responses of the models under arbitrary loading are dictated by experiment. In addition, (i) the pressure–volume relations of the metals are accurately modeled, (ii) the models yield good quality phonon spectra, (iii) the relative energetics between the bcc and fcc structures yields the correct low temperature, zero-stress, phase, and for Fe, K, Na, and Rb, experimentally observed phase
1246
F. Milstein
Figure 5. Comparison of pseudopotential and Morse model computations of the shear modulus ratio µB /µB of the alkali metals in compression. From Ref. [19].
transitions are indicated (Cs was not modeled), (iv) the energy of the crystal and its derivatives are represented by convenient analytical forms, and (v) the lattice summations converge rapidly (generally after third- or fourth-nearest neighbor interactions), so applications are not computationally intensive (the necessity of including at least third-nearest-neighbor interactions in the EAM formulation was noted by Chantasiriwan and Milstein [23]). Chantasiriwan and Milstein (to be published) also used their EAM models to compute the pressure dependencies of the moduli κ, µ, and µ of cubic crystals in LS simulations. Their LS work is intended to serve as a prelude to EAM IMD studies of lattice stability, to be carried out in due course. As an example of their EAM LS computational results, Fig. 6 shows the Gibbs energies G and the differences in Gibbs energies G of the bcc, fcc, and body centered tetragonal (bct) phases of Fe under hydrostatic pressure (the bct structure will be discussed later in this section) and Fig. 7 shows the pressure
Elastic stability criteria and structural bifurcations
1247
Figure 6. Influence of hydrostatic pressure on the Gibbs energy per atom, G, of the bcc, fcc, and bct structures, and the Gibbs energy differences, G, for the EAM model of Fe. From Chantasiriwan and Milstein, to be published.
dependences of the moduli κ, µ, and µ of the bcc and fcc structures. Although the EAM model still represents a considerable simplification of atomic bonding in Fe, it does yield reasonable agreement with experiment, as well as valuable insights. Experimentally, Fe is known to be bcc at atmospheric pressure and temperatures T below 1173 K; depending on the temperature, the application of pressure induces a bcc to fcc transformation (757 K
1248
F. Milstein
Figure 7. Influence of hydrostatic pressure on the bulk modulus κ and shear moduli µ and µ of the bcc and fcc structures of the EAM model of Fe. From Chantasiriwan and Milstein, to be published.
the fcc structure, with the Gibbs energy difference vanishing at about 23 GPa, after which the fcc structure has the lower Gibbs energy. The fcc phase is found to be unstable (µF < 0) up to pressures of about 8 GPa, where µF turns positive with increasing pressure. The modulus µB initially increases with increasing pressure, thus initially further stabilizing the bcc structure; however, with further increases of pressure, µB is diminished, and it eventually turns negative at about 50 GPa, causing loss of stability of the bcc structure. (A second zero in µB is found at about 120 GPa.) The initial increase in µB is “mandated” by the experimental values of the second- and third-order elastic moduli of Fe that are “built into” the model; the subsequent decrease in µB is a result that is “extracted” from the model. As µB decreases, the enthalpy barrier that acts to prevent loss of the bcc structure likewise decreases, while the driving force for phase transformation (i.e., the Gibbs energy difference) increases. The experimentally observed sluggishness of the bcc to hcp transition can be understood from the continued elastic stability of the bcc phase (µB > 0) at pressures beyond that at which the Gibbs energy difference between the phases vanishes. As will be discussed later in this article, a phase transformation path associated with a vanishing or diminishing shear modulus µB may take the bcc
Elastic stability criteria and structural bifurcations
1249
crystal into either an fcc or an hcp structure, depending on whether branching from the primary (cubic) path is homogeneous or inhomogeneous. The pseudopotential, EAM, and Morse model computational results all show that the bcc structure is “susceptible” to µ = 0 instabilities, i.e., to eigenstates of type (ii) (see after Eq. (21)); the associated homogeneous eigenmode takes the crystal structure from cubic to orthorhombic. Homogeneous branching at a µ = 0 eigenstate was first studied by Milstein et al. [24], and subsequently by Chantasiriwan and Milstein (to be published). Milstein et al. [24] began by conducting a thorough computational search of the orthorhombic states in the neighborhood of the µ = 0 eigenstates in the alkali metal pseudopotential models in order to locate all states in which the externally applied stresses and internal stresses remain in equilibrium (not necessarily a stable equilibrium) and hydrostatic after branching. (This was accomplished by calculating the internal states of stress while varying, independently, the three axial stretches λ1 , λ2 , and λ3 .) It was then found that the only non-cubic crystallographic states in the neighborhood of the µ = 0 eigenstates that satisfied the hydrostatic condition were on a unique secondary path on which the crystal remained tetragonal; furthermore, the same hydrostatic tetragonal path branched from all of the µB = 0 and µF = 0 eigenstates, thereby linking the primary bcc and fcc paths. (One way to envision the process is to assume that we may servo-mechanically control the lattice parameters of the cubic crystals; any incremental departure from the cubic path causes the path to become non-hydrostatic, except for unique departures carried out in the neighborhood of the µ = 0 eigenstates; at a µ = 0 eigenstate, we may deform the crystal along the unique secondary tetragonal path while the internal stresses remain hydrostatic; if we “start from” the bcc structure at a µB = 0 eigenstate, we will “end up” at the fcc structure at a µF = 0 eigenstate; the secondary path thus provides a hydrostatic, homogeneous, phase transformation path between the two cubic structures.) Figure 8 shows an example of homogeneous branching of cubic crystals under hydrostatic pressure for the case of the pseudopotential model of Rb. In this figure, all stretches are referenced to the unstressed bcc structure; i.e., λi = aib /aob (i = 1, 2, 3) where the lattice parameters of the body centered crystal at any stage are a1b , a2b , and a3b , and the lattice parameter of the unstressed bcc crystal is aob . On a primary bcc path, the stretches are always λ1 = λ2 = λ3 . The fcc structure can also be described as bct, with body centered (bc) lattice parameters a2b = a3b = a1b /21/2 ; thus the stretches at any stage on the primary fcc path vary as λ2 = λ3 = λ1 /21/2 . Successive computational stages on the primary paths are readily specified by the incrementation δλ1 = δλ2 = δλ3 for bcc and δλ2 =δλ3 =δλ1 /21/2 for fcc; this insures continued hydrostatic loading. Once cubic symmetry is broken at the branch point, however, hydrostatic states are no longer “automatically” located in such a simple manner; for a given increment δλ1 on the secondary path, the values of δλ2 and δλ3 were iteratively
1250
F. Milstein
Figure 8. Branching behavior under hydrostatic pressure in the pseudopotential model of Rb; the shear moduli µ of the cubic structures vanish at the states designated by ’s, at which points the secondary path branches from the primary cubic paths; on the secondary path, the crystal structure is body centered tetragonal (or equivalently, face centered tetragonal) and the state of internal stress is hydrostatic (the primary paths are of course also hydrostatic): (a) crystal geometry, (b) binding energy per atom, and (c) pressure. From Ref. [24].
Elastic stability criteria and structural bifurcations
1251
varied independently to reach hydrostatic states (i.e., to where the stresses in the three principal directions became equal). It is to be emphasized that, the only states where the crystal is able to depart from cubic symmetry homogeneously and still remain hydrostatic is at a shear modulus instability, and for the µ = 0 type of instability, the only homogeneous branching observed computationally is cubic to tetragonal. Furthermore, when higher order branching theory was employed, as is discussed in the next section of this article, it was proven ([24] and Chantasiriwan and Milstein (to be published)) that, for an initially cubic crystal, the only allowed symmetry-breaking, homogeneous, branching geometries are cubic to tetragonal, when µ vanishes, and cubic to trigonal, when µ vanishes, in agreement with computational results. Next we consider the response of cubic crystals under uniaxial loadings coincident with the principal symmetry axes. Experimentally it is known that the general elastic response of a metal, in both the linear and non-linear ranges, depends upon the “subgroup” to which the metal belongs. For example, three distinct subgroups are (i) fcc metals in general, (ii) the bcc β-brasses and alkali metals, and (iii) the bcc transition metals. For fcc metals, the Young moduli and Poisson ratios, respectively, are ordered according to E 111 > E 110 > E 100 ¯ 001 110 (this ordering also implies µ/µ < 1) and ν110 > ν100 > ν111 > 0 > ν110 , where E hkl is the initial ratio of stress to strain for uniaxial loading in the [hkl] crysh kl tallographic direction and νhkl is the negative of the initial ratio of transverse strain in the [h k l ] direction to axial strain in the [hkl] direction under [hkl] uniaxial loading (under [100] and [111] uniaxial loadings, the transverse strain is isotropic, so the superscripts are unnecessary); fcc metals also exhibit upwardly concave stress-strain relations in [100] loading, but downward concavity in [110] and [111] loading, with the magnitude of the nonlinearity greatest for the case of [110] loading and least for [111] loading [25, 26]. (As exceptions to the above “rules,” Al, Pt, and Ir lack the negative Poisson ratio in [110] loading, the µ/µ ratios of these metals, while smaller than unity, are considerably larger than the µ/µ ratios of other fcc metals (e.g., Ni, Pd, Cu, Ag, Au, Pb, Ce, and, Th), and the [100] stress–strain curve of Al is concave downward). The elastic response of the bcc alkali metals and the β-brasses is also characterized by the orderings µ > µ (typically, µ/µ ¯ 001 110 is of the order 0.1), E 111 > E 110 > E 100 , and ν110 > ν100 > ν111 > 0 > ν110 ; but the uniaxial loading curves are concave upward in [110] loading and concave downward in [100] and [111] loading, with relatively large curvatures for [100] and [110] loadings (Milstein and Marschall, 1988, 1992). Among the six bcc transition metals V, Nb, Ta, Cr, Mo, and W, with the exception of Ta, the linear elastic trends are “reversed” [26]; i.e., E 111 ≤ E 110 ≤ E 100 (which implies ¯ 001 10 ≤ ν 100 ≤ ν 111 ≤ ν 1110 (the equalities apply to W only). µ ≤ µ) and 0 ≤ ν110 The nearest neighbor atoms in a bcc crystal lie along the [111] direction, so
1252
F. Milstein
one might expect E 111 to be greatest, as it is in the group (ii) bcc metals, particularly when compressive loading is considered; the apparent “anomaly” in the bcc transition metals is evidently caused by localized directional bonding effects which are known to be significant owing to d-orbital electron interactions. Body centered cubic Fe is found to be intermediate to the two groups of bcc metals; its moduli are ordered as in the group (ii) metals, but its shear modulus ratio µ/µ is about 3 or 4 times that of the alkali metals and β-brasses. The experimentally observed moduli orderings, “upward concavities”, and negative Poisson ratios of the group (i) and group (ii) metals are, in fact, a direct consequence of the existence of multiple stress zeros on the stress–strain curves and of bifurcation phenomena that have been revealed in LS computations of large strain uniaxial loading curves; the multiple stress zeros and bifurcation phenomena, in turn, are a consequence of crystal symmetries and general characteristics of atomic bonding. As a result, LS computations based on even relatively simple atomic models, such as the Morse model, can be edifying; such models are also useful for exploring qualitative and semi-quantitative behavior of the group (i) and (ii) metals in IMD simulations. Figure 9 illustrates crystal symmetry under [100] uniaxial loading. Figure 9a shows a lattice that may be considered as initially unstressed bcc; under [100] uniaxial load, on the primary path, the body centered lattice parame/ a2b = a3b ; the left hand portion of this figure also shows a ters become a1 = bold-lined crystallographic cell that is initially face centered tetragonal (fct)
(b)
LOADING DIRECTION
(a)
Figure 9. Two ways of viewing initially cubic crystals under [100] and [110] uniaxial loads: (a) face centered cells (bold lines) in a body centered lattice structure and (b) body centered cells (bold lines) in a face centered lattice structure. From Ref. [32].
Elastic stability criteria and structural bifurcations
1253
and remains fct under uniaxial load, with lattice parameters, a1 = / a2f = a3f , in general, on the primary path. Analogously, Fig. 9b shows four crystallographic cells of a lattice that may be considered unstressed fcc initially, becoming fct, / a2f = a3f , under load on the primary path. The left-hand bold lined with a1 = / a2b = a3b , in general. Figure 9 thus illustrates the fact cell Fig. 9b is bct, a1 = that the primary paths of [100] loading of bcc and fcc crystals are identical, with the crystal residing in the fcc state when a2b = a3b = a1 /21/2 and in the bcc state at a2f = a3f = 21/2 a1 . General considerations of crystal symmetries and atomic bonding then require the primary path of uniaxial [100] loading to contain three states of zero stress, including a “special” unstressed tetragonal state, as demonstrated by Milstein [27]. That is, since the stress in a cubic state is hydrostatic, a uniaxial load must vanish in cubic states. The existence of two zeros on the primary path implies a third, since the load l1 must be compressive (i.e., negative) when the lattice parameter a1 is arbitrarily small, and l1 must be tensile (positive) when a1 is sufficiently large. When subjected to hydrostatic pressure, the initially unstressed tetragonal crystal may eventually reach a state where it becomes body centered or face centered cubic; at such states, the shear modulus µ of the cubic crystal necessarily vanishes; this occurrence is seen in Fig. 8 at the states where µ = 0; additionally, the pressure variation of the difference in the Gibbs energies of the cubic and tetragonal structures becomes stationary at these states, as is seen in Fig. 6. Numerous LS model computations of the [100] loading response of cubic crystals have been carried out for various atomic models, including by pair potentials, pseudopotentials, EAM models, and ab initio methods. General features common to such computations are illustrated in Fig. 10, which shows the [100] uniaxial compressive loading response of initially unstressed Morse model fcc crystals. Paths connecting bcc and fcc states via a tetragonal lattice distortion are called Bain transformation paths. The paths in Fig. 10 comprise a special class of Bain transformations, in that they connect two unstressed cubic states on a path of minimum energy barrier (that reaches its maximum value at the intermediate unstressed state). On other Bain paths (e.g., those under constant volume, constant transverse stretch, or other constraints), the energy barrier for transformation between the fcc and bcc structures is greater; see Ref. [28] for further discussion of this topic. In Fig. 10, the crystals are unstressed fcc at λ1 =λ2 =λ3 =1; they pass through the unstressed tetragonal states T; they are unstressed bcc at the remaining zeros (that occur at λ2 /λ1 = 21/2 ). The bcc structures also all occur in the neighborhood of λ1 = 0.79; this occurrence is readily understood from the consideration that the bcc and fcc volumes per atom are generally nearly the same. That is, if the bcc and fcc atomic volumes were identical, λ1 λ22 would be unity in the bcc state, which then would occur at λ1 = (1/2)1/3 = 0.794. The volume per atom in the unstressed tetragonal states T may vary widely when compared with its value in the unstressed cubic states, and if the unstressed fcc structure is stable
1254
F. Milstein
Figure 10. Mechanical responses of Morse model fcc and bcc crystals on the [100] uniaxial loading paths connecting the unstressed fcc and bcc structures; the stretch λ1 in this figure is the current value of the lattice parameter a1 divided by its value in the fcc state; the stress σ1 is the true stress (axial load divided by current transverse area) which is normalized by dividing by the corresponding value of the initial Young’s modulus specific to [100] loading. From Ref. [17].
(as it is for the complete family of Morse model crystals), the states T may occur either at the central or at the “left-hand” zero on the primary path; the structure at the central zero is necessarily unstable, owing to its falling load characteristic and its associated occurrence at a local energy maximum on the primary path. As mentioned earlier, unstressed Morse model bcc crystals are stable only for log β < 4.517; if log β = 4.517, the unstressed bcc and tetragonal states coincide, and the primary loading path is tangent to the abscissa at that that state; the corresponding structure lacks stability owing to its vanishing Young’s modulus. With reference to Fig. 10, the experimentally verified concavities of the
Elastic stability criteria and structural bifurcations
1255
[100] stress–strain curves of stable fcc crystals (upward) and stable bcc crystals (downward) are readily understood to be a consequence of crystal symmetries on the primary [100] loading path. Figure 10 also shows the locations of the invariant c22 = c23 eigenstates reckoned to the axes of the bct crystallographic cell. For initially stable Morse model crystals, these eigenstates are always primary eigenstates found in a region of compression. If the unstressed bcc crystal is unstable (i.e., log β > 4.517), these c22 = c23 states are embedded in the unstable region between the bcc structure and the local stress minimum on the primary path. When reckoned to the fct cell axes, the Morse model c22 = c23 eigenstates are likewise always primary, but are found in a region of tension (not shown in Fig. 10) between the fcc structure and the maximum value of the load l1 . Stationarity of the generalized force p1 also coincides with primary eigenstates for the bcc crystal in [100] tension (where p1 achieves a local maximum) and for the fcc structure under [100] compression (where p1 is at a local minimum). A wide range of models that are more sophisticated than the Morse model also yield c22 = c23 eigenstates in a tensile range between a stable fcc state and the maximum tensile load and in a in a compressive range for stable bcc crystals. For example, Fig. 11 shows pseudopotential model computations of the variations of c22 , c23 , and c44 on a primary [100] loading path. In this figure, the unstressed bcc, bct, and fcc states are denoted by B, T, and F, respectively. For the moduli reckoned to the fct crystallographic axes (Fig. 11a), c22 is seen to decrease and c23 to increase with increasing axial stretch, whereas the opposite occurs when these moduli are reckoned to the bct axes (Fig. 11b). A comparison of the a and b parts of Fig. 11 also points up the fact that the c22 = c23 eigenstate reckoned on the fct axes and the c44 = 0 eigenstate on the bct axes coincide, and visa versa, regardless of atomic model. (If the calculations in this figure were to have been made with a pair potential model, the c44 = 0 eigenstates would also have coincided with c23 = 0.) From the experimental viewpoint, variations of the moduli c22 and c23 on the primary [100] path may be calculated from measured second and third order elastic moduli; such calculations invariably show that, for fcc crystals, c22 decreases and c23 increases with increasing tension, while for bcc crystals, c22 decreases and c23 increases and with increasing compression, in agreement with atomic based lattice model computations. Thus there is an apparent invariance associated with the locations of these eigenstates on the primary [100] paths. The unstressed tetragonal state T is necessarily “to the left” of a stable fcc state on a primary [100] loading path. Conversely, if state T is “to the right” of the fcc state on this path, the centrally located fcc structure is unstable. In the EAM model of Chantasiriwan and Milstein, fcc Fe is unstable at zero pressure owing to the negative modulus µ (see Fig. 7); accordingly, the fcc state occurs at the central zero on the [100] loading path. However, at a hydrostatic pressure of about 8 GPa, the shear modulus µ of the fcc structure in the EAM model
1256 (a)
F. Milstein (b)
Figure 11. Variation of elastic moduli on the primary [100] loading path of Rb in the pseudopotential model; the upper abscissa scale is the current value of the lattice parameter a1 divided by its value in the unstressed fcc state F; the lower abscissa scale is the current value of a1 divided by its value in the unstressed bcc state B; the moduli are reckoned to the axes of (a) the face tetragonal axes and (b) the body centered tetragonal axes. From Ref. [32].
passes from negative to positive, at which state the fcc and hydrostatic bct structures coincide. This behavior is observed in Fig. 12, which shows EAM model computations of Bain transformation paths occurring under constant transverse compressive stress, σ2 = σ3 = −P. (In this figure the uppermost curve, at P =0, is a “true” [100] uniaxial loading path.) Three hydrostatic states are found on each path in Fig. 12, i.e., the bcc states (shown as “diamonds”), the fcc states (squares), and the hydrostatic bct states (circles). The central, unstable, hydrostatic state is occupied by the fcc structure at pressures below about 8 GPa, by the bct structure at pressures between about 8 and 50 GPa, and by the bcc structure at pressures over approximately 50 GPa. With reference to Fig. 7, it is seen in Fig. 12 that, on the Bain paths of constant transverse stress, the cubic states “exchange positions” with the hydrostatic bct state at pressures causing the respective shear moduli µ of the cubic crystal to vanish. Cubic crystals under [111] uniaxial loading exhibit load versus axial stretch responses that are qualitatively similar to the [100] responses discussed above; although, for the [111] loading case, the ordering of the three unstressed states on the primary path is invariant. That is, symmetries on the primary [111] loading path require the unstressed bcc, sc, and fcc structures to occur successively
Elastic stability criteria and structural bifurcations (a)
1257
(b)
Figure 12. Simulation of [100] axial loading of the EAM model of Fe in a constant hydrostatic environment. On each path, initially the crystal may be presumed to be in one of the cubic states, which is necessarily hydrostatic, under a pressure P. An additional axial stress σ1 is then applied, causing the path dependent state of stress in general to be σ1 = / σ2 = σ3 = −P = constant. The path then traverses three states that are under the same hydrostatic pressureP, i.e., bcc (indicated by a “diamond”), fcc (a circle), and the hydrostatic bct structure (a square). Pressures P are in the range showing “cross over” of (a) hydrostatic bct and fcc structures at the states µF = 0 and (b) hydrostatic bct and bcc structures at the states µB = 0. (1 Mbar = 100 GPa). From Chantasiriwan and Milstein, to be published.
at the states where the ratios of axial to transverse stretch are themselves in the ratio 1:2:4, as demonstrated by Milstein et al.[11]. The inherent instability of the unstressed sc structure can be understood from its location at the central zero on the [111] primary path. Similar behavior (i.e., three stress zeros and two energy minima) was found in the [0001] loading response of hcp crystals, thus suggesting the possible existence of a new crystal structure analogous to bcc, but with two atoms in the primitive basis [29]. The uniaxial [111] and the [0001] cases have fundamental similarities and differences; both have a uniaxial load applied normal to similar hexagonal planes of atoms, but with differing stacking orders of the planes (i.e., ABCABC. . . for [111] and ABAB. . . for [0001], in the usual notation); although crystal symmetry requires the existence of the three zeros in the [111] loading case, no such requirement is evident for the case of [0001] loading of hcp crystals.
1258
F. Milstein
The first LS model computations of the full [111] uniaxial loading path are those of Ref. [11], based on a Morse model of fcc Ni. Subsequently, Milstein and Chantasiriwan [2] and Chantasiriwan and Milstein (to be published) computed the [111] loading behavior of their EAM metal models. As an example of computational results, Fig. 13 shows the [111] loading response of the EAM Fe model discussed earlier. Note that, although the fcc form of Fe appears at the central zero on the [100] path owing to its negative µ-value, it appears at the right-hand zero on the [111] path, as required by crystal symmetry. These behaviors also point up the important fact that a crystal structure may lie at an energy minimum on a particular path, while it is at an energy maximum and hence unstable with respect to deformation along another path. Unstressed sc
(a)
(b)
(c)
Figure 13. Mechanical response of Fe to [111] uniaxial loading in the EAM formulation. The stretch λ3 (coaxial with the [111] direction) and transverse stretch λ2 are referenced to the unstressed bcc state B; the crystal necessarily becomes unstressed sc (S) and fcc (F) at values of λ3 /λ2 = 2 and 4, respectively [11]. Variation, with axial stretch, of (a) the (isotropic) transverse stretch and percent change of volume (from the volume VB of the bcc structure), (b) axial load L 3 , axial true stress σ 3 , and internal energy E, and (c) elastic moduli C33 2 and R = (C − C )C − 2C 2 and C44 and expressions R1 = (C11 + C12 )C33 − 2C13 2 11 12 44 14 (see relations (38) and (39)); the moduli here are those defined by (49), with the modification C14 = (a1 /V )2 E/a1 a4 , (a4 is the angle between the 2- and 3-axis) which yield the same domains of notional stability as the M-moduli. From Chantasiriwan and Milstein, to be published.
Elastic stability criteria and structural bifurcations
1259
crystals have a more “open” structure than the unstressed bcc, fcc, or bct crystals; for example, for the Fe model, the atomic volume of the unstressed bct structure is about 1% greater than that of the bcc and fcc structures, whereas, as seen in Fig. 13a, the volume per atom of the sc crystal is more than 10% greater than that of the bcc and fcc crystals. Accordingly, the energy and stress barriers for transitions between the bcc and fcc states on the [111] loading path are naturally greater than the corresponding barriers on the [100] path, which “explains” the experimental results, for both the bcc and fcc structures, that E 111 > E 100 (this condition, in turn, requires the orderings E 111 > E 110 > E 100 , based on general rules of anisotropic elasticity), that the nonlinearities in [111] loading are smaller than in [100] loading, and that the stress–strain responses under [111] loading are concave downward. (The downward concavity of the stress–strain curves of a stable bcc crystal in both [111] and [100] loading is a natural consequence of the structure’s position at the left hand zero on the paths, where there is no required stress zero residing to “its left”, in compression; by comparison, the symmetry-imposed stress-zero that lies “to the left” of a stable fcc crystal on both the [100] and [111] loading paths imposes an inflection point on the paths in the neighborhood of the fcc state; the relatively small stress barrier for the fcc to bcc transition on the [100] path “pushes” the inflection point into the tensile region so the [100] loading response is concave up; the much larger compressive stress barrier for the fcc to bcc transition in [111] loading enables the inflection point to reside in the region of compression, thereby yielding an initial stress-strain curve that is concave down.) The notional ranges of stability under [111] loading of all of the initially stable fcc metals that were studied with the EAM model (Li, Na, K, Rb, Cu, Ag, Au, Al, and Ni), as well as of the Morse Ni model, in both tension and compression, were found to be terminated at violations of the second inequality in (38), prior to the respective maximum and minimum loads and, as mentioned earlier, within in the group of G-, S-, and M-variables, the notional stability limits were fairly insensitive to the choice of variables. Similarly, for all of the bcc EAM model metals (Li, Na, K, Rb, Fe, Mo, and Nb) that were studied in [111] tensile loading, the notional stability limits were relatively insensitive to choice of strain variables and, with the exception of Mo, notional stability was terminated at violations of the second inequality in (38). In [111] compression, however, some of the bcc metals tended to remain notionally stable up to very large stress magnitudes (when compared with the maximum tensile stresses on the loading paths), where divergences among the diverse notional stability criteria may be expected. Among the bcc EAM models subjected to high [111] compressive stresses, primary eigenstates associated with violations of both the second and the first of the inequalities in (38) were found; since the latter type of notional instability is associated with a stationary value of p1 , this result is likely model-dependent. The stresses at
1260
F. Milstein
the limits of the notional stability domains in [111] bcc compression were, in some cases, found to be strongly dependent on the choice of strain variables; this result implies that the Born criterion is unlikely to be of general value for assessing stability of bcc crystals in [111] compressive loading. Crystal symmetry on the primary path of [110] loading of an initially unstressed fcc crystal is body centered orthorhombic (bco), while symmetry on the primary path of [110] loading of an initially unstressed bcc crystal is face centered orthorhombic (fco). Behavior on the primary [110] paths of bcc and fcc crystals can best be understood in terms of the secondary orthorhombic branch paths that emanate from the primary [100] path at the c22 = c23 eigenstates, since, as shown by Milstein and co-workers, the primary [110] uniaxial loading paths are identical to these secondary orthorhombic paths, and the path branchings profoundly influence the mechanical responses on the [110] paths. Crystal symmetry under [110] loading can be understood with reference to Fig. 9 also. For this purpose, assume that the bold-lined lined face centered (fc) cell in the right-hand portion of Fig. 9a is initially unstressed fcc; [110] loading of the fcc crystal is then accomplished by applying a uniaxial load as illustrated in the figure; the initially bc cells in Fig. 9a then become bco. Analogously, if the bold-lined bc cell in the right hand portion of Fig. 9b is initially unstressed bcc, [110] uniaxial loading of an initially unstressed bcc crystal is indicated, and the fc cells in Fig. 9b become fco. Now, let us “return” the bc cells in Fig. 9a to their unstressed bcc state (or the fc cells in Fig. 9b to their unstressed fcc state) and apply a compressive (or tensile) [100] uniaxial load in the direction shown. On the primary [100] path, under compression, a1 decreases and a2b = a3b increase (or under tension, a1 increases and a2f = a3f decrease); at the point of bifurcation, to first order on the secondary path, δa1 = 0, δa2 = −δa3 , so a2 and a3 vary much faster than a1 , at least initially, on the secondary paths. Without loss of generality, we may assume that a2b decreases and a3b increases when a1 decreases (or that a2f increases and a3f decreases when a1 increases); then a2b (or a2f ) will again become equal to a1 ; when this occurs, the load in the two-direction, which by definition is zero, must equal the load in the one-direction, so the crystal passes through an unstressed state on the orthorhombic secondary path; one unstressed state, however, implies a second, since the load must be compressive (or tensile) as a1 decreases further (or as a1 continues to increase). One of these stress zeros on each secondary path is naturally cubic, oriented as shown by the right-hand bold lined cells in Fig. 9. Branching of a crystal structure under uniaxial load was first observed computationally by Milstein and Huang [30] in their study of [110] loading of the fcc Morse model Ni crystal; they computed the path dependent axial and transverse strains, axial load and stress, energy, and elastic moduli and found the [110] bco path to branch from the primary tetragonal [100] loading path, under dead load, at the c22 = c23 eigenstate (with the moduli defined relative
Elastic stability criteria and structural bifurcations
1261
to the bc crystallographic axes). For the Morse Ni model, log β = 6.288; thus the branch point was embedded in the unstable, compressive, region of the [100] path (the location of the branch point may be induced from Fig. 10). The [110] path also contained the unstressed tetragonal state that is found on the primary [100] path, differently oriented. Milstein and Farber [31] employed similar LS model computations to study the analogous, but distinct, branching from the [100] tetragonal path to the secondary bco path that occurs at the point where c22 = c23 when the moduli are reckoned to the fct structure; there the branch point terminated a stable tensile region of the primary [100] path of an initially stable fcc crystal subjected to [100] tensile loading. Again, the secondary path passed through the same unstressed tetragonal state found on the primary [100] path, differently oriented. Milstein and Farber employed general theoretical arguments, supported by their computations, to show that the unique bco branch path (that remains under strict uniaxial load) takes the crystal into the unstressed bcc structure, which is oriented with the uniaxial load coincident with the [110] direction of the bcc crystal. In a review article, Milstein [17] presented a unified description of [100] and [110] uniaxial loading of bcc and fcc crystals that incorporated the branchings from the [100] path (considered as primary) to both [110] paths (considered as the secondary branch paths). Subsequently, the generalized behavior proposed by Milstein, based mainly on symmetry arguments, was verified in both pseudopotential [32] and EAM (Chantasiriwan and Milstein, to be published) model LS computations. Examples of these computational results are shown in Fig. 14 (for the Rb pseudopotential model) and Fig. 15 (for the Fe EAM model). In Figs. 14 and 15, the primary [100] path is represented by solid lines and the secondary [110] paths by dashed lines; the locations of the unstressed fcc, tetragonal, and bcc states are indicated by F, T, and B, respectively; the axial stretch λ1 is referenced to the unstressed bcc structure on the primary path; and the superscripts b and f indicate variables reckoned, respectively, to the bc and fc crystallographic axes (e.g., the transverse stretches λb2 and λb3 on the left hand branch paths are computed on the bco axes). In Fig. 15, the moduli are referenced to the bc crystal axes, and thus the right hand branch point occurs where c44 = 0. Figs. 14a and 15a make evident the basis for the ¯ 110 , which occurs naturally experimentally observed negative Poisson ratio ν110 as a result of the bifurcation characteristic; i.e., that a2b decreases when a1 decreases and that a2f increases when a1 increases; the large, positive, Pois001 are readily explained by the relatively rapid variations of a3b son ratios ν110 f and a3 on the secondary branches. The experimentally observed stress–strain concavities on the [110] paths are also readily understood from the stress– strain responses on the branch paths in Figs. 14 and 15; i.e., the right hand branch of [110] loading of the bcc crystal must “turn up” under compression, and the left hand branch of [110] loading of the fcc crystal must “turn down” under tension, in order that these paths “meet up” with the primary paths at the
1262
F. Milstein
(a)
(b)
(c)
(d)
Figure 14. Branching behavior under uniaxial loading of the pseudopotential model of Rb; the unstressed bcc, fcc, and tetragonal states are indicated by B, F, and T, respectively. Crystal structure is tetragonal on the primary path (solid line), which corresponds to [100] uniaxial loading of the bcc and fcc structures, and orthorhombic on the secondary branch paths (broken lines). The left-hand (body centered orthorhombic) branch contains the unstressed fcc structure with its [110] axis parallel to the uniaxial load and the right-hand (face centered orthorhombic) branch contains the unstressed bcc structure with its [110] axis parallel to the uniaxial load. The secondary paths are thus identical to the [110] loading paths of the cubic crystals: (a) Transverse stretch, (b) variation of atomic volume (B is the atomic volume in state B), (c) internal energy, and (d) true stress. From Ref. [32].
Elastic stability criteria and structural bifurcations
1263
(a)
(b)
(c)
Figure 15. Branching between the [100] (solid lines) and the [110] (broken lines) uniaxial loading paths for the EAM model of Fe (crystal structures on the respective paths are as described in the caption to Fig. 14). The only stable structure at zero stress in the Fe model is bcc (B); note the falling load characteristic and corresponding local maximum of energy that renders the unstressed tetragonal state (T) unstable with respect to deformation on the righthand branch path: (a) transverse stretch, (b) internal energy, (c) true stress. From Chantasiriwan and Milstein, to be published.
1264
F. Milstein
invariant c22 = c23 eigenstates. Thus, also, these [110] paths exhibit relatively large nonlinearities. Figures 16a–c compare the uniaxial loading responses of the EAM fcc metals Na, Cu, and Al and the bcc metals Na and Mo in tension on the three principal symmetry directions, and thereby explicitly demonstrate the profound influence of the crystal symmetries discussed above on uniaxial loading. The normalized maximum axial tensile stresses are relatively small and occur at relatively small values of axial stretch for bcc metals in [100] loading; this result helps to explain {100} cleavage in bcc metals, as noted by Morris et al. [14]. The fcc metals also exhibit relatively small maximum tensile stresses in [110] loading, although the greater number of slip systems available in the fcc structure evidently precludes cleavage in this mode of loading; the theoretical result does, however, suggest the possibility of cleavage of fcc crystals at low temperatures in [110] loading, where the slip systems may be less active. The large differences between the [100] and [110] loading responses of bcc and fcc crystals that are seen in Figs. 16a and b are absent in the [111] loading responses in Fig. 16c. Since the theoretical tensile [111] loading paths of bcc metals pass through unstressed sc structures that reside at significantly greater energies than the unstressed fcc, bct, or bcc structures, the stress barriers for a bcc to sc transition are generally quite large, and as noted by Milstein and Chantasiriwan [2], these stress barriers are close to the theoretical maximum stresses reached on the [110] tensile loading paths of bcc crystals, which, in turn, are governed by the inherent strength of the atomic bonds, rather than by crystallographic transformations among unstressed states or bifurcation phenomena. The major influence of crystal symmetries (and the general nature of atomic bonding) on uniaxial loadings of the fcc metals and the group (ii) bcc metals enables relatively simple models of atomic bonding to provide reasonable representations of the uniaxial loading responses; for example, see Fig. 17, which compares the Morse and EAM model mechanical responses of fcc Cu in tension and compression. While the Morse model incorporates only two empirical second-order elastic moduli, the EAM formulation incorporates all three second-order and all six third-order empirical elastic moduli; nevertheless, the two models exhibit similar qualitative and semi-quantitative mechanical responses. Thus, even simple models of atomic bonding can be useful in explorations of qualitative and semi-quantitative behavior in IMD simulations, as is discussed later in this article. As a final example of the rich and diverse phenomena exhibited by crystals under large elastic deformation, Fig. 18 shows the shear stress versus shearing angle θF of fcc crystals loaded in the mode of shearing depicted in Fig. 19. Here the shearing load is constrained to remain parallel to the (100) and (010) planes in the [100] and [010] directions, which takes the initially fcc structure into a bco configuration on the primary path; no other load acts on the crystal. Figure 18 displays the unexpected result that the decrease in the shearing
Elastic stability criteria and structural bifurcations
1265
(a)
(b)
(c)
Figure 16. Normalized axial stress (axial stress divided by the initial Young’s modulus appropriate to the particular metal and loading direction) versus axial stretch for the fcc metals Na, Cu, and Al and the bcc metals Na and Mo under unconstrained uniaxial loadings in the three principal symmetry directions.: (a) [100] loading, (b) [110] loading, (c) [111] loading. From Ref. [2].
1266
F. Milstein
Figure 17. Comparison of the mechanical responses of the Morse model of Cu (solid lines containing data points) and the EAM model of Cu (broken lines) in the three principal loading directions. The ordinate is uniaxial load in the [hkl] direction per unit reference area and the abscissa is axial stretch in the [hkl] direction. From Ref. [2].
angle θF (between the [100] and [010] directions), as well as all other lattice parameters, could be varied continuously only until θF reached the critical value of about 7.6◦ , after which continued decrease in θF required a first-order transition (i.e., a discontinuity in all lattice parameters) since a continuous path taking the θF beyond about 82.4◦ does not exist. (There does exist a continuous path taking θF back to 90◦ , transforming the original fcc structure into bcc, and passing through the unstressed bct structure along the way.) These results, as well as the “bc shearing analog” (i.e., where the shear forces are maintained parallel to the faces of the bc crystallographic axes and the crystal becomes fco on the primary path) are discussed in greater detail by Milstein [17].
3.
Role of Higher Order Moduli at Points of Bifurcation
Criteria for the elastic stability of a crystal subjected to a prescribed mode of loading may, in principle, be formulated in terms of inequalities among strain dependent second order elastic moduli and generalized forces, as is discussed earlier in this article. Loss of stability on the primary path is associated with possible bifurcation, leading to a secondary path on which the crystal may undergo phase transformation or failure. However, by analogy with branching
Elastic stability criteria and structural bifurcations
1267
Figure 18. Theoretical shear stress versus shearing angle for generalized Morse model fcc Ni crystals loaded in the shear mode illustrated in Fig. 19; m = 2 corresponds to the Morse model. From Ref. [37].
theory for discrete mechanical systems, the starting direction of a secondary path is decided by a higher order specification of the post critical loading program [6]. The onset of post-critical phenomena may be determinate of the full transformation path (in the case of phase change) or the mode of failure (if load carrying capacity is lost) or, indeed, of whether the material response to instability is either phase change or failure. While second order moduli are fundamental to the evaluation of stability domains, higher order moduli at the branch point are central to understanding post-bifurcation behavior. Although the development of higher order crystal stability – bifurcation theory
1268
F. Milstein (a)
(b)
Figure 19. Schematic illustration of the shear mode of loading employed in the computations of Fig. 18.
evidently is essential to a full understanding of the topic of material response to load, with but a few notable exceptions ([24, 33, 34] and Chantasiriwan and Milstein, to be published), relatively little work apparently has been done in this area. Here we examine two examples of post-bifurcation behavior, analyzed in terms of higher order elastic moduli on the primary paths at the branch points, i.e., (i) homogeneous branchings of initially cubic crystals at states where a shear modulus vanishes and (ii) the tetragonal to orthorhombic branching at the c22 = c23 eigenstates. As mentioned in the previous section, the only path found computationally that branched from a cubic structure at a µ = 0 eigenstate under continuing hydrostatic conditions was a unique tetragonal path. This behavior may be understood from second order expansions (third-order in the moduli) of the axial stress increments δσi , i = 1, 2, 3, that are constrained to remain equal on the secondary path. That is, following the development of Chantasiriwan and Milstein (to be published), the axial stress increments on an arbitrary orthorhombic cell may be expressed as
δσi =
∂σi ∂a j
1 δa j + 2
∂ 2 σi ∂a j ∂ak
δa j δak ,
(47)
where the summations are over repeated indices (i, j, k = 1, 2, 3). Equations (47) may be expanded as δσi = Cii δqi + (Ci j + P)δq j + (Cik + P)δqk + 12 Ciii δqi2 + (−Cii + Cii j )δqi δq j + (−Cii + Ciik )δqi δqk + 12 (−2P − 2Ci j + Ci j j )δq 2j + (−P − Ci j − Cik + Ci j k )δq j δqk + 12 (−2P − 2Cik + Cikk )δqk2 ,
(48)
Elastic stability criteria and structural bifurcations
1269
where no summation convention is implied, i = / j= / k, and δqi = δai /ai . The moduli in (48) are based on Rasky and Milstein’s 1986 formulation, i.e.,
Ci j =
ai a j V
∂2 E ∂ai ∂a j
and
Ci j k
ai a j ak = V
∂3 E . ∂ai ∂a j ∂ak
(49)
Next, in Eqs. (48), incorporate the moduli symmetries of a cubic-crystal (C111 = C222 = C333 and C112 = C122 = C113 = C133 = C223 = C233, along with the usual symmetries among the Ci j and with respect to interchange of indices), and set δσ1 − δσ2 = 0 and
δσ1 − δσ3 = 0,
(50)
conditions that ensure that the internal state of stress remains hydrostatic as branching occurs. The first of Eqs. (50) becomes δσ1 − δσ2 = (C11 − C12 − P)(δq1 − δq2 ) + (−C11 + C112 + P + 2C12 − C123 )(δq1 δq3 − δq2 δq3 ) + 12 (C111 + 2P + 2C12 − C112 )(δq12 − δq22 ) = 0, (51) with the analogous expression for δσ1 − δσ3 = 0. One obvious solution to the pair of equations is δq1 =δq2 =δq3 , i.e., the primary path; however, if C11 −C12 − P = 0 (which is equivalent to µ = 0), a second solution is possible, viz. δq1 = δq2 = / δq3 , which results in the crystal branching from cubic to tetragonal, with the lattice parameters on the primary path initially varying according to (C112 − C111 − 2C11 ) δq1 . = δq3 (C111 + C112 − 2C123 + 2C11 + 2C12 )
(52)
Chantasiriwan and Milstein similarly analyzed homogeneous branching at the µ = 0 eigenstate and found the branch path to have trigonal symmetry. The cubic to tetragonal branch path that emanates from the µ = 0 eigenstate is therefore the only secondary path that can occur homogeneously if the internal state of stress is maintained hydrostatic and, as illustrated in Fig. 8, the path has been found to connect primary bcc and fcc paths at their respective µ = 0 eigenstates. This mode of branching is equivalent to the uniform shear¯ direction; the secondary ing of (110) planes of the cubic crystal in the [110] path thereby reveals a mechanism for transformations between bcc and fcc structures by uniform shearing under hydrostatic pressure. By contrast, when shearing of (110) planes in a bcc crystal occurs inhomogeneously, wherein al¯ and [110] ¯ ternate (110) planes shear in [110] directions relative to each other (as was observed in the IMD simulations of Zhao et al. [18]), the transformation is bcc to hcp. Experimentally, pressure induced phase transformations between bcc and fcc structures have been observed in the metals K, Rb, Cs, Ca, Sr, Tl, and Fe while transformations between bcc and hcp structures under pressure have been observed in Be, Mg, Ba, Tl, Ti, Zr, and Fe. Questions
1270
F. Milstein
such as “What factors may cause shearing to occur either homogeneously or inhomogeneously in a bulk sample of a cubic crystal under pressure at a shear modulus instability?” and “Are these factors linked to the elastic moduli (particularly the higher order terms) of the crystal at the branch point?” appear ripe for further investigation. Hill [33] developed general theory for higher order constitutive branching in elastic materials, and discussed branching from the primary tetragonal to the secondary orthorhombic path at the c22 = c23 eigenstate as a special case. As a note of possible “historical interest”, some years ago, Hill and Milstein jointly investigated the mathematical and numerical character of the secondary path at the branch point with an aim of developing an expression for the variation of load, dl1 /dλ1 , on the secondary path, as a function of the elastic moduli on the tetragonal path. Initially, a third-order (in the moduli) theoretical formula failed to satisfy the computational results, leading us to question the computations. (Two independent calculations of dl1 /dλ1 , carried out as follows, gave divergent results: (i) higher-order moduli calculated from Morse model lattice summations were put into the theoretical formula for dl1 /dλ1 and (ii) the ratio δl1 /δλ1 was computed from finite differences on the branch path.) Subsequently, the following “fourth-order” formula derived by Hill gave “exact” agreement.
c2 [2c22 (c123 − c122 ) + c12 (c222 − c223 )]2 /c22 dl1 = c11 − 12 − , dλ1 c22 [(2/3)c22 (c2222 − 4c2223 + 3c2233 ) − (c222 − c223 )2 ] (53) where the moduli in (53) are ci j =
∂2 E ∂li = , ∂λi ∂λ j ∂λ j
ci j k =
∂ 2li , ∂λ j ∂λk
ci j kl =
∂ 3li . ∂λ j ∂λk ∂λl
(54)
Equation (53) was derived by expanding the incremental change δli of the load li to third-order in the δλi , (fourth-order in the moduli), δli = ci j δλ j +
1 2
ci j k δλ j δλk +
1 6
ci j kl δλ j δλk δλl
(i = 1, 2, 3), (55)
incorporating the symmetries of the tetragonal structure, substituting the following series expansions for the δλi in terms of a parameter t that approaches zero as the bifurcation point is approached, δλ1 = β1 t 2
δλ2 = γ t + β2 t 2
δλ3 = −γ t + β3 t 2 ,
(56)
setting δl1 = α1 t 2 ,
δl2 = δl3 = 0,
(57)
Elastic stability criteria and structural bifurcations
1271
and solving for α1 /β1 by grouping terms of like order in t. The higher order moduli (i, j, k, l = 1, 2, 3) on the tetragonal path are c111 , c112 , c122 , c123 , c222 , c223 , c1111 , c1112 , c1122 , c1123, c1222 , c1223 , c2222 , c2223 , c2233 , as well as those moduli formed by interchange of the indices 2 and 3 among distinct moduli and by interchange of the order of indices within a given modulus (e.g., c1223 = c1332 =c2123 =c3132= . . . ). The inclusion of fourth order moduli in the expansion (55) is necessary owing to the highly singular character of the bifurcation at the c22 = c23 eigenstate. 2 /c22 ) is the slope dl1 /dλ1 of the tetragonal path at The quantity (c11 − c12 the branch point, which must be positive if the c22 = c23 eigenstate terminates stability; consequently, the slope of the branch path at the point of bifurcation will be negative (and hence the secondary path will necessarily be unstable at the point of bifurcation) if the expression in (53) that contains the higher 2 /c22 ). For all Morse model order moduli is positive and greater than (c11 − c12 crystals and for the pseudopotential model crystals that have been investigated, the orthorhombic paths branch with negative slope; among the EAM models, however, both negative and positive sloped path branchings were observed. For example, LS simulations of [100] loading of the EAM model of Ni exhibited positive-sloped branching on the right-hand (fco) branch path, as is seen in Fig. 20. This behavior raises the interesting question of whether or not the tetragonal crystal actually becomes unstable before, at, or after the location of the c22 = c23 eigenstate on the primary path. Note that, before the branch point is reached on the primary path in Fig. 20, there exists a range over which the secondary fco path is at a lower internal energy and lower axial stress; crystal instability modes for this form of bifurcation diagram have yet to be investigated in IMD simulations. Since the load l1 passes through zeros at the cubic states on the secondary orthorhombic paths, the magnitude and algebraic sign of dl1 /dλ1 is also governed largely by crystal symmetry and the location of the branch point on the secondary path. That is, because the atomic volumes of a given substance in the unstressed bcc and fcc states are approximately equal, the cubic states occur on their respective secondary paths at approximately the same values of λ1 , regardless of atomic bonding characteristics. (For example, the unstressed bcc state on the right-hand (fco) branch occurs at about a 12% strain relative to the fcc state on the primary path because crystal symmetry dictates that λ1 = λ2 = 21/2 λ3 at the bcc state on the fco branch, and if the atomic volumes of the bcc and fcc structures were identical, λ1 λ2 λ3 would be unity in both states, so the bcc state on the fco branch would then occur at λ1 = 2(1/6) = 1.122.) If the point of bifurcation on the fco branch occurs where λ1 is greater than about 1.12, as it does for the EAM model of Ni, this branch emanates with dl1 /dλ1 > 0. Analogous symmetry conditions apply to the left-hand (bco) branch.
1272
F. Milstein
(a)
(c)
(b)
(d)
Figure 20. Mechanical response of the EAM model of Ni, exhibiting a positive-sloped (dl 1 /dλ1 > 0), secondary, face centered orthorhombic, branch path at the c22 = c23 eigenstate (where the face centered orthorhombic [110] uniaxial loading path of the bcc structure branches from the tetragonal [100] loading path of the fcc structure): (a) internal energy, (b) true stress, (c) change of volume relative to the volume of the unstressed fcc structure, VF , (d) transverse stretch. From Ref. [34].
4.
Instability and Bifurcations in Isostress Molecular Dynamics Simulations
Molecular dynamics simulations, particularly when carried our under isostress conditions, add “new dimensions” to the studies of elastic stability, bifurcation, and post-bifurcation behavior, in that inhomogeneous branching and temperature effects may be readily investigated in a natural, unconstrained manner. Available numerical “tools” include the isostress ansatz Lagrangian of Parrinello and Rahman [3], canonical fluctuation formulas for computing stress- and temperature-dependent elastic moduli [35], and visualization techniques for determining instability mechanisms (i.e., for viewing the evolutions
Elastic stability criteria and structural bifurcations
1273
of atomic configurations during the course of an instability). Here we illustrate the results of some applications of these computational methodologies in studies of stress-induced instabilities in Morse model crystals that have been thoroughly studied previously in lattice statics simulations. We begin with Zhao et al. [34] IMD study of thermally activated, inhomogeneous, shear modulus instabilities in bcc crystals under hydrostatic, isothermal, conditions. In LS simulations, bcc Morse model crystals lose stability under pressure (P > 0) on the curve µB(L) shown in Fig. 2, where the shear modulus µB passes from positive to negative with decreasing pressure (increasing λ) and sc Morse model crystals are stable only in hydrostatic tension above the µ S - and below the κ -curves. In the IMD simulations at temperatures approaching 0 K, the bcc and sc structures were also indefinitely stable in the ranges predicted by the LS computations, and the bcc and sc structures lost stability via inhomogeneous bifurcations that occurred in association with the vanishing of their respective shear modulus, µB and µS . These simulations demonstrated explicitly the applicability of Hill and Milstein’s stability criteria, relations (21), to cases where stability also is lost under inhomogeneous eigenmodes. The bcc crystals were observed to lose stability via the ¯ planes, in alternate [1¯ 10] ¯ and [110] directions (which is an shearing of (110) inhomogeneous eigenmode consistent with the µB = 0 eigenstate) and, postbifurcation, the crystals were observed directly to transform to hcp crystals, ¯ bcc planes became the (0001) planes of the new hcp wherein the sheared (110) crystal. This shearing mechanism for the bcc – hcp transformation was proposed much earlier by Burgers [36], based on analogous hexagonal geometry between the (110) planes in bcc crystals and (0001) planes in hcp crystals; the study by Zhao et al. [34], demonstrated that the mechanism is “triggered” by loss of elastic stability of the bcc structure in conjunction with a vanishing or diminishing shear modulus µ. The influence of temperature on the transition is particularly interesting. Figure 21 shows the variation of µ with pressure and stretch at the temperatures T = 10−5 , 1, and 10 K (data points) and in the LS simulations (solid lines) for a particular Morse model bcc crystal (log β = 4.54). It is seen that, at the indicated temperatures, there is very little difference in the µ-values at a given pressure or stretch. Figure 21 also shows the critical pressures, above which the bcc structure remained stable indefinitely (and below which, instability occurred), indicated by arrows at (a), (b), and (c) for the temperatures T = 10−5 , 1, and 10 K, respectively. (In order to avoid extremely long times to transformation, the critical states in Fig. 21 were determined in adiabatic simulations, although the crystals remained essentially isothermal until initiation of the instability.) With increasing temperature, the transformation is able to occur at increasing pressures and µ-values, owing to the effect of thermal activation; i.e., increased thermal agitation enables the larger enthalpy barriers that occur at greater µ-values, and hence at greater pressures, to be surmounted. Figure 22 shows the variation of enthalpy change
1274
F. Milstein
Figure 21. Shear modulus µ of a Morse model bcc crystal as a function of (1) pressure P and (2) stretch λ. Critical values of pressure and stretch, as determined in constant pressure IMD simulations, are indicated by arrows at (a) 10−5 K, (b) 1 K, and (c) 10 K. (Values of µ were also computed in “supercritical states”; i.e., by employing the fluctuation formulas prior to an incipient instability.) [34]
during the instability of the bcc Morse model crystal with log β = 4.54, at T = 1 K, as an example of the influence of pressure on enthalpy barrier during instabilities that occurred during isothermal IMD simulations. Since the enthalpy barrier vanishes at the µ = 0 eigenstate, and apparently it decreases rapidly as this state is approached from within an initially stable region, we see in Fig. 21 that, at very low temperatures (i.e., where the atomic positions are usually considered as essentially “frozen”), there exists a remarkably strong influence of temperature upon the critical transformation pressure. Next consider IMD simulations of two Morse model fcc crystals, each subjected to two modes of uniaxial loading, i.e., [100] and [111]; the models employ values of log β = 3.864 and 6.288, and reproduce experimental values of the elastic moduli c11 and c12 and atomic volumes of unstressed Cu and Ni, respectively. Figures 23 and 24 show the variations of the moduli combinations that determine the stability ranges according to relations (33) and (34) (in [100] loading) and (38) in [111] loading, as computed by fluctuation formulas at various temperatures, and in the lattice statics computations
Elastic stability criteria and structural bifurcations
1275
Figure 22. Enthalpy change H vs. degree of transformation ξ under isothermal (temperature T =1K ), isobaric, conditions in molecular dynamics simulations of bcc to hcp transitions; ξ is a geometric, path dependent parameter that varies from 0 (in the bcc state) to 1 (in the hcp state); in the main figure and in the upper left inset, pressure P = 2.43 GPa; for curves (1), (2), (3), and (4) in the upper right inset, P = 0, 0.63, 1.26, and 2.43 GPa, respectively; the solid curves are polynomial least-squares fits to the IMD results. From Ref. [34].
(Ref. [39] and Zhao, Maroudas, and Milstein, to be published). The LS computations of the moduli combinations (which are based on the Green variables) yield good agreement with the corresponding moduli combinations computed by fluctuation methods at 1 K in the molecular dynamics simulations, and loss of stability in the IMD simulations at 1 K, under both the [100] and the [111] modes, corresponds well with the violations of the stability criteria. In particular, in [100] compressive loading, loss of stability is associated with stationarity of the generalized force, p1 , which occurs near a load or stress minimum, and in [100] tension, stability is lost at the invariant c22 = c23 eigenstate. In [111] loading of both Morse model crystals, in both tension and compression, stability is terminated in conjunction with violation of the second of relations (38), in agreement with the earlier LS work although the IMD [111] Cu simulations lost compressive stability earlier than might be expected. Increasing temperature is seen to induce instabilities at smaller stress magnitudes owing to the effects of thermal activation, as is discussed above for the case of hydrostatic simulations. In the range of temperatures depicted in Figs. 23 and 24, increasing temperature also generally causes the instabilities to occur at
1276
F. Milstein (a)
(b)
(c)
(d)
Figure 23. Mechanical response of Morse models of fcc Ni and Cu under uniaxial [100] tension and compression. Critical values of axial stretch λ1 and stress σ1 , as determined in IMD simulations, are indicated by arrows at (a) for temperature T = 1 K and at (b) for T = 300 K. The crystals remained stable indefinitely at stress magnitudes below the indicated critical stress magnitudes and lost stability at greater stress magnitudes. From Ref. [39] and Zhao, Maroudas, and Milstein, to be published.
Elastic stability criteria and structural bifurcations
1277
(a)
(b)
(c)
(d)
Figure 24. Mechanical response of Morse models of fcc Ni and Cu under uniaxial [111] 2 and A = (c − c )c − 2c2 tension and compression. Expressions B = (c11 + c12 )c33 − 2c13 11 12 44 14 (see relations (38)). Critical values of axial stretch λ1 and stress σ1 , as determined in IMD simulations, are indicated by arrows at (a) for temperature T = 1 K, at (b) for T = 300 K, and, for Ni, at (c) for T =500 K. The crystals remained stable indefinitely at stress magnitudes below the indicated critical stress magnitudes and lost stability at greater stress magnitudes. From Zhao, Maroudas and Milstein, to be published.
1278
F. Milstein
smaller stretch magnitudes, although Cu in [100] compression is an exception, evidently owing to the “softening” of the crystal at higher temperatures. The modes of instabilities in [100] tension are particularly interesting. In both cases the bifurcation starts with the second of the eigendeformations listed in (35), although branching occurs in domains, rather than homogeneously; in alternating domains, δλ2 > 0, δλ3 < 0, and δλ2 < 0, δλ3 > 0; in the Ni model, wherein the instability occurs at a relatively large axial stretch, the instability leads to fracture; in the Cu model, wherein the instability occurs at a smaller axial stretch, the domains are able to rotate in opposite directions, leading to a phase change that exhibits a remarkable atomic pattern formation (Ref. [39]). Whether the modes of instability in [111] loading correspond to the eigendeformations (44) and/or (45) remains to be determined.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
[15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25]
R. Hill, Math. Proc. Camb. Phil. Soc., 77, 225, 1975. F. Milstein and S. Chantasiriwan, Phys. Rev. B, 58, 6006, 1998. M. Parrinello and A. Rahman, J. Appl. Phys., 52, 7182, 1981. M. Born, Proc. Camb. Phil. Soc., 36, 160, 1940. M. Born and R. Furth, Proc. Camb. Phil. Soc., 36, 454, 1940. R. Hill and F. Milstein, Phys. Rev. B, 15, 3087, 1977. F. Milstein and R. Hill, J. Mech. Phys. Solids, 27, 255, 1979. F. Milstein and R. Hill, Phys. Rev. Lett., 43, 1141, 1979. N.H. Macmillan and A. Kelly, Proc. R. Soc. London, A, 330, 291, see also p. 309, 1972. F. Milstein, Phys. Rev. B, 3, 1130, 1971. F. Milstein, R. Hill, and K. Huang, Phys. Rev. B, 21, 4282, 1980. R. Hill, Adv. Appl. Mech., 18, 1, 1978. J. Wang, S. Yip, S.R. Phillpot, and D. Wolf, Phys. Rev. Lett., 71, 4182, 1993. J.W. Morris, Jr., C.R. Krenn, D. Roundy, and M.L. Cohen, “Elastic stability and the limit of strength,” In:Turchi, P. E. and Gonis, A. (eds.), Phase Transformations and Evolution in Materials, Warrandale, PA, TMS, pp. 187–207, 2000. F. Milstein and R. Hill, J. Mech. Phys. Solids, 25, 457, 1977. F. Milstein and R. Hill, J. Mech. Phys. Solids, 26, 213, 1978. F. Milstein, “Crystal elasticity,” In: H.G. Hopkins and M.J. Sewell (eds.), Mechanics of Solids, Pergamon Press, Oxford and New York, pp. 417–451, 1982. J. Zhao, D. Maroudas, and F. Milstein, Phys. Rev. B, 62, 13 799, 2000. D.J. Rasky and F. Milstein, Phys. Rev. B, 33, 2765, 1986. F. Milstein and D.J. Rasky, Phys. Rev. B, 54, 7016, 1996. D.A. Young, Phase Diagrams of the Elements, University of California Press, Berkeley, 1991. S. Chantasiriwan and F. Milstein, Phys. Rev. B, 58, 5996, 1998. S. Chantasiriwan and F. Milstein, Phys. Rev. B, 48, 14 080, 1996. F. Milstein, H.E. Fang, X.Y. Gong, and D.J. Rasky, Solid State Commun., 99, 807, 1996. F. Milstein and D.J. Rasky, Phil. Mag. A, 45, 49, 1982.
Elastic stability criteria and structural bifurcations [26] [27] [28] [29] [30] [31] [32] [33] [34]
[35] [36] [37] [38] [39]
1279
F. Milstein and J. Marschall, Acta Metall. Mater., 40, 1229, 1992. F. Milstein, Solid State Commun., 34, 653, 1980. F. Milstein, H.E. Fang, and J. Marschall, Phil. Mag. A, 70, 621, 1994. F. Milstein, Y.C. Tang, K. Huang, and R. Hsu, Phil. Mag. A, 48, 871, 1983. F. Milstein and K. Huang, Phys. Rev. B, 18, 2529, 1978. F. Milstein and B. Farber, Phys. Rev. Lett., 44, 277, 1980. F. Milstein, J. Marschall, and H.E. Fang, Phys. Rev. Lett., 74, 2977, 1995. R. Hill, Math. Proc. Camb. Phil. Soc., 92, 167, 1982. J. Zhao, S. Chantasiriwan, D. Maroudas, and F. Milstein, ”Atomistic simulations of the mechanical response and modes of failure in metals at finite strain,” In: Proceedings of the Tenth International Conference on Fracture (Honolulu, Hawaii), Elsevier, Amsterdam, contribution IFC10 0575OR, 2001. J.R. Ray, Comput. Phys. Rep., 8, 109, 1988. W.G. Burgers, Physica, (Amsterdam), 1, 561, 1935. K. Huang, F. Milstein, and J.A. Baldwin, Jr., Phys. Rev. B, 10, 3635, 1974. F. Milstein and J. Marschall, Phil. Mag. A, 58, 365, 1988. F. Milstein, J. Zhao, and D. Maroudas, Phys. Rev. B, 70, 184102, 2004.
4.3 TOWARD A SHEAR-TRANSFORMATION-ZONE THEORY OF AMORPHOUS PLASTICITY Michael L. Falk1, James S. Langer2 , and Leonid Pechenik2 1 University of Michigan, Ann Arbor, MI, USA 2
University of California, Santa Barbara, CA, USA
Our understanding of how solid objects bend and break seems poised for major progress because of recent developments in computational capabilities, in experimental techniques, especially high resolution microscopy, and in basic theoretical understanding of non-equilibrium phenomena. In this chapter, we describe some ideas in the theory of amorphous plasticity that have been inspired and enabled by these developments. We focus on two fundamental questions: (i) “How do materials undergo transitions from hardening to flow under applied stresses?” and (ii) “How best can the relevent aspects of microstructural dynamics be incorporated into constitutive theories?” Perhaps the most important goal for modern solid mechanics is a general equation, or a set of equations, that can play the same role for plastically deforming materials that the Navier–Stokes equation plays for fluids and Hooke’s law plays for elastic solids. While most real fluids and real elastic solids exhibit important deviations from Newtonian or Hookean behavior, the solutions to these equations provide a firm basis for our understanding of phenomena ranging from flow in a pipe to the stress field around a crack. These two paradigms have been so powerful that, when describing a deforming solid, scientists and engineers tend to use equations that mimic one or the other. Constitutive laws are developed either for low stresses and low temperatures by prescribing functional relations between stress and strain, or else they are developed for high stresses and high temperatures in the form of equations that relate stress to strain rate. What is lacking is a more widely applicable but comparably tractable constitutive relation that can capture both kinds of behavior. Specifically, we need an equation that can describe the exchange of stability that accompanies the transition from hardening to flow. The existence of such an equation would allow us to deal quantitatively with situations in which hardening and flow coexist and, together, determine the performance of 1281 S. Yip (ed.), Handbook of Materials Modeling, 1281–1312. c 2005 Springer. Printed in the Netherlands.
1282
M.L. Falk et al.
real materials. Ultimately, such a constitutive description should tell us what actually happens when a material is apparently making a transition between brittle and ductile failure. In this chapter we discuss the development of such a constitutive law for deformation of amorphous solids. Where possible, we will describe the predictions of this model in the context of the experimental and theoretical literature on metallic glasses and amorphous polymers. A generally applicable constitutive theory of plasticity of the kind that we think is needed must necessarily be couched in terms of macroscopic quantities – stresses, strains, and coarse-grained internal state variables – and yet it must be based on atomistic concepts and must be consistent with fundamental symmetries and conservation laws. These principles have led us to move in some directions that differ from work that has been done in this field in recent decades. We shall point out briefly several places where earlier theories seem to omit essential physical mechanisms or even violate basic physical principles. We shall also show how we have used the laws of thermodynamics to constrain the form of our equations of motion. The way in which we relate phenomenology to atomistic mechanisms seems especially important if we are successfully to predict both the mechanical response and structural evolution of real materials. Much current work along these lines focuses on dislocation structures. However, while dislocations are crucially important for understanding the behavior of crystals, we believe they are not necessarily the most promising place to begin investigating the general relationship between structure and deformation. This is because the dislocation structures that govern deformation in crystals are composed of large numbers of extended one-dimensional defects. We propose that amorphous solids, being the most liquid-like of the solid materials, provide a more nearly ideal place to begin an investigation that will eventually provide useful guidance for understanding general classes of deformation behavior.
1.
Phenomenology from Experiment and Simulation
At the time of the birth of modern theories of plasticity in the mid 20th century there were severe limitations on the data available to guide the development of such a theory [1]. The data that was available did not detail rate or history dependences of plastic deformation or carefully consider temperature dependencies. Additionally very limited information was available on the microstructural changes that accompanied deformation. Today a large number of experimental techniques have been devised to investigate these aspects of materials response in more detail. Not only the quality and quantity of the available data has improved, but the qualitative nature of the measurements has broadened. Foremost among these is the ability to analyze microstructural changes on a wide range of spatial scales. It is important to note that these
Toward a shear-transformation-zone theory
1283
experimental advances in analyzing microstructure have, for the most part, been limited to crystalline materials. The study of amorphous solid microstructure has been much slower to advance; but recently new methods have begun to emerge including fluctuation microscopy [2] and quantitative techniques in high resolution electron microscopy [3, 4] that have revealed important structure even in materials with no immediately discernable crystal structure. In the absence of a standard set of methods for microstructural and atomicscale characterization, advances in understanding glasses have relied heavily on an entirely new means of investigating materials properties: atomic-scale simulation. The important difference between these techniques and continuum computational methods is that properties can emerge from these methods that were not a priori included in the model. For example dislocations and fracture can be observed in atomistic simulations of solids for which only the initial crystal structure and interatomic bonding potential are supplied as input. Thus these simulation techniques can be used to observe the consequences of assumptions regarding bonding and structure for materials behavior on larger scales. In the context of amorphous materials, investigations of this nature actually pre-date the advent of computer simulations. Some of the earliest “atomicscale” simulations of amorphous deformation were bubble raft experiments that were used as an analog for atomic scale structure in a metallic glass [5]. These and subsequent investigations were crucial for providing a qualitative picture of the nature and behavior of the microscopic structures that mediate shear deformation in disordered matter. These regions are typically referred to as “shear transformation zones” (STZs) or “flow defects”. Figure 1 shows an image of the result of a two-dimensional molecular dynamics simulation of 20 000 atoms having undergone a small amount of shear. The darker regions show where the atoms have undergone the most rearrangement. Figure 2 shows a close-up image of one such region. By examining many regions of this type a number of assertions can be made regarding their behavior essentially providing a phenomenology of the defect dynamics of the amorphous solid: • STZs are orientational in nature: they transform preferentially under a particular applied stress. • STZs are two-state systems: after transforming once they can reverse their transformation, but they cannot continuously undergo repeated transformations in the same direction. • STZ transformations may be thermally activated or mechanically driven. These three assertions will form the basis for the theoretical analysis to follow. In addition to dwelling on the micromechanics of deformation it is important to consider some aspects of the macroscopically observed material behavior that our theory must explain. In this chapter we will consider what we
1284
M.L. Falk et al.
Figure 1. A 2D simulation of a binary Lennard–Jones glass composed of 20 000 atoms undergoing less than 1% shear. Dark areas are local regions that have rearranged plastically.
believe to be the most important features of homogeneous deformation in these materials. In particular we wish to model both shear softening behavior and shear thinning (sometimes called super-plasticity) that is commonly observed in both metallic glasses and amorphous polymers. Both of these will be discussed in more detail below. We will not consider the formation of shear bands, crazes or other aspects of localization. While strain localization is an extremely important failure mode in amorphous polymers and metallic glasses, understanding these phenomena requires first a thorough understanding of the physics of homogeneous deformation. Thus, we expect that a physically complete theory of homogeneous deformation will also predict the subsequent instabilities that develop into inhomogeneities. Figure 3 shows typical stress–strain behavior measured near the glass transition temperature in a Zr based metallic glass as measured by Lu et al. [6].
Toward a shear-transformation-zone theory
1285
Figure 2. A closeup of a shear transformation zone in the sheared Lennard–Jones glass shown in Fig. 1 before and after undergoing a local rearrangement.
Figure 3. Stress strain behavior of a Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 bulk metallic glass measured for a range of strain rates and temperatures [6].
Figure 4 shows similar curves from an amorphous polymer system studied by Hasan and Boyce [7, 8]. A prominent peak in the transient loading curve is observed in the limit of high strain rates, low temperatures and long annealing times. As we shall see in the thermal STZ theory presented below, such behavior can be considered to arise from a dynamic lag between the application of a driving stress and the time needed for the glassy microstructure to adjust to its steady state STZ density. Figure 5a shows viscosity-strain rate behavior for the same metallic glass system shown in Fig. 3 taken from the same work. At low strain rates there exists a well defined Newtonian regime, but at high strain rates the viscosity decreases with increasing strain rate. The scaling behavior shown in Fig. 4b was first observed by Kato et al. [9], and is quite commonly observed in complex fluid and polymer rheology [10].
1286
M.L. Falk et al. 100
TRUE STRAIN (MPa)
Annealed Quenched
50
0 0.0
0.1 0.2 TRUE STRAIN
0.3
Figure 4. Stress strain behavior of annealed and quenched polystyrene at temperature 296 K and strain rate -0.001 s−1 [7].
Simulation has also served a useful purpose in studying the mechanical and rheological properties of amorphous metals and polymers via ersatz experiments analogous to those performed in the laboratory. It is important to note that simulation techniques often impose a much different set of restrictions than are imposed by experimental techniques. For example, while experimental loading may be limited to near quasi-static conditions and moderate strain rates, molecular dynamics simulations are typically restricted to very high strain rates. Although this constraint limits our ability to compare simulation directly to experiment, it provides a distinct window on another, possibly important, regime of the material systems’ physical response. Thus simulations can be used to access ultra-high quench rates and ultra-high strain rates, to perform repeated simulations on absolutely identical samples, and to prevent failure modes by imposing strict boundary conditions, all of which are difficult or impossible to do experimentally.
Toward a shear-transformation-zone theory
1287
Figure 5. Rheological behavior of a Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 bulk metallic glass measured for a range of temperatures [6].
Figure 6. Transient stress strain behavior of a 3D binary Lennard–Jones glass prepared by quenching from five different initial liquid temperatures [11].
An example of a set of simulations of this type are the stress strain curves shown in Fig. 6 taken from work by Albano et al. [11]. These represent glass samples that have been produced by instantaneously quenching from a number of different equilibrium liquid temperatures. The strain rate of the tests are in the range of 22% per nanosecond and the temperatures are approximately 40%
1288
M.L. Falk et al.
Figure 7. Rheological behavior of a molecular dynamics simulation of a confined polymeric system [12].
of the glass transition temperature. Again we see the very generic shear softening behavior that arises in the glasses that were quenched from the lower temperature liquids. Figure 7 from Ref. [12] shows similar shear thinning behavior to that observed in Fig. 5a. but in this case for a simulation of a confined thin polymer film. Clearly both the shear softening and shear thinning phenomena are quite generic and are observed across materials systems and in both simulation and experiment.
2.
Theoretical Background: Viscoplastic Constitutive Theories
A considerable number of constitutive theories have been proposed to describe deformation in amorphous solids. It will not be possible to address all of these theories here. We will, however, consider the most prominent theories describing the mechanical behavior of amorphous metals and polymers. We consider these theories in the light of our statements at the beginning of this chapter. That is to say that, while our goal is to learn as much as we can from these previous developments, ultimately we will only be satisfied with a theory that naturally describes the transition from hardening to flow and accomplishes this in such a way that the terms are clearly physical in their
Toward a shear-transformation-zone theory
1289
assumptions. In these respects the theories we discuss here, while insightful, appear to us to be incomplete and. in some cases, physically unrealistic. The particular theories we will use as points of comparison are the “flow defect” theory as originally developed by Spaepen [13] and more recently extended by Sietsma and coworkers De Hey et al. [14] in the context of metallic glasses; and the constitutive models for amorphous polymers developed by Hasan et al. [15] and Hasan and Boyce [8]. All of these theories have a number of features in common. They begin from a stress-assisted thermal activation formalism originated by Eyring [16] (see also Ref. [17]) which was applied to polycrystalline plasticity by Kocks et al. [18] and amorphous plasticity by Argon [19]. In this formalism, the plastic strain rate is proportional to the number of density of deformable sites in the system, n, times the net rate of transitions that promote shear:
nνs G(s) ˙ = exp − s kB T pl
G(−s) − exp − kB T
.
(1)
In this equation ν is a molecular vibration frequency; kB is Boltzmann’s constant; T is the temperature; s is the deviatoric part of the applied stress, and s is its magnitude. Here G(s) is an activation barrier for transitions that promote shear in the direction of the applied deviatoric stress s while G(−s) is the activaiton barrier for transformations that promote shear in the reverse direction. Note that each activation barrier is a function only of the magnitude of the applied deviatoric stress. It is also important to note that this formalism proposed by Eyring includes only one scalar population n because it was first conceived in the context of a “theory of holes” that attributed viscous rearrangement to the motion of vacancy-like objects in a liquid. Eyring further made the assumption that, to a good approximation, the stress dependence of these activation barriers is linear, i.e., G = G 0 − (s/2), leading to the simplified form:
˙ pl =
2nνs G 0 s exp − sinh . s kB T 2kB T
(2)
All the theories that follow attempt to describe the deformation dynamics by postulating equations of motion for n and, in some cases, the distribution of n as a function of activation energy. Both De Hey et al. [14] and Hasan et al. [15] chose equations of motion of the general form n˙ = n˙ T (n, T ) + n˙ (n, ˙ pl)
(3)
where n˙ T models the dynamics of thermal relaxation and n˙ models the production of defects induced by plastic strain. Hasan et al. [15] exclusively model n˙ and neglect the effects of aging arising from n˙ T . Duine et al. [20] consider several possible forms for n˙ T and claim that the empirical data
1290
M.L. Falk et al.
obtained by differental scanning calorimetry (DSC) of metallic glasses is best described by n˙ T = −kr n(n − n eq )
(4)
where kr is a thermally activated rate factor and n eq is the thermal equilibrium density of flow defects in the absence of shear. It is important to note that this thermodynamic analysis of DSC data requires the additional assumptions of the free volume theory originated by Turnbull and Cohen [21] and further developed by Spaepen [13] that begins by assuming that the defect density is related to the free volume by an equation of the form
n ∝ exp −
V∗ vf
(5)
where v f is the free volume and V ∗ is a molecular volume. van den Buekel and Sietsma [22] further assume that the rate of change of enthalpy is directly proportional to the rate of change of v f . This is a critical assumption of the free volume model and, as far as we are aware, has not been directly compared to any independent measure of v f . We will address these assumptions and some thermodynamic arguments for the necessity of a v f -like parameter toward the end of the chapter. There is considerable disagreement among the authors of these models regarding the form of n˙ in Eq. (3). In the context of polymers, Hasan et al. [15] initially chose a form where n˙ =
−˙ pl (n − n ∞ ) τp
(6)
where τp is a proportionality constant and n ∞ is the steady state defect density under shear at low temperature. In a later investigation, however, Hasan and Boyce [8] developed a more complex model in which the whole energy distribution rather than just the total number of defects shifts during deformation. In spirit, this model is similar to the free volume models mentioned in the previous paragraph, although in detail it is quite different. As previously mentioned, we will address the reasons one would include such extra degrees of freedom in the latter parts of this chapter, but here we simply describe some of the most notable features of the model. Particularly, in this model n ≈ exp(−a/kT ) and a˙ = −kp (a − aeq ) exp[−ξ exp(−ξ pl)]
(7)
where kp is a transition rate, and ξ describes the sharpness with which the dynamics of a “turn on” during plastic flow. An additional parameter S is introduced that models the development of stresses that favor the reverse transformation during loading. Similarly, in the context of metallic glass deformation, Spaepen [13] and De Hey et al. [14] propose dynamics of v f which indirectly alter the defect
Toward a shear-transformation-zone theory
1291
density n via Eq. (5). The form proposed in Ref. [14], that v˙f = c˙ pl , is the simpler of the two. In that model this free volume creation term is applied in conjunction with Eq. (4) above to produce an equation of motion for v f that models both shear induced disordering and thermal relaxation. Spaepen [13] proposes a form
v˙f =
s V ∗ kB T coth vf M 2kB T
− 1+
Mv f n D kB T
csch
s 2kB T
˙ pl
(8)
where M is a modulus. Note that in Eq. (8) we have factored out ˙ pl to make this form comparable to the other equations discussed above. At large values of s/kB T and non-zero T , this expression reduces to a form close to that proposed by De Hey et al. [14], v f2 ∝ ˙ pl . However, it behaves unphysically in the limit of vanishingly small T and non-zero s, where the steady-state value of v f is predicted to become infinite. Despite this difficulty, this equation of motion has been used by Huang et al. [23] and Steif et al. [24] in attempts to model shear band formation. There are several aspects of these models that we find unsatisfactory. One prime example is the appearance of the plastic strain pl in Eq. (7), which illustrates a problem that pervades much of plasticity theory. Fundamental principles imply that the plastic strain cannot serve as an internal state variable because it has no structural meaning outside of the ideally elastic regime. Materials undergoing large scale irreversible deformations necessarily lose their memories of earlier configurations; thus displacements measured from much earlier reference states cannot meaningfully characterize current states of these systems. For similar reasons, couching plasticity theories in a Lagrangian framework in terms of strains measured relative to an arbitrarily determined initial reference state seems to be an unprofitable mathematical strategy. The importance of expressing such relations in terms of rate equations that are most easily described in Eulerian coordinates has long been recognized; for example, see Ref. [25]. Nevertheless, Lagrangian formulations in which the current plastic strain is given undue weight as a state-determining measure continue to be advocated in review articles [26] and used in advanced mathematical treatments of plasticity [27]. Another serious problem in the above equations is that they use functional forms that cannot be made consistent with symmetry laws without violating analyticity requirements. For example in Eq. (6) and the evolution equations of Refs. [13, 14], the rate of change of a scalar denoting the internal state of the system, n or v f is proportional to the plastic strain rate, which is a tensor. Thus, although the production rate of defects due to shear should be positive regardless of the direction of the applied stress, when the sign of the shear is reversed in these equations, the defect production rate also changes sign. The authors most likely intended to use the magnitude of the plastic strain rate; but this also would be unsatisfactory because the absolute value is a nonanalytic
1292
M.L. Falk et al.
function that is unlikely to arise from any first-principles analysis of molecular mechanisms. Beyond these details, there are issues that arise directly from the forms of Eqs. (1) and (3) which underly nearly all of these theories. In particular, the fact that n is the prefactor of both the forward and backward transition rates reflects the fact that no attempt is being made to differentiate between the populations of the zones that undergo these two distinct transformations. Consequently, despite the fact that Eq. (1) ostensibly describes the forward and backward transitions of a two-state system, there is no way for these equations to describe the fact that once a region transforms it is not available for further transformation. This decoupling of the strain rate dynamics from the population dynamics implies that there is some fast relaxation mechanism – faster than any other rate introduced explicitly in the theory – which causes zones instantaneously to lose their memory of prior transformations. As a result, none of the above constitutive laws has the possibility of describing a transition from jammed to flowing behavior at a well defined yield stress. In what follows, we describe a theoretical framework that we have developed in an attempt to correct the problems mentioned above and to include the two-state dynamics associated with STZs. We find that we can apply simple thermodynamically motivated physical arguments to derive the form of our equivalent of the term n˙ in Eq. (3). We will accomplish this initally in the context of a low temperature theory that strictly applies only when thermally activated transitions are negligible, and we will then describe more briefly the ways in which thermal effects can be introduced into such a theory. The introduction of thermal effects is necessary to explain experimental data related to shear softening and shear thinning as described above. We will finish by discussing how this theory points toward the need for a better understanding of the thermodynamics of out-of-equilibrium systems and, in particular, how it may imply a relationship between the concepts of free volume and “effective temperature.” We believe that refining such concepts is a necessary next step toward constructing constitutive laws for plasticity with less emphasis on pure phenomenology and more on the statistical mechanical consequences of well defined atomic-scale mechanisms of deformation.
3.
The Low-temperature STZ Theory
We start by presenting the fundamental form of a constitutive relationship that exhibits a transition from jammed (viscoelastic) to flowing (viscoplastic) behavior. The theory will be introduced in the low-temperature limit, the limit in which the state of the system only changes when the strain rate is non-zero. In the derivation that follows we will take the STZ picture more literally than perhaps is necessary. We will assume that the material is riddled with local
Toward a shear-transformation-zone theory
1293
regions of approximately identical size that can undergo local shear rearrangements, that these regions are initially randomly oriented, and that they behave essentially as two-state systems. Importantly, we must also assume that these STZs can be created and destroyed due to deformation of the surrounding medium. The STZ picture implies a continuum of orientations of zones in the material. That is to say that, at any point, the distribution of zones could be written as a function ζ(eˆ ) that denotes the number of zones oriented in the direction eˆ , the unit vector in the direction with angular coordinate(s) . Because STZs contain orientational (director) information rather than directional (vector) information, ζ is defined on the unit sphere (unit circle in 2D) so that ζ(eˆ ) = ζ(−eˆ ). Because such a general field theory would be difficult to construct, we expand this orientational density field in terms of its first and second moments at each point. We write 1 r) = n tot( 2 1 n i j ( r) = 2
dζ( r , ), dζ( r , )d ij .
(9)
(10)
Here we replace the spatially r varying orientational function ζ with a spatially varying scalar field n tot corresponding to the number density of STZs, and a spatially varying traceless tensor field n i j . By definition dij ≡ 2eˆi eˆ j − δi j
(11)
and the integrals in Eqs. (9) and (10) are over the unit circle in 2D or the unit sphere in 3D. Rather than delve into the full tensor version of the STZ theory here (see Ref. [28] for details), it is more instructive to derive the constitutive equations by first specializing to the case of a 2D body (surface or film) loaded in pure shear with the pricipal axes of the loading aligned along the x x and yy directions. Furthermore, we will assume that all the STZ’s are aligned along the same principal axes. Without loss of generality, therefore, we let the deviatoric stress be diagonal along the x, y axes; specifically, let sx x = −s yy = s and sx y = 0. Then choose the “+” zones to be oriented (elongated) along the x-axis; and the “–” zones along the y-axis; and denote the population density of zones oriented in the “+/–” directions by the symbol n ± . The resulting equations are easily motivated in this context and a generalization to the tensorial 3D form of the equations is straightforward. We begin by expressing the plastic shear strain rate in terms of the rates of STZ transformations using a kinetic formalism that deviates from Eq. (l). This
1294
M.L. Falk et al.
equation explicitly includes the effect of the relative populations of “+” and “–” zones on the plastic strain rate: pl ≡ ˙ pl = ˙ xplx = −˙ yy
λ (R(−s)n − − R(+s)n + ). τ0
(12)
Here λ is a material-specific parameter with the dimensions of volume (or area in a strictly 2D model) which must be roughly equal to the size of an STZ, that is, a few atoms in size. The quantity in parentheses in Eq. (12) is the net rate per unit area at which STZ’s are transforming from “−” to “+” orientations. Here, R(s)/τ0 and R(−s)/τ0 are the rates for “+” to “−” and “−” to “+” transitions, respectively. The instantaneous local densities of “+” and “−” zones are denoted by n + and n − , respectively. For simplicity, we write these rates as explicit functions of only the deviatoric stress s, although they depend implicitly on the temperature and pressure and perhaps other quantities as well. Note that the inclusion of separate densities for the “+” and “−” regions immediately distinguishes this approach from that in Eq. (l). The equation of motion for the populations n ± generally must be a master equation of the form:
τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ∓ + (s, n + , n − )
n∞ − n± . 2
(13)
The first two terms on the right-hand side are the stress-driven transition rates introduced in the preceding paragraph. They describe volume-conserving, pure-shear deformations which preserve the total population of the STZ’s. These terms have no equivalent in Eq. (3) above. The last two terms in parentheses, proportional to , describe creation and annihilation of STZ’s. In the low-temperature theory, is nonzero only when the plastic strain rate is nonzero; the molecular rearrangements required for creating or annihilating STZ’s cannot occur spontaneously, that is, in the absence of external driving forces. The assumption in Eq. (13) that the annihilation and creation rates are both proportional to the same function has profound implications in this theory. Among those implications is the requirement that n ∞ be a strain-rate independent constant. Note that n ∞ is the total density of zones generated in a system that is undergoing steady plastic deformation. It is not the same as the equilibrium density at nonzero temperature and zero strain rate, which ordinarily is said to go rapidly to zero as the temperature decreases below the glass transition. On the other hand, n ∞ is a property of low-temperature materials at non-zero strain rates. The form in which we have cast Eq. (13) is consistent with our assumption that, under steady-state conditions with unidirectional shear stress, the number of events in which the molecules rearrange themselves is not proportional to the time but to the strain. That picture seems intuitively reasonable. If
Toward a shear-transformation-zone theory
1295
the system requires a certain number of STZ-like rearrangements in order to achieve some deformation, then it should not matter (within limits) how fast that deformation takes place. The picture breaks down, of course, when there are competing rearrangement mechanisms. For example, the density of STZ’s becomes strain-rate dependent when we introduce thermal fluctuations. We also expect that the picture may fail in polymeric glasses or polycrystalline solids, where more complex components may introduce extra length and time scales. We shall use the energetic arguments introduced in Ref. [29] to determine the factor in Eq. (13), but first we must discuss the state variables and specific forms for the transition rates. We define dimensionless state variables by writing ≡
n+ + n− , n∞
≡
n+ − n− . n∞
(14)
In a more general treatment [28–31], remains a scalar density, but becomes a traceless symmetric tensor with the same transformation properties as the deviatoric stress. In this way they are closely related to n tot and n ij introduced in Eqs. (9)–(10). We also define: S≡
1 (R(−s) − R(+s)), 2
C≡
1 (R(−s) + R(+s)), 2
T ≡
S . C
(15)
Then the STZ equations of motion become: τ0 ˙ pl = 0 C(s)( T (s) − );
(16)
˙ = 2C(s)( T (s) − ) − (s, , ); τ0
(17)
˙ = (s, , )(1 − ). τ0
(18)
and
Here, 0 ≡ λn ∞ is roughly the fraction of the total volume of the lowtemperature system in steady-state flow that is covered by the STZ’s. This is a material-specific quantity. If 0 is small, then the disorder induced in the system by deformation is small. Conversely, if 0 is large, then the STZlike defects cover the system and the material in some sense “melts” under persistent straining. Throughout this paper, we shall use only what we call the “quasi-linear” version of these equations [30]. That is, we note that T (s) and C(s) are, respectively, antisymmetric and symmetric dimensionless functions of s, and write: s ≡ s˜ ; C(s) ∼ (19) T (s) ∼ = 1, = sy
1296
M.L. Falk et al.
where s y will turn out to be the yield stress. The choice C(s) ∼ = 1 is, in effect, our definition of the time constant τ0 . As pointed out in earlier papers [29, 30], this quasilinear approximation has important shortcomings. Neglecting the stress dependence of C(s) means that we overestimate the amount of plastic deformation that occurs at small stresses and therefore also overestimate the rate at which orientational memory disappears in unloaded systems. Moreover, the quasilinear approximation is too simplistic to be related directly to atomic mechanisms, a point that we shall comment upon further. Nevertheless, the quasilinear theory has the great advantage that it is mathematically tractable and easy to interpret. It will serve to illustrate the main points that we wish to make in this paper, but aspects of the nonlinearities associated with C and T will need to be reintroduced before we shall be able to understand fully the non-equilibrium behavior of amorphous solids. Equations (16)–(18) now become: τ0 ˙ pl = 0 ( ˜s − );
(20)
˙ = 2( ˜s − ) − (˜s , ); τ0
(21)
˙ = (˜s , , )(1 − ). τ0
(22)
and The quantity n ∞ /τ0 is the STZ creation rate. We can derive an expression for that rate by using the energy-balance argument introduced in Ref. [29]. As before, we start by writing the first law of thermodynamics in the form: 20 s y d (23) ( ˜s − )˜s = 0 s y ψ( , ) + Q(˜s , , ). τ0 dt The left-hand side of Eq. (23) is the rate at which plastic work is done by the applied stress s = s y s˜ . On the right-hand side. 0 s y ψ is the state-dependent recoverable internal energy, and Q is the dissipation rate. So long as the STZ’s remain uncoupled from the heat bath. Q must be positive in order for the system to satisfy the second law of thermodynamics, that is. for the work done in going around a closed cycle in the space of variables s, , and to be non-negative. As argued in Ref. [29], the simplest and most natural choice for – and, so far as we can tell, the only one that produces a sensible theory – is that it be the energy dissipation rate per STZ. That is. ε0 s y (˜s , , ), (24) Q(˜s , , ) = τ0 2˙ pl s =
With this hypothesis, we can use Eqs. (21) and (22) to write Eq. (23) in the form ∂ψ ∂ψ (1 − ) + (2( ˜s − ) − ) + . (25) 2( ˜s − )˜s = ∂ ∂
Toward a shear-transformation-zone theory
1297
Then, solving for , we find: =
2( ˜s − )(˜s − ∂ψ/∂) . + (1 − )(∂ψ/∂ ) − (∂ψ/∂)
(26)
To assure that remains non-negative for all s˜, we must let ∂ψ = , (27) ∂ so that the numerator becomes 2 (˜s −/ )2 . Then (see Ref. [29]), we choose ψ( , ) = 2
2 1+ 2 ,
(28)
so that (˜s , , ) =
4 ( ˜s − )2 . (1 + )( 2 − 2 )
(29)
This result has the physically appealing feature that it diverges when 2 approaches its upper limit 2 , thus enforcing a natural boundary for dynamical trajectories in the space of the state variables and . It is convenient at this point to replace the variable by m = / , so that the equations of motion become: τ0 ε˙ pl = ε0 (˜s − m);
τ0 m˙ = 2(˜s − m) 1 − and
2m(˜s − m) ; (1 + )(1 − m 2 )
˙ 4(˜s − m)2 1− . τ0 = 1 − m2 1+
(30) (31)
(32)
At the stable fixed point of Eq. (32), A =1, Eq. (31) becomes 2(˜s − m)(1 − s˜ m) , (33) (1 − m 2 ) which exhibits what we believe to be the single most important consequence of the two-state STZ dynamics, that is, the occurrence of a the yield stress that was missing in earlier theories. Note that Eq. (33) has two steady-state solutions: a jammed state with ˙ pl = 0 for m = s˜ and a flowing state with nonzero ˙ pl for m = 1/˜s . A simple analysis indicates that the first of these states is dynamically stable for s˜ < 1 and the second for s˜ > 1. Thus, the system exhibits an exchange of dynamical stability at the yield stress s˜ = 1. The steady-state flow obeys a Bingham-like law: ε0 1 s˜ − , s˜ > 1. (34) ˙ pl = 0, s˜ < 1; ε˙ pl = τ0 s˜ τ0 m˙ =
1298
4.
M.L. Falk et al.
High-temperature STZ Theory
We return now to Eq. (13), the low-temperature master equation for the STZ population densities n±, and ask what changes need to be made in order to incorporate thermal effects at temperatures near the glass transition. One obvious possibility is to modify the rate factors R(±s) to include thermal activation across energy barriers. This is not difficult: but the extra analysis required is not relevant to the main points that we wish to make here and therefore will be omitted. The more important thermal effects are those that are completely missing in Eq. (13), specifically, the thermally assisted relaxation – i.e., aging – of the STZ variables that can occur spontaneously in the absence of external driving or plastic strain rate. There are two ways in which relaxation must occur. First, thermal fluctuations ought to act much like deformation-induced disorder in causing the n ± to relax toward their steady-state values n ∞ /2. Second, there should be some annealing mechanism that causes the total STZ population to decrease. Both of these mechanisms involve dilations and contractions of the kind associated with creation and annihilation of STZ’s: thus, again for simplicity, we assume that there is just a single relaxation rate, denoted ρ(T )/τ0 , that characterizes them. That rate may have the Vogel-Fulcher or Cohen and Grest [32] super-Arrhenius form, rapidly becoming extremely small as the temperature falls below the glass temperature. Specifically, ρ(T ) might have the form V dil , (35) ρ(T ) = ρ0 exp − υf (T ) where ρ0 is a dimensionless prefactor, V dil is the activation volume required to nucleate a dilational rearrangement, and v f (T ) is often identified as the free volume. We shall not make the latter assumption because we expect the free volume to depend on the history of plastic deformation, not just on the current temperature. In our work described here, we have evaluated ρ(T ) directly from measurements of the linear Newtonian viscosity, and have not used explicit formulas such as Eq. (35). By proposing that critical slowing down near the glass transition arises due to a divergence in the time scale associated with a rate process, we are departing from the formalism that has been developed in the context of conventional free volume theory. In the latter, theory, it is typically assumed that all rates are essentially Arrhenius while the defect population that mediates deformation is strongly suppressed below the glass temperature. We have chosen to ascribe the change in dynamics to a diverging time scale rather than the defect population, because it seems more likely that interesting temperature dependencies will arise by considering the paths connecting two low energy states rather than by considering the energies of the configurations themselves.
Toward a shear-transformation-zone theory
1299
Having made these assumptions, there are at least two ways forward that we can imagine. The first continues to describe the dynamics of the material in terms of a progression of defect–defect interactions, one that must be extended to include more complex diffusion-mediated effects at higher temperatures. This is the path that we have taken in recent investigations and have described in Ref. [33]. A second, more speculative path is based on the assumption that, during plastic deformation, the slow configurational degrees of freedom fall out of thermal equilibrium with the heat bath and can be described by an effective temperature. Teff . The assumption of such a model would be that the configurational energy and density fluctuations are described by a Boltzmann distribution with this effective temperature, so that STZs having a characteristic formation energy of E z would then be found with a probability proportional to exp(−E z /kB Teff ). Concepts of this kind have been presented recently in work by Ono et al. [34]; Cugliandolo et al. [35]; Sollich et al. [36]; Berthier and Barrat [37] and also are related to the ideas of Ref. [38]. Note that Teff may play a role similar to the free volume. It is an intensive variable, roughly analogous to υf (T ) in Eq. (35), but it cannot be simply a function of the bath temperature T. It should, however, approach T at sufficiently high T or for fixed nonzero T at sufficiently small rates of deformation. In what follows, we shall describe briefly both of these paths of investigation. Further details can be found in Ref. [33].
5.
Defect Dynamics
To begin, we consider a defect based theory and discuss its consequences. In this model our proposed form for the modified master equation is: τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ±
− +( (s, , ) + ρ(T )) n2∞ − n ± − κρ(T ) n+n+n n±. ∞
(36)
The first and second appearances of ρ(T ) on the right-hand side of Eq. (36) correspond, respectively, to its two roles described above – relaxation of the populations n ± toward n ∞ /2 and annealing of the STZ population as a whole. The second of these terms, the quadratic form with a dimensionless multiplicative constant κ, is a bimolecular mechanism that has been discussed extensively in Refs. [39–41]. After repeating the energy-balance analysis of Eqs. (23)–(29). we find that the equations of motion analogous to Eqs. (31) and (32) are ˜ s , , m, T ), (37) τ0 m˙ = 2(˜s − m) − m (˜ and τ0
˙ ˜ s , , m, T )(1 − ) − κρ(T ) , = (˜
(38)
1300
M.L. Falk et al.
where, 2 2 ˜ s , , m, T ) = 4(˜s − m) + 2ρ(T ) + κρ(T ) (1 + m ) . (˜ (1 + )(1 − m 2 )
(39)
The expression for the plastic strain rate, given in Eq. (30), remains unchanged. We emphasize that this theory contains only three adjustable parameters: 0 , τ0 , and κ. Using the above equations in the low-stress limit, we find the Newtonian viscosity to be 2s y τ0 s , = pl ˙ ε0 ρ(T ) ε˙ pl →0 ε
η N ≡ lim
(40)
which confirms our expectation that, in this theory, it is the rate function ρ(T ) that governs viscous relaxation. Because we know the yield stress s y .we can use Eq. (40) along with the experimentally measured values of η N to determine the ratio 0 /τ0 , a characteristic strain rate, up to a multiplicative scale factor for ρ(T ). The scale factor ρ(T ) is obtained at each temperature from the Newtonian viscosities in the low strain-rate limit as shown in Table 1. With these constraints on the parameters, we obtain values for 0 /τ0 , the values of ρ(T ), and κ by fitting the steady-state data for stress as a function of strain rate in the non-Newtonian regime. To fit non-steady-state data, for example, stress– strain curves measured at constant strain rate, we have only one remaining adjustable parameter, 0 . The resulting analysis reproduces experimental results of Refs. [6, 9] with remarkable accuracy, exhibiting both the interesting scaling behavior that these experiments have revealed and providing quantitative evidence in favor of the general features of our theoretical framework. Table 1. Experimental data for viscosity taken from Ref. [6], and values of ρ used in the present calculations Temperature (K)
Viscosity, (Pa s)
ρ
573 593 603 613 623 643
4.00 × 10−14 4.03 × 10−13
1.07 × 10−8 1.06 ×10−7 4.77 ×10−7 1.06 × 10−6 5.88 × 10−6 1.01 × 10−4
8.99 × 10−12 4.03 × 10−12 7.29 × 10−11 4.27 × 10−10
To illustrate the principal results of our analysis, we first follow the lead of Refs. [6, 9] by looking for scaling in the steady-state behavior of our system. To be specific about what we mean here, we show in Figs. 8 and 9 theoretically computed sets of stress–strain curves for various temperatures and fixed strain rates. As we shall explain shortly, these figures are to be compared with Fig. 3 which is taken from Ref. [6]. (See also Fig. 1 in Ref. [14] and Fig. 1
Toward a shear-transformation-zone theory
1301
Figure 8. Theoretical curves of tensile stress versus strain for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 at several different temperatures as shown. The strain rate is ˙ total = 1 × 10−1 s −1 . For clarity, the curves have been displaced by constant increments along the strain axis.
Figure 9. Theoretical curves of tensile stress versus strain for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 at several different strain rates as shown. The temperature is T = 643 K. For clarity, all but the first of these curves have been displaced by the same amount along the strain axis.
1302
M.L. Falk et al.
in Ref. [9].) A general feature of these curves is that, when the strain rate is held constant, the stress rises through a maximum, decreases as the material softens, and then reaches a steady-state value. We shall discuss the initial transients later in this section, but look first at the late-stage, steady-state behavior. We compute the steady-state flow stress as a function of the strain rate ˙ = 0. Then, as in Refs. [6, 9], we by solving Eqs. (37) and (38) with m˙ = pl plot s˜ = s/s y as a function of η N ˙ for eight different values of the relaxation rate ρ(T )corresponding to the eight different temperatures for which data are reported in Ref. [6]. The results are shown in Fig. 10. As discovered by Kato et al. [9], all of these curves lie on top of one another for stresses s˜ < 1 but, in our case, they diverge from each other in the flowing regime, s˜ > 1, where the Bingham-like behavior shown in Eq. (34) sets in. Figure 11a contains the same theoretical curves as those shown in Fig. 10, but plotted there as tensile stress versus scaled strain rate, and compared with experimental data taken from Fig. 9a of Ref. [6] which is the same data shown here in Fig. 5. The same theoretical functions and data points are replotted in Fig. 11b to show the normalized viscosity, η/η N as a function of the scaled strain rate. The latter figure is directly comparable to Ref. [6], Fig. 9b. Note that the range of strain rates shown in Fig. 11 corresponds to the range of the experimental data and is substantially smaller than that shown in Fig. 10. The theoretical curve that lies above the rest at high strain rates is for T = 683 K ,
Figure 10. Scaling behavior in the STZ theory: shear stress s˜ as a function of strain rate scaled by η N . This graph is plotted for the same set of temperatures as shown in Fig. lla, but for a larger range of strain rates.
Toward a shear-transformation-zone theory
1303
(a)
(b)
Figure 11. Tensile stress and viscosity as functions of scaled strain rate η N ˙ total . The data points for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 are taken from Ref. [6], Fig. 9a and b. The solid curves are theoretical results computed for the same set of temperatures as shown.
the highest of the temperatures reported in Ref. [6]. The data points at that temperature all lie at scaled strain rates that are too small to test this predicted breakdown of the scaling law. Our Fig. 12a shows individual theoretical and experimental curves of tensile stress as a function of (unsealed) strain rate for different temperatures. Here, the experimental data is that from Fig. 5. These curves are replotted in Fig. 12b to show (unscaled) viscosity as a function of strain rate, analogous to Fig. 5 reprinted from Ref. [6]. Our main conclusion from this steady-state analysis is that we are observing a transition from thermally assisted creep to viscoplastic flow in the neighborhood of the dynamic yield stress. At low stresses and strain rates, the linear response relation contains only the factor η N ∝ 1/ρ(T ), thus we obtain
1304
M.L. Falk et al.
Steady state stress (MPa)
(a) 103
102
101 (b)
Viscosity (PaS)
1013 1012 1011 1010 109 108
10⫺5
10⫺4
10⫺3
10⫺2
10⫺1
Figure 12. Tensile stress (a) and viscosity (b) as functions of strain rate for different temperatures as shown. The data points are for the bulk metallic glass Zr41.2 Ti13.8 CU12.5 Ni10 Be22.5 as reported by Lu et al. [6], Figs. 7 and 8. The solid lines are theoretical curves.
the simple scaling. Near the yield stress, however, our theoretical strain rate increases by several orders of magnitude for small increments of stress, and the experimental behavior tracks this trend accurately. This behavior resembles super-plasticity. Interestingly, the theoretical scaling persists through the “super-plastic” region and does not break down until true viscoplastic flow begins. So far, we have examined only steady-state behavior. We turn next to stress–strain curves obtained in constant strain-rate experiments such as those shown in our Figs. 8 and 9 and in Fig. 3. To plot these curves, we solve 0 s˜˙ = ˙ total − (˜s − m), 2µ˜ τ0
(41)
Toward a shear-transformation-zone theory
1305
along with Eqs. (37) and (38) to compute s˜ as a function of the total strain. Here, µ˜ is the ratio of the shear modulus to the yield stress, which we know from experiment. Our Figs. 8 and 9 are drawn so as to be directly comparable to Fig. 3; that is. we use the same strain rates and temperatures. As mentioned above, we have chosen the parameter 0 to optimize our fits to these curves. The one other parameter that we need for solving Eqs. (37), (38), and (41) is the initial value of . For this purpose, we have chosen the steady-state solution of Eq. (38) at zero driving force, that is, the smallest value of that can be achieved by annealing. In all cases, the agreement between theory and experiments seems satisfactory. The peak heights and positions for fixed strain rate ˙ total = 0.1 s −1 and varying temperatures in Fig. 8, and for fixed temperature T = 643 K and varying strain rates in Fig. 9, are within about ten percent of their experimental values. The experimental curves for low temperatures and large strain rates end where the samples break: the dashed lines in our figures indicate our theoretical extensions of those parts of the curves for which no experimental data is available. The one systematic discrepancy is that our initial theoretical slopes are smaller than the experimental ones. This is primarily an artifact of our quasi-linear theory, where plastic deformation sets in at unphysically small stresses. Until we incorporate the fully nonlinear transition rates into our calculations, we do not believe that it will be meaningful to try to improve our fits to these transient stress–strain curves by further adjusting parameters such as κ or 0 . Finally, in Fig. 13, we use the material parameters deduced above for the system studied in Lu et al. [6] to plot stress–strain curves for different initial values of , say 0 , all at temperature T = 643 K and ˙ total = 3.2 × 102 s −1 . The different 0 ’s correspond to different initial states of disorder produced by varying the annealing times and temperatures. Presumably, annealing for longer times at lower temperatures produces smaller values of 0 ; but it seems difficult to make quantitative estimates of this effect. These curves may be compared qualitatively with those shown in Fig. 4 and [14], Fig. 9, where larger initial densities of STZ’s produce larger plastic responses and correspondingly smaller overshoots during the early stages of deformation.
6.
Outline of an Effective Temperature Theory
Although the defect-dynamics theory described above seems remarkably successful, it does have some significant shortcomings. In the first place, it has no way of predicting the results of calorimetric measurements. A related difficulty is that a more complete theory should predict some relationship between the two steady-state defect densities denoted here by n eq (T ) (the
1306
M.L. Falk et al.
Figure 13. Tensile stress as a function of strain for several different values of 0 . Curves are plotted for ˙ total = 3.2 × 10−2 s −1 at T = 643 K.
thermal equilibrium density in the absence of shear) and n ∞ (the lowtemperature density in the presence of persistent deformation rate). Presumably, these two quantities should approach each other in the limit of high temperature or small deformation rate, but they do not behave in that way here. These and other considerations lead us to believe that there may be a more fundamentally correct way of describing these intrinsically nonequilibrium phenomena. We now outline some features of a theoretical approach that we believe shows promise of solving the above problems. We assume that the STZ density is determined by the distribution of very slow spatial fluctuations of the configurational energies in our system, and we characterize these fluctuations by an intensive temperature-like variable that we call χ. If we assume a specific heat of order kB per molecule, then χ is the ratio of the effective thermal energy kB Teff to a characteristic STZ energy E Z . χ is thus a dimensionless intensive quantity that plays something like the role that the free volume did in earlier theories. In particular, our STZ density is proportional to exp(−1/χ) = exp(−E Z /kB Teff ). The difference between the theory presented here and previous free volume theories is that, although Teff may become equal to the bath temperature T under some circumstances, it is not just a function of T but, rather, depends on the deformation-induced internal state of disorder of the system. In fact, we may sometimes refer to Teff as the “disorder temperature.”
Toward a shear-transformation-zone theory
1307
Another way in which our effective temperature approach differs from traditional free volume models is again the attribution of the super-Arrhenius behavior to the rates rather than the STZ density. That is to say that, unlike Refs. [14, 20] our χ does not converge in the limit of zero shear rate to a temperature dependent equilibrium value determined by the Vogel–Fulcher or Cohen–Grest forms for υf (T ), but rather converges simply to kB T /E Z . The rate of this convergence, however, slows dramatically below the glass temperature, so that χ falls out of thermal equilibrium in quenched systems. In order not to confuse the defect based and disorder temperature based approaches, we eschew the addition of new defect interaction mechanisms, and instead assume that all STZ interactions are captured by the dissipation rate and the thermally induced transition rate ρ. The convergence of the STZ density to the value prescribed by our effective temperature is then accomplished by invoking detailed balance so that the ratio of the creation and annihilation rates is consistent with the effective temperature of the material. Thus our new master equation for the STZ densities, analogous to Eq. (36), takes the form:
τ0 n˙ ± = R(∓s)n ∓ − R(±s)n ± + ( + ρ)
n ∞ −1/χ e − n± . 2
(42)
From here, we can derive χ-dependent equations of motion for and m = / analogous to Eqs. (30), (37) and (38). These equations simplify if we eliminate by assuming that the dimensionless STZ density is always in equilibrium with the effective temperature, so that = exp(−1/χ). Then we find τ0 ˙ pl = 0 e−1/χ (˜s − m);
(43)
and τ0 m˙ =
2(˜s − m)(1 − m s˜ ) − mρ . 1 − m2
(44)
We now must construct an equation of motion for χ that describes the way in which the slow configurational degrees of freedom fall out of equilibrium with the heat bath. In the absence of loading, but for extremely long times, we expect that the system will gradually converge to the thermalized values of χ and the STZ density. However, we must also consider what value χ approaches when we take ˙ pl → 0 after taking T → 0. At present we have no way to determine this value or its rate dependence, but we appeal to our earlier discussion to argue that χ must approaches some steady state value, say χ∞ ≡ kB T∞ /E Z in this limit. Then we can construct an equation of motion for χ by assuming that this effective temperature determines the internal energy of all of the configurational degrees of freedom – not just those associated with
1308
M.L. Falk et al.
STZ’s – via a temperature independent specific heat. Therefore the dynamics of this effective temperature must obey an equation of the form:
CD T˙eff = Q 1 −
Teff T∞
+ 0 K (χ)
ρ(T ) kB (T − Teff ). τ0
(45)
On the left-hand side, CD is a specific heat. The first term on the right-hand side describes how the energy dissipated during plastic deformation, at a rate Q, is absorbed by the configurational degrees of freedom and drives Teff to T∞ . The second term in (45) describes how Teff → T in the absence of external driving, and does so at a rate proportional to the dilational fluctuation rate ρ/τ0 , which becomes very small at low temperatures. K (χ)is a cooling coefficient with dimensions of inverse volume, defined with a factor 0 purely for convenience. The χ dependence of K arises because K must be proportional to the density of sites at which these cooling events can occur, which must depend on the state of disorder. We can covert Eq. (45) into an equation for χ by our association of the energy dissipated during plastic deformation with . Q(s, , ) =
µ ˜ 0 −1/χ e (s, , m). τ0
(46)
The function (not + ρ) is (s, m, χ) =
2(˜s − m)2 + ρm 2 . 1 − m2
(47)
Note that the term proportional to ρ in disappears in an undriven system because m → 0 in that case. Our final equation for χ˙ is τ0 c˜ χ˙ = e−1/χ (s, , m)(χ∞ − χ) + K˜ (χ)ρ(T ) 0
T −χ . TZ
(48)
Here c˜ and K˜ are the dimensionless specific heat and cooling coefficients, respectively, and Tz = E Z /kB . It is useful at this point to exhibit the expression for the Newtonian viscosity that emerges from these equations of motion: ηN = lim
˙ pl →0
2s y τ0 µs ¯ exp(TZ /T ). ≈ pl ˙ 0 ρ(T )
(49)
Thus we recover approximately the same form as we did for our previous defect dynamics based model. Equation (40), except for the fact that here the temperature dependence appears in both the rate constant ρ and the temperature dependence of the thermal equilibrium STZ density. exp(−1/χ) ≈ exp(−TZ /T ).
Toward a shear-transformation-zone theory
7.
1309
Outlook
The investigations we have discussed here lead us to believe that the low temperature STZ theory captures one of the primary salient features of the experimental data: a transition from a jammed state to a flowing state at a well defined yield stress. This underlying dynamical transition is observed at high temperatures as a transition from thermally activated creep flow, to mechanically driven superplastic flow. Hence the material shear thins. Questions remain as to the precise way thermally activated transitions should be introduced in such a model. In our initial investigations we have proposed two such methods that consider very different physical pictures of the thermodynamics of deformation. In our preliminary opinion we believe the effective temperature model may do a better job of capuring the underlying physics than the defect dynamics picture. However, only careful theoretical development and experimental tests will resolve this issue. Both of these models are mere frameworks that shall have to be improved in specific respects if we are to develop one into a yet more quantitative, predictive description of plastic deformation in amorphous solids. We conclude this chapter by identifying three directions for the next phases of these investigations. Fully non-linear, temperature dependent transition rates. When we examine the quasi-linear STZ theory in the context of a theory that includes thermal fluctuations, we see that it is a special case in which the shear rearrangements are not being modeled as realistically as the dilations or contractions. To see this in more detail, go back to our original [42], fully non-linear version of the low-temperature rate factors R(s):
V shear (s) 1 ; R(s) = exp − τ0 υf
V shear (s) = V0shear e−s/µ¯ ,
(50)
where V shear (s) is the activation volume required to nucleate a shear transformation. Our idea here was that, at temperatures well below the glass temperature, the transitions between STZ states are not thermally activated but, rather, are controlled entropically. That is, the rate factors are determined by the number of paths that the molecules within a zone can follow in moving around each other while going from one state to the other. The exponential factor in Eq. (50) is an approximation for a weighted measure of that number of paths. Its s−dependence means that greater weight must be given to paths moving in the direction of the stress than opposite to it. The exponential form of V shear (s) is the simplest non-negative function that becomes arbitrarily small at large s and introduces just one new parameter, the effective STZ stiffness µ. ¯ The quasi-linear version of the theory corresponds (roughly) to the limit of small s and small values of V0shear /υf .
1310
M.L. Falk et al.
Comparison of Eq. (50) with Eq. (35) indicates that the natural way to include thermal effects in R(s) is simply to let υf have the T -dependent Cohen–Grest form. This means that, at low T, the ratio V0shear υf (T ) becomes very large, which, in turn, implies that the functions C(s) and T (s) introduced in Eq. (15) become strongly stress dependent, and the quasilinear approximations made in Eq. (19) are no longer valid. Importantly, C(s) becomes very small for small s, so that plastic deformation is strongly suppressed at stresses appreciably below the yield stress. The strong stress dependence of C(s) and T (s) should be especially apparent in transient behavior of the kind shown in Figs. 3, 4, 8, 9 and 13. Here, the initial response to loading at small stress will be almost entirely elastic, and plastic deformation will begin only later in the process. We shall have to use the fully non-linear theory when undertaking more detailed comparisons with these kinds of experimental results. Shear localization. All of the analysis in this paper pertains to spatially homogeneous systems. In order to make closer contact with experiments, we shall have to understand why and when these systems become unstable against shear banding and inhomogeneous failure modes, especially fracture. One mechanism for shear localization that we have not mentioned in this presentation is the elastic interaction between STZ’s studied in Ref. [43]. As shown in that paper, an STZ-like event generates a quadrupolar stress field that induces other nearby events along preferred spatial directions and suppresses events elsewhere. The result is a tendency toward shear localization that should be interesting to examine in the context of this more general version of the STZ theory. A second mechanism that seems likely to play a role in shear localization is already built into our equations of motion when we write them in terms of spatially varying fields. From Eqs. (39) and (38), we see that the STZ density grows most rapidly, within limits, in regions where already is large. This feedback effect, perhaps coupled to the effect of elastic interactions mentioned above, is our best guess at present about how shear banding will emerge in the STZ theory. Effective temperature and the interpretation of x, especially in granular materials and related systems. Apart from our investigations, there is increasing evidence that an effective temperature can be associated with systems like sheared foams or granular materials [34–37]. In those non-molecular systems, the usual kinetic temperature is zero because the constituents have very large masses, but an effective temperature determined by response–fluctuation relations goes to a non-zero limit when the deformation rate becomes arbitrarily small. In our present molecular system, there is a true kinetic temperature, but well below the glass transition that temperature is so small that thermally assisted molecular rearrangements are effectively frozen out. During irreversible processes such as plastic deformation, therefore, the way in which our
Toward a shear-transformation-zone theory
1311
slow, configurational degrees of freedom characterized by χ fell out of equilibrium with the fast, thermal (vibrational) degrees of freedom should resemble the behavior of the non-molecular systems. It should be an important and interesting project to see how or whether the STZ concepts can be extended to the latter kinds of systems.
References [1] R. Hill, The Mathematical Theory of Plasticity, Clarendon Press, Oxford, UK, 1950. [2] M. Treacy and J. Gibson, Ultramicroscopy, 52, 31, 1993. [3] J. Li, Z. Wang, and T. Hufnagel, Phys. Rev. B, 65, 144201, 2002. [4] W. Jiang and M. Atzmon, Acta Mater., 51, 4095, 2003. [5] A. Argon and L. Shi, “Simulation of plastic flow and distributed shear relaxations in metallic glasses by means of the amorphous bragg bubble raft,” In: V. Vitek (ed.), Amorphous Materials: Modeling of Structure and Properties, The Metallurgical Society of AIME, pp. 279–303, 1982. [6] J. Lu, G. Ravichandran, and W. Johnson, Acta Mater., 51, 3429, 2003. [7] O. Hasan and M. Boyce, Polymer, 34, 5085, 1993. [8] O. Hasan and M. Boyce, Polym. Eng. Sci., 35, 331, 1995. [9] H. Kato, Y. Kawamura, A. Inoue, and H. Chen, Appl. Phys. Lett., 73, 3665, 1998. [10] R.G. Larson, The Structure and Rheology of Complex Fluids, Oxford University Press, Oxford, 1999. [11] F. Albano, N. Lacevic, M. Falk, and S. Glotzer, Mater. Sci. Eng. A., 2004. [12] P. Thompson, G. Grest, and M. Robbins, Phys. Rev. Lett., 68, 3448–3451, 1992. [13] F. Spaepen, Acta Metall., 25, 407, 1977. [14] P. De Hey, J. Sietsma, and A. Van Den Beukel, Acta Mater., 46, 5873, 1998. [15] O. Hasan, M. Boyce, X. Li, and S. Berko, J. Polym. Sci.: Part B, 31, 185, 1993. [16] H. Eyring, J. Chem. Phys., 4, 283, 1936. [17] A. Krausz and H. Eyring, Deformation Kinetics, John Wiley and Sons, New York, 1975. [18] U. Kocks, A. Argon, and M. Ashby, Progress in Materials Science, vol. 19, Pergamon Press, Oxford, UK, 1975. [19] A. Argon, Phil. Mag., 28, 839, 1973. [20] P. Duine, J. Sietsma, and A. van den Beukel, Acta Metall. Mater., 40, 743, 1992. [21] D. Turnbull and M. Cohen, J. Chem. Phys., 52, 3038, 1970. [22] A. van den Buekel and J. Sietsma, Acta Metall. Mater., 38, 383, 1990. [23] R. Huang, Z. Suo, J. Prevost, and W. Nix, J. Mech. Phys. Solids, 50, 1011, 2002. [24] P.S. Steif, F. Spaepen, and J.W. Hutchinson, Acta Metall., 30, 447, 1982. [25] R.M. McMeeking and J.R. Rice, Int. J. Solids Struct., 11, 601, 1975. [26] P.M. Naghdi, Z. Angew. Math. Phys., 41, 315, 1990. [27] L. Anand and M.E. Gurtin, Int. J. Solids Struct., 40, 1465–1487, 2003. [28] L. Pechenik, Cond-mat/0305516, 2003. [29] J. Langer and L. Pechenik, Phys. Rev. E, 68, 061507, 2003. [30] M. Falk and J. Langer, M.R.S. Bull., 25, 40, 2000. [31] L. Eastgate, J. Langer, and L. Pechenik, Phys. Rev. Lett., 90, 045506, 2003. [32] M. Cohen and G. Grest, Phys. Rev. B, 20, 1077, 1979.
1312
M.L. Falk et al.
[33] M. Falk, J. Langer, and L. Pechenik, Submitted, 2003. [34] I. Ono, C. O’Hern, D. Durian, S. Langer, and A. Liu, Phys. Rev. Lett., 89, 095703, 2002. [35] L. Cugliandolo, J. Kurchan, and L. Peliti, Phys. Rev. E, 55, 3898, 1997. [36] P. Sollich, F. Lequeux, P. Hebraud, and M. Cates, Phys. Rev. Lett., 78, 2020, 1997. [37] L. Berthier and J.-L. Barrat, Phys. Rev. Lett., 89, 095702, 2002. [38] A. Mehta and S. Edwards, Phys. A, 157, 1091, 1990. [39] A. Taub and F. Spaepen, Acta Metall., 28, 1781, 1980. [40] A. Greer and F. Spaepen, Ann. N.Y. Acad. Sci., 371, 218, 1981. [41] S. Tsao and F. Spaepen, Acta Metall., 33, 881, 1985. [42] M. Falk and J. Langer, Phys. Rev. E, 57, 7192, 1998. [43] J. Langer, Phys. Rev. E, 64, 011504, 2001.
4.4 STATISTICAL PHYSICS OF RUPTURE IN HETEROGENEOUS MEDIA Didier Sornette Institute of Geophysics and Planetary Physics and Department of Earth and Space Science, University of California, Los Angeles, California, USA and CNRS and Universit´e des Sciences, Nice, France
The damage and fracture of materials are technologically of enormous interest due to their economic and human cost. They cover a wide range of phenomena like cracking of glass, aging of concrete, the failure of fiber networks in the formation of paper and the breaking of a metal bar subject to an external load. Failure of composite systems is of utmost importance in naval, aeronautics and space industry [1]. By the term composite, we refer to materials with heterogeneous microscopic structures and also to assemblages of macroscopic elements forming a super-structure. Chemical and nuclear plants suffer from cracking due to corrosion either of chemical or radioactive origin, aided by thermal and/or mechanical stress. Despite the large amount of experimental data and the considerable effort that has been undertaken by material scientists [2], many questions about fracture have not been answered yet. There is no comprehensive understanding of rupture phenomena but only a partial classification in restricted and relatively simple situations. This lack of fundamental understanding is indeed reflected in the absence of reliable prediction methods for rupture, based on a suitable monitoring of the stressed system. Not only is there a lack of non-empirical understanding of the reliability of a system, but also the empirical laws themselves have often limited value. The difficulties stem from the complex interplay between heterogeneities and modes of damage and the possible existence of a hierarchy of characteristic scales (static and dynamic). Many material ruptures occur by a “one crack” mechanism and a lot of effort is being devoted to the understanding, detection and prevention of the nucleation of the crack [3, 4]. Exceptions to the “one crack” rupture mechanism are heterogeneous materials such as fiber composites, rocks, concrete under compression, ice, tough ceramics and materials with large distributed 1313 S. Yip (ed.), Handbook of Materials Modeling, 1313–1331. c 2005 Springer. Printed in the Netherlands.
1314
D. Sornette
residual stresses. The common property shared by these systems is the existence of large inhomogeneities, that often limit the use of homogeneization or effective medium theories for the elastic and more generally the mechanical properties. In these systems, failure may occur as the culmination of a progressive damage involving complex interactions between multiple defects and growing microcracks. In addition, other relaxation, creep, ductile, or plastic behaviors, possibly coupled with corrosion effects may come into play. Many important practical applications involve the coupling between mechanical and chemical effects with the competition between several characteristic time scales. Application of stress may act as a catalyst of chemical reactions [5] or, reciprocally, chemical reactions may lead to bond weakening [6] and thus promote failure. A dramatic example is the aging of present aircrafts due to repeating loading in a corrosive environment [7]. The interaction between multiple defects and the existence of several characteristic scales present a considerable challenge to the modeling and prediction of rupture. Those are the systems and problems on which the interdisciplinary marriage with statistical physics has brought new fruitful ideas that we now briefly present.
1.
Creep Rupture
There are many different conditions under which a material can rupture: constant strain rate, or stress, or stress rate, or more complex strain/stress histories (involving also other control parameters such as temperature, water content, chemical activity, etc.). The situation in which a stress is imposed is very frequent in mechanics (constant weight) and leads to the phenomenon of creep (also known as “static fatigue”). A stress step leads in general to a strain response and other observable changes such as acoustic emissions (see for a review, Ref. [8]). Understanding damage and rupture of a material subjected to a constant stress is thus a good starting point. For industrial applications, creep experiments are not always practical because they require adjusting the stress to subcritical levels such that one does not wait too long before interesting processes (including eventually rupture) are monitored. Accelerated tests, which yield information more quickly, include step-stress and ramp-stress loading [9]. As we said, time-dependent deformation of a material subjected to a constant stress level is known as creep. In creep, the stress is below the mechanical strength of the material, so that the rupture does not occur upon application of the load. It is by waiting a sufficiently long time that the cumulative strain may finally end in catastrophic rupture. Creep is all the more important, the larger the applied stress and the higher the temperature. The time to creep rupture is found in a large variety of materials to be controlled by the stress sign and magnitude, temperature and microstructure.
Statistical physics of rupture in heterogeneous media
1315
Creep is often divided into three regimes: (i) the primary creep regime corresponds to a decay of the strain rate following the application of the constant stress, which can often be described by the so-called Andrade’s law (a power law decay with time); (ii) the secondary regime describes an (often very long) cross over, characterized by an approximately constant strain rate, towards the (iii) tertiary creep regime in which the strain rate accelerates up to rupture. Andrade’s law for the strain rate is similar to the power-law relaxation of the aftershock seismic activity triggered by the stress change induced by a previous earthquake, known as Omori’s law [10]. In creep experiments, Omori’s law describes the decay of the rate of acoustic emissions in the primary regime. Creep experiments are thus interesting both because they constitute standard mechanical tests of long-time properties of structures and because of the power laws reminiscent of the critical behavior of complex self-organizing systems that have become popular paradigms, as discussed below. Studies of the creep rupture phenomena have been performed through direct experiments [11–14] and different models [15–25]. If a lot of works were devoted to homogeneous materials like metals and ceramics, many recent studies are concerned with heterogeneous materials like composites and rocks [11–14]. The knowledge of the failure properties of composite materials are of great importance because of the increasing number of applications for composites in engineering structures. The long-term behavior of these materials, especially polymer matrix composites is a critical issue for many modern engineering applications such as aerospace, biomedical, and civil engineering infrastructure. The primary concerns in long-term performance of composite materials are in obtaining critical engineering properties that extend over the projected lifetime of the structure. Viscoelastic creep and creep-rupture behaviors are among the critical properties needed to assess long-term performance of polymer-based composite materials. The knowledge of these critical properties is also required to design material microstructures which can be used to construct highly reliable components. For heterogeneous materials, the underlying microscopic failure mechanism of creep rupture is very complex depending on several characteristics of the specific types of materials. Beyond the development of analytical and numerical models, which predict the damage history in terms of the specific parameters of the constituents, another approach is to study the similarity of creep rupture with phase transitions phenomena as summarized here. This approach tackles the large range of scales involved in the damage evolution by using coarse-grained models describing the mechanism of creep, damage and precursory rupture by averaging over the microscopic degrees of freedom to retain only a few major ingredients that are thought to be the most relevant. By comparing the predictions of a hierarchy of models, from simple to elaborate, it is then possible to assess what are the relevant ingredients.
1316
D. Sornette
A recent experimental work on heterogenenous structural materials, conducted in GEMPPM at INSA Lyon, illustrates this approach [26]. Figure 1 shows a rapid and continuous decrease of the strain rate de/dt in the primary creep regime, which can be described by Andrade’s law (Omori’s law for the acoustic emissions) 1 de ∼ p, (1) dt t with an exponent p smaller than or equal to one. A quasi-constant strain rate (steady-state or secondary creep) is observed over an important part of the total creep time and then followed by an increasing creep rate (tertiary creep regime) culminating in fracture. Creep strains at fracture are large with values from a few percent up to 40% for such composite samples. The acceleration of the strain rate before failure is well fitted by a power-law singularity 1 de ∼ , dt (tc − t) p
(2)
with an exponent p smaller than or equal to one. The critical time tc determined from the fit of the data with expression (2) is generally close to the observed failure time (within a few seconds). The same temporal evolution is generally obtained for the acoustic emission activity as for the strain rate. The same patterns are obtained when plotting the acoustic emission event rate or the rate of acoustic emission energy. There are much larger fluctuations for the energy rate than for the event rate, due to the large distribution of acoustic (a)
(b)
10⫺2
(c) 10⫺3
⫺3
10
10⫺4 10⫺5 10⫺6 0
1
2 time (s)
3 x 104
strain rate
strain rate
strain rate
10⫺3 10⫺4 p=0.99
10⫺5
10⫺6 101
102
103 104 time (s)
105
10⫺4
p’=0.80
10⫺5
10⫺6
104
103 tc⫺ t (s)
102
Figure 1. Creep strain rate for a Sheet Molding Compound (SMC) composite consisting in a combination of polyester resin, calcium carbonate filler, thermoplastic additive and random oriented short glass reinforced fibres. The creep experiment was performed at a stress of 48 MPa and a temperature T = 100◦ C, below the glass transition, at the GEMPPM, INSA LYON, Villeurbanne, France. The stress was increased progressively and reached a constant value after about 17 s. Left panel: full history in linear time scale; middle panel: time is shown in logarithmic scale to test for the existence of a power-law relaxation regime; right panel: time is shown in the logarithm of the time to rupture time tc such that a time-to-failure power law (2) is qualified as a straight line. Reproduced from Ref. [26].
Statistical physics of rupture in heterogeneous media
1317
emission energies, but the crossover time between primary creep and tertiary creep, and the values of p and p are similar for the acoustic emission event rate and acoustic emission energy rate. This suggests that the amplitude distribution does not depend on time, a conclusion which is verified experimentally. How can one rationalize all these observations?
2.
The Role of Heterogeneities and Disorder
First, we need to define more precisely what is meant by “heterogeneity” or “disorder”. Disorder may describe the existence of a distribution (say Weibulllike) of material strength, and/or of their elastic properties, as well as the presence of internal surfaces such as fiber-matrix interfaces, voids and microcracks (or internal microdefects). In this sense, a kevlar-matrix or carbon-matrix composite would behave more like a heterogeneous system than a homogeneous matrix. There is not a unique way of defining the amplitude of disorder, since the classification depends on how the mechanics and physics respond to the heterogeneity. It can, in fact, be shown from a theorem of Von Neumann and Morgenstern [27] that the existence of possible correlations in the disorder prevents the existence of a unique absolute measure of disorder amplitude. In other words, the measure of disorder is relative to the problem. In practice, it can usually be quantified by some measure of the contrast between material and strength properties of components of the systems, weighted by their relative concentrations and their scales. When disorder is uncorrelated in space, a reasonable measure of its amplitude is the width or standard deviation (when it exists) of its distribution. The correlation length of the disorder and the characteristic sizes and their distribution are also important variables as they control the length scales that are relevant for the stress heterogeneity. A consequence is the size/volume effect, which is a very important practical subject. As already mentioned, the key parameter controlling the nature of damage and rupture is the degree and nature of disorder. This was considered early by Mogi [28], who showed experimentally on a variety of materials that, the larger the disorder, the stronger and more useful are the precursors to rupture. For a long time, the Japanese research effort for earthquake prediction and risk assessment was based on this very idea [29]. A quantification of this idea of the role of heterogeneities on the nature of rupture has been obtained with a two-dimensional spring-block model with stress transfer over a limited range and with the democratic fiber bundle model [30]. These models do not claim realism but attempt rather to capture the interplay of heterogeneity and of the stress transfer mechanism. It was found that heterogeneity plays the role of a relevant field (in the language of the statistical physics of critical phase transitions): systems with limited stress amplification exhibit a tri-critical transition [31], from a Griffith-type abrupt rupture (first-order)
1318
D. Sornette
regime to a progressive damage (critical) regime as the disorder increases. In the two-dimensional spring-block model of surface fracture [30], the stress can be released by spring breaks and block slips. This spring-block model may represent schematically the experimental situation where a balloon covered with paint or dry resin is progressively inflated. An industrial application may be for instance a metallic tank with carbon or kevlar fibers impregnated in a resin matrix wrapped up around it which is slowly pressurized [32]. As a consequence, it elastically deforms, transferring tensile stress to the overlayer. Slipping (called fiber-matrix delamination) and cracking can thus occur in the overlayer. In Ref. [30], this process is modeled by an array of blocks which represents the overlayer on a coarse grained scale in contact with a surface with solid friction contact. The solid friction will limit stress amplification. The fact that the disorder is so relevant as to create the analog of a tri-critical behavior can be traced back to the existence of solid friction on the blocks which ensures that the elastic forces in the springs are carried over a bounded distance (equal to the size of a slipping “avalanche”) during the stress transfer induced by block motions. There are similarities between this model and models of quasi-periodic matrix cracking in fibrous composites and of fragmentation of fibers in the so-called “single-filament-composite” test. This last model has been extensively developed and extended in a global and local load-sharing framework [33–35]. In the presence of long-range elasticity, disorder is found to be always relevant leading to a critical rupture. However, the disorder controls the width of the critical region [36]. The smaller it is, the smaller will be the critical region, which may become too small to play any role in practice. This has been confirmed by simulations of the “thermal fuse model” described below [37]: the damage rate on the approach to failure for different disorder can be rescaled onto a universal master curve.
3.
Qualitative Physical Scenario: From Diffuse Damage to Global Failure
The following qualitative physical picture for the progressive damage of an heterogeneous system leading to global failure has emerged from a large variety of theoretical, numerical, and experimental works (see for instance Refs. [38–41]). First, single isolated defects and microcracks nucleate which then, with the increase of load or time of loading, both grow and multiply leading to an increase of the density of defects per unit volume. As a consequence, defects begin to merge until a “critical density” is reached. Uncorrelated percolation [42] provides a starting modeling point valid in the limit of very large disorder [43]. For realistic systems, long-range correlations transported by the
Statistical physics of rupture in heterogeneous media
1319
stress field around defects and cracks make the problem much more subtle. Time dependence is expected to be a crucial aspect in the process of correlation building in these processes. As the damage increases, a new “phase” appears, where microcracks begin to merge leading to screening and other cooperative effects. Finally, the main fracture is formed causing global failure. The nature of this global failure may be abrupt (“first-order”) or “critical” depending on the strength of heterogeneity as well as load transfer and stress relaxation mechanisms. In the “critical” case, the failure of composite systems may often be viewed, in simple intuitive terms, as the result of a “correlated percolation process.” However, the challenge is to describe the transition from damage and corrosion processes at a microscopic level to macroscopic failure.
4.
Scaling and Critical Point
Motivated by the multi-scale nature of ruptures in heterogeneous systems and by analogies with the percolation model [42], statistical physicists suggested in the mid-1980s that rupture of sufficiently heterogeneous media would exhibit some universal properties, in a way maybe similar to critical phase transitions [43–45]. The idea was to build on the knowledge accumulated in statistical physics on the so-called N -body problem and cooperative effects in order to describe multiple interactions between defects. However, most of the models were drastically simplified and essentially all of them quasistatic with rather unrealistic loading rules [46, 47]. Suggestive scaling laws, including multifractality, were found to describe size effects and damage properties [46, 48], but the relevance to real materials was not convincingly demonstrated with a few exceptions (e.g., percolation theory to explain the experimentally based Coffin-Manson law of low cycle fatigue [49] or the Portevin-Le Chatelier effect in diluted alloys [50]). With numerical simulations and perturbation expansions, Hansen, Hinrichsen and Roux [48] (see also Ref. [46]) have used this class of quasi-static rupture models (with short-range as well as long-range interactions) to classify three possible rupture regimes, as a function of the concentrations of weak versus strong elements in the system. Specifically, the distribution p(x) of rupture thresholds x of elements of the discretized systems was parameterized as follows: p(x) ∼ x φ0 −1 for x → 0 and p(x) ∼ x −(1+φ∞ ) for x → + ∞. Then, the three regimes depend on the relative value of the exponents φ0 and φ∞ c compared with two critical values φ0c and φ∞ . The “weak disorder” regime c c (few strong elements) occurs for φ0 > φ0 (few weak elements) and φ∞ > φ∞ and boils down essentially to the nucleation of a “one-crack” run-away. For c (few strong elements), the rupφ0 ≤ φ0c (many weak elements) and φ∞ > φ∞ ture is controlled by the weak elements, with important size effects. The damage is diffuse but presents a structuration at large scales. For φ0 > φ0c (few weak
1320
D. Sornette
c elements) and φ∞ ≤ φ∞ (many strong elements), the rupture is controlled by the strong elements : the final damage is diffuse and the density of broken elements goes to a non-vanishing constant. This third case is very similar to the percolation models of rupture: Roux et al. [49] have indeed shown that percolation is retrieved in the limit of very large disorder. Beyond quasi-static models, the “thermal fuse model” of Sornette and Vanneste [37] was the first one with a realistic dynamical evolution law for the damage field. It was initially formulated in the framework of electric breakdown: when subjected to a given current, all fuses in a network heat up due to a generalized Joule effect (with exponent b); in the presence of heterogeneity in the conductances of the fuses, one of them will eventually breaks down first when its temperature reaches the melting threshold. Its current is then immediately distributed in the remaining fuses according to Kirchoff law. The model was later reformulated by showing that it is exactly equivalent to a (scalar) antiplane mechanical model of rupture with elastic interaction in which the temperature becomes a local damage variable [50]. This model accounts for space-dependent elastic and rupture properties, has a realistic loading (constant stress applied at the beginning of the simulation, for instance) and produces growing interacting microcracks with an organization which is a function of the damage-stress law. This model is thus a statistical generalization with quenched disorder of homogeneization theories of damage [51, 52]. In a creep experiment (constant applied stress), the total rate of damage in the late stage of evolution, as measured for instance by the elastic energy released per unit time dE/dt, is found on average to increase as a power law similar to expression (2),
1 dE ∼ , dt (tc − t)α
(3)
of the time-to-failure tc − t in the later stage. This behavior reproduces the tertiary creep regime culminating in the global rupture at tc . In this model, rupture is found to occur as the culmination of the progressive nucleation, growth and fusion between microcracks, leading to a fractal network. Interestingly, the critical exponents (such as α > 0) are non-universal and vary as a function of the damage law (exponent b). This model has since then been found to describe correctly the experiments on the electric breakdown of insulatorconducting composites [53]. Another application of the thermal fuse model is the damage by electromigration of polycristalline metal films [54]. See also Ref. [50] for relations with dendrites and fronts propagation. The concept that rupture in heterogenous materials is a genuine critical point, in the sense of phase transitions in statistical physics, was first articulated by Anifrani et al. [32], based on experiments on industrial composite structures. In this framework, the power law (3) is interpreted as analogous to a diverging susceptibility in critical phase transitions. It was found that
Statistical physics of rupture in heterogeneous media
1321
the critical behavior may correspond to an acceleration of the rate of energy release or to a deceleration, depending on the nature and range of the stress transfer mechanism and on the loading procedure. Symmetry arguments as well as the concept of a hierarchical cascade of damage events led in addition to suggest theoretically and verify experimentally that the power law behavior (3) of the time-to-failure analysis should be corrected for the presence of log-periodic modulations [32]. This “log-periodicity” can be shown to be the signature of a hierarchy of characteristic scales in the rupture process. This hierarchy can be generated dynamically by a cascade of sub-harmonic bifurcations [55]. These log-periodic corrections to scaling amount mathematically to taking the critical exponent α = α + iα complex, where i2 = −1 [56]. This has led to the development of a powerful predictive scheme ([57] and see below). The critical rupture concept can be seen as a non-trivial generalization of the dimension analysis based on Buckingham theorem and the asymptotic matching method proposed by Bazant [58, 59] to model size effect in complex materials, in the same way that Barenblatt’s [60] second-order similitude generalizes the naive similitude of first-order (or simple analytical behavior) of standard dimensional analysis, or in the same way the non-analytical behavior characterizing critical phase transitions generalizes the mean-field behavior of Landau-Ginzburg theory. Acharyya and Chakrabarti [61, 62] have shown how to define a “breakdown susceptibility” during the progressive damage of model systems when subjected to local short-duration impulses and how the breakdown point can then be located in advance by extrapolating this breakdown susceptibility. Numerical simulations on two-dimensional heterogeneous systems of elastic-brittle elements have confirmed that, near the global failure point, the cumulative elastic energy released during fracturing of heterogeneous solids with long-range elastic interactions follows a power law with log-periodic corrections to the leading term [63]. The presence of log-periodic correction to scaling in the elastic energy released has also been demonstrated numerically for the thermal fuse model [64] using a novel averaging procedure, called the “canonical ensemble averaging”. This averaging technique accounts for the dependence of the critical rupture time tc on the specific disorder realization of each sample. A recent experimental study of rupture of fiber-glass composites has also confirmed the critical scenario [65]. A systematic analysis of industrial pressure tanks brought to rupture has also confirmed the critical rupture concept and the presence of significant log-periodic structures, that are useful for prediction [66]. Through a series of computer and laboratory simulations and table-top experiments, Chakrabarti and Benguigui [67] have presented a useful synthesis of basic modeling principles borrowing from statistical physics putting in perspective three case studies: electrical failures like fuse and dielectric breakdown, mechanical fractures, and earthquakes. Their work also emphasizes the critical rupture concept [61, 62, 68, 69].
1322
D. Sornette
Let us also mention the work of Ramanathan and Fisher [70]: using analytical calculations and by numerical simulations, they compare the nature of the onset of a single crack motion in an heterogeneous material when neglecting or taking into account the dynamical wave stress transfer mechanism. In the quasistatic limit with instantaneous stress transfer, the crack front is found to undergo a dynamic critical phenomenon, with a second-order-like transition from a pinned to a moving phase as the applied load is increased through a critical value. Real elastic waves lead to overshoots in the stresses above their eventual static value when one part of the crack front moves forward. Simplified models of these stress overshoots showed an apparent jump in the velocity of the crack front directly to a nonzero value. In finite systems, the velocity also shows hysteretic behavior as a function of the loading. These results suggest a first-order-like transition [70].
5.
Creep Rupture: Models
Let us come back to the experiments shown in Fig. 1. There are many models, at the interface between standard mechanical approaches and statistical physics, which attempt to capture these observations. Vujosevic and Krajcinovic [71], Turcotte, Newman and Shcherbakov [23], Shcherbakov and Turcotte [24] and Pradhan and Chakrabarti [21, 22] used systems of elements or fibers within a probabilistic framework (corresponding to so-called annealed or thermal disorder) with a hazard rate function controlling the probability of rupture for a given fiber as a function of the stress applied to that fiber. Turcotte, Newman and Shcherbakov [23] obtained a finite-time singularity of the strain rate before failure in fiber bundle models by postulating a power law dependence of the hazard rate controlling the probability of rupture for a given fiber as a function of the stress applied to that fiber. Shcherbakov and Turcotte [24] studied the same model and recovered a power-law singularity of the strain rate for systems subjected to constant or increasing stresses with an exponent p = 4/3 larger than the experimental results. Using energy conservation and the requirement of non-negative entropy change, Lyakhovsky, Ben-Zion and Agnon [72] derived an evolution equation for the density of microcracks similar to that of Turcotte, Newman and Shcherbakov [23] for a fiber bundle model. Ben-Zion and Lyakhovsky [73] derived analytically the existence of power laws describing the time-dependent increase of the singular strain and the accelerated energy release in the tertiary regime using the continuum-based damage approach of Lyakhovsky, Ben-Zion and Agnon [72]. Sammis and Sornette [74] give an exhaustive review of the mechanisms giving rise to the power law tertiary regime, with application to earthquakes. Vujosevic and Krajcinovic [71] also found a power-law acceleration in
Statistical physics of rupture in heterogeneous media
1323
two-dimensional simulations of elements and in a mean-field democratic load sharing model, using a stochastic hazard rate, but they do not obtain Andrade’s law in the primary creep regime. Shcherbakov and Turcotte [24] were able to obtain Andrade law only in the situation of a system subjected to a constant applied strain (stable regime). But then, they did not have a global rupture and they did not obtain the critical power law preceding rupture. Thus, the models described above do not reproduce at the same time Andrade’s law for the primary regime and a power-law singularity before failure. Miguel et al. [15] reproduced Andrade’s law with p ≈ 2/3 in a numerical model of interacting dislocations, but their model does not reproduce the tertiary creep regime (no global failure). Several creep models consider the democratic fiber bundle model (DFBM) with thermally activated failures of fibers. Pradhan and Chakrabarti [21, 22] considered the DFBM and added a probability of failure per unit time for each fiber which depends on the amplitude of a thermal noise and on the applied stress. They computed the failure time as a function of the applied stress and noise level but they did not discuss the temporal evolution of the strain rate. Ciliberto, Guarino and Scorretti [16] and Politi, Ciliberto and Scorretti [20] considered the DFBM in which a random fluctuating force is added on each fiber to mimic the effect of thermal fluctuations. Ciliberto, Guarino and Scorretti [16] showed that this simple model predicts a characteristic rupture time given by an Arrhenius law with an effective temperature renormalized (amplified) by the quenched disorder in the distribution of rupture thresholds. Saichev and Sornette [25] showed that this model predicts Andrade’s law as well as a power law time-to-failure for the rate of fiber rupture with p = p = 1, with logarithm corrections (which may give apparent exponents p and p smaller than one). A few other models reproduce both a power-law relaxation in the primary creep and a finite time singularity in the tertiary regime. Main [19] reproduced a power-law relaxation (Andrade’s law) followed by a power-law singularity of the strain rate before failure by superposing two processes of subcritical crack growth, with different parameters. A first mechanism with negative feedback dominates in the primary creep and the other mechanism with positive feedback gives the power-law singularity close to failure. Lockner [14] gave an empirical expression for the strain rate as a function of the applied stress in rocks, which reproduces, among other properties, Andrade’s law with p = 1 in the primary regime and a finite-time singularity leading to rupture. Kun et al. [17] and Hidalgo, Kun and Herrmann [18] studied numerically and analytically a model of visco-elastic fibers, with deterministic dynamics and quenched disorder. They considered different ranges of interaction between fibers (local or democratic load sharing). Kun et al. [17] derived the condition for global failure in the system and the evolution of the failure time as a function of the applied stress in the unstable regime, and analysed the
1324
D. Sornette
statistics of inter-event times in numerical simulations of the model. Hidalgo, Kun and Herrmann [18] derived analytically the expression for the strain rate as a function of time. This model reproduces a power-law singularity of the strain rate before failure with p = 1/2 in the case of a uniform distribution of strengths, but is not able to explain Andrade’s law for the primary creep. This model gives a power-law decay of the strain rate in the primary creep regime only if the stress is at the critical point, but with an exponent p = 1/2 smaller than the experimental values. Nechad et al. [26] developed a variant of this model in which a composite system is viewed as made of a large set of representative elements (RE), each representative element comprising many fibers with their interstitial matrix. Each RE is endowed with a visco-elasto-plastic rheology with parameters which may be different from one element to another. The parameters characterizing each RE are frozen and do not evolve with time (so-called quenched disorder). Specifically, each RE is modeled as an Eyring dashpot in parallel with a linear spring. The Eyring rheology is standard for fiber composites [12]. It consists, at the microscopic level, in adapting to the matrix rheology the theory of reaction rates describing processes activated by crossing potential barriers. With these sole ingredients, the model recovers the three primary, secondary and tertiary regimes with exponents p = 1 (defined in expression (1)) and p = 1 (defined in expression (2)). These solutions for the primary and tertiary regimes are basically of the same form with p = p = 1 as the Langevin-type model solved by Saichev and Sornette [25]; this may not be surprising since the Eyring rheology describes, at the microscopic level, processes activated by crossing potential barriers, which are explicitely accounted for in the thermal fluctuation force model [25]. The key ingredients leading to these results are the broad (power law) distribution of rupture thresholds and the nonlinear Eyring rheology in a Kelvin element. Nechad et al.’s [26] model is a macroscopic deterministic effective description of the experiments. In contrast, the modeling strategy of Ciliberto, Guarino and Scorretti [16], of Politi, Ciliberto and Scorretti [20] and of Saichev and Sornette [25] emphasizes the interplay between microscopic thermal fluctuations and frozen heterogeneity. Qualitatively, Nechad et al.’s [26] model is similar to a deterministic macroscopic Fokker–Planck description while the thermal models of Ciliberto, Guarino and Scorretti [16], of Politi, Ciliberto and Scorretti [20] and of Saichev and Sornette [25] are reminiscent of stochastic Langevin models. It is well-known in statistical physics that Fokker–Planck equations and Langevin equations are exactly equivalent for systems at equilibrium and just constitute two different descriptions of the same processes, and their correspondence is associated with the general fluctuation-dissipation theorem. Similarly, the encompassing of both the Andrade relaxation law in the primary creep regime and of the time-to-failure power law singularity in the tertiary regime by Nechad et al.’s [26] model and by the thermal model solved in [25] suggests a deep connection between these two levels of description for creep and damage processes.
Statistical physics of rupture in heterogeneous media
6.
1325
Toward Rupture Prediction
There is a huge variability of the failure time from one sample to another one, for the same applied stress, as shown in Fig. 2. This implies that one cannot predict the time to failure of a sample using an empirical relation between the applied stress and the time of failure. There is however another approach suggested by Fig. 2 as proposed by Nechad et al. [26]. It shows the correlation between the transition time tm (minimum of the strain rate) and the rupture time tc ≈ tend and shows that tm is about 2/3 of the rupture time tc . This suggests a way to predict the failure time from the observation of the strain rate during the primary and secondary creep regimes, before the acceleration of the damage during the tertiary creep regime leading to the rupture of the sample. As soon as a clear minimum is observed, the value of tm can be measured and that of tc deduced from the relationship shown in Fig. 2. However, there are some cases where the minima is not well defined, for which the first (smoothed) minimum is followed by a second similar one. In this case,
106 [±62] [90/35] SMC fit tc=tm⫻1.58⫹16
rupture time tc (s)
105
104
103
102 102
103
104
105
transition time tm (s) Figure 2. Relation between the time tm of the minima of the strain rate and the rupture time tc , for all samples investigated in [26].
1326
D. Sornette
the application of the relationship shown in Fig. 2 would lead to a pessimistic prediction for the lifetime of the composite. The observation that the failure time is correlated with the p-value and the duration of the primary creep suggests that, either a single mechanism is responsible both for the decrease of the strain rate during primary creep and for the acceleration of the damage during the tertiary creep or, if the mechanisms are different nevertheless, the damage that occurs in the primary regime impacts on its subsequent evolution in the secondary and tertiary regime, and therefore on tc . In contrast, using a fit of the acoustic emission activity by a power-law to estimate tc according to formula (3) works only in the tertiary regime and thus does not exploit the information contained in the deformation and in the acoustic emissions of the primary and secondary regimes which cover 2/3 to 3/4 of the whole history. In practice, one needs at least one order of magnitude in the time tc − t to estimate accurately tc and p , which means that, if the power-law acceleration regime starts immediately when the stress is applied (no primary creep), one cannot predict the rupture time using a fit of the damage rate by Eq. (3) before 90% of the failure time. If, as observed in the experiments of Nechad et al. [26], the tertiary creep regime starts only at about 63% of tc , then one cannot predict the rupture time using a fit of the damage rate before 96% of the failure time. This limitation was the motivations for the development of formulas that interpolate between the primary and tertiary regimes beyond the pure power law (3) using log-periodic corrections to scaling [32, 66, 75–78]. In particular, Anifrani et al. [32] have introduced a method based on log-periodic correction to the critical power law which has been used extensively by the European Aerospace company A´erospatiale (now EADS) on pressure tanks made of kevlar-matrix and carbon-matrix composites embarked on the European Ariane 4 and 5 rockets. In a nutshell, the method consists in this application in recording acoustic emissions under constant stress rate and the acoustic emission energy as a function of stress is fitted by the above log-periodic critical theory. One of the parameters is the time of failure and the fit thus provides a “prediction” when the sample is not brought to failure in the first test [77]. The results indicate that a precision of a few percent in the determination of the stress at rupture is typically obtained using acoustic emission recorded about 20% below the stress at rupture. This has warranted the selection of this non-destructive evaluation technique as the routine qualifying procedure in the industrial fabrication process. This methodology and these experimental results have been guided by the theoretical research over the years using the critical rupture concept discussed above. In particular, there is now a better understanding of the conditions, the mathematical properties and physical mechanisms on the basis of log-periodic structures [55, 56, 79–81]). Another noteworthy approach already mentioned above for the prediction of rupture, which is inspired by statistical physics,
Statistical physics of rupture in heterogeneous media
1327
is the “breakdown susceptibility” introduced by Acharyya and Chakrabarti [61, 62]. It requires monitoring the response of the system when subjected to local short-duration impulses whose nature depends upon the problem (stress, strain, temperature, electromagnetic etc.). In summary, starting with the initial flurry of interest from the statistical physics community on problems of material rupture, a new awareness of the many-body nature of the rupture problem has blossomed. There is now a growing understanding in both communities of the need for an interdisciplinary approach, improving on the reductionist approach of both fields to tackle at the same time the difficult modeling of specific properties of the microscopic structures and their interactions leading to collective effects. Independently of the types of materials for given applications, this approach will be crucial in making progress on the optimization of the lifetime of materials (“durability”) and on the determination of the remaining life time of materials in use (“remaining potential”).
References [1] T. Reichhardt, “Rocket failure leads to grounding of small US satellites,” Nature (London), 384, 99–99, 1996. [2] H. Liebowitz (ed.), Fracture, New York, Academic Press, vols. I–VII, 1984. [3] J. Fineberg and M. Marder, “Instability in dynamic fracture,” Phys. Rep., 313, 2–108, 1999. [4] E. Bouchaud, “The morphology of fracture surfaces: a tool for understanding crack propagation in complex materials,” Surf. Rev. Lett., 10, 797–814, 2003. [5] J.J. Gilman, “Mechanochemistry,” Science, 274, 65–65, 1996. [6] A.R.C. Westwood, J.S. Ahearn, and J.J. Mills, “Developments in the theory and application of chemomechanical effects,” Colloid Surfaces, 2, 1, 1981. [7] National Research Council, Aging of U.S. Air Force Aircraft, Final Report from the Committee on Aging of U.S. Air Force Aircraft, National Materials Advisory Board Commission on Engineering and Technical Systems, Publication NMAB-4882, National Academy Press, Washington, D.C., 1997. [8] R. El Guerjouma, J.C. Baboux, D. Ducret, N. Godin, P. Guy, S. Huguet, Y. Jayet, and T. Monnier, “Non-destructive evaluation of damage and failure of fiber reinforced polymer composites using ultrasonic waves and acoustic emission,” Adv. Engrg. Mater., 8, 601–608, 2001. [9] W. Nelson, “Accelerated testing: statistical models, test plans and data analyses,” John Wiley & Sons, Inc., New York, 1990. [10] F. Omori, “On the aftershocks of earthquakes,” J. Coll. Sci. Imp. Univ. Tokyo, 7, 111, 1894. [11] A. Agbossou, I. Cohen, and D. Muller, “Effects of interphase and impact strain rates on tensile off-axis behaviour of unidirectional glass fibre composite: experimental results,” Engrg. Fract. Mech., 52 (5), 923–935, 1995. [12] J.Y. Liu, R.J. Ross, “Energy criterion for fatigue strength of wood structural members,” J. Engrg. Mater. Technol., 118(3), 375–378, 1996.
1328
D. Sornette
[13] A. Guarino, S. Ciliberto, A. Garcimartin, M. Zei, and R. Scorretti, “Failure time and critical behaviour of fracture precursors in heterogeneous materials,” Eur. Phys. J. B, 26 (2), 141–151, 2002. [14] D.A. Lockner, “A generalized law for brittle deformation of Westerly granite,” J. Geophys. Res., 103(B3), 5107–5123, 1998. [15] M.C. Miguel, A. Vespignani, M. Zaiser, and S. Zapperi, “Dislocation jamming and Andrade creep,” Phys. Rev. Lett., 89(16), 165501, 2002. [16] S. Ciliberto, A. Guarino, and R. Scorretti, “The effect of disorder on the fracture nucleation process,” Physica D, 158, 83–104, 2001. [17] F. Kun, Y. Moreno, R.C. Hidalgo, and H.J. Herrmann, “Creep rupture has two universality classes,” Europhys. Lett., 63(3), 347–353, 2003. [18] R.C. Hidalgo, F. Kun, and H.J. Herrmann, “Creep rupture of viscoelastic fiber bundles,” Phys. Rev. E, 65(3), 032502/1-4, 2002. [19] I.G. Main, “A damage mechanics model for power-law creep and earthquake aftershock and foreshock sequences,” Geophys. J. Int., 142(1), 151–161, 2000. [20] A. Politi, S. Ciliberto, and R. Scorretti, “Failure time in the fiber-bundle model with thermal noise and disorder,” Phys. Rev. E, 66(2), 026107/1-6, 2002. [21] S. Pradhan and B.K. Chakrabarti, “Failure due to fatigue in fiber bundles and solids,” Phys. Rev. E, 67, 046124, 2003a. [22] S. Pradhan and B.K. Chakrabarti, “Failure properties of fiber bundle models,” Int. J. Mod. Phys. B, 17( 29), 5565–5581, 2003b. [23] D.L. Turcotte, W.I. Newman, and R. Shcherbakov, “Micro and macroscopic models of rock fracture,” Geophys. J. Int., 152(3), 718–728, 2003. [24] R. Shcherbakov and D.L. Turcotte “Damage and self-similarity in fracture,” Theoretical Appl. Fract. Mech., 39(3), 245–258, 2003. [25] A. Saichev and D. Sornette, Andrade, “Omori and time-to-failure laws from thermal noise in material rupture,” Phys. Rev. E, 71(1), 2005 (preprint http://arXiv.org/abs/ cond-mat/0311493). [26] H. Nechad, A. Helmstetter, R. El Guerjouma, and D. Sornette, “Andrade and critical time-to-failure laws in fibre-matrix composites: experiments and model,” in press, J. Mech. Phys. Solids, http://arXiv.org/abs/cond-mat/0404035, 2005. [27] J. von Neumann and O. Morgenstern, “Theory of games and economic behavior,” Princetown University Press, 1947. [28] K. Mogi, “Some features of recent seismic activity in and near Japan: activity before and after great earthquakes,” Bull. Eq. Res. Inst. Tokyo Univ., 47, 395–417, 1969. [29] K. Mogi, “Earthquake prediction research in Japan,” J. Phys. Earth, 43, 533–561, 1995. [30] J.V. Andersen, D. Sornette, and K.-T. Leung, “Tri-critical behavior in rupture induced by disorder,” Phys. Rev. Lett., 78, 2140–2143, 1997. [31] A. Aharony, “Tricritical phenomena,” Lect. Notes Phys., 186, 209, 1983. [32] J.-C. Anifrani, C. Le Floc’h, D. Sornette, and B. Souillard, “Universal log-periodic correction to renormalization group scaling for rupture stress prediction from acoustic emissions,” J. Phys. I France, 5, 631–638, 1995. [33] W.A. Curtin, “Exact theory of fibre fragmentation in a single-filament composite,” J. Mater. Sci., 26, 5239–5253, 1991. [34] W.A. Curtin, “Size scaling of strength in heterogeneous materials,” Phys. Rev. Lett., 80, 1445–1448, 1998. [35] M. Ibnabdeljalil and W.A. Curtin, “Strength and reliability of fiber-reinforced composites: localized load-sharing and associated size effects,” Int. J. Sol. Struct., 34, 2649–2668, 1997.
Statistical physics of rupture in heterogeneous media
1329
[36] D. Sornette and J.V. Andersen, “Scaling with respect to disorder in time-to-failure,” Eur. Phys. J. B, 1, 353–357, 1998. [37] D. Sornette and C. Vanneste, “Dynamics and memory effects in rupture of thermal fuse networks,” Phys. Rev. Lett., 68, 612–615, 1992. [38] X.-L. Lei, K. Kusunose, M.V.M.S. Rao, O. Nishizawa, and T. Satoh, “Quasi-static fault growth and cracking in homogenous brittle rock under triaxial compression using acoustic emission monitoring,” J. Geophys. Res., 105, 6127–6139, 1999. [39] X.-L. Lei, K. Kusunose, O. Nishizawa, A. Cho, and T. Satoh, “On the spatiotemporal distribution of acoustic emissions in two granitic rocks under triaxial compression: the role of pre-existing cracks,” Geophys. Res. Lett., 27, 1997–2000, 2000. [40] C.A. Tang, H. Liu, P.K.K. Lee, Y. Tsui, and L.G. Tham, “Numerical studies of the infuence of microstructure on rock failure in uniaxial compression – Part I: effect of heterogeneity,” Int. J. Rock Mech. Mining Sci., 37, 555–569, 2000a. [41] C.A. Tang, H. Liu, P.K.K. Lee, Y. Tsui, and L.G. Tham, “Numerical studies of the infuence of microstructure on rock failure in uniaxial compression – Part II: constraint, slenderness and size effect,” Int. J. Rock Mech. Mining Sci., 37, 571–583, 2000b. [42] D. Stauffer and A. Aharony, “Percolation theory, Taylor and Francis, London,” 1992. [43] L. de Arcangelis, S. Redner, and H.J. Herrmann, “A random fuse model for breaking processes,” J. Physique Lett., 46, L585–590, 1985. [44] P.M. Duxbury, P.D. Beale, and P.L. Leath, “Size effects of electrical breakdown in quenched random media,” Phys. Rev. Lett., 57, 1052–1055, 1986. [45] A. Gilabert, C. Vanneste, D. Sornette, and E. Guyon, “The random fuse network as a model of rupture in a disordered medium,” J. Phys. France, 48, 763–770, 1987. [46] H.J. Herrmann and S. Roux (eds.), Statistical Models for the Fracture of Disordered media, Elsevier, Amsterdam, 1990. [47] P. Meakin, “Models for material failure and deformation,” Science, 252(5003), 226– 234, 1991. [48] A. Hansen, E. Hinrichsen, and S. Roux, “Scale-invariant disorder in fracture and related breakdown phenomena,” Phys. Rev. B, 43, 665–678, 1991. [49] Y. Br´echet, T. Magnin, and D. Sornette, “The Coffin–Manson law as a consequence of the statistical nature of the LCF surface damage,” Acta Metall., 40, 2281–2287, 1992. [50] M.S. Bharathi and G. Ananthakrishna, “Chaotic and power law states in the PortevinLe Chatelier effect,” Europhys. Lett., 60, 234–240, 2002; Correction, Ibid, 61, 430, 2003. [51] J.L. Chaboche, “A continuum damage theory with anisotropic and unilateral damage,” Rech. Aerospatiale, 2, 139, 1995. [52] J.F. Maire and J.L. Chaboche, “A new formulation of continuum damage mechanics (CDM) for composite materials,” Aerospace Sci. Technol., 1, 247–257, 1997. [53] L. Lamaign`ere, F. Carmona, and D. Sornette, “Experimental realization of critical thermal fuse rupture,” Phys. Rev. Lett., 77, 2738–2741, 1996. [54] R.M. Bradley and K. Wu, “Dynamic fuse model for electromigration failure of polycrystalline metal films,” Phys. Rev. E, 50, R631–R634, 1994. [55] Y. Huang, G. Ouillon, H. Saleur, and D. Sornette, “Spontaneous generation of discrete scale invariance in growth models,” Phys. Rev. E, 55, 6433–6447, 1997. [56] D. Sornette, “Discrete scale invariance and complex dimensions,” Phys. Rep., 297, 239–270, 1998.
1330
D. Sornette
[57] C. Le Floc’h and D. Sornette, “Predictive acoustic emission: application on helium high pressure tanks,” Pr´ediction des e´ v`enements catastrophiques: une nouvelle approche pour le controle de sant´e structurale, Instrumentation Mesure Metrologie, published by Hermes Science, RS Series, I2M, vol. 3(1–2), 89–97 (in french), 2003. [58] Z.P. Bazant, “Scaling of quasibrittle fracture: asymptotic analysis,” Int. J. Fract., 83, 19–40, 1997a. [59] Z.P. Bazant, “Scaling of quasibrittle fracture: hypotheses of invasive and lacunar fractality, their critique and Weibull connection,” Int. J. Fract., 83, 41–65, 1997b. [60] G.I. Barenblatt, Dimensional Analysis, Gordon and Breach, New York, 1987. [61] M. Acharyya and B.K. Chakrabarti, “Response of random dielectric composites and earthquake models to pulses – prediction possibilities,” Physica A, 224, 254–266, 1996a. [62] M. Acharyya and B.K. Chakrabarti, “Growth of breakdown susceptibility in random composites and the stick-slip model of earthquakes – prediction of dielectric breakdown and other catastrophes,” Phys. Rev. A, 53, 140–147; “Correction,” Phys. Rev. A, 54, 2174–2175, 1996b. [63] M. Sahimi and S. Arbabi, “Scaling laws for fracture of heterogeneous materials and rock,” Phys. Rev. Lett., 77, 3689–3692, 1996. [64] A. Johansen and D. Sornette, “Evidence of discrete scale invariance by canonical averaging,” Int. J. Mod. Phys. C, 9, 433–447, 1998. [65] A. Garcimartin, A. Guarino, L. Bellon, and S. Ciliberto, “Statistical properties of fracture precursors,” Phys. Rev. Lett., 79, 3202–3205, 1997. [66] A. Johansen and D. Sornette, “Critical ruptures,” Eur. Phys. J. B, 18, 163–181, 2000. [67] B.K. Chakrabarti and L.G. Benguigui, “Statistical physics of fracture and breakdown in disordered systems,” Clarendon Press, Oxford, 1997. [68] R. Banerjee and B.K. Chakrabarti, “Critical fatigue behaviour in brittle glasses,” B. Mater. Sci., 24(2), 161–164, 2001. [69] S. Pradhan and B.K. Chakrabarti, “Precursors of catastrophe in the Bak-TangWiesenfeld, Manna, and random-fiber-bundle models of failure,” Phys. Rev. E, 016113, 2002. [70] S. Ramanathan and D.S. Fisher, “Onset of propagation of planar cracks in heterogenous media,” Phys. Rev. B, 58, 6026–6046, 1998. [71] M. Vujosevic and D. Krajcinovic, “Creep rupture of polymers – a statistical model,” Int. J. Solids Struct., 34(9), 1105–1122, 1997. [72] V. Lyakhovsky, Y. Benzion, and A. Agnon, “Distributed damage, faulting and friction,” J. Geophys. Res. (Solid Earth), 102(B12), 27635–27649, 1997. [73] Y. Ben-Zion and V. Lyakhovsky, “Accelerated seismic release and related aspects of seismicity patterns on earthquake faults,” Pure Appl. Geophys., 159(10), 2385–2412, 2002. [74] S.G. Sammis and D. Sornette, “Positive feedback, memory and the predictability of earthquakes,” Proc. Nat. Acad. Sci. USA, 99(Supp. 1), 2501–2508, 2002. [75] S. Gluzman, J.V. Andersen, and D. Sornette, “Functional renormalization prediction of rupture,” Comput. Seismology, 32, 122–137, 2001. [76] A. Moura and V.I. Yukalov, “Self-similar extrapolation for the law of acoustic emission before failure of heterogeneous materials,” Int. J. Fract., 118(3), 63–68, 2002. [77] J. Gauthier, C. Le Floc’h, and D. Sornette, “Predictability of catastrophic events; a new approach for structural health monitoring predictive acoustic emission application on helium high pressure tanks,” In: D. Balageas (ed.), Proceedings of the first European workshop Structural Health Monitoring, ONERA, pp. 926–930, http://arXiv.org/abs/cond-mat/0210418, 2002.
Statistical physics of rupture in heterogeneous media
1331
[78] V.I. Yukalov, A. Moura, and H. Nechad, “Self-similar law of energy release before materials fracture,” J. Mech. Phys. Solids, 52, 453–465, 2004. [79] D. Sornette, “Predictability of catastrophic events: material rupture, earthquakes, turbulence, financial crashes and human birth,” Proc. Natl. Acad. Sci. USA, 99 (Supp. 1), 2522–2529, 2002. [80] K. Ide and D. Sornette, “Oscillatory finite-time singularities in finance, population and rupture,” Physica A, 307(1–2), 63–106, 2002. [81] W.-X. Zhou and D. Sornette, “Generalized q-analysis of log-periodicity: applications to critical ruptures,” Phys. Rev. E, 046111, 6604 N4 PT2:U129–U136, 2002. [82] D. Sornette and C. Vanneste, “Dendrites and fronts in a model of dynamical rupture with damage,” Phys. Rev. E, 50, 4327–4345, 1994. [83] S. Roux, A. Hansen, H. Herrmann, and E. Guyon, “Rupture of heterogeneous media in the limit of infinite disorder,” J. Stat. Phys., 52, 237–244, 1988.
4.5 THEORY OF RANDOM HETEROGENEOUS MATERIALS S. Torquato Department of Chemistry, PRISM, and Program in Applied & Computational Mathematics, Princeton University, Princeton, NJ 08544, USA
1.
Introduction
The theoretical prediction of the transport, electromagnetic, and mechanical properties of heterogeneous materials has a long and venerable history, attracting the attention of some of the luminaries of science, including Maxwell [1], Rayleigh [2], and Einstein [3]. Since the early work on the physical properties of heterogeneous materials, there has been an explosion in the literature on this subject [4–9] because of the rich and challenging fundamental problems it offers and its manifest technological importance. A heterogeneous material is composed of domains of different materials (phases), such as a composite, or the same material in different states, such as a polycrystal [8]. It is assumed that the “microscopic” length scale is much larger than the molecular dimensions but much smaller than the characteristic length of the macroscopic sample. In such circumstances, the heterogeneous material can be viewed as a continuum on the microscopic scale, and macroscopic or effective properties can be ascribed to it (see Fig. 1). Heterogeneous materials abound in synthetic products and nature. Synthetic examples include aligned and chopped fiber composites, particulate composites, powders, interpenetrating multiphase composites cellular solids, colloids, gels, foams, phase-separated metallic alloys, microemulsions, block copolymers, and fluidized beds. Some examples of natural heterogeneous materials are granular media, soils, polycrystals, sandstone, wood, bone, lungs, blood, animal and plant tissue, cell aggregates and tumors. The physical phenomena of interest occur on “microscopic” length scales that span from tens of nanometers in the case of gels to meters in the case of geological media. Structure on this “microscopic” scale is generically referred to as microstructure. 1333 S. Yip (ed.), Handbook of Materials Modeling, 1333–1357. c 2005 Springer. Printed in the Netherlands.
1334
S. Torquato
Figure 1. Left panel: A schematic of a random two-phase material shown as white and gray regions with general phase properties K 1 and K 2 and phase volume fractions φ1 and φ2 . Here L and represent the macroscopic and microscopic length scales, respectively. Right panel: When L is much bigger than , the heterogeneous material can be treated as a homogeneous material with effective property K e .
5 µm
125 µm
Figure 2. Examples of random heterogeneous materials [8]. Left panel: A colloidal system of hard spheres of two different sizes. Right panel: A Fontainebleau sandstone.
In many instances, the microstructures can be characterized only statistically, and therefore such materials are referred to as random heterogeneous materials. There is a vast family of random microstructures that are possible, ranging from dispersions with varying degrees of clustering to complex interpenetrating connected multiphase media, including porous media. Figure 2 shows examples of synthetic and natural random heterogeneous materials. The first example shows a scanning electron micrograph of a colloidal system of hard spheres of two different sizes, primarily composed of boron carbide (black
Theory of random heterogeneous materials
1335
regions) and aluminum (white regions). The second example shows a planar section through a Fontainebleau sandstone obtained via X-ray microtomography. This imaging technique enables one to obtain full three-dimensional renderings of the microstructure, revealing that the void or pore phase (white region) is actually connected across the sample. Four different classes of problems are summarized in Table 1 and we will focus on the following four steady-state (time-independent) effective properties associated with these classes: 1. 2. 3. 4.
Effective conductivity tensor, σe Effective stiffness (elastic) tensor, C e Mean survival time, τ Fluid permeability tensor, k
Knowledge of these effective properties are required for a host applications in engineering, physics, geology, materials science, and biology [8]. Depending on the physical context, each phase can be either solid, fluid or void. The quantity σe represents either the electrical or thermal conductivity tensor, which are mathematically equivalent properties. It is the proportionality constant between the average of the local electric current (heat flux) and average of the local electric field (temperature gradient) in the composite. This averaged relation is Ohm’s law or Fourier’s law (for the composite) in the electrical or thermal problems, respectively. For reasons of mathematical analogy, the determination of the effective conductivity translates immediately into
Table 1. The four different classes of steady-state effective media problems considered here. F ∝ K e ·G, where K e is the general effective property, G is the average (or applied) generalized gradient or intensity field, and F is the average generalized flux field. Class A and B problems share many common features and hence may be attacked using similar techniques. Class C and D problems are similarly related to one another [8] Class
General effective property (K e )
Average (or applied) generalized intensity (G)
Average generalized flux (F)
A
Thermal conductivity Electrical conductivity Dielectric constant Magnetic permeability Diffusion coefficient
Temperature gradient Electric field Electric field Magnetic field Concentration gradient
Heat flux Electric current Electric displacement Magnetic induction Mass flux
B
Elastic moduli Viscosity
Strain field Strain rate field
Stress field Stress field
C
Survival time NMR survival time
Species production rate NMR production rate
Concentration field Magnetization density
D
Fluid permeability Sedimentation rate
Applied pressure gradient Force
Velocity field Mobility
1336
S. Torquato
equivalent results for the effective dielectric constant, magnetic permeability, or diffusion coefficient. Therefore, we refer to all of these problems as class A problems as described in Table 1. The effective stiffness (elastic) tensor C e is one of the most basic mechanical properties of a heterogeneous material. The quantity C e is the proportionality constant between the average stress and average strain. This relation is the averaged Hooke’s law for the composite. Considerable attention has been devoted to instances in which the heterogeneous medium consists of a pore region in which diffusion (and bulk reaction) occurs and a “trap” region whose interface can absorb the diffusing species via a surface reaction. A key parameter in such processes is the mean survival time τ , which gives the average lifetime of the diffusing species before it gets trapped. Often it is useful to introduce its inverse, called the trapping constant γ ∝ τ −1 , which is proportional to the trapping rate. A key macroscopic property for describing slow viscous flow through porous media is the fluid permeability tensor k. The quantity k is the proportionality constant between the average fluid velocity and applied pressure gradient in the porous medium. This relation is Darcy’s law for the porous medium. Given the phase properties K 1 , K 2 , . . . , K M and phase volume fractions φ1 , φ2 , . . . , φM of a heterogeneous material with M phases, how are its effective properties mathematically defined? It will be shown below that the effective properties of the heterogeneous material are determined by averages of local fields derived from the appropriate governing continuum-field theories (partial differential equations) for the problem of concern. Specifically, any of the aforementioned effective properties, which we denote generally by K e , is defined by a linear relationship between an average of a generalized local flux F and an average of a generalized local (or applied) intensity G, i.e., F ∝ K e · G.
(1)
For the conduction, elasticity, trapping, and flow problems, the average generalized flux F represents the average local electric current (heat flux), stress, concentration, and velocity fields, respectively, and the average generalized intensity G represents the average local electric field (or temperature gradient), strain, production rate, and applied pressure gradient, respectively. Table 1 summarizes the average local (or applied) field quantities that determine the steady-state effective properties for all four problem classes. The similarities and differences between these classes are described fully by Torquato [8]. The effective properties of a heterogeneous material depend on the phase properties and microstructural information, including the phase volume fractions, which represent the simplest level of information. It is important to emphasize that the effective properties are generally not simple relations
Theory of random heterogeneous materials
1337
(mixtures rules) involving the phase volume fractions. This suggests that the complex interactions between the phases result in a dependence of the effective properties on nontrivial details of the microstructure. To illustrate this point, we consider a 50–50 two-phase system shown in the left panel of Fig. 3. It consists of a disconnected inclusion phase and a connected matrix phase. Let the gray “phase” be highly conducting (or stiff) compared to the white “phase”. The right panel shows a composite with exactly the same microstructure but with the phases interchanged. Which of the two composites has the higher effective conductivity (or stiffness)? Clearly, the one depicted in the right panel has the higher effective property, since the connected phase here is the more conducting (or stiffer) phase. Thus, even though both composites have the same volume fraction, their effective properties will be dramatically different, implying that the effective properties depend on microstructural information beyond that contained in the volume fractions. To summarize, for a random heterogeneous material consisting of M phases, the general effective property K e is the following function: K e = f (K 1 , K 2 , . . . , K M ; φ1 , φ2 , . . . , φ M ; ),
(2)
where indicates functionals of higher-order microstructural information. The mathematical form that this microstructural information takes is statistical correlation functions. A central aim of the theory of random heterogeneous materials is the development of methods to estimate the functional in (2) and therefore the relevant statistical correlation functions.
50⫺50 Mixture
50⫺50 Mixture
Figure 3. Left panel: 50–50 mixture consisting of a disconnected inclusion phase and a connected matrix phase. The gray phase is highly conducting (or stiff) relative to the white phase. Right panel: The same microstructure except the phases are interchanged [8].
1338
2.
S. Torquato
Microstructural Correlation Functions
The diverse effective properties that we are concerned with here rigorously lead to a wide variety of microstructural descriptors, generically referred to as microstructural correlation functions, and which are defined below. The reader is referred to the book by Torquato [8] for a detailed disussion of these microstructural correlation functions. We will assume that the microstructures are static or can be approximated as static, and therefore any realization ω of the random material will be taken to be independent of time. In particular, we will focus on two-phase heterogeneous materials. Each realization ω of the two-phase random medium comes from some probability space and occupies some subset V of d-dimensional Euclidean space, i.e., V ∈ d . The region of space V ∈ d of volume V that is partitioned into two disjoint random sets or phases: phase 1, a region V1 (ω) of volume fraction φ1 , and phase 2, a region V2 (ω) of volume fraction φ2 . Let ∂V(ω) denote the surface or interface between V1 (ω) and V2 (ω). For a given realization ω, the indicator function I (i) (x; ω) for phase i for x ∈ V is a random variable defined by (i)
I (x; ω) =
1, 0,
if x ∈ Vi (ω), otherwise,
(3)
for i = 1, 2. The indicator function M(x; ω) for the interface is defined as M(x; ω) = |∇I (1)(x; ω)| = |∇I (2) (x; ω)|,
(4)
and therefore is a generalized function that is nonzero when x is on the interface. Depending on the physical context, phase i can be a solid, fluid, or void characterized by some general tensor property. Henceforth, we will drop ω from the notation.
2.1.
n-Point Probability Functions
The so-called n-point probability function for phase i, Sn(i) , is the the expectation of the product I (i) (x 1 )I (i) (x 2 ) · · · I (i) (x n ), i.e.,
Sn(i) (x 1 , x 2 , . . . , x n ) ≡ I (i) (x 1 )I (i) (x 2 ) · · · I (i) (x n )
(5)
This quantity can be interpreted as the probability that n points at positions x 1 , x 2 , . . . , x n are found in phase i. For statistically homogeneous media, the Sn(i) are translationally invariant and therefore depend only on relative positions of the n points. In particular, S1(i) (x 1 ) is just the constant volume fraction φi of phase i. If the random medium is also statistically isotropic, the Sn(i) depend only on the distances between the n points. Henceforth, we will define
Theory of random heterogeneous materials
r
1339
S2(r) r
L(z)
Fsv(r)
z δ
Fss(r)
P(δ) r
Figure 4. A schematic depicting events that contribute to lower-order functions for random (1) media of arbitrary microstructure [8]. Shown is the two-point probability function S2 ≡ S2 for phase 1 (white region), surface–void and surface–surface functions Fsv and Fss , lineal-path function L ≡ L (1) , and the pore-size density function P.
Sn ≡ Sn(i) . The two-point or autocorrelation function S2(r) ≡ S2(1)(r) for statistically homogeneous media can be obtained by randomly tossing line segments of length r ≡ |r| with a specified orientation and counting the fraction of times the end points fall in phase 1 (see Fig. 4). For an isotropic porous solid, this two-point function can also be obtained experimentally via scattering of radiation [8].
2.2.
Surface Correlation Functions
Surface correlation functions contain information about the random interface ∂V and are of basic importance in the trapping and flow problems. In this context, we will let phase 1 denote the fluid or “void” phase, and phase 2 the “solid” phase. The simplest surface correlation function is the specific surface s(x) (interface area per unit volume) at point x, which is a one-point correlation function for statistically inhomogeneous media, i.e., s(x) = M(x),
(6)
where M(x) is the interface indicator function given by (4). Two-point surface correlation functions for statistically inhomogeneous media are defined by Fsv (x 1 , x 2 ) = M(x 1 )I(x 2 ), Fss (x 1 , x 2 ) = M(x 1 )M(x 2 ),
(7) (8)
1340
S. Torquato
where I(x) ≡ I (1)(x) is the indicator function for the void phase. These functions are called the surface–void and surface–surface correlation functions, respectively. Higher-order surface correlation functions can also be defined [8].
2.3.
Lineal Measures
For statistically isotropic media, the lineal-path function L (i) (z) gives the probability that a line segment of length z lies wholly in phase i when randomly thrown into the sample. Figure 4 shows an event that contributes to the lineal-path function. The function L (i) (z) is related to the chord-length probability density function p(i) (z) via the formula p(i) (z) =
C d2 L (i) (z) , φi dz 2
(9)
(i) where (i) C is the mean chord length for phase i, i.e., the first moment of p (z) defined by Chords are all of the line segments between intersections of an infinitely long line with the two-phase interface. For statistically isotropic media, the quantity p(i) (z)dz is the probability of finding a chord of length between z and z + dz in phase i.
2.4.
Pore-Size Functions
The pore-size probability density function P(δ) (also referred to as poresize “distribution” function) first arose to characterize the void or “pore” space in porous media. For simplicity, we will define P(δ) for phase 1, keeping in mind that it is equally well defined for phase 2. The quantity P(δ)dδ is defined as the probability that a randomly chosen point in V1 (ω) lies at a distance between δ and δ + dδ from the nearest point on the pore–solid interface.
2.5.
Two-Point Cluster Function
Perhaps the most promising two-point descriptor identified to date is the two-point cluster function C2(i) (x 1 , x 2 ) [8]. The quantity C2(i) (x 1 , x 2 ) gives the probability of finding two points at x 1 and x 2 in the same cluster of phase i. The formation of very large “clusters” of a phase in a heterogeneous material (on the order of the system size) can have a dramatic influence on its macroscopic properties. A cluster of phase i is defined as the part of phase i that can be reached from a point in phase i without passing through phase j =/ i. A critical point, known as the percolation threshold, is reached when a sample-spanning cluster first appears. Thus, C2(i) is the analogue of the two-point probability
Theory of random heterogeneous materials
1341
function S2(i) , but unlike its predecessor, it contains nontrivial topological “connectedness” information. Indeed, it is a useful signature of clustering in the system since it becomes longer ranged as the percolation threshold is approached from below. The measurement of C2(i) for a three-dimensional material sample cannot be made from a two-dimensional cross-section of the material, since it is an intrinsically three-dimensional microstructural function. The remaining challenge is to be able to incorporate C2(i) into a theory to predict macroscopic properties for a wide range of conditions, even near the threshold.
2.6.
Nearest-Neighbor Functions
All of the aforementioned statistical descriptors are defined for disordered materials of arbitrary microstructure. In the special case of random media composed of particles (phase 2) distributed randomly throughout another material (phase 1) or simple atomic systems, there is a variety of natural morphological descriptors. For simplicity, consider systems of identical spherical particles of diameter D (or radius R = D/2) at number density ρ. The “particle” nearest-neighbor probability density function HP (r) characterizes the probability of finding the nearest neighbor at some given distance from a reference particle. A different nearest-neighbor function, HV , referred to as the “void” nearest-neighbor probability density function, characterizes the probability of finding a nearest-neighbor particle center at a given distance from an arbitrary point in the system. Other descriptors of particle systems are described by Torquato [8].
3.
Unified Theoretical Approach
The previous section described some of the different types of statistical correlation functions that have arisen in rigorous structure–property relations [8]. Until recently, application of such structure–property relations was virtually nonexistent because of the difficulty involved in ascertaining the correlation functions. Are these different functions related to one another? Can one write down a single expression that contains complete statistical information and thereby compute any specific correlation function? The key quantity that enables one to answer to these two queries in the affirmative is the canonical n-point correlation function Hn [8, 10].
3.1.
Canonical n-Point Correlation Function
For simplicity, we will begin by considering a classical, closed system of N interacting identical spherical particles of radius R in volume V . Any ensemble
1342
S. Torquato
of many-particle systems is completely spatially characterized classically by the n-particle probability density function ρn (r n ). The quantity ρn (r n )dr n is proportional to the probability of finding any subset of n particles with configuration r n in volume element dr n . In general, ρn (r n )dr n depends on the N -particle potential N (r N ) and the particular dynamical process involved to create the system. In many instances, the total potential energy (in the absence of external fields) is well approximated by pairwise additivity. The key idea employed by Torquato [10] to define and derive series representations of the canonical n-point correlation function Hn is the available space and available surface to the ith “test” particle of radius bi that is inserted into the system of spheres of radius R. Letting ai = bi + R and yi j = |x i − r j |, one representation of the Hn is given by Hn (x ; x m
p−m
;r )= q
∞ s=0
n = p + q, where G n (x p ; r q ) =
(−1)s (−1)m
∞
q p
(−1)s
l=1
s=0
×
ρq+s (r q+s )
k=1
∂ ∂ ··· G (s) (x p ; r q ), ∂a1 ∂am n
(10)
e(ykl ; ak )
s! q+s j =q+1
1−
p
[1 − m(yi j ; ai )] dr j ,
(11)
i=1
and e(r; a) = 1 − m(r; a) equals zero if r < a and unity otherwise. Importantly, all of the aforementioned microstructural correlation functions are special limits of the Hn and thus one can, in principle, compute each of them for this class of model microstructures given the ρn . Representations of the Hn for dispersions of spheres with a size distribution have also been obtained. The formal results described above can be extended to statistically anisotropic models consisting of inclusions whose configuration is fully specified by center-of-mass coordinates (e.g., oriented ellipsoids, cubes, cylinders, etc.) as well as to statistically anisotropic laminates. The formalism also extends to nonparticulate systems, such as cell or lattice models. The reader is referred to Ref. [8] for a discussion of these extensions.
3.2.
Model Pair Potentials
For a system of noninteracting particles, N = 0. In so far as statistical thermodynamics is concerned, this is the trivial case of an ideal gas. However, this is a nontrivial model of a heterogeneous material, since the lack of spatial correlation implies that the particles may overlap to form complex clusters,
Theory of random heterogeneous materials
1343
as shown in Fig. 5. At low sphere densities, the particle phase is a dispersed, disconnected phase, but above a critical value, called the percolation threshold, the particle phase becomes connected. For d = 2 and 3, this threshold occurs at a sphere volume fraction of about 0.68 and 0.29, respectively [8]. We will refer to this model as overlapping spheres. Interpenetrable-sphere systems in general are useful models of consolidated media, such as sandstones and other rocks, and sintered materials. In the hard-sphere pair potential, the particles do not interact for interparticle separation distances greater than the sphere diameter D but experience an infinite repulsive force for distances less than or equal to D. Hard-sphere systems have received considerable attention, since they serve as a useful model for a number of physical systems, such as simple liquids, glasses, colloidal dispersions, fiber-reinforced composites, particulate composites, and granular media [8, 11–13]. The hard-sphere potential approximates well the structure of dense-particle systems with more complicated potentials because short-range repulsion between the particles is the primary factor in determining their spatial arrangement. Unlike the case of overlapping spheres, the determination of ρn for general ensembles of hard spheres is nontrivial. For hard-sphere systems the impenetrability constraint does not uniquely specify the statistical ensemble [8]. The hard-sphere system can be in thermal equilibrium or in one of the infinitely many nonequilibrium states, such as the random sequential addition (or adsorption) (RSA) process (see Fig. 6). The latter is produced by randomly, irreversibly, and sequentially placing nonoverlapping objects into a volume. For identical d-dimensional RSA spheres, the filling process terminates at the saturation limit, which is substantially lower than the maximum density for random hard spheres in equilibrium. Denoting the maximum sphere volume fraction by φ2max , it turns out that for identical hard spheres in an RSA process in the thermodynamic limit,
Figure 5. Two-dimensional overlapping-particle systems at a low density with at most three particle overlaps (left panel) and at a high density above the percolation threshold (right panel) [8].
1344
S. Torquato
"Frozen" or "Parked"
Equilibrium
RSA
Figure 6. “Snapshot” of an equilibrium system of hard particles (left) and a realization of hard particles assembled according to an RSA process (right) [8]. In the former, the particles are free to sample the configuration space subjected to impenetrability of the other particles, but in the latter, the particles are frozen at their initial positions.
φ2max ≈ 0.75, 0.55, and 0.38 for d = 1, 2, and 3, respectively. In contrast, for identical disordered hard spheres in equilibrium, φ2max is exactly unity for d =1, and for d = 2 and 3, φ2max ≈ 0.83 and 0.64, respectively. Interestingly, these maximum densities for equilibrium hard spheres apparently correspond to a special singular disordered state commonly referred to as the random close packing (RCP) state [14]. However, it has recently been shown [15] that the venerable concept of the random close packed state (RCP) is mathematically ill-defined. The RCP density is thought to be the highest density that a “random” packing of particles can attain. The term close packed implies maximal coordination throughout the system, i.e., an ordered lattice, which clearly is in conflict with the notion of randomness. The exact proportion of these two competing effects is not well-defined, and therein lies the problem. Finally, the notion of the randomness was never quantified, nor even clearly defined. Torquato et al. [15] demonstrated the ill-defined nature of the RCP state by introducing precise definitions for jamming [16] and scalar order metrics, and analyzing computer-generated sphere packings. They showed that since one can achieve packings with arbitrarily small increases in packing fraction at the expense of small increases in order, the notion of RCP as the highest possible density that a random sphere packing can attain is not well-defined. To replace this idea, they introduced a new concept: the maximally random jammed (MRJ) state, which can be defined precisely once a scalar order metric is chosen. This lays the mathematical groundwork for studying randomness in packings of particles and initiates the search for the MRJ state in a quantitative way not possible before. In a recent study [17], a comprehensive set of candidate jammed states of identical hard spheres and a variety of different order metrics have been used to estimate the MRJ packing fraction to be 0.637 ± 0.0015. Figure 7 shows a configuration near the MRJ state. The determination of the MRJ states for systems of spheres with a polydispersity in size [18] and for systems of
Theory of random heterogeneous materials
1345
Figure 7. A realization of a random packing of 500 identical spheres near the maximally random jammed state.
ellipsoids [19] are intriguing and challenging problems. Recently, it was shown that ellipsoids can randomly pack more densely than spheres in the MRJ state and nearly as dense as the densest (crystal) packing of spheres [19]. Interpenetrable-sphere models enable one to define systems that are intermediate between overlapping spheres and impenetrable spheres, thereby enabling one to vary the degree of connectivity of the particle phase. A popular interpenetrable-sphere model is the penetrable-concentric-shell model or, more colloquially, the cherry-pit model. When 0 ≤ λ ≤ 1, each sphere of diameter D may be thought of as being composed of an impenetrable core of diameter λD encompassed by a perfectly penetrable concentric shell of thickness (1 − λ)D/2. By varying the impenetrability parameter λ between 0 and 1, one can continuously pass between fully penetrable (overlapping) spheres and totally impenetrable spheres, respectively. Well-known models that incorporate attractive interactions include the square-well potential and the Lennard–Jones potential [11]. A special limit of the square-well potential that reduces attractive interactions to a delta function at contact is referred to the “sticky” hard-sphere potential. This potential provides a simple means of modeling aggregation processes in particle systems.
3.3.
Illustrative Calculations
Given the series representation of the canonical n-point correlation function Hn , one can compute (using statistical–mechanical techniques) specific statistical descriptors as special limiting cases as outlined above. Here we report two illustrative calculations of correlation functions. The reader is referred to Ref. [8] for computational details.
1346
S. Torquato
Figure 8 shows the matrix two-point probability function S2 for threedimensional overlapping spheres (phase 2) at a sphere volume fraction φ2 = 0.5. Included in the figure is the corresponding plot of S2 for equilibrium hard (totally impenetrable) spheres, which, unlike overlapping spheres, exhibits short-range order. Figure 9 shows the two-point cluster function C2 for sticky hard spheres. The key point is that C2 becomes longer ranged as the 0.5 Hard Spheres Overlapping Spheres
0.4
φ2 ⫽ 0.5 S2( r )
0.3
0.2
0.1
0.0 0.0
0.5
1.0
1.5 r/D
2.0
2.5
3.0
Figure 8. The matrix two-point probability function S2 (r ) versus the dimensionless distance r/D for two models of isotropic distributions of spheres of diameter D = 2R at a sphere volume fraction φ2 = 0.5 [8].
Two–point cluster function, C2 (r)
0.3 Sticky hard spheres φ2 ⫽0.297 0.2
φ2 ⫽0.2 φ2 ⫽0.1
0.1
0.0 0.0
1.0 2.0 3.0 Dimensionless distance, r/D
4.0
Figure 9. The two-point cluster function C2 (r ) versus the dimensionless distance r/D for sticky hard spheres of diameter D at several values of the sphere volume fraction φ2 . Here τ = 0.35 is the dimensionless “stickiness” parameter and φ2c = 0.297 [8].
Theory of random heterogeneous materials
1347
percolation threshold φ2c =0.297 is approached from below. We note that there is a variety of techniques available to extract microstructural functions from digitized 2D and 3D representations of heterogeneous materials [8].
4.
Homogenization Theory
Homogenization theory is concerned with finding the appropriate homogenized (or averaged, or macroscopic) governing partial differential equations describing physical processes occurring in heterogeneous materials when the length scale of the heterogeneities tends to zero. In such instances, it is desired that the effects of the microstructure reside wholly in the macroscopic or effective properties via certain weighted averages of the microstructure. In its simplest form, the method is based on the consideration of two length scales: the macroscopic scale L, characterizing the extent of the system, and the microscopic scale , associated with the heterogeneities. Moreover, it is supposed that some external field is applied that varies on a characteristic length scale . The limit of interest for purposes of homogenization is L ≥ . Therefore, there is a small parameter = /L associated with rapid fluctuations in the microstructure or local property. Accordingly, the field quantities (e.g., temperature field, electric field, stress field, concentration field, velocity field) depend on two variables: a global or slow variable x and a local or fast variable y = x/ . The slowly varying parts of the fields are imposed by the source, the boundary conditions, or the initial conditions, while the rapidly varying parts are imposed by the local property or microstructure. These variations are schematically shown in Fig. 10. Under these conditions, a complete analysis of the problem involves three steps: 1. One first sets out to find the form of the homogenized differential equations equations, valid on length scales O( ), by performing an asymptotic expansion of the field quantities in terms of the global and local variables.
O( )
O (Λ) Figure 10. A schematic depiction of the slow and rapid parts of the field.
1348
S. Torquato
2. Next, one must determine the effective properties that arise in the averaged equations as a function of the microstructure. 3. Finally, one must solve the homogenized equations under appropriate boundary or initial conditions. Two-scale homogenization theory enables one to show that effective conductivity tensor σe , effective stiffness tensor C e , mean survival time τ , and fluid permeability tensor k of the random heterogeneous material in the limit
→ 0 are determined by ensemble averages of local fields that satisfy the appropriate conservation equations, i.e., governing partial differential equations. The reader interested in these derivations is referred to the books by Torquato [8] and Milton [7].
4.1.
Conduction Problem
Consider the steady-state transport or displacement of a conservable quantity associated with any of the class A problems that are summarized in Table 1. To fix ideas, we will speak in the language of electrical or thermal conduction, keeping in mind that the results of this section apply as well to the determination of the effective dielectric constant, magnetic permeability, and diffusion coefficient. Each realization ω of the random heterogeneous material that occupies the space V is composed of two phases (phases 1 and 2) having constant conductivity tensors σ1 and σ2 .
4.1.1. Local differential equation Let J (x) denote the local electric (thermal) current or flux at position x, and let E(x) denote the local field intensity. Under steady-state conditions with no source terms, J is solenoidal and E is irrotational: ∇ · J(x) = 0 in V,
(12)
∇ × E(x) = 0 in V,
(13)
for each realization of the ensemble. The latter condition implies the existence of a potential field T , i.e., E = −∇T.
(14)
Thus, E and T represent the electric field (negative of the temperature gradient) and electric potential (temperature) in the electrical (thermal) problem, respectively. We also specify the potential T on the boundary of V.
Theory of random heterogeneous materials
1349
4.1.2. Local constitutive relation The fields J to E are linked by assuming a linear constitutive relation, i.e., J(x) = σ(x) · E(x) in V,
(15)
where the local conductivity tensor can be expressed as σ(x) = σ1 I (1) (x) + σ2 I (2) (x),
(16)
and I (i) (x) is the indicator function for phase i given by (3).
4.1.3. Averaged constitutive relation The following ensemble-averaged constitutive relation defines the symmetric second-order effective conductivity tensor: J (x) = σe · E(x).
4.2.
(17)
Elasticity Problem
4.2.1. Local differential equations Let τ(x) and ε(x) denote, respectively, the symmetric local stress and strain tensors at position x. Under steady state without sources, τ(x) and ε(x) obey the elastostatic equations for each realization of the ensemble: ∇ · τ = 0 in V
(18)
∇ × [∇ × ε]T = 0 in V,
(19)
The latter condition implies the existence of a displacement field u, i.e.,
ε(x) = 12 ∇u(x) + ∇u(x)T .
(20)
The superscript T denotes the transpose operation. We also specify the displacement u on the boundary of V.
4.2.2. Local constitutive relation We assume that the fields τ and ε via a linear constitutive relation, i.e., τ(x) =
C(x) : ε(x)
in V,
(21)
where C(x) = C 1 I (1)(x) + C 2 I (2)(x)
(22)
1350
S. Torquato
is the local stiffness tensor. Relation (21) is the generalization of Hooke’s law. Here the symbol : denotes the contraction with respect to two indices.
4.2.3. Averaged constitutive relation The following ensemble-averaged constitutive relation defines the symmetric second-order effective conductivity tensor: τ(x) = C e :ε(x).
(23)
For elastically isotropic media, C e is expressible in terms of two independent effective scalar parameters; for example, the effective bulk modulus K e and the effective shear modulus G e .
4.3.
Trapping Problem
Consider the problem of diffusion and reaction among partially absorbing “traps” in each realization ω of the random medium. Let V1 be the region in which diffusion occurs (i.e., trap-free, or pore, region) and let V2 be the trap region. It is important to emphasize that, unlike the previous two problems of conduction and elasticity, there is no local constitutive relation for the mean survival time τ or, equivalently, the trapping constant γ = (τ φ1 D)−1 , where D is the diffusion coefficient. This is also true for the fluid permeability, as will be described below. Thus, the trapping and flow problems (classes C and D problems) are fundamentally different from the conduction and elasticity problems (classes A and B problems); see Table 1.
4.3.1. Local differential equation The generation rate per unit trap-free volume of a diffusing species is G(x). The scaled concentration field of the reactants u(x) exterior to the partially absorbing traps under steady-state conditions is governed by the Poisson equation: u = −1 in V1
(24)
with the boundary condition at the pore–trap interface given by ∂c + κc = 0 on ∂V, (25) ∂n where is the Laplacian operator, κ is the surface rate constant, and n is the unit outward normal from the pore space. For infinite surface reaction D
Theory of random heterogeneous materials
1351
(κ = ∞), the traps are perfect absorbers, and u = 0 (diffusion-controlled limit). For vanishing surface reaction (κ = 0), the traps are perfect reflectors, and ∂u/∂n = 0 (reaction-controlled limit).
4.3.2. Averaged constitutive relation The trapping constant γ is defined by the following averaged constitutive relation: G(x) = γ DC(x),
(26)
where C(x) is the average concentration field and γ −1 = u = τ φ1 D.
4.4.
(27)
Flow Problem
For each realization ω of the random porous medium, let V1 be the region through which the viscous incompressible fluid with viscosity µ flows (i.e., pore, or void, region) and let V2 be the solid region.
4.4.1. Local differential equations The fluid motion satisfies the tensor Stokes equations w = ∇ π − I in V1 , ∇ · w = 0 in V1 , w = 0 on ∂V,
(28) (29) (30)
where I is the second-order unit tensor. In these equations, the scaled tensor velocity field wi j is the j th component of the velocity due to a unit pressure gradient in the ith direction, and π j is the j th component of the associated scaled pressure.
4.4.2. Averaged constitutive relation The fluid permeability tensor k is defined by Darcy’s law: k U(x) = − ∇ p0 (x), µ where U(x) is the average velocity field and k = w.
(31)
(32)
1352
5.
S. Torquato
Variational Principles and Bounds
Due to the complexity of the microstructure, there are relatively few situations in which one can evaluate the effective properties of heterogeneous materials exactly. Such rare results are nonetheless quite valuable as benchmarks to test theories and computer simulations [7, 8]. The difficulty in obtaining exact predictions of the effective properties is due to the fact that they generally depend on functionals involving an infinite set of statistical correlation functions that characterize the microstructure [7, 8]. Such complete information is usually not available. Hence, approximation and rigorous bounding techniques have been devised to estimate the effective properties [4, 6–9]. In the absence of exact results for the effective properties, any rigorous statement about them must be in the form of rigorous bounds. Bounds are useful because: (1) they rigorously incorporate nontrivial information about the microstructure via statistical correlation functions and consequently serve as a guide in identifying appropriate statistical descriptors; (2) as successively more microstructural information is included, the bounds become progressively narrower; (3) one of the bounds can provide a relatively sharp estimate of the property for a wide range of conditions, even when the reciprocal bound diverges from it; (4) they are usually exact under certain conditions and can be used to find extremal microstructures; (5) they can be utilized to test the merits of a theory or computer experiment; and (6) they provide a unified framework to study a variety of different effective properties.
5.1.
Variational Principles
Different methods exist to derive bounds on effective properties [7], but only variational principles (minimum energy principles) are available to derive bounds on all four different effective properties considered in this article [8]. To illustrate the basic idea, we state the minimum energy principles for the conduction problem without proof.
5.1.1. Minimum potential energy ˆ defined by the set Let AU be the class of trial intensity fields E ˆ ∇×E ˆ = 0, E ˆ = E}. AU = {ergodic E;
(33)
The actual average energy is bounded from above by the trial average energy, i.e., 1 ˆ · σ · E ˆ ˆ ∈ AU , E · σe · E ≤ 1 E ∀E (34) 2
2
Theory of random heterogeneous materials
1353
where E is curl-free. This principle leads to an upper bound on the effective conductivity.
5.1.2. Minimum complementary energy Let AL be the class of trial flux fields Jˆ defined by the set AL = {ergodic Jˆ ; ∇ · Jˆ = 0, Jˆ = J },
(35)
Again, the actual average energy is bounded from above by the trial average energy, i.e., 1 J 2
−1 ˆ 1 ˆ · σ−1 J e · J ≤ 2J · σ
∀ Jˆ ∈ AL ,
(36)
where J is solenoidal. This principle leads to a lower bound on the effective conductivity.
5.2.
Variational Bounds
To obtain bounds from the variational principles, one must construct specific trial fields. The simplest trial fields that satisfy the admissibility conditions are constant vectors. In particular, the constant fields E and J lead to the simple bounds σ−1 −1 ≤ σe ≤ σ,
(37)
where for any local property a, a ≡ φ1 a1 + φ2 a2 . Thus, the effective conductivity tensor is bounded from above by the arithmetic mean of the phase conductivities and from below by the harmonic mean of the phase conductivities. We refer to these results as one-point bounds, since they involve information up to the level of the volume fraction, which is a one-point correlation function. The one-point bounds (37) are generally far apart from one another. In order to improve upon them (i.e., to obtain higher-order bounds), one must construct nonuniform trial fields that better reflect the field interactions between the phases. One can systematically generate n-point bounds (bounds involving information up to the level of the relevant n-point correlation function) in this manner [7, 8]. Perhaps the most well-known two-point bounds are the optimal Hashin–Shtrikman bounds on the effective conductivity [7, 8, 20]. The reader is referred to the book of Torquato [8] for a comprehensive discussion of the derivation of n-point bounds on all four effective properties considered in this article.
1354
6.
S. Torquato
Evaluation of Property Bounds
Various two-, three-, and four-point bounds on effective properties have been evaluated for a number of different model microstructures. These evaluations require knowledge of the relevant n-point correlation functions, which were discussed in Section 3. To illustrate the predictive capability of bounds, we graphically depict them in Figs. 11 and 12 for each effective property for simple models of distributions of identical spheres in a matrix or fluid phase. All of these results are taken from Ref. [8]. Figure 11 compares two- and three-point bounds to corresponding simulation data and a three-point approximation for the effective conductivity of superconducting hard spheres (σ2 /σ1 = ∞) in equilibrium. Although the corresponding upper bounds diverge to infinity, the three-point lower bound provides a good estimate of σe because the particles do not cluster. It is noteworthy that the three-point approximation is quite accurate. The same figure includes a comparison of bounds to experimental data of the effective shear modulus G e for random equilibrium arrays of identical glass spheres in an epoxy matrix. The three-point bounds provide significant improvement over the two-point bounds, and the data lie closer to the three-point lower bound. Figure 12 compares a two-point interfacial surface upper bound and pore-size lower bound to simulation data for the scaled mean survival time τ D/R 2 versus trap volume fraction φ2 for three-dimensional fully overlapping spherical traps of radius R in the case of perfectly absorbing traps (κ ∗ = κ R/D = ∞). This figure also provides a comparison of bounds on the scaled inverse fluid permeability (resistance) ks /k for equilibrium arrays of identical
28 Random Hard Spheres σ 2 /σ 1 = ∞
Random Hard Spheres Scaled shear modulus, Ge/G1
Scaled conductivity, σe/σ1
10.0
2–point lower bound 3–point lower bound 3–point approximation Simulation data 5.0
0.0 0.0
0.2
0.4
0.6
Particle volume fraction, φ
2
24
G2/G1⫽28.5, G1/K1⫽0.228, G2/K1⫽0.66
20 16
Data 2–point bounds 3–point bounds
12 8 4 0 0.0
0.2
0.4
0.6
Particle volume fraction, φ
0.8
1.0
2
Figure 11. Left panel: Two- and three-point estimates of σe /σ1 at σ2 /σ1 = ∞ for random arrays of identical hard spheres in equilibrium [8]. Right panel: Comparison of bounds on G e /G 1 to experimental data for random equilibrium arrays of identical glass spheres in an epoxy matrix [8].
Theory of random heterogeneous materials
1355 100 Kozeny Equation
Overlapping Traps κ*⫽∞
τ D/R
2
1.5
Cross-Property Bound
1.0
ks k
10 2–Point Bound
0.5
0.0 0.0
3–Point Bound 1 0.2
0.4
φ2
0.6
0.8
1.0
0
0.2
φ
0.4
0.6
2
Figure 12. Left panel: Comparison of upper and lower bounds (solid curves) to simulation data for the scaled mean survival time τ D/R 2 for fully overlapping spherical traps of radius R in the case of perfect absorption (κ ∗ = κ R/D = ∞) [8]. Right panel: Comparison of bounds and estimates on the scaled inverse fluid permeability (resistance) ks /k for equilibrium arrays of identical nonoverlapping spheres [8].
three-dimensional nonoverlapping spheres of radius R versus sphere volume fraction φ2 , where ks = 2R 2 /(9φ2 ). Included in this figure is a cross-property bound and the empirical Kozeny relation, the latter of which provides a rough estimate of experimental data.
7.
Summary
An effective property of a random heterogeneous material is a functional of the relevant local fields weighted with certain correlation functions that statistically characterize the structure. Generally, the type of correlation function involved depends on the specific physical problem that one studies. However, for certain classes of materials, it has been shown that all of the apparently different types of correlation functions can be obtained from a canonical function Hn and, consequently, can be shown to be related to one another. Thus, seemingly different effective properties are indeed related to one another via so-called cross-property relations [8]. Such a unified approach to study effective properties of random heterogeneous is both natural and very powerful. In general, an infinite amount of microstructural information is required to determine an effective property exactly. In practice, therefore, lower-order n-point estimates (approximations or bounds) of the effective properties have been derived that often provide accurate predictions. Nonetheless, many challenges remain. For example, a systematic means of incorporating into structure–property relations important topological information, such as connectedness of the phases, in a nontrivial manner has not been accomplished to date.
1356
S. Torquato
There were many important theoretical topics that we were not able to cover in this article. Some of these topics include: (1) percolation theory [8, 9, 21, 22]; (2) image analysis of microstructures [8]; (3) exact solutions for effective properties [7, 8]; (4) analytical approximations for effective properties [7–9]; (5) property optimization [6, 7]; and (6) cross-property relations [7, 8], which link different effective properties.
References [1] J.C. Maxwell, Treatise on Electricity and Magnetism, Clarendon Press, Oxford, 1873. [2] L. Rayleigh, “On the influence of obstacles arranged in a rectangular order upon the properties of medium,” Phil. Mag., 34, 481–502, 1892. [3] A. Einstein, “Eine neue bestimmung der molek¨uldimensionen,” Ann. Phys., 19, 289– 306, 1906. [4] R.M. Christensen, Mechanics of Composite Materials, Wiley, New York, 1979. [5] V.V. Jikov, S.M. Kozlov, and O.A. Olenik, Homogenization of Differential Operators and Integral Functionals, Springer-Verlag, Berlin, 1994. [6] A.V. Cherkaev, Variational Methods for Structural Optimization, Springer-Verlag, New York, 2000. [7] G.W. Milton, The Theory of Composites, Cambridge University Press, Cambridge, England, 2002. [8] S. Torquato, Random Heterogeneous Materials: Microstructure and Macroscopic Properties, Springer-Verlag, New York, 2002. [9] M. Sahimi, Heterogeneous Materials I: Linear Transport and Optical Properties, Springer-Verlag, New York, 2003. [10] S. Torquato, “Microstructure characterization and bulk properties of disordered two-phase media,” J. Stat. Phys., 45, 843–873, 1986. [11] J.P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, New York, 1986. [12] R. Zallen, The Physics of Amorphous Solids, Wiley, New York, 1983. [13] W.B. Russel, D.A. Saville, and W.R. Schowalter, Colloidal Dispersions, Cambridge University Press, Cambridge, England, 1989. [14] J.D. Bernal, “The geometry of the structure of liquids,” In: T. J. Hughel (ed.), Liquids: Structure, Properties, Solid Interactions, Elsevier, New York, pp. 25–50, 1965. [15] S. Torquato, T.M. Truskett, and P.G. Debenedetti, “Is random close packing of spheres well defined?,” Phys. Rev. Lett., 84, 2064–2067, 2000. [16] S. Torquato and F.H. Stillinger, “Multiplicity of generation, selection, and classification procedures for jammed hard-particle packings,” J. Phys. Chem. B, 105, 11849– 11853, 2001. [17] A.R. Kansal, S. Torquato, and F.H. Stillinger, “Diversity of order and densities in jammed hard-particle packings,” Phys. Rev. E, 66, 041109, 1–8, 2002a. [18] A.R. Kansal, S. Torquato, and F.H. Stillinger, “Computer generation of dense polydisperse sphere packing,” JCP, 117, 8212–8218, 2002b. [19] A. Donev, I. Cisse, D. Sachs, E.A. Variano, F.H. Stillinger, R. Connelly, S. Torquato, and P.M. Chaikin, Improving the density of jammed disordered packings using ellipsoids, Science, 303, 990–993, 2004.
Theory of random heterogeneous materials
1357
[20] Z. Hashin and S. Shtrikman, “A variational approach to the theory of the effective magnetic permeability of multiphase materials,” J. Appl. Phys., 33, 3125–3131, 1962. [21] G. Grimmet, Percolation, Springer-Verlag, New York, 1989. [22] D. Stauffer and A. Aharony, Introduction to Percolation Theory, Taylor and Francis, London, 1992.
4.6 MODERN INTERFACE METHODS FOR SEMICONDUCTOR PROCESS SIMULATION J.A. Sethian Department of Mathematics, University of California, Berkeley, CA, USA
The manufacture of semiconductor devices may include dozens of process steps, all delicately choreographed to produce a functioning, reliable, and efficient device. These steps, such as photolitography, etching and deposition, act to shape and mold the device, replete with various metals, insulators, and interconnects. As one might guess, a trial and error approach to determine a repeatable and reliable recipe is not inexpensive. Numerical simulations which capture the essential details of these processes have a valuable role to play. Understanding and predicting how devices can be effectively built is complicated by rapidly changing advances in the manufacturing process. Smaller and smaller devices have now moved far away from continuum equations for transport, etching and deposition, and now rely on such discrete effects as individual atom surface physics. Adequate equations and models for these phenomena are themselves subjects of great debate. Until recently, these modeling difficulties were matched by numerical difficulties associated with accurately tracking the evolution of material boundaries and interfaces under complex physics. Some of the difficulties included trying to capture the formation of sharp corners, topological changes which lead to void formation and shadow zones, delicate dependence of profile evolution on interface geometry, and the sheer programming difficulty of accurately representing and capturing three-dimensional motion. These computational obstacles masked the unresolved modeling issues inherent in the move towards smaller and smaller process scales. In recent years, there has been significant advancement in numerical methods to track propagating interfaces evolving in complex situations. These techniques rely on an Eulerian, implicit embedding view of an interface discussed below; briefly, the interface is embedded as a particular level set of 1359 S. Yip (ed.), Handbook of Materials Modeling, 1359–1369. c 2005 Springer. Printed in the Netherlands.
1360
J.A. Sethian
higher dimensional function, and it is that latter function that does all the work. Fortunately, this exchange is well-worth the trade: topological changes, numerical robustness, high accuracy, and straightforward programming come at little cost. The resulting numerical techniques, known as level set methods and narrow band level set methods, have been incorporated into a wide collection of TCAD and semiconductor process simulation codes.
1.
Physical Effects and Background
The goal of numerical simulations in microfabrication is to model the process by which silicon devices are manufactured. Here, we briefly summarize some of the physical processes. This material is discussed in more detail in “Evolution and Implementation of Level Set Methods” [1], and much of this description is taken from that source. First, a single crystal ingot of silicon is extracted from molten pure silicon. This silicon ingot is then sliced into several hundred thin wafers, each of which is then polished to a smooth finish. A thin layer of crystalline silicon is then oxidized, a light-sensitive “photoresist” that is sensitive to light is applied, and the wafer is then covered with a pattern mask that shields part of the photoresist. This pattern mask contains the layout of the circuit itself. Under exposure to a light or an electron beam, the exposed photoresist polymerizes and hardens, leaving an unexposed material that is then etched away in a dry etch process, revealing a bare silicon dioxide layer. Ionized impurity atoms such as boron, phosphorus, and argon are then implanted into the pattern of the exposed silicon wafer, and silicon dioxide is deposited at reduced pressure in a plasma discharge from gas mixtures at a low temperature. Finally, thin films such as aluminum are deposited by processes such as plasma sputtering, and contacts to the electrical components and component interconnections are established. The result is a device that carries the desired electrical properties. These processes produce considerable changes in the surface profile as it undergoes various effects of etching and deposition. This problem is known as the “surface topography problem” in microfabrication and is controlled by a large collection by physical events. Our central concerns are etching and deposition processes, in which the rate of change of the surface profile depends on such factors as the visibility of the etching/deposition source from each point of the evolving profile, reemission of particles, surface diffusion along the front, complex flux laws that produce faceting, shocks and rarefactions, material-dependent discontinuous etch rates, and masking profiles. The underlying physics and chemistry that contribute to the motion of the interface profile are very much areas of active research. Nonetheless, once empirical models are formulated, the problem ultimately becomes the one of tracking an interface moving under a given speed function. This problem
Modern interface methods for semiconductor process simulation
1361
occurs in a wide collection of physical phenomena, including ocean waves, fluid mixing, combustion, crystal growth and dendritic solidification, and secondary oil recovery. Interestingly, many non-physical problems can also be cast in this setting, including problems in shape segmentation, optimal structural design, and inverse reconstruction.
2.
Algorithmic Requirements for Tracking Interfaces
Abstractly, the goal is to devise numerical algorithms to track moving interfaces. For the moment, imagine a interface moving in a two-dimensional domain; for simplicity, we can think of a curve moving in the plane, and at each point of the curve and any time, we can query a function that gives us the speed of that point on the curve. For further simplicity, we shall call this speed function F, whose arguments may include the position (x, y), the time t, and a collection of other factors, including the shape of the interface and related physics on and off the interface, We seek a representation of this “interface” which allows us to update its position by repeatedly querying this function. Possibly, the most straightforward such discretization is to simply consider the curve as a collection of linked marker nodes, whose position at any times reveals the interface. At any time, one can query the speed function F to find the velocity of each node, and then update the node position to obtain the new location of the front. In Fig. 1, we show the idea of a discrete marker representation. On the left, a circle expanding with unit speed in the normal direction is discretized into a set of linked markers. However, the middle figure shows the first difficulty: for shapes with sharp corners, the normal direction is not defined, furthermore, for non-convex shapes, the moving markers may overlap. The figure on the right shows a greater difficulty: two evolving fronts may change topology as they intersect. The correct solution is the expanding envelope of the two shapes; however, it is difficult to determine which markers rightly belong on the new
?
?
?
?
Marker discretization
?
?
?
?
Collision of normals
Topological change
Figure 1. Marker particle interface discretization.
1362
J.A. Sethian
front. While algorithmic solutions to these problems exist, they become more complex and challenging in representing three-dimensional evolutions. In semiconductor process modeling, these effects are of great importance: fluxes and visibilities delicately depend on accurate calculation of normals, topological changes, and construction of an intact front.
3.
Level Set Methods
Level set methods, introduced by Osher and Sethian [2] (“Fronts Propagating with Curvature-Dependent Speeds”), were devised to accurately tracking interfaces evolving under a variety of complex speed laws in two and three dimensions. They rely in part on the theory of curve and surface evolution given by Sethian (“Curvature and the Evolution of Fronts”) [3] and on the link between front propagation and hyperbolic conservation laws given by Sethian [4] (“Numerical Methods for Propagating Fronts”). They recast interface motion as a time-dependent Eulerian initial value partial differential equation, and rely on viscosity solutions to the appropriate differential equations to update the position of the front, using an interface velocity that is derived from the relevant physics both on and off the interface. These viscosity solutions are obtained by exploiting schemes from the numerical solution of hyperbolic conservation laws. Level set methods are specifically designed for problems involving topological change, dependence on curvature, formation of singularities, and host of other issues that often appear in interface propagation techniques. The fundamental idea is easily explained. For simplicity, we focus on a one-dimensional front propagating in two space dimensions. Let (s, t = 0) be a simple closed curve lying in the plane; here, s parameterizes the curve and t = 0 corresponds to the initial time. We embed this curve as the zero level set φ(x, y, y = 0) = 0 of function φ from R2 × [0, inf] to R. Thus, in R3 , φ(x, y, t = 0) is a surface whose zero level set in the x y-plane corresponds to the initial position of the front (s, t = 0). Suppose this curve is propagating in its normal direction with a given speed F. We then seek an initial value partial differential equation for the evolution of the level set function φ(x, y, t) such that at any given time, the zero level set φ(x, y, t) = 0 corresponds to the new position of the front. In Fig. 2, we show an initial level set function for a circle centered at the origin. We now have two goals: first, to determine a good way of choosing the initial level set function φ(x, y, t = 0), and second, to determine an initial value PDE which transports this level set function such that the zero level set always corresponds to the evolving front. For an initial choice, one straightforward option is the signed distance function, namely φ(x, y, t = 0) = ± d(x, y)
(1)
Modern interface methods for semiconductor process simulation
1363
z
y F
ψ(x,y,t⫽0) F
y
x
F
x
F Figure 2.
ψ⫽0
Initial front and initial level set function.
where d(x, y) is the distance from the point (x, y) to the given initial curve , and the sign is chosen as positive (negative) if (x, y) is outside (inside) . Finally, to derive the initial value PDE, a straightforward application of the multidimensional chain rule [1, 2] produces the initial value PDE φt + F|∇φ| = 0.
(2)
This is the level set equation. The advantages of this approach are immediately apparent: • Topological changes occur automatically, since the level set function φ itself is always a graph, regardless of the connectedness of the zero level set corresponding to the interface. • The determination of the normal and curvature, which may be required to accurately evaluate the speed function F is easily approximated through derivative operators applied to the level set function, that is, n =
∇φ ∇φ and κ = ∇ · . |∇φ| |∇φ|
• The formulation is unchanged in higher dimensions, leading to a straightforward perspective to complex interface motion.
4.
Numerical Approximations to the Level Set Equation
Equation (2) formulates interface propagation as an initial value PDE in which the speed F in the normal direction is considered as supplied. Of course,
1364
J.A. Sethian
in realistic physical situations, the speed function F depends on a host of influences, including front geometry, parabolic and elliptic equations on either side of the interface with subtle boundary conditions, and additional factors. Nonetheless, from an operator splitting perspective, once both the position of the front (that is, the level set function φ) and its normal speed F are known at the beginning of a time step, the next job is to update the level set function itself. The subtlety in providing an accurate and robust discretization of Eq. (2) stems from the fact that interface (and hence the associated level set function) need not be differentiable. Hence, care must be taken in evaluating the gradient operator |∇φ|. Osher and Sethian [2] introduced a high order upwind finite difference scheme for Hamilton–Jacobi equations of this form. The idea (see Ref. [4] for motivation) is to move the considerable numerical technology built for hyperbolic conservation laws to Hamilton–Jacobi equations, which in a very rough sense can be thought of as integrated form. By doing so, viscosity solutions are automatically selected which accurately capture the evolution of corners by considering the limiting process of vanishing viscosity. One of the simplest forms for such a scheme (see Ref. [2]) is given by n + − φin+1 j k = φi j k − t[max(Fi j k , 0)∇ + min(Fi j k , 0)∇ ],
(3)
where
∇+ =
∇− =
−y
+x 2 2 2 max(Di−x j k , 0) + min(Di j k , 0) + max(Di j k , 0) +y
+z 2 2 + min(Di j k , 0)2 + max(Di−z j k , 0) + min(Di j k , 0) +y
−x 2 2 2 max(Di+x j k , 0) + min(Di j k , 0) + max(Di j k , 0) −y
−z 2 2 + min(Di j k , 0)2 + max(Di+z j k , 0) + min(Di j k , 0)
1/2 1/2
Here, we have used standard finite difference notation. Higher order schemes are available [2]. In this formulation, the entire level set function is updated on a computational mesh throughout the physical domain. This means that all the level sets are updated, not just the zero level set corresponding to the interface itself. The central advance that made these methods computationally competitive with other techniques came through the use of an adaptive “Narrow Band Level Set Method”; introduced by Adalsteinsson and Sethian [5] (“A Fast Level Set Method for Propagating Interfaces”): the idea of doing so was first discussed in Ref. [6]. In this approach, one works only in a neighborhood (or “narrow band”) of the zero level set, repeatedly rebuilding the signed distance function in a updated narrow band once the front has moved towards the edge of the domain.
Modern interface methods for semiconductor process simulation
1365
Figure 3. Grid points in dark area are members of narrow band.
There is one final issue. In most physical problems, the interface speed is defined on the front itself, yet the above level set formulation requires that a speed function F be defined throughout the narrow band/computational grid. Thus, a mechanism is required to extend the velocity from the front itself throughout the computational region. There are a variety of ways to do so; one is to solve an associated PDE for this extended velocity based on boundary values on the front itself. A particular fast way, introduced by Adalsteinsson and Sethian (“The Fast Construction of Extension Velocities in Level Set Methods” [7]) comes from using a variant of the fast marching method [8]. For details on narrow band level set methods, upwind schemes for interface evolution, reinitialization techniques and the construction of extension velocities, and general aspects of level set methods for interface propagation, see Refs. [1, 2, 5, 7, 8] (Fig. 3).
5.
Application to TCAD: Etching and Deposition in Semiconductor Processing
Narrow band level set methods for semiconductor etching and deposition simulations were introduced by Adalsteinssion and Sethian in a series of papers [9–12]. Details about computing the effects of visibility, reemission and redeposition, complex flux laws, and material-dependent etch rates may be found there. Here, we review a few applications, comparing simulations with experiment analyzing various aspects of surface thin film physics. All the
1366
J.A. Sethian
Figure 4. Ion-milling:experiment (top) vs. simulation (bottom).
simulations are performed using TERRAIN∗, a commercial version of these techniques built by Technology Modeling Associates and specifically designed for process simulation.
5.1.
Ion Milling
We begin with a comparison with experiment of an ion-milling process. Figure 4 shows an experiment on the top and a simulation at the bottom. We note that both the simulation and the experiment show the crossing non-convex curves on top of the structures, the sharp points, and the sloping sides.
5.2.
Plasma-enhanced Chemical Vapor Deposition
Next, we show comparison with experiment of two plasma-enhanced chemical vapor deposition (PECVD) simulations. We show a series of experiments. First, two smaller structure calculations are used to verify the ability to match experiment. Figures 5 and 6 show these results. Figures 7 and 8 show more simulations for more complex structures.
5.3.
SRAM Simulations
Finally, we show SRAM comparisons between experiment and simulations for large structures (Fig. 9). We show the original layout together with the actual pattern printed through photolithography, followed by the sequential processing steps. * We thank Juan Rey, Brian Li, and Jiangwei Li for providing these results.
Modern interface methods for semiconductor process simulation
Figure 5. PECVD, small-scalestructure: experiment (left) vs. simulation (right).
Figure 6. PECVD, small-scalestructure: experiment (left) vs. simulation (right).
Figure 7. PECVD: experiment(top) vs. simulation (bottom).
Figure 8. PECVD: experiment(top) vs. simulation (bottom).
1367
1368
J.A. Sethian
Original
Printed layout
Stimulation: Step one
Stimulation: Step two
Stimulation: Step three
Stimulation: Step four
Figure 9. SRAM simulation:experiment and simulation.
Modern interface methods for semiconductor process simulation
6.
1369
Discussion and Outlook
The use of Eulerian partial-differential-equations-based techniques for tracking moving interfaces carries some intrinsic advantages: topological changes are handled cleanly, geometric properties such as normals and curvature are accurately evaluated, and formulations carry forward to three dimensions with little change. In semiconductor simulations, some of the more challenging issues with this approach include material assignment, complexities at triple points, maintenance of sharp corners and fast evaluations of fluxes. Ultimately, the goal is to couple these simulations with diffusion solvers, stress analysis, and additional physics to make full process simulators. This work is currently underway.
References [1] J. Sethian, Level Set Methods and Fast Marching Methods, 2nd edn. Cambridge University Press, Cambridge, 1999. [2] S. Osher and J. Sethian, “Fronts propagating with curvature-dependent speeds: algorithms based on Hamilton–Jacobi formulations,” J. Comput. Phys., 79, 12–49, 1988. [3] J. Sethian, “Curvature and the evolution of fronts,” Commun. Math. Phys., 101, 489–499, 1985. [4] J. Sethian, “Numerical methods for propagating fronts,” In: P. Concus and R. Finn (eds.), Variational Methods for Free Surface Interfaces, Springer-Verlag, New York, pp. 66–80, 1987. [5] D. Adalsteinsson and J. Sethian, “A fast level set method for propagating interfaces,” J. Comput. Phys., 118, 269–277, 1995. [6] D. Chopp, “Computing minimal surfaces via level set curvature flow,” J. Comput. Phys., 106, 77–91, 1993. [7] D. Adalsteinsson and J. Sethian, “The fast construction of extension velocities in level set methods,” J. Comput. Phys., 148, 2–22, 1999. [8] J. Sethian, “Fast marching methods,” SIAM Rev., 41, 2–22, 1999. [9] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. I. Algorithms and two-dimensional simulations,” J. Comput. Phys., 120, 128–144, 1995. [10] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. II. Three-dimensional simulations,” J. Comput. Phys., 122, 348–366, 1995. [11] D. Adalsteinsson and J. Sethian, “A unified level set approach to etching, deposition and lithography. III. Complex simulations and multiple effects,” J. Comput. Phys., 138, 193–223, 1997. [12] J. Sethian and D. Adalsteinsson, “An overview of level set methods for etching, deposition, and lithography development,” IEEE Trans. Semiconductor Devices, 10, 167–184, 1996.
4.7 COMPUTING MICROSTRUCTURAL DYNAMICS FOR COMPLEX FLUIDS Michael J. Shelley and Anna-Karin Tornberg Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
1.
Introduction
Complex liquids such as polymeric or fiber suspensions are rich in interesting phenomena and as an area of scientific inquiry sits astride materials science and fluid dynamics. Materials science enters because such fluids are a form of composite material with micro-structure, while fluid dynamics enters because complex liquids are, well, fluidic. Computation plays a large role in their study not only at the continuum level, which is not a finished business anyway, but also at the microscopic level where the microstructure can have non-trivial dynamics in response to simple forcing flows.
2.
Background
Flows in nature and engineering often acquire their interesting aspects by the presence in and interaction of the fluid with immersed elastic objects. Fish, tree leaves, flagellae, and rigid polymers all come to mind. A very important special case is when the elastic bodies are microscopic and filamentary. For example, flexible fibers make up the micro-structure of suspensions that show strongly non-Newtonian bulk behavior, such as elasticity, shear-thinning, and normal stresses in shear flow [13, 23]; micro-organisms utilize for locomotion the anisotropic drag properties of their long flexible flagella [7]; rigid biopolymers such as actin and tubelin mediate many intracellular division and transport processes, and dynamic rheological measurements are used to probe their biophysical properties [11]. The dynamics of flexible filaments are also relevant to understanding soft materials: Liquid crystal phase transitions lead to the study of “soft” growing filaments in a smectic-A phase [26, 33, 34], 1371 S. Yip (ed.), Handbook of Materials Modeling, 1371–1388. c 2005 Springer. Printed in the Netherlands.
1372
M.J. Shelley and A.-K. Tornberg
while solutions of wormlike micelles have very complicated and interesting macroscopic behavior [6, 19, 23], perhaps due to their susceptibility to shearinduced breakage. In all these problems, the filaments have large aspect ratios (length over radius), ranging from order 10 to a thousand for natural to synthetic fibers, and up to many thousands in biological settings. Clearly we are considering examples where the inertia of both fluid and filament can be neglected, i.e., very low Reynolds number flows, for which the fluid dynamics is described by the Stokes equations. The Stokes equations are linear, and time enters only as a parameter, leading to its celebrated reversibility [2]. However, this reversibility is broken by surface forces such as those induced by bending rigidity, and simple forcing flows can lead to very non-trivial dynamics. Here, we will also neglect the complicating effects of thermal fluctuations of a surrounding solvent, and focus on the cleaner case where both the surrounding fluid and the rod can be described by classical continuum mechanics. As a simple case, consider a plane shear flow. A straight rod placed in this flow will rotate and translate with the fluid. However, as Fig. 1 shows, the dynamics can be very different when the rod is flexible. As the strength of the shear flow increases relative to the bending rigidity of the filament, there is a sharp bifurcation beyond which the filament is unstable to buckling [3, 37] and small shape perturbations can grow into substantial bending of the filament. This stores elastic energy in the fiber which can later be released back to the system as the fiber is extended, and is related to the anomolous stresses that elastic fluids can develop, such as normal stress differences that push
0.5
0.5
0.5
0.5
0
0
0
0
⫺0.5 ⫺0.5
0 t⫽0
0.5
⫺0.5 ⫺0.5
0 t⫽48.64
0.5
⫺0.5 ⫺0.5
0 0.5 t⫽49.664
⫺0.5 ⫺0.5
0 0.5 t⫽50.688
Figure 1. Upper figure: Substantial buckling occurs for a flexible filament in a plane background shear flow, U0 = (γ˙ y, 0, 0). Lower figure: The first normal stress difference plotted as a function of time. For comparison, the dashed line indicates the first normal stress difference for a straight filament. From Tornberg and Shelley [37], with permission.
Computing microstructural dynamics for complex fluids
1373
apart bounding walls in linear shear experiments [23]. The dynamics of the first normal stress difference (N 1, the one responsible for the above effect) is also plotted in Fig. 1, as is the normal stress difference for a straight filament (dashed line). The first normal stress difference is zero in the absence of the filament, and is zero in temporal mean for a rigid filament. For a filament that bends, the symmetry of the first normal stress difference that holds for a straight filament is broken, and the integrated normal stress difference now yields a positive net contribution [3, 37]. Thus, the dynamics shows a surprising richness even for a single filament, and for suspensions there is much that is still not well understood. It is worth noting that while experiments capture such sharp changes in fluidic response, continuum theories generally do not. Indeed, here the change lies in the degrees of freedom available to the microscopic fiber. Ultimately, one would like to have a macroscopic model for such suspensions, and eliminate the need of computer simulations to resolve the micro-structure. Indeed, a greater understanding of these flows is needed in order to develop such models, and this requires both experiments and numerical simulations. While in experiments it is natural to study suspensions of fibers, it is difficult to isolate one or a few fibers and understand their individual dynamics. Numerical simulations offer the ability to obtain data for each fiber in great detail. The main challenge for a numerical method lies in its ability to include many filaments in the simulation, at a reasonable cost, while maintaining accuracy. Indeed, since the details of the micro-structural dynamics are important to the collective behavior of the system, a relatively large number of filaments is likely needed before computed average quantities can be considered representative.
2.1.
Numerical Approaches
Given the scales of the problem – many filaments, slenderness, complicated individual dynamics – several approximate methods have been developed. One such are the so-called bead-models. This class of models contains a wide variety of approximations, with the common feature being that a flexible fiber is modeled as a chain of linked rigid bodies. For example, Yamamoto and Matsouka [40] modeled their flexible filaments as chains of N spherical beads. Interactions between the spheres within a fiber are taken into account through the use of a mobility matrix, but to reduce the cost of the computations, interactions with spheres belonging to other fibers are not included. Forces from inter-fiber interactions are instead computed using a lubrication approximation, ignoring far field interactions. Ross and Klingenberg [31] modeled each fiber as a number of linked prolate spheroids. In their study of sheared suspensions of fibers, they neglect hydrodynamics interactions, i.e., the fact that the
1374
M.J. Shelley and A.-K. Tornberg
fibers have an effect on the fluid flow, and added only a short range repulsive force to keep fibers from overlapping. Very similar approximations have been made by Ning and Melrose [25] and Switzer and Klingenberg [36], where they instead use cylinders as building blocks for each fiber. In Ref. [36], in order to reduce the computational cost, only five cylinders were used to model each fiber, thereby restricting the possible modes of bending of the fiber. Joung et al. [21] developed a method for slightly flexible fibers, modeled as chains of spherical beads. In their formulation they include the forces due to interactions within beads of each fiber, as well as between different fibers. However, they do not solve this full system to obtain the bead velocities to update the positions of the beads. Instead, they first determine the updated end-to-end vector of each fiber, followed by a force and moment calculation to adjust the positions of the individual beads. For the computations of external moments, the fiber is assumed rigid, hereby limiting the validity to cases of only slightly bent fibers. The immersed boundary method [28] has also been applied to this class of problems. In this method, an elastic boundary is discretized with connected Lagrangian markers, and its relative displacements by fluid motion are used to calculate the boundary’s elastic responses. These elastic forces are then distributed onto a background grid covering the computational domain, and used as forces acting upon the fluid, thereby modifying the surrounding fluid flow. For example, Stockie [35] used an immersed boundary method (at moderate Reynolds number) to simulate a single “filament” (modeled as an infinitesimally thin elastic boundary) buckling in a two-dimensional linear shear flow. To add a physical width to the fiber, a fiber structure must be constructed from a bundle of intertwined immersed elastic boundaries. Lim and Peskin [24] used such a construction to study the so-called whirling instability [38] of one fiber at low Reynolds number. While this method has the advantage that flows at finite Reynolds numbers can be simulated, being fundamentally grid based, it would be very difficult to use this method to simulate a large number of high aspect ratio filaments. As a different starting point, we have recently developed a numerical approach based on a formulation of the problem where we make explicit use both of the Stokes equations, and of the slenderness of the fibers [37]. The key points are that for Stokes flow, boundary integral methods can be employed to reduce the three-dimensional dynamics to the dynamics of the two-dimensional filament surfaces [30], and by using slender-body asymptotics, this can be further reduced to the dynamics of the one-dimensional filament centerlines. The resulting integral equations capture the non-local interaction of the fiber with itself, as well as with any other structures within the fluid, such as other fibers. Indeed, the dynamics shown in Fig. 1 are simulated using this formulation, though only for a single filament interacting with a background shear flow.
Computing microstructural dynamics for complex fluids
1375
We have applied this method to simulate a moderate number of high aspect ratio, very flexible, filaments in a three-dimensional fluid [37]. Indeed, these are the first simulations to reach such a regime of both high flexibility and aspect ratio. Still, much improvement remains to be done, especially in the area of near-interactions of the filaments where lubrication forces become important, as well as improving the efficiency of computing the non-local filament– filament interactions. There are important and fascinating applications for such a set of numerical tools. This includes answering very basic questions concerning the development of non-Newtonian stresses in elastic fluids; shedding light on observations of micro-fluidic mixing through a form of low Reynolds “turbulence” induced by elastic response of the fluid; studying the transitions to and nature of collective dynamics in swimming bacteria. Some of these applications lead to special cases of the dynamics that allow the simulations to be performed at very low cost.
3.
Mathematical Formulation
Let denote the fluid domain in R3 , external to the filament. Consider a Newtonian fluid of viscosity µ, with velocity field u(x), and pressure p(x), where x = (x, y, z) ∈ R3 . Assuming that fluid inertia is negligible, u and p satisfy the Stokes equations: ∇ p − µu = 0 and ∇ · u = 0 in . Let denote the surface of the filament and u its surface velocity. We impose the no-slip condition on and require that u(x) → U0 (x) as x → ∞, where the background velocity U0 (x) is also a solution to the Stokes equations. Hence u = u on ,
u → U0 for ||x|| → ∞.
In the case of several filaments this can be generalized by considering the union of all filament surfaces, and imposing no-slip conditions thereon. A full boundary integral formulation for this problem would yield integral equations on the surfaces of the filaments relating surface stress and surface velocity [30]. For long, slender filaments, such a formulation would be very expensive to solve numerically. Instead we use the filament slenderness to reduce the integral equations to the filament centerlines.
1376
3.1.
M.J. Shelley and A.-K. Tornberg
Non-local Slender Body Approximation
Consider a slender filament; that is ε = a/L 1, where a is the filament radius, and L is its length. A non-local slender body approximation can be derived by placing fundamental solutions to the Stokes equations (Stokeslets and doublets) on the filament centerline, then applying the technique of matched asymptotics to derive the approximate equation. Such an approximation was derived by Keller and Rubinow in 1976 [22]. Their derivation yields an integral equation with a modified Stokeslet kernel on the filament centerline and relates the filament forces to the velocity of the centerline. Johnson [20] added a more detailed analysis and a modified formulation that included accurate treatment of the filament’s free ends, yielding an equation that is asymptotically accurate to O(ε 2 log ε) if the filament ends are tapered. Gotz [14] gives an integral expression for the fluid velocity U(x) at any point x outside the filament. If there are multiple filaments, their contributions simply add due to the superposition principle of Stokes flow. Denote the filaments by l , l = 1, . . . , M. Let the centerline of each filament be parameterized by arclength s ∈ [0, L], where L is the non-dimensional length of the filament and let xl (s, t) be the coordinates of the filament centerline. In the cases we consider, the arclength s is the material parameter for the filament, so that it is independent of t (i.e., the filament is assumed inextensible). We assume that each filament exerts a force per unit length, fl (s, t), upon the fluid. For filament l , we have
M ∂xl (s, t) − U0 (xl , t) = −[fl ](s) − Kδ [fl ](s) − Vk (xl (s)), µ¯ ∂t k=1, k = /l
(1) where the sum is over the contributions from all other filaments to the velocity of filament l, and U0 (x, t)is the background velocity, if any. Assuming the background flow to be a shear flow of strength γ˙ ,the prob˜ flow lem has been made non-dimensional by using a typical filament length L, −1 2 ˜ time-scale γ˙ , and the force F =E/ L , where E is the rigidity of the fiber. The non-dimensional parameters are the effective viscosity µ¯ = 8πµγ˙ L˜ 2 /(E/ L˜ 2 ), representing a ratio between characteristic fluid drag and the filament elastic force, and the asymptotic parameter c = log(ε 2 e), where the radius of the √ filament is r(s) = 2ε s(L − s)[20]. The local operator l is given by l [f](s) = [−c(I + sˆl sˆl (s)) + 2(I − sˆl sˆl (s))]f(s),
(2)
Computing microstructural dynamics for complex fluids and the integral operator Kl,δ [f](s) by
ˆ s )R(s, ˆ s) I + sˆl (s)ˆsl (s) I + R(s, f(s ) − Kl,δ [f](s) = l
|R(s, s )|2 + δ(s)2
|s − s |2 + δ(s)2
1377
f(s) ds . (3)
ˆR ˆ and sˆ sˆ are dyadic products, with R ˆ the Here, R(s, s ) = x(s) − x(s ), and R ˆ normalized R vector and S the unit tangent vector. Note that these two operators depend on the shape of the filament (given by xl (s, t)). In the original slender-body formulations [14, 20, 22], the regularization parameter δ in Eq. (3) is zero. An analysis of the straight filament case shows that these original slender-body formulations are not suitable for numerical computations, due to high wave number instabilities at length-scales not accurately described by slender-body theory [34, 37]. As we discuss in Ref. [37], the regularization introduced can remove this instability while retaining the same asymptotic accuracy as the original formulation of Johnson. The Stokeslet and doublet contributions from the other filaments are given by Vk (¯x) =
k
ˆ k (s )R ˆ k (s ) I+R fk (s )ds |Rk (s )|
ε2 + 2
k
ˆ k (s )R ˆ k (s ) I − 3R fk (s )ds , |Rk (s )|3
(4)
where Rk (s ) = x¯ − xk (s ).
3.2.
Force Definition
The integral equation (1) relates the velocity of filament l to the forces acting upon the filament, as well as to the forces acting upon the other filaments. Here we assume that filament forces are described by Euler–Bernoulli elasticity [32], and for a filament given by x(s) the non-dimensional force (per unit length) is given by f(s) = −(T (s)xs )s + xssss ,
(5)
where derivatives with respect to arclength are denoted by a subscript s. The first term in Eq. (5) is the filament tensile force, with T the tension, that resists compression and extension. The second term represents bending forces. Twist elasticity is neglected [12]. The ends of the filament are considered “free,” that is, no forces or moments are exerted upon them, so that xss |s = 0,1 =xsss |s = 0,1 =0 and T |s = 0,1 = 0.
1378
3.3.
M.J. Shelley and A.-K. Tornberg
Completing the Formulation
Now, consider the assumption of inextensibility. This determines the line tension T. Since the filament is inextensible, s remains a material parameter, and thus s and t derivatives can always be interchanged. Writing ∂t x(s, t) = U(s, t), we have
∂t (xs · xs ) = 0 ⇒ xs · xt s = xs · Us = 0.
(6)
This condition can be combined with Eq. (1) to derive an integro-differential equation for the line tension. With this, the integro-differential equation for the line tension Tl (s) for filament l, l = 1, . . . , M, is of the form L s [Tl , xl ] = J [xl , U0 ] −
M k=1, k =l
(xl )s ·
∂ Vk (xl (s)), ∂s
(7)
where L s [T, x] = 2cTss + (2 − c) (xss · xss ) T − xs ·
∂ Kδ [(T xs )s ] ∂s
∂ U0 + (2 − 7c) (xss · xsss ) − 6c(xsss · xsss ) ∂s ∂ ¯ − xs · xs ). −xs · Kδ [xssss ] − µβ(1 ∂s
¯ s· J [x, U0 ] = µx
(8)
Equation (7) is solved together with the boundary condition T = 0 at s = 0,1. Here we have simplified the expressions for L s and J using a ladder of differential identities, derived by successive differentiations of xs · xs = 1 [37]. The line tension T (s) acts as a Lagrangian multiplier, constraining the motion of the filament to obey the inextensibility condition. However, the equation for T (s) was derived assuming that the filament was of exactly the correct length, and hence xs · xs = 1 for all s. However, if there is a small length error present, this error will not be corrected. On the contrary, the computed line tension can, depending on the configuration, even act so as to increase this error. We stabilize this constraint by replacing the inextensibility condition in (6) with ¯ − xs · xs ), which is equivalent to the original (1/2) ∂t (xs · xs ) = xs · xt s = µβ(1 condition when xs · xs = 1, and which acts to dynamically remove length errors if they are present (β is the penalization parameter, typically set to be of order O(10)). In summary, once the tension is found for each filament l ,their velocities are completely specified and the system can be stepped forward. Above, we seem to have suggested that each filament tension can be found independently of the others. This is not so; through its dependence on filament force, the operator J depends upon the tensions of all the other filaments k , k =/ l.
Computing microstructural dynamics for complex fluids
4.
1379
Numerical Methods
Thus, our system of flexible fibers is evolved by a large coupled set of integro-differential equations. There are several interesting aspects to the construction of accurate and efficient methods for its numerical solution, including methods of implicit time-stepping, and the evaluation of nearly singular integrals (which is usually more difficult that simply singular).
4.1.
Temporal Discretization
An explicit treatment of all terms in the time-dependent equation (1) would yield a very strict fourth-order stability constraint upon the time-step t. This arises basically from the large number of derivatives in the bending term of the force. To avoid this, we treat all occurrences of xssss implicitly, and combine this with a second-order backward differentiation formula [1]. Schematically, we write xt = F(x, xssss ) + G(x),
(9)
where x(s, t) are the coordinates of filament number l,and where the dependence on U0 and xk , k =/ l is not explicitly described. Here, xssss is treated implicitly, and all other terms are treated explicitly. We approximate this decomposition by 1 n n−1 (3xn+1 − 4xn + xn−1 ) = F(2xn − xn−1 , xn+1 ), ssss ) + 2G(x ) − G(x 2t (10) where t n = nt. We find that this scheme yields only a first-order constraint on t (i.e., proportional to the spatial grid size). The dynamics of multiple filaments are coupled to each other through the summation in Eq. (1). We treat this coupling term explicitly, that is, as part of G(x) in Eq. (10). In the resulting linear system for xln+1 (s), l = 1, . . . , M, the contribution from the other filaments will therefore be in the right hand side, and so the big system decouples into separate linear systems for xln+1 (s), l = 1, . . . , M. In doing this, we are essentially using that the interaction terms are smoothing operators and hence any the high-order terms that they contain do not contribute to high-order stability constraints on the time-step [18]. The equation for the line tensions Tl (s), l = 1, . . . , M is given in (7). This is a system of coupled integro-differential equations for the corresponding line tensions that must be solved at every time. To avoid solving one very large linear system for the line tensions on all the filaments, we introduce a fixed point iteration, in which we use the newest updates of the Tk s (i.e. possessive)
1380
M.J. Shelley and A.-K. Tornberg
available (k =/ l), when computing Tl (s). We find that this fixed point iteration typically converges rapidly, which again relies on the fact that the interaction terms are smoothing operators on the tensions.
4.2.
Spatial Discretization
The filament center lines are discretized uniformly in arclength s, with N intervals of step size h = 1/N. The discrete points are denoted s j = jh, j = 0, . . . , N, and the values f j = f (s j ). Second-order divided differences are used to approximate spatial derivatives. D p denotes divided difference operators such that D p f j approximates f ( p) (s j ) to an O(h 2 )error. Standard centered operators are used whenever possible, but at boundaries skew operators are applied. For the integral operator K in Eq. (3), both terms in the integrand are singular at s = s for δ = 0, and the integral is only well defined for the difference of these two terms. For the regularized operator, the terms are still nearly singular, and the numerical scheme must be designed with care to accurately treat the difference of these terms. To do this, we subtract off a term from the first part of the integral, and add the same term to the second part, and write the integral operator (3) as 1
Kδ [g](s) = 0
G(s, s )g(s )
(s − s )2 + δ(s)2
ds +(I + sˆ sˆ)
1 0
g(s ) − g(s)
(s − s )2 + δ(s)2
ds , (11)
where G(s, s ) is given by
G(s, s ) =
(s − s )2 + δ(s)2 ˆ R) ˆ − (I + sˆ sˆ). (I + R |R|2 + δ(s)2
(12)
We then treat each part separately, by approximating the argument to the operator, as well as G(s, s ) by piecewise polynomials. These are all smooth, well-behaved functions. In the end, we need to evaluate integrals of the form sj +1
sj
(s − s j ) p ds = |s − s |2 + δ(s)2
h
αp dα, α 2 + bα + c + δ(s)2
0
where b = 2(s j − s) and c = (s j − s)2 , and p = 0, . . . , 4. These integrals have analytical formulas, becoming somewhat lengthy as p increases. By evaluating
Computing microstructural dynamics for complex fluids
1381
these integrals analytically, the rapidly changing part where s is close to s can be treated exactly. In the line tension equation (7), terms like xs · ∂/∂s(Kδ [g]) appear. These differentiated integral terms are approximated to second order by 1 ∂ Kδ [g](s)|s=si ≈ [Kδ [g](s j +1/2 ) − Kδ [g](s j −1/2 )]. ∂s h
(13)
This compact centered approximation of the derivative is important to achieve a stable numerical approximation of the line tension equation.
5.
The Microstructural Dynamics of Suspensions
This suite of numerical methods are presently at the leading edge for simulating flows of high aspect ratio, very flexible filaments, and form the basis for approaching a set of very interesting scientific problems. Figures 2 and 3 show a simulation of 25 filaments of equal and unit length evolving and interacting in a background oscillatory shear flow. Such dynamic background flows are used to extract such quantities as storage and loss moduli of elastic fluids [23]. Here, periodic boundary conditions are imposed in the streamwise (x) direction, with period twice the filament length, and viscosity µ¯ = 1.5×105 and aspect ratio ε = 10−2 . We have used N = 100 points to discretize each filament, and evolve with time-step t = 0.0128. The background shear flow is given by U0 = (sin(2πωt)y, 0, 0), where ω = (2000 t)−1 , so that one period is 2000 time-steps. The simulation is run for five periods.
Figure 2. Filament configurations at t = 0.0 and 32. The velocity profile of the background shear flow is indicated at the bottom. From Tornberg and Shelley [37], with permission.
1382
Figure 3.
M.J. Shelley and A.-K. Tornberg
As in Fig. 2, but at t=38 and 49. From Tornberg and Shelley [37], with permission.
As seen in the left plot of Fig. 2 all the filaments are initially straight. In simulations of a single filament, a small perturbation must be introduced to excite buckling. In this case, the non-local filament interactions are sufficient for this purpose. The right plot of Fig. 2 shows the filament configuration at t = 32, while Fig. 3 shows the configurations at t = 38 and 49. These three last plots all lie within the second period of the background shearing. In each plot, the tension T in each filament is used to color their surfaces, and the effects of compression and extension are clear; in particular, bending is typically associated with compression. Between t = 32 and 38 the background shear has slowed down and changed direction, and is slowly picking up again. This induces the strong bending of many filaments seen at t = 49. Figure 4 shows the evolution of the total elastic energy, Eel = ds X 2s , of the filament system, as well as N1 ,the first normal stress difference. As with the single filament simulations in a uniform shear [3, 37], we find a net positive first normal stress difference, though here with a contribution on each period of the forcing. However, examination of the elastic energy makes it particularly clear that the system is not in an “equilibrium” with averaged dynamics the same on each half-period. This suggests either that the simulations have not yet run long enough to remove dependencies on the initial configuration, or that the number of filaments is yet too small to get a good distribution of positions and orientations, or both. Indeed, one of the goals of our project is to make our methods much more efficient so that a larger number of filaments can be simulated, as is discussed below. If we were able to perform much larger simulations, and to compute quantities of interest such as filament concentration and orientations, elastic energy, viscosity of the suspension, and normal stress differences, the impact on the development of macroscopic models would be substantial. Also, within this type of simulation one can study the details of filament interactions directly,
Computing microstructural dynamics for complex fluids
1383
200 150 100 50 0
0
20
40
60
80
100
120
100
120
2000 1000 0 ⫺1000 ⫺2000 0
20
40
60
80
Figure 4. The upper plot shows the evolution of the elastic energy εel for the 25 filament simulation. The lower plot shows the evolution of the first normal stress difference N1 . From Tornberg and Shelley [37], with permission.
and relate macroscopic stress development directly to geometrical configuration of the filaments, and to their own internal stresses. We also intend to investigate dependencies on parameters such as bending rigidity, fiber slenderness, fiber concentration, and responses to protypical background flows, from shearing to extensional. Numerical simulations also constitute a great complement to laboratory experiments, since parameters and initial settings are easily varied and exactly controlled.
5.1.
Outlook
5.1.1. Numerical methods Simulating denser suspensions, while maintaining accuracy, is a real numerical challenge. There are two primary issues: the treatment of near-range filament–filament interactions, and the ability to simulate a suspension including a large number of filaments. On the first, as two filaments come within very close proximity of each other – on the order of a filament radius – lubrication forces become important and these are not well captured by slender-body theory. Shelley and Ueda [34] avoided the issue by reformulating the slender-body dynamics so that close approach induces a nearly singular response that kept the filaments separated. In their simulations of dense suspensions of straight
1384
M.J. Shelley and A.-K. Tornberg
rods, Butler and Shaqfeh [5] used a version of slender-body theory, and modeled lubrication forces by approximations of flow between two rigid rods (though their slender-body approximation was incomplete in describing the self-induced dynamics of fiber). A more general strategy would interpolate between slenderbody theory to describe regions away from close approach, and a full, but local, boundary integral formulation describing the local flow field in the region of close approach between two filaments. There are also being developed new fast summation strategies for computing the filament–filament interactions. Very recently, Zorin and his collaborators at the Courant Institute have developed highly efficient “kernel-free” fast multipole methods whose application here would yield a cost of O(MN) [4, 10, 29]. They already have a production code for fast stokeslet summation, and we are collaborating with them on applying this code to our problem.
5.1.2. Other dynamical problems in fiber-like flows To some extent, the slender-body formulation that we have developed follows the earlier work of Shelley and Ueda [34] (though with considerable improvement and generalization). There, they were specifically interested in modeling the filamentary dynamics of a phase transition observed in an undercooled smectic-A liquid crystal sample, and which was posited as the substrate upon which new industrial fibers and materials might be synthesized [26]. The left figure of Fig. 5 shows a snap-shot from the phase transition experiments of [26], in which a microscopic filament was demonstrated to grow exponentially in time. Here the pattern grows rapidly outwards, becoming “space-filling”, because the liquid crystal sample is confined to a narrow gap between two microscope coverslips, thus constraining the dynamics. The right figure shows the result of a numerical simulation of Shelley and Ueda [34] of this process. The system is of an elastic filament in a Stokesian fluid, whose axial tension is determined by the constraint of specified length growth. This growth leads to a buckling instability with a critical length-scale, and the hydrodynamical interactions of the filament with its disparate parts – mediated through a non-local slender-body theory – leads to the resulting dynamical patterns. E & PalffyMuhoray [9] have also elucidated many of the thermodynamical aspects of the phase transition. A recently discovered and very interesting phenomenon involving fluids with elastic response is that of “elastic turbulence”, which has important applications to fluid mixing in small devices at low Reynolds numbers. In experiments, Groisman and Steinberg [16] have shown that by adding a small amount of high-molecular-weight polymer to a viscous fluid, the flow in a cylindrical cup with a rotating top plate can become very irregular at low Reynolds numbers, and such that the fluid motion is excited over a wide
Computing microstructural dynamics for complex fluids 4
1385 4
t⫽1.05
t⫽2.25
2
2 0
0
⫺2
⫺2
⫺4
⫺4 ⫺4 ⫺2
4
0
2
⫺4 ⫺2
4 4
t⫽1.65
2
2
0
0
⫺2
⫺2
0
2
4
t⫽2.85
⫺4
⫺4 ⫺4 ⫺2
0
2
4
⫺4 ⫺2
0
2
4
Figure 5. Left: The “space-filling” pattern made by a thin filament of smectic-A material (whose molecular layers are presumably in a hedge-hog arrangement about the centerline) as isotropic material permeates into the filament, causing it to grow in length, and buckle. From Ref. [26], shown with permission. Right: A simulation of this growth process, based upon a non-local slender-body theory. Here the non-local hydrodynamical interactions cause the pattern to push outwards as the filament length grows exponentially in time, as in experiment. From Ref. [34], shown with permission.
range of spatial and temporal scales. The flow shows many of the features of developed turbulence. In Ref. [17], the same authors performed an experiment where they studied the mixing of very viscous liquids in a curved channel as polymers were added to the liquids. They found that at sufficiently high flow rates, the combination of elastic response with curved channel walls lead to elastic instabilities and an irregular flow with strongly enhanced mixing. Roughly speaking, the elastic response of the fluid supplies the requisite nonlinearity and extra time-scales to create very complicated flow patterns, even at very low Reynolds number. Very recently, Groisman et al. have demonstrated how the non-linearities of elastic fluids can be used to create micro-fluidic logical devices [15]. Relevant to these systems is also the lawful inclusion of thermal fluctuations in such continuum-based models. There has been some recent work in this direction. Montesi, Pasquali, and Wiggins (private communication) have incorporated a model of such fluctuations in a local drag model of an elastic filament in a shear flow. Many problems in biological settings involve elastic filaments immersed in fluids. The flagella utilized by micro-organisms for locomotion [7] are very slender and flexible. At one end it is attached to the body, and actively driven to perform a swimming stroke. The flexibility of the flagella is very important in reducing the drag in the backstroke. The internal structure of a cross-section of cilia and flagella is not symmetric, yielding it easier for the cilia/flagella to
1386
M.J. Shelley and A.-K. Tornberg
bend in one direction than the other, hence providing a prefered direction for the cilia/flagella to bend. It would interesting to investigate models such as ours where the filaments have preferred bending directions. Related to this are the dynamics of active suspensions, such as bacterial ones, where the microstructural elements are self-propelling, i.e., swimming and eating bugs [27]. These active elements react to concentration gradients (say, of oxygen), but are also carried along in the macroscopic fluid motions induced by all of the other locomoting bacteria. The resulting bacterial flow structures can be very complicated, and quite “turbulent” in appearance. It has been shown that the self-propelled flows of dense bacterial suspensions can be dominated by vortices and jets, each of which is made up of many individual bacteria ([39]; also, unpublished observations of Dombrowski et al. [8]). This is motivation for developing simulational methods for many-particle suspensions of self-locomoting straight fibers. This could be done with flexible fibers of high rigidity, but by exploiting the fact that the fibers are straight, the computational cost can be made far lower. Here, the approach is to determine a filament force that contains a motive part, as well as a part that constantly readjusts so as to be consistent with the filament being rigid. This yields an mathematical description with similarity to that of Butler and Shaqfeh [5] for simulating suspension dynamics of rigid rods.
References [1] K.E. Atkinson, An Introduction to Numerical Analysis, Wiley, New York, 1989. [2] G.K. Batchelor, An Introduction to Fluid Dynamics, Cambridge University Press, Cambridge, 1967. [3] L. Becker and M. Shelley, “The instability of elastic filaments in shear flow yields first normal stress differences,” Phys. Rev. Lett., 87, 198301, 2001. [4] G. Biros, L. Ying, and D. Zorin, “A kernel-independent adaptive fast multipole method in two and three dimensions,” J. Comput. Phys., 196, 591–626, 2004. [5] J.E. Butler and E.S.G Shaqfeh, “Dynamic simulation of the inhomogeneous sedimentation of rigid fibers,” J. Fluid Mech., 468, 205–237, 2002. [6] M.E. Cates and S.J. Candau, “Statics and dynamics of worm-like surfactant micelles,” J. Phys. Condens. Mater., 2, 6869–6892, 1990. [7] S. Childress, Mechanics of Swimming and Flying, Cambridge University Press, Cambridge, 1981. [8] C. Dombrowski, L. Cisneros, S. Chatkaew, R. Goldstein, and J. Kessler, “Selfconcentration and large-scale coherence in bacterial dynamics,” Preprint, 2003. [9] W.E and P. Palffy-Muhoray, “Dynamics of filaments during the isotropic-smectic a phase transition,” J. Nonlinear Sci., 9, 417–437, 1999. [10] Z. Gimbutas and V. Rokhlin, “A generalized fast multipole method for nonoscillatory kernels,” SIAM J. Sci. Comput., 24, 796–817, 2002. [11] T. Gisler and D.A. Weitz, “Scaling of the microrheology of semidilute F-Actin solutions,” Phys. Rev. Lett., 82, 1606–1609, 1999.
Computing microstructural dynamics for complex fluids
1387
[12] R. Goldstein, T. Powers, and C. Wiggins, “Viscous nonlinear dynamics of twist and writhe,” Phys. Rev. Lett., 80, 5232, 1998. [13] S. Goto, H. Nagazono, and H. Kato, “Polymer solutions, 1. Mechanical properties,” Rheol. Acta, 25, 119–129, 1986. [14] T. G¨otz, Interactions of Fibers and Flow: Asymptotics, Theory and Numerics, PhD thesis, University of Kaiserslautern, Germany, 2000. [15] A. Groisman, M. Enzelberger, and S. Quake, “Microfluidic memory and control devices,” Science, 300, 955–958, 2003. [16] A. Groisman and V. Steinberg, “Elastic turbulence in a polymer solution flow,” Nature, 405, 53, 2000. [17] A. Groisman and V. Steinberg, “Efficient mixing at low Reynolds numbers using polymer additives,” Nature, 410, 905, 2001. [18] T. Hou, J. Lowengrub, and M. Shelley, “Long-time evolution of vortex sheets with surface tension,” Phys. Fluids, 9, 1933, 1997. [19] A. Jayaraman and A. Belmonte, “Oscillations of a solid sphere falling through a wormlike micellar fluid,” Phys. Rev. E, 065301, 2003. [20] R.E. Johnson, “An improved slender-body theory for stokes flow,” J. Fluid Mech., 99, 411–431, 1980. [21] C.G. Joung, N. Phan-Thien, and X. Fan, “Direct simulation of flexible fibers,” J. Non-Newtonian Fluid Mech., 99, 1–36, 2001. [22] J. Keller and S. Rubinow, “Slender-body theory for slow viscous flow,” J. Fluid Mech., 75, 705–714, 1976. [23] R.G. Larson, The Structure and Rheology of Complex Fluids, Oxford University Press, Oxford, 1998. [24] S. Lim and C.S. Peskin, “Simulations of the whirling instability by the immersed boundary method,” SIAM J. Sci. Comput., 25, 2066–2083, 2004. [25] Z. Ning and J.R. Melrose, “A numerical model for simulating mechanical behavior of flexible filaments,” J. Chem. Phys., 111, 10717–10726, 1999. [26] P. Palffy-Muhoray, B. Bergersen, H. Lin, R. Meyer, and Z. Racz, “Filaments in liquid crystals: structure and dynamics,” In: S. Kai (ed.), Pattern Formation in Complex Dissipative Systems, World Scfientific, Singapore, 1991. [27] T. Pedley and J. Kessler, “Hydrodynamic phenomena in suspensions of swimming microorganisms,” Annu. Rev. Fluid Mech., 24, 313, 1992. [28] C.S. Peskin, “The immersed boundary method,” Acta Numer., 11, 479–517, 2002. [29] J. Phillips and J. White, “A precorrected-fft method for electrostatic analysis of complicated 3d structures,” IEEE Trans. Comput.-Aid. Des. Integrat. Circuits Syst., 16, 1059–1072, 1997. [30] C. Pozrikidis, Boundary Integral and Singularity Methods for Linearized Viscous Flow, Cambridge University Press, Cambrige, 1992. [31] R.F. Ross and D.J. Klingenberg, “Dynamic simulation of flexible fibers,” J. Chem. Phys., 106, 2949–2960, 1997. [32] L. Segel, Mathematics Applied to Continuum Mechanics, MacMillan, New York, 1977. [33] M. Shelley and T. Ueda, “The nonlocal dynamics of stretching, buckling filaments,” In: D. Papageorgiou and Y. Renardi (eds.), Multi-Fluid Flows and Instabilities, AMS-SIAM, Philadelphia, 1996. [34] M.J. Shelley and T. Ueda, “The stokesian hydrodynamics of flexing, stretching filaments,” Physica D, 146, 221–245, 2000. [35] J.M. Stockie, “Simulating the motion of flexible pulp fibres using the immersed boundary method,” J. Comput. Phys., 147, 147–165, 1998.
1388
M.J. Shelley and A.-K. Tornberg
[36] L.H. Switzer and D.J. Klingenberg, “Rheology of sheared flexible fiber suspensions via fiber-level simulations,” J. Rheol., 47, 759–778, 2003. [37] A.K. Tornberg and M.J. Shelley, “Simulating the dynamics and interactions of flexible fibers in stokes flow,” J. Comput. Phys., 196, 8–40 2004. [38] C. Wolgemuth, T. Powers, and R. Goldstein, “Twirling and whirling: viscous dynamics of rotating elastic filaments,” Phys. Rev. Lett., 84, 1623, 2000. [39] X.-L. Wu and A. Libchaber, “Quasi-two-dimensional bacterial bath,” Phys. Rev. Lett., 84, 3017, 2000. [40] S. Yamamoto and T. Matsuoka, “Dynamic simulations of fiber suspensions in shear flow,” J. Chem. Phys., 102, 2254–2260, 1995.
4.8 CONTINUUM DESCRIPTIONS OF CRYSTAL SURFACE EVOLUTION Howard A. Stone1 and Dionisios Margetis2 1
Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA 01238, USA 2 Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
1.
Morphological Evolution of Crystalline Materials
It is well known that liquid surfaces evolve in shape due to the effect of surface tension, which drives configurations towards lower energy. The breakup of an initially cylindrical fluid thread into spherical droplets, first quantified experimentally by Plateau and analyzed by Rayleigh, is a popular and illustrative example. Solid surfaces, in particular surfaces of crystals, also evolve according to the analogous principle of minimizing their surface energy. The evolution in this case, however, is more complicated to describe physically and mathematically than the analogous phenomena for fluid interfaces, because there is a richer variety of competing mechanisms that are available for the solid to change its shape. In addition, a solid supports strain, which leads to the surface energy depending on the slope of the crystal surface. In this article we summarize the basic physical ideas that underlie crystal surface evolution, introduce continuum descriptions in terms of continuum thermodynamics and partial differential equations (PDEs), and provide solutions to some analytically tractable prototypical problems. A strong motivation for studying how crystal surfaces evolve is the need to better understand and harness properties of solid structures and electronic devices at the nanoscale. In most experimental, and technologically relevant, situations, such structures are not in thermodynamic equilibrium, and decay with a lifetime that varies appreciably with the temperature T and, most importantly, scales as an integer power of the feature size; thus, smaller structures decay faster. Hence, there is a need for the quantitative understanding of the factors that affect surface evolution, such as formation and growth of islands 1389 S. Yip (ed.), Handbook of Materials Modeling, 1389–1401. c 2005 Springer. Printed in the Netherlands.
1390
H.A. Stone and D. Margetis
and atom clusters. Other problems are related to crystal dissolution, the effects of catalysts, and surface functionalization (e.g., using physical chemistry techniques).
1.1.
Mechanisms of Surface Evolution of Crystalline Materials
There are at least four primary mechanisms for solid surfaces to evolve: (i) Evaporation–condensation processes whereby atoms leave the surface or deposit on the surface from above. These processes are driven by differences between the chemical potential of the surface and the adjacent bulk phases (solid or vapor). (ii) Surface diffusion whereby movable atoms or point defects (“adatoms”) perform random walks (Brownian motion) along the surface. (iii) Strain-driven rearrangements of atoms in the bulk of the material. (iv) Atomic motion driven by external electric fields, which is a phenomenon referred to as electromigration. Here we focus on mechanisms (i) and (ii). These mechanisms, and especially their effects on the macroscopic surface features as well as their quantitative description, depend on the temperature and the surface orientation. There are two distinct temperature regimes that mark different macroscopic behaviors of surfaces both at and away from equilibrium; these regimes are separated by the orientation-dependent roughening transition temperature TR . For any fixed T , continuously curved portions of the surface are characterized by a roughening transition temperature TR < T whereas macroscopically flat regions of the surface known as “facets” have TR > T [1]. Below TR the surface consists of distinct steps bounding terraces whose size can vary from a few nanometers up to a few microns, as shown in Fig. 1a which also illustrates kinks, clusters of atoms and voids. The increase (a)
(c)
(b) T < TR
(1)
(3) T > TR
Steps on Si(001)
(2)
xn xn+1
x
(4)
Figure 1. (a) A STM (single tunneling microscopy) image of a stepped Si (001) surface, which illustrates kinks, voids, atom clusters, a step and a terrace. (Ref. B.S. Swartzentruber’s website, Sandia National Laboratories). (b) The contrast between the shape of crystal surfaces above and below the roughening transition. For T < TR the equilibrium crystal shape has stable facets while for T > TR the surface is continuously rounded with no facets present (Ref. Fig. 7 from Ref. [3]). (c) The notation used for keeping track of the position of different steps with position xn (t).
Continuum descriptions of crystal surface evolution
1391
of the temperature above TR causes the terraces to shrink as steps are spontaneously created everywhere, and the surface appears to be “rough,” as shown in Fig. 1b. Accordingly, below TR the evolution is caused by the lateral motion of the atomic steps and is in principle more difficult to describe physically and mathematically. Moreover, the detailed description of these processes is impacted by kinetics at the step edges, especially those of extremal steps of opposite sign [2]. The physical picture described above implies that the energetics of the solid surface are different above and below TR , which can be a few hundreds of degrees K below the melting temperature for solids. This description of surface evolution clearly has important differences from the case of liquid–liquid interfaces.
1.2.
Theoretical Descriptions of Surface Evolution
The aim of most theoretical studies is to describe the surface morphology at mesoscopic and macroscopic length scales by taking into account the motion of atoms or steps at smaller length scales. Historically, there have been two different theoretical approaches: (i) Approaches based on continuum thermodynamics and principles of continuum mechanics such as mass conservation, which lead to diffusion-like PDEs or variational principles (e.g., for a recent variational approach see Ref. [4]). (ii) Simulations of individual atoms or step motion by solving a large number of coupled equations; for example, the wandering of an individual step is studied by taking into account the local or nonlocal interactions with adjacent steps. This second approach often succeeds in providing detailed information about the surface morphology by accounting for motions over a wide range of length scales. Nevertheless, the merits of the first approach include its relative simplicity because it often enables analytical solutions and, therefore, allows for quantitative predictions for experiments. It is worthwhile mentioning that the differential equations that arise in this continuum approach vary in their form and properties of solutions, and are not generally so familiar to researchers. We take the first approach in the main body of this article. There are many different types of problems that have emerged in the theoretical and experimental studies of morphological surface evolution, as determined mostly by the geometry and dimensionality of the surface configurations both above and below TR . We mention here only four types of such problems. In particular, there have been studies of: (a) the relaxation or flattening of a surface with long-wavelength features, an example being a periodic corrugation with an initial sine profile in one or two rectilinear coordinates, (b) the relaxation of a surface morphology with an initial localized “bump,” or structure of finite extent, (c) the evolution of the interface between two grain
1392
H.A. Stone and D. Margetis
boundaries, which is commonly referred to as grooving, and (d) the evolution of surfaces of revolution in three dimensions such as cones and cylinders (e.g., wires). Analytical solutions for representative versions of some of these problems are summarized in this paper.
1.3.
Step-flow Models
The basic description of atomistic processes in the framework of step kinetics that underlies surface evolution was given by Burton et al. [5] and is referred to as BCF theory; for an overview see Ref. [3]. Figure 1c shows a cross-section of a 1D step configuration along x with the position of the nth step denoted by xn (t), where the nth terrace is the region xn < x< xn+1 . The starting point for step-flow models is the conservation of mass, which relates xn with the adatom surface current (atoms/time), Jn (x), on the nth terrace by dxn [Jn−1 (xn , t) − Jn (xn , t)], x˙n (t) ≡ , (1) a dt where is the atomic area and a is the step height. The surface current is Jn = −Ds (∂cn /∂ x), where cn = cn (x, t) is the adatom concentration and Ds is the diffusivity. The concentration cn (x, t) satisfies the diffusion equation, Ds (∂ 2 cn /∂ x 2 ) = ∂cn /∂t ≈ 0, where the time derivative is negligible in the quasistatic approximation. Thus, cn on each terrace is cn (x, t) = An (t)x + Bn (t), where the time t enters implicitly through the boundary conditions. The requisite boundary conditions describe the attachment and detachment of atoms at the step edges, x˙n (t) =
− Jn (xn , t) = k[cn (xn , t) − cneq ],
eq
Jn (xn+1 , t) = k[cn (xn+1 , t) − cn+1 ], (2)
where k is the attachment–detachment rate coefficient and the superscript “eq” denotes the equilibrium atom density at the step edge. Hence, An and Bn can be determined in terms of cneq , which is related to the step chemical potential, µn , by
cneq = ceq exp
µn kB T
≈ ceq 1 +
µn , kB T
(3)
where we have also indicated the limit |µn | kB T [6]. Finally, µn is related to other step positions via the step interaction potential. In particular, for nextneighbor interactions described by the potential V (xn , xn+1 ), the step chemical potential is µn =∂[V (xn , xn+1 )+ V (xn−1 , xn )]/∂ x. Hence, (1)–(3) define a system of coupled ODEs for the step positions which can be solved numerically with given initial conditions xn (0) to determine the evolution of a stepped surface.
Continuum descriptions of crystal surface evolution
2.
1393
Governing Equations for Continuum Descriptions
A basic ingredient of the continuum equations for surface evolution both above and below roughening is the chemical potential µ [7]. We restrict the majority of our discussion to the analysis of configurations in one independent space dimension where the height profile is denoted h = h(x, t) and h x ≡ ∂h/∂ x. The surface thermodynamics can be described in terms of either of two energies, the surface free energy per projected area of the high-symmetry plane, G(h x ), or the perhaps more familiar surface free energy per area of the surface, γ (φ), whereφ is the surface orientation, tan φ = h x . The two energies are related by G = γ 1 + h 2x . As above, we denote by the atomic area. The chemical potential µ is related to G by µ − µ0 = −(∂/∂ x)(∂ G/∂h x ), as shown using a variational principle [8] in the Appendix. It then follows by elementary calculus that ∂ ∂G ∂2G = −h x x 2 ∂ x ∂h x ∂h x ∂ ∂ γ (φ) 2 2 cos φ = − h x x cos φ ∂φ ∂φ cos φ d2 γ hxx = − γ + 2 , dφ (1 + h 2x )3/2
µ − µ0 = −
κ
(4)
where in the last step we used cos φ = (1 + h 2x )−1/2 . Also, κ denotes the curvature of the surface. The term d2 γ /dφ 2 ≡ γφφ is not present when considering surface evolution of liquids. In the more general, 2D setting [8], we have G = G(h x , h y ) and use µ − µ0 = −v ((∂/∂ x)(∂ G/∂h x ) + (∂/∂ y)(∂ G/∂h y )) where v is the atomic volume. As a result, the chemical potential is µ − µ0 = −v (γ + γφ1 φ1 )κ1 − v (γ + γφ2 φ2 )κ2 , where κ1 and κ2 are the principal curvatures, and φ1 and φ2 are the corresponding angles (surface orientations) along the normals to these principal curvatures.
2.1.
Surface Evolution by Evaporation–Condensation Above and Below TR
Perhaps the simplest case of surface dynamics is when the evolution occurs by displacement of atoms by evaporation from, or condensation on, the surface. The driving force for movement of the atoms is then the difference of chemical potentials between the surface and the vapor. Thus, with v n denoting the speed at which the surface is displaced in the normal direction, v n = −ζ(µ − µ0 ),
(5)
1394
H.A. Stone and D. Margetis
where ζ > 0 is the product of a surface mobility [9] and the inverse of the step height. It is necessary to distinguish two cases, T > TR and T < TR , since below roughening the existence of steps and facets produces differences in the form of the relation between surface energy (γ ) and surface orientation (h x ). In the classical case of evolution above roughening, T > TR , ζ and γ are analytic in case of constant properties, ζ0 and γ0 , Eq. (5) simplifies. h x . For the special
3/2 1 + h 2x and the curvature is κ = h x x / 1 + h 2x , then (4) and Since v n = h t (5) lead to h t = ζ0 γ0
hxx 1 + h 2x
(T > TR ).
(6)
In the small-slope limit, |h x | 1, we simply have the familiar linear diffusion equation. Some examples, for both the linear and nonlinear equations, are given below. On the other hand, below the roughening transition, T < TR , the mobility may be dependent on the surface orientation, e.g., ζ = k0 |h x |α with α = 0 or 1 is common. Further, it is usual to consider small slopes and define the height function h(x, t) relative to a crystallographic plane. In this case, γ + γφφ = γ˜ |h x |β , where β = 1 when the dominant physical effect at the nanoscale is that of step–step elastic interactions that decay inversely proportional to the square of the step distance [10], or G = g0 + g1 |h x | + 13 g3 |h x |3 , where 2g3 = γ˜ . For α = 1 and β = 1 in particular, the surface evolves according to the non-linear equation h t = k0 γ˜ h 2x h x x .
(7)
A general discussion of the evaporation–condensation dynamics below roughening is given by Spohn [9]. Again, some examples are provided below.
2.2.
Surface Evolution by Surface Diffusion Above and Below TR
As above, the surface evolves in the normal direction at a speed v n owing to variations in the flux of atoms along the surface. It is straightforward to give the development in two independent space dimensions here [11]. If we let j denote the number of atoms per unit length normal to a contour lying in the surface and v the atomic volume as above, then mass conservation requires v n + v ∇s · j = 0.
(8)
For systems out of, but close to, equilibrium the surface flux j is proportional to the gradient of the surface chemical potential (or energy) for an
Continuum descriptions of crystal surface evolution
1395
atom. The corresponding thermodynamic force on the atom is −∇s µ, and the flux of atoms then follows from a form of a Stokes–Einstein argument: j = −(Ds cs /kB T )∇s µ, where Ds is the surface diffusivity, cs is the adatom concentration (number/area; adatoms are those atoms free to diffuse at any time along the surface), kB is Boltzmann’s constant and T is the absolute temperature. Assuming all material parameters are constants, the surface evolves according to vn =
Ds cs v 2 ∇s µ. kB T
(9)
Above the roughening transition, the chemical potential change, µ − µ0 , is proportional to the surface curvature. Hence, (9) yields a fourth-order nonlinear PDE for the height h. In one dimension the PDE is
Ds cs 2 γ0 ∂2 hxx 2 1 + hx 2 . ht = − kB T ∂ x (1 + h 2x )3/2
(10)
For small slopes, this equation is linearized to h t = − (Ds cs 2 γ0 /kB T ) hxxxx. On the other hand, below the roughening transition, the surface energy depends on the surface orientation. Taking γ + γφφ = γ˜ |h x | in one dimension for small surface slopes, we obtain the nonlinear PDE ht = −
Ds cs 2 γ˜ (|h x |h x x )x x . kB T
(11)
Some solutions of these equations for surface-diffusion-driven evolution above and below the roughening temperature are given below.
3.
Solutions to Prototypical Problems: Surface Evolution by Evaporation–Condensation Processes, T > TR
In these last sections we tersely summarize a number of problems that have been treated analytically, including both the familiar linear second- and fourthorder diffusion equations and the more intricate nonlinear equations. We treat in sequence evaporation–condensation and surface diffusion processes, first for conditions above, and then for conditions below, the roughening transition. We begin with evaporation–condensation dynamics. Recall that (6) reduces to the diffusion equation for |h x | 1 so that h t = ζ0 γ0 h x x . Relaxation of periodically corrugated surfaces [12]. For an initial periodic profile with wavelength λ, h(x, 0) = A sin(2π x/λ), the diffusion equation is solved by applying a Fourier series in the form h(x, t)= ∞ n=1 an (t) sin(2nπ x/λ),
1396
H.A. Stone and D. Margetis
where the coefficients an (t) satisfy the ODE a˙ n + ζ0 γ0 (2π n/λ)2 an = 0 and the initial condition an (0) = A. Hence, the complete solution is h(x, t) = A
∞
e
−ζ0 γ0
2π n 2 λ
t
n=1
2nπ x sin . λ
(12)
For sufficiently long times, which corresponds to t λ2 /(4π 2 ζ0 γ0 ), 2 2 Eq. (12) simplifies to h(x, t) ∼ Ae−ζ0 γ0 (4π /λ )t sin(2π x/λ); thus, the lifetime of the periodic profile is proportional to λ2 . Decay of a localized mound of atoms. Again, we restrict ourselves to the small-slope approximation. For an initial bump, h(x, 0) = f (x) where f (x) is of finite extent, and the condition h → 0 sufficiently fast as |x| → ∞, h(x, t) ˜ s) = is analytically by applying the Laplace transform in t, h(x, √ ∞determined −st δ − s|x| ˜ dt h(x, t)e . In particular, for f (x) = δ(x), we find h (x, s) = e / 0√ (2 s) whose inversion gives the fundamental solution 2 1 − x e 4ζ0 γ0 t . h δ (x, t) = √ ζ0 γ0 t
(13)
Notice that this solution has the similarity form t −1/2 H (η) where η is the √ similarity variable x/ 4ζ0 γ0 t. The solution for an arbitrary initial bump is obtained by superposition ∞
h(x, t) =
dx h δ (x − x , t) f (x ).
(14)
−∞
Grooving at a grain boundary [13, 14]. Here we consider the evolution of a groove that forms at a grain boundary of an otherwise flat surface. It is thus necessary to solve the nonlinear equation (6) subject to the condition h x (0, t) = −(cos θ/ sin θ), where θ is half the dihedral angle formed at the groove. This problem admits a similarity solution of the form h(x, t) = (2ζ0 γ0 t)1/2 H x/(2ζ0 γ0 t)1/2 where H (η) satisfies the ODE
H − ηHη
1 + Hη2 = Hηη .
(15)
Thus, the grain deepens at a rate proportional to t 1/2 . A numerical solution of (15) is in principle necessary and is straightforward to obtain by the usual shooting procedure of guessing H (0) with a given Hη (0) until H (η → ∞) → 0. For the special case of small surface slopes, |Hη | 1, (15) can √ be linearized, and the resulting solution is H (η) = −(cos θ/sin θ)(η erfc(η/ 2)− √ 2 2/π e−η /2 ).
Continuum descriptions of crystal surface evolution
4.
1397
Surface Evolution by Surface-Diffusion Processes, T > TR
In the small-slope approximation, |h x | 1, Eq. (10) reads h t = − B h x x x x where B = (Ds cs 2 γ0 /kB T ) > 0 is a material parameter with dimension (length)4 /time. Decay of periodic surface modulations. For an initial periodic profile with wavelength λ, h(x, 0)= A sin(2π x/λ), (10) is solved again by applying Fourier series; the coefficients an (t) satisfy the ODE a˙ n + B(2π n/λ)4an = 0 and the initial condition an (0) = A. Hence, the complete solution is h(x, t) = A
∞
e
−B
2π n 4 λ
n=1
t
2nπ x sin . λ
(16)
For sufficiently long times, t (2π )−4 λ4 /B, this solution is approximated 4 by h(x, t) ∼ A e−B(2π/λ) t sin(2π x/λ); thus, the lifetime of the periodic profile is proportional to λ4 . This scaling with size should be contrasted with the case of evaporation-condensation for which the lifetime is proportional to λ2 . Decay of a localized mound of atoms. In some circumstances there are initial conditions that correspond to a mound of material on an otherwise flat surface. The system proceeds to lower its energy by flattening and so it is of interest to quantify this decay process. For an initial bump, h(x, 0) = f (x), and the condition h → 0 sufficiently fast as |x| → ∞, h(x, t) is again determined analytically by applying the Laplace transform of h(x, t) in t. In particular, for √ δ −1 −1/4 −3/4 −s 1/4 B −1/4 |x|/ 2 ˜ f (x) = δ(x), we find h (x, s) = 2 B s e sin(s 1/4 2−1/2 −1/4 B |x| + π/4) whose inversion gives the real solution h δ (x, t) =
1 1 2π i 2(Bt)1/4
i∞ −i∞
dσ σ −3/4 sin ησ 1/4 +
π , 4
(17)
where η = |x|/(4Bt)1/4 . The solution has the similarity form t −1/4 H (η), which, for long times, could have been recognized immediately. The solution for an arbitrary bump is given by (14); for sufficiently long times this solution also obtains a similarity structure. It is inferred that for long times the bump has a lifetime proportional to the fourth power of its linear size.
5.
Surface Evolution by Evaporation–Condensation Processes, T < TR
Decay of a localized mound of atoms in one space dimension. Here we consider Eq. (7). The material parameter k0 γ˜ has the units of diffusivity, (length)2 /time. If we consider an arbitrary initial distribution of atoms confined
1398
H.A. Stone and D. Margetis
to a region |x| ≤ X (t), then global mass conservation requires 2 0X (t ) h(x, t) dx = M = constant. For this problem there is a similarity solution that describes the long-time behavior of a bump, and a wide class of initial distributions are expected to evolve to the profile predicted by the similarity solution for times t 2 /(k0 γ˜ ), where is a length scale characteristic of the initial distribution. The similarity solution has the form
h(x, t) =
1/6
M4 96k0 γ˜
t −1/6 H (η), where η =
x , (3k0 γ˜ M 2 /2)1/6 t 1/6 (18)
and the function H (η) thus satisfies the fourth-order ODE −(H η)η = Hη 2 Hηη . ηe Conservation of the total mass becomes 0 H (η)dη = 1, where X e (t) = (1/6) (1/6) t is the finite extent of the evolving surface; the ηe 3k0 γ˜ M 2 /2 constant ηe remains to be determined. The ODE for H (η) can be integrated twice; using the symmetry condition Hη (0) = 0 along with the definition of the leading edge ηe as H (ηe ) = 0 we obtain
1/2
3 H (η) = 8
ηe2
1−
η ηe
4/3 3/2
,
(19)
which is the form given by Spohn [9]. The parameter ηe is determined from total mass conservation; we find ηe3
1/2 1
3 8
(1 − η
)
4/3 3/2
dη = 1 or
0
ηe =
2 6π
1/6
(1/4) (3/4)
1/6
, (20)
where (s) is the Gamma function. Decay of an axisymmetricbump in two dimensions ([9]). For axisymmetric shapes h = h(r, t) where r = x 2 + y 2 . We take [10] G = g0 + g1 |∇h| + 13 g3 |∇h|3 ,
(21)
and µ − µ0 = −v ((∂/∂ x)(∂ G/∂h x ) + (∂/∂ y)(∂ G/∂h y )). The resulting PDE for small slopes, |∇h| 1, and h r < 0 follows from (5) and (21) with ζ = k0 |∇h| to be hr g3 ∂ 2 (rh r ) 1+ where A = v k0 g1 , (22) ht = A r g1 ∂r with the initial condition h(r, 0) = H(r). Neglecting the g3 /g1 term √in the PDE and applying the method of characteristics we obtain h(r, t) ≈ H 2At + r 2 . This solution describes how the initial bump shrinks to zero at long times, while corrections due to the g3 /g1 term then are relatively small and can be obtained via simple iterations.
Continuum descriptions of crystal surface evolution
6.
1399
Surface Evolution by Surface–Diffusion Processes, T < TR
Evolution of a periodic profile in one dimension. Kinetic simulations [15, 16] based on a step-flow model with elastic step–step interactions have indicated that the height of periodic profiles in one dimension may evolve as h(x, t) = (x)(t), i.e., h has a separable form. From the continuum viewpoint the surface evolution can be described by (11), h t = − B (|h x |h x x )x x ˙ where B = (Ds cs 2 γ˜ /kB T ). Assuming h x > 0, and thus satisfy −/ 2 −1 = C = const. > 0 and (x x x )x x = C(x). Hence, (t) = (Ct + K ) ≈ C −1 t −1 for long times, while the ODE for can only be solved numerically. The set of boundary conditions that would yield a unique solution to this PDE is a topic of discussion in the literature (e.g. Ref. [2]). Evolution of an axisymmetric shape in two dimensions ([17]). Here we consider the surface-diffusion-driven change in shape of an initially conical surface (see Fig. 2a). Using (21) and the equation for µ − µ0 in terms of G along with (9), we obtain a PDE for h(x, y, t) in two dimensions. For axisymmetric shapes, h = h(r, t), with a growing facet of radius w(t), as shown in Fig. 2a, the PDE for the slope profile F = −h r is g3 ∂ 2 1 ∂ 2 ∂ F 3B = 4 −B ∇ rF , ∂t r g1 ∂r r ∂r
(23)
where B = (Ds cs 2v γ0 /kB T ). This equation can be studied using a combination of free-boundary (the facet width w(t) changes in time) and boundary-layer ideas (there is a region of rapid variation associated with the highest-derivative term in Eq. (23). For g3 /g1 < O(1) singular perturbation theory suggests that the solution F varies rapidly inside a boundary layer of width δ b near the facet.
(a)
er
w(t) z⫽h(r,t)
3.5 scaled slope, f0(η)
facet
(b)
ez
e
3
d
2.5
c
2
b
1.5
a
1 0.5 0
0
2 4 6 8 10 scaled radial coordinate, η
12
Figure 2. (a) Schematic of an axisymmetric shape with an indication of the step structure on the atomic scale. (b) Surface slope profiles as a function of a similarity variable. The different profiles correspond to different values of g3 /g1 as described in Ref. [17].
1400
H.A. Stone and D. Margetis
Taking F ≈ a(t) f0 (η) for long times where η = [r − w(t)]/δ b we obtain δ b = O( 1/3 ) and a universal ODE for f 0 , ( f 02 ) = f 0 − 1. This equation can only be solved numerically assuming slope continuity, f 0 (0) = 0. Solutions are obtained by the routine shooting procedure of starting with f 0 (η∗ ) ≈ c1 (η∗ )1/2 + c3 (η∗ )3/2 for η∗ 1 and finding the coefficients c1 and c3 so that f 0 (η → ∞) = 1, as dictated by asymptotic matching at η = ∞ with the “outer solution” for g3 /g1 = 0. Different numerical solutions of the ODE are shown in Fig. 2b. There is excellent agreement (not shown here) between the theoretical predictions and the results from kinetic simulations [6].
7.
Outlook
The development of continuum descriptions for the time evolution of the shape of crystalline materials leads to a number of different partial differential equations. The distinction of the driving forces for surface evolution above and below the roughening temperature is significant and it is only in fairly recent years that attention has focussed on the below roughening case. The use of step-flow models, and the understanding gained from these systems, is also important for probing kinetic, and other, features of the basic continuum models. Further advances and comparison of these ideas with experiment will lead to progress in future years.
Appendix A Here we derive the first line of equation (4), which relates the chemical potential µ to the surface energy parameter G(h x ). The total surface free energy in 1D is G t = dx G(h x ). Taking the first variation with respect to h of G t − dx λ˜ to be zero for h fixed at the endpoints, where λ˜ h is the change of the chemical potential, we find
0=
∂G dx δh x − −1 λ˜ h δh = − ∂h x
∂ ∂G dx + −1 λ˜ h δh, (A1) ∂ x ∂h x
By definition of the chemical potential, µ − µ0 = λ˜ h and the initial starting point in equation (4) is obtained.
Acknowledgments We thank M.Z. Bazant, D. Kandel, R.V. Kohn, R.R. Rosales, V. Shenoy, and Z. Suo for helpful conversations, E.D. Williams for her kind permission to reproduce a figure from (Jeong and Williams, 1999), and M.J. Aziz for his constant support, encouragement and valuable explanations.
Continuum descriptions of crystal surface evolution
1401
References [1] P. Nozi`eres, “Shape and growth of crystals,” In: C. Godreche (ed.), Solids Far from Equilibrium, Cambridge University Press, Cambridge, pp. 1–154, 1992. [2] A. Chame, S. Rousset, H.P. Bonzel, and J. Villain, “Slow dynamics of stepped surfaces,” Bulgarian Chem. Commun., 29, 398–434, 1996/97. [3] H.-C. Jeong and E.D. Williams, “Steps on surfaces: experiment and theory,” Surf. Sci. Rep., 34, 171–294, 1999. [4] V.B. Shenoy and L.B. Freund, “A continuum description of the energetics and evolution of stepped surfaces in strained nanostructures,” J. Mech. Phys. Solids, 50, 1817–1841, 2002. [5] W.K. Burton, N. Cabrera, and F.C. Frank, “The growth of crystals and the equilibrium structure of their surfaces,” Phil. Trans. R. Soc. London Ser. A, 243, 299–358, 1951. [6] N. Israeli and D. Kandel, “Profile of a decaying crystalline cone,” Phys. Rev. B, 60, 5946–5962, 1999. [7] C. Herring, “Surface tension as a motivation for sintering,” In: W.E. Kingston (ed.), The Physics of Powder Metallurgy, McGraw-Hill, New York, pp. 143–179, 1951. [8] W.W. Mullins, “Capillarity-induced surface morphologies,” Interface Sci., 9, 9–20, 2001. [9] H. Spohn, “Surface dynamics below the roughening transition,” J. Phys. I, France, 3, 69–81, 1993. [10] H.P. Bonzel, “Equilibrium crystal shapes: towards absolute energies,” Prog. Surf. Sci., 67, 45–57, 2001. [11] F.A. Nichols and W.W. Mullins, “Morphological changes of a surface of revolution due to capillarity-induced surface diffusion,” J. Appl. Phys., 36, 1826–1835, 1965. [12] W.W. Mullins, “Flattening of a nearly plane solid surface due to capillarity,” J. Appl. Phys., 30, 77–83, 1959. [13] W.W. Mullins, “Theory of thermal grooving,” J. Appl. Phys., 28, 333–339, 1957. [14] Z. Suo, “Motions of microscopic surfaces in materials,” Adv. Appl. Mech., 33, 193– 294, 1997. [15] M. Ozdemir and A. Zangwill, “Morphological equilibration of a corrugated crystalline surface,” Phys. Rev. B, 42, 5013–5024, 1990. [16] N. Israeli, and D. Kandel, “Decay of one-dimensional surface modulations,” Phys. Rev. B, 62, 13707–13717, 2000. [17] D. Margetis, M.J. Aziz, and H.A. Stone, “Continuum description of profile scaling in nanostructure decay,” Phys. Rev. B, 69, art. 041404(R), 2004.
4.9 BREAKUP AND COALESCENCE OF FREE SURFACE FLOWS Jens Eggers School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK
The picture of a dolphin, jumping out of the water in the New England aquarium in Boston (Fig. 1), gives a very good idea of the challenges involved in the description of free-surface flows. In a complex series of events, which is still not well understood, water swept up by the dolphin breaks up into thousands of small drops. A more detailed idea of what happens close to the point of breakup is given in Fig. 2, which shows a drop of water falling from a faucet. Once an elongated neck has formed, surface energy is minimized by locally reducing its radius, and a drop separates at a point. Once the neck is broken, it rapidly snaps back, forming a capillary waves on its surface. In the last picture on the right, the neck has been severed on the other end as well. Thus in a single dripping event, two drops have actually formed, and the smaller “satellite” drop will subsequently break up to form even smaller drops. This gives a good idea of the complexity of just a single breakup event, driven by surface tension. The complementary event of drop coalescence is illustrated by Fig. 3, which shows two drops which have been made to touch at a point. Surface tension drives a motion that makes the drop coalesce, since the combined drop has a lower surface energy. The intitial motion is so rapid that it is hardly resolved by the camera, and it results in quite a complicated sequence of capillary waves, drop oscillations, etc. Clearly changes in topology brought about by breakup or coalescence are the most dramatic events in the evolution of a free-surface, characterized by a very rapid and complex motion of the surface (cf. Figs. 2 and 3). In fact, it is not a priori clear whether continuum equations are able to describe topology changes, since somewhere in between flow features develop which are
1403 S. Yip (ed.), Handbook of Materials Modeling, 1403–1416. c 2005 Springer. Printed in the Netherlands.
1404
Figure 1.
J. Eggers
Dolphin in the New England aquarium in Boston; photograph by Harald Edgerton.
of molecular size. So apart from predicting the actual motion near the singular point, the aim of the theory is to explain how one topology can be transformed into the other in a unique way. The spatial and temporal resolution of any numerical simulation is limited, so a thorough understanding of the singularity is needed. Once the rapid motion near the singularity can be described theoretically, the numerical evolution can be matched onto it. In addition, the theoretical description of singularities will explain some universal flow features, attributable to breakup or coalescence of drops.
Breakup and coalescence of free surface flows
1405
Figure 2. A sequence of photographs showing a drop of water falling from a pipette D = 5.2 mm in diameter (photograph by H. Peregrine, see Ref. [2]). The superimposed black lines are the result of a simulation of the one-dimensional equations (1) and (2).
Figure 3. A sequence of images (t = 1 ms) of two mercury drops brought into contact at the point indicated by an arrow [4].
1.
Non-linear Dynamics of Drop Formation
To obtain insight into the non-linear dynamics close to breakup one has to solve a notoriously difficult problem: the Navier–Stokes equation within a domain that is changing in time. The motion of the interface is dictated by the fluid motion itself, as the interface is convected passively by the fluid motion at the interface. The motion of the interface has to be computed with great accuracy, because the fluid motion is driven by surface tension, resulting in a Laplace pressure proportional to the mean curvature of the interface. Since the driving is proportional to pressure gradients, acceleration of a fluid element is effectively determined by third derivatives of the surface shape. The numerical difficulties inherent in this coupling of fluid motion and its driving force are discussed thoroughly by Scardovelli and Zaleski [1], giving an overview of available numerical methods.
1406
J. Eggers
To obtain greater insight into drop breakup, it is necessary to reduce the non-linear dynamics associated with it to its essentials. The idea is that near the point where the neck radius goes to zero, the fluid motion is directed primarily in a direction parallel to the axis. This allows to reduce the problem to an equation for the average velocity in the axial direction alone. Alternatively, and more or less equivalently, the velocity field can be expanded in the radial coordinate. If a typical radial length scale is smaller than the corresponding axial one, usually signaled by the interface slope being less than one, the leading order coefficient for the velocity suffices, as discussed in detail by Eggers [2]. The result is a system of equations for the local radius h(z, t) of the fluid neck, and the average velocity v(z, t) in the radial direction. All other terms are of higher order in h or the radial variable r. For a liquid with kinematic viscosity ν, density ρ, and surface tension γ (neglecting the effect of the outer gas), the result of the calculation is: ∂t h 2 + ∂z (vh 2 ) = 0, γ ∂t v + v∂z v = − ∂z ρ inertia
(1)
1 ∂z (∂z vh 2 ) 1 + + 3ν . 2 R1 R2 h
surface tension
(2)
viscosity
The simplification achieved by (1) and (2) is enormous. Firstly, the dimension of the problem been reduced by one (the radial variable has been eliminated). Secondly, the moving boundary has been described explicitely by h(z, t). Equation (1) expresses the conservation of mass: it is written as a conservation equation for the volume h 2 dz of a slice of fluid. Equation (2) is a balance of forces acting on a fluid element, and thus very similar in structure to the original Navier–Stokes equation [3]. The l.h.s. of (2) corresponds to inertial forces, driven by surface tension and viscous forces on the right. As to be expected from Laplace’s formula, surface tension forces are proportional to the mean curvature, which for a body of rotation is 1 1 ∂zz h 1 + = − 3. 2 R1 R2 h 1 + (∂z h) 1 + (∂z h)2
(3)
Strictly speaking, the radial expansion implied by (1) and (2) would have required us to replace the mean curvature by the leading-order expression 1/ h(z, t) alone. This is indeed sufficient to describe the neighborhood of the pinch point, but the applicability of the equations is greatly enhanced by including the full curvature, because the equations then include a spherical drop among their equilibrium solutions. The remarkable power of the system (1) and (2) in describing a real break-up event is illustrated by Fig. 2. The sequence of experimental pictures shows a drop of water falling from a pipette 5.2 mm in diameter. The drop is shown at the moment of the first bifurcation (first picture), after which the
Breakup and coalescence of free surface flows
1407
fluid neck recoils from the drop (second picture). Shortly afterward the neck pinches on its other end (third picture), thereby forming a satellite drop. Such satellite drops are seen to be a direct consequence of the long neck that is forming at pinch-off, which in turn reflects the profile being extremely asymmetric around the pinch-point: on one side, the profile asymptotes to the drop, on the other side it is very flat and forms a slender neck. It is evident from Fig. 2 that the one-dimensional approximation works extremely well in describing breakup, and the formation of satellite drops. This includes regions near the drop, where the profile is actually quite steep, so the expansion underlying (1) and (2) is formally not valid. A careful assessment of the quality of one-dimensional approximations, achieved through comparison with accurate numerical simulations of the full Navier–Stokes equation, is given in Ref. [5]. As discussed in the introduction, it is not clear how to pass from the first panel in Fig. 2 to the second on the basis of (1) and (2), but rather some “surgery” was necessary. Namely, when the minimum neck radius was just 10−4 times the original radius, it was cut and spherical caps were placed on either side. Below we will justify this procedure on the basis of a more detailed understanding of the dynamics at the pinch-point.
2.
Similarity Solutions
We now turn to the immediate neighborhood of the pinch-point, where separation occurs. Since the evolution takes place on length and time scales much smaller than any externally applied scales such as the diameter Dof the capillary in Fig. 2, the motion should be properly measured in some intrinsic units of the fluid. The only such units of length and time that can be formed from the fluid parameters are ν =
νρ γ
and tν =
ν3ρ2 . γ2
(4)
As expected intuitively, length and time scales increase with viscosity, which can vary greatly between different fluids. For water, the viscous length scale ν is just 10 nm, far below anything visible on the scale of the photographs in Figs. 2 and 3. For glycerol, on the other hand, ν is in the order of centimeters, and the asymptotics described below is easily observable. Since our description is local, it is clear that we have to represent the motion in a local coordinate system. The only reasonable choice for its origin is the point z 0 and time t0 where the singularity occurs. Making space and time dimensionless using ν and tν , we introduce z =
(z − z 0 ) ν
and
t =
(t0 − t) . tν
(5)
1408
J. Eggers
Now representing the spatial profile as well as the velocity field in these co-ordinates, h(z, t) = ν H (z , t ) v(z, t) =
ν V (z , t ) tν
(6)
we expect the new functions H and V to represent properties of the singularity alone. In particular, we hope that they are universal, independent of both the initial conditions and the material parameters of the fluid. Since no external scales are thus expected to come into play in the description of H (z , t ) and V (z , t ), these profiles should be invariant under a change of scale. This means both the height and the velocity profile should be self-similar: H (z , t ) = t φ
V (z , t ) = t
−1/2
z
t 1/2
ψ
z t 1/2
(7)
.
The meaning of (7) is that the shape of the profiles does not change as a function of time, only the radial and the axial scales are adjusted as t goes to zero. The exponents implicit in (7) were computed from the requirement that all terms in the equations balance, i.e., that inertial, surface tension, and viscous forces are of the same order close to the singularity. In particular, two things are noteworthy about the exponents: First, the neck radius shrinks linearly with t as the singularity is approached, while the corresponding axial scale only shrinks like t 1//2 . This implies that the profile is asymptotically slender, and the assumptions underlying the derivation of (1) and (2) were justified. Second, the exponent of the velocity is negative, so the motion is increasingly fast close to the singularity. This is not unexpected, since ever stronger surface tension forces are driving increasingly small fluid necks. Once the neck reaches microscopic size, of course, the description in terms of a velocity field becomes meaningless, so there is no danger of truly infinite velocities looming here. Finally, once the self-similar form (7) is re-introduced into the equations of motion (1) and (2), one obtains a set of ordinary differential equations for the similarity profiles φ(ξ ) and ψ(ξ ) alone. A more thorough analysis of the structure of the equations reveals [2] that there is only one universal solution of them, once proper boundary conditions are imposed at ξ = ±∞. These are derived from the condition that matching must be possible to the macroscopic profiles farther away, which evolve on much longer time scales than the selfsimilar solution itself. A remarkable consequence of this universality is that
Breakup and coalescence of free surface flows
1409
the minimum neck radius, at a given time away from the point of breakup, is a quantity that is independent of the initial radius [2]: h min = 0.03
γ (t0 − t). νρ
(8)
To look at a comparison between theory and experiment in more detail, Fig. 4 shows three successive images of a jet of glycerol pinching off to form a drop (a small section of which is seen on the right). Once the temporal distance from the singularity is known (from experiment), the profile can be predicted without adjustable parameters (dark continuous lines). The only difference between the three sets of lines is that the axes have been rescaled by the factor implied by (7). The universality of the solution described by (7) of course implies that it holds equally well for the pinching of the drop of water shown in Fig. 2, as it does for the glycerol jet of Fig. 4. The reason this common feature is not immediately apparent is that ν is extremely small for water, so one would have to observe the neighborhood of the point of breakup in Fig. 2 under extreme magnification. This means that only on a very small scale will all three forces contributing to (2) come into play. For the parts of the evolution where the minimum radius h min is much larger than ν , one can neglect viscosity, so that Fig. 2 is effectively described by inviscid dynamics. Thus to understand the appearance of drop pinch-off on a given scale D (such as the nozzle diameter), one has to take into account the phenomenon of
Figure 4. A sequence of interface profiles of a jet of glycerol close to the point of breakup (the center of the drop being formed is seen as a bright spot in the top picture). The experimental images correspond to t0 − t = 350, 298, and 46 µs (from top to bottom). Corresponding analytical solutions based on (7) are superimposed. (Experiment by T. Kowalewski, see Ref. [2]).
1410
J. Eggers
cross-over: if initially D ν , the dynamics is characterized by a balance of inertial and surface tension forces. As h min reaches ν , the dynamics changes toward an inertial–surface tension–viscous balance. If on the other hand D ν initially, inertia cannot play a significant role: the dynamics is dominated by viscosity and surface tension. In the course of this evolution, however, inertia becomes increasingly important and finally catches up with the other two. As a result, the same universal solution as before is finally observed. To each of the new balances described above corresponds a new similarity solution, distinct from (7) [2]. For example, the inertia–surface tension balance leads to a minimum drop radius that behaves like 2/3
h min = 0.7
γ ρ
(t0 − t)2/3 ,
(9)
while the viscous–surface tension balance corresponds to h min = 0.06
γ (t0 − t). νρ
(10)
The spatial structure of the corresponding similarity solutions largely explains the macroscopic appearance of high and low viscosity fluids, respectively. The axial and radial scales of the inviscid solution (9) both behave like (t0 − t)2/3 , thus leading to a neck that is cone-shaped, consistent with Fig. 2. For its computation, the lubrication Eqs. (1) and (2) are inadequate, rather, the full equations for inviscid, irrotational flow have to be solved. The reason is that the interface profile turns over, so that the tip of the cone-shaped neck is actually inside the drop. These predictions of similarity theory have been confirmed by both experiment and full numerical simulations of the Navier– Stokes equations by Chen et al. [6]. Very viscous fluids, on the other hand, tend to form very elongated threads. This is reflected by the fact that the typical axial scale of the viscous solution (10) behaves like (t0 − t)0.175 (t0 − t). Interestingly, the exponent β = 0.175 . . . is an irrational number coming from the solution of a transcendental equation [2]. This is an example of self-similarity of the second kind, in the classification of [7]. The striking difference in the behavior of high and low viscosity fluids is represented schematically in Fig. 5. It would be beyond the scope of this brief overview to mention all the recent developments in the study of drop pinch-off, some of which are discussed in Ref. [8]. To name some examples, the presence of an outer fluid significantly alters pinching, leading to new types of similarity solutions, with important applications for the physics of mixing. For extremely small jets of the size of nanometers, thermal fluctuations have to be taken into account, which significantly alter the dynamics. This has been found using molecular dynamics simulations of a jet 6 nm in diameter. However, even on much larger
Breakup and coalescence of free surface flows
1411 Re ⫽ 兹D/ 艎
h min ⫽ 0.06艎t'
h min ⫽ 0.7艎t' 2/3
⫺log(h min )
h min ⫽ 0.03艎t' Figure 5. A graphical representation of some of the scaling regimes that can be observed in droplet pinching, depending on the viscosity of the liquid. For high viscosity (Re small) threads form as a drop falls from a pipette of diameter D. In the opposite case of low viscosity (Re large) the pinching neck is conical. As the neck radius goes to zero, however, one always ends up with the same universal scaling regime. (Photographs by X.D. Shi and S. Nagel, see Ref. [2]).
scales small perturbations to the observed similarity solutions can be important. In fact, the threads shown in Fig. 4 are quite sensitive to perturbations, and a careful examination of the last panel shows (unfortunately obscured by the drawn lines) the growth of disturbances on the thread Eggers [2]. In other words, the question of what resolution (experimental or numerical) is necessary near the point of breakup depends very much on what one is interested in: a lot of detail may be buried within a pinching event, which may or may not be important for a given question. If one is trying to describe topological transitions numerically, one will always have to renounce the description of the dynamics below some cut-off length. It is therefore important to understand the mechanisms which guarantee the uniqueness of the continuation across the singularity. The key is again
1412
J. Eggers
the universality of similarity solutions we already found in the approach to the singularity. A new set of similarity solutions can be found after breakup – one exists on either side of the pinch-point. This is illustrated in Fig. 6, which shows some typical predictions of the similarity theory. On one side one sees the rapid retraction of a very thin needle, on the other a small protrusion is left initially on the drop, which quickly heals off to form a smooth drop. The difference to the similarity solution before breakup lies of course in the boundary condition at the retracting tip. A closer analysis reveals [2] that the similarity solutions after breakup depends on the boundary conditions for both the height and the velocity as one moves away to infinity. The crucial condition that guarantees unique continuation is the fact that both profiles to the left and right of the point of breakup have to coincide with the solutions before breakup as one moves to infinity. Information between the solutions before and after breakup is thus passed on solely on the basis of the far-field
(a)
72µm t ⫺ t0 ⫽ ⫺114µs, ⫺63µs, ⫺11µs
Before breakup
(b)
72µm
t ⫺ t0 ⫽ 11µs, 63µs, 114µs
After breakup
Figure 6. The breakup of a mixture of glycerol in four parts of ethanol, as calculated from similarity solutions before and after breakup. Part (a) shows three profiles before breakup, in time distances of 46 µs, corresponding to |t | = 1, 0.55, and 0.1. In part (b) the same is shown after breakup.
Breakup and coalescence of free surface flows
1413
behavior. Whatever microscopic physics determines the actual breakup event is irrelevant to the continuation.
3.
Coalescence
As we have seen above, the understanding of drop breakup is aided greatly by the universality of the observed solutions. One is able to almost completely disregard the free-surface motion away from the point of breakup. The main difficulty in finding a unifying picture for drop coalescence lies in the fact that one cannot disregard the drop motion that leads to the meeting of the two drops, resulting in a number of problems. Firstly, the motion produced by the purely geometrical overlap between two approaching drops is comparable or faster than the motion generated by surface tension. Hence the velocity of approach must come into play when describing the dynamics of coalescence. Secondly, the fluid caught between two approaching drops cannot be ignored, even if its viscosity is very small. The reason is that a very thin lubrication film between two drops will still produce an appreciable pressure, which deforms the drops prior to their meeting at a point. Thirdly, the mechanism that leads to the first small-scale union between the drops is not well understood. In particular, the presence of surfactants on the surface produces barriers that have to be overcome, which can significantly delay reconnection, as shown by Amarouchene et al. [9]. We will therefore focus on the simplest case of a vanishing speed of approach, in which case the ensuing dynamics is determined by the fluid parameters and the radius R of the drops alone. If the two drops do not have equal radius, the one with the smaller radius will play the dominant role and effectively replace R. Since the motion starts from rest, it will initially be slow, so the driving by surface tension is counteracted by viscosity alone. This behavior will persist until the radius rm of the liquid bridge connecting the two drops has reached ν , after which it crosses over to one where only inertia matters and viscosity drops out of the problem. Finally, for rm ≈ R, the initially local motion in the bridge between the drops evolves to a global motion involving all of the fluid. The central idea in investigating the dynamics of coalescence is of course that for rm R the motion is self-similar, dominated by the local behavior close to the meniscus where the two drops meet. At the meniscus the curvature is extremely high, and thus leads to a driving that is confined to a ring-shaped region, whose radius rm is expanding. Turning first to the initial stage of viscous motion, the problem is thus one of a line force moving through an infinite medium. The force per unit length of the line is 2γ , and one has to compute the speed that results from it. It is one of the characteristic features of Stokes flow that to obtain a finite answer, logarithmic corrections come into play, for
1414
J. Eggers
which an upper and a lower cut-off is needed. The upper cut-off evidently is the radius of the drop itself, the width of the meniscus serves as the lower cut-off. The result of the calculation [10] is (η = νρ being the viscosity): rm (t) ∼ −
γ (α − 1) γ t ln t, η 2π Rη
(11)
where the width of the meniscus is assumed to scale like ∝ rmα . Interestingly, the value of α which determines the prefactor in (11) depends on the presence of an outer fluid between the drops. If no outer fluid is present, the correct value is α = 3, which can in fact be deduced from an exact solution (due to Hopper) to the two-dimensional analogue of the problem under study [10]. A closeup of this extremely sharp meniscus is shown in panel (a) of Fig. 7. Even a small amount of interstitial fluid, however, changes the situation considerably. Owing to the fact that the gap between the drops is exceedingly narrow, it is quite hard to push any fluid away from the advancing meniscus. Instead, the interstitial fluid is collected in a pouch at the meniscus (see Fig. 7), and is now much larger, so that one finds α = 3/2 [10]. If the drop fluid is very viscous (ν > R), this is all that can be said from the point of view of aymptotics. Among others, we have established that the motion is described by a well-defined asymptotic solution. Hence after a very short time, details of the microscopic mechanisms leading to coalescence have been “forgotten”. In a numerical simulation, “surgery” done on a sufficiently small scale will meet a similar fate, and one soon ends up following the unique physical solution. Finally, if ν R, there is a region where the motion is almost inviscid. From a balance of surface tension forces with inertial forces at the meniscus one deduces [10] that
rm ∝
γR ρ
1/4
t 1/2 .
(12)
This behavior has been confirmed by recent numerical simulations of Ref. [11]. However, there is an unexpected complication: as the meniscus retracts, capillary waves grow ahead of it, whose amplitude finally equals the width of the channel. Thus the two sides of the drops touch, and a toroidal void is enclosed. This process repeats itself, leaving behind a self-similar succession of voids. In summary, one can often obtain analytical solutions to the equations of motion near a singularity, explaining some universal features of breakup and coalescence events. This is important for estimating errors introduced by a given numerical procedure used to describe topological transitions. Matching numerics to known analytical solutions can lead to considerable savings in numerical effort.
Breakup and coalescence of free surface flows (a)
(b)
0.040
y
0.040
0.030
0.030
0.020
y 0.020
0.010
0.010
0.000 ⫺0.0010
⫺0.0005
0.0000
0.0005
1415
0.000 ⫺0.0010 ⫺0.0005
0.0010
x
0.0000
0.0005
0.0010
x
(c)
0.20
0.15
y
0.10
0.05
0.00 ⫺0.010
⫺0.005
0.000
0.005
0.010
x
Figure 7. A closeup of the point of contact during coalescence of two identical drops for the two cases of no outer fluid, (a), and two fluids of equal viscosity, ((b) and (c)). Part (a) is Hopper’s solution (no outer fluid) for rm /R = 10−3 , 10−2.5 , 10−2 , and 10−1.5 . Part (b) is a numerical simulation of the case where the inner and outer viscosities are the same, showing fluid that collects in a bubble at the meniscus. Note that the two axes are scaled differently, so the bubble is almost circular. For large values of rm , as shown in (c), the fluid finally escapes from the bubble.
References [1] R. Scardovelli and S. Zaleski, “Direct numerical simulation of free-surface and interfacial flow,” Annu. Rev. Fluid Mech., 31, 567–603, 1999. [2] J. Eggers, “Non-linear dynamic and breakup of free-surface flows,” Rev. Mod. Phys., 69, 865–929, 1997. [3] L.D. Landau and E.M. Lifshitz, Fluid Mechanics, Pergamon, Oxford, 1984. [4] A. Menchaca-Rocha et al., “Coalescence of liquid drops by surface tension,” Phys. Rev. E, 63, 046309, 1–5, 2001.
1416
J. Eggers
[5] B. Ambravaneswaran, E.D. Wilkes, and O.A. Basaran, “Drop formation from a capillary tube: comparison of one-dimensional and two-dimensional analyses and occurence of satellite drops,” Phys. Fluids, 14, 2606–2621, 2002. [6] A.U. Chen, P.K. Notz, and O.A. Basaran, “Computational and experimental analysis of pinch-off and scaling,” Phys. Rev. Lett., 88, 174501, 1–4, 2002. [7] G.I. Barenblatt, Scaling, Self-Similarity, and Intermedeate Asymptotics, Cambridge, 1996. [8] S.P. Lin, Breakup of Liquid Sheets and Jets, Cambridge, 2003. [9] Y. Amarouchene, G. Cristobal, and H. Kellay, “Noncoalescing drops,” Phys. Rev. Lett., 87, 206104, 1–4, 2002. [10] J. Eggers, J.R. Lister, and H.A. Stone, “Coalescence of liquid drops,” J. Fluid Mech., 401, 293–310, 1999. [11] L. Duchemin, J. Eggers, and C. Josserand, “Inviscid coalescence of drops,” J. Fluid Mech., 487, 167–178, 2003.
4.10 CONFORMAL MAPPING METHODS FOR INTERFACIAL DYNAMICS Martin Z. Bazant1 and Darren Crowdy2 1
Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA 2 Department of Mathematics, Imperial College, London, UK
Microstructural evolution is typically beyond the reach of mathematical analysis, but in two dimensions certain problems become tractable by complex analysis. Via the analogy between the geometry of the plane and the algebra of complex numbers, moving free boundary problems may be elegantly formulated in terms of conformal maps. For over half a century, conformal mapping has been applied to continuous interfacial dynamics, primarily in models of viscous fingering and solidification. Current developments in materials science include models of void electro-migration in metals, brittle fracture, and viscous sintering. Recently, conformal-map dynamics has also been formulated for stochastic problems, such as diffusion-limited aggregation and dielectric breakdown, which has re-invigorated the subject of fractal pattern formation. Although restricted to relatively simple models, conformal-map dynamics offers unique advantages over other numerical methods discussed in this chapter (such as the Level–Set Method) and in Chapter 9 (such as the phase field method). By absorbing all geometrical complexity into a time-dependent conformal map, it is possible to transform a moving free boundary problem to a simple, static domain, such as a circle or square, which obviates the need for front tracking. Conformal mapping also allows the exact representation of very complicated domains, which are not easily discretized, even by the most sophisticated adaptive meshes. Above all, however, conformal mapping offers analytical insights for otherwise intractable problems. After reviewing some elementary concepts from complex analysis in Section 1, we consider the classical application of conformal mapping methods to continuous-time interfacial free boundary problems in Section 2. This includes cases where the governing field equation is harmonic, biharmonic, or in a more general conformally invariant class. In Section 3, we discuss the 1417 S. Yip (ed.), Handbook of Materials Modeling, 1417–1451. c 2005 Springer. Printed in the Netherlands.
1418
M.Z. Bazant and D. Crowdy
recent use of random, iterated conformal maps to describe analogous discretetime phenomena of fractal growth. Although most of our examples involve planar domains, we note in Section 4 that interfacial dynamics can also be formulated on curved surfaces in terms of more general conformal maps, such as stereographic projections. We conclude in Section 5 with some open questions and an outlook for future research.
1.
Analytic Functions and Conformal Maps
We begin by reviewing some basic concepts from complex analysis found in textbooks such as Churchill and Brown [1]. For a fresh geometrical perspective, see Needham [2]. A general function of a complex variable depends on the real and imaginary parts, x and y, or, equivalently, on the linear combinations, z = x + i y and z¯ = x − i y. In contrast, an analytic function, which is differentiable in some domain, can be written simply as w = u + iv = f (z). The condition, ∂ f /∂ z¯ = 0, is equivalent to the Cauchy–Riemann equations, ∂u ∂v = ∂x ∂y
and
∂u ∂v =− , ∂y ∂x
(1)
which follow from the existence of a unique derivative, f =
∂v ∂f ∂v ∂u ∂ f ∂u = +i = = −i , ∂x ∂x ∂ x ∂(i y) ∂ y ∂y
(2)
whether taken in the real or imaginary direction. Geometrically, analytic functions correspond to special mappings of the complex plane. In the vicinity of any point where the derivative is nonzero, f (z) =/ 0, the mapping is locally linear, dw = f (z) dz. Therefore, an infinitesimal vector, dz, centered at z is transformed into another infinitesimal vector, dw, centered at w = f (z) by a simple complex multiplication. Recalling Euler’s formula, (r1 eiθ1 )(r2 eiθ2 ) = (r1r2 )ei(θ1 + θ2 ) , this means that the mapping causes a local stretch by | f (z)| and local rotation by arg f (z), regardless of the orientation of dz. As a result, an analytic function with a nonzero derivative describes a conformal mapping of the plane, which preserves the angle between any pair of intersecting curves. Intuitively, a conformal mapping smoothly warps one domain into another with no local distortion. Conformal mapping provides a very convenient representation of free boundary problems. The Riemann Mapping Theorem guarantees the existence of a unique conformal mapping between any two simply connected domains, but the challenge is to derive its dynamics for a given problem. The only constraint is that the conformal mapping be univalent, or one-to-one, so that physical fields remain single-valued in the evolving domain.
Conformal mapping methods for interfacial dynamics
2. 2.1.
1419
Continuous Interfacial Dynamics Harmonic Fields
Most applications of conformal mapping involve harmonic functions, which are solutions to Laplace’s equation, ∇ 2 φ = 0.
(3)
From Eq. (1), it is easy to show that the real and imaginary parts of an analytic function are harmonic, but the converse is also true: Every harmonic function is the real part of an analytic function, φ = Re , the complex potential. This connection easily produces new solutions to Laplace’s equation in different geometries. Suppose that we know the solution, φ(w) = Re (w), in a simply connected domain in the w-plane, w , which can be reached by conformal mapping, w = f (z, t), from another, possibly time-dependent domain in the z-plane, z (t). A solution in z (t) is then given by φ(z, t) = Re (w) = Re ( f (z, t))
(4)
because ( f (z)) is also analytic, with a harmonic real part. The only caveat is that the boundary conditions be invariant under the mapping, which holds for Dirichlet (φ = constant) or Neumann (nˆ · ∇ φ = 0) conditions. Most other boundary conditions invalidate Eq. (4) and thus complicate the analysis. The complex potential is also convenient for calculating the gradient of a harmonic function. Using Eqs. (1) and (2), we have ∇z φ =
∂φ ∂φ +i = , ∂x ∂y
(5)
where ∇z is the complex gradient operator, representing the vector gradient, ∇ , in the z-plane.
2.1.1. Viscous fingering and solidification The classical application of conformal-map dynamics is to Laplacian growth, where a free boundary, Bz (t), moves with a (normal) velocity, dz ∝ ∇ φ, (6) dt proportional to the gradient of a harmonic function, φ, which vanishes on the boundary [3]. Conformal mapping for Laplacian growth was introduced independently by Polubarinova–Kochina and Galin in 1945 in the context of ground-water flow, where φ is the pressure field and u = (k/η)∇ ∇ φ is the velocity of the fluid of viscosity, η, in a porous medium of permeability, k, according v=
1420
M.Z. Bazant and D. Crowdy
to Darcy’s law. Laplace’s equation follows from incompressibility, ∇ · u = 0. The free boundary represents an interface with a less viscous, immiscible fluid at constant pressure, which is being forced into the more viscous fluid. In physics, Laplacian growth is viewed as a fundamental model for pattern formation. It also describes viscous fingering in Hele–Shaw cells, where a bubble of fluid, such as air, displaces a more viscous fluid, such as oil, in the narrow gap between parallel flat plates. In that case, the depth averaged velocity satisfies Darcy’s law in two dimensions. Laplacian growth also describes dendritic solidification in the limit of low undercooling, where φ is the temperature in the melt [4]. To illustrate the derivation of conformal-map dynamics, let us consider viscous fingering in a channel with impenetrable walls, as shown in Fig. 1(a). The viscous fluid domain, z (t), lies in a periodic horizontal strip, to the right of the free boundary, Bz (t), where uniform flow of velocity, U , is assumed far ahead of the interface. It is convenient to solve for the conformal map, z = g(w, t), to this domain from a half strip, Re w > 0, where the pressure is simply linear, φ = Re Uw/µ. We also switch to dimensionless variables, where length is scaled to a characteristic size of the initial condition, L, pressure to UL/µ, and time to L/U . Since ∇w φ = 1 in the half strip, the pressure gradient at a point, z = g(w, t), on the physical interface is easily obtained from Eq. (30): ∂f = ∇z φ = ∂z
∂g ∂w
−1
(7)
(a)
(b)
3 4
2 1
2
0
0
1
2
2 4 3 4
3
2
1
0
1
2
3
2
0
2
4
6
Figure 1. Exact solutions for Laplacian growth, a simple model of viscous fingering: (a) a Saffman–Taylor finger translating down an infinite channel, showing iso-pressure curves (dashed) and streamlines (solid) in the viscous fluid, and (b) the evolution of a perturbed circular bubble leading to cusp singularities in finite time. (Courtesy of Jaehyuk Choi.)
Conformal mapping methods for interfacial dynamics
1421
where w = f (z, t) is the inverse mapping (which exists as long as the mapping remains univalent). Now consider a Lagrangian marker, z(t), on the interface, whose pre-image, w(t), lies on the imaginary axis in the w-plane. Using the chain rule and Eq. (7), the kinematic condition, Eq. (6), becomes, ∂g dw dz ∂g = + = dt ∂t ∂w dt
∂g ∂w
−1
.
(8)
Multiplying by ∂g/∂w =/ 0, this becomes
∂g ∂g ∂g 2 dw + = 1. ∂w ∂t ∂w dt
(9)
Since the pre-image moves along the imaginary axis, Re(dw/dt) = 0, we arrive at the Polubarinova–Galin equation for the conformal map:
Re
∂g ∂g ∂w ∂t
= 1,
for Re w = 0.
(10)
From the solution to Eq. (10), the pressure is given by φ = Re f (z, t). Note that the interfacial dynamics is nonlinear, even though the quasi-steady equations for φ are linear. The best-known solutions are the Saffman–Taylor fingers, t (11) g(w, t) = + w + 2(1 − λ) log(1 + e−w ) λ which translate at a constant velocity, λ−1 , without changing their shape [5]. Note that (11) is a solution to the fingering problem for all choices of the parameter λ. This parameter specifies the finger width and can be chosen arbitrarily in the solution (11). In experiments however, it is found that the viscous fingers that form are well fit by a Saffman–Taylor finger filling precisely half of the channel, that is with λ = 1/2, as shown in Fig. 1(a). Why this happens is a basic problem in pattern selection, which has been the focus of much debate in the literature over the last 25 years. To understand this problem, note that the viscous finger solutions (11) do not include any of the effects of surface tension on the interface between the two fluids. The intriguing pattern selection of the λ = 1/2 finger has been attributed to a singular perturbation effect of small surface tension. Surface tension, γ , is a significant complication because it is described by a non-conformally-invariant boundary condition, φ = γ κ,
for z ∈ Bz (t)
(12)
where κ is the local interfacial curvature, entering via the Young–Laplace pressure. Small surface tension can be treated analytically as a singular perturbation to gain insights into pattern selection [6, 7]. Since surface tension
1422
M.Z. Bazant and D. Crowdy
effects are only significant at points of high curvature κ in the interface, and given that the finger in Fig. 1(a) is very smooth with no such points of high curvature, it is surprising that surface tension acts to select the finger width. Indeed, the viscous fingering problem has been shown to be full of surprises [8]. In a radial geometry, the univalent mapping is from the exterior of the unit circle, |w| = 1, to the exterior of a finite bubble penetrating an infinite viscous liquid. Bensimon and Shraiman [9] introduced a pole dynamics formulation, where the map is expressed in terms of its zeros and poles, which must lie inside the unit circle to preserve univalency. They showed that Laplacian growth in this geometry is ill-posed, in the sense that cusp-like singularities occur in finite time (as a zero hits the unit circle) for a broad class of initial conditions, as illustrated in Fig. 1(b). (See Howison [3] for a simple, general proof due to Hohlov.) This initiated a large body of work on how Laplacian growth is “regularized” by surface tension or other effects in real systems. Despite the analytical complications introduced by surface tension, several exact steady solutions with non-zero surface tension are known [10, 11]. Surface tension can also be incorporated into numerical simulations based on the same conformal-mapping formalism [12], which show how cusps are avoided by the formation of new fingers [13]. For example, consider a threefold perturbation of a circular bubble, whose exact dynamics without surface tension is shown in Fig. 1(b). With surface tension included, the evolution is very similar until the cusps begin to form, at which point the tips bulge outward and split into new fingers, as shown in Fig. 2. This process repeats itself to produce a complicated fractal pattern [14], which curiously resembles the diffusion-limited particle aggregates discussed below in Section 3.
2.1.2. Density-driven instabilities in fluids An important class of problems in fluid mechanics involves the nonlinear dynamics of an interface between two immiscible fluids of different densities. In the presence of gravity, there are some familiar cases. Deep-water waves involve finite disturbances (such as steady “Stokes waves”) in the interface between lighter fluid (air) over a heavier fluid (water). With an inverted density gradient, the Rayleigh–Taylor instability develops when a heavier fluid lies above a lighter fluid, leading to large plumes of the former sinking into the latter. Tanveer [15] has used conformal mapping to analyze the Rayleigh– Taylor instability and has provided evidence to associate the formation of plumes with the approach of various conformal mapping singularities to the unit circle. A related problem is the Richtmyer–Meshkov instability, which occurs when a shock wave passes through an interface between fluids of different
Conformal mapping methods for interfacial dynamics
1423 4 3 2 1 0 ⫺1 ⫺2 ⫺3
4
3
2
1
0
⫺1
⫺2
⫺3
⫺4
⫺4
Figure 2. Numerical simulation of viscous fingering, starting from a three-fold perturbation of a circular bubble. The only difference with the Laplacian-growth dynamics in Fig. 1(b) is the inclusion of surface tension, which prevents the formation of cusp singularities. (Courtesy of Michael Siegel.)
densities. Behind the shock, narrow fingers of the heavier fluid penetrate into the lighter fluid. The shock wave usually passes so quickly that compressibility only affects the onset of the instability, while the nonlinear evolution occurs much faster than the development of viscous effects. Therefore, it is reasonable to assume a potential flow in each fluid region, with randomly perturbed initial velocities. Although real plumes roll up in three dimensions and bubbles can form, conformal mapping in two dimensions still provides some insights, with direct relevance for shock tubes of high aspect ratio. A simple conformal-mapping analysis is possible for the case of a large density contrast, where the lighter fluid is assumed to be at uniform pressure. The Richtmyer–Meshkov instability (zero-gravity limit) is then similar to the Saffman–Taylor instability, except that the total volume of each fluid is fixed. A periodic interface in the y direction, analogous to the channel geometry in Fig. 1, can be described by the univalent mapping, z = g(w, t), from the
1424
M.Z. Bazant and D. Crowdy
interior of the unit circle in the mathematical w plane to the interior of the heavy-fluid finger in the physical z-plane. Zakharov [16] introduced a Hamiltonian formulation of the interfacial dynamics in terms of this conformal map, taking into account kinetic and potential energy, but not surface tension. One way to derive equations of motion is to expand the map in a Taylor series, g(w, t) = log w +
∞
an (t)w n ,
|w| < 1.
(13)
n=0
(The log w term first maps the disk to a periodic half strip.) On the unit circle, w = eiθ , the pre-image of the interface, this is simply a complex Fourier series. The Taylor coefficients, an (t), act as generalized coordinates describing n-fold shape perturbations within each period, and their time derivatives, a˙ n (t), act as velocities or momenta. Unfortunately, truncating the Taylor series results in a poor description of strongly nonlinear dynamics because the conformal map begins to vary wildly near the unit circle. An alternate approach used by Yoshikawa and Balk [17] is to expand in terms resembling Saffman–Taylor fingers, g(w, t) = log w + b(t) −
N
bn (t) log(1 − λn (t)w),
(14)
n=1
which can be viewed as a re-summation of the Taylor series in Eq. (13). As shown in Fig. 3, exact solutions exist with only a finite number of terms in the finger expansion, as long as the new generalized coordinates, λn (t), stay inside the unit disk, |λn | < 1. This example illustrates the importance of the choice of shape functions in the expansion of the conformal map, e.g., w n vs. log(1 − λn w).
2.1.3. Void electro-migration in metals Micron-scale interconnects in modern integrated circuits, typically made of aluminum, sustain enormous currents and high temperatures. The intense electron wind drives solid-state mass diffusion, especially along dislocations and grain boundaries, where voids also nucleate and grow. In the narrowest and most fragile interconnects, grain boundaries are often well separated enough that isolated voids migrate in a fairly homogeneous environment due to surface diffusion, driven by the electron wind. Voids tend to deform into slits, which propagate across the interconnect, causing it to sever. A theory of void electro-migration is thus important for predicting reliability. In the simplest two-dimensional model [18], a cylindrical void is modeled as a deformable, insulating inclusion in a conducting matrix. Outside the void,
Conformal mapping methods for interfacial dynamics
1425
2 1 0
⫺1 ⫺2 ⫺3 ⫺4 ⫺5 ⫺6 ⫺7 ⫺6
⫺4
⫺2
0
2
4
6
Figure 3. Conformal-map dynamics for the strongly nonlinear regime of the RichtmyerMeshkov instability [17]. (Courtesy of Toshio Yoshikawa and Alexander Balk.)
the electrostatic potential, φ, satisfies Laplace’s equation, which invites the use of conformal mapping. The electric field, E = −∇ ∇ φ, is taken to be uniform far away and wraps around the void surface, due to a Neumann boundary condition, nˆ · E = 0. The difference with Laplacian growth lies in the kinematic condition, which is considerably more complicated. In place of Eq. (6), the normal velocity of the void surface is given by the surface divergence of the surface current, j , which takes the dimensionless form, nˆ · v =
∂ 2φ ∂ 2κ ∂j =χ 2 + 2, ∂s ∂s ∂s
(15)
where s is the local arc-length coordinate and χ is a dimensionless parameter comparing surface currents due to the electron wind force (first term) and due to gradients in surface tension (second term). This moving free boundary problem somewhat resembles the viscous fingering problem with surface tension, and it admits analogous finger solutions, albeit of width 2/3, not 1/2 [19]. To describe the evolution of a singly connected void, we consider the conformal map, z = g(w, t), from the exterior of the unit circle to the exterior of
1426
M.Z. Bazant and D. Crowdy
the void. As long as the map remains univalent, it has a Laurent series of the form, g(w, t) = A1 (t)w + A0 (t) +
∞
A−n (t)w −n ,
for |w| > 1,
(16)
n=1
where the Laurent coefficients, An (t), are now the generalized coordinates. As in the case of viscous fingering [3], a hierarchy of nonlinear ordinary differential equations (ODEs) for these coordinates can be derived. For void electromigration, Wang et al. [18] start from a variational principle accounting for surface tension and surface diffusion, using a Galerkin procedure. They truncate the expansion after 17 coefficients, so their numerical method breaks down if the void deforms significantly, e.g., into a curved slit. Nevertheless, as shown in Fig. 4(a), the numerical method is able to capture essential features of the early stages of strongly nonlinear dynamics. In the same regime, it is also possible to incorporate anisotropic surface tension or surface mobility. The latter involves multiplying the surface current by a factor (1 + gd cos mα), where α is the surface orientation in the physical z-plane, given at z = g(eiθ , t), by α = θ + arg
∂g iθ (e , t). ∂w
(17)
Some results are shown in Fig. 4(b), where the void develops dynamical facets.
(a)
(b)
Figure 4. Numerical conformal-mapping simulations of the electromigration of voids in aluminum interconnects [18]. (a) A small shape perturbation of a cylindrical void decaying (above) or deforming into a curved slit (below), depending on a dimensionless group, χ, comparing the electron wind to surface-tension gradients. (b) A void evolving with anisotropic surface diffusivity (χ = 100, gd = 100, m = 3). (Courtesy of Zhigang Suo.)
Conformal mapping methods for interfacial dynamics
1427
2.1.4. Quadrature domains We end this section by commenting on some of the mathematics underlying the existence of exact solutions to continuous-time Laplacian-growth problems. Significantly, much of this mathematics carries over to problems in which the governing field equation is not necessarily harmonic, as will be seen in the following section. The steadily-translating finger solution (11) of Saffman and Taylor turns out to be but one of an increasingly large number of known exact solutions to the standard Hele–Shaw problem. Saffman [20] himself identified a class of unsteady finger-like solutions. This solution was later generalized by Howison [21] to solutions involving multiple fingers exhibiting such phenomena as tip-splitting where a single finger splits into two (or more) fingers. It is even possible to find exact solutions to the more realistic case where there is a second interface further down the channel [22] which must always be the case in any experiment. Besides finger-like solutions which are characterized by time-evolving conformal mappings having logarithmic branch-point singularities, other exact solutions, where the conformal mappings are rational functions with timeevolving poles and zeros, were first identified by Polubarinova–Kochina and Galin in 1945. Richardson [23] later rediscovered the latter solutions while simultaneously presenting important new theoretical connections between the Hele–Shaw problem and a class of planar domains known as quadrature domains. The simplest example of a quadrature domain is a circular disc D of radius r centered at the origin which satisfies the identity
h(z) dx dy = πr 2 h(0),
(18)
D
where h(z) is any function analytic in the disc (and integrable over it). Equation (18), which is known as a quadrature identity since it holds for any analytic function h(z), is simply a statement of the well-known mean-value theorem of complex analysis [24]. A more general domain D, satisfying a generalized quadrature identity of the form
h(z) dx dy = D
N n k −1
c j k h ( j )(z k )
(19)
k=1 j =0
is known as a quadrature domain. Here, {z k ∈ C} is a set of points inside D and h ( j )(z) denotes the j th derivative of h(z). If one makes the choice h(z) = z n in (19) the resulting integral quantities have become known as the Richardson moments of the domain. Richardson showed that the dynamics of the Hele–Shaw problem is such as to preserve quadrature domains. That is, if the initial fluid domain in a Hele–Shaw cell is a quadrature domain at time
1428
M.Z. Bazant and D. Crowdy
t = 0, it remains a quadrature domain at later times (so long as the solution does not break down). This result is highly significant and provides a link with many other areas of mathematics including potential theory, the notion of balayage, algebraic curves, Schwarz functions and Cauchy transforms. Richardson [25] discusses many of these connections while Varchenko and Etingof [26] provide a more general overview of the various mathematical implications of Richardson’s result. Shapiro [27] gives more general background on quadrature domain theory. It is a well-known result in the theory of quadrature domains [27] that simply-connected quadrature domains can be parameterized by rational function conformal mappings from a unit circle. Given Richardson’s result on the preservation of quadrature domains, this explains why Polubarinova–Kochina and Galin were able to find time-evolving rational function conformal mapping solutions to the Hele–Shaw problem. It also underlies the pole dynamics results of Bensimon and Shraiman [9]. But Richardson’s result is not restricted to simply-connected domains; multiply-connected quadrature domains are also preserved by the dynamics. Physically this corresponds to time-evolving fluid domains containing multiple bubbles of air. Indeed, motivated by such matters, recent research has focused on the analytical construction of multiplyconnected quadrature domains using conformal mapping ideas [28, 29]. In the higher-connected case, the conformal mappings are no longer simply rational functions but are given by conformal maps that are automorphic functions (or, meromorphic functions on compact Riemann surfaces). The important point here is that understanding the physical problem from the more general perspective of quadrature domain theory has led the way to the unveiling of more sophisticated classes of exact conformal mapping solutions.
2.2.
Bi-Harmonic Fields
Although not as well known as conformal mapping involving harmonic functions, there is also a substantial literature on complex-variable methods to solve the bi-harmonic equation, ∇ 2 ∇ 2 ψ = 0,
(20)
which arises in two-dimensional elasticity [30] and fluid mechanics [31]. Unlike harmonic functions, which can be expressed in terms of a single analytic function (the complex potential), bi-harmonic functions can be expressed in terms of two analytic functions, f (z) and g(z), in Goursat form [24]: ψ(z, z¯ ) = Im [¯z f (z) + g(z)].
(21)
Note that ψ is no longer just the imaginary part of an analytic function g(z) but also contains the imaginary part of the non-analytic component z¯ f (z).
Conformal mapping methods for interfacial dynamics
1429
A difficulty with bi-harmonic problems is that the typical boundary conditions (see below) are not conformally invariant, so conformal mapping does not usually generate new solutions by simply a change of variables, as in Eq. (4). Nevertheless, the Goursat form of the solution, Eq. (21), is a major simplification, which enables analytical progress.
2.1.5. Viscous sintering Sintering describes a process by which a granular compact of particles (e.g., metal or glass) is raised to a sufficiently large temperature that the individual particles become mobile and release surface energy in such a way as to produce inter-particulate bonds. At the start of a sinter process, any two particles which are initially touching develop a thin “neck” which, as time evolves, grows in size to form a more developed bond. In compacts in which the packing is such that particles have more than one touching neighbor, as the necks grow in size, the sinter body densifies and any enclosed pores between particles tend to close up. The macroscopic material properties of the compact at the end of the sinter process depend heavily on the degree of densification. In industrial application, it is crucial to be able to obtain accurate and reliable estimates of the time taken for pores to close (or reduce to a sufficiently small size) within any given initial sinter body in order that industrial sinter times are optimized without compromising the macroscopic properties of the final densified sinter body. The fluid is modeled as a region D(t) of very viscous, incompressible fluid, in which the velocity field, u = (u, v) = (ψ y , −ψx )
(22)
is given by the curl of an out-of-plane vector, whose magnitude is a stream function, ψ(x, y, t), which satisfies the bi-harmonic equation [31]. On the boundary ∂ D(t), the tangential stress must vanish and the normal stress must be balanced by the uniform surface tension effect, i.e., − pn i + 2µei j = T κn i ,
(23)
where p is the fluid pressure, µ is the viscosity, T is the surface tension parameter, κ is the boundary curvature, n i denotes components of the outward normal n to ∂ D(t) and ei j is the usual fluid rate-of-strain tensor. The boundary is time-evolved in a quasi-steady fashion with a normal velocity, Vn , determined by the same kinematic condition, Vn = u · n, as in viscous fingering. In terms of the Goursat functions in (21) – which are now generally time-evolving – the stress condition (23) takes the form i f (z, t) + z f (¯z , t) + g (¯z , t) = − z s , 2
(24)
1430
M.Z. Bazant and D. Crowdy
where again s denotes arc length. Once f (z, t) has been determined from (24), the kinematic condition Im[z t z¯ s ] = Im[−2 f (z, t)¯z s ] −
1 2
(25)
is used to time-advance the interface. A significant contribution was made by Hopper [32] who showed, using complex variable methods based on the decomposition (21), that the problem for the surface-tension driven coalescence of two equal circular blobs of viscous fluid can be reduced to the evolution of a rational function conformal map, from a unit w-circle, of the form g(w, t) =
R(t)w . w 2 − a 2 (t)
(26)
The two time-evolving parameters R(t) and a(t) satisfy two coupled nonlinear ODEs. Figure 5 shows a sequence of shapes of the two coalescing blobs computed using Hopper’s solution. At large times, the configuration equilibrates to a single circular blob. While Hopper’s coalescence problem provides insight into the growth of the inter-particle neck region, there are no pores in this configuration and it is natural to ask whether more general exact solutions exist. Crowdy [33] reappraised the viscous sintering problem and showed, in direct analogy with Richardson’s result on Hele–Shaw flows, that the dynamics of the sintering problem is also such as to preserve quadrature domains. As in the Hele– Shaw problem, this perspective paved the way for the identification of new exact solutions, generalizing (26), for the evolution of doubly-connected fluid regions. Figure 6 shows the shrinkage of a pore enclosed by a typical “unit” in a doubly-connected square packing of touching near-circular blobs of viscous fluid. This calculation employs a conformal mapping to the doubly-connected fluid region (which is no longer a rational function but a more general automorphic function) derived by Crowdy [34] and, in the same spirit as Hopper’s solution (26), requires only the integration of three coupled nonlinear ODEs. The fluid regions shown in Fig. 6 are all doubly-connected quadrature domains. Richardson [35] has also considered similar Stokes flow problems using a different conformal mapping approach.
Figure 5. Evolution of the solution of Hopper [32] for the coalescence of two equal blobs of fluid under the effects of surface tension.
Conformal mapping methods for interfacial dynamics
1431
Figure 6. The coalescence of fluid blobs and collapse of cylindrical pores in a model of viscous sintering. This sequence of images shows an analytical solution by Crowdy [34] using complex-variable methods.
2.1.6. Pores in elastic solids Solid elasticity in two dimensions is also governed by a bi-harmonic function, the Airy stress function [30]. Therefore, the stress tensor, σi j , and the displacement field, u i , may be expressed in terms of two analytic functions, f (z) and g(z): σ22 + σ11 = f (z) + f (z), 2 σ22 − σ11 + iσ12 = z f (z) + g (z), 2 Y (u 1 + iu 2 ) = κ f (z) − z f (z) − g(z) 1+ν
(27) (28) (29)
where Y is Young’s modulus, ν is Poisson’s ratio, and κ = (3 − ν)/(1 + ν) for plane stress and κ = 3 − 4ν for plane strain. As with bubbles in viscous flow, the use of Goursat functions allows conformal mapping to be applied to bi-harmonic free boundary problems in elastic solids, without solving explicitly for bulk stresses and strains. For example, Wang and Suo [36] have simulated the dynamics of a singlyconnected pore by surface diffusion in an infinite stressed elastic solid. As in the case of void electromigration described above, they solve nonlinear ODEs for the Laurent coefficients of the conformal map from the exterior of the unit disk, Eq. (16). Under uniaxial tension, there is a competition between surface tension, which prefers a circular shape, and the applied stress, which drives elongation and eventually fracture in the transverse direction. The numerical
1432
M.Z. Bazant and D. Crowdy
method, based on the truncated Laurent expansion, is able to capture the transition from stable elliptical shapes at small applied stress to the unstable growth of transverse protrusions at large applied stress, although naturally it breaks down when cusps resembling crack tips begin to form.
2.3.
Non-Harmonic Conformally Invariant Fields
The vast majority of applications of conformal mapping fall into one of the two classes above, involving harmonic or bi-harmonic functions, where the connections with analytic functions, Eqs. (4) and (21), are cleverly exploited. It turns out, however, that conformal mapping can be applied just as easily to a broad class of problems involving non-harmonic fields, recently discovered by Bazant [37]. Of course, in planar geometry, the conformal map itself is described by an analytic function, but the fields need not be, as long as they transform in a simple way under conformal mapping. The most convenient fields satisfy conformally invariant partial differential equations (PDEs), whose forms are unaltered by a conformal change of variables. It is straightforward to transform PDEs under a conformal mapping of the plane, w = f (z), by expressing them in terms of complex gradient operator introduced above, ∇z =
∂ ∂ ∂ +i =2 , ∂x ∂y ∂z
(30)
which we have related to the z partial derivative using the Cauchy–Riemann equations, Eq. (1). In this form, it is clear that ∇z f = 0 if and only if f (z) is analytic, in which case ∇ z f = 2 f . Using the chain rule, also obtain the transformation rule for the gradient, ∇ z = f ∇w .
(31)
To apply this formalism, we write Laplace’s equation in the form, ∇ z2 φ = Re ∇z ∇ z φ = ∇z ∇ z φ = 0,
(32)
which assumes that mixed partial derivatives can be taken in either order. (Note that a · b = Re ab.) The conformal invariance of Laplace’s equation, ∇w ∇ w φ = 0, then follows from a simple calculation, ∇z ∇ z = (∇z f )∇ w + | f |2 ∇w ∇ w = | f |2 ∇w ∇ w ,
(33)
where ∇z f = 0 because f is also analytic. As a result of conformal invariance, any harmonic function in the w-plane, φ(w), remains harmonic in the
Conformal mapping methods for interfacial dynamics
1433
z-plane, φ( f (z)), after the simple substitution, w = f (z). We came to the same conclusion above in Eq. (4), using the connection between harmonic and analytic functions, but the argument here is more general and also applies to other PDEs. The bi-harmonic equation is not conformally invariant, but some other equations – and systems of equations – are. The key observation is that any “product of two gradients” transforms in the same way under conformal mapping, not only the Laplacian, ∇ · ∇ φ, but also the term, ∇ φ1 · ∇ φ2 = Re(∇φ1 )∇φ2 , which involves two real functions, φ1 and φ2 : Re(∇z φ1 ) ∇ z φ2 = | f |2 Re(∇w φ1 ) ∇ w φ2 .
(34)
(Todd Squires has since noted that the operator, ∇ φ1 × ∇ φ2 = Im(∇φ1 )∇φ2 , also transforms in the same way.) These observations imply the conformal invariance of a broad class of systems of nonlinear PDEs: N
ai ∇ 2 φi +
N
i =1
j =i
ai j ∇ φi · ∇ φ j +
N
bi j ∇ φi × ∇ φ j = 0,
(35)
j = i+1
where the coefficients ai (φ), ai j (φ), and bi j (φ) may be nonlinear functions of the unknowns, φ = (φ1 , φ2 , . . . , φ N ), but not of the independent variables or any derivatives of the unknowns. The general solutions to these equations are not harmonic and thus depend on both z and z. Nevertheless, conformal mapping works in precisely the same way: A solution, φ(w, w), can be mapped to another solution, φ( f (z), f (z)), by a simple substitution, w = f (z). This allows the conformal mapping techniques above (and below) to be extended to new kinds of moving free boundary problems.
2.1.7. Transport-limited growth phenomena For physical applications, the conformally invariant class, Eq. (35), includes the important set of steady conservation laws for gradient-driven flux densities, ∂ci = ∇ · Fi = 0, ∂t
Fi = ci ui − Di (ci ) ∇ ci ,
ui ∝ ∇ φ,
(36)
where {ci } are scalar fields, such as chemical concentrations or temperature, {Di (ci )} are nonlinear diffusivities, {ui } are irrotational vector fields causing advection, and φ is a potential [37]. Physical examples include advectiondiffusion, where φ is the harmonic velocity potential, and electrochemical transport, where φ is the non-harmonic electrostatic potential, determined implicitly by electro-neutrality.
1434
M.Z. Bazant and D. Crowdy
By modifying the classical methods described above for Laplacian growth, conformal-map dynamics can thus be formulated for more general, transportlimited growth phenomena [38]. The natural generalization of the kinematic condition, Eq. (6), is that the free boundary moves in proportion to one of the gradient-driven fluxes with velocity, v ∝ F1 . For the growth of a finite filament, driven by prescribed fluxes and/or concentrations at infinity, one obtains a generalization of the Polubarinova–Galin equation for the conformal map, z = g(w, t), from the exterior of the unit disk to the exterior of growing object, Re(w g gt ) = σ (w, t) on |w| = 1,
(37)
where σ (w, t) is the non-constant, time-dependent normal flux, nˆ · F1 , on the unit circle in the mathematical plane.
2.1.8. Solidification in a fluid flow A special case of the conformally invariant Eq. (35) has been known for almost a century: steady advection-diffusion of a scalar field, c, in a potential flow, u. The dimensionless PDEs are Pe u · ∇ c = ∇ 2 c,
u = ∇ φ,
∇ 2 φ = 0,
(38)
where we have introduced the P´eclet number, Pe = UL/D, in terms of a characteristic length, L, velocity, U , and diffusivity, D. In 1905, Boussinesq showed that Eq. (38) takes a simpler form in streamline coordinates, (φ, ψ), where = φ + iψ is the complex velocity potential: ∂c = Pe ∂φ
∂ 2c ∂ 2c + ∂φ 2 ∂ψ 2
(39)
because advection (the left hand side) is directed only along streamlines, while diffusion (the right hand side) also occurs in the transverse direction, along isopotential lines. From the general perspective above, we recognize this as the conformal mapping of an invariant system of PDEs of the form (36) to the complex plane, where the flow is uniform and any obstacles in the flow are mapped to horizontal slits. Streamline coordinates form the basis for Maksimov’s method for interfacial growth by steady advection-diffusion in a background potential flow, which has been applied to freezing in ground-water flow and vapor deposition on textile fibers [4, 39]. The growing interface is a streamline held at a fixed concentration (or temperature) relative to the flowing bulk fluid at infinity. This is arguably the simplest growth model with two competing transport processes, and yet open questions remain about the nonlinear dynamics, even without surface tension.
Conformal mapping methods for interfacial dynamics
1435
Figure 7. The exact self-similar solution, Eq. (40), for continuous advection-diffusion-limited growth in a uniform background potential flow (yellow streamlines) at the dynamical fixed point (Pe = ∞). The concentration field (color contour plot) is shown for Pe = 100. (Courtesy of Jaehyuk Choi.)
The normal flux distribution to a finite absorber in a uniform background flow, σ (w, t) in Eq. (37) is well known, but rather complicated [40], so it is replaced by asymptotic approximations for analytical work, such as √ σ ∼ 2 Pe/π sin(θ/2) as Pe → ∞, which is the fixed point of the dynamics. In this important limit, Choi et al. [41] have found an exact similarity solution,
(40) g(w, t) = A1 (t) w(w − 1), A1 (t) = t 2/3 √ iθ to Eq. (37) with σ (e , t) = A1 (t) sin(θ/2) (since Pe(t) ∝ A1 (t) for a fixed background flow). As shown in Fig. 7, this corresponds to a constant shape, 2/3 ◦ whose linear size grows like √ t , with a 90 cusp at the rear stagnation point, where a branch point of w(w − 1) lies on the unit circle. For any finite, Pe(t), however, the cusp is smoothed, and the map remains univalent, although other singularities may form. Curiously, when mapped to the channel geometry with log z, the solution (40) becomes a Saffman–Taylor finger of width, λ = 3/4.
3.
Stochastic Interfacial Dynamics
The continuous dynamics of conformal maps is a mature subject, but much attention is now focusing on analogous problems with discrete, stochastic dynamics. The essential change is in the kinematic condition: The expression for the interfacial velocity, e.g., Eq. (6), is re-interpreted as the probability
1436
M.Z. Bazant and D. Crowdy
density (per unit arc length) for advancing the interface with a discrete “bump”, e.g., to model a depositing particle. Continuous conformal-map dynamics is then replaced by rules for constructing and composing the bumps. This method of iterated conformal maps was introduced by Hastings and Levitov [42] in the context of Laplacian growth. Stochastic Laplacian growth has been discussed since the early 1980s, but Hastings and Levitov [42] first showed how to implement it with conformal mapping. They proposed the following family of bump functions,
f λ,θ (w) = eiθ f λ e−iθ w , |w| ≥ 1
f λ (w) = w 1−a
(41)
a
1−λ (1 + λ)(w + 1) w+1+ w 2 +1−2w −1 2w 1+λ
(42) as elementary univalent mappings of the exterior of the unit disk used to advance the interface (0 < a ≤ 1). The function, f λ,θ (w), places a bump of (approximate) area, λ, on the unit circle, centered at angle, θ. Compared to analytic functions of the unit disk, the Hastings–Levitov function (42) generates a much more localized perturbation, focused on the region between two branch points, leaving the rest of the unit circle unaltered √ [43]. For a = 1, the map produces a strike, which is a line segment of length λ emanating normal to the circle. For a = 1/2, the map is an invertible composition of simple linear, M¨obius and Joukowski transformations, which inserts a semi-circular bump on the unit circle. As shown in Fig. 8, this yields a good description of (a)
(b) 4
400
2
200
0
0
⫺2
⫺200
⫺4
⫺400 ⫺4
⫺2
0
2
4
⫺400
⫺200
0
200
400
Figure 8. Simulation of the aggregation of (a) 4 and (b) 10 000 particles using the Hastings– Levitov algorithm (a = 1/2). Color contours show the quasi-steady concentration (or probability) field for mobile particles arriving from infinity, and purple curves indicate lines of diffusive flux (or probability current). (Courtesy of Jaehyuk Choi and Benny Davidovitch.)
Conformal mapping methods for interfacial dynamics
1437
aggregating particles, although other choices, like a = 2/3, have also been considered [43]. Quantifying the effect of the bump shape remains a basic open question. Once the bump function is chosen, the conformal map, z = gn (w), from the exterior of the unit disk to the evolving domain with n bumps is constructed by iteration,
gn (w) = gn−1 f λn ,θn (w)
(43)
starting from the initial interface, given by g0 (w). All of the physics is contained in the sequence of bump parameters, {(λn , θn )}, which can be generated in different ways (in the w plane) to model a variety of physical processes (in the z-plane). As shown in Fig. 8(b), the interface often develops a very complicated, fractal structure, which is given, quite remarkably, by an exact mapping of the unit circle. The great advantage of stochastic conformal mapping over atomistic or Monte Carlo simulation of interfacial growth lies in its mathematical insight. For example, given the sequence {(λn , θn )} from a simulation of some physical growth process, the Laurent coefficients, Ak (n), of the conformal map, gn (w), as defined in Eq. (16), can be calculated analytically. For the bump function (42), Davidovitch et al. [43] provide a hierarchy of recursion relations, yielding formulae such as A1 (n) =
n
(1 + λm )a ,
(44)
m=1
and explain how to interpret the Laurent coefficients. For example, A1 is the conformal radius of the cluster, a convenient measure of its linear extent. It is also the radius of a grounded disk with the same capacitance (with respect to infinity) as the cluster. The Koebe “1/4 theorem” on univalent functions [44] ensures that the cluster (image of the unit disk) is always contained in a disk of radius 4A1 . The next Laurent coefficient, A0 , is the center of a uniformly charged disk, which would have the same asymptotic electric field as the cluster (if also charged). Similarly, higher Laurent coefficients encode higher multipole moments of the cluster. Mapping the unit circle with a truncated Laurent expansion defines the web, which wraps around the growing tips and exhibits a sort of turbulent dynamics, endlessly forming and smoothing cusp-like protrusions [42, 45]. The stochastic dynamics, however, does not suffer from finite-time singularities because the iterated map, by construction, remains univalent. In some sense, discreteness plays the role of surface tension, as another regularization of ill-posed continuum models like Laplacian growth.
1438
3.1.
M.Z. Bazant and D. Crowdy
Diffusion-Limited Aggregation (DLA)
The stochastic analog of Laplacian growth is the DLA model of Witten and Sander [46], illustrated in Fig. 8, in which particles perform random walks one-by-one from infinity until they stick irreversibly to a cluster, which grows from a seed at the origin. DLA and its variants (see below) provide simple models for many fractal patterns in nature, such as colloidal aggregates, dendritic electro-deposits, snowflakes, lightning strikes, mineral deposits, and surface patterns in ion-beam microscopy [14]. In spite of decades of research, however, DLA still presents theoretical mysteries, which are just beginning to unravel [47]. The Hastings–Levitov algorithm for DLA prescribes the bump parameters, {(λn , θn )}, as follows. As in Laplacian growth, the harmonic function for the concentration (or probability density) of the next random walker approaching an n-particle cluster is simply, φn (z) = A Re log gn−1 (z),
(45)
according to Eq. (4), since φ(w) = A Re log w = A log|w| is the (trivial) solution to Laplace’s equation in the mathematical w plane with φ = 0 on the unit disk with a circularly symmetric flux density, A, prescribed at infinity. Using the transformation rule, Eq. (31), we then find that the evolving harmonic measure, pn (z)|dz|, for the nth growth event corresponds to a uniform probability measure, Pn (θ) dθ, for angles, θn , on the unit circle, w = eiθ : ∇ φ dθ w = Pn (θ) dθ, pn (z)|dz| = |∇z φ||dz| = |gn−1 dw| = |∇w φ||dw| = g 2π n−1
(46) where we set A = 1/2π for normalization, which implicitly sets the time scale. The conformal invariance of the harmonic measure is well known in mathematics, but the surprising result of Hastings and Levitov [42] is that all the complexity of DLA is slaved to a sequence of independent, uniform random variables. Where the complexity resides is in the bump area, λn , which depends nontrivially on current cluster geometry and thus on the entire history of random angles, {θm | m ≤ n}. For DLA, the bump area in the mathematical w plane should be chosen such that it has a fixed value, λ0 , in the physical z-plane, equal to the aggregating particle area. As long as the new bump is sufficiently small, it is natural to try to correct only for the Jacobian factor Jn (w) = |gn (w)|2
(47)
Conformal mapping methods for interfacial dynamics
1439
of the previous conformal map at the center of the new bump, λn =
λ0 , Jn−1 (eiθn )
(48)
although it is not clear a priori that such a local approximation is valid. Note at least that gn → ∞, and thus λn → 0, as the cluster grows, so this has a chance of working. Numerical simulations with the Hastings–Levitov algorithm do indeed produce nearly constant bump areas, as in Fig. 8. Nevertheless, much larger “particles”, which fill deep fjords in the cluster, occasionally occur where the map varies too wildly, as shown in Fig. 9(a). It is possible (but somewhat unsatisfying) to reject particles outside an “area acceptance window” to produce rather realistic DLA clusters, as shown in Fig. 9(b). It seems that the rejected large bumps are so rare that they do not much influence statistical scaling properties of the clusters [48], although this issue is by no means rigorously resolved.
3.2.
Fractal Geometry
Fractal patterns abound in nature, and DLA provides the most common way to understand them [14]. The fractal scaling of DLA has been debated for decades, but conformal dynamics is shedding new light on the problem. Simulations show that the conformal radius (44) exhibits fractal scaling, A1 (n) ∝ n 1/D f , where the fractal dimension, D f = 1.71, agrees with the accepted value from Monte Carlo (random walk) simulations of DLA, although the prefactor seems to depend on the bump function [43]. A perturbative renormalizationgroup analysis of the conformal dynamics by Hastings [45] gives a similar result, D f = 2 − 1/2 + 1/5 = 1.7. The multifractal spectrum of the harmonic measure has also been studied [49, 50]. Perhaps the most basic question is whether DLA clusters are truly fractal – statistically self-similar and free of any length scale. This long-standing question requires accurate statistics and very large simulations, to erase the surprisingly long memory of the initial conditions. Conformal dynamics provides exact formulae for cluster moments, but simulations are limited to at most 105 particles by poor O(n 2 ) scaling, caused by the history-dependent Jacobian in Eq. (48). In contrast, efficient random-walk simulations can aggregate many millions of particles. Therefore, Somfai et al. [51] developed a hybrid method relying only upon the existence of the conformal map, but not the Hastings–Levitov algorithm to construct it. Large clusters by Monte Carlo simulation, and approximate Laurent coefficients are computed, purely for their morphological information, as follows. For a given cluster of size N , M random walkers are launched
1440
M.Z. Bazant and D. Crowdy
(a)
(b)
(c)
(d)
(e)
(f)
Figure 9. Simulations of fractal aggregates by Stepanov and Levitov [48]: (a) Superimposed time series of the boundary, showing the aggregation of particles, represented by iterated conformal maps; (b) a larger simulation with a particle-area acceptance window; (c) the result of anisotropic growth probability with square symmetry; (d) square-anisotropic growth with noise control via flat particles; (e) triangular-anisotropic growth with noise control; (f) isotropic growth with noise control, which resembles radial viscous fingering. (Courtesy of Leonid Levitov.)
from far away, and the positions, z m , where they would first touch the cluster, are recorded. If the conformal map, z = gn (eiθ ), were known, the points z m would correspond to M angles θm on the unit circle. Since these must sample a uniform distribution, one assumes θm = 2π m/M for large M. From Eq. (16),
Conformal mapping methods for interfacial dynamics
1441
the Laurent coefficientsare simply the Fourier coefficients of the discretely sampled function, z m = Ak eiθm k . Using this method, all Laurent coefficients appear to scale with the same fractal dimension,
|Ak (n)|2 ∝ n 2/D f
(49)
although the first few coefficients crossover extremely slowly to the asymptotic scaling.
3.3.
Snowflakes and Viscous Fingers
In conventional Monte Carlo simulations, many variants of DLA have been proposed to model real patterns found in nature [14]. For example, clusters closely resembling snowflakes can be grown by a combination of noise control (requiring multiple hits before attachment) and anisotropy (on a lattice). Conformal dynamics offers the same flexibility, as shown in Fig. 9, while allowing anisotropy and noise to be controlled independently [48]. Anisotropy can be introduced in the growth probability with a weight factor, 1 + c cos mαn , where αn is the surface orientation angle in the physical plane given by Eq. (17), or by simply rejecting angles outside some tolerance from the desired symmetry directions. Noise can be controlled by flattening the aspect ratio of the bumps. Without anisotropy, this produces smooth fluid-like patterns (Fig. 9(f)), reminiscent of viscous fingers (Fig. 2). The possible relation between DLA and viscous fingering is a tantalizing open question in pattern formation. Many authors have argued that the regularization of finite-time singularities in Laplacian growth by discreteness is somehow analogous to surface tension. Indeed, the average DLA cluster in a channel, grown by conformal mapping, is similar (but not identical) to a Saffman–Taylor finger of width 1/2 [52], and the instantaneous expected growth rate of a cluster can be related to the Polubarinova–Galin (or “Shraiman– Bensimon”) equation [42]. Conformal dynamics with many bumps grown simultaneously suggests that Laplacian growth and DLA are in different universality classes, due to the basic difference of layer-by-layer vs. one-byone growth, respectively [53]. Another multiple-bump algorithm with complete surface coverage, however, seems to yield the opposite conclusion [54].
3.4.
Dielectric Breakdown
In their original paper, Hastings and Levitov [42] allowed for the size of the bump in the physical plane to vary with an exponent, α, by replacing Jn−1
1442
M.Z. Bazant and D. Crowdy
with ( Jn−1 )α/2 in Eq. (48). In DLA (α = 2), the bump size is roughly constant, but for 0 < α < 2 the bump size grows with the local gradient of the Laplacian field. This is a simple model for dielectric breakdown, where the stochastic growth of an electric discharge penetrating a material is nonlinearly enhanced by the local electric field. One could use strikes (a = 0) rather than bumps (a = 1/2) to better reproduce the string-like branched patterns seen in laboratory experiments [14] and more familiar lightning strikes. The model displays a “stable-to-turbulent” phase transition: The relative surface roughness decreases with time for 0 ≤ α < 1 and grows for α > 1. The original Dielectric Breakdown Model (DBM) of Niemeyer et al. [55] has a more complicated conformal-dynamics representation. As usual, the growth is driven by the gradient of a harmonic function, φ (the electrostatic potential) on an iso-potential surface (the discharge region). Unlike the αmodel above, however, DBM growth events are assumed to have constant size, so the bump size in the mathematical plane is still chosen according to Eq. (48). The difference lies in the growth measure, which does not obey Eq. (46). Instead, the generalized harmonic measure in the physical z-plane is given by p(z) ∝ |∇z φ|η ,
(50)
where η is an exponent interpolating between the Eden model (η = 0), DLA (η = 1), and nonlinear dielectric breakdown (η > 1). For η =/ 1, the fortuitous cancellation in Eq. (46) does not occur. Instead, a similar calculation using Eq. (45) yields a non-uniform probability measure for the nth angle on the unit circle in the mathematical plane, (eiθn )|1−η , Pn (θn ) = |gn−1
(51)
which is complicated and depends on the entire history of the simulation. Nevertheless, conformal mapping can be applied fruitfully to DBM, because not solving Laplace’s equation around the cluster outweighs the difficulty of sampling the angle measure. Surmounting the latter with a Monte Carlo algorithm, Hastings [56] has performed DBM simulations of 104 growth events, an order of magnitude beyond standard methods solving Laplace’s equation on a lattice. The results, illustrated in Fig. 10, support the theoretical conjecture that DBM clusters become one-dimensional, and thus non-fractal, for η ≥ 4. Using the conformal-mapping formalism, efforts are also underway to develop a unified scaling theory of the η-model for the growth probability from DBM combined with the α-model above for the bump size [50].
Conformal mapping methods for interfacial dynamics (a)
1443
(b)
Figure 10. Conformal-mapping simulations by Hastings [56] of the Dielectric Breakdown Model with (a) η = 2 and (b) η = 3.5. (Courtesy of Matt Hastings.)
3.5.
Brittle Fracture
Modeling the stochastic dynamics of fracture is a daunting problem, especially in heterogeneous materials [14, 57]. The basic equations and boundary conditions are still the subject of debate, and even the simplest models are difficult to solve. In two dimensions, stochastic conformal mapping provides an elegant, new alternative to discrete-lattice and finite-element models. In brittle fracture, the bulk material is assumed to obey Lam´e’s equation of linear elasticity, ∂ 2u = (λ + µ)∇ ∇ (∇ ∇ · u) + µ∇ ∇ 2 u, (52) ∂t 2 where u is the displacement field, ρ is the density, and µ and λ are Lam´e’s constants. For conformal mapping, it is crucial to assume (i) two-dimensional symmetry of the fracture pattern and (ii) quasi-steady elasticity, which sets the left hand side to zero to obtain equations of the type described above. For Mode III fracture, where a constant out-of-plane shear stress is applied at infinity, we have ∇ · u = 0, so the steady Lam´e equation reduces to Laplace’s equation for the out-of-plane displacement, ∇ 2 u z = 0, which allows the use of complex potentials. For Modes I and II, where a uniaxial, in-plane tensile stress is applied at infinity, the steady Lam´e equation must be solved. As discussed above, this is equivalent to the bi-harmonic equation for the Airy stress function, which allows the use of Goursat functions. For all three modes, the method of iterated conformal maps can be adapted to produce fracture patterns for a variety of physical assumptions about crack dynamics [58]. For Modes I and II fracture, these models provide the first ρ
1444
M.Z. Bazant and D. Crowdy
examples of stochastic bi-harmonic growth, which have interesting differences with stochastic Laplacian growth for Mode III fracture. The Hastings–Levitov formalism is used with constant-size bumps, as in DLA, to represent the fracture process zone, where elasticity does not apply. The growth measure a function of the excess tangential stress, beyond a critical yield stress, σc , characterizing the local strength of the material. Quenched disorder is easily included by making σc a random variable. In spite of its many assumptions, the method provides analytical insights, while obviating the need to solve Eq. (52) during fracture dynamics, so it merits further study.
3.6.
Advection-Diffusion-Limited Aggregation
Non-local fractal growth models typically involve a single bulk field driving the dynamics, such as the particle concentration in DLA, the electric field in DBM, or the strain field in brittle fracture, and as a result these models tend to yield statistically similar structures, apart from the effect of boundary conditions. Pattern formation in nature, however, is often fueled by multiple transport processes, such as diffusion, electromigration, and/or advection in a fluid flow. The effect of such dynamical competition on growth morphology is an open question, which would be difficult to address with lattice-based or finite-element methods, since many large fractal clusters must be generated to fully explore the space and time dependence. Once again, conformal mapping provides a convenient means to formulate stochastic analogs of the non-Laplacian transport-limited growth models from Section 2.3 (in two dimensions). It is straightforward to adapt the Hastings– Levitov algorithm to construct stochastic dynamics driven by bulk fields satisfying the conformally invariant system of Eq. (35). A class of such models has recently been formulated by Bazant et al. [38]. Perhaps the simplest case involving two transport processes, illustrated in Fig. 11, is Advection-Diffusion-Limited Aggregation (ADLA), or “DLA in a flow”. Imagine a fluid carrying a dilute concentration of sticky particles flowing past a sticky object, which begins to collect a fractal aggregate. As the cluster grows, it causes the fluid to flow around it and changes the concentration field, which in turn alters the growth probability measure. Assuming a quasi-steady potential flow with a uniform speed far from the cluster, the dimensionless transport problem is Pe0 ∇ φ · ∇ c = ∇ 2 c, ∇ 2 φ = 0, c = 0, nˆ · ∇ φ = 0, σ = nˆ · ∇ c, c → 1, ∇ φ → xˆ ,
z ∈ z (t),
(53)
z ∈ ∂z (t),
(54)
|z| → ∞,
(55)
Conformal mapping methods for interfacial dynamics
1445
Figure 11. A simulation of Advection-Diffusion-Limited Aggregation from Bazant et al. [38] In each row, the growth probabilities in the physical z-plane (on the right) are obtained by solving advection-diffusion in a potential flow past an absorbing cylinder in the mathematical w-plane (on the left), with the same time-dependent P´eclet number.
where Pe0 is the initial P´eclet number and σ is the diffusive flux to the surface, which drives the growth. The transport problem is solved in the mathematical w-plane, where it corresponds to a uniform potential flow of concentrated fluid past an absorbing circular cylinder. The normal diffusive flux on the cylinder, σ (θ, Pe), can be obtained from a tabulated numerical solution or an accurate analytical approximation [40]. Because the boundary condition on φ at infinity is not conformally invariant, the flow in the w-plane has a time-dependent P´eclet number, Pe(t) = A1 (t)Pe0 , which grows with the conformal radius of the cluster. As a result, the
1446
M.Z. Bazant and D. Crowdy
probability of the nth growth event is given by a time-dependent, non-uniform measure for the angle on the unit circle, β τn σ (eiθn , A1 (tn−1 )), (56) Pn (θn ) = λ0 where β is a constant setting the mean growth rate. The waiting time between growth events is an exponential random variable with mean, τn , given by the current integrated flux to the object, λ0 = βτn
2π
σ (eiθ , A1 (tn−1 )) dθ.
(57)
0
Unlike DLA, the aggregation speeds up as the cluster grows, due to a larger cross section to catch new particles in the flow. As shown in Fig. 11, the model displays a universal dynamical crossover from DLA (the unstable fixed point) to an advection-dominated stable fixed point, since Pe(t) → ∞. Remarkably, the fractal dimension remains constant during the transition, equal to the value for DLA, in spite of dramatic changes in the growth rate and morphology (as indicated by higher Laurent coefficients). Moreover, the shape of the “average” ADLA cluster in the high-Pe regime of Fig. 11 is quite similar (but not identical) to the exact solution, Eq. (40), for the analogous continuous problem in Fig. 7. Much remains to be done to understand these kinds of models and apply them to materials problems.
4.
Curved Surfaces
Entov and Etingof (44) considered the generalized problem of Hele–Shaw flows in a non-planar cell having non-zero curvature. In such problems, the velocity of the viscous flow is still the (surface) gradient of a potential, φ, but this function is now a solution of the so-called Laplace–Beltrami equation on the curved surface. The Riemann mapping theorem extends to curved surfaces and says that any simply-connected smooth surface is conformally equivalent to the unit disk, the complex plane, or the Riemann sphere. A common example is the well-known stereographic projection of the surface of a sphere to the (compactified) complex plane. Under a conformal mapping, solutions of the Laplace–Beltrami equation map to solutions to Laplace’s equation and this combination of facts led Entov and Etingof (44) [59] to identify classes of explicit solutions to the continuous Hele–Shaw problem in a variety of non-planar cells. With very similar intent, Parisio et al. [60] have recently considered the evolution of Saffman–Taylor fingers on the surface of a sphere. By now, the reader may realize that most of the methods already considered in this article are, in principle, amenable to generalization to curved surfaces,
Conformal mapping methods for interfacial dynamics
1447
which can be reached by conformal mapping of the plane. For example, Fig. 12 shows a simulation of a DLA cluster growing on the surface of a sphere, using a generalized Hastings–Levitov algorithm, which takes surface curvature into account. The key modification is to multiply the Jacobian in Eq. (47) by the Jacobian of the stereographic projection, 1 + |z/R|2 , where R is the radius of the sphere. It should also be clear that any continuous or discrete growth model driven by a conformally-invariant bulk field, such as ADLA, can be simulated on general curved surfaces by means of appropriate conformal projection to a complex plane. The reason is that the system of Eq. (35) is invariant under any conformal mapping, to a flat or curved surface, because each term transforms like the Laplacian, ∇ 2 φ → J ∇ 2 φ, where J is the Jacobian. The purpose of studying these models is not only to understand growth on a particular ideal shape, such as a sphere, but more generally to explore the effect of local surface curvature on pattern formation. For example, this could help interpret mineral deposit patterns in rough geological fracture surfaces, which form by the diffusion and advection of oxygen in slowly flowing water.
Figure 12. Conformal-mapping simulation of DLA on a sphere. Particles diffuse one by one from the North Pole and aggregate on a seed at a South Pole. (Courtesy of Jaehyuk Choi, Martin Bazant, and Darren Crowdy.)
1448
5.
M.Z. Bazant and D. Crowdy
Outlook
Although conformal mapping has been with us for centuries, new developments with applications continue to the present day. This appears to be the first pedagogical review of stochastic conformal-mapping methods for interfacial dynamics, which also covers the latest progress in continuum methods. Hopefully, this will encourage the further exchange of ideas (and people) between the two fields. Our focus has also been on materials problems, which provide many opportunities to apply and extend conformal mapping. Building on specific open questions scattered throughout the text, we close with a general outlook on directions for future research. A basic question for both stochastic and continuum methods is the effect of geometrical constraints, such as walls or curved surfaces, on interfacial dynamics. Most work to date has been for either radial or channel geometries, but it would be interesting to describe finite viscous fingers or DLA clusters growing near walls of various shapes, as is often the case in materials applications. The extension of conformal-map dynamics to multiply connected domains is another mathematically challenging area, which has received some attention recently but seems ripe for further development. Understanding the exact solution structure of Laplacian-growth problems using the mathematical abstraction of quadrature domain theory holds great potential, especially given that mathematicians have already begun to explore the extent to which the various mathematical concepts extend to higher-dimensions [27]. Describing multiply connected domains could pave the way for new mathematical theories of evolving material microstructures. Topology is the main difference between an isolated bubble and a dense sintering compact. Microstructural evolution in elastic solids may be an even more interesting, and challenging, direction for conformal-mapping methods. From a mathematical point of view, much remains to be done to place stochastic conformal-mapping methods for interfacial dynamics on more rigorous ground. This has recently been achieved in the simpler case of Stochastic Loewner evolution (SLE), which has a similar history to the interfacial problems discussed here [61]. Oded Schramm introduced SLE in 2000 as a stochastic version of the continuous Loewner evolution from univalent function theory, which grows a one-dimensional random filament from a disk or half plane. This important development in pure mathematics came a few years after the pioneering DLA papers of Hastings and Levitov in physics. A notable difference is that SLE has a rigorous mathematical theory based on stochastic calculus, which has enabled new proofs on the properties of percolation clusters and self-avoiding random walks (in two dimensions, of course). One hopes that someday DLA, DBM, ADLA, and other fractal-growth models will also be placed on such a rigorous footing.
Conformal mapping methods for interfacial dynamics
1449
Returning to materials applications, it seems there are many new problems to be considered using conformal mapping. Relatively little work has been done so far on void electromigration, viscous sintering, solid pore evolution, brittle fracture, electrodeposition, and solidification in fluid flows. The reader is encouraged to explore these and other problems using a powerful mathematical tool, which deserves more attention in materials science.
References [1] R.V. Churchill and J.W. Brown, Complex Variables and Applications, 5th edn., McGraw-Hill, New York, 1990. [2] T. Needham, Visual Complex Analysis, Clarendon Press, Oxford, 1997. [3] S.D. Howison, “Complex variable methods in Hele-Shaw moving boundary problems,” Euro. J. Appl. Math., 3, 209–224, 1992. [4] L.M. Cummings, Y.E. Hohlov, S.D. Howison, and K. Kornev, “Two-dimensional soldification and melting in potential flows,” J. Fluid Mech., 378, 1–18, 1999. [5] P.G. Saffman and G.I. Taylor, “The penetration of a fluid into a porous medium or Hele–Shaw cell containing a more viscous liquid,” Proceedings of the Royal Society, London A, 245, 312–329, 1958. [6] M. Kruskal and H. Segur, “Asymptotics beyond all orders in a model of crystal growth,” Stud. Appl. Math., 85, 129, 1991. [7] S. Tanveer, “Evolution of Hele–Shaw interface for small surface tension,” Philosophical Transactions of the Royal Society of London A, 343, 155–204, 1993a. [8] S. Tanveer, “Surprises in viscous fingering,” J. Fluid Mech., 409, 273–308, 2000. [9] B. Bensimon and D. Shraiman, “Singularities in non-local interface dynamics,” Phys. Rev. A, 30, 2840–2842, 1984. [10] L.P. Kadanoff, “Exact solutions for the Saffman–Taylor problem with surface tension,” Phys. Rev. Lett., 65, 2986–2988, 1990. [11] D. Crowdy, “Hele–Shaw flows and water waves,” J. Fluid Mech., 409, 223–242, 2000. [12] J.W. Maclean and P.G Saffman, “The effect of surface tension on the shape of fingers in the Hele–Shaw cell,” J. Fluid Mech., 102, 455, 1981. [13] W.-S. Dai, L.P. Kadanoff, and S.-M. Zhou, “Interface dynamics and the motion of complex singularities,” Phys. Rev. A, 43, 6672–6682, 1991. [14] A. Bunde and S. Havlin (ed.), Fractals and Disordered Systems, 2nd edn., Springer, New York, 1996. [15] S. Tanveer, “Singularities in the classical Rayleigh–Taylor flow: formation and subsequent motion,” Proceedings of the Royal Society, A, 441, 501–525, 1993b. [16] V.E. Zakharov, “Stability of periodic waves of finite amplitude on the surface of deep fluid,” J. Appl. Mech. Tech. Phys., 2, 190, 1968. [17] T. Yoshikawa and A.M. Balk, “The growth of fingers and bubbles in the strongly nonlinear regime of the Richtmyer–Meshkov instability,” Phys. Lett. A, 251, 184– 190, 1999. [18] W. Wang, Z. Suo, and T.-H. Hao, “A simulation of electromigration-induced transgranular slits,” J. Appl. Phys., 79, 2394–2403, 1996. [19] M. Ben Amar, “Void electromigration as a moving free-boundary value problem,” Physica D, 134, 275–286, 1999.
1450
M.Z. Bazant and D. Crowdy
[20] P. Saffman, “Exact solutions for the growth of fingers from a flat interface between two fluids in a porous medium,” Q. J. Mech. Appl. Math., 12, 146–150, 1959. [21] S. Howison, “Fingering in Hele–Shaw cells,” J. Fluid Mech., 12, 439–453, 1986. [22] D. Crowdy and S. Tanveer, “The effect of finiteness in the Saffman–Taylor viscous fingering problem,” J. Stat. Phys., 114, 1501–1536, 2004. [23] S. Richardson, “Hele–Shaw flows with a free boundary produced by the injection of fluid into a narrow channel,” J. Fluid Mech., 56, 609–618, 1981. [24] G. Carrier, M. Krook, and C. Pearson, Functions of a Complex Variable, McGraw– Hill, New York, 1966. [25] S. Richardson, “Hele–Shaw flows with time-dependent free boundaries involving injection through slits,” Stud. Appl. Math., 87, 175–194, 1992. [26] A. Varchenko and P. Etingof, Why the Boundary of a Round Drop Becomes a Curve of Order Four, University Lecture Series, AMS, Providence, 1992. [27] H. Shapiro, The Schwarz Function and its Generalization to Higher dimension, Wiley, New York, 1992. [28] S. Richardson, “Hele–Shaw flows with time-dependent free boundaries involving a multiply-connected fluid region,” Eur. J. Appl. Math., 12, 571–599, 2001. [29] D. Crowdy and J. Marshall, “Constructing multiply-connected quadrature domains,” SIAM J. Appl. Math., 64, 1334–1359, 2004. [30] N. Muskhelishvili, Some Basic Problems of the Mathematical Theory of Elasticity, Noordhoff, Groningen, Holland, 1953. [31] G.K. Batchelor, An Introduction to Fluid Dynamics, Cambridge University Press, 1967. [32] R. Hopper, “Plane stokes flow driven by capillarity on a free surface,” J. Fluid Mech., 213, 349–375, 1990. [33] D. Crowdy, “A note on viscous sintering and quadrature identities,” Eur. J. Appl. Math., 10, 623–634, 1999. [34] D.G. Crowdy, “Viscous sintering of unimodal and bimodal cylindrical packings with shrinking pores,” Eur. J. Appl. Math., 14, 421–445, 2003. [35] S. Richardson, “Plane stokes flow with time-dependent free boundaries in which the fluid occupies a doubly-connected region,” Eur. J. Appl. Math., 11, 249–269, 2000. [36] W. Wang and Z. Suo, “Shape change of a pore in a stressed solid via surface diffusion motivated by surface and elastic energy variations,” J. Mech. Phys. Solids, 45, 709– 729, 1997. [37] M.Z. Bazant, “Conformal mapping of some non-harmonic functions in transport theory,” Proceedings of the Royal Society, A, 460, 1433, 2004. [38] M.Z. Bazant, J. Choi, and B. Davidovitch, “Dynamics of conformal maps for a class of non-Laplacian growth phenomena,” Phys. Rev. Lett., 91, 045503, 2003. [39] K. Kornev and G. Mukhamadullina, “Mathematical theory of freezing for flow in porous media,” Proceedings of the Royal Society, London A, 447, 281–297, 1994. [40] J. Choi, D. Margetis, T.M. Squires, and M.Z. Bazant, “Steady advection-diffusion to finite absorbers in two-dimensional potential flows,” J. Fluid Mech., 2004b. [41] J. Choi, B. Davidovitch, and M.Z. Bazant, “Crossover and scaling of advectiondiffusion-limited aggregation,” In preparation, 2004a. [42] M.B. Hastings and L.S. Levitov, “Laplacian growth as one-dimensional turbulence,” Physica D, 116, 244–252, 1998. [43] B. Davidovitch, H.G.E. Hentschel, Z. Olami, I. Procaccia, L.M. Sander, and E. Somfai, “Diffusion-limited aggregation and iterated conformal maps,” Phys. Rev. E, 59, 1368–1378, 1999. [44] P.L. Duren, Univalent Functions, Springer-Verlag, New York, 1983.
Conformal mapping methods for interfacial dynamics
1451
[45] M.B. Hastings, “Renormalization theory of stochastic growth,” Phys. Rev. E, 55, 135, 1997. [46] T.A. Witten and L.M. Sander, “Diffusion-limited aggregation: a kinetic critical phenomenon,” Phys. Rev. Lett., 47, 1400–1403, 1981. [47] T.C. Halsey, “Diffusion-limited aggregation: a model for pattern formation,” Phys. Today, 53, 36, 2000. [48] M.G. Stepanov and L.S. Levitov, “Laplacian growth with separately controlled noise and anisotropy,” Phys. Rev. E, 63, 061102, 2001. [49] M.H. Jensen, A. Levermann, J. Mathiesen, and I. Procaccia, “Multifractal structure of the harmonic measure of diffusion-limited aggregates,” Phys. Rev. E, 65, 046109, 2002. [50] R.C. Ball and E. Somfai, “Theory of diffusion controlled growth,” Phys. Rev. Lett., 89, 133503, 2002. [51] E. Somfai, L.M. Sander, and R.C. Ball, “Scaling and crossovers in diffusion limited aggregation,” Phys. Rev. Lett., 83, 5523, 1999. [52] E. Somfai, R.C. Ball, J.P. DeVita, and L.M. Sander, “Diffusion-limited aggregation in channel geometry,” Phys. Rev. E, 68, 020401, 2003. [53] F. Barra, B. Davidovitch, and I. Procaccia, “Iterated conformal dynamics and Laplacian growth,” Phys. Rev. E, 65, 046144, 2002a. [54] A. Levermann and I. Procaccia, “Algorithm for parallel laplacian growth by iterated conformal maps,” Phys. Rev. E, 69, 031401, 2004. [55] L. Niemeyer, L. Pietronero, and H.J. Wiesmann, “Fractal dimension of dielectric breakdown,” Phys. Rev. Lett., 52, 1033–1036, 1984. [56] M.B. Hastings, “Fractal to nonfractal phase transition in the dielectric breakdown model,” Phys. Rev. Lett., 87, 175502, 2001. [57] H.J. Hermann and S. Roux (eds.), Statistical Models for the Fracture of Disordered Media, North-Holland, Amsterdam, 1990. [58] F. Barra, A. Levermann, and I. Procaccia, “Quasistatic brittle fracture in inhomogeneous media and iterated conformal maps,” Phys. Rev. E, 66, 066122, 2002b. [59] V.M. Entov and P.I. Etingof, “Bubble contraction in Hele–Shaw cells,” Quart. J. Mech. Appl. Math., 507–535, 1991. [60] F. Parisio, F. Moreas, J.A. Miranda, and M. Widom, “Saffman–Taylor problem on a sphere,” Phys. Rev. E, 63, 036307, 2001. [61] W. Kager and B. Nienhuis, “A guide to stochastic loewner evolution and its applications,” J. Stat. Phys., 115, 1149–1229, 2004.
4.11 EQUATION-FREE MODELING FOR COMPLEX SYSTEMS Ioannis G. Kevrekidis1, C. William Gear1 , and Gerhard Hummer2 1 Princeton University, Princeton, NJ, USA 2
National Institutes of Health, Bethesda, MD, USA
A persistent feature of many complex systems is the emergence of macroscopic, coherent behavior from the interactions of microscopic “agents” – molecules, cells, individuals in a population – among themselves and with their environment. The implication is that macroscopic rules (a description of the system at a coarse-grained, high-level) can somehow be deduced from microscopic ones (a description at a much finer level). For laminar Newtonian fluid mechanics, a successful coarse-grained description (the Navier–Stokes equations) was known on a phenomenological basis long before its approximate derivation from kinetic theory [1]. Today we must frequently study systems for which the physics can be modeled at a microscopic, fine scale; yet it is practically impossible to explicitly derive a good macroscopic description from the microscopic rules. Hence, we look to the computer to explore the macroscopic behavior based on the microscopic description. It is difficult to define complexity in a precise, useful way. At the same time it pervades current modeling in engineering science, in the life and physical sciences, and beyond them (e.g., in economics) (see, e.g., Refs. [2, 3]). We may not typically think of a laminar Newtonian flow as complex, even though it involves interactions of enormous numbers of fluid molecules with themselves and with the boundaries of the flow. Such problems are considered simple because we have a good model, describing the behavior of the system at the level we need for practical purposes. If we are interested in pressure drops and flow rates over humanly relevant space/time scales, we do not need to know where each and every molecule is, or its individual velocity, at a given instant in time. Similarly, if a stirred chemical reactor can be modeled adequately, for design purposes, by a few ordinary differential equations (ODEs), the immense complexity of molecular interactions involved in flow, reaction and mixing in it goes unnoticed. The system is classified as simple, because 1453 S. Yip (ed.), Handbook of Materials Modeling, 1453–1475. c 2005 Springer. Printed in the Netherlands.
1454
I.G. Kevrekidis et al.
a simple model of the behavior is adequate for practical purposes. This suggests that the scale of the observer, and the practical goals of the modeling, are crucial in classifying a system, its models, or its behavior as complex – or as simple. Macroscopic models of reaction and transport processes in our textbooks come in the form of conservation laws (species, mass, momentum, energy) closed through constitutive equations (reaction rates as a function of concentration, viscous stresses as functionals of velocity gradients). These models are written directly at the scale (alternatively, at the level of complexity) at which we are interested in practically modeling the system behavior. Because we observe the system at the level of concentrations or velocity fields,we sometimes forget that what is really evolving during an experiment is distributions of colliding and reacting molecules. We know, from experience with particular classes of problems, that it is possible to write predictive deterministic laws for the behavior observed at the level of concentrations or velocity fields – laws that are predictive over space and time scales relevant to engineering practice. Knowing the right level of observation at which we can be practically predictive, we attempt to write closed evolution equations for the system at this level. The closures may be based on experiment (e.g., through engineering correlations) or on mathematical modeling and approximation of what happens at more microscopic scales (e.g., the Chapman–Enskog expansion). In many problems of current modeling practice, ranging from materials science to ecology, and from engineering to computational chemistry, the physics are known at the microscopic/individual level, and the closures required to translate them to high-level, coarse-grained, macroscopic descriptions are not available. Sometimes we do not even know at what level of observation one can be practically predictive. Severe computational limitations arise in trying to bridge, through direct computer simulation, the enormous gap between the scale of the available description and the macroscopic, “system” scale at which the questions of interest are asked and the practical answers are required (see, e.g., Refs. [4, 5]). These computational limitations are a major stumbling block in current complex system modeling. Our objective is to describe a computational approach for dealing with any complex, multi-scale system whose collective, coarse-grained behavior is simple when we know in principle how to model such systems at a very fine scale (e.g., through molecular dynamics). We assume that we do not know how to write good simple model equations at the right coarse-grained, macroscopic scale for their collective, coarse-grained behavior. We will argue that, in many cases, the derivation of macroscopic equations can be circumvented; that by using short bursts of appropriately initialized microscopic simulation one can effectively solve the macroscopic equations without ever writing them down. A direct bridge can be built between microscopic simulation (e.g., kinetic Monte Carlo, agent-based modeling) and traditional continuum numerical
Equation-free modeling for complex systems
1455
analysis. It is possible to enable microscopic simulators to directly perform macroscopic, systems level tasks. The main idea is to consider the microscopic, fine-scale simulator as a (computational) experiment that one can set up, initialize, and run at will. The results of such appropriately designed, initialized and executed brief computational experiments allow us to estimate the same information that a macroscopic model would allow us to evaluate from explicit formulas. The heart of the approach can be conveyed through a simple example (see Fig. 1). Consider a single, autonomous ODE, dc = f (c). dt
(1)
Think of it as a model for the dynamics of a reactant concentration in a stirred reactor. Equations like this embody “practical determinism” as discussed above: given a finite amount of information (the state at the present time, c(t =0)) we can predict the state at a future time. Consider how this is done on the computer using – for illustration – the simplest numerical integration scheme, forward Euler: cn+1 ≡ c([n + 1]τ ) = cn + τ f (cn ).
(2)
Starting with the initial condition, c0 , we go to the equation and evaluate f (c0 ), the time derivative, or slope of the trajectory c(t); we use this value to make a prediction of the state of the system at the next time step, c1 . We then repeat the process: go to the equation with c1 to evaluate f (c1 ) and use the Euler scheme to predict c2 ; and so on. Forgetting for the moment accuracy and adaptive step size selection, consider how the equation is used: given the state we evaluate the time-derivative; and then, using mathematics (in particular, Taylor series and smoothness to create a local linear model of the process in time) we make a prediction of the state at the next time step. A numerical integration code will “ping” a sub-routine with the current state as input, and will obtain as output the time-derivative at this state. The code will then process this value, and use local Taylor series in order to make a prediction of the next state (the next value of c at which to call the sub-routine evaluating the function f ). Three simple things are important to notice. First, the task at hand (numerical integration) does not need a closed formula for f (c) – it only needs f (c) evaluated at a particular sequence of values cn . Whether the sub-routine evaluates f (c) from a single-line formula, uses a table lookup, or solves a large subsidiary problem, from the point of view of the integration code it is the same thing. Second, the sequence of values cn at which we need the time-derivative evaluated is not known a priori. It is generated as the task progresses, from processing results of previous function evaluations through the Euler formula. We know that protocols exist for designing experiments to
1456
I.G. Kevrekidis et al. (a) C C2 f (C 1 )
C1
f (C 0 )
C0
t0
t1
t2
t
t2
t
(b) C C2 ~ f (C 1 )
C1 ~ f (C 0 )
C0
t0
t1
(c) C ⫺ Φτ (C)
(n) C (n) C ⫹ε
C (n⫹1)
Figure 1. (a) Forward Euler numerical integration, used (b) as a template for projective integration using the results of short experiments. (c) Fixed-point iteration for a timestepper.
Equation-free modeling for complex systems
1457
accomplish tasks such as parameter estimation [6]. In the same spirit, we can think of the Euler method, and of explicit numerical integrators in general, as protocols for specifying where to perform function evaluations based on the task we want to accomplish (computation of a temporal trajectory). Lastly, the form of the protocol (the Euler method here) is based on mathematics, particularly on smoothness and Taylor series. The trajectory is locally approximated as a linear function of time; the coefficients of this function are obtained from the model using function evaluations. Suppose now that we do not have the equation, but we have the experiment itself : we can fill up the stirred reactor with reactant at concentration c0 , run for some time, and record the time series of c(t). Using the results of a short run (over, say, 1 min) we can now estimate the slope, dc/dt at t = 0, and predict (using the Euler method) where the concentration will be in, say 10 min. Now, instead of waiting for 9 min for the reactor to get there, we stop the experiment and immediately start a new one: reinitialize the reactor at the predicted concentration; run for one more minute, and use forward Euler to predict what the concentration will be 20 min down the line. We are substituting short, appropriately initialized experiments, and estimation based on the experimental results, for the function evaluations that the sub-routine with the closed form f (c) would return. We are in effect doing forward Euler again; but the coefficients of the local linear model are obtained using experimentation “on demand ” [7] rather than function evaluations of an a priori available model. Many elements of this example are contrived; for example, the assumption that an Euler prediction with a 10 min step is reasonably accurate. It may also appear laughable that, instead of waiting nine more minutes for the reactor to get to the predicted concentration, we will initialize a fresh experiment at that concentration. It will probably take much more than 9 min to start a new experiment; there will be startup transients, and noise in the measurements. The point, however, remains: it is possible to do forward Euler integration using short bursts of appropriately initialized experiments if it is easy to initialize such experiments at will. An “outer” process (design of the next experiment, setting it up, measuring its results, processing them to design a new experiment) is wrapped around an “inner” process (the experiment). The outer wrapper is motivated by the task that we wish to perform (here, longtime integration) and is based on traditional, continuum numerical analysis. The inner layer is the process itself. It is clear that systems theory components (data acquisition and filtering, model identification, [8]) are vital in forming the connection between the outer layer and the inner layer (the task we want to accomplish and the system itself). Now we complete the argument: suppose that the inner layer is not a laboratory experiment, but a computational one, with a model at a different, much finer level of description (for the sake of the discussion, a lattice kinetic
1458
I.G. Kevrekidis et al.
Monte Carlo, kMC, model of the reaction). Instead of running the kMC model for long times, and observing the evolution of the concentration, we can exploit the procedure described above, perform only short bursts of appropriately initialized microscopic simulation, and use their results to evolve the macroscopic behavior over hopefully much longer time scales. It is much easier to initialize a code at will – a computational experiment – as opposed to initializing a new laboratory experiment. Many new issues arise, notably noise, in the form of fluctuations, from the microscopic solver. The conceptual point, however, remains: even if we do not have the right macroscopic equation for the concentration, we can still perform its numerical integration without obtaining it in closed form. The skeleton of the wrapper (the integration algorithm) is the same one we would use if we had the macroscopic equation; but now function evaluations are substituted by short computational experiments with the microscopic simulator, whose results are appropriately processed for local macroscopic identification and estimation. If a large separation of time-scales exists between microscopic dynamics (here, the time we need to run kinetic Monte Carlo to estimate dc/dt) and the macroscopic evolution of the concentration, this procedure may be significantly more economical than direct simulation. Passing information between the microscopic and macroscopic scales at the beginning and the end of each computational experiment is a vitally important issue. It is accomplished through a lifting operator (macro- to micro-) and a restriction operator (micro- to macro-) as discussed below (see [9, 10] and references therein). Detailed, fine-level dynamics are typically given in terms of microscopically/stochastically evolving distributions of interacting “agents” (molecules, cells); the evolution rules could be molecular dynamics (classical, or Car–Parrinello [11]), MC or kMC, Brownian dynamics, etc. The macroscopic dynamics are described by closed evolution equations, typically ordinary (for macroscopically lumped) or partial differential/integrodifferential equations. The dependent variables in these equations are frequently a few, lower order moments of the evolving distributions (such as concentration, the zeroth moment). The proposed computational methodology consists of the following basic elements: (a) Choose the statistics of interest for describing the long-term behavior of the system and an appropriate representation for them. For example, in a gas simulation at the particle level, the statistics would probably be density and momentum (zeroth and first moment of the particle distribution over velocities) and we might choose to discretize them in a computational domain via finite elements. We call this the macroscopic description, u. These choices suggest possible restriction operators, M, from the microscopic-level description U, to the macroscopic description: u = MU;
Equation-free modeling for complex systems
1459
(b) Choose an appropriate lifting operator, µ from the macroscopic description, u, to one or more consistent microscopic descriptions, U. For example, in a gas simulation using pressure, etc. as the macroscopic-level variables, µ could make random particle assignments consistent with the macroscopic statistics. µM = I, i.e., lifting from the macroscopic to the microscopic and then restricting (projecting) down again should have no effect, except roundoff. (c) Start with a macroscopic condition (e.g., concentration profile) u(t0 ); (d) Transform it through lifting to one – or more – fine, consistent microscopic realizations U(t0 ) = µu(t0 ); (e) Evolve each realization using the microscopic simulator for the desired short macroscopic time T, generating the values U(t1 ) where t1 = t0 + T; (f) Obtain the restriction(s) u(t1 ) = MU(t1 ) (and average over them). This constitutes the coarse time-stepper, or coarse time-T map. If this map is accurate enough, we showed above how to use it in a two-tier procedure to perform Coarse Projective Integration [12–14]. • repeating steps (e–f) over several time steps and obtaining several U(ti ) as well as their restrictions u(ti ) = MU(ti ), i = 1, 2, . . . , k + 1 • using the chord approximating these successive time-stepper output points to estimate the derivative – the “right-hand-side” of the equations we do not have –, we can then • use this derivative in another, outer integrator scheme (such as forward Euler) to produce estimates of the macroscopic state much later in time u(tk+1+M ). • go back to step (d). The lifting step (creating microscopic distributions conditioned on a few of their lower moments, going back to Ehrenfest, [15]) is clearly not unique, and sometimes quite non-trivial: consider for example creating a distribution of particles on a lattice that has a prescribed average as well as a prescribed pair probability. A preparatory step (e.g., through simulated annealing) may be required to arrange the particles on the lattice consistently with the prescribed constraints. Through such appropriate preparation, one can even lift prescribed pair-correlation functions to consistent particle assemblies. Constrained dynamics algorithms, like SHAKE [16] can also be thought of as lifting procedures; see also Ref. [17]. An important point made in Fig. 2a is that an initial simulation interval must elapse before estimating the time-derivative of the macroscopic variables from the microscopic simulation. In the microscopic dynamics, every particle evolves while interacting with other particles, and all the moments of the distribution evolve in a coupled manner. It is therefore remarkable that practically predictive models are usually written in terms of only a few moments
1460
I.G. Kevrekidis et al. (a)
(b)
TI M E
Patch dynamics Lift µ
Project Restrict 2
Evolve 2
Restrict 1
Evolve 1 Interpolate
Lift µ
Interpolate Apply BC2
Boxes SPACE
(c)
Figure 2. Schematic illustrations of (a) coarse projectiveintegration; (b) patch dynamics; and (c) coarse-timestepper-based bifurcation computations (see text).
Equation-free modeling for complex systems
1461
of these evolving distributions. This is only possible because the remaining, higher-order moments quickly become functionals of the few, lower order, slow, “master” moments – our observation variables. This occurs over timescales that are short compared to the macroscopic observation time-scales. In this separation of time-scales (and concomitant space scales) lies the essential reduction step underpinning effective simplicity and practical determinism. The idea is that the long-term observable dynamics of the system evolve on a low-dimensional, strongly attracting, slow manifold in moments space; this is, effectively, a quasi-steady state approximation [18]. This manifold is parameterized by our observation variables (typically the lower distribution moments, like concentration) in terms of which we write macroscopic equations. The expected values of the remaining moments can be written as an (unspecified) function of the coarse variables; that is the graph of the manifold. A good example is the law of Newtonian viscosity: when one starts a molecular simulation, the stresses are not instantaneously proportional to velocity gradients – but for Newtonian fluids they become so within a few collision times, i.e., over times much shorter than the macroscopic observation times over which the Navier–Stokes equations become valid approximations. The coarse variables are therefore observation variables. If the fine-scale simulation, conditioned on values of the observation variables, is initialized “off manifold”, it only takes a fast (possibly constrained) initial transient to approach a neighborhood of this manifold. Through the restriction operator, we observe the dynamics on the hyperplane spanned by our chosen observation variables. After the system quickly relaxes to the manifold, we estimate the time-derivative of the observation variables, and use it in the projective integration scheme. The dynamics of the full system will then, after lifting and a short integration, spontaneously establish (by bringing us to the manifold) the missing closure: the effect of the full description on the observed dynamics. A direct conceptual analogy arises here with center manifolds in dynamical systems (parameterized using eigenvectors of the linearization at a steady state, see, e.g., Ref. [19]) or inertial manifolds for dissipative PDEs (parameterized using eigenfunctions of a linear dissipative operator, [20, 21]). Normal forms and (approximate) inertial forms are thus analogous to our macroscopic equations for the coarse observation variables. Low order moments have traditionally been the observation variables of choice in our textbooks. In principle, however, any set of variables that parameterizes this low-dimensional slow manifold can be used as observation variables with the appropriate lifting and restriction operators. Using more observation variables than necessary reduces computational efficiency; it is analogous to using a finer mesh than necessary for the accuracy required in solving a problem. Intelligently chosen order parameters usually provide a much more parsimonious basis set on which to observe the dynamics and apply our computational framework. There is a clear analogy here with
1462
I.G. Kevrekidis et al.
empirical eigenfunctions [22] used for model reduction in the discretization of dissipative PDEs. The detection of good observables, capable of efficiently parameterizing this manifold, through statistical analysis of simulation results, is a crucial enabling technology for our computational framework. Using data mining techniques (e.g., see Ref. [23–25]) to find such observables can be thought of as the “variable-free” component of the equation-free modeling approach. In coarse projective integration we exploit the smoothness in time of the unavailable macroscopic equation in order to project (jump) to the future. In the case of macroscopically (spatially or otherwise) distributed systems, one can exploit smoothness of the unavailable macroscopic equation in space in order to perform the microscopic simulations only over small, but appropriately coupled, computational boxes (“teeth”). This is illustrated in Fig. 2b: (a) Coarse variable selection (same as above, but now the variable u(x) depends on “coarse space” x. We have chosen for simplicity to consider only one space dimension.) (b) Choice of lifting operator (same as above, but now we lift entire profiles of u(x, t) to profiles of U(y, t), where is microscopic space corresponding to the macroscopic space x. This lifting involves therefore not only the variables, but the space descriptions too. The basic idea is that a coarse point in x corresponds to an interval (a “box” or “tooth” in y). (c) Prescribe a macroscopic initial profile u(x, t 0 ) – the “coarse field”. In particular, consider the values u i (t0 ) at a number of macro-mesh points; the macroscopic profile arises from interpolation of these values of the coarse-field. (d) Lift the “mesh points” xi and the values u i (t0 ) to profiles Ui (yi , t0 ), in microscopic domains (“teeth”) yi corresponding to the coarse-mesh points xi . These profiles should be conditioned on the values u i , and it is a good idea that they are also conditioned on certain boundary conditions motivated by the coarse-field (e.g., be consistent with coarse slopes at the boundaries of the teeth that are computed from the coarse-field). (e) Evolve the microscopic dynamics in each of these boxes for a short time T based on the microscopic description, and through ensembles that enforce the coarsely inspired boundary conditions (see, e.g., Ref. [26]) – and thus generate Ui (yi , t1 ), where t1 = t0 + T. (f) Obtain the restriction from each patch to coarse variables u i (t1 ) = M Ui (yi , t1 ). (g) Interpolate between these to obtain the new coarse-field u(x, t1 ). Up to this point, we have the gaptooth scheme: a scheme that computes in small domains (the “teeth”) which communicate over the gaps between them
Equation-free modeling for complex systems
1463
through “coarse-field motivated” boundary conditions. We can now proceed by combining the gaptooth scheme with projective integration ideas to (h) Repeat the process (lift within the teeth, compute new boundary conditions, evolve microscopically, restrict to macroscopic variables and interpolate) for a few steps, and then (i) Project coarse-fields “long” into the future. For a projective forward Euler this would involve the chord between two successive coarse-fields to estimate the right-hand-side of the unavailable coarse equation, and then an Euler “projection” of the coarse-field long into the future. (j) Repeat the entire procedure starting with the lifting (d) above. This leads to patch dynamics: a computational framework in which simulations using the microscopic description over short times and small computational domains (“patches” in space-time) can be used to advance the macroscopic dynamics over long times and large computational domains [10, 27–29]. Initializing microscopic computations conditioned on macroscopic variables is an important component of coarse projective integration; similarly, imposing macroscopically motivated boundary conditions to microscopic computations is an important element of gaptooth and patch dynamics. The methods we discussed can, under appropriate conditions, drastically accelerate the direct simulation of the coarse-grained, macroscopic behavior of certain complex multi-scale systems. Direct simulation, however, is but the simplest computational task one can perform with a system model. It corresponds, in some sense, to physical experimentation: we set parameter values and initial conditions, let the system evolve on the computer and observe its behavior, just like performing a laboratory experiment. Depending on what we want to learn about the system, there exist much more interesting and efficient ways of using the model and the computer. Consider for example the location of steady states; fixed point algorithms, like the Netwon– Raphson, are a much more efficient way of finding steady states than direct integration (given a good initial guess). Such fixed point algorithms can locate both stable and unstable steady states (the latter would be extremely difficult or impossible to find with direct simulation). “The Jacobian of the solution is a treasure trove, not only for continuation, but also for analyzing stability of solutions, for detecting bifurcations of solution families, and for computing asymptotic estimates of the effects, on any solution, of small changes in parameters, boundary conditions and boundary shape” [30]. Beyond stability and sensitivity analysis, having the steady states and using Taylor series in their neighborhood (Jacobians, Hessians) one can design stabilizing controllers, observers, solve optimization problems, etc. There is a vast arsenal of algorithms (and codes implementing them) for the computer-aided analysis of system models, going much beyond direct simulation. Yet these algorithms
1464
I.G. Kevrekidis et al.
are applicable to macroscopic equations: ODEs, Differential Algebraic Equations (DAEs), PDEs/PDAEs and their discretizations. Smoothness and Taylor series expansions (derivatives with respect to time, Frechet derivatives, partial derivatives with respect to parameters) are vital in formulating and implementing most of these algorithms. When the model comes in the form of microscopic/stochastic simulators at a much finer scale – without a closed formula for the equation, i.e., without a “right-hand side” for the time-derivative –, this arsenal of continuum numerical tools appears useless. Fortunately, the same coarse timestepping idea we used to accelerate direct simulation of an effectively simple multi-scale system can be used to enable its coarse-grained computer-assisted analysis even without explicit macroscopic equations. To illustrate this, we return to our simple scalar example in Fig. 1. We are given a black box timestepper for this equation: a code which, initialized with cn (t = nτ ) integrates the equation for time τ and returns the result cn+1 = c(t = [n + 1]τ ). We use the notation cn+1 = τ (cn ). If the task at hand is to find a steady state for the equation, this can be accomplished by calling the timestepper repeatedly (integrate forward in time) until the result does not change any more. Indeed a steady state of the equation is a fixed point for the timestepper, x ∗ =τ (x ∗ ). Yet this iteration will only find stable steady states, and the rate of convergence to them depends on the physical dynamics of the problem, becoming increasingly slow close to transition boundaries. The method of choice for finding a steady state (given a good initial guess) would be a Newton–Raphson iteration, which would converge quadratically to non-singular steady states.
df dc
c(n)
(c(n+1) − c(n) ) = − f (c(n) ).
Can we trick an integration code (the timestepper) into becoming a fixed point solver? In other words, if we do not have the equation for f (c), but can computationally evaluate the timestepper, can we still do Newton for the steady state? The answer is illustrated in Fig. 1c: we use the computationally evaluated timestepper to solve the fixed point problem G(c) ≡ c − (c) = 0. Calling the timestepper for an initial condition c(n) gives us (c(n) ) and the residual, G(c(n) ). Lacking a formula to compute the linearization, we call the timestepper with a nearby initial condition, c(n) + ε. This gives us (c(n) + ε), • ε. This estimate and the difference (using Taylor series) is approximately d dc of the action of the Jacobian can then be used in a secant method to compute the next iterate c(n + 1) of the steady-state search. Notice again the crucial issue of being able to initialize a simulator at will; after c(n+1) is estimated from
Equation-free modeling for complex systems
1465
the nearby integrations and the secant procedure, we can immediately call the timestepper with initial condition c(n + 1) and iterate the process. We have not done much more than estimating derivatives through differencing. Yet forward integration can now be used (through a computational superstructure, a “wrapper” that implements what we just described in words) to converge to unstable steady states, and eventually to compute bifurcation diagrams. We have enabled a simulation code to perform a task (fixed point computation) for which it had not been designed [31]. This procedure may initially appear hopeless in higher dimensions (e.g., for the large sets of ODEs arising in PDE discretizations). Fortunately, recent developments in large-scale computational linear algebra (the so-called matrix free solvers and eigensolvers) address precisely this point. Integrating with two nearby initial conditions (m-vectors, differing by the m-vector ε) and taking the difference of the timestepper results provides an estimate of DΦ · ε, the inner product of the m × m Jacobian matrix of the timestepper (which is not available in closed form) and the known m-vector ε. Matrix-free iterative algorithms (for example Newton–Krylov/GMRES methods based on the timestepper) can then be used to solve for the steady state (e.g., Refs. [32, 33]). Matrix-free eigensolvers (e.g., subspace iteration methods based on the timestepper) can be used to estimate the part of the spectrum of the linearization close to the imaginary axis, which is relevant for stability and bifurcation computations of the unavailable equation [34]. We see once more that the quantities necessary for computer-aided analysis (residuals, action of Jacobians) can be estimated by appropriately designed short calls to the timestepper and subsequent post-processing of the results, even if the equation is not available in closed form. Remarkably, and completely independently of complex/multi-scale computations, these software wrappers have the potential to enable legacy integration codes (large-scale, industrial dynamic simulators) to perform tasks such as stability/bifurcation and operability analysis, controller design and optimization. Our inspiration comes from precisely such a wrapper: the Recursive Projection Method of Ref. [35], which enables a class of large scale direct simulators (even slightly unstable ones) into becoming convergent fixed point solvers. Clearly, the same type of computational superstructure can turn coarsetimesteppers (lifting from macroscopic to consistent microscopic initial conditions, evolving with the fine-scale code, and restricting back to macroscopic variables) into coarse-fixed point algorithms, and, with appropriate augmentation, coarse bifurcation algorithms (Fig. 2c). Coarse residuals and the action of coarse slow Jacobians and Hessians can be estimated in a matrix-free context by systematic, judicious calls to the coarse timestepper. Coarse equation solvers and coarse eigensolvers can thus be implemented – many aspects of the computer-assisted analysis of the unavailable macroscopic equation can be
1466
I.G. Kevrekidis et al.
performed without the equation. Motivated by the connection to matrix-free numerical analysis methods, we call the timestepper and coarse-timestepper based computer-assisted analysis equation free computation [10]. The scope of the approach is very general. Coarse projective integration and coarse bifurcation computations have been used to accelerate lattice kinetic Monte Carlo simulations of catalytic surface reactions ([36–39]); biased random walk kMC models of e-coli chemotaxis ([40]); kinetic theory-based, interacting particle simulations of hydrodynamic equations [28]; Brownian dynamics simulations of nematic liquid crystals [41]; lattice Boltzmann-BGK simulations of multi-phase, bubbly flows [31]; molecular dynamics simulations of the folding of a peptide fragment [42]; individual-based kMC models of evolving diseases such as influenza [43]; kMC models of dislocation movement in a lattice containing diffusing impurities [44]; molecular dynamics simulations of granular flows; and more. For some spatially distributed problems, this involved gaptooth and patch dynamics versions of the coarse-timestepper. As more experience is accumulated and the methods develop further, more problems may become accessible to equation-free computer aided analysis. Beyond simulation and stability/continuation computations, equation-free computation has been used to perform tasks such as linear stabilizing controller design for kMC, LB-BGK as well as Brownian Dynamics simulators [41, 45, 46]; case studies of coarse optimization [47] as well as coarse feedback linearization for kMC simulators [48, 49] have been performed; additional tasks like coarse reverse integration backward in time [50], and coarse dynamic renormalization [10, 51], for the equation-free computation of selfsimilar solutions are also possible. Wrappers for legacy codes have been designed (RPM has been wrapped around gPROMS to accelerate rapid pressure swing absorption computations, and coarse integration of an unavailable envelope equation has also been used for this purpose, [52]). Other problems can also be approached through the same basic scheme, including problems which we believe could be modeled by effective medium equations (such as flow in porous media, or reaction-diffusion over microcomposite catalysts). Here again, short bursts of detailed medium simulation can be used to estimate the timestepper of the effective medium equation without deriving this equation explicitly [53]. Similarly, the solution of effective continuum equations for spatially discrete problems (such as lattices of coupled neurons) can be attempted in an equation-free framework [54]. Most of the discussion so far was formulated in a deterministic context; yet many complex systems of interest are well-described by stochastic models. Every outcome of computations with such models is in principle different; noise destroys determinism at the level of a single experiment. Determinism is often restored, however, at a different level of observation: when one considers the distribution of the outcomes of several realizations. One can be deterministic (i.e., write predictive equations) about the expectation of a sufficiently
Equation-free modeling for complex systems
1467
large ensemble of experiments; possibly about the expectation and standard deviation of such an ensemble. Once again, higher order moments of a probability distribution (whose evolution is governed by a Fokker–Planck-type equation) get quickly slaved to lower order moments, and one can be practically predictive if one looks at an appropriately coarse-grained level. While, for example, we cannot know the fate of an individual after a year, we can be practically predictive about the evolution of a few basic statistics of the population of a country. For the right observables, the coarse-timestepper is then constructed by simulating a large enough ensemble of realizations of the stochastic problem. An important category of problems can be approximated by dynamics on low-dimensional free-energy surfaces, parametrized by a few well-chosen coarse variables (reaction coordinates). In the statistical mechanics of molecular systems the ability to be “practically predictive” with just a few meaningful reaction coordinates is intimately connected with separation of time scales. Formally, such coordinates could be defined with the help of the leading eigenfunctions of a Frobenius–Perron operator for the detailed problem [55]; yet this is practically unachievable. Instead, physical intuition, experience and data analysis is often used to suggest collective coordinates which hopefully provide dynamically relevant measures of the progress of a reaction. Projecting the full dynamics on such well-chosen reaction coordinates will then retain the macroscopically relevant features of the dynamics with only simplified representations of noise and memory [56, 57]. Short bursts of appropriately initialized molecular dynamics can again be used to estimate on demand the drift and the noise terms of effective Langevin or Fokker–Planck equations in these variables [58]; to find minima and saddles; to solve optimal path problems, and to construct approximate propagators for the density on this surface, without deriving or writing this effective equation in closed form. In our discussion we have endeavored to outline the new possibilities opened by such an equation-free framework. These possibilities are accompanied by many theoretical and practical difficulties. Some of these issues arise in algorithms of continuum numerical analysis themselves (stepsize selection in numerical integration, mesh-size selection in spatial discretizations, error monitoring and control in matrix-free iterative methods); some are particular to complex/multi-scale timesteppers (consistent initialization through lifting; estimation and filtering involved in restriction operators; imposition of macroscopically inspired boundary conditions); some arise from the coupling (choice of good observation variables). We will mention one special feature here. Adaptive step size selection is often performed by doing the computation with different step sizes and estimating the error a posteriori; similarly, adaptive mesh selection is based on computations performed at different mesh-sizes to estimate the error. To adaptively determine the level of coarse-graining at which we can be practically predictive, the coarse timestepper can be computed by
1468
I.G. Kevrekidis et al.
conditioning the microscopic simulation at different observation levels, i.e., with different numbers of coarse variables (e.g., surface coverages only, vs. surface coverages and pair probabilities for lattice simulations of surface reactions). Matrix-free, timestepper-based eigensolvers can then be used to estimate the slow eigenvalues and corresponding eigenvectors for the timestepper, which should be tangent to the slow manifold (embodying the missing closure). Gaps in this spectrum, and the components of the corresponding eigenvectors can be used to probe the number and nature of coarse variables that should be used to observe the system dynamics (i.e., to locally parametrize the manifold). Handshaking between microscopic solvers and macroscopic continuum numerical analysis consists mainly of subjects traditionally studied in systems theory. System identification based on the results of computational experimentation with the fine-scale model is the most important component. Separation of time-scales underpins the low-dimensionality of the macroscopic dynamics. The dynamics of the hierarchy of distribution moments constitute a singularly perturbed system, and brief simulation is used to “cure off-manifold initial conditions” by bringing them back onto the manifold, healing the errors we commit when lifting. The dynamics themselves establish the missing closure; we can think of this as a “closure on demand” approach. Adaptive tabulation [59] can be used to economize in the design of experiments, and the importance of data assimilation/statistical analysis tools to identify non-linear correlations has already been stressed. The use of observer theory (e.g., [60, 61]) and realization balancing (e.g., Refs. [62, 63]) arises naturally: the microscopic system dynamics are observed on the macroscopic variables, but are realized through the microscopic simulator. Techniques for filtering [64] and variance reduction [65] will play an important role in determining how useful equation-free computations will ultimately be [66]. Timestepper-based methods are, in effect, alternative ensembles for performing microscopic (molecular dynamics, kMC, Brownian dynamics) simulations. These ensembles, however, are motivated by macroscopic numerical analysis, rather than statistical mechanical considerations. We are currently exploring the applicability of these “numerical analysis motivated” ensembles in accelerating equilibrium computations (grand canonical MC computations of micelle formation, [67, 68]). It is particularly interesting to consider ensembles motivated by the augmented systems arising in multi-parameter continuation. In such ensembles, like the pathostat [48, 49] based on pseudoarclength continuation, both the variables and the operating parameters themselves evolve, so that the system traces both stable and unstable parts of bifurcation diagrams. An increasing number of experimental systems appears in the literature for which finely spatially distributed actuation authority – coupled with sensing – is available; photosensitive chemical reactions addressed through a
Equation-free modeling for complex systems
1469
digital projector [69], laser-addressable catalytic reactions [70] and interfacial flows [71], colloidal particles manipulated through optical tweezers [72] or electric fields [73] are some such examples. When experiments can be initialized at will, the timestepper methods we discussed here can be applied to laboratory – rather than computational – experiments. Continuum numerical methods will then become experimental design protocols, tuned to the task we wish to perform. This way, mathematics might be performed directly on the physical system, and not on the (approximate) equations modeling it. Many of the mathematical and computational tools combined in this exposition (e.g., system identification, or inertial manifold theory) are wellestablished; we borrowed them, in our synthesis with tools developed in our group, as necessary. Innovative multi-scale/multi-level techniques proposed over the last decade include the quasi-continuum methods of Phillips and coworkers [74, 75]; the optimal prediction methods of Chorin and coworkers [76, 77]; the coupling of continuum fields with stochastic evolution in the work of Oettinger and coworkers [78, 79]; the kinetic-theory-based solvers proposed by Xu and Prendergast [80, 81], the modification of equation-free computation in the context of conservation laws by E and Engquist [82]; and the lattice coarse graining by Katsoulakis et al. [83] (see the review by Givon et al, [84] and the discussion in Ref. [10]. In the context of molecular dynamics simulations, the idea of using multiple, and possibly coupled replica runs to search conformation space (for systems with unmodified or artificially modified energy surfaces) forms the basis of approaches such as conformational flooding [85], parallel replica MD [86], SWARM-MD [87], coarse extended Lagrangian dynamics [88, 89], and simple averaging over multiple trajectories [90, 91]. It is fitting to close this perspective citing from a 1980 article entitled “Computer-aided analysis of nonlinear problems in transport phenomena” by Brown, Scriven and Silliman [30]: The nonlinear partial differential equations of mass, momentum, energy, species and charge transport, especially in two and three dimensions, can be solved in terms of functions of limited differentiability – no more than the physics warrants – rather than the analytical functions of classical analysis. . . . Organizing the polynomials in the so-called finite element basis functions facilitates generating and analyzing solutions by large, fast computers employing modern matrix techniques”. These sentences celebrate the transition from analytical solutions (of explicitly available equations) to computer-assisted solutions. The solutions are not analytically available for our class of complex/multiscale problems either; but now the equations themselves are not available, and they are solved in a computerassisted fashion using appropriate computational experiments at a different level of system description. The similarity of the list of important elements is remarkable: The right basis functions, dictated by the physics (discretizations of the right coarse observation variables); large, fast computers (now
1470
I.G. Kevrekidis et al.
massively parallel clusters, each CPU computing one realization of trajectories for the same “coarse” initial condition); and modern matrix techniques (now matrix-free iterative linear algebra). The approach bridges traditional numerical analysis, computational experimentation with the microscopic simulator, and systems theory; its most vital element is the simple fact that a code can be initialized at will. If one has good macroscopic equations, one should use them. But when these equations are not available in closed form (and such cases arise with increasing frequency in contemporary modeling) the equation-free computational enabling technology we outlined here may hold the key to the engineering of effectively simple systems.
Acknowledgments This work was partially supported over the years by AFOSR, through an NSF/ITR grant, DARPA and Princeton University. A somewhat shortened version of this article has appeared as a Perspective in the July 2004 issue of the AIChE Journal.
References [1] S. Chapman and T.G. Cowling, The Mathematical Theory of Non-Uniform Gases, 2nd edn., Cambridge Unversity Press, Cambridge, 1952, 1939. [2] J.M. Ottino, “Complex systems,” AIChE Journal, 49(2), 292, 2003. [3] M.E. Csete and J. Doyle, “Reverse engineering of biological complexity,” Science, 295 1664, 2002. [4] D. Maroudas, “Multiscale modeling of hard materials: challenges and opportunities for chemical engineering,” AIChE J., 46, 878, 2002. [5] G. Lu and E. Kaxiras, An overview of multiscale simulations of materials: cond-mat/0401073 preprint at arXiv.org, 2004. [6] G.E.P. Box, W. Hunter, and J.S. Hunter, Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building, Wiley, New York, 1978. [7] G. Cybenko, “Just in time learning and estimation,” In: Identification, Adaptation and Learning: the Science of Learning Models from Data, NATO ASI Series, F153, Springer, Berlin, 423, 1996. [8] L. Ljung, System Identification: Theory for the User, 2nd edn., Prentice Hall, New York, 1999. [9] K. Theodoropoulos, Y.-H. Qian, and I.G. Kevrekidis, “Coarse stability and bifurcation analysis using timesteppers: a reaction diffusion example,” Proc. Natl Acad. Sci., 97(18), 9840, 2000. [10] I.G. Kevrekidis, C.W. Gear, J.M. Hyman, P.G. Kevrekidis, O. Runborg, and K. Theodoropoulos, “Equation-free coarse-grained multiscale computation: enabling microscopic simulators to perform system-level tasks,” Commun. Math. Sci., 1(4), 715–762, original version can be obtained as physics/0209043 at arXiv.org, 2003.
Equation-free modeling for complex systems
1471
[11] R. Car and M. Parrinello, “Unified approach for molecular dynamics and density functional theory,” Phys. Rev. Lett., 55, 2471, 1985. [12] C.W. Gear and I.G. Kevrekidis, “Projective methods for stiff differential equations: problems with gaps in their eigenvalue spectrum,” SIAM J. Sci. Comp., 24(4), 1091, original NEC Technical Report NECI-TR 2001-029, Apr. 2001, 2003. [13] C.W. Gear, “Projective integration methods for distributions,” NEC Technical Report NECI TR 2001-130, Nov. 2001, 2001. [14] C.W. Gear, I.G. Kevrekidis, and K. Theodoropoulos, “Coarse integration/bifurcation analysis via microscopic simulators: micro-Galerkin methods,” Comp. Chem. Eng., 26, 941, Original NEC Technical Report NECI TR 2001-106, Oct. 2001, 2002. [15] P. Ehrenfest and T. Ehrenfest, In: Enzyklopaedie der Mathematsichen Wissenschaften (1911), repinted in P. Ehrenfest, Collected Scientific Papers, North Holland, Amsterdam, 1959. [16] J.P. Ryckaert, G. Ciccotti, and H. Berendsen, “Numerical integration of the Cartesian equations of motion of a system with constraints: molecular Dynamics of N-alkanes,” J. Comp. Phys., 23, 327, 1977. [17] C.W. Gear, T.J. Kaper, I.G. Kevrekidis, and A. Zagaris, “Projecting on a slow manifold: singularly perturbed systems and legacy codes,” submitted to SIADS, can be found as Physics/0405074 at arXiv.org, 2004. [18] M. Bodenstein, “Eine theorie der photochemischen Reaktionsgeschwindigkeiten,” Z. Phys. Chem., 85, 329, 1913. [19] J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcations of Vector Fields, Springer Verlag (Appl. Math. Sci. vol. 42.), New York, 1983. [20] P. Constantin, C. Foias, B. Nicolaenko, and R. Temam, Integral Manifolds and Inertial Manifolds for Dissipative Partial Differential Equations, Springer Verlag, New York, 1988. [21] R. Temam, Infinite Dimensional Dynamical Systems in Mechanics and Physics, Springer Verlag, New York, 1998. [22] P. Holmes, J.L. Lumley, and G. Berkooz, Turbulence, Coherent Structures, Dynamical Systems and Symmetry, Cambridge University Press, Cambridge, 1998. [23] I.T. Jolliffe, Principal Component Analysis, Springer Verlag, New York, 1986. [24] A.J. Smola, O.L. Mangasarian, and B. Schoelkopf, “Sparse kernel feature analysis,” Data Mining Institute Technical Report 99–04, University of Wisconsin, Madison, 1999. [25] R.R. Coifman, S. Lafon, A. Lee, M. Maggioni, F. Warner, and S. Zucker, “Geometric diffusions as a tool for harmonic analysis and structure definition of data,” Proc. Natl. Acad. Sci. USA, submitted, 2004. [26] J. Li, D. Liao, and S. Yip, “Imposing field boundary conditions in MD simulations of fluids: optimal particle controller and buffer zone feedback,” Mat. Res. Soc. Symp. Proc., 538, 473, 1998. [27] I.G. Kevrekidis, “Coarse bifurcation studies of alternative microscopic/hybrid simulators,” Plenary Lecture, CAST Division, AIChE annual meeting, Los Angeles, can be found at http://arnold.princeton.edu/∼yannis, 2000. [28] C.W. Gear, J. Li, and I.G. Kevrekidis, “The gaptooth method in particle simulations,” Phys. Lett. A, 316, 190–195, 2003. [29] G. Samaey, I.G. Kevrekidis, and D. Roose, “The gap-tooth scheme for homogenization problems,” SIAM MMS, in press, 2005.
1472
I.G. Kevrekidis et al.
[30] R.A. Brown, L.E. Scriven, and W.J. Silliman, “Computer-aided analysis of nonlinear problems in transport phenomena,” In: P.J. Holmes (ed.), New Approaches to Nonlinear Problems in Dynamics, SIAM Publications, Philadelphia, p. 298, 1980. [31] K. Theodoropoulos, Sankaranarayanan, S. Sundaresan, and I.G. Kevrekidis, “Coarse bifurcation studies of bubble flow lattice Boltzmann simulations,” Chem. Eng. Sci., 59, 2357, can be obtained as nlin.PS/0111040 at arXiv.org, 2004. [32] C.T. Kelley, Iterative Methods for Solving Linear and Nonlinear Equations, SIAM Publications, Philadelphia, 1995. [33] Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd edn., SIAM Publications, Philadelphia, 2003. [34] R.B. Lehoucq, D.C. Sorensen, and C. Yang, ARPACK Usres’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods, SIAM Publications, Philadelphia, 1998. [35] G.M. Shroff and H.B. Keller, “Stabilization of unstable procedures: a recursive projection method,” SIAM J. Numer. Anal., 30, 1099, 1993. [36] A. Makeev, D. Maroudas, and I.G. Kevrekidis, “Coarse stability and bifurcation analysis using stochastic simulators: kinetic Monte Carlo examples,” J. Chem. Phys., 116, 10083, 2002. [37] A.G. Makeev, D. Maroudas, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse bifurcation analysis of kinetic Monte Carlo simulations: a lattice gas model with lateral interactions,” J. Chem. Phys., 117(18), 8229, 2002. [38] A.G. Makeev and I.G. Kevrekidis, “Equation-free multiscale computations for a lattice-gas model: coarse-grained bifurcation analysis of the NO+CO reaction on Pt(100),” Chem. Eng. Sci., 59, 1733, 2004. [39] R. Rico-Martinez, C.W. Gear, and I.G. Kevrekidis, “Coarse projective KMC integration: forward/reverse initial and boundary value problems,” J. Comp. Phys., 196, 474, 2004. [40] S. Setayeshgar, C.W. Gear, H.G. Othmer, and I.G. Kevrekidis, “Application of coarse integration to bacterial chemotaxis,” SIAM MMS, accepted, can be found as physics/0308040 at arXiv.org, 2004. [41] C. Siettos, M.D. Graham, and I.G. Kevrekidis, “Coarse Brownian dynamics for nematic liquid crystals: bifurcation, projective integration and control via stochastic simulation,” J. Chem. Phys., 118(22), 10149, can be obtained as cond-mat/0211455 at arXiv.org, 2003. [42] G. Hummer and I.G. Kevrekidis, “Coarse molecular dynamics of a peptide fragment: free energy, kinetics and long time dynamics computations,” J. Chem. Phys., 118(23), 10762, 2003. [43] J. Cisternas, C.W. Gear, S. Levin, and I.G. Kevrekidis, “Equation-free modeling of evolving diseases: coarse-grained computations with individual-based models,” Proc. R. Soc. London, 460, 27621, can be found as nlin.AO/0310011 at arXiv.org, 2004. [44] M. Haataja, D. Srolovitz, and I.G. Kevrekidis, “Apparent hysteresis in a driven system with self-organized drag,” Phys. Rev. Lett., 92(16), 160603, also cond-mat/0310460 at arXiv.org, 2004. [45] C.I. Siettos, A. Armaou, A.G. Makeev, and I.G. Kevrekidis, “Microscopic/stochastic timesteppers and coarse control: a kinetic Monte Carlo example,” AIChE J., 49(7), 1922, nlin.CG/0207017 at arXiv.org, 2003. [46] A. Armaou, C.I. Siettos, and I.G. Kevrekidis, “Time-steppers and coarse control of microscopic distributed processes,” Int. J. Robust Nonlinear Control, 14, 89, 2004.
Equation-free modeling for complex systems
1473
[47] A. Armaou and I.G. Kevrekidis, “Optimal switching policies using coarse timesteppers,” Proceedings of the 2003 CDC Conference, Hawaii, can be obtained as nlin.CG/0309024 at arXiv.org, 2003. [48] C.I. Siettos, N. Kazantzis, and I.G. Kevrekidis, “Coarse feedback linearization using timesteppers,” Submitted to Int. J. Bifurcations and Chaos, 2004. [49] C.I. Siettos, D. Maroudas, and I.G. Kevrekidis, “Coarse bifurcation diagrams via microscopic simulators: a state-feedback control-based approach,” Int. J. Bif. Chaos, 14(1), 207, 2004. [50] C.W. Gear and I.G. Kevrekidis, “Computing in the past with forward integration,” Phys. Lett. A, 321, 335, 2004. [51] L. Chen, P.G. Debenedetti, C.W. Gear, and I.G. Kevrekidis, “From molecular dynamics to coarse self-similar solutions: a simple example using equation-free computation,” J. Non-Newtonian Fluid Mech., 120, 215, 2004. [52] C.I. Siettos, C.C. Pantelides, and I.G. Kevrekidis, “Enabling dynamic process simulators to perform alternative tasks: a time-stepper based toolkit for computer-aided analysis,” Ind. Eng. Chem. Res., 42(26), 6795, 2003. [53] O. Runborg, I.G. Kevrekidis, and K. Theodoropoulos, “Effective stability and bifurcation analysis: a time stepper based approach,” Nonlinearity, 15, 491, 2002. [54] J. Moeller, O. Runborg, P.G. Kevrekidis, K. Lust, and I.G. Kevrekidis, “Effective equations for discrete systems: a time stepper based approach,” in press, Int. J. Bifurcations and Chaos, 2005. [55] C. Schuette, A. Fischer, W. Huisinga, and P. Deuflhard, “A direct approach to conformational dynamics based on hybrid Monte Carlo,” J. Comp. Phys., 151, 146, 1999. [56] R. Zwanzig, Nonequilibrium Statistical Mechanics, Oxford University Press, New York, 2001. [57] P. Haenggi, P. Talkner, and M. Borkovec, “Reaction-rate theory: 50 years After Kramers,” Rev. Mod. Phys., 62(2), 251, 1990. [58] R. Kupferman and A. Stuart, “Fitting SDE models to nonlinear Kac-Zwanzig heat bath models,” Phys. D, in press, 2005. [59] S. Pope, “Computationally efficient implementation of combustion chemistry using ins situ adaptive tabulation,” Comb. Theory Model., 1, 41, also Beam Technologies Inc, ISAT-CK Users’ Guide (Release 1.0), 1998. Beam Technologies Inc., Ithaca, NY, 1997. [60] D.G. Luenberger, “Observing the state of a linear system,” IEEE Trans. Military Electronics, 8, 74, 1964. [61] A.J. Krener, Nonlinear observers in control systems, robotics and automation. In: H. Unbehauen (ed.), Encyclopedie of Life Support Systems (EOLSS), Eolss Publishers, Oxford, 2003. [62] B.C. Moore, “Principal component analysis in linear systems: controllability, observability and model readuction,” IEEE Trans. Automatic Control, 26(1), 17, 1981. [63] S. Lall, J.E. Marsden, and S. Glavaski, “A subspace approach to balanced truncation for model reduction of nonlinear control systems,” Int. J. Robust Nonlinear Control, 12, 519, 2002. [64] R.E. Kalman and R.S. Bucy, “New results in linear filtering and prediction theory,” Trans. ASME, Part D, J. Basic Eng., 83, 95, 1961. [65] M. Melchior and H.C. Oettinger, “Variance reduced simulations of stochastic differential equations,” J. Chem. Phys., 103(21), 9506, 1995. [66] J. Li, P.G. Kevrekidis, C.W. Gear, and I.G. Kevrekidis, “Deciding the nature of the coarse equation through microscopic simulation,” SIAM MMS, 1(3), 391, 2003.
1474
I.G. Kevrekidis et al.
[67] D. Kopelevich, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse grained computations for a micellar system,” in press, 2005. [68] D. Kopelevich, A.Z. Panagiotopoulos, and I.G. Kevrekidis, “Coarse kinetic approach to rare events: application to micelle formation,” in press, J. Chem. Phys., 2005. [69] T. Sakurai, E. Mihaliuk, F. Chirila, and K. Showalter, “Design and control of wave propagation patterns in excitable media,” Science, 296 , 2009, 2002. [70] J. Wolff, A.G. Papathanasiou, I.G. Kevrekidis, H.H. Rotermund, and G. Ertl, “Spatiotemporal addressing of surface activity,” Science, 294, 134, 2001. [71] D. Semwogerere and M.F. Schatz, “Evolution of hexagonal patterns from controlled initial conditions in a Benard-Marangoni convection experiment,” Phys. Rev. Lett., 88, 054501, 2002. [72] D.G. Grier, “A revolution in optical manipulation,” Nature, 424, 810, 2003. [73] W.D. Ristenpart, I.A. Aksay, and D.A. Saville, “Electrically guided assembly of planar superlattices in binary colloidal suspensions,” Phys. Rev. Lett., 90, 12, 2003. [74] R. Phillips, Crystals, Defects and Microstructures, Cambridge University Press, Cambridge, 2001. [75] M. Ortiz and R. Phillips, “Nanomechanics of defects in solids,” Adv. Appl. Mech., 36, 1, 1999. [76] A. Chorin, A. Kast, and R. Kupferman, “Optimal prediction for underresolved dynamics,” Proc. Natl Acad. Sci. USA, 95, 4094, 1998. [77] A. Chorin, O. Hald, and R. Kupferman, “Optimal prediction and the Mori–Zwanzig representation of irreversible processes,” Proc. Natl Acad. Sci. USA, 97, 2968, 2000. [78] H.C. Oettinger, Stochastic Processes in Polymeric Fluids, Springer Verlag, New York, 1996. [79] M. Laso and H.-C. Oettinger, “Calculation of viscoelastic flow using molecular models: the CONNFFESSIT approach,” JNNFM, 47, 1, 1993. [80] K. Xu and K. Prendergast, “Numerical Navier–Stokes from gask kinetic theory,” J. Comp. Phys., 114, 9, 1994. [81] K. Xu, “A Gas-kinetic BGK scheme for the Navier–Stokes equations and its connection with artificial dissipation and the Godunov method,” J. Comp. Phys., 171, 289, 2001. [82] W.E. and B. Engquist, “The heterogeneous multiscale methods,” Commun. Math. Sci., 1, 87, 2003. [83] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, “Coarse grained stochastic processes for microscopic lattice systems,” Proc. Natl. Acad. Sci. USA, 100(3), 782, 2003. [84] D. Givon, R. Kupferman, and A. Stuart, “Extracting macroscopic dynamics: model problems and algorithms,” Submitted to Nonlinearity, can be obtained as Warwick Preprint 11/2003, http://www.maths.warwick.ac.uk/ ∼stuart/extract.pdf, 2003. [85] H. Grubmueller, “Predicting slow structural transitions in macromolecular systems: conformational flooding,” Phys. Rev. E., 52(3), 2893, 1995. [86] A.F. Voter, “Parallel replica method for dynamics of infrequent events,” Phys. Rev. B, 57(22), R13985, 1998. [87] T. Huber and W.F. van Gunsteren, “SWARM-MD: searching conformational space by cooperative molecular dynamics,” J. Chem. Phys. A., 102(29), 5937, 1998. [88] M. Iannuzzi, A. Laio, and M. Parrinello, “Efficient exploration of reactive potential energy surfaces using Car-Parrinello molecular dynamics,” Phys. Rev. Lett., 90(23), 238302, 2003.
Equation-free modeling for complex systems
1475
[89] A. Laio and M. Parrinello, “Escaping free energy minima,” Proc. Natl Acad. Sci. USA, 99(20), 12562, 2002. [90] I.C. Yeh and G. Hummer, “Peptide loop-closure kinetics from microsecond molecular dynamics simulations in explicit solvent,” JACS, 124(23), 6563, 2002. [91] C.D. Snow, N. Nguyen, V.S. Pande, and M. Gruebele, “Absolute comparison of simulated and experimental protein folding,” Nature, 420(6911), 102, 2002.
4.12 MATHEMATICAL STRATEGIES FOR THE COARSE-GRAINING OF MICROSCOPIC MODELS Markos A. Katsoulakis1 and Dionisios G. Vlachos2 1
Department of Mathematics and Statistics, University of Massachusetts - Amherst, Amherst, MA 01002, USA 2 Department of Chemical Engineering, Center for Catalytic Science and Technology, University of Delaware, Newark, DE 19716, USA
1.
Introduction
Spatial inhomogeneity at some small length scale is the rule rather than the exception in most physicochemical processes ranging from advanced materials’ synthesis, to catalysis, to self-assembly, to atmospheric science, to molecular biology. These inhomogeneities arise from thermal fluctuations and complex interactions between microscopic mechanisms underlying conservation laws. While nanometer inhomogeneity and its corresponding ensemble average behavior can be studied via molecular simulation, such as molecular dynamics (MD) and Monte Carlo (MC) techniques, mesoscale inhomogeneity is beyond the realm of available molecular models and simulations. Mesoscopic inhomogeneities are encountered in self-assembly, pattern formation on surfaces and in solution, standing and traveling waves, as well as in systems exposed to an external field that varies spatially over micrometer to centimeter length scales. It is this class of problems that require “large scale” mesoscopic or coarse-grained molecular models and where the developments described herein are applicable. It is desirable that such mesoscopic or coarse-grained models meet the following needs: • They are derived from microscopic ones to retain microscopic mechanisms and interactions and enable a truly first principles multi-scale approach; • They reach large length and time scales, which are currently unattainable by micro scopic molecular models; 1477 S. Yip (ed.), Handbook of Materials Modeling, 1477–1490. c 2005 Springer. Printed in the Netherlands.
1478
M.A. Katsoulakis and D.G. Vlachos
• They give the correct statistical mechanics limits; • They describe equilibrium as well as dynamic properties accurately; • They retain the correct noise of molecular models to ensure that phenomena, such as nucleation, phase transitions, pattern formation, etc. at larger scales are properly modeled; • They are amenable to mathematical analysis in order to assess the errors introduced during coarse-graining and enable optimized coarse-graining strategies to be developed. Toward these goals, recent work in Refs. [1–3] focused on developing a novel stochastic modeling and computational framework, capable of describing efficiently much larger length and time scales than conventional microscopic models and simulations. Here, we did not directly attempt to speed up microscopic simulation algorithms such as MD or MC. Instead, our perspective was to derive a hierarchy of new coarse-grained stochastic models – referred to as Coarse-Grained MC (CGMC) – ordered by the magnitude of space/time scales. This new set of models involves a reduced set of observables compared to the original microscopic models, incorporating microscopic details and noise, as well as the interaction of the unresolved degrees of freedom. The outline of this approach can be summarized in the following heuristic steps: 1. Coarse-grid selection. We select a computational grid (lattice) Lc (see Fig. la) which will be referred to as the “coarse-grid”. The microscopic processes describe much smaller scales by explicitly simulating atoms or molecules–“particles”–and are defined at the subgrid level: for example in Ref. [1] they are defined on a “microscopic” grid L (see Fig. lb and Section 3 below). 2. Coarse-grained Monte Carlo methods. Using the microscopic stochastic model as a starting point, we derive by carrying out a "stochastic closure" a coarser stochastic model for a reduced number of observables, set on Lc (see Fig. la). These new stochastic processes define in essence Coarse Lattice LC 1
2
3
4
5
6
...
m
adsorption desorption diffusion Fine Lattice L 1 2 3 4 5 6 7 ...q
Figure 1.
Coarse and fine grids (lattices) with absorption/desorption and surface diffusion.
Mathematical strategies of microscopic models
1479
coarse-grained MC algorithms, which rather than describing dynamics of a single microscopic particle as conventional MC do, they model the evolution of a coarse observable on Lc . The CGMC models span a hierarchy of length scales starting from the microscopic to the mesoscopic scales, and involve Markovian birth–death and generalized exclusion processes. A key feature of our coarse-graining procedure is that the full hierarchy of our derived stochastic dynamics satisfies detailed balance relations and as a result not only yields self-consistent random fluctuation mechanisms, but which are also consistent with the underlying microscopic fluctuations. To demonstrate the basic ideas, we consider as our microscopic model an Ising-type system. This class of stochastic processes is employed in the modeling of adsorption, desorption, reaction and diffusion of interacting chemical species on surfaces or through nanopores of materials in numerous areas such as catalysis and microporous materials, growth of materials, biological molecules, magnetism, etc. The fundamental principle on which this type of modeling is based on is the following: when the binding of species on a surface or within a pore is relatively strong, these physical processes can be described as jump (hopping) processes from one site to another or to the gasphase (Fig. lb) with a transition probability that can be calculated, to varying degrees of rigor, from even smaller scales using quantum mechanical calculations and/or transition state theory, or from detailed experiments, see for instance [4].
2.
Microscopic Lattice Models
Ising-type systems are set on a periodic lattice L which is a discretization of the interval I = [0, 1]. We divide I in N (micro)cells and consider the microscopic grid L = 1/N Z ∩ I in Fig. lb. Throughout this discussion we concentrate on one-dimensional models, however, our results extend easily (and perform better!) in higher dimensions. At each lattice site ie x ∈ L the order parameter σ (x) is allowed to take the values 0 and 1 describing vacant and occupied sites, respectively. The energy H of the system, evaluated at the configuration σ = {σ (x) : x ∈ L} is given by the Hamiltonian, 1 J (x − y)σ (x)σ (y)+ hσ (x), (1) H (σ ) = − 2 x∈L y =/ x where h = h(x), x ∈ L, is the external field and J is the inter-particle potential. Equilibrium states of the Ising model are described by the Gibbs states at a prescribed temperature T , µL,β (dσ ) = Z L−1 exp (−β H (σ )) PN (dσ ),
1480
M.A. Katsoulakis and D.G. Vlachos
where β = 1/kT and k is the Boltzmann constant and Z L is the partition function. Furthermore the product Bernoulli distribution PN (σ ) with mean 1/2 is the prior distribution on L. The inter-particle potentials J account for interactions between occupied sites. We consider symmetric potentials with finite range interactions where by the integer L we denote the total number of interacting neighboring sites of a given point on L. The interaction potential can be written as J (x − y) =
1 V L
N (x − y) , L
x, y ∈ L,
(2)
where V (r) = V (−r), and V (r) = 0, |r| ≥ 1, accounting for possible finite range interactions. Note that for V summable, the choice of the scaling factor 1/L in (1) implies the summability of the potential J , even when N, L → ∞. An additional condition required in order to obtain error estimates for the coarse-graining procedure is that V is smooth away from 0 and R |∂r V (r)| dr < ∞. The derivation of the interaction potentials can be carried out either from quantum mechanics calculations (e.g., RKKY interactions in micromagnetics [5]) or experimentaly. Sometimes potentials involve only nearest neighbors since further interactions can be neglected, in which case we obtain the classical Ising model. However in many applications interactions are significant over a large but finite number of neighbors (see for instance the experimental results in Ref. [6]), or even involve true long range interactions such as electrostatics or the RKKY-type exchange energies mentioned earlier. The dynamics of Ising-type models considered in the literature consists of order parameter flips and/or exchanges that correspond to different physical processes. More specifically a flip at the site x ∈ L is a spontaneous change in the order parameter, 1 is converted to 0 and vice versa, while a spin exchange between the neighboring sites x, y ∈ L is a spontaneous exchange of the order parameters at the two locations, 1 is converted to 0 and vice versa. For instance, a spin flip can model the desorption of a particle from a surface described by the lattice to the gas phase above and conversely the adsorption of a particle from the gas phase to the surface, see Fig. lb. Such a model has also been proposed recently in the atmospheric sciences literature for describing certain unresolved features of tropical convection [7, 8]. On the other hand spin exchanges describe the diffusion of particles on a lattice; in this case the presence of interactions typically gives rise to a non-Fickian macroscopic behavior [9–11]. These mechanisms are set-up as follows: if σ is the configuration prior to a flip at x, then we denote the configuration after the flip by σ x . When the configuration is σ , a flip occurs at x with a rate c(x, σ ), i.e., the order parameter at x changes, during the time interval [t, t + t] with probability c(x, σ )t. The resulting stochastic process {σt }t ≥ 0 is defined as a continuous time jump Markov process with generator defined in terms of the
Mathematical strategies of microscopic models
1481
rate c(x, σ ), [12]. The imposed condition of detailed balance implies that the dynamics leave the Gibbs measure invariant and is equivalent to c(x, σ ) exp(−β H (σ )) = c(x, σ x ) exp(−β H (σ x )). The simplest type of dynamics satisfying the detailed balance condition is the Metropolis-type dynamics [13] where the energy barrier for desorption or diffusion depends only on the energy difference between the initial and final states. This type of dynamics are usually employed as MC relaxational algorithms for sampling from the equilibrium canonical Gibbs measure. However, in the context of physicochemical applications involving non-equilibrium evolution of interacting chemical species on surfaces or through nanopores of materials, it is more appropriate to consider dynamics where the activation energy of desorption or diffusion is the energy barrier a species has to overcome in jumping from one lattice site to another or to the gas phase. This type of dynamics is called Arrhenius dynamics and can be derived from MD or transition state theory calculations (see for instance Ref. [4]), to varying degrees of rigor and approximation. The fundamental idea here is that when the binding of species on a surface or within a pore is relatively strong, desorption and diffusion can be modeled as a hopping process from one site to another or to the gas phase, with a transition probability that depends on the potential energy surface. The Arrhenius rate for the adsorption/desorption mechanism is: c(x, σ ) = d0 (1 − σ (x)) + d0 σ (x) exp[−βU (x, σ )], where U (x, σ ) =
(3)
J (x − z)σ (z) − h(x),
z= / x,z∈L
is the total energy contribution from the particle interactions with the particle located at the site x ∈ L, as well as the external field h. Typically an additional term corresponding to the energy associated with the surface binding of the particle at x, can be also included in the external field h in U ; finally d0 is a rate constant that mathematically can be chosen arbitrarily but physically is related to the pre-exponential of the microscopic processes. Similarly we can define an Arrhenius mechanism for diffusion; in both cases the dynamics satisfy detailed balance.
3.
Coarse-grained Stochastic Processes and CGMC Algorithms
First we construct the coarse grid Lc by dividing I = [0, 1] in m equal size coarse cells (see Fig. la); in turn, each coarse cell is subdivided into q
1482
M.A. Katsoulakis and D.G. Vlachos
(micro)cells. Hence I is divided in N = mq cells and L = 1/mq Z ∩ I is the microscopic lattice in Fig. lb. Each coarse cell is denoted by Dk , k = 1, . . . , m and the coarse lattice corresponding to the coarse cell partition (Fig. la) is defined as Lc = 1/m Z ∩ I. We consider the integers k = 1, . . . , m as the unsealed lattice points of Lc , the coarse-grained stochastic processes defined below are set on Lc while the Ising model is set on the microscopic lattice L. Next we define a coarse-grained observable on the coarse lattice Lc . One such intuitive choice motivated by renormalization theory [14] is the average over each coarse cell Dk :
F(σt )(k) : =
σt (y),
k = 1, . . . , m.
(4)
y∈Dk
Although F(σt ) is not a Markov process, our goal here is to derive a Markov process ηt , defined on the coarse lattice Lc , approximating the true microscopic average F(σ ). Computationally this new process η is advantageous over the underlying microscopic σ , since it has a substantially smaller state space than σ and can be simulated much more efficiently. We next derive with a direct calculation from the microscopic stochastic process the exact coarse-grained rates for adsoprtion and desorption for the microscopic average F(σt ) in coarse cell Dk ; these rates are, respectively c¯a (k) : =
c(x, σ ) (1 − σ (x)),
x∈Dk
c¯d (k) : =
c(x, σ )σ (x).
(5)
x∈Dk
In the case of Arrhenius diffusion the exact jump rate from cell Dk to Dl of the microscopic average (4) is given by c¯diff (k) : =
c(x, y, σ )σ (x)(1 − σ (y)).
(6)
x∈Dk, y ∈Dt
The main goal here is to express these exact coarse-grained rates, up to a controlled error, as functions of the “mesoscopic” random variable F(σ ), rather than the microscopic σ. This step yields a Markov process that will approximate in a probability metric the microscopic average (4). We refer to this procedure as a closure in analogy to closure arguments in kinetic theory and the derivation of coarse-grained deterministic PDE from interacting particle systems as hydrodynamic limits [12]. However, here we carry out a stochastic closure that retains fluctuations of the microscopic system. We demonstrate these arguments only in the case of Arrhenius dynamics; full details including other dynamics can be found in Refs. [1–3]. For the adsorption/desorption case we define the coarse-grained birth– death Markov process η = {η(k) : k ∈ Lc } approximating (4), where the random variable η(k) ∈ {0, 1, . . . , q} counts the number of particles in each coarse cell Dk . Using the rate calculations above we obtain the update rate with which the
Mathematical strategies of microscopic models
1483
value η(k) ≈ F(σ ) is increased by 1 (adsorption rate of a single particle in the coarse cell Dk ) and decreased by 1 (desorption in Dk ), respectively: ca (k, η) = d0 [q − η(k)],
cd (k, η) = d0 η(k) exp[−β U¯ (k)],
(7)
¯ As we show in where U¯ (l) = k∈Lc k=1 J¯(l, k)η(k) + J¯(0, 0)(η(l) − 1) − h(l). Katsoulakis et al. 2003a this new rate can be obtained from (5) with an error of the order O(q/L), when replacing F(σ ) ≈ η. Finally, the coarse-grained potential J¯ is defined by including the average of all contributions of pairwise microscopic interactions between coarse cells and within the same coarse cell,
J¯(k, l) = m 2
J (r − s) dr ds,
(8)
Dl ×Dk
where the area of Dl × Dk is equal to 1/m 2 . The coarse-grained external field h¯ is defined accordingly. Wavelets with vanishing moments can also be used in the construction of the coarse-grained potential [11, 15]. Similarly, in the Arrhenius diffusion case we obtain [3] the new rate cdiff (k → l, η) = q1 η(k)(q − η(l)) exp[−β(U0 + U¯ (k, η))],
(9)
describing the migration of a particle from the coarse cell Dk to cell Dl if k, I are nearest neighbors, and cdiff (k → l, η) = 0 otherwise; the generator for the Markov process ηt is defined analogously. A crucial step, which is special for the diffusion case, in obtaining (9) from (6) is the approximation of the local function σ (x)(1 − σ )) in (6) as a function of the coarse-grained variable η. This last step is trivial in the spin flip dynamics since such local functions in (5) are linear. Here we make the closure assumption that the particles are at local equilibrium inside each coarse cell Dk , we thus can replace σ (x) by q −1 η(k) (resp. σ (y) by q −1 η(l)). This last substitution somewhat parallels the “Replacement Lemma” in the interacting particle systems literature, necessary to obtain deterministic PDE as hydrodynamic limits: relative entropy estimates describing local equilibration of interacting particles allow to approximately rewrite local functions as a function of the coarse grained variables, see Ref. [16]. This analogy becomes precise in the discussion in Section 6 of the relative entropy error estimates, discussed below (18), between the microscopic processes σ and coarse-grained η. The invariant measure for the coarse-grained process {ηt }t ≥0 is a canonical Gibbs measure related to the original microscopic dynamics {σt }t ≥0: µm,q,β (dη) =
1 Z m,q,β
exp(−β H¯ (η))Pm,q (dη),
(10)
1484
M.A. Katsoulakis and D.G. Vlachos
where the product binomial distribution Pm,q (η), is the prior distribution arising from the microscopic prior by including q independent sites. Furthermore, H¯ is the coarse-grained Hamiltonian derived from the microscopic H ,
1 H¯ (η) = − 2 l∈L +
c k∈Lc k=1
J¯(0, 0) η(l)(η(l) − 1) J¯(k, l)η(k)η(l) − 2 l∈L c
¯ hη(k)
(11)
k∈Lc
The same-cell interaction term η(l)(η(l) − 1), yields the global mean field theory when the coarse-graining is performed beyond the interaction parameter L, as well as at the other extreme of q = 1 it is consistent with the Ising case. As a result we obtain a complete hierarchy of MC models-termed coarsegrained MC-spanning from Ising (q = 1) to mean field statistical mechanics limits where the latter does not include detailed interactions but includes noise, unlike the usual ODE mean field theories. Finally it can be easily shown both in the adsorption/desorption and the diffusion case that the condition of detailed balance for η with respect to the measure µm,q,β holds. Thus, combined mechanisms of diffusion, adsorption and desorption, which typically coexist in physical systems [17], can be modeled and simulated consistently for every coarse-graining level q. Detailed balance guarantees the proper inclusion of fluctuations in the coarse-grained model as they arise from the microscopies. This is justified in part by the form of the prior in (10), it is tested numerically in Refs. [1, 3] and it is proved rigorously by the loss of information estimate (18) below.
4.
Coarse-grained Monte Carlo Algorithms
The implementation of coarse-grained MC (CGMC), based on (7) and (9), is essentially identical to the microscopic MC [18] with a few differences. First, the inter-particle potential J is coarse-grained at the beginning of a simulation to represent interactions between particles within each cell (a feature absent in microscopic MC) as well as interactions with neighboring cells. Second, the order parameter is still an integer but varies between zero and q, instead of zero and one which is typical for microscopic MC. Otherwise, microscopic and coarse-grained algorithms are basically the same. Finally, we should comment about the significant computational savings resulting from coarse graining. For CGMC the CPU time in kinetic MC simulation with global update, i.e., searching the entire lattice to identify the chosen site, scales approximately as O(m 3 ) vs. O(N 3 ) for a conventional MC algorithm. In addition, coarse-grained potentials J¯ are compressed through the wavelet expansion (4) and thus additional savings are made in the calculation of energetics.
Mathematical strategies of microscopic models
1485
Overall in the case of adsorption/desorption processes the CPU time can decrease for the same real time with increasing q approximately as O(1/q 2 ). For example, even a very modest 10-fold reduction in the number of sites (q = 10) results in reduced CPU by a factor of 102 , yielding a significant enhancement in performance. Thus, while for macroscopic size systems in the millimeter length scale or larger, microscopic MC simulations are impractical on a single processor, the computational savings of CGMC make it a suitable tool capable of capturing large scale features, while retaining microscopic information on intermolecular forces and particle fluctuations. CGMC can capture mesoscale morphological features by incorporating the noise correctly, as well as simulating large length scales. For instance we refer to the standing wave example for adsorption/desorption computed by CGMC in Ref. [2] in this case we employed an exact analytic solution for the average coverage as a rigorous benchmark for the CGMC computations. A striking difference between diffusion and adsorption/desorption processes simulations is that in the case of diffusion we also have coarse-graining in time by a factor q 2 . This is certainly intuitively clear if one considers the additional space covered by a single coarsegrained jump, which would take q microscopic jumps. We refer to Ref. [3] for theory and simulations justifying and demonstrating precisely this coarse-graining in time effect. In turn, this approach contributes to improving the hydrodynamic slowdown effect in conservative MC and results in additional CPU savings. Overall, for long potentials CPU savings of up to q 4 , occur for continuous time KMC simulation.
5.
Connections to Stochastic Mesoscopic Models and Their Simulation
In this section we discuss connections of CGMC with coarse-grained models involving Stochastic PDE (SPDE) derived mainly in the physics and more recently in the mathematics communities. These approaches involve a heuristic and in some cases a rigorous passage to the infinite lattice limit in averaged quantities such as (4). Then, under suitable conditions, random fluctuations in the microscopic average (4) are suppressed in analogy to the Law of large numbers, but are accounted for as corrections similarly to the Central Limit Theorem. In the end the limit of (4) is expected to solve a SPDE. A classical example of such a SPDE is the stochastic Cahn–Hilliard–Cook model [19], which takes the abstract form:
ct − ∇ · µ[c]∇
δ E[c] δc
1 − √ ∇ · { 2µ[c]W˙ } = 0, N
(12)
1486
M.A. Katsoulakis and D.G. Vlachos
where W˙ = (W˙ 1 (x, t), . . . , W˙ d (x, t)) is a space/time white noise, δ E[c]/δc is the variational derivative of the free energy
|∇c| + βh
c(y) dy +
2
E[c] = D
F(c(y)) dy.
(13)
Here F(c) is a double-well potential and µ[c] is the mobility of the system. In the case of Cahn–Hilliard–Cook models the mobility is typically µ[c] = 1, or µ[c] = c(1 − c). In Ref. [10] we derived a stochastic PDE of the type (12) as a mesoscopic theory for diffusion of molecules interacting with a long range potential for microscopic dynamics by studying the asymptotics of (4), as the the number of interacting neighbors L → ∞. The free energy in this case is β E[c] = − 2 +
V (y − y )c(y)c(y ) dy dy + βh
c(y) dy
r(c(y)) dy .
(14)
where r(c) = c log c + (1 − c) log (1 − c), and the mobility depends explicitely on the choice of microscopic dynamics:
µ[c] =
βc(1 − c), βc(1 − c) exp(−βV ∗ c),
Metropolis-type, Arrhenius
(15)
where * denotes the convolution of two functions. Here the derivation of the noise is not based on a central limit theorem-type of scaling, which would linearize (12) and will not account for the expected hysteresis and metastability. Instead, the noise term is “designed” so that: (a) as expected (12) will satisfy a fluctuation–dissipation relation and (b) yield the same large deviation functional and rare events as the microscopic spin exchange process. We refer to Ref. [20] for an overview of mesoscopic PDE-based theories for both diffusion and adsorption and desorption processes. The connection of CGMC with SPDE such as (12) can be readily seen even with an equilibrium calculation: formally the Gibbs states associated with this Langevin-type stochastic equation is given by the free energy E[c]. On the other hand in Ref. [1] 2003a we derived an asymptotic formula for the coarse-grained Gibbs measure (10) as q → ∞: µm,qβ (η0 ) =
1 Z m,q,β
exp −qm(E m,q (η0 ) + oq (1)) ,
(16)
where E m,q [C] = −
β ¯ βh 1 Ck + r(Ck ), V (k, l)Ck Cl + m k∈L m k∈L 2m L¯ l k c c (17)
Mathematical strategies of microscopic models
1487
and J¯ = 1/L V¯ and L¯ = L/q is the coarse-grained potential length of J¯; we also define the average coverage at k ∈ Lc , Ck = λk /q, where η0 = (λ1 , λ2 , . . . , λm ), 0 ≤ λi ≤ q, and r(c) = c log c + (1 − c) log (1 − c). It is now clear that when the coarse-grained potential V¯ is long ranged (17) is merely a discrete version of the free energy (14). On the other hand if V¯ is a nearest neighbor potential then (17) yields a discrete version of the Ginzburg–Landau energy (13). In passing we remark that (16) also implies that for large q and m fixed, the most probable equilibrium configurations of the coarse-grained process ηt are given by the minimizers of the discrete free energy (17). A notable advantage of the CGMC methods over numerically solving Cahn–Hilliard–Cook type equations is the explicit connection to the microscopic system. While the connection with the underlying microscopic system is clear for the stochastic mesoscopic equations (12), (15) their derivation from microscopies is valid for L 1, which is not a strict requirement for our coarse-grained systems, as the estimate (18) demonstrates. From a mathematical perspective, due to the singular nature of the noise term, such SPDEs are expected to have only distributional, at best, solutions in dimensions more than 1. As a result, although direct simulation of (12), (see (15)), may have the advantage that PDE-based spectral methods can be used to surpass the hydrodynamic slowdown of MC algorithms, see Horntrop et al. 2001, they, however, require the careful handling of the highly singular noise term so that the scheme satisfies the detailed balance condition. For detailed adsorption/desorption mechanisms, it is not even clear which is the stochastic mesoscopic analogue of (12) that still satisfies detailed balance. On the other hand, CGMC includes fluctuations consistently with the detailed balance principle, allowing for the mesoscopic modeling of multiple simultaneous mechanisms such as particle diffusion, adsorption, desorption and reaction and always including properly stochastic fluctuations.
6.
The Numerical Analysis of CGMC: An Information Theory Approach
In this section we discuss the error analysis between microscopic models and CGMC in a more traditional numerical analysis sense. The error here represents the loss of information in the transition from the microscopic probability measure to the coarse-grained one. Such relative entropy estimates give a first mathematical reasoning for the parameter regimes (e.g., degree of coarse-graining) for which CGMC is expected to give errors within a certain tolerance. In Refs. [1, 3] we rigorously and computationally demonstrated that coarse-grained and microscopic processes share the same asymptotic mean
1488
M.A. Katsoulakis and D.G. Vlachos
behavior, i.e., that averages of the microscopic and coarse-grained processes solve the same mesoscopic deterministic PDE in the long-range interactions limit L → ∞. In addition to comparing the asymptotic mean behavior of coarse-grained and microscopic systems, we would like to understand how well and in what regimes CGMC captures the fluctuations of the microscopic system. As a first step in this direction, in numerical simulations in Ref. [2] we observed almost pathwise agreement between CGMC and microscopic MC simulations in the adsorption/desorption case when the level of coarse graining q was substantially smaller than L, e.g., q/L ≈ .25 and L = 40 (we note that in two dimensions potentials with just three lattice units long interactions have L about 30). These simulations suggested that in order to understand questions beyond the agreement in average behavior, we would like to have a comparison of the entire probability measures of the microscopic and CG processes. Our principal idea in this direction is to obtain a quantitative measure of controlling the loss of information during coarse-graining from finer to coarser scales: we consider the exact coarse graining of the microscopic Gibbs measure, µL,β oF(η) : = µL,β ({σ : F(σ ) = η}), where F is the projection operator from fine-to-coarse variables (4), and compare it to the Gibbs measure in CGMC (10). The relative entropy between the two measures provides a first quantitative estimate of the loss of information, during the coarse-graining process from finer to coarser scales, [3]: R(µm,q,β |µL,β oF) : = N
−1
log
η
= O
q . L
µm,q,β (η) µm,q,β (η) µL,β ({σ : F(σ ) = η}) (18)
Notice that the estimate (18) is on the specific entropy which is the relative entropy normalized with the size N of the microscopic system; the loss of information – however, small in each coarse cell – grows linearly with size as we take into account a growing number of cells. Relation (18) gives some initial mathematical intuition, at least at equilibrium, on how to rationally design a “good” CGMC algorithm, i.e., decide how to select the extent of coarse-graining q, given a potential J with a total number of interacting neighbors L and a desired accuracy. In fact, (18) is essentially a numerical analysis estimate between the exact solution of the microscopic system σ and the approximating CGMC η. Such estimates for the solution of a PDE and a corresponding finite element approximation are usually done in an L p or Sobolev norm. Here the relative entropy provides the analogue of a norm, without strictly being one. Furthermore, due to the Pinsker inequality [22], the estimate (18) implies an estimate on the total variation norm of the probability measures.
Mathematical strategies of microscopic models
7.
1489
Conclusions
Here we provided an overview of the first steps taken in deriving a mathematically founded framework for coarse-graining of stochastic processes and associated kinetic Monte Carlo simulations. We have shown that coarsegrained models and simulations can reach larger scales while retaining information about the microscopic mechanisms and interaction potentials and the correct noise. Information theory methods have been introduced to assess the errors (loos of information) during coarse-graining. We believe that these tools will be essential to providing strategies for optimized coarse-graining designs. Concluding, we remark that while our focus has been on simple Ising type of models, the concepts introduced here can be extended to more complex systems. One such application to atmospheric sciences arises in Ref. [8], where CGMC models, coupled with the macroscopic fluid and thermodynamic equations, are used to parametrize underresolved (subgrid) features of tropical convection. Furthermore, in recent years there is a great interest in the polymer science and biology literature in coarse-graining atomistic models of polymer chains; we refer to the review article on coarse-graining by Muller-Plathe Ref. [22], for further discussion. In this context, coarse-graining is typically achieved by collecting a number of atoms (on the order of 10–20) in a polymer chain into a “super-atom” and semi-empirically/analytically fit parameters to a known potential type U¯ , e.g., Lennard–Jones, to derive the coarse-grained potential for the super-atoms. Other coarse-graining techniques in the polymer science literature including the bond fluctuation model and its variants share the perspective of the CGMC: an atomistic chain model is mapped on a lattice, where a super-atom occupies a lattice cell (similarly to the coarse-cells Dk in Section 2). All these coarse-grained models have to varying degrees the drawback that they rely on parameterized coarse potentials. Hence at different conditions (e.g., temperature, density, composition) need to be re-parameterized [23]. Furthermore, since they are not directly derived from the atomistic dynamics, it is not clear if they reproduce transport and dynamic properties such as melt viscosities. We hope that our methods can eventually provide a new mathematical framework to these approaches and a more systematic – if not completely mathematical – way to construct coarse-grained dynamics and potentials for such complex systems.
References [1] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, J. Comp. Phys., 186, 250, 2003. [2] M.A. Katsoulakis, A.J. Majda, and D.G. Vlachos, Proc. Natl. Acad. Sci. USA, 100, 782, 2003. [3] M.A. Katsoulakis and D.G. Vlachos, J. Chem. Phys., 112, 18, 2003.
1490
M.A. Katsoulakis and D.G. Vlachos
[4] S.M. Auerbach, Int. Rev. Phys. Chem., 19, 155, 2000. [5] R.C. O’Handley, Modern Magnetic Materials: Principles and Applications, Wiley, New York, 2000. [6] S. Renisch, R. Schuster, J. Wintterlin, and C. Ertl, Phys. Rev. Lett., 82, 3839, 1999. [7] A.J. Majda and B. Khouider, Proc. Natl. Acad. Sci. USA, 99, 1123, 2002. [8] B. Khouider, B. Majda, A. J. and M.A. Katsoulakis, Proc. Natl. Acad. Sci. USA, 100, 11941, 2003. [9] G. Giacomin and J.L. Lebowitz, J.L., J. Stat. Phys., 87, 37, 1997. [10] D.G. Vlachos and M.A. Katsoulakis, Phys. Rev. Lett., 85, 3898, 2000. [11] R. Lam, T. Basak, D.G. Vlachos, and M.A. Katsoulakis, J. Chem. Phys., 115, 11278, 2001. [12] C. Kipnis and C. Landim, Scaling Limits of Interacting Particle Systems, Springer, New York, 1999. [13] B. Gidas, Topics in Contemporary Probability and its Applications, J. Laurie Snell (ed.), CRC Press, Boca Raton, 1995. [14] N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, vol. 85, Addison-Wesley, New York, 1992. [15] A.E. Ismail, G.C. Rutledge, and G. Stephanopoulos, J. Chem. Phys., 118, 4414, 2003. [16] H.T. Yau, Lett. Math. Phys., 22, 63, 1991. [17] M. Hildebrand and A.S. Mikhailov, J. Phys. Chem., 100, 19089, 1996. [18] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press, London, 2000. [19] H.E. Cook, Acta Metall., 18, 297, 1970. [20] M.A. Katsoulakis and D.G. Vlachos, IMA Vol. Math. Appl., 136, 179, 2003. [21] D.J. Horntrop, M.A. Katsoulakis, and D.G. Vlachos, J. Comp. Phys., 173, 361, 2001. [22] T.M. Cover and J.A. Thomas, J.A., Elements of Information Theory, Wiley, New York, 1991. [23] F. Muller-Plathe, Chem. Phys. Chem., 3, 754, 2002. [24] G. Beylkin, R. Coifman, and V. Rokhlin, Commun Pure Appl. Math., 44, 141, 1991. [25] M. Hildebrand, A.S. Mikhailov, and G. Ertl, Phys. Rev. E, 58, 5483, 1998. [26] M. Seul and D. Andelman, Science, 267, 476, 1995. [27] A.F. Voter and J.D. Doll, J. Chem. Phys., 82, 80, 1985.
4.13 MULTISCALE MODELING OF CRYSTALLINE SOLIDS Weinan E and Xiantao Li Program in Applied and Computational Mathematics, Princeton University
1.
Introduction
Multiscale modeling and computation has recently become one of the most active research areas in applied science. With rapidly growing computing power, we are increasingly more capable of modeling the details of physical processes. Nevertheless, we still face the challenge that the phenomena of interest are oftentimes the result of strong interaction between multiple spatial and temporal scales, and the physical processes are described by radically different models at different scales. The mechanical behavior of solids is a typical example that exhibits such a multiscale characteristic. At the fundamental level, everything about the solid can be attributed to the electronic structures which obey the Schr¨odinger equation. Atomic interactions and crystal structures can be described at the atomistic scale using molecular dynamics. Mechanical properties at the scale of the material are often modeled using continuum mechanics for which one speaks of stresses and strains. In between there are carious levels of mesoscales where one deals with defects such as grain boundaries, dislocation dynamics, and dislocation bundles. What makes the problem challenging is that these different scales are often strongly coupled with each other. Continuum models usually offer an efficient way of studying material properties. But they suffer from inadequate accuracy and the lack of microstructural information that tells us the microscopic mechanisms for why the material responds in the way it does. Atomistic models, on the other hand, allow us to probe the detailed crystalline and defect structure. However, the length and time scales of our interest are often far beyond what a full atomistic computation can reach. This is where multiscale modeling comes into play. The idea is that by coupling microscopic models such as molecular dynamics (MD) 1491 S. Yip (ed.), Handbook of Materials Modeling, 1491–1506. c 2005 Springer. Printed in the Netherlands.
1492
Weinan E and X. Li
with macroscopic models such as continuum mechanics, one might be able to develop numerical tools that have the accuracy that is comparable with the microscopic model and the efficiently that is comparable to the macroscopic model. In this article, we will review some of the strategies that have been proposed for this purpose. We will focus on the coupling between molecular dynamics and continuum mechanics, although some of the strategies can be formulated in a more general setting. In addition, for simplicity we will concentrate on concurrent coupling methods that link different scales “on the fly”. Broadly speaking, concurrent coupling methods can be divided into two main categories, those based on energetic formulations and those based on dynamic formulations. We will discuss them separately.
2.
Energy-based Methods
At the atomistic scale, the deformation of the solid is described by the (displaced) positions of atoms that make up the solid. At zero temperature, the positions of the atoms are obtained by minimizing the total energy of the system, which consists of the potential energy due to the interaction of the atoms and the energy due to applied forces: E tot = E(x1 , . . . , x N ) −
f (x j )
(1)
j
Here x j denotes the displaced position of the jth atom. We will use x0j to denote its reference position which is taken to be the equilibrium position. u j = x j − x0j is the displacement of the jth atom. At the continuum level, the deformation of the solid is described by the displacement field u which also minimizes the total energy of the system that consists of the elastic energy caused by the deformation and the energy due to external forces:
ε (∇u − fex u) dx
(2)
Here ε is the strain energy density. Numerically this problem is solved by finite element methods on an appropriate triangulation {α } of the domain that defines the solid. In both cases, dynamics can be generated using Hamilton’s equation for the corresponding energy functional. Clearly the continuum approach is more efficient once we know the strain energy density. The conventional approach in continuum mechanics is to model this empirically using a combination of experimental data and analytical reasoning. Recently developed multiscale approach, on the other hand, aims
Multiscale modeling of crystalline solids
1493
at computing the strain energy directly based on the atomistic model. Next we will discuss several methods that have been developed for this purpose. To begin with, let Q be an appropriately defined operator that maps the microscopic configuration u j of the atoms to the macroscopic displacement field u. Then consistency between (1) and (2) implies that the strain energy should be given in terms of the atomistic model by, e[u] = min
Q{u j } = u
E tot .
(3)
However, this formula is quite impractical for numerical purpose since the number of atoms involved is often too large, and one has to come up with appropriate approximation procedures.
2.1.
QC – Quasicontinuum Method
One remarkably successful approach is the (quasicontinuum QC) method [1, 2]. QC is a way of simulating the macroscale nonlinear deformation of crystalline solids using molecular mechanics. It consists of three main components. • A finite element method on an adaptively generated mesh, which is automatically refined to the atomistic level near defects. Away from the defects, the mesh is coarsened to reflect the slow variation of the displacement field. • A kinematic constraint by which a subset of atoms, called representative atoms, are selected. The deformation of the other atoms are expressed in terms of the deformation of the representative atoms. This reduces the number of degrees of freedom in the problem. • A summation rule that computes an approximation to the total energy of the system by visiting only a small subset of the atoms. A simple example of the summation rule is the Cauchy–Born rule which computes the local energy by assuming the deformation is locally uniform. We now discuss these components in some detail. Ideally, in order to calculate the total energy, one needs to visit all the atoms in the domain: E tot =
N
E i (x1 , x2 , . . . , x N ).
(4)
i=1
Here E i is the energy contribution from site xi . The analytical form of E i depends on the empirical potential models in use. In practice, the computation of E i only involves neighboring atoms. In the region where the displacement field is smooth, keeping track of each individual atom is unnecessary. After
1494
Weinan E and X. Li
selecting some representative atoms (repatoms), the displacement of the rest of the atoms can be approximated via linear interpolation, uj =
Nrep
Sα x0j uα ,
α=1
where the subscript α identifies the representative atoms, Sα is an appropriate weight function, Nrep being the number of repatoms involved. This step reduces the number of degrees of freedom. But to compute the total energy, in principle we still need to visit every atom. To reduce the computational complexity involved in computing the total energy, several summation rules are introduced. The simplest of these is to assume that the deformation gradient A = ∂x/∂x0 is uniform within each element: namely that the Cauchy–Born rule holds. The strain energy in the element k can be approximately written as ε (Ak ) |k | in terms of the strain energy density ε (A). With these approximations, the evaluation of the total energy is reduced to a summation over the finite elements, E tot ≈
Ne
ε (Ak ) |k |
(5)
k=1
where Ne is the number of elements. This formulation is called the local version of QC. The advantage of local QC is the great reduction of the degrees of freedom since Nrep N . In the presence of defects, the deformation tends to be non-smooth. Therefore, the approximation made in local QC will be inaccurate. A nonlocal version of QC has been developed which proposes to compute the energy with the following ansatz: E tot ∼
Nrep
n α E α (uα )
(6)
α=1
Here the weight {n α } is related to atom density. The energy from each repatom {E α } is computed by visiting its neighboring atoms, which are generated using the local deformation. Near defects such as cracks or dislocations, the finite element mesh is also refined to atomic scale to reflect the local deformation more accurately. Practical implementations usually combine both local nonlocal version of the method, and a criterion has been suggested to identify the local/nonlocal regions so that the whole procedure can be applied adaptively. Another version of QC, which is based on the force calculation, has been put forward in Ref. [3]. The methods generate clusters around the repatoms and perform the force calculation using the atoms within the clusters, see Fig. 1.
Multiscale modeling of crystalline solids
1495
Figure 1. Schematic illustration of QC (courtesy of M. Ortiz). Only atoms in the small cluster need to be visited during the computation.
QC has been successfully applied to a number of problems∗ including dislocation structure, nanoindentation, crack propagation, deformation twinning, etc. The use of local QC to control the far-field region and thus create a continuum environment for material defects has become more and more popular. In its simplest form, QC ignores atomic vibrations and thus the entropic effects. This restricts QC to static problems at zero temperature. Dynamics of defects can be studied in a quasistatic setting. Finite temperature can be incorporated perturbatively [2, 4].
2.2.
MAAD – Macro Atomistic Ab initio Dynamics
MAAD (Macro Atomistic Ab initio Dynamics) was proposed in Refs. [5, 6] to simulate crack propagation in Silicon. The computational domain is decomposed into three parts: the continuum region away from the crack tip where the
* For recently development and source code, see http://www.qcmethod.com.
1496
Weinan E and X. Li
linear elasticity model is solved using a finite element method, an atomistic region near the crack tip on which molecular dynamics m j x¨ j = − ∇x j V,
j = 1, 2, . . . , Natom ,
(7)
with the Stillinger–Weber potential is used, an a quantum mechanical region at the crack tip where the tight binding model (TB) is used to model bond breaking. This is done by writing the Hamiltonian in the form Htot = HFE + HMD + HFE/MD + HTB + HMD/TB ,
(8)
which represents the energy contribution from different regions and the interface between them. For brevity we will explain the calculation of the first three terms. In the (finite element FE) region, the variables are the displacement field u, and the expression for the Hamiltonian is standard: HFE =
Ne 1 uT Kuk + u˙ kT M u˙ k 2 k=1 k
(9)
Here K and M are the stiffness and mass matrices. The stiffness matrix can be obtained from the harmonic approximation of the interatomic potential. In the case of finite (but constant) temperature, these parameters are adjusted accordingly to be consistent with the atomistic system in the MD region. The Hamiltonian in the MD region is simply the total energy: HMD =
atom 1 N m i u˙ 2i + V 2 i=1
(10)
where ui is the displacement of the ith atom, V is the total potential energy in the MD region. A key ingredient in this procedure is a handshaking scheme at the continuum/MD (and MD/TB) interface. Specifically, near the continuum/MD interface the finite elements are refined all the way to the atomistic level so that their vertices coincide with the reference atomistic positions at the interface. The handshaking Hamiltonian HFE/MD accounts for the interaction across the interface. The energy is computed from the continuum side and the MD side using, respectively, the formulase (9) and (10) with half and half weight for each. The continuum region and the atomistic region are then evolved simultaneously in time. Energy transport across the interface has been ignored. By refining the finite element mesh to atomistic scale at the interface, MAAD also avoids the issue of phonon reflection that we will discuss at the end of this article.
Multiscale modeling of crystalline solids
2.3.
1497
CGMD – Coarse-Grained Molecular Dynamics
Coarse-grained molecular dynamics is a systematic procedure for deriving the effective Hamiltonian for a set of coarse-grained variables from the microscopic Hamiltonian [7]. Starting from a microscopic Hamiltonian HMD defined on the phase space and defining coarse-grained variable by uµ =
f j µu j ,
pµ =
j
f j µp j ,
(11)
j
where f j µ are appropriate weights, the effective Hamiltonian for the coarsegrained variables are obtained from 1 E(uµ , pµ ) = Z
HMD e− HMD , kB T dx j dp j
(12)
where
= µ δ uµ −
fkµ uk δ pu −
k
fkµ pk ,
k
and Z is a normalization constant, T is the temperature. Consistency with the coarse-grained variables is ensured through the presence of the delta functions, similar to the imposition of the kinematic constraint in QC. Equation (12) plays the role of (3) at finite temperature, with Q defined via (11). The basic assumption in this formalism is that the small scale component is at equilibrium given the coarse-grained variables. Strictly speaking this is only true if the relaxation times associated with the small scales are much shorter than that or the coarse-grained variables. In general the coarse-grained energy in (12) is still difficult to compute. It has been computed for the case of harmonic potential in Ref. [7] and more generally in Ref. [8].
3.
Dynamics-based Method
So far we have discussed energy based methods. In these methods, the key is to obtain a multiscale representation of the total energy of the system. In QC, this is done via the representative atoms and the summation rule. In MAAD, this is done by handshaking the atomistic and continuum regions through a gradual matching of the grids. In CGMD, this is done via thermodynamically integrating out the contribution of the small scales. Hamilton’s equation is applied to the reduced Hamiltonian in order to model dynamics.
1498
Weinan E and X. Li
An alternative approach is to model dynamics directly. Equilibrium states are obtained as steady states of the dynamics. This is essential if energy transport is coupled with the dynamics. At the present time, this approach is much less developed compared with energy-based approaches discussed earlier. So far the only general strategy seems to be that of Li and E[9], which is based on the framework of the heterogeneous multiscale method (HMM) developed by E and Engquist [10]. This will be discussed next. We will also discuss a related topic, namely how to impose matching conditions at the atomistic–continuum interface.
3.1.
Heterogeneous Multiscale Method
In order to develop a general multiscale methodology that can handle both dynamics and finite temperature effects, Li and E [9] relied on the framework of the heterogeneous multiscale method (HMM), which has been used for designing multiscale methods for several different applications including fluids.† there are two major components in HMM. The selection of a macroscale solver and the estimation of the needed macroscale data using the microscale solver. In general the macroscale solver should be chosen to maximize the efficiency in resolving the macroscale behavior of the system and minimize the complexity of coupling with the microscale model. In the context of solids, our starting point for both the macroscale and microscale models are the universal conservation laws of mass, momentum and energy in Lagrangian coordinates: ∂t A − ∇x0 v = 0,
ρ0 ∂t v + ∇x0 · σ = 0, ρ0 ∂t e + ∇x0 · j = 0,
(13)
Here A, v, e are the deformation gradient, velocity and total energy per particle respectively, ρ0 is the density. At the macroscale level, e.g., continuum mechanics, σ is the first Piola–Kirchhoff stress tensor and j is the energy flux. The first equation in (13) is merely a compatibility statement. The second and third equation express conservation of momentum and energy, respectively. After combining with proper constitutive relations these equations can be used to model nonlinear elasticity, thermoelasticity and even plasticity. At the microscopic level, i.e., molecular dynamics, these conservation laws
† For other applications of HMM, visit http://www.math.Princeton.edu/multiscale.
Multiscale modeling of crystalline solids
1499
continue to hold with the stress and energy given in terms of the atomistic variables by,
σ˜ (x0 , t) = 12 f xi (t) − x j (t) ⊗ x0i − x0j i= /j 1 × δ(x0 − (x0j + λ(x0i − x0j )))dλ, 0 ˜j (x0 , t) = 1 v i (t) + v j (t) · f x j − xi x0i − x0j 4 i= /j 1 × δ(x0 − (x0j + λ(x0i − x0j )))dλ,
(14)
0
Here for simplicity we only provided these expressions for the case when the
atomistic potential is simply a pair potential: V =1/2 i =/ j φ xi (t) − x j (t) and f = −∇φ. It is well-known that pair potentials are quite inadequate for modeling solids, but one can find the formulas for more general potentials in Ref. [9]. These conservation laws suggest a new coupling strategy in the HMM framework at the level of fluxes: the macroscopic variables can be used as constraints for the atomistic system, the needed constitutive data – the fluxes, can be obtained from results from the atomistic model via ensemble time averaging after the microscale system equilibrates. This is the method proposed in Ref. [9]. Compared with QC or CGMD, HMM is more of a top-down approach in that it starts with an incomplete macroscale model, and uses the microscale model as a supplement to provide the missing data, the fluxes. In QC or CGMD, one starts with a full atomistic description with all the physical details. A coarse graining procedure is then applied to remove the unnecessary data in order to arrive at a coarse-grained model. We next describe the details of the HMM procedure.
3.1.1. Macroscale solver Since the macroscale model is a conservation law, the macroscale solver is a method for conservation laws. Although there are plenty of methods available for conservation laws, e.g., Ref. [11], many of them involve the computation of the Jacobian for the flux functions, and this dramatically increases the computational complexity in a coupled multiscale method when the continuum equation is not explicitly known. An exception is the central scheme of Lax–Friedrichs type, such as Ref. [12], which is formulated over a staggered-grid. As it turns out, this method can be easily coupled with molecular dynamic simulations.
1500
Weinan E and X. Li
We first write the conservation laws in the generic form, ut + fx = 0,
(15)
We will confine our discussion to one dimensional continuum models since the extension to higher dimension is straightforward. A (macro) staggered grid is laid out as in Fig. 2. First order central scheme represents the solutions by piece-wise constants, which are the average values over each cell: unk
1 = x
x k+1/2
u(x, t n )dx.
x k −1/2
Time integration over xk , xk + 1 × t n , t n + 1 leads to the following scheme, +1 unk + 1/2 =
t n unk + unk + 1 − fk + 1 − fnk , 2 x
(16)
tn+2 fn+1 k 1/2
fn+1 k+1/2
[]
[] un+1 k 1/2
un+1 k+1/2
tn+1 fnk
fnk 1
[]
[] unk 1
tn
xk1
[] unk
xk
fnk+1
unk+1 xk+1
Figure 2. A schematic illustration of the numerical procedure for one macro time step: starting from piecewise constant solutions {unk }, one integrates (15) in time and in the cell [xk , xk+1 ]. The time step t is chosen in such a way that the waves coming from xk+1/2 will not reach xk , and thus for t ∈ [t n , t n+1 ), u(xk , t) = unk .To obtain the local flux, we perform a MD simulation using unk as constraints. The needed flux is then extracted from the MD fluxes via time averaging.
Multiscale modeling of crystalline solids
1501
where fnk
1 = t
tn +1
f(xk , t)dt
tn
This is then approximated by numerical quadrature such as the mid-point formula. A simple choice is f kn ∼ f (xk , t n ). The stability of such a scheme, which usually manifests itself in the form of a constraint on the size of t, can be appreciated from considering the adiabatic case f = f(u): if we choose the time step t small enough, the waves generated from the cell interface {x k + 1/2} will not arrive at the grid point {xk }, and, therefore, the solution as well as the fluxes at the grid points will not change until the next time step. With this specific choice of the macro-solver, we can illustrate the HMM procedure schematically in Fig. 2. At each macro time step, the scheme (16) requires as input the fluxes at grid point xk to complete the time integration, These flux values are obtained by performing local MD simulations that are consistent with the local macroscale state (A, v, e). The Eq. (13) is then integrated to next time step using (16).
3.1.2. Reconstruction Next we discuss how to set up the atomistic simulation to estimate the local fluxes. The first step is to reconstruct initial MD configurations that are consistent with the local macro state variables (A, v, e). The shape of the MD cell, and hence the new basis, is set up from the local deformation tensor. For example if the undeformed cell has basis E, then the ˜ new basis is E=AE. Assuming the deformation is uniform within the cell, the new basis then determines the displacement of each atom. From the atomic positions we can compute the potential energy. After subtracting the potential energy and the kinetic energy associated with the mean velocity from the total energy e, we obtain the temperature by assuming that the remaining energy is due to thermal fluctuation. Using the mean velocity and temperature we initialize the velocity of the atoms by Maxwell distribution.
3.1.3. Boundary conditions Of central importance is the boundary condition imposed on the microscopic system in order to guarantee consistency with the local macroscale variables. In the case when the system is homogeneous, the most convenient boundary condition is the periodic boundary condition. The MD cell is first
1502
Weinan E and X. Li
deformed according to the deformation gradient A. Then the cell is periodically extended to the whole space.
3.1.4. Estimating the data The needed macroscale fluxes are estimated from the MD results by time averaging. To reduce the transient effects, we use a kernel that puts less weight on the transient period, e.g., 1 A K = lim t →+∞ t
t 0
s K (1 − )A(s)ds, t
K (θ) = 1 − cos (2π θ ).
(17)
Experience suggests that using this kernel substantially improves the quality of the data than straightforward averaging.
3.1.5. Dealing with defects In the presence of defects, QC and MAAD refine the grid to atomic level to account for defect energy. This procedure is seamless but can become rather complicated in simulating dynamics. HMM instead suggest keeping the macro-grid (which might be locally refined) in the entire computational domain but performing a model refinement locally near the defects. Away from the defects, the fluxes are computed using the procedure described before, or if an empirical model is accurate enough, one can simply compute the fluxes using the empirical model. Near the defects there are two cases to consider depending on whether there is scale separation between the local relaxation time around the defects and the time scale for the dynamics of the defects In the absence of such a time scale separation, the molecular dynamics simulation around the defects will be kept for all times. This imposes a limitation on the time scales that can be accessed using such a procedure. But if the atomistic relaxation times can be very long, there is really little one can do other than following the history of the atomistic features near the defects. Macro-scale fluxes can still be computed from the micro-scale fluxes via time averaging. In this case, since the atomistic region near the defect is necessarily macroscopically inhomogeneous, the atomistic boundary conditions need to the modified. Li and E [9] proposes using a biased Andersen thermostate at a border region that takes into account both the local mean velocity and local temperature. Finally, the overall deformation is controlled by fixing the outmost atoms. In the case when there is time scale separation, this procedure can be much simplified. In this case one can build the defect dynamics directly into the macro-solver and the atomistic simulations can be localized in space and time
Multiscale modeling of crystalline solids
1503
to predict the velocity of the defects and stress near the defects. Such a defect tracking procedure is implemented for twin boundary dynamics in Ref. [9].
3.1.6. Atomistic–continuum interface condition One issue that has received a great deal of attention is the matching condition at the atomistic–continuum interface. In a coupled MD-continuum calculation, the MD region is meant to be vary small but inevitably at finite temperature. The phonons generated in the MD region need to be propagated out in order to keep the fluctuations in the MD region under control. This is achieved through imposing appropriate boundary conditions at the atomistic– continuum interface that limits phonon reflection. The first attempt for deriving such boundary conditions is found in Ref. [12]. Cai et al. suggested obtaining the exact linear response functions at the interface by precomputing. This strategy is in principle exact under the harmonic approximation. But it is often too expensive since the linear response functions (which are simply Green’s functions) are quite nonlocal. When the MD region changes as a result of defect dynamics, these functions will have to be computed again. Further work along this line was done later by Wagner et al. Ref. [13]. To achieve an optimal balance between efficiency and accuracy, a local method was formulated in E and Huang [14, 15] with the idea of minimizing phonon reflection, giving a pre-determined stencil for the boundary condition. To explain the optimal local matching conditions, we consider the one dimensional case where the continuum model is the simple wave equation, ∂ 2u ∂ 2u = ∂t 2 ∂ x 2 and its discrete form, − 2u nj + u n−1 u n+1 j j = u nj +1 − 2u nj −1, j ≥ 1. (18) 2 t These equations can be obtained by linear zing (7). For simplicity we consider the case when the atomistic region is in the semi-infinite domain defined by x > 0 and j =0 is the boundary. To prescribe the boundary condition we express u n0 as u n0 =
ak, j u n−k j ,
a0,0 = 0.
k, j ≥ 0
We start with a pre-determined set S of {k, j }’s outside of which we set ak, j = 0. The set S is the stencil that we choose. Choosing the right S is a very crucial step in this procedure. Large S will lead to an increase in the complexity of
1504
Weinan E and X. Li
the algorithm. But small S may not be enough for the purpose of suppressing phonon reflection. Once S is selected, {ak, j }are chosen to minimize the total reflection in appropriate norm. The reflection coefficient, or more generally the reflection matrix can be obtained by looking for solutions in the form of u nj = ei(nωt + j ξ ) + R(ξ )ei(nωt − j ξ ) . Using (18), we obtain
ak, j ei( j ξ −kωt ) − 1 , −i( j ξ −kωt ) − 1 k, j ak, j e
R(ξ ) = −
k, j
(19)
where ω = ω(ξ ) is the dispersion coefficient satisfying ωt ξ 1 sin = sin . t 2 2 Similar calculation can be done for general crystal structures in which case the phonon spectrum may consist of several branches. Having R(ξ ), ak, j can be obtained by minimizing the total phonon reflection, π
min
W (ξ )R(ξ )|2 dξ,
0
with appropriately chosen weight function W . In addition constraints are needed at ξ = 0 in the form of R(0) = 0, R (0) = 0, . . . , to ensure accuracy at large scale. As example, if one uses only the terms a1,0 and a1,1 , and W =1 with R(0)=0 at the boundary, one has, + tu n−1 u n0 = (1 − t)u n−1 0 1 .
(20)
If instead one keeps the terms {a j,k, j ≤ 3, k ≤ 2}, the minimization leads to the following coefficients:
(a j,k ) =
1.95264 −0.074207 −0.014903 −0.95406 0.074904 0.015621
.
In order to get better performance at high wave number, more coefficients (larger S) have to be included. The method has been applied to dislocation dynamics in the Frenkel– Kontorova model and friction between rough crystal surfaces. It has shown promise in suppressing phonon reflection.
Multiscale modeling of crystalline solids
4.
1505
Summary
We have based our presentation on dividing multiscale methods into energybased and dynamics-based methods. From the viewpoint of coarse-graining, there are also two different set of ideas. The first set of ideas, used in QC, CGMD and HMM, is to pre-define a set of coarse-grained variables. By expressing the microscopic model in terms of the coarse-grained variables, one finds a relationship that express the macroscale data in terms of the microscopic quantities. In QC, this relationship is (3). In CGMD, this relationship is (12). In HMM, this relationship is (14). This relationship is the starting point of the micro-macro coupling. The second set of ideas, used in MAAD and E and Huang [14], is to divide the computational domain into macro and micro regions. Separate models are used in different regions and an explicit matching is used to bridging the two regions. Most existing work on multiscale modeling of solids deals with single crystal with isolated defects. Going beyond single crystals requires substantial work. Dealing with polycrystals with grain boundaries and plasticity with many interacting dislocations seem to require new ideas in coupling.
References [1] E.B. Tadmor, M. Ortiz, and R. Phillips, “Quasicontinuum analysis of defects in crystals,” Phil. Mag. A, 73, 1529, 1996. [2] R.E. Miller and E.B. Tadmor, “The quasicontinuum method: overview, applications and current directions,” J. Comput.-Aided Mater. Des., in press, 2003. [3] J. Knap and M. Ortiz, “An analysis of the quasicontinuum method,” J. Mech. Phys. Solid, 49, 1899, 2001. [4] V. Shenoy and R. Phillips, “Finite temperature quasicontinuum methods,” Mat. Res. Soc. Symp. Proc., 538, 465, 1999. [5] F.F. Abraham, J.Q. Broughton, N. Bernstein, and E. Kaxiras, “Spanning the continuum to quantum length scales in a dynamic simulation of brittle fracture,” Europhys. Lett., 44(6), 783, 1998. [6] J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling of length scales: methodology and application,” Phys. Rev. B, 60(4), 2391, 1999. [7] R.E. Rudd and J.Q. Broughton, “Coarse-grained molecular dynamics and the atomic limit of finite element,” Phys. Rev. B, 58(10), R5893, 1998. [8] R.E. Rudd and J.Q. Broughton, Unpublished, 2000. [9] X.T. Li and W.E, “Multiscal modeling of solids,” Preprint, 2003. [10] W.E and B. Engquist, “The heterogeneous multi-scale methods,” Comm. Math. Sci., 1(1), 87, 2002. [11] E. Godlewski, and P.A. Raviart, Numerical Approximation of Hyperbolic systems of Conservation Laws, Springer-Verlag, New York, 1996. [12] H. Nessyahu and E. Tadmor, “Nonoscillatory central differencing for hyperbolic conservation laws,” J. Comp. Phys., 87(2), 408, 1990.
1506
Weinan E and X. Li
[13] G.J. Wagner, G.K. Eduard, and W.K. Liu, Molecular Dynamics Boundary Conditions for Regular Crystal Lattice, Preprint, 2003. [14] W.E and Z. Huang, “Matching conditions in atomistic-continuum modeling of material,” Phys. Rev. Lett., 87(13), 135501, 2001. [15] W.E and Z. Huang, “A dynamic atomistic-continuum method for the simulation of crystalline material,” J. Comp. Phys., 182, 234, 2002.
4.14 MULTISCALE COMPUTATION OF FLUID FLOW IN HETEROGENEOUS MEDIA Thomas Y. Hou California Institute of Technology, Pasadena, CA, USA
There are many interesting physical problems that have multiscale solutions. These problems range from composite materials to wave propagation in random media, flow and transport through heterogeneous porous media, and turbulent flow. Computing these multiple scale solutions accurately presents a major challenge due to the wide range of scales in the solution. It is very expensive to resolve all the small scale features on a fine grid by direct num-erical simulations. A natural question is if it is possible to develop a multiscale computational method that captures the effect of small scales on the large scales using a coarse grid, but does not require resolving all the small scale features. Such multiscale method can offer significant computational savings. We use the immiscible two-phase flow in heterogeneous porous media and incompressible flow as examples to illustrate some key issues in designing multiscale computational methods for fluid flows. Two-phase flows have many applications in oil reservoir simulations and environmental science problems. Through the use of sophisticated geological and geostatistical modeling tools, engineers and geologists can now generate highly detailed, three-dimensional representations of reservoir properties. Such models can be particularly important for reservoir management, as fine scale details in formation properties, such as thin, high permeability layers or thin shale barriers, can dominate reservoir behavior. The direct use of these highly resolved models for reservoir simulation is not generally feasible because their fine level of detail (tens of millions grid blocks) places prohibitive demands on computational resources. Therefore, the ability to coarsen these highly resolved geologic models to levels of detail appropriate for reservoir simulation (tens of thousands grid blocks), while maintaining the integrity of the model for purpose of flow simulation (i.e., avoiding the loss of important details), is clearly needed. 1507 S. Yip (ed.), Handbook of Materials Modeling, 1507–1528. c 2005 Springer. Printed in the Netherlands.
1508
T.Y. Hou
In recent years, we have introduced a multiscale finite element method (MsFEM) for solving partial differential equations with multiscale solutions [1–4]. This method has been demonstrated to be effective in upscaling two-phase flows in heterogeneous porous media. The main idea of this approach is to construct local multiscale finite element base functions that capture the small scale information within each element. The small scale information is then brought to the large scales through the coupling of the global stiffness matrix. Thus, the effect of small scales on the large scales is captured correctly. In our method, the base functions are constructed by solving the governing equation locally within each coarse grid element. The local construction of the multiscale base functions offers several computational advantages such as parallel computing and local adaptivity in computing the base functions. These advantages can be explored in upscaling a fine grid model. One of the central issues in many multiscale methods is to localize the subgrid small scale problems. In the context of the multiscale finite element method, it is the question of how to design proper microscopic boundary conditions for the local base functions. Naive choice of microscopic boundary conditions can lead to large errors. The nature of the numerical errors due to improperly chosen local boundary conditions depends on the type of the governing equation for the underlying physical problem. For elliptic or diffusion dominated problems, the effect of the numerical boundary layers is strongly localized. For convection dominated transport, the errors caused by the improper microscopic boundary condition can propagate long distance and pollute the large scale physical solution. Below we will discuss multiscale methods for these two types of problems in some details.
1.
Formulation and Background
The flow and transport problems in porous media are considered in a hierarchical level of approximation. At the microscale, the solute transport is governed by the convection–diffusion equation in a homogeneous fluid. However, for porous media, it is very difficult to obtain full information about the pore structure. Certain averaging procedure has to be carried out, and the porous medium becomes a continuum with certain macroscopic properties, such as the porosity and permeability. With the modern geostatistical techniques, one can routinely generate a fine grid model as large as tens of millions of grid blocks. As a first step, one has to upscale the fine grid model to a coarse grid model consisting of tens of thousands of coarse grid blocks but still preserve the integrity of the original fine grid model. Once the coarse grid model is obt-ained, it can be used many times with different boundary conditions or source distributions for the purpose of model validation and oil field management. This could reduce the computational cost significantly.
Multiscale computation of fluid flow in heterogeneous media
1509
We consider a heterogeneous system which represents two-phase immiscible flow. Our interest is in the effect of permeability heterogeneity on twophase flow. Therefore, we neglect the effect of compressibility and capillary pressure, and consider porosity to be constant. This system can be described by writing Darcy’s law for each phase (all quantities are dimensionless) vj =
krj (S) K ∇ p, µj
(1)
where vj are Darcy’s velocity for the phase j (j = o, w; oil, water), p is pressure, S is water saturation, K is the permeability tensor, krj is the relative permeabilities of each phase and µj is the viscosity of the phase j. Darcy’s law for each phase coupled with mass conservation, can be manipulated to give the pressure and saturation equations ∇ · (λ(S)K ∇ p) = 0, ∂S + u · ∇ f (S) = 0, ∂t
(2) (3)
which can be solved subject to some appropriate initial and boundary conditions. The parameters in the above equations are given by krw (S) kro (S) + , µw µo krw (S)/µw , f (S) = krw (S)/µw + kro /µo u = vw + vo = −λ(S)K ∇ p. λ=
(4) (5) (6)
Typically, the permeability tensor K in an oil reservoir model contains many or continuous spectrum of scales that are not separable. The variation in the permeability tensor is also very large, with the ratio between the maximum and minimum permeability being as large as 106 . This means that flow velocity can be very large near certain fast flow channels. To avoid time-stepping restriction associated with an explicit method, a full implicit time discretization is usually employed for the saturation equation. Moreover, the geometry of the computational domain is quite complicated. All these complications make it difficult to apply standard fast iterative methods such as the multigrid method to solve the large scale elliptic equation for pressure. In fact, solving the elliptic problem seems to consume most of the computational time in practice. Thus developing an efficient multiscale adaptive method for solving the elliptic problem becomes essential in oil reservoir simulations.
1510
2.
T.Y. Hou
Multiscale Finite Element Method
We first focus on developing an effective multiscale finite element method for solving the elliptic (pressure) equation with highly oscillating coefficients. We consider the following elliptic problem L u : = − ∇ · (a (x)∇u) = f in ,
u = 0 on ∂,
(7)
where a (x) = (aij (x)) is a positive definite matrix, is the physical domain and ∂ denotes the boundary of domain . This model equation represents a common difficulty shared by several physical problems. For flow in porous media, it is the pressure equation through Darcy’s law. The coefficient a ε represents the permeability tensor. For composite materials, it is the steady heat conduction equation and the coefficient a ε represents the thermal conductivity. The variational problem of (7) is to seek u ∈ H01 () such that a(u, v) = f (v), ∀v ∈ H01 (),
(8)
where a(u, v) =
aij
∂v ∂u dx and f (v) = ∂x i ∂x j
f v dx.
We have used the Einstein summation notation in the above formula. The Sobolev space H01 () consists of all functions whose mth derivatives (m = 0, 1) are L 2 integrable over and which vanish at the boundary of . A finite element method is obtained by restricting the weak formulation (8) to a finite dimensional subspace of H01 (). For 0 < h ≤ 1, let Kh be a partition of by a collection of triangular element K with diameter ≤ h. In each element K ∈ Kh , we define a set of nodal basis {φ Ki , i =1, . . . , d} with d being the number of nodes of the element. The subscript K will be neglected when bases in one element are considered. In our multiscale finite element method, the base function φ i is constructed by solving the homogeneous equation over each coarse grid element: L φ i = 0 in K ∈ Kh .
(9)
Let x j ( j = 1, . . . , d) be the nodal points of K . As usual, we require φ i (x j ) = δi j , where δi j = 1 if i = j , and δi j = 0 for i =/ j . One needs to specify the boundary condition of φ i to make (9) a well-posed problem. The simplest choice of the boundary condition for φ i is the linear boundary condition. For now, we assume that the base functions are continuous across the boundaries of the elements, so that the finite element solution space V h , which is spanned by the multiscale bases φ Ki is a subspace of H01 (), i.e.,
V h = span φ Ki : i = 1, . . . , d; K ∈ Kh ⊂ H01 ().
Multiscale computation of fluid flow in heterogeneous media
1511
Except for special cases when the coefficient aij has periodic structure or is separable in space variables, we in general need to compute the multiscale bases numerically using a subgrid mesh. The multiscale finite element method is to find the approximate solution of (8) in V h , i.e., to find u h ∈ V h such that a(u h , v) = f (v), ∀v ∈ V h .
(10)
In the case when a (x) = a(x, x/) with a(x, y) being periodic in y, we have proved that the multiscale finite element method gives a convergence result uniform in as tends to zero [2]. Moreover, the rate of convergence in the energy norm is of the form O h + + (/ h)1/2 . We remark that the idea of using base functions governed by the differential equations has been used in the finite element community see e.g., [5]. The multiscale finite element method presented here is also similar in spirit to the residual-free bubble finite element method [6] and the variational multiscale method [7].
3.
The Over-sampling Technique
The choice of boundary conditions in defining the multiscale bases plays a crucial role in approximating the multiscale solution. Intuitively, the boundary condition for the multiscale base function should reflect the multiscale oscillation of the solution u across the boundary of the coarse grid element. To gain insight, we first consider the special case of periodic microstructures, i.e., a (x) = a(x, x/), with a(x, y) being periodic in y. Using standard homogenization theory [8], we can perform multiscale expansion for the base function, φ , as follows (y = x/) φ = φ0 (x) + φ1 (x, y) + θ (x) + O( 2 ), where φ0 is the effective solution, φ1 is the first order corrector. The boundary corrector θ is chosen so that the boundary condition of φ on ∂ K is exactly satisfied by the first three terms in the expansion. By solving a periodic cell problem for χ j y · a(x, y) y χ j =
∂ ai j (x, y) ∂ yi
(11)
with zero mean, we can express the first order corrector φ1 as follows: φ1 (x, y) = − χ j ∂φ0 /∂x j . The boundary corrector, θ , then satisfies x · a(x, x/) x θ = 0 in K with boundary condition
θ ∂ K = φ1 (x, x/)∂ K .
1512
T.Y. Hou
The oscillatory boundary condition of θ induces a numerical boundary layer, which leads to the so-called resonance error [1]. To avoid this resonance error, we need to incorporate the multidimensional oscillatory information through the cell problem into our boundary condition for φ . If we set φ |∂ K = (φ0 + φ1 (x, x/))|∂ K , then the boundary condition for θ |∂ K becomes identically equal to zero. Therefore, we have θ ≡ 0. In this case, we have an analytic expression for the multiscale base functions φ as follows φ = φ0 (x) + φ1 (x, x/),
(12)
where φ1 (x, y) = −χ j (x, y)∂φ0 /∂x j , χ j is the solution of the cell problem (11), and φ0 can be chosen as the standard linear finite element base. This set of multiscale bases avoid the boundary layer effect completely. The analytic form of the multiscale base function also gives a more efficient way to construct the multiscale base functions. Numerical experiments by Andrew Westhead demonstrate a clear first order convergence of this method without suffering from resonance error. For more details, see www.ama.caltech.edu/∼ westhead/MSFEM. However, for problems that do not have scale separation and periodic microstructure, we cannot use this approach to compute the multiscale base functions in general. Motivated by our convergence analysis, we propose an over-sampling method to overcome the difficulty due to scale resonance [1]. The idea is quite simple and easy to implement. Since the boundary layer in the first order corrector is thin, O(), we can first construct intermediate sample bases in a domain with size larger than h + . Here, h is the coarse grid mesh size and is the small scale in the solution. From these intermediate sample bases, we can construct the multiscale bases over the computational element, using only the interior information of the sample bases restricted to the computational element. Specifically, let ψ j be the base functions satisfying the homogeneous elliptic equation in the larger sample domain S ⊃ K . We then form the actual base φ i by linear combination of ψ j φi =
d
ci j ψ j .
j =1
The coefficients ci j are determined by condition φ i (x j ) = δi j . The corresponding θ ε for φ i are now free of boundary layers. By doing this, we can reduce the influence of the boundary layer in the larger sample domain on the base functions significantly. As a consequence, we obtain an improved rate of convergence [1, 3].
4.
Convergence and Accuracy
To assess the accuracy of our multiscale method, we compare MsFEM with a traditional linear finite element method (LFEM for short) using a subgrid
Multiscale computation of fluid flow in heterogeneous media
1513
mesh, h s = h/M. The multiscale bases are computed using the same subgrid mesh. Note that MsFEM only captures the solution at the coarse grid h, while FEM tries to resolve the solution at the fine grid h s . Our extensive numerical experiments demonstrate that the accuracy of MsFEM on the coarse grid h is comparable to that of the corresponding well-resolved LFEM calculation at the same coarse grid. In some cases, MsFEM gives even more accurate results than LFEM. First, we demonstrate the convergence in the case when the coefficient has scale separation and periodic structure. In Table 1, we present the result for a(x/ε) =
2 + sin(2π x2 /ε) 2 + P sin(2πx1 ε) + (P = 1.8), 2 + P cos(2π x2 /ε) 2 + P sin(2π x1 /ε) f (x) = −1 and u|∂ = 0,
(13) (14)
where = [0, 1] × [0, 1]. We denote by N the number of coarse grid points along each dimension, i.e., N = 1/ h. The convergence of three different methods are compared for fixed ε/ h = 0.64, where “L” indicates that linear boundary condition is imposed on the multiscale base functions, “os” indicates the use of over-sampling, and LFEM stands for linear FEM. We see clearly the scale resonance in the results of MsFEM-L and the (almost) first-order convergence (i.e., no resonance) in MsFEM-os-L. Moreover, the errors of MsFEM-os-L are smaller than those of LFEM obtained on the fine grid. Next, we illustrate the convergence of the multiscale finite element method when the coefficient is random and has no scale separation nor periodic structure. In Fig. 1, we show the results for a log-normally distributed a ε . In this case, the effect of scale resonance shows clearly for MsFEM-L, i.e., the error increases as h approaches ε. Here ε ∼ 0.004 roughly equals the correlation length. Even the use of an oscillatory boundary conditions (MsFEMO), which is obtained by solving a reduced 1D problem along the edge of the element, does not help much in this case. On the other hand, MsFEM with over-sampling agrees very well with the well-resolved calculation. We have also applied the multiscale finite element method to study wave propagation in random media and singularly perturbed convection-dominated diffusion problems. For more details, see Refs. [9, 10]. Table 1. Convergence for periodic case N 16 32 64 128
ε 0.04 0.02 0.01 0.005
MsFEM-L ||E||l 2 Rate
MsFEM-os-L ||E||l 2 Rate
LFEM MN ||E||l 2
3.54e–4 3.90e–4 4.04e–4 4.10e–4
7.78e–5 3.83e–5 1.97e–5 1.03e–5
256 512 1024 2048
–0.14 –0.05 –0.02
1.02 0.96 0.94
1.34e–4 1.34e–4 1.34e–4 1.34e–4
1514
T.Y. Hou 1e⫺2
LFEM MFEM-O MFEM-L MFEM-os-L
l 2 -norm error
5e⫺3
1e⫺3
5e⫺4 32
64
128
256
512
N
Figure 1. The l 2 -norm error of the solutions using various schemes for a log-normally distributed permeability field.
5.
Recovery of Small Scale Solution from Coarse Grid Solution
To solve the transport equation in the two-phase flows, we need to compute the velocity field from the elliptic equation for pressure, i.e., u = − λ(S)K ∇ p. In some applications involving isotropic media, the cell-averaged velocity is sufficient, as shown by some computations using the local upscaling methods [11]. However, for anisotropic media, especially layered ones (Fig. 2), the velocity in some thin channels can be much higher than the cell average, and these channels often have dominant effects on the transport solutions. In this case, the information about fine scale velocity becomes vitally important. Therefore, an important question for all upscaling methods is how to take those fast-flow channels into account. For MsFEM, the fine scale velocity can be easily recovered from the multiscale base functions, which provide interpolations from the coarse h-grid to the fine h s -grid. To illustrate that we can recover the fine grid velocity field from the coarse grid pressure calculation, we use the layered medium which is plotted in Fig. 2. We compare the computations of the horizontal velocity fields obtained by two methods. In Fig. 3a, we plot the horizontal velocity field obtained by using a fine grid (N = 1024) calculation. In Fig. 3b, we plot the same horizontal velocity field obtained by using the coarse grid pressure calculation with a coarse grid (N = 64) and using the multiscale finite element bases to interpolate the fine grid velocity field. We can see that the recovered velocity field captures very well the layer structure in the fine grid velocity
Multiscale computation of fluid flow in heterogeneous media
1515
1
0.8
0.6
0.4
0.2
0
0
0.2
Figure 2.
(a)
0.4
0.6
0.8
A random porosity field with layered structure.
(b)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0 0
0.2 2
0.4 4
6
0.6 8
1
0.8 10
1
12 14 elevation
0
0 0
0.2 2
0.4 4
6
0.6 8
0.8 10
1
12 14 elevation
Figure 3. (a) Fine grid horizontal velocity field, N = 1024. (b) Recovered horizontal velocity field from the coarse grid calculation (N = 64) using multiscale bases.
field. Further, we use the recovered fine grid velocity field to compute the saturation in time. In Fig. 4a, we plot the saturation at t =0.06 obtained by the fine grid calculation. Figure 4b shows the corresponding saturation obtained using the recovered velocity field from the coarse grid calculation. Most of detailed fine scale fingering structures in the well-resolved saturation are captured very well by the corresponding calculation using the recovered velocity field from the coarse grid pressure calculation. The agreement is quite striking. We also check the fractional flow curves obtained by the two calculations. The fractional flow of the red fluid, defined as F = Sred u 1 dy/ u 1 dy (S being
1516 (a)
T.Y. Hou (b)
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
0
0.2
0.4
0.6
0.8 1 elevation
0
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8 1 elevation
Figure 4. (a) Fine grid saturation at t = 0.06, N = 1024. (b) Saturation computed using the recovered velocity field from the coarse grid calculation (N = 64) using multiscale bases.
DNS (fine) MsFEM (recovered) DNS (averaged) MsFEM (coarse)
1
Fractional flow
0.9
0.8
0.7
0.6
0.5
0
0.1
0.2
0.3
0.4
0.5
0.6
Time
Figure 5. Variation of fractional flow with time. DNS: well-resolved direct numerical solution using LFEM (N = 512). MsFEM: over-sampling is used (N = 64, M = 8).
the saturation, u 1 being the horizontal velocity component), at the right boundary is shown in Fig. 5. The top pair of curves are the solutions of the transport problem using the cell-averaged velocity obtained from a well-resolved solution and from MsFEM; the bottom pair are solutions using well-resolved fine scale velocity and the recovered fine scale velocity from the MsFEM calculation. Two conclusions can be made from the comparisons. First, the
Multiscale computation of fluid flow in heterogeneous media
1517
cell-averaged velocity may lead to a large error in the solution of the transport equation. Second, both the recovered fine scale velocity and the cell-averaged velocity obtained from MsFEM give faithful reproductions of respective direct numerical solutions. We remark that a finite volume version of the multiscale finite element method has been developed by Jenny et al. [12]. They also found that by updating the multiscale bases adaptively in space and time, they can approximate the well-resolved solution accurately. The percentage of the multiscale bases that need to be updated is small, only a few percent of the total number of bases [13]. In some sense, the multiscale finite element method also offers an efficient approach to capture the fine scale details using only a small fraction of the computational time required for a direct numerical simulation using a fine grid.
6.
Scale-up of Two-phase Flows
The multiscale finite element method has been used in conjunction with some moment closure models to obtain an upscaled method for two-phase flows. In many oil reservoir applications, capillary pressure effect is so small that it is neglected in practice. Upscaling a convection dominated transport is difficult due to the nonlocal memory effect [14]. Here we use the upscaling method proposed in [15] to design an overall coarse grid model for the transport equation. In its simplest form, neglecting the effect of gravity, compressibility, capillary pressure, and considering constant porosity and unit mobility, the governing equations for the flow transport in highly heterogeneous porous media can be described by the following partial differential equations ∇ · (K (x)∇ p) = 0, ∂S + u · ∇ S = 0, ∂t
(15) (16)
where p is the pressure, S is the water saturation, K (x) = (Kij (x)) is the relative permeability tensor, and u = − K(x)∇ p is the Darcy velocity. The work of Efendiev et al. [15] for upscaling the saturation equation involves a moment closure argument. The velocity and the saturation are separated into a local mean quantity and a small scale perturbation with zero mean. For example, the Darcy velocity is expressed as u = u0 + u in (16), where u0 is the average of velocity u over each coarse element, u is the deviation of the fine scale velocity from its coarse scale average. If one ignores the third order terms containing the fluctuations of velocity and saturation, one can obtain an
1518
T.Y. Hou
average equation for the saturation S as follows
∂S ∂ + u0 · ∇ S = ∂t ∂x i
∂S Di j (x, t) , ∂x j
(17)
where the diffusion coefficients Di j (x, t) are defined by Dii (x, t) = |ui (x)| L 0i (x, t),
Di j (x, t) = 0, for i=/ j,
where |ui (x)| stands for the average of |ui (x)| over each coarse element. The function L 0i (x, t) is the length of the coarse grid streamline in the xi direction which starts at time t at point x, i.e., L 0i (x, t)
t
=
yi (s)ds,
0
where y(s) is the solution of the following system of ODEs dy(s) = u0 (y(s)), y(t) = x. ds Note that the hyperbolic equation (16) is now replaced by a convection– diffusion equation. One should note that the induced diffusion term is history dependent. In some sense, it captures the nonlocal history dependent memory effect described by Tartar in the simple shear flow problem [14]. The multiscale finite element method can be readily combined with the above upscaling model for the saturation equation. The local fine grid velocity u can be reconstructed from the multiscale finite element bases. We perform a coarse grid computation of the above algorithm on the coarse 64 × 64 mesh using a mixed multiscale finite element method [4]. The fractional flow curve using the above algorithm is depicted in Fig. 6. It gives excellent agreement with the “exact” fractional flow curve which is obtained using a fine 1024 × 1024 mesh. Upscaling the two-phase flow is more difficult due to the dynamic coupling between the pressure and the saturation. One important observation is that the fluctuation in saturation is relatively small away from the oil/water interface. In this region, the multiscale bases are essentially the same as those generated by the corresponding one-phase flow (i.e., λ = 1). These base functions are time independent. In practice, we can design an adaptive strategy to update the multiscale bases in space and time. The percentage of multiscale bases that need to be updated is relatively small (a few percent of the total number of the bases) [13]. The base functions that need to be updated are mostly near the interface separating the oil from the water. For those coarse grid cells far from the interface, there is little change in mobility dynamically. The upscaling of
Multiscale computation of fluid flow in heterogeneous media
1519
1
F(t )
0.8
0.6
0.4
0.2
0
0
0.5
1
1.5
2
2.5
t Figure 6. The accuracy of the coarse grid algorithm. Solid line is the well-resolved fractional flow curve. The slash-dotted line is the fractional flow curve using above coarse grid algorithm.
the saturation equation based on moment closure argument can be generalized to the two-phase flow with the enhanced diffusivity depending on the local small scale velocity field [15]. As we mentioned before, the fluctuation of the velocity field u can be accurately recovered from the coarse grid computation by using local multiscale bases.
7.
Multiscale Analysis for Incompressible Flow
The upscaling of the nonlinear transport equation in two-phase flows shares some of the common difficulties in deriving the effective equations for incompressible flow at high Reynolds number. The understanding of scale interactions for 3D incompressible flow has been a major challenge. For high Reynolds number flow, the degrees of freedom are so high that it is almost impossible to resolve all small scales by direct numerical simulations. Deriving an effective equation for the large scale solution is very useful in engineering applications, see e.g., [16, 17]. In deriving a large eddy simulation model, one usually needs to make certain closure assumptions. The accuracy of such closure models is hard to measure a priori. It varies from application to application. For many engineering applications, it is desirable to design a subgrid-based large scale model in a systematic way so that we can measure and control the modeling error. However, the strong nonlinear interaction of small scales and the lack of scale separation make it difficult to derive an effective equation.
1520
T.Y. Hou
We consider the incompressible Navier-Stokes equation ut + (u · ∇)u = −∇ p + ν u , ∇ · u = 0,
(18) (19)
with multiscale initial data u (x, 0) = u0 (x). Here u (t, x) and p (t, x) are velocity and pressure, respectively, ν is viscosity. We use boldface letters to denote vector variables. For the time being, we do not consider the effect of boundary and assume that the solution is periodic with period 2π in each dimension. For incompressible flow at high Reynolds number, small scales are generated dynamically through nonlinear interactions. In general, there is no scale separation in the solution. However, by decomposing the physical solution into a lower frequency component and a high frequency component, we can formally express the solution as the sum of a large scale solution and a small scale component. This decomposition can be carried out easily in Fourier space. Further, by rearranging the order of summation in the Fourier transformation, we can express the initial condition in the following form
u (x, 0) = U(x) + W x,
x ,
where W(x, y) is periodic in y and has mean zero. Here represents the cut-off wavelength in the solution above which the solution is resolvable and below which the solution is unresolvable. We call this a reparameterization technique. The question of interest is how to derive a homogenized equation for the averaged velocity field for small but finite . If the viscosity coefficient ν is of order one, then it can be shown that the high frequency oscillations will be damped out quickly in O() time. Even with ν = O(), the cell viscosity will be of order one and the oscillatory component of the velocity field is of order O(). In order for the oscillatory component of the velocity field persists in time, we need to have ν = O( 2 ). In this case, the cell viscosity is zero to the leading order. Since we are interested in the convection dominated transport, we set ν = 0 and consider only the incompressible Euler equation. The homogenization of the Euler equation with oscillating data was first studied by McLaughlin–Papanicolaou–Pironneau (MPP for short) [18]. In Ref. [18], MPP made an important assumption that the small scale oscillation is convected by the mean flow. Based on this assumption, they made the following multiscale expansion for velocity and pressure
t θ(t, x) t θ(t, x) + u1 t, x, , + ··· u (t, x) = u(t, x) + w t, x, , t θ(t, x) t θ(t, x) + p1 t, x, , + ··· p (t, x) = p(t, x) + q t, x, ,
Multiscale computation of fluid flow in heterogeneous media
1521
where w(t, x, τ, y), u1 (t, x, τ, y), q, and p1 are assumed to be periodic in both y and τ , and the phase θ is convected by the mean velocity field u ∂θ + u · ∇x θ = 0, θ(0, x) = x. (20) ∂t By substituting the above multiscale expansions into the Euler equation and equating coefficients of the same order, MPP obtained a homogenized equation for (u, p), and a periodic cell problem for (w(t, x, τ, y), q(t, x, τ, y)). On the other hand, it is not clear whether the resulting cell problem for w and q has a unique solution that is periodic in both y and τ . Additional assumptions were imposed on the solution of the cell problem in order to derive a variant of the k − model. The understanding of how small scale solution being propagated dynamically is clearly very important in deriving the homogenized equation. Motivated by the work of MPP, we have recently developed a multiscale analysis for the incompressible Euler equation with multiscale solutions [19, 20]. Our study shows that the small scale oscillations are convected by the full oscillatory velocity field, not just the mean velocity: ∂θ + u · ∇x θ = 0, θ (0, x) = x. (21) ∂t This is clear for the 2D Euler equation since vorticity, ω , is conserved along the characteristics θ (t, x) , ω (t, x) = ω0 θ (t, x), where ω0 (x, x/) is the initial vorticity, which is of order O(1/). Similar conclusion can be drawn for the 3D Euler equation. Now the multiscale structure of θ (x, t) is coupled to the multiscale structure of u . In some sense, we embed multiscale structure within multiscale expansions. It is quite a challenge to unfold the multiscale solution structure. Naive multiscale expansion for θ may lead to generation of infinite number of scales. Motivated by the above analysis, we look for multiscale expansions of the velocity field and the pressure of the following form u (t, x) = u(t, x) + w(t, θ(t, x), τ, y) + u(1) (t, θ(t, x), τ, y) + · · · , (22)
(1)
p (t, x) = p(t, x) + q(t, θ(t, x), τ, y) + p (t, θ(t, x), τ, y) + · · · , (23)
where τ = t/ and y = θ (t, x)/. We assume that w, and q have zero mean with respect to y. The phase function θ is defined in (21) and it has the following multiscale expansion: θ (1) θ = θ(t, x) + θ t, θ(t, x), τ, + ··· . (24)
1522
T.Y. Hou
This particular form of multiscale expansion was suggested by a corresponding Lagrangian multiscale analysis [19]. If one tried to expand θ naively as a function of x/ and t/, one would find that there is a generation of infinite number of scales at t > 0 and would not be able to obtain a well-posed cell problem. Expanding the Jacobian matrix, we get ∇x θ = B (0) +B (1) +· · · . Substituting the expansion into the Euler equation and matching the terms of the same order, we obtain the following homogenized equation
∂t u + u · ∇x u + ∇x · ww = −∇x p, u|t =0 = U(x), ∇x · u = 0,
(25) (26)
where ww stands for space-time average in (y, τ ), and ww stands for a matrix whose entry at the ith row and j th column is wi w j . The equation for w is given by
∂τ w + B (0) ∇y q = 0, τ > 0; (B
(0)
∇y ) · w = 0,
w|τ =0 = W(x, y),
(27) t = 0.
(28)
Moreover, we can derive the evolution equations for θ and θ(1) as follows
∂t θ + (u · ∇x )θ = 0, θ|t =0 = x,
(29)
∂τ θ(1) + (w · ∇x )θ = 0, θ(1)|τ =0 = 0.
(30)
From θ and θ(1) , we can compute the Jacobian matrix B (0) as follows: B (0) = (I − D y θ(1))−1 ∇x θ.
(31)
To check the convergence of our multiscale analysis, we compare the computational result obtained by solving the homogenized equation with that obtained by a well-resolved direct numerical simulation (DNS). Further, we use the first two terms in the multiscale expansion for the velocity field to reconstruct the fine grid velocity field. The initial velocity field is generated in Fourier space by imposing some power-law decay in the velocity field with a random phase perturbation in each Fourier mode. For this initial condition, we choose = 0.05. In Fig. 7a, we plot the initial horizontal velocity field in the fine mesh. The corresponding coarse grid velocity field is plotted in Fig. 7b. As we see in the spectrum plot in Fig. 9, there is no scale separation in the solution. We compare the computation obtained by the homogenized equation with that obtained by DNS at t = 0.5 in Fig. 8. We use the spectral interpolation to reconstruct the fine grid velocity field as a sum of the homogenized solution u and the cell velocity field w. We can see that the reconstructed velocity field (plotted only on the coarse grid) captures very well the fine grid velocity field obtained by DNS using a 512 × 512 grid. We also compare the accuracy
Multiscale computation of fluid flow in heterogeneous media (a)
1523
(b)
500
60
1
440
0.8
400
50
0.6
350
0.4
40
300
0.2
250
0
30
⫺0.2
200
⫺0.4
20
150 100
⫺0.6
10
⫺0.2
50
⫺1 100
200 300 400 t ⫽0 u ⫹w (fine grid)
10
500
20
30
40
50
60
t ⫽0 u ⫹w (coarse grid)
Figure 7. Horizontal velocity fields at t = 0. (a)
(b)
500
1
60
440
0.8
400
50
0.6
350
0.4
40
300 250
0.2 0
30
⫺0.2
200
⫺0.4
20
150
⫺0.6
100
10
⫺0.2
50
⫺1 100 200 300 400 t ⫽0.5 u ⫹w (DNS,fine grid)
500
10 20 30 40 50 60 t ⫽0.50 u ⫹w (interpolated on coarse grid)
Figure 8. Horizontal velocity fields at t = 0.5.
in the Fourier space, which is given in Fig. 9b. The agreement between the well-resolved solution and the reconstructed solution from the homogenized equation is excellent in both low frequencies and high frequencies. Further, we compare the mean velocity field obtained by the homogenized equation with that obtained by direct simulation using a low pass filter. The results are plotted in Figs. 10 and 11, respectively. We can see that the agreement between the two calculations is very good up to t = 1.0. Similar results are obtained for longer time calculations. The above multiscale analysis can be generalized to problems with general multiscale initial data without scale separation and periodic structure. This can be done by using the reparameterization technique in the Fourier space, which we described earlier for the initial velocity. This reparameterization technique
1524
T.Y. Hou
(a)
(b)
10⫺1
10⫺1 DNS(512⫻512)
10⫺2 10⫺3
10⫺2
10⫺4
U⫹W(512⫻512) DNS(512⫻512)
10⫺5
10⫺3
10⫺6 10⫺4
10⫺7 10⫺8
10⫺5
10⫺9 10⫺10
10⫺6
10⫺11 100
101
102
103
t ⫽0 spectrum of velocity
100
101
102
103
t ⫽0.5 spectrum of velocity
Figure 9. Spectrum of velocity fields at t = 0 and t = 0.5 respectively.
(a)
(b)
500
1
60
440
0.8
400
50
0.6
350
0.4 40
300 250
0.2 0
30
⫺0.2
200
⫺0.4
20
150
⫺0.6
100 10
50
⫺0.2 ⫺1
100 200 300 400 t ⫽1.0 mean flow u (DNS,fine grid) filter k ⫽0.01
500
10 20 30 40 50 60 t ⫽1.0 mean flow u (coarse grid) filter k ⫽0.01
Figure 10. Mean velocity fields at t = 1.0.
can be used repeatedly in time. The dynamic reparameterization also accounts for the dynamic interactions between the large and small scales. The difficulty associated with finding the local microscopic boundary condition can be overcome. Preliminary computational results show that the multiscale method can capture accurately the large scale solution and the spectral property of the small scale solution for a relatively long time computations. Our ultimate goal is to use the multiscale analysis to design an effective coarse grid model that can capture accurately the large scale behavior but with a computational cost comparable to the traditional large eddy simulation
Multiscale computation of fluid flow in heterogeneous media (a)
1525
(b)
⫺0.05 ⫺0.1
⫺0.05
filter scale k⫽0.01
t⫽1.0 DNS t⫽1.0 two-scale t⫽00
⫺0.15
⫺0.15
⫺0.2
⫺0.2
⫺0.25
⫺0.25
⫺0.3
⫺0.3
⫺0.35
⫺0.35
⫺0.4
⫺0.4
⫺0.45
⫺0.45
⫺0.5
0
1
2
t⫽1.0 DNS t⫽1.0 two-scale t⫽00
⫺0.1
3
4
5
6
t⫽1.0 cross-section of mean flow u filter scale⫽0.01
7
⫺0.5
0
1
2
3
filter scale k⫽0.05
4
5
6
7
t⫽1.0 cross-section of mean flow u filter scale⫽0.005
Figure 11. Cross-Section of the mean velocity fields at t = 1.0.
(LES) models [16, 17]. To achieve this, we need to take into account the special structures in the fully mixed flow, such as homogeneity and possible local self-similarity of the flow in the interior of the domain. When the flow is fully mixed, we expect that the Reynolds stress term, i.e., ww , reaches to a statistical equilibrium relatively fast. As a consequence, we may need to solve for the cell problem in τ for only a small number of time steps after updating the effective velocity in one coarse grid time step. Moreover, we need not solve the cell problem for every coarse grid for homogeneous flow. It should be sufficient to solve one or a few representative cell problems for fully mixed flow and use the solution of these representative cell solutions to compute the Reynolds stress term in the homogenized velocity equation. If this can be achieved, it would lead to a significant computational saving.
8.
Discussions
Multiscale methods offer several advantages over direct numerical simulations on a fine grid. First, the multiscale bases are very local. This makes it very easy to implement the method in parallel computing. Also the memory requirement is less stringent compared with direct numerical simulations since the base functions can be computed locally and independently. Secondly, we can use an effective adaptive strategy to update the multiscale bases only in the region that is needed. Thirdly, the multiscale methods offer an effective tool in deriving upscaled equations. In oil reservoir simulations, it is often the
1526
T.Y. Hou
case that multiple simulations of the same reservoir model must be carried out in order to validate the fine grid reservoir model. After the upscaled model has been obtained, it can be used repeatedly with different boundary conditions and source distributions for management purpose. In this case, the cost of computing the multiscale base functions is just an over-head. If one can coarsen the fine grid by a factor of 10 in each dimension, the computational saving of the upscaled model over the original fine model could be as large as a factor 10 000 (three space dimensions plus time). It remains a great challenge to develop a systematic multiscale analysis to upscale the convection-dominated transport in heterogeneous media. While the upscaled saturation equation based on perturbation argument and moment closure approximation is simple and easy to implement, it is hard to estimate its modeling error as the fluctuations in velocity or saturation are not small in practice. New multiscale analysis need to be developed to account for the longrange interaction of small scales (the memory effect). Recently, we have developed a novel multiscale analysis for convection-dominated transport equation [2]. The analysis is based on a delicate multiscale analysis of the transport equation. The multiscale analysis for two-phase flows is not as complicated as that for the incompressible Euler equation. There is no need to introduce a multiscale phase function here, and the fast variable, y = x/, which characterizes the small scale solution, enters only as a parameter. This makes it easier for us to generalize the analysis to problems which do not have scale separation. We remark that there are other different approaches to multiscale problems, see e.g., [22–27]. Some of these methods assume that the media have periodic microstructures or scale separation, and explore these properties in their multiscale methods, while others use wavelet approximations, renormalization group techniques, and variational methods.
9.
Outlook
Looking forward, the main challenge in developing multiscale methods seems to be the lack of analytical tools in studying nonlinear dynamic problems that are convection-dominated and whose solutions do not have scale separation or periodic microstructures. For convection-dominated transport problems that do not have scale separation, it is very difficult to construct local multiscale base functions as we did for the elliptic-or diffusion-dominated problems. Incorrect local microscopic boundary conditions for the local multiscale base functions can lead to order one errors propagating down stream and create fluid dynamic instabilities. Systematic multiscale analysis needs to be carried out to account for the long-range interaction of small scales.
Multiscale computation of fluid flow in heterogeneous media
1527
To bridge the gap between the classical homogenization theory where scale separation is required and those practical applications where we do not have scale separation, we need to develop a new type of multiscale analysis. The new multiscale analysis should not require a large separation of scales. By using the dynamic reparameterization technique, we can always divide a multiscale solution into a large scale component and a small scale component. Interaction of the large scales and small scales can be effectively modeled by using a two-scale analysis for a short time increment. Then we use the reparameterization technique to decompose the solution again into a large scale component and a small scale component. Thus interaction of large and small scale solution occurs iteratively at every small time increment. Over a long time, we can account for interactions of all scales. We are currently pursuing this approach with the hope to develop a systematic multiscale analysis for incompressible flow at high Reynolds number.
References [1] T.Y. Hou and X. Wu, “A multiscale finite element method for elliptic problems in composite materials and porous media,” J. Comput. Phys., 134, 169–189, 1997. [2] T.Y. Hou, X. Wu, and Z. Cai, “Convergence of a multiscale finite element method for elliptic problems with rapidly oscillating coefficients,” Math. Comput., 68, 913–943, 1999. [3] Y.R. Efendiev, T.Y. Hou, and X. Wu, “Convergence of a nonconforming multiscale finite element method,” SIAM J. Numer. Anal., 37, 888–910, 2000b. [4] Z. Chen and T. Hou, “A mixed finite element method for elliptic problems with rapidly oscillating coefficients,” Math. Comput., 72, 541–576, 2002. [5] I. Babuska, G. Caloz, and E. Osborn, “Special finite element methods for a class of second order elliptic problems with rough coefficients,” SIAM J. Numer. Anal., 31, 945–981, 1994. [6] F. Brezzi and A. Russo, “Choosing bubbles for advection-diffusion problems,” Math. Models Methods Appl. Sci., 4, 571–587, 1994. [7] T.J.R. Hughes, “Multiscale phenomena: Green’s functions, the Dirichlet-toNeumann formulation, subgrid scale models, bubbles and the origins of stabilized methods,” Comput. Methods Appl. Mech. Engrg., 127, 387–401, 1995. [8] A. Bensoussan, J.L. Lions, and G. Papanicolaou, Asymptotic Analysis for Periodic Structures, 1st edn., North-Holland, Amsterdam, 1978. [9] T.Y. Hou, “Multiscale modeling and computation of incompressible flow,” In: J.M. Hill and R. Moore (eds.), Applied Mathematics Entering the 21st Century, Invited Talks from the ICIAM 2003 Congress, SIAM, Philadelphia, pp. 177–209, 2004. [10] P. Park and T.Y. Hou, “Multiscale numerical methods for singularly-perturbed convection–diffusion equations,” Int. J. Comput. Meth., 1(1), 17–65, 2004. [11] L.J. Durlofsky, “Numerical calculation of equivalent grid block permeability tensors for Heterogeneous porous media,” Water Resour. Res., 27, 699–708, 1991. [12] P. Jenny, S.H. Lee, and H. Tchelepi, “Multi-scale finite volume method for elliptic problems in subsurface flow simulation,” J. Comput. Phys., 187, 47–67, 2003.
1528
T.Y. Hou
[13] P. Jenny, S.H. Lee, and H. Tchelepi, “Adaptive multi-scale finite volume method for multi-phase flow and transport in porous media,” Multiscale Model. Simul., 3, 50–64, 2005. [14] L. Tartar, “Nonlocal effects induced by homogenization,” In: F. Culumbini (ed.), PDE and Calculus of Variations , Birkh¨auser, Boston, pp. 925–938, 1989. [15] Y.R. Efendiev, L.J. Durlofsky, and S.H. Lee, “Modeling of subgrid effects in coarsescale simulations of transport in heterogeneous porous media,” Water Resour. Res., 36, 2031–2041, 2000a. [16] J. Smogorinsky, “General circulation experiments with the primitive equations,” Mon. Weather Rev., 91, 99–164, 1963. [17] M. Germano, U. Pimomelli, P. Moin, and W. Cabot, “A dynamic subgrid-scale eddy viscosity model,” Phys. Fluids A, 3, 1760–1765, 1991. [18] D.W. McLaughlin, G.C. Papanicolaou, and O. Pironneau, “Convection of microstructure and related problems,” SIAM J. Appl. Math., 45, 780–797, 1985. [19] T.Y. Hou, D. Yang, and K. Wang, “Homogenization of incompressible Euler equations,” J. Comput. Math., 22(2), 220–229, 2004b. [20] T.Y. Hou, D. Yang, and H. Ran, “Multiscale analysis in the Lagrangian formulation for the 2-D incompressible Euler equation,” Discr. Continuous Dynam. Sys., 12, to appear, 2005. [21] T.Y. Hou, A. Westhead, and D. Yang, “Multiscale analysis and computation for two-phase flows in strongly heterogeneous porous media,” (in preparation), 2005a. [22] M. Dorobantu and B. Engquist, “Wavelet-based numerical homogenization,” SIAM J. Numer. Anal., 35, 540–559, 1998. [23] T. Wallstrom, S. Hou, M.A. Christie, L.J. Durlofsky, and D. Sharp, “Accurate scale up of two-phase flow using renormalization and nonuniform coarsening,” Comput. Geosci., 3, 69–87, 1999. [24] T. Arbogast, “Numerical subgrid upscaling of two-phase flow in porous media,” In: Z. Chen (ed.), Numerical Treatment of Multiphase Flows in Porous Media, Springer, Berlin, pp. 35–49, 2000. [25] A. Matache, I. Babuska, and C. Schwab, “Generalized p-FEM in homogenization,” Numer. Math., 86, 319–375, 2000. [26] L.Q. Cao, J.Z. Cui, and D.C. Zhu, “Multiscale asymptotic analysis and numerical simulation for the second order Helmholtz equations with rapidly oscillating coefficients over general convex domains,” SIAM J. Numer. Anal., 40, 543–577, 2002. [27] W.E. and B. Engquist, “The heterogeneous multi-scale methods,” Comm. Math. Sci., 1, 87–133, 2003.
4.15 CERTIFIED REAL-TIME SOLUTION OF PARAMETRIZED PARTIAL DIFFERENTIAL EQUATIONS Nguyen Ngoc Cuong, Karen Veroy, and Anthony T. Patera Massachusetts Institute of Technology, Cambridge, MA, USA
1.
Introduction
Engineering analysis requires the prediction of (say, a single) selected “output” s e relevant to ultimate component and system performance:∗ typical outputs include energies and forces, critical stresses or strains, flowrates or pressure drops, and various local and global measures of concentration, temperature, and flux. These outputs are functions of system parameters, or “inputs”, µ, that serve to identify a particular realization or configuration of the component or system: these inputs typically reflect geometry, properties, and boundary conditions and loads; we shall assume that µ is a P-vector (or P-tuple) of parameters in a prescribed closed input domain D ⊂ R P . The input–output relationship s e (µ) : D → R thus encapsulates the behavior relevant to the desired engineering context. In many important cases, the input–output function s e (µ) is best articulated as a (say) linear functional of a field variable u e (µ). The field variable, in turn, satisfies a µ-parametrized partial differential equation (PDE) that describes the underlying physics: for given µ ∈ D, u e (µ) ∈ X e is the solution of g(u e (µ), v; µ) = 0,
∀ v ∈ X e,
(1)
where g is the weak form of the relevant partial differential equation† and X e is an appropriate Hilbert space defined over the physical domain ⊂ Rd . Note * Here superscript “e” shall refer to “exact.” We shall later introduce a “truth approximation” which will bear no superscript. † We shall restrict our attention in this paper to second-order elliptic partial differential equations; see Outlook for a brief discussion of parabolic problems.
1529 S. Yip (ed.), Handbook of Materials Modeling, 1529–1564. c 2005 Springer. Printed in the Netherlands.
1530
N.N. Cuong et al.
in the linear case, g(w, v; µ) ≡ a(w, v; µ) − f (v), where a(·, ·; µ) and f are continuous bilinear and linear forms, respectively; for any given µ ∈ D, u e (µ) ∈ X e now satisfies a(u e (µ), v; µ) = f (v),
∀ v ∈ X e (linear).
(2)
Relevant system behavior is thus described by an implicit “input–output” relationship s e (µ) = (u e (µ)),
(3)
evaluation of which necessitates solution of the partial differential equation (1) or (2). Many problems in materials and materials processing can be formulated as particular instantiations of the abstraction (1) and (3) or perhaps (2) and (3). Typical field variables and associated second-order elliptic partial differential equations include temperature and steady conduction–Poisson; displacement and equilibrium or Helmholtz elasticity; {velocity, temperature} and steady Boussinesq incompressible Navier–Stokes; wavefunction and stationary Schr¨odinger via (say) Hartree–Fock approximation. The latter two equations are nonlinear, while the former two equations are linear; in subsequent sections we shall provide detailed examples of both nonlinear and linear problems. Our particular interest – or certainly the best way to motivate our approach – is “deployed” systems: components or processes that are in service, in operation, or in the field. For example, in the materials and materials processing context, we may be interested in assessment, evolution, and accommodation of a crack in a critical component of an in-service jet engine; in real-time characterization and optimization of the heat treatment protocol for a turbine disk; or in online thermal “control” of Bridgman semiconductor crystal growth. Typical computational tasks include robust parameter estimation (inverse problems) and adaptive design (optimization problems): in the former – for example, assessment of current crack length or in-process heat transfer coefficient – we must deduce inputs µ representing system characteristics based on outputs s e (µ) reflecting measured observables; in the latter – for example, prescription of allowable load or best thermal environment – we must deduce inputs µ representing “control” variables based on outputs s e (µ) reflecting current process objectives. Both of these demanding activities must support an action in the presence of continually evolving environmental and mission parameters. The computational requirements on the forward problem are thus formidable: the evaluation must be real-time, since the action must be immediate; and the evaluation must be certified – endowed with a rigorous error bound – since the action must be safe and feasible. For example, in our aerospace crack example, we must predict in the field – without recourse to a lengthy computational investigation – the load that the potentially damaged structure
Real-time solution of parametrized partial differential equations
1531
can unambiguously safely carry. Similarly, in our materials processing examples, we must predict in operation – in response to deduced environmental variation – temperature boundary conditions that will preserve the desired material properties. Classical approaches such as the finite element method cannot typically satisfy these requirements. In the finite element method, we first introduce a piecewise-polynomial “truth” approximation subspace X (⊂ X e ) of dimension N . The “truth” finite element approximation is then found by (say) Galerkin projection: given µ ∈ D, s(µ) = (u(µ)),
(4)
where u(µ) ∈ X satisfies g(u(µ), v; µ) = 0,
∀ v ∈ X,
(5)
or, in the linear case g(w, v; µ) ≡ a(w, v; µ) − f (v), a(u(µ), v; µ) = f (v),
∀ v ∈ X (linear).
(6)
We assume that (5) and (6) are well-posed; we articulate the associated hypotheses more precisely in the context of a posteriori error estimation. We shall assume – hence the appellation “truth” – that X is sufficiently rich that u(µ) (respectively, s(µ)) is sufficiently close to u e (µ) (respectively, s e (µ)) for all µ in the parameter domain D. Unfortunately, for any reasonable error tolerance, the dimension N needed to satisfy this condition – even with the application of appropriate (parameter-dependent) adaptive mesh refinement strategies – is typically extremely large, and in particular much too large to provide real-time response in the deployed context. Deployed systems thus present no shortage of unique computational challenges; however, they also provide many unique computational opportunities – opportunities that must be exploited. We first consider the “approximation opportunity.” The critical observation is that, although the field variable u e (µ) generally belongs to the infinitedimensional space X e associated with the underlying partial differential equation, in fact u e (µ) resides on a very low-dimensional manifold Me ≡{u e (µ) | µ ∈ D} induced by the parametric dependence; for example, for a single parameter, µ ∈ D ⊂ R P=1 , u e (µ) will describe a one-dimensional filament that winds through X e . Furthermore, the field variable u e (µ) will typically be extremely regular in µ – the parametrically induced manifold Me is very smooth – even when the field variable enjoys only limited regularity with respect to the spatial coordinate x ∈ .∗ In the finite element method, the approximation space X is * The smoothness in µ may be deduced from the equations for the sensitivity derivatives; the stability and
continuity properties of the partial differential operator are crucial.
1532
N.N. Cuong et al.
much too general – X can approximate many functions that do not reside on Me – and hence much too expensive. This observation presents a clear opportunity: we can effect significant dimension reduction in state space if we restrict attention to Me ; the field variable can then be adequately approximated by a space of dimension N N . However, since manipulation of even one “point” on Me is expensive, we must identify further structure. We thus next consider the “computational opportunities”; here there are two critical observations. The first observation derives from the mathematical formulation: very often, the parameter dependence of the partial differential equation can be expressed as the sum of Q products of (known, easily evaluated) parameter-dependent functions and parameter-independent continuous forms; we shall denote this structure as “affine” parameter dependence. In our linear case, (2), affine parameter dependence reduces to a(w, v; µ) =
Q
q (µ) a q (w, v),
(7)
q=1
for q : D → R and a q : X × X → R, 1 ≤ q ≤ Q. The second observation derives from our context: rapid deployed response perforce places a predominant emphasis on very low marginal cost – we must minimize the additional effort associated with each new evaluation µ → s(µ) “in the field.” These two observations present a clear opportunity: we can exploit the underlying affine parametric structure (7) to design effective offline–online computational procedures which willingly accept greatly increased initial preprocessing – offline, pre-deployed – expense in exchange for greatly reduced marginal – online, deployed – “in service” cost.∗ The two essential components to our approach are (i) rapidly, uniformly (over D) convergent reduced-basis (RB) approximations, and (ii) associated rigorous and sharp a posteriori error bounds. Both components exploit affine parametric structure and offline–online computational decompositions to provide extremely rapid deployed response – real-time prediction and associated error estimation. We next describe these essential ingredients.
2. 2.1.
Reduced-Basis Method Approximation
The reduced-basis method was introduced in the late 1970s in the context of nonlinear structural analysis [1, 2] and subsequently abstracted, analyzed, * Clearly, low marginal cost implies low asymptotic average cost; our methods are thus also relevant to (non real-time) many-query multi-optimization studies – and, in fact, to any situation characterized by extensive exploration of parameter space.
Real-time solution of parametrized partial differential equations
1533
and extended to a much larger class of parametrized PDEs [3, 4] – including the incompressible Navier–Stokes equations [5–7] relevant to many materials processing applications. The RB method explicitly recognizes and exploits the dimension reduction afforded by the low-dimensional and smooth parametrically induced solution manifold. We note that the RB approximation is constructed not as an approximation to the exact solution, u e (µ), but rather as an approximation to the (finite element) truth approximation, u(µ). As already discussed, N , the dimension of X , will be very large; our RB formulation and associated error estimation procedures must be stable and (online) efficient as N → ∞. We shall consider in this section the linear case, g(w, v; µ) ≡ a(w, v; µ) − f (v), in which s(µ) and u(µ) are given by (4) and (6), respectively; recall that a is bilinear and f , , are linear. We shall consider a “primal–dual” formulation particularly well-suited to good approximation and error characterization of the output; towards this end, we introduce a dual, or adjoint, problem: given µ ∈ D, ψ(µ) ∈ X satisfies a(v, ψ(µ); µ) = −(v),
∀ v ∈ X.
(8)
Note that if a is symmetric and = f , which we shall denote “compliance,” ψ(µ) = −u(µ). In the “Lagrangian” [4] RB approach, the field variable u(µ) is approximated by (typically) Galerkin projection onto a space spanned by solutions of the governing PDE at N selected points in parameter space. For the primal probpr pr lem, (6), we introduce nested parameter samples S N ≡ {µ1 ∈ D, . . . , µ N ∈ D} pr and associated nested RB approximation subspaces W N ≡span{ζn ≡ u(µn ), 1 ≤ n ≤ N } for 1 ≤ N ≤ Nmax ; similarly, for the dual problem (8), we define corredu sponding samples S Ndudu ≡ {µdu 1 ∈ D, . . . , µ N du ∈ D} and RB approximation du du du du ∗ spaces W N du ≡span{ζndu ≡ ψ(µdu n ), 1 ≤ n ≤ N } for 1 ≤ N ≤ Nmax . (Procedu dures for selection of good samples SN , S N du and hence spaces W N , W Ndudu will be discussed in subsequent sections.) Our RB approximation is thus: given µ ∈ D, s N (µ) = (u N (µ)) + g(u N (µ), ψ N du (µ); µ),
(9)
where u N (µ) ∈ W N and ψ N du (µ) ∈ W Ndudu satisfy a(u N (µ), v; µ) = f (v),
∀ v ∈ WN ,
(10)
and a(v, ψ N du (µ); µ) = −(v),
∀ v ∈ W Ndudu ,
(11)
* In actual practice, the primal and dual bases should be orthogonalized with respect to the inner product associated with the Hilbert space X, (·, ·) X ; the algebraic systems then inherit the “conditioning” properties of the underlying partial differential equation.
1534
N.N. Cuong et al.
respectively. We emphasize that we are interested in global approximations that are uniformly valid over a finite parameter domain D. We note that, in the compliance case – a symmetric and = f such that ψ(µ) = −u(µ) – we may simply take N du = N , S Ndu = S N , W Ndu = W N , and hence ψ N (µ) = −u N (µ). In practice, in such a case we need never actually form the dual problem – we simply identify ψ N (µ) = −u N (µ) – with a corresponding 50% reduction in computational effort. Typically [8, 9], and in some very special cases provably [10], u N (µ), ψ N (µ), and s N (µ) converge to u(µ), ψ(µ), and s(µ) uniformly and extremely rapidly – thanks to the smoothness in µ – and thus we may achieve the desired accuracy for N, N du N . The critical ingredients of the a priori theory are (i) the optimality properties of Galerkin projection,∗ and (ii) the good approximation properties of W N (respectively, W Ndudu ) for the manifold M ≡ {u(µ) | µ ∈ D} (respectively, Mdu ≡ {ψ(µ) | µ ∈ D}).
2.2.
Offline–Online Computational Procedure
Even though N , N du may be small, the elements of (say) W N are in some sense “large”: ζn ≡ u(µpr n ) will be represented in terms of N N truth finite element basis functions. To eliminate the N -contamination of the deployed performance, we must consider offline–online computational procedures [7– 9, 11]. For our purposes here, we continue to assume that our PDE is linear, (6), and furthermore exactly affine, (7), for some modest Q. In future sections we shall consider a nonlinear example as well as the possibility of nonaffine operators. To begin, we expand our reduced-basis approximation as u N (µ) =
N
du
u N j (µ)ζ j ,
ψ N du (µ) =
j =1
N
ψ N du j (µ)ζ jdu .
(12)
j =1
It then follows from (9) and (12) that the reduced-basis output can be expressed as s N (µ) =
N
du
u N j (µ) (ζ j ) −
j =1
N
ψ N du j (µ) f (ζ jdu )
j =1 N du
+
Q N j =1 j =1 q=1
u N j (µ)ψ N du j (µ)q (µ)a q (ζ j , ζ jdu ),
(13)
* Galerkin optimality relies on stability of the discrete equations. The latter is only assured for coercive problems; for noncoercive problems, Petrov–Galerkin methods may thus be preferred [12].
Real-time solution of parametrized partial differential equations
1535
where the coefficients u N j (µ), 1 ≤ j ≤ N , and ψ N du j , 1 ≤ j ≤ N du , satisfy the N × N and N du × N du linear algebraic systems N j =1 du
N j =1
Q
(µ)a (ζ j , ζi ) u N j (µ) = f (ζi ), q
q=1 Q
q
1 ≤ i ≤ N,
(14)
q (µ)a q (ζidu , ζ jdu ) ψ N du j (µ) = −(ζidu),
1 ≤ i ≤ N du .
q=1
(15) The offline–online decomposition is now clear. For simplicity below we assume that N du = N . In the offline stage – performed once – we first solve for the ζi , ζidu , 1 ≤ i ≤ N ; we then form and store (ζi ), f (ζi ), (ζidu), and f (ζidu ), 1 ≤ i ≤ N , and a q (ζ j , ζi ), a q (ζidu , ζ jdu ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q, and a q (ζi , ζ jdu ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q.∗ Note all quantities computed in the offline stage are independent of the parameter µ. In the online stage – performed many times, for each new value of µ “in the field” –we first assemble and subsequently invert the N × N “stiff ness matrices” qQ= 1 q (µ) a q (ζ j , ζi ) of (14) and qQ= 1 q (µ) a q (ζidu , ζ jdu ) of (15) – this yields the u N j (µ), ψ N du j (µ), 1 ≤ j ≤ N ; we next perform the summation (13) – this yields the s N (µ). The operation count for the online stage is, respectively, O(Q N 2 ) and O(N 3 ) to assemble (recall that the a q (ζ j , ζi ), 1 ≤ i, j ≤ N , 1 ≤ q ≤ Q, are pre-stored) and invert the stiffness matrices, and O(N ) + O(Q N 2 ) to evaluate the output (recall that the (ζ j ) are pre-stored); note that the RB stiffness matrix is, in general, full. The essential point is that the online complexity is independent of N , the dimension of the underlying truth finite element approximation space. Since N, N du N , we expect – and often realize – significant, orders-of-magnitude computational economies relative to classical discretization approaches.
3. 3.1.
A Posteriori Error Estimation Motivation
A posteriori error estimation procedures are very well developed for classical approximations of, and solution procedures for, (say) partial differential equations [13–15] and algebraic systems [16]. However, until quite recently, * In actual practice, in the offline stage we consider N = N du du max and N = Nmax ; then, in the online stage, we extract the necessary subvectors and submatrices.
1536
N.N. Cuong et al.
there has been essentially no way to rigorously, quantitatively, sharply, and efficiently assess the accuracy of RB approximations. As a result, for any given new µ, the RB (say, primal) solution u N (µ) typically raises many more questions than it answers. Is there even a solution u(µ) near u N (µ)? This question is particularly crucial in the nonlinear context – for which in general we are guaranteed neither existence nor uniqueness. Is |s(µ)−s N (µ)| ≤ tol, where tol is the maximum acceptable error? Is a crucial feasibility condition s(µ) ≤ C (say, in a constrained optimization exercise) satisfied – not just for the RB approximation, s N (µ), but also for the “true” output, s(µ)? If these questions cannot be affirmatively answered, we may propose the wrong – and unsafe or infeasible – action in the deployed context. A fourth question is also important: Is N too large, |s(µ) − s N (µ)| tol, with an associated steep (N 3 ) penalty on computational efficiency? An overly conservative approximation may jeopardize the real-time response and associated action – with corresponding detriment to the deployed systems. We may also consider the approximation properties and efficiency of the (say, primal) parameter samples and associated RB approximation spaces, S N and W N , 1 ≤ N ≤ Nmax . Do we satisfy our global “acceptable error level” condition, |s(µ) − s N (µ)| ≤ tol , ∀µ ∈ D, for (close to) the smallest possible value of N ? And a related question: For our given tolerance tol , are the RB stiffness matrices (or, in the nonlinear case, Newton Jacobians) as well-conditioned as possible – given that by construction W N will be increasingly colinear with increasing N ? If the answers are not affirmative, then our RB approximations are more expensive (and unstable) than necessary – and perhaps too expensive to provide real-time response. In short, the pre-asymptotic and essentially ad hoc or empirical nature of reduced-basis discretizations, the strongly superlinear scaling (with N , N du ) of the reduced-basis online complexity, and the particular needs of deployed realtime systems virtually demand rigorous a posteriori error estimators. Absent such certification, we must either err on the side of computational pessimism – and compromise real-time response – or err on the side of computational optimism – and risk sub-optimal, infeasible, or potentially unsafe decisions. In Refs. [8, 9, 17, 18], we introduce a family of rigorous error estimators for reduced-basis approximation of a wide class of partial differential equations (see also Ref. [19] for an alternative approach). As in almost all error estimation contexts, the enabling (trivial) observation is that, whereas a 100% error in the field variable u(µ) or output s(µ) is clearly unacceptable, a 100% or even larger (conservative) error in the error is tolerable and not at all useless; we may thus pursue “relaxations” of the equation governing the error and residual that would be bootless for the original equation governing the field variable u(µ). We now present further details for the particular case of elliptic linear problems with exact affine parameter dependence (7): the truth solution satisfies
Real-time solution of parametrized partial differential equations
1537
(4), (6), and (8), and the corresponding reduced-basis approximation satisfies (9)–(11). (In subsequent sections we shall consider the extension to nonlinear problems through a detailed example; we shall also briefly discuss nonaffine problems.)
3.2.
Error Bounds
We shall need several preliminary definitions. To begin, we denote the inner product and norm associated with our Hilbert space X as (w, v) X and √
v X = (v, v) X , respectively; we further define the dual norm (of any bounded linear functional h) as h(v) .
v X
h X ≡ sup v∈X
(16)
We recall that we restrict our attention here to second-order elliptic partial differential equations: thus, for a scalar problem (such as heat conduction), H01 () ⊂ X e ⊂ H 1 (), where H 1 () (respectively, H01 ()) is the usual space of derivative-square-integrable functions (respectively, derivative–square– integrable functions that vanish on ∂, the boundary of ) [20]. A typical choice for (·, ·) X is (w, v) X =
∇w · ∇v + wv,
(17)
which is simply the standard H 1 () inner product. We next introduce [12, 18] the operator T µ : X → X such that, for any w in X , (T µ w, v) X = a(w, v; µ), ∀ v ∈ X . We then define σ (w; µ) ≡
T µ w X ,
w X
and note that β(µ) ≡ inf sup
a(w, v; µ) = inf σ (w; µ),
w X v X w∈X
(18)
γ (µ) ≡ sup sup
a(w, v; µ) = sup σ (w; µ);
w X v X w∈X
(19)
w∈X v∈X
w∈X v∈X
we also recall that β(µ) w X T µ w X ≤ a(w, T µ w; µ),
∀ w ∈ X.
(20)
Here β(µ) is the Babuˇska “inf–sup” stability constant – the minimum singular value associated with our differential operator (and transpose operator) – and
1538
N.N. Cuong et al.
γ (µ) is the standard continuity constant. We suppose that γ (µ) is bounded ∀ µ ∈ D, and that β(µ) ≥ β0 > 0, ∀ µ ∈ D. We note that for a symmetric, coercive bilinear form, β(µ) = αc (µ), where αc (µ) ≡ inf
w∈X
a(w, w; µ) ,
w 2X
is the standard coercivity constant. Given our reduced-basis primal solution u N (µ), it is readily derived that the error e(µ) ≡ u(µ) − u N (µ) ∈ X satisfies a(e(µ), v; µ) = −g(u N (µ), v; µ),
∀ v ∈ X,
(21)
where −g(u N (µ), v; µ) ≡ f (v) − a(u N (µ), v; µ) (in this linear case) is the familiar residual. It then follows from (16), (20), and (21) that
e(µ) X ≤
ε N (µ) , β(µ)
where ε N (µ) ≡ g(u N (µ), · ; µ) X ,
(22)
is the dual norm of the residual. We now assume that we are privy to a nonnegative lower bound for the ˜ ˜ inf–sup parameter, β(µ), such that β(µ) ≥ β(µ) ≥ β β(µ), ∀µ ∈ D, where β ∈]0, 1[. We then introduce our “energy” error bound
N (µ) ≡
ε N (µ) , ˜ β(µ)
(23)
the effectivity of which is defined as η N (µ) ≡
N (µ) .
e(µ) X
It is readily proven [9, 18] that, for any N , 1 ≤ N ≤ Nmax , 1 ≤ η N (µ) ≤
γ (µ) , ˜ β(µ)
∀ µ ∈ D.
(24)
From the left inequality, we deduce that e(µ) X ≤ N (µ), ∀µ ∈ D, and hence that N (µ) is a rigorous upper bound for the true error∗ measured in the
· X norm – this provides certification: feasibility and “safety” are guaranteed. From the right inequality, we deduce that N (µ) overestimates the true * Note, however, that these error bounds are relative to our underlying “truth” approximation, u(µ) ∈ X, not to the exact solution, u e (µ) ∈ X e .
Real-time solution of parametrized partial differential equations
1539
∗ ˜ error by at most γ (µ)/β(µ), independent of N – this relates to efficiency: an overly conservative error bound will be manifested in an unnecessarily large N and unduly expensive RB approximation, or (even worse) an overly conservative or expensive decision or action “in the field.” We now turn to error bounds for the output of interest. To begin, we note that the dual satisfies an “energy” error bound very similar to the primal result: du , for 1 ≤ N du ≤ Nmax
ψ(µ) − ψ N du (µ) X ≤ du N (µ),
∀ µ ∈ D;
du du ˜ here du N ≡ ε N (µ)/β(µ), and ε N (µ) = − (·) − a(·, ψ N du (µ); µ) X is the dual norm of the dual residual. It then follows† that
|s(µ) − s N (µ)| ≤ sN (µ),
∀µ ∈ D,
(25)
where
sN (µ) ≡ ε N (µ) du N (µ).
(26)
du ˜ It is critical to note that sN (µ) = β(µ) N (µ) N (µ): the output error (and output error bound) vanishes as the product of the primal and dual error (bounds), and hence much more rapidly than either the primal or dual error. From the perspective of computational efficiency, a good choice is ε N (µ) ≈ ε du N (µ); the latter also (roughly) ensures that the bound (25), (26) will be quite sharp. In the compliance case, a symmetric and = f , we immediately obtain
du N (µ) = N (µ), and hence (25) obtains for
sN (µ) ≡
ε 2N (µ) , ˜ β(µ)
∀ µ ∈ D (compliance);
(27)
here, we obtain the “square” effect even without (explicit) introduction of the dual problem. For a coercive further improvements are possible [9]. The real challenge in a posteriori error estimation is not the presentation of these rather classical results, but rather the development of efficient computational approaches for the evaluation of the necessary constituents. In our particular deployed context, “efficient” translates to “online complexity independent of N ,” and “necessary constituents” translates to “dual norm of the primal residual, ε N (µ) ≡ g(u N (µ), ·; µ) X , dual norm of the dual residual, ε du N (µ) ≡ − (·) − a(·, ψ N du (µ); µ) X , and lower bound for the inf–sup ˜ constant, β(µ).” We now turn to these issues. * The upper bound on the effectivity can be large. In many cases, this effectivity bound is in fact quite pessimistic; in many other cases, the effectivity (bound) may be improved by judicious choice of (multipoint) inner product (·, ·) X – in effect, a “bound conditioner” [21]. † The proof is simple: |s(µ) − s (µ)| = |(e) − g(u (µ), ψ (µ); µ)| = | − a(e(µ), ψ(µ); µ) − g(u (µ), N N N N ψ N (µ); µ)| = |g(u N (µ), ψ(µ) − ψ N (µ); µ)| ≤ ε N (µ) du N (µ).
1540
3.3.
N.N. Cuong et al.
Offline–Online Computational Procedures
3.3.1. The dual norm of the residual We consider only the primal residual; the dual residual admits a similar treatment. To begin, we note from standard duality arguments that ˆ ε N (µ) ≡ g(u N (µ), ·; µ) X = e(µ) X,
(28)
where eˆ(µ) ∈ X satisfies (e(µ), ˆ v) X = −g(u N (µ), v; µ),
∀ v ∈ X.
(29)
We next observe from our reduced-basis representation (12) and affine assumption (7) that −g(u N (µ), v; µ) may be expressed as −g(u N (µ), v; µ) = f (v) −
Q N
q (µ)u N n (µ)a q (ζn , v),
∀v ∈ X.
q=1 n=1
(30) It thus follows from (29) and (30) that eˆ(µ) ∈ X satisfies (e(µ), ˆ v) X = f (v) −
Q N
q (µ) u N n (µ) a q (ζn , v),
∀ v ∈ X.
(31)
q=1 n=1
The critical observation [8, 9] is that the right-hand side of (31) is a sum of products of parameter-dependent functions and parameter-independent linear functionals. In particular, it follows from linear superposition that we may write e(µ) ˆ ∈ X as e(µ) ˆ =C+
Q N
q (µ) u N n (µ) Lqn ,
q=1 n=1
for C ∈ X satisfying (C, v) X = f (v), ∀ v ∈ X, and Lqn ∈ X satisfying (Lqn , v) X = − a q (ζn , v), ∀ v ∈ X , 1 ≤ n ≤ N , 1 ≤ q ≤ Q; note from (17) that the latter are simple parameter-independent (scalar or vector) Poisson, or Poisson-like, problems. It thus follows that 2
e(µ) ˆ X = (C, C) X +
Q N
q (µ) u N n (µ) 2(C, Lqn ) X
q=1 n=1
+
Q
N
q =1 n =1
q
(µ) u N n (µ)
q (Lqn , Ln ) X
.
(32)
Real-time solution of parametrized partial differential equations
1541
The expression (32) – which we relate to the requisite dual norm of the residual through (28) – is the sum of products of parameter-dependent (simple, known) functions and parameter-independent inner products. The offline– online decomposition is now clear. In the offline stage – performed once – we first solve for C and Lqn , 1 ≤ n ≤ N , 1 ≤ q ≤ Q; we then evaluate and save the relevant parameter-independent q inner products (C, C) X , (C, Lqn ) X , (Lqn , Ln ) X , 1 ≤ n, n ≤ N , 1 ≤ q, q ≤ Q. Note that all quantities computed in the offline stage are independent of the parameter µ. In the online stage – performed many times, for each new value of µ “in the field” – we simply evaluate the sum (32) in terms of the q (µ), u N n (µ) and the precalculated and stored (parameter-independent) (·, ·) X inner products. The operation count for the online stage is only O(Q 2 N 2 ) – again, the essential point is that the online complexity is independent of N , the dimension of the underlying truth finite element approximation space. We further note that, unless Q is quite large, the online cost associated with the calculation of the dual norm of the residual is commensurate with the online cost associated with the calculation of s N (µ).
3.3.2. Lower bound for the inf–sup parameter Obviously, from the definition (18), we may readily obtain by a variety of techniques effective upper bounds for β(µ); however, lower bounds are much more difficult to construct. We do note that in the case of symmetric coercive ˜ operators we can often determine β(µ) (≤ β(µ) = αc (µ), ∀µ ∈ D) “by inspection.” For example, if we verify q (µ) > 0, ∀ µ ∈ D, and a q (v, v) ≥ 0, ∀ v ∈ X , 1 ≤ q ≤ Q, then we may choose [8, 21] for our coercivity lower bound ˜ β(µ) =
q (µ) min αc (µ), ¯ q∈{1,...,Q} q (µ) ¯
(33)
for some µ¯ ∈ D. Unfortunately, these hypotheses are rather restrictive, and hence more complicated (and offline-expensive) recipes must often be pursued [17, 18]. We consider here a construction which is valid for general noncoercive operators (and thus also relevant in the nonlinear context [22]); for simplicity, we assume our problem remains well-posed over a convex parameter set that includes D. To begin, given µ¯ ∈ D and t = (t(1) , . . . , t( P) ) ∈ R P – note t( j ) is the value of the j th component of t – we introduce the bilinear form T (w, v; t; µ) ¯ = (T µ¯ w, T µ¯ v) X +
P p=1
t( p)
Q ∂q q=1
∂µ( p)
µ¯
µ¯
(µ) ¯ a (w, T v) + a (v, T w) q
q
(34)
1542
N.N. Cuong et al.
and associated Rayleigh quotient F(t; µ) ¯ = min v∈X
T (v, v; t; µ) ¯ ; 2
v X
(35)
it is readily demonstrated that F(t; µ) ¯ is concave in t [24], and hence D µ¯ ≡ P ¯ µ) ¯ ≥ 0} is perforce convex. We next introduce semi-norms {µ ∈ R |F(µ − µ; | · |q : X → R+,0 such that |a q (w, v)| ≤ q |w|q |v|q , Q
C X = supw∈X
q=1
∀w, v ∈ X, 1 ≤ q ≤ Q,
|w|2q
w 2X
(36) ,
for positive parameter-independent constants q , 1 ≤ q ≤ Q, and C X ; it is often the case that 1 (µ) = Constant, in which case the q = 1 contribution to the sum in (34) and (36) may be discarded. (Note that C X is typically independent of Q, since the a q are often associated with non-overlapping subdomains of .) Finally, we define
(µ; µ) ¯ ≡ CX
max
q∈{1,...,Q}
q (µ) − q (µ) ¯ q
∂ (µ − µ) ¯ ( p) (µ) ¯ , − ∂µ( p) p=1 P
q
(37)
for µ ≡ (µ(1) , . . . , µ( P) ) ∈ R P. We now introduce points µ¯ j and associated polytopes P µ¯ j ⊂ D µ¯ j , 1 ≤ j ≤ J, such that D⊂
J
P µ¯ j ,
(38)
j =1
min
ν∈V
µ ¯j
F(ν − µ¯ j ; µ¯ j ) − max (µ; µ¯ j ) ≥ β β(µ¯ j ), µ∈P
µ¯ j
1 ≤ j ≤ J. (39)
Here V µ¯ j is the set of vertices associated with the polytope P µ¯ j – for example, P µ¯ j may be a simplex with |V µ¯ j | = P + 1 vertices; and β ∈ ]0, 1[ is a prescribed accuracy constant. Our lower bound is then given by ˜ β(µ) =
max
j ∈{1,...,J }|µ∈P
µ ¯j
β β(µ¯ j ).
(40)
˜ ˜ In fact, β(µ) of (40) may not strictly honor our condition β(µ) > β β(µ); however, as the latter relates to accuracy, approximate satisfaction suffices.
Real-time solution of parametrized partial differential equations
1543
˜ (Recall that β(µ) appears in the denominator of our error bound; hence, even a relative inf–sup discrepancy of 80%, β ≈ 1/5, is acceptable.) It can be eas˜ ily demonstrated that β(µ) ≥ β(µ) ≥ β β0 > 0, ∀µ ∈ D, which thus ensures well-posed and rigorous error bounds. We now turn to the offline–online decomposition. The offline stage comprises two parts: the generation of a set of points and polytopes–vertices, µ¯ j and P µ¯ j , V µ¯ j , 1 ≤ j ≤ J ; and the verification that (38) (trivial) and (39) (nontrivial) are indeed satisfied. We focus on verification; generation – quite involved – is described in detail in [23]. To verify (39), the essential observation is that the expensive terms – “truth” eigenproblems associated with F, (35), and β, (18) – are limited to a finite set of vertices, J+
J
|V µ¯ j |,
j =1
in total; only for the extremely inexpensive – and typically algebraically very simple – (µ; µ¯ j ) terms must we consider minimization over the polytopes. The online stage (40) is very simple: a search/look-up table, with complexity logarithmic in J and polynomial in P. We close by remarking on the properties of F(µ − µ; ¯ µ) ¯ that play an important role. First, F(µ − µ; ¯ µ) ¯ ≤ β 2 (µ), ∀µ ∈ D µ¯ (say, for the case in which q (µ) = µ(q) , 1 ≤ q ≤ Q = P): this ensures the lower bound result. Second, F(t; µ) ¯ is concave in t (note that in general β(µ) is neither (quasi-) concave nor (quasi-) convex in µ [24]): this ensures a tractable offline computation. Third, F(µ − µ; ¯ µ) ¯ is “tangent”∗ to β(µ) at µ = µ¯ – the cruder estimate (µ; µ) ¯ is a second-order correction: this controls the growth of J (for example, relative to simpler continuity bounds [17]).
3.4.
Sample Construction: A Greedy Algorithm
Our error estimation procedures also allow us to pursue more rational constructions of our parameter samples S N , S Ndudu (and hence spaces W N , W Ndudu ) [18]. We consider here only the primal problem – in which our error criterion is
u(µ) − u N (µ) X ≡ e(µ) X ≤ tol ; similar approaches may be developed for du , and hence the output – |s(µ) − s N (µ)| ≤ the dual – ψ(µ) − ψ N du (µ) X ≤ tol s tol. We denote the smallest primal error tolerance anticipated as tol, min – this must be determined a priori offline; we then permit tol ∈ [tol, min, ∞[ to be specified online. We also introduce F ∈ D nF , a very fine random sample over the parameter domain D of size n F 1. * To make this third property rigorous we must in general consider non-smooth analysis and also possibly a continuous spectrum as N → ∞.
1544
N.N. Cuong et al.
We first consider the offline stage. We assume that we are given a sample S N , and hence space W N and associated reduced-basis approximation (procedure to determine) u N (µ), ∀µ ∈ D. We then calculate µ∗N = arg maxµ ∈ F
N (µ) – N (µ) is our “online” error bound (23) that, in the limit of n F → ∞ queries, may be evaluated (on average) in O(N 2 Q 2 ) operations; we next append µ∗N to S N to form S N + 1 , and hence W N + 1 . We now continue this process until N = Nmax such that N∗ max = tol,min, where N∗ ≡ N (µ∗N ), 1 ≤ N ≤ Nmax . In the online stage, given any desired tol ∈ [tol, min, ∞[ and any new value of µ ∈ D “in the field”, we first choose N from a pre-tabulated array such that N∗ ≡ N (µ∗N ) = tol. We next calculate u N (µ) and N (µ), and then verify that – and if necessary, subsequently increase N such that – the condition
N (µ) ≤ tol is indeed satisfied. (We should not and do not rely on the finite sample F for either rigor or sharpness.) The crucial point is that N (µ) is an accurate and “online-inexpensive” – O(1) effectivity and N -independent asymptotic complexity – surrogate for the true (very-expensive-to-calculate) error u(µ) − u N (µ) X . This surrogate permits us to (i) offline – here we exploit low average cost – perform a much more exhaustive (n F 1) and, hence, meaningful search for the best samples S N and, hence, most rapidly uniformly convergent spaces W N ,∗ and (ii) online – here we exploit low marginal cost – determine the smallest N , and hence, the most efficient approximation, for which we rigorously achieve the desired accuracy.
4. 4.1.
A Linear Example: Helmholtz-Elasticity Problem Description
We consider a two-dimensional thin plate with a horizontal crack at the (say) interface of two lamina: the (original) domain o (z, L) ⊂ R2 , shown in Fig. 1, is defined as [0, 2] × [0, 1] \ Co , where Co ≡ {x1 ∈ [b − L/2, b + L/2], x2 = 1/2} defines the idealized crack. The left surface of the plate is secured; the top and bottom boundaries are stress-free; and the right boundary is subjected to a vertical oscillatory uniform traction at frequency ω. We model the plate as plane-stress linear isotropic elastic with (scaled) density unity, Young’s modulus unity, and Poisson ratio 0.25; the latter determine the (parameter-independent) constitutive tensor E i j k . Our P = 3 input is µ ≡ (µ(1) , µ(2) , µ(3) ) ≡ (ω2 , b, L); our output is the (oscillatory) amplitude of the average vertical displacement on the right edge of the plate.
* We may in fact view our offline sampling process as a (greedy, parameter space, “L ∞ (D)”) variant of the POD economization procedure [25] in which – thanks to N (µ) – we need never construct the “rejected” snapshots.
Real-time solution of parametrized partial differential equations
1545
L b
Figure 1. (Original) domain for the Helmholtz elasticity example.
The governing equation for the displacement u o (x o ; µ) ∈ X o (µ) is therefore a o (u o (µ), v; µ) = f o (v), ∀ v ∈ X o (µ), where X o (µ) is a quadratic finite element truth approximation subspace (of dimension N = 14,662) of X e (µ) ≡ {v ∈ (H 1 (o (b, L)))2 | v|x1o = 0 = 0 }; here a (w, v; µ) ≡
o
wi, j E i j k v k, − ω2 wi v i ,
o (b,L)
∂v i /∂ x j and repeated physical indices imply summation), and (v i, j denotes f o (v) ≡ x o = 2 v 2 . The crack surface is hence modeled extremely simplisti1 cally – as a stress-free boundary. The output s o (µ) is given by s o (µ) = o (u o (µ)), where o (v) = f o (v); we are thus “in compliance”. We now map o (b, L) via a continuous piecewise-affine transformation to a fixed domain . This new problem can now be cast precisely in the desired abstract form, in which , X , and (w, v) X are independent of the parameter µ: as required, all parameter dependence now enters through the bilinear and linear forms; in particular, our affine assumption (7) applies for Q = 10. In the Appendix we summarize the q (µ), a q (w, v), 1 ≤ q ≤ Q; the bound conditioner (·, ·) X ; and the resulting continuity constants q and semi-norms | · |q , 1 ≤ q ≤ Q, and norm equivalence parameter C X . The (undamped, nonradiating) Helmholtz equation exhibits resonances. Our techniques can treat near resonances, as well as large frequency ranges, quite well [18, 23]. For our illustrative purposes here, we choose the parameter domain D (⊂ R P = 3 ) ≡ (ω2 ∈ [3.2, 4.8])×(b ∈ [0.9, 1.1]) × (L ∈ [0.15, 0.25]); D contains no resonances – β(µ) ≥ β0 > 0, ∀µ ∈ D – however, ω2 = 3.2 and 4.8 are close to corresponding natural frequencies, and hence the problem is distinctly noncoercive.
4.2.
Numerical Results
We first consider the inf–sup lower bound construction. We show in Fig. 2 ¯ µ) ¯ for µ= ¯ µ¯ 1 =(4.0, 1.0, 0.2); for purposes of presentation β (µ) and F(µ− µ; 2 we keep µ(1) = (ω = 4.0) fixed and vary µ(2) (= b) and µ(3) (= L). We observe 2
1546
N.N. Cuong et al.
0.02 0.01 0
⫺0.01 ⫺0.02 0.25 0.225 0.2
L
0.175 0.15 0.9
0.95
1
1.05
1.1
b
Figure 2. β 2 (µ) and F(µ − µ; ¯ µ) ¯ for µ¯ = (4, 1, 0.2) as a function of (b, L); ω2 = 4.0.
that (in this particular case, even without (µ; µ)), ¯ F(µ − µ; ¯ µ) ¯ is a lower bound for β 2 (µ); that F(µ − µ; ¯ µ) is concave; and that F(µ − µ; ¯ µ) is tan2 ¯ Thanks to the latter, we can cover D (for ¯β = 0.2) such gent to β (µ) at µ = µ. that (38) and (39) are satisfied with only J = 84 polytopes; in this particular case the P µ¯ j , 1 ≤ j ≤ J, are hexahedrons such that |V µ j | = 8, 1 ≤ j ≤ J . Armed with the inf–sup lower bound, we can now pursue the adaptive sampling strategy described in the previous section. We recall that our problem is compliant, and hence we need only consider the primal variable (and (µ) = ε N (µ)). For tol, min subsequently set ψ N du = N (µ) = −u N (µ) and ε du N du = N pr = 10−3 and n F = 729 we obtain Nmax = 32 such that Nmax ≡ Nmax (µ Nmax ) = 9.03 × 10−4 . We present in Table 1 N,max,rel , η N,ave , sN,max , and ηsN,ave as a function of N . Here N,max,rel is the maximum over Test of N (µ)/ u Nmax max , η N,ave is the average over Test of N (µ)/ u(µ) − u N (µ) X , sN,max,rel is the maximum over Test of sN (µ)/|s Nmax |max , and ηsN,ave is the average over Test of sN (µ)/|s(µ) − s N (µ)|. Here Test ∈ (D I )343 is a random parameter sample of size 343; u Nmax max ≡ maxµ ∈ Test u Nmax (µ) X = 2.0775 and |s Nmax |max ≡ maxµ∈Test |s Nmax (µ)| = 0.089966; and N (µ) and sN (µ) are given by (23) and (27), respectively. We observe that the RB approximation – in particular, for the output – converges very rapidly, and that our rigorous error bounds are in fact quite sharp. The effectivities are not quite O(1) primarily due to the relatively crude inf–sup lower bound; but note that, thanks to the rapid convergence of the RB approximation, O(10) effectivities do not significantly affect efficiency – the induced increase in RB dimension N is quite modest. We turn now to computational effort. For (say) N = 24 and any given µ (say, (4.0, 1.0, 0.2)) – for which the error in the reduced-basis output s N (µ)
Real-time solution of parametrized partial differential equations
1547
Table 1. Numerical results for Helmholtz elasticity N
N,max,rel
η N,ave
sN,max,rel
ηsN,ave
12 16 20 24 28
1.54 × 10−1 3.40 × 10−2 1.58 × 10−2 5.91 × 10−3 2.42 × 10−3
13.41 12.24 13.22 12.56 12.44
3.31 × 10−2 2.13 × 10−3 4.50 × 10−4 4.81 × 10−5 9.98 × 10−6
15.93 14.86 15.44 14.45 14.53
relative to the truth approximation s(µ) is certifiably less than sN (µ) (= 4.94 × 10−7 ) – the Online Time (marginal cost) to compute both s N (µ) and sN (µ) is less than 0.0030 the Total Time to directly calculate the truth result s(µ) = (u(µ)). The savings will be even larger for problems with more complex geometry and solution structure, in particular in three space dimensions. As desired, we achieve efficiency due to (i) our choice of sample, (ii) our rigorous stopping criterion sN (µ), and (iii) our affine parameter dependence and associated offline–online computational procedures; and we achieve rigorous certainty – the reduced-basis predictions may serve in “deployed” decision processes with complete confidence (or at least with the same confidence as the underlying physical model and associated truth finite element approximation). The true merit of the approach is best illustrated in the deployed–real-time context of parameter identification (crack assessment) and adaptive mission optimization (load maximization); see Ref. [24] for an example.
5.
A Nonlinear Example: Natural Convection
Obviously nonlinear equations do not admit the same degree of generality as linear equations. We thus present our approach to nonlinear equations for a particular quadratically nonlinear elliptic problem: the steady Boussinesq incompressible Navier–Stokes equations. This example permits us to identify the key new computational and theoretical ingredients; then, in Outlook, we contemplate more general (higher-order) nonlinearities.
5.1.
Problem Description
We consider Prandtl number Pr = 0.7 Boussinesq natural convection in a square cavity (x1 , x2 ) ∈ ≡ [0, 1] × [0, 1]; the Pr = 0 limit is described in greater detail in [22, 26]. The governing equations for the velocity U = (U1 , U2 ), pressure p, and temperature θ are the (coupled) incompressible steady Navier– Stokes and thermal convection–diffusion equations. Our single parameter
1548
N.N. Cuong et al.
(P = 1) is the Grashof number, µ ≡ Gr, which is the ratio of the buoyancy forces (induced by the temperature field) to the momentum dissipation mechanisms; we consider Gr ∈ D ≡ [1.0, 1.0 × 104 ]. This flow is a model problem for Bridgman growth of semi-conductor crystals; future work shall address geometric (angle, aspect ratio) and Pr variation, and higher Gr – all of which are important in actual materials processing applications. In terms of the general mathematical formulation, (5), u(µ) ≡ (U1 , U2 , p, θ, λ)(µ), where λ is a Lagrange multiplier associated with the pressure zero-mean condition. Our solution u(µ) resides in the space X ≡ X U × X p × X θ × R, where X U ⊂ (H01 ())2 , X p ⊂ L 2 () (respectively, X θ ⊂ {v ∈ H 1 () |v|x1 = 0 = 0}) is a classical P2 −P1 Taylor–Hood Stokes (respectively, P2 scalar) finite element approximation subspace [5]; X is of dimension N = 2869. We associate to X the inner product and norm (w, v) X =
∂χ ∂φ ∂ Wi ∂ Vi + Wi Vi + rq + + χφ + κα ∂x j ∂x j ∂ xi ∂ xi
√
and w X = (w, w) X , respectively, where w = (W1 , W2 , r, χ, κ) and v = (V1 , V2 , q, φ, α). The strong (or distributional) form of the governing equations is then √
Gr u j
√ ∂p √ ∂u i ∂ 2ui = − Gr + Gr θδi2 + , ∂x j ∂ xi ∂x j∂x j
i = 1, 2,
∂u i = λ, ∂ xi √ ∂ 2θ ∂θ Gr Pr u j = , ∂x j ∂x j∂x j
with boundary–normalization conditions u|∂ = 0 on the velocity, p = 0 on the pressure, and ∂θ/∂n|1 = 1, θ|0 = 0, ∂θ/∂n|s = 0 on the temperature; the flow is thus driven by the flux imposed on 1 . Here δij is the Kroneckerdelta, ∂ is the boundary of , and 0 = {x1 = 0, x2 ∈ [0, 1]} (left side), 1 = {x1 = 1, x2 ∈ [0, 1]} (right side), and s = {x1 ∈ ]0, 1[ , x2 = 0} ∪ {x1 ∈ ]0, 1[ , x2 = 1} (top and bottom). It is readily derived that λ = 0; however, we retain this term as a computationally convenient and stable fashion by which to impose the zero-mean pressure condition on the truth finite element solution. Our output of interest is the average temperature over 1 : s(Gr) = (u(Gr)), where (v = (V1 , V2 , q, φ, α)) ≡
φ;
1
note that s −1 (Gr) is the traditional “Nusselt number”.
(41)
Real-time solution of parametrized partial differential equations
1549
The weak form of our partial differential equations is then given by (5), where g(w, v; Gr) ≡ a0 (w, v; Gr) + 12 a1 (w, w, v; Gr) − f (v), a0 (w 1 , v; Gr) ≡
+
∂ Wi1 ∂ Vi − ∂x j ∂x j
a1 (w 1 , w 2 , v; Gr) ≡
√
Gr −
∂ Wi1 q + κ1 ∂ xi √
Gr −
q +α
χ V2 −
r1
1
(42)
r
1 ∂ Vi
∂ xi
,
(43) 1 ∂ Vi
W j1 Wi2 + W j2 Wi
+ Pr f (v) ≡
∂χ ∂φ + ∂ xi ∂ xi 1
∂χ W j2
1
∂x j
+
φ;
∂x j
∂χ W j1
2
∂x j
φ ,
(44) (45)
1
here w 1 = (W11 , W21 , r 1 , χ 1 , κ 1 ), w 2 = (W12 , W22 , r 2 , χ 2 , κ 2 ) , and v = (V1 , V2 , q, φ, α). Note that, even though = f , we are not in “compliance” as g is not bilinear, symmetric; however, we are “close” to compliance, and thus might anticipate rapid output convergence. We next observe that a0 (w 1 , v; Gr) and a1 (w 1 , w 2 , v; Gr) satisfy (a nonlinear version of) our assumption of affine parameter dependence (7). In particular, we may write a0 (w 1 , v; Gr) =
Q0
q
q
0 (Gr)a0 (w 1 , v),
(46)
q=1
a1 (w 1 , w 2 , v; Gr) =
Q1
q
q
1 (Gr)a1 (w 1 , w 2 , v),
(47)
q=1
√ 1 2 = 2 and Q = 1. In particular, (Gr) = 1, (Gr) = Gr, and 11 (Gr) = for Q 0 1 0 0 √ Gr; the corresponding parameter-independent bilinear and trilinear forms should be clear from (43) and (44). We shall exploit (46) and (47) in our offline–online decomposition. We define the derivative (about z ∈ X ) bilinear form dg(·, ·; z; Gr) : X × X → R as dg(w, v; z; Gr) ≡ a0 (w, v; Gr) + a1 (w, z, v; Gr)
1550
N.N. Cuong et al.
which clearly inherits the affine structure (46) and (47) of g; we note that, for our simple quadratic nonlinearity, g(z + w, v; Gr) = g(z, v; Gr) + dg(w, v; z; Gr) + (1/2) a1 (w, w, v; Gr). We then associate to dg(·, ·; z; Gr) our Babuˇska inf–sup and continuity “constants” dg(w, v; z; Gr) ,
w X v X dg(w, v; z; Gr) γ (z; Gr) ≡ sup sup ,
w X v X w∈X v∈X β(z; Gr) ≡ inf sup w∈X v∈X
respectively; these constants now depend on the state z about which we linearize. We shall confirm a posteriori that a solution to our problem does indeed exist for all Gr in the chosen D; we can further demonstrate [22] that the manifold {u(Gr)|Gr ∈ D} upon which we focus is a nonsingular (isolated) ∗ solution branch, √ and thus β(u(Gr)) ≥ β0 > 0, ∀ Gr ∈ D. We can also verify γ (z; Gr) ≤ 2 Gr (1 + ρU (ρU + Prρθ ) z X ), where
V L 4 () ,
V X U
ρU ≡ sup
V ∈X U
ρθ ≡ sup
φ∈X θ
φ L 4 ()
φ H 1 ()
(48)
are embedding constants [27, 28]; for V ∈ X U , V L n () ≡ Sobolev n/2 1/n ( (Vi Vi ) ) , 1 ≤ n < ∞, (W, V ) X U ≡ (∂ Wi /∂ x j )(∂ Vi /∂ x j ) + Wi Vi , 1/2 and V X U ≡ (V, V ) X U . We present in Fig. 3(a) a plot of s(Gr); as expected, for low Gr we obtain the conduction solution, s(Gr) = 1; at higher Gr, the larger buoyancy terms create more vigorous flows and hence more effective heat transfer. We show in Fig. 3(b) the velocity and temperature distribution at Gr = 104 ; we observe the familiar “S”-shaped natural convection profile.
5.2.
Reduced-Basis Approximation
For simplicity of exposition we shall not address here the adjoint in the nonlinear (approximation or error estimation) context [22], and we shall thus only consider RB treatment of the primal problem, (5) and (42). Our RB (Galerkin)
* We note that our truth approximation is div-stable in the sense that the “Brezzi” inf–sup parameter, β Br , is bounded from below (independent of N ):
β Br ≡
inf
{q∈X p |
sup
q(∂Vi /∂xi )
q=0} V ∈X U V X U q L 2 ()
> 0;
this is a necessary condition for “Babuˇska” inf–sup stability of the linearized operator dg(·, ·, z; Gr).
Real-time solution of parametrized partial differential equations
1551
Figure 3. (a) Inverse Nusselt number s(Gr) as a function of Gr; and (b) velocity and temperature field for Gr = 104 .
approximation is thus: for given Gr ∈ D, evaluate s N (Gr) = (u N (Gr)), where p u N (Gr) ≡ (U N , p N , θ N , λ N )(Gr) ∈ W N ≡ W NU × W N × W Nθ × W Nλ satisfies g(u N (Gr), v; Gr) = 0,
∀ v ∈ WN ,
for and g defined in (41) and (42)–(45). There are two new ingredients: correct choice of W N to ensure div-stability; and efficient offline–online treatment of the nonlinearity. We first address W N . To begin, we assume that N = 4m for m a positive intpr eger, and we introduce a sequence of nested parameter samples S N ≡ {µ1 ∈ pr D, . . . , µ N/4 ∈ D} in terms of which we may then define the components of ¯ W N . It is simplest to start with W p ≡ span{p(µn ), 1 ≤ n ≤ N/4, and p}, where p¯ = 1 is the constant function; we then choose W NU ≡ span{U (µpr n ), 2 U ), 1 ≤ n ≤N/4}, where for q ∈ L (), Sq ∈ X satisfies S p(µpr n (Sq, V ) X U =
∂ Vi q, ∂ xi
∀ V ∈ XU ;
W Nθ
λ ≡ span{θ(µpr we next define n ), 1 ≤ n ≤ N/4}; and, finally, W N ≡ R. Note that W NU must be chosen such that the RB approximation satisfies the Brezzi div-stability condition; for our problem, the domain and hence, the span of the supremizers do not depend on the parameter, and therefore the choice of W NU is simple – the more general case is addressed in [29]. We obp serve that dim(W NU ) = (N/2), dim(W N ) = (N/4) + 1, dim(W Nθ ) = (N/4), and dim(W Nλ ) = 1, and hence dim(W N ) = N + 2.∗
* In fact, we can explicitly eliminate (the zero coefficient of) p¯ and λ (= 0) from our RB discrete equations, N pr p and thus the effective dimension of W N is N . In the RB context, for which each member p(µn ) of W N is explicitly zero-mean, the services of the Lagrange multiplier are no longer required.
1552
N.N. Cuong et al.
For our nonlinear problem, the essential computational kernel is the inner Newton update: given a kth iterate u kN (Gr), the Newton increment δu kN (Gr) v; u kN (Gr); Gr)=−g(u kN (Gr), v; Gr), ∀v ∈ X . If we now satisfies dg(δu kN (Gr), N = n=1 u kN n (Gr) ζn – where W N = span{ζn , 1 ≤ n ≤ N } – and expand u kN (Gr) N k δu N (Gr) = j =1 δu kN j (Gr) ζ j , we obtain [17] the linear set of equations N j =1
Q0
q
q
0 (Gr) a0 (ζ j , ζi )
q=1
+
Q1 N n=1 q =1
q q 1 (Gr)u kNn (Gr)a1 (ζ j , ζn , ζi )
= − g(u kN (Gr), ζi ; Gr),
δ kN j (Gr)
1 ≤ i ≤ N,
where (from (42))
−g(u kN (Gr), ζi ;
Gr) = f (ζi ) −
N j =1
1 + 2
Q1 N
Q0
q
q
0 (Gr) a0 (ζ j , ζi )
q=1
q q u kN n (Gr)1 (Gr)a1 (ζ j , ζn , ζi )
u kN j (Gr)
n=1 q=1
is the residual for v = ζi . We can now directly apply the offline–online procedure [7–9] described earlier for linear problems, except now we must perform summations both “over the affine parameter dependence” and “over the reduced-basis coefficients” (of the current Newton iterate about which we linearize).∗ The operation count for the predominant Newton update component of the online stage is then – per Newton iteration – O(N 3 ) to assemble the residual, −g(u kN (Gr), ζi ; Gr), 1 ≤ i ≤ N , and O(N 3 ) to assemble and invert the N × N Jacobian. The essential point is that the online complexity is independent of N , thanks to offline generation and storage of the requisite parameter independent quantities q (for example, a1 (ζ j , ζn , ζi )). For this particular nonlinear problem, there is relatively little additional cost associated with the nonlinearity. However, our success depends crucially on the low-order polynomial nature of our nonlinearity: in general, standard Galerkin procedures will yield N n + 1 complexity for an nth order (n ≥ 2) polynomial nonlinearity. Although symmetries can be invoked to modestly improve the scaling with N and n [18], in any event new approaches will be * In essence – we shall see this again in the error estimation context – our quadratic nonlinearity effectively introduces N additional “parameter-dependent functions” and “parameter-independent forms” associated with the coefficients of our field-variable expansion and our trilinear form, respectively; however, these new parameter contributions are correlated in ways that we can gainfully exploit.
Real-time solution of parametrized partial differential equations
1553
required for nonpolynomial nonlinearities; we discuss these new procedures for efficient treatment of general nonaffine and nonlinear operators in Outlook.
5.3.
A Posteriori Error Estimation
The motivation for rigorous a posteriori error estimation is even more selfevident in the case of nonlinear problems. Fortunately, there is a rich mathematical foundation upon which to build the necessary computational structure. We first introduce the former; we then describe the latter. For simplicity, we develop here error bounds only for the primal energy norm, u(µ)−u N (µ) X ; we can also develop error bounds for the output – however, good effectivities will require consideration of the dual [22].
5.3.1. Error bounds We require some slight modifications to our earlier (linear) preliminaries. µ µ In particular, we introduce TN : X → X such that, for any w ∈ X , (TN w, v) X = dg(w, v; u N (µ); µ), ∀v ∈ X ; we then define σ N (w; µ) ≡ TNµ w X / w X . Our inf–sup and continuity constants – now linearized about the reduced-basis solution – can then be expressed as β N (µ) ≡ β(u N (µ); µ) = infw ∈ X σ N (w; µ), and γ N (µ) ≡ γ (u N (µ); µ) = supw ∈ X σ N (w; µ), respectively; as before, we shall need a nonnegative lower bound for the inf–sup parameter, β˜N (µ), such that β N (µ) ≥ β˜N (µ) ≥ 0, ∀ µ ∈ D. As in the linear case, the dual norm of the residual, ε N (µ) of (22), shall play a central role; the (negative of the) residual for our current nonlinear problem is given by (42) for w = u N (µ). We also introduce a new √ combination of parameters τ N (µ) ≡ 2ρ(µ)ε N (µ)/β˜N2 (µ), where ρ(µ) = 2 GrρU (ρU + Prρθ ) depends on the Sobolev embedding constants ρU and ρθ of (48); in essense, τ N (µ) is an appropriately “nondimensionalized” measure of the residual. Finally, we define N ∗ (µ) such that τ N (µ) < 1 for N ≥ N ∗ (µ); we require N ∗ (µ) ≤ Nmax , ∀ µ ∈ D. (The latter is a condition on Nmax that reflects both the convergence rate of the RB approximation and the quality of our inf–sup lower bound.) We recall that µ ≡ Gr ∈ D ≡ [1.0, 1.0 × 104 ]. Our error bound is then expressed, for any µ ∈ D and N ≥ N ∗ (µ), as
N (µ) =
β˜N (µ) 1 − 1 − τ N (µ) . ρ(µ)
(49)
The main result can be very simply stated: if N ≥ N ∗ (µ), there exists a unique solution u(µ) to (5) in the open ball
β˜N (µ) B u N (µ), ρ(µ)
≡
˜N (µ) β z ∈ X z − u N (µ) X < ; ρ(µ)
(50)
1554
N.N. Cuong et al.
furthermore,
u(µ) − u N (µ) X ≤ N (µ).
(51)
The proof, given in Ref. [22], is a slight specialization of a general abstract result [30, 31] that in turn derives from the Brezzi–Rappaz–Raviart (BRR) framework for the analysis of variational approximations of nonlinear partial differential equations [32]; the central ingredient is the construction of an appropriate contraction mapping which then forms the foundation for a standard fixed-point argument. On the basis of the main proposition (50) and (51) we can further prove several important corollaries related to the wellposedness of the truth approximation (5), and – similar to the linear result (24) – the effectivity of our error bound (49) [22]. We note that, as ε N (µ) → 0, we shall certainly satisfy N ≥ N ∗ (µ); furthermore the upper bound to the true error, N (µ) of (49), is asymptotic to ε N (µ)/β˜N (µ). We may derive these limits directly and rigorously from (49) and (51), or more heuristically from the equation for the error e(µ) ≡ u(µ) −u N (µ), dg(e(µ), v; u N (µ); µ) = −g(u N (µ), v; µ) − 12 a1 (e(µ), e(µ), v; µ). (52) We conclude that the nonlinear case shares much in common with the limiting linear case. However, there are also important differences: even for τ N (µ) < 1, we must (in general) admit the possibility of other solutions to (5) – solutions outside B(u N (µ), β˜N /ρ(µ)) – that are not near u N (µ); and for τ N (µ) ≥ 1, we cannot even be assured that there is indeed any solution u(µ) near u N (µ). This conclusion is not surprising: for “noncoercive” nonlinear problems the error equation (51) may in general admit no or several solutions; we can only be certain that a small (isolated) solution exists, (50) and (51), if the residual is sufficiently small. The theory informs us that the appropriate measure of the residual is τ N (µ), which reflects both the stability of the operator (β˜N (µ)) and the strength of the nonlinearity (ρ(µ)). As in the linear case, the real computational challenge is the development of efficient procedures for the calculation of the necessary a posteriori quantities:∗ the dual norm of the residual, ε N (µ); the inf–sup lower bound, β˜N (µ); and – new to our nonlinear problem – the Sobolev constants, ρU and ρθ . We now turn to these considerations.
* Typically, the BRR framework provides a nonquantitative a priori or a posteriori justification of asymp-
totic convergence. In our context, there is a unique opportunity to render the BRR theory completely predictive: actual a posteriori error estimators that are quantitative, rigorous, sharp, and (online) inexpensive.
Real-time solution of parametrized partial differential equations
1555
5.3.2. Offline-online computational procedures The dual norm of the residual. Fortunately, the duality relation of the linear case, (29), still applies – g(w, v; µ) of (42) is nonlinear in w, but of course linear in v. For our nonlinear problem, the negative of the residual, (42), for w = u N (µ), may be expressed in terms of the reduced-basis expansion (12) as −g(u N (µ), v; µ) = f (v) −
N
u N n (µ)
n=1
Q0
q
q
0 (µ)a0 (ζn , v)
q=1
Q1 N
1 q q 1 (µ) u N n (µ)a1 (ζn , ζn , v) , + 2 q =1 n =1
(53)
where we recall that µ ≡ Gr. If we insert (53) in (29) and apply linear superposition, we obtain e(µ) ˆ =C+
N
u N n (µ)
n=1
Q0
q
0 (µ)Lqn +
q=1
Q1 N q =1 n =1
q
q
1 (µ)u N n (µ)Qn n ,
where C ∈ X satisfies (C, v) X = f (v), ∀ v ∈ X , Lqn ∈ X satisfies (Lqn , v) X = q q q − a0 (ζn , v), ∀ v ∈ X , 1 ≤ n ≤ N , 1 ≤ q ≤ Q 0 , and Qn n ∈ X satisfies Qn n = q −a1 (ζn , ζn , v)/2, ∀ v ∈ X , 1 ≤ n, n ≤ N , 1 ≤ q ≤ Q 1 ; the latter are again simple (vector) Poisson problems. It thus follows that [22] 2
e(µ) ˆ X
= (C, C) X +
N
u N n (µ) 2
Q0
n=1
× 2
Q1
q=1
q
q
1 (µ)(C, Qn n ) X +
q=1
+
N
u N n (µ) 2
n =1
+
N n =1
q
0 (µ)(C, Lqn ) X +
Q0 Q1 q=1 q =1
u N n (µ)
Q1 Q1 q=1 q =1
q
Q0 Q0 q=1 q =1 q
q
N
u N n (µ)
n =1
q
q
0 (µ)0 (µ)(Lqn , Ln ) X q
0 (µ)1 (µ)(Lqn , Qn n ) X
q q q q 1 (µ)1 (µ)(Qn n , Qn n ) X
from which we can directly calculate the requisite dual norm of the residual through (28). We can now readily adapt the offline–online procedure developed in the linear case; however, our summation “over the affine dependence” now involves a double summation “over the reduced-basis coefficients”. The operation count for the online stage is thus (to leading order) O(Q 21 N 4 ); the essential point is that
1556
N.N. Cuong et al.
the online complexity is again independent of N – thanks to offline generation and storage of the requisite parameter-independent inner products (for examq q ple, (Qn n , Qn n ) X , 1 ≤ n, n , n , n ≤ N , 1 ≤ q, q ≤ Q 1 ). Although the N 4 online scaling is certainly less than pleasant, the error bound is calculated only once – at the termination of the Newton iteration – and hence in actual practice the additional online cost attributable to the residual dual norm computation is in fact not too large. However, the quartic scaling with N is again a memento mori that, for higher order (than quadratic) nonlinearities, standard Galerkin procedures are not viable; we discuss the alternatives further in Outlook. Lower bound for the inf–sup parameter. Our procedure for the linear case can be readily adopted: we need “only” incorporate the N additional parameterdependent “coefficient functions” – in fact, the RB coefficients – that appear in the linearized-about-u N (µ) derivative operator. Hence, for our nonlinear problem, the bilinear form T of (34) and Rayleigh quotient F of (35) now contain sensitivity derivatives of these additional “coefficient functions”; furthermore, the (µ, µ) ¯ function of (37) – our second-order remainder term – now includes the deviation of the RB coefficients from linear parameter dependence. Further details are provided in Ref. [22] (for Pr = 0) for the case in which W N ≡ W NU is divergence-free. Sobolev continuity constant. We present here the procedure for calculation of ρU ; the procedure for ρθ is similar. We first note [27, 28] that ρU = ˆ ξˆ ) ∈ (R+ , X U ) satisfies (1/δˆmin )1/2 , where (δ, (ξˆ , V ) X U = δˆ
ξˆ j ξˆ j ξˆi Vi ,
∀V ∈ X U ,
ξˆ 4L 4 () = 1,
and (δˆmin , ξˆmin ) denotes the ground state. To solve this eigenproblem, and in particular to ensure that we realize the ground state, we pursue a homotopy procedure. Towards that end, we introduce a parameter h ∈ [0, 1] (and associated small increment h) and look for (δ(h), ξ(h)) ∈ (R+ , X U ) that satisfies
(ξ(h), V ) X U = δ(h) h
ξ j (h)ξ j (h)ξi (h)Vi
+ (1 − h)
ξi (h)Vi , ∀V ∈ X U ,
h ξ 4L 4 () + (1 − h) ξ 2L 2 () = 1;
(54)
(δmin (h), ξmin (h)) denotes the ground state. We observe that (δmin (1), ξmin (1))= (δˆmin , ξˆmin ); and that (δmin (0), ξmin (0)) is the lowest eigenpair of the standard
Real-time solution of parametrized partial differential equations
1557
(vector) Laplacian “linear” eigenproblem. Our homotopy procedure is simple: we first set h old = 0 and find (δmin (0), ξmin (0)) by standard techniques; then, until h new = 1, we set h new ← h old + h, solve (54) for (δmin (h new ), ξmin (h new )) by Newton iteration initialized to (δmin (h old), ξmin (h old )), and update h old ← h new . For our domain, we find (offline) ρU = 0.6008, ρθ = 0.2788; since ρU and ρθ are parameter-independent, no online computation is required.
5.3.3. Sample construction The greedy algorithm developed in the linear case requires some modification in the nonlinear context. The first issue is that, to evaluate our error bound N (µ), we must appeal to our inf–sup lower bound; however, in the nonlinear case, this inf–sup lower bound, β˜N (µ), is defined with respect to the linearized state u Nmax (µ) [22]. In short, to determine the “next” sample point µ N+1 we must already know S Nmax – and hence µ N+1 . To avoid this circular reference during the offline sample generation process, we replace our inf–sup lower bound with a crude (for example, piecewise constant over D) approximation to β(u(µ)); once the samples are constructed, we revert to our rigorous (and now calculable) lower bound, β˜N (µ). The second issue is that, in the nonlinear context, our error bound is not operative until τ N (µ) < 1; hence, the greedy procedure must first select on arg maxµ∈F τ N (µ) – until τ N (µ) < 1 over D – and only subsequently select on arg maxµ ∈ F N (µ) [Prud’homme, private communication]. The resulting sample will ensure not only rapid convergence to the exact solution, but also rapid convergence to a certifiably accurate solution.
5.4.
Numerical Results
We present in Table 2 u(µ˜ N ) − u N (µ˜ N ) X / u(µ˜ N ) X , N,rel (µ˜ N ) ≡
N (µ˜ N )/ u N (µ˜ N ) X , and η N (µ˜ N ) ≡ N (µ˜ N )/ e(µ˜ N ) X for 8 ≤ N ≤ Nmax = 40; here µ˜ N ≡ arg max
µ∈Test
u(µ) − u N (µ) X
u(µ) X
and Test is a random parameter grid of size n Test = 500. We observe very rapid convergence of u N (µ) to u(µ) over D (more precisely, Test ) – our samples S N are optimally constructed to provide uniform convergence. The output error decreases even more rapidly: maxµ ∈ Test |s(µ) − s N (µ)|/s(µ) = 1.34 × 10−1 , 2.80 × 10−4 , and 9.79 × 10−7 for N = 8, 16, and 24, respectively; this “superconvergence” is a vestige of near compliance. As regards a posteriori error estimation, we observe that N ∗ (µ˜ N ) = 24
1558
N.N. Cuong et al. Table 2. Convergence and effectivity results for the natural convection problem; the “*” signifies that N ∗ (µ˜ N ) > N, which in turn indicate that τ N (µ˜ N ) ≥ 1 N
u(µ˜ N ) − u N (µ˜ N ) X
u(µ˜ N ) X
N,rel (µ˜ N )
η N (µ˜ N )
8 16 24 32 40
3.28 × 10−1 1.45 × 10−2 1.80 × 10−4 8.05 × 10−7 4.60 × 10−8
* * 7.47 × 10−4 7.60 × 10−6 8.69 × 10−7
* * 4.15 9.44 18.93
is relatively small – we can (respectively, can not) provide a definitive error bound for N ≥ 24 (respectively, N < 24); more generally, we find that N ∗ (µ) ≤ 24, ∀ µ ∈ D. We note that the effectivities are quite good∗ – in fact, considerably better than the worst-case predictions of our effectivity corollary. (The higher effectivity at N = 40 is undoubtedly due to round-off in the online summation.) The results of Table 2 are based on an inf–sup lower bound construction with J = 28 elements: points µ¯ j and polytopes (here segments) P µ¯ j , 1 ≤ j ≤ J . The accuracy of the resulting lower bound is reflected in the modest N ∗ (µ) and the good effectivities reported in Table 2. Most of the points µ¯ j are clustered at larger Gr, as might be expected. Finally, we note that the total online computational time on a Pentium M 1.6 GHz processor to predict u N (Gr), s N (Gr), and N (Gr) to a relative accuracy (in the energy norm) of 10−3 is – ∀ Gr ∈ D – 300 ms; this should be compared to 50 s for direct finite element calculation of the truth solution, u(Gr), s(Gr). We achieve computational savings of O(100): N is very small thanks to (i) the good convergence properties of S N and hence W N , and (ii) the rigorous and sharp stopping criterion provided by N (Gr); and the marginal computational complexity to evaluate s N (Gr) and N (Gr) depends only on N and not on N – thanks to the offline–online decomposition. The computational savings will be even more significant for more complex problems particularly in three spatial dimensions; it is critical to recall that we realize these savings without compromising rigorous certainty.† * It is perhaps surprising that the BRR theory – not really designed for quantitative service – yields such sharp results. However, it is important to note that, as ε N (µ) → 0, N (µ) ∼ ε N (µ)/β˜ N (µ), and thus the more pessimistic bounds (in particular ρ) are absent – except in τ N (µ). † We admit that the extension of our results to much larger Gr is not without difficulty. The more complex flow structures and the stronger nonlinearity will degrade the convergence rate and a posteriori error bounds – and increase N and J ; and (inevitable) limit points and bifurcations will require special precautions.
Real-time solution of parametrized partial differential equations
6.
1559
Outlook
We address here some of the more obvious questions that arise in reviewing the current state of affairs. As a first question: How many parameters P can we consider – for P how large are our techniques still viable? It is undeniably the case that ultimately we should anticipate exponential scaling (of both N and certainly J ) as P increases, with a concomitant unacceptable increase certainly in offline but also perhaps in online computational effort. Fortunately, for smaller P, the growth in N is rather modest, as (good) sampling procedures will automatically identify the more interesting regions of parameter space. Unfortunately, the growth in J is more problematic: we shall require more efficient construction and verification procedures for our inf–sup lower bound samples. In any event, treatment of hundreds (or even many tens) of truly independent parameters by the global methods described in this chapter is clearly not practicable; in such cases, more local approaches must be pursued.∗ A second question: How can we efficiently treat problems with non-affine parameter dependence and (more than quadratic) state-space nonlinearity? Both these issues are satisfactorily addressed by a new “empirical interpolation” approach [33]. In this approach, we replace a general nonaffine nonlinear function of the parameter µ, spatial coordinate x, and field variable u(x; µ), H(u; x; µ), by a collateral RB expansion: in particular, we approxµ); x; µ) – as required in our RB projection for u N (µ) – by imate H(u N (x; M dm (µ)ξm (x). The critical ingredients of the approach are H M (x; µ) = m=1 H = {µH , . . . , µH }, and approximation (i) a “good” collateral RB sample, S M 1 M H H space, span{ξm = H(u(µm ); x; µm ), 1 ≤ m ≤ M}, (ii) a stable and inexpensive interpolation procedure by which to determine (online) the dm (µ), 1 ≤ m ≤ M, and (iii) effective a posteriori error bounds with which to quantify the effect of the newly introduced truncation. It is perhaps only in the latter that the technique is somewhat disappointing: the error estimators – though quite sharp and very efficient – are completely (provably) rigorous upper bounds only in certain restricted situations. Finally, a third question, again related to generality: What class of PDEs can be treated? In addition to the elliptic equations discussed in this paper, parabolic equations can also be addressed satisfactorily from both the approximation and error estimation points of view [24, 34, 35]:† much of the elliptic technology directly applies, except that time now appears as an additional parameter; this parabolic framework can be viewed as an extension of * We do note that at least some problems with ostensibly many parameters in fact involve highly coupled or correlated parameters: certain classes of shape optimization certainly fall into this category. In these situations, global progress can be made. † To date we have experience with only stable parabolic systems such as the heat equation; unstable systems present considerable difficulty, in particular if long-time solutions are desired.
1560
N.N. Cuong et al.
time-domain model reduction procedures [19, 25, 36]. Unfortunately, treatment of hyperbolic problems does not look promising: although RB methods can perform quite well anecdotally, in general the underlying smoothness (in parameter µ) and stability will no longer obtain; as a result, both the approximation properties and error estimators will suffer. We close by noting that the offline aspects of the approaches described are both complicated and computationally expensive. The former can be at least partially addressed by appropriate software and architectures [37]; however, the latter will in any event remain. It follows that these techniques will really only be viable in situations in which there is truly an imperative for real-time certified response: a real premium on (i) greatly reduced marginal cost (or asymptotic average cost), and (ii) rigorous characterization of certainty; or equivalently, a very high (opportunity) cost associated with (i) slow response – long latency times, and (ii) incorrect (or unsafe) decisions or actions. There are many classes of materials and materials processing problems and contexts for which the methods are appropriate; and certainly there are many classes of materials and materials processing problems and contexts for which more classical techniques remain distinctly preferred.
Appendix A Helmholtz Elasticity Example We first define a reference domain corresponding to the geometry b = br = 1 and L = L r = 0.2. We then map o (b, L) → ≡ o (br , L r ) by a continuous piecewise-affine (in fact, piecewise-dilation-in-x1 ) transformation. We define three subdomains, 1 ≡ ] 0, br − L r /2 [ × ] 0, 1 [ , 2 ≡ ] br − L r /2, br + L r / ¯ = ¯1∪ ¯2∪ ¯ 3. 2 [× ] 0, 1[, 3 ≡ ]br + L r /2, 2 [×] 0, 1 [, such that We may then express the resulting bilinear form a(w, v; µ) as an affine sum (7) for Q = 10; the particular q (µ), a q (w, v), 1 ≤ q ≤ 10, as shown in Table 3. (Recall that w = (w1 , w2 ) and v = (v 1 , v 2 ).) The constitutive constants in Table 3 are given by c11 =
1 , 1 − ν2
c22 = c11 ,
c12 =
ν , 1 − ν2
c66 =
1 , 2(1 + ν)
where ν = 0.25 is the Poisson ratio (and the normalized Young’s modulus is unity); recall that we consider plane stress and a linear isotropic solid. We now define our inner product-cum-bound conditioner as (w, v) X ≡
c11
∂v 1 ∂w1 ∂v 2 ∂w2 ∂v 2 ∂w2 ∂v 1 ∂w1 + c22 + c66 + c66 ∂ x1 ∂ x1 ∂ x2 ∂ x2 ∂ x1 ∂ x1 ∂ x2 ∂ x2
+ w1 v 1 + w2 v 2
=
Q q=2
a q (w, v) ;
Real-time solution of parametrized partial differential equations
1561
Table 3. Parametric functions q (µ) and parameter-independent bilinear forms a q (w, v) for the two-dimensional crack problem q (µ)
q 1
1
c12
a q (w, v) ∂v ∂w1 ∂v 1 ∂w2 + 2 ∂ x1 ∂ x2 ∂ x2 ∂ x1
+ c66 2
br − L r /2 b − L/2
3
Lr L
4
2 − br − L r /2 2 − b − L/2
5
b − L/2 br − L r /2
6
L Lr
7
2 − b − L/2 2 − br − L r /2
8
−ω2
9
L −ω2 Lr
10
c11 1
c11 2
c11 3
c22 1
c22
b − L/2 br − L r /2
2 − b − L/2 −ω2 2 − br − L r /2
2
c22
3
∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x2 ∂ x2 ∂v 2 ∂w2 ∂ x2 ∂ x2 ∂v 2 ∂w2 ∂ x2 ∂ x2
∂v ∂w1 ∂v 1 ∂w2 + 2 ∂ x2 ∂ x1 ∂ x1 ∂ x2
+ c66
1
2
3
1
2
+ c66
+ c66
+ c66
+ c66
+ c66
3
∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 2 ∂w2 ∂ x1 ∂ x1 ∂v 1 ∂w1 ∂ x2 ∂ x2 ∂v 1 ∂w1 ∂ x2 ∂ x2 ∂v 1 ∂w1 ∂ x2 ∂ x2
w 1 v 1 + w2 v 2 1
w 1 v 1 + w2 v 2 2
w 1 v 1 + w2 v 2 3
thanks to the Dirichlet conditions at x1 = 0 (and also the wi v i term), (·, ·) X is appropriately coercive. We now observe that (µ) = 1 ( 1 = 0) and we can thus disregard the q = 1 term in our continuity bounds. We may then choose |v|2q = a q (v, v), 2 ≤ q ≤ Q, since the a q (·, ·) are positive semi-definite; it thus follows from the Cauchy–Schwarz inequality that q = 1, 2 ≤ q ≤ Q; furthermore, from (36), we directly obtain C X = 1.
Acknowledgments We would like to thank Professor Yvon Maday of University Paris VI for his many invaluable contributions to this work. We would also like to thank
1562
N.N. Cuong et al.
Dr Christophe Prud’homme of EPFL, Mr Martin Grepl of MIT, Mr Gianluigi Rozza of EPFL, and Professor Liu Gui-Rong of NUS for many helpful recommendations. This work was supported by DARPA and AFOSR under Grant F49620-03-1-0356, DARPA/GEAE and AFOSR under Grant F49620-03-10439, and the Singapore-MIT Alliance.
References [1] B.O. Almroth, P. Stern, and F.A. Brogan, “Automatic choice of global shape functions in structural analysis,” AIAA J., 16, 525–528, 1978. [2] A.K. Noor and J.M. Peters, “Reduced basis technique for nonlinear analysis of structures,” AIAA J., 18, 455–462, 1980. [3] J.P. Fink, and W.C. Rheinboldt, “On the error behavior of the reduced basis technique for nonlinear finite element approximations,” Z. Angew. Math. Mech., 63, 21–28, 1983. [4] T.A. Porsching, “Estimation of the error in the reduced basis method solution of nonlinear equations,” Math. Comput., 45, 487–496, 1985. [5] M.D. Gunzburger, Finite Element Methods for Viscous Incompressible Flows: A Guide to Theory, Practice, and Algorithms, Academic Press, Boston, 1989. [6] J.S. Peterson, “The reduced basis method for incompressible viscous flow calculations,” SIAM J. Sci. Stat. Comput., 10, 777–786, 1989. [7] K. Ito and S.S. Ravindran, “A reduced-order method for simulation and control of fluid flows,” Journal of Computational Physics, 143, 403–425, 1998. [8] L. Machiels, Y. Maday, I.B. Oliveira, A.T. Patera, and D. Rovas, “Output bounds for reduced-basis approximations of symmetric positive definite eigenvalue problems,” C. R. Acad. Sci. Paris, S´erie I, 331, 153–158, 2000. [9] C. Prud’homme, D. Rovas, K. Veroy, Y. Maday, A.T. Patera, and G. Turinici, “Reliable real-time solution of parametrized partial differential equations: Reducedbasis output bound methods,” J. Fluids Eng., 124, 70–80, 2002. [10] Y. Maday, A.T. Patera, and G. Turinici, “Global a priori convergence theory for reduced-basis approximation of single-parameter symmetric coercive elliptic partial differential equations,” C. R. Acad. Sci. Paris, S´erie I, 335, 289–294, 2002. [11] E. Balmes, “Parametric families of reduced finite element models: Theory and applications,” Mech. Syst. Signal Process., 10, 381–394, 1996. [12] Y. Maday, A.T. Patera, and D.V. Rovas, “A blackbox reduced-basis output bound method for noncoercive linear problems,” In: D. Cioranescu and J. Lions (eds.), Nonlinear Partial Differential Equations and Their Applications, Coll´ege de France Seminar Volume XIV, Elsevier Science B.V, pp. 533–569, 2002. [13] R. Becker and R. Rannacher, “Weighted a posteriori error control in finite element methods,” ENUMATH 95 Proceedings World Science Publications, Singapore, 1997. [14] M. Paraschivoiu and A.T. Patera, “A hierarchical duality approach to bounds for the outputs of partial differential equations,” Comp. Meth. Appl. Mech. Eng., 158, 389–407, 1998. [15] M. Ainsworth and J.T. Oden, A Posteriori Error Estimation in Finite Element Analysis. Pure and Applied Mathematics., Wiley-Interscience, New York, 2000. [16] J.W. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, 1997.
Real-time solution of parametrized partial differential equations
1563
[17] K. Veroy, C. Prud’homme, and A.T. Patera, “Reduced-basis approximation of the viscous Burgers equation: Rigorous a posteriori error bounds,” C. R. Acad. Sci. Paris, S´erie I, 337, 619–624, 2003. [18] K. Veroy, C. Prud’homme, D.V. Rovas, and A.T. Patera, “A posteriori error bounds for reduced-basis approximation of parametrized noncoercive and nonlinear elliptic partial differential equations (AIAA Paper 2003-3847),” Proceedings of the 16th AIAA Computational Fluid Dynamics Conference, 2003. [19] M. Meyer and H.G. Matthies, “Efficient model reduction in non-linear dynamics using the Karhunen–Lo`eve expansion and dual-weighted-residual methods,” Comput. Mech., 31, 179–191, 2003. [20] A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, 2nd edn. Springer, 1997. [21] K. Veroy, D. Rovas, and A.T. Patera, “A posteriori error estimation for reducedbasis approximation of parametrized elliptic coercive partial differential equations: “Convex inverse” bound conditioners,” Control, Optim. Calculus Var., 8, 1007–1028, Special Volume: A tribute to J.-L. Lions, 2002. [22] K. Veroy and A.T. Patera, “Certified real-time solution of the parametrized steady incompressible Navier–Stokes equations; Rigorous reduced-basis a posteriori error bounds,” Submitted to International Journal for Numerical Methods in Fluids (Special Issue — Proceedings for 2004 ICFD Conference on Numerical Methods for Fluid Dynamics, Oxford), 2004. [23] N.C. Nguyen, Reduced-Basis Approximation and A Posteriori Error Bounds for Nonaffine and Nonlinear Partial Differential Equations: Application to Inverse Analysis, PhD Thesis, Singapore-MIT Alliance, National University of Singapore, In progress, 2005. [24] M.A. Grepl, N.C. Nguyen, K. Veroy, A.T. Patera, and G.R. Liu, “ Certified rapid solution of parametrized partial differential equations for real-time applications,” Proceedings of the 2nd Sandia Workshop of PDE-Constrained Optimization: Towards Real-Time and On-Line PDE-Constrained Optimization, SIAM Computational Science and Engineering Book Series. Submitted, 2004. [25] L. Sirovich, “Turbulence and the dynamics of coherent structures, Part 1: Coherent structures,” Q. Appl. Math., 45, 561–571, 1987. [26] B. Roux (ed.), Numerical Simulation of Oscillatory Convection in Low-Pr Fluids: A GAMM Workshop, vol. 27 of Notes on Numerical Fluids Mechanics, Vieweg, 1990. [27] N. Trudinger, “On imbedding into Orlicz spaces and some applications,” J. Math. Mech., 17, 473–484, 1967. [28] G. Talenti, “Best constant in Sobolev inequality,” Ann. Mat. Pura Appl., 110, 353–372, 1976. [29] G. Rozza, “Proceedings of the Third M.I.T. Conference on Computational Fluid and Solid Mechanics,” June 14–17, 2005. In: K. Bathe (ed.), Computational Fluid and Solid Mechanics., Elsevier, Submitted, 2005. [30] G. Caloz and J. Rappaz, “Numerical analysis for nonlinear and bifurcation problems,” In: P. Ciarlet and J. Lions (eds.), Handbook of Numerical Analysis, vol. V, Techniques of Scientific Computing (Part 2), Elsevier Science B.V, pp. 487–637, 1997. [31] K. Ito and S.S. Ravindran, “A reduced basis method for control problems governed by PDEs,” In: W. Desch, F. Kappel, and K. Kunisch (eds.), Control and Estimation of Distributed Parameter Systems, Birkh¨auser, pp. 153–168, 1998. [32] F. Brezzi, J. Rappaz, and P. Raviart, “Finite dimensional approximation of nonlinear problems. Part I: Branches of nonsingular solutions,” Numerische Mathematik, 36, 1–25, 1980.
1564
N.N. Cuong et al.
[33] M. Barrault, N.C. Nguyen, Y. Maday, and A.T. Patera, “An “empirical interpolation” method: application to efficient reduced-basis discretization of partial differential equations,” C. R. Acad. Sci. Paris, S´erie I, 339, 667–672, 2004. [34] D. Rovas, Reduced-Basis Output Bound Methods for Parametrized Partial Differential Equations, PhD Thesis, Massachusetts Institute of Technology, Cambridge, MA, 2002. [35] M.A. Grepl and A.T. Patera, A posteriori error bounds for reduced-basis approximations of parametrized parabolic partial differential equations, M2AN Math. Model. Numer. Anal., To appear, 2005. [36] Z.J. Bai, “Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems.”, Appl. Numer. Math., 43, 9–44, 2002. [37] C. Prud’homme, D.V. Rovas, K. Veroy, and A.T. Patera, “A mathematical and computational framework for reliable real-time solution of parametrized partial differential equations,” M2AN Math. Model. Numer. Anal., 36, 747–771, 2002.
Chapter 5 RATE PROCESSES
5.1 INTRODUCTION: RATE PROCESSES Horia Metiu University of California, Santa Barbara, CA, USA
We can divide the time evolution of a system into two classes. In one, a part of the system changes its state from time to time; chemical reactions, polaron mobility, diffusion of adsorbates on a surface, and protein folding belong to this class. In the other, the change of state takes place continuously; electrical conductivity, the diffusion of molecules in gases, and the thermoelectric effect in doped semiconductors belong to this class. Chemical kinetics deals with phenomena of the first kind; the second kind is studied by transport theory. It is in the nature of a many-body system that its parts share energy with each other, creating a state of approximate equality. This leads to stagnation: each part tends to hover near the bottom of a bowl in the potential energy surface. Occasionally, the inherent thermal fluctuations put enough energy in a part of the many-body system to cause it to escape from its bowl and travel away from home. But the tendency to lose energy rapidly, once a part acquires more than its average share, will trap the traveler in another bowl. When this happens, the system has undergone a chemical reaction, or the polaron took a jump to another lattice site, or an impurity in a solid changed location. The rate of these events is described by well known, generic, phenomenological rate equations. The parameter characterizing the rate of a specific system is the rate constant k. In the past 30 years great progress has been made in our ability to calculate the rate constant by atomic simulations. The machinery for performing such calculations is described in the first articles in this chapter. Doll presents the modern view on the old and famous transition state theory, which is still one of the most useful and widely used procedures for calculating rate constants. The atomic motion in a many-body system takes place on a scale of femtoseconds, while the lifetime of a system in a potential energy bowl is much longer. This discrepancy led to the misconception that the dynamics of a chemical reaction is slow. 1567 S. Yip (ed.), Handbook of Materials Modeling, 1567–1571. c 2005 Springer. Printed in the Netherlands.
1568
H. Metiu
The main insight of TST is that a system acquires enough energy to undergo a reaction only “once in a blue moon”. If enough energy is acquired, in the right coordinates, the dynamics of the reaction is very rapid. The rate of reaction is low not because its dynamics is slow, but because the system has enough energy very rarely. In modern parlance the reaction is a rare event. This causes problems for a brute-force simulation of a reaction. One can follow a group of atoms, in the many-body system, for a nanosecond, because of limitations in computing power, and not observe a reactive event. The second insight of TST is that the only parameter out of equilibrium, in a chemical kinetics experiment, is the concentration. Each molecule participating in the reaction is in equilibrium with its environment at all times. Therefore, one can calculate, from equilibrium statistical mechanics, the probability that the system reaches the transition state and the rate with which the system crosses the ridge separating the bowl in which the system is initially located from the one that is the final destination. This is all it takes to build a theory of the rate constant. The only approximation is the assumption that once the system crosses the ridge, it will turn back only in a long time, on the order of k−1 . This late event is part of the backward reaction and it does not affect the forward rate constant. Given the propensity of many-body systems to share energy among degrees of freedom, this is not a bad assumption: once it crosses the ridge the system has a high energy in the reaction coordinate and it is likely to lose it. There are, however, cases in which the shape of the potential energy around the ridge is peculiar or the reaction coordinate is weakly coupled to the other degrees of freedom. When this happens, recrossing is not negligible and TST makes errors. In my experience these errors are small and rarely affect the prefactor A, in the expression k =A exp[−E/RT], by more than 30%. Given the fact that we are unable to calculate the activation energy E accurately and that the latter appears at the exponent it seems unwise to try to obtain an accurate value for A when one makes substantial errors in E (a 0.2-eV error is not rare). This is why TST is still popular in spite of the fact that one could calculate the rate constant exactly, sometimes without a great deal of additional computational effort. The TST reduces the calculation of k to the calculation of partition functions, which can be performed by Monte Carlo simulations. There is no longer any need to perform extremely long Molecular Dynamics calculations in the hope of observing a transition of the system from one bowl to another. Because recrossing is neglected, the rate constant calculated by TST is always larger than the exact rate constant. This does not mean that the TST rate constant is always larger than the measured one. It is only larger than the rate constant calculated exactly on the potential energy surface used in the TST calculations. This inequality led to the development of variational transition state theory, developed and used extensively in Truhlar’s work. In this procedure one
Introduction: rate processes
1569
varies the position of the surface dividing the initial and the final bowls, until the transition theory rate constant has a minimum. The rate constant obtained in this way is more accurate (assuming that the potential energy is accurate) than the one calculated by placing the dividing surface on the ridge separating the two bowls. These issues are discussed and explained in Doll’s article. The next two articles, by Dellago and by Ciccotti, Kapral and Sergi, describe the methods used for exact calculations of the rate constant k. Here “exact” means the exact rate constant for a given potential energy surface. If the potential energy surface is erroneous, the exact rate constant has nothing to do with reality. However, it is important to have an exact theory, since our ability to generate reasonable (and sometimes accurate) potential energy surfaces is improving each year. The exact theory of the rate constant is based on the so-called correlation function theory, which first appeared in a paper by Yamamoto. Since this theory does not assume that recrossing does not take place, it must use molecular dynamics to determine which trajectories recross and which do not. It does this very cleverly, to avoid the “rare event” trap. It uses equilibrium statistical mechanics to place the system on the dividing surface, with the correct probability. Then it lets the system evolve to cross the dividing surface and follows its evolution to determine whether it will recross the dividing surface. If it does, that particular crossing event is discarded. If it does not, it is kept as a reactive event. Averages over many such reactive events, used in a specific equation provided by the theory, give the exact rate constant. The advantage of this method, over ordinary molecular dynamics, is that it must follow the trajectory only for the time when the reaction coordinate loses energy and the system becomes unable to recross the dividing surface. As many experiments and simulations show, this time is shorter than a picosecond, which is quite manageable in computations. Moreover, the procedure generates a large of number of reactive trajectories with the appropriate probability. Since reactive trajectories are very improbable, a brute-force molecular dynamics simulation, starting with the reactants, will generate roughly one reactive trajectory in 100 000 calculations, each requiring a very long trajectory. This is why brute-force calculations of the rate constant are not possible. The two articles mentioned above discuss two different ways of implementing the theory. The theory presented by Dellago is new and has not been extensively tested. The one presented by Ciccotti, Kapral, and Sergi is the workhorse used for all systems that can be described by classical mechanics. While in principle the method is simple, the implementation is full of pitfalls and “small” technical difficulties, and these are clarified in the articles. Application of the correlation function theory to the diffusion of impurities in solids is discussed by Wahnstrom.
1570
H. Metiu
The statements made above, about the time scales reached by molecular dynamics, were true until a few years ago, when Voter proposed several methods that allow us to accelerate molecular dynamics to the point that we can follow the evolution of a complex system for microseconds. This has brought unexpected benefits. To use the transition state theory, or the correlation function theory of rare events, one must know what the events are; we need to know the initial and final state of the system. There are systems for which this is not easy to do. For example, Johnsson discovered, while studying the evolution of the shape of an “island” made by adsorbed atoms, that six atoms move in concert with relative ease. It is very unlikely that anyone would have proposed the existence of this “reaction” on the basis of chemical intuition. In general, in the complex systems encountered in biology and materials science, a group of molecules may move coherently and rapidly together in ways that are not intuitively expected. The accelerated dynamics method often finds such events, since it does not make assumptions about the final state of the system. The article of Blas, Uberuaga, and Voter discusses this aspect of kinetics. Since Kramers’ classic work, it has been realized that in many systems chemical reactions can be described by a stochastic method that involves the Brownian motion of the representative point of the system on the potential energy surface. Since then, the theory has been expanded and used to explain chemical kinetics in condensed phases. Its advantage is that it expresses chemical kinetics in complex media in terms of a few parameters, the strength of thermal fluctuations in the system and the “friction” causing the system to lose energy from the reaction coordinate. This reductionist approach appeals to many experimentalists who have used it to analyze chemical kinetics of molecules in liquids. Much work has also been done to connect the friction and the fluctuations to the detailed dynamics of the system. Nitzan’s article reviews the status of this field. All theories mentioned above assume that the motion of the system can be described by classical mechanics. This is not the case in reactions involving proton or electron transfer. The generalization of the correlation function theory of the rate constant to a fully quantum theory has been made by Miller, Schwartz, and Tromp, who extended considerably the early work of Yamamoto. Some of the first computational methods using this theory were proposed by Wahnstrom and Metiu. Since then, approximate methods, that allow calculations for systems with many degrees of freedom, have been invented. These are reviewed by Schwartz and Voth, who have both contributed substantially to this field. The review of quantum theory of rates is rounded off by an article by Gross, on reactive scattering and adsorption at surfaces. This discusses the dynamics of such reactions in more detail than usual in kinetics, since it examines the rate of reaction (dissociation or adsorption) when the molecule approaching the surface has a well-defined quantum
Introduction: rate processes
1571
state. One can obtain the rate constant from this information, by averaging the state-specific rates over a thermal distribution of initial states. Many people familiar with statistical mechanics have realized that chemical kinetics is, like any other phenomenon in a many-body system, subject to fluctuations that might be observable if one could detect the kinetic behavior of a small number of molecules. It was believed that light scattering may be able to study such fluctuations, since it can detect the evolution of concentration in the very small volume illuminated by light. It turned out that the volume was not small enough and, as far as I know, the fluctuations have not been detected by this method. Undaunted by this lack of experimental observations, Gillespie went ahead and developed the methodology needed for studying the stochastic evolution of the concentration in a system undergoing chemical reactions. This methodology assumed that the rate constants are known and examined the evolution of the concentrations in space and time. Later on, scanning tunneling microscopy studies of the evolution of atoms deposited on a surface and a variety of single molecule kinetic experiments provided examples of systems in which fluctuations in rate processes play a very important role. Gillespie’s article reviews the methods dealing with fluctuating chemical kinetics. Evans reviews the stochastic algorithms needed for studying the kinetics of adsorbates, with applications to crystal growth and catalysis. Jensen’s article studies specific kinetic models used in crystal growth. The chapter ends with three articles on kinetic phenomena of interest in biology. The rate of protein folding, studied with minimalist models that try to capture the essential features causing proteins to fold, is reviewed by Chan. Pande examines the use of detailed models in which the interatomic interactions are treated in detail. The two approaches are complementary and much can be learned by comparing their conclusions. Tajkhorshid, Zhu, and Schulten review the transport of water through the pores of cell membranes. A dominant feature of this transport is that water forms a quasi one-dimensional “wire”. For this reason, transport in biological channels is closely related to water transport through a carbon nanotube and the article reviews both. Kinetics, one of the oldest and most useful branches of chemical physics, is undergoing a quiet revolution and is penetrating in all areas of materials science and biochemistry. There is a very good reason for this: most systems we are interested in are metastable. To understand what they are, we need to use kinetics to simulate how they are made. Moreover, we need to use kinetics to understand how they function and how they are degraded by outside influences or by inner instabilities. Finally, a well-formulated kinetic model contains thermodynamics as the long-time limit.
5.2 A MODERN PERSPECTIVE ON TRANSITION STATE THEORY J.D. Doll Department of Chemistry, Brown University, Providence, RI, USA
Chemical rates, the temporal evolution of the populations of species of interest, are of fundamental importance in science. Understanding how such rates are determined by the microscopic forces involved is, in turn, a basic focus of the present discussion. Before delving into the details, it is valuable to consider the general nature of the problem we face when considering the calculation of chemical rates. In what follows we shall assume that we know: • • • •
the relevant physical laws (classical or quantum) governing the system, the molecular forces at work, the identity of the chemical species of interest, and the formal statistical-mechanical expressions for the desired rates.
Given all that, what is the “problem?” In principle, of course, there is none. “All” that we need do is to work out the “details” of our formal expressions and we have our desired rates. The kinetics of any conceivable physical, chemical, or biologic process are thus within our reach. We can predict fracture kinetics in complex materials, investigate the effects of arbitrary mutations on protein folding rates, and optimize the choice of catalyst for the decomposition/storage of hydrogen in metals, right? Sadly, “no.” Even assuming that all of the above information is at our disposal, at present it is not possible in practice to carry out the “details” at the level necessary to produce the desired rates for arbitrary systems of interest. Why not? The essential problem we face when discussing chemical rates is one of greatly differing time scales. If, for example, a species is of sufficient interest that it makes sense to monitor its population, it is, by default, generally relatively “stable.” That is, it is a species that tends to live a “long” time on the scale 1573 S. Yip (ed.), Handbook of Materials Modeling, 1573–1583. c 2005 Springer. Printed in the Netherlands.
1574
J.D. Doll
of something like a molecular vibration. On the other hand, if we are to understand the details of chemical events of interest, then we must be able to describe the dynamics of those events on a time scale that is “short” on the molecular level. If we do otherwise , we risk losing the ability to understand how those detailed molecular motions influence and/or determine the rates at issue. What happens then when we confront the problem of describing a rate process whose natural time scale is on the order of seconds? If we are not careful we end up drowning in the detail imposed by being forced to describe events on macroscopic time scales using microscopic dynamical methods. In short, we spend a great deal of time (and effort) watching things “not happen.” Is there a better way to proceed? Fortunately, “yes.” Using methods developed by a number of investigators [1–9], it is possible to formulate practical and reliable methods for estimating chemical rates for systems of realistic complexity. While there are often assumptions involved in the practical implementation of these approaches, it is increasingly feasible to quantify and often remove the effects of these assumptions albeit at the expense of additional work. It is our purpose to review and illustrate these methods. Our discussion will focus principally on classical level implementations. Quantum formulations of these methods are possible and are considered elsewhere in this monograph. While much effort has been devoted to the quantum problem, it remains a particularly active area of current research. In the present discussion, we purposely utilize a sometimes nonstandard language in order to unify the discussion of a number of historically separate topics and approaches. The starting point for any discussion of chemical rates is the identification of various species of interest whose population will be monitored as a function of time. While there are many possible ways in which to do this, it is convenient to consider an approach based on the Stillinger/Weber inherent structure ideas [10, 11]. In this formulation, configuration space is partitioned by assigned each position to a unique potential energy basin (“inherent structure”) based on a steepest descent quench procedure. The relevant mental image is that of watching a “ball” roll slowly “downhill” on the potential energy surface under the action of an over-damped dynamics. In many applications the Stillinger/Weber inherent structures are themselves of primary interest. Although the number of such structures grows rapidly (exponentially) with system size [12], this type of analysis and the associated graphical tools it has spawned [13], provide a valuable language for characterizing potential energy surfaces. Wales, in particular, has utilized variations of the technique to great advantage in their study of the minimization problem [14]. In our discussion, it is the evolution of the populations of the inherent structures rather than the structures themselves that are of primary concern. Inherent structures, by construction, are associated with local minima in the
A modern perspective on transition state theory
1575
potential energy surface. They thus have an intrinsic equilibrium population that can, if desired, be estimated using established statistical–mechanical techniques. Since the dynamics in the vicinity of the inherent structures is locally stable, the inherent structure populations tend to be (relatively) slowly varying and thus provide us with a natural set of populations for kinetic study. If followed as a function of time under the action of the dynamics generated by potential energy surface to which the inherent structures belong, the populations of the inherent structures will, aside from fluctuations, tend to remain constant at their various equilibrium values. Fluctuations in these populations, on the other hand, will result in a net flow of material between the various inherent structures. Such flows are the mechanism by which such fluctuations, either induced or spontaneous, “relax.” Consequently, they contain sufficient information to establish the desired kinetic parameters. To make the discussion more explicit, we consider the simple situation of a particle moving on the bistable potential energy depicted in Fig. 1. Performing a Stillinger/Weber quench on this potential energy will obviously produce two inherent structures. Denoted A and B in the figure, these correspond to the regions to the left and right of the potential energy maximum, respectively. We now imagine that we follow the dynamics of a statistical ensemble of N particles moving on this potential energy surface. For the purposes of discussion, we assume that the physical dynamics involved includes a solvent or “bath” (here unspecified) that provides fluctuating forces that act on the system
V(x)
A
B x
Figure 1. A prototypical, bistable potential energy. The two inherent structures, A and B, are separated by an energy barrier.
1576
J.D. Doll
of interest. The bath dynamics acts both to energize the system (permitting it to acquire sufficient energy to sometimes cross the potential barrier) as well as to dissipate that energy once it has been acquired. It is important to note that these fluctuations and dissipations must, in some sense, be balanced if an equilibrium state is to be produced and sustained [7]. Were the dynamics in our example purely conservative and one-dimensional in nature, for example, the notion of rates would be ill-posed. We now assume in what follows that we can monitor the populations of the inherent structures as a function of time. Denoting these populations NA (t) and NB (t), we further assume, following Chandler [7], that the overall kinetics of the system can described by the phenomenological rate equations dNA (t) = −kA→B NA (t) + kB→A NB (t) dt (1) dNB (t) = +kA→B NA (t) − kB→A NB (t). dt If the total number of particles is conserved, then the two inherent structure populations are trivially related: the fluctuation in the population of one inherent structure is the negative of that for the other. Assuming a fixed number of particles, it is thus a relatively simple matter to show that dδ NA (t) = −(kA→B + kB→A )δ NA (t), (2) dt where δ NA (t) indicates the deviation of NA (t) from its equilibrium value. The decay of a fluctuation in the population of inherent structure A, relative to an initial value at time zero, is thus given by δ NA (t) = δ NA (0) e−keff t ,
(3)
where keff is given by the sum of the “forward” and “backward” rate constants keff = (kA→B + kB→A ).
(4)
As noted by Onsager [15], it is physically reasonable to assume that if they are small, fluctuations, whether induced or spontaneous, are damped in a similar manner. Accepting this hypothesis, we conclude from the above analysis that the decay of the equilibrium population autocorrelation function, denoted here by , is given in terms of keff by δ NA (0)δ NA (t) = e−keff t . δ NA (0)δ NA (0)
(5)
Equivalently, taking the time derivative of both sides of this expression, we see that keff is given explicitly as keff = −
δ NA (0)δ N˙ A (t) . δ NA (0)δ NA (t)
(6)
A modern perspective on transition state theory
1577
Equations (5) and (6) are formally exact expressions that relate the sum of the basic rate constants of interest to various dynamical objects that can be computed. Since we also know the ratio of these two rate constants (it is given by the corresponding ratio of the equilibrium populations), the desired rate parameters can be obtained from either expression provided that we can obtain the relevant time correlation functions involved. Although formally equivalent, Eqs. (5) and (6) differ with respect to their implicit computational demands. Computing the rate parameters via Eq. (5), for example, entails monitoring the decay of the population autocorrelation function. To obtain reliable estimates of the rate parameters from Eq. (5), we have to follow the system dynamics over a time-scale that is an appreciable fraction of the reciprocal of keff . If the barriers separating the inherent structures involved are “large”, this time scale can become macroscopic. Simply stated, the disparate time-scale problem makes it difficult to study directly the dynamics of infrequent events using the approach suggested by Eq. (5). Equation (6), on the other hand, offers a more convenient route to the desired kinetic parameters. In particular, it indicates that we might be able to obtain these parameters from short as opposed to long-time dynamical information. If the phenomenological rate expressions are formally correct for all times, then the ratio of the two time correlation functions in Eq. (6) is time-independent. However, since it is generally likely that the phenomenological rate expressions accurately describe only the longer-time motion between inherent structures, we expect in practice that the ratio on the right hand side of Eq. (6) will approach a constant “plateau” value only at times long on the scale of detailed molecular motions. The critical point, however, is that this transient period will be of molecular not macroscopic duration. With Eq. (6), we thus have a route to the desired kinetic parameters that requires only molecular or short time-scale dynamical input. A valuable practical point concerning kinetic formulations based on Eq. (6) is that for many applications the final plateau value of the correlation function ratio involved is often relatively well approximated by its zero time value. Because the correlation functions required depend only on time differences, such zero-time quantities are purely equilibrium objects. Consequently, an existing and extensive set of equilibrium tools can be invoked to produce approximations to kinetic parameters. The approach to the calculation of chemical rates based on Eq. (6) has several desirable characteristics. Most importantly, it has a refinable nature and can be implemented in stages. At the simplest level, we can estimate chemical rate parameters using purely zero-time, or equilibrium methods. Such approximate methods alone may be adequate for many applications. We are, however, not restricted to accepting such approximations blindly. With additional effort we can “correct” such preliminary estimates by performing additional dynamical studies. Because such calculations involve “corrections” to
1578
J.D. Doll
equilibrium estimates of rate parameters, as opposed to the entire rate parameters themselves, the dynamical input required is only that necessary to remove the errors induced by the initial equilibrium assumptions. Because such errors tend to involve simplified assumptions concerning the nature of transition state dynamics, the input required to estimate the corrections is of a molecular, not macroscopic time scale. We now focus our discussion on some of practical issues involved in generating equilibrium estimates of the rates. We shall illustrate these using the simple two-state example described above. We begin by imagining that we have at our disposal the time history of a reaction coordinate of interest, x(t). As a function of time, x(t) moves back-and-forth between inherent structures A and B, which we assume to be separated by the position x = q. Using one of the basic properties of the delta function, δ(ax) =
1 δ(x), |a|
(7)
it is easy to show that N (τ , [x(t)]), defined by N (τ, [x(t)]) =
τ
dx(t) δ(x(t) − q), dt
dt
0
(8)
is a functional of the path whose value is equal to the (total) number of crossings of the x(t) = q surface in the interval (0,τ ). Every time x(t) crosses q, the delta function argument takes on a zero value. Because the delta function in Eq. (8) is in coordinate space while the integral is with respect to time, the Jacobian factor into Eq. (8) creates a functional whose value jumps by unity each time x(t) − q sweeps through a value of zero. If we form a statistical ensemble corresponding to various possible histories of the motion of our system and bath, we can compute the average number of crossings of the x(t) = q surface in the (0,τ ) interval, N(τ , [x(t)]), using the expression N (τ, [x(t)]) =
τ
dt x(t) ˙ δ(x(t) − q) .
(9)
0
Here represents the time derivative of x(t). Because are dealing with a “stationary” or equilibrium process, the time correlation function that appears on the right hand side of Eq. (9) can be function only of time differences. Consequently, the integrand on the right hand side of Eq. (9) is time-independent and can be brought outside the integral. The result thus becomes τ dt, N (τ, [x(t)]) = x˙ δ(x − q) 0
(10)
A modern perspective on transition state theory
1579
where the (now unnecessary) time labels have been dropped. We thus see that the number of crossings of the x(t) = q surface in this system per unit time is given by N (τ, [x(t)]) = x˙ δ(x − q) . (11) τ Recalling that N measures the total number of crossings, the number of crossings per unit time in the direction from A to B (the number of “up zeroes” of x(t) − q in the language of Slater) is half the value in Eq. (11). Thus, the equilibrium estimate of the rate constant for the A to B transition, (i.e., the number of crossings per unit time from A to B per atom in inherent structure A) is given by
1 TST = kA→B 2
x˙ δ(x − q)
NA
.
(12)
Equation (12) gives an approximate expression to the rate constant that involves an equilibrium flux between the relevant inherent structures. Because the relevant flux is associated with the “transition” of one inherent structure into another, the approach to chemical rates suggested by Eq. (12) is typically termed “transition state” theory (TST). Along with its multi-dimensional generalizations, it represents a convenient and useful approximation to the desired chemical rate constants. Being an equilibrium approximation to the dynamical objects of interest, it permits the powerful machinery of Monte Carlo methods [16, 17] to be brought to bear on the computational problem. The significance of this is that the required averages can be computed to any desired accuracy for arbitrary potential energy models. One can proceed analytically by making secondary, simplifying assumptions concerning the potential. Such approximations are, however, controllable in that their quality can be tested. Furthermore, Eq. (12) provides a unified treatment of the problem that is independent of the nature of the statistical ensemble that is involved. Applications involving canonical, microcanonical and other ensembles are treated within a common framework. It is historically interesting in this regard to note that if the reaction coordinate of interest is expressed as a superposition of normal modes, Eq. (12) leads naturally to the unimolecular reaction expressions of Ref. [4]. There is a technical aspect concerning the calculation of the averages appearing in Eq. (12) that merits discussion. In particular, it is apparent from the nature of the average involved that, if they are to be computed accurately, the numerical methods involved must be capable of accurately describing the reactant’s concentration profile in the vicinity of the transition state. If we are dealing with with activated processes where the difference between transition state in inherent structure energies are “large”, then such concentrations can become quite small and difficult to treat by standard methods. This is
1580
J.D. Doll
simply the equilibrium, “sparse-sampling” analog of the disparate time-scale dynamical problem. Fortunately, there are a number well-defined techniques for coping with this technical issue. These include, to name a few, umbrella methods [18], Bennett/Voter techniques [19, 20], J-walking [21, 22], and parallel tempering approaches [23]. These and related methods make it possible to compute the the required, transition-state-constrained averages. The basic approach outlined above can be extended in a number of ways. One immediate extension involves problems in which there are multiple, rather than two states involved. Adams has considered such problems in the context of his studies on the effects of precursor states on thermal desorption [24]. A second extension involves using the fundamental kinetic parameters produced to study more complex events. Voter, in a series of developments, has formulated a computationally viable method for studying diffusion in solids based on such an approach [25]. In its most complete form (including dynamical corrections), this approach produces a computationally exact procedure for surface or bulk diffusion coefficients of a point defect at arbitrary temperatures in a periodic system [26]. In related developments, Voter [25] and Henkelmen and J´onsson [27] have discussed using “on-the-fly” determinations of TST kinetic parameters in kinetic Monte Carlo studies. Such methods make it possible to explore a variety of lattice dynamical problems without resorting to ad hoc assumptions concerning mechanisms of various elementary events. In a particularly promising development, they also appear to offer a valuable tool for the study of long-time dynamical events [28, 29]. An important practical issue in the calculation of TST approximations to rates is the identification of the transition state itself. In many problems, such as the simple two-state problem discussed previously, locating the transition state is trivial. In others, it is not. Techniques designed to locate explicit transition states in complex systems have been discussed in the literature. One popular technique, developed by Cerjan and Miller [30] and extended by others [31–33], is based on an “eigenvector following” method. In this approach, one basically moves “up-hill” from a selected inherent structure using local mode information to determine the transition state. Other approaches, including methods that do not require explicit second-order derivatives of the potential, have been discussed [34]. It is also important to mention a different class of methods suggested by Pratt [35]. Borrowing a page from path integral applications, this technique attempts to locate transition states by working with paths that build in proper initial and final inherent structure character from the outset. Expanding upon the spirit of the original Pratt suggestion, recent efforts have considered sampling barrier crossing paths directly [36]. We wish to close by pointing out what we feel may prove to be a potentially useful link between inherent structure decomposition methods and the problem of “probabilistic clustering” [37, 38]. An important problem in applied mathematics is the reconstruction of an unknown probability distribution given a
A modern perspective on transition state theory
1581
known statistical sampling of that distribution. So stated, the probabilistic clustering problem is effectively the inverse of the Monte Carlo sampling problem. Rather than producing a statistical sampling of a given distribution, we seek instead to reconstruct the unknown distribution from a known statistical sampling. This clustering problem is of broad significance in information technology and has received considerable attention. Our point in emphasizing the link between probabilistic clustering and inherent structure methods is that our increased ability to sample arbitrary, sparse distributions would appear to offer an alternative to the Stillinger/Weber quench approach to the inherent structure decomposition problem. In particular, one could use clustering methods both to “identify” and to “measure” the concentrations of inherent structures present in a system.
Acknowledgments The author would like to thank the National Science Foundation for support through awards CHE-0095053 and CHE-0131114 and the Department of Energy through award DE-FG02-03ER46704. He also wishes to thank the Center for Advanced Scientific Computing and Visualization (TCASCV) at Brown University for valuable assistance with respect to some of the numerical simulations described in the present paper.
References [1] M. Polanyi and E. Wigner, “The interference of characteristic vibrations as the cause of energy fluctuations and chemical change,” Z. Phys. Chem., 139(Abt. A), 439, 1928. [2] H. Eyring, “Activated complex in chemical reaction,” J. Chem. Phys., 3, 107, 1935. [3] H.A. Kramers, “Brownian motion in a field of force and the diffusion model of chemical reactions,” Physica (The Hague), 7, 284, 1940. [4] N.B. Slater, Theory of Unimolecular Reactions, Cornell University Press, Ithaca, 1959. [5] P.J. Robinson and K.A. Holbrook, Unimolecular Reactions, Wiley-Interscience, 1972. [6] D.G. Truhlar and B.C. Garrett, “Variational transition state theory,” Ann. Rev. Phys. Chem., 35, 159, 1984. [7] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford, New York, 1987. [8] P. H¨anggi, P. Talkner, and M. Borkovec, “Reaction-rate theory: fifty years after Kramers,” Rev. Mod. Phys., 62, 251, 1990. [9] M. Garcia-Viloca, J. Gao, M. Karplus, and D.G. Truhlar, “How enzymes work: analysis by modern rate theory and computer simulations,” Science, 303, 186, 2004.
1582
J.D. Doll
[10] F.H. Stillinger and T.A. Weber, “Dynamics of structural transitions in liquids,” Phys. Rev. A, 28, 2408, 1983. [11] F.H. Stillinger and T.A. Weber, “Packing structures and transitions in liquids and solids,” Science, 225, 983, 1984. [12] F.H. Stillinger, “Exponential multiplicity of inherent structures,” Phys. Rev. E, 59, 48, 1999. [13] O.M. Becker and M. Karplus, “The topology of multidimensional potential energy surfaces: theory and application to peptide structure and kinetics,” J. Chem. Phys., 106, 1495, 1997. [14] D.J. Wales and J.P.K. Doye, “Global optimization by basin-hopping and the lowest energy structures of Lennard–Jones clusters containing up to 110 atoms,” J. Phys. Chem. A, 101, 5111, 1997. [15] L. Onsager, “Reciprocal relations in irreversible processes, II,” Phys. Rev., 38, 2265, 1931. [16] M.H. Kalos and P.A. Whitlock, Monte Carlo Methods, Wiley-Interscience, New York, 1986. [17] M.P. Nightingale and C.J. Umrigar, Quantum Monte Carlo Methods in Physics and Chemistry, Kluwer, Dordrecht, 1998. [18] J.P. Valleau and G.M. Torrie, “A guide to Monte Carlo for statistical mechanics: 2. byways,” In: B.J. Berne (ed.), Statistical Mechanics: Equilibrium Techniques, Plenum, New York, 1969, 1977. [19] C.H. Bennett, “Exact defect calculations in model substances,” In: A.S. Nowick and J.J. Burton (eds.), Diffusion in Solids: Recent Developments, Academic Press, New York, pp. 73, 1975. [20] A.F. Voter, “A Monte Carlo method for determining free-energy differences and transition state theory rate constants,” J. Chem. Phys., 82,1890, 1985. [21] D.D. Frantz, D.L. Freeman, and J.D. Doll, “Reducing quasi-ergodic behavior in Monte Carlo simulations by J-walking: applications to atomic clusters,” J. Chem. Phys., 93, 2769, 1990. [22] J.P. Neirotti, F. Calvo, D.L. Freeman, and J.D. Doll, “Phase changes in 38 atom Lennard-Jones clusters: I: a parallel tempering study in the canonical ensemble,” J. Chem. Phys., 112, 10340, 2000. [23] C.J. Geyer and E.A. Thompson, “Anealing Markov chain Monte Carlo with applications to ancestral inference,” J. Am. Stat. Assoc., 90, 909, 1995. [24] J.E. Adams and J.D. Doll, “Dynamical aspects of precursor state kinetics,” Surf. Sci., 111, 492, 1981. [25] J.D. Doll and A.F. Voter, “Recent developments in the theory of surface diffusion,” Ann. Revi. Phys. Chem., 38, 413, 1987. [26] A.F. Voter, J.D. Doll, and J.M. Cohen, “Using multistate dynamical corrections to compute classically exact diffusion constants at arbitrary temperature,” J. Chem. Phys., 90, 2045, 1989. [27] G. Henkelman and H. J´onsson, “Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table,” J. Chem. Phys., 115, 9657, 2001. [28] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Ann. Rev. Mater. Res., 32, 321, 2002. [29] V.S. Pande, I. Baker, J. Chapman, S.P. Elmer, S. Khaliq, S.M. Larson, Y.M. Rhee, M.R. Shirts, C.D. Snow, E.J. Sorin, and B. Zagrovic, “Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing,” Biopolymers, 68, 91, 2003.
A modern perspective on transition state theory
1583
[30] C.J. Cerjan and W.H. Miller, “On finding transition states,” J. Chem. Phys., 75, 2800, 1981. [31] C.J. Tsai and K.D. Jordan, “Use of an eigenmode method to locate the stationary points on the potential energy surfaces of selected argon and water clusters,” J. Phys. Chem., 97, 11227, 1993. [32] J. Nichols, H. Taylor, P. Schmidt, and J. Simons, “Walking on potential energy surfaces,” J. Chem. Phys., 92, 340, 1990. [33] D.J. Wales, “Rearrangements of 55-atom Lennard–Jones and (C60) 55 clusters,” J. Chem. Phys., 101, 3750, 1994. [34] G. Henkelman and H. J´onsson, “A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives,” J. Chem. Phys., 111, 7010, 1999. [35] L.R. Pratt, “A statistical method for identifying transition states in high dimensional problems,” J. Chem. Phys., 85, 5045–5048, 1986. [36] P.G. Bolhuis, D. Chandler, C. Dellago, and P.L. Geissler, “Transition path sampling: throwing ropes over rough mountain passes, in the dark,” Ann. Rev. Phys. Chem., 53, 291, 2002. [37] B.G. Mirkin, Mathematical Classification and Clustering, Kluwer, Dordrecht, 1996. [38] D. Sabo, D.L Freeman, and J.D. Doll, “Stationary tempering and the complex quadrature problem,” J. Chem. Phys., 116, 3509, 2002.
5.3 TRANSITION PATH SAMPLING Christoph Dellago Institute of Experimental Physics, University of Vienna, Vienna, Austria
Often, the dynamics of complex condensed materials is characterized by the presence of a wide range of different time scales, complicating the study of such processes with computer simulations. Consider, for instance, dynamical processes occurring in liquid water. Here, the fastest molecular processes are intramolecular vibrations with periods in the 10–20 fs range. The translational and rotational motions of water molecules occur on a significantly longer time scale. Typically, the direction of translational motion of a molecule persist for about 500 fs, corresponding to 50 vibrational periods. Hydrogen bonds, responsible for many of the unique properties of liquid water, have an average lifetime of about 1 ps and the rotational motion of water molecules stays correlated for about 10 ps. Much longer time scales are typically involved if covalent bonds are broken and formed. For instance, the average lifetime of a water molecule in liquid water before it dissociates and forms hydroxide and hydronium ions is on the order of 10 h. This enormous range of time scales, spanning nearly 20 orders of magnitude, is a challenge for the computer simulator who wants to study such processes. In general, the dynamics of molecular systems can be explored on a computer with molecular dynamics simulation (MD), a method in which the underlying equations of motion are solved in small time steps. In such simulations the size of the time step must be shorter than the shortest characteristic time scale in the system. Thus, many molecular dynamics steps must be carried out to explore the dynamics of a molecular system for times that are long compared with the basic time scale of molecular vibrations. Depending on specific system properties and the available computer equipment, one can carry out from 10 000 to millions of such steps. In ab initio simulations where interatomic forces are determined by solving the electronic structure problem on the fly, total simulation times typically do not exceed dozens of picoseconds. Longer simulations of nanosecond, or, in some rare cases, microsecond length can be achieved if forces are determined from computationally less expensive empirical force fields often 1585 S. Yip (ed.), Handbook of Materials Modeling, 1585–1596. c 2005 Springer. Printed in the Netherlands.
1586
C. Dellago
used to simulate biological systems. But many interesting and important processes still lay beyond the time scale accessible with MD simulations even on today’s fastest computers. Indeed, an ab initio molecular dynamics simulation of liquid water long enough to observe a few dissociations of water molecules would require a multiple of the age of the universe of computing time even on state-of-the-art parallel high performance computers. The computational effort needed to study many other interesting processes, ranging from protein folding to the nucleation of phase transitions and transport in and on solids, in straightforward molecular dynamics simulations with atomistic resolution may be less extreme, but still surpasses the capabilities of current computer technology. Fortunately, many processes occurring on long time scale are rare rather than slow. Consider, for instance, a chemical reaction during which the system has to overcome a large energy barrier on its way from reactants to products. Before the reaction occurs, the system typically spends a long time in the reactant state and only a rare fluctuation can drive the system over the barrier. If this fluctuation happens, however, the barrier is crossed rapidly. For example, it is now known from transition path sampling simulations that the dissociation of a water molecule in liquid water takes place in a few hundred femtoseconds once a rare solvent fluctuation drives the transition between the stable states, the intact water molecule and the separated ion pair. As mentioned earlier, the waiting time for this event, however, is of the order of 10 h. Other examples of rare but fast transitions between stable states include the nucleation of first order phase transitions, conformational transitions of biopolymers, and transport in and on solids. In such cases it is computationally advantageous to focus on those segments of the time evolution during which the rare event takes place rather than wasting large amounts of computing time following the dynamics of the system waiting for the rare event to happen. Several computational techniques to accomplish that have been put forward [1–4]. One approach consists in locating (or postulating) the bottleneck separating the stable states between which the rare transition occurs. Molecular dynamics trajectories initiated at this bottleneck, or transition state, can then be used to study the reaction mechanism in detail and to calculate reaction rate constants [5]. In small or highly ordered systems transition states can often be associated with saddle points on the potential energy surface. Such saddle points can be located with appropriate algorithms. Particularly in complex, disordered systems such as liquids, however, such an approach is frequently unfeasible. The number of saddle points on the potential energy surface may be very large and most saddle points may be irrelevant for the transition one wants to study. Entropic effects can further complicate the problem. In this case, a technique called transition path sampling provides an alternative approach [6]. Transition path sampling is a computational methodology based on a statistical mechanics of trajectories. It is designed to study rare transitions between
Transition path sampling
1587
known and well defined stable states. In contrast to other methods, transition path sampling does not require any a priori knowledge of the mechanism. Instead, it is sufficient to unambiguously define the stable states between which the transition occurs. The basic idea of transition path sampling consists in assigning a probability, or weight, to every pathway. This probability is a statistical description of all possible reactive trajectories, the transition path ensemble. Then, trajectories, are generated according to their probability in the transition path ensemble. Analysis of the harvested pathways yields detailed mechanistic information on the transition mechanism. Reaction rate constants can be determined within the framework of transition path sampling by calculating “free energies” between different ensembles of trajectories. In the following, we will give a brief overview of the basic concepts and algorithms of the transition path sampling technique. For a detailed description of the methodology and for practical issues regarding the implementation of transition path sampling simulations the reader is referred to two recent review articles [7, 8].
1.
The Transition Path Ensemble
Imagine a system with two long-lived stable states, call them A and B, between which rare transitions occur (see Fig. 1). The system spends much of its time fluctuating in the stable states A and B but rarely transitions between A and B occur. In the transition path sampling method one focuses on short
B
A
Figure 1. Several transition pathways connecting stable states A and B which are separated by a rough free energy barrier.
1588
C. Dellago
trajectories x(T ) of length T (in time) represented by a time-ordered discrete sequence of states: x(T ) ≡ {x0 , xt , x2t , . . . , xT }.
(1)
Here, xt is the state of the system at time t. Each trajectory may be thought of as a chain of states obtained by taking snapshots at regular time intervals of length t as the system evolves according to the rules of the underlying dynamics. If the time evolution of the system follows Newton’s equations of motion, x ≡ {r, p} is a point in phase space and consists of the coordinates, r, and momenta, p, of all particles. For systems evolving according to a high friction Langevin equation or a Monte Carlo procedure the state x may include only coordinates and no momenta. The probability of a certain trajectory to be observed depends on the probability ρ(x0 ) of its initial point x0 and on the probability to observe the subsequent sequence of states starting from that initial point. For a Markovian process, that is for a process in which the probability of state xt to evolve into state xt +t over a time t depends only on xt and not on the history of the system prior to t, the probability P[x(T )] of a trajectory x(T ) can simply be written as a product of single step transition probabilities p(xt → xt +t ): P[x(T )] = ρ(x0 )
T /t −1
p(xit → x(i+1)t ).
(2)
i=0
For an equilibrium system in contact with a heat bath at temperature T the distribution of starting points is canonical, i.e., ρ(x0 ) ∝ exp{−H (x)/kB T }, where H (x) is the Hamiltonian of the system and kB is Boltzmann’s constant. Depending on the process under study other distributions of initial conditions may be appropriate. The path distribution of Eq. (2) describes the probability to observe a particular trajectory regardless of whether it connects the two stable states A and B. Since in the transition path approach the focus is on reactive trajectories, the path distribution P[x(T )] is restricted to the subset of pathway starting in A and ending in B: PAB [x(T )] ≡ h A (x0 )P[x(T )]h B (xT )/Z AB (T ).
(3)
The functions h A (x) and h B (x) are unity if their argument x lies in region A or B, respectively, and they vanish otherwise. Accordingly, only reactive trajectories starting in A and ending in B can have a weight different from zero in the path distribution PAB [x(T )]. The factor Z AB (T ) ≡
Dx(T ) h A (x0 )P[x(T )]h B (xT ),
(4)
which has the form of a partition function, normalizes the path distribution of Eq. (3). The notation Dx(T ), familiar from path integral theory, denotes a
Transition path sampling
1589
summation over all pathways. The function PAB [x(T )], which is a probability distribution function in the high dimensional space of all trajectories, describes the set of all reactive trajectories with their correct weight. This set of pathways is the transition path ensemble. In transition path sampling simulations care must be exercised in defining the stable states A and B. Both A and B need to be large enough to accommodate most equilibrium fluctuations, i.e., the system should spend the overwhelming fraction of time in either A or B. At the same time, A and B should not overlap with each other’s basin of attraction. Here, the basin of attraction of region A consist of all configurations that relax predominantly into that region. The basin of attraction of region B is defined analogously. If state A is incorrectly defined in such a way that it contains also points belonging to the basin of attraction of B, the transition path ensemble includes pathways only apparently connecting the two stable states. This situation is illustrated in Fig. 2. In many cases the stable states A and B can be defined through specific limits of a one-dimensional order parameter q(x). Although there is no general rule guiding the construction of such order parameters, this step in setting up a
q'
TS
A
B qA
qB
q
Figure 2. Regions A and B must be defined in a way to avoid overlap of A and B with each other’s basin of attraction. On this two dimensional free energy surface region A defined through q < q A includes points belonging to the basin of attraction of B (defined through q > q B ). Thus, the transition path ensemble PAB [x(T )] contains paths which start in A and end in B, but which never cross the transition state region marked by TS (dashed line). This problem can be avoided by using also the variable q in the definition of the stable states.
1590
C. Dellago
transition path sampling simulation can be usually completed quite easily with a trial and error procedure. Note, however, that an appropriate order parameter is not necessarily a good reaction coordinate capable of describing the whole transition. In general, finding such a reaction coordinate is a difficult problem.
2.
Sampling the Transition Path Ensemble
In the transition path sampling method a biased random walk through path space is performed in such a way that pathways are visited according to their weight in the transition path ensemble PAB [x(T )]. This can be accomplished in an efficient way with Monte Carlo methods proceeding in analogy to a conventional Monte Carlo simulation of, say, a simple liquid at a given temperature T [9]. In that case a random walk through configuration space is constructed by carrying out a sequence of discrete steps. In each step, a new configuration is generated from an old one, for instance by displacing a single particle in a random direction by a random amount. Then, the new configuration, also called trial configuration, is accepted or rejected depending on how the probability of the new configuration compares to that of the old one. This is most easily done by applying the celebrated Metropolis rule [10], designed to enforce detailed balance between the move and its reverse. As a result, the trial move is always accepted if the energy of the new configuration is lower than that of the old one and accepted with a probability exp(−E/kB T ) if the trial move is energetically uphill (here, E is the energy difference between the new and the old configuration). Execution of a long sequence of such random moves followed by the acceptance or rejection step yields a random walk of the system through configuration space during which configurations are sampled with a frequency proportional to their weight in the canonical ensemble. Ensemble averages of structural and thermodynamics quantities can then straightforwardly computed by averaging over this sequence of configurations. In a transition path sampling simulation one proceeds analogously. But in contrast to a conventional Monte Carlo simulation, the random walk is carried out in the space of all trajectories and the result is a sequence of pathways instead of a sequence of configurations. In each step of this random walk a new pathway x (n) (T ), the trial path, is generated from an an old one, x (o) (T ). Then, the trial pathway is accepted or rejected according to how its weight PAB [x (n) (T )] in the transition path ensemble compares to the weight of the old one, PAB [x (o) (T )]. Correct sampling of the transition path ensemble is guaranteed by enforcing the detailed balance condition which requires the probability of a path move from x (o) (T ) to x (n) (T ) to be balanced exactly by the probability of the reverse path move. This detailed balance condition can be satisfied using the Metropolis criterion. Iterating this procedure of path generation followed by acceptance or rejection, one obtains a sequence of pathways in which
Transition path sampling
1591
each pathway is visited according to its weight in the transition path ensemble. It is important to note that while pathways are sampled with a Monte Carlo procedure, each single pathway is a genuinely dynamical pathway generated according to the rules of the underlying dynamics. To implement the procedure outlined above, one needs to specify how to generate a new pathway from an old one. This can be done efficiently with algorithms called shooting and shifting. For simplicity we will explain these algorithms for Newtonian dynamics (as used in most MD simulations) although they can be easily applied to other types of dynamics as well. So, imagine a Newtonian trajectory of length T as obtained from a molecular dynamics simulation of L = T /t steps starting in region A and ending in region B (see Fig. 3). From this existing transition pathway a new trajectory is generated by first randomly selecting a state of the existing trajectory. Then, the momenta belonging to the selected state are changed by a small random amount. Starting from this state with modified momenta the equations of motion are integrated forward to time T and backward to time 0. As a result, one obtains a complete new trajectory of length T which crosses (in configuration space) the old trajectory at one point. By keeping the momentum displacement small the new trajectory can be made to resemble the old one closely. As a consequence, the new pathway is likely to be reactive as well and to have a nonzero weight in the transition path ensemble. Any new trajectory with starting point in A and ending point in B can be accepted with high likelihood (in fact, for constant energy trajectories with a microcanonical distribution of initial conditions all new trajectories connecting A and B can be accepted). If the new trajectory does not begin in A or does not end in B it is rejected. For optimum efficiency, the magnitude of the momentum displacement should be selected such that the average acceptance probability is in the range from 40 to 60%. Shooting moves can be complemented with shifting moves, which consist in shifting the starting point of the path in time. This kind of move is computationally inexpensive since typically only a small part of the pathway needs to
A
B
Figure 3. In a shooting move one generates a new trajectory (dashed line) from an old one (solid line) by integrating the equations of motion forward and backward starting from a point with random momenta randomly selected along the old trajectory. The acceptance probability of the newly generated path can be controlled by varying the magnitude of the momentum displacement (thin arrow).
1592
C. Dellago
be regrown. If the starting point of the path is shifted forward in time, a path segment of appropriate length has to be appended at the end of the path by integration of the equation of motion. If, on the other hand, the starting point is shifted backward in time, the trajectory must be completed by integrating the equations of motion backward in time starting from the initial point of the original pathway. Depending on the time by which the path is shifted, the new path can have large parts in common with the old path. Since ergodic sampling is not possible with shifting moves alone, path shifting always needs to be combined with path shooting. Although shifting moves cannot generate a truly new path, they can increase sampling efficiency especially for the calculation of reaction rate constants. To start the Monte Carlo path sampling procedure one needs a pathway that already connects A with B. This initial pathway is not required to be a high-weight dynamical trajectory, but can be an artificially constructed chain of states. Shooting and shifting will then rapidly relax this initial pathway towards regions of higher probability in path space. The generation of an initial trajectory is strongly system dependent and usually does not pose a serious problem.
3.
Analyzing Transition Pathways
Pathways harvested with the transition path sampling method are full dynamical trajectories in the space spanned by positions and momenta of all particles. In such high-dimensional many-particle systems it is usually difficult to identify the relevant degrees of freedom and to distinguish them from those which might be regarded as random noise. In the case of a chemical reaction occurring in a solvent, for instance, the specific role of solvent molecules during the reaction is often unclear. Although direct inspection of transition pathways with molecular visualization tools may yield some insight, detailed knowledge of the transition mechanism can only be gained through systematic analysis of the collected pathways. In the following, we will briefly review two approaches to carry out such an analysis: the transition state ensemble and the distribution of committors. In simple systems of a few degrees of freedom, for instance a small molecule undergoing an isomerization in the gas phase, one can study transition mechanisms by locating minima and saddle points on the potential energy surface of the system. While the potential energy minima are the stable states in which the system spends most of its time, the saddle point are configurations the system must cross on its way from one potential energy well to another. These so called transition states are the lowest points on ridges separating the stable states from each other. From the transition states the system can relax
Transition path sampling
1593
into either one of the two stable states depending on the initial direction of motion. In a high dimensional complex system local potential energy minima and saddle points do not carry the same significance as in simple systems. In a large, disordered system many local potential energy minima and saddle points may belong to one single stable state, and free energy barriers may not be related to a single saddle point. Nevertheless, the concept of a transition state is still meaningful if defined in a statistical way. In this definition, configurations are considered to be transition states if trajectories started from them with random initial momenta have equal probability to relax to either one of the stable states between which transitions occur. Naturally, along each transition pathway there is at least one (but sometimes several) configuration with this property. Performing such an analysis for many transition pathways yields the transition state ensemble, the set of all configurations on transition pathways which relax into A and B with equal probability. Inspection of this set of configurations is simpler than scrutiny of the set of harvested complete pathways. As a result of the analysis described above one may be led to guess which degrees of freedom are most important during the transition, or, in other words, which degrees of freedom contribute to the reaction coordinate. Such a guessed reaction coordinate, q(x), can be tested with the following procedure. The first step consists in calculating the free energy F(q), for instance by using umbrella sampling [9] or constrained molecular dynamics [11]. The free energy profile F(q) will possess minima at values of q typical for the stables states A and B and a barrier located at q = q ∗ separating these two minima. If q is a good reaction coordinate, trajectories started from configurations with q = q ∗ relax into A and B with equal probability. To verify the quality of the postulated reaction coordinate, a set of configurations with q = q ∗ is generated. Then, for each of these configurations one calculates p B , the probability to relax into state B, also called the committor. This can be done by initiating many short trajectories at the configuration and observing which state they relax to. As a result, one obtains a distribution P( p B ) of committors. For a good reaction coordinate, this distribution should peak at a value of p B ≈ 1/2. If this is not case, other degrees of freedom need to be taken into account for a correct description of the transition [7].
4.
Reaction Rate Constants
Since trajectories collected in the transition path sampling method are genuine dynamical trajectories, they can be used to study the kinetics of reactions. The phenomenologic description of the kinetics in terms of reaction rate constants is related to the underlying microscopic dynamics by time correlation functions of appropriate population functions that describe how the system
1594
C. Dellago
relaxes after a perturbation [12]. In particular, for transitions from A to B the relevant correlation function is h A (x0 )h B (xt ) , (5) C(t) = h A where the angular brackets · · · denote equilibrium averages. The correlation function C(t) is the conditional probability to observe the system in region B at time t provided it was in region A at time 0. To understand the general features of this function, let us imagine that we prepare a large number of identical and independent systems in a way that at time t = 0 all of them are located in A. Then, we let all systems evolve freely and observe the fraction of systems in region B as a function of time. This fraction is the correlation function C(t). Initially, all systems are in A and, therefore, C(0) = 0. As time goes on, some systems cross the barrier due to random fluctuations and contribute to the population in region B. So C(t) grows and it keeps growing until equilibrium sets in, i.e., until the flow of systems from A to B is compensated by the flow of system moving from B back to A. For very long times, correlations are lost and the probability to find a system in B is just given by the equilibrium population h B . For first order kinetics C(t) approaches its asymptotic value exponentially, C(t) = h B [1 − exp(−t/τrxn )], where the reaction time τrxn can be written in terms of the forward and backward reaction rate constants, τrxn = (k AB + k B A )−1 . For times short compared to the reaction time τrxn (but longer than the time necessary to cross the barrier) the correlation function C(t) grows linearly, C(t) ≈ k AB t, and the slope of this curve is the forward rate constant k AB . Thus, to determine reaction rate constants one has to calculate the correlation function C(t). To determine the correlation form C(t) in the transition path sampling method we rewrite it in the suggestive form
C(t) =
Dx(t) h A (x0 )P[x(t)]h B (xt ) . Dx(t) h A (x0 )P[x(t)]
(6)
Here, both numerator and denominator are integrals over path distributions and can be viewed as partition functions belonging to two different path ensembles. The integral in the denominator has the form of a partition function of the ensemble of pathways starting in region A and ending somewhere. The integral in the numerator, on the other hand, is more restrictive and places a condition also on the final point of the path. This integral can be viewed as the partition function of the ensemble of pathways starting in region A and ending in region B. Thus, the path ensemble in the numerator is a subset of the path ensemble in the denominator. The ratio of partition functions can be related to the free energy difference F between the two ensembles of pathways, C(t) ≡ exp(−F).
(7)
Transition path sampling
1595
This free energy difference is the generalized reversible work necessary to confine the endpoints of pathways starting in A to region B. Exploiting this viewpoint, one can calculate the time correlation function C(t) and hence determine reaction rate constants by adapting conventional free energy estimation methods to work in trajectory space. So far, reaction rate constants have been calculated in the framework of transition path sampling with umbrella sampling, thermodynamic integration, and fast switching methods. In principle, the forward reaction rate constant k AB can be determined by carrying out a free energy calculation for different times t and taking a numerical derivative. In the time range where C(t) grows linearly, this derivative has a plateau which coincides with k AB . Proceeding in such a way one has to perform several computationally expensive free energy calculations. Fortunately, C(t) can be factorized in a way so that only one such calculation needs to be carried out for a particular time t . The value of C(t) at all other times in the range [0, T ] can then be determined from a single transition path sampling simulation of trajectories with length T . Thus, calculating reaction rate constants in the transition path sampling method is a two-step procedure. First, C(t ) is determined for a particular time t using a free energy estimation method in path space. In a second step, one additional transition path sampling simulation is carried out to determine C(t) at all other times. The reaction rate constant can finally be calculated by determining the time derivative of C(t).
5.
Outlook
Transition path sampling is a practical and very general methodology to collect and analyze rare pathways. In equilibrium, such rare but important trajectories may arise due to free energetic barriers impeding the motion of the system through configuration space. Transition path sampling, however, can be used equally well to study rare trajectories occurring in non-equilibrium processes such as solvent relaxation following excitation or rare pathways arising in new methodologies for the computation of free energy differences. Different types of dynamics ranging from Monte Carlo and Brownian dynamics to Newtonian and non-equilibrium dynamics can be treated on the same footing. To date, the transition path sampling method has been applied to many processes in physics, chemistry and materials science. Examples include chemical reactions in solution, conformational changes of biomolecules, isomerizations of small cluster, the dynamics of hydrogen bonds, ionic dissociation, transport in solids, proton transfer in aqueous systems, the dynamics of non-equilibrium systems, base pair binding in DNA, hydrophobic collapse, and cavitation between solvophobic surfaces. Furthermore, the transition path sampling has been combined with other approaches such as parallel tempering, master equation methods, and the Jarzynski method for the computation of free energy
1596
C. Dellago
differences. Due to the generality of the transition path sampling method it is likely that in the future this approach will be used fruitfully to study new problems in a variety of complex systems.
References [1] D. Wales, Energy Landscapes, Applications to Clusters, Biomolecules and Glasses, Cambridge University Press, Cambridge, 2003. [2] R. Elber, A. Ghosh, and A. C´ardenas, Long time dynamics of complex systems, Acc. Chem. Res., 35, 396, 2002. [3] H. J´onsson, G. Mills, and K.W. Jacobsen, “Nudged elastic band method for finding minimum energy paths of transitions,” In: B.J. Berne, G. Ciccotti, and D.F. Coker, (eds.), Computer Simulation of Rare Events and Dynamics of Classical and Quantum Condensed-Phase Systems – Classical and Quantum Dynamics in Condensed Phase Simulations, World Scientific, Singapore, p. 385, 1998. [4] W.E.W. Ren and E. Vanden-Eijnden, String method for the study of rare events, Phys. Rev. B, 66, 052301, 2002. [5] D. Chandler, “Barrier crossings: classical theory of rare but important events,” In: B.J. Berne, G. Ciccotti, and D.F. Coker (eds.), Computer Simulation of Rare Events and Dynamics of Classical and Quantum Condensed-Phase Systems – Classical and Quantum Dynamics in Condensed Phase Simulations, World Scientific, Singapore, p. 3, 1998. [6] C. Dellago, P.G. Bolhuis, F.S. Csajka, and D. Chandler, “Transition path sampling and the calculation of rate constants,” J. Chem. Phys., 108, 1964, 1998. [7] C. Dellago, P.G. Bolhuis, and P.L. Geissler, “Transition path sampling,” Adv. Chem. Phys., 123, 1, 2002. [8] P.G. Bolhuis, D. Chandler, C. Dellago, and P.L. Geissler, “Transition path sampling: throwing ropes over mountain passes in the dark,” Ann. Rev. Phys. Chem., 53, 291, 2002. [9] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn. Academic, San Diego, 2002. [10] N. Metropolis, A.W. Metropolis, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations for fast computing machines,” J. Chem. Phys., 21, 1087, 1953. [11] G. Ciccotti, “Molecular dynamics simulations of nonequilibrium phenomena and rare dynamical events,” In: M. Meyer and V. Pontikis (eds.), Proceedings of the NATO ASI on Computer Simulation in Materials Science, Kluwer, Dordrecht, p. 119, 1991. [12] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987.
5.4 SIMULATING REACTIONS THAT OCCUR ONCE IN A BLUE MOON Giovanni Ciccotti1 , Raymond Kapral2 , and Alessandro Sergi2 1
INFM and Dipartimento di Fisica, Universit`a “La Sapienza”, Piazzale Aldo Moro, 2, 00185 Roma, Italy 2 Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON, Canada, M5S 3H6
The computation of the rates of condensed phase chemical reactions poses many challenges for theory. Not only do condensed phase systems possess a large number of degrees of freedom so that computations are lengthy, but typically chemical reactions are activated processes so that transitions between metastable states are rare events that occur on time scales long compared to those of most molecular motions. This time scale separation makes it almost impossible to determine reaction rates by straightforward simulation of the equations of motion. Furthermore, condensed phase reactions often involve collective degrees of freedom where the solvent participates in an important way in the reactive process. Consequently, the choice of a reaction coordinate to describe the reaction is often far from a trivial task. Various methods for determining reaction paths have been devised (see Refs. [1, 2] and references therein). These methods have the goal of determining how the system passes from one metastable state to another and thus finding the reaction path or reaction coordinate. In many situations one has some knowledge of how to construct a reaction coordinate (or set of reaction coordinates) for a particular physical problem. One example is the use of the many-body solvent polarization reaction coordinate to describe electron or proton transfer in solution. In almost all situations investigated to date the dynamics of condensed phase activated reaction rates can be described in terms of a small number of reaction coordinates (often involving collective degrees of freedom). In this chapter, we describe how to simulate the rates of activated chemical reactions that occur on slow time scales, assuming that some set of suitable reaction coordinates is known. In order to compute the rates of rare reactive 1597 S. Yip (ed.), Handbook of Materials Modeling, 1597–1611. c 2005 Springer. Printed in the Netherlands.
1598
G. Ciccotti et al.
events we need to be able to sample regions of configuration space that are rarely visited since the interconversion between reactants and products entails passage through such regions of low probability. We show that by applying holonomic constraints to the reaction coordinate in a molecular dynamics simulation we can force the system to visit unfavorable configuration space regions. Through such constraints we can generate an ensemble of configurations (the Blue Moon ensemble) that allows one to efficiently estimate the rate constant for activated chemical processes [3].
1.
Reactive Flux Correlation Function Formalism
We begin with a sketch of the reactive flux correlation function formalism in order to specify the quantities that must be computed to obtain the reaction rate constant. In order to simplify the notation, we consider a molecular system containing N atoms with Hamiltonian H = K (p) + V (r), where K (p) is the kinetic energy, V (r) is the potential energy and (p, r) denotes the 6N momenta and coordinates defining the phase space of the system. A chemical reaction A B is assumed to take place in the system. The reaction dynamics is described phenomenologically by the mass action rate law dn A (t) = −kf n A (t) + kr n B (t), dt
(1)
where n A (t) is the mean number density of species A. The task is to compute −1 (K eq is the equilibrium constant) rate the forward kf and reverse kr = kf K eq constants by molecular dynamics simulation. (The formalism is easily generalized to other reaction schemes.) To this end, we assume that the progress of the reaction can be characterized on a microscopic level by a scalar reaction coordinate ξ(r) which is a function of the positions of the particles in the system. A dividing surface at ξ ‡ serves to partition the configuration space of the system into two A and B domains that contain the metastable A and B species. The microscopic variable corresponding to the fraction of systems in the A domain is n A (r) = θ(ξ ‡ − ξ(r)), where θ is the Heaviside function. Similarly, the fraction of systems in the B domain is n B (r) = θ(ξ(r) − ξ ‡ ). The time rate of change of n A (r) is n˙ A (r) = −ξ˙ (r)δ(ξ(r) − ξ ‡ ).
(2)
The rate at which the A and B species interconvert can be determined from the well-known reactive flux formula for the rate constant [4–6]. Using this formalism the time-dependent forward rate coefficient can be expressed in terms of the equilibrium correlation function of the initial flux of A with the
Simulating reactions that occur once in a blue moon
1599
A species density at time t as 1 1 ˙ A (r)n A (r, t) = eq ξ˙ δ(ξ(r) − ξ ‡ ) θ(ξ(r(t)) − ξ ‡ ) . (3) eq n nA nA
kf (t) =
Here, the angular brackets denote an equilibrium canonical average, · · · = eq Q −1 dr dr exp{−β H } · · · , where Q is the partition function and n A is the equilibrium density of species A. The forward rate constant can be determined from the plateau value of this time-dependent forward rate coefficient [6]. We can separate the static and dynamic contributions to the rate coefficient by multiplying and dividing each term on the right-hand side of Eq. (3) by δ(ξα (r) − ξ ‡ ) to obtain kf (t) =
ξ˙ δ(ξ(r) − ξ ‡ )θ(ξ ‡ − ξ(r(t))) δ(ξ(r) − ξ ‡ )
δ(ξ(r) − ξ ‡ )
eq
nA
.
(4)
The equilibrium average δ(ξ(r) − ξ ‡ ) = P(ξ ‡ ) is the probability density of finding the value of the reaction coordinate ξ(r) = ξ ‡ . We may introduce the free energy W(ξ ) associated with the reaction coordinate by the definition W(ξ ) = − β −1 ln(P(ξ )/Pu ), where Pu is a uniform probability density of ξ . For an activated process the free energy will have the form shown schematically in Fig. 1. A high free energy barrier at ξ = ξ ‡ separates the metastable reactant and product states. The equilibrium density of species A is
n A = θ(ξ ‡ − ξ(r)) = eq
=
dξ θ(ξ ‡ − ξ ) δ(ξ(r) − ξ )
dξ P(ξ ).
(5)
ξ <ξ ‡
B
W(ξ)
A
ξ‡
ξ
Figure 1. Sketch of the free energy versus ξ showing the free energy maximum at ξ = ξ ‡ and specification of the A and B domains.
1600
G. Ciccotti et al.
Using these results the expression for the time-dependent rate coefficient may be written as cd
‡
e−β W (ξ ) , kf (t) = ξ˙ θ(ξ ‡ − ξ(r(t)) ‡ ξ ‡ dξ e−β W (ξ ) ξ <ξ
(6)
where · · · cd‡ defines an average conditional on ξ(r) = ξ ‡ ξ
· · · cd ξ =
· · · δ(ξ(r) − ξ ) . δ(ξ(r) − ξ )
(7)
Often it is useful to represent the results in terms of the time-dependent transmission coefficient κ(t) which is defined as κ(t) = kf (t)/kfTST , where the transition-state theory value of the rate constant is given by the limit t → 0+ of Eq. (3) as [5] kfTST =
1 ‡ ˙ ˙ )θ( ξ ) . ξ δ(ξ(r) − ξ eq nA
(8)
The transmission coefficient κ(t) measures the deviations from kfTST due to dynamical recrossing events. From Eq. (6) we see that the computation of the rate coefficient requires the calculation of conditional averages depending on specified, rarely visited, values of the reaction coordinate. The ensemble of such configurations which are visited “once in a blue moon” is termed the blue moon ensemble [3]. In Section 2, we describe how the conditional averages which play an essential role in the computation of the rate constant may be determined from constrained molecular dynamics trajectories which allow one to efficiently sample rarely visited regions of configuration space.
2.
The Blue Moon Ensemble
The presence of the delta function δ(ξ(r) − ξ ) in the conditional equilibrium averages fixes the reaction coordinate to have the specified value ξ . We shall now show that such conditional averages of observables depending only on configuration space variables can be computed by applying holonomic constraints to the equations of motion. For simplicity, we assume that no other bond constraints are present but this more general case is easily treated [3]. To this end, we consider a system described by the Cartesian coordinates (r, p) subject to a holonomic constraint σ (r) = ξ(r) − ξ = 0 on the reaction coordinate. When there are constraints one usually introduces a set of generalized coordinates q and conjugate momenta pq such that r = r(q). In general, it is not possible to invert this relation since there are more r coordinates than q
Simulating reactions that occur once in a blue moon
1601
coordinates. However, by adding the expression for the constraint there is an extra generalized coordinate and the one to one correspondence r = r(q, σ ) is recovered. The statistical mechanics of the system may be formulated in terms of the generalized coordinates q; however, it is useful to have an equivalent formulation in terms of the original Cartesian coordinates. The dynamics of the system is described in Cartesian coordinates by the Lagrangian L(r, r˙ ) = K (˙r) − V (r) =
N 1 i=1
2
m i r˙ 2i − V (r),
(9)
to which we add the constraint σ = 0. The set of 3N − 1 generalized coordinates q plus σ can be taken as a new set of equivalent coordinates denoted collectively by u. We have r = r (q, σ ) = r (u), or u = u (r). In the new variables the Lagrangian is given by ˙ = 12 u˙ T Mu˙ − V (u), L (u, u)
(10)
where uT is the transpose of vector u and M = JT mJ is the metric matrix with elements given by Mµν =
N
mi
i=1
∂ri ∂u µ
·
∂ri ∂u ν
.
(11)
Here, J is the Jacobian matrix of the transformation r ↔ u and m is a diagonal matrix of the masses. The Lagrangian of the constrained motion is easily obtained by putting σ = σ˙ = 0 ˙ σ˙ = 0). ˙ ≡ L (q, σ = 0, q, Lc (q, q)
(12)
To derive the statistical mechanical ensemble we need the Hamiltonian description of the dynamical system. The Hamiltonian corresponding to the description with coordinates u is given by H (u, pu ) = 12 pu T M−1 pu + V (u),
(13)
where pu =
∂L ∂u˙
˙ = Mu.
(14)
The inverse of the metric matrix M−1 can be written explicitly as
M−1
µν
=
N 1 ∂u µ i=1
m i ∂ri
·
∂u ν ∂ri
.
(15)
To obtain the constrained motion we have to compute the Hamiltonian at σ = 0 and pσ satisfying the constraints σ = σ˙ = 0. Since
σ˙ = M−1 pu
3N
= Epq + Z pσ ,
(16)
1602
G. Ciccotti et al.
where the subscript 3N denotes that element of the vector M−1 pu and E and Z are submatrices in the block form of M−1 M−1 =
ET
E , Z
(17)
˜ q , where the tilde the above condition corresponds to taking pσ = − Z˜ −1 Ep means that the matrices have to be evaluated at σ = 0. The explicit form of Z is needed below and is given by Z=
N 1 ∂σ i=1
m i ∂ri
·
∂σ ∂ri
.
(18)
The constrained Hamiltonian may now be written as
˜ q . Hc (q, pq ) ≡ H q, σ = 0, pq , pσ = − Z˜ −1 Ep
(19)
Note that Eq. (16) implies pσ + Z −1 Epq = Z −1 σ˙ . Letting ρc (q, pq ) be the probability density for the constrained dynamical system, we have
ρc q, pq dq dpq = ρ u, pu δ(σ )δ( pσ + Z −1 Epq )du dpu = ρ r, pr δ(σ )δ(Z −1 σ˙ )dr dpr ≡ ρξ r, pr dr dpr . (20) In the penultimate equality, we used the fact that the point contact transformation (u, pu ) ↔ (r, pr ) is a canonical, phase space volume conserving, transformation [7]. We may rewrite this probability density as
p
ρξ r, pr = ρξr (r) ρξ pr |r ,
(21)
where the configurational probability density, ρξr (r), is obtained by performing the momentum integration of the full probability density ρξ (r, pr ) and the conditional probability density of the momenta given the configuration, p ρξ (pr |r), is defined in Eq. (21). In the canonical ensemble, the configurational probability density is given by 1/2 −β V (r) 1/2 −β V (r) e δ(σ )dr = Q −1 e δ(ξ(r) − ξ )dr, ρξr (r) dr = Q −1 c |Z | c |Z | (22)
where Q c is the partition function of the constrained system. The factor |Z |1/2 arises from performing the momentum integration of Eq. (20). The conditional probability density of the momenta given the configuration is
ρξ pr |r dpr = |Z |−1/2 e−β K δ(Z −1 σ˙ )dpr . p
(23)
Simulating reactions that occur once in a blue moon
1603
The physical interpretation of Z which enters this expression for the probability density has been discussed by several authors [8, 9] and has its origin in the restriction imposed in momentum space by the constraint σ = 0 which, holding at all times, implies that the generalized velocity σ˙ must vanish. The configurational probability density in Eq. (22) should be compared with the joint configurational probability density to be at r and at ξ = ξ ρ r (r) δ(ξ(r) − ξ )dr = Q −1 e−β V (r) δ(ξ(r) − ξ )dr.
(24)
We may express the conditional average of any configurational property of our system in terms of the ξ -constrained ensemble introduced above. While the value ξ we wish to sample is rare in the original ensemble, only configurations with ξ = ξ are sampled in the ξ -constrained ensemble. This feature is illustrated schematically in Fig. 2. By comparison of Eqs. (22) and (24) we may write O(r)δ(ξ(r) − ξ ) |Z |−1/2 O(r)ξ = , δ(ξ(r) − ξ ) |Z |−1/2 ξ
(25)
where the observable O(r) is any function of the configuration space, · · · denotes a canonical ensemble average and · · · ξ denotes an average over the constrained ensemble with ξ = ξ . Equation (25) allows one to estimate the conditional average on the left-hand side in terms of averages in the constrained ensemble with evident statistical advantages. The constrained ensemble we have constructed can be generalized to give the biased (whose bias may be removed) configurational sample determined above and the correct distribution of momenta. This Blue Moon ensemble can be easily obtained by multiplying the ξ -constrained configurational probability
ξ(r)⫽ξ'
Figure 2. Schematic representation of the sampling procedure in the Blue Moon ensemble. The bold line depicts the constrained (ξ(r) = ξ ) dynamical evolution in phase space. The unconstrained natural evolution of the system is shown as a dashed line. The open circles represent common points in configuration space which are the initial conditions of the activated trajectory sampling. These points are not real crossings in phase space since the two trajectories differ in the momentum space. Interruptions of the dashed line denote lengthy segments in the natural trajectory and indicate that “crossings” are rare events. The dynamics represented by the solid line segments of the unconstrained trajectory in the vicinity of the crossing points provide the dynamical information needed to compute averages.
1604
G. Ciccotti et al.
density times the correct (Maxwellian) conditional (most often the momenta and coordinates are independent) probability of the momenta ρBM (r, pr ) = ρξr (r)ρ p (pr |r).
(26)
This ensemble provides the basis for a natural method that can be used to compute time correlation functions. Given two arbitrary dynamical variables O (r, pr ) and O (r, pr ) we may write O (r, pr )O (r(t), pr (t))δ(ξ(r) − ξ ) δ(ξ(r) − ξ ) −1/2 O (r, pr )O (r(t), pr (t))ξ |Z | . = |Z |−1/2 ξ
(27)
The result in Eq. (27) allows one to compute the contribution to the timedependent rate coefficient in the first set of brackets in Eq. (4) using the Blue Moon ensemble. To complete the calculation of the time-dependent rate coefficient, we must be able to calculate Pξ (ξ ) = δ(ξ − ξ ) = Pu exp(−βW(ξ )) in an efficient manner. The free energy W(ξ ) is the reversible work needed to bring the system from a given reference state to ξ = ξ . The associated thermodynamic force F(ξ ) = −
dW(ξ ) , dξ
(28)
can be expressed as the conditional average of a suitable observable and from the thermodynamic integration of F(ξ ) over ξ , we can obtain the potential of mean force W(ξ ) =
ξ
dW(ξ dξ dξ
)
ξ
=
dξ
(∂ H/∂ξ ) δ(ξ − ξ ) . δ(ξ − ξ )
(29)
The explicit form of the thermodynamic force can be obtained by performing the derivative in Eq. (28) to obtain (∂ H/∂ξ )δ(ξ −ξ ) (β −1 (∂/∂ξ ) ln |J |−(∂ V /∂ξ ))δ(ξ − ξ ) = δ(ξ −ξ ) δ(ξ − ξ ) ˆ ξ ˆ Fδ(ξ − ξ ) Z −1/2 F = , (30) ≡ δ(ξ − ξ ) Z −1/2 ξ
F(ξ ) = −
where |J | is the Jacobian of the transformation r → u resulting from the explicit integration over the momenta. The quantity Fˆ whose conditional average determines the mean force is the sum of two terms: the first term, β −1 (∂/∂ξ ) ln|J |, represents the apparent forces acting on the system due to the use of generalized (non-inertial) coordinates, while the second term,
Simulating reactions that occur once in a blue moon
1605
−(∂ V /∂ξ ), corresponds to the component along the generalized coordinate ξ of the force coming from the potential V . This result expresses the thermodynamic force as a conditional average which can be computed numerically by using the Blue Moon ensemble as indicated in the last line of Eq. (30). It is possible to obtain another, more convenient, expression for the mean force that obviates the need to compute the Jacobian and the ξ derivative of the potential, two quantities that are often difficult to determine [10]. The alternative form involves the Lagrange multiplier associated with the constraint on ξ , one of the variables in the (u, pu ) representation of phase space u˙ =
∂H ∂pu
,
p˙ u = −
∂H ∂u
− λδξ u .
(31)
In this approach, one keeps the momentum dependent observable (∂ H/∂ξ ) and computes the difference between the configurationally unbiased constrained average and the corresponding conditional average. The (negative of the) mean force can be written as dW(ξ ) (∂ H/∂ξ ) δ(ξ − ξ ) = dξ δ(ξ − ξ ) −1/2 Z − λ − p˙ξ + (1/2β) (∂ ln |Z |/∂ξ ) ξ = Z −1/2 ξ −1/2 Z (1/β)G − λ ξ = , Z −1/2 ξ
(32)
where G=
N 1 ∂ξ ∂2 ξ ∂ξ 1 · · . 2 Z i, j =1 m i m j ∂ri ∂ri ∂r j ∂r j
(33)
The computations leading to this result are straightforward but lengthy [10]. This formula provides a much more convenient route for the computation of the mean force and, hence, the potential of mean force which uses quantities that are automatically provided by SHAKE [11, 12] in the constrained molecular dynamics simulation. From these considerations, we see that all quantities needed to estimate the rate coefficient may be determined efficiently in the Blue Moon ensemble. It is a straightforward to include other constraints, such as bond constraints, in the formalism [3]. In particular, the expression for the correlation function takes the form O (r, pr )O (r(t), pr (t))δ(ξ(r) − ξ ) δ(ξ(r) − ξ ) −1/2 O (r, pr )O (r(t), pr (t))ξ ,M D , = D −1/2 ξ ,M
(34)
1606
G. Ciccotti et al.
where the subscript M refers to the other bond constraints and D = |Z|/|Z | with Z defined as in Eq. (18) but extended to include all constraints Z mn =
N 1 ∂σm i=1
m i ∂ri
·
∂σn ∂ri
,
(35)
while Z is defined by a similar equation with the restriction to only bond constraints. To illustrate how the formalism can be applied we consider the adiabatic transfer of a proton in an [AH A]− complex, with fixed internuclear separation between the A− anions, solvated by a polar liquid [13] A − H · · · A− A− · · · H − A.
(36)
Let rp be the quantum coordinate of the proton and R the remainder of the complex and solvent classical degrees of freedom. The total potential energy of the system is V =Vps (rp , R)+Vs (R), where Vps and Vs are the proton–solvent and solvent–solvent interactions, respectively. In the adiabatic approximation, ¨ the proton wave function satisfies the following Schrodinger equation
h¯ 2 2 − ∇ + Vps (rp , R) n (rp ; R) = n (R) n (rp ; R), 2m p rp
(37)
where m p is the mass of the proton, 2πh¯ is Planck’s constant, n (R) is the nth adiabatic eigenvalue and the corresponding wave function is n (rp ; R). The ¨ = −∇R (Vps + classical coordinates follow Newton’s equations of motion m i R
n (R)) on the nth adiabatic energy surface. In particular, the adiabatic dynamics on the ground-state surface can be calculated easily by solving the ¨ Schrodinger equation for each solvent configuration in order to obtain the ground-state energy 0 (R) and wave function 0 . A convenient reaction coordinate for this problem is the solvent polarization, ξ(R) = E(R) E(R) =
i
zi
1 1 − , |Ri − u| |Ri − u |
(38)
where z i is the charge on site i and u and u are two chosen reference positions. The Blue Moon expression for the time-dependent transmission coefficient is κf (t) =
˙ D −1/2 ( E)θ( E(t) − E ‡ )ξ,M . ˙ ˙ D −1/2( E)θ( ( E)) ξ,M
(39)
Here, D −1/2 is the Blue Moon unbiasing factor with D = (2m)−1 i (∇i E)2 , where the sum extends over all classical degrees of freedom assumed to have equal mass m. The time-dependent transmission coefficient κ(t) was calculated by constraining the system to E(t) = E ‡ to generate the Blue Moon ensemble and releasing the constraint to determine the time evolution of
Simulating reactions that occur once in a blue moon
1607
1
κ(t )
0.8
0.6
0.4
0
1
2
3
4
5
t (ps) Figure 3.
The time-dependent transmission coefficient for adiabatic proton transfer.
θ( E(t) − E ‡ ) that appears in the correlation function expression for κ(t). The results of this calculation are shown in Fig. 3 [13]. The simulations show that a plateau is reached on the order of 1 ps and that the forward rate constant kf is about 0.6 of its kfTST value as a result of recrossings of the E(t) = E ‡ surface. The transition-state rate constant can also be determined in the Blue Moon ensemble using the expression
kfTST = (2πβ)−1/2 D −1/2
−1
ξ,M
e−β W ( E
E< E ‡
‡)
d Ee−β W ( E)
.
(40)
The free energy at the barrier top E ‡ was estimated from quadratic approximations to the free energy function near the metastable states and the expectation value of D −1/2 in the above formula was evaluated in the Blue Moon ensemble. The resulting value for the transition-state rate constant is kfTST = 4 × 109 s−1 . The direct molecular dynamics simulation of the rate constant would be a difficult task without the use of a rare event sampling technique in view of the activated nature of this proton transfer reaction.
3.
Vectorial Reaction Coordinate
In some instances, the reaction path the system takes to go from reactants to products may not be simply related to a single scalar reaction coordinate which is chosen on physical grounds. An inappropriate choice of a reaction coordinate can lead to difficulties in the computation of the rate. The underlying structure of the reaction path may often be revealed by extending the description to vectorial reaction coordinates. For example, if the free energy surface as a function of two reaction coordinates ξ1 and ξ2 has the structure
1608
G. Ciccotti et al.
shown in Fig. 4, then a description based on projections of the free energy along ξ1 will lead to misleading results. The discovery of the appropriate set of reaction coordinates, or more generally the reaction path, may not be a simple task for some systems. To describe the free energy of such more complex situations, we suppose that the system can be characterized by a set of reaction coordinates which are a functions of the positions of the particles in the system, ξ(r) = {ξ1 , ξ2 , . . . , ξn }. The probability density of finding ξ(r) = ξ is
P(ξ ) = δ(ξ(r) − ξ ) =
n
δ(ξα (r) −
α=1
ξα )
.
(41)
The free energy W(ξ ) associated with the vectorial reaction coordinate is W(ξ ) = −β −1 ln(P(ξ )/Pu ),
(42)
where Pu is again a uniform probability density of ξ . This free energy W or reversible work needed to take the system from the vectorial reaction coordinate value ξa to ξb can be calculated by means of a n-dimensional line integral
W(ξb ) − W(ξa ) =
dξ ·
∂W
C (ξ a ,ξ b )
∂ξ
,
(43)
where C(ξ a , ξ b ) is the path taken from ξa to ξb . Using Eqs. (41) and (42) and the properties of the delta function one may show that −
∂W ∂ξ
=−
∂H ∂ξ
cd
= Fξ ,
(44)
ξ
ψ2
B
‡
ψ2
X
A ψ1‡
ψ1
Figure 4. Sketch of a free energy surface showing two metastable regions depicted as shaded domains in the figure. To obtain such a plot, the free energy is computed for specified values ‡ ‡ of two reaction coordinates, ξ1 (r) and ξ2 (r). The saddle point at (ξ1 , ξ2 ) is indicated by a × symbol.
Simulating reactions that occur once in a blue moon
1609
where Fξ is the mean force associated with ξ. Following the procedure outlined in Section 2 we may write the (negative of the) mean force in the form [14] ∂W ∂ξ
||−1/2 1/βG − λ
=
||−1/2
ξ‡
ξ‡
,
(45)
where αβ =
1 ∂ξα ∂ξβ k
m k ∂rk ∂rk
.
(46)
and we have defined the vector G with elements Gα =
i,n
1 −1 ∂ξµ ∂2 ξγ ∂ξν −1 , m i m n µγ ν µα ∂ri ∂ri ∂rn ∂rn γ ν
α = 1, . . . , n,
(47)
and λ is the vector of Lagrange multipliers appearing in the constrained equations of motion. The formalism for the most general treatment where there are both other constraints and vectorial reaction coordinate constraints has been given by [15]. The Blue Moon ensemble has been used to compute the free energy as a function of several reaction coordinates for ionization reactions of [NaCl2 ]− ion complexes in water clusters [16]. The multidimensional reaction coordinate formalism described above has also been applied to study the interaction between monomers in a superoxide dismutase protein, Photobacterium leiognathi [14], which we now briefly describe. This protein provides a good example of macromolecular recognition since the monomers are able to form the dimeric enzyme in water. Calculation of the binding force for different mutant proteins, obtained by substituting the amino acids at the monomer–monomer interface, and structural analysis could provide insight into the recognition process. In Fig. 5, we give a pictorial view of the protein as found in nature.
Zn Cu
Figure 5. Photobacterium leiognathi Cu,Zn SOD structure. The ribbon shows the fold of the two identical subunits constituting the dimer. Arrows represent the β-strands while thin wires represent the random-coil structure and the turns. The copper and zinc ions are shown as dark and light gray labelled spheres, respectively.
1610
G. Ciccotti et al. 10
W(kcal/mol)
0 ⫺10 ⫺20 ⫺30 ⫺40 ⫺50 16
17
18
19
20
21
22
23
24
ψ1(Å) Figure 6.
The graph shows the potential of mean force as a function of the separation.
The separation of the two monomers can be studied as a function of their relative displacement and orientation. This requires a six-dimensional reaction coordinate. A complete sampling of the phase space as a function of this sixdimensional coordinate is not feasible for the system under investigation (there are 2694 atoms in the proteins which were solvated by 9944 water molecules). However, the principal component of the binding force can be calculated by freezing the slow orientational modes of the two monomers and studying their separation at a fixed orientation. One can choose the initial orientation of the monomers to be that minimizing the energy of the system; hence, the mean force along the relative separation distance can be calculated. The result of such a calculation is shown in Fig. 6. While the separation path one obtains is not fully realistic, it can be used to perform a series of identical numerical experiments on different mutants of Photobacterium leiognathi to investigate the important structural and dynamical features in the recognition process.
4.
Outlook
The description of condensed phase activated rate processes is a challenging problem. The choice of a suitable reaction coordinate or set of reaction coordinates is a central feature of such descriptions. Often this choice is made on physical grounds but schemes for determining reaction paths are needed to provide results when physical considerations are inadequate. Once such reaction coordinates are known the methods described in this chapter provide algorithms for the computation of reaction rates. As briefly described in the text, the methods presented here have been used to investigate also adiabatic quantum rate processes. Non-adiabatic reaction rates may also be treated using the techniques developed here [17].
Simulating reactions that occur once in a blue moon
1611
References [1] C. Dellago, P.G. Bolhuis, and P.L. Geissler, “Transition path sampling,” Adv. Chem. Phys., 123, 1–78, 2002. [2] W.E and E. Vanden Eijnden, “Conformational dynamics and transition pathways in complex systems,” In: S. Attinger and P. Koumoutsakes (eds.), Lecture Notes in Computational Science and Engineering, Springer, Berlin, vol. 39, to be published, 2004. [3] E. Carter, G. Ciccotti, C. Hynes, and R. Kapral, “Constrained reaction coordinate dynamics for the simulation of rare events,” Chem. Phys. Lett., 156, 472–477, 1989. [4] T. Yamamoto, “Quantum statistical mechanical theory of the rate of exchange chemical reactions in the gas phase,” J. Chem. Phys., 33, 281–289, 1960. [5] D. Chandler, “Statistical-mechanics of isomerization dynamics in liquids and transition-state approximation,” J. Chem. Phys., 68, 2959–2970, 1978. [6] R. Kapral, S. Consta, and L. McWhirter, “Chemical rate laws and rate constants,” In: B. Berne, G. Ciccotti, and D. Coker (eds.), Classical and Quantum Dynamics in Condensed Phase Systems, World Scientific, Singapore, pp. 583–616, 1998. [7] H. Goldstein, Classical Mechanics, Addison-Wesley, Reading, MA, 1980. [8] M. Fixman, “Classical statistical-mechanics of constraints – theorem and application to polymers,” Proc. Natl. Acad. Sci. USA., 71, 3050–3053, 1974. [9] N.G. van Kampen and J.J. Lodder, “Constraints,” Am. J. Phys., 52, 419–424, 1984. [10] M. Sprik and G. Ciccotti, “Free energy from constrained molecular dynamics,” J. Chem. Phys., 109, 7737–7744, 1998. [11] J.P. Ryckaert, G. Ciccotti, and H.J.C. Berendsen, “Numerical-integration of Cartesian equations of motion of a system with constraints – molecular-dynamics of n-alkanes, J. Comput. Phys., 23, 327–341, 1977. [12] G. Ciccotti and J.P. Ryckaert, “Molecular dynamics simulation of rigid molecules,” Comput. Phys. Rep., 4, 345–392, 1986. [13] D. Laria, G. Ciccotti, M. Ferrario, and R. Kapral “Molecular dynamics study of adiabatic proton transfer reactions in solution,” J. Chem. Phys., 97, 378–388, 1992. [14] A. Sergi, G. Ciccotti, M. Falconi, A. Desideri, and M. Ferrario, “Effective binding force calculation in a dimeric protein by molecular dynamics simulation,” J. Chem. Phys., 116, 6329–6338, 2002. [15] I. Coluzza, M. Sprik, and G. Ciccotti, “Constrained reaction coordinate dynamics for systems with constraints,” Mol. Phys., 101, 2885–2894, 2003. [16] S. Consta and R. Kapral, “Ionization reactions of ion complexes in mesoscopic water clusters,” J. Chem. Phys., 111, 10183–10191, 1999. [17] A. Sergi and R. Kapral, “Quantum-classical dynamics of non-adiabatic chemical reactions,” J. Chem. Phys., 118, 8566–8575, 2003.
5.5 ORDER PARAMETER APPROACH TO UNDERSTANDING AND QUANTIFYING THE PHYSICO-CHEMICAL BEHAVIOR OF COMPLEX SYSTEMS Ravi Radhakrishnan1 and Bernhardt L. Trout2 1 Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA 2
Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Many physico-chemical processes such as nucleation events in phase transitions, chemical reactions, conformational changes of biomolecules, and protein folding are activated processes that involve rare transitions between stable or metastable states in the free energy surface. Understanding the underlying mechanism and computing the rates associated with such processes is a central to many applications. For instance, the familiar process of nucleation of ice from supercooled water is encountered in several scientific and technologically relevant processes. The formation of ice microcrystals in clouds via nucleation is a phenomenon that has a large impact in terms of governing global climatic changes. The key to the survival of Antarctic fish and certain species of beetles through harsh winters is their ability to inhibit nucleation of intracellular ice with the aid of antifreeze proteins. At the other end of the spectrum, certain protein assemblies called ice-nucleation agents are believed to be responsible for catalyzing ice nucleation, a phenomenon, which is exploited by certain bacteria to derive nutrients from their host plants. Controlling the formation and propagation of intracellular ice is finding importance in cryopreservation of natural and biosynthetic tissues. Similarly, one can cite many technologically relevant self-assembly and transport processes in the context of fabrication of advanced materials for specific applications in drug delivery, biosensing, and chemical catalysis. A unifying feature among various activated events is that they can be understood in terms of transitions between a series of stable (global minimum) 1613 S. Yip (ed.), Handbook of Materials Modeling, 1613–1626. c 2005 Springer. Printed in the Netherlands.
1614
R. Radhakrishnan and B.L. Trout
or metastable (local minima) basins in the free energy landscape separated by free energy barriers or bottlenecks (also known as transition states). The metastable states represent high-probability regions and the transition states represent low-probability regions of phase space. For a system consisting of N atoms, the free energy landscape F = −kB T ln Q is 3N-dimensional and related the configurational partition function Q given by,
Q=
dr1 dr2 · · · dr N exp(−β H (r1 , r2 , . . ., r N )).
(1)
Here, kB is the Boltzmann constant, T is the temperature, r1 ,r2 , . . . , r N are coordinates of the N atoms in the system, β = 1/kB T , and H is the classical Hamiltonian giving the energy of the system for a given configuration. An activated process can be described by a set of pathways connecting the relevant metastable states in the free energy landscape (see Fig. 1). A Monte Carlo path or a molecular dynamics trajectory that captures an activated process is likely to be representative of a pathway for the transition in the sense that the trajectory will show many characteristics that are unique to the pathway. However, an ensemble of molecular dynamics trajectories (or Monte Carlo paths) connecting the metastable states of the free energy landscape, rather than a single trajectory or path best describes the mechanism of transition. Consequently, the intermediate states along the pathway can be
H (r1,r2)
1.5 1 0.5 0 ⫺0.5 ⫺1 ⫺1.5 ⫺2 80
A
60 r2 40
B 20 0
0
10
20 30
40 r1
50
60
70
Figure 1. Model Hamiltonian for a system with two degrees of freedom. The stable (A) and metastable (B) states are shown along with the transition state. Three different paths connecting states A and B are shown on the contour projection.
Order parameter approach of complex systems
1615
characterized by unifying patterns (structural or energetic) that are common to the ensemble of molecular dynamics trajectories. In a statistical sense, identifying the dynamical variables to quantify the patterns and averaging over the different molecular configurations along the transition pathway can yield insight into the relationship between the evolution of the patterns and the free energy landscape. These dynamical variables, also referred to as order parameters, are quantities that can classify different metastable states according to their distinguishing characteristics (such as symmetries associated with different phases). In addition, the order parameters depend on the nature of intermolecular forces, solvent degrees of freedom, etc., and consequently, the chemical environment will impact their choice. In the simple example in Fig. 1, the intermediate states and the transition pathway are easily described in terms of the variables r1 and r2 , which serve as good order parameters. As we shall see throughout this article, the information about the free energy landscape obtained from such an approach is useful in quantifying the rate of the activated process while the evolution of the order parameters along the pathway is useful in understanding the underlying mechanism. Furthermore, equilibrium properties of the system, and how they depend on the control variables, can be inferred from the relationship between the order parameters and the free energy landscape. Conceptual understanding of the nature of the process, which arises from the relationship between the order parameters and the free energy landscape, generally goes by the name of “a phenomenological theory”. More specifically, the phenomenology arises from identifying how the order parameters (and hence the rate of the process) are influenced by the chemical environment, state variables, and control variables, so that the order parameters can themselves be ascribed physical meaning and interpretation.
1.
Relationship between the Order Parameters and the Free Energy Function
The use of order parameters to construct phenomenological theories (top-down approach) of phase transitions in condensed matter and solid-state systems was pioneered by Landau, Ginzburg, De Gennes, and others. In this approach, the order parameter is chosen based on physical grounds and intuition, and the free energy functional (as a function of the order parameter) is constructed based on symmetry arguments. The literature in this class of problems is extensive, with applications ranging from superconductivity and superfluidity, magnetic and liquid-gas transitions, and theory of liquid crystals. A comprehensive treatise on the study of phase transitions using
1616
R. Radhakrishnan and B.L. Trout
phenomenological approaches, including universality associated with critical behavior and the re-normalization group method is provided in Refs. [1, 2]. Order parameters can also be used to construct the free energy functional by the coarse graining (bottom-up approach) of the microscopic Hamiltonian. For complex systems, for which the construction of an analytic free energy functional may be nontrivial, the free energy as a function of the order parameters can be obtained via density functional theory [3], or via molecular simulations. In this article, we discuss the latter approach. Starting from a set of n order parameters (φ1 , φ2 , . . . , φn ), the free energy density, [φ1 , φ2 , . . . , φn ] (also called the potential of mean force or Landau free energy), along the order parameters, are related to the microscopic Hamiltonian by [1]
exp(−β[φ1 , φ2 , . . . , φn ])=
dr1 dr2 · · · dr N exp(−β H (r1 , r2 , . . . , r N )) ×δ(φ1 − φ1 )δ(φ2 − φ2 ) · · · (φn − φn ).
Here, δ is the Dirac delta function, and [φ1 , φ2 , . . . , φn ] is to be interpreted as [φ1 = φ1 ,φ2 = φ2 , . . . , φn = φn ]. The free energy F is then given by
exp(−β F) =
dφ1 dφ2 · · · dφn exp(−β[φ1 , φ2 , . . . , φn ]).
(3)
The domain of integration in the above equation covers the range of order parameter values characterizing the particular state. In the example given in Fig. 1, [φ1 , φ2 ] ≡ [r1 , r2 ] = exp(−β H (r1 , r2 )), and the free energy of state A is obtained by integrating over r1 in the range (0–30), and over r2 in the range (30–60). The value of the free energy is insensitive to the exact values defining the domain of integration, as long as the region containing the minimum of the function is included. In the course of a microscopic simulation (such as molecular dynamics or Monte Carlo), the free energy density is calculable by collecting histograms of the distribution of the order parameters. If the sampling in the simulations is ergodic (i.e., encompasses the relevant phase space) and sufficiently long, these histograms are proportional to the joint probability distribution of the order parameters, P[φ1 , φ2 , . . . , φn ]. The free energy density is related to the P[φ1 , φ2 , . . . , φn ] by β[φ1 , φ2 , . . . , φn ] = −ln(P[φ1 , φ2 , . . . , φn ]) + Constant.
(4)
In order to circumvent the problem associated with ergodicity, the histograms are evaluated in separate windows of the order parameter ranges using the procedure of umbrella sampling [4]. The umbrella sampling can be understood as performing the simulations in an extended ensemble whose free
Order parameter approach of complex systems
1617
energy is related to the original thermodynamic ensemble (with a free energy F) by = F −
h i φi ,
(5)
i
where i h i φi is chosen a priori as a weighting function W {φ1 , φ2 , . . . , φn }. The umbrella sampling scheme amounts to simulations being performed using a modified Hamiltonian H = H + βW {φ1 , φ2 , . . . , φn }). The probability distribution P [φ1 , φ2 , . . . , φn ] in the modified ensemble defined by is the related to P[φ1 , φ2 , . . . , φn ] by −kB T ln(P [φ1 , φ2 , . . . , φn ]) = −kB T ln(P[φ1 , φ2 , . . . , φn ]) −W {φ1 , φ2 , . . . , φn }.
(6)
In the simplest case, choosing W {φ1 , φ2 , . . . , φn } = 0 for φ1,min < φ1 < φ1,max , and ∞ otherwise, enables the calculation of P[φ1 , φ2 , . . . , φn ] in the range φ1,min < φ1 < φ1,max . Performing such calculations over several windows covering the entire range of φ1 (of relevance) enables an accurate calculation of P[φ1 , φ2 , . . . , φn ]. In addition, enhanced sampling methods such as configurational bias sampling [5], parallel tempering [6], density of states Monte Carlo [7], and methods based on Tsallis statistics [8] can be used to improve the accuracy of the calculations.
2.
Types of Order Parameters
Order parameters have been extensively used in conjunction with molecular simulations in applications involving solid and liquid-crystalline (LC) phases, in which one of the phases is characterized by long-range order. In such cases, the order parameter can be chosen on the basis of the symmetry of the ordered phase. A few examples are given in Fig. 2. For phases with longrange order (i.e., φ(0)φ(r → ∞) = nonzero constant), the order parameter assumes a nonzero value. For disordered phases (i.e., φ(0)φ(r) ∼ exp(−r/λ), λ being the correlation length), the order parameter is zero for an infinite system. For phases with quasi-long-range order (i.e., φ(0)φ(r) ∼ r −η ), the order parameter in a finite system assumes a value intermediate between the disordered and ordered phases, with system size dependence characterized by the exponent η. The Mermin order parameter [9] is introduced to quantify order in a twodimensional crystal of circular disks, where the only close-packing possible is hexagonal, i.e., leading to a triangular lattice. More generally, for an “N-atic” order in two-dimensional systems, the pair correlation function g(r) ≡ g(r, θ) in a cylindrical coordinate system (r ≡ reiθ ) can be expressed in terms of a Fourier series in angular (θ) space
1618
Figure 2.
R. Radhakrishnan and B.L. Trout
Order parameters describing bond-orientational order in condensed phases.
g(r, θ) =
g j (r) exp(i N j θ),
(7)
j
where the summation over j runs from 0 to ∞. The coefficients of expansion (i.e., the g j (r)s ) are suitable order parameters. For a hexatic (6-fold) symmetry, the dominant order parameter is given by the first term in the expansion
Order parameter approach of complex systems
1619
evaluated at the nearest neighbor distance, i.e., g1 (r = rnn ), which is the same as the Mermin order parameter 6 (r) in Fig. 2. The Steinhardt order parameters [10] are a generalization of the above definition to three-dimensional systems. The pair correlation function g(r) ≡ g(r, θ, φ) in spherical polar coordinates is expanded in terms of a Fourier series,
g(r) = go (r) 1 +
l
lm (r) lm (θ, φ) ,
(8)
m
where the summation over l runs from 0 to ∞, and that over m runs from −1 to +1. The coefficients lm are related to the Steinhardt order parameters (Fig. 2), which are useful in differentiating between various crystal types in three-dimensional systems. For water-like molecules (which have a propensity for tetrahedral coordination, owing to their hydrogen-bonding nature), the tetrahedral order parameter [11] in Fig. 2 measures the degree to which the nearest-neighbor molecules are tetrahedrally coordinated with respect to a given molecule. The tetrahedral order parameter is a three-body order parameter, which ensures local tetrahedral symmetry around each (water-like) molecule. The tetrahedral order parameter is sensitive to formation of structures of crystalline structures in water. Similarly, the nematic order parameter [1] in Fig. 2 quantifies the degree of nematic order (i.e., parallel ordering of anisotropic molecules along their longitudinal axis) in liquid-crystalline systems. In fact, the nematic order parameter is characterized by a 2-fold symmetry (the longitudinal axis is a headless director, i.e., with no up or down direction), and therefore is closely related to Y2m term of the Steinhardt order parameter. In the examples given in Fig. 2, the definitions of order parameters are based on bond-orientational order, i.e., orientation of nearest neighbor (or molecular) bonds. In each case, the order parameters quantify the degree of crystalline order in the system; therefore, the order parameters assume nonzero (distinct) values in the crystalline phase, which reduce (mostly to zero) in the disordered phase.
3. 3.1.
Applications of Order Parameters Quantification of Disorder
Torquato and coworkers [12], have used the three-dimensional Steinhardt order parameters along with a translational order parameter t (based on the radial distribution function g(r)) to quantify the degree of disorder in dense packed materials. For a system of hard spheres in three dimensions, the authors
1620
R. Radhakrishnan and B.L. Trout
computed order parameter maps (Q 6 vs. t) for the liquid and crystal phases at equilibrium, and series of metastable states with jammed configurations (i.e., those configurations in which a given particle cannot be displaced when the rest of the particles in the system are fixed). The authors found that the translational order parameter was always positively correlated with the bond orientational order parameter (i.e., an increase in one led to an increase in the other) for the hard sphere system. Additionally, the bond orientational order parameter (being the more sensitive measure of the two) increased monotonically with increasing volume packing fraction for the jammed structures, for the entire range of packing fraction between the equilibrium liquid and crystal. The authors concluded that the concept of random close-packing is ill-defined based on the observation that an infinitesimal increase in the bond-orientational order parameter can lead to an infinitesimal increase in the packing fraction. The results also supported the view that glassy structures were not merely liquid-like structures with “frozen-in” disorder, because they were characterized by distinctly different values of Q 6 , intermediate between liquid and crystal phases.
3.2.
Anomalies of Liquid Water
Errington and Debenedetti [13] have advanced a formalism to understand the structure–property relationship in liquid water on the basis of order parameter maps. Based on the values of the translational order parameter t and the tetrahedral order parameter ξ for liquid water at equilibrium, the authors traced paths of the system in ξ − t space. Each path was obtained at constant temperature as the density was gradually increased in their computer simulations. Unlike the hard sphere case (where Q 6 and t are positively correlated and monotonically increase with increasing volume fraction), the authors found structurally anomalous regions (of state space, e.g., in the temperature–density plot) in water where “order” decreased with increasing density. The boundary of the structurally anomalous region was identified by ξ and t extrema on the ξ − t traces at different temperatures. The authors also found that the structural anomaly in water was correlated with a transport anomaly (when the diffusion coefficient increases with increasing density), and with a thermodynamic anomaly (when the coefficient of thermal expansion is negative); in particular, the anomalous regions occur as a cascade, i.e., the structurally anomalous region encompasses the region characterized by the transport anomaly, which encompasses the region showing the thermodynamic anomaly. In a following work, the authors also investigated the order parameter maps in a system of Lennard-Jones particles and found that the LJ system of particles displayed the same qualitative behavior as the system of hard spheres, which further
Order parameter approach of complex systems
1621
supports the view that the anomalies in water are directly related to the chemical structure of the water molecule.
3.3.
Nucleation of Crystalline Forms from Liquid
In a pioneering study, Frenkel and coworkers [14] calculated the free energy barrier to crystal nucleation in a system of Lennard-Jones particles in three dimensions, by employing the Steinhardt order parameters and using the formalism in Section 2. The authors showed that the path to nucleation of the stable face centered cubic phase (when the liquid is supercooled below the freezing temperature) can be described in terms of increasing values of Q 6 , while simultaneously suppressing the increase of W4 . The path in order parameter space along which both Q 6 and W4 simultaneously increase, leads the system into a metastable body centered cubic phase. Radhakrishnan and Trout [15–17] extended the above approach to describe ice-nucleation under a variety of homogeneous and inhomogeneous environments including hexagonal ice in the bulk, cubic ice under an external electric field and in a confined system, and clathrate hydrates in a super-saturated aqueous solution containing the hydrophobic solute, CO2 . In order to calculate the free energy barrier to nucleation, the authors employed the two-body Steinhardt order parameters and the three-body tetrahedral order parameters for the one-component systems, and additionally used a translational order parameter based on g(r) for the two-component aqueous solution of CO2 . The authors found that as the successive density modes in the liquid (quantified by peaks in the direct correlation function) became more correlated – owing to a decrease in temperature, influence of an external potential, or increased inhomogeneity – the free energy barrier to nucleation decreased. Interpreting their results in light of density functional theory of freezing, the authors discovered an inverse correlation between the degree of coupling of the successive density modes in the liquid phase and the free energy barrier to nucleation. Rutledge and coworkers [18], using molecular dynamics simulations, have studied crystal nucleation in a polymer melt. The time-scale to observe nucleation of a crystal phase in a polymer melt is normally beyond the scope of molecular dynamics simulations. The authors, however, found that under an applied a uniaxial (extensional) stress, the barrier to nucleation reduces considerably, to the extent that crystal nucleation can be captured in the simulations. The crystalline domains in the simulations were identified based on the local value of the nematic order parameter. The authors concluded that the addition of a large deforming stress accelerates the crystallization process by driving the individual chains into a low energy torsional conformation and by aligning them in a single direction, which leads to a lowering of the nucleation barrier.
1622
R. Radhakrishnan and B.L. Trout
Shetty and coworkers [19] have proposed a new formalism to construct new types of order parameters to quantify local order in inhomogeneous systems (such as a crystal–melt interface) based on pattern recognition and genetic algorithm.
3.4.
Solvation in Biomolecules
The contribution of hydration in molecular assembly and enzyme catalysis has long been recognized. Many biomolecules are characterized by surfaces containing extended nonpolar regions, and the aggregation and subsequent removal of such surfaces from water is believed to play a critical role in the biomolecular assembly in cells. Conventional views hold that the hydration shell of small hydrophobic solutes is clathrate-like, characterized by local cage-like hydrogen-bonding structures and a distinct loss in entropy. Using molecular dynamics simulations on the solvated polypeptide melittin, Cheng and Roskky [20] found that the hydration of extended nonpolar planar surfaces appears to involve structures that are orientationally inverted relative to clathrate-like hydration shells, with unsatisfied hydrogen bonds that are directed towards the hydrophobic surface. The authors employed bond-orientational order parameters to classify the local structuring of the solvent. Based on the correlation between the observed values of the order parameters and the average binding energy (i.e., the interaction of a molecule with all other molecules in the system) of proximal water molecules in each surface set, they concluded that the clathrate-like and inverted clathrate-like structures are distinguished by a substantial difference in the water–water interaction enthalpy, and that their relative contributions depended strongly on the surface topography of the melittin molecule. Clathrate-like structures dominate near convex surface patches, whereas the hydration shell near flat surfaces fluctuates between clathrate-like and less-ordered or inverted structures. The strong influence of surface topography on the structure and free energy of hydrophobic hydration is likely to be a generic feature, which may be important for many biomolecules.
3.5.
Freezing of Inhomogeneous Fluids in Porous Media
Molecular simulations for simple fluids confined in model systems of slit-shaped pores show a freezing behavior that is governed by the relative strength of the fluid–wall interaction to the fluid–fluid interaction (quantified by a parameter α), and the pore width H . The shift in the freezing temperature, Tf,pore − Tf,bulk, is found to be positive if the fluid–wall interaction is more strongly attractive than the fluid–fluid interaction and negative if the fluid–wall interaction is less attractive than the fluid–fluid interaction.
Order parameter approach of complex systems
1623
Using the Mermin order parameters and the free energy formalism in Section 2, Radhakrishnan and coworkers [21] discovered the presence of several thermodynamically stable intermediate phases lying between the liquid phase and the solid phase in computer simulation studies of Lennard-Jones molecules in slit-pores. Their studies led to the conclusion that the contact layers, i.e., the layers closest to the pore walls, freeze at a higher temperature than the inner layers for strongly attractive pores (large values of α), and thus the intermediate phase has the structure termed “contact-crystalline”, i.e., the contact layers are crystalline while the inner layers are liquid-like. For moderate values of α, the contact layers are liquid-like while the inner layers are crystalline, and the intermediate phase exists as a “contact-liquid” phase. The authors also found that for repulsive and weakly attractive walls, the intermediate (contact-layer) phase is at best metastable, and thus, only the liquid and crystal phases are stable. Based on these observations, the authors constructed “global phase diagrams” which present a unifying picture of confined phase freezing behavior in terms of the parameter α and the pore width H . In a later study [22], the authors also found evidence for the existence of a hexatic phase as an intermediary between the fluid and crystalline ones. The hexatic phase is a manifestation of the fact that, in a continuous symmetry breaking transition such as the freezing transition, the translational symmetry and the rotational symmetry can break at two different temperatures. Thus, in the liquid to hexatic phase transition, the rotational symmetry is broken and in the hexatic to crystalline transition the translational symmetry is broken. Hexatic phases, which retain long-range orientational, but not positional order, are known to occur in infinite quasi-two-dimensional systems; the authors established their presence in the simulations using a system size scaling analysis. They also found that for pore sizes accommodating more than three adsorbed molecular layers, a “contact-hexatic” phase was stable phase (in the simulations), where the contact layers are hexatic, while the inner layers are liquid-like.
4.
Assumptions in the Order Parameter Approach and Outlook
As illustrated in this article, the order parameter approach can be gainfully employed in studies of complex systems. In problems where an obvious symmetry is involved, defining a suitable set of order parameters becomes an easy task. The various examples described here demonstrate that these order parameters can be used to associate a defining characteristic of the free energy landscape to the observed phenomenon. At the expense of increased physical insight and phenomenological understanding, several inherent assumptions go into the order parameter approach described here, which we discuss below.
1624
4.1.
R. Radhakrishnan and B.L. Trout
Physical Significance of the Order Parameters
Based on the relationship between the order parameters and the free energy landscape, it is tempting to associate the order parameters with physical variables. However, in most cases, the order parameters appear as coefficients of expansion of an extensive thermodynamic variable. Consequently, depending on the choice of expansion, the definitions of the order parameters vary and are certainly not unique. Therefore, it is not guaranteed a priori that the order parameters should represent physical variables. In the examples described here, the physicality of the order parameters can be understood on the basis of density functional theory. Since the free energy F is a unique functional of the spatially varying density ρ(r), and the order parameters in Fig. 2 are all based on expansions of g(r) (where ρg(r) = ρ(r)), the order parameters can be ascribed to physical variables, if the particular density mode they characterize can be identified. In general, for other choices of order parameters, this connection must be established. Another underlying assumption is that the phenomena of interest (happening in 3N-dimensional space) can be described in terms of a small number of order parameters. Although a rigorous proof in support of this assumption may not be possible, an argument based on a few general characteristics of physical systems may be put forward in its defense. If our objective is to correlate the equilibrium properties of the system using the order parameters, the order parameters can once again be chosen as the coefficients of expansion of g(r). On first glance, it appears that one needs to include an infinite set of order parameters to have a rigorous theory. However, most often, correlations in g(r) die out by molecular length scales (even in the ordered phase, at finite temperatures), and therefore only order parameters corresponding to density modes relevant to these length-scales need to be included, which reduces their number to a few. If the objective is to describe an activated process, only those dynamical variables corresponding to the slowest modes (which are identifiable by principal component analysis of a molecular dynamics trajectory) in the transition pathway need to be included, therefore greatly reducing the number of relevant order parameters.
4.2.
Rate Processes
Within order parameter framework, we need to invoke additional assumptions to compute the rates of activated processes. A straightforward and conceptually appealing description is using transition state theory, according to which, the rate k is given by k = A exp(−β F),
(9)
Order parameter approach of complex systems
1625
where F is the free energy barrier along the reaction pathway (i.e., the free energy of the transition state relative to the reactants), and A is a pre-factor related to the inherent frequency of barrier crossing. For an ideal gas, A = (βh)−1 , h being the Planck’s constant. More generally, A is given by the inverse of the time-scale over which the order parameter correlation function φ(0)φ(t) decays when calculated at the transition state. Both terms (i.e., F and A) can be important in calculating the rate. In closing, we note that methods for verifying the existence of a transition state in a multidimensional free energy landscape independent of the order parameter formalism exist, which can validate the findings of the order parameter approach [23]. Alternative approaches to treat activated processes, which are independent of the existence of order parameters (and hence related approximations) have also been developed [24]. Nevertheless, the order parameter approach continues to be widely employed because of its simplicity, computational efficiency, and phenomenological appeal.
References [1] P.M. Chaikin and T.C. Lubensky, Principles of Condensed Matter Physics, Cambridge University Press, Cambridge, 1995. [2] N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, Addison-Wesley, New York, 1992. [3] D. Henderson (ed.), Fundamentals of Inhomogeneous Fluids, Marcel Dekker, New York, 1992. [4] D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, Oxford, 1987. [5] D. Frenkel and B. Smit, Understanding Molecular Simulations: From Algorithms to Applications, 2nd edn., Academic Press, San Diego, 2001. [6] R.H. Zhou, B.J. Berne, and R. Germain, “The free energy landscape for beta hairpin folding in explicit water,” Proc. Natl. Acad. Sci. USA, 98, 14931–14936, 2001. [7] F.G. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050–2053, 2001. [8] C. Tsallis, “Possible generalization of the Boltzmann–Gibbs Statistics,” J. Stat. Phys., 52, 479–487, 1988. [9] N.D. Mermin, “Crystalline order in 2 dimensions,” Phys. Rev., 176, 250, 1968. [10] P.J. Steinhardt, D.R. Nelson, and M. Ronchetti, “Bond-orientational order in liquids and glasses,” Phys. Rev. B, 28, 784–805, 1983. [11] P.L. Chau and A.J. Hardwick, “A new order parameter for tetrahedral configurations,” Mol. Phys., 93, 511–518, 1998. [12] S. Torquato, T.M. Truskett, and P.G. Debenedetti, “Is random close packing of spheres well defined?” Phys. Rev. Lett., 84, 2064–2067, 2000. [13] J.R. Errington and P.G. Debenedetti, “Relationship between structural order and the anomalies of liquid water,” Nature, 409, 318–321, 2001. [14] R.M. Lyndenbell, J.S. Van Duijneveldt, and D. Frenkel, “Free-energy changes on freezing and melting ductile metals,” Mol. Phys., 80, 801–814, 1993.
1626
R. Radhakrishnan and B.L. Trout
[15] R. Radhakrishnan and B.L. Trout, “A new approach for studying nucleation phenomena using molecular simulations: application to CO2 hydrate clathrates,” J. Chem. Phys., 117(4), 1786, 2002. [16] R. Radhakrishnan and B.L. Trout, “Nucleation of crystalline phases of water in homogeneous and inhomogeneous environments,” Phys. Rev. Lett., 90, 2003, 2003. [17] R. Radhakrishnan and B.L. Trout, “Nucleation of hexagonal ice Ih in liquid water,” J. Am. Chem. Soc., 125, 7743, 2003. [18] M.S. Lavine, N. Waheed, and G.C. Rutledge, “Molecular dynamics simulation of orientation and crystallization of polyethylene during uniaxial extension,” Polymer, 44, 1771–1779, 2003. [19] R. Shetty, F. Escobedo, D. Choudhary, and P. Clancy, “Characterization of order in simple materials. A pattern recognition approach,” J. Chem. Phys., 117, 4000–4009, 2002. [20] Y.K. Cheng and P.J. Rossky, “Surface topography dependence of biomolecular hydrophobic hydration,” Nature, 392, 696–699, 1998. [21] R. Radhakrishnan, K.E. Gubbins, and M. Sliwinska-Bartkowiak, “Global phase diagrams for freezing in porous media,” J. Chem. Phys., 116, 1147–1155, 2002. [22] R. Radhakrishnan, K.E. Gubbins, and M. Sliwinska-Bartkowiak, “Existence of a hexatic phase in porous media,” Phys. Rev. Lett., 89, 076101, 2002. [23] P.G. Bolhuis, C. Dellago, and D. Chandler, “Reaction coordinates of biomolecular isomerization,” Proc. Natl. Acad. Sci. USA, 97, 5877–5882, 2000. [24] P.G. Bolhuis, D. Chandler, C. Dellago, and P. Geissler “Transition path sampling: throwing ropes over rough mountain passes, in the dark,” Ann. Rev. Phys. Chem., 53, 291–318, 2002.
5.6 DETERMINING REACTION MECHANISMS Blas P. Uberuaga and Arthur F. Voter Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
The articles in this chapter review methods for determining reaction rates for systems in which transitions from state to state are infrequent. Building on seminal concepts such as transition state theory, advances in the last 25 years have given us efficient techniques for finding saddle points and computing accurate rate constants, both classically and quantum mechanically. One motivation for calculating rate constants is that they can be supplied as input to higher-level simulation models, such as the kinetic Monte Carlo (KMC) method, which are the subject of the next chapter. For a KMC simulation to provide dynamical predictions at a desired level of quality, the accuracy of the rate constants should be of equal quality, but, just as important, the list of rates needs to be complete; it should contain all of the relevant reaction mechanisms. This latter challenge is often more daunting than the former. In this article, we discuss specific examples illustrating the difficulty in identifying the important reaction mechanisms. Through these examples, we demonstrate the use of methods for searching for these mechanisms and the importance of doing so. Realistic systems often behave in ways we least expect and if we rely on our intuition about what reaction pathways exist, we are almost certain to miss important mechanisms. We focus on infrequent-event systems in which each state corresponds to a basin in the potential energy surface, the typical case for materials processes, although some of the concepts can be generalized to free-energy basins (i.e., entropically trapped states). As we consider making up a list of rates, these systems generally fall into one (or more) of four categories: Category 0: systems in which all the reaction mechanisms are known, and can be specified in advance. Category 1: systems with reaction mechanisms that are unexpected, lying outside our intuition. Category 2: systems with reaction mechanisms that only become important (or come into existence) when the system conditions are varied. 1627 S. Yip (ed.), Handbook of Materials Modeling, 1627–1634. c 2005 Springer. Printed in the Netherlands.
1628
B.P. Uberuaga and A.F. Voter
Category 3: systems with reaction mechanisms that become important (or come into existence) after evolution of the system causes it to change its character. Systems from category 0 are ideal for kinetic Monte Carlo simulation. Knowing all the reaction mechanisms, we can, in principle, compute the rate constant for each one, forming a complete rate catalog [1]. If the rate constants provided to the KMC algorithm are exact, the resulting dynamics will also be exact – an extremely appealing situation. However, in essence, category 0 exists only for model systems. Systems for which all the reaction mechanisms are known in advance are ones we have made that way by construction. Pure model systems play an important role in understanding materials properties and testing methodology, but often we are interested in accurately modeling realistic systems or obtaining accurate results for dynamics in a particular interatomic potential. As physical systems are allowed to evolve freely, they inevitably show us unexpected reaction mechanisms. This puts them into category 1 at least, and varying the conditions may also reveal additional mechanisms (category 2). The problem is that, often, a system must evolve for times longer than those accessible via MD before this behavior becomes evident. Worse, if the system also belongs to category 3, which we typically do not know in advance, we will need to follow the dynamics for much longer times. Frustratingly, this time scale issue is the very reason we appealed to KMC in the first place, and now we are faced with it again just to find the relevant rate constants.
1.
Approaches to Exploring the System
Accelerated molecular dynamics (AMD) methods [2], described elsewhere in this handbook [3], offer a way around this dilemma. They give us a way to probe the system dynamics without the limitations of our imposed intuition. In the accelerated dynamics approach, the system is allowed to evolve according to the classical equations of motion, as in a direct molecular dynamics simulation. However, by design, during this evolution, state-to-state transitions are coaxed into occurring more rapidly, though in a way still faithful to the dynamics of the system. The result is that, with these methods, the system evolution can be followed over much longer time scales than are accessible to direct molecular dynamics simulation. Reaction mechanisms that are too slow to be observed with direct MD occur naturally during the evolution of an AMD simulation. An alternative to AMD for discovering reaction mechanisms and probing long-time dynamics is to collect a set of escape paths from a given potential
Determining reaction mechanisms
1629
basin. One way of doing this is to search for saddle points bounding the basin. For example, in the dimer method, described elsewhere in this chapter [4], efficient mode-following searches are initiated at random positions in the basin. Such methods offer a convenient way to scan for possible reaction mechanisms out of a given state and can be employed to probe long-time dynamics. By performing a large number of saddle searches, the majority of relevant escape paths can be found. If the escape rate is then computed for each of these paths, one can be chosen in an on-the-fly kinetic Monte Carlo (OFKMC) step. This takes the system to a new state, and the procedure is then repeated. One must keep in mind, however, that if any of the escape paths are missed in the saddle searches, then the OFKMC step to the next state may not be advancing the system in a dynamically correct way. If low-barrier escape paths are missed, the dynamical evolution will almost certainly be corrupted. The philosophy of this saddle-finding approach differs from that of the accelerated dynamics methods. In AMD, no attempt is made to determine all possible mechanisms for escape from a state (potential basin). Instead, the goal is to propagate the system from state to state as fast as possible while maintaining accuracy in the sense that the probability for choosing a given escape path from a state is proportional to the rate constant for that path. In exact evolution (e.g., via MD), the system chooses an appropriate escape path without knowing about the other possible escape paths, and the power of AMD comes from adopting this philosophy. As tools for determining reaction mechanisms to be supplied as input to KMC, the AMD approach and the saddle-finding approach are complimentary. To probe long times accurately, e.g., for category-3 systems, the higher fidelity of the AMD approach may be advantageous. To generate a list of reaction mechanisms out of a given type of state, the saddle-finding approach is typically more efficient. To illustrate the main point of this discussion, we now give examples of systems for each of the three categories listed above; in each case, an incomplete (intuition-based) KMC table would result in incorrect dynamics. The category-1 example is historical, concerning the diffusion behavior on the fcc(100) surface, perhaps the simplest of metal surfaces. For the surfacediffusion community, at least, this was where the issue of mechanism complexity and rate-catalog incompleteness first became apparent. The second example, the diffusion of interstitial hydrogen in an fcc lattice of fullerenes, illustrates a category-2 system in which parallel replica dynamics (an AMD method) was instrumental in uncovering an unexpected mechanism that, in turn, qualitatively altered the KMC predictions. In the last (category-3) example, taken from a study of radiation damage annealing in MgO, a surprising and unexpected mechanism turned up only after a temperature accelerated dynamics simulation was used to evolve interstitial clusters for seconds.
1630
2.
B.P. Uberuaga and A.F. Voter
Category 1: Surface Exchange on FCC(100)
The simplicity of the fcc(100) surface has long made it a prototype case for studies of metallic surface diffusion and growth. An isolated adatom rests in a four-fold hollow with four available 011 directions for hopping to an adjacent site. Because atoms in the surface layer make four nearest-neighbor bonds, i.e., the surface is tightly packed, it was originally assumed that these hopping events were the only diffusive mechanism available to the adatom, and kinetic Monte Carlo models [1] were based on this assumption. However, density functional theory calculations on A1 [5] showed that an adatom can exchange with a surface layer atom, leading to a 100 displacement to a second-nearest-neighbor position, as shown in Fig. 1. For a number of fcc metals (e.g., Al, Pt, Au), this mechanism is not only available, but substantially favored over hopping. The exchange event can be easily observed in a hightemperature molecular dynamics simulation using embedded-atom method potentials [6], but up to that time the community was so confident in its intuition about adatoms hopping on fcc(100) that no such simulation had been performed. In subsequent years, exchange processes were discovered to play a dominant role, providing low-energy paths for surface smoothing, diffusion over step edges and interface mixing (e.g, see Refs. [7, 8]), even for the metals where hopping is favored for an isolated adatom. Many more complicated processes have been discovered as well (e.g., [9] and Refs. [30–44] in Ref. [2]). Clearly, exchange events should be included in kinetic Monte Carlo simulations to obtain a proper description of the long-time behavior during growth or annealing.
Figure 1. Adatom diffusion mechanisms on Ag(100). Top row: adatom hop. Bottom row: adatom exchange mechanism. In each case, themiddle frame is the saddle point and the initial adatom is highlighted for clarity.
Determining reaction mechanisms
3.
1631
Category 2: H2 Diffusion in FCC C60
Understanding the behavior of H2 and other light gases in carbonaceous materials is of great interest because of the potential technological applications, such as fuel cells. Using parallel replica dynamics, the diffusive behavior of H2 in crystalline C60 was examined in an effort to determine the important diffusive events [10]. As expected, H2 moves through the interstitial sites of the C60 lattice by single molecular hops, jumping between octahedral and tetrahedral sites in the lattice. The parallel replica simulations revealed, however, that under higher H2 loading, more than one H2 molecule can occupy a single octahedral site, and this fact dramatically changes the diffusive behavior. The dependence of the diffusive behavior on loading makes this system an example of a category2 system. After the parallel replica simulations had explored the system long enough so that all of the relevant mechanisms were uncovered, the rates of these mechanisms were provide as input to a KMC simulation. If multiple occupancy of octahedral sites is not allowed, the self-diffusivity of H2 in C60 falls off quickly as the loading of H2 is increased, as seen in Fig. 2. However, once the fact that multiple H2 can share an interstitial site is included in the KMC simulation, the behavior reverses and the self-diffusivity increases substantially with loading. The system had to be allowed to fully explore the state-space available to it under high loading conditions.
100
D/D0
10
1
T=700K T=500K T=300K T=500K (restricted model) T=300K (restricted model)
0.1
0.01
0
0.2
0.4
0.6
0.8
1
Fractional loading
Figure 2. Self-diffusivity of H2 predicted by a lattice-gas KMC model with and without the existence of interstitial dimers and trimers (filled and open symbols, respectively). At each temperature, the self-diffusivity is shown normalized by the self-diffusivity of an isolated interstitial H2 , D0 . Error bars, shown for the 700 K case with dimers, are similar for the other cases. The overall self-diffusivity changes completely with the addition of reactions involving interstitial dimers (after Ref. [11]).
1632
4.
B.P. Uberuaga and A.F. Voter
Category 3: Radiation-damage Annealing in MgO
Oxide ceramics are important materials in nuclear applications, as, for example, fuels in light water reactors and host materials for storing nuclear waste. Thus, an understanding of the radiation damage behavior is critical for predicting the aging properties of these materials. In a study using pairwise Coulombic potentials for MgO, and temperature accelerated dynamics (TAD) augmented by lowest-barrier information from the dimer method (dimer-TAD [3]), the room-temperature annealing of cascade-generated defects was investigated [11]. Because the barriers in this system were typically high (relative to T = 300 K), dimer-TAD yielded substantial boost factors, allowing simulations on very long time scales. With this interatomic potential, interstitials and vacancies formed in lowenergy cascades are charged, and thus interact strongly. Interstitials, mobile on the ns time scale, annihilate with vacancies (which are immobile) or aggregate with other interstitials to form clusters In , containing n interstitials. Dimers (I2 ) are mobile on the time scale of seconds at T = 300 K, and two dimers encountering each other form an I4 cluster, which is both stable against dissociation and essentially immobile. This suggested that all larger clusters would be stable and immobile as well, so that only the mobility of smaller clusters (n < 4) needed to be accounted for in order to understand the long time behavior of damage in MgO. However, pursuing the cascade annealing evolution further showed this was not true, and several unusual characteristics probably would not have been discovered in a static study. As shown in Fig. 3, a diffusing I2 that encounters an I4 can create a highly mobile I6 entity. After the initial encounter, the I6 system passed through 32 states over a time of 2.9 s before the onset of mobility. Further, this mobile state travels one-dimensionally along a single 110 direction, and diffuses faster than a single interstitial. Each diffusive jump is a concerted event involving 12 atoms – the six interstitial atoms and six lattice atoms. Finally, this kinetically formed state is not the ground state, but a metastable state with a lifetime of years at T = 300 K. This is a clear example of category-3 behavior. Simulations following the coalescence of interstitials into dimers, which in turn formed tetramers and finally hexamers, on the time scale of seconds, were required before the unexpectedly mobile I6 cluster emerged. This behavior clearly impacts the cascade annealing dynamics, and an accurate KMC model will need to include the formation and diffusion kinetics of I6 and perhaps larger clusters as well.
5.
Conclusions
As these examples illustrate, care must be taken in constructing a rate table for higher-level simulations such as KMC. Often, human intuition is not
Determining reaction mechanisms (a)
1633 (b)
8 ps
(c)
4.1 s
1.2 s
(d)
4.1 s ⫹ 41 ns
Figure 3. TAD simulation of the formation of interstitial cluster I6 (three Mg cation interstitials and three O anion interstitials) from I2 and I4 at T = 300 K. Only defects in the lattice are shown. The scheme is: large spheres are interstitials and small spheres arevacancies, light spheres are O and dark spheres are Mg, where a vacancy is defined to be any lattice site with no atom within 0.83 Å and, conversely, an interstitial is any atom not within 0.83 Å of a lattice site. (a) An I2 and I4 begin about 1.2 nm apart. (b) By t = 1.2 s, the I2 approaches the immobile I4 . (c) By t = 4.1 s, the combined cluster anneals to form the metastable I6 , (d) which diffuses on the ns time scale with a barrier of 0.24 eV (after Ref. [11]).
adequate for determining what reaction mechanisms might be important. A much more reliable approach is to allow the system to explore the space available to it, considering both the varying initial conditions that might be relevant as well as the states it might enter (unexpectedly) at much longer times. Finally, we note that for some systems, the behavior will be so complex as to defy condensation into an accurate rate catalog. Highly concerted mechanisms can make even the specification of events difficult. Moreover, if the rates depend strongly on the environment beyond a short distance, the size of the catalog, which grows exponentially with the neighborhood size [1], becomes prohibitively large. This becomes a serious problem more quickly for alloy systems. In this situation, direct simulation using accelerated dynamics or OFKMC methods offers an alternative.
Acknowledgments This work was supported by the United States Department of Energy (DOE), Office of Science, Office of Basic Energy Sciences, Division of
1634
B.P. Uberuaga and A.F. Voter
Materials Sciences; and through a cooperative research and development agreement (CRADA) with Motorola, Inc.
References [1] A.F. Voter, “Classically exact overlayer dynamics – diffusion of rhodium clusters on Rh(100),” Phys. Rev. B–Condens. Matter., 34, 6819–6829, 1986. [2] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Annu. Rev. Mater. Res., 32, 321–346, 2002. [3] B.P. Uberuaga, F.Montalenti, T.C. Germann, and A.F. Voter, “Accelerated molecular dynamics methods,” Handbook of Materials, 2004. [4] H. J´onsson, Handbook of Materials, 2004. [5] G.L. Kellogg and P.J. Feibelman, “Surface self-diffusion on Pt(001) by an atomic exchange mechanism,” Phys. Rev. Lett., 64, 3143–3146, 1990. [6] M.S. Daw, S.M. Foiles, and M.I. Baskes, “The embedded-atom method: a review of theory and applications,” Mater. Sci. Rep., 9, 251–310, 1993. [7] T. Ala-Nissila, R. Ferrando, and S.C. Ying, “Collective and single particle diffusion on surfaces,” Adv. Phys., 51, 949–1078, 2002. [8] A.K. Schmid, J.C. Hamilton, N.C. Bartelt, and R.Q. Hwang, “Surface alloy formation by interdiffusion across a linear interface,” Phys. Rev. Lett., 77, 2977–2980, 1996. [9] H. J´onsson, “Theoretical studies of atomic-scale processes relevant to crystal growth,” Ann. Rev. Phys. Chem., 51, 623–653, 2000. [10] B.P. Uberuaga, A.E. Voter, K.K. Sieber, and D.S. Sholl, “Mechanisms and rates of interstitial H2 diffusion in crystalline C60 ,” Phys. Rev. Lett., 91, 105901, 2003. [11] B.P. Uberuaga, R. Smith, A.R. Cleave, F. Montalenti, G. Henkelman, R.W. Grimes, A.F. Voter, and K.E. Sickafus, “Structure and mobility of defects formed from collision cascades in MgO,” Phys. Rev. Lett., 92, 115505, 2004.
5.7 STOCHASTIC THEORY OF RATE PROCESSES Abraham Nitzan Tel Aviv University, Tel Aviv, 69978, Israel
1. 1.1.
Stochastic Modeling of Physical Processes Introduction
This chapter is written under the assumption that the reader has basic knowledge of probability theory such as needed in an elementary course in statistical mechanics, and at least an intuitive feeling for random numbers as those generated by tossing a coin or throwing a die. A random function is a function that assigns a random number to each value of its argument. Using this argument as an ordering parameter, each realization of this function is an ordered sequence of such random numbers. When the ordering parameter is time we have a time series of random variables, which is called a stochastic process. For example, the random function F(t) that assign to each time t the number of cars on a given highway segment is a random function of time, i.e., a stochastic process. Time is a continuous ordering parameter, however if observations of the random function z(t) are made at discrete time 0 < t1 < t2 , · · · , < tn < T , then the sequence {z(ti )} is a discrete sample of the continuous function z(t). Stochastic processes are ubiquitous in descriptions of observed phenomena. Here we focus on systems of classical particles. Given the initial conditions of a classical system of N particles (i.e., all initial 3N positions and 3N momenta) its time evolution is determined by the Newton equations of motion. For a quantum system, the corresponding N -particle wavefunction is determined by evolving the initial wavefunction according to the Schr¨odinger equation. In fact these initial conditions are generally not known but can be often characterized by a probability distribution (e.g., the Boltzmann distribution for an equilibrium system). The (completely deterministic) time evolution 1635 S. Yip (ed.), Handbook of Materials Modeling, 1635–1672. c 2005 Springer. Printed in the Netherlands.
1636
A. Nitzan
associated with any given initial state should be averaged over this distribution. This is the usual starting point of non-equilibrium statistical mechanics. In many cases, however, we seek simplified descriptions of physical processes by focusing an small subsystems or on a few observables that characterize the process of interest. These observables can be macroscopic, e.g., the energy, pressure, temperature, etc. of the system, or microscopic, e.g., the center of mass position, a particular bond length or the internal energy of a single molecule. In the reduced space of these “important” observables, the microscopic influence of the other ∼1023 degrees of freedom appears as random fluctuations that give these observables an apparently random character. For example, the energy of an individual molecule behaves as a random function of time (i.e., a stochastic process) even in a closed system whose total energy is strictly constant. Lets consider a particular example in which we are interested in the center of mass position ri of an isotopically substituted molecule i in an equilibrium homogeneous fluid containing a macroscopic number N of normal molecules. The trajectory ri (t) of this molecule shows an erratic behavior, changing direction (and velocity) after each collision. Indeed, this trajectory is just a projection a deterministic trajectory in the 6N dimensional phase space on the coordinate of interest, however solving this 6N -dimensional problem may be intractable and, moreover, may constitute a huge waste of effort because it yields the time dependence of 6N momenta and positions of all N particles while we are interested only in ri (t), the position of a single particle i. Instead we may look for a reduced description of ri (t) only. We may attempt to get it by a systematical reduction the 6N -coupled equations of motion. Alternatively we may construct a phenomenological model for the motion of this coordinate under the influence of all other motions. As we shall see, both ways lead to the characterization of ri (t) as a stochastic process. As another example consider the internal vibrational energy of a diatomic solute molecule, e.g., CO, in a simple atomic solvent (e.g., Ar). This energy can be monitored by spectroscopic methods, and we can follow processes such as thermal (or optical) excitation and relaxation, energy transfer and energy migration. The quantity of interest may be the time evolution of the average vibrational energy per molecule, where the average is taken over all molecules of this type in the system (or in the observation zone). At low concentration these molecules do not affect each other and all dynamical information can be obtained by observing (or theorizing on) the energy Ej (t) of a single molecule j . Average over many such molecules or over repeated independent observations on a single molecule is an ensemble average. Following vibrational excitation, it is often observed that the subsequent relaxation is exponential, E(t)=E(0) exp(−γt). A single trajectory Ej (t) (also observable by a method called single molecule spectroscopy) is however much more complicated. As before, to predict its exact course of evolution we need to know the initial
Stochastic theory of rate processes
1637
positions and velocities of all the particles in the system, then to solve the Newton or the Schr¨odinger equation with these initial conditions. Again, the resulting trajectory in phase space is completely deterministic, however Ej (t) appears random. In particular, it will look different in repeated experiments because in setting up such experiments only the initial value of Ej is specified, while the other degrees of freedom are subjected only to a few conditions (such as temperature and density). In this reduced description Ej (t) may be viewed as a stochastic variable. The role of the theory is to set up its statistical properties and to investigate its consequences. Obviously, once the usefulness of descriptions of physical processes as stochastic processes in a small subspace of variables is realized, different models of this type can be examined, in which different subspaces are considered. Being interested in the time evolution of the vibrational energy of a single diatomic molecule, we may focus just on this variable, or on the coordinate (the internuclear distance) and momentum of the intramolecular nuclear motion, or on the atomic coordinates and velocities associated with the molecule and its nearest neighbors, etc. These increasingly detailed reduced descriptions lead to greater accuracy at the cost of bigger calculations. The choice of level of reduction is guided by the information designated as relevant based on available experiments, and by considerations of accuracy based on physical arguments. In particular, timescale and interaction-range considerations are central to the theory and practice of reduced descriptions. The relevance of stochastic descriptions brings out the issue of their theoretical and numerical evaluation. Instead of solving the equations of motion for ∼6×1023 degrees of freedom we now face the much less demanding, but still challenging need to construct and to solve stochastic equations of motion for the few relevant variables. The next section describes a particular example.
1.2.
An Example: The Random Walk Problem
An example of a stochastic process associated with a reduced molecular description is the random walk process, in which a particle starts from a given position and moves randomly. This is a model for the motion of a molecule that is assumed to change direction randomly after each collision, then moves a certain length l (of the order of the mean free path) before the next collision. For simplicity lets consider a 1-dimensional (1D) model: during a time interval t the particle is assumed to move to the right with probability pr = kr t and to the left with probability pl = kl t, so that the probability that its stays in its original position is 1 − pr − pl . kl and kr are rate coefficients, measuring the probabilities per unit time that the corresponding properties will occur. In a homogeneous system the rates to move to the right and the left are the
1638
A. Nitzan
same, kl = kr . Inequality may reflect the existing of some force that makes the probability to move in a particular direction larger than in the opposite one. Obviously pl and pr are linear in t only for t sufficiently small so that these numbers are substantially less then 1. Starting from t = 0, we require the probability P(n, t) = P(n, N t) that the particle has made a net number of n steps to the right (A negative n implies that the particle has actually moved to the left), i.e., if it starts at position n = 0 we ask what is the probability that it is found at position n (i.e., at distance nl from the origin) at time t, after making at total of N steps. An equation for P(n, t) can be found by considering the propagation from time t to t +t: P(n, t + t) = P(n, t) + kr t (P(n − 1, t) − P(n, t)) + kl t (P(n + 1, t) − P(n, t))
(1.1)
In Eq. (1.1) the terms that add to P(n, t) on the right are associated with different random steps. Thus, for example, kr t P(n − 1, t) is the increase in P(n, t) due to the possibility of jump, during a time interval t, from position n − 1 to position n, while −kr t P(n, t) is the decrease in P(n, t) resulting from transitions from n to n + 1 in the same period. Rearranging Eq. (1.1) and deviding by t we get, when t → 0, ∂ P(n, t) = kr (P(n − 1, t) − P(n, t)) + kl (P(n + 1, t) − P(n, t)) ∂t (1.2) Note that in (1.2) time is a continuous variable while position, expressed by n is discrete. We may also go into a continuous representation in position space by substituting n → nx = x, n − 1 → x − x, n + 1 → x + x to get ∂ P(x, t) = kr (P(x − x, t) − P(x, t)) + kl (P(x + x, t) − P(x, t)) ∂t (1.3) Here P(x, t) may be understood as the probability to find the particle in an interval of length x about x. Introducing the density f (x, t) so that P(x, t) = f (x, t)x and expanding the right hand side of (1.3) up to second order in x we obtain ∂ f (x, t) ∂ 2 f (x, t) ∂ f (x, t) = −v +D (1.4) ∂t ∂x ∂x2 where x (1.5) v = (kr − kl )x = ( pr − pl ) t and where 1 x 2 (kr + kl )x = ( pr + pl ) (1.6) D= 2 (2t)
Stochastic theory of rate processes
1639
Note that even though in (1.4) we use a continuous representation of position and time, the nature of our physical problem implies that x and t are finite, of the order of the mean free path and the mean free time, respectively. To get a feeling for the nature of the solution of Eq. (1.4) consider first the case D = 0. The solutions of the equation ∂ f /∂t = −v∂ f /∂ x have the form f (x, t) = f (x − vt) i.e., any structure defined by f is moving forward (to the right) with speed (drift velocity)) v. This is what is expected under the influence of a constant force that makes kr and kl different. The first term of (1.4) is seen to reflect the effect of the systematic motion resulting from this force. Next consider Eq. (1.4) for the case v = 0, i.e., when kr = kl . In this case Eq. (1.4) becomes the diffusion equation ∂ 2 f (x, t) ∂ f (x, t) =D (1.7) ∂t ∂x2 The solution of this equation for the initial condition f (x, 0) = δ(x − x0 ) is
1 (x − x0 )2 exp − f (x, t|x0 , t = 0) = 4Dt (4π Dt)1/2
(1.8)
(Note that the left hand side is written as a conditional probability density: We have found the probability density about point x at time t given that the particle was at x 0 at time t = 0. Note also that the initial density f (x, 0) = δ(x − x0 ) reflects the initial condition that the particle was, with probability 1, in a neighborhood of length x about x0 taken in the limit x → 0). The diffusion process is the actual manifestation of the random walk that leads to a symmetric spread of the density about the initial position. Equation (1.7) implies that the solution of (1.4) under the initial condition f (x, 0) = δ(x − x0 ) is
1 (x − vt − x0 )2 exp − f (x, t|x0 , t = 0) = 4Dt (4π Dt)1/2
(1.9)
showing both the drift and the diffusion spread. Further insight into the nature of the drift-diffusion process that we are studying can be obtained by considering moments of the probability distribution. Equation (1.2) readily yields equations that describe the time evolution of these moments. Both sides of Eq. (1.4) yield zero when summed over all n from −∞ to ∞, while multiplying this equation by n and n 2 then performing the summation lead to d n = kr − kl (1.10) dt and dn 2 = 2n(kr − kl ) + kr + kl (1.11) dt
1640
A. Nitzan
Assuming n(t = 0) = n 2 (t = 0) = 0, i.e., that the particle starts its walk from the origin, n = 0, Eq. (1.10) results in nt = (kr − kl )t
(1.12)
while Eq. (1.11) leads to n 2 t = (kr − kl )2 t 2 + (kr + kl )t
(1.13)
From Eqs. (1.12) and (1.13) it follows that t = ( pr + pl )N (1.14) t for a walker that has executed a total of N steps during time t = N t. When we are interested in length scales that are large relative to x we may consider a continuous representation of these results. Putting x = nx, Eq. (1.12) becomes δn 2 t = n 2 t − n2t = (kr + kl )t = ( pr + pl )
xt = vt
(1.15)
while Eq. (1.14) gives δx 2 t = 2Dt
(1.16)
where v and D are given by Eqs. (1.5) and (1.6), respectively. Together Eqs. (1.5) and (1.6) express the essential features of biased random walk: a drift with speed v associated with the bias kr =/ kl , and a spread with a diffusion coefficient D. The linearity of the spread δx 2 with time is a characteristic feature of normal diffusion. Note that for a random walk in an isotropic 3-dimensional space the corresponding relationship is δr 2 = δx 2 + δy 2 + δz 2 = 6Dt
2.
(1.17)
Some Concepts from the General Theory of Stochastic Processes
In Section 1 we have defined a stochastic process as a time series of random variable(s). If observations are made at discrete time 0 < t1 < t2 , · · · , < tn < T , then the sequence {z(ti )} is a discrete sample of the continuous function z(t). In the random walk problem discussed in Section 1 z(t) was the position at time t of a particle that executes such a walk. We can measure and discuss z(t) directly, keeping in mind that we will obtain different realizations (stochastic trajectories) of this function from different experiments performed under identical conditions. Alternatively we can characterize the process using the probability distributions associated with it. We can consider many such distributions: P(z, t)dz is the probability that the
Stochastic theory of rate processes
1641
realization of the random variable z at time t is in the interval between z and z + dz. P2 (z 2 t2 ; z 1 t1 ) dz 1 dz 2 is the probability that z will have a value between z 1 and z 1 + dz 1 at t1 and between z 2 and dz 2 at t2 , etc. The time evolution of the process, if recorded in times t0 , t1 , t2 , . . . , tn is most generally represented by the joint probability distribution P(z n tn ; . . . ; z 0 t0 ). The concept of conditional probability is useful: P1 (z 1 t1 |z 0 t0 )dz 1 =
P2 (z 1 t1 ; z 0 t0 )dz 1 P1 (z 0 t0 )
(2.1)
is the probability that the variable z will have a value in the interval z 1 , . . . , z 1 + dz 1 at time t1 if it assumed the value z 0 at time t0 . Similarly P2 (z 4 t4 ; z 3 t3 |z 2 t2 ; z 1 t1 )dz 3 dz 4 =
P4 (z 4 t4 ; z 3 t3 ; z 2 t2 ; z 1 t1 ) dz 3 dz 4 P2 (z 2 t2 ; z 1 t1 )
(2.2)
is the conditional probability that z is in z 4 , . . . , z 4 + dz 4 at t4 and is in z 3 , . . . , z 3 + dz 3 at t3 , given that its values were z 2 at t2 and z 1 at t1 . In the absence of time-correlations, the values taken by z(t) at different times are independent. In this case P(z n tn ; z n−1 tn−1 ; . . . ; z 0 t0 )= nk=0 P(z k , tk ) and time correlation functions, e.g., C(t2 , t1 ) = z(t2 )z(t1 ), are given by products of simple averages C(t2 , t1 )=z(t2 )z(t1 ), where z(t1 )= dz z P1 (z, t1 ). This is often the case when the sampling times tk are placed far from each other – farther than the longest correlation time of the process. More generally, the time correlation function C(t1 , t2 ) can be obtained from the distribution P2 (z 2 t2 ; z 1 t1 ) by the obvious expressions
C(t2 , t1 ) = dz 1 dz 2 z 2 z 1 P2 (z 2 t2 ; z 1 t1 )
(2.3a)
C(t3 , t2 , t1 ) = dz 1 dz 2 dz 3 z 3 z 2 z 1 P2 (z 3 t3 ; z 2 t2 ; z 1 t1 )
(2.3b)
In practice, numerical values of time correlations functions are obtained by averaging over an ensemble of realizations. Let z (k) (t) be the kth realization of the random function z(t). Such realizations are obtained by observing z as a function of time in many experiments done under identical conditions. The correlation function C(t2 , t1 ) is then given by C(t2 , t1 ) = lim
N→∞
N 1
z (k) (t2 )z (k) (t1 ) N k=1
(2.4)
If the stochastic process is stationary, the time origin is of no importance. In this case P1 (z 1 , t1 ) = P1 (z 1 ) does not depend on time, while P2 (z 2 t2 ; z 1 t1 ) = P2 (z 2 , t2 − t1 ; z 1 , 0) depends only on the time difference t21 = t2 − t1 . Also in this case the correlation function C(t2 , t1 ) = C(t21 ) can be obtained by taking
1642
A. Nitzan
a time average over different origins along a single stochastic trajectory according to N 1
z(tk + t)z(tk ) C(t) = lim N→∞ N k=1
(2.5)
Here the average is over a sample of reference times that span a region of time that is much larger than the longest correlation time of the process. Further progress can be made by specifying particular kinds of processes of physical interest. In particular we focus on two such processes: (1) Markovian stochastic processes. The process z(t) is called Markovian if the knowledge of the value of z (say z 1 ) at a given time (say t1 ) fully determines the probability of observing z at any later time P(z 2 t2 |z 1 t1 ; z 0 t0 ) = P(z 2 t2 |z 1 t1 );
t2 > t1 > t0
(2.6)
Markov processes have no memory of earlier information. Newton equations describe deterministic Markovian processes by this definition, since knowledge of system state (all positions and momenta) at a given time is sufficient for determining it at any later time. The random walk problem discussed in Section 1.2 is an example of a stochastic Markov process. The Markovian property can be expressed by P(z 2 t2 ; z 1 t1 ; z 0 t0 ) = P(z 2 t2 |z 1 t1 )P(z 1 t1 ; z 0 t0 );
for t0 < t1 < t2
(2.7)
or P(z 2 t2 ; z 1 t1 |z 0 t0 ) = P(z 2 t2 |z 1 t1 )P(z 1 t1 |z 0 t0 );
for t0 < t1 < t2
(2.8)
because, by definition, the probability to go from (z 1 , t1 ) to (z 2 , t2 ) is independent of the probability to go from (z 0 , t0 ) to (z 1 , t1 ). The above relation holds for any intermediate point between (z 0 , t0 ) and (z 2 , z 2 ). As with any joint probability, integrating the left hand side of Eq. (2.8) over z 1 yields P(z 2 t2 |t0 z 0 ). Thus for a Markovian process P(z 2 , t2 |z 0 , t0 ) =
dz 1 P(z 2 , t2 |z 1 , t1 )P(z 1 , t1 |z 0 , t0 )
(2.9)
This is the Chapman Kolmogorov equation. What is the significance of the Markovian property of a physical process? Note that the Newton equations of motion as well as the time dependent Schrodinger equation are Markovian in the sense that the future evolution of a system described by these equations is fully determined by the present (“initial”) state of the system. Non-Markovian dynamics results from the same reduction procedure that we use in order to focus on the “relevant” subsystem, that, as argued above, leads us to consider stochastic time evolution. To see
Stochastic theory of rate processes
1643
this consider a “universe” described by two variables, x1 and x2 , that satisfy the Markovian equations of motion dx1 = F1 (x1 (t), x2 (t), t) dt
(2.10a)
dx2 = F2 (x1 (t), x2 (t), t) dt
(2.10b)
We have taken F2 to depend only on the instantaneous value of x1 for simplicity. If x1 is the “relevant” subsystem, a description of the dynamics in the subspace of this variable can be achieved if we integrate Eq. (2.10b) to get x2 (t) = x2 (t = 0) +
t
F2 x1 (t ), t
(2.11)
0
Inserting this into (2.10a) gives
dx1 = F1 x1 (t), x2 (t = 0) + dt
t
F2 x1 (t ), t , t
(2.12)
0
This equation describes the dynamics in the x1 subspace, and its non-Markovian nature is evident. Starting at time t, the future evolution of x1 is seen to depend not only on its value at time t, but also on its past history, since the right hand side depends on all values of x1 (t ) starting from t = 0.1 Why has the Markovian time evolution (2.10) of a system of two degrees of freedom become a non-Markovian descripting in the subspace of one of them? Equation (2.11) shows that this results from the fact that x2 (t) responds to the historical time evolution of x1 , and therefore depends on past values of x1 , not only on its value at time t. More generally, consider a system A+B made of a part (subsystem) A that is relevant to us as observers, and another part B, “the environment”, that is uninteresting to us except for its effect on the relevant subsystem A. A non-Markovian behavior of the reduced description of the physical subsystem A reflects the fact the at any time t subsystem A interacts with the rest of the total system, i.e., with B, whose state is affected by its past interaction with A. In effect, the present state of B carries the memory of past states of the relevant subsystem A. This observation is very important because it points to a way to consider this memory as a qualitative attribute of system B (the environment or the bath) that 1 Equation (2.12) shows also the origin of the stochastic nature of reduced descriptions. Focusing on x , 1 we have no knowledge of the initial state x2 (t = 0) of the “rest of the universe”. At most we may know the distribution (e.g., Boltzmann) of different initial states. Different values of x2 (t = 0) correspond to different realizations of the “relevant” trajectory x1 (t). When the number of “irrelevant” degrees of freedom becomes huge their initial state and therefore also this trajectory assume a stochastic character.
1644
A. Nitzan
determines the physical behavior of system A. In the example of Eq. (2.10), where system B comprises one degree of freedom x2 , its dynamics is solely determined by its interaction with system A, coordinate x1 and the memory can be as long as the duration of the observed motion. In practical applications however system A represents only a few degrees of freedom, while B is the rest of the universe. B is so large relative to A that its dynamics may be dominated by interactions between B particles. To be more precise, suppose first that the interaction between subsystems A and B suddenly disappeared at some time t . From this time on B evolves under its internal interactions, and our physical experience tells us that it will eventually, say after a characteristic relaxation time τB , reach thermal equilibrium. Therefore at time of order t + τB subsystem B will carry no memory of its interaction with A at times up to t . This implies that also in the real case, where A and B interact continuously, the state of B at time t does not depend on states of A at times earlier than t = t − τB . Consequently, dynamics in the A subspace at time t will depend on the history of A at earlier times going back only as far as this t . The relaxation time τB can be therefore identified with the memory time of the environment B. We can now state the condition for the reduced dynamics of subsystem A to be Markovian: This will be the case if the characteristic timescales of the evolution of A are slow relative to the characteristic relaxation time associated with the environment B. When this condition holds, measurable changes in the A subsystem occur slowly enough so that on this relevant timescale B appears to be always at thermal equilibrium, and independent of its historical interaction with A. Reiterating, denoting the characteristic time for the evolution of subsystem A by τA , the condition for the time evolution within the A subspace to be Markovian is τB τA .
(2.13)
(2) Gaussian stochastic Processes. The special status of the Gaussian (“normal”) distribution in reduced descriptions of physical processes stems from the central limit theorem of probability theory and the fact that in a coarsegrained description a “time evolution step” taken by the system is affected by many more or less independent random events. A succession of such evolution steps, whether Markovian or not, constitutes a Gaussian stochastic process. As a general definition, a stochastic process z(t) is Gaussian if the probability distribution of its observed values z 1 , z 2 , . . . , z n at any n time points t1 , t2 , . . . , tn (for any value of the integer n) is an n-dimensional Gaussian distribution. − 12
Pn (z 1 t1 , z 2 t2 , . . . , z n tn ) = ce
n n j =1 k=1
a j k (z j −m j )(z k −m k )
;
−∞ < z j < ∞ (2.14)
where the matrix (a j k ) = A is symmetric and positive definite (i.e., u† Au > 0 for any vector u) and where c is a normalization factor.
Stochastic theory of rate processes
1645
A Gaussian process can be Markovian. Consider, for example, the process defined by Eq. (2.8) together with
1 (z k − zl )2 exp − P(z k , tk |zl , tl ) = √ 22kl 2π kl
(2.15)
This process is Gaussian by definition since (2.14) is satisfied for any pair of times. It follows that
1 dz 1 P(z 2 , t2 |z 1 , t1 )P(z 1 , t1 |z 0 , t0 ) = 2π 221 + 210
(z 2 − z 0 )2 × exp 2 2 21 + 210
(2.16)
so the Markovian property (2.9) implies 220 = 221 + 210
(2.17)
If we further assume that kl is a function only of tk − tl , it follows that its form must be √ (2.18) (t) = Dt where D is some constant. Noting that 210 is the variance of the probability distribution in Eq. (2.15), we have found that in a Markovian process described by (2.15) this variance is proportional to the elapsed time. Comparing this result with (1.8) we see that we have just identified diffusion as a Gaussian Markovian stochastic process. Taken independently of the time ordering, the distribution (2.14) is a multivariable, n-dimensional, Gaussian distribution Pn (z 1 , z 2 , . . . , z n ). We state without proof two important properties of this distribution: z j = m j ;
δz j δz k = (A)−1
j,k
;
where
δz = z − z
(2.19)
These relationships imply that a Gaussian distribution is completely characterized by the first two moments of its variables. In particular, since in (2.14) z j and the matrix element z k are associated with the times t j and tk , respectively, −1 are seen to be the time correlation function δz(t j )δz(tk ) . (A) j,k
3.
Stochastic Equations of Motion
We have already observed that the full phase space description of a system of N particles (taking all 6N coordinates and velocities into account) requires the solution of the Newton (or Schr¨odinger) equations of motion in this phase
1646
A. Nitzan
space, and is deterministic in nature, while the time evolution of a small subsystem is stochastic in nature. Focusing on the latter, we would like to derive or construct appropriate equations of motion that will describe this stochastic motion. Two routes can be taken towards this goal: (1) Derive such equations from first principles. In this approach we start with the equations of motion for the entire system, and derive from it equations of motion for the subsystem under consideration. The stochastic nature of the latter stems from the fact that the state of the complementary system, “the rest of the world”, is not known precisely, and is given only in probabilistic terms. (2) Construct such equations using physical arguments, experimental observations and intuition. The resulting equations must be considered phenomenological. We have already encountered an example for such a phenomenological model in Eq. (1.2) for the random walk problem ∂ P(n, t) = kr (P(n − 1, t) − P(n, t)) + kl (P(n + 1, t) − P(n, t)) ∂t (3.1) Such an equation has been termed master equation. The Master equation is a generalized kinetic equation, giving the time-evolution of the probability distribution in terms of the transition rates between different “states” of the system. In this section we take the second route. The equations introduced and discussed below should be viewed as models for physical processes. The input to these models is the nature of the process and the choice of stochastic variables, available information such as the temperature or known average rates, some knowledge of the time and length scales involved (such as needed to choose between a Markovian or non-Markovian description) and basic principles such as any known symmetries and conservation rules, and when applicable, the requirement that at long time the system described should approach thermal equilibrium.
3.1.
Langevin Equations
3.1.1. Langevin equation for 1 particle in 1 dimension Sometimes we find it advantageous to focus our stochastic description not on the probability but on the random variable itself. This makes it possible to describe in a more direct way the source of randomness in the system and its effect on the time evolution of the interesting subsystem. In this case the basic stochastic input is not a set of transition probabilities, but the actual
Stochastic theory of rate processes
1647
effect of the “environment” on the “interesting subsystem”. Obviously this effect is random in nature, reflecting the fact that we do not have a complete microscopic description of the environment. As discussed above, we could attempt to derive these stochastic equations of motion from first principle, i.e., from the full Hamiltonian of the system + environment, Alternatively we can construct the equation of motion using intuitive arguments and as much available physical information as possible. Again, we are taking the second route. As a simple example consider the equation of motion of a particle moving in a 1-dimensional potential, x¨ = −
1 ∂ V (x) m ∂x
(3.2)
and consider the effect on this particle’s dynamics of putting it in contact with a “thermal environment”. Obviously the effect depends on the strength of interaction between the particle and it environment. A useful measure of the latter within a simple intuitive model is the friction force, proportional to the particle velocity, which acts to slow the particle: x¨ = −
1 ∂ V (x) − γ x˙ m ∂x
(3.3)
The effect of friction is to damp the particle energy. This can most easily be seen by multiplying Eq. (3.3) by m x, ˙ and using m x˙ x¨ + x˙ (∂ V (x)/∂ x) = (d/dt)[E K + E P ] = E˙ to get E˙ = −2γ E k . Here E, E K and E P are the total energy of the particle and its kinetic and potential components, respectively. Equation (3.3) thus describes a process of energy dissipation, and leads to zero energy (when measured from a local minimum on the potential surface) at infinite time. It therefore cannot in itself describe the time evolution in a thermal system. What is missing is the random “kicks” that the particle can occasionally obtain from the surrounding thermal particles. These kicks can be modeled by an additional random force in Eq. (3.3) x¨ = −
1 1 ∂ V (x) − γ x˙ + R(t) m ∂x m
(3.4)
The function R(t) describes the effects of random collisions between our particle (“system”) and the molecules of the thermal environment (“bath”). This force is obviously a stochastic process, and a full stochastic description of our system is obtained once we define its statistical nature. What can be said about the statistical character of the stochastic process R(t)? First, from symmetry arguments valid for stationary systems, R(t)=0, where the average can be either time or ensemble average. Secondly, since Eq. (3.3) seems to describe the relaxation of the system at temperature T = 0, R should be related to the finite temperature of the thermal environment. Next, at T = 0, the time evolution of x is Markovian (knowledge of x and x˙ fully
1648
A. Nitzan
determines the future of x), so the system-bath coupling introduced in (3.3) is of Markovian nature (i.e., what the bath does to the system at time t does not depend on history of the system or the bath; in particular, the bath has no memory of what the system did in the past). The additional, finite temperature term R(t) has to be consistent with the Markovian form of the damping term since both arise from the same source, the bath motion. Finally, in the absence of further knowledge and because R is envisioned as a combined effect of many environmental motions, it makes sense to assume that, for each time t, R(t) is a Gaussian variable, and that the stochastic process R(t) is a Gaussian process. We have already argued that the Markovian nature of the system evolution implies that the relaxation dynamics of the bath is much faster than that of the system. The bath loses its memory fast relative to the timescale of interest for the system dynamics. Still the timescale for the bath motion is not unimportant. If, for example the sign of R(t) would change infinitely fast it will make no effect on the system. Indeed, in order for a finite force R to move the particle it has to have a finite duration. It is convenient to introduce a timescale τ B , which characterizes the bath motion, and to consider an approximate picture in which R(t) is constant in the interval τ B , while R(t1 ) and R(t2 ) are independent Gaussian random variables if |t1 − t2 | ≥ (1/2)τ B . Accordingly, R(t1 )R(t1 + t) = C S(t)
(3.5)
where S(t) is 1 if |t| < (1/2)τ B , and is 0 otherwise. Since R(t) was assumed to be a Gaussian process, the first two moments specify completely its statistical nature. The assumption that the bath is fast relative to the timescales that characterize the system implies that τ B is much shorter than all timescales (inverse frequencies) derived from the potential V (x) and much smaller than the time γ −1 for the energy relaxation. In Eqs. (3.4) and (3.5), both γ and C are related to the strength of the system-bath coupling, and should therefore be somehow related to each other. In order to obtain this relation it is sufficient to consider Eq. (3.4) for the case where the potential V is a position independent constant. In this case the equation u˙ = −γ u +
1 m
R(t)
(3.6)
(here u= x˙ is the particle velocity) can be solved as a first order inhomogeneous differential equation, to yield
u(t) = u(t = 0)e
−γ t
1 + m
t 0
dt e−γ (t −t ) R(t )
(3.7)
Stochastic theory of rate processes
1649
For long times, as the system reaches equilibrium, only the second term on the right of (3.7) contributes. For the average u at thermal equilibrium this gives zero, while for u 2 we get 1 u = 2 m
t
2
dt 0
t
dt e−γ (t −t )−γ (t −t ) C S(t − t )
(3.8)
0
Since for non-vanishing integrand, |t − t"| ≤ τ B 1/γ , u 2 in Eq. (3.8) can be approximated by 1 u = 2 m
t
2
−2γ (t −t )
t
dt e 0
0
dt C S(t − t ) =
1 Cτ B 2m 2 γ
(3.9)
where to get the final result we took the limit t → ∞. Since in this limit the system should be in thermal equilibrium we have u 2 = k B T/m, whence C=
2mγ k B T τB
(3.10)
Using this result in Eq. (3.5) we find that the correlation function of the Gaussian random force R has the form R(t1 )R(t1 + t) = 2mγ k B T
S(t) τ B →0 −−−→ 2mγ k B T δ(t) τB
(3.11)
The last result indicates that, since for the system’s motion to be Markovian τ B has to be much shorter than the relevant system timescales, its actual magnitude is not important and the random force may be thought of as δ-correlated. The limiting process described above indicates that mathematical consistency requires that as τ B → 0 the second moment of the random force diverge, and the proper limiting form of the correlation function is a Dirac δ function in the time difference. Usually in analytical treatments of the Langevin equation this limiting form is convenient. In numerical solutions however, the random force is generated at time intervals t, determined by the integration routine. The random force then is generated as a Gaussian random variable with zero average and variance equal to 2mγ k B T/t. The result (3.11) shows that the requirement that the friction γ and the random force R(t) together act to bring the system to thermal equilibrium at long time, naturally leads to a relation between them, expressed by Eq. (3.11). This is a relation between the fluctuations and the dissipation in the system. It constitutes an example of the fluctuation-dissipation theorem of non-equilibrium statistical mechanics.
1650
A. Nitzan
3.1.2. The high friction limit The friction coefficient γ defines the timescale, γ −1 , of the thermal relaxation of the system described by (3.4). A simpler stochastic description can be obtained for a system in which this time is shorter than any other characteristic timescale of our system. This situation is often referred to as the overdamed limit. In this limit of large γ , the relaxation of the velocity is so fast that we can assume that v quickly reaches a steady state for any value of the random force R(t), i.e., v˙ = x¨ = 0. This statement is not obvious, and a supporting (though not rigorous) argument is provided below. If it is true then Eqs. (3.4) and (3.11) yield 1 dx = dt γm
−
dV + R(t) ; dx
R = 0;
R(0)R(t) = 2mγk B T δ(t) (3.12)
This is a Langevin type equation that describes strong coupling between the system and its environment. Obviously, the limit γ → 0 of deterministic motion cannot be identified here. Why can we, in this limit, neglect the acceleration term in (3.4)? Consider a particular realization of the random force in this equation and denote −(1/m)dV/dx + R(t) = F(t). Consider then Eq. (3.4) in the form v˙ = −γ v + F(t)
(3.13)
If F(t) is constant than after some transient (short for large γ ) the solution of (3.13) reaches the constant velocity state v = F/γ
(3.14)
The neglect of the v˙ term in (3.13) is equivalent to the assumption that Eq. (3.14) provides a good approximation for the solution of (3.13) even when F depends on time. To find the conditions under which this assumption holds consider the solution of (3.13) for a particular Fourier component of the time dependent force F(t) = Fω eiωt
(3.15)
Disregarding any initial transient amounts to looking for a solution of (3.13) of the form v(t) = v ω eiωt .
(3.16)
Inserting (3.15) and (3.16) into (3.13) we find vω =
Fω 1 Fω = iω + γ γ 1 + iω/γ
(3.17)
Stochastic theory of rate processes
1651
which implies F(t) (1 + O(ω/γ )) (3.18) v(t) = γ We found that Eq. (3.14) holds, with corrections of order ω/γ . It should be emphasized that this argument is not rigorous because the random part of F(t) is in principle fast, i.e., contain Fourier components with large ω. More rigorously, the transition from Eq. (3.4) to (3.12) should be regarded as coarsegraining in time that leads to a description in which the fast components of the random force are averaged to zero and the velocity distribution follows the instantaneous applied force.
3.2.
Master Equations
As discussed at the beginning of Section 3, a phenomenological stochastic evolution equation can be constructed using a model that describes the relevant states of the system and the transition rates between them. For example, in the one-dimensional random walk problem discussed in Section 1.2 we described the position of the walker using a discrete set of equally spaced points mx(m = −∞ . . . ∞) on the real axis. Denoting by P(m, t) the probability that the particle is at position m at time t and by kr and kl the probabilities per unit time (i.e., the rates) that the particle would move from a given site to the neighboring site on its right and left respectively, we obtained a kinetic equation for the time evolution of P(m, t): ∂ P(n, t) = kr (P(n − 1, t) − P(n, t)) + kl (P(n + 1, t) − P(n, t)) ∂t (3.19) This is an example of a master equation.2 More generally, transition rates between any two states can be given, and the master equation then takes the form
∂ P(m, t)
= kmn P(n, t) − knm P(m, t) (3.20) ∂t n n n= /m
n= /m
where kmn ≡ km←n is the rate to go from state n to state m. Note that we may write (3.20) in the form ∂P ∂ P(m, t)
= = KP (3.21) K mn P(n, t); i.e., ∂t ∂t n 2 Many science texts refer to a 1928 paper by W. Pauli [W. Pauli, Festschrift zum 60. Geburtstage A. Sommerfelds (Hirzel, Leipzig, 1928) p. 30] as the first derivation of this type of Kinetic equation. Pauli has used this approach to construct a model for the time evolution of a many-state quantum system, using expression derived from quantum perturbation theory for the transition rates.
1652
A. Nitzan
provided we define K mn = kmn for m =/ n;
K mm = −
knm
(3.22)
n
n= /m
Note that (3.22) implies that m K mn = 0 for all n. This is compatible with the fact that m P(m, t) = 1 is independent of time. The nearest neighbor random walk process is described by a special case of this master equation with K mn = kl δn,m+1 + kr δn,m−1 − (kl + kr ) δn,m
(3.23)
In what follows we consider several examples.
3.2.1. The random walk problem We first consider the 1-dimensional random walk problem described by Eq. (3.19). It can be easily seen that summing either side of this equation over all m from −∞ to ∞ yields zero, while multiplying this equation by m and m 2 then performing the summation yields ∂m = kr − kl (3.24) ∂t and ∂m 2 = 2m(kr − kl ) + kr + kl (3.25) ∂t Therefore (assuming m(t = 0) = m 2 (t = 0) = 0, i.e., that the particle starts its walk from the origin, m = 0) mt = (kr − kl )t = ( p − q)t/τ = ( p − q)N m 2 t = (kr − kl )2 t 2 + (kr + kl )t
(3.26)
and δm 2 t = m 2 t − m2t = (kr + kl )t = ( p + q) τt = ( p + q)N
(3.27)
for a walker that has executed a total of N steps during the time t = N τ , with probabilities p and q to jump to the right and to the left, respectively. We can do more by introducing the generating function, defined by F(s, t) =
∞
P(m, t)s m ;
(0 < |s| < 1)
(3.28)
m=−∞
Its usefulness stems from the fact that it can be used to generate all moments of the probability distribution according to:
∂ s ∂s
k
= m k
F(s, t) s=1
(3.29)
Stochastic theory of rate processes
1653
We can get an equation for the time evolution of F by multiplying the master ∞ m m and summing over all m. Using equation by s m = −∞ s P(m − 1, t) = s F(s) ∞ m and m = −∞ s P(m + 1, t) = F(s)/s leads to ∂ F(s, t) = k+ s F(s, t) + k− 1s F(s, t) − (k+ + k− )F(s, t) ∂t whose solution is F(s, t) = Ae[k+ s+(k− /s)−(k+ +k− )]t
(3.30)
(3.31)
If the particle starts from m = 0, i.e., P(m, t = 0) = δm,0 , Eq. (3.28) implies that F(s, t = 0) = 1. In this case the integration constant in Eq. (3.31) is A=1. It is easily verified that using (3.31) in Eq. (3.29) with k = 1, 2 leads to Eqs. (3.26). Using it with larger k’s leads to the higher moments of the time dependent distribution.
3.2.2. Chemical kinetics k
Consider the simple 1st order chemical reaction, A −→ B. The corresponding kinetic equation, dA = −kA ⇒ A(t) = A(t = 0)e−kt (3.32) dt describes the time evolution of the average number of molecules A in the system. Without averaging, the time evolution of this number is a random process, because the moment at which each specific A molecule transforms into B is undetermined. (The stochastic nature of radioactive decay, which is described by a similar 1st order kinetics, can be realized by listening to a Geiger counter). In addition, fluctuations from the average can also be observed if we monitor the reaction in a small enough volume, e.g., in a biological cell. We can derive a master equation for the probability P(n, t) that the number of A molecules is n at time t using considerations similar to those we used above: P(n, t + t) = P(n, t) + k(n + 1)P(n + 1, t)t − kn P(n, t)t ∂ P(n, t) = k(n + 1)P(n + 1, t) − kn P(n, t) (3.33) ⇒ ∂t Unlike the random walk problem, the rate at which the probability to be in a given state n changes depends on the state: The probability per unit time to go from n + 1 to n is k(n + 1), and the probability per unit time to go from n to n − 1 is kn. The process described by Eq. (3.33) is an example of a birthand-death process. In this particular example there is no source feeding A molecules into the system, so only death steps take place.
1654
A. Nitzan
The solution of Eq. (3.33) is easily achieved using the generating function method. The random variable n can take only non-negative integer values, and the generating function is therefore F(s, t) =
∞
s n P(n, t)
(3.34)
n=0
Multiplying (3.33) by s n and summing leads to ∂F ∂F ∂F ∂ F(s, t) =k − ks = k(1 − s) ∂t ∂s ∂s ∂s
(3.35)
Here we have used identities such as ∞
s n n P(n, t) = s
n=0
∂ F(s, t) ∂s
(3.36)
and ∞
s n (n + 1)P(n + 1, t) =
n=0
∂F ∂s
(3.37)
If P(n, t = 0) = δn,n0 then F(s, t = 0) = s n0 . For this initial condition the solution of Eq. (3.35) is
F(s, t) = 1 + (s − 1)e−kt
n0
(3.38)
This again gives all the moments using Eq. (3.29). For example it is easy to get nt = n o e−kt
(3.39)
δn 2 t = n o e−kt (1 − e−kt )
(3.40)
The first moment gives the familiar evolution of the average A population. The second moment shows that the variance of the fluctuations from this average is zero at t = 0 and t = ∞, and goes through a maximum at some intermediate time.
3.3.
The Fokker Planck Equation
In many practical situations the random process under observation is continuous in the sense that (a) the space of possible states x is continuous (or it can be transformed to a continuous-like representation by a coarse graining procedure), and (b) the change in the system state during a small time interval is small, i.e., if the system is found in a state x at time t then the
Stochastic theory of rate processes
1655
probability to find it in state y =/ x at time t + δt vanishes when δt → 0.3 When these, and some other conditions detailed below, are satisfied, we can derive a partial differential equation for the probability distribution. The result is the Fokker–Planck equation. As an example without mathematical justification consider the master equation for the random walk problem ∂ P(n, t) = kr P(n − 1, t) + kl P(n + 1, t) − (kr + kl )P(n, t) ∂t = −kr (P(n, t) − P(n − 1, t)) − kl (P(n, t) − P(n + 1, t)) (3.41) = −kr (1 − e−∂/∂n )P(n, t) − kl (1 − e∂/∂n )P(n, t) In the last step we have regarded n as a continuous variable and have used the Taylor expansion ea(∂/∂n) P(n) = 1 + a
1 ∂2 P ∂P + a 2 2 + · · · = P(n + a) ∂n 2 ∂n
(3.42)
In practical situations n is a very large number – it is the number of microscopic steps taken on the timescale of a macroscopic observations. This implies that ∂ k P/∂n k ∂ k + 1 P/∂n k + 1 .4 We therefore expand the exponential operators according to ∂
1 − e± ∂n = ∓
1 ∂2 ∂ − ∂n 2 ∂n 2
(3.43)
and neglect higher order terms, to get ∂ P(n, t) ∂ 2 P(n, t) ∂ P(n, t) = −A +B ∂t ∂n ∂n 2
(3.44)
where A = −(kr − kl ) and B = kr + kl . We can give this result a more physical form by transforming from the number-of-steps variable n to the position variable x = nx, using Pn (n) = Px (x) · x. Here x is the step length, and the subscripts n and x denote the probability in the space of position indices and the probability density on the axis x (we omit these subscripts above and below when the nature of the distribution is clear from the text). This leads to ∂ P(x, t) ∂ 2 P(x, t) ∂ P(x, t) = −v +D ∂t ∂x ∂x2
(3.45)
3 In fact we will require that this probability vanishes faster than δt when δt → 0. 4 For example if f (n) = n a then ∂ f /∂n = an a−1 which is of order f /n. The situation is less obvious in cases such as the Gausssian distribution f (n) ∼ exp((n − n)2 /2δn 2 ). Here the derivatives with respect to n adds a factor ∼ (n − n) /δn 2 that is much smaller than 1 as long as n − n n because δn 2 is of
order n.
1656
A. Nitzan
where v = x A and D = x 2 B. Note that we have just repeated, using a somewhat different form, the derivation of Eq. (1.4). The result (3.45) (or (1.4)) is a Fokker–Planck type equation. As already discussed below Eq. (1.4), these equations describe a drift diffusion process: For a symmetric walk, kr = kl , v = 0 and (3.45) becomes the diffusion equation with the diffusion coefficient D = x 2 (kr + kl ) = x 2 /τ. Here τ is the hopping time defined by τ = (kr + kl )−1 . When kr =/ kl the parameter v is non-zero and represents the drift velocity that may be induced by an external force that creates an asymmetry in the flow through the system. Additional insight can be obtained by rewriting Eq. (3.45) in the form: ∂ P(x, t) ∂ J (x, t) =− ∂t ∂x J (x, t) = v P(x, t) − D
(3.46a) ∂ P(x, t) ∂x
(3.46b)
Equation (3.46a) expresses the fact that the probability distribution P is a conserved quantity and therefore its time dependence can only stem from boundary fluxes. Indeed, Eq. (3.46a) implies that Pab (t) = ab P(x, t); a < b satisfies dPab (t)/dt =J (a, t) − J (b, t). This identifies J (x, t) as the probability flux at point x: J (a, t) is the flux entering the [a,b] interval (for positive J ) from a, J (b, t) – the flux leaving the interval (if positive) from b. In 1-dimension, J is of dimensionality t −1 , and when multiplied by the total number of walkers it gives the number of such walkers that pass the point x per unit time in a direction determined by the sign of J . Equation (3.46b) shows that J is a combination of the drift flux, v P, associated with the net local velocity v, and the diffusion flux, D∂ P/∂ x associated with the spatial inhomogeneity of the distribution. In a 3-dimensional system the analog of Eq. (3.46) is ∂ P(r, t) = −∇ · J(r, t) ∂t J(r, t) = v P(r, t) − D∇ P(r, t)
(3.47)
Now P(r, t) is of dimensionality l −3 and the vector J has the dimensionality l −2 t −1 . J expresses the flux of walkers (number per unit time and area) in the J direction.5 The derivation of the Fokker–Planck (FP) equation described above is far from rigorous since the conditions for neglecting higher order terms in the 5 It is important to emphasize that, again, the first of Eqs. (3.47) is just a conservation law. Integrating it over some volume s enclosed by some surface S and denoting P (t) = dP(r, t) we find, using the
divergence theorem of vector calculus, dP (t)/dt =− S dS · J(r, t) where dS is a vector whose magnitude is a surface element and its direction is a vector normal to this element in the direction outward of the volume .
Stochastic theory of rate processes
1657
expansion of exp(±∂/∂ x) where not established. A rigorous derivation of the Fokker–Planck equation for a Markov process can be obtained from the Chapman–Kolmogorov equation, Eq. (2.9), under fairly general continuity conditions that are satisfied in most physical situations. Since the latter describes a continuous stochastic process, a Fokker–Planck equation is indeed expected in the Markovian case. Here we outline this derivation for the simpler case of the high friction limit, where the Langevin equation is (c.f. Eq. (3.12)) 1 dx = dt γm
−
dV + R(t) dx
(3.48)
where R(t) again satisfies Eq. (3.11). In this case we will derive an equation for P(x, t), the probability density to find the position at x (the velocity distribution is assumed equilibrated on the timescale considered). It is convenient to redefine the time scale τ = t/(γm)
(3.49)
Denoting the random force on this timescale by ρ(τ ) = R(t), we have ρ(τ1 )ρ(τ2 ) = 2mγk B T δ(t1 − t2 ) = 2k B T δ(τ1 − τ2 ). The new Langevin equation becomes dV (x) dx =− + ρ(τ ) dt dx ρ = 0;
(3.50a)
ρ(0)ρ(τ ) = 2k B T δ(τ )
(3.50b)
The friction γ does not appear in these scaled equations, however any rate evaluated from this scheme will be inversely proportional to γ when described on the real (i.e., unscaled) time axis. The starting point of the derivation of an equation for the time evolution of the probability density P(x, t) is the statement of the fact that the integrated probability is conserved. As already discussed this implies that the time derivative of P should be given by the gradient of the flux xP, ˙ i.e., ∂ ∂ ∂ P(x, τ ) = − (x˙ P) = − ∂τ ∂x ∂x
−
∂V + ρ(τ ) P ∂x
Rewrite this in the form ∂ P(x, t) ˆ )P = (τ ∂t ∂ ∂V ˆ − ρ(τ ) (τ ) = ∂x ∂x
(3.51)
(3.52)
and integrate between τ and τ + τ to get P(x, τ + τ ) = P(x, τ ) +
τ +τ τ
dτ1 (τ1 )P(x, τ1 )
(3.53)
1658
A. Nitzan
The operator contains the random function ρ(τ ). Repeated iterations in the integral and averaging over all realizations of ρ lead to P(x, τ + τ ) − P(x, τ ) =
τ +τ
+
dτ1 (τ1 )
τ τ +τ
τ1
dτ2 (τ1 )(τ2 )+· · · P(x, τ )
dτ1 τ
τ
(3.54) Our aim now is to take these averages using the statistical properties of ρ and to carry out the required integrations keeping only terms of order τ . To this ˆ ˆ ) = Aˆ + Bρ(τ end we note that is of the form (τ ) where Aˆ and Bˆ are deterministic operators. Since ρ = 0 the first term in the square brackets is ˆ simply Aτ = ∂/∂ x(∂ V (x)/∂ x)τ , where the operator ∂/∂ x is understood to operate on everything on its right. The integrand in the second term inside the square brackets contains terms of the kind A A, ABρ=0 and B 2 ρ(τ1 )ρ(τ2 ). The deterministic AA terms with the double integral are of order τ 2 and may be neglected. The only contribution of order τ come from the BB terms (recall Bˆ = ∂/∂ x) using Eq. (3.50b) τ1
τ +τ
dτ1 τ
τ
∂2 dτ2 ρ(τ1 )ρ(τ2 ) 2 = ∂x
τ +τ
dτ1 k B T τ
∂2 ∂2 = k T τ B ∂x2 ∂x2 (3.55)
With a little effort we can convince ourselves that higher order terms in the 2 expansion (2.15) contribute only terms of order τ τ2 or higher. Consider for τ +τ τ1 dτ1 τ dτ2 τ dτ3 (τ1 )(τ2 )(τ3 ). example the third order term τ Evaluating the integrals with a deterministic AAA term will yield a result of order τ 3 that can be disregarded. The AAB and BBB terms appear in products with terms like ρand ρρρ which are zero. The only terms that may potentially contribute are of the type ABB. However they do not: such terms appear with a function like ρ(τ1 )ρ(τ2 ) that yields a δ-function that eliminates one of the three time integrals, however the remaining two yield a τ 2 term and do not contribute to order τ . Similar considerations show that all higher order terms in Eq. (3.54) may be disregarded. Equations (3.56) and (3.55) finally lead to ∂ P(x, τ ) = ∂τ
∂2 ∂ dV + k B T 2 P(x, τ ) ∂ x dx ∂x
(3.57)
Stochastic theory of rate processes
1659
Transforming back to the original time variable t = γ mτ results in the Smoluchowski equation
∂ ∂ ∂V ∂ P(x, t) =D + β P(x, t); ∂t ∂x ∂x ∂x D=
β=
kB T mγ
1 kB T
(3.58) (3.59)
The Smoluchowski equation (3.58) is the overdamped limit of the original Fokker–Planck equation. When the potential V is constant it becomes the well-known diffusion equation. Equation (3.59) is a relationship between the diffusion constant D and the friction coefficient γ and is known as the Einstein relationship. Let’s consider some properties of Eq. (3.58). First note that it can be rewritten in the form ∂ ∂ P(x, t) = − J (x, t) ∂t ∂x
(3.60)
where the probability flux J is given by
∂V ∂ +β J = −D ∂x ∂x
P(x, t)
(3.61)
As discussed above (see Eq. (3.46) and the discussion below it) Eq. (3.60) has the form of a conservation rule, related to the fact that the overall probability is conserved.6 The 3-dimensional generalization of (3.58), ∂ P(r, t) = D∇ · (β∇ V + ∇)P(r, t), ∂t
(3.62)
can similarly be written as a divergence of a flux ∂ P(r, t) = −∇ · J ∂t
(3.63a)
J = −D (β∇ V + ∇) P(r, t)
(3.63b)
Again, Eq. (3.63a) is just a conservation low: it may be shown to be equivalent to the integral form dP =− dt
J(r) · ds
(3.64)
S
6 If N is the total number of particles NP(x) is the particles number density. The conservation of the inte
grated probability, i.e., dx P(x, t) = 1 is equivalent to the conservation of the total number of particles: In the process under discussion particles are neither destroyed nor created, only move in position space.
1660
A. Nitzan
where the probability P to be in the region of space surrounded by a closed surface S
d3 P(r)
P =
(3.65)
and where ds is a vector whose magnitude equals that of the surface element and its direction is normal to that element in the outward direction. (The flux through a surface that defined a closed subspace is defined to be positive when the flux is directed in the outward direction). Equation (3.64) expresses the fact that the change in the probability to be in the region is associated with the flux(s) that enter or leave through the boundary S of .The equivalence of the differential form (3.63a) and the integral form (3.64) of the conservation condition is a direct result of the divergence theorem of vector calculus
J(r) · ds =
S
(∇ · J)d3 r
(3.66)
S
that implies
d3r
∂ P(r, t) =− ∂t
d3r(∇ · J) ⇒
∂P = −∇ · J ∂t
(3.67)
Secondly, the flux is seen to be a sum of two terms, J = JD + JF , where JD = −D∂ P/∂ x (or, in 3D, J D = −D∇ P) is the diffusion flux, while JF = Dβ (−∂ V /∂ x) P (or , in 3D, J F = β D (−∇ V ) P) is the flux caused by the force F = −∂ V /∂ x (or F = −∇ V ). The latter corresponds to the term v P in (3.46b), where the drift velocity v is proportional to the force, i.e., JF = u FP. This identifies the mobility u as u = β D = (mγ )−1
(3.68)
Finally note that at equilibrium the flux should be zero. Equation (3.61) then leads to a Boltzmann distribution: ∂V ∂P = −β P ⇒ P(x) = const · e−β V (x) ∂x ∂x
(3.69)
The procedure to find the evolution equation for the probability distribution equivalent to the general 1D Langevin Eq. (3.4) is similar in spirit to that described for the derivation of the Smoluchowski equation and will not be repeated here. It leads to the Fokker–Planck equation
1 ∂V ∂ ∂ ∂ ∂P(x, v, t) = −v + +γ ∂t ∂x m ∂ x ∂v ∂v
kB T ∂ v+ m ∂v
P(x, v, t) (3.70)
Stochastic theory of rate processes
1661
To understand the physical content of this equation consider first the case where γ vanishes. In this case the Langevion Eq. (3.4) becomes the deterministic Newton equation (1/m) ∂ V /∂ x = −v˙ and Eq. (3.70) is seen to take the form ∂P ∂P ∂P = −x˙ − v˙ (3.71) ∂t ∂x ∂v This is, again, an expression for the conservation of probability since it implies dx ∂P dv ∂P dP ∂P = + + =0 (3.72) dt ∂t dt ∂ x dt dv Equation (3.72) is known as the Liouville equation (written here for a 1D single particle system), and is completely equivalent to the Newton equation of motion. Next consider the term proportional to γ in Eq. (3.70). This term is responsible for the system-bath coupling. We see that γ (v + (k B T /m) ∂/∂v) P(x, v, t) is an expression for a dissipative probability flux: it is a flux in the v direction that results from the coupling of our particle to the thermal environment. This coupling cannot change P if P is already an equilibrium distribution. Indeed kB T ∂ 2 v+ e−β1/2mv = 0 (3.73) m ∂v i.e., the dissipative flux vanishes at equilibrium just as the deterministic flux does. Finally note that the dissipative flux does not depend on the potential.
4.
Applications to Chemical Reactions in Condensed Phases
In this section we apply the mathematical apparatus developed above to study the dynamics of chemical reactions in condensed phase. Only two simple examples will be handled with some detail while other physical situations will only be briefly outlined. A chemical reaction may generally be viewed as taking place in two principal stages. In the first the reactants are brought together and in the second the assembled chemical system undergoes the structural/chemical change. In a condensed phase the first process involves diffusion, sometimes (e.g., when the species involved are charged) in a force field. The second stage often involves the crossing of a potential barrier. In unimolecular reactions the species that undergoes the chemical change is already assembled and only the latter process is relevant. Also, when the barrier is high the latter process is rate determining. On the other hand in bi-molecular reactions, if the barrier is low (of order k B T or less) the diffusion process that brings the reactants together may be rate determining. In this case the reaction is controlled by this diffusion process. In what follows we treat these two processes separately.
1662
4.1.
A. Nitzan
Diffusion Controlled Reactions
To treat the case where the reaction rate is determined by the rate at which reactants approach each other we model the molecular motion by the Smoluchowski equation. To be specific we consider two species, A and B, where the A molecules are assumed to be static while the B molecules undergo diffusion characterized by a diffusion coefficient D.7 A chemical reaction in which B disappears occurs when B reaches a critical distance R ∗ from A. We will assume that A remains intact in this reaction. The macroscopic rate equation is d[B] = −k[B][A] (4.1) dt where [A] and [B] are molar concentrations. We want to relate the rate coefficient k to the diffusion coefficient D. It is convenient to define A = A[A] and B = A[B], where A is the Avogadro number and A and B are molecular number densities of the two species. In terms of these quantities Eq. (4.1) takes the form dB −k B A = (4.2) dt A Macroscopically the system is homogeneous. Microscopically however, as the reaction proceeds, the concentration of B near any A center becomes depleted and the rate becomes dominated by the diffusion process that brings fresh supply of B into the neighborhood of A. Focusing on one particular A molecule we consider the distribution of B molecules, B(r) = N B P(r) in its neighborhood. Here N B is the total number of B molecules and P(r) is the probability density for finding B molecules at position r given that an A molecule resides at the origin. P(r), and therefore B(r), satisfy the Smoluchowski equation ∂ B(r, t) = −∇ · J ∂t J = −D(β∇ V + ∇)B(r, t)
(4.3)
where V is the A − B interaction potential. At long time a steady state is established, in which B disappears as it reaches a distance R ∗ from A and a constant diffusion flux of B molecules is maintained towards A. If D is small (i.e., if the reaction is diffusion controlled) this steady state is established much before B is consumed, and we are interested in the corresponding steady state rate. For simplicity we assume that the A and B molecules are spherical, so that the interaction between them depends only on their relative distance rand 7 It can be shown that if the molecules A diffuse as well, the same formalism applies, with D replaced by
D A + DB .
Stochastic theory of rate processes
1663
the steady state distribution is spherically symmetric. This implies that only the radial part of J is non zero
J (r) = −D β
d d B(r) V (r) + dr dr
(4.4)
Furthermore, conservation of the B density in each spherical shell surrounding A implies that the integral of J(r) over any sphere centered about A is a constant independent of the sphere radius. Denoting this constant by − J0 gives J (r) = −
J0 4πr 2
(4.5)
Using this in Eq. (4.4) leads to
d d V (r) + B(r) J0 = 4π Dr β dr dr 2
(4.6)
It is convenient at this point to change variable, putting B(r) = b(r) exp(−βV (r)). Equation (2.6) then becomes J0 eβ V (r) db(r) = dr 4πD r 2
(4.7)
which may be integrated from R ∗ to ∞ to yield J0 ; b(∞) − b(R ) = 4πDλ ∗
with λ
−1
≡
∞
dr R∗
eβ V (r) r2
(4.8)
λ is a parameter of dimension length. Note that in the absence of an A − B interaction (i.e., V (r) = 0) λ = R ∗ . We will assume that V (r) → 0 as r → ∞. This implies that b(∞) = B(∞) = B, the bulk number density of the B species. Denoting B ∗ = B(R ∗ ) and V ∗ = V (R ∗ ), Eq. (4.8) finally gives
B∗ = B −
J0 ∗ e−β V 4πDλ
(4.9)
Consider now Eq. (4.2). The rate dB/dt at which B is consumed (per unit volume) is equal to the integrated B flux towards any A center, multiplied by the number of such centers per unit volume −kB
A = −4πr 2 J (r) A = −J0 A A
(4.10)
kB A
(4.11)
whence J0 =
1664
A. Nitzan
Using this in Eq. (4.9) leads to ∗
B = Be
−β V ∗
k 1− 4πDλA
(4.12)
If upon reactive contact, i.e., when r = R ∗ , reaction occurs instantaneously with unit probability, then B ∗ = 0. The steady state rate is then k = 4πADλ
(4.13)
The result (4.13) is the simplest form of rate expression for diffusion controlled reactions. Note again that λ = R ∗ if V (r) = 0. More generally, it is possible that B disappears at R ∗ with a rate that is proportional to B ∗ , i.e., A A dB = −k B = −k ∗ B ∗ , dt A A Using this in Eq. (4.12) leads to k=
i.e., k B = k ∗ B ∗
4π DλA 1 + (4π DλA/k ∗ e−β V ∗ )
(4.14)
(4.15) ∗
which yields the result (4.13) in the limit k ∗ e−β V → ∞. V ∗ is the interaction potential between the A and B species at the critical separation distance R ∗ (on a scale where V (∞) = 0), and can be positive or negative. A strongly positive ∗ V ∗ amounts to a potential barrier to reaction. In the limit k ∗ e−β V → 0 we ∗ get k = k ∗ e−β V . In this case however we need independent information about the rate k ∗ , of the process (usually barrier crossing) that takes place once the reactants have been assembled. This rate is, the subject of our discussion next.
4.2.
Barrier Crossing Processes
Once the reactants are assembled, the reaction may take place. This usually involves some configurational change within the assembled species. In many cases the initial and final configurations are relatively stable (if not than this configurational change will take place almost instantaneously and we would be in the diffusion controlled limit). This implies that on route between the reactant to the product configurations the system has to go through a maximum potential energy (in fact free energy). The reaction rate then becomes the rate of the barrier crossing process. If we model the process as the stochastic motion of a particle along the “reaction coordinate” with a potential characterized by minima for the reactant and product species and a barrier between them, we can treat the dynamics of this process using the Fokker–Planck equation, Eq. (3.70). Before we do this we discuss a particular situation in which knowledge of this dynamics is not needed.
Stochastic theory of rate processes
1665
4.2.1. Transition state theory (TST) Consider a system of particles moving in a box at thermal equilibrium, under their mutual interactions. In the absence of any external forces the system will be homogenous, characterized by the equilibrium particle density. From the Maxwell velocity distribution for the particles, we can easily calculate the equilibrium flux in any direction inside the box, say in the positive x direction, Jx = ρυx , where ρ is the density of particles and υx = 0∞ dυx υx exp(−βmυx /2). Obviously this quantity has no relation to the kinetic processes observed in the corresponding non-equilibrium system. For example, if we disturb the homogeneous distribution of particles, the rate of the resulting diffusion process is associated with the net particle flux (difference between fluxes in opposing directions) which is zero at equilibrium. There are, however, situations where the equilibrium current calculated as described above, through a carefully chosen surface, provides a good approximation for an observed non-equilibrium rate. In fact, for many chemical processes characterized by transitions through high energy barriers, this approximation is so successful that dynamical effects result in relatively small corrections to the so called transition state theory, which is based on the calculation of just that equilibrium flux. To understand the reason for this success we need to carefully examine the assumptions underlying this theory: (1) The rate can be calculated for a system in thermal equilibrium. This assumption is based on the observation that the timescale of processes characterized by high barriers is much longer than the timescale of achieving local thermal equilibrium in the reactants and product regions. The only quantity which remains in a non-equlibrium state on this long timescale is the relative concentrations of reactants and products. (2) The rate is associated with the equilibrium flux across the boundary separating reactants and products. The fact that this boundary is characterized by a high energy barrier is again essential here. Suppose that the barrier is impenetrable at first and we start with all particles in the “reactant state”, say to the left of the barrier in Fig. 1. On a very short timescale thermal equilibrium is achieved in this state. Assumption (1) assures us that this thermal equilibrium is maintained also after we lower the barrier to its actual (large) height. Assumption (2) suggests that if we count the number of barrier crossing events per unit time in the direction reactants → products using the (implied by assumption (1)) Maxwell distribution of velocities, we get a good representation of the rate. For this to be true, the event of barrier crossing has to be the deciding factor concerning the transformation of reactants to products. This is far less simple then it sounds: If the particle is infinitesimally to the left or to the right of the barrier its identity as reactant or product is
1666
A. Nitzan V
EB
x⫽0
xB
x
Figure 1. A schematic view of the potential surface for a unimolecular reaction. x = 0 is the bottom of the reactant well. X B is the position of the barrier.
not assured. It is only after subsequent relaxation leads it towards the bottoms of the corresponding wells that its identity is determined. Assumption (2) in fact states that all equilibrium trajectories crossing the barrier are reactive, i.e., indeed go from well defined reactants to well defined products. For this to be approximately true two conditions should be satisfied: (a) the barrier region should be small relative to the mean free path of the particles along the reaction coordinate, so that their transition from a well defined left to a well defined right is undisturbed and can be calculated from the thermal velocity, and (b) once the particles cross the barrier they relax quickly to the equilibrium reactant or product states. These conditions are inconsistent with each other: The fast relaxation required by the latter will make the mean free path small, in contrast to the requirement of the former. Indeed, this is the origin of the failure of assumption (2) in a barrierless process. Such a process proceeds by diffusion, which is defined over lengthscales large relative to the mean free path of the particles. Therefore, the transition region needed to define the “final” location of the particle to the left or to the right cannot be smaller then this mean free path. For a transition region located at the top of a high barrier we have a more favorable situation: Once the particle has crossed the barrier, it gains kinetic energy as it goes into the well region, and since the rate of energy loss due to friction is proportional to the kinetic energy, the particle may lose energy quickly and become identified as a product before it is reflected back to the reactant well. This fast energy loss from the reaction coordinate is strongly accelerated in real chemical systems where the reaction coordinate is usually strongly coupled, away from the barrier region, to other non-reactive molecular degrees of freedom.
Stochastic theory of rate processes
1667
We next derive the transition state rate of escape for a simple example. Consider an activated rate process represented by the escape of a particle from a 1D potential well (Fig. 1). The Hamiltonian of the particle is H=
p2 + V (x) 2m
(4.16)
where V (x) is characterized by a potential well with a minimum at x = 0 and a potential barrier peaked at x = x B > 0, separating reactants (x < x B ) from products (x > x B ) (Fig. 1). Under the above assumptions the rate coefficient for the escape of a particle out of the well is given by the forward flux at the transition state x = x B kTST = v f P (x B )
(4.17)
where P(x B )dx is the equilibrium probability that the particle is within dx of x B , P(x B ) = E B
−∞
exp (β E B ) dx exp (−βV (x))
;
E B = V (x B )
(4.18)
and where υ f is the average of the forward velocity ∞
1 dv ve−(1/2)βmv √ = 2 −(1/2)βmv 2πβm −∞ dve 2
v f = 0∞
(4.19)
Note that the fact that only half the particles move in the forward direction is taken into account in the normalization of Eq. (4.19). For a high barrier, most of the contribution to the integral in the denominator of (4.18) comes from regions of the coordinate x for which V (x) is well represented by an harmonic well, V (x) = (1/2)mω02 x 2 . Under this approximation this denominator then becomes
∞
−(1/2)βmω02 x 2
dxe −∞
=
2π βmω02
(4.20)
Inserting Eqs. (4.18)–(4.20) into (4.17) leads to kT ST =
ω0 −β E B e 2π
(4.21)
The transition state rate is of a typical Arrenius form: a product of a frequency factor that may be interpreted as the number of attempts, per unit time, that
1668
A. Nitzan
the particle makes to exit the well, and an activation term associated with the height of the barrier. It is important to note that it does not depend on the coupling between the molecule and its environment, only on parameters that determine the equilibrium distribution. A similar derivation can be done in a multidimensional case. For a molecular system with N nuclear degrees of freedom represented below by their coordinates x1 , x2 , . . . , the analog of the frequency ω0 are N frequencies that characterize the bottom of the well. These are the normal modes frequencies obtained from diagonalizing the force constant matrix ∂ 2 V /∂ xi ∂ x j evaluated at the position of the well bottom. The barrier in this case is a saddle point on this potential surface, and the normal modes frequencies evaluated at the saddle configuration include at least one imaginary frequencies. It is the normal mode coordinate associated with this frequency that defines the reaction coordinate at the barrier. mode frequencies evaluated Denoting the normal at the well bottom by ω0 j ; j = 1, . . . , N and the frequencies of the stable normal modes evaluated at the saddle point by ω B j ; j = 1, . . . , N − 1 the generalization of Eq. (4.21) reads
N 1 i=0 ω0i −β E B e kTST = N 2π i=1 ω Bi
(4.22)
Some observations: In view of the simplifying assumptions which form the basis for transition State Theory, its success in many practical situations may come as a surprise. Bear in mind however that transition state theory accounts quantitatively for the most important factor affecting the rate – the activation energy. Dynamical theories which account for deviations from TST often deal with effects which are orders of magnitude smaller than that associated with the activation barrier. Environmental effects on the dynamics of chemical reactions in solution are therefore often masked by solvent effect on the activation free energy. Transition State Theory is important in one additional respect: It is clear from the formulation above that the rate (4.21), or (4.22) constitutes an upper bound to the exact rate. The reason for this is that the correction factor discussed above, essentially the probability that an escaping equilibrium trajectory is indeed a reactive trajectory, is smaller than unity. This observation forms the basis to the so called variational Transition State Theory, which exploit the freedom of choosing the dividing surface between reactants and products: Since any dividing surface will yield an upper bound to the exact rate, the best choice is that which maximizes the TST rate. Corrections to TST arise from dynamical effects on the rate. They arise when the coupling to the thermal environment is either too large or too small. In the framework of the Fokker–Planck Eq. (3.70) these are the limits γ → ∞ and γ → 0, respectively, where the friction γ measures the strength of the
Stochastic theory of rate processes
1669
system coupling to its environment. In the first case the total outgoing flux out of the reactant region is not a good representative of the reactive flux because most of the trajectories cross the dividing surface many times – a general characteristics of a diffusive process. In the extreme strong coupling case the system cannot execute any large amplitude motion, and the actual rate vanishes even though the transition state rate is still given by the expressions derived above. In the opposite limit of a very small coupling between the system and its thermal environment it is the assumption that thermal equilibrium is maintained in the reactant region that breaks down.8 In the extreme limit of this situation the rate is controlled not by the time it take a thermal particle to traverse the barrier, but by the time it take the reactant particle to accumulate enough energy to reach the barrier. As just stated, both limits can be obtained from the general solution of Eq. (3.70). The rate obtained from this analysis is found to vanish like γ when γ → 0 and like γ −1 when γ → ∞. In the following subsection we show how this result is obtained in one of these limits, that of large friction.
4.2.2. Barrier crossing in the high friction limit The theory of unimolecular rates based on the Fokker–Planck Eq. (3.70) is known as the Kramers theory. The Fokker–Planck equation
∂P kB T ∂2 P ∂ ∂ P(x, v; t) 1 dV ∂ P = −v +γ (v P) + ∂t m dx ∂v ∂x ∂v m ∂v 2
(4.23)
with a potential V (x) characterized by a barrier (Fig. 1) is sometimes referred to in the present context as the Kramers equation. In the high friction limit this equation yields the Smoluchowski equation, Eq. (3.58), that we write in the form ∂ ∂ P(x, t) = − J (x, t) (4.24) ∂t ∂x dV ∂ +β P(x, t) (4.25) J = −D ∂x dx Denoting by A the amount of reactant in the well9 A≡
dx P(x, t)
(4.26)
well
8 If the product space is bound, another source of error is the breakdown of the assumption of fast equilibration in the product region. Unrelaxed trajectories may find their way back into the reactant subspace. 9 If N is the total number of particles, their number in the well region is NA. A is the fraction of particles in the well, or the probability per particle to be in the well.
1670
A. Nitzan
The rate of escape from this initial well is given by k = A−1 (−dA/dt). The reaction, i.e., the escape from the well depletes the number of particles in the well. However, the rate coefficient k is well defined as a time independent constant when it does not depend on this number. We can therefore evaluate the rate by considering an artificial situation in which A is maintained strictly constant (so the quasi-steady-state is replaced by a true one) by imposing a source at the bottom of the well and a sink outside it. For the mathematical derivation it may be more convenient to put the source at x = −∞ and the sink at x = +∞, where the geometry of Fig. 1 is considered. In fact, this source does not have to be described in detail: We simply impose the condition that the population (or probability) inside the well, far from the barrier region, is fixed, while outside the well we imposed the condition that it is zero. Under such conditions the system will approach, at long time, a steady state in which ∂ P/∂t = 0 but J =/ 0. The desired rate k is then given by J/A. To carry out this program we start from the Smoluchowski (3.58) written in the form ∂ ∂ P(x, t) = − J (x, t) ∂t ∂x
dV ∂ J = −D +β ∂x dx
(4.27)
P(x, t)
(4.28)
At steady state J is constant and
D β
d dV + dx dx
Pss (x) = − J
(4.29)
J can be viewed as one of the integration constants to be determined by the boundary conditions. Without extra effort the case D = D(x) can be considered. The equation for Pss (x) is then
β
d dV J + Pss (x) = − dx dx D(x)
(4.30)
Looking for a solution of the form Pss (x) = f (x)e−β V (x)
(4.31)
we find J β V (x) df =− e dx D(x) f (x) = − J
x ∞
β V (x ) e dx D(x )
(4.32) ∞
=J x
β V (x ) e dx D(x )
(4.33)
Stochastic theory of rate processes
1671
The choice of ∞ as an integration limit amounts to the choice of boundary condition f (x → ∞) = 0, which represents a sink at infinity. This leads to Pss (x) = J e−β V (x)
∞
dx
x
eβ V (x ) D(x )
(4.34)
Integrating both sides from x = −∞ to x = x B finally leads to
J = k = xB dx P (x) ss −∞
x B
dxe−β V (x)
−∞
∞ x
−1
eβ V (x ) dx D(x )
(4.35)
This result can be further simplified by using the high barrier assumption, β(V (x B )−V (0)) 1, that was already recognized as a condition to get a meaningful unimolecular behavior with a time independent rate constant. In this case the largest contribution to the inner integral comes from the" neighborhood of the barrier, x =# x B , so exp [βV (x)] can be replaced by exp β(E B − (1/2) mω2B (x − x B )2 ) , while the main contribution to the outer integral comes from the
bottom of the well at x = 0, so the exp[−βV(x)] can be replaced by exp −1/2βmω2B x 2 . This then gives
k=
∞
dxe−1/2βmω0 x
−∞
2 2
"
∞
β E B −12 mω2B (x−x B )2
dx −∞
e
D(x B )
# −1
(4.36)
The integrals are now straightforward, and the result is (using D = (βmγ )−1 ) ωB ω0 ω B −β E B e (4.37) = kTST . k= 2πγ γ The resulting rate is expressed as a corrected TST rate. Recall that we have considered a situation where the damping γ is faster than any other characteristic rate in the system. Therefore the correction term is smaller than unity, as expected.
4.2.3. Other results The simplicity of the 1D Smoluchowski Eq. (3.58) has made it possible for us to obtain a practically exact rate expression for the large friction high barrier case. Other situations can be handled by solving the Fokker–Planck Eq. (3.70). We have already mentioned that the rate obtained in the low friction limit (γ → 0) is proportional to γ . Since the transition state theory result, Eq. (4.21), which does not depend on γ , is an upper bound to the rate, the overall picture regarding the dependence of the rate on the molecule-environment coupling as expressed by the friction has to look like Fig. 2, where the dotted line interpolates between the limiting cases.
1672
A. Nitzan k k⫽kTST⫽(ω/2π)exp[⫺βEB] k~γ
k~γ⫺1
γ Figure 2. A schematic view of the dependence of an the reaction rate k on the friction coefficient γ .
A full stochastic theory of chemical reactions has to consider further generalizations of this model. First, many dimensional models have to be considered. Secondly, the Markovian approximation, that was founded on the assumption that the bath dynamics is much faster than all other degrees of freedom in the system is not a good approximation for many molecular processes since periods of molecular frequencies are considerably shorter than characteristic timescales of a typical thermal environment. Finally, for some reactions, in particular those involving motions of hydrogen atoms at low temperature, tunneling may be important. The scope of this chapter does not allow the coverage of these fascinating issues. Rather, it provides the introduction and the necessary conceptual ingredients for further studies of these subjects.
5.8 APPROXIMATE QUANTUM MECHANICAL METHODS FOR RATE COMPUTATION IN COMPLEX SYSTEMS Steven D. Schwartz Departments of Biophysics and Biochemistry, Albert Einstein College of Medicine, New York, USA
1.
Introduction
The last 20 years have seen qualitative leaps in the complexity of chemical reactions that have been studied using theoretical methods. While methodologies for small molecule scattering are still of great importance and under active development [1], two important trends have allowed the theoretical study of the rates of reaction in complex molecules, condensed phase systems, and biological systems. First, there has been the explicit recognition that the type of state to state information obtained by rigorous scattering theory is not only not possible for complex systems, but more importantly, not meaningful. Thus, methodologies have been developed that compute averaged rate data directly from a Hamiltonian. Perhaps the most influential of these approaches has been the correlation function formalisms developed by Bill Miller et al. [2]. While these formal expressions for rate theories are certainly not the only correlation function descriptions of quantum rates [3, 4], these expressions of rates directly in terms of evolution operators, and in their coordinate space representations as Feynman Propagators, have lent themselves beautifully to complex systems because many of the approximation methods that have been devised are for Feynman propagator computation. This fact brings us to the second contributor to the blossoming of these approximate methods, the development of a wide variety of approximate mathematical methods to compute the time evolution of quantum systems. Thus the marriage of these mathematical developments has created the necessary powerful tools needed to probe systems of complexity unimagined just a few decades ago.
1673 S. Yip (ed.), Handbook of Materials Modeling, 1673–1689. c 2005 Springer. Printed in the Netherlands.
1674
S.D. Schwartz
The structure of this chapter will be as follows: in Section 2, we will briefly review correlation function theories of chemical reaction rates. In particular attention will be paid to specific features that allow approximation. Section 3 will begin the description of approximation methods for rate computation. The first approach studied will be that developed in our group – the quantum Kramers methodology coupled with approximate propagator computation via operator expansion and resummation. This is a fully quantum mechanical approach which gains efficiency both by approximation of the system by a simplified model, and by evaluation of the necessary quantum mechanical propagators using approximate methodologies. This approach has proven to yield results that are accurate and relatively simple to implement. The next set of approximation methods will be semiclassical computations of the propagator. Broadly classed, semiclassical mechanics is among the earliest of approximate quantum methodologies, stretching back to the JWKB approximation [5]. New realizations of semiclassical mechanics have seen a significant expansion of viability of the concept. We will include recent advances in the initial value representation (IVR) approach and wavepacket propagation including coherent state representations. Next, we will examine mixed quantum classical time evolution methodologies – so called surface hopping approaches. Finally we will briefly describe other methods for approximate quantum computation, such as transition state theory (TST). We then conclude with challenges faced by these approaches in the future.
2.
Correlation Function Theories of Rates
As was mentioned above, for a complex system, such as a large molecular reaction, a reaction in a condensed phase, or a reaction in a macromolecule such as an enzyme, one would like to compute quantum rate data at least approximately, but this will never be possible from summing and averaging state-to-state information. For this reason it is desirable to have a direct quantum rate methodology. Bill Miller recognized many years ago that in classical mechanics, TST provided such a formalism – albeit an approximate one [6]. Such considerations led to a fully quantum theory of reaction rates expressed as correlation functions. Three correlation functions were derived which correlated flux across a dividing surface with itself, a flux-side correlation function (by side we mean reactant or product), and a side–side correlation function; and in fact all three were shown to be related by simple derivative relations. The two forms that have seen the most practical application have been the flux autocorrelation function and the flux-side correlation function. The rate
Approximate quantum mechanical methods
1675
of a reaction is directly expressed as a time integral of the flux autocorrelation function: k(T ) = Q r (T )−1
∞
dtC f f (t),
(1)
0
where
C f f (t) = tr e
−β H /2
Fe
−β H /2 i H t /1h
e
Fe
−i H t /1h
.
(2)
F is the quantum flux operator given by (i/1h ) H , h , and h is a quantum operator which is just the Heaviside function of the quantum position operator corresponding to a reaction coordinate or order parameter. Similarly the rate is defined via the flux side correlation function as a long time limit k(T ) = Q r (T )−1 lim C f s (t),
(3)
t →∞
with the flux side correlation function given by
C f s (t) = tr e
−β H /2
Fe
−β H /2 i H t /1h
e
he
−i H t /1h
.
(4)
Application of this formalism reduces to evaluation of matrix elements of the correlation functions that in turn reduces to matrix elements of the evolution operator, or the Feynman Propagator [7]. Thus, using these types of expressions for the rate, approximations are needed for the Feynman propagator. We now begin our brief survey of these approximation methods for both Feynman propagators, and more generally rates.
3.
The Quantum Kramers Methodology and Evolution Operator Expansions and Resummations
While there have been great strides made in the exact evaluation of Feynman propagators for multiple degrees of freedom, exact evaluation of propagators in real time is still largely limited to few degree of freedom problems. For this reason we have evaluated rates using an approximation to a full many body potential employing a system bath approach. It is known that for a purely classical system [8, 9], an accurate approximation of dynamics of a tagged degree of freedom in a condensed phase can be obtained through the use of a generalized Langevin equation. The generalized Langevin equation is given by Newtonian dynamics plus the effects of the environment in the form of a memory
1676
S.D. Schwartz
friction and a random force [10]. The Fluctuation–Dissipation theorem yields the relation between the friction and the random force. The quantum Kramers approach is dependent on an observation of Zwanzig [11], that given a specific set of mathematical requirements on an interaction potential for a condensed phase system, the Generalized Langevin equation can be formally related to a microscopic Hamiltonian in which the system is coupled to an infinite set of harmonic oscillators via simple bilinear coupling
P2 1 ck s P2 k + m k ωk2 qk − H = s + Vo + 2m s 2m k 2 m k ωk2 k
2
,
(5)
and a discrete spectral density gives the distribution of bath modes in the harmonic environment weighed by their coupling to the reaction coordinate J (ω) =
π ck2 [δ(ω − ωk ) − δ(ω + ωk )]. 2 k m k ωk
(6)
The direct relation between the physical system represented by the generalized Langevin equation and the harmonic Hamiltonian shown above is, via the friction on the reaction coordinate, with the spectral density obtained from the cosine transform of the time dependent friction on the reaction coordinate. Implicit in this assumption is that the friction that will be felt by the reaction coordinate is independent of the position along the reaction coordinate. Thus, the way this approach is implemented in a classical system [8], is a molecular dynamics (MD) calculation is performed with the reaction coordinate fixed at a specific location. For a reactive system, it is usually assumed that a region close to the transition state dominates the dynamics [12, 13], and so the reacting particle is clamped at the transition state. Because the rigorous transition state may not be at hand for a complex reaction, the location is often defined as the top of the potential barrier to reaction. The idea of the quantum Kramers approach we have employed is that, if a classical Langevin equation accurately describes the classical dynamics, and is equivalent to a microscopic Hamiltonian, then a quantum analogue is found by solving for the distribution of bath modes classically to yield a spectral density, but then solving the dynamics of the resulting Zwanzig Hamiltonian quantum mechanically. Thus, if one wants to evaluate a rate, then a Zwanzig Hamiltonian in which the distribution of bath modes is obtained from an accurate classical MD simulation on the exact system replaces the exact Hamiltonian in the evaluation of the flux autocorrelation function of Eq. 2. Even with this approximation, exact solution of the quantum mechanical propagator, for this many degree of freedom system, is not possible, and so we evaluate the propagator using an operator methodology we developed that allows an approximate adiabatic evolution operator to be easily corrected to an almost exact many
Approximate quantum mechanical methods
1677
body propagator. The details are given in the literature [14–19], but in short, we write a general Hamiltonian as
H = H a + H b + f (a, b),
(7)
where a and b are shorthand for any number of degrees of freedom. f (a, b) is a coupling, usually only a function of coordinates, but this is not required. The operator resummation idea rests on the fact that because these three terms are operators, the exact evolution operator may not be expressed as a product
e−i H t =/ e−i H a t e−i H b t + f (a,b),
(8)
but in fact equality may be achieved by application of an infinite order product of nth order commutators
e−i H t =/ e−i H a t e−i H b t + f (a,b)ec1 ec2 ec3 · · · .
(9)
This is usually referred to as the Zassenhaus expansion or the Baker Campbell Hausdorf theorem [20]. As an aside a symmetrized version of this expansion terminated at the C1 term results in the Feit and Fleck [21] approximate propagator. We have shown that an infinite order subset of these commutators, may be resummed exactly as an interaction propagator U (t)resum = U (t) Ha U (t) Hb + f (a,b)U −1 (t) Ha + f (a,b)U (t) Ha .
(10)
The first two terms are just the adiabatic approximation, and the second two terms the correction. For example, if we have a fast subsystem labeled by the “coordinate” a, and a slow subsystem labeled by b; then the approximate evolution operator to first order in commutators with respect to the slow suborder in system b f a b Hb , and infinite the commutators of the “fast” Hamiltonian with the coupling: f a b Ha is given by e−i(Ha +Hb + f (a,b))t /1h ≈ e−i Ha t /1h e−i(Hb + f (a,b))t /1h e+i(Ha + f (a,b))t /1h e−i Ha t /1h . (11) The advantage to this formulation is higher dimensional evolution operators are replaced by a product of lower dimensional evolution operators. This is always a far easier computation. In addition, because products of evolution operators replace the full evolution operator, a variety of mathematical properties are retained, such as unitarity, and thus time reversal symmetry. Combination of the quantum Kramers idea with the resummed evolution operators results in a largely analytic formulation for the flux autocorrelation function for a chemical reaction in a condensed phase. After a lengthy but not complex computation the quantum Kramers flux autocorrelation function has been shown to be [22, 23]. Cf =
C 0f
B1 Z bath −
∞ 0
dωκ 0f J (ω)B2 Z bath .
(12)
1678
S.D. Schwartz
Here C 0f is the gas phase (uncoupled) flux autocorrelation function, Z bath is the bath partition function, J (ω) is the bath spectral density (computed as described above from a classical MD computation), B1 and B2 are combinations of trigonometric functions of the frequency ω and the inverse barrier frequency, and finally κ 0f =
2 1 −iHs tc /1h s = 0|e |s = 0 . 4m 2s
(13)
As in other flux correlation function computations, tc is the complex time t − (i1h β)/2. Thus, given the quantum Kramers model for the reaction in the complex system, and the resummed operator expansion as a practical way to evaluate the necessary evolution operators needed for the flux autocorrelation function, the quantum rate in the complex system is reduced to a simple combination of gas phase correlation functions with simple algebraic functions. This methodology has now been applied to models of condensed phase rates [22, 23], proton transfer in benzoic acid crystals [24], proton transfer in polar solvent [25, 26], quantum control of reaction rates in solution [27, 28], a quantum theory of polar solvation [29], and chemical reaction rates in enzymes [30]. It can produce results of experimental accuracy, and has allowed the identification of crucial physical features not previously identified in complex systems such as enzymes. One example is found in the concept of promoting vibrations [31] we now briefly review. The Hamiltonian of Eq. (5) contains only antisymmetrically coupled environmental modes. We have described how these modes are mathematically equivalent to the well symmetrizing modes of Marcus theory [32, 33]. These modes cause well bottoms to fluctuate in depth, and so as in the case of electron transfer theory, environmental reorganization energy is required to bring well depths into equivalence, and results in the activation barrier. Symmetrically coupled modes, on the other hand move the position of well bottoms closer and further away. We first noticed such an addition to the theory was needed in our work on proton transfer in benzoic acid crystals where the physical motion is obvious: the motion of carboxylic oxygens (see Fig. 1). Motions of this type can be incorporated into the framework of the Hamiltonian of Eq. (5) with the addition of another oscillator to the Hamiltonian
P2 1 ck s P2 k + m k ωk2 qk − H = s + Vo + 2 2m s 2m 2 m k k ωk k
1 + M2 2
Cs 2 Q− M2
2
+
PQ2 2M
2
.
(14)
Approximate quantum mechanical methods
1679
H
O
O
O
O
H
C
C O
H
O
Figure 1. A benzoic acid dimer showing how the symmetric motion of the oxygen atoms will affect the potential for hydrogen transfer.
Again, employing operator expansion and resummation techniques we were able to compute rates from correlation functions for this physical system. Interestingly, various properties indicative of chemistry in regimes where quantum effects are large, such as tunneling reactions, are dramatically affected by the addition of a single symmetrically coupled mode. For example, in tunneling regimes, where kinetic isotope effects (KIEs) are expected to be large, such a symmetrically coupled vibration suppresses KIEs even in the presence of deep tunneling. This odd physical effect is due to differential corner cutting in hydrogen and deuterium reactions, and has been seen now in a variety of materials including enzymatically-catalyzed reactions. These computations have been among the first to explain this puzzling feature of enzymatic hydrogen transfer reactions: that is how can it be that certain indicators of tunneling such as secondary and tertiary KIEs strongly indicate tunneling of a reacting hydrogen, while the primary KIE on that particular hydrogen transfer is quite low, seemingly classical. A current topic of some controversy is, if in fact enzymes might have evolved to maximize tunneling in some cases as the primary mechanism of catalysis, rather than the more standard biochemical view of barrier lowering through transition state stabilization [34, 35].
1680
4.
S.D. Schwartz
Semiclassical Propagators and Rate Theory
As was mentioned previously, semiclassical mechanics is one of the oldest methodologies for the computation of quantum dynamics. It has gone through a variety of incarnations, each more usable than the previous. The current revivification revolves around the IVR. The utility of this method as compared to previous methods is simple to see. All modern semiclassical treatments [36–39] end up deriving a quantum propagator from classical trajectories. Not surprising, given that the full coordinate space matrix element of the evolution operator, the Feynman propagator, is built from classical trajectories weighted by the phase factor of the action integral. The difficulty is found in the boundary conditions of these trajectories. The early approaches of semiclassical mechanics required a search in phase space for trajectories that connected initial and final states. This can easily be seen from the well-known Van Vleck propagator [40] ∂ x 2 1/2 iS(x ,x ;t )/1h e 2 1 . ∂p
K (x2 , x1 ; t) = (2πi1h ) F
(15)
1
Here, S is the classical action integral for a trajectory that goes from x1 to x2 in time t. The practical difficulty in implementation is that one must find a trajectory that evolves from an initial condition in phase space (x1 , p1 ) to the final point x2 . As is well known this is a non-linear search, difficult to implement in multiple dimensions. The IVR approach pioneered by Miller [41] and then discussed by Marcus [42] relies on exact integrations over the x2 coordinate (rather than semiclassical stationary phase integration). This in turn allows a change of variables from x2 to p1 , in other words – all initial conditions. The resulting wavefunction matrix element of the evolution operator is given by:
|∂ x˜t (x˜0 , p˜0 )/∂ p˜0 | 1/2 K n2 ,n1 (t) = dx˜0 d p˜0 (2πi1h ) F × eiSt (x˜0 , p˜0 )/1h ψ∗n2 (x(t)) ˜ ψn1 (x(0)). ˜
(16)
Thus an exact numerical integral is traded for a semiclassical stationary phase one, but the root search is avoided. This is of the utmost importance in complex, many dimensional systems, where the integrals needed are well suited to Monte Carlo methodologies, but the older “primitive” semiclassical approach would present a root search that was essentially impossible to accomplish. Other approaches to semiclassical time evolution that end up having a similar functional form are important to mention. Perhaps one of the most influential is the semiclassical propagator of Herman and Kluk [43]. Though it ends up in a similar form, the approach was derived from a very different approach.
Approximate quantum mechanical methods
1681
They investigated another semiclassical approximation – that of frozen Gaussians. This simple approach had been used to great effect by Heller et al. [44]. The idea here is to expand any initial wavefunction which must be propagated in a nonorthogonal, overcomplete set of coherent states g(x, t; xt , pt ) = N e(−γ (x−xt )
2 +i p
t (x−x t ))
.
(17)
Here N is a normalization factor, and xt and pt are the center and momentum of the wavepacket. The center and momentum may be thought of as a continuous quantum number space, and one may resolve the identity as
δ(x − x ) = (2π )−1 dx 1 d p1 g(x, t; x1 , p1 )g ∗ (x , t; x1 , p1 ).
(18)
The idea behind the frozen Gaussian approach is to assume that after an initial decomposition of a wavefunction into a swarm of Gaussians, then time evolution of the actual wavefunction is given simply be classical motion of the individual wavepackets via classical mechanics. Thus all evolution is found in the classical motions of the centers and momenta of the wavepackets. It is certainly well known that in a non-harmonic potential, a wavepacket will spread and deform, but the frozen Gaussian approach is successful because the collective motion of the wavepackets approximates the evolution of the initial wavefunction [44]. The Herman–Kluk propagator was derived as an attempt to improve on the frozen Gaussian approximation. That is, an initial wavefunction is expanded in a coherent state basis, but rather than use purely classical propagation of the centers, a semiclassically exact propagation with steepest descent integration is used to compute the evolution. The Herman–Kluk version of IVR results in the expression for the time evolved wavefunction x|e−iH t /1h |x = (2π 1h )
−1
dx0 d p0 g(x, t; xt , pt )g ∗ (x , t; x0 , p0 )eiSt (x0 , p0 )/1h C H K . (19)
Here st (x0 , p0 ) is the classical action for the unique trajectory that evolves from the initial point x0 , p0 , and C H K is the Herman–Kluk prefactor given by the monodromy matrix elements
CHK =
1 2
Mx x + M pp +
1h γ i Mq p + M pq i 1h γ
1/2
.
(20)
Here Mx x = M px
∂ xt ∂ x0
∂ pt = ∂ x0
Mx p =
∂ xt ∂ p0
∂ pt M pp = ∂ p0
.
(21)
1682
S.D. Schwartz
It is apparent from the structure of Eq. (16) that this is again an initial value semiclassical method – the propagator is built up from the coherent state basis weighted by the classical action along a trajectory beginning at a specific phase point in combination with the partial variations in position or momenta at intermediate time t with variation in initial phase point. In fact, Miller has shown a formal relation between the Herman–Kluk formulation, more standard IVR, and an exact coherent state representation [45]. It is also worth mentioning that Filinov filtering has been used to great effect to damp out unwanted oscillations in the integrand of the propagator expression [46]. As with the resummed operator expansion method, a semiclassical expression for the coordinate space matrix element of the propagator may be used in correlation function rate expressions. Currently, such methods have not been applied to chemical reaction in truly complex materials such as condensed phases or enzymes, but they have been applied to computations of rates in systems of hundreds of degrees of freedom [47], and such computations make application to real material systems seems possible in the foreseeable future. It should be pointed out that there have also been recent approaches that employ coherent state expansions to exactly rewrite the Schrodinger Eq. [48]. The motivating idea is then that either new approximations are presented given the new mathematical formulation, or the equivalent but mapped set of equations are more amenable to numerical solution than the original Schrodinger equation. There is of course a long history of such approaches – time dependent perturbation arises from rewriting the Schodinger equation in terms of time dependent basis set occupations. It is also worth mentioning the recent applications of Bohm’s quantum formulation [49]. Wyatt et al. have made great strides in this area [50]. Writing the time evolution of the wavefunction as the time evolution of a prefactor and a phase allows one to cast the quantum evolution in terms of fluid mechanics like equations, and so, one may hope to take advantage of the large body of approximations built up in this area of investigation.
5.
Mixed Quantum Classical Propagation Methods
The final general approach to rate computation in complex systems we describe are actually a broad class of approaches that are unified by a mixing of quantum and classical mechanics. This is in contradistinction to the previously described methods that handle all degrees of freedom equivalently, but approximately. The idea behind the mixed quantum classical methods is simply that there are either degrees of freedom or more generally components of a computation that are essentially classical in nature, while there are other degrees of freedom which are inherently quantum mechanical. This approach is quite appealing. No one would seriously question the lack of quantum
Approximate quantum mechanical methods
1683
character of say a large protein molecule, but it is equally well true that individual chemical events within that protein molecule could exhibit significant quantum character. In addition there are some formulations of rate theory that are inherently classical, TST being the most prominent, to which one would like to add quantum corrections to maintain simplicity, but improve accuracy. One of the more successful mixed quantum classical methods is the classical trajectory with quantum transitions method pioneered by Tully et al. [51]. This method derives from the original surface hopping idea again of Tully [52], designed to include electronic nonadiabaticity in dynamics computations. The idea here was that in the simulation of an electronically non-adiabatic system, rather than attempt to solve the Schrodinger equation for both the electronic and nuclear degrees of freedom, classical mechanics is used but equations of motion are propagated on two or more adiabatic Born–Oppenheimer potential surfaces. The inherent nonadiabaticity in the problem is included by allowing the trajectories to “hop” from one Born–Oppenheimer surface to another at regions of strong non-adiabatic coupling – avoided crossings. The probability of hops is given by the off diagonal elements of the electronic Hamiltonian. In some sense, it may seem inappropriate to include one type of quantum effect – nonadiabaticity in the electronic degrees of freedom, while ignoring all the others, tunneling, zero-point motions, interference of trajectories, but this is in fact exactly the goal. In many systems not involving for example, hydrogen transfer, one expects these other quantum effect to be minimal in magnitude, while the mixing of electronic surfaces can actually have a significant impact on the computed rate. It is well known that given the large disparity in nuclear and electronic mass, electronic non-adiabatic effect tend to be quite small and localized to specific regions of adiabatic potential surfaces, and so the localized surface hopping approach is well justified. For problems in which electronic nonadiabaticity does not play a role, there may still be degrees of freedom for which quantum mechanics is crucial, but other degrees of freedom may be treated classically. The difficulty is to find a way to mix these two inherently dissimilar descriptions, although it is well known that classical mechanics may be derived from quantum mechanics as an 1h → 0 limit, it is not clear how to take only certain degrees of freedom to this limit. In fact, there is no “basic” derivation to do this, but rather each prescription must be tested. Metiu et al. [53] proposed just such a method that proved successful in computing thermal rate constants. They also start from a correlation function definition for the rate constant. The full Hamiltonian operator is partitioned into a reaction coordinate plus coupling to all other degrees of freedom, with the rest of the degrees of freedom labeled as “spectator” coordinates. The assumption is made that the reaction coordinate Hamiltonian and the spectator Hamiltonian (soon to be taken to a classical limit) commute, and this evolution operator factors. Then the algorithm proceeds by assigning classical methodologies to all manipulations of the classical degrees of
1684
S.D. Schwartz
freedom. For example, traces over quantum operators become integrals over phase space. Classical degrees of freedom appearing in the quantum Hamiltonian as parameters, are treated as time dependent quantities evolving according to classical mechanics – a fairly common classical path assumption. There are a variety of levels of possible approximation – for example allowing the quantum systems to feedback self-consistently to the classical system, and the reader is referred to the original work. More recently, there have been advances in the surface hopping technology that allow the concept to be applied to nuclear quantum effects when there is no electronic nonadiabaticity. In many ways this is a more complex problem because the quantum effects are not limited to a specific region(s) of potential energy surfaces such as avoided crossings or seams, but rather are present through out the entire phase space. In fact, the way to handle this problem was to apply a more sophisticated surface-hopping algorithm again developed by Tully [51]. The central feature of this methodology is the “fewest switches” algorithm. This algorithm determines for a given trajectory at each step whether a switch should be made to another electronic potential energy surface. This algorithm is designed so that the statistical distribution of the different electronic states is maintained. The switches are sudden, in quantum language, all probability density is suddenly transferred from one electronic state to another, but because the actual implementation involves the propagation of swarms of trajectories, flux slowly evolves from one state to another. The method also is designed to conserve energy, and so an electronic transition requires alteration of the momentum of a specific trajectory. Once the possibility exists for quantum transitions at any point on the potential energy surface, it is a straightforward extension to allow quantum transitions for specific nuclear degrees of freedom. For example in the initial application to proton transfer in polar solvent [54], the proton is treated quantum mechanically, while all other degrees of freedom classically. Once the dynamics is run, then quantum expectation values for the position of the quantum degree of freedom may be computed as a function of time. A reaction if said to have occurred if the system is started in the reactant region, and then undergoes a transition to the product region. Coker et al. [55, 56] have been carried out extensive simulations, using the advanced surface hopping technologies for electronically non-adiabatic chemical reactions. In these reactions dozens of electronic potential energy surfaces are involved. In addition, Coker has shown formally how one may rigorously justify the Tully approach from “first principles.” [57, 58] Recently Kapral and Ciccotti have provided more formal first principles work deriving surface hopping methodologies [59]. Rossky and co-workers [60] have extensively investigated the importance of quantum decoherence in mixed quantum classical simulations. This effect needs to be “reintroduced” to quantum classical simulations, as it is certainly not present in classical mechanics. MD with
Approximate quantum mechanical methods
1685
quantum transitions has also found use in the simulation of chemical reactions in biological materials. Hammes–Schiffer and her group have recently published papers applying this method to liver alcohol dehydrogenase and dihydrofolate reductase [61, 62]. In addition to application of the Tully method, this group mixes quantum and classical mechanics in a variety of other ways as well. Computations of classical TST rates, the transmission coefficient, and KIEs augment the surface hopping computations. Quantum mechanics is included in the TST results via a partially quantum computed free energy profile from which the TST Boltzmann factor is derived. It is to be admitted that these reactions in biological materials are monumentally complex, and in this case the methods that have been devised to deal with them are a complex mixture of classical, dynamical, and statistical. Having described the previous approaches to quantum corrected TST, our final section of mixed quantum classical techniques would not be complete without a description of some of the more recent applications of TST to reactions in complex materials. TST is an inherently classical theory, but there are a variety of ways in which it can be augmented with quantum information to give fairly simple to use, but fairly accurate computations of rates. Standard gas phase TST gives the rate as k=
Q ‡ k B T −Vo /k B T e , Q reacts h
(22)
where Q ‡ is the partition function for the transition state, similarly Q reacts is that for the reactants, and the Boltzmann factor is of the gas phase barrier height. For a reaction in a complex material, a more reasonable approach is to employ the potential of mean force in the Boltzmann factor, thereby taking account of statistical fluctuations from the surrounding extended environment. In such cases, the “theory” is really more of an empirically tested algorithm, and one may write a condensed matter transition state rate expression as κk B T − G pmf /RT T e . (23) h Truhlar, Hillier and Nicoll [63] employed such a formula recently. The question of practical application then is what should the appropriate free energy of reaction be. There are a great variety of ways to decide on this. For example, one may compute this either variationally, or based on surface energetics. There are other sections of this book that describe exactly this problem, so we would not describe the alternatives in detail. The final required component is the factor labeled as κ. This factor contains everything else that is not included in TST. It might for example include barrier recrossings, or quantum tunneling through the barrier. For a complex, multidimensional system there is really no way to rigorously compute such a “tunneling correction”, but lack of derivational rigor has not prevented the application of what has become an k(T ) =
1686
S.D. Schwartz
effective method. There have been a great many tunneling schemes derived that provide rate constants of accuracy of an entirely reasonable nature. As another example, Gao and Truhlar et al. [64] have applied TST analysis to hydride transfer in an enzyme system with a potential energy surface derived via mixed quantum classical approaches with central active site atoms treated quantum mechanically, and the surrounding atoms modeled via a molecular mechanics potential.
6.
Conclusions and Future Directions
This brief section has provided an overview of some of the dominant approaches to the computation of rates in complex materials. The vast increase of complexity of systems studied has not simply been due to the advancement of mathematical technique or the increase in computer power. It has also been due to investigators increased willingness to study, using approximate techniques, problems for which there can never be an exact solution. Care must, of course, be continuously exercised to check results against appropriate experiment when possible. It is also important to realize that the use of theoretical prediction is at times more valid for distinguishing between different physical models rather than exact quantitative prediction. This is a paradigm shift from the early gas phase days of rate computation, where matching of experimental numbers was the dominant goal. This having been said, we assert that the main challenge to rate computation in particular and dynamics in general for complex material systems over the next 20 years, will be continued to development of basic methodologies such as the Flux Correlation function formalism, operator resummations for evolution operators, and the IVR semiclassical methodologies. These formal theories are not always clearly applicable to new more complex materials (as was the case for the Flux Correlation function formalism) but it is only through basic development that new more complex materials will be studied in the decades to come.
References [1] Y.M. Li and J.Z.H. Zhang, “Theoretical dynamical treatment of chemical reactions,” In: Modern Trends In Chemical Reaction Dynamics Part I: Experiment and Theory by Xueming Yang & Kopin Liu (eds.), 2003. [2] Wm.H. Miller, S.D. Schwartz, and J.W. Tromp, “Quantum mechanical rate constants for bimolecular reactions,” J. Chem. Phys., 79, 4889–4898, 1983. [3] T. Yamamoto, “Quantum statistical mechanical theory of the rate of exchange chemical reactions in the gas phase,” J. Chem. Phys., 33, 281, 1960. [4] D. Chandler, “Statistical mechanics of isomerization dynamics in liquids and the transition state approximation,” J. Chem. Phys., 2959–2970, 1978.
Approximate quantum mechanical methods
1687
[5] For an older but excellent review see: R.B. Bernstein, “Quantum effects in elastic molecular scattering,” Adv. Chem. Phys., 10, 75, 1966. [6] Wm.H. Miller, “Quantum mechanical transition state theory and a new semiclassical model for reaction rate constants,” J. Chem. Phys., 61, 1823–1834, 1974. [7] R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals, McGraw Hill, New York, 1965. [8] J.E. Straub, M. Borkovec, and B.J. Berne, “Molecular dynamics study of an isomerizing diatomic in a Lennard–Jones fluid,” J. Chem. Phys., 89, 4833, 1988. [9] B.J. Gertner, K.R. Wilson, and J.T. Hynes, “Nonequilibrium solvation effects on reaction rates for model SN2 reactions in water,” J. Chem. Phys., 90, 3537, 1988. [10] E. Cortes, B.J. West, and K. Lindenberg, “On the generalized langevin equation: classical and quantum mechanical,” J. Chem. Phys., 82, 2708–2717, 1985. [11] R. Zwanzig, “The nonlinear generalized langevin equation,” J. Stat. Phys., 9, 215, 1973. [12] Wm.H. Miller, “Quantum mechanical transition state theory and a new semiclassical model for reaction rate constants,” J. Chem. Phys., 61, 1823–1834, 1974. [13] P. Pechukas in “Dynamics of molecular collisions,” Part B Wm.H. Miller, (ed.), Plenum, New York, 1976. [14] S.D. Schwartz, “Accurate quantum mechanics from high order resummed operator expansions,” J. Chem. Phys., 100, 8795–8801, 1994. [15] S.D. Schwartz, “Vibrational energy transfer from resummed evolution operators, J. Chem. Phys., 101, 10436–10441, 1994. [16] D. Antoniou and S.D. Schwartz, “Vibrational energy transfer in linear hydrocarbon chains: new quantum results,” J. Chem. Phys., 103, 7277–7286, 1995. [17] S.D. Schwartz, “The interaction representation and non-adiabatic corrections to adiabatic evolution operators,” J. Chem. Phys., 104, 1394–1398, 1996. [18] D. Antoniou and S.D. Schwartz, “Nonadiabatic effects in a method that combines classical and quantum mechanics,” J. Chem. Phys., 104, 3526–3530, 1996. [19] S.D. Schwartz, “The interaction representation and non-adiabatic corrections to adiabatic evolution operators II: nonlinear quantum systems,” J. Chem. Phys., 104, 7985–7987, 1996. [20] W. Magnus, “On the exponential solution of differential equations for a linear operator,” Comm. Pure and Appl. Math. VII, 649, 1954. [21] M.D. Feit and J.A. Fleck Jr., “Solution of the schrodinger equation by a spectral method II: vibrational energy levels of triatomic molecules,” J. Chem. Phys., 78, 301, 1983. [22] S.D. Schwartz, “Quantum activated rates – an evolution operator approach,” J. Chem. Phys., 105, 6871–6879, 1996. [23] S.D. Schwartz, “Quantum reaction in a condensed phase – turnover behavior from new adiabatic factorizations and corrections,” J. Chem. Phys., 107, 2424–2429, 1997. [24] D. Antoniou and S.D. Schwartz, “Proton transfer in benzoic acid crystals: another look using quantum operator theory,” J. Chem. Phys., 109, 2287–2293, 1998. [25] D. Antoniou and S.D. Schwartz, “A Molecular dynamics quantum kramers study of proton transfer in solution,” J. Chem. Phys., 110, 465–472, 1999. [26] D. Antoniou and S.D. Schwartz, “Quantum Proton transfer with spatially dependent friction: phenol-amine in methyl chloride,” J. Chem. Phys., 110, 7359–7364, 1999. [27] P. Gross and S.D. Schwartz, “External field control of condensed phase reactions,” J. Chem. Phys., 109, 4843–4851, 1998.
1688
S.D. Schwartz
[28] R. Karmacharya, P. Gross, and S.D. Schwartz, “The Effect of coupled nonreactive modes on laser control of quantum wavepacket dynamics,” J. Chem. Phys., 111, 6864–6868, 1999. [29] R. Karmacharya, D. Antoniou, and S.D. Schwartz, “Nonequilibrium solvation and the quantum Kramers problem: proton transfer in aqueous glycine,” J. Phys. Chem. (Bill Miller festschrift), B105, 2563–2567, 2001. [30] D. Antoniou, S. Caratzoulas, C. Kalyanaraman, J.S. Mincer, and S.D. Schwartz, “Barrier passage and protein dynamics in enzymatically catalyzed reactions,” European Journal of Biochemistry, 269, 3103–3112, 2002. [31] D. Antoniou and S.D. Schwartz, “Internal enzyme motions as a source of catalytic activity: rate promoting vibrations and hydrogen tunneling,” J. Phys. Chem., B105, 5553–5558, 2001. [32] R.A. Marcus, “Chemical and electrochemical electron transfer theory,” Ann. Rev. Phys. Chem., 15, 155–181, 1964. [33] V. Babamov and R.A. Marcus, “Dynamics of Hydrogen Atom and Proton Transfer reactions: Symmetric Case,” J. Chem. Phys., 74, 1790, 1981. [34] V.L. Schramm, “Enzymatic transition state analysis and transition-state analogues,” methods in enzymology 308, 301–354, 1999. [35] R.L. Schowen, Transition States of Biochemical Processes, Plenum Press, New York, 1978. [36] Wm.H. Miller, “Classical Limit Quantum Mechanics and the Theory of Molecular Collisions,” Adv. Chem. Phys., 25, 69–177, 1974. [37] P. Pechukas, “Semiclassical scattering theory I,” Phys. Rev., 181, 166–173, 1969. [38] P. Pechukas, “Semiclassical scattering theory II atomic collisions,” Phys. Rev., 181, 174–181, 1969. [39] R.A. Marcus, “Theory of Semiclassical transition probabilities (S matrix) for inelastic and reactive collisions,” J. Chem. Phys., 54, 3965, 1971. [40] M.C. Gutzwiller, “Chaos in classical and quantum mechanics,” Springer New York, 1990. [41] Wm.H. Miller, “Classical S Matrix: Numerical application to inelastic collisions,” J. Chem. Phys., 53, 3578–3587, 1970. [42] R.A. Marcus, “Theory of Semiclassical transition probabilities (S matrix) for inelastic and reactive collisions,” J. Chem. Phys., 56, 3548, 1972. [43] M.F. Herman and E. Kluk, “A semiclassical justification for the use of non- spreading wavepackets in dynamics calculations,” Chem. Phys., 91, 27–34, 1984. [44] E.J. Heller, “Frozen Gaussians: a very simple semiclassical approximation,” J. Chem. Phys., 75, 2923–2931, 1981. [45] Wm.H. Miller, “On the Relation between the semiclassical initial value representation and an exact quantum expansion in time-dependent coherent States,” J. Phys. Chem. B, 106, 8132–8135, 2002. [46] V.I. Filinov, Nucl. Phys., B271, 717–725, 1986. [47] H. Wang, X. Sun, and Wm.H. Miller, “Semiclassical approximations for the calculation of thermal rate constants for chemical reactions in complex molecular systems,” J. Chem. Phys., 108, 9726–9736, 1998. [48] D.V. Shalashin and M.S. Child, “Nine-dimensional quantum molecular dynamics simulation of intramolecular vibrational energy redistribution in the CHD3 molecule with the help of coupled coherent states,” J. Chem. Phys., 119, 1961–1969, 2003. [49] D. Bohm, “A suggested interpretation of the quantum theory in terms of “hidden” variables I,” Phys. Rev., 85, 166, 1952.
Approximate quantum mechanical methods
1689
[50] E.R. Bittner and R.E. Wyatt, “Integrating the quantum Hamilton–Jacobi equations by wavefront expansion and phase space analysis,” J. Chem. Phys., 113, 8888–8897, 2000. [51] J.C. Tully, “Molecular dynamics with electronic transitions,” J. Chem. Phys., 93, 1061–1071 1990. [52] J.C. Tully, In: Wm.H. Miller (ed.), Dynamics of Molecular Collisions, Part B, Plenum, New York, pp. 217, 1976. [53] G. Wahnstrom and H. Metiu, “The calculation of the thermal rate coefficient by a method combining classical and quantum mechanics,” J. Chem. Phys., 88, 2478–2491, 1988. [54] S. Hammes-Schiffer and J.C. Tully, “Proton transfer in solution: molecular dynamics with quantum transitions,” J. Chem. Phys., 101, 4657–4667, 1994. [55] N. Yu, C.J. Margulis, and D.F. Coker, “Influence of solvation environment on excited state avoided crossings and photo-dissociation dynamics”, J. Phys. Chem. B, 105, 6728–2737, 2001. [56] C.J. Margulis and D.F. Coker, “Nonadiabatic molecular dynamics simulations of photofragmentation and geminate recombination dynamics in size-selected I2-(CO2)n cluster ions,” J. Chem. Phys., 110, 5677–5690, 1999. [57] D.F. Coker and L. Xiao, “Methods for molecular dynamics with non-adiabatic transitions,” J. Chem. Phys., 102, 496–510, 1995. [58] H.S. Mei and D.F. Coker, “Quantum molecular dynamics studies of H2 transport in water,” J. Chem. Phys., 104, 4755–4767, 1996. [59] S. Nielsen, R. Kapral, and G. Ciccotti, “Mixed quantum-classical surface hopping dynamics,” J. Chem. Phys., 112, 6543–6553, 2000. [60] B.J. Schwartz, E.R. Bittner, O.V. Prezdo, and P.J. Rossky, “Quantum decoherence and the isotope effect in condensed phase nonadiabatic molecular dynamics simulations,” J. Chem. Phys., 104, 5942–5955, 1996. [61] S.R. Billeter, S.P. Webb, P.K. Agarwal, T. Iordanov and S. Hammes-Schiffer, “Hydride transfer in liver alcohol dehydrogenase: quantum dynamics, kinetic isotope effects, and role of enzyme motion,” J.A.C.S., 123, 11262–11272, 2001. [62] P.K. Agarwal, S.R. Billeter, and S. Hammes Schiffer, “Nuclear quantum effects and enzyme dynamics in dihydrofolate reductase catalysis,” J. Phys. Chem. B, 106, 3238–3293, 2002. [63] R.M. Nicoll, I. Hillier, D.G. Truhlar, “Quantum mechanical dynamics of hydride transfer in polycyclic hydroxy keytones in the condensed phase,” J.A.C.S., 123, 1459–1463, 2001. [64] C. Alhambra, J.C. Corchado, M.L. Sanchez, J. Gao, and D.G. Truhlar, “Quantum dynamics of hydride transfer in enzyme catalysis,” J.A.C.S., 122, 8197–8203, 2000.
5.9 QUANTUM RATE THEORY: A PATH INTEGRAL CENTROID PERSPECTIVE Eitan Geva1, Seogjoo Jang2, and Gregory A. Voth3 1
Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109-1055, USA 2 Chemistry Department, Brookhaven National Laboratory, Upton, New York 11973-5000, USA 3 Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112-0850, USA
1.
Introduction
The dynamics of many important processes that take place in condensed phase hosts can be described in terms of rate kinetics, with well-defined rate constants. The calculation of such rate constants from first principles has presented theoretical chemistry with an ongoing challenge. Nonequilibrium statistical mechanics provides a framework within which one can derive explicit expressions for those rate constants, via either linear response theory or Fermi’s golden rule. In both cases, one finds that the rate constants are given in terms of equilibrium correlation functions [1–6]. Those correlation functions can be evaluated with relative ease from classical molecular dynamics (MD) simulations, even for complex anharmonic many-body systems such as molecular liquids and biopolymers. However, classical mechanics is not valid in the case of many important processes, such as electron and proton transfer and vibrational relaxation. In those cases, one needs to compute the quantummechanical correlation functions. A numerically exact calculation of the latter lies far beyond the reach of currently available computer resources, due to the exponential scaling of the computational effort with the number of degrees of freedom [7, 8]. The challenge therefore lies in finding ways to compute quantum mechanical rate constants which are based on either bypassing the explicit simulation of the quantum dynamics (e.g., transition state theory (TST)), or by using reliable and computationally feasible approximate techniques for computing quantitatively accurate quantum mechanical correlation functions. 1691 S. Yip (ed.), Handbook of Materials Modeling, 1691–1712. c 2005 Springer. Printed in the Netherlands.
1692
E. Geva et al.
Feynman’s path integral formulation of quantum mechanics [9–11] provides a powerful framework for understanding the statics and dynamics of quantum-mechanical condensed phase systems [12]. The propagator, which corresponds to the probability amplitude for going from one position to another over a time period t, is the central quantity in this formulation. The propagator is given by a sum, over all possible classical paths that satisfy those boundary conditions, of eiS/h¯ , where S is the corresponding classical action. The very same propagator can also be related to statistical-mechanical averages at thermal equilibrium, by replacing the real time, t, with an “imaginary time” −iβ h¯ (β = 1/k B T ). The propagator is known in closed form only for a small number of relatively simple systems. Unfortunately, a numerical calculation of the real-time propagator of a general many-body system is not possible in practice, due to the highly oscillatory nature of the integrand, eiS/h¯ , in the real-time path integral (the so called “sign problem”). It is possible to simplify the calculation by using semiclassical approximations in order to minimize the number of classical paths summed over, and thereby extend the applicability of this approach to somewhat larger systems [13]. However, the application of those techniques to condensed phase systems has proven to be extremely difficult, and requires rather drastic additional approximations. The situation is quite different in the case of the imaginary-time pathintegral propagator, where a numerically exact calculation is possible, even in the case of relatively complex and anharmonic many body systems. This is attributed to a fascinating equivalence between the imaginary-time propagator and the partition function of a classical system of cyclic chains of beads, which are connected by harmonic springs (each quantum degree of freedom give rise to one chain, and the effective number of beads in a chain increases as the temperature and mass are decreased). This equivalence between quantum degrees of freedom and classical chains implies that one can obtain the imaginarytime path-integral propagator from scalable classical MD and Monte Carlo (MC) simulations of the corresponding cyclic chains [14, 15]. The center of mass of those chains is known as their centroid. Feynman [9, 10] was the first to point out that the centroid can be thought of as a classical-like variable moving on a classical-like effective potential, and thereby serving as a basis for a classical-like formulation of quantum-mechanical equilibrium statistical mechanics. Several workers have later used the centroid perspective in order to construct variational approximations for the quantum partition function [16, 17]. The first application of the centroid concept to rate theory was in the context of quantum mechanical TST [18–29], and led to the development of path-integral QTST (PI-QTST) [21, 22, 24, 25, 28]. The structure of PI-QTST is similar to that of classical TST [30], except that the classical positions are replaced by the centroids of the corresponding chains. The path integral centroid approach was later extended to the calculation of time-dependent correlation functions, with the introduction of centroid
Quantum rate theory: a path integral centroid perspective
1693
molecular dynamics (CMD) by Cao and Voth [25, 31–37]. CMD is based on the hypothesis that the centroid follows classical-like dynamics, and that quantum effects can be incorporated by modifying the initial sampling and the force fields, as well as by representing dynamical observables by suitably defined “centroid symbols”. Important progress has been made over the last years in clarifying the assumptions underlying CMD [36–41]. CMD has also been shown to be useful and computationally feasible for realistic, complex, many-body systems (see, e.g., Refs. [42–51]. This chapter provides an overview of recent progress in the application of the path integral centroid approach to the calculation of quantum mechanical rates. Section 2 provides an overview of the formal theory of the path integral centroid and rate processes. Applications to the calculation of reaction rate constants, diffusion constants, and vibrational energy relaxation rate constants are described in Sections 4, 4.1, and 4.2, respectively. We close in Section 5, with conclusions, and some discussion of future prospects and open problems.
2. 2.1.
Formal Theory The Centroid Formulation of Quantum Statistical Mechanics
In its most recent formulation [36, 37], centroid dynamics has been shown to be based on the following phase–space operator (given here in 1D, for simplicity) ˆ c , pc ) = h¯ φ(x 2π
∞
∞
dξ −∞
ˆ
ˆ c )+iη( p− ˆ pc )−β H dηeiξ(x−x ,
(1)
−∞
where xc and pc are the centroid position and momentum, respectively, and β = 1/k B T is the inverse temperature. A central role is reserved for the trace of this operator, which corresponds to the centroid density ˆ c , pc )]. ρc (xc , pc ) = T r[φ(x
(2)
The centroid approach also associates a classical-like centroid symbol, ˆ which is Ac (xc , pc ), with each quantum dynamical observable, A(xˆ , p), defined by Ac (xc , pc ) =
ˆ c , pc ) A(xˆ , p)] T r[φ(x ˆ . ρc (xc , pc )
(3)
1694
E. Geva et al.
The centroid density, ρc (xc , pc ), turns out to have a classical-like form, which is similar to that of the classical Boltzmann distribution ρc (xc , pc ) = e−β pc /2m ρc (xc ) ≡ e−β pc /2m e−β Vcm (xc ) . 2
2
(4)
Vcm (xc ) = − ln[ρc (xc )]/β in Eq. (4) is called the centroid potential. It is distinctly different from the classical potential and can be written in terms of a constrained imaginary-time path integral
e
−β Vcm (x c )
≡ ρc (xc ) =
2πβ h¯ 2 m
× δ xc − (β h¯ )−1
= lim
P→∞
1/2
Dx(λ) x(0)=x(β h¯ )
β h¯
dλx(λ) exp {−S[x(λ)]/h¯ }
0
2πβ h¯ 2 m
1/2
mP 2πβ h¯ 2
P/2
dx1 · · ·
dx P
P 1
× δ xc − xk exp {−S[x1 , . . . , x P ]/h¯ }, P k=1
(5)
with 1 1 S[x(λ)] = lim S[x1 , . . . , x P ] P→∞ h h¯ ¯ 1 = h¯ and
β h¯
dλ 0
1 2 m [x(λ)] ˙ + V [x(λ)] 2
P P
1
mP 1 2 S[x1 , . . . , x P ] = β (x − x ) + V (xk ) k k+1 h¯ P k=1 2β 2 h¯ 2 k=1
(6)
(7)
In Eq. (7), x P + 1 = x1 . It should be noted that ρc (xc ) is proportional to the probability density of finding a classical cyclic chain polymer consisting of P beads, which are connected by harmonic springs and subject to the potential V (x)/P, with their center of mass (the centroid) at x = xc . The centroid also corresponds to the zero-frequency normal mode of the Fourier representation of the imaginary time propagator. The constrained imaginary-time propagator in Eq. (5) can be computed using classical MD or MC simulations (PIMD and PIMC, respectively) for relatively complex many-body systems, [14, 15]. The above definitions form the basis for an exact classical-like formulation of quantum statistical mechanics, which is summarized in Table 1. The last line in Table 1 is of particular importance since it relates the classical-like
Quantum rate theory: a path integral centroid perspective
1695
Table 1. The centroid formulation of quantum statistical mechanics. Bˆ is an arbitrary linear combination of xˆ and pˆ Standard
ˆ
T r e−β H
Z
ˆ
T r e−β H Aˆ
ˆ A
β C Kubo B A (t)
Z
ˆ
−β H Bˆ A(t ˆ + iλ/h¯ ) dλ T r e
β
Z
0
Centroid dxc d pc ρc (xc , pc ) 2π h¯ dxc d pc ρc (xc , pc )A c (xc , pc ) 2π h¯ Z dxc d pc ρc (xc , pc )Bc A c [xc , pc ; t] 2π h¯ Z
two-time centroid correlation function with the exact Kubo-transformed quantum-mechanical correlation function. More specifically, in the case where Bˆ = xˆ or pˆ (or any linear combination of xˆ and p), ˆ the following identity holds 1 2π h¯
dxc 1 = β
β
d pc ρc (xc , pc )xc Ac [xc , pc ; t]
ˆ ˆ + i hλ) dλT r e−β H xˆ A(t . ¯
(8)
0
It should be noted that Kubo-transformed correlation functions can be related to the corresponding regular correlation functions via a well known identity [39]. However, the relationship in Eq. (8) is of little practical use since the exact time dependence of the centroid symbol Ac [xc , pc ; t] is given by
Ac (xc , pc ; t) = T r e
ˆ −i Hˆ t /h¯ φc (x c ,
pc ) i Hˆ t /h¯ ˆ e A , ρc (xc , pc )
(9)
which requires the same amount of numerical effort to compute as its standard quantum mechanical analogue. Several computationally feasible approximations of the centroid dynamics have been proposed [37]. In all of them, the centroid is assumed to move on an effective potential, obtained by averaging over the higher normal modes of the imaginary-time path. Hence, the centroid is assumed to be effectively decoupled from the higher normal modes [38]. The CMD method, which is by far the most popular of those methods, is based on the following approximation [37]. φˆc [xc (t), pc (t)] φˆc (xc , pc ) i Ht ˆ ˆ e /h¯ ≈ , (10) e−i H t /h¯ ρc (xc , pc ) ρc (xc (t), pc (t)) such that Ac [xc , pc ; t] ≈ Ac [xc (t), pc (t)].
(11)
1696
E. Geva et al.
Here, xc (t) and pc (t) are propagated as classical-like position and momentum variables on the centroid potential, Vcm (xc ) (Cf. Eqs. (4) and (5)). We also note for later use that a centroid correlation function similar to that in Eq. (8), except for the fact that xc is replaced by xcn , where n is an integer, can be shown to be identical to the corresponding high order Kubo-transformed correlation function [39]. For example, 1 2π h¯
dxc 2 = 2 β
d pc e−β[ pc /2m+Vcm (xc )] xc2 Ac [xc , pc ; t]
β
2
β1
dβ1 0
ˆ
ˆ dβ2 T r e−β H x(−iβ ˆ ˆ . ¯ )x(−iβ ¯ ) A(t) 1 /h 2 /h
(12)
0
The CMD approximation for the correlation function in Eq. (12) can then be obtained by applying Eq. (11) to Ac [xc , pc ; t]. The calculation of correlation functions which do not have the same form as the ones considered above require one or more of the following additional approximations: [25, 32, 34, 39, 44, 52–54] (1) approximate analytic continuation; (2) a second order cumulant approximation; (3) approximate semiclassical representation of nonlinear operators; and (4) approximate classical representation of nonlinear operators. The results obtained via those methods should be used and interpreted with care. For example, the cumulant approximation will fail if the dynamics is not Gaussian, and will not lead to the correct classical limit. In a recent paper, Reichman et al. have derived a formal relationship between nonlinear centroid time correlation functions and highorder Kubo-transformed quantum correlation functions (Cf. Eq. (12)) [39]. However, in practice, a numerically exact transformation of these high-order Kubo-transformed correlation functions into standard ones, can become very difficult, and particularly so in the case of highly nonlinear and/or many-body operators.
3. 3.1.
Reaction Rate Constants Path Integral Quantum Transition State Theory (PI-QTST)
Activated barrier crossing events usually satisfy rare event statistics. In those processes, the exponentially small probability of activation, i.e., the probability of visiting a reactive configuration, is the dominant factor in determining the rate of the reaction. Advances in importance sampling methodology and MD simulation techniques, have made the calculation of classical reaction rate constants into a routine procedure, which is able to provide accurate
Quantum rate theory: a path integral centroid perspective
1697
results in condensed phase systems [55]. The situation is far less satisfactory when quantum effects such as tunneling and zero-point energy, are significant, which is particularly relevant in the case of proton and electron transfer reactions. Numerous theoretical attempts have been made to devise approximate, yet reliable, methods for calculating quantum-mechanical reaction rate constants. One of the most successful among these is the PI-QTST, which was proposed by Voth, Chandler, and Miller [21], and partially based on an idea due to Gillan [19, 20]. PI-QTST stands out in its unique ability to introduce an effective procedure for carrying out importance sampling in the context of quantum barrier crossing. It has also been found to be surprisingly reliable in applications to a wide range of condensed matter systems. The purpose of the present section is to introduce the essential characteristics of PI-QTST, and overview various ways of improving it that were introduced over the past decade. We will restrict ourselves, for the sake of simplicity, to the case of a Cartesian one-dimensional reaction coordinate. In the case where the coupling to the other degrees of freedom of the overall system (the bath) is weak, such that the motion along the reaction coordinate is almost ballistic near the reactive zone, one can apply the following PI-QTST rate expression kPI−QTST =
1 −β Vcm (xc∗ ) e , hβ Z r
(13)
where xc∗ corresponds to the position of the barrier top in the centroid potential, Vcm (xc ), as defined by Eq. (4), and Z r is the reactant partition function. Test calculations [21] show that this expression is virtually exact if the quantum tunneling is confined to the close vicinity of the barrier top. A simple mathematical criterion of this situation is that β < 2π/(h¯ ωb ), where ωb is the barrier frequency (i.e., the angular frequency of the inverted harmonic oscillator potential that can be fitted to the barrier top). In the low temperature deep tunneling regime of β > 2π/(h¯ ωb ), unless the potential barrier is very asymmetric, Eq. (13) still provides good qualitative estimates of the exponential factors, although the prefactors tend to be underestimated by about 2π/(β h¯ ωb ). In many respects, the role of PI-QTST in the quantum regime is similar to that of the TST in the classical regime. The evaluation of Eq. (13) does not require any quantum dynamics, and only requires a feasible imaginary time path integral simulation for the calculation of Vcm (xc ) as a function of the reaction coordinate centroid. Thus, PI-QTST represents a rather accurate and affordable approach to calculating quantum reaction rate constants in realistic condensed phase systems. It is expected to provide quantitative predictions, if the conditions underlying its validity are met, and valuable qualitative insight otherwise. In this respect, it is not surprising that attempts have been made to improve the PI-QTST in ways analogous to those employed for improving classical TST. For example, such improvements in the spirit of variational TST
1698
E. Geva et al.
(VTST) [22, 56, 57] and the reactive flux formulation [58] have been incorporated into PI-QTST, and have been shown to lead to improvements. Also, corrections due to quantum friction, within the Kramers model, have also been incorporated into the PI-QTST [22, 59]. However, unlike the classical TST, the dynamical basis underlying the PI-QTST has not been clear from the beginning. More specifically, the assumptions underlying PI-QTST are somewhat ambiguous, which makes it difficult to systematically correct for its shortcomings in the deep tunneling regime and for the case of a very asymmetric barrier. Suggestions were made for quantum dynamical correction of Eq. (13) based on a rigorous quantum dynamical formulation [21, 60], which has not been tested even for model systems, or based on an empirical relation [61] that does not seem to have general validity. Notable progress has been made in two respects. The first is the formulation of a unified rate theory [62], which is based on a rate expression suggested by Affleck [63]. When implemented within the path integral centroid formalism, this theory provides an improved dynamical correction factor. The second is the modification of the PI-QTST for very asymmetric barriers [64], which also clarified the source of errors within the semiclassical approximation.
3.2.
A Centroid Linear-response Approach to Rate Constants
It has recently been argued that linear response theory provides an alternative route for calculating rate constants from CMD simulations, without resorting to additional approximations [40, 41]. In this approach, one starts with the system at a nonequilibrium state, which, in its most general form, is given by ˆ
ρ(0) ˆ =
e−β H ˆ + . Z
(14)
ˆ ˆ must obviously satisfy T r( ) ˆ = 0,
ˆ † = , The deviation from equilibrium, , ˆ to equiand keep ρ(0) ˆ positive. The relaxation of the quantity of interest, A, librium is then given by ˆ ˆ ˆ A(t)], δ A(t) = T r[ δ
(15)
ˆ ˆ eq . Assuming that the relaxation of δ A(t) is exponential, where δ Aˆ = Aˆ − A −kt ˆ ˆ i.e., that δ A(t)/δ A(0) = e , then leads to to the following expression for the rate constant k=−
˙ˆ ˆ A(t T r{
p )} . ˆ ˆ A} T r{ δ
(16)
Quantum rate theory: a path integral centroid perspective
1699
˙ˆ = i[ Hˆ , A]/ ˆ h¯ is the operator that represents the flux of A, and 0 < t p where A −1 k is a relatively short transient time preceding the onset of rate kinetics. The important point is that the actual value of the rate constant is, by definition, insensitive to the details of the initial state. This translates into flexˆ which one can take advantage of when it comes to ibility in the choice of , methods like CMD that are more directly applicable to specific types of corre ˆ ˆ = 0β dλe−(β−λ) Hˆ δ xe ˆ −λ H /Z , lation functions. More specifically, substituting
into Eq. (16) yields an expression for the rate constant in terms of a correlation function that can be evaluated directly from CMD simulations k=−
K ubo (t) Cδx, A˙ K ubo Cδx,A
=−
dxc d pc ρc (xc , pc )δxc A˙ c [xc pc ; t] . dxc d pc ρc (xc , pc )δxc δ Ac [xc pc ]
(17)
In cases where another type of perturbation to take is needed, it is possible advantage of the relationship between dxc d pc ρc (xc , pc )xcn Ac [xc pc ; t] and higher order Kubo-transformed correlation functions [39]. For example, substituting ˆ
−β H ˆ =e
Z
β
β1
dβ1 0
dβ2 δ x(−iβ ˆ ˆ ¯ )δ x(−iβ ¯) 1 /h 2 /h
(18)
0
into Eq. (16) would yield β
β1
˙ˆ dβ2 δ x(−iβ ˆ ˆ ¯ )δ x(−iβ ¯ ) A(t) 1 /h 2 /h eq β1 ˆ ˆ ˆ ¯ )δ x(−iβ ¯ )δ Aeq 1 /h 2 /h 0 dβ1 0 dβ2 δ x(−iβ 2 ˙ dxc d pc ρc (xc , pc )δxc Ac [xc pc ; t] . =− dxc d pc ρc (xc , pc )δxc2 δ Ac [xc pc ] dβ1
k = − β 0
0
(19)
To summarize, the linear response approach outlined above allows us to express any quantum mechanical rate constant in terms of a correlation function that can be obtained directly from centroid dynamics simulations. It should be noted that the above formulation is completely general, and can be incorporated into any approximate scheme for simulating the centroid dynamics. It should also be emphasized that the accuracy of the result will generally depend on the ability of the particular approximation of the centroid dynamics to capture the quantum aspects which are relevant to the specific relaxation process under consideration.
3.3.
Reaction Rate Constants from CMD Simulations
In this subsection, we consider the application of the linear response approach outlined in Section 3.2, to the calculation of reaction rate constants.
1700
E. Geva et al.
In the case of reaction kinetics, the observed quantity is the instantaneous mole fraction of the product. Within the linear response framework, this dictates that Aˆ = h(ˆs ) in Eq. (15), where sˆ is the operator that represents the reaction coordinate and h(ˆs ) is the heaviside operator. The expectation value of h(ˆs ) corresponds to the mole fraction of the product, and its flux is given ˙ s ) = [ pδ(ˆ by h(ˆ ˆ s ) + δ(ˆs ) p]/2m. ˆ The standard reactive flux method [1, 18, 65, ˆ = 0β dλe−(β − λ) Hˆ 66] can be derived by choosing an initial state, such that
ˆ δh(ˆs )e−λ H /Z . It leads to equivalent expressions for the reaction rate constant in terms of either the flux-heaviside or flux-flux correlation functions, which cannot be obtained directly from centroid dynamics simulations. However, we may choose another initial state since the rate constant is independent ˆ = 0β dλe−(β − λ) Hˆ δ sˆ e−λ Hˆ /Z , leads to the of it. More specifically, setting
following expression for the reaction rate constant k=−
(t p ) CsKubo ˆ , Fˆ CδKubo (0) sˆ ,δ hˆ
.
(20)
The position-flux correlation function in Eq. (20) can then be evaluated directly from simulations of the centroid dynamics (cf. Eq. (17)). Quantum corrections are introduced into the resulting centroid approximation of the reaction rate constant in two distinctively different ways: • The centroid symbol of the flux, which is given explicitly in Ref. [40], requires that the initial value of the reaction coordinate centroid, sc , is sampled from a distribution of finite width. This should be contrasted with the classical analogue, where all trajectories start at the barrier top. This initial distribution of sc reflects quantum delocalization, and becomes wider as the temperature and friction are lowered. • The dynamics takes place on the centroid potential, rather than the classical potential. The barrier on the centroid potential is lower than its classical counterpart, and increasingly so as the temperature and friction are decreased, which is a reflection of tunneling and zero-point energy effects. The linear response centroid methodology outlined above for calculating the unimolecular reaction rate constant from CMD simulations has been applied to a symmetrical [40] and asymmetrical [67] double-well potential, both of which are bilinearly coupled to a bath of harmonic oscillators. It was found that CMD is able to quantitatively capture most of the quantum enhancement to the reaction rate, over a wide range of temperatures and frictions. It was also found that the reaction rate constants obtained from CMD coincide with these obtained from PI-QTST at high frictions. At intermediate frictions, the predictions of PI-QTST were found to be in a slightly better agreement with the exact result, which is likely to be accidental. However, CMD, being a
Quantum rate theory: a path integral centroid perspective
1701
dynamical method, could capture the turnover behavior at low frictions, which PI-QTST, being a TST method, could not. The delocalized nature of the initial distribution of the reaction coordinate centroid makes it increasingly more difficult to calculate the rate constant at very low temperatures. As the temperature decreases, the initial distribution acquires a bi-modal shape with sharp peaks on both sides of, and far away from, the barrier top. This distribution results in a situation where the large majority of the trajectories start far from the barrier top and therefore have a very small likelihood of crossing the barrier. At the same time, a small minority of the trajectories, which start in the close vicinity of the barrier top, are very likely to cross the barrier. As it turns out, the two types of trajectories make comparable contributions to the rate constant (the low likelihood of crossing the barrier is compensated for by the high probability of starting far from the barrier top). Efficient sampling of the trajectories that start in the close vicinity of the barrier top is possible via umbrella sampling. However, sampling of the trajectories that start far from the barrier top is made increasingly more demanding due to the inherent rare event statistics. In other words, more and more trajectories need to be sampled in order to obtain good statistics by having enough of them cross the barrier. Shi and Geva have shown that, at least for the case of bilinear coupling to a harmonic bath, one can overcome this problem by resorting to classicallike sampling, where all the trajectories start at the barrier top, as defined with respect to the centroid potential [68]. Although this is an approximation, it was shown that the error is given by a factor whose value is of the order of unity (except at extremely low temperatures and frictions). The expression for the reaction rate constant, within this approximation, is identical to the classical one, except for the fact that the classical potential is replaced by the centroid potential. This approximation was found to perform well when tested on the above mentioned benchmark problems [67, 68], and extends the applicability of the CMD-based method to very low temperatures that would have been difficult to access via the original method.
3.4.
Relationship Between PI-QTST and CMD
The formulation of path integral centroid dynamics [36] and identification of the mathematical procedure leading to CMD approximation, [37] raises a natural question about the possibility to understand the dynamical basis of the PI-QTST in a similar manner. The major theoretical difficulty in this effort has been that the standard expression for the reaction rate involves population–flux or flux–flux time correlation functions, all of which are nonlinear functions of coordinate. According to the formulation of real time centroid dynamics, [36] at least one physical observable should be linear in the position or momentum,
1702
E. Geva et al.
for the calculation of time correlation function based on the centroid approach to be possible. An alternative formulation was developed by Jang and Voth, [28] where the reaction rate is defined as the steady state decay rate of the reactant population. Approximation of the reactant state in terms of an effective reactant centroid density leads to a rate expression given by the correlation between position centroid and time dependent flux operator. Application of the path integral centroid dynamics formalism, [36] and the application of the CMD approximation for the time evolution of the flux operator, then leads to the following, surprisingly simple, expression for the reaction rate constant: kCMD = κ(d)kPI−QTST ,
(21)
which involves a new quantum transmission factor defined as
κ(d) =
∞
dxc −∞
d|φˆc (xc , pc )|d ρc (xc , pc )
(22)
where the dividing surface is positioned at d. The occurrence of κ(d) in Eq. (21) and its dependence on the dividing surface is due to the approximate nature of CMD. In the high temperature limit, numerical tests show that κ(d) is independent of the value of d. In the low temperature limit, κ(d) depends on the choice of d. It was argued that the best choice is the value of d maximizing κ(d), which was based on the rationale that the CMD approximation tends to underestimate the true tunneling rate. Test calculations [28] for symmetric and cut-off asymmetric Eckart barriers showed that this choice indeed provides improved estimates in comparison to PI-QTST. As was shown in Section 3.2, the linear response CMD expression for the reaction rate constant can be approximated by the classical expression for the rate constant, as long as we replace the classical potential with the centroid potential. Taking advantage of the fact that the rate constant is independent of the initial condition, one may then revert from an expression which is given in terms of the position–flux correlation function, to one that is given in terms of the heavyside-flux correlation function. The resulting approximation coincides with that previously proposed by Schenter et al., as a way of introducing dynamical corrections into PI-QTST [58]. One can then go in the opposite direction, and establish a general relationship between the CMD-based expression for the rate constant, and PI-QTST. This result extends the relationship between CMD and PI-QTST developed by Jang and Voth in Ref. [28] to situations that involve bounded reactive potentials and coupling to a bath.
Quantum rate theory: a path integral centroid perspective
3.5.
1703
Applications of PI-QTST
In the present subsection, we provide some examples of realistic systems where PI-QTST or its variations has been applied. Our intent is not to present an extensive list of references, but rather to emphasize the range of systems for which the application of PI-QTST is possible. Our choice of examples is thus rather subjective, but we believe these cases are representative of the range of various applications of PI-QTST to date.
3.5.1. Hydrogen diffusion on metallic surfaces One of the first applications of PI-QTST was to study hydrogen diffusion on metal surfaces [69]. Because of the general nature of the PI-QTST method, the problem could be studied for the first time in a highly realistic fashion, i.e., by simultaneously including the effects of anharmonic interactions, lattice distortions, and phonon fluctuations. It was found that these effects were large and cannot be neglected (e.g., through frozen surface approximations). Subsequently, Rick et al. [70] applied PI-QTST for the calculation of hydrogen and deuterium diffusion rates on the Pd(111) surface. They found significant quantum effects for the surface and subsurface transitions. Their calculations showed that the quantum effects for hydrogen increase the diffusion rate by a factor of two even at room temperature. Mattsson and Wahnstr¨om [71] also calculated the isotope effects for the quantum diffusion of hydrogen on the Ni(001) surface. They found the calculated results are in quantitative agreement with experimental results at room temperature and the crossover to the temperature-independent deep tunneling regime occurs at about T = 120 K.
3.5.2. Diffusion of helium atoms in zeolites Murphy et al. [72] have calculated the diffusion rate of helium atoms in zeolites. Between the two competing effects of tunneling and zero-point motion, they found that surprisingly the latter dominates below 100 K. The resulting quantum effect was found to lower the diffusion rate by about a factor of 9 at 50 K.
3.5.3. Recombination of atomic impurities in solid hydrogen Jang and Voth [73] applied PI-QTST for the calculation of the recombination rate of two lithium atoms in solid para-hydrogen at 4 K. Both the lithium atoms and the hydrogen molecules exhibit significant quantum behavior in this low temperature limit. The result of the calculation was also found to be
1704
E. Geva et al.
consistent with experimental observations. A comparison with a calculation for classical lithium atoms showed that the quantum effects of the lithium atoms alone enhances the recombination rate by about a factor of 200 at 4 K. Similar calculations were also performed for boron atom recombination in solid hydrogen [74].
3.5.4. Proton transfer in water Schmitt and Voth, [43, 75, 76] have presented an extensive study of the hydrated proton in liquid water using their multi-state empirical valence bond model. In these studies, they performed calculatios of the PI-QTST quantum free energy barrier along the proton transfer reaction coordinate that represents the variation between the two distinct forms of the solvated proton, called the Eigen and Zundel cations. Comparison with classical calculations showed that the quantum effects lower the free energy barrier by about 0.4 kcal/mol, thereby predicting that the quantum effect enhances the proton transport rate by about a factor of two at 300 K. This enhancement was confirmed by direct CMD simulation [43, 76]. Earlier work by Lobaugh and Voth [77, 78] had already demonstrated the power and generality of the PI-QTST approach for elucidating the various complex features of proton transfer reactions in polar solvents.
3.5.5. Enzymatic reactions In principle, one of the most significant future applications of PI-QTST will be in the calculation of accurate isotope effects for enzymatic reactions, especially those involving proton or hydride transfers, as increasingly accurate potential energy functions or hybrid “QM/MM” methods become available for describing such reactions. As a recent example, Feierberg et al. [79] have applied an approximate version of PI-QTST to a rate limiting proton abstraction reaction in the enzyme glyoxalase I (see also references cited therein for other examples). They found significant isotope effects exist even at physiological temperatures. Namely, the H/D kinetic isotope effect was found to be about a factor of five. However, these authors also found a similar similar isotope effect for the uncatalyzed reaction, and thus they concluded that the quantum effect, while significant, does not have a net effect in the enzyme catalysis.
3.5.6. Intramolecular proton transfer Iftimie and Schofield [80] have applied PI-QTST to the tautomerization reaction of the enol form of malonaldehyde, and studied the effects of the
Quantum rate theory: a path integral centroid perspective
1705
quantization of the proton and also other secondary nuclear degrees of freedom, i.e., those involving carbon and oxygen. They found that the quantization of the proton lowers the free energy barrier by 2.5 kcal/mol at 300 K. This study was similar to the earlier work published by Hinsen and Roux on proton transfer in acetylacetone [81].
3.5.7. Completely first-principles reaction rates In a “proof of concept” paper, Pavese et al. [49] showed how ab initio MD could be combined with CMD to calculate completely “first principles” rate constants and other dynamical properties (albeit at a considerable computational cost). In this approach, both the quantization of the electrons (in their adibatic ground state) and the quantization of the nuclear motions are included within a single computational algorithm. This method was subsequently used by Tuckerman and Marx [82] to study the simultaneous effects of skeleton atom quantization and proton tunneling in an intramolecular hydrogen transfer reaction.
4.
Related Topics
In this section, two related topics (self-diffusion rates and vibrational energy relaxation rates) will be discussed.
4.1.
Self-diffusion Rates
The accurate determination of quantum transport properties, specifically the self-diffusion coefficient, of many-body systems remains at the forefront of scientific effort. The velocity time auto-correlation function, which serves as one of the most relevant quantities which can be used to characterize the behavior of disordered liquids, provides one means of uncovering the self-diffusion coefficient through the well know Green–Kubo relation, 1 D= 3
∞
dtCv (t).
(23)
0
As such, a great deal of theoretical effort has been applied to developing methods which are capable of accurately determining the velocity time autocorrelation function in sytems which exhibit quantum properties. CMD time correlation functions, which are exact for quadratic potentials in the classical limit and provide well defined approximations to the exact Kubo-transformed
1706
E. Geva et al.
time correlation functions in the anharmonic case, allow one to obtain accurate approximate quantum velocity autocorrelation functions through the frequency space identity relating the Kubo-transformed time correlation function to its quantum counterpart. The unique quantum characteristics of liquid hydrogen have led to intensive studies both experimentally and theoretically [46–48, 51, 83–95]. Liquid hydrogen exhibits important nuclear quantum effects and yet particle exchange is negligible. While much progress has been made, an accurate characterization of the dynamical behavior of such systems remains a challenging problem. The CMD method has proven to be an invaluable tool in accurately characterizing the dynamical behavior of liquid hydrogen systems. Indeed, recent CMD studies [95] have accurately determined the self-diffusion coefficients for several such systems as compared to the measured experimental values. For example, these CMD studies have determined the following self-diffusion coefficients: 0.35 Å2 /ps for liquid para-hydrogen at T = 14 K; 1.52 Å2 /ps for higher temperature, T = 25 K, liquid para-hydrogen; and 0.40 Å2 /ps for liquid ortho-deuterium at T = 20.7 K. The experimental values are 0.4, 1.6, and 0.36 Å2 /ps, respectively. The good agreement between the CMD results and those of experiment is a reflects the abiltity of CMD to account for the most prominent dynamical quantum effects where classical MD simulations are known to fail. It is also worth noting that the self-diffusion coefficients predicted by CMD are more accurate than those predicted by other theoretical methods [84, 85, 88]. The experimental determination of the actual velocity time autocorrelation functions is a difficult problem due to the difficulties in measuring the dynamic structure factor at low wave vectors. The CMD studies have provided reliable approximations for these functions as well and have shwon that hard, velocity reversing collisions are more prominent in colder, more dense, liquid para-hydrogen [42, 45, 95]. The relaxation time for the liquid para-hydrogen and ortho-deuterium systems was also established. The frequency representation of the velocity autocorrelation function further elucidated the distribution of the single phonon density of states in both liquid para-hydrogen and ortho-deuterium. Experimentally, neutron scattering techniques have become a prominent source of obtaining dynamical information for liquid hydrogen. Recently, both the coherent and incoherent dynamic structure factors of liquid hydrogen have been determined experimentally [51, 83, 92]. Celli et al. [83] compared their experimentally determined incoherent dynamic structure factor of liquid parahydrogen with that predicted by CMD and found excellent agreement for all values of wave vectors and frequencies considered. Similarly, good agreement was found between the experimentally determined coherent dynamic structure factors and those predicted by CMD over a range of wave vectors
Quantum rate theory: a path integral centroid perspective
1707
[48, 51]. Of particular interest was the ability of CMD to predict well-defined collective density fluctuations in liquid para-hydrogen which are consistent with quantum effects in low temperature regimes. Furthermore, CMD was able to reasonably predict the de Gennes narrowing phenomenon resulting from the reduction of collective excitations as the wave vector increases. CMD has been equally successful in determining the dynamic structure factors in liquid ortho-deuterium [95].
4.2.
Vibrational Energy Relaxation Rate Constants
The problem of vibrational energy relaxation (VER) in the condensed phase has received much attention over the last few decades [96–103]. The VER rate provides a sensitive probe of intramolecular dynamics and solute-solvent interactions, which are known to have a crucial impact on other important properties, such as chemical reactivity, solvation dynamics, and transport coefficients. The standard approach to VER is based on the Landau–Teller (LT) formula, which gives the VER rate constant in terms of the Fourier transform (FT), at the vibrational frequency, of a certain short-lived force–force correlation function (FFCF), which can be calculated from equilibrium MD simulations with a rigid solute. It should be noted that the derivation of the LT formula is based on several assumptions, namely: weak coupling between the solute and solvent, separation of time scales (such that the VER life-time is much longer than the correlation time of the FFCF), and the rotating wave approximation (RWA) [104]. The fact that the frequency of most molecular vibrations is high in the sense that h¯ ω/k B T 1, dictates that the quantum-mechanical FFCF, rather than the classical FFCF, should be used in the LT formula. The most popular approach for dealing with this difficulty is to first evaluate the FT of the classical FFCF, and then multiply the result by a frequency-dependent quantum correction factor (QCF) [96, 105–120]. Various approximate QCFs have been proposed in the literature. Unfortunately, estimates obtained from different QCFs can differ by orders of magnitude, and particularly so when high-frequency vibrations are involved. Thus, finding more rigorous ways for computing VER rate constants is clearly highly desirable. Previous attempts [44, 52–54] to apply CMD for the calculation of VER rate constant were complicated by the fact that the force in the FFCF involves a highly nonlinear function of the coordinates, and therefore cannot be directly obtained from CMD simulations without additional approximations for the nonlinear operators. Shi and Geva have recently proposed a linear–response-based approach to VER [104], which made it possible to calculate the VER rate constant from
1708
E. Geva et al.
CMD simulations, without the introduction of further approximations [41]. As was shown in Refs. [41] and [104], a LT-type VER requires that the initial deviation from equilibrium be quadratic in the vibrational coordinate, q. To ˆ this end, one may make the following particular choice of
ˆ
−β H ˆ =e
Z
β
β1
dβ1 0
0
dβ2 δ q(−iβ ˆ ˆ ¯ )δ q(−iβ ¯) 1 /h 2 /h
ˆ −δ q(−iβ ˆ ¯ )δ q(−iβ ¯ )eq . 1 /h 2 /h
(24)
ˆ is obviously quadratic in q, and at the same time leads to an expression This
for the VER rate constant in terms of a correlation function that can be directly obtained from CMD simulations (cf. Eq. (19)). Geva and Shi have also derived a centroid LT formula for the VER rate constant, by making the same assumptions as in the derivation of the original LT formula. The resulting centroid LT formula turned out to be similar to the classical LT formula, except for the fact that the force is replaced by its centroid symbol, and the dynamics takes place on the centroid potential, rather than on the original classical potential (see Ref. [41] for more details). By virtue of this approach, the centroid VER rate constant has been calculated for several models: (1) a vibrational mode coupled to a harmonic bath, with the coupling exponential in the bath coordinates; (2) a diatomic molecule coupled to a short linear chain of Helium atoms; and (3) a “breathing sphere” diatomic molecule in a two-dimensional monoatomic Lennard–Jones liquid. It was confirmed that CMD is able to capture the main features of the quantum mechanical force–force correlation function rather well, in both time and frequency domains. At the same time, it was observed that CMD was unable to accurately predict the high-frequency tail of the quantum-mechanical power spectrum of this correlation function, which limits its usefulness for calculating VER rate constants of high-frequency molecular vibrations. Interestingly, a recent calculation of the VER rate constant which was based on the linearized-semiclassical initial-value-representation (LSC-IVR) method, revealed that the high-frequency tail of the FFCF power spectrum is dominated by a non-classical term which is very sensitive to quantum fluctuations of the force around its average value [121]. The centroid symbol of the force corresponds to the average force over the corresponding imaginary-time cyclic path, and as a result seems to miss this effect. Another fact in favor of this interpretation comes from a recently established relationship between LSCIVR and CMD [38], where it was shown that the centroid correlation function can be obtained from the LSC-IVR correlation function, by decoupling the centroid, which corresponds to the zero-frequency normal mode of the corresponding imaginary-time cyclic path, from the higher normal modes. These higher normal modes appear to be responsible for the very same quantum
Quantum rate theory: a path integral centroid perspective
1709
fluctuations that seem to play a key role in VER. This suggests that more advanced centroid dynamics approaches in the future should take these issues into consideration.
5.
Conclusions, Future Prospects and Open Problems
The path integral centroid approach combines the feasibility of imaginarytime path integral simulations with classical-like centroid dynamics, with the result that real-time quantum dynamical information can be obtained from imaginary time simulations. The development of these methods has taken a rather unusual step where its practical applications came first and the justifications behind these applications followed up later. These theories have clearly established the conditions under which the application of path integral centroid methods is reliable, but, on the other hand, they have also opened up theoretical issues to be resolved for these methods to be even more applicable. Our current level of understanding is that all of the existing centroid methods for actual calculations can be derived from the formulation of Ref. [36] and the application of the CMD approximation. Model calculations and test for realistic systems show that these practicable centroid methods provide reliable information on the incoherent and stationary quantum dynamics for a broad range of condensed phase systems. Considering the generality of this approach, which does not rely on the specific nature of the system, this clearly constitutes an important advance. However, as is the case with any newly developing methodology, important theoretical issues are pending, which need to be resolved in order to make the centroid methods even more accurate and general. We will therefore conclude this chapter by listing the following four prominent issues: (i) developing more accurate, yet feasible, schemes for simulating the time evolution of centroid variables; (ii) extending the methodology to include quantum statistics and electronically nonadiabatic dynamics (however, signifcant progress has already been made on both the former [122–128] and latter [129] fronts); (iii) extension of the existing centroid theories for general nonlinear time correlation functions; and (iv) generalization of the centroid theories for nonequilibrium scenarios.
References [1] [2] [3] [4]
T. Yamamoto, J. Chem. Phys., 33, 281, 1960. A.G. Redfield, Adv. Mag. Reson., 1, 1, 1965. B.J. Berne and G.D. Harp, Adv. Chem. Phys., 17, 63, 1970. D.A. McQuarrie, Statistical Mechanics (Harper and Row, New York, 1976).
1710
E. Geva et al.
[5] R. Kubo, M. Toda, and N. Hashitsume, Statistical Physics II - Nonequilibrium Statistical Mechanics, (Springer-Verlag, Berlin, 1983). [6] W.T. Pollard, A.K. Felts, and R.A. Friesner, Adv. Chem. Phys. XCIII, 77, 1996. [7] N. Makri, Annu. Rev. Phys. Chem., 50, 167, 1999. [8] P. Jungwirth and R.B. Gerber, Chem. Rev., 99, 1583, 1999. [9] R.P. Feynman and A.R. Hibbs, Quantum Mechanics and Path Integrals (McGrawHill Book Company, New York, 1965). [10] R.P. Feynman, Statistical Mechanics (Addison-Wesley Publishing Company, New York, 1972). [11] H. Kleinert, Path Integrals in Quantum Mechanics, Statistics, and Polymer Physics (World Scientific, Singapore, 1995). [12] U. Weiss, Series in Modern Condensed Matter Physics Vol. 2 : Quantum Dissipative Systems (World Scientific, Singapore, 1993). [13] W.H. Miller, J. Phys. Chem. A, 105, 2942, 2001. [14] B.J. Berne and D. Thirumalai, Annu. Rev. Phys. Chem., 37, 401, 1986. [15] D.M. Ceperley, Rev. Mod. Phys., 67, 279, 1995. [16] R. Giachetti and V. Tognetti, Phys. Rev. Lett., 55, 912, 1985. [17] R.P. Feynman and H. Kleinert, Phys. Rev. A, 34, 5080, 1986. [18] W.H. Miller, J. Chem. Phys., 61, 1823, 1974. [19] M.J. Gillan, Phys. Rev. Lett., 58, 563, 1987. [20] M.J. Gillan, J. Phys. C, 20, 3621, 1987. [21] G.A. Voth, D. Chandler, and W.H. Miller, J. Chem. Phys., 91, 7749, 1989. [22] G.A. Voth, Chem. Phys. Lett., 170, 289, 1990. [23] R.P. McRae, G.K. Schenter, B.C. Garrett, G.R. Haynes, G.A. Voth, and G.C. Schatz, J. Chem. Phys., 97, 7392, 1992. [24] G.A. Voth, J. Phys. Chem., 97, 8365, 1993. [25] G.A. Voth, Adv. Chem. Phys., 93, 135, 1996. [26] N. Fisher and H.C. Andersen, J. Phys. Chem., 100, 1137, 1996. [27] E. Pollak and J. Liao, J. Chem. Phys., 108, 2733, 1998. [28] S. Jang and G.A. Voth, J. Chem. Phys., 112, 8747, 2000. [29] J.L. Liao and E. Pollak, Chem. Phys., 268, 295, 2001. [30] P. Pechukas, in Dynamics of molecular collisions, Part 2 (Plenum Press, N.Y., 1976), p. 269. [31] J. Cao and G.A. Voth, J. Chem. Phys., 100, 5093, 1994. [32] J. Cao and G.A. Voth, J. Chem. Phys., 100, 5106, 1994. [33] J. Cao and G.A. Voth, J. Chem. Phys., 101, 6157, 1994. [34] J. Cao and G.A. Voth, J. Chem. Phys., 101, 6168, 1994. [35] J. Cao and G.A. Voth, J. Chem. Phys., 101, 6184, 1994. [36] S. Jang and G.A. Voth, J. Chem. Phys., 111, 2357, 1999. [37] S. Jang and G.A. Voth, J. Chem. Phys., 111, 2371, 1999. [38] Q. Shi and E. Geva, J. Chem. Phys., 118, 8173, 2003. [39] D.R. Reichman, P.-N. Roy, S. Jang, and G.A. Voth, J. Chem. Phys., 113, 919, 2000. [40] E. Geva, Q. Shi, and G.A. Voth, J. Chem. Phys., 115, 9209, 2001. [41] Q. Shi and E. Geva, J. Chem. Phys., 119, 9030, 2003. [42] A. Calhoun, M. Pavese, and G.A. Voth, Chem. Phys. Lett., 262, 415, 1996. [43] U.W. Schmitt and G.A. Voth, J. Chem. Phys., 111, 9361, 1999. [44] S. Jang, Y. Pak, and G.A. Voth, J. Phys. Chem. A, 103, 10289, 1999. [45] M. Pavese and G.A. Voth, Chem. Phys. Lett., 249, 231, 1996. [46] K. Kinugawa, P.B. Moore, and M.L. Klein, J. Chem. Phys., 106, 1154, 1997. [47] K. Kinugawa, P.B. Moore, and M.L. Klein, J. Chem. Phys., 109, 610, 1998.
Quantum rate theory: a path integral centroid perspective [48] [49] [50] [51]
[52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91]
1711
K. Kinugawa, Chem. Phys. Lett., 292, 454, 1998. M. Pavese, D.R. Bernard, and G.A. Voth, Chem. Phys. Lett., 300, 93, 1999. S. Miura, S. Okazaki, and K. Kinugawa, J. Chem. Phys., 110, 4523, 1999. F.J. Bermejo, K. Kinugawa, C. Cabrillo, S.M. Bennington, B. Fak, M.T. FernandezDiaz, P. Verkerk, J. Dawidowski, and R. Fernandez-Pirea, Phys. Rev. Lett., 84, 5359, 2000. J. Poulsen and P.J. Rossky, J. Chem. Phys., 115, 8014, 2001. J. Poulsen, S.R. Keiding, and P.J. Rossky, Chem. Phys. Lett., 336, 488, 2001. J. Poulsen and P.J. Rossky, J. Chem. Phys., 115, 8024, 2001. J.B. Anderson, Adv. Chem. Phys., 91, 381, 1995. M. Messina, G.K. Schenter, and B.C. Garrett, J. Chem. Phys., 98, 8525, 1993. M. Messina, G.K. Schenter, and B.C. Garrett, J. Chem. Phys., 99, 8644, 1993. G.K. Schenter, M. Messina, and B.C. Garret, J. Chem. Phys., 99, 1674, 1993. E. Pollak, J. Chem. Phys., 103, 973, 1995. N. Chakrabarti, T.C. Jr., and B. Roux, Chem. Phys. Lett., 293, 209, 1998. R. Ramirez, J. Chem. Phys., 107, 3550, 1997. J. Cao and G.A. Voth, J. Chem. Phys., 105, 6856, 1996. I. Affleck, Phys. Rev. Lett., 46, 388, 1981. S. Jang, C.D. Schwieters, and G.A. Voth, J. Phys. Chem. A, 103, 9527, 1999. W.H. Miller, S.D. Schwartz, and J.W. Tromp, J. Chem. Phys., 79, 4889, 1983. W.H. Miller, J. Phys. Chem. A, 102, 793, 1998. I. Navrotskaya, Q. Shi, and E. Geva, Isr. J. Chem., 42, 225, 2002. Q. Shi and E. Geva, J. Chem. Phys., 116, 3223, 2002. Y.-C. Sun and G.A. Voth, J. Chem. Phys., 98, 7451, 1993. S.W. Rick, D.L. Lynch, and J.D. Doll, J. Chem. Phys., 99, 8183, 1993. T.R. Mattsson and G. Wahnstr¨om, Phys. Rev. B, 56, 14944, 1997. M.J. Murphy, G.A. Voth, and A.L.R. Bug, J. Phys. Chem. B, 101, 491, 1997. S. Jang and G.A. Voth, J. Chem. Phys., 108, 4098, 1998. S. Jang, S. Jang, and G.A. Voth, J. Phys. Chem. A, 103, 9512, 1999. U.W. Schmitt and G.A. Voth, Israeli J. Chem., 39, 483, 1999. U.W. Schmitt and G.A. Voth, Chem. Phys. Lett., 329, 36, 2000. J. Lobaugh and G.A. Voth, J. Chem. Phys., 100, 3039, 1994. J. Lobaugh and G.A. Voth, J. Chem. Phys., 104, 2056, 1996. I. Feierberg and V. Luzhkov and J. Åqvist, J. Bio. Chem., 275, 22657, 2000. R. Iftimie and J. Schofield, J. Chem. Phys., 115, 5891, 2001. K. Hinsen and B. Roux, J. Chem. Phys., 106, 3567, 1997. M.E. Tuckerman and D. Marx, Phys. Rev. Lett., 86, 4946, 2001. M. Celli, D. Colognesi, and M. Zoppi, Phys. Rev. E, 66, 021202, 2002. A. Nakayama and N. Makri, J. Chem. Phys., 119, 8592, 2003. E. Rabani, D.R. Reichman, G. Krylov, and B.J. Berne, Proc. Natl. Acad. Sci. USA, 99, 1129, 2002. D.R. Reichman and E. Rabani, J. Chem. Phys., 116, 6279, 2002. E. Rabani and D.R. Reichman, J. Chem. Phys., 120, 2004. E. Rabani and D.R. Reichman, Europhys. Lett., 60, 656, 2002. K. Carneiro, M. Nielsen, and J.P. McTague, Phys. Rev. Lett., 30, 481, 1973. M. Mukherjee, F.J. Bermejo, B. Fak, and S.M. Bennington, Europhys. Lett., 40, 153, 1997. F.J. Bermejo, F.J. Mompean, M. Garcia-Hernandez, J.L. Martinez, D. MartinMarero, A. Chahid, G. Senger, and M.L. Ristig, Phys. Rev. B, 47, 15097, 1993.
1712
E. Geva et al.
[92] F.J. Bermejo, B. Fax, S.M. Bennington, R. Fernandez-Perea, C. Cabrillo, J. Dawidowski, M.T. Fernandez-Diaz, and P. Verkerk, Phys. Rev. B, 60, 15154, 1999. [93] Y. Yonetani and K. Kinugawa, J. Chem. Phys., 119, 9651, 2003. [94] M. Mukherjee, F.J. Bermejo, S.M. Bennington, and B. Fak, Phys. Rev. B, 57, 11031, 1998. [95] T.D. Hone and G.A. Voth, J. Chem. Phys., (Submitted). [96] D.W. Oxtoby, Adv. Chem. Phys., 47 (Part 2), 487, 1981. [97] D.W. Oxtoby, Annu. Rev. Phys. Chem., 32, 77, 1981. [98] D.W. Oxtoby, J. Phys. Chem., 87, 3028, 1983. [99] J. Chesnoy and G.M. Gale, Adv. Chem. Phys., 70 (part 2), 297, 1988. [100] R.M. Stratt and M. Maroncelli, J. Phys. Chem., 100, 12981, 1996. [101] T. Elsaesser and W. Kaiser, Annu. Rev. Phys. Chem., 42, 83, 1991. [102] A. Laubereau and W. Kaiser, Rev. Mod. Phys., 50, 607, 1978. [103] P. Hamm, M. Lim, and R.M. Hochstrasser, J. Chem. Phys., 107, 1523, 1997. [104] Q. Shi and E. Geva, J. Chem. Phys., 118, 7562, 2003. [105] B.J. Berne, J. Jortner, and R. Gordon, J. Chem. Phys., 47, 1600, 1967. [106] J.S. Bader and B.J. Berne, J. Chem. Phys., 100, 8359, 1994. [107] S.A. Egorov, K.F. Everitt, and J.L. Skinner, J. Phys. Chem. A, 103, 9494, 1999. [108] S.A. Egorov and J.L. Skinner, J. Chem. Phys., 112, 275, 2000. [109] J.L. Skinner and K. Park, J. Phys. Chem. B, 105, 6716, 2001. [110] D. Rostkier-Edelstein, P. Graf, and A. Nitzan, J. Chem. Phys., 107, 10470, 1997. [111] D. Rostkier-Edelstein, P. Graf, and A. Nitzan, J. Chem. Phys., 108, 9598, 1998. [112] K.F. Everitt, J.L. Skinner, and B.M. Ladanyi, J. Chem. Phys., 116, 179, 2002. [113] P.H. Berens, S.R. White, and K.R. Wilson, J. Chem. Phys., 75, 515, 1981. [114] L. Frommhold, Collision-induced absorption in gases, vol. 2 of Cambridge Monographs on Atomic, Molecular, and Chemical Physics, (Cambridge University Press, England, 1993), 1st ed. [115] J.L. Skinner, J. Chem. Phys., 107, 8717, 1997. [116] S.C. An, C.J. Montrose, and T.A. Litovitz, J. Chem. Phys., 64, 3717, 1976. [117] S.A. Egorov and J.L. Skinner, Chem. Phys. Lett., 293, 439, 1998. [118] P. Schofield, Phys. Rev. Lett., 4, 239, 1960. [119] P.A. Egelstaff, Adv. Phys., 11, 203, 1962. [120] G.R. Kneller, Mol. Phys., 83, 63, 1994. [121] Q. Shi and E. Geva, J. Phys. Chem. A, 107, 9059, 2003. [122] P.-N. Roy and G.A. Voth, J. Chem. Phys., 110, 3647, 1999. [123] P.-N. Roy, S. Jang, and G.A. Voth, J. Chem. Phys., 111, 5303, 1999. [124] N.V. Blinov, P.-N. Roy, and G.A. Voth, J. Chem. Phys., 115, 4484, 2001. [125] N.V. Blinov and P.-N. Roy, J. Chem. Phys., 115, 7822, 2001. [126] N.V. Blinov and P.-N. Roy, J. Chem. Phys., 116, 4808, 2002. [127] P.-N. Roy and N.V. Blinov, Isr. J. Chem., 42, 183, 2002. [128] P. Moffatt, N. Blinov, and P.-N. Roy, J. Chem. Phys., in press, 2004. [129] J.L. Liao and G.A. Voth, J. Phys. Chem. B, 106, 8449, 2002.
5.10 QUANTUM THEORY OF REACTIVE SCATTERING AND ADSORPTION AT SURFACES Axel Groß Physik-Department T30, TU M¨unchen, 85747 Garching, Germany
1.
Introduction
The interaction of atoms and molecules with surfaces is of great technological relevance [1]. Both advantageous and harmful processes can occur at surfaces. Catalytic reactions at surfaces represent a desired process while corrosion is an unwanted process. If light atoms and molecules such as hydrogen or helium are interacting with the surface, then quantum effects in the interaction dynamics between the incoming beam and the substrate have to be taken into account. First of all there are quantum effects in the energy transfer to the substrate vibrations, the phonons. While classically there will always be an energy loss of the incident particles to the substrate, quantum mechanically there is a certain probability for elastic scattering, i.e., without any energy transfer between the substrate and the scattered particles. This has also important consequences on the sticking probabilities of weakly bound species such as rare gases at low kinetic energies. Furthermore, in elastic scattering at a periodic surface, the wave vector parallel to the surface can only be changed by reciprocal lattice vectors because of the quasi-momentum conservation. If the de Broglie wavelength of the incident particles is of the order of the lattice spacing of the substrate, the angular distribution of the scattered particles exhibits a characteristic pattern of wellresolved reflection peaks. The resulting diffraction pattern depends only on the geometry of the surface. Therefore it has been used extensively as a tool to determine surface structures [2, 3]. I first address quantum effects in the sticking of weakly bound species, namely rare gas atoms, at surfaces. Depending on the mass of the rare gas atoms, the whole range between almost purely classical and almost purely 1713 S. Yip (ed.), Handbook of Materials Modeling, 1713–1733. c 2005 Springer. Printed in the Netherlands.
1714
A. Groß
quantum behavior can be observed [4, 5]. The lighter the atom, the higher the probability for elastic scattering and therefore the lower the trapping probability. We also briefly mention quantum effects in the adsorption dynamics which, in fact lead to a vanishing trapping probability in the limit of very low incident kinetic energies and surface temperatures [6–9]. As far as quantum effects in the dynamics of the scattered particles are concerned, I use the interaction of hydrogen with palladium surfaces as a model system. This system has been well-studied both experimentally and theoretically. Initially these studies were motivated among other reasons by the fact that bulk palladium can absorb huge amounts of hydrogen. This made it a possible candidate for a hydrogen storage device in the context of the fuel cell technology. Besides, palladium is also used as a catalyst material for hydrogenation and dehydrogenation reactions. The strong corrugation and anisotropy of the H2 /Pd(100) potential energy surface (PES) lead to significant off-specular and rotationally inelastic diffraction intensities [10]. These effects have been verified for related reactive systems [11, 12]. Furthermore, the diffraction intensities exhibit a pronounced oscillatory structure because of threshold effects and resonances in the scattering process. This structure is also visible in the quantum mechanically determined adsorption probability of H2 /Pd(100) [10, 13, 14]. This, however, has not been found in experiments yet [15]. Further quantum effects in activated systems are due to the localization and quantization of the wave function in the barrier region [16, 17] which causes a steplike structure in the reaction probabilities. This chapter is structured as follows. In Section 2, a general introduction into the phenomena occuring in the quantum scattering at surfaces is given. Then quantum effects in the trapping at surfaces and diffraction are addressed. The influence of quantum phenomena in the reaction dynamics at surfaces is discussed in Section 5. The chapter ends with some concluding remarks.
2.
Quantum Scattering at Surfaces
A schematic summary of possible collision processes in the scattering of atoms and molecules at surfaces is presented in Fig. 1. We consider a monoenergetic beam of atoms or molecules impinging on a periodic surface. In the following, we refer to both atoms and molecules by just calling them molecules. A monoenergetic incident beam is characterized by the wave vector K i = Pi /, where Pi is the initial momentum of the particles. When the incoming particles hit the surface, the substrate atoms will recoil. Therefore, classically there will always be a certain energy transfer from the molecules to the substrate. Quantum mechanically, however, there will be a certain probability for elastic scattering, i.e., with no energy transfer to the substrate. This probability is given by the so-called Debye–Waller factor.
Quantum theory of reactive scattering and adsorption at surfaces
1715
Specular Ioo
in
λ
om
c In
Diffraction Imn g be am
Inelastic
(Dissociative) adsorption
Selective adsorption
a
Figure 1.
Summary of the different collision processes in reactive scattering at surfaces.
Furthermore, if the de Broglie wavelength λ = 1/| K i | of the incident beam is of the order of the lattice spacing a, quantum effects in the momentum transfer parallel to the surface become important. In the case of elastic scattering, the component of the wave vector parallel to the surface can only be changed by a reciprocal lattice vector of the periodic surface. This means that the wave vector K f after the scattering is given by mn , K f = K i + G
(1)
mn is a reciprocal lattice vector of the periodic surface. The conservawhere G tion of the quasi-momentum parallel to the surface in elastic scattering leads to diffraction which means that there is only a discrete number of allowed scattering angles. The intensity of the elastic diffraction peak mn according to Eq. (1) is denoted by Imn . The scattering peak I00 with K f = K i is called the specular peak. From the diffraction pattern the periodicity and lattice constant of the surface can be derived. The coherent scattering of atoms or molecules from surfaces has been known as a tool for probing surface structures since 1930 [18]. In particular helium atom scattering (HAS) has been used intensively to study surface crystallography (see, e.g., [2] and references therein). The main source for the energy transfer between the impinging molecules and the substrate is the excitation and deexcitation of substrate phonons. Phonons also carry momentum. Then the conservation of quasi-momentum parallel to the surface reads mn + K f = K i + G
exch.phon.
± Q,
(2)
1716
A. Groß
is a two-dimensional phonon-momentum vector parallel to the where Q surface. The plus-signs in the sum correspond to the excitation or emission of a phonon while the minus signs represent the deexcitation or absorption of a phonon. The energy balance in phonon-inelastic scattering can be expressed as 2 K 2f 2 K i2 = + ±ω Q, j, 2M 2M exch.phon.
(3)
and where ω Q, j corresponds to the energy of the phonon with momentum Q mode index j . In fact, helium atom scattering has been used extensively in order to determine the surface phonon spectrum in one-phonon collisions via Eqs. (2) and (3) [2, 19]. The excitation of phonons usually leads to a reduced normal component of the kinetic energy of the back-scattered atoms or molecules. Thus the reflected beam is shifted in general to larger angles with respect to the surface normal compared to the angle of incidence. The resulting supraspecular scattering is indicated in Fig. 1 as the inelastic reflection event. In the case of the scattering of weakly interacting particles at smooth surfaces, often resonances in the intensity of the specular peak as a function of the angle of incidence are observed [20]. These so-called selective adsorption resonances which are also indicated in Fig. 1 occur when the scattered particle can make a transition into one of the discrete bound state of the adsorption potential [21]. This is only possible if temporarily the motion of the particle is entirely parallel to the surface. The interference of different possible paths along the surface causes the resonance effects. Energy and momentum conservation yields the selective adsorption condition mn )2 2 K i2 2 ( K i + G = − |El |, 2M 2M
(4)
where El is a bound level of the adsorption potential. From the scattering resonances, bound state energies can be obtained using Eq. (4) without any detailed knowledge about the scattering process. The coherent elastic scattering of molecules is more complex than atom scattering. Additional peaks may appear in the diffraction pattern. They are a consequence of the fact that in addition to parallel momentum transfer the internal degrees of freedom of the molecule, rotations and vibrations, can be excited during the collision process. The total energy balance in the molecular scattering is 2 K 2f 2 K i2 = + E rot + E vib + ±ω Q, j. 2M 2M exch.phon.
(5)
Quantum theory of reactive scattering and adsorption at surfaces
1717
Usually the excitation of molecular vibrations in molecule-surface scattering is negligible, in contrast to the phonon excitation. This is due to the fact that the time-scale of the molecular vibrations is usually much shorter than the scattering time or the rotational period. This leads to high frequencies of the molecular vibrations whose energies are too high to become excited in a typical scattering experiment. Molecular rotations, on the other hand, can be excited rather efficiently in the scattering at highly corrugated and anisotropic surfaces. Because of the energy conservation, the rotational excitation in scattering reduces the kinetic energy perpendicular to the surface. This leads to additional peaks in the diffraction spectrum, the rotationally inelastic diffraction peaks. Experimentally, rotationally inelastic diffraction of hydrogen molecules has been first observed in the scattering at inert ionic solids such as MgO [22] or NaF [23]. At metal surfaces with a high barrier for dissociative adsorption, the molecules are scattered at the tails of the metal electron density which are usually rather smooth. In addition, the interaction of the molecules with these tails does not depend very strongly on the orientation of the molecules. Hence relatively weak diffraction and hardly any rotationally inelastic diffraction has been observed for, e.g., the scattering of H2 from Cu(001) [24, 25]. This is different for the case of HD scattering, where the displacement of the center of mass from the center of the charge distribution leads to a strong rotational anisotropy [26]. At reactive surfaces where non-activated adsorption is possible, the scattering occurs rather close to the surface where the potential energy surface is already strongly corrugated and anisotropic. For such systems, rotationally inelastic peaks in the diffraction pattern have been clearly resolved experimentally [11, 12] and predicted theoretically in six-dimensional quantum dynamical calculations [10] as is discussed below. At reactive surfaces, the particles can of course also adsorb. As it is indicated in Fig. 1, molecules can adsorb both molecularly which means intact or dissociatively. In the case of the atomic or molecular adsorption, the particles can only remain trapped at the surface if their initial kinetic is transfered to the surface and dissipated. For light projectiles, the quantum nature of the substrate phonons becomes important. This is the topic of Section 3.
3.
Quantum Effects in the Trapping at Surfaces
Let us consider an atom impinging on a surface. Even in the absence of any chemical binding, there will always be an attractive interaction between the surface and the atom due to the van der Waals forces [27]. Let us further assume that there is no energetic barrier for the access of the atomic adsorption well. A particle impinging on a surface can only become trapped
1718
A. Groß
in an attractive adsorption well if it transfers its entire initial kinetic energy to the surface because then it cannot escape back to the gas phase. Hence the sticking probability can be expressed as ∞
PE () d,
S(E) =
(6)
E
where PE () is the probability that an incoming particle with kinetic energy E transfers the energy to the surface. If the adsorption process is treated purely classical, no matter how small the adsorption well, no matter how small the mass ratio between the impinging atom and the substrate oscillator, for E → 0 and Ts → 0 the sticking probability will always reach unity if there is no barrier before the adsorption well. This is due to the fact that every impinging particle will transfer energy to the substrate at zero temperature. In the limit of zero initial kinetic energy any energy transfer will be sufficient to keep the particle in the adsorption well. Quantum mechanically, however, there is a non-zero probability for elastic scattering at the surface. Hence the sticking probabilities should become less than unity in the zero-energy limit, in particular for light atoms impinging on a surface. This has indeed been observed in the sticking of rare gas atoms at cold Ru(0001) surfaces [4, 5]. In order to reproduce elastic scattering, the quantum nature of the phonon system has to be taken into account. In the simplest approach, the substrate phonons can be modeled by an ensemble of independent quantum surface oscillators. Since the oscillators are assumed to be independent, the essential physics can be captured by just considering an atomic projectile interacting via linear coupling with a single surface oscillator. In the so-called trajectory approximation, the particle’s motion is treated classically. Assuming that the motion of the atom is hardly influenced by the excitation of the surface oscillator, the equations of motion are solved without taking the coupling to the oscillator into account. The classical trajectory then introduces a time-dependent force in the Hamiltonian of the oscillator. In this forced oscillator model [28], the energy transfer probability PE () can be evaluated. In fact, a compact expression can be derived for the energy distribution in the scattering of an atom at a system of phonon oscillators with a Debye spectrum at a temperature Ts [29, 30]. Assuming some analytical form for the potential, this expression depends on a small set of parameters such as the potential well depth, the potential range, the mass of the surface oscillator and the surface Debye temperature. This model was used in order to reproduce the measured sticking probabilities of rare gas atoms on a Ru(001) surface at a temperature of Ts = 6.5 K [5]. A comparison between the measured and calculated sticking probabilities for Ne, Ar, Kr, and Xe on Ru(001) is shown in Fig. 2. The lighter the atoms, the
Quantum theory of reactive scattering and adsorption at surfaces
1719
smaller the sticking probability. At small energies, the sticking probabilities do not reach unity due to the quantum nature of the substrate phonons except for the heaviest rare gas atom Xe. Indeed, attempts to reproduce the measured sticking probabilities with purely classical methods have failed, at least for Ne and Ar [4, 5]. A classical treatment of the solid is only appropriate if the energy transfer to the surface is large compared to the Debye energy of the solid [6]. At even lower kinetic energies than reached in the experiments [5] shown in Fig. 2, the quantum nature of the impinging particles cannot be neglected any longer. Hence the trajectory approximation cannot be applied any more. In fact, in the limit E → 0 the de Broglie wavelength of the incoming particle tends to infinity. In the case of a short-range attractive potential this means that the amplitude of the particle’s wave function vanishes in the attractive region [6, 7]. Thus there is no coupling and consequently no energy transfer between the particle and the substrate vibrations. Therefore the quantum mechanical sticking probability also vanishes for E → 0. However, in order to see this effect extremely small kinetic energies corresponding to a temperature below 0.1 K are required [6]. Nevertheless, this quantum phenomenon in the sticking at surfaces has been verified experimentally for the adsorption of atomic hydrogen on thick liquid 4 He films [31]. There is yet another effect that also leads to zero sticking at very low energies. Quantum mechanically particles can also be reflected at attractive
1.0
∗∗ ∗
Trapping probabilty
0.8
0.6
∗∗
∗ ∗∗Xe
∗ ∗
∗ Kr ∗∗∗ ∗ Experiment ∗ Ar ∗∗ Theory ∗ ∗ ∗ ∗∗ 0.2 ∗ ∗ ∗ ∗∗ ∗ ∗∗ Ne ∗ ∗ ∗∗∗ ∗ ∗ 0.0 ∗0.02∗ 0.00 0.04 0.06 0.4
0.08
Kinetic energy (eV)
Figure 2. Sticking probability of rare gas atoms on Ru(0001) at a surface temperature of Ts = 6.5 K. Stars (*): experiment; lines: theoretical results obtained with the forced oscillator model (after [5], not all measured data points are included).
1720
A. Groß
parts of the potential. If the potential falls off asymptotically faster than 1/Z 2 , then the reflection amplitude R exhibits the universal behavior [9, 32] |R| −→ 1 − bk, k→0
(7)
where k is the wave number corresponding to the asymptotic kinetic energy E =2 k 2 /2M. This means that in the low energy limit the reflection probability |R|2 goes to unity even if the particle does not reach a classical turning point. In fact, such a quantum reflection has been observed in the scattering of an ultracold beam of metastable neon atoms from silicon and glass surfaces [8]. In order to reproduce the measured reflectivities, an 1/Z 4 dependence of the potential has to be assumed [8, 9], where Z is the distance to the surface. This indicates that the atoms are scattered at the long-range tail of the so-called Casimir–van der Waals potential.
4.
Diffraction
In order to describe diffraction, the wave nature of the scattered molecules has to be taken into account by solving the appropriate Schr¨odinger equation. Either the time-dependent Schr¨odinger equation i
∂ t) = H ( R, t) ( R, ∂t
(8)
or the time-independent Schr¨odinger equation = E( R, t) H ( R)
(9)
may be considered to treat the scattering process. The time-dependent Schr¨odinger equation is typically solved on a numerical grid using the wavepacket formalism [33–36]. In the time-independent formulation, the wavefunction is usually expanded in some suitable set of eigenfunctions leading to so-called coupled-channel equations [27, 37]. One important prerequisite for the determination of scattering intensities is the knowledge of the interaction potential between the scattered particles and the surface. While about one decade ago most interaction potentials had to be guessed based on experimental information, it has now become possible to map out whole potential energy surfaces by ab initio total-energy calculations [37, 38] which is illustrated in Fig. 3. This development has been caused by the progress in computer power and by the development of efficient electronic structure codes (see, e.g., Refs. [39–42]). We use the scattering of H2 at a metal surface as an exemplary system to discuss the quantum effects in the scattering at surfaces. While a decade ago it was also not possible to perform full quantum dynamical simulations in all
Quantum theory of reactive scattering and adsorption at surfaces
1721
Figure 3. Contour plots of the PES along two two-dimensional cuts through the sixdimensional coordinate space of H2 /Pd (100) (from [49]). The contour spacing is 0.1 eV per H2 molecule. The considered coordinates are indicated in the inset. The lateral position of the H2 molecule and its orientation are indicated above the contour plots.
hydrogen degrees of freedom, this can now be routinely done [13, 43–47]. In particular, hydrogen/palladium represents a system that is well-suited for both experiments under ultra-high vacuum conditions as well as for a theoretical treatment in the framework of modern electronic structure methods. There is a wealth of microscopic information which is well established and doublechecked through the fruitful combination of state-of-the-art experiments with ab initio total-energy calculations and related simulations. The interaction potential of hydrogen interacting with palladium surfaces has been determined in detail by several total-energy calculations [48–51] based on density-functional theory. Parametrizations of the ab initio potential energy surfaces have been used for quantum and classical molecular dynamics simulations of the scattering and adsorption of H2 on Pd(100) [10, 13, 14], Pd(111) [52–54] and Pd(110) [55]. In this contribution, I will particularly focus on H2 /Pd(100). Two so-called elbow potentials of this system which were determined by DFT calculations [49, 56] are shown in Fig. 3. The H2 /Pd(100) PES which is a six-dimensional hyperplane according to the H2 degrees of freedom when the substrate atoms
1722
A. Groß
are kept fixed is usually analysed in terms of these elbow potentials. They correspond to two-dimensional cuts through the six-dimensional PES as a function of the molecular distance from the surface Z and the interatomic H–H distance r for fixed lateral center-of-mass coordinates and molecular orientation. Hydrogen molecules usually adsorb dissociatively on metal surfaces [57, 58]. As Fig. 3(a) indicates, H2 dissociates spontaneously at Pd(100), i.e., there are non-activated paths for dissociative adsorption. However, dissociative adsorption corresponds to a bond making-bond breaking process that depends sensitively on the local chemical environment. Consequently, the PES is strongly corrugated which means the interaction strongly varies as a function of the lateral coordinates of the molecule. This is illustrated in Fig. 3(b). If the molecule comes down over the on-top site, the shape of the elbow looks entirely different. Along this pathway, the adsorption is no longer non-activated. We will see that the strong corrugation leads to significant intensities in the off-specular peaks. The PES does not only depend on the lateral position of the H2 molecule, i.e., the PES is not only corrugated, but it is also strongly anisotropic. Only molecules with their axis parallel to the surface can dissociate, for molecules approaching the Pd surface in an upright orientation the PES is purely repulsive [49]. Because of the anisotropy of the PES, in addition to elastic diffraction peaks there will be large intensities in rotationally inelastic diffraction peaks which correspond to rotational transitions in the collision process. The six-dimensional ab initio PES of H2 /Pd(100) has been parametrized using some suitable analytical form [14]. Using this fit, the six-dimensional quantum dynamics of H2 interacting with a fixed substrate have been determined [10] by solving the time-independent Schr¨odinger equation in a coupledchannel scheme using the concept of the so-called local reflection matrix (LORE) [59, 60]. This is a numerically very efficient and stable scheme that is based on a fine step-wise representation of the PES. One typical calculated angular distribution of H2 molecules scattered at Pd(100) is shown in Fig. 4 [10]. The total initial kinetic energy is E i = 76 meV. ¯ direction which The incident parallel momentum equals 2G along the 011 ◦ corresponds to an incident angle of θi = 32 . The molecules are initially in the rotational ground state ji = 0. Figure 4(a) shows the so-called in-plane scattering distribution, i.e., the diffraction peaks in the plane spanned by the wave vector of the incident beam and the surface normal. The label (m, n) denotes the parallel momentum transfer G = (mG, nG). The specular peak is the most pronounced one, but the first order diffraction peak (10) is only a factor of four smaller. Note that in a typical helium atom scattering experiment the off-specular peaks are about two orders of magnitude smaller than the specular peak [2]. This is due to the fact that the chemically inert helium atoms are scattered at the smooth tails of the surface electron distribution.
Quantum theory of reactive scattering and adsorption at surfaces (a)
1723
0.3
Scattering intensity
(00)
0.2
0.1
(20) ∆j⫽2 (40) ∆j⫽2 (50)
0.0 ⫺90
⫺60
(30) ∆j⫽2 (40) (30)
⫺30
(10) ∆j⫽2
(00) (10) ∆j⫽2
(10) 0
30
60
90
Scattering angle θf(˚)
(b)
90
Final angle θy(˚)
60 30 0
⫺30 ⫺60 ⫺90 ⫺90
⫺60
⫺30
0
30
60
90
Final angle θx(˚)
Figure 4. Six-dimensional quantum results of the rotationally inelastic scattering of H2 on Pd(100) for a kinetic energy of 76 meV at an incidence angle of 32◦ along the [10] direction of the square surface lattice. Panel (a) shows the in-plane diffraction spectrum where all peaks have been labeled according to the transition. Both in-plane and out-of-plane diffraction peaks are plotted in panel (b), the open and filled circles correspond to the rotationally elastic and rotationally inelastic scattering, respectively. The radius of the circles is proportional to the logarithm of the scattering intensity (after [10]).
In addition, rotationally inelastic diffraction peaks corresponding to the rotational excitation j = 0 → 2 are plotted in Fig. 4(a). They have been summed up over all final azimuthal quantum numbers m j . Note that the excitation probability of the so-called cartwheel rotation with m = 0 is for all peaks approximately one order of magnitude larger than for the so-called helicopter rotation m = j , since the polar anisotropy of the PES is stronger than the azimuthal one. The intensity of the rotationally inelastic diffraction peaks in Fig. 4 is comparable to the rotationally elastic ones. Except for the specular peak they
1724
A. Groß
are even larger than the corresponding rotationally elastic diffraction peak with the same momentum transfer (m, n). Because of the particular conditions with the incident parallel momentum corresponding to the reciprocal lattice vector ¯ diffraction peaks fall = (2G, 0), the rotationally elastic and inelastic (20) G upon each other. The out-of-plane scattering intensities are not negligible, which is demonstrated in Fig. 4(b). The open circles represent the rotationally elastic, the filled circles the rotationally inelastic diffraction peaks. The radii of the circles are proportional to the logarithm of the scattering intensity. The sum of all out-ofplane scattering intensities is approximately equal to the sum of all in-plane scattering intensities. Interestingly, some diffraction peaks with a large parallel momentum transfer still show substantial intensities. This phenomenon is well known from helium atom scattering and has been discussed within the concept of so-called rainbow scattering [61]. The intensity of the scattering peaks for normal incidence are analysed in detail in Fig. 5. The intensities of four diffraction peaks are plotted as a function of the kinetic energy for rotationally elastic (Fig. 5(a)) and rotationally inelastic (Fig. 5(b)) scattering. In molecular beam experiments, the beams are not monoenergetic but have a certain velocity spread. In order to allow a better comparison with the experiment, an initial velocity spread of v/v = 0.05 typical for experiments [12] has been applied to the results of the quantum dynamical simulations. The theoretical results still exhibit a rather strong oscillatory structure which is a consequence of the quantum nature of H2 scattering. Let us first focus on the specular peak. An analysis of the energetic position of the oscillations reveals that they occur whenever new diffraction channels open up. This process is illustrated in Fig. 6. For a particular kinetic energy, there is a discrete number of diffraction peaks. The final wave vectors K f differ by multiples of the reciprocal lattice. At certain threshold energies of the unit vectors G E threshold the energy becomes large enough that additional diffraction channels open up. At exactly E = E threshold, the new channel corresponds to a wave that propagates paralllel to the surface. Thus the oscillations in the scattering intensities are a consequence of the fact that at the threshold energies the number of diffraction peaks changes discontinuously. In detail, the first pronounced dip in the intensity of the specular peak at E i = 12 meV coincides with the emergence of the (11) diffraction peak, the small dip at E i = 22 meV with the opening up of the (20) diffraction channel. The huge dip at approximately E i = 50 meV reflects the threshold for rotationally inelastic scattering. Interestingly, the rotational elastic (10) and (11) diffraction peaks show pronounced maxima at this energy. This indicates a strong coupling between parallel motion and rotational excitation. Figure 5(b) shows the intensities of the rotationally inelastic diffraction peaks. The specular peak is still the largest, however, some off-specular peaks
Quantum theory of reactive scattering and adsorption at surfaces
1725
Scattering intensity
(a) (00) (10) (20) (11)
10⫺1
10⫺2
j⫽0→0
10⫺3 0.00
0.05
0.10
0.15
0.20
0.25
Total incident kinetic energy Ei (eV)
Scattering intensity
(b)
10⫺1
10⫺2
j⫽0→2
10⫺3 0.00
0.05
0.10
0.15
0.20
0.25
Total incident kinetic energy Ei (eV)
Figure 5. Calculated scattering intensity versus kinetic energy for H2 molecules in the rotational ground state impinging under normal incidence on Pd(100) with an initial velocity spread of v/v = 0.05 (after [10]).
G||
Kf
E < Ethreshold
E⫽Ethreshold
Figure 6. Schematic illustration of the opening up of new scattering channels for normal incidence as a function of the energy.
1726
A. Groß
become larger than the (00) peak at higher energies. This is due to the fact that the rotationally anisotropic component of the potential is more corrugated than the isotropic component [49]. Besides, it is apparent that the oscillatory structure for rotationally inelastic scattering is somewhat smaller than for rotationally elastic scattering. Not all peaks in the scattering amplitudes can be unambiguously attributed to the emergence of new scattering channels. As already mentioned in Section 2, additional structures in the scattering intensities can also be caused by selective adsorption resonances: molecules become temporarily trapped into metastable molecular adsorption states at the surface due to the transfer of normal momentum to parallel and angular momentum which resonantly enhances the scattering intensities. Such resonances have been clearly resolved, e.g., in the physisorption of H2 on Cu [62]. For the strongly corrugated and anisotropic H2 /Pd(100) system it is difficult to identify the nature of possible scattering resonances from the quantum calculations. Classically, one observes dynamic trapping in the H2 /Pd interaction dynamics [52, 53, 63] which is the equivalent of selective adsorption resonances: impinging molecules do neither directly scatter not dissociate but transfer energy from the translation perpendicular to the surface into internal degrees of freedom and motion parallel to the surface. In this transient state, they can spend several picoseconds at the surface. Although most of the dynamically trapped molecules eventually dissociate, this process still influences the reflection probabilities. Oscillatory structures have been known for years in He and H2 scattering [20] and also in low-energy electron diffraction (LEED) [64]. For reactive systems such as H2 /Pd(100), the experimental observation of diffraction is not trivial. Because of the reactivity, an adsorbate layer builds up very rapidly during the experiment. These layers destroy the perfect periodicity of the surface and thus suppress diffraction effects. In order to keep the surface relatively clean, one has to use rather high surface temperatures so that adsorbates quickly desorb again. High surface temperatures, on the other hand, also smear out the diffraction pattern. Still experimentalists managed to clearly resolve rotationally inelastic peaks in the diffraction pattern of D2 /Ni(110) [12] and D2 /Rh(110) [11] in addition to rotationally elastic peaks.
5.
Quantum Effects in Reaction Probabilities
While elastic scattering and diffraction are purely quantum phenomena that cannot be understood and reproduced within classical physics, reaction probabilities can be calculated by both classical and quantum molecular dynamics calculations. In a multidimensional situation, classical reaction probabilities are obtained by averaging over molecular dynamics simulations for statistically distributed initial conditions. For example, to determine the probability
Quantum theory of reactive scattering and adsorption at surfaces
1727
for dissociative adsorption, trajectories with different initial lateral positions within the surface unit cell and different molecular orientations have to be run. A quantum wave function, on the other hand, is always delocalized to a certain degree. One could say that quantum reaction probabilities correspond to a coherent average over initial conditions while classically the average is done incoherently. This coherent averaging causes quantum effects for example in the dissociative adsorption probability that will be discussed in this section. We continue to focus on the system H2 /Pd(100). In the determination of the diffraction pattern, we had neglected the substrate motion. This approximation is indeed justified in the description of the interaction of hydrogen with densely packed metal surfaces. There is only a small energy transfer from the light hydrogen molecule to the heavy substrate atoms. Furthermore, usually no significant surface rearrangement occurs during the interaction time. Even in the description of the dissociative adsorption of H2 on metal surfaces, in contrast to molecular adsorption, the substrate motion can be safely neglected [36, 37, 58]. The crucial process in the dissociative adsorption dynamics is the bond-breaking process, i.e., the conversion of translational and internal energy of the hydrogen molecule into translational energy of the atomic fragments on the surface relative to each other. The fragments will of course eventually thermalize at the surface by transfering their excess energy to the substrate, but this only occurs after the dissociation step. Thus the dissociation dynamics can be described by a six-dimensional PES which takes only the molecular degrees of freedom into account. In this framework, the dissociation probability can be regarded as a quantum transmission probability from the entrance channel of the impinging molecule to the dissociation channel at the surface. Because of the conservation of the particle flux, the adsorption probability for some particular initial state i can be evaluated by Si = 1 −
|R j i |2 ,
(10)
j
where the R j i are the amplitudes of all final scattering states. The calculated dissociative adsorption probability of H2 /Pd(100) as a function of the kinetic energy is shown in Fig. 7 and compared to the results of molecular beam experiments [15, 65]. The inset shows more recent results using an improved ab initio potential energy surface [45]. Because of the unitarity relation Eq. (10), scattering and adsorption probabilities are closely linked to each other. Indeed, the adsorption probability also exhibits a pronounced oscillatory structure at exactly the same kinetic energies as the scattering intensities. This means that this structure is also due to threshold effects because of the emergence of new scattering channels. In addition, resonance phenomena contribute to the oscillatory structure. However, if one again assumes a velocity spread of the incident molecules typical
1728
A. Groß 1.0
Sticking probability
0.8
1.0 0.8
Exp.. Rendulic et al 6D QD, ji⫽0 (Eichler et al.)
0.6
Exp. Rettner and Auerbach
0.4
0.6
0.2 0.0 0.0
0.1
0.2
0.3
0.4
0.5
0.4
0.2 Exp. Rendulic et al. 6D QD,ji⫽0 (Gross et al.) 6D QD, beam simulation
0.0 0.0
Exp. Rettner and Auerbach
0.1
0.2
0.3
0.4
0.5
Kinetic energy Ei(eV)
Figure 7. Sticking probability of H2 /Pd(100) as a function of the initial kinetic energy. Circles: H2 molecular beam adsorption experiment under normal incidence (Rendulic et al. [65]); dash–dotted line: H2 effusive beam scattering experiment with an incident angle of θi = 15◦ (Rettner and Auerbach [15]); dashed and solid line: theory according to H2 initially in the ground state and with a thermal distribution appropriate for a molecular beam [13]. The inset shows the theoretical results using an improved ab initio potential energy surface [45].
for molecular beam experiments [65], the calculated sticking probability becomes rather smooth (solid line in Fig. 7). This means that the quantum effects in the dissociative adsorption probability are hardly visible. The predicted quantum oscillations have been searched for experimentally by an effusive beam experiment for an angle of incidence of 15◦ [15, 66], but no oscillations could be detected. As already pointed out, surface imperfections at a reactive substrate such as adatoms or steps reduce the coherence of the scattering process and thus smooth out the oscillatory structure [10, 67]. But more importantly, also the angle of incidence has a decisive influence on the symmetry and the scattering intensities [14]. The calculations were done for normal incidence while the experiment was done for non-normal incidence [15, 66]. At non-normal incidence, the number of symmetrically equivalent diffraction channels is reduced compared to normal incidence. This makes the effect of the opening up of new diffraction channels less dramatic and thus also smoothes the adsorption probabilities. Experiment [65] and theory agree well, as far as the qualitative trend of the adsorption probability as a function of the kinetic energy is concerned. First there is an initial decrease, and after a minimum the sticking probability rises again. The initial decrease of the sticking probability is typical for H2 adsorption at transition metal surfaces [15, 65, 68–70]. In these systems, the PES shows purely attractive paths towards dissociative adsorption, but the majority of reaction paths for different molecular orientations and impact points exhibits energetic barriers hindering the dissociation. However, at low
Quantum theory of reactive scattering and adsorption at surfaces
1729
kinetic energies most impinging molecules are steered towards the attractive dissociation channel leading to a high adsorption probability. This steering effect [13, 71, 72] is suppressed at higher kinetic energies causing the decrease in the adsorption probability. While diffraction is a consequence of the periodicity of the surface, there are also more local quantum effects occurring within the surface unit cell, in particular if the wave function has to propagate through a narrow transition state. The consequences of such a situation will be illustrated using simple lowdimensional model calculations [17]. In Fig. 8(a), an idealized two-dimensional potential energy surface for activated adsorption is plotted as a function of one lateral coordinate and a reaction path coordinate. This PES has features
(a) 1.0
Reaction path coordinate s (Å)
0.8 0.6
0.3
0.1
0.5 0.9
0.4 0.7
0.2 0.0
⫺0.1
⫺0.2
⫺0.3
⫺0.4
⫺0.5 ⫺0.7
⫺0.6 ⫺0.8
⫺0.9
⫺1.0 0.0
1.0
2.0
3.0
4.0
5.0
Surface coordinate X (Å)
(b) 0.06
Sticking probability
classical dynamics H2 quantum dynamics D2 quantum dynamics
0.04
0.02
0.00
0.1
0.2 0.3 Kinetic energy (eV)
0.4
Figure 8. Activated dissociation of molecular hydrogen at a two-dimensional corrugated surface. (a) potential energy surface, (b) sticking probability vs. kinetic energy for a hydrogen beam under normal incidence. Full line: classical sticking probability which is independent of the mass as a function of the kinetic energy [14]; dashed line: H2 quantum sticking probability; long-dashed line: D2 quantum sticking probability [17].
1730
A. Groß
appropriate for, e.g., the hydrogen dissociation at the (2×2) sulfur covered Pd(100) surface [48, 73]: The minimum barrier has a height of 0.09 eV, the adsorption energy is E ad = 1 eV, and the square surface unit cell has a lattice constant of a = 5.5 Å. The calculated dissociation probability at such a surface is plotted in Fig. 8(b). The classical sticking probability is compared to the quantum sticking probabilities of the hydrogen isotopes H2 and D2 . Please note that there is no isotope effect in the dissociation probability as a function of the kinetic energy for hydrogen moving classically on a PES as long as there are no energy transfer processes to, e.g., substrate phonons [14]. This is caused by the fact that at the same kinetic energy H2 and D2 follow exactly the same trajectories in space. The quantum results show a very regular oscillatory structure as a function of the kinetic energy. This is not due to any resonance phenomenom but rather to the existence of quantized states at the transition state [16, 74]. At the minimum barrier position the wave function has to pass through a narrow valley of the corrugated PES. This leads to a localization of the wave function and thereby to a quantization of the allowed states that can pass through this valley. In the harmonic approximation the energy levels correspond to harmonic oscillator eigenstates which are equidistant in energy. Their spacing ω is determined by the curvature of the PES in the coordinates perpendicular to the reaction path. For H2 passing through the transition state shown in Fig. 8(a), the curvature of the potential perpendicular to the reaction path corresponds to a frequency of ω = 104 meV. And indeed, the oscillations in the H2 sticking probability exhibit a period of about 100 meV. The level spacing of the quantized states depends on the mass of the traversbetween the quantized states at ing particles. For D2 the energetic separation √ the transition state is smaller by a factor of 1/ 2 compared to H2 . This is indeed reflected in Fig. 8(b) by the smaller period of the oscillations in the D2 quantum sticking probability. The existence of quantized states is closely related to the zero-point energies. Because of the Heisenberg uncertainty principle, there is a minimum energy required for any localized quantum state, namely the zero-point energy. For a harmonic oscillator, this zero-point energy is given by ω/2. It leads to an effectively higher minimum barrier for the quantum propagation through a transition state region. Consequently, the onset of sticking occurs at higher energies in the quantum calculation than in the classical calculations (see Fig. 8(b)). However, this onset is not shifted by ω/2, but by a smaller amount. This is caused by another quantum phenomenom, tunneling. Quantum mechanically particles can also traverse a barrier region for energies below the minimum barrier height. This promoting effect partially counterbalances the hindering effect of the zero-point energies.
Quantum theory of reactive scattering and adsorption at surfaces
1731
Figure 8(b) demonstrates that the quantum sticking probabilities oscillate around the classical result which means that tunneling and quantization effects almost cancel each other on the average. In addition, if more degrees of freedom are considered, there will be further quantization effects. The combined effect will be a smoothing of the oscillatory structure. Indeed, in six-dimensional quantum calculations of the dissociative adsorption of H2 on Cu(100) [43], hardly any steplike structure is visible in the adsorption probability. Therefore it is almost very hard to detect these quantum effects in molecular beam experiments because of limited energetic resolution of the beams and the unavoidable existence of surface imperfections.
6.
Conclusions
In this review, we have presented an overview over the quantum effects in the interaction dynamics of atoms and molecules with surfaces. They are of particular importance for light atoms and molecules such as helium or hydrogen. The quantum nature of the substrate phonons leads to the phenomenom of elastic scattering at surfaces. This leads to trapping probabilities that are less than one in the non-activated sticking of weakly bound species at surfaces. Another quantum effect, namely diffraction, is a consequence of the periodicity of surfaces together with elastic scattering. It occurs when the de Broglie wavelength of the incident beam is of the order of the lattice spacing of the substrate and can be used as a tool to determine surface structures. The opening up of new scattering channels leads to an oscillatory structure in the intensities of the diffraction peaks and in the dissociative adsorption probabilities of H2 at reactive surfaces. Furthermore, there are quantum effects due to the existence of quantized states at the transition states of the multidimensional potential energy surface. However, all these additional quantum effects are suppressed by substrate imperfections and surface temperature effects. Hence they can hardly be resolved in experiments.
References [1] C.B. Duke and E.W. Plummer (eds.), Frontiers in Surface and Interface Science, North-Holland, 2002. [2] E. Hulpke (ed.), Helium Atom Scattering from Surfaces, volume 27 of Springer Series in Surface Sciences, Springer, Berlin, 1992. [3] D. Farías and K.-H. Rieder, Rep. Prog. Phys., 61, 1575, 1998. [4] H. Schlichting, D. Menzel, T. Brunner, W. Brenig, and J.C. Tully, Phys. Rev. Lett., 60, 2515, 1988. [5] H. Schlichting, D. Menzel, T. Brunner, and W. Brenig, J. Chem. Phys., 97, 4453, 1992.
1732 [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47]
A. Groß R. Sedlmeir and W. Brenig, Z. Phys. B, 36, 245, 1980. W. Brenig, Z. Phys. B, 36, 227, 1980. F. Shimizu, Phys. Rev. Lett., 86, 987, 2001. H. Friedrich, G. Jacoby, and C.G. Meiter, Phys. Rev. B, 65, 032902, 2002. A. Groß and M. Scheffler, Chem. Phys. Lett., 263, 567, 1996. D. Cvetko, A. Morgante, A. Santaniello, and F. Tommasini, J. Chem. Phys., 104, 7778, 1996. M.F. Bertino, F. Hofmann, and J.P. Toennies, J. Chem. Phys., 106, 4327, 1997. A. Groß, S. Wilke, and M. Scheffler, Phys. Rev. Lett., 75, 2718, 1995. A. Groß and M. Scheffler, Phys. Rev. B, 57, 2493, 1998. C.T. Rettner and D.J. Auerbach, Chem. Phys. Lett., 253, 236, 1996. A.D. Kinnersley, G.R. Darling, S. Holloway, and B. Hammer, Surf. Sci., 364, 219, 1996. A. Groß, J. Chem. Phys., 110, 8696, 1999. I. Estermann and O. Stern, Z. Phys., 61, 95, 1930. B. Gumhalter, Phys. Rep., 351, 1, 2001. R. Frisch and O. Stern, Z. Phys., 84, 430, 1933. M. Patting, D. Farías, and K.-H. Rieder, Surf. Sci., 429, L503, 1999. R.G. Rowe and G. Ehrlich, J. Chem. Phys., 4648, 1975. G. Brusdeylins and J.P. Toennies, Surf. Sci., 126, 647, 1983. J. Lapujoulade, Y. Lecruer, M. Lefort, Y. Lejay, and E. Maurel, Surf. Sci., 103, L85, 1981. M.F. Bertino and D. Farías, J. Phys.: Condens. Matter, 14, 6037, 2002. K.B. Whaley, C.-F. Yu, C.S. Hogg, J.C. Light, and S. Sibener, J. Chem. Phys., 83, 4235, 1985. A. Groß, Theoretical Surface Science – A Microscopic Perspective, Springer, Berlin, 2002. R.W. Fuller, S.M. Harris, and E.L. Slaggie, Am. J. Phys., 31, 431, 1963. W. Brenig, Z. Phys. B, 36, 81, 1979. J. B¨oheim and W. Brenig, Z. Phys. B, 41, 243, 1981. I.A. Yu, J.M. Doyle, J.C. Sandberg, C.L. Cesar, D. Kleppner, and J.T. Greytak, Phys. Rev. Lett., 71, 1589, 1993. C. Eltschka, H. Friedrich, and M.J. Moritz, Phys. Rev. Lett., 86, 2693, 2001. R. Newton, Scattering Theory of Waves and Particles, 2nd edn., Springer, New York, 1982. J.A. Fleck, J.R. Morris, and M.D. Feit, Appl. Phys., 10, 129, 1976. H. Tal-Ezer and R. Kosloff, J. Chem. Phys., 81, 3967, 1984. G.-J. Kroes, Prog. Surf. Sci., 60, 1, 1999. A. Groß, Surf. Sci. Rep., 32, 291, 1998. A. Groß, Surf. Sci., 500, 347, 2002. G. Kresse and J. Furthm¨uller, Phys. Rev. B, 54, 11169, 1996. G. Kresse and D. Joubert, Phys. Rev. B, 59, 1758, 1999. B. Hammer, L.B. Hansen, and J.K. Nørskov, Phys. Rev. B, 59, 7413, 1999. B. Kohler, S. Wilke, M. Scheffler, R. Kouba, and C. Ambrosch-Draxl, Comput. Phys. Commun., 94, 31, 1996. G.-J. Kroes, E.J. Baerends, and R.C. Mowrey, Phys. Rev. Lett., 78, 3583, 1997. A. Groß, C.-M. Wei, and M. Scheffler, Surf. Sci., 416, L1095, 1998. A. Eichler, J. Hafner, A. Groß, and M. Scheffler, Phys. Rev. B, 59, 13297, 1999. Y. Miura, H. Kasai, and W. Di˜no, J. Phys.: Condens. Matter, 14, L479, 2002. W. Brenig and M.F. Hilf, J. Phys. Condens. Mat., 13, R61, 2001.
Quantum theory of reactive scattering and adsorption at surfaces
1733
[48] S. Wilke, D. Hennig, R. L¨ober, M. Methfessel, and M. Scheffler, Surf. Sci., 307, 76, 1994. [49] S. Wilke and M. Scheffler, Phys. Rev. B, 53, 4926, 1996. [50] W. Dong and J. Hafner, Phys. Rev. B, 56, 15396, 1997. [51] V. Ledentu, W. Dong, and P. Sautet, Surf. Sci., 412, 518, 1998. [52] H.F. Busnengo, W. Dong, and A. Salin, Chem. Phys. Lett., 320, 328, 2000. [53] C. Crespos, H.F. Busnengo, W. Dong, and A. Salin, J. Chem. Phys., 114, 10954, 2001. [54] H.F. Busnengo, E. Pijper, M.F. Somers, G.J. Kroes, A. Salin, R.A. Olsen, D. Lemoine, and W. Dong, Chem. Phys. Lett., 356, 515, 2002. [55] H.F. Di Cesare, M.A. Busnengo, W. Dong, and A. Salin, J. Chem. Phys., 118, 11226, 2003. [56] S. Wilke and M. Scheffler, Surf. Sci., 329, L605, 1995. [57] K. Christmann, Surf. Sci. Rep., 9, 1, 1988. [58] G.R. Darling and S. Holloway, Rep. Prog. Phys., 58, 1595, 1995. [59] W. Brenig, T. Brunner, A. Groß, and R. Russ, Z. Phys. B, 93, 91, 1993. [60] W. Brenig and R. Russ, Surf. Sci., 315, 195, 1994. [61] U. Garibaldi, A.C. Levi, R. Spadacini, and G.E. Tommei, Surf. Sci., 48, 649, 1995. [62] S. Andersson, L. Wilzen, M. Persson, and J. Harris, Phys. Rev. B, 40, 8146, 1989. [63] A. Groß and M. Scheffler, J. Vac. Sci. Technol. A, 15, 1624, 1997. [64] E.G. McRae, Rev. Mod. Phys., 51, 541, 1979. [65] K.D. Rendulic, G. Anger, and A. Winkler, Surf. Sci., 208, 404, 1989. [66] C.T. Rettner and D.J. Auerbach, Phys. Rev. Lett., 77, 404, 1996. [67] A. Groß and M. Scheffler, Phys. Rev. Lett., 77, 405, 1996. [68] K.D. Rendulic and A. Winkler, Surf. Sci., 299/300, 261, 1994. [69] M. Beutl, M. Riedler, and K.D. Rendulic, Chem. Phys. Lett., 256, 33, 1996. [70] M. Gostein and G.O. Sitz, J. Chem. Phys., 106, 7378, 1997. [71] D.A. King, CRC Crit. Rev. Solid State Mater. Sci., 7, 167, 1978. [72] M. Kay, G.R. Darling, S. Holloway, J.A. White, and D.M. Bird, Chem. Phys. Lett., 245, 311, 1995. [73] C.M. Wei, A. Groß, and M. Scheffler, Phys. Rev. B, 57, 15572, 1998. [74] D.C. Chatfield, R.S. Friedman, and D.G. Truhlar, Faraday Discuss., 91, 289, 1991.
5.11 STOCHASTIC CHEMICAL KINETICS Daniel T. Gillespie Dan T Gillespie Consulting, 30504 Cordoba Place, Castaic, CA 91384
The time evolution of a well-stirred chemically reacting system is traditionally described by a set of coupled, first-order, ordinary differential equations. Obtained through heuristic, phenomenological reasoning, these equations characterize the evolution of the molecular populations as a continuous, deterministic process. But a little reflection reveals that the system actually possesses neither of those attributes: Molecular populations are whole numbers, and when they change they always do so by discrete, integer amounts. Furthermore, in excusing ourselves from the arduous task of tracking the positions and velocities of all the molecules in the system, which we hope to justify on the grounds that the system is “well-stirred”, we preclude a deterministic description of the system’s evolution; because, a knowledge of the system’s current molecular populations is not by itself sufficient to predict with certainty the future molecular populations. Just as rolled dice are essentially random or “stochastic” when we do not precisely track their positions and velocities and all the forces acting on them, so is the time evolution of a well-stirred chemically reacting system for all practical purposes stochastic. That said, discreteness and stochasticity are usually not noticeable in chemical systems of “test-tube” size or larger, and for most such systems the traditional continuous deterministic description seems to be adequate. But if the molecular populations of some reactant species are very small, as is often the case for instance in cellular systems in biology, discreteness and stochasticity can sometimes play an important role. Whenever that happens, the ordinary differential equations approach will not be able to accurately describe the true behavior of the system. Stochastic chemical kinetics attempts to describe the time evolution of a well-stirred chemically reacting system as an overtly discrete, stochastic process, evolving in real (continuous) time. And it tries to do this in a way that accurately reflects how chemical reactions physically occur at the molecular level. This article will outline the theoretical foundations of stochastic chemical kinetics, and then derive and interrelate its principle equations and 1735 S. Yip (ed.), Handbook of Materials Modeling, 1735–1752. c 2005 Springer. Printed in the Netherlands.
1736
D.T. Gillespie
computational methods. It will also show how it happens that the resulting discrete, stochastic description usually gives way to the traditional continuous, deterministic description in a special limiting approximation.
1.
Microphysical Foundations of Stochastic Chemical Kinetics
We consider a well-stirred system of molecules of N chemical species {S1, . . . , S N }, which interact through M chemical reaction channels {R1 , . . . , R M }. We assume the system to be confined to a constant volume , and to be in thermal (but not necessarily chemical) equilibrium at some constant absolute temperature T . We let X i (t) denote the number of molecules of species Si in the system at time t. Our goal is to estimate, as best we can, the state vector X(t) = (X 1 (t), . . . , X N (t)), given that the system was in state X(t0 ) = x0 at some initial time t0 < t.1 Each reaction channel R j is assumed to be “elemental” in the sense that it describes a distinct physical event which happens essentially instantaneously. This assumption restricts us to two general types of reaction: Unimolecular reactions of the form Si → product(s); and bimolecular reactions of the form Si + Si → product(s), where in the latter i may or may not be the same as i.2 A given reaction channel R j is characterized mathematically by two quantities. The first is its state-change vector ν j = (ν1 j , . . . , ν N j ), where νi j is defined to be the change in the Si molecular population caused by one R j reaction event; thus, if the system is in state x and an R j reaction occurs, the system immediately jumps to state x + ν j . The two-dimensional array νi j is commonly known as the stoichiometric matrix. Its elements are practically always confined to the values 0, ±1 and ±2. The other defining quantity for reaction channel R j is its propensity function a j . It is defined as follows:
a j (x) dt = the probability, given X(t) = x, that one R j reaction will occur somewhere inside in the next infinitesimal time interval [t, t + dt).
(1)
1 Boldface variables will always be understood here to be N-component vectors, with the components corresponding to the N chemical species in the system. 2 A set of three elemental reactions of the form S + S → S and S + S → S can often be regarded as 1 2← 4 3 4 5 the single trimolecular reaction S1 + S2 + S3 → S5 if the first two reactions are much faster than the third. But this is always an approximation.
Stochastic chemical kinetics
1737
This definition might be said to be the fundamental premise of stochastic chemical kinetics, because everything else follows from it. It is important to recognize that this probabilistic definition has a solid basis in physical theory, more solid in fact than the reasoning that is traditionally used to justify the deterministic differential equations mentioned earlier. Since the microphysical basis of Eq. (1) ultimately determines the forms of the propensity functions, it is appropriate to describe it briefly here. If R j is the unimolecular reaction S1 → product(s), the underlying physics, which might be quantum mechanical, generally dictates the existence of some constant which we shall call c j such that c j dt gives the probability that any particular S1 molecule will so react in the next infinitesimal time dt. If there is currently a finite number x1 of S1 molecules in the system, we can take dt to be so small that no more than one of them will undergo that reaction in the next dt. This allows us to invoke the addition law of probability theory for mutually exclusive events, and so calculate the probability for any S1 molecule in the system to undergo the R j reaction by simply summing the individual reaction probabilities. That sum gives x1 × c j dt, from which we may conclude that the propensity function in Eq. (1) is a j (x) = c j x1 . If R j is a bimolecular reaction of the form S1 + S2 → product(s), stochasticity manifests itself in two ways, both stemming from the fact that we do not know the exact position and velocity of any molecule in the system: First, we can predict only the probability that an S1 molecule and an S2 molecule will collide in the next dt. And second, we can predict only the probability that such a collision will actually produce an R j reaction. Consider a randomly chosen pair of S1 and S2 molecules. The assumption of thermal equilibrium implies that the S2 molecule √ will see the S1 molecule moving with an average relative speed v¯ 12 = 8kB T /π m 12, where kB is Boltzmann’s constant and m 12 = m 1 m 2 /(m 1 + m 2 ). Denote the effective collision cross section of the molecular pair by σ12 (which would equal π(r1 + r2 )2 if the molecules were hard spheres with radii r1 and r2 ). In the next infinitesimal time dt, the S1 molecule will sweep out relative to the S2 molecule an infinitesimally small “collision volume” of size (v¯12 dt)σ12 – so called because if the center of the S2 molecule happens to lie inside that volume then the two molecules will collide in the next dt. (We take dt to be so small that there is virtually no chance that the collision will be preempted by an earlier collision with some third molecule.) By our assumption that the system is “well-stirred” – a condition that can be secured either by an externally driven stirrer or by the inevitable self-stirring effects of the many non-reactive (bounce-off) molecular collisions that typically occur in such a system – the probability that the center of the S2 molecule will lie inside the collision volume is just the ratio of that volume to the total system volume: (v¯12 dt)σ12 / . This ratio is therefore the probability that the pair will collide in the next dt. Denoting by p j the probability that a colliding S1-S2 molecular pair will actually
1738
D.T. Gillespie
react according to R j , we conclude by the multiplication law of probability theory that
v¯12 σ12 p j (v¯12 dt)σ12 dt = c j dt (2) × pj = gives the probability that a randomly chosen S1-S2 molecular pair will und-ergo the R j reaction in the next dt. Now taking dt to be so small that no more than one of the x1 x2 S1-S2 pairs in the system will react in the next dt, we can invoke the addition law of probability for mutually exclusive events to compute the probability for some pair to so react as x1 x2 × c j dt. Thus we conclude that the propensity function in Eq. (1) is a j (x) = c j x1 x2 . If this bimolecular have reckoned the reaction had instead been S1 + S1 → product(s), we would number of distinct S1 molecular pairs to be x1 (x1 − 1) 2, and so obtained for the propensity function a j (x) = c j (1/2)x1 (x1 − 1), which properly vanishes if there is only one S1 molecule. The foregoing analysis shows two things: First, an elemental reaction channel R j can indeed be described by a function a j (x) in the manner prescribed by Eq. (1). And second, a j (x) can usually be written as the product of some constant c j , called the specific reaction probability rate constant, times the number of distinct combinations of R j reactant molecules that are available when the system’s state is x. Our subsequent work here will depend critically on the first point, but will tolerate considerable variance with respect to the second; hence, we shall be less concerned here with the forms of the propensity functions than with the fact that they exist and satisfy Eq. (1). But we should note in passing that the task of evaluating c j entirely from first principles is a very challenging one. An interesting result for bimolecular reactions emerges in the idealized case in which the colliding molecules will react if and only if the kinetic energy associated with the component of their relative velocity along their line of centers at contact exceeds some threshold value ε ∗ ; in that case, it can be proved from elementary kinetic theory that the conditional reaction probability p j in Eq. (2) is given by p j = −ε ∗/kB T , thus providing a physically transparent interpretation of the familiar Arrhenius factor.3 If it were also the case that the reaction can occur only if the point of collisional contact between the two molecules lies inside specific solid angles ω1 on molecule 1 and ω2 on molecule 2, then in the absence of any orienting forces p j would contain the additional probability factors (ω1 /4π ) and (ω2 /4π ). It turns out that c j for a unimolecular reaction is numerically equal to the reaction rate constant k j of conventional deterministic chemical kinetics, while c j for a bimolecular reaction is equal to k j / if the reactants are 3 See R. Present, Kinetic Theory of Gases (McGraw-Hill, New York, 1958), and D. Gillespie, Physica A 188, 404–425, 1992.
Stochastic chemical kinetics
1739
different species, or 2k j / if they are the same. Contemplating this result by itself, one might be tempted to conclude that the fundamental premise (1) and the mathematical forms of the propensity functions all follow from some simple heuristic “stochastic extrapolation” of the mass-action equations of deterministic chemical kinetics. But the foregoing analysis shows that that is not the case: The existence and forms of the propensity functions are rooted in the realities of molecular dynamics. The equations of stochastic chemical kinetics cannot be derived in a logically rigorous way from the equations of deterministic chemical kinetics; rather, as we shall see later, the derivation goes the other way. In what follows, we shall simply assume that the propensity functions a j (x), like the state-change vectors ν j , are all given.
2.
The Chemical Master Equation
Although the probabilistic nature of Eq. (1) precludes making an exact prediction of X(t) given that X(t0 ) = x0 for any t > t0 , we might reasonably hope to infer the probability
P(x, t | x0 , t0 ) = Prob {X(t) = x, given X(t0 ) = x0 } .
(3)
In technical terms, P(x, t | x0 , t0 ) is the “probability density function” of the time-dependent “random variable” X(t), and X(t) in turn is, by virtue of the dynamics prescribed by Eq. (1), a “jump Markov process”.4 It is not difficult to deduce a time-evolution equation for the function (3) by using the laws of probability theory to write P(x, t + dt | x0 , t0 ) as the sum of the probabilities of all the mutually exclusive ways in which the system could evolve from state x0 at time t0 to state x at time t + dt, via specified states at time t:
P(x, t + dt | x0 , t0 ) = P(x, t | x0 , t0 ) × 1 −
M
a j (x)dt
j =1
+
M
P(x − ν j , t | x0 , t0 ) × a j (x − ν j )dt .
j =1
4 A stochastic process – a random variable that depends on time – is said to be Markov if its future values depend on its past values only through its present value. (A Markov process is distinguished from a Markov chain by the fact that time is a real or continuous variable in a process and an integer variable in a chain.) A jump Markov process changes discontinuously at isolated instants in time, and remains constant between such jumps. There are also continuous Markov processes, which evolve in a way that is mathematically continuous but often not differentiable.
1740
D.T. Gillespie
Here, the first term on the right is the probability that the system is already in state x at time t and then no reaction of any kind occurs in [t, t + dt). And the generic second term is the probability that the system is one R j reaction removed from state x at time t and then one R j reaction occurs in [t, t + dt). That these M + 1 routes to the final state x are mutually exclusive and collectively exhaustive is ensured by taking dt to be so small that no more than one reaction of any kind can occur in [t, t + dt). Subtracting P(x, t | x0 , t0 ) from both sides of the above equation, dividing through by dt, and then taking the limit dt → 0, we obtain what is know as the chemical master equation (CME): M
∂ P(x, t | x0 , t0 ) = a j (x − ν j )P(x − ν j , t | x0 , t0 ) ∂t j =1
− a j (x)P(x, t | x0 , t0 ) .
(4)
In principle, the CME completely determines the function P(x, t | x0 , t0 ). But a closer inspection of Eq. (3) reveals that it is actually a set of coupled, ordinary differential equations in t; in fact, there is one equation for each possible value (0,1,2, . . .) of each of the M components of the variable x – roughly as many equations as there are combinations of molecules in the system! So it is perhaps not surprising that the CME can be solved analytically for only a very few very simple systems, and numerical solutions are usually prohibitively difficult in other cases. One might hope, less ambitiously, to learn something from the CME about the behavior of functional averages like f (X(t)) ≡ x f (x)P(x, t | x0 , t0 ), but this too turns out to be practically impossible if any of the reaction channels are bimolecular. For example, it can be proved from Eq. (4) that M dX(t) = ν j a j (X(t)) . dt j =1
(5)
Now, if all the reactions were monomolecular, the propensity functions would all be linear in the state variables, and we would have a j (X(t)) = a j (X(t)). Equation (5) would then become a closed ordinary differential equation for the first moment or mean X(t). But if any reaction is bimolecular, the right hand side of Eq. (5) will contain at least one quadratic moment of the form X i (t)X i (t), and Eq. (5) would then be merely the first of an infinite, openended set of equations for all the moments. In the hypothetical case in which there are no fluctuations, i.e., if X(t) were a deterministic or sure process, we would have f (X(t)) = f (X(t)) for all functions f . Equation (5) would then reduce to M dX(t) = ν j a j (X(t)). dt j =1
(6)
Stochastic chemical kinetics
1741
This is just the well known reaction rate equation (RRE) of traditional deterministic chemical kinetics – a set of coupled first-order ordinary differential equations for the components X i (t), which are now continuous (real) variables. The RRE is more commonly written in terms of the concentration vari able Z(t) = X(t)/ , but that simple scalar transformation is inconsequential for our purposes here. Although the foregoing analysis shows that the deterministic RRE (6) would be valid if all fluctuations were simply ignored, it does not tell us how or why the fluctuations might ever be “ignorable”. We shall later prove that the RRE can actually be derived from Eq. (1) through a series of physically transparent approximating assumptions.
3.
The Stochastic Simulation Algorithm
Since the CME (4) is practically never useful for calculating the probability density function of X(t), we need another approach. Let us look for a way to construct a numerical realization of X(t), i.e., a simulated trajectory of X(t) vs. t. Note that this is not the same as solving the CME numerically; however, much the same effect can be achieved by either histogramming or averaging the results of many realizations. For example, the nth n moment X (t ) , which would be given in terms of the solution to the CME 1 i as x xin P(x, t1 | x0 , t0 ), can also be estimated by generating L trajectories x(1) (t), . . ., x(L) (t) from state x0 at time t0 to time t1 , and then computing n L xi(l) (t1 ) . This estimate will have an associated uncertainty which L −1 l=1 decreases with the number of realizations L like L −1/2 . In practice, it is often found that as few as two or three simulated trajectories can convey as good a picture of the dynamical behavior of X(t) as would be afforded by an exact expression for P(x, t | x0 , t0 ). The key to generating simulated trajectories of X(t) is actually not the CME, but rather a new probability function, p(τ, j | x, t), which is defined as follows:
p(τ, j | x, t) dτ = the probability, given X(t) = x, that the next reaction in the system will occur in the infinitesimal time interval [t + τ, t + τ + dτ ), and will be an R j reaction. (7) Formally, this function is the joint probability density function of the two random variables “time to the next reaction” (τ ) and “index of the next reaction” ( j ), given that the system is currently in state x. If we can derive an analytical expression for this function, we could use Monte Carlo techniques
1742
D.T. Gillespie
to generate simultaneous samples of τ and j , and that would enable us to advance the system in time from one reaction event to the next. Happily, it turns out that we can do all this fairly easily, and without having to make any approximations. To derive an analytical expression for p(τ, j | x, t), we begin by introducing yet another probability function, P0 (τ | x, t), which is defined as the probability, given X(t) = x, that no reaction of any kind occurs in the time interval [t, t +τ ). By the definition (1) and the laws of probability theory, this function satisfies
P0 (τ + dτ | x, t) = P0 (τ | x, t)× 1 −
M
(a j (x)dτ ),
j =1
since the right side gives the probability that no reaction occurs in [t, t + τ ) and then no reaction occurs in [t + τ, t + τ + dτ ) (as usual we take the infinitesimal time span dτ to be so small that it can contain no more than one reaction). A simple algebraic rearrangement of this equation and passage to the limit dτ → 0 results in the differential equation dP0 (τ | x, t) = −a0 (x) P0 (τ | x, t), dτ where we have defined
a0 (x) =
M
a j (x).
(8)
j =1
The solution to this differential equation for the initial condition P0 (τ = 0 | x, t) = 1 is P0 (τ | x, t) = exp (−a0 (x) τ ). Now we observe that the probability defined in Eq. (7) can be written p(τ, j | x, t) dτ = P0 (τ | x, t)×(a j (x)dτ ), since the right side gives the probability that no reactions occur in [t, t + τ ) and then one R j reaction occurs in [t + τ, t + τ + dτ ). When we insert the above formula for P0 (τ | x, t) into this last equation and cancel the dτ ’s, we obtain p(τ, j | x, t) = a j (x) exp (−a0 (x) τ ),
(9a)
Stochastic chemical kinetics
1743
or equivalently p(τ, j | x, t) = a0 (x) exp (−a0 (x) τ ) ×
a j (x) . a0 (x)
(9b)
Equation (9a) is the desired explicit formula for the joint probability density function of τ and j . The equivalent form (9b) shows that this joint density function can be factored as the product of a τ -density function and a j -density function; more precisely, it shows that τ is an exponential random variable with mean and standard deviation 1/a0 (x), while j is a statistically independent integer random variable with point probabilities a j (x)/a0 (x). There are several exact Monte Carlo procedures for generating samples of these random variables. Perhaps the most direct is the procedure that follows by applying to each of the two probability density functions in Eq. (9b) the so-called inversion generating method:5 Draw two random numbers r1 and r2 from the uniform distribution in the unit-interval, and take
1 1 ln , τ= a0 (x) r1 j = the smallest integer satisfying
(10a) j
a j (x) > r2 a0 (x).
(10b)
j =1
And so we arrive at the following exact procedure for constructing a numerical realization of the process X(t), a procedure called the stochastic simulation algorithm (SSA): 0. Initialize the time t = t0 and the system’s state x = x0 . 1. With the system in state x at time t, evaluate all the a j (x) and their sum a0 (x). 2. Generate values for τ and j using Eqs. (10) (or an equivalent procedure). 3. Effect the next reaction by replacing t ← t + τ and x ← x + ν j . 4. Record (x, t) as desired. Return to Step 1, or else end the simulation. The X(t) trajectory that is produced by the SSA might be thought of as a “stochastic version” of the trajectory that would be obtained by solving the RRE (6). (But note that the time step τ in the SSA is exact, and is not a finite approximation to some infinitesimal dt, as is the time step in most numerical solvers for the RRE.) If it is found that every SSA-generated trajectory is practically indistinguishable from the RRE trajectory, then we may conclude that microscale randomness is negligible for this system. But if the SSA trajectories are found to deviate significantly from the RRE trajectory, then we must 5 It can be proved that a sample x of the random variable X can be obtained from a sample r of the unitx interval uniform random variable by solving −∞ P(x )dx = r, where P is the density function of X. This is known as the “inversion” generating procedure.
1744
D.T. Gillespie
conclude that microscale randomness is not negligible, and the deterministic RRE does not provide an accurate description of the system’s true behavior. The SSA and the CME are logically equivalent to each other, since each is derived without approximation from premise (1). But even when the CME is completely intractable, the SSA is quite straightforward to implement. In fact, as a numerical procedure, the SSA is even simpler than the procedures that are typically used to numerically solve the RRE (6). The catch is that the SSA is often very slow. The source of this slowness can be traced to the factor 1/a0 (x) in Eq. (10a), which as mentioned earlier is the mean of the random variable τ : Since a0 (x) is at least linear and more commonly quadratic in the reactant populations, a0 (x) can be very large, and τ correspondingly very small, whenever any reactant species is present in large numbers, and that is nearly always the case in practice. One notable attempt to speed up the SSA is the Gibson-Bruck procedure, which advances the system in exact accord with the function p(τ, j | x, t) in Eq. (9) but using a different scheme than Eqs. (10).6 Although this procedure is more complicated to code than the procedure described above, it is significantly faster and more efficient for systems having many species and many reaction channels. But any procedure that simulates every reaction event, exactly and one at a time, will inevitably be too slow for many practical applications. This prompts us to consider the possibility of giving up some of the exactness of the SSA in return for greater simulation speed.
4.
Tau Leaping
One approximate accelerated simulation strategy is tau-leaping, which tries to advance the system by a pre-selected time interval τ that encompasses more than one reaction event. To properly accomplish that feat when the system is in state x at time t, we need to know how to generate sample values of the M random variables
K j (τ ; x, t) = the number of times reaction channel R j fires in [t, t + τ ), given that X(t) = x ( j = 1, . . . , M).
(11)
For then, we could simply insert those sample values into the update formula X(t + τ ) = x +
M
K j (τ ; x, t ) ν j
j =1
6 For details, see M. Gibson and J. Bruck, J. Phys. Chem., 104, 1876–1889, 2000.
(12)
Stochastic chemical kinetics
1745
to “leap” the system ahead by the chosen time τ . Unfortunately, that is easier said than done. In general, the M random variables (11) are statistically dependent, and it is not altogether clear even how to calculate their joint probability density function, much less generate random samples according to that function. Suppose, however, that τ is chosen small enough that the following Leap Condition is satisfied: The expected state change induced by the leap is sufficiently small that no propensity function changes its value by a significant amount. In that case, we should be able to approximate each K j (τ ; x, t) by a statistically independent Poisson random variable: K j (τ ; x, t ) ≈ P j (a j (x), τ ) ( j = 1, . . . , M).
(13)
This is because the generic Poisson random variable P(a, τ ) is by definition the number of events that will occur in time τ , given that a dt is the probability that an event will occur in any infinitesimal time dt, where a may be any positive constant (hence the need for the Leap Condition).7 Therefore, if we can find a value for τ that is small enough that the Leap Condition is satisfied, yet large enough that many reaction events occur in time τ , we may indeed have a faster, albeit approximate, simulation strategy. The practical question arises, how can we determine in advance the largest value of τ that is compatible with the Leap Condition? Although there is as yet no unequivocal answer to this question, the following recipe for choosing τ will approximately ensure that no propensity function is likely to change its value in the leap by more than εa0 (x), where ε is some pre-chosen accuracy control parameter satisfying 0 < ε 1: With
f (x) = j j
N ∂a j (x) i=1
and M
∂ xi
νi j
( j, j = 1, . . . , M)
µ j (x) =
f j j (x) a j (x)
σ j2 (x) =
f j2j (x) a j (x)
j =1 M j =1
take8
τ = Min
j ∈[1,M]
(14a)
( j = 1, . . . , M),
(14b)
ε a (x) ε 2 a02 (x) 0 , . µ j (x) σ 2 (x) j
(15)
7 It can be shown that the probability that the random variable P(a, τ ) as so defined will equal any non negative integer n is e−aτ (aτ )n n!, and also that the mean and variance of P(a, τ ) are both equal to
aτ . 8 For a derivation, see D. Gillespie and L. Petzold, J. Chem. Phys., 119, 8229–8234, 2003.
1746
D.T. Gillespie
The explicit tau-leaping simulation procedure thus goes as follows: 1. In state x at time t, and with a value chosen for ε, evaluate τ from Eq. (15). R j in time 2. For j =1, . . . , M, generate the number of firings k j of reaction
τ as a sample of the Poisson random variable P a j (x), τ .9 3. Leap, by replacing t ← t + τ and x ← x + M j =1 k j ν j . Smaller values of ε will result in a better satisfaction of the Leap Condition, and hence a leap that is more accurate, but of course shorter. In the limit ε → 0, tau-leaping becomes mathematically equivalent to the SSA; however, tau-leaping will be very inefficient in that limit because all the k j ’s will usually be zero, giving a very small time step without any change of state. Therefore, it is advisable to abort the above procedure after Step 1 if τ is found to be less than a few multiples of 1/a0 (x), the mean time to the next reaction, and instead use the SSA to step directly to that next reaction. A variation on the foregoing tau-leaping procedure allows us to advance the system to the moment of the next firing of some particular reaction channel Rα , which perhaps initiates some critical sequence of events in the system. To do that, we start by computing a tentative τ from Eq. (15), and then computing aα (x) τ , the expected number of Rα firings in that time τ . If aα (x) τ < 1, we should not try to leap ahead to the next Rα reaction because that would violate the Leap Condition. But if aα (x) τ ≥ 1, then a leap with kα = 1 should be okay. In that case, wewould generate the actual time τ to the next Rα reaction as τ = aα−1 (x) ln (1 r),where r is a unit-interval uniform random number. Using that value for τ , we would then generate Poisson values for all the other k j =/ α as in Step 2, and finally effect the leap as in Step 3. If the system happens to be “dynamically stiff” – meaning that it has widely varying dynamical modes, the fastest of which are stable – the explicit tauleaping procedure will be computationally unstable for time steps that are larger than the fastest time scale, and that may severely restrict the size of τ . Stiffness is very common in chemical systems. Recently, an implicit tauleaping procedure has been proposed which shows promise of overcoming the instability problem for stiff systems.10 It should be noted that tau-leaping is not as foolproof as the SSA. If one takes leaps that are too large, bad things can happen; e.g., some species populations might be driven negative. The underlying philosophy of tau-leaping is to leap over “unimportant” reaction events but not the “important” ones, and 9 Numerical procedures for generating Poisson random numbers can be found, for instance, in W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes: The Art of Scientific Computing, Cambridge University Press, New York, 1986. 10 For details, see M. Rathinam, L. Petzold, Y. Cao, and D. Gillespie, J. Chem. Phys., 119, 12784–12794, 2003.
Stochastic chemical kinetics
1747
in some circumstances special measures must be taken to ensure that outcome. Much more work in this area is needed.
5.
The Chemical Langevin Equation
In the previous section we noted that, when the system is in state x at time t, if we choose a time-step t that is small enough that none of the propensity function values changes significantly during t, then the system’s state at time t + t can be decently approximated by
. P j a j (x), t ν j , X(t + t) = x + M
(16)
j =1
where the P j ’s are statistically independent Poisson random variables. Suppose the system admits a t that satisfies not only that condition, but also the condition that the expected (or mean) number of firings of each reaction channel in time t is 1; i.e., a j (x) t 1 for all j = 1, . . . , M.
(17)
It will usually be possible to find such a t if the molecular populations of all reactant species are “sufficiently large”. Now, it is well know that the Poisson random variable P(a, τ ), which has mean and variance aτ , can be approximated when aτ 1 by the normal random variable with the same mean and variance.11 Therefore, denoting the normal random variable with mean m and variance σ 2 by by N (m, σ 2 ), condition (17) allows Eq. (16) to be further approximated as follows:
. N j a j (x)t, a j (x)t ν j , X(t + t) = x + M
(18a)
j =1
=x+
M
a j (x)t +
a j (x)tN j (0, 1) ν j ,
j =1
√ . ν j a j (x)t + ν j a j (x)N j (0, 1) t , X(t + t) = x + M
M
j =1
j =1
(18b)
where the second line invokes the fact that N (m, σ 2 ) = m + σ N (0, 1). 11 That e−aτ (aτ )n /n! ≈ (2π aτ )−1/2 exp(−(n − aτ )2 /2aτ ) when aτ 1 follows from Stirling’s approx-
imation and the small-x approximation for ln(1 + x).
1748
D.T. Gillespie
We have thus established the following result: If the system admits a macroscopically infinitesimal time increment dt, defined so that during dt (i) no propensity function changes its value significantly yet (ii) every reaction channel fires many more times that once, then we can approximate the t to t + dt system evolution by √ . ν j a j (X(t)) dt + ν j a j (X(t)) N j (t) dt, X(t + dt) = X(t) + M
M
j =1
j =1
(19) where the N j (t) are M statistically independent, temporally uncorrelated, normal random variables with means 0 and variances 1. Equation (19) is called the chemical Langevin equation (CLE). The dot over its equal sign reminds us that it is an approximation, valid only to the extent that dt is small enough to satisfy condition (i) and simultaneously large enough to satisfy condition (ii). It is usually possible to find such a dt if all the reactant populations are sufficiently large. But if that is not possible, Eq. (19) has no basis and should not be invoked. The approximate character of the CLE (19) is underscored in the fact that the state vector X(t) therein is no longer discrete (integer-valued), but instead is continuous (real-valued); in fact, the name “Langevin” is applied because Eq. (19) has the exact mathematical form of the like-named generic equation that governs the time-evolution of any continuous Markov process. For the sake of completeness, two pertinent but unobvious results from the formal theory of continuous Markov processes should be noted here:12 First, Eq. (19) can be written in the mathematically equivalent form M M dX(t) . = ν j a j (X(t)) + ν j a j (X(t)) j (t) . dt j =1 j =1
(20)
independent “Gaussian white noise” processes Here, j (t) are statistically satisfying j (t) j (t ) = δ j j δ(t − t ), where the first delta function is Kronecker’s and the second is Dirac’s. Equation (20) is called the “white noise form” of the CLE. Second, the time evolution of X(t) prescribed by Eq. (19)
12 For a proof of the equivalence of the mathematical forms (19), (20) and (21), see D. Gillespie, Am. J.
Phys., 64, 1246–1257, 1996.
Stochastic chemical kinetics
1749
induces a time evolution in the probability density function of X(t) according to the partial differential equation
N M ∂ ∂ P(x, t|x0 , t0 ) . =− νi j a j (x) P(x, t | x0 , t0 ) ∂t ∂ x i i=1 j =1
+
+
N 1 ∂2
2
i=1
N i,i =1 (i
∂ xi2
M
νi2j a j (x) P(x, t | x0 , t0 )
j =1
M
∂ νi j νi j a j (x) P(x, t | x0 , t0 ). ∂ xi ∂ xi j =1 2
(21) This equation is called the chemical Fokker–Planck equation (CFPE). Essentially, we have approximated the jump Markov process governed by the master Eq. (4) by the continuous Markov process governed by the Fokker–Planck Eq. (21). All this somewhat complicated and possibly unfamiliar mathematics should not be allowed to obscure the genuine simplicity of the logical arguments underlying the foregoing derivation of the CLE (19): Condition (i) allowed us to infer, essentially from the fundamental premise (1), the Poisson approximation (16), and condition (ii) then allowed us to make the normal approximation (18), whence the CLE (19).13 Before examining some interesting theoretical implications of the CLE, we should note that it has a very practical numerical application: In either of its forms (18), the CLE enables us to approximately advance the system in time by a macroscopically infinitesimal time increment t. By virtue of condition (17), that would allow us to leap over very many individual reactions, thus producing a very substantial increase in simulation speed over the SSA. The Langevin update formula (18) is computationally more attractive than the explicit tau-leaping update formula (16) simply because normal random numbers are easier to generate than Poisson random numbers.14 But it should be clear that the Langevin update formula (18) is really just a limiting approximation of the explicit tau-leaping update formula (16): Whenever conditions (17) hold, tau-leaping inevitably reduces to Langevin leaping.
13 Note that this derivation of the CLE does not proceed in the ad-hoc manner of many Langevin equation derivations, in which the forms of the coefficients of j (t) in Eq. (20) are simply assumed with an eye to obtaining some pre-conceived outcome. 14 See the reference cited in footnote 9.
1750
6.
D.T. Gillespie
The Reaction Rate Equation Limit
In practice, most chemical systems contain huge numbers of molecules, and are thus well on their way to the so-called thermodynamic limit, in which the species populations X i and the system volume all approach infinity in such a way that the species concentrations X i / remain constant. The large molecular populations of such systems usually means that their dynamical behavior is well described by the CLE (19). An inspection of the CLE (19) shows that it separates the state increment X(t + dt) − X(t) in a macroscopically infinitesimal time step dt into two components: a deterministic component proportional to dt, and a fluctu√ ating component proportional to dt. The deterministic component is evidently linear in the propensity functions, while the fluctuating component is proportional to the square root of the propensity functions. Now it happens that all propensity functions grow, in the thermodynamic limit, in direct proportion to the size of the system. For a unimolecular propensity function of the form c j xi this is obvious; for a bimolecular propensity function of the form c j xi xi this follows because c j is inversely proportional to the system volume [cf. Eq. (2)], which offsets one of the population variables. Therefore, as the thermodynamic limit is approached, the deterministic component of the state increment in the CLE (19) grows like the system size, whereas the fluctuating component grows like the square root of the system size. The fluctuating component thus scales, relative to the deterministic component, as the inverse square root of the system size. This establishes, in a logically deductive way, the conventional rule-of-thumb that relative fluctuations in a chemically reacting system typically scale as the inverse square root of the system size. This scaling behavior also implies that, in the full thermodynamic limit, the fluctuating term in the CLE (19) usually becomes vanishingly small compared to the deterministic term, and hence can be dropped. The CLE therefore becomes in the full thermodynamic limit, . ν j a j (X(t)) dt. X(t + dt) = X(t) + M
(22)
j =1
This is just the conventional RRE (6). But we have now derived it within the theoretical framework of stochastic chemical kinetics. Notice how our description of the system’s dynamical behavior has progressed: The CME (4) and the SSA (9) describe X(t) as a discrete stochastic process. The CLE (19) and the CFPE (21) describe X(t) as a continuous stochastic process. And the RRE (22) describes X(t) as a continuous deterministic process. At each level, the description is an approximation of the description at the previous level, valid only under certain specific conditions.
Stochastic chemical kinetics
1751
One instance in which the limiting form (22) can be misleading is when the sum on the right hand side is zero, which happens whenever the system evolves to a “stable state”. In such a circumstance, the fluctuating term in the CLE (19) will inevitably be larger than the deterministic term, and hence not entirely negligible. Another instance of inadequacy of the RRE concerns the long-time behavior of an open or driven system that has more than one stable state: Such a system will in fact perpetually visit all of those stable states, whereas the RRE contrarily implies that the system will go to the nearest (downhill) stable state and stay there forever. But in the many cases where the approximating assumptions leading to the RRE are warranted, the RRE provides a very efficient description of the system’s temporal behavior.
7.
The Chemical Kinetics Modeling Hierarchy
We conclude by summarizing the hierarchy of schemes that are available for modeling the time evolution of a chemically reacting system, proceeding from the slowest and most accurate to the fastest and most approximate. The most exact procedure for simulating the time evolution of a chemically reacting system is molecular dynamics (MD), wherein the position and velocity of every molecule in the system are tracked precisely. This results in a simulation of every molecular collision that occurs in the system, not only the reactive collisions but also the non-reactive (elastic) collisions. MD is thus able to show very accurately the evolution of not only the species populations, but also their spatial distributions. But of course, this essentially exact approach requires an enormous investment of computation time and resources. If the system is such that reactive collisions are usually separated in time by many non-reactive collisions, and the predominant effect of the latter is simply to “stir” the system, then we may back away from an MD simulation and use instead the SSA. The SSA simulates only the reactive collisions. Because it skips over all the non-reactive collisions, and also avoids computing spatial distributions (which are assumed to be uniform in the statistical sense), the SSA is computationally much faster than MD. Tau-leaping is based on the same assumptions as the SSA, but it proceeds approximately from those assumptions; more specifically, it uses a special Poisson approximation to advance the system by a pre-selected time τ during which more than one reaction event may occur. The size of τ is restricted by the condition that no propensity function may change its value during τ by a “significant” amount. Whenever that condition is satisfied and at least some of the reaction channels fire very many times in τ , tau-leaping will be faster than the SSA. A tau-leap in which all of the reaction channels fire many more times than once is approximately described by the CLE. In a Langevin-leap, the number
1752
D.T. Gillespie
of firings of each reaction channel is approximated by a normal random number instead of a Poisson random number. Since by hypothesis many reaction events are skipped over, and since also normal random numbers are easier to generate than Poisson random numbers, Langevin-leaping is faster than ordinary tau-leaping. Finally, if the system admits a description by the CLE and is for all practical purposes at the thermodynamic or “large system” limit, then the random term in the CLE will usually be negligibly small compared to the deterministic term. The CLE then reduces to the deterministic RRE. This RRE limit is usually justified for macroscopic systems, and when it is, it provides the most efficient way to simulate the evolution of the system.
Acknowledgments This work was supported by the Air Force Office of Scientific Research and the California Institute of Technology under DARPA Award No. F3060201-2-0558, and also by the Molecular Sciences Institute under Contract No. 244725 with the Sandia National Laboratories and the Department of Energy’s “Genomes to Life Program.”
5.12 KINETIC MONTE CARLO SIMULATION OF NON-EQUILIBRIUM LATTICE-GAS MODELS: BASIC AND REFINED ALGORITHMS APPLIED TO SURFACE ADSORPTION PROCESSES J.W. Evans Ames Laboratory – USDOE, and Department of Mathematics, Iowa State University, Ames, Iowa, 50011, USA
For many growth, transport, or reaction processes occurring on the surfaces or in the bulk of crystalline solids, atoms reside primarily at a discrete periodic array or lattice of sites, actually vibrating about such sites. These atoms make occasional “sudden” transitions between nearby sites due to diffusive hopping, or may populate or depopulate sites due to adsorption and desorption, possibly involving reaction. Most of these microscopic processes are thermally activated, the rates having an Arrhenius form reliably determined by transition state theory [1]. In general, these rates will depend on the local environment (i.e., the occupancy of nearby sites) thus introducing cooperativity into the process, and they may vary over many orders of magnitude. Such systems are naturally described by lattice-gas (LG) models wherein the sites of a periodic lattice are designated as either occupied (perhaps by various types of particles) or vacant. A specification of all possible transitions between different configurations of particles, together with the associated rates, completely prescribes the evolution of the LG model for the process of interest. Such models are called Interacting Particle Systems (IPS) in the mathematical statistics community [2]. They correspond to stochastic Markov processes for evolution between different possible configurations of the system, and their evolution is rigorously described by appropriate master equations [3]. Since it is typically not possible to precisely determine the behavior of the solutions of these equations with analytic methods, Kinetic Monte Carlo (KMC) simulation is the most common tool for analysis. This approach, described below, implements on computer the “typical” evolution of the LG model 1753 S. Yip (ed.), Handbook of Materials Modeling, 1753–1767. c 2005 Springer. Printed in the Netherlands.
1754
J.W. Evans
through a specific sequence configurations using a random number generator to select processes with the appropriate weights [4]. The great advantage of KMC is that usually it can treat these processes on the physically relevant time and length scales, contrasting conventional Molecular Dynamics. Another advantage of KMC is its versatility with respect to model modification, allowing systematic testing of the effect of various processes on behavior. The focus here is on simulation of non-equilibrium processes, hence the “kinetic” in KMC. This contrasts conventional Monte Carlo (MC) simulation for equilibrium (Gibbs) states of Hamiltonian systems. For the latter, the independence of the equilibrium state on the history or dynamics of the system provides considerable flexibility in optimizing the simulation procedure [4]. For example, to speed up the simulation, one can use artificial dynamics provided that is consistent with detailed-balance. Also, other tools are available for analysis of simulation data, e.g., histogram re-weighting, based on the features of the Gibbs distribution. These techniques are not available for non-equilibrium systems, where the simulation must incorporate the physical dynamics to correctly predict the possibly non-trivial competition between various kinetic pathways. This requirement was fully not appreciated in earlier studies of the approach to equilibrium where generic rules for rates were often used. The philosophy adopted here is that the atomistic LG model should first be clearly defined, as distinct from the simulation algorithm used to analyze the model. Thus, below we first give a general description of non-equilibrium LG models, together with the master equations which describe their evolution, and only then describe the two types of generic KMC simulation algorithms. It is most instructive to illustrate these basic algorithms and various refinements to them in the context of specific classes of examples. We choose models for evolution of homoepitaxial thin film systems and for catalytic surface reactions.
1.
Evolution of Stochastic Lattice-Gas Models: Master Equations
The basic master equation formulation described below applies to the case of finite systems, i.e., lattices with L d sites where d is the spatial dimension. Any simulation is of course also restricted to such finite systems. Usually, finite-size effects are minimized by choosing periodic boundary conditions. Below, we let n j denote the occupancy of site j , {n j } the configuration of the entire system, and P({n j }, t) the probability for the system to be in this configuration at time t. Implicitly, these probabilities involve ensemble averaging which, in the context of KMC simulation, may correspond to averaging over a
KMC simulation of non-equilibrium LG models
1755
large number of simulation trials. Then, evolution is described exactly by the master equations [3] d/dt P({n j }, t) =
{n j }
−
W ({n j } → {n j })P({n j }, t)
{n j }
W ({n j } → {n j })P({n j }, t),
(1)
where W ({n j }→{n j }) denotes the prescribed rate of transitions from configuration {n j } to {n j }. These two configurations will differ only in the occupancy of a single site for adsorption or desorption, but in the occupancy of a pair of sites for diffusion. On the right hand side of (1), the first (second) term reflects gain (loss) in the population of configuration {n j }. As an aside, we note that for a Markov process, specifying a rate for each microscopic process actually means there is an exponential waiting-time distribution between events associated with this process, with the mean waiting-time between consecutive events given by the inverse of the rate. The solutions of the master equations satisfy conservation of probability, positivity, etc. The eigenvalues of the evolution matrix associated with these linear equations have non-positive real parts to avoid blow-up of probabilities, but they can in general be complex-valued. The latter scenario corresponds to time-oscillatory solutions which can occur in open non-equilibrium systems. In cases where the energy of configuration {n j } is described by a Hamiltonian, H ({n j }), selecting rates to satisfy the detailed-balance condition (Landau and Binder, 2000) W ({n j } → {n j }) exp(−H ({n j })/kT ) =W ({n j } → {n j }) exp(−H ({n j })/kT ),
(2)
guarantees that the solution will evolve to the Gibbs equilibrium state Peq ({n j })∝ exp(−H {n j }/kT ). In this case, one can also show that the evolution matrix has only real (non-positive) eigenvalues, so solutions of (1) exhibit only decay in time, not oscillatory behavior [3]. In both analytic and simulation studies, the P({n j }, t) typically contain too much information to be manageable. It is thus common to focus on reduced quantities such as the probability that a single site k is occupied, P(n k , t) = P({n j }, t), and higher-order quantities such as spatial pair-correlations. nj = /k From (1), one can obtain a hierarchy of rate equations for these, which can be analyzed using approximate factorization relations to truncate the hierarchy at some low order, or by exact Taylor series expansions for short-time behavior [5]. Often, one has translational invariance due to periodic boundary conditions, so site quantities are independent of location, and pair-correlations depend only on separation of the pair of sites. Furthermore, behavior in the
1756
J.W. Evans
limit of infinite system size, L → ∞, is of primary interest. Then, in the context of KMC simulation, such reduced quantities can be obtained precisely from a single simulation for a sufficiently large system, rather than by averaging over several trials.
2.
Generic Kinetic Monte Carlo Simulation Algorithms
We first describe the two types of generic KMC algorithms applied to LG models. We then compare features of the two algorithms, and give an example of their application to a simple deposition process. Finally, we discuss some issues associated with the finite size of the simulation system. Below, we assume that these models incorporate a variety of distinct atomistic processes, which we label by α (e.g., α = adsorption, desorption, diffusion, reaction, etc.). Furthermore, we suppose that each process, α, occurs with only a finite number of microscopic rates, Wα (m), for m = 1, 2, . . . , depending on the local environment.
2.1.
Basic Algorithm
Here, we let Wα (max) denote the maximum of the Wα (m), for each α. We then set Wtot = α Wα (max), and define pα = Wα (max)/Wtot , so that α pα = 1. In the basic algorithm, one first randomly selects a site, then, selects a process, α, with probability, pα , reflecting the maximum rate for that process α. Finally, one implements this process (if allowed) with a probability, qα = Wα /Wα (max)≤1, where Wα is the actual rate for process α at site j. This means that Wα is one of the Wα (m), with m determined by the local environment of site j. It is also essential to connect the “simulation time,” i.e., the number of times a site is chosen, with the “physical time” in the stochastic LG model. On each occasion a site is chosen, we increment the physical time by δt, where Ld Wtot δt = 1. Thus, after one attempt per site, the physical time has increased by 1/Wtot .
2.2.
Bortz Algorithm
Here, we let Nα (m) denote the (finite) number of particles which can partake in process α with the mth rate, Wα (m). Then, the total rate for all particles in the system associated with this process α occurring at the mth rate is Rα (m) = Wα (m)Nα (m), and the total rate for all processes is Rtot = β n Rβ (n). The Bortz (or Bortz–Kalos–Lebowitz) simulation algorithm [6] maintains a list of these particles for each α and m. The simulation proceeds by selecting a sub-process (α, m) with probability pα (m) = Rα (m)/Rtot , then
KMC simulation of non-equilibrium LG models
1757
randomly selecting one of the Nα (m) particles capable of making this move from the corresponding list, and then implementing the process (after which lists have to be updated). Again, one must connect the “simulation time,” i.e., the number of times a process is chosen, with the “physical time” in the stochastic LG model. On each occasion when a process is implemented, one increments the physical time by δt = 1/Rtot .
2.3.
Comparison of Algorithms
In comparing standard and Bortz algorithms, it is appropriate to first note that often the rates Wα (m), described above, vary over many orders of magnitude. Furthermore, processes with high rates often have a low population of available particles, a feature which can apply not just under quasi-equilibrium conditions, but more generally. Thus, in the basic algorithm, after selecting a site, usually one selects a process α with a large Wα (max), but then typically fails to implement this process due to the small population of particles in this class. Thus, the basic algorithm is simple, but possibly inefficient due to the large fraction of failed attempts. In contrast, in the Bortz algorithm, one always implements the chosen process, so in this sense the algorithm has optimal efficiency. However, there is a substantial “book keeping” penalty in that one must maintain and continually update lists of length Nα (m) of particles for each sub-process (labeled by α and m). In practice, for complex models where processes have many rates, one may compromise between the two approaches accepting some fraction of failed attempts to avoid substantial additional complexity or cost in book-keeping.
2.4.
A Simple Example
To illustrate these features, consider irreversible island formation during submonolayer deposition [7]. Here, atoms deposit randomly at rate F per unit time at the adsorption sites on the surface represented by an L × L site square lattice (so d = 2) with coordination number z = 4, and with periodic boundary conditions. Adsorbed atoms (adatoms) then hop randomly to adjacent sites at rate h per unit time (in each of z = 4 directions) until meeting other diffusing atoms and irreversibly nucleating new (immobile) islands, or until irreversibly aggregating with existing islands. We assume some simple rule for incorporating into islands adatoms which land on top of islands, or which diffuse to island edges, where this rule does not involve additional processes with finite rates. Thus, the model is characterized by just two rates. Typically, F ∼ 10−2 /s, but h ∼ 105 −107 /s is many orders of magnitude higher, and this leads to a very low density of diffusing adatoms on the surface (∼10−5 −10−7 atoms per site).
1758
J.W. Evans
For this deposition model, we write α = “dep” (deposition), or “hop” (diffusive hopping), where each process is described by a single rate. In the basic algorithm, one has pdep = F/(F + zh) 1, and phop = zh/(F + zh) ≈ 1. Thus, after choosing a site, typically one attempts to hop, but fails due to the very low probability of that site being occupied by a diffusing adatom. Also, one increments time by δt = (F + zh)−1 L−2 . In the Bortz algorithm, one maintains a list of the Nhop diffusing adatoms and their positions. Then, one has Rdep = FL2 (as all sites are adsorption sites) and Rhop = zhNhop . Thus, at each Monte Carlo step, one chooses either to deposit with probability pdep = FL2 /(FL2 + zhNhop ), or to hop with probability phop = zhNhop /(FL2 + zhNhop ). For deposition, one randomly chooses a lattice site and deposits. For hopping, one randomly chooses one of the Nhop diffusing atoms from the list, and then implements the hop in a randomly chosen direction. After either event, the list of diffusing adatoms is updated. In particular, one must check for incorporation into an island, which leads to removal of the adatom from the list.
2.5.
Finite Size Effects
For large systems, the time increments δt described above are small. Thus, the above algorithms accurately represent the continuous-time dynamics of the stochastic lattice gas models. These algorithms also automatically produce an exponential waiting-time distribution between consecutive events for each particle. However, for small systems, the increments δt become significant on the time scale of the slowest process. To recover an accurate description of continuous kinetics and waiting-time distributions, in the basic algorithm, one could simply reduce all the pα by some factor ε 1, and correspondingly reduce all the δt by the same factor. Analogous refinements are possible in the Bortz algorithm. Instead, one can recover the exponential waiting-time distribution by setting δt = −ln(x)L−d /Wtot (basic algorithm), or δt = −ln(x)/Rtot (Bortz algorithm), where x is a random number chosen uniformly in [0,1]. For KMC simulation (in finite systems), there are fluctuations between different runs or trials in predictions of quantities at some specific time. Simplistically, fluctuations in some number, N (e.g., of adsorbed √ particles, of islands, etc.) should vary like the square root of the number, N. Such numbers typically scale linearly with the system size (i.e., the number of sites = Ld ), so the corresponding densities ρ = N/Ld are roughly size-independent. Thus, it follows that uncertainties in numbers (densities) should scale like Ld/2 (L−d/2 ). A more sophisticated analysis comes from applying general fluctuationcorrelation relations (the presentation in Landau and Binder, 2000, for equilibrium systems is readily generalized): (δN)2 = Ld Ctot, or equivalently that (δρ)2 = L−d Ctot , where Ctot represents the pair-correlations for the quantity of interest (e.g., adsorbed atoms, islands, etc.) summed over all separations.
KMC simulation of non-equilibrium LG models
1759
Finally, we discuss the effects of finite system size on mean behavior of quantities of interest. Usually, the choice of periodic boundary conditions is motivated by the desire to minimize such effects, and specifically to remove “edge effects”. In general, one expects finite size effects to be negligible when the linear system size, L, significantly exceeds the relevant spatial correlation length, L c . This condition is violated near “critical points” where L c → ∞.
3.
Simulation of Homoepitaxial Thin Film Growth and Relaxation
Homoepitaxial growth [8] involves random deposition of adatoms on a surface, and their subsequent diffusion. Adatom diffusion mediates nucleation of new islands, when suitable number of adatoms meet, in competition with growth of existing islands, when adatoms aggregate with island edges. In addition, the details of interlayer transport are critical in determining multilayer morphologies. Post-deposition relaxation often occurs on a much longer timescale than growth, and different processes may dominate, e.g., 2D evaporationcondensation at island edges.
3.1.
Tailored Models and Algorithms
Rather than developing generic models which might handle both growth and relaxation, often a more effective strategy is to develop “tailored” models. These focus on the essential atomistic processes (for the conditions of interest) which are described by a few key parameters. As an example, we describe a simple but effective model for metal(100) homoepitaxial growth with irreversible island formation [9]. As in the simple example used above, deposition occurs at rate F and subsequent hopping to adjacent sites at rate h. Diffusing adatoms irreversibly nucleate new islands upon meeting, and irreversibly aggregate with existing islands. Islands have compact near-square shapes in these systems due to efficient edge diffusion and kink rounding. Thus, once a diffusing atom reaches an island edge, it is immediately moved to a nearby doubly-coordinated kink site. This produces near-square individual islands, and describes reasonably growth coalescence shapes for impinging islands. Atoms landing on top of islands diffuse until nucleating new islands in higher layers, or until reaching island edges. In the latter case, adatoms can hop down to lower layers also with rate h if the step edge is kinked, but with reduced rate h < h, for a straight close-packed step edge. Finally, we incorporate “downward funneling” of atoms deposited right at step edges to adsorption sites in lower layers. See Fig. 1 for a schematic of these processes.
1760
J.W. Evans DOWNWARD FUNNELING DEPOSITION
F NUCLEATION
h'
RESTRICTED DOWNWARD TRANSPORT (h'
AGGREGATION
TERRACE DIFFUSION
h
Figure 1. Schematic of metal(100) homoepitaxy with irreversible island formation. The square grid represents the lattice of substrate adsorption sites. Adatoms reaching island edges are moved immediately to nearby kink sites [9].
Thus, the model has only three rates, h, h , and F. One would naturally apply a Bortz-type algorithm maintaining a list of all hopping atoms in all layers. Rather than maintain a separate list of atoms just above close-packed step edges which can hop down at a distinct reduced rate h , it is easier to include them in a single list of “hoppers”, but if hopping down is selected, then implement this process with probability pdown = h / h < 1. One can determine h by matching the observed submonolayer island density, and h by matching, e.g., the second layer population after deposition of 1 ml. Corresponding activation barriers come from assuming an Arrhenius form with a prefactor of ∼1013 /s. Then, matching F to experiment, the model has no free parameters. How does it do? For Ag/Ag(100) homoepitaxy at 300 K, a purported classic case of smooth quasi-layer-by-layer growth, it predicts initial smooth growth up to ∼30 ml, but then extremely rapid roughening up to ∼1500 ml. For lower temperatures, initial growth is rougher (as expected), but growth of thicker films is smoother than at 300 K (contrasting expectations). These predictions are supported by recent experiments, i.e., the tailored model works!
3.2.
Classically Exact Models and Algorithms (with Look-up Tables for Rates)
In contrast to tailored models, one could attempt to describe exactly adatom diffusion in all possible local environments during or after growth. Typically, the barrier for intra-layer diffusion will depend only on the occupancy of sites which are neighbors or next-neighbors to either the initial or final site of the hopping particle. For metal(100) surfaces represented by a square
KMC simulation of non-equilibrium LG models
1761
lattice of adsorption sites, there are 10 such sites. Then, one should specify rates or barriers for 210 =1024 possible local environments, ignoring symmetries [10, 11]. Thus, in the simulation algorithm, if hopping is chosen, one must assess the local environment of the selected adatom, and determine the relevant rate which will be stored in a large look-up table. It is not possible to precisely determine so many barriers, and in fact film morphology may not be sensitive to the precise values of many of these: too low means the process is essentially instantaneous, too high means inoperative on the relevant time scale. For efficiency in simulation with look-up tables, it is reasonable to not implement processes with barriers above a certain threshold value, and perhaps to divide up all diffusing particles into a few classes (fast, medium, slow diffusers) for Bortz-type treatment [11]. This approach was introduced by Voter [10] to treat post-deposition diffusion of 2D islands in metal(100) homoepitaxial systems, and then adapted to treat film growth [11]. Originally, the values of barriers for rates were determined from Lennard–Jones or semi-empirical many-body potentials. Effort has been made to decompose this large set of diffusion processes into a few basic classes (which can aid simulation, as indicated above), and to develop reliable approximate formulae for barriers in various environments. Recently, at least a subset of key rates have been extracted from higher-level DFT calculations. However, we caution that even DFT may not have the accuracy to allow quantitative prediction of film morphologies.
3.3.
Self-Teaching or On-the-Fly Algorithms
There are a vast number of possible local configurations and rates for diffusing adatoms, but how many of these are practically important? Usually, most of these processes are associated with diffusion and detachment of adatoms at step edges, and one has some idea as to which are the most dominant processes. Thus, one strategy is to start with a smaller look-up table containing these key rates. Then, run the simulation using these rates, and any time a new local environment is generated in which an atom attempts to hop, stop the simulation, calculate the rate, insert the configuration and barrier value into the table, and continue the simulation [12, 13]. This approach could even be utilized to search for possible many-atom concerted moves in addition to probing single-atom hops.
3.4.
Hybrid Algorithms
Many thin film deposition systems exhibit large characteristic lateral lengths (e.g., large island separations). Consequently, rather than atomistic
1762
J.W. Evans
simulation of deposition and diffusion-mediated aggregation at island edges, it makes sense to adopt a continuum PDE description of the adatom density [14, 15]. The local nucleation rate can be determined from this density, and the nucleation process implemented stochastically with this rate. This approach has a significant advantage for reversible island formation with a high density of diffusing adatoms, as it is computationally expensive to follow all these particles in KMC. However, a continuum description of island growth can be problematic. Growth shapes are very sensitive to noise in the aggregation process for inefficient shape relaxation (the Mullins–Sekerka or DLA instability), and reliable continuum formulations are lacking for compact growth shapes due to efficient shape relaxation. Thus, it is natural to combine a continuum description of deposition, diffusion, and aggregation with an atomistic description of island shape evolution [16]. To grow islands, one tracks the cumulative total aggregation flux, and adds an atom when this reaches unity at a location chosen with a probability reflecting the local aggregation flux. Treatment of detachment from island edges is similar. Edge diffusion is treated atomistically as in a standard simulation.
3.5.
Other Algorithms
For island formation during deposition, island growth rates can be characterized precisely in terms of the areas of “capture zones” (CZs) which surround islands [14]. Combining this CZ-based formulation of island growth, together with a reliable characterization of the spatial aspects of nucleation, e.g., as primarily along CZ boundaries, one could imagine implementing a purely Geometry-Based Simulation (GBS) algorithm for island formation. As in the above hybrid approach, here one retains a stochastic component to the prescription of island nucleation [17]. Finally, we discuss tailored simulation algorithms for post-deposition coarsening of submonolayer island distributions in metal(100) homoepitaxial systems, where coarsening is mediated by the diffusion and coalescence of islands. Given the diffusion rates versus island size, one could develop the following simulation algorithm [18]: adopt a simple characterization of islands as squares with various sizes; let these undergo random walks with the appropriate diffusion rates; after each collision, replace two islands by a single island so as to preserve size.
4.
Simulation of Catalytic Surface Reactions
In catalytic surface reaction systems, the reactants are continually introduced as a gas above the surface. They adsorb (sometimes reversibly), usually diffuse across the surface, and react with coadsorbed species, producing
KMC simulation of non-equilibrium LG models
1763
product(s) which desorb. The reactant and product species are continually removed from the system by pumping. Thus, one has an open system which might achieve a time-independent steady-state, but this is not a Gibbs state. In simple models, it may also be possible to develop absorbing (or poisoned) states where the surface is completely covered by some non-desorbing species [19]. In more complex models, one may develop oscillatory states, although fluctuations preclude perfect periodic behavior.
4.1.
Basic Algorithms
If adsorption, desorption, diffusion, and reaction rates are comparable, then the basic KMC algorithm is effective. Consider the canonical monomer (A)– dimer (B2 ) reaction [19–21], which mimics CO-oxidation (A=CO and B2 =O2 ): A adsorbs reversibly at single empty sites; B2 adsorbs dissociatively and irreversibly at nearby pairs of empty sites; A may diffuse on the surface; adjacent A and B react to produce the product AB (=CO2 ) which immediately desorbs. For limited (non-reactive) desorption of A, upon increasing the adsorption rate of A relative to B2 , one finds a discontinuous non-equilibrium phase transition from a reactive steady state with low A-coverage, θA− , to a nearly A-poisoned steady state with high θA+ . This discontinuous transition disappears at a nonequilibrium critical point upon increasing the A desorption rate. See Fig. 2 for a schematic of the monomer-dimer reaction model and its steady-state behavior. As an aside, in the absence of desorption of A, this model exhibits a completely A-poisoned absorbing state [19]. From the general properties of finitestate Markov processes, any finite system must eventually evolve to such a state [3], while infinite systems can avoid such states indefinitely by remaining in other non-trivial steady states. Thus, KMC simulation must eventually reach such absorbing states (there are no true non-trivial steady states). However, in practice, this can take an immense amount of time, and the system resides in a pseudo-steady state which accurately reflects the true steady state of the corresponding infinite system. B
A
pA
dA
B B
B
A
A
pB2
A
k AB B
A h A
DISCONTINUOUS TRANSITION ⫹
θA θA
θA⫺
Low hA Low dA pA
BISTABILITY
θA
High hA
Low dA pA
Figure 2. Schematic of the monomer (A)–dimer (B2 ) surface reaction model which mimics CO-oxidation. Also shown is the variation of the steady-state coverage of A with adsorption rate, pA . Note the emergence of bistability with increasing hop rate, h A , of A.
1764
J.W. Evans
The above example illustrates that non-equilibrium steady states can exhibit phase transitions analogous to classic equilibrium systems. One cannot apply thermodynamic concepts geared to Hamiltonian systems, but KMC simulation combined with finite-size-scaling ideas borrowed from equilibrium theory is an effective tool to analyze their behavior. This remains true for more a realistic reaction model which incorporate rapid diffusion of CO, and interactions between adsorbed CO and O, although refined algorithms are needed for efficient simulation [20].
4.2.
“Constant-Coverage” Simulation Algorithms
In the conventional “constant-adsorption-rate” simulations of the above monomer-dimer model, if adsorption is selected as the process to be implemented, one chooses between attempting deposition of A or of B2 with probabilities reflecting their adsorption rates. A distinct “constant-coverage” simulation approach was suggested by Ziff and Brosilow [22]. Here, the structure of the conventional simulation algorithm is retained, except that now if adsorption is selected, one attempts to adsorb A (B2 ) if the current coverage is below (above) some prescribed target “constant-coverage” value, θA∗ , say. Furthermore, during the simulation, one tracks the fraction of attempts to adsorb A (rather than B2 ). The long-time value of this fraction determines the A adsorption rate corresponding to the prescribed coverage θA∗ . Thus, it determines the adsorption rate exactly at the discontinuous transition if one chooses θA− < θA∗ < θA+ . In summary, in conventional simulations of steady state behavior, one prescribes the A adsorption rate, and extracts the A coverage. In constant-coverage simulations, one prescribes the A coverage and extracts the A adsorption rate. Other variations are possible. Are the constant-adsorption-rate and constant-coverage simulations entirely consistent? Clearly, for conventional simulations in a small finite system, there are significant fluctuations in the steady-state A coverage. Such fluctuations are “artificially” removed in the constant-coverage simulation approach, so one also should expect some differences in mean values of various quantities. However, in the limit of large system size where fluctuations in conventional simulations diminish, the two simulation approaches should converge.
4.3.
Hybrid Algorithms
In “real” CO-oxidation or related reactions, the surface diffusion or hop rate for CO is often many orders of magnitude above other rates. Also, since removal of CO from the surface is not diffusion-limited, but reaction-limited, there is a significant build-up of rapidly hopping CO molecules. This makes
KMC simulation of non-equilibrium LG models
1765
conventional simulation inefficient. However, rapid mobility and reactionlimited removal of CO also mean that the CO should be quasi-equilibrated within the complex geometry of the relatively immobile coadsorbed reactant O. This suggests a hybrid approach wherein the distribution of CO is described by some simple analytic equilibrium procedure, and the O distribution is described by conventional LG KMC simulations [20, 23]. Here, reaction of a specific O to form CO2 is determined from the equilibrium probability of finding an adjacent CO. Next, we discuss application of the hybrid approach to the monomer–dimer reaction with infinitely mobile adsorbed A, which does not interact with other adsorbed A or B (other than through reaction with B). Now A will be randomly distributed on sites not occupied by B. Thus, in our hybrid simulation procedure, we track the location of all adsorbed Bs with a LG simulation, but just the total number of adsorbed A. From this number, one can readily determine the (spatially uniform) probability that any non-B site is occupied by A, and thus determine reaction rates, etc. The most dramatic consequence of replacing finite mobility of A with infinite mobility is that the discontinuous transition described above is replaced by bistability, i.e., stable reactive and near-poisoned states coexist for a range of A adsorption rates [24]. Bistability is also obtained from a mean-field rate equation treatment of the chemical kinetics. This is not surprising since mean-field equations apply to a well-stirred system (i.e., rapid surface diffusion). In this mean-field treatment, the two stable steady states are smoothly joined by a coexisting unstable state, all of which are readily determined from a steady-state rate equation analysis. In our hybrid model, one expects that an unstable steady state may exist. However, it will have a non-trivial distribution of adsorbed O, and cannot be readily analyzed by conventional (constant-adsorption-rate) simulations for which the system will always evolve away from the unstable state. However, efficient analysis of the non-trivial unstable state behavior is possible by simply implementing a constant-coverage version of the hybrid simulation code [20, 24]. By varying the target θ A∗ , one maps out both stable and unstable steady states.
5.
Outlook
KMC simulation has proved a tremendously successful tool for analyzing and elucidating the evolution of non-equilibrium LG models for a broad variety of cooperative phenomena (not just in physical sciences). This approach will continue to be applied effectively to analyze more complex and realistic models in traditional areas of investigation, as well as in new areas of cooperative phenomena. Recent variations and hybrid algorithms show great promise not only in more efficiently connecting atomistic processes with
1766
J.W. Evans
resulting behavior on far larger length scales, but just as significantly in providing fundamental insight into the key physics.
Acknowledgments Prof. Evans is supported by the USDOE BES SciDAC Computational Chemistry program (simulation algorithms) and Division of Chemical Sciences (surface reactions), and by NSF Grant CHE-0414378 (thin films). This work was performed at Ames Laboratory, which is operated for the USDOE by Iowa State University under contract No. W-7405-Eng-82.
References [1] A.F. Voter, F. Montalenti, and T.C. Germann, “Extending the time scale in atomistic simulation of materials,” Ann. Rev. Mater. Sci., 32, 321, 2002. [2] T. Liggett, Interacting Particle Systems, Springer-Verlag, Berlin, 1985. [3] N.G. Van Kampen, Stochastic Processes in Physics and Chemistry, North-Holland, Amsterdam, 1981. [4] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge UP, Cambridge, 2000. [5] J.W. Evans, “Random and cooperative sequential adsorption,” Rev. Mod. Phys., 65, 1281, 1993. [6] A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, “A new algorithm for Monte Carlo simulation of Ising spin systems,” J. Comp. Phys., 17, 10, 1975. [7] M.C. Bartelt and J.W. Evans, “Nucleation and growth of square islands during deposition: sizes, coalescence, separations, and correlations,” Surf. Sci., 298, 421, 1993. [8] Z. Zhang and M.G. Lagally, (eds.), Morphological Organization in Epitaxial Growth and Removal, World Scientific, Singapore, 1998. [9] K.J. Caspersen, A.R. Layson, C.R. Stoldt, V. Fournee, P.A. Thiel, and J.W. Evans, “Development and ordering of mounds in metal(100) homoepitaxy,” Phys. Rev. B, 65, 193407, 2002. [10] A.F. Voter, “Classically exact overlayer dynamics: diffusion of rhodium clusters on Rh(100),” Phys. Rev. B, 34, 6819, 1986. [11] H. Mehl, O. Biham, I. Furman, and M. Karimi, “Models for adatom diffusion on fcc(001) metal surfaces,” Phys. Rev. B, 60, 2106, 1999. [12] G. Henkelman and H. Jonsson, “Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table,” J. Chem. Phys., 115, 9657, 2001. [13] O. Trushin, A. Kara, and T.S. Rahman, “A self-teaching KMC method,” to be published, 2005. [14] M.C. Bartelt, A.K. Schmid, J.W. Evans, and R.Q. Hwang, “Island size and environment dependence of adatom capture: Cu/Co islands on Ru(0001),” Phys. Rev. Lett., 81, 1901, 1998. [15] C. Ratsch, M.F. Gyure, R.E. Caflisch, F. Gibou, M. Petersen, M. Kang, J. Garcia, and D.D. Vvedensky, “Level set method for island dynamics in epitaxial growth,” Phys. Rev. B, 65, 195403, 2002.
KMC simulation of non-equilibrium LG models
1767
[16] G. Russo, L.M. Sander, and P. Smereka, “Quasicontinuum Monte Carlo: a method for surface growth simulations,” Phys. Rev. B, 69, 121406, 2004. [17] M. Li, M.C. Bartelt, and J.W. Evans, “Geometry-based simulation of submonolayer film growth,” Phys. Rev. B, 68, 121401, 2003. [18] T.R. Mattsson, G. Mills, and H. Metiu, “A new method for simulating late stages of coarsening in island growth: the role of island diffusion and evaporation,” J. Chem. Phys., 110, 12151, 1999. [19] R.M. Ziff, E. Gulari, and Y. Barshad, “Kinetic phase transitions in an irreversible surface reaction model,” Phys. Rev. Lett., 56, 2553, 1986. [20] J.W. Evans, M. Tammaro, and D.-J.Liu, “From atomistic lattice-gas models for surface reactions to hydrodynamic reaction-diffusion equations,” Chaos, 12, 131, 2002. [21] E. Loscar and E.Z. Albano, “Critical behavior of irreversible reaction systems,” Rep. Prog. Phys., 66, 1343, 2003. [22] R.M. Ziff and B.J. Brosilow, “Investigation of the first-order transition in the A-B2 reaction model using a constant-coverage kinetic ensemble,” Phys. Rev. A, 46, 4630, 1992. [23] M. Silverberg and A. Ben Shaul, “Adsorbate islanding in surface reactions: a combined Monte Carlo – lattice gas approach,” J. Chem. Phys., 87, 3178, 1989. [24] M. Tammaro, M. Sabella, and J.W. Evans, “Hybrid treatment of spatiotemporal behavior in surface reactions with coexisting immobile and highly mobile reactants,” J. Chem. Phys., 103, 10277, 1995.
5.13 SIMPLE MODELS FOR NANOCRYSTAL GROWTH Pablo Jensen Laboratoire de Physique de la Mati`ere Condens´ee et des Nanostructures, CNRS and Universit´e Claude Bernard Lyon-1, 69622 Villeurbanne C´edex, France
1.
Introduction
Growth of new materials with tailored properties is one of the most active research directions for physicists. As pointed out by Silvan Schweber in his brilliant analysis of the evolution of physics after World War II [1] “An important transformation has taken place in physics: As had previously happened in chemistry, an ever larger fraction of the efforts in the field were being devoted to the study of novelty rather than to the elucidation of fundamental laws and interactions [. . .] The successes of quantum mechanics at the atomic level immediately made it clear to the more perspicacious physicists that the laws behind the phenomena had been apprehended, that they could therefore control the behavior of simple macroscopic systems and, more importantly, that they could create new structures, new objects and new phenomena [. . .] Condensed matter physics has indeed become the study of systems that have never before existed. Phenomena such as superconductivity are genuine novelties in the universe.” Among these new materials, those obtained as thin films are of outstanding importance. Indeed, the possibility of growing thin films with desired properties is at the heart of the electronics technological revolution (for a nice introduction to the history of that revolution, see Ref. [2]). Thin film technology combines the three precious advantages of miniaturization, assembly line production (leading to low cost materials) and growth flexibility (depositing successively different materials to grow complex devices). Recently, the search for smaller and smaller devices lead to the new field of nanostructure growth, where one tries to obtain structures containing a few hundred atoms. As a consequence, an impressive quantity of deposition techniques have been developed to grow carefully controlled thin films and nanostructures from atomic deposition [3]. 1769 S. Yip (ed.), Handbook of Materials Modeling, 1769–1785. c 2005 Springer. Printed in the Netherlands.
1770
P. Jensen
While most of these techniques are complex and keyed to specific applications, Molecular Beam Epitaxy (MBE) [4] has received much attention from physicists [5], mainly because of its (relative) simplicity. A younger technique, which seems promising to grow nanostructured materials with tailored properties is cluster deposition. Here, instead of atoms, one uses a beam of preformed large “molecules” containing typically 10–2000 atoms, the clusters. This technique has been shown to produce films with properties different from those obtained by the usual atomic beams. It is reviewed in Ref. [6] and will be considered no further in this short chapter. Due to the technological impetus, a tremendous amount of both experimental and theoretical work has been carried out in this field, and it is impossible to summarize every aspect of it here. I will therefore concentrate on simple models adapted to understand the first stages of growth (the submonolayer regime).
2.
Nanostructures: Why and How
As argued in the Introduction, the miniaturization logic naturally leads to trying to grow devices at the nanometer scale. This domain is very fashionable nowadays and the interested reader can find several information sources: for a simple and enjoyable introduction to the progressive miniaturization of electronics devices, see Ref. [7]. For more technical discussions, see for example Refs. [8, 9] and the journals entirely devoted to this field [10]. The reader is also referred to the enormous number of World Wide Web pages, especially those quoted in Ref. [13]. Besides the obvious advantages of miniaturization (for device speed and density on a chip), it has been argued [9] that the (magnetic, optical and mechanical) properties of nanostructured films can be intrinsically different from their macrocrystalline counterparts. For example, recent studies of the mechanical deformation properties of nanocrystalline copper [11] have shown that high strain can be reached before the appearance of plastic deformation, thanks to the high density of grain boundaries. Nanoparticles are also interesting as model catalysts [12]. The usual technology to grow thin films is deposition of atoms from a vapor onto a flat substrate. This technique was mainly used to grow relatively thick films (thickness larger than 100 nm typically). Recent developments with MBE allowed to control the growth at the atomic level, and for several materials it is possible to grow atomically flat surfaces over many micrometers. The same is true for interfaces in multilayer films, which are interesting for applications in electronics and magnetism. I refer the reader interested in the techniques and applications of atom deposition to several reviews [3]. I will focus here on a particular direction: the control of the submonolayer regime, i.e., before deposition of a single monolayer. There are two main
Simple models for nanocrystal growth
1771
interests: from the fundamental point of view, this regime allows a clearer determination of the atomic processes present during growth (the “elementary” processes to be described below). The models presented in this paper are useful in this regime and have allowed to understand and quantify many aspects of this regime of growth. One can also justify the study of the formation of the first layer since it is a template for the subsequent growth of the film [14, 15]. To grow a periodic array of nanometer islands of welldefined sizes, a promising direction seems to be the growth of strained islands by heteroepitaxy, stress being an ordering force which can lead to order [16]. However, growth in presence of elastic forces is beyond the capabilities of the present models which only take into account some of their effects (see below and Ref. [17]). Therefore I will not discuss this important subfield further. There are already many good reviews on atomic deposition with different emphasis: for a simple introduction, see Refs. [5], for more technical presentations, see Refs. [18–21]. One can find also a comprehensive compilation of measurements and analysis of atomic diffusion [22] or one more specific to metal surfaces [23] or to metal atoms deposited on amorphous substrates [24] or on oxides [25].
3. 3.1.
Models of Atom Deposition Introduction to Kinetic Monte Carlo Simulations
Given an experimental system, how can one predict the growth characteristics for a given set of parameters (substrate temperature, incoming flux of particles . . . )?
3.1.1. A bad idea: molecular dynamics simulations A first idea – the “brute-force” approach – would be to run a molecular dynamics simulation (see Ref. [26]). It should be clear however that such an approach is bound to fail since the calculation time is far too large. The problem is that there is an intrinsically large time scale in the growth problem: the mean time needed to fill a significant fraction of the substrate with the incident atoms. An estimate of this time is fixed by tML , the time needed to fill a monolayer: tML 1/F where F is the atom flux expressed in monolayers per second (ML/s). Typically, the experimental values of the flux are lower than 1 ML/s, leading to tML ≥ 1 s. Therefore, there is a time span of about 13 decades between the typical vibration time of an atom (approximately given
1772
P. Jensen
by the Debye frequency 10−13 s, the lower time scale for the simulations) and tML , rendering hopeless any “brute-force” approach.
3.1.2. Choosing clever elementary processes To reduce this time span, a more clever approach is needed. The idea is to “coarsen” the description by using elementary processes such as those sketched in Fig. 1. This idea is similar to the usual renormalization technique, but here one hides the shortest times (as one hides the highest energies) in “effective” parameters (see [1, 27] for a simple discussion on this point). For a discussion of the most relevant elementary processes for atomic deposition, see below and [18]. The rates of the different processes could in principle be calculated using the empirical or ab initio potentials, or be taken as parameters in the analysis. However, given the high number of possible processes it is more convenient to choose only some of them in the analysis. The advantage of this approach is that using a limited number of elementary processes allows to understand in detail their respective role in determining the growth characteristics. Moreover, a model with too many parameters can reproduce almost any experiment and it is dubious that meaningful comparisons can be obtained. The drawback of the “elementary processes” approach is that before interpreting an experiment in the framework of one of these models one has to be sure that no other process than those chosen is present, for otherwise the interpretation could be meaningless. The case-in-point example for warning against a too rapid interpretation of experiments by elementary processes is
(e) (a) (d) (b) (c)
Figure 1. Main elementary processes considered in this paper for the growth of films by atom deposition. (a) adsorption of a atom by deposition; (b) and (d) diffusion of the isolated atoms on the substrate; (c) formation of an island of two monomers by juxtaposition of two monomers (nucleation) (d) growth of a supported island by incorporation of a diffusing atom (e) evaporation of an adsorbed atom.
Simple models for nanocrystal growth
1773
the growth in the Pt/Pt(111) system. The initial experimental observations by the Comsa group had been thoroughly interpreted with a variety of elementary processes, only to discover, after several years, that the experimental results were determined by an unexpected process, not included in any of the simulations: contamination by CO adsorbates. . . See the full story in Ref. [28]. A simple physical rationale for choosing only a limited set of parameters is the following (see Fig. 2). For any given system, there will be a “hierarchy” of time scales, and the relevant ones for a growth experiment are those much lower than tML 1/F. The others are too slow to act and can be neglected. The problem is that which processes are relevant or not depends on the precise system under study. For example, for typical metal on metal systems, the evaporation time is larger than the time needed to break a single bond. Thus, evaporation can be neglected in the analysis even at high temperatures where atoms can detach from islands. For metal atoms deposited on some insulating surfaces, the contrary might be true: since the bond between an adatom and the substrate may be weaker than the bond between two metal adatoms,
characteristic time
diffusion inside substrate
island diffusion detachment evaporation 1/F edge diffusion diffusion on substrate Figure 2. Time scales of some elementary processes considered in this paper for the growth of films by atomic deposition. The relevant processes are those whose timescale are smaller than the deposition time scale shown by the arrow in the left. In this case, models including only atom diffusion on the substrate and along the island (or step) edges are appropriate.
1774
P. Jensen
evaporation from the substrate occurs even at low temperatures for which islands are still stable and there is no adatom detachment.
3.1.3. Combining the elementary processes: Kinetic Monte Carlo Now, given a set of elementary processes, there are two possibilities to predict the growth. The oldest is to write “rate-equations” which describe in a mean-field way the effect of these processes on the number of isolated atoms (called monomers) and islands of a given size. The first author to attempt such an approach for growth was Zinsmeister [29] in 1966, but the general approach is similar to the old rate-equations, first used by Smoluchovsky for particle aggregation [30]. Recently, Bales and Chrzan [35] have developed a more sophisticated self-consistent rate-equations approach which gives better results and allows to justify many of the approximations made in the past. However, these analytical approaches are mean-field in nature and cannot reproduce all the characteristics of the growth. Two known examples are the island morphology and the island size distribution (see [35] and also recent developments to improve the mean-field approach in [31]. There is an alternative approach to predict the growth: Kinetic Monte Carlo (KMC) simulations. Here one simply implements the processes chosen in a computer program with their respective rates and lets the computer simulate the growth. KMC simulations are an exact way to reproduce the growth, in the sense that they avoid any mean-field approximation. Given the calculation speed of present-day computers, systems containing up to 4000 × 4000 lattice sites can be simulated in a reasonable time (a few hours), which limits the finite size effects usually observed in this kind of simulation. Let me now discuss in some detail the way KMC simulations are implemented to reproduce the growth, once a set of processes has been defined, with their respective rates νpro taking arbitrary values or being derived from known potentials. There are two main points to discuss here: the physical correctness of the dynamics and the calculation speed. Concerning the first point, it should be noted that, originally [32], Monte Carlo simulations aimed at the description of the equations of state of a system. Then, the MC method performs a “time” averaging of a model with (often artificial) stochastic kinetics: time plays the role of a label characterizing the sequential order of states, and need not be related to the physical times. One should be cautious therefore on the precise Monte Carlo scheme used for the simulation when attempting at describing the kinetics of a system, as in KMC simulations. Note that the KMC approach is fundamentally different from the usual Monte Carlo algorithm, where one looks for the equilibrium properties of a system, using the energy differences of the different configurations. Instead, in KMC, one is interested in the kinetics, using the different energy barriers for the transitions between the different configurations.
Simple models for nanocrystal growth
1775
Let me address now the important problem of calculation speed. One could naively think of choosing a time interval t smaller than all the relevant times in the problem, and then repeat the following procedure: (1) choose one atom randomly (2) choose randomly one of the possible processes for this atom (3) calculate the probability ppro of this process happening during the time interval t ( ppro = νpro t) (4) throw a random number pr and compare it with ppro : if ppro < pr perform the process, if not go to the next step (5) increase the time by t and go to (1) This procedure leads to the correct kinetic evolution of the system but might be extremely slow if there is a large range of probabilities ppro for the different processes (and therefore some ppro 1). The reason is that a significant fraction of the loops leads to rejected moves, i.e., to no evolution at all of the system. Instead, Bortz et al. [33] have proposed a clever approach to eliminate all the rejected moves and thus reduce dramatically the computational times. The point is to choose not the atoms but the processes, according to their respective rate νpro and the number of possible ways of performing this process (called pro ). This procedure can be represented schematically as follows: (1) update the list of the possible ways of performing every possible process pro (2) randomly select one of the process, weighting the probability of selection by the process rate νpro and pro : ppro = (νpro pro ) processes pro νpro (3) randomly select a atom for performing this process (4) move the atom (5) increase the time by dt = (6) goto (1)
processes pro νpro
−1
This procedure implies a less intuitive increment of time, and one has to create (and update) a list of all the pro constantly, but the acceleration of the calculations is worth the effort. A serious limitation of KMC approaches is that one has to assume a finite number of local environments to obtain a finite number of parameters. This confines KMC approaches to regular lattices, thus preventing a rigorous consideration of elastic relaxation, stress effects . . . everything that affects not only the number of first or second nearest neighbors but also their precise position. Indeed, considering the precise position as in MD simulations introduces a continuous variable and leads to an infinite number of possible configurations or processes. Stress effects can be introduced approximately in KMC simulations [17] by allowing a variation of the bonding energy of an atom to an island as
1776
P. Jensen
a function of the island size (the stress depending on the size), but it is unclear how meaningful these approaches are.
3.2.
Basic Ingredients of the Growth
What is likely to occur when atoms are deposited on a surface? I will analyze in detail the following elementary processes: deposition, diffusion and evaporation of the atoms and their interaction on the surface (Fig. 1). The influence of surface defects which could act as traps for the atoms is also addressed. The first ingredient of the growth, deposition, is quantified by the flux F, i.e., the number of atoms that are deposited on the surface per unit area and unit time. The flux is usually uniform in time, but in some experimental situations it can be pulsed, i.e., change from a constant value to 0 over a given period. Chopping the flux can affect the growth of the film significantly [36]. The second ingredient is the diffusion of the atoms which have reached the substrate. I assume that the diffusion is brownian, i.e., the atom undergoes a random walk on the substrate. To quantify the diffusion, one can use both the usual diffusion coefficient D or the diffusion time τ , i.e., the time needed by an atom to move by one diameter. These two quantities are connected by D ∼ d 2 /(4τ ) where d is the hop length. The diffusion is here supposed to occur on a perfect substrate. Real surfaces always present some defects such as steps [37], vacancies or adsorbed chemical impurities. The presence of these defects on the surface can significantly alter the diffusion of the atoms and therefore the growth of the film. A third process which could be present in growth is re-evaporation √ of the atoms from the substrate after a time τe . It is useful to define X S = Dτe the mean diffusion length on the substrate before desorption. The last simple process I will consider is the interaction between atoms. The simplest case is when (a) atoms ignore each other as long as they are not immediate neighbors (b) atoms attach irreversibly upon contact. Point (a), commonplace in all simulations until recently, has been challenged by precise calculations of the potential felt by an atom approaching another atom or an island [38]. It has been shown that, for some systems, past the short range, a repulsive ring is formed around the adatoms (Fig. 3). The magnitude of the repulsion can be comparable to the diffusion barrier. Therefore, not taking this repulsive effect into account can lead to island densities much larger than experimentally observed. It remains to be seen how general this repulsive ring is. Point (b) is not correct at high temperatures, because atom-atom bonds can be broken. This situation is discussed in Section 4.2. The usual game for theoreticians is to combine these elementary processes and predict the growth of the film. However, experimentalists are interested in
Simple models for nanocrystal growth
1777
⫺1.0 ⫺1.5
log10(NX), ML
⫺2.0 ⫺2.5 ⫺3.0 ⫺3.5
ε R →∞
⫺4.0
DFT-kMC ε R ⫽25 meV Nucleation Theory
⫺4.5 1.4
1.8 1/T, K
2.2
2.6⫻10⫺2
Figure 3. Arrhenius plot of the island density as a function of temperature from an impermeable repulsive ring (squares), a KMC model including the repulsion (circles), a simplified KMC model including a repulsive ring with 25 meV (diamonds), and nucleation theory (not including the repulsion efect (triangles). After Ref. [38].
the reverse strategy: from (a set of) experimental results, they wish to understand which elementary processes are actually present in their growth experiments and what are the magnitudes of each of them (this is what physicists call “understanding a phenomenon”). The problem, of course, is that with so many processes, many combinations will reproduce the same experiments. Then, some clever guesses are needed to first identify which processes are present. I gave several hints in a previous review [6] and will not address this question in detail here.
4.
Predicting Growth with Computer Simulations
“Classical” studies [19] have focused on the evolution of the concentration of islands on the surface as a function of time, and especially on the
1778
P. Jensen
saturation island density, i.e., the maximum of the island density observed before reaching a continuous film. The reason is of course the double possibility to calculate it from rate-equations and to measure it experimentally by conventional microscopy. I will show other interesting quantities such as island size distributions which are measurable experimentally and have been recently calculated by computer simulations [40, 41]. Since I am only interested in the submonolayer regime, there is no need to take into account atoms falling on preexisting islands, except for the asymptotic case of strong evaporation discussed in Ref. [40]. Most metal on metal growth corresponds to this case, while metal on insulating surfaces grows by forming 3d islands (this is called the Wolmer–Weber growth mode, see for example Refs. [42]).
4.1.
Two Dimensional Growth: Irreversible Aggregation
I first study the formation of the islands in the limiting case of irreversible aggregation, for two growth hypothesis: negligible or important evaporation.
4.1.1. Complete condensation Let me start with the simplest case where only diffusion takes place on a perfect substrate (no evaporation). Figure 4a shows the evolution of the monomer (i.e., isolated atoms) and island densities as a function of deposition time. We see that the monomer density rapidly grows, leading to a rapid increase of island density by monomer-monomer encounter on the surface. This goes on until the islands occupy a significant fraction of the surface, roughly 0.1%. Then, islands capture efficiently the monomers, whose density decreases. As a consequence, it becomes less probable to create more islands, and their number increases more slowly. When the coverage reaches a value close to 15%, coalescence starts to decrease the number of islands. The maximum number of islands at saturation Nsat is thus reached for coverages around 15%. Concerning the dependence of Nsat as a function of the model parameters, it has been shown that the maximum number of islands per unit area formed on the surface scales as Nsat (F/D)1/319. Simulations [6, 35, 39] and theoretical analysis [34] have shown (Fig. 6) that the precise relation is Nsat = 0.53(Fτ )0.36 for the ramified islands produced by pure juxtaposition. This relation is very important since it allows, from an experimental measure of Nsat , to determine the value of τ (F is generally known), provided one knows that the simple hypothesis made are appropriate for the experiments. To show that this limiting case is not only of theoretical interest, let me show an experimental example. Thanks to a technological innovation, a scanning
Simple models for nanocrystal growth
1779 0 log (density,condensation)
3 islands 4
5 monomers
6
5
4
3
2
log ( Ft (ML))
1
0
condensation 2 islands 4
6
3
2 1 log ( Ft (ML))
0
Figure 4. Evolution of the monomer and island densities as a function of the thickness (in monolayers), for islands formed by irreversible aggregation: (a) complete condensation, F = 10−8 , τe = 1010 , τ = 1 (leading to X S = 105 and CC = 22) (b) important evaporation, F = 10−8 , τe = 600 (τ = 1) (X S = 25 and CC = 22). CC represents the mean island separation at saturation for the given fluxes when there is no evaporation [40]. The length units correspond to the atomic diameter. In (b) the “condensation” curve represents the total number of particles actually present on the surface divided by the total number of particles sent on the surface (Ft). It would be 1 for the complete condensation case, neglecting the monomers that are deposited on top of the islands. The solid line represents the constant value expected for the monomer concentration (equal to Fτe ).
tunneling microscope operating a very low temperatures, a group in Lausanne University could observe, for the first time, the beginning of the growth of a film at the atomic scale [43]. Working at very low temperatures (50 K) is essential to “hide” many elementary processes (which cannot be thermally excited) and render the growth simple enough, so that the naive models of theoreticians can be relevant (for an introduction to the strategies used by physicists to understand nature, see [44]). Figure 5 shows that simple models as the ones presented in this paragraph are able, in these conditions, to reproduce in detail the experimental results.
4.1.2. Evaporation What happens when evaporation is also included? Figure 4b shows that now the monomer density becomes roughly a constant, since it is now mainly determined by the balancing of deposition and evaporation. As expected, the constant concentration equals Fτe (solid line). The number of islands increases linearly with time, since the island creation rate is given by the probability of atom-atom encounter, which is roughly proportional to the square atom concentration. We also notice that only a small fraction (1/100) of the monomers do effectively remain on the substrate, as shown by the low condensation
1780
P. Jensen
(a)
(b)
Figure 5. Comparison of the morphologies of experimental (silver atoms deposited on platinium, a–c) and predicted with KMC models (d–f) submonolayer films of different thicknesses (see text). These figures show a small portion of the surface, 160 atomic diameters wide. To adjust the experimental results, we had to take the following rates for the elementary processes: a diffusion hop every 2 ms, thirty atoms being deposited every second on this square.
coefficient value at early times. This can be understood by noting that the islands grow by capturing only the monomers that are deposited within their “capture zone” (comprised between two circles of radius R and R + X S ). The other monomers evaporate before reaching the islands. As in the case of complete condensation, when the islands occupy a significant fraction of the surface, they capture rapidly the monomers. This has two effects: the monomer density starts to decrease, and the condensation coefficient starts to increase. Shortly after, the island density saturates and starts to decrease because of island-island coalescence. Figure 6 shows the evolution of the maximum island density in the presence of evaporation. A detailed analysis of the effect of monomer evaporation on the growth is given in Ref. [40], where is also discussed the regime of “direct impingement” which arises when X S ≤ 1: islands are formed by direct impingement of incident atoms as first neighbors of adatoms, and grow by direct impingement of adatoms on the island boundary. An experimental observation of the evaporation regime can be found in Ref. [45].
Simple models for nanocrystal growth
1781
log10(Nsat (per site))
2
3
no evap
4
evap, τe⫽100 τ mobile islands 5
14
10
6 log10(F)
2
Figure 6. Saturation island density as a function of the flux for different growth hypothesis indicated on the figure, always in the case of island growth by irreversible aggregation. “no evap” (circles) means complete condensation. Triangles show the densities obtained if τe = 100τ . In the preceding cases, islands are supposed to be immobile. This hypothesis is relaxed for the last set of data, “mobile islands” (squares) , where island mobility is supposed to decrease as the inverse island size [39] (there is no evaporation). The dashed line is an extrapolation of the data for the low normalized fluxes. Fits of the different curves in the low-flux region give: “no evap” (solid line): Nsat = 0.53(Fτ )0.36 ; “evap”(dotted line): Nsat = 0.26F 0.67 τ −1/3 τe (for the τ and τe exponents, see [40]) and “mobile islands” (dashed line): Nsat = 0.33(Fτ )0.42 .
4.2.
Reversible Aggregation
Previous results were obtained by assuming that atom-atom aggregation is irreversible. It is physically clear that at high temperatures atoms can detach from islands, and this has to be included in the models. The rate-equations approach [19] introduce a critical size i ∗ defined as follows: islands containing up to i ∗ atoms decay, while larger islands are stable. This means that only the concentration of sub-critical islands is in equilibrium with a gas of monomers. The concept of critical size was adopted for practical reasons (it simplifies the mathematical treatment) even if the macroscopic thermodynamical notions implicitly employed are difficult to justify for such small systems [18]. A more
1782
P. Jensen
satisfactory approach was developed with the help of KMC simulations [41, 46]: instead of defining arbitrarily a critical size, one uses binding energies for atoms and studies which islands grow and decay. KMC simulations have shown that the morphology of the submonolayer films change dramatically from ramified to compact islands as the ratio of bond energy to substrate temperature is varied (Fig. 7) and that the critical size is ill defined, the control parameter being the ratio of the dimer dissociation rate to the rate of adatom capture by dimers [41, 46]: λ=
N2 /τ1 Dρ N2
(1)
where τ1 is the mean time for a dimer to dissociate, D is the diffusion constant for monomers and ρ, N2 represent the densities of adatoms and dimers respectively. The case λ ∼ 0 represents irreversible aggregation whereas large λ values mean that islands can dissociate easily. In the case of reversible atomic aggregation, the scope is to determine the aggregation parameter λ (defined in Eq. 1). This can be done in several ways [41]: (1) By studying the flux dependence of Nsat : the exponent depends on λ; (2) By measuring the island size distribution which also uniquely depends on λ; (3) By measuring the nucleation rate and studying its dependence on the incident flux.
(a)
(b)
(c)
Figure 7. Morphology of the films obtained with reversible aggregation for atomic deposition with different atom-atom bond energies. The temperature is fixed to 400 K, the activation energy for diffusion of isolated atoms to 0.45 eV, the flux to 1 ML/s and the thickness to 0.03 ML. The bond energies are: (a) 0.5 eV, (b) 0.2 eV and (c) 0.1 eV.
Simple models for nanocrystal growth
1783
Once λ has been found, it is in principle possible to extract the microscopic parameters, even if in practice uncertainties remain because of the limited amount of experimental data generally available and the high number of fit parameters (for examples of such fits see [15, 41, 46].
5.
Conclusion
Modeling crystal growth is a rapidly evolving field. This is due to rapid developments in the experimental side: near-field microscopy (for example, scanning tunneling microscopy), control of the growth conditions (low temperature, vacuum). Thanks to all these improvements, experiments can be carried out on “theoretical” surfaces, namely carefully controlled surfaces similar to those that theoreticians can study. On the theoretical side, better algorithms to combine the different growth ingredients have been developed, and we now have better methods to predict atom-atom interaction (mainly the ab initio approach). For a recent informal review, see [47]. Many challenges remain, however: predicting, from atomistic level simulations, the behavior of the system on a macroscopic scale, which is difficult mainly when several intermediate scales are relevant (for example if elastic interactions are important); predicting, from precise simulations carried out over static configurations or, at best, nanoseconds, the behavior of a system over seconds or hours. These are not challenges only for surface science but also for physics in general (modeling of brittle or ductile fracture, ageing phenomena. . .), which leaves some hope that other fields will help us solving our problems!
References [1] S.S. Schweber, Physics Today, pp. 34, November, 1993. [2] Michael Riordan and Lillian Hoddeson, Crystal Fire: The Invention of the Transistor and the Birth of the Information Age, Sloan Technology Series, W.W. Norton, 1998. [3] For an introduction to this enormous field, see for example: F. Rosenberger, Fundamentals of Crystal Growth, Springer, 1979; J.A. Venables, Surf. Sci., 299/300, 798, 1994. [4] M.A. Herman and H. Sitter, Molecular Beam Epitaxy, Springer-Verlag, Berlin, 1989. [5] M. Lagally, Physics Today, 46(11), 24 (1993); Z. Zhang and M.G. Lagally, Science, 276, 377, 1997; P. Jensen, La Recherche, 283, 42, 1996; A.-L. Barabasi and H.E. Stanley, Fractal Concepts in Surface Growth, Cambridge University Press, 1995. A. Pimpinelli and J. Villain, Physics of Crystal Growth, Cambridge University Press, 1998. [6] P. Jensen, Rev. Mod. Phys., 71, 1695, 1999. [7] R. Turton, The quantum dot, W.H. Freeman and Company Ltd., 1995. [8] H. Gleiter, Nanostructured Materials, 1, 1, 1992. [9] G.W. Nieman, J.R. Weertman, and R.W. Siegel, J. Mater. Res., 6, 1012, 1991.
1784
P. Jensen
[10] Nanostructured Materials, Pergamon Press; Physica E, Elsevier Science; NanoLetters, American Chemical Society; Materials Today, www.materialstoday.com is a free magazine, which often deals with nanoscience. [11] J. Schiotz, T. Rasmussen, and K.W. Jacobsen et al., Phil. Mag. Lett., 74, 339, 1996. [12] C.R. Henry, Surf. Sci. Rep., 31, 235, 1998. [13] http://nano.gov; http://nanoweb.mit.edu; http://www.nano.org.uk [14] M.C. Bartelt and J.W. Evans, Phys. Rev. Lett., 75, 4250, 1995. [15] J.W. Evans and M.C. Bartelt, In: Surface Diffusion and Collective Processes, M.C. Tringides (ed.), Plenum, New York, 1997. [16] H. Brune et al., Phys. Rev. B, 52, R14380, 1995; H. Ibach, Surf. Sci. Rep., 29, 195, 1997. [17] M. Schroeder and D.E. Wolf, Surf. Sci., 375, 129, 1997; C. Ratsch et al., Phys. Rev. B, 55, 6750, 1997. [18] S.-L. Chang and P.A. Thiel, Critical Reviews in Surface Chemistry, 3, 239–296, 1994. [19] J.A. Venables, G.D.T. Spiller, and M. Hanb¨ucken, Rep. Prog. Phys., 47, 399, 1984. Note that some of the growth regimes predicted in this paper have been shown to be wrong (see [40]). [20] G.L. Kellogg, Surf. Sci. Rep., 21, 1, 1994. [21] H. Brune, Surf. Sci. Rep., 31, 125, 1998. [22] E.G. Seebauer and C.E. Allen, Prog. Surf. Sci., 49, 265, 1995. [23] R. Gomer, Rep. Prog. Phys., 53, 917, 1990. [24] A.A. Schmidt, H. Eggers, and K. Herwig et al., Surf. Sci., 349, 301, 1996. [25] C.T. Campbell, Surf. Sci. Rep., 27, 1, 1994. [26] D. Frenkel and B. Smit, Understanding Molecular Simulation, Academic Press, 1996; M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [27] P. Jensen, Phys. Today, July, 58, 1998. [28] P. Feibelman, Phys. Rev. B, 60, 4972, 1999; J. Wu et al., Phys. Rev. Lett., 89, 146103, 2002. [29] G. Zinsmeister, Vacuum, 16, 529, 1966; Thin Solid Films, 2, 497, 1968; Thin Solid Films, 4, 363, 1969; Thin Solid Films, 7, 51, 1971. [30] M. Smoluchovsky, Phys. Z., 17, 557 and 585, 1916. [31] J.W. Evans and M.C. Bartelt, Phys. Rev. B, 66, 235410, 2002. [32] N. Metropolis, et al., J. Chem. Phys., 21, 1087, 1953. [33] A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, J. Comp. Phys., 17, 10, 1975. [34] J. Villain, A. Pimpinelli, and L.-H. Tang, et al., J. Phys. I, France 2, 2107, 1992; J. Villain, A. Pimpinelli, and D.E. Wolf, Comments Cond. Mat. Phys., 16, 1, 1992. [35] G.S. Bales and D.C. Chrzan, Phys. Rev. B, 50, 6057, 1994. [36] P. Jensen and B. Niemeyer, Surf. Sci. Lett., 384, 823, 1997. [37] Hyeong-Chai Jeong and Ellen D. Williams, Surf. Sci. Rep., 34, 171, 1999. [38] Kristen A. Fichthorn and Matthias Scheffler, Phys. Rev. Lett., 84, 5371, 2000. [39] P. Jensen, A.-L. Barab´asi, and H. Larralde, et al., Nature 368, 22, 1994; Phys. Rev. B, 50, 15316, 1994. [40] P. Jensen, H. Larralde, and A. Pimpinelli, Phys. Rev. B, 55, 2556, 1997, Note that in this paper, a mistake was made in the normalization of the island size distributions (Fig. 9), This mistake is corrected in [6]. [41] C. Ratsch, P. Smilauer, and A. Zangwill et al., Surf. Sci. Lett., 329, L599, 1995. [42] A. Zangwill, Physics at surfaces, Cambridge University Press, Cambridge, 1988. [43] H. R¨oder et al., Nature 366, 141, 1993.
Simple models for nanocrystal growth
1785
[44] P. Jensen, Entrer en mati`ere: les atomes expliquent-ils le monde? in French, Seuil, 2001. [45] R. Anton and P. Kreutzer, Phys. Rev. B, 61, 16077, 2000. [46] M.C. Bartelt, L.S. Perkins, and J.W. Evans, Surf. Sci. Lett., 344, L1193, 1995. [47] P.J. Feibelman, J. Vac. Sci. Tech., A21, S64, 2003.
5.14 DIFFUSION IN SOLIDS G¨oran Wahnstr¨om Chalmers University of Technology and G¨oteborg University Materials and Surface Theory, SE-412 96 G¨oteborg, Sweden
A knowledge of diffusion in solids is necessary in order to describe the kinetics of various solid state reactions such as phase transformations, creep, annealing, precipitation, oxidation, corrosion, etc., all fundamental processes in materials science. There are two main approaches to diffusion in solids [1–5]: (i) the atomistic approach, where the atomic nature of the diffusing entities is explicitly considered; and (ii) the continuum approach, where the diffusing entities are treated as a continuous medium and the atomic nature of the diffusion process is ignored. Many useful results and general relations can be obtained within the continuum approach, but a more complete picture is obtained if the atomic motions are considered. Macroscopic quantities, such as diffusion fluxes, can then be related to microscopic quantities, such as atomic jump frequencies. Knowledge of how atoms move in solids is also intimately connected with the study of defects in solids.
1.
The Diffusion Equation
In the continuum approach the diffusion coefficient D is introduced through the Fick’s law which expresses the flux of particles j(r, t) in terms of the gradient of the particle concentration n(r, t) at the same position r and time t j(r, t) = −D∇n(r, t)
(1)
To arrive at the standard diffusion equation Fick’s law is combined with the equation which describes the conservation of particles, ∂n(r, t) + ∇ · j(r, t) = 0 (2) ∂t which implies that ∂n(r, t) = D∇ 2 n(r, t) (3) ∂t 1787 S. Yip (ed.), Handbook of Materials Modeling, 1787–1796. c 2005 Springer. Printed in the Netherlands.
1788
G. Wahnstr¨om
We have here assumed that D itself is independent on concentration. The solution to this equation is obtained exploiting the Fourier transform method. It can be written on the form 1 2 n 0 (r )e−(r−r ) /4Dt dr (4) n(r, t) = 3/2 (4π Dt) where n 0 (r) = n(r, t = 0) is the initial particle concentration. If boundary conditions have to be specified at finite distances the Fourier series expansion method has to be used. Several different diffusion coefficients can be defined. The tracer or selfdiffusion coefficient Ds describes the diffusive behavior of a given, or tagged, particle. Experimentally, that can be measured using a small amount of radioactive isotopes. The density of the tagged particle is described by the probability p(r, t)dr to find the particle at time t in the volume element dr at r, and is given by ∂ p(r, t) (5) = Ds ∇ 2 p(r, t) ∂t which is identical to Eq. (3) except for that D is replaced by Ds . The probability to find the tagged particle at position r at time t, given that it was located at r = 0 at time t = 0, can be obtained from the general solution (4), i.e., 1 2 e−r /4Ds t (6) p(r, t) = (4π Ds t)3/2 This Gaussian function describes the diffusive spreading of the probability distribution. The width is equal to the mean squared displacement of the tagged particle motion, R2 (t) = 6Ds t, and can be used as a definition of the selfdiffusion coefficient. 1 (7) Ds = R2 (t) 6t where the notation · · · is used for the averaging procedure. Equation (5) is based on the assumption that the motion is diffusive. For short times, the particle motion deviates from purely diffusive behavior and Eq. (7) becomes invalid. Therefore, the definition of Ds should be supplemented with the condition that t > τ0 , where τ0 is a suitable microscopic time-scale. The various diffusion coefficients depend on the thermodynamic variables, i.e., temperature, pressure and composition. It is well known that diffusion coefficients in solids generally depend rather strongly on temperature, being very low at low temperatures but appreciable at high temperatures. Empirically, this dependence can often be described by the Arrhenius formula D = D0 e−Ea /kB T
(8)
where D0 is commonly referred to as the pre-exponential factor and E a the activation energy for diffusion.
Diffusion in solids
2.
1789
The Continuum Approach
In the general case the situation can be quite complicated [1–3]. In a multi-component system one has to introduce material fluxes for each component i, ji (r, t) = −
Di j ∇n j (r, t)
(9)
j
The gradient in the concentration of one species may contribute to the flux of another species, described by the off-diagonal components of Di j . The diffusion coefficient is a function of composition as well as of temperature and pressure (or more generally stress). If a temperature or pressure gradient is present that may also introduce material fluxes and the Fick’s law of diffusion has to be generalized. Thermodynamic equilibrium demands not only that temperature and pressure be the same throughout a system but also that the chemical potential be everywhere the same. Therefore, the gradient of the chemical potential should enter in a more general description of diffusion. The theory of non-equilibrium thermodynamics is used to derive the general formalism for diffusion [3]. The theory put different phenomenological diffusion treatments together into a coherent structure. It is a linear theory and expresses the fluxes of the different species Ji in terms of suitable defined forces X j acting on these species, according to Ji =
Lij X j
(10)
j
where the phenomenological coefficients L i j are the basic kinetic parameters in the theory. In general, they will be functions of the usual thermodynamic variables, but they are independent on the forces X j . An important theorem, the Onsager reciprocity theorem, states that the matrix L is symmetric, i.e., L i j = L j i . This relation derives from the underlying atomic dynamics of the system and ultimately from the principle of detailed balance in statistical mechanics. In an isothermal, isobaric system the appropriate force is the gradient of the chemical potential X j = −∇µ j , and the corresponding transport coefficient L i j is related, but not equal, to the diffusion coefficient Di j . For instance, although by Onsager’s theorem L i j = L j i , it does not follow that Di j = D j i . In a non-isothermal system the equations must also include the heat flow Jq and a corresponding thermal force X q = −∇T /T . The corresponding set of coupled diffusion equations are derived by supplementing Eq. (10) with the particle conservation law. Numerical software packages for solution of multi-component diffusion equations have been developed [6]. An important application is the simulation of diffusion controlled transformations in alloys of practical importance. Necessary input is kinetic and thermodynamic data. These are derived by collecting and selecting
1790
G. Wahnstr¨om CELL #1 0.75 13 days at T ⫽ 1323K
0.70
WEIGHT-PERCENT C
0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 L.S. Darken: AIME 180(1948)430-438
0.25 0
E-3
5
10
15
20 25 30 DISTANCE
35
40
45 50
Figure 1. Simulated carbon concentration profile in a weld between two steels with initially similar carbon but different silicon contents see (http://www.thermocalc.com/Products/ Dictra.html).
experimental data from the literature. The progress of various solid state phase transformations can then be simulated. In Fig. 1 a result from a diffusion simulation is shown produced by the software package DICTRA [6].
3.
The Atomic Mechanism of Diffusion
The continuum approach is phenomenological. It does not give information on the nature of the diffusive motion. In order to describe the diffusion phenomena properly a knowledge of the underlying atomic mechanisms is required. Atoms in a solid vibrate around their equilibrium positions. Occasionally these oscillations become large enough to allow an atom to change site. It is these jumps from one site to another which gives rise to diffusion in a solid. The atomic jumps in a solid are rare on a microscopic time scale. The self-diffusion coefficient is about 10−8 cm2 /s near the melting point in most closed packed metals. The lattice spacing is of the order 10−8 cm which implies, using Eq. (7), that the atoms change site about 107 times/s. This should be compared with the vibrational frequency which is 1013 –1014 per second. Thus even near the melting point the great majority of the time the atom is oscillating about its equilibrium position in the crystal. It changes site only on one oscillation in 104 or 105 .
Diffusion in solids
1791
There are two common mechanisms by which atoms can diffuse through a crystalline solid, the vacancy and the interstitial mechanism. These are schematically illustrated in Fig. 2. For bulk diffusion in closed packed metals the vacancy mechanism is most important. Near the melting point the vacancy concentration is about 10−3 –10−4 site fraction in most metals. These vacancies allow the atoms to move, and this mechanism is operating in most cases with jumps to nearest neighbor sites or also to next nearest neighbor sites in bcc crystals. At high temperatures vacancy aggregates as divacancies may be present and influence the diffusivity. Curvature in the Arrhenius plot of self-diffusion is commonly interpreted as resulting from a monovacancy jump process at low temperatures with an increasing contribution from a divacancy jump process at higher temperatures [1]. That interpretation has recently been questioned based on computer simulations and it is argued that the curvature could be equally well interpreted by a single vacancy mechanism with a temperature-dependent activation energy [7]. At high temperatures interstitials may also be present but due to the high formation energy these defects are in most cases assumed to give no contribution at equilibrium. Substitutional atoms usually also diffuse by the vacancy mechanism. Other mechanisms as various exchange mechanisms have been suggested [1]. At the present there is no experimental support for any such mechanisms in crystallized metals and alloys. However, in disordered solids these more cooperative motions are more likely operating. In the interstitial mechanism the atoms move from interstitial site to interstitial site. Usually small interstitial atoms, like hydrogen or carbon atoms in metals, diffuse through the lattice by this mechanism. The surrounding solvent atoms are not greatly displaced from the normal lattice sites. If the interstitial atom is nearly equal in size to the lattice atoms diffusion is more likely to occur by the interstitialcy mechanism [1]. Here the interstitial atoms does not move directly to another interstitial site. Instead it moves into a normal lattice site and the atom that was originally at the lattice site is pushed into a neighboring interstitial site.
a
b
Figure 2. Mechanisms of diffusion in crystals: (a) the vacancy mechanism (b) the interstitial mechanism.
1792
4.
G. Wahnstr¨om
The Random Walk Model
The aim of the random walk model is to describe the observed macroscopic diffusion in terms of the atomic jumps which are the elementary processes in diffusion. It has been noted that the atomic jumps in a solid are rare on a microscopic time scale. The actual duration of an atomic jump is, however, short and can be neglected compared with the mean residence time at a site. This justifies an assumption of randomness of the atomic jumps. On the other hand, the total number of jumps over the period of hours or days is immense, about 108 each second, and a statistical treatment becomes justified. In the random walk models these aspects of the diffusive motion are taken into account. Consider a random walk on a simple cubic lattice with lattice spacing a. We assume that all sites are equally available and that the diffusing entities perform a series of uncorrelated jumps, i.e., we assume that interaction between diffusing entities and correlation effects can be neglected. If the jump vector for the ith jump is denoted by si , the total displacement after N jumps can be written as RN =
N
si
(11)
i=1
From symmetry considerations the mean displacement will be zero, R N = 0, while the mean-squared displacement is proportional to the number of jumps R2N =
N N i=1 j =1
si · s j =
i
si · si +
si · s j = N a 2
(12)
i= /j
where the last equality follows from the fact that we have assumed the jumps to be uncorrelated, i=/ j si · s j = 0. In many situations this is not the case and the analysis becomes much more complicated [5]. We can also write this in terms of the jump rate k between two neighboring sites R2N = ka 2 t
(13)
The jump rate is related to the mean residence time τ at a site according to 1/τ = nk, where n is the number of nearest neighboring sites. Furthermore, it can be related to the self-diffusion coefficient by comparing with Eq. (7), i.e., Ds =
a2 k 6
(14)
This very simple random walk model can be extended in many different directions [8] and the more complicated models are most often solved using the numerical Monte Carlo (MC) simulation technique.
Diffusion in solids
1793
The random walk modeling can also be generalized by writing down the equation for the rate of change of the probability distribution directly. We obtain the following rate equation, or Master equation ∂ p(ri , t) = k j →i p(r j , t) − ki→ j p(ri , t) ∂t j
(15)
with ki→ j equal to the transition rate from i to j . If nearest neighboring jumps are assumed with the same jump rate it simplifies to 1 ∂ p(r, t) = p(r + ak , t) − p(r, t) ∂t nτ k
(16)
where ak is the set of vectors which connects a site with its nearest neighboring sites. This equation gives a more detailed spatial information of the diffusive motion compared with ordinary diffusion Eq. (5). To recover the latter equation we may expand the probability distribution around r, and use the symmetry. The diffusion Eq. (5) is then obtained with Ds = (1/6) · (a 2 /τ n). Equation (16) is most easily solved in Fourier space. We obtain I s (q, t) ≡
and S s (q, ω) ≡
dreiq·r p(r, t) = e−(q)t (q) 1 dt −iωt s e I (q, t) = 2 2π π ω + 2 (q)
(17)
(18)
with (q) =
1 (1 − e−iq·ak ) nτ k
(19)
Quasi-elastic neutron scattering can be used to study diffusion [3]. In that case the incoherent scattering cross-section is directly related to S s (q, ω) and by determining the width of the quasi-elastic peak as function of the scattering wave-vector a very detailed description of the diffusive motion may be obtained. In practise only relatively fast diffusing atoms can be studied with neutrons. Interstitial solutions of hydrogen in metals and fast ion conductors are among those which have been extensively studied in this way. The same quantities can also be obtained using the numerical moleculardynamics (MD) simulation technique. In Fig. 3 results from a MD simulation for hydrogen diffusion in palladium are compared with quasi-elastic neutron scattering data [9]. The width of the quasi-elastic peak is shown as function of wave-vector. The temperature is 623 K and a classical description of the hydrogen motion should be quite reasonable. The simulation data agree with experiments provided energy dissipation to both the lattice vibrations and the electron excitations are taken into account.
1794
G. Wahnstr¨om
(a)
(b) 18 The dimensionless half-width ∆
The dimensionless half-width ∆
18 15 12 9 6 3 0
0
4
8
12
aq
15 12 9 6 3 0
0
4
8 aq
12
16
Figure 3. The half-width ( ≡ (q)a 2 /Ds ) of the quasi-elastic peak as function of wavevector, in units of a, along (a) the 100 direction and along (b) the 110 direction at T = 623 K. •: quasi-elastic neutron scattering data; : molecular-dynamics simulation data (with coupling to lattice vibrations); : molecular-dynamics simulation data (with coupling to lattice vibrations and electronic excitations). The dotted line shows the results from Eq. (19). Reprinted with permission from Ref. [9]. Copyright (1992) by the American Physical Society.
5.
The Atomic Jump Frequency
The random walk model relates the atomic jumps to the macroscopic diffusion phenomena. An understanding of parameters entering the expression for the atomic jump frequency and related quantities is therefore of great interest. Direct calculations of those parameters are important, in particular, if accurate calculations can be performed without fitting to experimental data, so called first-principles or ab initio calculations. In vacancy and interstitial diffusion the diffusion coefficient will depend on the concentration of defects and the atomic jump frequency k. In vacancy diffusion the relevant jump frequency is the one of an atom into an adjacent vacancy and in interstitial diffusion it is the jump rate between different interstitial sites. Using equilibrium statistical mechanics the defect concentration can be expressed in terms of formation entalpies and entropies. The atomic jump frequency k is most often approximated using the absolute rate theory, or transition state theory, according to k=
kB T Q # h Q
(20)
where Q and Q # are the statistical mechanical partition functions evaluated for the system at a stable site and at the transition site, respectively. The
Diffusion in solids
1795
transition site is defined as the hypersurface separating two stable sites. Assuming harmonic lattice vibrations Vineyard [10] derived the following expression for the transition rate N
j =1 ν j −E/kB T e ∗ j =1 v j
k = N−1
(21)
where the activation energy E (cf Eq. (8)) is the energy difference between the system located at a stable site and at the transition site or saddle point configuration. ν j are the N normal mode frequencies of the entire system at the stable site and ν ∗j the N −1 normal mode frequencies of the system constrained in the transition site. The various parameters entering the expressions for the defect concentration and the jump frequencies can be evaluated from first principles. In particular, the density functional theory has been applied extensively. Dynamics and finite temperature effects have also been considered from first principles. In Fig. 4 we show the result from such a calculation [7]. It is found that for aluminum the mono-vacancy diffusion alone dominates over diffusion due to divacancies and interstitials for all temperatures up to the melting point. The calculated diffusion rate agrees with experimental data over 11 orders of magnitude.
1000 900 ↑ Tm
800
←T(K) 700 10
10⫺12
D(m2/s)
10 10
600
v
10⫺14
⫺13
10
⫺16
2v
10⫺18
⫺14
i 1.2
10⫺15 10⫺16 10⫺17 10⫺18 1
500
⫺12
1.4
1.6
Tracer NMR Present work
1.2
1.4
1.6
1.8
2
1000/T(K⫺1)
Figure 4. Temperature dependence for the self-diffusion coefficient in aluminum as function of the inverse temperature. Open and filled circles are experimental data and the lines are from molecular-dynamics simulations. The inset shows calculated diffusion coefficients for vacancies (v), divacancies (2v), and interstitials (i). The contribution from divacancies and interstitials is less than 1% of that from mono-vacancies at the melting temperature. Reprinted with permission from Ref. [7]. Copyright (2002) by the American Physical Society.
1796
6.
G. Wahnstr¨om
Outlook
In the past diffusion studies have been dominated by various experimental techniques and the development of the theoretical description. Software has been developed for accurate simulation of diffusion in solids based on experimental input. More recently ab initio computations and computer simulations have gained in importance. The first-principles or ab initio methods can be used to get insight and to obtain data for various elementary properties in relation to diffusion. If the diffusivity is high the MD simulation technique can be used to study diffusion in a very direct way. It provides well-controlled “experiments” and allows a proper check of the validity of the various theoretical descriptions. The method requires a description of the inter-atomic interaction as input and if that is sufficiently reliable the method provides a fairly reliable substitute to actual experiments. The Monte Carlo simulation technique can also be used to study diffusion. In that case a model for the kinetic description has to be established. The method is particularly useful for the study of diffusion in complex systems, like concentrated alloys and disordered materials. To conclude; it is not unlikely that the present time of diffusion studies will be characterized as the computational period.
References [1] J.L. Bocquet, G. Brebec, and Y. Limoge, “Diffusion in metals and alloys,” In: R.W. Cahn and P. Haasen (eds.), Physical Metallurgy, 4th edn., Elsevier Science BV, Amsterdam, pp. 535–668, 1996. [2] C.P. Flynn, Point Defects and Diffusion, Clarendon Press, Oxford, 1972. [3] A.R. Allnatt and A.B. Lidiard, Atomic Transport in Solids, Cambridge University Press, 1993. [4] P. Shewmon, Diffusion in Solids, 2nd edn., The Minerals, Metals and Materials Society, Pennsylvania, 1989. [5] J.R. Manning, Diffusion Kinetics for Atoms in Crystals, D. van Nostrand, Princeton, 1968. [6] A. Borgenstam, A. Engstr¨om, L.H¨oglund et al., “DICTRA, a tool for simulation of diffusional transformations in alloys,” J. Phase Equilibria, 21, 269, 2000. [7] N. Sandberg, B. Magyari-K¨ope, and T.R. Mattsson, “Self-diffusion rates in Al from combined first-principles and model-potential calculations,” Phys. Rev. Lett., 89, 065901, 2002. [8] J.W. Haus and K.W. Kehr, “Diffusion in regular and disordered lattices,” Phys. Rep., 150, pp. 263–416, 1983. [9] Y. Li and G. Wahnstr¨om, “Nonadiabatic effects in hydrogen diffusion in metals,” Phys. Rev. Lett., 68, 3444, 1992. [10] G.H. Vineyard, “Frequency factors and isotope effects in solid state processes,” J. Phys. Chem. Solids, 3, 121, 1957.
5.15 KINETIC THEORY AND SIMULATION OF SINGLE-CHANNEL WATER TRANSPORT Emad Tajkhorshid, Fangqiang Zhu, and Klaus Schulten Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Water translocation between various compartments of a system is a fundamental process in biology of all living cells and in a wide variety of technological problems. The process is of interest in different fields of physiology, physical chemistry, and physics, and many scientists have tried to describe the process through physical models. Owing to advances in computer simulation of molecular processes at an atomic level, water transport has been studied in a variety of molecular systems ranging from biological water channels to artificial nanotubes. While simulations have successfully described various kinetic aspects of water transport, offering a simple, unified model to describe trans-channel translocation of water turned out to be a nontrivial task. Owing to its small molecular size and its high concentration in the environment, water is able to achieve significant permeation rates through different membranes, including biological cell membranes which are primarily composed of lipid bilayers. As such, water is generally exchangeable between various compartments of living organisms. However, due to the hydrophobic nature of the core of lipid bilayers, high permeation rates can only be achieved through devising additional pores in the bilayer that increase the permeability of water. These pores, known as channels, are primarily formed by folding and aggregation of one or more polypeptide chains inside the membrane. Aquaporins (AQPs) are the most prominent family of biological channels that facilitate transport of water across membranes in a selective manner. Other porins and channels also allow water molecules to pass, but they are either nonselective channels or mainly used for transport of other substrates, i.e., water is co-transported with other substrates through these channels.
1797 S. Yip (ed.), Handbook of Materials Modeling, 1797–1822. c 2005 Springer. Printed in the Netherlands.
1798
E. Tajkhorshid et al.
Water permeation through biological channels, such as AQPs, has been the subject of theoretical and experimental studies for many years [1]. Molecular dynamics (MD) simulations provide an ideal tool for investigating water transport through channels [2–5], since the movement of every single water molecule can be closely monitored in the simulations. A large body of evidence, including the recently solved structures of these channels and extensive MD simulations, have indicated that the pore region of selective water channels confines water molecules to a single file configuration, in which a highly correlated motion of neighboring, hydrogen-bonded water molecules governs the rate of diffusion and permeation of water through the channel. A very similar behavior of water has been reported in artificial water channels formed by carbon nanotubes (CNTs). This chapter presents a detailed description of water motion and permeation through water channels, through a comprehensive survey of the theory associated with single-channel water transport, methodologies developed to simulate such events, and comparison of experimental and calculated observables. The main objective is to provide the reader with a clear description of experimentally measurable properties of water channels. Our description links these properties to the microscopic structure and dynamics of channels. We show how observables like channel permeabilities can be examined by computer simulations and we present a mathematical theory of single-channel water transport.
1.
Structurally Known Biological Water Channels
AQPs are a family of membrane water channels for which crystallographic structures are available. They are present in nearly all life forms. In human, AQPs have been found in multiple tissues, such as the kidneys, the eye, and the brain. They form homo-tetramers in cell membranes, each monomer forming a functionally independent water pore, which does not conduct protons, ions or other charged solutes (Fig. 1a). A fifth pore, formed in the center of the tetramer, has been proposed to conduct ions under certain circumstances [6]. However, passive transport of water across cell membranes remains to be the main established physiological function of AQPs. Atomic resolution structures of aquaporin-1 (AQP1) [7–9] and the E. coli glycerol channel (GlpF) [10] have been employed in MD simulations characterizing the structure–function relationship of these channels in particular, regarding their selectivity [2–4, 11–14].
Kinetic theory and simulation of single-channel water transport (a)
1799
(b)
Figure 1. (a) Top view of a tetrameric AQP surrounded by lipidmolecules of a membrane. Each monomer constitutes an independent water pore. (b) An array of CNTs as a simplified model for the study of single channel water transport.
2.
Nanotubes as Simple Models of Water Channels
Synthetic pore-forming molecules, such as nanotubes, have attracted a great deal of attention recently. Due to their chemical simplicity, these artificial channels have been the subject of numerous experimental and theoretical studies [15, 16]. Simulation studies have employed CNTs as models for complicated biological channels, as they can be investigated more readily by MD simulations [17, 18] due to their simplicity, stability, and small size (Fig. 1b). Biological water channels are much more complex than CNTs, with irregular surfaces and highly inhomogeneous charge distributions. For example, MD simulations have revealed that water molecules in AQPs adopt a bipolar orientation which is induced electrostatically and is linked to the need that proton conduction must be prevented in AQP channels [3]. CNTs are electrically neutral, and may not reproduce some important features of biological channels. However, one may modify CNTs through the introduction of charges [18] to mimic various aspects of biological water channels. Computational studies have suggested that CNTs can be designed as molecular channels to transport water. Single-walled CNTs (with a diameter of 8.1 Å) have been studied recently by MD simulations. Simulations revealed that the CNTs spontaneously fill with a single file of water molecules and that water diffuses through the tube concertedly at a fast rate.
1800
3.
E. Tajkhorshid et al.
Experimental Measurement of Transmembrane Water Transport
The key characteristics accounting for transport through water channels are the osmotic permeability ( pf ) and the diffusion permeability ( pd ) [1], both measurable experimentally. pf is measured through application of osmotic pressure differences, while pd is measured through isotope labeling, e.g., use of heavy water. In this section, we explain how water transport is characterized experimentally, and what are the most important properties used to characterize the rate of transport of water through channels. We will introduce and define pf and pd of water channels, and, in particular, investigate the relationship between the two for single-file water channels. When the solutions on the two sides of a membrane have different concentrations of an impermeable solute, water flows from the low concentration side to the other side. In dilute solutions, the net water flux through a singlewater channel, jW (mol/s), is linearly proportional to the solute concentration difference CS (mol/cm3 ): jW = pf CS ,
(1)
where CS (mol/cm3 ) is the concentration difference of the impermeable solute between the two reservoirs connected by the channel, jW (mol/s) is the net molar water flux through the channel, and pf (cm3 /s) is defined as the osmotic permeability of the channel [1]. In contrast, no net water flux is expected in equilibrium, i.e., when no solute concentration difference is present. It is, however, still of interest to study water diffusion through the channels for CS = 0. For this purpose, experiments have been designed where a fraction of water molecules is labeled, e.g., by isotopic replacement or by monitoring nuclear spin states, so that they can be traced. Assuming that the interactions of these so-called tracers with the membrane and with other water molecules are identical to those of normal water molecules, tracers can be used to study diffusion of water molecules through channels at equilibrium conditions. When the reservoirs on the two sides of a membrane have different concentrations of tracers, a diffusional tracer flux will be established down the concentration gradient, although the average net water flux (consisting of both tracers and normal water molecules) remains zero. The tracer flux jtr (mol/s) through a single channel is linearly proportional to the tracer concentration difference Ctr (mol/cm3 ): jtr = pd Ctr ,
(2)
where pd (cm3 /s) is defined as the diffusion permeability of the channel [1] (Fig. 2).
Kinetic theory and simulation of single-channel water transport
jw solute
1801
water
jtr tracer Figure 2. Schematic presentation of experimental procedures to measurediffusion and osmotic permeability of channels. (Top) Addition of an impermeable solute to one side of the channel establishes a chemical potential difference of water that drives water transport to the solute-rich side. (Bottom) In the absence of a chemical potential difference of water across the channel, labeled water molecules (tracers) can be used to monitor random diffusion of water from one side to the other side of the channel.
Different experimental techniques are used for measurement of Pf and Pd [19]. It is important to note that, due to difficulties in measuring water transport through single channels, almost all of the experimental setups measure water permeation through a membrane, and the measured permeabilities (Pf and Pd with capital P) are those of the entire membrane. To obtain singlechannel permeabilities, pf or pd , one needs to know the density of the channel in the membrane, i.e., the number of channels per unit area. However, the ratio pf / pd can be measured without the knowledge of the channel density [20]. Pd is measured in the absence of a chemical potential difference of water (balanced osmotic/hydrostatic pressure on the two sides of the membrane). There is no net transport of water under these conditions. In order to monitor
1802
E. Tajkhorshid et al.
random translocation of water molecules from one side to the other side of the membrane, special water molecules (tracers) are needed. Isotopic water (such as 3 H2 O) or water molecules with different nuclear spin states can be used for this purpose. Immediately after introduction of tracers, tracer concentrations of the two reservoirs are monitored directly or indirectly over time [19], and Pd can be determined from the decay rate of the concentration difference. Pf is usually measured in the presence of an osmotic pressure difference, i.e., a difference in solute concentration. Typical Pf measurements are performed on cells or liposomes (small lipid vesicles with embedded water channels), by exploiting the stopped-flow technique. In this setup, the solute concentration of the extracellular solution is suddenly changed, resulting in volume changes of the cells (or vesicles) due to the net water flux. The volume change can be inferred by monitoring light scattering from the suspension [21], and the net water flux determined from the rate of volume change. Pf for a planar membrane can be determined by measuring the ionic concentration distribution near the surfaces of the membrane [22].
4.
Theory of Single-file Water Transport
The theory and derivations presented in this section closely follow Zhu et al. [5, 23]. We define a permeation event as a complete transport of a water molecule through the channel from one reservoir to the other. Let q0 be the average number of such permeation events in one direction per unit time; the number of permeation events in either direction should be identical, resulting in a total number of 2q0 . q0 is an intrinsic property of a water channel and is independent of tracer concentration. Let us assume that one reservoir has a tracer concentration of Ctr , and (for the sake of convenience) that the other reservoir has zero tracer concentration. The ratio of tracers to all water molecules in the first reservoir is Ctr /CW , where CW = 1/ VW is the concentration of water, and VW (18 cm3 /mol) is the molar volume of water, which is usually assumed to be constant. Since according to our assumption tracers move just like normal water molecules, the same proportion (i.e., Ctr /CW ) should characterize water molecules permeating the channel. Consequently, the tracer flux can be related to the total number of water molecules permeating the channel (q0 ) by jtr = (1/NA )q0 (Ctr /CW ), where NA is Avogadro’s number. Therefore, pd and q0 are related by a constant factor: pd =
VW q0 = v W q0 , NA
where v W = VW /NA is the average volume of a single water molecule.
(3)
Kinetic theory and simulation of single-channel water transport
1803
Within narrow channels water molecules form a single file, and their movement along the channel axis accordingly is highly correlated. Recently, a continuous-time random-walk (CTRW) model was proposed [24] to describe the transport of single-file water in channels. This model assumes that the channel is always occupied by N water molecules, and the whole water file moves in hops (translocations that shift all water molecules by the distance separating two neighboring water molecules) simultaneously and concertedly, with leftward and rightward hopping rates kl and kr , respectively. In equilibrium, kl and kr have the same value, denoted as k0 . Due to strong coupling between the water molecules, local effects (energetic barriers arising from interaction with certain parts of the channel wall, access resistance at channel entrances, etc.) contribute to the hopping rate of the whole water file. Consequently, all factors affecting the kinetics of water movement are effectively integrated into this single parameter (k0 ). In the following, we will show that both pd and pf can be predicted by this model, in terms of N and k0 . Since the complete permeation of a water molecule from one end of the channel to the other end includes at least N + 1 hops (shifts) of the single file, one expects the rate of permeation events at equilibrium to be smaller than the hopping rate. Indeed, the number of uni-directional permeation events per unit time, q0 , is given by q0 =
k0 . N +1
(4)
Equation (4) has been proven from kinetics [24] as well as using a state diagram [18], and its validity was verified by MD simulations of CNTs [17, 18]. Combining Eqs. (3) and (4), pd can be expressed as: pd =
v W k0 . N +1
(5)
pf is measured when a net water flux is induced by different solute concentrations in two reservoirs. In this case, the chemical potentials of water in the two reservoirs are different (the difference denoted as µ). Consequently, the hopping rates (kr and kl ) of the two directions are no longer the same. We note that the yield of a hop is the transfer of one water molecule from one reservoir to the other, resulting in a free energy change of µ in the system. In analogy to the forward and backward rates of a chemical reaction, the ratio of kr to kl can be expressed by [25]:
−µ , kr /kl = exp kB T
(6)
where kB is the Boltzmann constant and T is the temperature. We note now that kr and kl are both functions of µ/kB T . Since under physiological conditions, µ is much smaller than kB T (e.g., µ/kB T = 0.0036
1804
E. Tajkhorshid et al.
for a 200 mM solution of sucrose, according to Eq. (10)), we can expand kr and kl to first order:
kr = k0 1 + α
µ , kB T
kl = k0 1 + β
µ kB T
(7)
(for a symmetric channel also holds α = −β). The net water flux can be expressed by the difference between kr and kl : jW =
1 k0 (α − β) µ . (kr − kl ) = NA NA kB T
(8)
Substituting Eq. (7) into Eq. (6) and comparing the first order terms in µ/kB T leads to β − α = 1. The net water flux is then: jW = −
k0 µ . NA k B T
(9)
For dilute solutions, µ is linearly proportional to the solute concentration difference [1]: µ = −kB T VW CS .
(10)
From Eqs. (9) and (10), we obtain jW = k0 v W CS
(11)
and using Eq. (1), pf = v W k0 .
(12)
According to Eqs. (5) and (12), the ratio of pf to pd predicted by the CTRW model is pf / pd = N + 1.
(13)
The difference between pf and pd can be further elaborated as follows. For single-file water transport, a hop results in the net transfer of one water molecule from one side of the channel to the other side. pf is related to the rate of net water transfer under a chemical potential difference and, therefore, is determined by the hopping rate (see Eq. (12)). In contrast, pd is determined by the rate of permeation events (see Eq. (3)). A permeation event requires an individual water molecule to traverse all the way through the channel, and is not the same as a hop. Actually, the pf / pd ratio is exactly determined by the relative rates of hops and permeation events. Most models proposed for single-file water transport predict this ratio to be N or N + 1 [1]. As stated before, most experimental techniques take advantage of osmotic pressure to establish a chemical potential difference that is needed for the determination of pf . A hydrostatic pressure difference P between the two
Kinetic theory and simulation of single-channel water transport
1805
reservoirs can also give rise to a difference in the chemical potential of water [1] µ = v W P.
(14)
In fact, the osmotic pressure difference between two solutions is defined as the hydrostatic pressure difference that would generate the same µ. Therefore, the osmotic pressure difference between two dilute solutions is given by van’t Hoff’s law [1]: P = RT CS ,
(15)
where R = kB NA is the gas constant. It is also known experimentally that equal osmotic and hydrostatic pressure differences produce the same water flux through water channels [19]. The hopping rates and, hence, the water flux are functions of µ alone (Eqs. (7) and (9)), regardless of whether µ arises from osmotic or hydrostatic pressure differences. According to the CTRW model, when an osmotic or hydrostatic pressure difference exists, the water file performs a biased random walk, characterized by the hopping rates kr and kl . In this section, we will determine the statistical distribution of hops as a function of time. Within any infinitesimally small time dt, the probability of the water file to make a rightward hop is kr dt, independent of its history, i.e., when and how many rightward hops were made before. Such a process is referred to as a Poisson process, and the total number of rightward hops within time t, m r (t), obeys the well-known Poisson distribution, whose mean and variance are both kr t. Similarly, the number of leftward hops, m l (t), also obeys the Poisson distribution, with kl t being its mean and variance. The net number of hops, m(t), is defined as the difference of the numbers of rightward and leftward hops, i.e., m(t) = m r (t) − m l (t). Since the probabilities of making rightward and leftward hops are independent of each other, we obtain: m(t) = (kr − kl )t,
(16)
Var[m(t)] = (kr + kl )t,
(17)
where Var[m] = m 2 − m2 . Equations (16) and (17) show that both the mean and the variance of m(t) increase linearly with time. These expressions show that monitoring the average number of hops and its variance permits one to determine both kr and kl [5].
1806
5.
E. Tajkhorshid et al.
Collective Diffusion Model of Single-channel Water Transport
Following its definition, pf is measured in experiments under nonequilibrium conditions, for systems with nonzero µ. In principle, the same conditions (a chemical potential difference across the channel) can be established in MD simulations of water transport. Two of the techniques for doing so are (1) introduction of solutes to one side of the membrane to generate an osmotic pressure difference [25], and (2) application of a hydrostatic pressure difference across the channel through mechanically manipulating individual water molecules in the bulk [4, 5]. Through adjustment of the salt concentration or of the pressure difference, one may reach different values of µ in the simulations. Due to the presently accessible (ns) time scale of MD simulations, however, one has to adopt a large µ to obtain sufficient statistics of water permeation. This leads to situations that are far from actual experimental conditions, and it is not clear whether the results represent the normal kinetics of the water channel under study. If one can establish a quantitative relationship between water conduction under equilibrium and nonequilibrium conditions, this problem can be circumvented. In this section we demonstrate that water permeation obeys a linear current – µ relationship over a very wide range of µ values and that equilibrium MD simulations (µ = 0) can be used to characterize the osmotic permeability of a channel. Water permeation usually involves multiple water molecules in a channel whose movements are coupled to each other. As a result, a complicated multidimensional representation seems to be necessary to model this process. In the following, we introduce a collective coordinate, n, which offers a much simplified description of water translocation in channels. The derivation follows closely Zhu et al. [23]. Consider a channel (of length L) aligned along the z-direction. The collective coordinate n is defined in its differential form as follows: let S(t) denote the set of water molecules in the channel at time t, and let us assume that the displacement of water molecule i in the z-direction during dt is dz i ; then we define dz i . (18) dn = L i∈S(t ) By demanding n = 0 at t = 0, n(t) can be uniquely determined by integrating dn. Note that S(t) changes with time, and that a water molecule i contributes to n only when it is in the channel, i.e., if i ∈ S(t) at time t. We further note that every water molecule crossing the channel from one reservoir to the other contributes to n a total increment of exactly +1 or −1. Therefore, n quantifies the net amount of water permeation, and the trajectory n(t) describes the time evolution of the permeation.
Kinetic theory and simulation of single-channel water transport
1807
An important scenario is the stationary state in which a steady water flux through the channel exists. In this case, n(t) on average grows linearly with t, and the water flux is given by jW =
1 1 n(t) , jn = NA NA t
(19)
where NA is Avogadro’s number, and jn = NA jW is the water flux in the unit of number of water molecules/s. At equilibrium, the net amount of water permeation through the channel vanishes on average, i.e., n(t) = 0. Spontaneous, random water transport, however, may occur due to thermal fluctuation. Such microscopic fluctuations may not be detectable in experiments, but can be readily observed in MD simulations through n(t). At equilibrium, n(t) can be described as a onedimensional unbiased random walk, with a diffusion coefficient Dn that obeys
n 2 (t) = 2Dn t.
(20)
Dn has dimension t −1 since n is dimensionless. Intuitively, Dn is related to the rate at which the net transport of one water molecule happens spontaneously. All factors affecting water kinetics contribute to Dn and are effectively integrated into this single parameter. In the presence of a chemical potential difference (µ) of water between the two reservoirs, n obeys a biased random walk. We note that the net transport of one water molecule from one reservoir to the other results in a change of ±µ in the free energy, and that the total free energy change is proportional to the net amount of water transported. The free energy can be expressed then as a linear function in n: U (n) = µn.
(21)
Consequently, the trajectory of n can be described as a one-dimensional diffusion in a linear potential. Therefore, on average n is drifting with a constant velocity [23]: n(t) = −
µ Dn t, kB T
(22)
which corresponds to a stationary water flow through the channel. According to Eq. (19), the water flux is given by jn = −
µ Dn . kB T
(23)
From Eqs. (1), (10), (19) and (23) one obtains then for the osmotic permeability of the channel pf = v W Dn .
(24)
1808
E. Tajkhorshid et al.
Equation (24) shows that one can determine pf using the Dn value obtained from equilibrium MD simulations (cf. Eq. (20)) [23]. The CTRW model proposed for single-file water channels assumes that the whole water file moves in discrete hops simultaneously and concertedly, with rightward and leftward hopping rates kr and kl , respectively. k0 , defined as the value of kr or kl at equilibrium, is the major kinetic parameter in the model. Since each hop changes the collective coordinate, n, by +1 or −1, it holds n(t) = m r (t) − m l (t), where m r (t) and m l (t) are the number of rightward and leftward hops during time t, respectively. Because m r (t) and m l (t) obey a Poisson distribution (see alsoabove) whose mean and variance are both k0 t at equilibrium [5], one obtains n 2 (t) = 2k0 t. Comparison with Eq. (20) yields Dn = k0 . Therefore, for the discrete water movement described by the CTRW model, Dn is identical to the hopping rate k0 , and the expression derived from the CTRW model, namely, pf = v W k0 [5], is actually equivalent to Eq. (24) in the collective diffusion model [23]. However, while the CTRW model is only valid for single-file channels, the collective diffusion model is applicable to any water channel since it makes no assumption regarding water configuration or water movement inside the channel. In the CTRW model, in order to determine the net water flux ( jn = kr − kl ) as a function of µ, the rate theory expression kr /kl = exp(−µ/kB T ) was exploited [5, 25], along with the linear response approach which assumes that µ is much smaller than kB T [5]. The model, however, is not able to predict how jn relates to µ when µ is comparable or larger than kB T . In contrast, the collective diffusion model (Eq. (23)) predicts a linear relationship between jn and µ even when µ exceeds kB T [23].
6.
Simulation of Water Transport and Calculation of pd and pf
Equilibrium MD simulations provide an ideal tool to study free water diffusion through channels, since all water molecules can be easily traced in the simulations, and q0 counted [3, 23]. pd can then be calculated according to Eq. (3) from the simulations. In order to determine pf in a fashion similar to experiments, one needs to produce different osmotic or hydrostatic pressures on the two sides of the membrane. Figure 3 illustrates a scheme to induce a hydrostatic pressure difference in MD simulations [4, 5]. In order to avoid inaccuracies at the boundaries, applying periodic boundary conditions has become a common practice in MD simulation of molecular systems, particularly those that involve a considerable amount of solvents like water. In a periodic system, the unit cell is replicated in three dimensions; therefore, water layers and membranes alternate along the z-direction, defined as the membrane normal. Figure 3 shows a water layer
Kinetic theory and simulation of single-channel water transport
1809
membrane region II (P2)
unit cell
f
region III
region I (P1)
membrane
Figure 3. Illustration of the method to produce a pressuredifference in MD simulations. The two membranes shown in the figure are “images” of each other under periodic boundary conditions. A constant force f is applied only to water molecules in region III.
sandwiched by adjacent membranes. We define three regions (I, II, III) in the water layer, as shown in the figure. Region III is isolated from the two sides of the membrane by regions I and II, respectively. A constant force f along the z-direction is exerted on all water molecules in region III, generating a pressure gradient in this region that, consequently, results in a pressure difference between regions I and II, i.e., on the two sides of the membrane [4] nf , (25) A where n is the number of water molecules in region III, and A the area of the membrane. Consequently, a net water flux jW through the membrane channels embedded in the membrane can be induced, and pf calculated from jW and P. We note that the membrane needs to be held in its position, e.g., by constraints, to prevent an overall translation of the whole system along the direction of the applied forces. Assuming that the thickness of region III is d, the number of water molecules in this region is n = Ad/v W . Substituting this into Eq. (25) and the result into Eq. (14), we obtain for the chemical potential difference of water between regions I and II: P = P1 − P2 =
µ = f d.
(26)
The external force field generates a mechanical potential difference of f d between regions I and II, which must be exactly balanced by the chemical
1810
E. Tajkhorshid et al.
potential difference µ under a stationary population distribution of water, therefore also giving Eq. (26). In an earlier approach [4], all water molecules in the bulk region, including those adjacent to the entrances of the channels, were subject to external forces, a setup which might artificially affect the number of water molecules permeating the channel. This shortcoming was overcome later [5] through application of external forces only to water molecules in region III (Fig. 3), which leaves regions I and II under uniform hydrostatic pressures, and, hence, represents experimental conditions more closely. In order to keep the membrane in place, one can either apply constant counter forces on the membrane to balance the effect of hydrostatic pressure gradients experienced by the membrane [4], or constrain the membrane in the z-direction to prevent an overall translation of the system. The latter is superior, because the number of water molecules (n) in region III, and, therefore, the total external force to water (n f ), experience slight fluctuations during the simulation, and application of a fixed counter force on the membrane may not always exactly balance n f . Moreover, for very long simulations, applying constraints can also eliminate drifting of the membrane along the z-direction that may happen due to thermal motion. Too strong constraints, however, may restrict the dynamics of channel lining groups, which might, particularly in proteins, influence the kinetics of water transport, and one must carefully choose the constraints as to minimize this undesired effect. An interesting method, which we refer to as the “two-chamber setup”, has also been used to study osmotically driven water flow in MD simulations [25], where the unit cell consists of two membranes and two water layers containing different concentrations of solutes. We chose our proposed method rather than the two-chamber setup for two reasons. First, in order to observe on the ns time scale a statistically significant water flux through channels, one has to induce in the two-chamber setup a large chemical potential difference (µ) of water. However, it is noteworthy that Eq. (10) is valid only for dilute solutions; when the solute concentration is high, µ is no longer linearly proportional to the concentration difference. In contrast, in our method, µ can be linearly controlled (see Eq. (26)). Second, the osmotic water flux in the two-chamber setup will decrease with time and eventually stop [25], while application of a hydrostatic pressure gradient maintains a stationary flux that permits sampling for as long as one can afford.
7.
Calculation of Water Permeability of Aquaporins
As discussed earlier, using the CTRW model [24], one can demonstrate that pf and pd of a single-file water channel are related, but differ in value. Equilibrium MD simulations yield the pd value, and applying hydrostatic pressure
Kinetic theory and simulation of single-channel water transport
1811
differences across the membrane allows one to determine pf of membrane channels from MD simulations. We will now present the application of the described method to the example of a real biological channel, namely AQP1. Pressure-induced water permeation will be used to determine the channel’s pf value, which, as we will see, is found to agree well with experimental measurements. The simulations presented in this section are taken from [5]. The AQP1 [9] tetramer was embedded in a POPE lipid bilayer and solvated by adding layers of water molecules on both sides of the membrane. The whole system (shown in Fig. 4) contains 81 065 atoms. The system was first equilibrated for 500 ps with the protein fixed, under constant temperature (310 K) and constant pressure (1 atm) conditions. Then the protein was released and another 450 ps equilibration performed. Starting from the last frame of the equilibration, four simulations were initiated. In these simulations (sim1, sim2, sim3 and sim4), a constant force ( f ) was applied to the oxygen atoms of water molecules in region III, defined as a 7.7 Å-thick layer (shown in Fig. 4) in our system, to induce a pressure difference across the membrane. In principle, the position and thickness of region III can be arbitrarily defined and should not affect the results, as long as the induced pressure difference is set to the same value (by choosing a proper f ); in practice, one would partition the bulk water in such a way that
III I
II III Figure 4. Side view of the unit cell including the AQP1 tetramer (tuberepresentation), and lipid and water molecules (line representation). Hydrogen atoms of lipids are not shown and the phosphorus atoms are drawn as vdW spheres. Water molecules in region III (see Fig. 3) are drawn in a slightly darker shade.
1812
E. Tajkhorshid et al.
each of the three regions (I, II, III) has a sufficiently large thickness (relative to the diameter of a water molecule). The constant forces used in the four simulations differ in their direction or magnitude, generating four pressure differences, as summarized in Table 1. The simulations were performed under constant temperature (310 K) and constant volume conditions. As mentioned earlier, the membrane needs to be constrained to prevent the overall translocation of the system under the external forces. This was done by applying harmonic constraints to the Cα atoms of the protein and the phosphorus atoms of the lipid molecules, with spring constants of 0.12 kcal/mol/Å2 and 0.8 kcal/mol/Å2 , respectively. These spring constants are chosen to fully balance the external forces when the whole membrane is displaced by about Å along z from its reference position under a pressure difference of 200 MPa (as in sim1 and sim4). The constraints are applied only in the z-direction, and all atoms are free to move in the x- and y-directions. Note that the constraints on the protein are fairly weak and act only on the backbone Cα atoms; therefore, significant flexibility of protein side chains, which may influence the kinetics of water permeation, was maintained during the simulations. All simulations were performed using the CHARMM27 force field [26], the TIP3P water model, and the MD program NAMD2 [27]. Full electrostatics was employed using the Particle Mesh Ewald (PME) method [28]. Simulations sim1, sim2, sim3 and sim4 were each run for 5 ns, with the first 1 ns discarded and the remaining 4 ns used for analysis. 1 ns of simulation took 22.4 h on 128 1-GHz Alpha processors. During the simulations, the water density distribution in regions I, II, and III exhibited different patterns, as shown in Fig. 5, where the dashed lines are the boundaries separating these regions. In region III, where the external forces are applied, a gradient of water density is observed; in regions I and II, the density of water is roughly constant, indicating that the hydrostatic pressure in these regions is uniform. The water density gradient in region III and, hence, the density difference between regions I and II, differ in the four Table 1. Summary of the four simulations reported in this studya sim1 sim2 sim3 sim4
f (pN)
P (MPa)
µ (kcal/mol)
−7.36 −3.68 3.68 7.36
−195 −97 97 195
−0.814 −0.407 0.407 0.814
a The thickness of region III is d = 7.68 Å, containing on average
2470 water molecules. f is the constant force applied on individual water molecules. The area of the membrane in the unit cell is A = 9.35 × 10−17 m2 . The induced pressure difference P and chemical potential difference µ of water are calculated according to Eqs. (25) and (26), respectively.
Kinetic theory and simulation of single-channel water transport 61
I
III
1813
II
60
Density (mol/l)
59 58 57 56 55 54 53
35
40
45
50
z(Å)
Figure 5. Water density distribution along the z-direction in region III (bracketed by the dashed lines) and part of regions I and II. Data points marked by circles, diamonds, stars, and squares represent sim1, sim2, sim3, and sim4, respectively. The density is measured by averaging the number of water molecules within a 1 Å-thick slab over the last 4 ns of each trajectory.
simulations. From the observed water density difference and the calculated pressure difference (see Table 1) in these simulations, the compressibility of water is estimated to be 4.9 × 10−5 atm, which is in satisfactory agreement with its experimental value of 4.5 × 10−5 atm [19]. Water molecules in the channels were usually found in the single-file configuration (as shown in Fig. 6a) and moved concertedly during the simulations (Fig. 6b). Occasionally, larger number of water molecules were accommodated in the channel, or the water file appeared broken in part of the channel. Nevertheless, the CTRW model can be used to provide a simplified quantitative description of water movement in AQP1 channels, as demonstrated in [5]. The net water fluxes, directly determined from the simulations, are shown in Table 2. These values are plotted vs. the applied pressure difference in Fig. 7. From their best-fit slope, and according to Eqs. (1) and (15), the osmotic permeability was determined to be pf = (7.1 ± 0.9) × 10−14 cm3 /s. Different experiments have reported pf values for AQP1 monomers in the range of 1–16 × 10−14 cm 3 /s, the variation being probably due to uncertainties in the number of channels per unit membrane area; typically referenced pf values range from 5.43 × 10−14 cm3 /s [29] to 11.7×10−14 cm3 /s [21]. In light of this, the pf value calculated from our simulations agrees satisfactorily with experiments. In equilibrium MD simulations of AQP1, a total of 16 permeation events (in four AQP1 monomers in either direction) were observed in 10 ns [11]. Therefore, the rate of uni-directional permeation events in a monomer is
1814
E. Tajkhorshid et al.
(a)
(b)
z (Å)
5
0
⫺5
⫺10 0
100
200 300 Time (ps)
400
500
Figure 6. (a) An AQP1 monomer with channel water and nearby bulk water. Water molecules in the constriction (single-file) region, the vestibules of the channel, and in the bulk are rendered in vdW, CPK and line representations, respectively. (b) Trajectories (from sim1) of seven water molecules in the constriction region during 500 ps. Table 2. Water flux observed in the four simulationsa M1 sim1 sim2 sim3 sim4
−13.5 −9.5 11.5 11.5
Water count/4 ns M2 M3 −14.5 −6 8.5 9
−15 −1 5 10.5
M4 −17.5 −12.5 8 7
Flux (# /ns) Mean SD −3.8 −1.8 2.1 2.4
0.4 1.2 0.7 0.5
a To obtain the net water transfer through a channel, a plane normal to its axis is defined, and
when a water molecule crosses the plane, a count of +1 or −1 is accumulated, depending on its crossing direction. Two such planes were defined in the central part of the channel, and the average of their net counts is listed as the water count of the channel. The mean and standard deviation (SD) of the flux were calculated from the water counts of the four AQP1 monomers (M1 to M4) during 4 ns.
q0 = 0.2 H2 O/ns. According to Eq. (3), this q0 value translates into a diffusion permeability of pd = 6.0 × 10−15 cm3 /s. Using this pd value and the calculated pf value of this study, one obtains a pf / pd ratio of 11.9, in good agreement with the experimentally measured ratio of 13.2 for AQP1 [20]. The ratio corresponds to the number of effective steps in which a water molecule needs to participate to cross AQP1. The number (∼12) of effective steps in a complete permeation event should be interpreted as follows. In the bulk, water conduction is essentially uncorrelated, i.e., the bulk phase does not contribute to the pf / pd ratio. In the
Kinetic theory and simulation of single-channel water transport
1815
4 3
Water flux (#/ns)
2 1 0 ⫺1 ⫺2 ⫺3 ⫺4 ⫺200
⫺100
0 ∆P (MPa)
100
200
Figure 7. Relation of water flux and the applied pressure gradient. Values of pressure differences and water fluxes are taken from Tables 1 and 2, respectively. A line with the best-fit slope for the four data points is also shown in the figure.
constriction region of the channel, however, on average N = 7 water molecules move essentially in single file, i.e., in a correlated and concerted fashion, such that N + 1 = 8 steps are needed to transport a water molecule through. Water molecules in the vestibules (also shown in Fig. 6a) at the termini of the channel are not forming a single file, but nevertheless move in a somewhat concerted fashion, accounting for the remainder of the pf / pd ratio. For AQP1, the average number of water molecules in the single-file region is about 7 corresponding to a pt/pd ratio of 8, but the experimentally measured ratio of pf / pd is 13.2 [20]. In order to understand the difference, we note that water molecules in an AQP1 channel may occasionally deviate from the single-file configuration due to conformational fluctuation of the protein. Furthermore, the behavior of water in the vestibule regions of AQP channels [3, 13, 30] suggests that the single-file model is too simple and that water transport effectively involves vestibular water at the channel entrances, such that the latter water cannot be counted as bulk water (see Fig. 6a).
8.
Nanotube Simulations and the Collective Diffusion Model
In order to illustrate the validity of the collective diffusion model we consider MD simulations performed on two channels [23], denoted as a and b, and shown in Fig. 8. The simulations and systems results presented in this section
1816
E. Tajkhorshid et al.
a
b
Figure 8. Side view of the unit cells in systems a andb, with dimensions of 18.0 Å × 18.0 Å × 41.4 Å and 46.0 Å × 46.0 Å × 42.1 Å, respectively. Half of the CNT channels and the membranes are removed in order to reveal water molecules in the channels. The dashed lines and the bars indicate the layers where constant forces were applied to the water molecules in nonequilibrium simulations (see text).
are taken from [23]. In each system, two layers of carbon atoms mimicking a membrane partition the bulk water and a CNT serves as a water channel. The CNT in system a is of (6,6) armchair type with a C–C diameter of ∼8 Å. Previous simulations [17, 18] showed that this CNT conducts water strictly in single-file manner. The CNT in system b is of (15,15) armchair type with a C–C diameter of ∼20 Å, and with disordered, bulk-like water molecules in it. Systems a and b contain 276 (∼5 in pore) and 1923 (∼90 in pore) water molecules, respectively. The length of the channel is L = 13.2 Å in both systems. All nanotube simulations were performed under periodic boundary conditions with constant volume. The temperature was kept constant (T = 300 K) by Langevin dynamics with a damping coefficient of 5/ps. The CNT and the membrane were fixed in all simulations. The TIP3P model was used for water molecules. We employed the MD program NAMD2 [27] for the simulations, with full electrostatics calculated by the PME method. The channels were fixed and kept rigid throughout the simulations. This ensured that the channel maintains its structure under the large pressure from bulk water molecules. Equilibrium MD simulations of 40 ns and 20 ns were performed on systems a and b, respectively, with coordinates recorded every picosecond. We took the sum of one-dimensional displacements of all water molecules in the channel, divided by L, as the displacement n in each picosecond (cf. Eq. (18)). If a water molecule enters or exits the channel within a picosecond, only the portion of its displacement within the channel contributes to the sum. The trajectories of n(t), as shown in Fig. 9, were obtained by summing up (integrating) the n values as explained above. The mean square deviation (MSD) of n(t)
Kinetic theory and simulation of single-channel water transport
1817
80
b
n
40
0
a
⫺40
⫺80
0
10
20 t (ns)
30
40
Figure 9. Trajectories of n for equilibrium MD simulations of systems a and b.
for each system is presented in Fig. 10. According to Eq. (20), the diffusion coefficient Dn is one-half of the slope of the MSD–t curve. From the best-fit slopes, the Dn values were determined to be (16.5 ± 2.1)/ns and (524 ± 40)/ns for systems a and b, respectively. In order to test the key aspect of the collective diffusion model, namely, Eq. (23), we need to perform nonequilibrium simulations in the presence of a chemical potential difference (µ) of water across the membrane. This was achieved by application of a hydrostatic pressure difference, which corresponds to a chemical potential difference µ = f d across the membrane. The defined layers in systems a and b are shown in Fig. 8, with thicknesses d = 7.4 and 8.1 Å, respectively. By choosing a proper f , one can select any desired value for µ. For each system, we performed six nonequilibrium simulations, with µ set to 0.2 kB T , 0.5 kB T , 1 kB T , 2 kB T , 5 kB T , and 10 kB T . The simulation times (1–40 ns) varied in different simulations, but were long enough to observe a net transport of at least 100 water molecules in each case. Figure 11 shows both the predicted water fluxes (solid lines) from Eq. (23) and the observed water fluxes (squares) in the simulations, from which one can discern excellent agreements between predictions and simulations. The results demonstrate the validity of the collective diffusion model. It is remarkable that the water flux induced by a µ as large as 10 kB T can still be predicted by the Dn value determined from equilibrium simulations. In light of this, one is not surprised that the calculated osmotic permeability ( pf ) of AQP1 obtained from
1818
E. Tajkhorshid et al. 20 b
MSD
15
10
5 a
0
0
20
40
60
80
100
t (ps)
Figure 10. Mean square deviations (MSDs) of n for systems a and b. For each system, the trajectory n(t) shown in Fig. 9 was evenly divided into M (400 for system a, 1000 for system b) short time-periods. n(t) in each period was treated as an independent sub-trajectory n 2i (t), and was shifted so that n i (t)|t =0 = 0. The average over n i (t) (i = 1, . . . , M) was then taken as MSD(t). A line with the best-fit slope was superimposed on each MSD curve.
20
200
a
100
j n (/ns)
10
0 0 600
0.5
1
0
4000
200
2000
0
0
6000
b
400
0
a
0.5
1
0
5
10
5
10
b
0
∆µ/k B T
Figure 11. The dependence of water flux ( jn ) on the chemical potential difference (µ) of water. Each data point (marked as a square) represents the jn value obtained from a nonequilibrium simulation, by dividing the total displacement of n in the simulation by the simulation time (cf. Eq. (19)). The solid lines show the jn –µ relations predicted from Eq. (23), with Dn = 16.5 / ns for system a and Dn = 524 / ns for system b, both values being determined from the equilibrium simulations.
Kinetic theory and simulation of single-channel water transport
1819
nonequilibrium simulations [5], which were reported in the previous section, agrees with the experimental data despite the fact that the µ values (∼1 kB T ) in the simulations were much larger than experimental values (e.g., a solute concentration difference of 200 mM, as is typical in actual measurements, corresponds to a µ of 0.0036 kB T ). In this section we have mainly focused on the collective movement of water inside the channel. The movement of individual water molecules also deserves attention. In particular, some water molecules may permeate all the way through the channel, an event described as a full permeation event. One can count the number of such permeation events in each direction in unit time, denoted as q0 , from equilibrium simulations. We observed q0 values of about 3 and 110/ns from our equilibrium simulations for systems a and b, respectively. While the Dn value, which quantifies the collective water movement, determines the osmotic permeability pf (see Eq. (24)), the q0 value determines another experimental quantity for water channels, namely, the diffusion permeability pd [5]. The ratio pf / pd is actually equal to Dn /q0 . We obtained Dn /q0 ratios of 5.5 and 4.8 for systems a and b, respectively. The pf / pd ratio for a single-file channel can be interpreted as the number of effective steps a water molecule needs to take to completely cross the channel, i.e., the number of water molecules inside the channel plus 1 [5]; interestingly, despite the much larger number of water molecules in the pore region of system b, the pf / pd values for the two channels turned out to be similar. It is of interest to determine this ratio for different types of water channels in future studies. The collective diffusion model establishes a quantitative relationship between the spontaneous water transport at equilibrium and the stationary water flux under nonequilibrium conditions. Using this model, pf can be determined readily from equilibrium MD simulations. Since the model does not make specific assumptions on water channels, it can be used to characterize water permeation in any channel.
9.
Outlook
Biological water channels, even though only recently discovered, have evolved rapidly to a level of rather complete characterization, both through observation and theory. A key reason for the successful investigations was the fact that the structure of the channel has been solved for key members of the AQP family. Other reasons for the success are the relatively simple function, water transport, the lack of significant motion needed for function, and the related very rigid structure of water channel proteins. Yet, there are still fascinating research problems connected with water channels. Most pressing is an understanding of the mechanism of proton exclusion that is vital for the biological function, since the channels must not
1820
E. Tajkhorshid et al.
dissipate cell membrane potentials. Much success has been achieved recently [3, 31–34]. Another interesting aspect of water channel research is to develop an understanding of the diversity of water channels in the whole kingdom of life. Humans have 11 different AQPs [34] in various tissues, some being pure water channels, others being water as well as glycerol channels. The differences in the human AQPs might be related to their function, e.g., possibly to their different ability to gate the channel, but more likely connected with the transport, storage, and deployment of the channels in cells, e.g., as controlled through the antidiuretic hormone. Likewise, existence of many different AQPs in other species, such as plants, yeast, and bacteria, and their involvement in membrane transport of materials ranging from O2 and CO2 gases to substrates like nitrate pose important questions in terms of their selectivity that need to be understood. A fascinating opportunity for the study of AQPs has been opened up recently through the solvation of the structures of both an aqualglyceroprin (GlpF) and a pure water channel (AqpZ) for a single organism, namely, E. coli [10, 36]. A comparison of the two structures provides a fundamental chance to understand the design of this important class of membrane channels in terms of selectivity, transport rates, and role in the survival of cells.
Acknowledgments We acknowledge grants from the National Institutes of Health NIH P41RR05969 and R01-GM067887 and from the National Science Foundation NSF CCR 02-10843. The authors also acknowledge computer time provided at the NSF centers by the grant NRAC MCA93S028. F.Z. acknowledges a graduate fellowship awarded by the UIUC Beckman Institute. Molecular images in this paper were generated with the molecular graphics program VMD [37].
References [1] A. Finkelstein, Water Movement Through Lipid Bilayers, Pores, and Plasma Membranes, John Wiley & Sons, New York, 1987. [2] F. Zhu, E. Tajkhorshid, and K. Schulten, “Molecular dynamics study of aquaporin-1 water channel in a lipid bilayer,” FEBS Lett., 504, 212–218, 2001. [3] E. Tajkhorshid, P. Nollert, M.Ø. Jensen, L.J.W. Miercke, J. O’Connell, R.M. Stroud, and K. Schulten, “Control of the selectivity of the aquaporin water channel family by global orientational tuning,” Science, 296, 525–530, 2002. [4] F. Zhu, E. Tajkhorshid, and K. Schulten, “Pressure-induced water transport in membrane channels studied by molecular dynamics,” Biophys. J., 83, 154–160, 2002.
Kinetic theory and simulation of single-channel water transport
1821
[5] F. Zhu, E. Tajkhorshid, and K. Schulten, “Theory and simulation of water permeation in aquaporin-1,” Biophys. J., 86, 50–57, 2004. [6] A.J. Yool and A.M. Weinstein, “New roles for old holes: ion channel function in aquaporin-1,” News Physio. Sci., 17, 68–72, 2002. [7] K. Murata, K. Mitsuoka, T. Hirai, T. Walz, P. Agre, J.B. Heymann, A. Engel, and Y. Fujiyoshi, “Structural determinants of water permeation through aquaporin-1,” Nature, 407, 599–605, 2000. [8] G. Ren, V.S. Reddy, A. Cheng, P. Melnyk, and A.K. Mitra, “Visualization of a waterselective pore by electron crystallography in vitreous ice,” Proc. Natl. Acad. Sci. U.S.A., 98, 1398–1403, 2001. [9] H. Sui, B.-G. Han, J.K. Lee, P. Walian, and B.K. Jap, “Structural basis of waterspecific transport through the AQP1 water channel,” Nature, 414, 872–878, 2001. [10] D. Fu, A. Libson, L.J.W. Miercke, C. Weitzman, P. Nollert, J. Krucinski, and R.M. Stroud, “Structure of a glycerol conducting channel and the basis for its selectivity,” Science, 290, 481–486, 2000. [11] B.L. de Groot and H. Grubm¨uller, “Water permeation across biological membranes: mechanism and dynamics of aquaporin-1 and GlpF,” Science, 294, 2353–2357, 2001. [12] M.Ø. Jensen, E. Tajkhorshid, and K. Schulten, “The mechanism of glycerol conduction in aquaglyceroporins,” Structure, 9, 1083–1093, 2001. [13] M.Ø. Jensen, S. Park, E. Tajkhorshid, and K. Schulten, “Energetics of glycerol conduction through aquaglyceroporin GlpF,” Proc. Natl. Acad. Sci. U.S.A., 99, 6731– 6736, 2002. [14] M.Ø. Jensen, E. Tajkhorshid, and K. Schulten, “Electrostatic tuning of permeation and selectivity in aquaporin water channels,” Biophys. J., 85, 2884–2899, 2003. [15] S. Iijima, “Helical microtubules of graphitic carbon,” Nature, 354, 56–58, 1991. [16] R. Saito, G. Dresselhaus, and M.S. Dresselhaus, Physcial Properties of Carbon Nanotubes, Imperial College Press, 1998. [17] G. Hummer, J.C. Rasaiah, and J.P. Noworyta, “Water conduction through the hydrophobic channel of a carbon nanotube,” Nature, 414, 188–190, 2001. [18] F. Zhu and K. Schulten, “Water and proton conduction through carbon nanotubes as models for biological channels,” Biophys. J., 85, 236–244, 2003. [19] N. Sperelakis, Cell Physiology Source Book, Academic Press, San Diego, 1998. [20] J.C. Mathai, S. Mori, B.L. Smith, G.M. Preston, N. Mohandas, M. Collins, P.C.M. van Zijl, M.L. Zeidel, and P. Agre, “Functional analysis of aquaporin-1 deficient red cells,” J. Biol. Chem., 271, 1309–1313, 1996. [21] M.L. Zeidel, S.V. Ambudkar, B.L. Smith, and P. Agre, “Reconstitution of functional water channels in liposomes containing purified red cell CHIP28 protein, Biochemistry, 31, 7436–7440, 1992. [22] P. Pohl, S.M. Saparov, M.J. Borgnia, and P. Agre, “Highly selective water channel activity measured by voltage clamp: analysis of planar lipid bilayers reconstituted with purified AqpZ,” Proc. Natl. Acad. Sci. U.S.A., 98, 9624–9629, 2001. [23] F. Zhu, E. Tajkhorshid, and K. Schulten, “Collective diffusion model for water permeation through microscopic channels,” Phys. Rev. Lett., 2004, submitted. [24] A. Berezhkovskii and G. Hummer, “Single-file transport of water molecules through a carbon nanotube,” Phys. Rev. Lett., 89, 064503, 2002. [25] A. Kalra, S. Garde, and G. Hummer, “Osmotic water transport through carbon nanotube membranes,” Proc. Natl. Acad. Sci.U.S.A., 100, 10175–10180, 2003. [26] A.D. MacKerell Jr., D. Bashford, M. Bellott et al., “All-hydrogen empirical potential for molecular modeling and dynamics studies of proteins using the CHARMM22 force field,” J. Phys. Chem. B, 102, 3586–3616, 1998.
1822
E. Tajkhorshid et al.
[27] L. Kal´e, R. Skeel, M. Bhandarkar, R. Brunner, A. Gursoy, N. Krawetz, J. Phillips, A. Shinozaki, K. Varadarajan, and K. Schulten, “NAMD2: greater scalability for parallel molecular dynamics,” J. Comp. Phys., 151, 283–312, 1999. [28] U. Essmann, L. Perera, M.L. Berkowitz, T. Darden, H. Lee, and L.G. Pedersen, “A smooth particle mesh Ewald method,” J. Chem. Phys., 103, 8577–8593, 1995. [29] T. Walz, B.L. Smith, M.L. Zeidel, A. Engel, and P. Agre, “Biologically active twodimensional crystals of aquaporin CHIP,” J. Biol. Chem., 269, 1583–1586, 1994. [30] D. Lu, P. Grayson, and K. Schulten, “Glycerol conductance and physical asymmetry of the Escherichia coli glycerol facilitator GlpF,” Biophys. J., 85, 2977–2987, 2003. [31] B.L. de Groot, T. Frigato, V. Helms, and H. Grubm¨uller, “The mechanism of proton exclusion in the aquaporin-1 water channe,” J. Mol. Biol., 333, 279–293, 2003. [32] A. Burykin and A. Warshel, “What really prevents proton transport through aquaporins,” Biophys. J., 85, 3696–3706, 2003. [33] N. Chakrabarti, E. Tajkhorshid, B. Roux, and R. Pom`es, “Molecular basis of proton blockage in aquaporins,” Structure, 12, 65–74, 2004. [34] B. Ilan, E. Tajkhorshid, K. Schulten, and G.A. Voth, “The mechanism of proton exclusion in aquaporin channels,” Proteins: Struct. Func. Bioinf., 55, 223–228, 2004. [35] J.B. Heymann and A. Engel, “Aquaporins: phylogeny, structure, and physiology of water channels,” News Physio. Sci., 14, 187–193, 1999. [36] D.F. Savage, P.F. Egea, Y. Robles-Colmenares, J.D. O’Connell III, and R.M. Stroud, “Architecture and selectivity in aquaporins: 2.5 Å x-ray structure of aquaporin Z,” PLoS Biol., 1, 334–340, 2003. [37] W. Humphrey, A. Dalke, and K. Schulten, “VMD – visual molecular dynamics,” J. Mol. Graphics, 14, 33–38, 1996.
5.16 SIMPLIFIED MODELS OF PROTEIN FOLDING Hue Sun Chan University of Toronto, Toronto, Ont., Canada
Protein folding is one of the most basic physico-chemical self-assembly processes in biology. Elucidation of its underlying physical principles requires modeling efforts at multiple levels of complexity [1]. As in all theoretical endeavors, the degree of simplification in modeling protein behavior depends on the questions to be addressed. The motivations for using simplified models to study protein folding are at once practical and intellectual. Realistically, a truly ab initio solution to the Schr¨odinger equation for a protein and its surrounding solvent molecules is currently out of the question. Although classical (Newtonian) descriptions based on geometrically high-resolution all-atom molecular dynamics have provided much useful insight, these models are computationally costly. Moreover, it is unclear whether common empirical potential functions used in such all-atom approaches are ultimately adequate. In this context, simplified models offer a complementary and efficient means for posing questions and testing hypotheses. Similar in spirit to the Ising model of ferromagnetism, simplified models of protein folding are designed to capture essential physics, and are geared towards the discovery of higher organizing principles [2] while omitting details deemed unimportant for the question at hand.
1.
Lattice Protein Models
Experimental protein folding data has long been analyzed in terms of “native”, “denatured”, and “intermediate” states, often without a clear delineation of the relationship between the empirically defined states and their underlying conformational ensembles. While enlightening, the physical pictures emerged from such interpretations are far from complete. Proteins are chain molecules. Obviously, a microscopic physical picture of their folding 1823 S. Yip (ed.), Handbook of Materials Modeling, 1823–1836. c 2005 Springer. Printed in the Netherlands.
1824
H.S. Chan
must involve at least a rudimentary account of chain connectivity, conformational freedom, and the excluded volume constraints that two amino acid residues cannot be at the same place at the same time [3]. Lattice protein models fulfill this minimal requirement by representing protein chains as selfavoiding walks on a lattice. The multitude of lattice walks correspond to the many conformations accessible to a protein molecule. The most commonly applied lattice protein models are based upon two-dimensional (2D) square and three-dimensional (3D) simple cubic lattices (Fig. 1). Bond angles in these models are restricted to either 0◦ and 180◦ . Lattices with higher coordination numbers that allow for a larger set of possible bond angles have also been used extensively to provide more realistic representations of protein geometry (see review by Chan et al. [1]). Lattice protein models are closely related to lattice approaches in many areas of polymer physics. Although homopolymer models (with identical monomer units along the chain) are valuable in addressing certain issues in protein folding, the main emphasis in lattice modeling of proteins is on heteropolymer models (wherein monomers along the chain can be different). This is because protein sequences are made up of different types of amino acids. Intraprotein interactions are heterogeneous. A number of different schemes have been used to model intraprotein interactions. These include allowing different numbers of possible monomer types (different numbers of letters in an alphabet), ranging from reduced two-letter models designed for simplicity and tractability (e.g., Fig. 1) to 20-letter models that aim to better capture the energetics in real proteins. Some interaction schemes are general and transferrable in that they depend solely on the model sequence, while others have explicit biases towards a particular target structure (see below). The field of simplified lattice modeling of proteins has expanded dramatically in the last decade. Many of the models and their applications have been reviewed (e.g., [4–8]). One distinct advantage of these self-contained polymer models with explicitchain representations [1] is that they provide a clear deductive relationship between the premises of a model and its predictions. Ideas and hypotheses can be efficiently verified or falsified by simplifed protein folding models because of this logical clarity and their computational tractability.
2.
Energetics, Chain Moves and Density of States
Besides simplified lattice models, continuum (off-lattice) models with simplified representations of the polypeptide chain have also been used to study protein folding [6, 9–11]. The rationale for both simplified on- and offlattice models is to enhance the capability to broadly sample conformational space by sacrificing geometrical accuracy of the model chains. A central quantity governing the energetics of a simplified protein model is its density of
Simplified models of protein folding
1825 DENATURED ENSEMBLE
lng(h) 0 1 2 3 4 5 6 7 8 9
h2 g(2) 855549
h1 g(1) 2059356
h0 g(0) 2565772
h4 g(4) 61831
h3 g(3) 252356
h
h5 g(5) 11629
NATIVE
h9 g(9) 1
h8 g(8) 9
h7 g(7) 162
h6 g(6) 1670
Figure 1. A 2D HP lattice model illustration of protein folding thermodynamics. Filled and open circles correspond, respectively, to H and P residues. The given model protein sequence can have a maximum of nine HH contacts, achievable only by a single native conformation (lower left). The double arrow indicates the thermodynamic equilibrium between the single native conformation and the denatured ensemble (right) that consists of a total of 5,808,334 conformations, schematically depicted using one representative conformation for each h value. The corresponding logarithmic density of states is shown at the upper left.
states g(E), defined to be the number of conformations as a function of the intraprotein interaction energy E. In general, intraprotein energies in simplified models should be viewed as an effective potential. In addition to the direct interactions between chemical groups along the protein chain, an effective
1826
H.S. Chan
potential also includes energetic contributions arising from protein–solvent and solvent–solvent interactions by implicitly averaging over solvent degrees of freedom. As a result of solvent averaging, effective potentials can depend on temperature [1]. For many applications, however, one may simplify the calculation by taking E as temperature independent as long as the above consideration is taken into account in relating model predictions to experiments. As for any physical system, a model protein’s thermodynamics is controlled by its partition function Q=
g(E)e−E/kB T ,
(1)
E
where kB is Boltzmann constant and T is absolute temperature. The summation here, which may be replaced by an integration for continuum models, is over all possible energies E. In protein folding, a quantity of central interest is the ratio between native (N) and denatured (D) populations:
−E/kB T [N] E∈{E N } g(E)e = , −E/kB T [D] E∈{E D } g(E)e
(2)
where E ∈ {E N } and E ∈ {E D } indicate that the respective summations are over conformations defined to be in the native and denatured states. In formulations with T -independent E’s, the native energies are lower than the denatured energies, such that chain population is concentrated in the native (folded) state at low T but shift to the denatured state at high T . This provides a model description for the protein folding/unfolding transition. In general, the denatured state is comprised of many more conformations than that of the native state. In some highly simplified lattice models (e.g., Fig. 1), the native state is defined to be a single conformation ({E N } has only a single energy). The average energy
Eg(E)e−E/kB T −E/ kB T E g(E)e
E = E
(3)
and (constant-volume) heat capacity 1 ∂E = CV (T ) = ∂T kB T 2
E 2 g(E)e−E/kB T − −E/kB T E g(E)e E
−E/kB T 2 E Eg(E)e −E/k T E
g(E)e
B
(4) of the model protein can be readily computed once the density of states is determined. The unfolding transition of a protein is often associated with a prominent peak of the heat capacity as a function of T . This peak of heat absorption may be viewed as a finite-system analog of latent heat, whereby energy is taken up to propel protein conformations from the lower-energy native state to the higher-energy denatured state. The heat capacity of a model
Simplified models of protein folding
1827
protein provides important information about the cooperativity of the protein folding/unfolding transition, i.e., to what degree it can be viewed as an “allor-none” process [12]. It should be noted that most calorimetric data provides the constant-pressure heat capacity CP rather than CV . Therefore, to facilitate comparison with experiments, an expression for CP (T ) may be obtained by replacing energy E in Eq. (4) with enthalpy H . In lattice models, the energy E in Eq. (4) is taken to be the potential energy alone (since kinetic energy is not defined). In continuum models with Newtonian or Langevin dynamics, the energy in Eq. (4) should be the total energy that includes both the potential and kinetic energies, although the effect of including kinetic energy on the heat capacity function is small in the transition peak region. If the effective energy E is temperature dependent, the differentiation in Eq. (4) would lead to extra terms in the heat capacity expression. This can result in different intrinsic heat capacities for the native and denatured states, a feature similar to that observed experimentally for real proteins [13]. For short-chain lattice models, g(E) may be enumerated exactly (see below). For longer chains, exact enumeration is not practical. In such cases, g(E) is estimated instead by conformational sampling using Monte Carlo techniques (for continuum as well as lattice models) or Newtonian/Langevin dynamics (for certain continuum models). A common Monte Carlo technique is known as the Metropolis algorithm, which consists of two basic steps: (i) starting with any conformation of the model protein, attempt to randomly change its conformation by applying a chain move from a set of elementary coordinate transformations. Move sets are chosen primarily for their efficiency in reaching every region of conformational space. (ii) Evaluate the attempted conformational transition. In the Metropolis algorithm, the probability of accepting an attempted transition from conformation a with energy E a to conformation b with energy E b (b =/ a) is given by Pab = min{1, exp[−(E b − E a )/kB T ]}.
(5)
If the attempted conformational transition is accepted, b is added to the conformational sample. If the attempted transition is not accepted, a is counted one more time in the conformational sample. After sufficient sampling, the resulting collection of conformations visited is expected to converge to a Boltzmann distribution, from which an estimate of the density of states g(E) can be readily extracted. It should be emphasized that the Metropolis procedure was originally designed for thermodynamic sampling, not for kinetic simulation. The sequence of events in a Monte Carlo run does not necessarily correspond to a physical kinetic process. For certain applications, a computationally efficient move set may bear little semblance to actual chain dynamics. Nonetheless, when the set of moves and their relative probabilities are judiciously chosen so one can intuitively assume that the sampling moves correspond to physically plausible elementary chain motions, the series of conformations
1828
H.S. Chan
sampled in a Metropolis Monte Carlo run may be taken as a model kinetic trajectory. Indeed, this working assumption has been heavily utilized in simplified lattice models of protein folding kinetics.
3.
Reduced Alphabets and Exact Lattice Enumerations
Exact enumeration of lattice conformations has been an investigative tool in polymer physics since the late 1940s. The methodology has contributed to the development of renormalization group analyses of excluded volume effects (see review by Chan and Dill [3]). Since the late 1980s, simple exact models of protein folding have inherited this rigorous polymer physics technique of exhaustively accounting for all possible conformations [5]. For certain simplified lattice protein models with few-letter alphabets such as the HP model described below, the method can be extended to exhaustively accounting for all possible sequences as well [14]. A widely applied simplified lattice protein model is the two-letter HP (hydrophobic–polar) model. Sequences in this model are made up of two residue types – hydrophobic (H) and polar (P). Chains are configured on 2D square or 3D simple cubic lattices. The HP model is designed to capture the interplay between chain conformational freedom and hydrophobic interactions. For this purpose, the 2D HP model is exceptionally versatile and instructive, its reduced number of spatial dimensions notwithstanding. Hydrophobic effect is a main driving force for protein folding because hydrophobic residues tend to cluster together to avoid water and hence drive a protein to adopt a globular folded form. In the HP model, this effect is minimally modeled by assigning a favorable energy (< 0) to each non-bonded nearest-neighbor HH (hydrophobic–hydrophobic) contact. All other contacts are neutral (have zero energy). It follows that the energy of a conformation equals E = h, where h is the number of HH contacts in the conformation. Hence, the density of states of a model sequence is given by g(h), the number of conformations as a function of h. The number of lowest-energy (maximum-h) conformations is often denoted by g, termed ground-state degeneracy. Only a small fraction of approximately 2.5% of 2D HP sequences have a unique ground-state conformation (g = 1), they are used as model proteins. Most HP sequences have g > 1. This echoes experimental observations that an overwhelming majority of random amino acid sequences do not fold like natural proteins. Despite its simplicity, the HP model exhibits many proteinlike properties, and has provided numerous insights into protein structure and stability [3, 5]. An example HP model protein sequence and its exactly enumerated g(h) are shown in Fig. 1.
Simplified models of protein folding
1829
Together with the HP model, a wide diversity of simplified lattice protein models have been extensively investigated during the past decade. Because these models are highly simplified, caution has to be used to relate their predictions to experiments. Simple-minded interpretations of simplified model results can be misleading. Notably, the mere fact that a model alphabet has 20 letters does not by itself mean that its interactions resemble the physical interactions among the 20 real amino acid types. To address issues of model interpretation, a physical evaluation of the energetics embodied by some of the recent simplified lattice protein models was provided by Chan et al. [1].
4.
Sequence-Structure Mapping and Evolution
Although simplified lattice models with pairwise additive contact energies (these include the HP model) have been instrumental in making fundamental conceptual advances, their minimalist interaction schemes are not sufficient to provide quantitative rationalizations for several key generic thermodynamic and kinetic properties of protein folding ([12, 15] see below). Nevertheless, despite this limitation, the HP model continues to be valuable particularly to evolutionary studies because it offers an exactly enumerable yet physically motivated mapping between sequences and their ground-state conformations. Energetic contributions in a natural protein are “minimally frustrated” in that they tend to consistently favor the same native structure [4]. It follows that even though the HP potential is incomplete, the correspondence between a model sequence’s H/P pattern and its ground-state conformation(s) is expected to mimic that for real proteins. This interpretative framework is supported by the observation that H/P patterns among short 2D HP model protein sequences are similar to that observed among real proteins (reviewed in Ref. [14]). A useful evolutionary concept elucidated by HP and other simplified lattice models is that of the neutral net (Fig. 2). A neutral net is a set of sequences interconnected by single-point mutations and encoding for the same native structure. For the examples in Fig. 2, the top left conformation is identical to the native conformation in Fig. 1 with the same sequence, whereas the top right conformation (encoded by a different sequence) is structurally identical to the h = 6 example conformation in Fig. 1. Figure 2 shows how a sequence encoding for the top left conformation may evolve into a sequence encoding for the top right conformation by undergoing only two single-point mutations (dashed lines). Not surprisingly, one of the changes necessary for favoring the top right conformation is replacing the P residue in the core of the h = 6 conformation in Fig. 1 with an H residue. Other aspects of sequence-structure mapping can be subtle and less obvious. For instance, the above P → H substitution alone leads only to the g = 4 sequence in Fig. 2. This sequence has
1830
H.S. Chan NEUTRAL NETS
H H H H H H H H H H H H
P P P P P P P P P P P P
H H H H H H H H H H H H
P P P P P P P P P P P P
PH PH HH PH HH PH PH HH PH PH PH HH
H H H H H H H H H H H H
H H H H H H H H H H H H
PH PH HH HH PH PH HH PH PH HH PH HH
PP H PHH HP H HP H HP H HP H PPH PPH P HH PPH PPH PPH
P P P P P P P P P P P P
HP HP HP HP HP HP HP HP PP PP PP HP
H H H H H H H H H H H H
H H H H H H H H H H H H
H H H H H H H H H
H H H H H H H H H
H H H H H H H H H
P P P P HP HP P P P P P P P P P P
H H H H H H H H H
HH PH P H P H P H P H P H P P P P
P P P P P P P P P
H H H H H H H H H
P P P P P P P P P
P P P P P P P P P
P P P P P P P P P
P H HH P H P H P H HH P H P H P H
P H HH HH P H HH P H P H P H HH
H H H H H H H H H
CROSSOVER
HHHPPHPPPHPPHPHPHH
Figure 2. Modeling protein evolution. The two lattice conformations at the top are encoded by two neutral nets (shown below the conformations) consisting, respectively, of the 12 (left) and 9 (right) unique (g = 1) HP sequences listed under the nets. Net topologies are depicted by lines connecting pairs of sequences (represented by diamonds) that differ by a single-point H → P or P → H substitutive mutation. Sequences of the same neutral net are inter-connected by solid lines. The two neutral nets in this figure are linked via a g = 4 sequence (diamond between two dashed lines), which is connected by single-point substitutions (dashed lines) to one sequence in each of the two neutral nets. The latter sequences correspond to the two that are first on the two unique sequence lists, and are shown with their respective native conformations (top). In this example, a crossover between the pair of sequences that are last on the two sequence lists begets a recombined sequence that encodes for a different conformation, which is shown at the bottom of the figure with a dotted box highlighting the 10-residue sub-sequence that originates from the first 10 residues of the parent sequence on the right. The rest of the recombined sequence originates from the last eight residues of the parent sequence on the left.
Simplified models of protein folding
1831
four ground-state conformations, one of which is the top left conformation, but the top right conformation is not among them. A further single-point H → P mutation is needed in this case to create a sequence encoding for the top right conformation. HP and other model results indicate that sequences in individual neutral nets often conform to a “superfunnel” paradigm in that native stability tend to increase as a sequence’s mutational difference with a centrally located prototype sequence decreases. A hallmark of prototype sequences is their mutational stability. The prototype sequences of the two neutral nets in Fig. 2 correspond, respectively, to the first listed sequence on the left (shown with the top left conformation) and the seventh listed sequence on the right (which is one single-point mutation away from the sequence shown with the top right conformation). The exhaustive coverage of sequence-structure mappings in simplified lattice protein models provide a means to explore the effects of sequence-space topology on evolutionary population dynamics. Recombinatoric evolution has been modeled by simplified protein lattice model as well (Fig. 2, bottom). Here it is noteworthy that in the recombined sequence’s native structure (bottom center), the conformation adopted by a sub-sequence (enclosed in the dotted box in Fig. 2) is identical to that of the same sub-sequence in a different native structure (top right) encoded by the parent sequence on the right. More generally, HP model results suggest that a sequence’s power to encode for a unique structure resides partly in its sequentially local H/P patterns. This observation is reminiscent of, and provides a rationalization for the autonomous folding units in real proteins. Further details of simple exact models of protein evolution can be found in a recent review [14].
5.
Generic Protein Properties as Stringent Modeling Constraints
An approach for probing real protein energetics is to apply generic experimental protein folding properties as constraints on postulated simplified model interaction schemes. This investigative protocol is effective because even some apparently mundane features of protein folding turned out to be extremely stringent modeling constraints. Therefore, by rigorously evaluating model interactions against experiments, inferences can be made about the functional form of the interactions operating in real proteins (i.e., their “energy landscapes”). A prime example is the thermodynamic cooperativity associated with thermal denaturation of many small proteins, quantified by an van’t Hoff to calorimetric enthalpy ratio HvH /Hcal ≈ 1 deduced from experimental CP (T ). This condition implies that a protein’s density of state is sharply bimodal [10, 12]. It is noteworthy that several simplified lattice protein models deviate significantly from this quantitative criterion. Although not all
1832
H.S. Chan
real proteins exhibit thermodynamic cooperativity, this observation suggests strongly that the energetics of certain simplified models differ substantially from that of a large class of real proteins that do fold and unfold cooperatively. In general, thermodynamic cooperativity tends to be facilitated by enhancing interaction heterogeneity, which is often achievable by adopting a larger model alphabet [12]. The folding kinetic cooperativity of a growing number of small, singledomain proteins provide an even more stringent modeling constraint. The hallmarks of these proteins’ folding/unfolding kinetics are: (i) single-exponential kinetic relaxation, (ii) the logarithm of the folding and unfolding rates (ln kf and ln ku ) at constant T are essentially linear in chemical denaturant concentration, i.e., both arms of the “chevron plot” are linear, and (iii) the equilibrium [N]/[D] = kf /ku . Taken together, conditions (ii) and (iii) imply that for these proteins, ln kf and ln ku are linear in the free energy of unfolding (native stability) G u ≡ kB T ln([N]/[D]). Apparently, thermodynamic cooperativity is a prerequisite of kinetic cooperativity. The kinetic cooperativity conditions are highly discriminating modeling constraints. Figure 3 shows that these conditions are not satisfied by the common G¯o model. Even though this model has explicit energetic biases favoring only contact interactions present in the native structure (see below), kinetic trapping remains significant. A popular 20-letter lattice model as well as continuum G¯o models with pairwise additive contact interactions also fail to exhibit folding kinetic cooperativity [11, 12]. This is a likely cause of the fact that the diversity in the folding rates of these models does not exhibit trends similar to that observed experimentally for real, small, single-domain proteins [12, 15, 16]. In contrast, Fig. 3 shows that a chevron plot with essentially linear arms and single-exponential relaxation is obtainable from a model with a local–non-local coupling mechanism involving many-body interactions. Here “local” or “non-local” interaction refers, respectively, to a small or large separation along the protein chain between chemical groups involved in a given interaction [3]. The postulated coupling effect in Fig. 3 stipulates that contact interactions (mostly non-local) are strongly favorable only when short stretches of the chains around the pair of contacting residues adopt nativelike local conformations. The success of models embodying this mechanism (e.g., Fig. 3) in producing generic protein thermodynamic and kinetic behavior suggests that similar mechanisms are likely operative in real proteins [12, 16]. From these and related analyses, thermodynamic and kinetic cooperativities emerge as powerful modeling constraints on simplified protein folding model. They complement, and for the case of small, single-domain proteins, supersede several previous proposed modeling criteria such as the ratio between folding temperature and glass transition temperature. The earlier criteria tackled similar energetic issues insightfully but made less quantitative connections with experiments [12].
Simplified models of protein folding
1833
ⴚ10
In(rate)
ⴚ12 ⴚ14 ⴚ16 ⴚ18
15
10
5
0
ⴚ5
ⴚ10
∆Gu/kBT
Figure 3. Modeling protein folding cooperativity. Chevron plots showing the dependence of ln kf (squares and circles) and ln ku (triangles and diamonds) on native stability in units of kB T are computed for two 3D lattice 27mer models using Metropolis Monte Carlo dynamics. The models have the same unique ground-state conformation (structure shown on the right) but different interaction schemes. The upper plot is obtained using the common pairwise additive G¯o potential, whereas the lower plot incorporates a many-body mechanism with local–non-local coupling. The G u /kB T scale is for the lower plot. A lack of discrepancy between the open and filled symbols indicates that kinetic relaxation is essentially single-exponential. Kinetic trapping causes the folding arm of the common G¯o-model chevron plot to deviate significantly from linearity. In contrast, the chevron plot of the model with local–non-local coupling exhibits features similar to that of real, small, single-domain proteins. A hypothetical rate dependence (fitted V-shape) consistent with the apparent two-state equilibrium thermodynamics of the latter model is shown to coincide substantially with the qusai-linear regime of the lower chevron plot (see Ref. [16] for further details).
6.
Continuum (Off-lattice) Models: Limitations of Native-centric Methods
As for simplified lattice models, simplified continuum chain models of protein folding have utilized general, transferrable potentials [6, 10] as well as “native-centric” (G¯o-like) potentials with explicit energetic biases towards a particular target native structure [8, 9, 11]. Models with transferrable potentials are reductionist in nature. Their starting point is the general microscopic interactions presumed by a model. In that sense they conform closer to a deductive general physical theory. However, it is not yet computationally feasible to address many thermodynamic and kinetic questions of interest using
1834
H.S. Chan
explicit-solvent all-atom molecular dynamics models. At the same time, simplified continuum models with general transferrable potentials often lead to much more kinetic trapping than that observed experimentally in real, small proteins, indicating that such model potentials might not have captured certain key ingredients in real protein energetics. In this context, native-centric continuum models emerge as a complementary approach. These models stipulate that a protein’s known native structure contains significant information about its actual energetics, even if their precise nature remains to be elucidated. Assuming that this energetics can be approximately captured by a G¯o-like potential that explicitly favors a set of native contacts, these models are then applied to explore aspects of protein folding that are not immediately obvious from the presumed G¯o-like potential itself. Native-centric continuum modeling has provided much useful insight into folding kinetics, especially the transition state barrier to protein folding [8, 9]. However, because a G¯o-like potential cannot be totally realistic, and there are inevitably many arbitrary features in the definition of a native-centric model, extra caution has to be used to assess the robustness of their predictions [11]. Figure 4 examines this question by comparing two alternate definitions of native contacts in the literature. It shows that the free energy profiles predicted by two native-centric Langevin dynamics models with the same Cα chain representation for the same protein are sensitive to the choice of native contact set. Whereas the upper profile has a single peak region between the N and D states, the lower profile exhibits a dip on top of the overall peak. This difference may be interpreted to imply that while a high-energy folding intermediate is present on the lower profile, it is absent on the upper profile. This discrepancy underscores that predictions from a particular formulation of G¯o-like potential can only be regarded as tentative.
7.
Outlook
Simplified models have proven to be a powerful analytical tool for conceptual development and semi-quantitative rationalization of protein folding. Their ability to capture essential physics is well appreciated. At the same time, their intrinsic limitations should be recognized. Simplified models are most effective when attention is directed towards both their failures and their successes in reproducing experimental protein properties. As in the case of cooperativity discussed above, rigorous evaluations of model predictions against experiments and persistent efforts to resolve their discrepancies are indispensable for advance. As the field progresses, it is expected that an increasingly positive feedback between simplified and detailed modeling would ensue. Mesoscopic organizing principles emerging from simplified models should be used to provide novel ideas for detailed modeling. Detailed
Simplified models of protein folding
D
1835
N
Figure 4. Continuum G¯o-like model predictions are sensitive to the definition of native contacts. This figure compares the native contact sets NCS1 and NCS2 given in Ref. [11] for the 64-residue truncated form of chymotrypsin inhibitor 2. Results here are obtained using the “with-solvation” contact potential that incorporates rudimentary pairwise desolvation barriers. Thin lines connecting Cα positions along the protein’s backbone traces (thick line segments) show native contacts common to both NCS1 and NCS2 (middle drawing) as well as those that are in NCS2 but not in NCS1 (left drawing), and those in NCS1 but not in NCS2 (right drawing). The upper and lower curves spanning the denatured (D) and native (N) minima are free energy profiles for NCS2 and NCS1, respectively, as functions of fractional number of native contacts Q. (Figure courtesy of H¨useyin Kaya; see Ref. [11] for further details.)
modeling and experiment in turn are necessary for verifying or falsifying those very ideas, and for determining whether the proposed organizing principles are atomistically feasible. Thus, a solution to the protein folding problem may be incrementally approached by systematically bridging the gap between simplified and detailed models.
References [1] H.S. Chan, H. Kaya, and S. Shimizu, “Computational methods for protein folding: scaling a hierarchy of complexities,” In: T. Jiang, Y. Xu, and M.Q. Zhang (eds.), Current Topics in Computational Molecular Biology, The MIT Press, Cambridge, MA, pp. 403–447, 2002.
1836
H.S. Chan
[2] R.B. Laughlin and D. Pines, “The theory of everything,” Proc. Natl. Acad. Sci. USA, 97, 28–31, 2000. [3] H.S. Chan and K.A. Dill, “Polymer principles in protein structure and stability,” Annu. Rev. Biophys. Biophys. Chem., 20, 447–490, 1991. [4] J.D. Bryngelson, J.N. Onuchic, N.D. Socci, and P.G. Wolynes, “Funnels, pathways, and the energy landscape of protein folding – a synthesis,” Proteins Struct. Funct. Genet., 21, 167–195, 1995. [5] K.A. Dill, S. Bromberg, K. Yue, K.M. Fiebig, D.P. Yee, P.D. Thomas, and H.S. Chan, “Principles of protein folding – a perspective from simple exact models,” Protein Sci., 4, 561–602, 1995. [6] D. Thirumalai and S.A.Woodson, “Kinetics of folding of proteins and RNA,” Acc. Chem. Res., 29, 433–439, 1996. [7] J.N. Onuchic, H. Nymeyer, A.E. García, J. Chahine, and N.D. Socci, “The energy landscape theory of protein folding: insights into folding mechanisms and scenarios,” Adv. Protein Chem., 53, 87–152, 2000. [8] L. Mirny and E. Shakhnovich, “Protein folding theory: from lattice to all-atom models,” Annu. Rev. Biophys. Biomol. Struct., 30, 361–396, 2001. [9] C. Clementi, H. Nymeyer, and J.N. Onuchic, “Topological and energetic factors: what determines the structural details of the transition state ensemble and ‘en-route’ intermediates for protein folding? An investigation for small globular proteins,” J. Mol. Biol., 298, 937–953, 2000. [10] T. Head-Gordon and S. Brown, “Minimalist models for protein folding and design,” Curr. Opin. Struct. Biol., 13, 160–167, 2003. [11] H. Kaya and H.S. Chan, “Solvation effects and driving forces for protein thermodynamic and kinetic cooperativity: how adequate is native-centric topological modeling?,” J. Mol. Biol., 326, 911–931, 2003. [Corrigendum: 337, 1069–1070, 2004]. [12] H.S. Chan, S. Shimizu, and H. Kaya, “Cooperativity principles in protein folding,” Meth. Enzymol., 380, 350–379, 2004. [13] S. Shimizu and H.S. Chan, “Anti-cooperativity and cooperativity in hydrophobic interactions: three-body free energy landscapes and comparison with implicit-solvent potential functions for proteins,” Proteins: Struct. Funct. Genet., 48, 15–30, 2002. [14] H.S. Chan and E. Bornberg-Bauer, “Perspectives on protein evolution from simple exact models,” Appl. Bioinformat., 1, 121–144, 2002. [15] H.S. Chan, “Protein folding: matching speed with locality,” Nature, 392, 761–763, 1998. [16] H. Kaya and H.S. Chan, “Contact order dependent protein folding rates: kinetic consequences of a cooperative interplay between favorable nonlocal interactions and local conformational preferences,” Proteins Struct. Funct. Genet., 52, 524–533, 2003.
5.17 PROTEIN FOLDING: DETAILED MODELS Vijay Pande Department of Chemistry and of Structural Biology, Stanford University, Stanford, CA 94305-5080, USA
1.
Goals and Challenges of Atomistic Simulation
Proteins play a fundamental role in biology. With their ability to perform numerous biological roles, including acting as catalysts, antibodies, and molecular signals, proteins today realize many of the goals that modern nanotechnology aspires to. However, before proteins can carry out these remarkable molecular functions, they must perform another amazing feat – they must assemble themselves. This process of protein self-assembly into a particular shape, or “fold” is called protein folding. Due to the importance of the folded state in the biological activity of proteins, recent interest from misfolding related diseases [1], as well as a fascination of just how this process occurs [2–4], there has been much work performed in order to unravel the mechanism of protein folding [5]. There are two approaches one can take in molecular simulation. One direction is to perform coarse grained simulations, using simplified models. These models typically make either simplifying assumptions (such as Go models which use simplified Hamiltonians [6]) or coarse grained representations (such as an alpha-carbon only model for the protein [7]) or potentially both. While these methods are often first considered due to their computational efficiency, perhaps an even greater benefit of simplified models is their ability to potentially yield insight into general properties involved in protein folding. However, with any model or approach, there are limitations, and the cost for potential insight into general properties of folding is the limitation of restricted applicability to any particular protein system. Insight from folding simulations from simplified models is discussed in detail in Hue Sun Chan’s section. Alternatively, one can examine more detailed models. These models typically have full atomic detail, often for both the protein as well as the solvent. Detailed models have the obvious benefit of potentially greater fidelity to 1837 S. Yip (ed.), Handbook of Materials Modeling, 1837–1848. c 2005 Springer. Printed in the Netherlands.
1838
V. Pande
experiment. However, this comes at two great costs. First, the computational demands for purely performing the simulation become enormous. Second, the added degrees of freedom lead to an explosion of extra detail and simulationgenerated data; the act of gleaming insight from this sea of data is no simple task and is often underestimated, especially in light of the more straightforward (although still often overwhelming) task of simply performing the simulations.
1.1.
Why are Detailed Models Worth This Enormous Effort in Both Simulation and Analysis?
First, quantitative comparison between theory and experiment is critical for validating simulation as well as lending interpretation to experimental results. While it is generally held that experiments will not be able to yield the detail and precision available in simulations (and that simulations may likely be the only way one can fully understand the folding mechanism [8]), without quantitative validation of simulations, there is no way to know whether the simulation model or methodology are sufficiently accurate to yield a faithful reproduction of reality. Indeed, without a quantitative comparison to experiment, there is no way to decisively arbitrate the relative predictive merits of one model over another. Second, detailed models potentially have a greater predictive power. In principle, a detailed model should allow one to start purely from the protein sequence and by simulating the physical dynamics of protein folding, yield everything that one can measure experimentally, including folding and unfold rates, free energies, and the detailed geometry of the folded state. In practice, the ability for detail models to achieve these lofty goals rests both on the ability to carry out the computationally demanding kinetics simulations as well as the ability for current models (force fields) to yield sufficiently accurate representations of inter-atomic interactions.
1.2.
Why are the Challenges for Atomistic Simulation?
First, one must consider the source of the great computational demands of molecular simulation at atomic detail. To simulate dynamics, typically one simulates molecular dynamics by numerically integrating Newton’s equations for all the atoms in the system. By choosing to model with atomic degrees of freedom, one must simulate the dynamics at the timescales of atomic motion. Typically, this means that one must include fast timescales up to the femtosecond timescale. Indeed, if the timestep involved in numerical integration
Protein folding: detailed models
1839
is pushed too high (without constraining degrees of freedom), the numerical integration becomes unstable. This leads to the simple problem that if one wants to reach the millisecond timescale by taking femtosecond steps, many (1012 ) steps must be taken. While modern molecular dynamics codes are extremely well optimized and perform typically millions of steps per CPU day, this clearly falls short of what is needed (see Fig. 1). However, even if one could reach the relevant timescales, the next question is whether our models would be sufficiently accurate. In particular, would we reach the folded state, would the folded state be stable (with free energy of stability comparable to experiment), and would we reach the folded state with a rate comparable to experiment. Indeed, if one could quantitatively predict protein folding rates, free energy of stability, and structure, one would be able to predict essentially everything that one can measure experimentally. While rates and free energies themselves can only indirectly detail the nature of how proteins fold, clearly the ability to quantitatively predict all experimental observables is a necessary perquisite for any successful theory or simulation of protein folding. However, a quantitative prediction of all experimental observables is not sufficient. If a simulation could only reproduce experiments, the simulation would not yield any new insight, which is the goal of simulations in the first place, of course. This leads to a third important challenge for simulation: gaining insight from simulations. Indeed, as one adds detail to simulations, the burden of analysis becomes greater and greater. Atomistic simulations can easily generate gigabytes of data to be processed and the vast number of degrees of from time-resolved protein and water coordinates can obscure any simple, direct analysis of the folding mechanism.
Figure 1. Relevant timescales for protein folding. While detailed simulations must start with femtosecond timesteps, the timescales one would like to reach are much longer, requiring billions (microseconds), to trillions (milliseconds) of iterations. Typical fast, modern CPUs can do approximately a million iterations in a day, posing a major challenge for detailed simulation.
1840
2. 2.1.
V. Pande
Models: Atomistic Models for Protein Folding Atomic Force Field
Atomistic models for protein folding typically utilize a classical force field which attempts to reproduce the physical interaction between the atoms in the protein and solvent. The energy of the system is defined as the sum of interatomic potentials, which consist of several terms: E = E LJ + E Coulomb + E bonded
(1)
The van der Waals interaction between atoms is modeled by a Lenard– Jones energy (ELJ)
E Coulomb = ij εij
σij rij
12
−
σij rij
6
(2)
where σij is related to the size of the atoms i and j and εij is related to the strength of interaction. While van der Waals attraction is relatively weak, the LJ potential also serves an important role in providing the hard core repulsion between atoms. The bonded interactions modeled in E bonded handle the specific stereochemistry of the molecule – in particular, the nature of the covalent bonds and steric constraints in the angles and dihedral angles of the molecule. While these interactions are clearly local, they play a very important role in determining the conformational space of the molecule. Finally, E Coulomb corresponds to the familiar Coulomb’s law: qj qi (3) E Coulomb = rij ij where qi is the charge on atom i and rij the distance between atoms i and j . It is perhaps most natural to handle the pairwise interactions explicitly as the equation above, but this of course leads to simulation codes whose performance scales like N 2 , where N is the number of atoms. Clearly, this is very computationally demanding and ideally the calculation can be made O(N ). For inherently short range interactions, it is natural to do this with cutoffs and long-range corrections, i.e., to set the potential to zero smoothly beyond some cutoff distance, e.g., 12A. However, this cutoff procedure has been shown to lead to qualitatively incorrect results for Coulomb interactions [9] and instead reaction field or Ewald-based methods have been suggested with significantly better results [10]. Clearly, there are many parameters in the above formulae. Indeed, these numbers grow further when one considers the fact that the chemical environment of atoms renders even the same type of chemical element (e.g., carbon) to act very differently in different environments. For example, carbon in a hydrocarbon chain will behave fundamentally differently than carbon in an aromatic
Protein folding: detailed models
1841
ring. In order to handle this purely quantum mechanical effect in a classical model, one creates multiple atom types (corresponding to the different relevant environments) for each physical atomic element. In this example, one would define different carbon atom types. Thus, while there are only a handful of relevant physical atoms involved (primarily carbon, hydrogen, oxygen, and nitrogen), there can be tens of different atom types. While this is clearly the natural way to handle the role of chemical environment in a classical model, this leads to an explosion of parameters needed in the model, leading to a modeling challenge of the determination of these parameters. Several groups have risen to this challenge and have developed parameterizations for the force field functionals similar to the form above. Typically, these parameterizations are divided into terms for proteins (such as AMBER [11], CHARMM [12], and OPLS [13]) and for the solvent (e.g., TIP3P or SPC). These force fields are typically also associated with molecular dynamics codes themselves and thus one should not confuse the two.
2.2.
Implicit Solvation Models
With the parameterization above for the physical forces between atoms, one can simulate the relevant interactions: protein–protein, protein–solvent, and solvent–solvent. However, in typical simulations with solvent represented explicitly (i.e., directly simulating the solvent atom by atom), the number of solvent atoms is much larger than the number of protein atoms and thus most (e.g., 90%) of the computational time goes into simulating the solvent. Clearly, the solvent plays an important role since the hydrophobic and dielectric properties of water play a fundamental role in protein stability. However, an alternative to explicit simulation of water is to include these properties implicitly; i.e., by using a continuum model. Typically, these models account for hydrophobicity in terms of some free energy price for solvent exposed area on the protein. These surface area (SA) based methods vary somewhat in terms of how the SA is calculated. We stress that one should not a priori expect that a simpler (and perhaps less accurate) calculation of the SA yield worse results than a more geometrically accurate SA calculation. Indeed, since SA is itself an approximation, what is important for the fidelity of the model is not the geometric accuracy of the SA but rather whether the SA term faithfully reproduces the physical effect as judged by comparison to experiment. The dielectric contribution of water to the free energy is in some way a more difficult contribution to consider. The canonical method follows the Poisson–Boltzmann equation. Consider a protein in solvent. We can model the protein as a dielectric medium with a dielectric of εin and water as a medium with a dielectric of εout (thus making the dielectric a function of spatial
1842
V. Pande
position, ε(x)). Also, consider that the protein will likely have charges with a spatial density ρprotein(x) and that there will be counter ions in the solvent with a charge density ρcountert(x). In this case, we can describe the interaction between the charges and the dielectrics as: ∇[ε(x)∇φ] = −4πρ(x) = −4π [ρprotein(x) + ρcountert(x)] where the total charge density ρ(x) is comprised of both the protein and counter ion charges. If one assumes that the counter ion density is driven thermodynamically to its free energy minimum, we can make the mean field like approximation and state that ρcountert(x) =
i
n i qi exp
−qi φ(x) kT
(4)
where n i is the bulk number density of counter ion species i and qi is its charge. Thus, this method hands counter ions implicitly as well as the water. Including this term leads to the so-called nonlinear Poisson–Boltzmann equation. If the Boltzmann term is Taylor expanded for small cφ(x)/kT (i.e., high temperature, low counter ion concentration, or low potential strength), one gets the so-called linearized Poisson–Boltzmann equation. In general, the Poisson–Boltzmann equation is considered by many to be the “gold standard” for implicit solvation calculation. It can be used for both energy and force calculation [14] and thus can be used for molecular dynamics calculation. However, PB calculation is typically very computationally demanding and there has been much effort to develop more computationally tractable, empirical approximations to the PB equation. For example, Still and coworkers developed an empirical approximation to PB [15]. Based on a generalization of the Born equation for the potential of atoms, Still’s Generalized Born (GB) model (and its subsequent variants from Still’s group and other groups) have been shown to be both computationally tractable and quantitatively accurate for some problems, including the solvation free energy of small molecules [15] and protein folding kinetics [16].
2.3.
How Accurate are these Models?
Any question of accuracy must consider the desired quantity to calculate. While the spirit of this model has in some ways been empirically derived, it has been shown to agree reasonably well with PB calculation. More importantly, GB models have been able to accurately predict experiment, such as the solvation free energy of small molecules [15, 17]. In the end, experiment must of course be the final arbiter of any theoretical method. Moreover, while PB is on much more firm mathematical footing (i.e., one can derive it directly from the Poisson equation), one must consider that PB itself is empirical in
Protein folding: detailed models
1843
nature in some respects. Clearly, the concept of a dielectric is a macroscopic quantity and it is an approximation of sorts to apply these macroscopic concepts to the microscopic world of small molecules and proteins (hundreds to thousands of atoms). However, PB’s great success as a predictive tool demonstrates the validity (or at the very least predictive power) of such methods and approximations.
3.
Sampling: Methods to Tackle the Long Timescales Involved in Folding
Simulating the mechanism of protein folding is a great computational challenge due to the long timescales involved. Below, we briefly summarize some methods which have been used to address this challenge. As in any computational method, each has its own limitations and it is natural to consider the regime of applicability of each method.
3.1.
Diffusion–Collision Models
Due to the intrinsic hierarchical nature of the structure of proteins, it is intriguing to consider whether folding kinetics may also be hierarchical [18, 19]. Indeed, a natural hypothesis for the mechanism of protein folding is that secondary structural elements (such as alpha helices and beta hairpins) form first and then diffuse and consequently collide to form tertiary contacts and thus finally form the folded state. A generalization of this idea is that one can drastically simplify the conformational space of proteins into a series of specific states, perhaps determined by the formation of secondary structure, but not necessarily so. In this model, protein elements form, diffuse, and then collide to form larger tertiary structure. Weaver and Karplus first proposed this model for folding and its ability to predict folding kinetics [20]. With this simplification, one can model the folding mechanism (and predict folding rates) by kinetically connecting these states by including the rates between these states in a diffusion equation. This allows one to predict the overall rate (to compare with experiment) as well as the ability to predict the rate of formation of any relevant accumulating intermediate states [21, 22].
3.2.
High Temperature Unfolding
While folding times are clearly long from a simulation point of view, unfolding (especially under high denaturation conditions) can be very fast – on the nanosecond timescale. Under extreme denaturing conditions (e.g., ∼400 K
1844
V. Pande
temperature), one would expect that the folded state would become only metastable, with a low barrier to unfolding. If one is considering an energy (not free energy) barrier, increasing the temperature speeds kinetics in a relatively simple manner, in that the thermal energy to cross the barrier is simply higher and increases the rate of barrier crossing. Protein unfolding at high temperatures is more complex since entropy plays a dominant role in protein folding and unfolding. Thus, as one changes the temperature, one changes the nature of the underlying free energy landscape and barrier. Daggett and Levitt [24] first took advantage of this scenario, and Daggett’s group has subsequently used this method to examine a variety of proteins and compare their results to experiment, especially with a comparison of phi values calculated high temperature folding vs. experimental measurements [8, 25]. It is interesting to ask what is the regime of applicability of high temperature unfolding? Clearly, if the temperature is too high, it can lead to qualitative changes of the underlying free energy landscape. For example, at very high temperatures, the free energy barrier will be completely lost, and unfolding will occur as a “downhill” process. However, at less severe temperatures, the free energy barrier will still be present, although it is not guaranteed that alternate pathways would not become the rate limiting step in unfolding and one may expect that the transition state may shift to be more native-like. Nevertheless, this method has been used to predict experimental phi values, demonstrating agreement to experiment [8, 25]. The interesting next step is to understand the reasons for this intriguing level of agreement.
3.3.
Low-Viscosity Simulation
Another common means to try to tackle long timescales is to use an implicit solvation model with low viscosity. In implicit solvation models, one typically uses the Langevin equation for dynamics and employs a damping term consistent with water-like viscosity. However, water is relatively quite viscous and one need not use full-water viscosity. Instead, many groups have proposed to use viscosities 1/100 to 1/1000 that of water – or even no viscosity at all. Clearly, lowering the viscosity will speed the kinetics, but potentially at the risk of altering the nature of the kinetics, not only the rate [26]. Indeed, consider the case of protein folding from a random-walk configuration. In a water-viscosity simulation, collapsed is slowed due to the viscosity and the protein anneals its conformation (presumably forming native-like structure) along the way to collapse, thus collapsing into a partially native (or potentially completely native) structure. In a low-viscosity simulation, collapse is no longer rate limiting, and thus one quickly reaches a collapsed, but random globule; folding then proceeds via conformational rearrangements of this collapsed globule, which is a very different mechanism than that of the
Protein folding: detailed models
1845
water-like viscosity simulations. Moreover, this change in mechanism is likely responsible for the nonlinear relationship between folding time and Langevin damping constant γ found in recent simulation [26]: near water-like viscosity, there is a linear correlation between folding time and γ . However, at low γ values, this correlation becomes nonlinear, and thus one cannot reliably use an extrapolation in γ from low-viscosity simulation to predict water-like viscosity rates [26]. Nevertheless, for applications in which kinetic information is not needed, this means may potentially be a significant speed increase (and a competitive advantage of implicit solvation models in general over explicit solvation models).
3.4.
Sampling of Paths Using Relatively Short Trajectories
One fundamental property of all barrier crossing systems is that the ratio of the time to cross the barrier on an activated trajectory will be considerably faster than the mean folding time. Indeed, such a property is also directly a consequence of single exponential kinetics. In this case, the probability of a given molecule to fold at time t is given by p(t) = k exp(−kt), where k is the rate of folding. This probability can be integrated to get the more familiar fraction of molecules which fold by time t: f (t) = 1–exp(−kt). By either measure, we see that the probability of a molecule folding at short times (much shorter than 1/k) is very high. Of course, eventually, this single exponential approximation will break down, leaving the more direct question of what is the ratio of the barrier crossing time to 1/k. In simple systems, such as molecular isomerization, this time is likely very short, on the picosecond timescale [27]. For protein folding, these timescales are likely much longer, on the nanosecond timescale [28–30]. For systems in which the barrier crossing time is amenable to simulation (but perhaps simulating timescales at 1/k is not), path sampling methodologies can likely play a useful role [31–33]. The path sampling methods pioneered by Pratt’s and Chandler’s groups seek to sample a series of uncorrelated paths connecting reactant and product species (unfolded and folded states in the case of protein folding, for example). These paths can be generated by using an initial molecular dynamics trajectory and then “shooting” off new trajectories by starting a new trajectory from an existing part of the original trajectory (i.e., with the same coordinates), but with a variation in the momenta, such that the new trajectory takes a different path. Thus, shooting leads to a series of decorrelated trajectories. Path sampling-based methods have successfully used by Bolhuis’ group to simulate the folding of the C-terminal beta hairpin from protein G [34]. Bolhuis’ work agreed with the mechanism previously proposed, as well as a quantitative agreement of predicted rate with experiment.
1846
V. Pande
One caveat with these path sampling methods is the need to characterize the reactant and product states for the generation of folding trajectories. Indeed, this characterization itself is difficult and a change in this characterization would require rerunning the path sampling simulation. Also, one must typically choose a timescale for barrier crossing trajectories and this quantity is difficult to assess a priori, especially in the case of protein folding. To avoid these complications, one can take a far simpler approach and simply run many, uncoupled trajectories. This method is another application of the concept that the barrier crossing time, since many, relatively short trajectories can be naturally analyzed in the single-exponential folding regime to yield information of folding mechanisms and rates. This method has been applied by Pande’s group on several, small, fast folding proteins. By running tens of thousands of atomistic molecular dynamics simulations, each on the tens of nanosecond timescale, Pande’s group has been able to simulate the folding of small, fast folding proteins with a quantitative agreement with experiment [29, 35–37]. The primary benefits of this method are that it (1) can fold directly from sequence, since it does not require the knowledge of the native state to generate the simulations (although knowledge of the native state was used in the analysis of the data) and (2) can yield quantitative agreement with experimental rates, typically within the experimental uncertainty. There are a few important caveats of this method. First, one must make sure that each trajectory exceeds the typical barrier crossing time (typically on the nanosecond timescale) [38]; this may or may not be possible with a given system and it is difficult to know a priori. Second, while the analysis of multiple short trajectories need not assume single exponential kinetics, the usefulness of this method is strongest in cases where the second most rate limiting step is faster than the simulated trajectory timescale (i.e., faster than tens of nanoseconds). This will likely pose a challenge for larger, more complex proteins, which may appear single exponential experimentally, but have second most rate limiting steps with long timescale phases (e.g., hundreds of nanoseconds). Finally, one major caveat with this method is its great computational demands, typically requiring distributed computing methods [39]. However, with grid and distributed computing resources becoming popular, one might expect that this would become an overwhelming burden in the coming years, and will always be more computationally efficient than traditional parallel molecular dynamics.
References [1] C.M. Dobson, Trends Biochem. Sci., 24, 329–32, 1999. [2] V. Grantcharova, E.J. Alm, D. Baker, and A.L. Horwich, Curr. Opin. Struct. Biol., 11, 70–82, 2001.
Protein folding: detailed models
1847
[3] C.M. Dobson, A. Sali, and M. Karplus, Angew. Chem. Int. Ed. Engl., 37, 868–893, 1998. [4] V.S. Pande, A. Grosberg, T. Tanaka, and D.S. Rokhsar, Curr. Opin. Struct. Biol., 8, 68–79, 1998. [5] M. Levitt, Nature Str. Biol., 8, 392–393, 2001. [6] H. Abe and N. Go, Biopolymers, 20, 1013, 1981. [7] V.S. Pande, and D.S. Rokhsar, Proc. Natl. Acad. Sci. U.S.A., 95, 1490–1494, 1998. [8] U. Mayor, N.R. Guydosh, C.M. Johnson, J.G. Grossmann, S. Sato, G.S. Jas, S.M. Freund, D.O. Alonso, V. Daggett, and A.R. Fersht, Nature, 421, 863–867, 2003. [9] P.J. Steinbach and B.R. Brooks, Journal of Computational Chemistry, 15, 667–683, 1994. [10] I.G. Tironi, R. Sperb, P.E. Smith, and W.F. Vangunsteren, J. Chem. Phys., 102, 5451– 5459, 1995. [11] W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.M. Ferguson, D.C. Spellmeyer, T. Fox, J.W. Caldwell, and P.A. Kollman, J. Am. Chem. Soc., 117, 5179– 5197, 1995. [12] B.R. Brooks, R.E. Bruccoleri, B.D. Olafson, D.J. States, S. Swaminathan, and M. Karplus, J. Comp. Chem., 4, 187–217, 1983. [13] W.L. Jorgensen and J. Tirado-Rives, J. Am. Chem. Soc., 110, 1657–1666, 1988. [14] J.A. Grant, B.T. Pickup, and A. Nicholls, J. Comput. Chem., 22, 608–640, 2000. [15] D. Qiu, P.S. Shenkin, F.P. Hollinger, and W.C. Still, J. Phys. Chem. A, 101, 3005–3014, 1997. [16] V.S. Pande, I. Baker, J. Chapman, S. Elmer, S. Kaliq, S. Larson, Y.M. Rhee, M.R. Shirts, C. Snow, E.J. Sorin, and B. Zagrovic, Biopolymers, 68, 91–109, 2003. [17] V. Tsui and D.A. Case, Biopolymers, 56, 275–291, 2001. [18] R.L. Baldwin and G.D. Rose, TIBS, 24, 26–33, 1999. [19] R.L. Baldwin and G.D. Rose, TIBS, 24, 77–83, 1999. [20] M. Karplus and D.L. Weaver, Protein Sci., 3, 650–668, 1994. [21] R.V. Pappu and D.L. Weaver, Protein Sci., 7, 480–490, 1998. [22] R.E. Burton, J.K. Meyers, and T.G. Oas, Biochemistry, 37, 5337–5343, 1998. [23] V. Daggett, A. Li, L.S. Itzhaki, D.E. Otzen, and A.R. Fersht, J. Mol. Biol., 257, 430–440, 1996. [24] V. Daggett and M. Levitt, J. Mol. Biol., 232, 600–619, 1993. [25] V. Daggett and A.R. Fersht, Trends Biochem. Sci., 28, 18–25, 2003. [26] B. Zagrovic and V. Pande, J. Comput. Chem., 24, 1432–1436, 2003. [27] D. Chandler, J. Chem. Phys., 68, 2959–2970, 1978. [28] V.S. Pande and D.S. Rokhsar, Proc. Nat. Acad. Sci. U.S.A., 96, 9062–9067, 1999. [29] C. Snow, H. Nguyen, V.S. Pande, and M. Gruebele., Nature, 420, 102–106, 2002. [30] C. Snow, B. Zagrovic, and V.S. Pande, J. Am. Chem. Soc., 2002. [31] C. Dellago, P.G. Bolhuis, F.S. Csajka, and D. Chandler, J. Chem. Phys., 108, 1964– 1977, 1998. [32] D. Chandler, From the Lectures given at the Euroconference on Computer Simulations of Rare Events, Lerici, Villa Marigonla, Italy, 1997. [33] L.R. Pratt, J. Chem. Phys., 85, 5045–5048, 1986. [34] P.G. Bolhuis, Proc. Natl. Acad. Sci. U.S.A., 100, 12129–12134, 2003. [35] V.S. Pande, I. Baker, J. Chapman, S.P. Elmer, S. Khaliq, S.M. Larson, Y.M. Rhee, M.R. Shirts, C.D. Snow, E.J. Sorin, and B. Zagrovic, Biopolymers, 68, 91–109, 2003.
1848
V. Pande
[36] C.D. Snow, B. Zagrovic, and V.S. Pande, J. Am. Chem. Soc., 124, 14548–14549, 2002. [37] B. Zagrovic, E.J. Sorin, and V. Pande, J. Mol. Biol., 313, 151–169, 2001. [38] E. Paci, A. Cavalli, M. Vendruscolo, and A. Caflisch, Proc. Natl. Acad. Sci. U.S.A., 100, 8217–8222, 2003. [39] M. Shirts and V.S. Pande, Science, 290, 1903–1904, 2000.
Chapter 6 CRYSTAL DEFECTS
6.1 POINT DEFECTS C.R.A. Catlow Royal Institution of Great Britain, London W1S 4BS, UK
1.
Introduction
Point defects are pervasive. They are present in all materials, as a result of the intrinsic thermodynamic equilibrium and of the inevitable levels of impurities. Indeed, elementary chemical thermodynamics shows that if a defect formation reaction (e.g., Frenkel or Schottky formation as discussed below) is associated with a free energy gD , then the equilibrium mole fraction, xD , of defects is given by:
gD , xD = exp − nkT
(1)
where n is the number of defects created in the defect formation process. The equation shows that xD > 0 for T > 0, and also demonstrates that properties dependent on defects will show “Arrhenius”-like temperature dependence. The three broad classes of point defects are vacancies (unoccupied lattice sites), interstitials (extra lattice or impurity atoms/ions at sites which are not normally occupied) or substitutionals (impurity atoms/ions at normal lattice sites). They may be created thermally (i.e., by the intrinsic thermodynamic equilibrium discussed above), chemically (owing to the presence of impurity species) and by irradiation or mechanical damage. In the case of thermally generated defects, there are two basic models of disorder referred to above. Frenkel disorder involves the displacement of lattice ions to interstitial sites and anion Frenkel disorder is the predominant mode of disorder in, e.g., CaF2 ; Schottky disorder involves the creation of vacancies in stoichiometric properties as in NaCl where Schottky pair formation (sodium and chlorine vacancies) are the dominant mode of disorder. Detailed discussions of point defects and defect-dependent properties are given in the monograph of Agullo-Lopez, Catlow and Townsend [1] in the proceedings of several NATO Advanced Studies Institute (e.g., [2, 3]. This 1851 S. Yip (ed.), Handbook of Materials Modeling, 1851–1854. c 2005 Springer. Printed in the Netherlands.
1852
C.R.A. Catlow
chapter will focus on the calculation of point defect structures, energies and properties, with a strong emphasis on ionic materials where computational techniques have had a major impact.
2.
Defect Dependent Properties
Even small defect concentrations can have a major influence on key physical and chemical properties of materials. Here we introduce some of the most widely studied and significant aspects of the field. Atomic Transport including both diffusion and ionic conductivity is invariably affected by defect processes as both vacancy migration (in which neighboring atoms/ions jump into vacant sites) and interstitial migration affect transport of atomic charge. Conductivity and diffusion have been very extensively studied over the last 50 years, and calculations of defect formation and migration energies have played a crucial role in the development of the field as discussed later. Spectroscopic properties of solids may be drastically modified by defects which may provide localized hole or electron states within the energy band gap of the solid. The classic example is the “F center” in which one (or more) electron(s) are trapped at an anion vacancy site; and it has been known since the 1930s that the presence of such centers in, e.g., NaCl (induced by additive coloration, i.e., reaction of the crystal with metal vapor, or irradiation damage) impart an intense color to the crystal owing to optically induced excitations of the trapped electron. Substitutional impurities, especially transition metal ions can also provide absorption/emission centers in crystals, and indeed transition metal impurities are responsible for the colors of many gem-stones, e.g., ruby (Al2 O3 containing Cr3+ impurities). Spectroscopic properties of defective solids have been intensively investigated by theoretical/computational methods. An excellent account of the basic theory is given by Stoneham [4]. Equilibrium with the gas phase and nonstoichiometry. Nonstoichiometry is wide spread amongst the solid compounds of transition metals and lanthanide and actinide metals; examples are Fe1−x O, Ni1−x O, UO2+x , CeO2−x . In such compounds, nonstoichiometry can be understood in terms of variable cation valence, accompanied by defect formation. A simple example is Ni i−x O, where Ni2+ may be oxidized to Ni3+ – a process which is compensated by the formation of nickel vacancies. Indeed, oxidation of the crystal involves the following reaction: 3NiNi + 12 O2 → 2N•Ni + V||Ni + “NiO” where we have used the “Kroger-Vink” notion for defects, NiNi represents a || regular lattice Ni ion, Ni•Ni an oxidized cation and VNi a nickel vacancy.
Point defects
1853
It is clear that the vacancy concentration equals the deviation from stoichiometry, x, in Nii−x O, and that it is controlled by the oxygen partial pressure, PO2 . Indeed, a simple mass action analysis shows that if the above equilibrium operates: x=
1 1/6 K PO2 , 4
where K is the equilibrium constant for the reaction. If, however, more complex processes are involved (e.g., defect clustering) then a different dependence of x on PO2 will be observed. Extensive analyses have been performed on the variation of x with PO2 for nonstoichiometric compounds with the aim of elucidating their defect structures (see, e.g., [5]). Nonstoichiometric compounds can in many cases accommodate very high levels of disorder; for example, in Fe1−x O materials with x = 0.3 can be prepared, and in UO2−i x , x varies from 0.0 to 0.25. Heavily disordered systems invariably contain defect clusters, and defect clustering is, in general, an important phenomenon in doped and nonstoichiometric solids. Examples will be given in the sections which follow, and reviews of earlier work can be found in the monograph edited by Sorensen [6].
3.
Defect Calculations
Attempts to calculate defect energies go back to the origins of the field with pioneering work of Mott in the 1930s and Coulson in the 1950s. Details will be given in the sections which follow; here we review general aspects of the different methods available which are as follows:
3.1.
Simple Cluster Calculations
Here a cluster is defined which contains the defect and surrounding coordination shells. The energy and structure of the cluster is calculated with and without the defect. The calculations are usually performed using an appropriate quantum mechanical method. The approach, of course, omits long-range effects of the surrounding lattice (which may, however, be relatively unimportant in covalent materials), and may also be problematic regarding the treatment of the “dangling bonds” of the atoms on the periphery of the cluster, where a common approach is to saturate by attaching hydrogen atoms. The method has, however, been widely used to study defects in semiconductors.
1854
3.2.
C.R.A. Catlow
Embedded Cluster Methods
This approach attempts to remedy some of the deficiencies of simple cluster methods by embedding the cluster in a more approximate representation of the surrounding lattice. For example, if the cluster is described quantum mechanically, the embedding region may use interatomic potentials. Case must be taken in interfacing the two regions and in ensuring that the effects of the embedding region on the inner region containing the defect are accurately represented. An older, simpler variant of the approach in the Mott–Littleton method discussed in later sections, where the inner region is treated using interatomic potentials, and the outer region is a quasi-dielectric continuum. The method has enjoyed widespread success in modeling defects in ionic crystals.
3.3.
Periodic Methods
The approach here is particularly simple: a supercell is defined with the defect in the center and periodic boundary conditions are applied. The calculations are performed both with and without defects. The resulting defect energy and structure includes, of course, the effects of the interactions between the periodic images of the defects and the use of the method may be problematic, for charged defects (although there are prescriptions for overcoming this latter difficulty). The major advantage of the method is that it may use the extensive range of procedures and methods developed for modeling periodic solids. Further discussions of methods and applications of defect simulation will be given in the sections which follow.
References [1] F. Agullo-Lopez, C.R.A. Catlow, and P. Townsend, Point Defects in Materials, Academic Press, London, 1988. [2] F. B´eni`ere and C.R.A. Catlow (eds.), Mass Transport in Solids, NATO ASI Series, Plenum Press, vol. 97, 1983. [3] C.R.A. Catlow (ed.), Defects and Disorder in Crystalline and Amorphous Solids, NATO ASI Series C, vol. 418, Kluwer, Netherlands, 1993. [4] A.M. Stoneham, Theory of Defects in Solids, Clarendon Press, Oxford, 1976. [5] P. Kofstad, Non-Stoichiometry, Diffusion and Electrical Conductivity in NonStoichiometric Metal Oxides, John Wiley and Sons, 1972. [6] O.T. Sorensen, Non-Stoichiometric Oxides, Academic Press, New York, 1980.
6.2 POINT DEFECTS IN METALS Kai Nordlund and Robert Averback Accelerator Laboratory, P.O. Box 43 (Pietari Kalmin k. 2), 00014, University of Helsinki, Finland Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, Illinois, USA
Calculations of the properties of point defects in metals by computer simulations have the enormous advantages over other classes of materials since, to first order, charge exchange and angular forces can be neglected, thereby greatly reducing the computational effort. Moreover, the structure, concentration and diffusivity of both the vacancy and interstitial defects are known experimentally in a number of metals, so that a solid base of information is available to validate the simulation models employed. In this article we will first briefly summarize this base of experimental knowledge and then discuss current state of the art methods for calculating the properties of defects in metals, citing notable successes and failures. Each subsection is laid out discussing first the ground state energies of the defect structures, then the defect concentration (or entropy), and finally the migration properties of the defects. In the last section, we will briefly examine attempts to apply the concepts of defect structures in crystalline structures to metallic glasses.
1. 1.1.
Experimental Background Defect Structure
The structure of the single vacancy in metals is widely accepted to be the simplest possible one, that obtained when a single atom is removed from the lattice. In an elastic continuum, which works quite well for metals, the excess energy derives mostly from the creation of surface energy associated with the cavity, or breaking of bonds in a discrete lattice. Within this picture, the positive surface energy leads to a slight inward relaxation of surrounding atoms, 1855 S. Yip (ed.), Handbook of Materials Modeling, 1855–1876. c 2005 Springer. Printed in the Netherlands.
1856
K. Norlund and R. Averback
giving rise to a negative vacancy relaxation volume and change in the lattice parameter, as generally observed experimentally [1, 2]. The structure of the interstitial has also been measured directly in several metals [1, 2]. The interstitials in metals with the FCC, BCC or HCP structures are generally so-called “split” structures where two atoms share a single lattice site (Fig. 1). The two atoms are symmetrically arranged on both sides of the ideal site, so that it is not possible to state which one of the two would be the extra atom. They strongly displace the surrounding atoms outwards, giving rise to a large positive relaxation volume. We can estimate the relaxation volume by simply noting that for close-packed hard spheres, the insertion of an atom increases the total volume by approximately two atomic volumes, one owing to the extra atom itself, and the other from breaking the closepacking arrangement. The crystal direction defined by the two atoms and the lattice site is used to characterize the defect, so that one can, for instance, refer to a 1 1 0 split interstitial. Frequently, these interstitials are called dumbbell interstitials. This curious name stems from illustrating the two atoms as two (a)
(b)
(c)
(d)
(e)
(f)
Figure 1. Some sample defect structures in a pure element. Atoms on the front side of the polygons are shown as open circles, and atoms on the back sides as dashed open circles. Atoms forming dumbbell interstitials are shown as solid circles. Vacancy positions are indicated as thin dashed open circles. (a) 1 0 0 dumbbell interstitial in an FCC metal. (b) 1 1 0 dumbbell interstitial in a BCC metal. (c) 0 0 0 1 dumbbell interstitial in an HCP metal. (d) Vacancy in an FCC metal. (e) Vacancy in a BCC metal. (f) Divacancy with atoms missing from second-nearest-neighbor sites in a BCC metal.
Point defects in metals
1857
large spheres joined by a narrow bonding-rod, making the structure resemble a dumbbell weight-lifting device. Whether interstitials always have a split structure has long been questioned. In several BCC metals, calculations indicate that the energy of the so-called 1 1 1 crowdion structure is close or even lower than that of the 1 1 0 split structure [3, 4]. The extra atom in the crowdion structure creates a string-like defect along the 1 1 1 direction, encompassing some tens of atoms. Even when this structure does not represent the ground state, it may play a major role in interstitial migration; in the crowdion structure an interstitial can move several atomic distances very rapidly due to the collective and linear nature of the defect. The most obvious structure of the divacancy in the elastic continuum model is represented by two vacant nearest-neighbor sites, and indeed this is the structure experimentally observed in the BCC metal W [2], and it is believed to be the structure in most FCC metals as well [2]. Recent DFT calculations, however, suggest that this structure is not always the ground state, with the stable divacancy in Al being formed by two next-nearest-neighbor atoms [5]. In intermetallics, like ionic crystals, several additional point defect types arise. The designation of vacancies and interstitials must specify which constituent has been added or removed. A missing Al atom in NiAl, for example, can be denoted VAl and a Ni interstitial as INi . In addition, entirely new defect types can exist. An antisite defect represents an atom located on the wrong sublattice, e.g., a Ni atom on an Al site is denoted NiAl . A pair of adjacent antisites of opposite types is called an exchange defect. In addition, more complex point defect types can exist; examples of these are mentioned in Section 4.2.
1.2.
Concentration
The equilibrium concentration, c, of one defect type can be written [2, 6] as: c = g eS /k e−H f
f /kT
(1)
where k is the Boltzmann’s constant, H f the defect formation enthalpy, and S f the formation entropy. g is a geometrical factor which depends on the number of equivalent crystallographic orientations of the defect. For instance, the 1 0 0 dumbbell interstitial in FCC metals has the three equivalent orientations 1 0 0, 0 1 0, and 0 0 1 and hence g = 3. Since the g factor represents configurational entropy, it is commonly absorbed in the term S f . The type of the defect is conventionally given as a subscript, such that the monovacancy concentration is denoted cv or c1v , the interstitial formation entropy Sif , the divacancy formation f , and so on. enthalpy H2v The total equilibrium defect concentration and its temperature dependence has been measured experimentally for a number of metals using different techniques [2]. The predominant defect in all common transition metals in
1858
K. Norlund and R. Averback
thermodynamic equilibrium is the vacancy. This was shown by Simmons and Balluffi [7] and has later been confirmed by independent experimental methods, such as positron annihilation spectroscopy which detects open volume in lattices. The reason vacancies dominate is easy to understand; most transition metals have atomically dense structures such as FCC, HCP and BCC, where it is very difficult to squeeze in an extra atom. The alkali metals appear to form an exception; their interatomic bonding distances are relatively large, and hence it has been suggested that it is also relatively easy to add atoms on interstitial sites [8]. While vacancies are the predominant defect in most metals, nonArrhenius diffusion behavior suggest that other defect types are also important [6, 9, 10]. Most often divacancies are assumed to contribute to diffusion at high temperatures [6, 9], although trivacancies and interstitials have also been considered. Since these other defects have much higher enthalpies of formation than does the vacancy, they can only be important if they have exceptionally high entropies of formation and low enthalpies of migration. As will be discussed below, the interstitial has an enthalpy of migration of ≈0.1 eV and an entropy of formation of ≈15kB [11, 12]. Simulation results indicating interstitials are present near the melting point are shown in Fig. 2.
Defect concentration
10⫺4
cv
10⫺5 10⫺6 c2v
10⫺7
ci
10⫺8 10⫺9 10⫺10
0.95
1.0
1.05
1.1 1.15 Tmelt/T
1.2
1.25
1.3
Figure 2. Defect concentrations in Cu around the melting temperature. The vacancy concentration is the experimental one, while the interstitial and divacancy concentrations are predictions from our computer simulations [39]. These results suggest that the interstitial concentration may in fact exceed the divacancy concentration at the highest temperatures.
Point defects in metals
1.3.
1859
Defect Migration
The jump rate w of a single defect can be written [6] as: w = ν eS
m /k
e−H
m /kT
(2)
ν is the vibrational frequency of the lattice, H m the migration enthalpy of the defect, and S m the migration entropy. The number of jumps due to all defects of the same type is the defect concentration c times the jump rate [6, 9], = Z cw
(3)
where Z is the number of directions available for the jump. If the defect migration occurs by a purely random walk process, the self-diffusion coefficient for a single defect type is given by the Einstein relationship [10, 13]: D = 16 r 2
(4)
where r is the jump distance in the lattice. If one measures the self-diffusion coefficient using, for example, tracer atoms, then one has to correct this equation for correlation effects. This can be accounted for by including a jump correlation factor f [6] to obtain the tracer self-diffusion equation, D = 16 r 2 f
(5)
The correlation factor is normally in the range 0.1–1; for vacancies on a FCC lattice, for example, f = 0.781. The migration mechanisms cannot be determined directly experimentally, but can for many simple defect types be deduced from simple geometrical arguments, treating the atoms as soft spheres or with a pair potential, see Fig. 3. Modern DFT calculations have, generally speaking, given defect migration mechanisms consistent with those obtained from the simple geometrical arguments [14, 15]. For instance, the vacancy (or in actuality one atom next to the vacancy) in FCC metals moves along the straight line connecting one lattice site with the next-nearest-neighbour site. The split interstitial migration is by necessity somewhat more complex, but occurs by one of the atoms A in the split configuration filling the original lattice site, while simultaneously the other atom B pushes away an atom C in a nearby site and forms a new split interstitial [14]. Thus, the split interstitial formed by atoms A and B becomes a new split interstitial formed by atoms B and C, see Fig. 3(b). The factors Z and f are known for a given defect structure in a given lattice. Hence the factors determining the defect migration S m and H m can be deduced by measuring the temperature-dependent self-diffusion of the material, if c and hence S f and H f are known. This procedure, however, is complicated by the curvature in the Arrhenius plots. If several defect types are present, they can contribute to the curvature of the self-diffusion coefficient D
1860
K. Norlund and R. Averback (a)
(b)
Figure 3. Migration of defects in an FCC metal. (a) Migration of a vacancy. The arrow indicates how an atom moves from one site to the next, and the atom is shown in the saddle point configuration in the middle of the line joining the two sites. The initial and final empty site are shown with thin dashed lines. (b) Migration of an interstitial. For clarity only atoms on the front side of the cube are shown. The initial positions of the two atoms forming the dumbbell are shown in black, and the final ones in a striped pattern. Arrows indicate the movement of the three atoms involved.
both via their concentration and the difference in the migration coefficients. Moreover, since self-diffusion is hard to measure at low temperatures, this method does not work well for defects with low defect migration energies. More typically, the migration enthalpy of defects H m has been determined using either quenching or irradiation experiments. In quenching experiments, for example, specimens are first quenched from high temperatures, to a temperature at which the defects are immobile. The specimen is then warmed to successively higher temperatures according to some annealing schedule, while monitoring some property that is proportional to the defect concentration. By following the rate at which the defects annihilate at their sinks, the migration enthalpy can be deduced, for details see Ref. [16]. For interstitial defects, which have immeasurably small concentrations at high temperatures, irradiation experiments are necessary. The irradiations are performed at low temperatures and these are followed by an annealing program as just described, see Fig. 4. The difficulty with irradiation experiments, however, is that interstitials and vacancies are created in pairs, and their initial locations are correlated. These seemingly simple considerations have led to much confusion in the field over the years (see, for example, the collection of articles in J. Nucl. Mater., 69–70, 1978). From these and other alternative measurement methods, it has become clear that defect migration in transition metals has some general trends. The interstitials are highly mobile, with activation energies typically in the range 0.05–0.5 eV. In Au stage I has not been observed, even for measurements well below 1 K [17, 18]. This has been interpreted to mean that the Au interstitial is
Point defects in metals
1861
1.0
Relative damage level
0.8
Stage I Stage II
0.6
Stage III 0.4 Stage IV 0.2
Stage V
0.0 10
2
5
10 2
2
5
10 3
Temperature (K) Figure 4. Annealing stages in Cu. After Ehrhart [2]. The fine structure in stage I is due to recombination of close, bound Frenkel pairs.
mobile at all temperatures due to a quantum mechanical tunneling. The vacancies have mobilities of the order of 1 eV, and thus are much less mobile than interstitials at low temperatures.
2. 2.1.
Methods of Defect Simulation Structure
The determination of the ground state structure of a point defect is made difficult by the need to solve the energy minimization problem in 3N dimensions, where N is the number of atoms in the simulation cell. Moreover, it is generally not possible to formally prove that a particular defect structure represents the ground state in any given model (it took close to 400 years even to prove Kepler’s conjecture, i.e., that the perfect FCC and HCP structures are the densest possible packing of hard spheres [19]). In practice, however, the number of reasonable structures that can be obtained in the common crystal structures, FCC, BCC and HCP, is small, since all of these structures are relatively close-packed.
1862
K. Norlund and R. Averback
The computationally least demanding approach for finding the ground state is to use all of the high-symmetry defect positions in a lattice as initial trials for the defect structure. For each trial, the positions of the atoms surrounding the defect are then relaxed to the closest minimum. This is achieved efficiently using conjugate gradient (CG) energy minimization [20] or by carrying out a molecular dynamics (MD) simulation and scaling the temperature toward zero (MD methods are described elsewhere in this handbook series). A common trick for speeding up the latter method is to set the velocity of atoms to zero if the dot product of the velocity and force vectors is less than zero, i.e., the force is opposite to the velocity. In our experience these two methods are of comparable efficiency in small cells, while in large cells an adaptive CG method [21] is much faster than either plain CG or MD with the velocity–force trick. The lowest-energy state thus obtained yields the ground-state structure of the defect. The main drawback to this trial and error approach is that one is starting from educated guesses, and it is possible that some other state exists that has even lower energy. An approach to find the ground-state structure that circumvents this problem starts with a randomly chosen interstitial position (for interstitials inserting one extra atom in a random position in a periodic cell), and then carrying out an MD or MC simulated annealing run to find the ground state. In practice, this might be implemented by starting at half the melting temperature, simulating for 10 ps in this state, and then slowly cooling the system to 0 K. If this process always gives the same ground state for different starting positions and for a wide range of cooling rates, below some practical maximum rate, one can be quite confident that the true ground state has been obtained. Although simulated annealing is somewhat cumbersome, not using it may give wrong answers. For instance, the original Foiles embedded-atom method (EAM) potential for Pt [22] yields the tetrahedral position for the ground state of the interstitial, which would not be found by starting from the common (1 0 0) split interstitial and simply relaxing it to the closest minimum [23]. Also note that even for vacancies, just removing an atom may not result in the correct ground state of the model. Recent DFT simulations of graphite have revealed a Jahn–Teller effect, which breaks the vacancy symmetry and leads to a distorted vacancy structure [24]. In our experience, surprisingly long cooling times may be needed to obtain the true minimum; in our recent development of a bond-order potential for W we found that times of the order of 100 ps were needed to arrive at the correct ground-state structure of the interstitial. This makes it computationally very difficult to apply the simulation approach in DFT simulations, and in practice DFT methods remain limited to the first approach of testing a large number of high-symmetry structures. Another complication arises from finite-size effects. Since especially interstitial defects have a far extending strain field, a periodic simulation cell will
Point defects in metals
1863
always lead to artificial interaction between two defects due to their strain fields. While it is not possible to completely avoid size effects, the magnitude of the error can be checked by using different-sized or -shaped simulation cells, and different pressure relaxation approaches (fixed vs. pressure relaxed). Particular care must be exercised for calculations of extended defect structures, such as the crowdion interstitial. These defects can contain strings of some tens of atoms, and large cells are required to properly model them, sometimes with some thousands of atoms. Again this is particularly limiting for DFT methods, which usually are limited to a few hundreds of atoms. Once the ground-state structure has been found, the relaxation volume of the defect can be easily obtained by subtracting the volume of the perfect cell from that of the cell containing the defect, for cells relaxed to zero pressure. The formation volume f and relaxation volume, V are related by the expression: f = ±0 + V
(6)
where the plus sign is for vacancy formation and the minus sign for interstitials [2].
2.2.
Concentration
From Eq. (1), one can see that the equilibrium concentration of defects is determined by the formation enthalpy H f and formation entropy S f . At zero external pressure, and in the limit of low defect concentration, the formation energy E f of a defect can in a system with Na atom types be defined as [25]: E = E D − Q(E v − µe ) − f
Na
n i µi
(7)
i=1
where E D is the energy of the defect cell at 0 K, Q the charge of the defect, E v the valence band maximum, µe the chemical potential of the electrons, n i the number of atoms of each type i in the defect cell, and µi the chemical potential of atom type i. In monoatomic metal systems the charge state can be ignored and the other terms simplified to give the formation energy as:
Ef =
Eu Ed − Nd Nu
Nd
(8)
where E d and Nd are the potential energy and number of atoms in the defect cell, and E u and Nu the same in a defect-free cell. Thus, the procedure needed for determining E f , in practice, requires knowing the potential energy per atom E u /Nu in a defect-free cell, and then relaxing a simulation cell with a defect to
1864
K. Norlund and R. Averback
zero temperature and pressure to determine E d . The only complications arise from possible finite-size effects, and the need to carry out the calculation of E d to high precision, since Eq. (8) involves the subtraction of two large, almost equal, numbers. The determination of the formation entropy S f is more difficult, since the entropy involves temperature effects and thus the simulations must be long to average out the thermal fluctuations. This can be especially cumbersome for small simulation cells, where the fluctuations are large. The conceptually simplest way of determining S f is to determine the defect concentration c by direct simulation, then use Eq. (1) to calculate S f once E f has been determined as explained above. The concentration can be simulated by placing a crystal in contact with a particle source/sink, e.g., a liquid, and simulating until the defect concentration equilibrates in the crystal part. Since c seldom exceeds ∼0.001 even at the melting temperature, one needs systems with at least tens of thousands of crystal atoms to observe a significant vacancy concentration. Moreover, since the crystal part needs to be large, one also needs long simulation times to allow for in-diffusion of the defect from the particle source. This makes the direct approach quite cumbersome computationally. Nevertheless, it does have the advantage that in principle one single ordinary N P T ensemble MD simulation is enough to determine c, and the method has been successfully employed, at least for vacancies in Cu [11]. Another approach is to use Frenkel–Ladd thermodynamic integration [26–28] to determine explicitly the entropy of a simulation cell with a defect, and then compare it with the entropy of a defect-free cell. This requires constructing a special interaction model where one can move smoothly between an Einstein harmonic solid and the “true” solid of interest, where “true” now signifies the solid modeled by a classical or quantum mechanical method. One needs to simulate a range of systems between the Einstein solid and the true solid. Integration of the obtained enthalpy data gives the free energy difference between the Einstein and true solid, which can be subtracted from the known free energy of an Einstein crystal to give the free energy and hence entropy of a defect. For defects, the reference Einstein lattice should be the fully relaxed defect lattice [26, 29]. The simulations that provide the enthalpies can be either MC or MD simulations of the N P T ensemble. Both previously described methods require long simulations to obtain good statistical accuracy, and are hence not conducive to DFT calculations or calculations using complicated classical potentials. For such methods, one still can determine the defect entropy in the harmonic approximation by explicitly evaluating the change in the vibrational modes (phonon spectrum) of the defect. This can be achieved by determining the eigenfrequencies by direct diagonalization of the force-constant matrix in systems with and without the defect [2, 28, 30–32]. Anharmonic effects and electronic contributions to the defect entropy are, of course, neglected by this procedure [33].
Point defects in metals
2.3.
1865
Migration
Equation (2) can be put in the form: w = w0 e−H
m /kT
(9)
where the geometrical factor Z , lattice vibration rate ν and migration entropy S m have been embedded into the term w0 , which can be considered the attempt frequency of the jump. Since Z and ν are usually known for a given material, one can determine both S m and H m if the jump rate is known as a function of temperature. Thus a straightforward approach to determine the defect migration properties is to directly count defect jumps in an MD simulation. For simple point defects, such as the single vacancy, this can often be achieved by using some automated means for tracking the position of the defect as a function of time. The exact number of defect jumps can be then determined, yielding w for a given T . For more complicated defects, for example, even the interstitial, this becomes difficult since at high temperatures the “interstitialcy” is quite extended in space. In this case, it is easier to find the average square displacement of all atoms R 2 (t) in the system, then deduce the diffusion coefficient through the Einstein relation [13] which for three dimensions is R 2 (t) = 6 f Dt
(10)
and then w using Eqs. (2)–(4) for a single defect. Here, f is the correlation factor defined earlier. A completely different approach employs transition state theory to determine the migration rates. Within this approximation, which is generally valid for metals, one can determine both the migration enthalpy H m and prefactor w0 by nondynamic simulations, making this approach particularly attractive for time-constrained simulation methods such as DFT. To obtain H m , the height of the potential energy barrier separating a defect site with an adjacent site is first determined. For instance, if the migration path of a vacancy is a straight line connecting the two sites, this can be achieved simply by moving an atom stepwise along the straight line from an adjacent site to the vacancy site. For these calculations, the position of the moving atom is fixed at each step, while adjacent atoms are relaxed to the nearest potential energy minimum. The corners of the simulation cell are fixed to prevent motion of the whole cell. This yields an energy vs. distance curve, with H m given by the maximum height. If the migration mechanism is more complicated, a conceptually similar but more refined approach is required. Two recent popular methods are the nudged-elastic band and dimer methods [34, 35]. The attempt frequency w0 is determined by evaluating the eigenfrequencies of the ground state of the defect and migration saddle point state using the Vineyard approach [36].
1866
3.
K. Norlund and R. Averback
Accuracy of Defect Simulations
The accuracy of present defect simulations has been checked by comparisons with experiments in several systems. Due to the large number of studies on this topic, it is impossible to review the entire list of successes and failures in the field, but a few of the most studied cases illustrate the current state of the field.
3.1.
Structure
In FCC metals, both DFT methods and most modern EAM-like potentials reproduce well, at least qualitatively, the structure of the vacancy and interstitial [37–40]. They predict that the atoms surrounding the vacancy relax inwards slightly (leading to a small negative relaxation volume), and that the interstitial has the 0 0 1 dumbbell structure and a large positive relaxation volume, in agreement with experiments [2]. Similarly, the properties of vacancies on BCC and HCP lattices are also generally treated well [4, 41–44]. Predicting the properties of the BCC metal interstitial has proven to be more difficult, however, probably because of the more complex nature of the atomic interactions [45, 46], as well as magnetic effects. For instance, in Fe many classical models predict that the ground state of the interstitial is the 1 1 1 crowdion [47, 48], which disagrees with other classical potentials, DFT calculations and experimental information [32, 49–51]. On the other hand, recent DFT calculations indicate that the 1 1 1 dumbbell would be the stable interstitial in V [4].
3.2.
Concentration
The formation energy of vacancies E fv in metals is most often given quite accurately by EAM-like classical models [37, 38, 40, 43], and thus naturally also in more advanced DFT methods [15, 52–54]. This is, in part, because this quantity is often directly fitted during EAM potential construction, and in part, because the formation energy often is closely related to other basic properties of the material. For instance, in all common BCC metals E fv scales with the melting point of the material within an accuracy of about 10%. It is far more difficult to provide a general statement about the interstitial formation energies, since they are not very well known experimentally. Practically all models, however, consistently predict that E if is higher than E vf , in agreement with experimental observations. The defect formation entropy has been studied even less with simulations, due to the difficulties associated with carrying out the calculations, see above
Point defects in metals
1867
and Ref. [31]. The classical potential studies which have been carried out tend to predict low values, ∼1k for the vacancy formation entropy, which is in good agreement with experiments [11, 32]. The situation for the interstitial is particularly interesting, since the Granato model and recent experiments predict that (at least) the FCC metal interstitials should have very high values of Sif , ∼10k, due to the resonance modes associated with the dumbbell interstitial structure. This result has been supported by DFT and classical simulations giving a similar result [11, 14].
3.3.
Migration
The migration properties of metal vacancies have been studied both in classical and quantum mechanical frameworks [15, 38, 45, 55, 56]. The activation energy for vacancy migration is generally described satisfactorily both by classical and DFT models. Similarly to the case of the vacancy formation energy, this is largely due to the simple character of the migration and the fact that the migration energy is related to the elastic properties of the material [2]. For BCC metals it can be predicted quite well by Flynn’s simple analytical formula [57]. The interstitial migration is more complicated than the vacancy migration, and generally very rapid [2]. This is reproduced at least qualitatively by both EAM and DFT models, since the basic reason is related to the geometry of the defect migration path [4, 39, 49, 58]. One notable exception is the aforementioned quantum mechanical migration of the interstitial in gold, which cannot possibly be reproduced by conventional EAM or DFT models which always work in the Born–Oppenheimer approximation.
4.
Examples of Simulated Predictions
The results described in Section 3 are not all that interesting from a general scientific point of view, since they merely reproduce already known experimental results. The real value of simulations is making predictions of effects and properties that have not yet been experimentally measured, or can be measured only indirectly. We provide here a few examples of recent simulation results, which in our opinion are especially interesting in that they have wider implications.
4.1.
Temperature-dependent Defect Formation Energy
As mentioned in the beginning of this article, the curvature in Arrhenius plots of diffusion data is usually interpreted to be due to divacancies. Recently,
1868
K. Norlund and R. Averback
an alternative explanation has been offered, one that does not require the presence of any other defect than the vacancy. Using DFT calculations Carling et al. report that the divacancy in Al is energetically unstable [30, 59]. Instead, they propose that the anharmonicity of the lattice vibrations leads to a temperaturedependent vacancy formation energy Hvf (T ). When Hvf (T ) is used in Eq. (1), a curved Arrhenius plot is obtained which agrees well with experiments. Later, Sandberg showed that this can also explain the nonArrhenius behaviour of the vacancy diffusion [60]. Later work on the same systems, however, indicates that even if the nearestneighbor divacancy is unstable, the second-nearest-neighbor one may be stable [5]. Also, Khellaf, Seeger and Emrick have argued that Carling’s result of a negative divacancy binding energy is simply in contradiction to experiments [61]. The analysis of the experiments, however, assumes the enthalpy of defect formation is temperature-independent, which Carling argues, it is not [59]. Hence the controversy can be regarded as unresolved at present.
4.2.
Defects in NiAl
Point defects in intermetallic compounds, particularly NiAl, have been studied intensively by computer simulations during the last 5 years [61–65]. These studies have revealed the surprisingly complex nature that point defects can have in intermetallics. The general rule that the simple vacancy is the predominant point defect at all temperatures in elemental metals, cannot be true in intermetallic compounds, owing to the need to preserve stoichiometry. For NiAl, the dominant thermal defect is found to be the so-called “triple (Ni) defect” [62, 63]. This defect is a divacancy consisting of one missing Ni and one missing Al atom, but reconstructed such that there are two vacancies on Ni sites, and between these vacancies is an Al antisite atom on a Ni site. Moreover, in nonstoichiometric alloys the situation becomes even more complicated. The ground state of a nonstoichiometric alloy can be viewed to have an intrinsic concentration of vacancy or antisite defects. For instance, Ni-rich NiAl has Ni antisite atoms on Al sites [62, 65]. These defects are called constitutional defects. At finite temperatures thermal activation can produce thermal defects, some of which may be produced by removing the constitutional defects. Such defects are called interbranch defects; for example, one constitutional Ni antisite can be replaced by two Al vacancies [62]. Since the thermal activation thus removes defects which are part of the alloy ground state, one arrives at the peculiar situation of a negative defect concentration, at least for one species of defect. These defects are also interesting since the entropy term becomes the dominant contribution in the Gibbs free energy [65].
Point defects in metals
1869
The migration of atoms in NiAl is also interesting, as one has to distinguish between tracer diffusion and chemical diffusion (i.e., diffusion leading to a chemical composition gradient). For elemental metals tracer atoms are conventionally used to measure the diffusion. For an intermetallic, however, tracers may be misleading, since, for instance, a divacancy in a stoichiometric B2 structure lattice will contribute to tracer diffusion, but not to chemical diffusion [61]. Also interesting is that the vacancy cannot make an ordinary jump from one site to the nearest-neighbor site in the B2 structure of NiAl. Instead, it must always make a double jump to the next-nearest-neighbor site [64].
4.3.
Defects in Metallic Glasses
Hitherto we have only dealt with defects in crystalline metals. In crystalline materials the underlying lattice provides a straightforward means to characterize the structure of point defects, point defect clusters, and the Burgers vectors of dislocations. The same is not possible for metallic glasses, since lattice sites need not be conserved. Consequently, there are no experimental methods available to determine directly defect structures and properties. Positron annihilation does offer some help, however, in identifying vacancies [66]. As a consequence, very little is known about the nature of defects in metallic glasses, and indeed whether localized defects can be clearly defined. Most of what is known derives from MD computer simulations. Nevertheless, MD suffers from many of the same problems encountered by experiments in identifying the structure of defects. Also similar to experiments, most work has focused on the question of vacancies, since vacancies can be simply identified by a void volume. Vacancy-like defects in metallic glasses. The first MD simulation work probing the nature of defects in metallic glasses was performed by Bennett et al. [67] who examined the stability of vacancies in Lennard–Jones glasses. In these simulations, like many that followed, a glass was created by quenching a liquid from high temperatures to a temperature sufficiently low to suppress crystallization, using various stages of relaxation along the way. These early investigations showed that while vacancies created by removing an atom from the assembly were stable near 0 K, the localized excess volume quickly dissipated at temperatures quite low in comparison to the migration temperature of vacancies in the Lennard–Jones crystal. While these investigations provided an initial understanding of point defects in metallic glasses, they did not provide a systematic evaluation of the environment of the vacancy and its stability. In a simple Bravais lattice, all atoms have identical configurations, and thus statistical studies are unnecessary, but in a glass, all sites are unique and statistical investigations are necessary to examine for patterns of behavior.
1870
K. Norlund and R. Averback
Limoge and Delaye [68–70] performed such a statistical investigation of vacancy-like entities in a Lennard–Jones glass. Like Bennett et al., they removed an atom from the glass and monitored the response. Their primary results are illustrated in Fig. 5 (a)–(c) where the local excess volume around the vacancy is plotted as a function of time. These volumes were obtained by forming Voronoi polyhedra for the atoms neighboring the removal site and calculating the average volume of these polyhedra. At 0 K three different histories are observed. As shown in case (a), the volume initially undergoes a brief relaxation, but then remains constant, denoting a stable vacancy-like entity. In contrast, case (b) shows a situation where the excess volume shrinks slowly with time and returns nearly to its initial value before creating the vacancy. Lastly, case (c) illustrates the case where the volume is at first stable, but then after some time relaxes quickly to its initial value. In this case, it was observed that a neighboring atom jumped into the vacancy, annihilating it, but creating a new vacancy at a neighboring site. In addition to characterizing the nature of the different types of vacancies, Delaye and Limoge calculated the thermodynamic and kinetic properties of these defects. In Fig. 6 the formation enthalpy is plotted vs. the local hydrostatic pressure. It is noteworthy that when the local pressure is low, suggestive of a crystal-like environment, the formation enthalpy is close to the value in the crystal, 0.088 eV [70]. Delaye and Limoge also observed that the 41.0 Case a)
40.5
Case b)
Volume (Å3)
40.0
Case c)
39.5 39.0 38.5 38.0 37.5 37.0
0
1000
2000
3000
Time
4000
(10⫺14
5000
6000
7000
s)
Figure 5. Local excess volume in a Lennard–Jones liquid as a function of time. The three lines correspond to three different cases. The data is from Ref. [70].
Point defects in metals
1871
0.14 0.12
Enthalpy (eV)
0.1 0.08 0.06 0.04 0.02 0.0 ⫺2
0
2 4 Hydrostatic pressure (kbar)
6
8
Figure 6. Formation enthalpy vs. the local hydrostatic pressure for vacancies in glasses. The data is from Ref. [68].
formation enthalpy increases linearly with formation volume, with a ratio of 0.14 eV per atomic volume. On the other hand, the formation entropy was found to decrease with increasing volume, reaching a value of 4–5 kB at the largest volumes [68]. These entropies are notably larger than in the crystal in which a value of ∼2.7kB [68] has been determined. Similar results regarding the formation of a stable point defect were recently observed in simulations of quenched amorphous copper using EAM potentials [71]. Calculations of system evolution following both the addition of an atom or the removal of an atom showed the same three types of system evolution: stable defect formation (with a positive formation energy and volume), localized defect annihilation and extended cooperative relaxation. The results are summarized in Table 1. The formation volumes and energies were considerably lower than the corresponding values for the crystalline case, which are 1.2 eV for a vacancy and 3.2 eV for the interstitial. The lower formation enthalpy of the interstitial is understood within the Granato model of glasses as a consequence of the reduced shear modulus (see Ref. [72] and below). Currently, there is still little agreement regarding the role of such “point defects” in diffusion in amorphous materials, as it seems that diffusion in the glassy state happens through a collective process [73]. However, the “interstitial-like” defects in the amorphous system caused large-scale cooperative motion as they
1872
K. Norlund and R. Averback
Table 1. Formation volume V f and energy E f for stable point defects in quenched amorphous Cu, as described by our EAM potential Stable defect
Localized annihilation
Extended relaxation
Atom addition
V f (0 ) E f (eV)
0.3 ± 0.15 0.5 ± 0.25
0.0 ± 0.1 −0.1 ± 0.2
−0.7 ± 0.5 −1.0 ± 0.5
Atom removal
V f (0 ) E f (eV)
0.5 ± 0.2 0.7 ± 0.3
0.0 ± 0.1 0.1 ± 0.2
−0.4 ± 0.3 −0.7 ± 0.5
relaxed following insertion. This is consistent with simulations showing extended nonGaussian behavior during glass relaxation [74]. Granato interstitial model of a liquid. This chapter has dealt extensively with point defects in metals, and how they should be simulated on a computer. We have not mentioned application areas where point defects are important except in passing. Naturally point defects in metals do play a central role in many aspects of metals processing. We will finish this chapter, however, on a speculative note by mentioning a recent theory which proposes that point defects play a quite fundamental importance in the theory of liquids and glasses, i.e., the Granato model. This model of liquids and glasses essentially states that the structure of a liquid is, in fact, a crystal containing a few percent of interstitial defects and a glass is a frozen liquid. The basis for this model derives from two properties of interstitials in metals: (i) the entropy of formation of the interstitial is ∼15kB (see Section 1.2); (ii) the shear modulus of the crystal decreases rapidly with interstitial concentration. Since the formation enthalpy of the interstitial depends almost linearly on the shear modulus, every added interstitial reduces the energy required to add the next one. Granato has calculated the free energy of the solid as a function of interstitial concentration, and indeed finds that above the melting temperature, the crystal with a small concentration of interstitials (10−7 –10−8 ) is metastable with respect to the liquid, i.e., the crystal with a large number of interstitials. While defect models of melting have been proposed much earlier, these have mostly considered vacancy defects, presumably to account for the excess volume of the liquid. The entropy of a vacancy is, however, far too small to account for the entropy of melting. The interstitial, with its large entropy and large formation volume satisfies both criteria. While it is beyond the scope of this article to critically evaluate the model, we note that computer simulations are well suited to test the model predictions.
References [1] P. Ehrhart, K.H. Robrock, and H.R. Shober, “Basic defects in metals,” In: R.A. Johnson and A.N. Orlov (eds.), Physics of Radiation Effects in Crystals, Elsevier, Amsterdam, p. 3, 1986.
Point defects in metals
1873
[2] P. Ehrhart, “Properties and interactions of atomic defects in metals and alloys,” In: Landolt–Börnstein, New Series III, vol. 25 Springer, Berlin, Chapter 2, p. 88, 1991. [3] M.H. Carlberg, E.P. Munger, and L. Hultman, “Self-interstitial structures in bodycentred-cubic W studied by molecular dynamics simulation,” J. Phys. Condens. Matt., 11, 6509–6514, 1998. [4] H. Seungwu, L.Z. Ruiz, G.J. Ackland, R. Car, and D.J. Srolovitz, “Interatomic potential for vanadium suitable for radiation damage simulations,” J. Appl. Phys., 93, 3328–3335, 2003. [5] T. Uesugi, M. Kohyama, and K. Higashi, “Ab initio study on divacancy binding energies in aluminum and magnesium,” Phys. Rev. B, 184103, 2003. [6] N.L. Peterson, “Self-diffusion in pure metals,” J. Nucl. Mater., 69/70, 3, 1978. [7] R.O. Simmons and R.W. Balluffi, “Measurements of equilibrium vacancy concentrations in aluminum,” Phys. Rev., 117, 52–61, 1960. [8] F. Flores and N.H. March, “Effects of relaxation round point defects in the alkali metals on formation energies,” Point Defects and Defect Interactions in Metals, (eds.), J.I. Takamura, M. Doyama, and M. Kiritani, North-Holland, Amsterdam, the Netherlands, 85–92, 1981. [9] R.W. Siegel, “Vacancy concentrations in metals,” J. Nucl. Mater., 69/70, 117, 1978. [10] H. Mehrer, “Atomic jump processes in self-diffusion,” J. Nucl. Mater., 69/70, 38, 1978. [11] K. Nordlund and R.S. Averback, “The role of self-interstitial atoms on the high temperature properties of metals,” Phys. Rev. Lett., 80, 4201–4204, 1998. [12] C.A. Gordon, A.V. Granato, and R.O. Simmons, “Evidence for the self-interstitial model of liquid and amorphous states from lattice parameter measurements in krypton,” J. Non-Cryst. Sol., 205–207, 216, 1996. [13] A. Einstein, “Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen,” Ann. Phys., 17, 549, 1905. [14] B.J. Jesson, M. Foley, and P.A. Madden, “Thermal properties of the self-interstitial in aluminum: an ab initio molecular-dynamics study,” Phys. Rev. B, 55, 4941, 1997. [15] F. Willaime, “Impact of electronic structure calculations on the study of diffusion in metals,” Rev. Metall., 98, 1065–1071, 2001. [16] G.J. Dienes and G.H. Vineyard, Radiation Effects in Solids, Interscience Publishers, New York, 1957. [17] R.C. Birtcher, W. Hertz, G. Fritsch, and J.E. Watson, “Very low temperature electron irradiation and annealing of gold and lead,” Proceedings of the International Conference on Fundamental Aspects of Radiation Damage in Metals, CONF-751006-P1, vol. 1, p. 405, 1975. [18] H. Schroeder and B. Stritzker, “Resistivity annealing of gold after 150 keV proton irradiation at 0.3 K,” Radiation Effects, 33, 125–126, 1977. [19] T. Hales, “The Keple-Conjecture,” Ann Math., to appear in (not yet published). [20] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C: The Art of Scientific Computing, 2nd edn., Cambridge University Press, New York, 1995. [21] K. Nordlund, “Diffuse X-ray scattering from 3 1 1 defects in Si,” J. Appl. Phys., 91, 2978, 2002. [22] S.M. Foiles, “Application of the embedded-atom method to liquid transition metals,” Phys. Rev. B, 32, 3409, 1985. [23] K. Albe, K. Nordlund, and R.S. Averback, “Modeling metal–semiconductor interaction: analytical bond-order potential for platinum–carbon,” Phys. Rev. B, 65, 195124, 2002.
1874
K. Norlund and R. Averback
[24] R.H. Telling, C.P. Ewels, A.A. El-Barbary, and M.I. Heggie, “Wigner defects bridge the graphite gap,” Nature Mater., 2, 333, 2003. [25] G.-X. Qian, R.M. Martin, and D.J. Chadi, “First-principles study of the atomic reconstruction and energies of Ga- and As-stabilized GaAs(1 0 0) surfaces,” Phys. Rev. B, 38, 7649, 1988. [26] D. Frenkel and A.J.C. Ladd, “New Monte Carlo method to compute the free energy of arbitrary solids. application to the fcc and hcp phases of hard spheres,” J. Chem. Phys., 81, 3188, 1984. [27] D. Frenkel and B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd edn., Academic Press, San Diego, 2002. [28] J.M. Rickman and R. LeSar, “Free-energy calculations in materials research,” Annu. Rev. Mater. Sci., 32, 195–217, 2002. [29] S.M. Foiles, “Evaluation of harmonic methods for calculating the free energy of defects in solids,” Phys. Rev. B, 49, 14930, 1994. [30] K. Carling, G. Wahnström, T.R. Mattsson, A.E. Mattsson, N. Sandberg, and G. Grimvall, “Vacancies in metals: from first-principles calculations to experimental data,” Phys. Rev. Lett., 85, 3862, 2000. [31] Y. Mishin, M.R. Sorensen, and A.F. Voter, “Calculation of point-defect entropy in metals,” Philos. Mag. A Phys. Condens. Matt.: Struct. Defects Mech. Prop., 81, 2591–2612, 2001. [32] J. Wallenius, P. Olsson, C. Lagerstedt, N. Sandberg, R. Chakarova, and V. Pontikis, “Modelling of chromium precipitation in Fe–Cr alloys,” Phys. Rev. B, 69, 094103, 2003. [33] A. Satta, F. Willaime, and S.d. Gironcoli, “Vacancy self-diffusion parameters in tungsten: finite electron-temperature LDA calculations,” Phys. Rev. B: Condens. Matt., 57, 11184–11192, 1998. [34] M. Villarba and H. Jonsson, “Diffusion mechanisms relevant to metal crystal growth: Pt/Pt (111),” Surf. Sci., 317, 15, 1994; G. Mills, H. Jonsson and G.K. Schenter, Surf. Sci., 324–305, 1995. [35] G. Henkelman and H. Jonsson, “A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives,” J. Chem. Phys., 111, 7010, 1999. [36] G.H. Vineyard, “Frequency factors and isotope effects in solid state rate processes,” J. Phys. Chem. Solids, 3, 121–127, 1957. [37] N.Q. Lam, L. Dagens, and N.V. Doan, “Calculations of the properties of selfinterstitials and vacancies in the face-centred cubic metals Cu, Ag and Au,” J. Phys. F: Met. Phys., 13, 2503–16, 1983. [38] M.S. Daw, S.M. Foiles, and M.I. Baskes, “The embedded-atom method: a review of theory and applications,” Mat. Sci. Engr. Rep., 9, 251, 1993. [39] K. Nordlund, M. Ghaly, R.S. Averback, M. Caturla, T. Diaz de la Rubia, and J. Tarus, “Defect production in collision cascades in elemental semiconductors and FCC metals,” Phys. Rev. B, 57, 7556–7570, 1998. [40] B.-J. Lee, J.-H. Shim, and M.I. Baskes, “Semiempirical atomic potentials for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, Al, and Pb based on first and second nearest-neighbor modified embedded atom method,” Phys. Rev. B, 68, 144112, 2003. [41] F. Willaime and C. Massobrio, “A molecular dynamics study of zirconium based on an n-body potential: HCP/BCC phase transformation and diffusion mechanisms in the BCC phase,” MRS Symp. Proc., 193, 295, 1990. [42] U. Breier, W. Frank, C. Elsässer, M. Fähnle, and A. Seeger, “Properties of monovacancies and self-interstitials in bcc Na: an ab initio pseudopotential study,” Phys. Rev. B, 50, 5928, 1994.
Point defects in metals
1875
[43] A.S. Goldstein and H. Jonsson, “An embedded atom method potential for the h.c.p. metal Zr,” Philos. Mag. B: Phys. Condens. Matt. Electr. Opt. Magn. Prop., 71, 1041– 1056, 1995. [44] Q.M. Hu, D.S. Xu, and D. Li, “First principles calculations for vacancy formation energy and solute-vacancy interaction energy in alpha-Ti,” Int. J. Mater. Prod. Technol., 622–627, 2001. [45] F. Willaime, A. Satta, M. Nastar, and O.L. Bacq, “Electronic structure calculations of vacancy parameters in transition metals: impact on the BCC self-diffusion anomaly,” Int. J. Quant. Chem., 77, 927–39, 2000. [46] W. Xu and J. Moriarty, “Atomistic simulation of ideal shear strength, point defects, and screw dislocations in bcc transition metals: Mo as a prototype,” Phys. Rev. B, 54, 6941, 1996. [47] Y.N. Osetsky, A. Serra, V. Priego, F. Gao, and D.J. Bacon, “Mobility of selfinterstitials in fcc and bcc metals,” In: Y. Mishin, G. Vogl, N. Cowern, R. Catlow, and D. Farkas (eds.), Diffusion-Mechanisms-in-Crystalline-Materials, MRS Symposium Proceedings MRS, Warrendale, pp. 49–58, 1998. [48] G. Simonelli, R. Pasianot, and E.J. Savino, “Self-interstitial configuration in B.C.C. metals. An analysis based on many-body potentials for Fe and Mo,” Phys. Stat. Sol. B, 217, 747–758, 2000. [49] B.D. Wirth, G.R. Odette, D. Maroudas, and G.E. Lucas, “Energetics of formation and migration of self-interstitials and self-interstitial clusters in alpha-iron,” J. Nucl. Mater., 244, 185–194, 1997. [50] Y. Kawazoe, K. Ohno, K. Shiga, H. Kamiyama, Z. Tang, M. Hasegawa, and H. Matsui, “How accurate the first-principles calculations can be applied to nuclear reactor materials research?” Nucl. Instr. Meth. Phys. Res. B, Beam Interactions Mater Atoms, 153, 77–86, 1999. [51] H. Bilger, V. Hivert, J. Verdone, J. Leveque, and J. Soulie, H. Bilger, In: Point defects in Iron, (ed.), International Conference on Vacancies and Interstitials in Metals, Kernforschunganlage Jülich, Jülich, p. 751–767, 1968. [52] T. Korhonen, First-principles Electronic Structure Calculations: Defects in Metals, Nitrides and Carbides, Ph.D. Thesis, Helsinki Univ. Technol, Espoo, Finland, 1996. [53] P.A. Korzhavyi, I.A. Abrikosov, B. Johansson, A.V. Ruban, and H.L. Skriver, “Firstprinciples calculations of the vacancy formation energy in transition and noble metals,” Phys. Rev. B: Condens. Matt., 59, 11693–11703, 1999. [54] T.R. Mattson and A.E. Mattson, “Calculating the vacancy formation energy in metals: Pt, Pd, and Mo,” Phys. Rev. B, 66, 214110, 2002. [55] M.J. Sabochick and S. Yip, “Migration energy calculations for small vacancy clusters in copper,” J. Phys. F: Met. Phys., 18, 1689–1701, 1988. [56] J.N. Adams, S.M. Foiles, and W.G. Wolfer, “Self-diffusion and impurity diffusion of fcc metals using the five-frequence model and the Embedded Atom Method,” J. Mater. Res., 4, 102, 1989. [57] P. Flynn, Point Defects and Diffusion, Clarendon Press, Oxford, UK, 1972. [58] S.M. Foiles, M.I. Baskes, and M.S. Daw, “Embedded-atom-method functions for the fcc metals Cu, Ag, Au, Ni, Pd, Pt, and their alloys,” Phys. Rev. B, 33, 7983, 1986; Erratum: ibid, Phys. Rev. B, 37, 10378, 1988. [59] K. Carling, G. Wahnström, T.R. Mattsson, N. Sandberg, and G. Grimvall, “Vacancy concentration in Al from combined first-principles and model potential calculations,” Phys. Rev. B, 67, 054101, 2003.
1876
K. Norlund and R. Averback
[60] N. Sandberg, B. Magyari-Kope, and T.R. Mattsson, “Self-diffusion rates in Al from combined first-principles and model-potential calculations,” Phys. Rev. Lett., 89, 065901, 2002. [61] A. Khellaf, A. Seeger, and R.M. Emrick, “Quenching studies of lattice vacancies in high-purity aluminium,” Mater. Trans., 43, 186–189, 2002. [62] P.A. Korzhavyi, A.V. Ruban, A.Y. Lozovoi, Y.K. Vekilov, I.A. Abrikosov, and B. Johansson, “Constitutional and thermal point defects in B2 NiAl,” Phys. Rev. B, 61, 6003, 2000. [63] Y. Mishin, M.J. Mehl, and D.A. Papaconstantpoulos, “Embedded-atom potential for B2-NiAl,” Phys. Rev. B, 65, 224114, 2002. [64] Y. Mishin, A.Y. Lozovoi, and A. Alavi, “Evaluation of diffusion mechanisms in NiAl by embedded-atom and first-principles calculations,” Phys. Rev. B, 67, 014201, 2003. [65] A.Y. Lozovoi and Y. Mishin, “Point defects in NiAl: the effect of lattice vibrations,” Phys. Rev. B, 68, 184113, 2003. [66] X.D. Liu, J. Zhu, Z.Q. Hu, and J.T. Wang, “Investigation of defective structure of nanocrystalline Fe–Mo–Si–B alloys by the positron annihilation technique,” J. Mater. Sci. Lett., 12, 1826–1828, 1993. [67] C. Bennett, P. Chaudhari, V. Moruzzi, and P. Steinhardt, “On the stability of vacancy and vacancy clusters in amorphous solids,” Philos. Mag. A, 40, 485, 1979. [68] J.M. Delaye and Y. Limoge, “Molecular dynamics study of vacancy-like defects in a model glass: dynamical behavior and diffusion,” J. Phys. I, 3, 2079–2097, 1993. [69] J.M. Delaye and Y. Limoge, “Molecular dynamics study of vacancy-like defects in a model glass: static behaviour,” J. Phys. I, 3, 2063–2077, 1993. [70] Y. Limoge, “Microscopic and macroscopic properties of diffusion in metallic glasses,” Mater. Sci. Eng. A, 226–228, 228, 1997. [71] Y. Ashkenazy, R.S. Averback, and A. Granato, Point defects in supercooled amorphous Cu, to be published, 2004. [72] A.V. Granato, “Interstitialcy model for condensed matter states of face-centeredcubic metals,” Phys. Rev. Lett., 68, 974, 1992. [73] F. Faupel, W. Frank, M.-P. Macht, H. Mehrer, V. Naundorf, K. Rätzke, H.R. Schober, S.K. Sharma, and H. Teichler, “Diffusion in metallic glasses and supercooled melts,” Rev. Mod. Phys., 75, 237–280, 2003. [74] H.R. Schober, C. Oligschleger, and B.B. Laird, “Low-frequency vibrations and relaxations in glasses,” J. Non-Cryst. Solids, 156–158, 965–968, 1993. [75] G.E. Murch and I.V. Belova, “Chemical diffusion by vacancy pairs in intermetallic compounds with the B2 structure,” Phil. Mag. Lett., 80, 569–575, 2000.
6.3 DEFECTS AND IMPURITIES IN SEMICONDUCTORS Chris G. Van de Walle Materials Department, University of California, Santa Barbara, California, USA
Impurities are essential for giving semiconductors the properties that render them useful for electronic and optoelectronic devices. The intrinsic carrier concentrations in most semiconductors are quite low. Adding small amounts of impurities allows control of the conductivity of the semiconductor: shallow donors, such as phosphorous in silicon, produce n-type conductivity (carried by electrons), and shallow acceptors, such as boron in silicon, produce p-type conductivity (carried by holes). These doped layers and the junctions between them control carrier confinement, carrier flow, and ultimately the device characteristics. Commonly used semiconductors such as Si and GaAs can be doped both p-type and n-type. Constraints on doping still limit device performance, however. For instance, the shrinking size of Si field-effect transistors requires higher doping densities, with donors exhibiting deactivation when the doping increases above ∼3 ×1020 cm−3 . Doping problems have been more severe in wide-band-gap semiconductors such as ZnSe, GaN or ZnO, which typically exhibit unintentional n-type conductivity, and in which p-type conductivity has been difficult to achieve. Point defects (vacancies, self-interstitials, and antisites) have often been invoked to explain these difficulties. Computational studies have made important contributions to solving many of these problems. One application of the computational results is to provide a microscopic identification of a defect or impurity, by comparing the computed properties with experiment. For instance, atomic relaxations can be compared with EXAFS (extended X-ray absorption fine structure), hyperfine parameters, based on calculated wave functions, with EPR (electron paramagnetic resonance), and vibrational frequencies with infrared or Raman spectroscopy. In addition, the ability provided by first-principles calculations to calculate total energies enables us to attack the problems related to doping limitations. These limitations can be attributed to four causes: solubility; ionization energy; incorporation of impurities in undesired configurations; and 1877 S. Yip (ed.), Handbook of Materials Modeling, 1877–1888. c 2005 Springer. Printed in the Netherlands.
1878
C.G. Van de Walle
compensation by native point defects or foreign impurities. Each of these issues can be addressed by performing total-energy calculations, as illustrated in the following sections. Unless otherwise specified I will use the word “defect” to denote both native point defects and impurities, since the methodologies are the same.
1.
Geometry
For most purposes, the key issue is to calculate the properties of a single, isolated point defect in an infinite solid. Interactions between defects may be important, but these can often be modeled once the properties of the isolated noninteracting species are known. There are three basic approaches for handling the geometry corresponding to a single defect: clusters, supercells, and Green’s functions. The cluster approach is based on the assumption that most of the essential physics is captured if the local environment of the defect is well described. It focuses on the interactions of the defect with the surrounding shells of host atoms, using a cluster geometry. The cluster should be large enough to provide a reasonable description of the band structure of the host, and to minimize interactions between the defect and the surface of the cluster. Passivation of the surface is required in order to suppress surface states that would otherwise dominate the results; hydrogen atoms are commonly used for this purpose. In a supercell geometry, the impurity is surrounded by a finite number of semiconductor atoms, and that whole structure is periodically repeated. This has the advantage of allowing the use of various techniques that require translational periodicity of the system. Supercells need to be large enough to provide adequate separation of the defects, which can be explicitly tested. Another major advantage is that the band structure of the host crystal is well described. Indeed, a calculation for a supercell that is simply filled with the host crystal, in the absence of any defect, simply produces the band structure of the host. This contrasts with cluster approaches, where even fairly large clusters still produce sizeable quantum-confinement effects that significantly affect the band structure. For the zinc-blende (ZB) structure, typical supercells contain 16, 32, 54, 64, 128, 216, or 256 atoms. The 16-, 54-, and 128-atom supercells are based on simply extending the basis vectors of the fcc lattice. The shape of these cells is ill suited to providing adequate separation between defects in all directions. The 64- and 216-atom supercells are based on enlarging the “conventional” 8-atom simple cubic cell. The 32- and 256-atom supercells, finally, have bcc symmetry and are most appropriate for separating defects in all spatial directions. Defect calculations should be carried out at the theoretical lattice constant, in order to avoid spurious elastic interactions with defects in neighboring supercells. Atomic relaxations of several shells of neighbors around
Defects and impurities in semiconductors
1879
the defect should be allowed. It may be necessary to break the symmetry in order to explore low-symmetry configurations of the defect. Another approach which provides a good desciption of the band structure of the host crystal is based on the Green’s function determined for the perfect crystal. This function is then used to calculate changes induced by the presence of the defect. The Green’s function approach is more cumbersome and less physically transparent than the supercell technique. Indeed, the supercell approach has become the method of choice for performing state-of-the-art defect calculations.
2.
Hamiltonian
Any Hamiltonian for a defect in a semiconductor must include terms that describe the interactions between the nuclei, the interactions of electrons with the nuclei, and the electron-electron interactions. The latter are the most difficult part of the problem. Historically, Hartree–Fock methods were the first to attack many-electron problems, with considerable success for atoms and molecules. The main problems are the neglect of correlation and the computational demands: ab initio Hartree–Fock methods can only be applied to systems with small numbers of atoms, because they require the evaluation of a large number of multicenter integrals. Quantum chemists have developed simpler semi-empirical methods that either neglect or approximate some of these integrals, but the accuracy and reliability of these methods is hard to assess. Tight-binding calculations have also often been used for defects. These methods take advantage of the fact that within a local basis set the Hamiltonian matrix elements rapidly decrease with increasing distance between the orbitals. Thus, instead of having to diagonalize the full Hamiltonian matrix, most of the matrix elements vanish and only a sparse matrix has to be diagonalized. Depending on how the remaining matrix elements are determined one can distinguish two main approaches: (i) Empirical tight-binding methods use parameters obtained from fitting a set of experimental or computed quantities, a procedure for which no consistent prescription exists. (ii) First-principles tight-binding methods, on the other hand, use local orbitals to explicitly calculate the Hamiltonian matrix elements [1]. The most rigorous calculations for defects in semiconductors are based on density-functional theory (DFT). DFT in the local density approximation (LDA) [2, 3] allows a description of the many-body electronic ground state in terms of single-particle equations and an effective potential. The effective potential consists of the ionic potential due to the atomic cores, the Hartree potential describing the electrostatic electron-electron interaction, and the exchange-correlation potential that takes into account the many-body effects. This approach has proven to describe with high accuracy such quantities as atomic geometries,
1880
C.G. Van de Walle
charge densities, formation energies, etc. For calculations of defects and impurities in semiconductors use of the local density approximation seems to be well justified. The generalized gradient approximation offers few advantages, either for bulk properties or for formation energies of point defects [4, 5]. One shortcoming of DFT is its failure to produce accurate excited-states properties – the band gap is commonly underestimated. No method is currently available that goes beyond DFT and provides total-energy capability for the large supercell calculations required to investigate defects. Even methods aimed solely at calculating the band structure, such as quasiparticle calculations in the GW approximation [6] are currently prohibitively expensive for large cells. Defect calculations can, in principle, be performed in an all-electron approach, such as the FLAPW (full-potential augmented plane wave) or the fullpotential LMTO (linearized muffin-tin orbital) methods. Computationally, however, a pseudopotential approach is most efficient, particularly for the large-scale calculations required for defects. Most properties of molecules and solids are determined by the valence electrons, i.e., those electrons in outer shells which take part in the bonding between atoms. The core electrons can be removed from the problem by representing the ionic core (i.e., the nucleus plus inner shells of electrons) by a pseudopotential. State-of-the-art calculations employ nonlocal, norm-conserving pseudopotentials which are generated from atomic calculations and do not contain any fitting to experiment [7]. Such calculations can therefore be called “ab initio” or “first-principles”. A plane-wave basis set is most commonly used in the pseudopotential approach. Convergence as a function of plane-wave cutoff is straightforward to check. Integrations over the Brillouin zone are performed using the standard Monkhorst–Pack scheme [8] with a regularly spaced mesh of n × n × n points in the reciprocal unit cell shifted from the origin (to avoid picking up the point as one of the sampling points). Symmetry reduces this set to a set of points in the irreducible part of the Brillouin zone. Most problems require calculations not only of electronic wave functions, but also of atomic positions. An important advance in this respect was the development of the Car–Parrinello method [9] which allows simultaneous optimization of the electronic and atomic degrees of freedom. The ability to move atoms allows performing first-principles molecular dynamics, as well.
3.
Formation Energies
The formation energy is a key quantity determining the properties of a defect or impurity. Indeed, the equilibrium concentration of a defect is given by c = Nsites e−E f /k T
(1)
Defects and impurities in semiconductors
1881
where E f is the formation energy, Nsites is the number of sites on which the defect or impurity can be incorporated, k is the Boltzmann constant, and T the temperature. Equation (1) shows that defects with a high formation energy will occur in low concentrations. In principle, the free energy should be used in Eq. (1). Contributions from vibrational entropy are often neglected, however. Explicit calculations of entropies are quite demanding, and they also often cancel to some extent, for instance when solubilities are calculated. The formation energy is not a constant but depends on the growth conditions. We illustrate the key concepts with the example of a Mg acceptor (substituting on a Ga site) in GaN. Its formation energy is determined by the relative abundance of Mg, Ga, and N atoms, as expressed by the chemical potentials µMg , µGa and µN , respectively. If the Mg acceptor is charged (as is expected when it is electrically active), the formation energy also depends on the Fermi level (E F ), which acts as a reservoir for electrons. Forming a substitutional Mg acceptor requires the removal of one Ga atom and the addition of one Mg atom; the formation energy is therefore: − E f (GaN:Mg− Ga ) = E tot (GaN:MgGa ) − E tot (GaN, bulk) − µMg + µGa −[E F + E v + V ]. (2)
First-principles calculations allow explicit derivation of E tot (GaN:Mg− Ga ), the total energy of a system containing substitutional Mg on a Ga site. Similar expressions apply to other impurities and to the various native point defects. The chemical potentials depend on the experimental growth conditions, which can be either Ga-rich or N-rich. For the Ga-rich case, µGa = µGa[bulk] places an upper limit on µGa . Indeed, pushing µGa beyond this limit results in precipitation of bulk Ga, rather than growth of GaN. In equilibrium, µGa + µN = E tot [GaN], where E tot [GaN] is the total energy of a two-atom unit of bulk GaN; the upper limit on µGa therefore places a lower limit on µN . For the Nrich case, the upper limit on µN is given by µN = µN[N2 ] , i.e., the energy of N in an N2 molecule; this yields a lower limit on µGa . The total energy of GaN can also be expressed as E tot [GaN] = µGa[bulk] + µN[N2 ] + H f [GaN], where H f [GaN] is the enthalpy of formation, which is negative for a stable compound. We observe that the host chemical potentials thus vary over a range corresponding to the magnitude of the enthalpy of formation of the compound. The Mg atom in our example occurs in the negative charge state (i.e., it has donated a hole to the valence band). The electron that is placed on the Mg is taken out of a reservoir of electrons, the energy of which is the electron chemical potential or Fermi level, E F . E F is referenced to the valenceband maximum in the bulk. Due to the choice of this reference, Eq. (2) needs to explicitly contain a term representing the energy of the bulk valence-band maximum, E v , when expressing the formation energy of a charged state. However, we need to add a correction term, V , to align the reference potential in
1882
C.G. Van de Walle
the defect calculation with that in the bulk, as discussed in Van de Walle and Neugebauer, 2004 [10]. Another issue regarding calculations for charged states is the treatment of the G = 0 term in the total energy of the supercell. This term would diverge for a charged system; we therefore assume the presence of a compensating uniform background (jellium) and evaluate the G = 0 term as if the system were neutral [11]. Correction terms have been proposed [12], but they tend to overestimate the effect, essentially because screening is more efficient than assumed in a simple dielectric model. The Fermi level E F is not an independent parameter, but is always determined by the condition of charge neutrality. In principle, equations such as Eq. (2) can be formulated for every native defect and impurity in the material; the complete problem (including free-carrier concentrations in valence and conduction bands) can then be solved selfconsistently, imposing charge neutrality. However, it is often instructive to plot formation energies as a function of E F in order to examine the behavior of defects and impurities when the doping level changes. An example of such a plot is given in Fig. 1. It shows results for defects and impurities relevant for p-type GaN. The zero of E F is located at the top of the 4
MgN
Formation energy (eV)
Mgi
Bei VN
2
MgGa H⫹
0
BeGa 0
1 EF (eV)
2
Figure 1. Formation energies as a function of Fermi level for point defects and impurities relevant for p-type GaN. Energies are shown for Mg and Be in different configurations (Ga-substitutional, N-substitutional, and interstitial configuration). Also included is the dominant native defect (VN ) and hydrogen, a foreign impurity. Nitrogen-rich conditions and equilibrium with Mg3 N2 or Be3 N2 are assumed.
Defects and impurities in semiconductors
1883
valence band. For ease of presentation, the chemical potentials are set equal to fixed values; however, a general case can always be addressed by referring back to Eq. (2). The fixed values correspond to N-rich conditions and to maximum incorporation of the impurities. For Mg, this is determined by equilibrium with Mg3 N2 . Indeed, increasing the chemical potential of Mg results in a lowering of the formation energy of MgGa [Eq. (2)] and an increase in its concentration [Eq. (1)]; however, at some point it becomes more favorable to form Mg3 N2 instead of incorporating Mg as an impurity, and this determines the solubility limit. For each defect in Fig. 1, only the line segment is shown that corresponds to the charge state that gives rise to the lowest energy at a particular value of E F . The slope of the line segment represents the charge state, and a kink (change in slope of the lines) therefore represents a change in the charge state of the defect [see Eq. (2)]. The Fermi-level position at which this change occurs corresponds to a transition level that can be experimentally measured.
4.
Doping-Limiting Mechanisms
In the introduction we mentioned that first-principles calculations can elucidate doping limitations and suggest ways to overcome them. We now illustrate this claim with the example of Fig. 1. (1) Solubility. A high free-carrier concentration obviously requires a high concentration of the dopant impurity. The solubility corresponds to the maximum concentration that the impurity can attain in the semiconductor, a quantity that depends on the growth temperature and on the abundance of the impurity as well as the host constituents in the growth environment. As discussed above, the solubility of Mg in GaN is limited by formation of Mg3 N2 . Figure 1 shows that, under comparable conditions, the formation energy of beryllium is markedly lower than that of magnesium, implying that Be has a higher solubility. (2) Ionization energy. The ionization energy of a dopant determines the fraction of dopants that contributes free carriers at a given temperature. A high ionization energy limits the doping efficiency. Ionization energies of acceptors in GaN are typically so large (around 200 meV) that at room temperature only about 1% of acceptor impurities are ionized. Ionization energies are mainly determined by intrinsic properties of the semiconductor, such as effective masses, dielectric constant, etc. Switching to a different acceptor therefore has no dramatic effect on the ionization energy. The ionization energy of a substitutional acceptor corresponds to the Fermi-level position where the neutral and negative charge states have equal energies, i.e., it is given by the kink in the
1884
C.G. Van de Walle
curves for MgGa and BeGa in Fig. 1. Error bars on ionization energies are typically quite large. Still, Fig. 1 shows that the ionization energy of Be is slightly lower than that of Mg, suggesting that also from this point of view Be would be a superior acceptor. (3) Incorporation of impurities in other configurations. Most donor and acceptor impurities reside on substitutional sites, i.e., they replace one of the host atoms. In order for Mg in GaN to act as an acceptor, it needs to be incorporated on the gallium site. If Mg is located in other positions in the lattice, such as an interstitial position or substituting for a nitrogen atom, it actually behaves as a donor. For GaN doped with Mg, Fig. 1 shows that these other configurations (MgN and Mgi ) are energetically unfavorable, and hence will not form. The situation is different for Be: as shown in Fig. 1, the formation energy of beryllium interstitials is quite low, particularly when the Fermi level approaches the valence band. Self-compensation is therefore a serious risk in case of doping with Be. Another instance of impurities incorporating in undesirable configurations consists of the so-called DX centers. The prototype DX center is Si in AlGaAs. In GaAs and in AlGaAs with low Al content, Si behaves as a shallow donor. But when the Al content exceeds a critical value, Si behaves as a deep level. This has been explained in terms of Si moving off the substitutional site, towards an interstitial position [13]. It has been found that oxygen forms a DX center in AlGaN when the Al content exceeds about 30% [10]. (4) Compensation by native point defects. Native defects have frequently been invoked to explain doping problems in semiconductors. Recent studies have shown that this problem is not necessarily more severe in wide-band-gap semiconductors than in, e.g., GaAs. For GaN, compensation by vacancies can in some cases limit the doping level: gallium vacancies are acceptors and compensate n-type GaN, and nitrogen vacancies are donors and compensate p-type GaN (as illustrated in Fig. 1). Self-interstitials and antisites were found to be too high in energy in GaN to form in significant concentrations. Figure 1 also shows that the nitrogen vacancy has a transition level (between the 3+ and + charge states) at 0.5 eV above the valence band. Transfer of electrons from the conduction band (or from a shallow donor level) to this level of VN may therefore give rise to blue luminescence, roughly 0.5 eV below the band gap. Note that in order to calculate the transition energy more accurately the Franck–Condon shift should be taken into account. This is illustrated, for a general case, in Fig. 2. During the emission process the atomic configuration of the defect remains fixed – i.e., in the final state, the defect has undergone a change in charge state but still has the atomic configuration corresponding to the initial
energy
Defects and impurities in semiconductors
1885
q
X ⫹e
Ea
X
Ee
q ⫹1
Eg⫺ε(q /q⫹1)
Erel
zq⫹1
zq configuration coordinate z
Figure 2. Schematic configuration coordinate diagram illustrating the difference between thermal and optical ionization energies for a defect X. The curve for Xq is vertically displaced from that for Xq + 1 assuming the presence of an electron in the conduction band. E rel is the Franck–Condon shift, i.e., the relaxation energy that can be gained by relaxing from configuration z q (eguilibrium configuration for charge state q) to configuration z q + 1 (equilibrium configuration for charge state q + 1). E a is the absorption energy, E e the emission energy, ε(q/q + 1) the thermodynamic transition level, and E g the band gap.
charge state. Calculations can provide a complete configuration coordinate diagram showing the energy of the defect in different charge states as a function of the atomic configuration, from which the energies of all relevant transitions can be obtained. Figure 2 shows that the emission energy (e.g., in a photoluminescence experiment) is smaller than expected based on the position of the thermodynamic transition level ε(q/q + 1). (5) Compensation by foreign impurities. This source of compensation may seem rather obvious but it should be mentioned for completeness: for instance, when doping with acceptors in order to obtain p-type conductivity, impurities that act as donors should obviously be tightly controlled. Hydrogen, which is often unintentionally introduced during growth, is a prime example of such an impurity. In p-type GaN, H behaves as a donor (H+ ) (see Fig. 1); it thus compensates acceptors. A post-growth annealing step is required to render the acceptors electrically active. The presence of hydrogen during growth is actually beneficial. Indeed, as shown in Fig. 1 [14], the formation energy of hydrogen is actually lower than that of nitrogen vacancies. Hydrogen is therefore the preferred source of compensation, and suppresses the formation of VN . The presence of hydrogen during growth also shifts the Fermi level
1886
C.G. Van de Walle higher in the band gap, since it acts as a donor. This shift lowers the formation energy of the Mg or Be substitutional acceptors, thus enhancing the solubility. Controlling the presence of hydrogen during growth thus provides the capability to affect the concentrations of desired impurities (acceptors) and undesirable point defects (nitrogen vacancies). This example of defect and impurity engineering was based directly on the insights obtained from first-principles calculations.
5.
Discussion
Strictly speaking, the relationship [Eq. (1)] between concentrations and formation energies only holds in thermodynamic equilibrium. Materials growth is obviously a non-equilibrium situation – however, many growth techniques are close enough to equilibrium to warrant the use of the equilibrium approach. It should be emphasized that not all aspects of the process need to be in equilibrium in order to justify the use of equilibrium expressions for defects and impurities. What is required is a sufficiently high mobility of the relevant impurities and point defects to allow them to equilibrate at the temperatures of interest. Computations can be used to assess the mobility of defects. The migration barrier of an impurity is the energy difference between the saddle point and the ground state. Identifying the saddle point can be accomplished, e.g., by using the nudged elastic band method [15]. It is also possible to map the complete total energy surface for an interstitial impurity moving through a solid [11]. Our discussion so far has focused on isolated point defects and impurities. Complexes between defects and/or impurities can also be important. It is often assumed that such complexes automatically form when two defects exhibit a positive binding energy. In fact, it can be shown that for the equilibrium concentration of complexes to exceed the concentration of either constituent the binding energy needs to be greater than the larger of the formation energies of the constituents [10]. We mentioned in the introduction that calculations of quantities that can be directly compared with experiment provide a powerful means of providing microscopic identification of a defect or impurity. The first-principles calculations explicitly produce wave functions; it is therefore possible to calculate hyperfine parameters. The wave functions have to be obtained from a spinpolarized calculation; i.e., spin-up and spin-down electrons have to be treated independently. It has been shown that it is important to take contributions to the spin density from all the occupied states into account. Hyperfine parameters are particularly sensitive to the wave functions in the core region. When using a pseudopotential approach, the wave function in the core region is replaced by a smooth pseudo-wave function. The information contained in the
Defects and impurities in semiconductors
1887
pseudo-wave function, in conjunction with information about wave functions in the free atom, is actually sufficient to calculate hyperfine parameters with a high degree of accuracy [16]. Defects or impurities often give rise to localized vibrational modes (LVM). Light impurities, in particular, exhibit distinct LVMs that are often well above the bulk phonon spectrum. The value of the observed frequency often provides some indication as to the chemical nature of the atoms involved in the bond, but a direct comparison with first-principles calculations has proven to be very valuable. Evaluating the vibrational frequency corresponding to a stretching or wagging mode of a particular bond can be accomplished by using calculated forces to construct a dynamical matrix. In the case of light impurities, where a large mass difference exists between the impurity and the surrounding atoms, it is often a good approximation to focus on the displacement of the light impurity alone, keeping all other atoms fixed. A fit to the calculated energies as a function of displacement then produces a force constant. This approach lends itself well to taking higher-order terms (anharmonic corrections) into account. In the case of an impurity such as hydrogen the anharmonic terms can be on the order of several 100 cm−1 , and therefore an accurate treatment is essential [17].
6.
Outlook
We have provided an overview of first-principles computational methods for calculating defects and impurities in semiconductors. The power of the approach was illustrated with examples for GaN, an important wide-band-gap semiconductor. However, the methodology is entirely general and can be applied to any material. First-principles calculations for defects and impurities in semiconductors are playing an increasingly important role in interpreting and guiding experiments. In fact, in a number of areas theory has led experiment. Examples include the prediction of the behavior of hydrogen and its interactions with dopant impurities, and the study of diffusion of point defects in GaN. New developments in methodology could make the approach even more powerful. The band-gap problem inherent in density-functional calculations limits the accuracy in some cases, and solving this problem is an important goal. Other developments are aimed at rendering the exploration of migration paths or of possible configurations for low-symmetry configurations less cumbersome. The latter capability would make it easier to study complexes between point defects and impurities. Another area of increasing interest is the interaction between point defects or impurities on the one hand and extended defects, interfaces or surfaces on the other hand. Just like point defects in the bulk play an important role in diffusion, point defects at surfaces determine atomic mobilities at surfaces,
1888
C.G. Van de Walle
and hence play a decisive role in growth. Likewise, a full understanding of impurity incorporation requires comprehensive calculations of the behavior of impurities at and near the surface. Such first-principles calculations can then form the foundation for realistic simulations of the actual growth process.
References [1] M. Elstner, D. Porezag, G. Jungnickel et al., “Self-consistent-charge densityfunctional tight-binding method for simulations of complex materials properties,” Phys. Rev. B, 64, 7260–7268, 1998. [2] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– B871, 1964. [3] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [4] C. Stampfl and C.G. Van de Walle, “Density-functional calculations for III-V nitrides using the local-density approximation and the generalized gradient approximation,” Phys. Rev. B, 59, 5521–5535, 1999. [5] C. Stampfl and C.G. Van de Walle, “Theoretical investigation of native defects, impurities, and complexes in aluminum nitride,” Phys. Rev. B, 65, 155212–1-10, 2002. [6] W.G. Aulbur, L. J¨onsson, and J.W. Wilkins, “Quasiparticle calculations in solids,” Solid State Physics, 54, 1–218, 2000. [7] D.R. Hamann, M. Schl¨uter, and C. Chiang, “Norm-conserving pseudopotentials,” Phys. Rev. Lett., 43, 1494–1497, 1979. [8] H.J. Monkhorst and J.D. Pack, “Special points for Brillouin-zone integrations,” Phys. Rev. B, 13, 5188–5192, 1976. [9] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471–2474, 1985. [10] C.G. Van de Walle and J. Neugebauer, “Applied Physics Review: First-principles calculations for defects and impurities: applications to III-nitrides,” J. Appl. Phys., 95, 3851–3879, 2004. [11] C.G. Van de Walle, P.J.H. Denteneer, Y. Bar-Yam, et al., “Theory of hydrogen diffusion and reactions in crystalline silicon,” Phys. Rev. B, 39, 10791–10808, 1989. [12] G. Makov and M.C. Payne, “Periodic boundary conditions in ab initio calculations,” Phys. Rev. B, 51, 4014–4022, 1995. [13] D.J. Chadi and K.J. Chang, “Theory of the atomic and electronic structure of DX centers in GaAs and Alx Ga1−x As alloys,” Phys. Rev. Lett., 61, 873, 1988. [14] J. Neugebauer and C.G. Van de Walle, “Theory of hydrogen in GaN,” In: N.H. Nickel (ed.), R.K. Willardson and E.R. Weber (treatise eds.), Hydrogen in Semiconductors II, Semiconductors and Semimetals, vol. 61. Academic Press, Boston, pp. 479–502, 1999. [15] H. J´onsson, G. Mills, and K.W. Jacobsen, “Nudged elastic band method for finding minimum energy paths of transitions,” In: B.J. Berne, G. Ciccotti, and D.F. Coker (eds.), Classical and Quantum Dynamics in Condensed Phase Simulations, World Scientific, Singapore, Chapter 16, 1998. [16] C.G. Van de Walle and P.E. Bl¨ochl, “First-principles calculations of hyperfine parameters,” Phys. Rev. B, 47, 4244–4255, 1993. [17] S. Limpijumnong, J.E. Northrup, and C.G. Van de Walle, “Identification of hydrogen configurations in p-type GaN through first-principles calculations of vibrational frequencies,” Phys. Rev. B, 68, 075206–1-14, 2003.
6.4 POINT DEFECTS IN SIMPLE IONIC SOLIDS John Corish Department of Chemistry, Trinity College, University of Dublin, Dublin 2, Ireland
1.
Nature, Occurrence and Modelling of Point Defects in Simple Ionic Solids
Apart from man’s innate need to model and the satisfaction that a good model can bring, the real purpose of scientific modelling is to increase our understanding of a system. More importantly, it provides the basis from which to move forward to understand more complex systems and to design such new systems for specific applications by making predictions about their properties. Simple ionic solids, such as the alkali and silver halides, some fluoritestructured crystals and binary oxides, provide the most accessible and well-developed testing grounds for the study, both experimental and theoretical, of point defects in crystalline materials. This is because defects in these crystals typically carry a charge different from those on the ions that comprise the normal components of the matrix. Their presence, nature, interactions and movements can therefore be rather easily quantitatively determined through the measurement of readily observed macroscopic properties such as ionic conductivity. These charges can be present whether the defects are intrinsic or extrinsic. Intrinsic defects, such as Schottky or Frenkel defects, are equilibrium thermodynamic defects and exist in all materials because the balance between the enthalpy required for their formation in a perfect lattice and the resulting increase in the entropy of the system gives rise to a minimum in the Gibbs free energy. There is a corresponding equilibrium intrinsic defect concentration at each temperature. Extrinsic defects involve ionic impurities, often aliovalent, that are either adventitiously present in the lattice or that are introduced purposely as dopants to confer particular desired properties. They can enter the lattice substitutionally, in which case they displace a regular ion and take up its position on the lattice, or they may occupy an interstitial position that is not normally occupied in the pure lattice. 1889 S. Yip (ed.), Handbook of Materials Modeling, 1889–1899. c 2005 Springer. Printed in the Netherlands.
1890
J. Corish
It is generally advantageous to consider the difference between the charge on a particular site and the charge that it would carry in the perfect crystal. This is termed the “virtual charge” and is the charge used in the Kroger– Vink notation system that is recommended to describe defects [1]. It is clear that the preservation of charge neutrality in a crystal requires that the overall sum of virtual charges be zero. This means that the total sum of the virtual charges on intrinsic defects must be neutral and that the introduction of aliovalent ions always requires the simultaneous formation of charge compensating defects. These constraints, coupled with the variety of extrinsic defects that can be realised, are again very useful in defining the range of permissible defect structures in simple ionic solids. The concentration of aliovalent extrinsic dopants that can enter simple ionic lattices varies from perhaps only a few hundred parts per million or even less for alkali halides to up to tens of percents in fluorite-structured crystals. In the latter each alternate cube is empty in the pristine lattice and is therefore available to accommodate chargecompensating fluoride ions when the cation is replaced by trivalent dopants such as the rare-earth cations. The most common classical ionic conductors like the alkali halides typically have much less than 1% intrinsic defects concentrations, even at very high temperatures approaching their melting points. However, other simple ionic solids contain defects such as vacancies, often in relatively large concentrations, as an integral part of the structure of a particular phase e.g., α-Ag2 S, in which only about two-thirds of the available cation sites are occupied. Such large concentrations of vacancies that do not require to be thermodynamically formed can render ionic transport very facile between them and so give rise to fast-ion conduction. The modelling of this type and of other fast-ion conducting materials will be discussed in Article 6.5 below.
2.
Modelling Techniques
Because of their simplicity and usefulness as a test bed for novel advances in modelling, virtually all simulation techniques have been applied, usually at an early stage in their development, to simple ionic solids. These systems have also been extensively studied experimentally so that reliable data are available with which the results of the simulations can be compared. A necessary prerequisite for the use of a defect modelling technique as a predictive tool is that it can first accurately calculate the macroscopic properties of the defect free material. It should also reproduce the energies for the formation of the basic defects, for their interactions and for their migrations, all in good agreement with their experimentally determined values. Applications of these techniques, particularly in inorganic crystallography, have been recently reviewed [2].
Point defects in simple ionic solids
2.1.
1891
Static Lattice Calculations
This is the technique that is most widely used in the systems of interest here. A foundation for the modelling of defects in ionic crystals was laid relatively early when Mott and Littleton [3]showed that a defect could be modelled successfully and its energy calculated by dividing the crystal into two regions. This two-region strategy overcomes the essentially infinite nature of the problem. In the inner region, that directly surrounding the defect, the interactions between the species are calculated explicitly, while the species in the outer region are treated as a continuum using macroscopic response functions. This strategy may be written mathematically by expressing the energy of the defective crystal as: E = E 1 (r) + E 2 (r, ξ ) + E 3 (ξ )
(1)
Here E 1 (r) is the energy of the inner region with r denoting the displacements as determined explicitly. E 2 (r, ξ ) is the interaction energy between the two regions and E 3 (ξ ) the energy or outer region with ξ being the displacements. If the inner region is sufficiently large it may be assumed that the outer region consists of perfect lattice with harmonic displacements. The calculations done by Mott and Littleton themselves involved only very small numbers of ions in the inner region but the method has since been incorporated, starting some 35 years later, into powerful simulation codes HADES [4]; CASCADE [5], GULP [6]. In common with others, these programmes can generate complete crystal structures from the specified lattice vectors and the contents of a unit cell. They then accept the detailed atomistic specifications for both simple and complex defects and use the two-region strategy, with sufficiently large numbers of ions in the inner region to ensure its convergence, to calculate the defect energies of the most relaxed defect configuration. Their output includes complete atomistic information on the detailed structure of the defect with the coordinates of the positions of the surrounding ions, and the energy of the system. Although the earlier programmes were restricted both in terms of the choice of simple interionic two-body potentials that could be used and of the crystal symmetries they could handle they were, in many applications, adequate for simple ionic solids. These static lattice programmes have, of course, been continuously developed and current codes provide for very wide choices of interaction potentials to represent the force fields in the materials and include the complete range of crystal symmetries. The programmes are relatively undemanding in terms of computational power and so the energies of large and complex defect structures can be calculated. In addition, quite sophisticated potentials, including treatment of the polarisation of the ions that is essential for reliable results, can be employed. Defect formation energies are determined by comparison with the energies, also calculated by the programme, for the perfect lattices. If migration
1892
J. Corish
pathways are assumed and calculations carried out for the saddle-point energy, the activation energy for a jumping ion can be also be determined by subtraction of the energy of the ground state from that calculated for the assumed saddle point. In such cases it is always advisable to search out for the lowest energy pathway by fixing the jumping ion at a variety of appropriate sites in the region of any assumed saddle points. The energies of association of complex defects are determined as the differences between the energy calculated for the complexes and the sum of the energies calculated for their isolated constituent parts. Static lattice calculations are essentially constant volume calculations and so provide a value for u v (0). The thermodynamic relationships between the results obtained and the corresponding experimental quantities, h p (T ), which are measured at constant pressure, have been set out by Catlow et al., [7]. The essential equations are: g p = fv
h p = uv − T
s p = uv −
∂V ∂T
∂V ∂T
p
p
∂ fv ∂V
∂ fv ∂V
(2)
(3) T
(4) T
Later work [8, 9] has shown that the reason that the internal energy of a defect at infinite zero can be identified with its enthalpy at a finite temperature is because two terms cancel each other to make h p (T ) ∼ u v (0). This is the comparison that is made between experimental and calculated values in normal practice.
2.2.
The Supercell Method
This technique can also be used to calculate defect energies by introducing the defect to the supercell to which periodic boundary conditions are to be applied. However there are some inherent difficulties because an artificial ordering is introduced for the defects and also because the presence of charges on the defects causes the Coulombic summation to diverge. The sum of the Coulombic energy requires a neutral cell with multiple defects that maintain charge neutrality and that these interact with each other and their images in the periodic summation. To reduce these effects a number of corrections have been suggested that allow the simulation of single charged point defects with subtraction of defect–defect interactions to yield a simulation that is very close to infinite dilution. The Ewald method used to sum the Coulombic terms assumes that the cell is charge neutral. However, a small modification [10] that assumes that there
Point defects in simple ionic solids
1893
is a counter charge distributed evenly throughout the cell and is the g = 0 term for the reciprocal space summation given by: −
π Q2 2V η
(5)
made it possible to consider charged cells. Here Q is the total charge on the cell, V is the cell volume and n is the Gaussian half-width used in the Ewald sum. In addition, the Coulombic interaction energy from defect–defect interactions can be calculated [11] as: defects ” i, j
qi q j 2r i j · ε R
(6)
where the ” indicates that the summation does not include pairs of defects within the unit cell, ri j is the distance between defects i and j and ε R is the static dielectric constant matrix. Parker and co-workers extended these extended these post-calculation corrections to all symmetries and, more importantly, corrected the energies and forces during the simulation, rather than applying them retrospectively. The forces due to defect–defect interactions were also calculated and subtracted and were therefore excluded from affecting the final configuration and energy. The techniques were implemented in the free energy minimisation code PARAPOCS [12, 13] and used to calculate the free energy of defect formation and migration in MgO. Watson et al. [14] later used the method to investigate defect enthalpy formation as a function of pressure in MgSiO3 perovskite, including defect volume. Such calculations are difficult using the Mott–Littleton approach.
2.3.
Molecular Dynamics
This technique is of rather limited use in the study simple ionic solids because it is not suited to the modelling of the hopping motion that is typical of ionic migration through these materials and which is too slow to yield results on a reasonable timescale. A related approach that has proved useful involves pushing ions from one site to another using forced molecular dynamics. A force is applied to an ion pushing it toward a vacant site. Molecular dynamics is then applied so that the system can relax around the moving ion. In this way a trace of the diffusion pathway and the value of the activation energy can be obtained. This approach has been tested on bulk MgO and has also been applied to diffusion in grain boundaries of MgO [15], NiO [16] and Al2 O3 .
1894
2.4.
J. Corish
Electronic Structure Calculations
As with all quantum mechanical calculations those for defects in crystals are limited by the size of the system that can be handled. Periodic calculations result in defect–defect interactions and also exhibit problems with charge neutrality. These effects are more difficult to correct for than in force-field calculations because of the extra complication of the electronic rearrangement. However some basic defects in simple ionic materials, such as MgO and Li2 O, have been studied [17, 18].
2.5.
Calculation of Defect Entropies
Simulation techniques to calculate the entropies of defect formation have also been developed using both the two-region approach and the supercell method. The calculations are a natural follow-on from the calculation of defect energies and require the detailed relaxed positions that are calculated for the ions in the defective region. They then treat the ions in the defective lattice within the harmonic approximation and the calculation effectively estimates the effect of the defect on the lattice phonon spectrum. In the two-region strategy the ions in the outer region are not allowed to move while the ions in the inner region immediately about the defect may vibrate. Provided that sufficiently large numbers of ions are used the two techniques give similar answers [9, 19].
3.
Interionic Potentials
The static lattice codes now used to calculate defect properties are sufficiently well developed and flexible that the accuracy of the calculated energies for both defect formation and defect processes depend crucially on the quality of the interionic potentials used. This, in turn, depends on two factors. The first of these is the form of the potential and whether it can adequately represent the real forces that exist and act in the crystal. The second is the accuracy with which the parameters in that potential can be determined and their appropriateness to the calculation of defect properties. The potential most commonly used in simple ionic solids is the Buckingham potential which has the form: (r) = A exp(−r/ρ) − C/r 6
(7)
This is comprised of a repulsive term with a pre-exponential parameter, A, a hardness parameter, ρ, and an attractive term with a parameter, C. For a pure
Point defects in simple ionic solids
1895
binary crystal three such interactions, cation–cation, cation–anion and anion– anion must be specified. The introduction of a dopant ion will necessitate at least two more analogous interactions, dopant–cation and dopant-anion and a third to represent dopant–dopant interactions if the dopant ions are sufficiently close to each other. For accurate calculations, it is also essential to take account in the simulation of the polarisability of the ions. This polarisability has been represented by a number of models of which the shell model of Dick and Overhauser [20] is the most successful and widely used. In this model the ion is represented by a spring-coupled core and a shell each of which carry charges: the sum of these charges is the overall charge on the ion. The cores and shells can move independently thus giving rise to the polarisation of the ions. The use of the shell model in the simulation of a binary crystal introduces four additional parameters, a shell charge, Y , and a core-shell coupling constant, k, for each ion, the values of which must also be determined in addition to those of the other parameters in the potential. The polarisability of the free ion, α, is related to Y and k by the equation: α = Y 2 /k
(8)
The parameters required for the interionic potentials in simple ionic solids are determined by two principal means. In the first of these they are evaluated through fitting of the potentials, using appropriate lattice dynamics and other equations, to available experimentally determined macroscopic properties of the crystal such as the lattice separation, cohesive energy and elastic and dielectric constants. Potentials for which the parameters have been determined in this way are called empirical potentials. Despite successful applications of such potentials for specific crystals, there are significant disadvantages to their more general use. Even for the most thoroughly experimentally investigated and simple crystals, the parameters evaluated in this way are underdetermined. Furthermore is it not always possible to discriminate between the extents of the contributions of each of the various ion-pair interactions to the overall measured properties. In addition, the values obtained for the parameters depend on properties measured when all the ions in the crystal are occupying equilibrium lattice sites and so they are unlikely to be completely appropriate to defect structures in which the constituents are displaced from such positions. This effect is further aggravated in calculations of the energies of defect processes during which ions may be expected to be substantially displaced from their equilibrium separations. The second method is to carry out electronic structure calculations at some suitable level of approximation to yield the energies of interaction for each ion pair in question as a function of their distance of separation. One such calculation used for simple ionic solids is the electron gas approximation [21] in which the crystal field effects corresponding to the environment in which the ion is placed in the crystal are sometimes included. The resulting
1896
J. Corish
energy-separation relationship can then be fitted to one of the analytical forms of potential available in the defect energy code. Alternatively the numerical values can be used directly by making use of splining techniques to provide values of the potential and its derivatives at separations intermediate between those at which its value was calculated. It is clear that neither of these methods is entirely satisfactory and that, in almost every case, some degree of model building is required to complete the potential. The parameters to describe the interaction between a particular pair of ions determined in one crystal may be transferred for use in another crystal in which the same two ions interact. However, care must be exercised to ensure that all the pair potentials assembled for use in a calculation were originally determined in the same way e.g., using an electronic structure calculation at the same level of approximation. Libraries of such interactions have been built up and can provide the parameters for the interactions when a new material is to be studied, for example the GULP programme includes a library of such potentials. The two-body potentials discussed here to date allow only central forces to be treated in the calculations. Whereas these are adequate for fully ionic or almost fully ionic materials they fail to properly represent the forces in crystals with deformable and highly polarisable ions in which non-central forces can play a significant role. Among the simple ionic solids the silver halides, both the chloride and the bromide, provide examples of materials that cannot be adequately represented by two-body potentials. The current defect codes include bond-bending terms usually of the form: i j k = 0.5 ki j k (ϑ − ϑ0 )2
(9)
so that energy is required to depart from the equilibrium bond angle, ϑ 0 .Such terms introduce directionality into the bonding and may be appropriate for use when modelling semi-covalent and covalent systems or molecular ions. ki j k is the bond-bending force constant between the bonds ik and ij. The codes also contain torsional terms, also called four-body potentials, to model systems that have a planar structure due, for example, to π -bonding. Bond-stretching terms to model bonded interactions, such as the hydroxide ion in ionic systems, and often written in the well-known Morse potential form, are also included. Further refinements include the extension of the shell model to include breathing shell and deformable shell models which can improve the representation of the polarisability of some ions. Because molecular dynamics codes are substantially more demanding of computational power it is usually necessary to carry out simulations using potentials that are simpler in form. However, it has proved possible to use full shell model potentials to calculate the diffusion energy barriers [22]. As has been mentioned above, molecular dynamics techniques are more suited to
Point defects in simple ionic solids
1897
systems in which the energy barriers are small and in which there is a high concentration of defects, i.e., fast-ion conductors (Section X.Y).
4.
Discussion
Where adequate interionic potentials have been determined the energies calculated using static lattice techniques for the formation of point defects, for their interactions and for their migration in simple ionic crystals are generally in very good agreement with the analogous experimentally determined values. This is well demonstrated even in the early extensive compilations of calculated and experimental defect energies made by Corish et al [23, 24]. Indeed in the case of the most-studied of the alkali halides, in which Schottky defects predominate, such calculations were sufficiently accurate and reliable to show that the experimental values for the activation energies for anion vacancy migration, which were reported as being substantially larger than those for the corresponding cation vacancies, were not correct but rather resulted from an artefact of the non-linear fitting procedures used to determine the defect parameters from the conductivity data. In the identically structured silver halides, in which the dominant defect is cationic Frenkel, the use of the quasi-harmonic approximation (Eqs. (2)–(4) above) provided the answer to what had been a long-standing problem. The unusually very rapid increase in the conductivities of these materials at higher temperatures as they approach their melting points was explained when the Frenkel defect formation energy was shown to be temperature dependent [23]. More recently, DFT methods have been used to examine the defect formation energies as well as the very facile motion of the very polarisable and readily deformable silver ion through these crystals [24]. In simple ionic crystals simulations techniques have been particularly useful in enabling us to understand the interactions between defects, their aggregation into larger more complex defect clusters and the eventual formation of small domains with a different structure. In many cases these processes or the entities formed are not open to easy experimental identification or investigation. A particularly striking example is the work of Tomlinson et al. [25] on the rock-salt structured transition metal oxides. They calculated the free energies, and hence the equilibrium constants, for the aggregation of a range of complex defects that are formed when the valences of the cations increase in the non-stoichiometric materials. These equilibrium constants were then used in a mass action analysis to estimate values for the oxygen partial pressure as a function of the deviation from stoichiometry and the results were compared with experimental determinations to identify the nature of the defect clustering occurring in the material. In the rare-earth doped fluorites simulation studies were used to correlate the structures of the complex defects formed between
1898
J. Corish
several dopant ions and charge-compensating interstitial fluoride ions and the observed EXAFS spectra from the dopants [26].
5.
Summary
The study of point defects in simple ionic crystals has been a very productive test-bed for both experimental and theoretical investigations of defect chemistry and physics in solid materials. In terms of modelling the techniques and their embodiment into codes of ever-increasing flexibility, reliability and scope has depended on the availability of accurately known experimental data. The close interplay between calculation and experiment that has been achieved in these systems has been crucial to progress in both areas. It has been pursued vigorously and has now brought us to a very detailed understanding of the nature of defect formation and interaction and of the dynamics of defect processes. The codes that have been developed and tested as well as the techniques that have emerged to identify realistic and appropriate interionic potentials and to determine the values of the parameters for specific interactions have been invaluable in enabling progress towards the modelling of more technologically important materials where the force-fields that operate are more complex in nature. Current developments in quantum mechanical/molecular modelling techniques promise Mott–Littleton type quantum calculations for defects. Success in this endeavour will solve many of the difficulties that now surround the quantum mechanical investigation of defects and promise the prospect of calculating even more accurate defect energies.
References [1] F.A. Kr¨oger and H.J. Vink, “Relations between the concentrations of imperfections in crystalline solids,” Solid State Phys., 3, 307, 1956. [2] C.R.A. Catlow (ed.), “Computer Modelling in Inorganic Crystallography,” Academic Press, London, 1997. [3] N.F. Mott and M.J. Littleton, “Conduction in polar crystals. I. electrolytic conduction in solid salts,” Trans Faraday Soc., 34, 485, 1938. [4] M.J. Norgett, Harwell Report AERE-R 7650, AEA Technology, Harwell, Didcot, OX11. ORA, U.K., 1974. [5] M. Leslie, SERC, Daresbury Laboratory Report DL-SCI-TM3IT, CCL, Daresbury Laboratory, Warrington, WA4, 4AD, U.K., 1982. [6] J.D. Gale, General Utility Lattice Programme, Imperial College London, U.K, 1993. [7] C.R.A. Catlow, J. Corish, P.W.M.J. Jacobs et al., “The thermodynamics of characteristic defect parameters,” J. Phys. C., 14, L121, 1981. [8] M.J. Gillan, “The volume of formation of defects in ionic crystals,” Phil. Mag., 4, 301, 1981. [9] J.H. Harding, “Calculation of the free energy of defects in calcium fluoride,” Phys. Rev. B, 32, 6861, 1985.
Point defects in simple ionic solids
1899
[10] M. Leslie and M.J. Gillan, “The energy and elastic dipole tensor of defects in ionic crystals calculated by the supercell method,”J. Phys. C – Solid State Phys. B, 973, 1985. [11] N.L. Allan, W.C. Mackrodt, and M. Leslie, “Calculated point defect entropies in MgO,” Advances in Ceramics, 23, 257, 1989. [12] S.C. Parker and G.D. Price, “Computer modelling of phase transitions in minerals,” Adv. Solid State Chem., 1, 295, 1989. [13] G.W. Watson, T. Tschaufeser, R.A. Jackson et al., “Modelling the crystal structures of inorganic solids using lattice energy and free-energy minimisation,” In: C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallography Academic Press, London, 1997. [14] G.W. Watson, A. Wall, and S.C. Parker, “Atomistic simulation of the effect of temperature and pressure on point defect formation in MgSiO3 perovskite and the stability of CaSiO3 perovskite,”J. Phys. Condens. Matter, 12, 8427, 2000. [15] D.J. Harris, G.W. Watson, and S.C. Parker, “Vacancy migration at the {410}/[001] symmetric tilt grain boundary of MgO: an atomistic simulation study,” Phys. Rev. B, 56, 11477, 1997. [16] D.J. Harris, J.H. Harding, and G.W. Watson, “Computer simulation of the reactive element effect in NiO grain boundaries,” Acta Mater., 48, 3309, 2000. [17] A. Devita, M.J. Gillan, J.S. Lin et al., “Defect energies in MgO treated by 1st principles methods,” Phys. Rev. B, 46, 12964, 1992. [18] A. Devita, I. Manassidis, J.S. Lin et al., “The energetics of frenkel defects in Li2 O from 1st principles,” Europhys. Letts., 19, 605, 1992. [19] M.J. Gillan and P.W.M.J. Jacobs, “Entropy of a point defect in an ionic crystal,” Phys. Rev. B, 28, 759, 1983. [20] B.G. Dick and A.W. Overhauser, “Theory of the dielectric constants of alkali halide crystals,” Phys. Rev., 164, 90, 1964. [21] Y.S. Kim and R.G. Gordon, “Theory of binding of inorganic crystals: application to alkali-halides and alkaline-earth-dihalide crystals,” Phys. Rev. B, 9, 3548, 1974. [22] P.J.D. Lindan and M.J. Gillan, “Shell-model molecular dynamics simulation of superionic conduction in CaF2 ,” J. Phys –Cond. Matter, 5, 1019, 1993. [23] J. Corish, “Calculated and experimental defect parameters for Silver Halides,” J. Chem. Soc., Faraday Trans., 85, 437, 1989. [24] D.J. Wilson, S.A. French, and C.R.A. Catlow, “Computational studies of intrinsic defects in Silver Chloride,” Radiat. Eff. Defects Solids, 157, 857, 2002. [25] S. Tomlinson, C.R.A. Catlow, and J.H. Harding, “Computer modelling of the defect structure of non-stoichiometric binary transition metal oxides,” J. Phys. Chem. Solids, 51, 477, 1990. [26] C.R.A. Catlow, A.V. Chadwick, J. Corish et al., “Defect structure of doped CaF2 at high temperatures,” Phys. Rev. B, 39, 1897, 1989. [27] J. Corish and P.W.M.J. Jacobs, “Surface and defect properties of solids,” M.W. Roberts and J.M. Thomas (eds.), Specialist Periodical Reports, vol. 2, The Chemical Society, London, p. 160, 1973. [28] J. Corish, P.W.M. Jacobs, and S. Radhakrishna, “Surface and defect properties of solids,” M.W. Roberts and J.M. Thomas (eds.), Specialist Periodical Reports, vol. 6, The Chemical Society, London, p. 219, 1977.
6.5 FAST ION CONDUCTORS Alan V. Chadwick Functional Materials Group, School of Physical Sciences, University of Kent, Canterbury, Kent CT2 7NR, UK
1.
Introduction
Fast ion conductors, sometimes referred to as superionic conductors or solid electrolytes, are solids with ionic conductivities that are comparable to those found in molten salts and aqueous solutions of strong electrolytes, i.e., 10−2 – 10 S cm−1 . Such materials have been known of for a very long time and some typical examples of the conductivity are shown in Fig. 1, along with sodium chloride as the archetypal normal ionic solid. Faraday [1] first noted the high conductivity of solid lead fluoride (PbF2 ) and silver sulphide (Ag2 S) in the 1830s and silver iodide was known to be unusually high ionic conductor to the German physicists early in the 1900s. However, the materials were regarded as anomalous until the mid 1960s when they became the focus of intense interest to academics and technologists and they have remained at the forefront of materials research [2–4]. The academic aim is to understand the fundamental origin of fast ion behaviour and the technological goal is to utilize the properties in applications, particularly in energy applications such as the electrolyte membranes in solid-state batteries and fuel cells, and in electrochemical sensors. The last four decades has seen an expansion of the types of material that exhibit fast ion behaviour that now extends beyond simple binary ionic crystals to complex solids and even polymeric materials. Over this same period computer simulations of solids has also developed (in fact these methods and the interest in fast ion conductors were almost coincidental in their time of origin) and the techniques have played a key role in this area of research. The electrical conduction in fast ion conductors occurs via point defects and the modelling of these defects is covered in Article 6.4 in this Volume by
1901 S. Yip (ed.), Handbook of Materials Modeling, 1901–1914. c 2005 Springer. Printed in the Netherlands.
1902
A.V. Chadwick 2.0 1.0 0.0
log ó/S cm⫺1
⫺1.0
RbAg4I5
â-PbF2
⫺2.0 ⫺3.0 ⫺4.0 Ce1.8Gd0.2O1.9
⫺5.0 ⫺6.0
ä-Bi2O3
⫺7.0
NaCl
⫺8.0 0.8
1.2
1.6
Zr1.8Y0.2O1.9 2.0
2.4
Agl
NaSCN.P(EO)8 2.8
3.2
3.6
1000K/T Figure 1. The experimental conductivity plots for several fast ion conductors.
Corish. In an ionic solid the conductivity, o´ , can simply be expressed by the expression: σ=
i
n r |qr |µr
(1)
r=1
where n r is the number of defects of type r per unit volume, qr the effective charge and ìr the mobility of defects of type r. Equation 1 provides a simple, phenomenological interpretation of fast ion conduction. The unusually high o´ could arise from an unusually large n r and/or ìr . Most theoretical models have focused on a high concentration of defects as the cause of fast ion behaviour. In fact an early attempt at a general explanation of the phenomenon was the molten sub-lattice model, which assumed that a high concentration of defects on one of the sub-lattices led to it’s melting with concomitant liquidlike motion of the ions [5]. Although the focus still remains on an unusually high defect concentration this general model has been rejected and massive disorder is not a requirement; normal ionic conductors, like NaCl, typically have a site fraction of defects ∼10−3 and o´ ∼10−3 S cm−1 at the melting point and therefore defect site fractions of ∼1% would lead to anomalous behaviour.
Fast ion conductors
1903
Given the wide range of materials that exhibit fast ion conduction attention has turned to the development of models that are system specific. Detailed descriptions of the computer simulation methods are presented elsewhere in this Volume. Hence only a brief summary of the techniques is necessary here and this will be presented in the next section. This will be followed by the results that have been obtained for fast ion conductors. Given the large number of systems that are now known to exhibit the phenomenon the discussion will be restricted to the major material types. The technical details of the simulations, unless vital to the discussion, will not be presented but the focus will be on the major findings and the rˆole the simulations have played in understanding the experimental data and the insights they can provide. The final section will present a summary and a look at how simulations could develop in this field.
2.
Modelling Techniques
As mentioned earlier the studies of fast ion conductors and computer modelling share a common and interlinked history and most of the modelling techniques have been applied to these materials.
2.1.
Static Lattice Calculations
This technique, described by Corish [6] in this volume, has been widely used in the study of fast ion conductors. The two-region strategy of the Mott– Littleton procedure has been used as implemented in the codes HADES, CASCADE [8] and GULP [9]. A variety of interatomic potentials have been employed, ranging from simple point charge models to shell-model and ab initio potentials. In more recent work on complex systems DFT methods have been used to derive potentials using codes such as CASTEP [10]. In a similar manner to the study of normal ionic solids the codes have been used to calculate the energies of defect formation and migration, the energies of solution of impurities and the energies of defect clusters. In a few cases the calculation of the entropy has been included so that the free energies and hence absolute values of defect concentrations and diffusion coefficients could be obtained.
2.2.
Molecular Dynamics Simulations
Molecular dynamics (MD), simulations have been extensively used in the study of liquids, [11, 12] and have proved most informative of the methods applied to fast ion conductors. A variety of codes have been employed, the
1904
A.V. Chadwick
most recent being the DL POLY code from the Daresbury Laboratory [13]. The rapid transit of the ions in these materials, as in liquids, happens on a timescale that a significant number of displacements occur in the period of a typical calculation. Although heavily demanding on computing resources the number of ions in the simulation box, the length of time of the calculation, the complexity of the interatomic potential have all increased with the increase of computer power to the extent that modern calculations are directly comparable to experiment. The obvious advantage of this technique is that it produces parameters that are correlated to the measured experimental parameters without the need to invoke theoretical model. For example, the radial distribution functions (RDF) can be compared with diffraction or EXAFS data, and the mean square displacements can be compared with diffusion coefficients from tracer or NMR data. However, there is also additional mechanistic information in that the MD produces a detailed picture of the ion motion at the atomic scale. Hence it possible to decide whether the motion of the mobile ion is liquid like (i.e., the ion is continuous motion), or solid-like, (i.e., the ion jumps from site to site with the transit time much less than the site residence time), and, in the case of the latter process, if nature of the point defects involved and the degree of correlation between successive diffusive jumps.
2.3.
Monte Carlo Simulations
Monte Carlo (MC), modelling methods are well-suited to the investigation of highly disordered systems and have been used in the study of some classes of fast ion conductor where there are a range of defect sites and possible jump mechanisms for an ion. A typical use is to couple the calculation of jump activation energies for the different defect environments (for example, obtained from static lattice simulations) with an MC simulation. In a simulation box a defect and its jump direction is selected at random and using the knowledge of the environment the Boltzmann factor is obtained. A standard Metropolis sampling procedure is used to decide if a jump succeeds or fails. After several thousand moves, in which the configuration is continually updated, an average diffusion coefficient and corresponding activation energy can be obtained.
3.
Discussion
Fast ion conductors have been classified in a variety of ways all have which have their advantages. For example they have been classified on the nature of the mobile ion (i.e., H+ , Li+ , Na+ , O2− , etc.), or on the type of lattice structure (i.e., fluorite type, layered, tunnel, glasses, polymers, etc.), or in terms of their applications (i.e., batteries, fuel cells, sensors, etc.). The classification used
Fast ion conductors
1905
here is that developed by Catlow [14], as it is the most useful in terms of the mechanisms of the ion transport [15]. For each of these classes the rˆole computer simulations has played in understanding the structures and processes will be outlined.
3.1.
Solids with a Phase Transition
Many solids exhibit fast ion conduction after a phase transition, the low temperature behaviour being interpretable in terms of classical defect theories [16, 17]. Examples include AgI and crystals with the fluorite structure. In AgI ˆ to there is a first order transition at 146◦ C from the low temperature a-form the high temperature cubic a´ -form with an increase in o´ of two orders of magnitude, as shown in Fig. 1. The a´ -form has a bcc arrangement of the I− ions with the two Ag+ ions per unit cell distributed over 42 possible lattice sites which are occupied with the preference order 12d (tetrahedral) > 24h (trigo¯ space group. Conduction takes place by nal) > 6b (octahedral) in the Im3m + Ag ions moving between tetrahedral and trigonal sites and has been studied in MD simulations. The most extensively studied fast ion conductors are those with the fluorite structure [18–21] and a general feature of the structure is a broad thermal anomaly, often referred to as the Bredig transition, similar to a diffuse e¨ -type order-disorder transition at a temperature Tc , which is approximately 0.8 of the melting temperature. The changes in ionic conductivity are quite subtle, as indicated by the data for aˆ -PbF2 in Fig. 1. At low temperatures the conductivity is normal and can be interpreted with classical point defect models [22] however at the same temperature regime as the thermal anomaly the conductivity plot shows an increasing upward curvature which then plateaus around 1 S cm−1 in the fast ion region. Similar behaviour, though not extensively investigated is found in the rare-earth fluorides with the tysonite structure. There is a wealth of experimental data for the fluorite-structured materials, particularly the fluorides (CaF2 ,SrF2 ,BaF2 ) as they are one of the easiest systems to study (e.g., large single crystals are readily available, they are thermally stable, etc.). When this is coupled with the simplicity of the structure it is not surprising that these systems have been the subject of more computer modelling studies than any other system. Very early static lattice simulations using shell model potentials showed that the defects in the low temperature phase were Frenkel defects, anion vacancies and interstitials. The calculated formation and migration energies of these defects were in good agreement with experiment. Thus reliable potentials were available at the outset. More difficult was the successful modelling of the transition to the superionic phase, which is due to a rapid generation of Frenkel defects around Tc . This is believed to be due to a cooperative effect in which the energy of
1906
A.V. Chadwick
formation of the Frenkel defects lowers as the concentration of these defects increases, resulting in a rapid rise in their concentration as the temperature increase. An early explanation was that this was due to the formation of defect clusters, which are well documented in doped systems (as will be discussed in Section 3.2), but involving simply vacancies and interstitial anions. There was evidence from neutron scattering experiments for the existence of these clusters in the superionic phase and static lattice simulations showed that these clusters had significant binding energies [23]. However, it was the MD studies of the fluorides, notably by Gillan and co-workers [20] that led to a better understanding of these systems. These simulations used shell model potentials to treat the polarizability of the ions and were able to reproduce were able to reproduce the experimental data, notably the specific heat peak and the conductivity. The calculations also suggested and alternative explanation of the neutron data suggesting that the features observed were not necessarily due to defect clusters but arose from the dynamic nature of the system, i.e., a snapshot showed many anions were off the normal sites. In addition, the MD work showed that the motion of the anions was a solid state hopping process, the residence time on a lattice site being considerably longer than the transit time between sites, and the vacancy concentration was only the order a few per cent, consistent with conductivity data [22]. Work continues on these systems, with more experimental data and MD simulations using ab initio potentials to more effectively treat the ion polarizability [21, 24].
3.2.
Massively Disordered and Heavily Doped Solids
In a few materials there is massive disorder due to an exceptionally high defect concentration. An excellent example is bismuth oxide, Bi2 O3 , which undergoes a phase transition to a fluorite-structured a¨ -phase at high temperature, which has the highest known O2− ion conductivity. This is hardly surprising as the oxygen sublattice contains 25% vacancies and this is clearly different from the fluorite-structured materials described in the preceding section. The silver chalcogenides (e.g., Ag2 S, Ag2 Se and Ag2 Te) are similar to AgI in that there are more available cation sites than Ag+ ions and in these systems the motion of the cations appears to be truly liquid-like. The addition of aliovalent impurities to an ionic crystal is the traditional method of increasing the defect concentration [6, 16, 17]. However, in simple ionic solids the changes are limited by the extremely low solubilities of the impurities, typically less than 1 mole per cent. Exceptions to this general rule are open structures, particularly the fluorite structure which is capable of dissolving up to 50 mole per cent of cation impurities. These fall into this class of heavily doped systems and includes the technologically important fluorite-structured oxides zirconia (ZrO2 ) and ceria (CeO2 ). In fact, pure ZrO2 has a monoclinic
Fast ion conductors
1907
structure and is stabilised in the cubic, fluorite phase by doping with Y3+ or Ca2+ (hence the name cubic-stabilised zirconia, CSZ) and the creation of anion vacancies. Another group of compounds that can be usefully included in this class is the mixed fluorides, such as PbSnF4 and RbBiF4 , where the mixing of cations seems to generate defects and leads to fast fluoride conductivity at temperatures around 100◦ C [25]. A considerable research effort has focused on the fluorite structured oxides due to their technological importance, particularly for solid oxide fuel cells (SOFC). The effect of doping with lower valence cations will create chargecompensating anion vacancies and these vacancies are responsible for the high ionic conductivity. The doping can be expressed by the reaction (Krõger–Vink notation): 2T 3+ (dissolving in MO2 ) → 2DM + V•• O
(2)
where T is a trivalent cation (e.g., a rare earth or yttrium) and MO2 is the oxide, ceria or zirconia. However, it was found that the conductivity did not increase monotonically with dopant concentration but it peaked around 10 mole per cent and that the conductivity was dependent on the nature of the dopant. The origin of the effect was clear; as the concentration of the dopant increased there was an increase in the concentration of impurity-anion vacancy clusters (DM − V •• ) increased in size and complexity that “trapped” the vacancies and reduced their mobility. The binding in these clusters has been treated very successfully by static lattice calculations where the key factor is found to be elastic strain due to a mismatch of ion sizes. Therefore the best dopants, in terms of minimising the reduction in conductivity at high concentrations, are when they are the same size as the host cation. A detailed simulation of Y3+ doped ceria has been made in which the various clusters were modelled in a static lattice calculation and the transport of the vacancies by an MC simulation [26]. The qualitative agreement with the experimental data was excellent, showing the maximum in conductivity as a function of dopant concentration and the change in activation energy, as shown in Fig. 2. In addition, the simulation also showed that at high concentration the vacancies were still making jumps but they were mainly from site to site within the clusters.
3.3.
Layered and Tunnel Structures Solids
In this group of materials the ions move along rapid diffusion pathways that are defined by the crystal structure. The aˆ and aˆ -aluminas are the archetypal 2-D conducting materials of this class and were originally explored as membranes for the sodium-sulfur battery [2]. These compounds are nonstoichiometric sodium aluminates with the mobile Na+ ions located in the
1908
A.V. Chadwick
(a) ⫺1 calculated 833K
log ó/S cm⫺1
⫺3
experimental 833K
⫺5
calculated 455K
⫺7 experimental 455K ⫺9
0
0.02
0.04
0.06
0.08
0.1
fraction of anion sites vacant (b) 1.4 experimental 455K activation energy/eV
1.2
1 calculated 455K 0.8
0.6
0.4 0
0.05
0.1
0.15
fraction of anion sites vacant Figure 2. Calculated and experimental conductivity studies of CeO2 doped with Y2 O3 [26]. (a) conductivity plots and (b) Arrhenius energies.
Fast ion conductors
1909
planes between spinel blocks of alumina that are bridged by oxygen. The unit cell of the conduction plane is shown in Fig. 3. In this conduction plane the Na+ ions in the stoichiometric material occupy alternate sites on a hexagonal network, the so-called Beevers-Ross (BR) sites. However, most experimental samples contain between 15 and 30% excess Na+ and the nominal nonstoichiometric composition is (Na2 O)1+x Al22 O33 (where 0.15 < x < 0.3), with the additional Na+ and charge compensating additional oxygen ions located in the conduction plane. Both the structure and Na+ migration have been the subject of computer simulation and is summarised in the recent paper by Beckers et al. [27]. Early static lattice simulations showed that the stable position for excess Na+ ions is the anti-Beevers–Ross (aBR) site. However, recent diffraction studies have shown the preferred site is displaced from this position and is between the mid-oxygen (mO) and aBR site, the so-called A site. Cation migration proceeds by jumps between lattice sites and MD simulations have been particularly illuminating on the sequence of diffusive steps. The calculations show the basic process is a hopping process between BR and aBR sites,
aBR
aBR
O
mO
O
BR
BR mO
mO A
aBR
A aBR
Figure 3. The unit cell in the conduction plane of aˆ -alumina. Lattice sites are labelled BR (Beevers–Ross), aBR anti (Beevers–Ross), mO (mid-oxygen) and O (bridging oxygen).
1910
A.V. Chadwick
but the details depend on composition. In stoichiometric material all BR sites are occupied and all aBR sites are vacant, thus Na+ migration is via intermediate aBR defect sites and diffusion is slow and the conductivity is low. In the non-stoichiometric material there is a strong correlation between jumps, particularly at low temperatures; an aBR to BR jump is immediately followed by a BR to aBR jump and so on, creating trains of mobile ions. The calculations also show the role of the excess oxygen in interstitial sites is to stabilise the A site and their presence ‘blocks’ the motion of Na+ ions. There are other layered materials in which the conduction is 2D but on the whole they have not been thoroughly explored. A rare example of a simple binary 2D conductor is lithium nitride, Li3 N, which is the result of an unusual crystal structure. An example of a 3D ionic conductor in which the ions move through channels is the Na+ ion conductor Na3 Zr2 PSi2 O12 , which is now generally referred to as NASICON (Na superionic conductor). Like the aˆ -aluminas this is a ceramic material but has a Na+ higher conductivity at 300◦ C.
3.4.
Amorphous Solids
There are recognised applications for vitreous electrolytes and a number of glass systems do exhibit high ionic conductivities [28–30]. Glasses with conductivities of the order of 10−2 S cm−1 at room temperature are known and have been referred to as “superionic conducting glasses (SIG)” or “vitreous electrolytes”. The highly mobile ion is usually Li+ , Ag+ or F− and a variety of network forming materials have been identified as producing useful electrolytes, e.g., silicates, borates, phosphates and sulfides. To date, there have been relatively few computer simulation investigations of these systems however it appears that the fast ion migration occurs along preferential pathways in the glassy matrix.
3.5.
Solid Polymer Electrolytes
A dry number of polymers will dissolve ionic salts and yield films that have reasonable magnitudes of ionic conductivity. The most useful and widely investigated are based on high molecular weight polyethylene oxide (PEO; [–CH2 –CH2 –O–]n ) where the ether oxygen atoms are co-ordinated to the cation of the salt and effect solvation in the same manner as crown ethers [31]. The high conductivity is found above the melting point of PEO (∼65◦ C), when the material is amorphous and in an elastomer phase. Although the specific conductivity of these materials is only ∼10−3 S cm−1 (two orders of magnitude lower than the best fast ion conductors) they have the mechanical advantages
Fast ion conductors
1911
of being flexible and strong, compared to brittle ceramics, and are easily processed as very thin films. These materials have attracted major interest as the membranes for lithium ion batteries. The PEO based electrolytes have been extremely difficult systems to model due to both the dynamic nature of the system and the lack of long-range order. The similarity between the activation energies for ionic conduction and the reptation motion of the polymer backbone had led to the assumption that the two processes were inter-related [31]. However, there was a continuing debate concerning the nature of the mobile species; the relatively low dielectric constant led to the speculation that there was considerable clustering of the ions and that transport could involve ion dimers, trimers, etc. A recent computer simulation study the PEO-lithium triflate (Li+ CF3 SO− 3) has been particularly revealing [32]. Firstly, cubic amorphous cells were constructed using the amorphous cell module and force fields in the Insight code (Molecular Simulations Inc.). The salt was then introduced and the ionpolymer interactions were represented by potentials derived from high level ab initio calculations, which were validated by reference to crystallographic parameters. The system was equilibrated and the configurations used as input to the DL POLY MD code, with the NVT ensemble, periodic boundary conditions and the Ewald summation. There was reasonable quantitative agreement of the calculated diffusion coefficients with those obtained from NMR measurements, as shown in Fig. 4, and there was unique qualitative information on the system. RDFs were evaluated for the various pairs and there were strong
log D/m2 s⫺1
⫺9.5 Li⫹D calc CF3SO3⫺D calc Li⫹D NMR CF3SO3⫺D NMR
⫺10
⫺10.5
⫺11 1.8
1.9
2
2.1
2.2
2.3
2.4
2.5
2.6
1000K/T Figure 4. Comparison of the calculated and experimental diffusion coefficients of Li+ and CF3 SO− 3 ions in 8:1 PEO-lithium triflate polymer electrolyte [32].
1912
A.V. Chadwick
Li+ – Li+ and Li+ -triflate correlations at long distances, whereas ion-polymer correlations were weaker. The first shell of Li+ consisted of 2.5 triflate oxygens and 1.5 ether oxygens indicating that the ion is not located at specific crystallographic type site in the polymer structure. Molecular graphics showed that rather than isolated clusters of ions there was an extended network of ions which cross-linked the polymer chains and through which the diffusion occurred. There was little evidence of ion hopping and that the motion of ions in the network was faster than those close to the polymer. Local motion of the polymer chain was more rapid than the ion transport, but there was negligible diffusion of the chain, suggesting polymer reptation is important in the diffusion of the ions.
3.6.
Proton Conductors
The mechanisms that are thought to operate in these systems are different to those in other fast ion conductors and are taken as a separate class. In the case of hydrated salts like hydrogen uranyl phosphate, HUO2 PO4.4H2 O (HUP), the protons are believed to migrate by the classical Gr˝otthus mechanism used to explain proton mobility in water and involving a proton exchange between the water molecules [33]. A number of doped ceramic oxides with the perovskite structure, mainly cerates and zirconates, also exhibit high proton conductivity and the protons was believed to migrate by a tunnelling process [34]. The best example is Yb doped SrCeO3 , where the dopant is charge compensated by the creation of anion vacancies. Water molecules can be incorporated into these vacancies and generate hydroxyl ions according to the reaction (Kr˝oger–Vink notation): OO + VO + H2 O → 2(OH)O
(3)
Static lattice simulations have been used successfully in a conventional manner to study the solution of dopants in the perovskite oxides. However more revealing has been the simulation of the proton motion in calcium zirconate, CaZrO3 [35]. This involved an ab initio calculation of the electronic structure using a DFT approach in the CASTEP code [10] coupled with a classical MD simulation. The calculations showed that the proton moved via hops between oxygen anions. Thus the process was analogous to the Gr˝otthus mechanism and there was no evidence of the migration of hydroxyl ions. As in the Gr˝otthus mechanism the reorientation and alignment of the OH group towards the next oxygen anion was a rate-determining factor. This was consistent with the experimental observation that distorted perovskites have a lower conductivity than cubic perovskites.
Fast ion conductors
4.
1913
Conclusions
Fast ion conduction has provided challenges to conventional theories in terms of both defect structure and the mechanisms of ion migration. Computer simulation methods have provided a means of attacking these problems and have played a key role in developing the current level of understanding of the materials. Particularly successful have been the MD simulations. It is clear that there will be further developments with the expected increases in computing power. For example, this will allow better representation of the interatomic potentials with the greater use of ab initio methods, the use of larger simulation boxes and, in the case of MD calculations, longer run times. In terms of new areas that will develop it is reasonable to expect more work in nanocrystalline fast ion conductors. Enhanced diffusion is a feature of in nanocrystals and recent experimental work has shown that conductivity is increased when particle size or film of an ionic material is in the nanometre range [36]. It should soon be possible to explicitly simulate a nanocrystal of a few nanometre in diameter on the computer and explore the morphology and ion transport directly. Finally, it is worth mentioning developments in methodology. The common approaches used for fast ion conductors have been to construct a structure for the simulation based on crystallographic information and then introduce defects. Recently alternative approaches have been developed where the simulation involves some kind of structural evolution [37]. For example, in the study of films on substrates the technique can involve a MD simulation in which the system is melted or amorphised followed by a recrystallization. In these methods the system will evolve its own natural morphology and defects, such as grain boundaries, dislocations and point defects, are generated naturally in the simulation and depend solely on the interatomic potentials, rather than the intuitive input of the researcher. This can be very useful in exploring complex structures, such as the role of the substrate on film morphology. The methods should find applications in the study of fast ion conductors, particularly for nanocrystals and thin films, where the materials are being employed in technological applications.
References [1] M. Faraday, “Experimental researches in electricity,” J.M. Dent, London; Faraday’s Diaries 1820–1862, G. Bell, London, entries for 21st February, 1833 and 19th February, 1835, 1939. [2] S. Chandra, “Superionic solids,” North-Holland, Amsterdam, 1981. [3] A.M. Stoneham (ed.), “Ionic solids at high temperatures,” World Scientific, Singapore, 1989.
1914
A.V. Chadwick
[4] A. Laskar and S. Chandra (eds.), “Superionic solids and solid electrolytes,” Academic Press, New York, 1990. [5] S. Huberman, Phys. Rev. Letts., 32, 1000, 1974. [6] J. Corish, “Point defects in simple ionic solutions,” Article 6.4, this volume. [7] M.J. Norgett, Harwell Report AERE-R 7650, AEA Technology, Harwell, Didcot, OX11 0RA. U.K., 1974. [8] M. Leslie, SERC Daresbury Laboratory Report DL-SCI-TM3IT, CCLRC Daresbury Laboratory, Warrington WA4 4AD, U.K., 1982. [9] J.D. Gale, General Utility Lattice Programme, Imperial College, London, U.K., 1992. [10] M.D. Segall, P.J.D. Lindan, M.J. Probert C.J. Pickard, P.J. Hasnip, S.J. Clark, and M.C. Payne, J. Phys. Condens. Matter, 14, 2717, 2002. [11] M.P. Allen and D.J. Tyldesley, “Computer simulation of liquids,” OUP, Oxford, 1987. [12] S. Yip, this Volume, Chapter 6.5, 2004. [13] W. Smith and T.R. Forrester, J. Mol. Graph., 14, 136, 1996. [14] C.R.A. Catlow, J. Chem. Soc. Faraday Trans., 86, 1167, 1990. [15] A.V. Chadwick, in “Diffusion in materials – unsolved problems,” G. Murch (ed.), Transtech, Zurich, 1992. [16] A.B. Lidiard, Handbuch der Physik, XX, 157, 1957. [17] J. Corish and P.W.M. Jacobs, In: M.W. Roberts and J.M. Thomas (eds.) Surface and Defect Properties of Solids, The Chemical Society, London, vol. 2, p. 184, 1973. [18] C.R.A. Catlow, Comments in Solid State Phys., 9, 157, 1980. [19] A.V. Chadwick, Solid State Ionics, 8, 209, 1983. [20] M.J. Gillan, In: A.M. Stoneham (ed.), “Ionic Solids at High Temperatures,” World Scientific, Singapore, p. 169, 1989. [21] D.A. Keen, J. Phys.: Condens. Matter, 14, R819, 2002. [22] A. Azimi, V.M. Carr, and A.V. Chadwick et al., J. Phys. Chem. Solids, 45, 23, 1984. [23] A.R. Allnatt, A.V. Chadwick, and P.W.M. Jacobs, Proc. Roy. Soc., A410, 385, 1987. [24] M.J. Castiglione and P.A. Madden, J. Phys. Condens. Matter, 13, 9963, 2001. [25] J.M. Reau and J. Grannec, In: P. Hagenm˝uller (ed.), Inorganic Solid Fluorides, Academic Press, New York, p. 423, 1985. [26] G.E. Murch, C.R.A. Catlow, and A.D. Murray, Solid State Ionics, 18–19, 196, 1986. [27] J.V.L. Beckers, K.J. van der Bent, and S.W. de Leeuw, Solid State Ionics, 133, 217, 2000. [28] C.A. Angell, Solid State Ionics, 9–10, 3, 1983. [29] C.A. Angell, Solid State Ionics, 18–19, 72, 1986. [30] J.L. Soucquet, Solid State Ionics, 28–30, 693, 1988. [31] F.M. Gray, “Solid polymer electrolytes,” VCH, New York, 1991. [32] C.R.A. Catlow, A.V. Chadwick, and G. Morrison, Radia. Eff. Defects Solids, 156, 331, 2001. [33] A.N. Fitch, Mater. Sci. Forum, 39, 113, 1986. [34] H. Iwahara, H. Uchida, and K. Tanaka, Solid State Ionics, 9–10, 1021, 1983. [35] M.S. Islam, R.A. Davies, and J.D. Gale, Chem. Mater., 13, 2049, 2001. [36] N. Sata, K. Eberl, and K. Eberman et al., Nature, 408, 946, 2000. [37] D.C. Sayle and R.L. Johnston, Curr. Opin. Solid State Mater. Sci., 7, 3, 2003.
6.6 DEFECTS AND ION MIGRATION IN COMPLEX OXIDES M. Saiful Islam Chemistry Division, SBMS, University of Surrey, Guildford GU2 7XH, UK
1.
Introduction
Ionic or mixed conductivity in complex ternary oxides has attracted considerable attention owing to both the range of applications (e.g., fuel cells, oxygen generators, oxidation catalysts) and the fundamental fascination of fast oxygen transport in solid state ionics [1, 2]. In particular, the ABO3 perovskite structure has been dubbed an “inorganic chameleon” since it displays a rich diversity of chemical compositions and properties. For instance, the mixed conductor La1−x Srx MnO3 finds use as the cathode material in solid oxide fuel cells (SOFCs) and also exhibits colossal magnetoresistance (CMR), whereas Sr/Mg doped LaGaO3 shows superior oxygen ion conductivity relative to the conventional zirconia-based electrolyte at moderate temperatures. A range of perovskite-structured ceramics, particularly cerates (ACeO3 ) and zirconates (AZrO3 ), also exhibit proton conductivity with potential fuel cell and sensor applications. It has become increasingly clear that the investigation of defect phenomena and atomistic diffusion mechanisms in these complex oxides underpins both the fundamental understanding of macroscopic behaviour and the ability to predict their transport parameters. Computer modelling techniques are now well established tools in this field of solid state ionics, and have been applied successfully to studies of structures and dynamics of solids at the atomic level. A major theme of modelling work has been the strong interaction with experimental studies, which is evolving in the direction of increasingly complex systems. In general, three main classes of technique have been employed in the study of complex oxide materials: atomistic (static lattice) based on energy minimisation, molecular dynamics (MD) and quantum mechanical (ab initio) methods. Our focus here is to highlight the major findings of recent 1915 S. Yip (ed.), Handbook of Materials Modeling, 1915–1924. c 2005 Springer. Printed in the Netherlands.
1916
M.S. Islam
modelling studies since detailed descriptions of these computational methods are presented in Chapter 1 (Cohen), Chapter 2 (Gale; Catlow) and Chapter 6 (Corish) of this Volume. This review addresses recent trends in computational studies of the defect, dopant and ionic transport properties of topical perovskite oxide materials. In particular, we highlight contemporary work on different oxygen ion and proton-conducting perovskites (such as LaGaO3 and CaZrO3 ) to illustrate part of the wide scope of information that can be obtained.
2.
Dopants in LaMO3 Perovskites
The series of compounds based on LaMO3 (where, for example, M = Mn, Co, Ga) are some of the most fascinating members of the perovskite family, due to their applications in SOFCs, ceramic membranes and heterogeneous catalysis [1, 3, 4]. The addition of aliovalent cation dopants is crucial to the ionic (or mixed) conductivity in these oxides. These materials are typically acceptor-doped with divalent ions at the La3+ site, resulting in extrinsic oxygen vacancies at low vapour pressures. Considering Sr2+ substitution of La3+ as an example, this doping process can be represented by the following defect reaction: 1 × 1 •• 1 SrO + La× La + 2 OO = SrLa + 2 VO + 2 La2 O3
(1)
where, in Kroger–Vink notation, SrLa signifies a dopant substitutional and V•• O an oxygen vacancy. Atomistic simulations can been used to evaluate the energies of this “solution” reaction by combining appropriate defect and lattice energy terms. In this way, the modelling approach provides a useful systematic guide to the relative energies for different dopant species at the same site. The starting point is the modelling of the ABO3 perovskite structure, which is built upon a framework of corner-linked BO6 octahedra with the A cation in a 12-coordinate site; the orthorhombic phase can be considered as due purely to tilts of these octahedra from the ideal cubic configuration (shown in Fig. 1). The interatomic potential parameters were derived by empirical procedures (as discussed by Gale in Chapter 2) using their observed structures and crystal properties; these energy minimization methods produce good agreement between experimental and simulated structures [5], which provides a reliable starting point for the defect calculations. Detailed studies have calculated solution energies for a series of alkalineearth metal ions in the LaMO3 materials (M = Mn,Co,Ga). The defect calculations are based on well established Mott–Littleton methodology [6] embodied in the GULP code [7]. Figure 2 reveals that the lowest energy values are predicted for Sr and Ca at the La site. The favourable incorporation of these ions will ‘therefore’ enhance transport properties owing to the increase in the
Defects and ion migration in complex oxides
1917
(a)
(b)
Figure 1. rhombic.
Perovskite structure showing corner-linked MO6 octahedra: (a) cubic (b) ortho-
1918
M.S. Islam
5.0
LaGaO3
Solution energy (eV/dopant)
LaMnO3 4.0
LaCoO3 Ba
Mg
3.0
2.0
1.0 Ca 0.0
Sr 0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
Dopant radius (Å)
Figure 2. Calculated energies of solution as a function of dopant ion radius for alkaline-earth cations substituting on the La site in LaMO3 perovskites.
concentration of mobile defects. These results accord well with experimental work in which Sr is the dopant commonly used to generate ionic (or mixed) conductivity in these perovskites, while Ca is often used to generate mixed valent Mn3+ /Mn4+ in the magnetoresistive (CMR) manganates. It is also apparent from Fig. 2 that a degree of correlation is found between the calculated solution energy and the size of the alkaline-earth dopant with minima near the radius of the host La3+ . However, ion size is not the sole factor as previous studies show that the solution energy for the alkali metal dopants with similar ionic radii are appreciably endothermic, in line with their observed low solubility.
3.
Oxygen Defect Migration in LaGaO3
The LaGaO3 material doped with Sr and Mg has attracted growing attention as a solid electrolyte competitive with Y/ZrO2 (and Gd/CeO2 ) due to its extremely high oxygen ion conductivity at lower operating temperatures [1, 3]. As discussed by Chadwick in Chapter 6, simulation methods have been able to investigate fundamental mechanistic problems of ion migration in complex ionic conductors.
Defects and ion migration in complex oxides
1919
As part of an extensive survey of perovskite oxides, we first probed the energy profiles for oxygen ion migration by calculating the defect energy of the migrating ion along possible diffusion paths, and allowing relaxation of the lattice at each position. It should be noted that interstitial formation and migration are calculated to be highly unfavourable as expected for the closely packed ABO3 perovskite lattice. The simulations ‘therefore’ confirm the migration of oxygen ion vacancies as the lowest energy path, as well as predicting that any oxygen hyperstoichiometry will not involve interstitial defects. For the topical LaGaO3 oxygen ion conductor, the simulations find that the calculated migration energy (0.73 eV) is in good agreement with experimental activation energies of about 0.7 eV from high temperature dc conductivity [8] and 0.79 eV from SIMS data [9]. In terms of the precise diffusion mechanism, however, it has not been clear from experiment whether the migrating ion takes a direct linear path along the edge of the MO6 octahedron into a neighbouring vacancy. An important modelling result is that a small deviation from the direct path for vacancy migration is revealed, illustrated schematically in Fig. 3. The calculations ‘therefore’ indicate a curved route around the octahedron edge with the “saddle-point” away from the adjacent B site cation. Indeed, a recent neutron diffraction and scattering studies of doped LaGaO3 [10] provide evidence for our predicted curved pathway. In the saddle-point configuration, the migrating ion must pass through the opening of a triangle defined by two A site (La) ions and one B site (Ga) ion. The simulation approach is able to treat ionic polarizability and lattice relaxation, generating valuable information on local ion movements. From our analysis we find significant displacements (0.1 Å) of these cations away from the mobile oxygen ion. These results emphasise that neglecting lattice relaxation effects at the saddle-point may be a serious flaw in previous ion size approaches based on a rigid hard-sphere model, in which the ‘critical radius’ of the opening is derived. It is worth mentioning that in addition to dopant ion incorporation these modelling techniques have been used to investigate dopant–vacancy association where we find a minimum in the binding energy occurs for Sr2+ on La3+ in LaGaO3 , which would be beneficial to oxygen ion conductivity. Recent studies have also examined larger complex clusters (xMgGa (x/2)V•• O ) within 2D and 3D structures, which may be related to possible “nano-domain” formation at higher dopant regimes [11]. This work on nano-clusters is currently being extended to other materials.
4.
Proton Transport in AZrO3
In addition to oxygen ion conduction, perovskite oxides have received considerable attention as high temperature proton conductors with
1920
M.S. Islam
M
O
M
O
M
O
M
Figure 3.
Curved path for oxygen vacancy migration between adjacent anion sites in LaMO3 .
promising use in fuel cell and hydrogen sensor technologies [12–14]. Most attention has focused on A2+ B4+ O3 perovskites, particularly ACeO3 and AZrO3 . An important example is the development of a sensor for hydrogen in molten metal based upon doped CaZrO3 as the proton-conducting electrolyte. The CaZrO3 material is typically acceptor-doped with trivalent ions (e.g., In3+ ) at the Zr4+ site. When these perovskite oxides are exposed to water vapour, the oxygen vacancies are replaced by hydroxyl groups, described as follows: x • H2 O(g) + V•• O + OO → 2OHO
(2)
In an attempt to gain further insight into the mechanistic features of proton diffusion, we have focused on the orthorhombic phase of CaZrO3 comprised of O(1) and O(2) inequivalent oxygen sites; this study extends earlier simulation work on ideal cubic perovskites [15]. Here the DFT-pseudopotential approach has been utilised to perform ab initio dynamics calculations [16] using the
Defects and ion migration in complex oxides (i)
1921
(ii)
H
O
(iii)
Zr
Figure 4. Sequence of three snapshots from ab initio MD simulations showing inter-octahedra proton hopping in orthorhombic CaZrO3 . (The Ca ions are omitted for clarity).
CASTEP code [17], which essentially combines the solution of the electronic structure with classical molecular dynamics (MD) for the nuclei. Graphical analysis of the evolution of the system with time shows proton hopping events during the simulation run. Figure 4 presents “snapshots” of one of these proton hops between neighbouring O(1) oxygen ions illustrating both initial and barrier (transition) states. This confirms that proton conduction occurs via a simple transfer of a proton from one oxygen ion to the next (Gr¨otthuss mechanism). These simulations provide no evidence for the migration of hydroxyl ions (“vehicle” mechanism) or the existence of “free protons”. We also find rapid rotational and stretching motion of the O–H group, which allows the reorientation of the proton towards the next oxygen ion before the transfer process. Interestingly, our simulations reveal predominantly inter-octahedra proton hopping, rather than within octahedra, which is influenced by the [ZrO6 ] tilting within the orthorhombic structure of CaZrO3 . This work is consistent with the experimental observation that proton mobilities are lower in perovskite structures deviating strongly from cubic symmetry.
5.
Proton–Dopant Association in AZrO3
Proton conducting oxides are typically doped with aliovalent ions. However, there has been some debate as to whether there is any significant interaction between the dopant ion and the protonic defect (hydroxyl ion at oxygen site), which may lead to proton “trapping”. In an attempt to probe this question of proton-dopant association, DFT-based methods [17] have been
1922
M.S. Islam
used to examine defect pairs (OH•O MZr ) comprised of a hydroxyl ion and a neighbouring dopant substitutional (shown in Fig. 5). Attention was focused on three commonly used dopants in CaZrO3 , namely Sc3+ , Ga3+ and In3+ . The resulting binding energies (with respect to the two isolated defects) are in the range −0.2 to −0.3 eV which suggests that all the hydroxyl-dopant pairs are favourable configurations. Although there are no experimental data on CaZrO3 for direct comparison, the calculated values are in accord with proton “trapping” energies of about −0.2 and −0.4 eV for Sc-doped SrZrO3 and Yb-doped SrCeO3 respectively, derived from muon spin relaxation (µSR) and quasi-elastic neutron scattering (QENS) experiments [18]. These studies postulate that in the course of their diffusion, protons are temporarily trapped at single dopant ions. It is noted, however, that defect pairs do not necessarily preclude the presence of isolated protons and dopant ions, since clusters will be in equilibrium with single defects. This picture can be viewed as analogous
Figure 5. Dopant-OH pair at nearest-neighbour sites in the [Zr–O] plane in CaZrO3 .
Defects and ion migration in complex oxides
1923
to oxygen ion conductivity in fluorite oxides and the well-known importance of dopant–vacancy interactions [19].
6.
Concluding Remarks
Computational techniques now play an important role in contemporary studies of complex oxides such as perovskite-structured ionic conductors. Such modelling tools, acting as powerful “computational microscopes,” have been used here to provide deeper fundamental insight as to the defect and ion transport properties of some topical examples of complex oxides. Our simulations, using both energy minimization and quantum mechanical methods, have aimed to guide and stimulate further experimental work on these perovskite materials, relevant to their applications in solid oxide fuel cells, oxygen separation membranes and partial oxidation reactors. Future developments in the modelling of ternary oxides are likely to encompass the atomistic simulation of interfaces and nanocrystals, the greater use of shell model MD over longer time-scales, and the extension of quantum mechanical techniques to more complex oxide systems, with increasing emphasis on predictive calculations. These developments will be assisted by the constant growth in computer power, and will also draw on the strong interaction with complementary experimental techniques.
Acknowledgments The author is grateful for valuable discussions with C.R.A. Catlow, A.V. Chadwick, J.D. Gale, P.R. Slater and J.R. Tolchard. The work has been supported by the EPSRC and the Royal Society.
References [1] [2] [3] [4] [5] [6]
B.C.H. Steele, Solid State Ion., 134, 3, 2000. P. Knauth and H.L. Tuller, J. Am. Ceram. Soc., 85, 1654, 2002. T. Norby, J. Mater. Chem., 11, 11, 2001. S.W. Tao and J.T.S. Irvine, Chem. Record, 4, 83, 2004. M.S. Islam, J. Mater. Chem., 10, 1027, 2000. C.R.A. Catlow (ed.), Computer Modelling in Inorganic Crystallography, Academic Press, London, 1997. [7] J.D. Gale, J. Chem. Soc., Faraday Trans., 93, 629, 1997. [8] K. Huang, R.S. Tichy, and J.B. Goodenough, J. Am. Ceram. Soc., 81, 2565, 1998. [9] T. Ishihara, J.A. Kilner, M. Honda, and T. Takita, J. Am. Chem Soc., 119, 2747, 1997.
1924
M.S. Islam
[10] M. Yashima, K. Nomura, H. Kageyama, Y. Miyazaki, N. Chitose, and K. Adachi, Chem. Phys. Lett., 380, 391, 2003. [11] M.S. Islam and R.A. Davies, J. Mater. Chem., 14, 86, 2004. [12] H. Iwahara, H. Matsumoto, and K. Takeuchi, Solid State Ionics, 136–137, 133, 2000. [13] S.M. Haile, Acta Materalia, 51, 5981, 2003. [14] K.D. Kreuer, Ann. Rev. Mater. Res., 33, 333, 2003. [15] W. M¨unch, K.D. Kreuer, G. Seifert, and J. Maier, Solid State Ionics, 136–137, 183, 2000. [16] M.S. Islam, R.A. Davies, and J.D. Gale, Chem. Mater., 13, 2049, 2001. [17] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, Rev. Mod. Phys., 64, 1045, 1992. [18] R. Hempelmann, M. Soetratmo, O. Hartmann, and R. Wappling, Solid State Ionics, 107, 269, 1998. [19] J.A. Kilner, Solid State Ionics, 129, 13, 2000.
6.7 INTRODUCTION: MODELING CRYSTAL INTERFACES Sidney Yip1 and Dieter Wolf2 1
Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA 2 Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
Interfaces represent an integral part of the understanding, design, and processing of modern materials [1, 4–6]. Many phenomena and properties, ranging from electronic and optical to thermal and mechanical in nature, are known to be dominated by the presence of interfaces. The basic paradigm for understanding and controlling interfacial phenomena is the concept of structure–property correlation, well known in materials science. In the remaining sections of this chapter, we will discuss the application of this concept to grain boundaries. The reader should keep in mind that while the modeling of grain boundaries is a significant problem in and of itself, the modeling concepts and simulation methods developed and applied in this context, apply equally well to other types of interfacial materials. One way of classifying solid interfaces is shown in Fig. 1 where three types of interfacial systems are distinguished. The interfacial region (assumed to be infinite in the x–y plane) is embedded in the z direction between two perfect, semi-infinite bulk crystals. The lower and upper halves of this bicrystal generally consist of different materials, A and B. Only flat interfaces are considered in Fig. 1; however, from high-resolution electron microscopy we know that macroscopically curved interfaces are usually faceted on an atomic scale. The presence or absence of one or both of the bulk regions in Fig. 1 has a strong effect on the lattice parameters, and hence the physical properties in the interfacial region. By examining the different ways in which the interfacial region may or may not be sandwiched between the two bulk regions, three types of interfacial systems can be distinguished: (a) Bulk (or buried, internal) interfaces: Systems in which the interface region is surrounded by bulk material on both sides. Examples are interphase (for A =/ B) and grain (for A = B) boundaries. 1925 S. Yip (ed.), Handbook of Materials Modeling, 1925–1930. c 2005 Springer. Printed in the Netherlands.
S. Yip and D. Wolf
MATERIAL B
1926
Bulk Perfect Crystal 2
Interface Region 2
MATERIAL A
Interface Plane
z x
Interface Region 1
Bulk Perfect Crystal 1
y
Figure 1. Distinction of three types of interfacial systems. Depending on whether the system is embedded in bulk material on both sides of the interface, on only one side or not at all, we distinguish “bulk”, “epitaxial” and “thin-film” interfaces. A and B are generally different materials.
(b) Epitaxial interfaces: Systems with bulk material on only one side of the interface and a thin film on the other. For A = B, a bulk free surface is obtained. (c) Thin-film interfaces: Systems with both bulk regions removed. For A = B, a free-standing thin film, generally containing a grain boundary and bordered by free surfaces, is obtained. Strained-layer superlattices are included here for the case in which the structure of an A|B thin-film sandwich is periodic not only in the x–y plane but also in the z direction: . . . |A|B|A|B| . . . . Since no bulk embedding is left, the material near the interface would not “know” its bulk lattice parameter. Following this line of thinking one can go further and distinguish between coherent versus incoherent, or commensurate versus incommensurate interfaces (D. Wolf, in [1], Chapter 1). By grain boundaries we mean the regions between adjacent crystalline grains. Because the adjacent grains can have different shapes or orientations, atoms in the region of mismatch, the grain boundary, will be less well packed than those in the grain interior. This open structure means that the grain boundary region can act like a source or sink for defects, an easier path for atomic transport, or a more active site for mechanical deformation and even chemical
Introduction: modeling crystal interfaces
1927
reaction. The details of the structural openness and how the local atomic arrangements can affect the various physical properties of polycrystalline material are fundamental topics for modeling studies. In this sense the studies of crystal defects in the present chapter provide illustrations of the basic utility of materials theory and simulation that permeates throughout this volume. Many special interests in interface materials stem from their inherent inhomogeneity, i.e., the physical properties at or near an interface can differ dramatically from those of the nearby bulk material. For example, the thermal expansion, electrical resistivity or elastic response near an interface can be highly anisotropic in an otherwise isotropic material, and differ by orders of magnitude from those of the adjacent bulk regions. Typically these gradients extend over only a few atomic layers, so their experimental investigation requires techniques capable of atomic-level resolution and detection. For surfaces and thin films suitable characterization methods have been developed [2]. On the other hand, buried interfaces continue to be a major challenge, because the presence of the interfaces affects only a small fraction of the atoms (D. Seidman, in [1], Chapter 2). This inherent difficulty in the experimental investigation of buried interfaces actually presents an opportunity for atomic- and electronic-level calculations to contribute to our understanding of solid interfaces by means of simulations. Because the properties in the interfacial region are controlled by relatively few atoms, electronic- and atomic-level simulations can provide a close-up view of the most critical part of the material. The limitations of these simulations, our incomplete knowledge of electronic structure and interatomic interactions, the finite size of any simulation cell and its embedding in the surrounding material, and the finite duration of any simulation, are well known. Nonetheless, we now have the means to study the positions and movements of atoms, local stresses, etc., and relate this information to the physical properties of the material. For example, in molecular dynamics simulation it is as if one had an atomic-level camera with a field of view of about 100 Å at a speed of 1014 –1015 frames per second. The unique features of such simulations offer opportunities for a joint approach combining atomic-level experimental techniques with computer simulations. In atomistic simulations one has complete information about the model systems under study. This is a very significant advantage over experiments where either the microstructure of the sample has not been fully determined, or the phenomenon of interest cannot be measured in sufficient details. Because both structure and properties can be well characterized in simulation the results of such studies are particularly useful for establishing correlations. This feature will be illustrated repeatedly in the following sections. The layout of the remaining sections of this chapter follows the paradigm of structure-property correlation. We summarize in Chapter 6.8 the simulation concepts and methods that are relevant for the study of interfaces in
1928
S. Yip and D. Wolf
general and grain boundaries in particular, making use of the treatments that can be found in Chapters 1 and 2. Even though grain boundaries have a finite depth, they are considered to be planar defects. As with all studies of extended defects in crystals, one begins with the structural aspects, such as classification of geometrical features in a lattice and considerations of atomic distributions and configurations, and then correlates various physical properties to the interfacial structure. From the modeling perspective, a pertinent question is which of the many possible structures that a grain boundary can have should one study. This is a nontrivial issue since in a simulation one must have a way of specifying the energy of the system in terms of the atomic coordinates, the information usually embodied in an interatomic potential, an energy function which depends on the positions of all the atoms in the system. Knowing the potential is generally not enough because border conditions for the simulation cell, which necessarily has to be finite, also must be specified. In Chapter 6.9 we will discuss static calculation of grain-boundary energy and the correlation of the results with the structure of the interface. This investigation illustrates how one can determine the stability of an interface, a procedure that can be applied to any system of atoms. To study finite-temperature behavior we will switch to molecular dynamics, in Chapter 6.10, in order to obtain the trajectories of the atoms as the system microstructure evolves in a thermal environment. We will consider results on grain growth and plastic deformation in this context. In the following section, Chapter 6.11, we go into even more fundamental details to probe an extreme form of thermal response, the crystal-to-liquid transition or melting. The question we ask is at the atomic level what are the mechanisms for a crystal lattice to collapse at a certain critical temperature. Another way to raise the issue is to ask about the thermodynamic significance of the melting point and its relation to the kinetics of melting. We will see that atomistic simulation can provide the mechanistic details to give us a fuller understanding of melting, and to address the thermodynamic and the kinetic aspects of the phenomena in a unifying manner. This kind of investigation can be extended in several directions, such as melting at a grain boundary and the connection with a related phenomenon, the crystal-to-amorphous transition or solid-state amorphization. In Chapter 6.12 we take up the elastic behavior of interface materials. Conceptually, crystals with interfaces are inherently inhomogeneous systems. We will find that their elastic behavior can be opposite to what is expected of homogeneous systems. For example, we see from a study of the elastic behavior of interfaces that the elastic moduli of a material do not necessarily soften when the material density decreases. Simulations reveal that the net elastic response of interfacial materials is the result of complex, highly nonlinear competition between interfacial structural disordering and consequent volume expansion. Volume expansion and the related increase in average
Introduction: modeling crystal interfaces
1929
interatomic distances usually give rise to elastic softening, whereas the structurally disordered interface region can cause either strengthening or softening [3]. The latter is readily seen via the underlying radial distribution function, as shown in Fig. 2. In spite of a decrease in the average density, indicated by the shift of the solid arrows towards larger distances, indicated by the open arrows, some atoms near an interface may be pushed closer together, causing elastic strengthening, while others are moved further apart, giving rise to a weakened elastic response. Because of anharmonicity, the former are weighed much more heavily than the latter, the net result being that some elastic moduli may actually strengthen even though the average density decreases. The final section of the chapter, Chapter 6.13, is a discussion of the role of grain boundaries in nanoscrystalline materials. Nanocrystals have emerged in recent years as materials systems with unique structures and properties. They are of interest not only for fundamental understanding, but also because they can be functionalized for exciting technological applications. From the standpoint of modeling material interfaces, nanocrystals constitute an increasingly important topic of study. Nanostructures may be viewed as the confluence of clusters of a few to tens of atoms and microstructural entities on the length scales of microns. Using atomistic simulation techniques one can probe the structural and thermodynamic features of nanocrystals and relate them to the
0.8
r 2 G (r)
0.6
0.4
0.2
0.0 0.6
0.8
1.0
1.2
1.4
1.6
r/a
Figure 2. Zero-temperature radial distribution function, G(r ), vs. distance r (in units of the lattice parameter a) for a superlattice of (001) twist grain boundaries in Cu (see Chapter 4) with six (001) planes between the interfaces [3]. The dashed lines delineate the peaks associated with the nearest, second-nearest, etc., neighbors in the fcc lattice. G(r ) is normalized such that the areas under the peaks correspond to the numbers of nearest (12), second-nearest (6), etc., neighbors; the peak centers are indicated by the open arrows. The solid arrows indicate the positions of the corresponding perfect-crystal (δ-function type) peaks. The difference between open and solid arrows signals a volume expansion of the multilayer over the perfect crystal.
1930
S. Yip and D. Wolf
corresponding features characteristic of amorphous materials. Another aspect to be discussed is the how one can relate the behavior of nanocrystals to those of bicrystals and polycrystals. We have emphasized the modeling of crystal defects using the atomistic methods discussed in Chapter 2. It should be clear that the electronic structure methods discussed in Chapter 1 are also applicable so long as the number of atoms in the simulation are not so large as to preclude their use. There are significant connections between this chapter and the one following, Chapter 7, on microstructure, and also parts of Chapter 9 on soft matter.
Acknowledgments DW is supported by the US Department of Energy, BES Materials Sciences, under Contract W-31-109-Eng-38.
References [1] D. Wolf and S. Yip, (eds.), Materials Interfaces: Atomic-Level Structure and Properties, Chapman and Hall, London, 1992. [2] E. Meyer, S.P. Jarvis, and N.D. Spencer, “Scanning probe microscopy in materials science,” MRS Bull., 29, 443–445, 2004. [3] D. Wolf and J.F. Lutsko, Phys. Rev. Lett., 10, 1170, 1998. [4] A.P. Sutton and R.W. Balluffi, Interfaces in Crystalline Materials, Clarendon Press, Oxford, 1994. [5] D. Wolf and S. Yip, MRS Bull., 15, 21–23, 1990a. [6] D. Wolf and S. Yip, MRS Bull., 15, 23, 1990b.
6.8 ATOMISTIC METHODS FOR STRUCTURE–PROPERTY CORRELATIONS Sidney Yip Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
There is a general belief that physical properties of crystals can be classified, to a first approximation, according to its structure. The basis of this thinking is that there is close correlation between structure and the chemical bonding between atoms which in turn controls the properties [1]. Although it is not guaranteed to be always successful, this can be a good starting point toward the understanding of materials properties and behavior. In this section we discuss the use of atomistic techniques to study interfaces, primarily grain boundaries, in the context of structure–property correlation. As we will see, these methods are a subset of the multiscale techniques treated extensively in Chapters 1–4. Using grain boundary as a prototypical crystal defect, we examine how atomistic simulation techniques can be brought together to determine the physical properties of crystalline materials with well-characterized defect microstructure. This section also serves as an introduction to the subsequent sections which are concerned, in one way or another, with probing the structure and associated properties of grain boundaries. An integrated approach to establish the correlations between interfacial structure and the physical properties of grain boundaries has been proposed using intergranular fracture as a target application [2]. We adopt the same point of view here not so much to study intergranular fracture as to illustrate the capabilities of four related atomistic techniques, lattice statics (LS), lattice dynamics (LD), Monte Carlo (MC), and molecular dynamics (MD). A fundamental characteristic of all interfacial systems is that such systems are intrinsically inhomogeneous.. The presence of an interface means the
1931 S. Yip (ed.), Handbook of Materials Modeling, 1931–1951. c 2005 Springer. Printed in the Netherlands.
1932
S. Yip
immediate region surrounding the interface can have properties quite different from the bulk. It follows that any understanding of interfacial properties must explicitly take into account the local behavior of the interfacial region, for example, the local volume expansion at the boundaries or the local elastic constants. Because the interfacial region is usually only a few atomic spacing in extent, calculations of local properties necessarily involve details of displacements and forces at the molecular level. Such information is directly available from discrete-particle (atomistic) simulations in which the interfacial region is modeled as an assembly of particles interacting through specified interatomic potentials. In the case of intergranular fracture there are formidable difficulties in the experimental approach to structure–property correlations because the complexity and diversity of relevant phenomena involve the specification and determination of large numbers of parameters and variables. One can imagine such problems at every stage of an experiment, from sample preparation where one needs to control the interfacial structure, to sample characterization and measurement of details of the fracture process. On the other hand, these difficulties point to opportunities for the modeling approach. It is the inherent nature of atomistic calculations that one can specify interatomic structure and forces, follow the system evolution in dynamical details, and analyze the results with regard to various physical properties of interest. Simulation ‘therefore’ becomes a unique complement to experiment and the traditional theoretical methods of materials research. In assessing the capabilities of atomistic simulations, certain fundamental limitations of this approach also should be recognized at the outset. These are concerned with the availability of realistic interatomic potential models for the material system under study, and the finite system size and simulation duration that can be studied. The former has to be addressed in every individual study, since the significance (meaningfulness) of the results is always limited by the adequacy of the potential model adopted. With continuing research toward developing potentials for metals, ionics and semiconductors at first, and alloys and compounds more recently, and the increasing use of electronic-structure methods to produce databases for fitting and validation, the question of reliable potentials for atomistic simulation is not as critical as before. Nonetheless, the use of empirical potentials in atomistic simulation will always remain a compromise between tractability and predictive accuracy. The limitation of system size and simulation duration is an issue of computational resources. With computer power still increasing steadily, this is also becoming less serious in that what was not feasible only a year or two ago, is now within reach of current capabilities. It is important to recognize that development of ingenious boundary and initial conditions can be very effective in mitigating artifacts due to finite size and duration of simulation.
Atomistic methods for structure–property correlations
1.
1933
Linking Atomistic Techniques for Structure–Property Correlations
We begin with a brief review of the basic concepts in atomistic simulations. Consider a collection of N interacting atoms and denote their positions as {r N } ≡ (r 1 , r 2 , . . . , r N ), where r i is the position of atom i. We will call {r N }the atomic configuration or the system configuration. Suppose the system is in an initial configuration {r N }i which is not an equilibrium configuration, then the atoms will move under the effects of interatomic interactions once the system is allowed to evolve in time. At any given instant the state of the system is characterized by a potential energy, U ({r N }), which depends on the instantaneous positions of all the atoms, and a kinetic energy, K ({r N }), where {v N } is the set of atomic velocities (v 1 , v 2 , . . . , v N ). It is understood that the evolution of the N -particle system occurs as if the N atoms are actually part of a larger body, one of macroscopic dimensions, and a set of border conditions will be specified to describe the embedding of the simulation cell with N atoms in the larger medium. From a conceptual point of view three basic ingredients are required to predict the system evolution under the effects of interatomic forces. The interatomic potential has to be specified. In contrast to electronic-structure calculations, the potentials are assumed to depend only on the atom positions, with the electrons considered as an effective medium which mediates the atomatom coupling. Secondly, to account for interactions between the system and its environment border conditions have to be invoked. Such conditions may be a periodic extension of the simulation cell in three, two, or one dimension, for example. Or they may be a certain embedding of the system, such as enclosing the system in a linear elastic medium. Finally, an algorithm has to be used to determine the system response of interest, whether it is deterministic or stochastic, and whether the response is a static relaxation or dynamical evolution. The three aspects will be considered separately.
2.
Interatomic Interactions
The key to the computational efficiency of atomistic simulations lies in the description of the interaction between the atoms through a potential energy. The task of reducing the complex many-body problem of a system of interacting electrons and ions to a description involving only atomic coordinates is highly nontrivial and not unique. The successes of this approach which have been extensively documented thus far justify the belief that meaningful results can be obtained. Another practical motivation is that this is the only tractable way of investigating a large class of significant problems in materials research
1934
S. Yip
for which the use of first-principles methods is not yet feasible. The development of an empirical potential usually consists of two steps. Some analytical form that can be conceptually rationalized is first chosen for the potential energy function of the system of N atoms, U ({r N }), then the adjustable parameters in this function are determined by fitting to a database of properties, typically lattice structure and zero-temperature lattice parameters, elastic constants, the sublimation or vacancy formation energy, and others. The database used can be purely experimentally or theoretically determined, or a combination of the two. Current trend is to use only theoretical results. See Chapters 1 and 2 for extensive discussions. From a conceptual viewpoint one can discuss empirical potential models by writing the energy U ({r N }) of the system as an expansion in n, where n is the number of particles interacting with each other, U ({r N }) =
V1 (r i ) +
i
V2 (r i , r j ) +
V3 (r i , r j , r k ) + . . .
(1)
i< j
i< j
where V1 (r i ) is the one-body potential which depends only on the position of atom i, V2 (r i , r j ) is the two-body potential, and so on. The simplest possible representation of many-body interactions is the sum of two-body potentials. See the discussions on interatomic potentials in Chapters 2 and 9. For metals the two-body approximation is known to be inadequate. A more reasonable approach is to adopt a many-body potential in the following sense. One writes U ({r N }) =
i
Fi (ρi ) +
ϕ(r i j )
(2)
i< j
where Fi is the cohesive energy (attractive) contribution which depends on the electron density at atom i, ρi , and ϕ is the repulsive two-body interaction depending the separation between atoms i and j , r i j =r i −r j . See in particular Article 2.2. Much of the results on grain-boundary (GB) studies discussed in this chapter are based on the two-body potential model.
3.
Border Conditions
By its very nature an interfacial system is composed of two coupled regions, the interface and the surrounding bulk regions. Formulating proper border conditions is a compromise in trying to satisfy two requirements, (i) keeping the simulation cell size small without incurring artifacts, and (ii) treating the coupling to bulk surroundings realistically. This problem is discussed extensively in several subsequent sections in this chapter. The so-called Region I–Region II method has ben adopted in a number of calculations. In this case, the interface (GB) is embedded in two semi-infinite bulk ideal
Atomistic methods for structure–property correlations
1935
crystals, with periodic border conditions in the directions parallel to the interface. For an isolated interface, one which is embedded in two semi-infinite bulk ideal crystals, two-dimensional periodic border conditions (2-d PBC) in two directions parallel to the interface are clearly appropriate. This embedding is accomplished by surrounding the interface region, denoted as Region I in Fig. 1 by two semi-infinite bulk ideal crystals, denoted as Region II. With the same 2-d PBC applied in directions x and y to both Regions I and II, the system has no free surfaces. When thermal motions are considered in the calculation, the Region I–Region II treatment together with 2-d PBC is still appropriate. But now the periodicity enforced by the border conditions gives rise to a limitation on the dynamical processes that can be studied. As is well known in MD simulations, the system under investigation cannot propagate phonons with wavelengths greater than the dimensions of Region I. In strained-layer superlattice materials one has a periodic arrangement of interface planes in the direction of the interface-plane normal (z-direction). Such systems can be represented by periodically repeating the atom positions
REGION II
REGION I
REGION II
z x
y
Figure 1. Schematic of simulation cell showing the simulation region (Region I) and the border regions (Region II). The grain-boundary is a planar interface which is infinite in the x- and y-directions by virtue of the periodic border conditions. The border condition in the z-direction, normal to the interfacial plane, depends on the specification of Region II.
1936
S. Yip
in Region I in the direction normal to the interface, and by eliminating Region II. Instead of embedding a single interface in semi-infinite ideal crystals, the atoms near the surface of Region I are now surrounded by their own periodic images. The result is a three-dimensionally periodic arrangement of atoms (3-d PBC) in which the computational unit cell, Region I, now contains two identical interface planes. For studying an isolated interface, 3-d PBC have the undesirable feature that the two interface planes in the simulation cell can interact so that their mutual influence cannot be easily separated from the behavior of a single interface. A method for treating the z-borders at finite temperatures in a lessconstraining manner has been proposed [3]. Unlike a 3-d PBC this approach gives a simulation cell containing a single interface, and in contrast to the condition of a fixed z-border, it accommodates dimensional changes normal to the interface as well as translational motions parallel to the interface plane. This method also makes use of the Region I–Region II configuration shown in Fig. 1, where in Region I the particles are treated explicitly and Region II consists of two semi-infinite blocks of atoms held fixed at their ideal-crystal positions. The novel feature is that the rigid blocks are allowed to move by translations parallel to the interface plane, such motions being determined by the force exerted on the blocks across the Region I–Region II border. The blocks are also allowed translations in the z-direction; these movements are treated separately from the parallel translations and are governed by the pressure exerted on the blocks by Region I. At zero temperature and for finite temperature simulations, other borders have been used for different specific applications. For the simulation of equilibrium segregation, a 2-d PBC three-region simulation cell was used for the purpose of allowing the solute concentration at the interface to be calculated while keeping the solute concentration in the surrounding bulk region fixed (see below). For the simulation of crack-tip systems, different borders involving 1-d PBC which allow the external stress stress to be transmitted to the simulation cell have been proposed.
4.
Lattice Statics
The atomistic methods we will link together are all concerned with a common model system. We will briefly examine the physical basis of each method, indicate their complementarity, and consider how they can be used synergistically. The method of lattice statics enables one to determine the zero-temperature relaxed structure of the simulation system by minimization of the potential energy U ({r N }). It is widely used in problems dealing with the low-temperature structure and energetics of defects in crystals [4].
Atomistic methods for structure–property correlations
1937
The basis of the method is the expansion of U about a certain configuration {r oN }, U ({r N }) = U ({r oN }) +
∇ ri U (r − r oi )
i
1 ∇ ri ∇ r j U o :(r i − r oi )(r i − r oj ) + . . . + 2 i, j
(3)
where terms beyond the quadratic order in the displacement are notshown. To determine the equilibrium (relaxed) configuration, one can solve the equation given by the requirement that the force on each atom must vanish,
F i ({r oN }) = −∇ ri U ({r N })U ({r N })o = 0 which is the second term in Eq. (3). In lattice-statics calculation energy minimization is carried out by moving each atom in the direction of the force acting on it by a small amount. How small this should be is governed by the force constants, the third term in Eq. (3), or in simpler schemes, it is chosen arbitrarily. This process of relaxation is continued until all the forces are reduced below some prescribed value, then the system considered fully relaxed. The configuration {r oN } and the corresponding energy U ({r oN })thus obtained are the equilibrium structure and energy of the system at zero temperature. When the system contains an interface, it is not sufficient to relax only the atoms. The border conditions may need adjustment in response to the atomic relaxations at the interface, otherwise a residual strain can build up in the interfacial region.
5.
Lattice Dynamics
Although not generally considered as simulation in the sense of moving particles, this widely used method for calculating thermal properties of crystals can be computationally quite efficient [5]. In its simplest form, the harmonic approximation, the approach is most useful at low temperatures. However, with various modifications, known as quasiharmonic approximations, one can obtain useful results up to temperatures approaching the melting point in some cases [6]. The basis of the method is the expansion given in Eq. (3). With {r oN } taken to be the equilibrium configuration, the first term in Eq. (3) is the cohesive energy of the crystal and the second term vanishes. The harmonic approximation constitutes ignoring all terms cubic in the displacement and higher. The crystal lattice is thus treated as a system of coupled oscillators with force constants derived from the second derivatives of the potential energy. The combined use of lattice statics and lattice dynamics is quite clear. Starting with
1938
S. Yip
a model system with an initial configuration, one performs first energy minimization to obtain the relaxed configuration {r oN }and the energy Uo . This then provides the input structure for lattice dynamics calculation which involves the diagonalization of the dynamical matrix formed from the coefficients of the displacements in Eq. (3), giving the frequencies and eigenvectors of the various vibrational modes in the system. All the thermal properties of a harmonic solid can be calculated from the partition function once the normal modes are determined. There is some degree of freedom in choosing the unit cell for the formulation of the dynamical matrix; however, it is intrinsic in the method that the system be periodic.
6.
Molecular Dynamics
This method is the realization of the classical-dynamics description of a model system through the Newtonian equations of motion [7], d 2r i (t) = −∇ ri U ({r N (t)}), i = 1, . . . , N (4) dt 2 where mi the mass of atom i. Given the initial configuration of the atoms and the border conditions, these equations are integrated numerically to give the positions of the atoms at later times in incremental steps. The basic output of molecular dynamics (MD) is a set of particle trajectories, {r N (t)} = (r 1 (t), r 2 (t), . . . , r N (t)), the complete description of the model system as formulated in classical mechanics. Knowing how the system evolves in time one can proceed to determine all physical properties, equilibrium as well as dynamical, of interest. In molecular dynamics all the particles are displaced from one time step to the next in accordance with Eq. (3). Each particle therefore has an instantaneous velocity and kinetic energy. One can define an instantaneous temperature for the system as being proportional to the total kinetic energy. This quantity will fluctuate in time as the particles move through regions of different potential interaction. It is through these fluctuations that entropic effects into the simulation. Because of this property, MD is valid for classical systems at any temperature. In practice, the validity of the MD approach is limited to temperatures near or above the Debye temperature solid because it neglects the zero-point vibration effects of quantum mechanical nature. This neglect is not very serious since one can modify the MD procedure to take these effects into account. To the extent that the interatomic potential used holds for all interparticle separations, the simulation is valid for arbitrary deformation. It also follows that one can simulate the system behavior at any temperature desired, as well as the system response to a temperature change such as in a phase transition. mi
Atomistic methods for structure–property correlations
1939
Relative to lattice statics, MD can be regarded as a method for determining how a model system which is relaxed at zero temperature behaves at finite temperature and external stress. The effects of external stress can be treated either through the border conditions or modifications of Eq. (3) by introducing an appropriate Lagrangian [8]. By ‘behavior’ we mean here both the equilibrium properties such as thermal expansion, and the mechanical responses such as the elastic constants. Relative to lattice dynamics, MD provides a means to study anharmonic effects associated with large amplitudes of atomic displacements at elevated temperatures, arising from thermal activation or external stress. It is quite feasible for MD to verify the results of both lattice statics and lattice dynamics. The version of MD discussed here is based on classical mechanics, extensions to quantum MD are discussed in Chapter 2.
7.
Monte Carlo
This is a companion atomistic simulation method to MD in that both require the same input of potential energy, initial atomic configuration of the system, and border conditions [Binder, 1979]. In contrast to MD, particle displacements in MC are chosen by sampling from a prescribed probability distribution function. In the 3N -configurational space defined by the positions of the atoms, the system is represented by a phase point at any step during the simulation. As the system evolves, this point moves through the configurational space following a certain trajectory. Unlike MD, there is no velocity involved in the simulation. The system temperature is therefore predetermined by its appearance in the probability distribution and does not change for the entire simulation. Whereas the particle trajectory in MD is determined by Hamiltonian or Newtonian dynamics, the trajectory in MC is determined by stochastic dynamics. Stochastic simulations using a sequence of random (actually only pseudo-random) numbers, are sometimes also called Monte Carlo. In our usage, we will mean the method of simulation at finite temperature where the assignment of atom displacement is based on a procedure known as the Metropolis method, or any of its variations involving the concept of importance sampling. The Metropolis procedure is well-known for sampling the canonical distribution,
−U ({r N }) f ({r N }) ∝ exp kB T
(5)
where kB is the Boltzmann’s constant and T the system temperature. The result of a Monte Carlo simulation is a sequence of Ns particle configurations{r N } j , j = 1, . . . , Ns , where Ns specifies how many configurations one wishes to sample. This is the output corresponding to the particle trajectories, {r N (tk )},
1940
S. Yip
produced by MD, where k = 1, . . . , Nt is the time-step index, Nt being the number of time steps one wishes to simulate. Since all the equilibrium properties of interest are calculated as ensemble averages over an appropriate distribution of particle positions, they can be obtained using either MC or MD. The assumption is that the system is ergodic, for which ensemble and time averages are in principle equivalent. In practice this requires that the configuration steps and time steps are both large enough. It is generally believed that MD and MC, when both are properly carried out, will give the same description of the equilibrium properties of the model system, although it is quite rare that this correspondence is explicitly demonstrated in any study [20]. For dynamical properties, the two methods are not expected to give the same results; the time-dependent response given by the MD trajectories is regarded as the physically meaningful one in the sense of classical dynamics. On the other hand, MD is intrinsically bound to the microscopic time scale determined by the interatomic forces in condensed matter, typically a fraction of a picosecond. In the MC approach a step in the sequence is governed by the transition probability, which can be formulated to describe any kind of particle displacement of interest. Thus, the stochastic method is more flexible in terms of the time step of simulation; it can be used to treat kinetic phenomena which occur on longer time scales than the natural time scales of MD, such as impurity or solute segregation at an interface. From the standpoint of computational efficiency, one can compare the four techniques discussed in terms of the number of times the system energy and interatomic forces are evaluated in each method. In this respect, lattice dynamics is the most efficient since the dynamical matrix has to be evaluated only once. In lattice statics, energy minimization is achieved typically in a few hundred iterations or less. In MD, the force calculation is made every time step unless some bookkeeping device is introduced to reduce the frequency of updating the system configuration. Roughly speaking, the efficiency of MC is comparable to MD in calculations of equilibrium properties. Where MC can have a significant advantage is in the study of ‘slow’ timedependent phenomena; it can be the only viable means of atomistic simulation in special cases. Besides CPU-time efficiency, one should also consider storage requirements in comparing the different techniques. Here lattice dynamics is the least efficient since, for a system of N particles, it requires the storage of the dynamical matrix of order 3N × 3N . In practice this has limited LD to relatively small systems, containing typically 1000 atoms. Moreover, the CPU required for matrix diagonalization increases approximately as N 3, rendering the method inefficient for large systems. By contrast, MC simulation requires the least storage in that unlike MD and LS the interatomic forces are not required. Storage requirement for MC, LS and MD increases linearly with the number of particles.
Atomistic methods for structure–property correlations
1941
It should be evident that each of the four methods discussed provides unique capabilities for the study of structure–property correlations in interface materials. We have shown in Fig. 2 how they can be linked together synergistically. There are three levels of correlations. Beginning with a given interface geometry, specified by five macroscopic degrees of freedom, one prepares the initial atomic configuration, un-relaxed positions of all the atoms, by rotating two semi-infinite perfect-crystal blocks with respect to each other by an angle θ about the normal to the interface plane, n. ˆ At the second level, lattice statics methods are used to determine the relaxed structure and energy of the interfacial system, including the three microscopic degrees of freedom associated with translations parallel and perpendicular to the interface plane. Lastly, the relaxed structure is passed on to the third level at which finite temperature properties are investigated by means of LD, MD, or MC, including the effects of externally applied stress. Suppose this approach is applied to intergranular fracture. LS methods are well suited to explore zero-temperature correlation between the interface
INTERFACE GEOMETRY
â n1, â n2, θ; a 0(1)/a 0(2) CRYSTAL STRUCTURE(S)
PLANAR UNIT CELL
INTERATOMIC POTENTIAL(S)
LATTICE STATICS (LS)
2-d or 3-d PBCs
RELAXED STRUCTURE and ENERGY at T⫽0
LATTICE DYNAMICS (LD)
MOLECULAR DYNAMICS (MD)
MONTE CARLO (MC)
QUASI-HARMONIC and
ANHARMONIC EFFECTS
EQILIBRIUM PROPERTIES
HARMONIC ELASTIC and
'FAST' DYNAMICAL
'SLOW' KINETIC
THERMODYNAMIC PROPERTIES
PHENOMENA
PROCESSES
Figure 2. Methods for atomistic modeling of grain-boundary systems showing their interrelated roles in the study of structure-property correlations [2].
1942
S. Yip
geometry and the corresponding energy. Insights gained can be an invaluable guide to probe the effects of external stress and temperature, to unravel the correlation between the interface structure and its fracture properties. LD calculations are best suited to obtain local elastic constants which play a central role in the mechanical response of a stressed system. MC simulations constitute the most powerful approach to the determination of the equilibrium distribution of solutes in the interface region which are known to control the embrittlement behavior. The actual dynamical process of crack extension can be simulated by MD, thus leading to quantifiable insights that connects interfacial geometry and chemistry with fracture resistance.
8.
Structural Disorder at an Interface
We now consider basic atomic structural and mobility properties that have been studied by atomistic simulations. The most common form of structural disorder in a crystal lattice is caused by the thermal movements of the atoms, usually leading to thermal expansion. This homogeneous type of disorder, and the consequent volume increase, originate in the anharmonicity of the interatomic interactions. In the presence of planar defects, structural disorder occurs even at zero temperature. The effect is now localized, usually also in the form of volume expansion. Thus, volume expansion may be viewed as a measure of structural disorder in the system, homogeneous and inhomogeneous. The radial distribution function, r 2 g(r), is a useful tool for characterizing the effects of structural disorder. One can look for two effects of thermal disorder here. The δ-function like zero-temperature peaks associated with the shells of nearest, second nearest, and more distant neighbors are broadened, and because of the volume expansion the peak centers are shifted toward larger distances. As illustrated in Fig. 3, the structural disorder at a solid interface gives rise to the same two effects even at zero temperature. Figures 3(a) and 3(b) show the radial distribution function for atoms in the planes nearest and next-nearest to a high-angle twist boundary on the (100) plane of an fcc metal, respectively [9]. In the plane closest to the GB, the perfect crystal δ-function peaks have been replaced by a broad distribution, whereas in the secondclosest plane, the ideal-crystal peaks are largely recovered. This illustrates the highly localized nature of the structural disorder at the interface. Figure 3(a) also illustrates the cause for the local expansion at the GB. The distances to the left of the arrows represent atoms showed more closely together than in the perfect crystal; because of anharmonicity these atoms repel each other more strongly, thereby giving rise to a local expansion. A commonly used measure to characterize atomic mobility is the atomic mean-squared displacement (MSD). MSD is essentially time independent in
Atomistic methods for structure–property correlations (a)
0.6 Au (EAM) (100) θ ⫽ 43.60˚ (Σ 29) PLANE 1
0.5 0.4 r 2g(r)
1943
0.3 0.2 0.1 0.0 0.8
1.0
1.2
1.4
1.6
1.4
1.6
r/a
(b)
0.6 Au (EAM) (100) θ ⫽ 43.60˚ (Σ 29) PLANE 2
0.5
r 2g(r)
0.4 0.3 0.2 0.1 0.0
0.8
1.0
1.2 r/a
Figure 3. Radial distribution function, r 2 g(r ), for the two planes nearest to a (100)θ = 43.60◦ (29) grain boundary as described by an EAM potential for Au. Arrows indicate the corresponding perfect-crystal peak positions. While atoms in the plane nearest to the interface (a) are very strongly affected by the presence of the interface, the atoms in the second-nearest plane (b) have an environment much closer to that of an ideal crystal.
the solid, while in the liquid it increases approximately linearly with time, with a proportionality constant that is a direct measure of the liquid diffusion constant. To study the structural disorder at grain boundaries MSD is less useful than the magnitude of the static structure factor, S(k). We define S 2 (k) as 2 S (k) ≡ S(k) = 2
N 1 cos(k · r i ) N i=1
2
N 1 + sin(k · r i ) N i=1
2
(6)
where r i is the position of atom i. Because grain boundary is a planar defect, it is more appropriate to focus on the spatial ordering along the normal to the interface plane. We divide the simulation cell into slices along the z-direction
1944
S. Yip
(perpendicular to the interface plane), with each slice chosen to contain a single atomic plane in the crystal, and define the planar structure factor, S 2p (k), using in Eq. (6) only those atoms lying in a given lattice plane. For an ideal crystal at zero temperature, S 2p (k)is unity for any wave vector, k, which is a reciprocal lattice vector in the plane p. In the liquid state (without longrange order in plane p), S 2p (k)fluctuates near zero. As the two halves of a bicrystal are rotated with respect to each-other about the GB-plane normal, two different wave vector, k 1 and k 2 , are required, each corresponding to a principal direction in the related half. For a well-defined crystalline lattice plane, say in semicrystal 1, S 2p (k 1 ) then fluctuates near a finite value (∼1) appropriate for that temperature, whereas S 2p (k 2 ) ∼ 0. In the GB region, due to the local disorder, one expects somewhat lower values for S 2p (k 1 ). By monitoring S 2p (k 1 )and S 2p (k 2 )every slice may be characterized as (a) belonging to semicrystal 1 (for S 2p (k 1 )finite, S 2p (k 2 ) ∼ 0),(b) belonging to semi-crystal 2 (for S 2p (k 1 ) ∼ 0, S 2p (k 2 )finite) or (c) disordered or liquid (for S 2p (k 1 ) ∼ 0, S 2p (k 2 ) ∼0). See Chapters 6.9, 6.10 and 6.11 for further discussions of structural order parameter.
9.
Grain Boundary Sliding and Migration
When a bicrystal is brought into equilibrium at a finite temperature, thermal stresses can develop in the GB core and give rise to deformation or activated cooperative motions. Of particular interest are motions which result in a displacement of the interface with the GB structure remaining intact. One can imagine two modes of boundary displacements, a translation along the boundary plane by the upper half of the bicrystal relative to the lower half (sliding), and a movement of the boundary plane in a perpendicular direction (migration). It turns out that the individual atoms can move by relatively small distances and yet collectively they cause the boundary to slide and migrate. Because no large amplitude of atomic displacements are involved, these collective motions can be readily thermally activated and, therefore, observed in a dynamic simulation. Furthermore, it is reasonable to expect that sliding and migration will be coupled if the boundary is to continually reconstitute itself during its movements [10]. The coupled motions of sliding and migration, observed in two- and three-dimensional simulations, can be analyzed using the so-called DSC lattice. Figure 4 shows an example of the atomic displacements that have been observed in the bicrystal. One sees relative to the lower crystal the atoms in the upper crystal, except for the atoms in the transition region, all undergo a lateral displacement equal to one unit of the DSC lattice in the x-direction. The
Atomistic methods for structure–property correlations
1945
B
B'
A
A'
Figure 4. Atomic displacements in a molecular dynamics simulation of a symmetric tilt bicrystal which are associated with coupled GB sliding and migration [10]. As a result of these displacements which occurred under either thermal or shear activation, the boundary plane moved from position A to position B.
displacements of the atoms in the transition region are more complicated because they have to change allegiance from being part of the upper crystal to belonging to the lower crystal. In addition to thermal activation, boundary sliding and migration can be induced by external stress. This has been demonstrated in a MC study by applying a shear stress to the same two-dimensional bicrystal model discussed above [11]. With either thermal or stress activation, it was found that a threshold value exists below which boundary motion was not observed. See Article 6.7 for further discussions of sliding and migration.
10.
Atomic Mobility
The study of atomic migration by MD is appropriate provided the time required for migration is not greater than the time interval of simulation. One measure of the necessary level of diffusivity is given by the behavior of the mean squared displacement r 2 . If this displacement shows a linear increase with time over a period long compared to local fluctuations and correlations, then the motion can be considered as diffusive, and the diffusion coefficient obtained from the slope of the linear portion of r 2 . Alternatively, if the discrete jumps between lattice sites can be monitored, one can deduce the diffusion coefficient from the observation of several hundred jumps. In the study of liquid-state dynamics, the mean squared displacement procedure is routinely used to calculate the diffusion coefficient, which typically has values of order 10−5 cm2 /s.
1946
S. Yip
Atomic diffusion along grain boundaries is an important metallurgical process for matter transport, especially at temperatures well below Tm where it may be orders of magnitude more rapid than bulk diffusion. It is generally believed that GB diffusion occurs via a point-defect mechanism [12]. A related question concerns the action of the grain boundary as a source or sink for defects, and, in the presence of impurities, the role of diffusion in grainboundary segregation. MD studies of GB diffusion have been carried out on a symmetrical tilt boundary in the fcc structure using the Lennard-Jones potential with 3-D PBC [13] and the Morse potential for Cu with a 2-D PBC and fixed z-borders [14], as well as in the bcc structure using anempirical potential for α − Fe with fixed z-borders [15]. Figure 5 shows the bcc bicrystal model in the form of a stack of ten layers of (001) atomic planes, each containing 40 atoms. After the system was relaxed, a vacancy was introduced into one of the sites in the GB core (labeled as A, B, C, or D). The different sites are clearly not equivalent, as can be seen by the corresponding vacancy formation energies calculated separately by LS methods. Long simulation runs were carried out out at several temperatures during which vacancy migration from one site to another was monitored. A typical jump sequence at 1500 K (observed melting point of iron is about 1800 K) is shown in Fig. 6. From this kind of data one can extract a vacancy migration energy by assuming an Arrhenius behavior for the jump frequency; a reasonable value of 0.51 eV was obtained in this case. In such simulations one also has the necessary details to investigate the relative frequency with which the vacancy visits the different sites, information that
(b)
5.00 a0
6.5 0a
0
(a)
Layer
1 10 9 8 7 6 5 4 3
6.32 a0
x [130] z [001] y [310] y [310] x [130]
2 1
Figure 5. Bicrystal model of a symmetric tilt boundary in an MD simulation of vacancy migration in bcc iron, (a) view showing the simulation cell, and (b) view of one of the (001) planes and the border regions, enclosed by dashed lines, containing fixed particles.
Atomistic methods for structure–property correlations
1947
Figure 6. A portion of the vacancy migration trajectory observed during MD simulation at an elevated temperature. Length scale along [001] has been expanded by a factor of 5. Sites in the GB core are labeled A, B, C, D with equivalent sites denoted by a prime. Three sequences are shown, vacancy migration predominantly in the GB plane (left sequency), migration involving an interstitial position I (middle sequence), and migration resulting in exchange of atoms at sites B and B (right sequence).
may be potentially useful for correlating the diffusion properties of a given GB with structural features of the boundary core. To obtain a diffusion coefficient from the simulation data one can use the jump-frequency information, or resort to a more fundamental quantity, the mean squared displacement (MSD), 1 r (t) = N 2
N
[r i (t) − r i (0)]
2
(7)
i=1
where on the right hand side means an average over different time origins t = 0 if one is dealing with a steady-state process, otherwise, the bracket notation should be ignored. The mean squared displacement ‘therefore’ is a measure of how far an atom in the system migrates during a time interval t. The way it is defined in Eq. (3) assumes that all the atoms in the system behave in the same
1948
S. Yip
2 MSD (IO⫺16 cm2)
C
B 1
A 0
0
100
200 TIME (psec)
300
400
Figure 7. Mean squared displacement of atoms observed in an MD study of GB diffusion [15], (A) atoms in the bulk region only, (B) all atoms in the simulation cell, and (C) atoms in the GB core only (sites A–D in Fig. 6).
way on average. If there is a particular atom or a small group of atoms one wishes to follow, then the factor (1/N ) and the summation over atoms would have to be adjusted accordingly. Figure 7 illustrates the difference between atomic mobility in the bulk crystal and that in the GB core. The grain-boundary coefficient, DG B , can be obtained from the definition D = r 2 (t)/6t, for t large compared to any local relaxation times. However, in this approach there is an ambiguity in deciding which atoms should be included in calculating the MSD. Clearly, only those atoms that could be considered as belonging to the GB core should be counted in determining DG B . The basic difficulty is that what is considered GB region is not precisely defined; in other words, we do not have a unique of defining the width of the GB. This problem becomes worse when the GB core becomes deformed and starts to migrate, which can happen at sufficiently high temperatures. See Article 6.7 for further discussion of GB diffusion.
11.
Grain Boundary Segregation
Grain-boundary segregation is known to strongly influence not only the mechanical properties of interface materials [16], but also the atomic mobility at the GBs [21]. By segregation we mean the migration of solute atoms from the bulk to the GB and the subsequent distribution of these particles. As discussed above MD is appropriate for the simulation of atomic migration,
Atomistic methods for structure–property correlations
1949
provided the diffusivity is such that significant motions can be captured during the interval of simulation. In polycrystals with micron grain size or larger, atomic diffusion in the bulk would be too slow for MD to follow. On the other hand, in nanocrystals the possibility is more promising [see Article 6.10]. For the equilibrium distribution of the segregants, the problem is basically one of determining the proper chemical composition of the GB region, and allowing for atomic and overall volume relaxations to relieve the lattice strain. Monte Carlo method [23] is well suited for this kind of study because it affords an efficient sampling of various lattice configurations in which pairs of atoms of different species are exchanged sequentially. If one wishes to perform only structural relaxation of a GB with a given impurity concentration and distribution, MD also can be used. The fundamental reason that MC results are the most physically meaningful is that entropy effects are taken into account so the free energy is minimized. There are different ways of performing the MC simulation depending on the ensemble distribution to be generated [17]. For problems where it is not necessary to let the total number of atoms fluctuate, it is convenient to work with either the isochoric canonical ensemble (NVT), where particle N , the system volume V , and temperature T are held constant, or the isobaric canonical ensemble (NPT). Since the number of segregants in the GB is not known a priori, a grand canonical ensemble, either (µV T ) or (µP T ), with µ being the chemical potential, should be used. However, determining the chemical potential, which is required in applying the grand-canonical-ensemble MC (GCEMC) method, is itself a challenging task. The problem of simulating equilibrium segregation begins with finding the concentration of each atom species at the interface subject to the condition that the interfacial region is in chemical equilibrium with the bulk. Chemical equilibrium means the two regions have the same chemical potential, which must be known before one can apply GCEMC. If the chemical potentials have not been determined, then one needs to first perform a simulation of the bulk system where the chemical potentials are adjusted to give the desired concentrations in the bulk. The effect of the GCEMC procedure then is to generate system configurations according to the distribution
[U ({r N }) − µA NA − µB NB ] f ({r N }, NA , NB ) ∝ exp − kB T
(8)
in the case of a binary system, with NA being the number of atoms of species A. Such a calculation is capable of giving the proper chemical composition at the interface while also taking care of the relaxation of lattice strain effects. We give an illustration of using MC to study the distribution of oversized impurities Bi in symmetrical and asymmetrical tilt boundaries in Cu [18]. Since these were rather early studies, very simple two-body interatomic potentials were employed, in particular the cross interactions between atoms of
1950
S. Yip
different species were fitted to the heat of mixing. A simulation cell consisting of three regions was set up in a manner similar to Fig. 1. In the interface region the atoms were allowed to exchange lattice sites as well as species identity. In the adjacent regions the atoms could exchange sites, but the average solute concentration was kept constant. The arrangement just described did not include local atomic relaxation effects since the atoms were always assigned to lattice sites. To study relaxation effects a different cell was used in which there was only the central region and all borders were 3-d periodic. In this case the impurities were randomly distributed initially, and atoms were allowed to exchange positions with one of their nearest neighbors; in addition they were allowed to move according to the usual Metropolis procedure. The simulation thus showed that segregation was localized in the first few layers near the GB, with increasing tendency to segregate as temperature was decreased. The net effect of a large solute atom was an overall outward movement of the boundary atoms. The amount and distribution of segregants were different for different GBs. In the case of the asymmetrical tilt boundary the extent of segregation was slightly higher and the segregant profile at the interface became asymmetrical. A final comment is that solute-atom segregation at internal interfaces is a problem where simulation has been integrated into interpretation of experiments [19].
12.
An Outlook
The correlation between the structure and property is a longstanding concept in materials science. Just as the study of materials has become a broad enterprise that encompasses many traditional disciplines, the concept of correlation has expanded from classical route of processing–properties– performance relations mostly for bulk materials to include both fundamentals and innovations in areas that may be called chem.-bio-nano [20]. The increased emphasis on functionality of materials will mean correspondingly more opportunities for multiscale modeling and simulation as discussed throughout this volume.
References [1] C. Kittel, Introduction to Solid State Physics, 3rd edn., John Wiley & Sons, New York, 1966. [2] S. Yip and D. Wolf, “Atomistic concepts for simulation of grain boundary fracture,” Mater. Sci. Forum, 46, 77–168, 1989. [3] J.F. Lutsko, D. Wolf, S. Yip, S.R. Phillpot, and T. Nguyen, “Molecular-dynamics method for the simulation of bulk-solid interfaces at high temperatures,” Phys. Rev. B, 38, 11572–11581, 1988.
Atomistic methods for structure–property correlations
1951
[4] J.R. Beeler, In: H. Herman (ed.), Advances in Materials Research, vol. 5, Wiley, New York, p. 295, 1970. [5] A.A. Maradudin, E.W. Montroll, G.H. Weiss, and I. Ipatova, Theory of Lattice Dynamics in the Harmonic Approximation, Academic, New York, 1971. [6] D.C. Wallace, Thermodynamics of Crystals, Wiley, New York, 1972. [7] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Clarendon, Oxford, 1987. [8] M. Parrinello and A. Rahman, “Polymorphic transitions in single crystals: a new molecular dynamics method,” J. Appl. Phys., 52, 7182–7190, 1981. [9] S.R. Phillpot, D. Wolf, and S. Yip, “Effects of atomic-level disorder at solid interfaces,” MRS Bull., XV, pp. 38–45, 1990. [10] G.H. Bishop, R.J. Harrison, T. Kwok, and S. Yip, “Simulation of grain boundaries at elevated temperature by computer molecular dynamics,” In: J.W. Christian, P. Haasen, T.B. Massalski (eds.), Progress in Materials Science, Chalmers Anniversary Volume, Pergamon, Oxford, pp. 49–95, 1981. [11] R. Najafabadi and S. Yip, Scripta Metall., 18, 159, 1984. [12] R.W. Balluffi, Metall. Trans. B, 13, 527, 1982, L. Peterson, Int. Metall. Rev., 28, 66, 1983. [13] G. Ciccotti, M. Guillope, and V. Pontikis, Phys. Rev. B, 27, 5576, 1983. [14] C. Nitta, “Computer simulation study of grain boundary diffusion in aluminum and aluminum-copper systems,” PhD Thesis, MIT, 1986. [15] T. Kwok, P.S. Ho, and S. Yip, “Molecular-dynamics studies of grain-boundary diffusion, II. Vacancy migration, diffusion mechanism, and kinetics,” Phys. Rev. B, 29, 5363–5371, 1984. [16] M.P. Seah, J. Phys., F10, 1043, 1985. [17] J. Ray, “Elastic constants and statistical ensembles in molecular dynamics,” Comput. Phys. Rep., 8, 109–151, 1988. [18] S. Foiles, Phys. Rev. B, 32, 7685, 1985. [19] S. Foiles and D. Seidman, “Solute–atom segregation at internal interfaces,” MRS Bull., XV, pp. 51–57, 1990. [20] M.C. Flemings and S. Suresh, “Materials education for the new century,” MRS Bull., November, 918–924, 2001. [21] W.G. Hoover and B.J. Alder, “Studies in molecular dynamics. IV. The pressure, collision rate, and their number dependence for hard disks,” J. Chem. Phys., 46, 686–691, 1967. [22] H. Gleiter and B. Chalmers, Progr. Mater. Sci., 16, 77, 1972. [23] K. Binder, Applications of the Monte Carlo Methods in Statistical Phys. (Springer Verlag, Berlin, 1979).
6.9 STRUCTURE AND ENERGY OF GRAIN BOUNDARIES Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
Grain boundaries (GBs) are internal interfaces formed when two crystals that are misoriented relative to each-other are brought into intimate contact. Together with the grain junctions (i.e., the lines and points were three or more GBs meet), GBs represent the elementary building blocks of polycrystalline microstructures. Their structure and physical behavior therefore controls many of the thermal, mechanical and electrical properties of polycrystalline solids. A central theme in GB research addresses this interrelation between GB structure and physical properties. In this context it is usually understood that the term “structure” includes both the GB geometry and the underlying GB atomic structure. While the GB geometry involves the eight crystallographic degrees of freedom of the interface (Section 1.), the atomic structure is determined by the physics of the material, i.e., by the nature of the interactions between the atoms, leading to phenomena such as misfit localization or the amorphous structure of high-energy GBs (Section 3.). At an intermediate level, the concept of the “atomic-level geometry” (Section 2.) provides a natural link between the macroscopic geometry and the atomic structure. Here we review the key concepts for the description of GB “structure” at each of these three levels, with particular emphasis on how they can be connected to certain key GB physical properties, such as the GB energy.
1.
Macroscopic Geometry
In addition to the crystal structure, the geometry of a bicrystalline GB is characterized by five “macroscopic” and three “translational” or “microscopic” degrees of freedom (DOFs). Their experimental characterization represents a major challenge. The translational DOFs are represented by the three components of a vector, T, associated with rigid-body translations parallel to 1953 S. Yip (ed.), Handbook of Materials Modeling, 1953–1983. c 2005 Springer. Printed in the Netherlands.
1954
D. Wolf
(two DOFs) and perpendicular to the GB (one DOF). In principle, a sixth parameter, representing the position of the GB, should be added to the five macroscopic DOFs; however, for an immobile GB this DOF is irrelevant.
1.1.
Macroscopic Degrees of Freedom
The five macroscopic DOFs can be defined in different ways. The ultimate measure of any particular choice is its ability to expose GB physical properties in terms of the terminology thus introduced. Here we describe two methods for their definition (for details, see Ref. [1]). The traditional “CSL-misorientation” terminology focuses on the misorientation between two grains in terms of the coincident-site lattice [2–4]. While limited to commensurate interfaces (see Section 2.3.), this terminology is particularly useful for the description of dislocation boundaries (i.e., low-angle and vicinal GBs; see Section 3.2.). By contrast, the “interface-plane” terminology [1, 5] focuses on the GB plane and thus exposes the similarities between GBs, free surfaces and stacking faults. While not limited to commensurate GBs, it is particularly useful for the description of high-angle GBs (see Section 3.3.).
1.1.1. CSL-misorientation Method Within the framework of the CSL description of GBs, three DOFs are identified with the CSL rotation, R(nˆ CSL , φCSL ), about a rotation axis, nˆ CSL , by the angle φCSL ; the remaining two DOFs are assigned to the GB plane. A GB is then thought of as having been generated by a relative rotation, R, of two identical, interpenetrating crystal lattices such that a three-dimensional superlattice in common to the two is generated. The unit-cell volume of this superlattice (the CSL) is times larger than that of the two space lattices; the integer number may therefore be thought of as the inverse volume density of CSL sites obtained for a given rotation. Following generation of the CSL, a GB is created by choosing as the GB plane any plane of the CSL and subsequently removing corresponding halves from each crystal lattice. The five DOFs are thus defined as follows: {DOFs} = {ˆnCSL , φCSL , nˆ 1 }
(“CSL-misorientation method”),
(1)
where nˆ 1 represents the GB-plane normal in either of the two halves of the bicrystal (here chosen to be grain 1). Although redundant as it is uniquely determined by the GB misorientation, = {R(nˆ CSL , φCSL )} is usually added as a sixth parameter in Eq. (1). Due to the requirement that, at least prior to allowing for rigid-body translations, a superlattice in common to the two grains exists, the CSL method is limited to commensurate GBs.
Structure and energy of grain boundaries
1955
For dislocation boundaries the distinction between pure tilt (nˆ CSL ⊥nˆ 1 ), pure twist (nˆ CSL ||nˆ 1 ) and mixed (or “general”) GBs is extremely useful as it provides information about the types and planar densities of dislocations present in the GB atomic structure, with the tilt and twist components, respectively, defining the edge- and screw-dislocation content. For “general” GBs (i.e., those with five DOFs, thus having both twist and tilt components), the CSL rotation may be viewed as consisting of the tilt rotation, R(nˆ T , ψ), followed by a twist rotation, R(nˆ 1 , θ); i.e., R(nˆ CSL , φCSL ) = R(nˆ 1 , θ)R(nˆ T , ψ),
(2)
where nˆ T and ψ are the tilt axis and tilt angle. The three rotation matrices in Eq. (2) involve a total of nine parameters for an overall misorientation characterized by only the three DOFs in nˆ CSL and φCSL ; six relationships must therefore exist among nˆ CSL , φCSL , nˆ T , ψ, nˆ 1 and θ. Due to this myriad of parameters in the CSL terminology, the correct number of macroscopic DOFs of any given GB and its atomic-level geometry are not readily apparent. For example, the fact that asymmetric tilt boundaries (ATGBs) have only four DOFs and hence represent a subset of general boundaries and that, with only two DOFs, symmetric tilt boundaries (STGBs) are a subset of twist boundaries is not obvious. In fact, because the misorientation represents already three DOFs, it is difficult to envision any GB having only two DOFs. To illustrate these difficulties, we start with the conceptually very simple pure (or symmetric) twist boundary, for which nˆ CSL ≡ ± nˆ 1 (i.e., the CSL rotation axis is parallel to the GB plane normal). Equation (2) then becomes {DOFs} = {±nˆ 1 , φCSL , nˆ 1 }
(symmetric twist GB),
(4)
revealing that such a GB has only three DOFs. The fact that, compared to a general GB with five DOFs, an asymmetric tilt boundary (ATGB; see Fig. 1) has only four DOFs while a STGB has only two is not as readily apparent, however. Generally, for tilt boundaries nˆ CSL ≡ nˆ T and φCSL ≡ψ; Eq. (2) then yields {DOFs} = {ˆnT , ψ, nˆ 1 } (tilt GB).
(5)
From Eq. (5) it appears that a tilt boundary may have all five DOFs, similar to a general boundary (see Eq. (2)). However, because of the constraint that the tilt axis is perpendicular to the GB plane normal (i.e., nˆ T ⊥nˆ 1 or (nˆ T · nˆ 1 ) = 0), the number of independent variables in Eq. (5) is reduced from five to only four; an ATGB has therefore only four DOFs. In a symmetric GB, generally nˆ 2 = ± nˆ 1 ; in a symmetric tilt boundary, the normals nˆ 1 and nˆ 2 are related by the rotation: nˆ 2 = ±nˆ 1 = R(nˆ T , ψ)nˆ 1 .
(6)
1956
D. Wolf
(a)
(b)
(c)
â n
Ψ
â n
â n
Ψ
Ψ
â nT
Figure 1. Definition of the four macroscopic degrees of freedom of an asymmetric tilt boundary (ATGB) [1, 5].
This constraint further reduces the number of independent variables in Eq. (5) from four to only two. This little recognized fact means that an STGB has only two macroscopic DOFs, as do free surfaces and stacking faults. Within the CSL terminology this intimate geometrical relationship between these three rather different types of interfacial systems, each having only two DOFs, is far from obvious. Fortunately, however, within the interface-plane terminology, these similarities become readily obvious.
1.1.2. Interface-plane method GBs are planar defects on crystallographically well-defined lattice planes. Conceptually a bicrystal may therefore be thought of as having been formed by bringing two free surfaces, with normals nˆ 1 and nˆ 2 , into contact followed by a rotation about their common normal (see Fig. 2). The five macroscopic DOFs may then be simply defined as follows [5]: {DOFs} = {ˆn1 , nˆ 2 , θ} (“interface-plane method”).
(7)
ˆ The unit vectors nˆ 1 and nˆ 2 characterize the common GB-plane normal, (n), with respect to the two principal (say, cubic) coordinate systems, (x1 , y1 , z 1 ) and (x2 , y2 , z 2 ), in the two halves of the bicrystal, i.e., the normals of the two free surfaces brought into contact (see Fig. 2). Since each unit vector represents two DOFs, the GB plane characterized with respect to each of the two grains represents four DOFs. The only remaining DOF is the one associated with a “twist” rotation, i.e., a rotation about nˆ by the angle θ; any other rotation would change nˆ 1 and nˆ 2 .
Structure and energy of grain boundaries
1957
â n2 z2
θ
y2
x2
â n1 z1 y1
x1 Figure 2. Definition of the five macroscopic DOFs of a general (or “asymmetric twist”) GB within the interface-plane terminology [1, 5].
1.2.
Symmetric vs. Asymmetric Tilt and Twist GBs
By contrast with the CSL terminology, the five DOFs defined in Eq. (7) readily permit the correct number of macroscopic DOFs of any given GB to be identified and its atomic-level geometry to be visualized. By definition, the angle θ describes a twist rotation. We therefore define θ = 0◦ as the angle for which the GB is of a pure tilt type (i.e., an asymmetric tilt GB, ATGB), such that the GB atomic structure contains only edge dislocations (see Fig. 1(c)). For θ > 0, screw dislocations are introduced in addition to the edge dislocations already present for θ = 0◦ , leading to an increase in the GB planar unit-cell area. Hence, if the lattice planes associated with nˆ 1 and nˆ 2 are commensurate (see Section 2.3), the ATGB at θ = 0◦ has the smallest possible planar unit cell of any GB on the (nˆ 1 ,nˆ 2 ) GB plane.
1958
D. Wolf
Because of the inversion symmetry of Bravais lattices, a twist rotation by θ = 180◦ produces another ATGB with an identical planar unit-cell area and shape (see Section 2.2); however, its atomic structure differs from that at θ =0◦ in that the planar stacking of lattice planes on one side of the GB is inverted with respect to that at θ = 0◦ [1,5,6]. The two ATGB configurations that exist on a given (nˆ 1 ,nˆ 2 ) GB plane are thus characterized by: {DOFs} = {ˆn1 , nˆ 2 , θ = 0◦ or θ = 180◦ } (asymmetric tilt GB).
(8)
It is then rather obvious that, with a fixed value of θ and hence only four DOFs, ATGBs represent a geometric subset of general (or “asymmetric twist”) GBs having five DOFs. Given the above definition of θ = 0◦ , the twist component, R(nˆ 1 , θ), of the general GB defined by (7) is also readily apparent. Its tilt component, R(nˆ T , ψ) (see Eq. (2)), is governed by the condition that nˆ T ⊥nˆ 1 , nˆ 2 (i.e., that the tilt axis lies in the GB plane; see also Fig. 2). Therefore nˆ T = [nˆ 1 × nˆ 2 ]/sin ψ,
(9)
with the tilt angle given by sin ψ = |[nˆ 1 × nˆ 2 ]|.
(10)
Equations (9) and (10) illustrate the fact that the tilt component of a general GB is solely determined by the four DOFs associated with the (nˆ 1 , nˆ 2 ) GB plane. For symmetric GBs, nˆ 2 = ± nˆ 1 , reducing the number of DOFs in Eq. (7) from five to three, according to (see Eq. (8)) {DOFs} = {ˆn1 , ±nˆ 1 , θ}
(symmetric GB).
(11)
θ = 0 is then conveniently chosen to represent the perfect crystal. As discussed in Section 2.2, in lattices with inversion symmetry θ = 180◦ yields the STGB configuration on a given lattice plane, characterized by the inversion of stacking at the GB. STGBs are therefore a subset of twist GBs, with only two DOFs since θ is fixed at 180◦ ; by contrast, in twist GBs θ represents a DOF.
2.
Atomic-level Geometry
To better visualize GB geometry, the interface-plane description can be taken all the way down to the level of the underlying Bravais and crystal lattices. This leads naturally to the atomic-level description of GB geometry and provides a useful link between the macroscopic geometry and GB atomic structure. In addition to the five DOFs, two useful quantities in this description are the size of the GB planar unit cell (for periodic GBs) and the interplanar spacing parallel to the GB.
Structure and energy of grain boundaries
2.1.
1959
Planar Stacking in Bravais Lattices
A three-dimensional Bravais lattice is usually defined by three vectors, a1 , a2 and a3 , such that any lattice point is given by r = la1 + ma2 + na3 (l, m, n = 0, ± 1..) (see Fig. 3(a)). To visualize the planar arrangement of ˆ a new Bravais sites perpendicular to some arbitrary (rational) direction, n, ˆ c2 (n) ˆ and c3 (n), ˆ can be defined “plane-based” set of Bravais vectors, c1 (n), ˆ and c2 (n) ˆ lie within the planes with normal nˆ while c3 (n) ˆ consuch that c1 (n) ˆ and c2 (n) ˆ thus define nects points within different planes (see Fig. 3(b)). c1 (n) the primitive planar unit cell of the nˆ planes while the out-of plane component, ˆ = d + e enables one to proceed from one lattice plane to the next. ˆ of c3 (n) d||n, As sketched in Fig. 4, the in-plane component, e, governs relative translations between neighboring planes, characterizing their overall, periodic stacking; by contrast, the inter-planar component, d, permits one to proceed from one plane to a neighboring one. To be specific, we consider cubic crystals in which nˆ = (hkl)/(h 2 + k 2 + 2 1/2 l ) may be expressed in terms of the Miller indices, (hkl). All relevant geometrical parameters may then be expressed in terms of (hkl) as well. For example, the interplanar spacing, |d| = d(hkl), is given by the well-known expression d(hkl) = αhkl a(h 2 + k 2 + l 2 )−1/2 ,
(αhkl = 0.5 or 1)
(12)
where a is the cubic lattice parameter and the value of αhkl depends on the combination of odd and even Miller indices. (For example, in the fcc lattice, αhkl = 1 if all (hkl) are odd but 0.5 otherwise.) The number of planes in the repeat stacking sequence, referred to as the stacking period P(hkl), is given by [1, 5, 6] P(hkl) = βhkl (h 2 + k 2 + 12 ),
(βhkl = 1 or 2), (b) z' || â n
(a)
z z'
(13)
a2
e
â n
d
c3
c2
y'
y
a3 x
a1 c1 x'
Figure 3. Two alternate methods for defining a three-dimensional Bravais lattice. (a) Conventional and (b) plane-based Bravais lattice.
1960
D. Wolf A
e B
d(hkl)
d C D E
Figure 4. Stacking of Bravais-lattice planes, schematically illustrated for a hypothetical five-plane stacking sequence, . . .|ABCDE|. . . [1, 5]. Table 1. Interplanar spacing, d(hkl), in units of the lattice parameter a (see Eq. (12)), for the most widely spaced planes in the fcc lattice. Also listed are the stacking period, P(hkl) (see Eq. (13)), and the values of (hkl) for an STGB on the (hkl) plane (see Eq. (18) below). [1] (hkl) 1 (1 1 1) 2 (1 0 0) 3 (1 1 0) 4 (1 1 3) 5 (3 3 1) 6 (2 1 0) 7 (1 1 2) 8 (1 1 5) 9 (5 1 3) 10 (2 2 1)
P(hkl)
d(hkl/a)
3 2 2 11 38 10 6 27 35 18
0.5774 0.5000 0.3535 0.3015 0.2294 0.2236 0.2041 0.1925 0.1690 0.1667
(hkl)
3 1 1 11 19 5 3 27 35 9
where, similar to αhkl , the value of βhkl depends on the combinations of odd and even (hkl). For example, the well-known . . . |ABC|ABC|. . . stacking of (111) planes in the fcc lattice is an example for P(hkl) = 3; the . . .|AB|AB|. . . stacking of the (100) and (110) planes is an example for P(hkl) = 2. Since each Bravais plane contains exactly one lattice site in the planar unit cell (see Figs. 3 and 4), its area, Amin (hkl), is related to d(hkl) via the unit-cell volume, : Amin (hkl)d(hkl) = .
(14)
The most widely spaced planes are therefore the ones with the smallest planar unit-cell areas; i.e., the densest planes of the crystal lattice. As an example, Table 1 lists the values of d(hkl) and P(hkl) for the 10 densest planes of the fcc lattice, in which = a 3 /4.
Structure and energy of grain boundaries
2.2.
1961
Symmetric Tilt Boundaries in Cubic Crystals
From a strictly geometrical viewpoint STGBs are fascinating, yet little understood objects since they are so closely related to free surfaces and stacking faults. These simplest of all crystalline interfaces share the common property that, apart from having different rigid-body translations, T, their geometry is fully characterized by only the two DOFs associated with the interface (hkl) plane. Consideration of their atomic-level geometry exposes this commonality (see Fig. 5). The atomic-level geometry of the STGB sketched in Fig. 5(c) is characterized by the familiar inversion at the GB of the stacking of the perfect-crystal lattice planes in Fig. 5(a) (see the shaded arrows on the right), a feature in common to all STGBs. In practice, the inverted configuration in Fig. 5(c) is unstable, giving rise to a rigid-body translation (Tx , Ty ) parallel to the GB accompanied by a volume expansion per unit GB area, Tz = δV , that minimizes the GB energy. A basic property of all lattices with inversion symmetry is that the stacking sequence in a given direction, hkl, may be inverted by a 180◦ twist rotation about hkl. In such lattices STGBs therefore represent special 180◦ twist boundaries [5, 6]. (a)
(b)
(c)
T⫽(Tx,Ty) A
E
B
D C B
e
C
d
D
A
d(hkl)⫹␦V
E
E
Ψ/2
E E
D
D
D
C
C
C
B
B
B
A
A Perfect crystal
Ψ/2 Ψ/2
A
Stacking Fault
Symmetrical Tilt ("Twin")
Figure 5. Geometrical relation between (a) planar stacking in the perfect crystal, (b) the stacking fault and (c) the STGB on the same (hkl) plane. (a) shows schematically two unit stacks of lattice planes, labeled |ABCDE|, in a (hypothetical) direction with a 5-plane stacking period (P(hkl) = 5), illustrating the constant in-plane and out-of-plane translations, e and d, as one proceeds from one plane to the next. In the stacking fault in (b), the upper grain has been merely translated parallel to the (hkl) interface plane. By contrast, in the STGB configuration in (c), the upper stack is inverted with respect to the lower one while preserving the shape and area of the perfect-crystal planar unit cell. In most STGBs, rigid-body translations parallel and perpendicular to the GB will destroy the mirror symmetry in (c) and give rise to a volume expansion per unit GB area, δV [1, 5].
1962
D. Wolf
Apart from having a different rigid-body translation, the stacking fault on the (hkl) plane (sketched in Fig. 5(b)) differs from the STGB merely by the absence of this stacking inversion (i.e., θ = 0◦ ) while the free surface may be viewed as the Tz → ∞ limit. Obviously the STGB, stacking fault and free surface have equal planar unit-cell areas, with unit-cell dimensions that are identical to those of the perfect-crystal (hkl) plane in Fig. 5(a); this area is hence the smallest possible for any planar defect on the (hkl) plane. A special STGB translational state is the coherent-twin configuration in which the GB is not only a mirror plane (see Fig. 5(c)) but also an atom plane. In the fcc lattice there is only one such STGB, namely the (111) twin, with the stacking sequence, . . .|ABCBA|. . ., in which the C plane is the mirror plane (see Fig. 6). Starting from the . . .|ABC|ABC|. . . perfect-crystal stacking for P(111) = 3 (see Fig. 6(a)), a 180◦ twist rotation (or, because of symmetry, a 60◦ twist rotation) yields the inverted . . .|CBA|ABC|. . . STGB configuration in Fig. 6(b) which, after a rigid-body translation of the upper crystal relative to the lower one such that A → B (and hence B → C and C → A) results in the coherent twin configuration in Fig. 6(c). The fact that only (hkl) planes with P(hkl) = 3 can accommodate an STGB is also readily seen (see Fig. 7). For example, starting from the . . .|AB|AB|. . . perfect-crystal stacking for P(hkl) = 2 in (a) (e.g., for the (100) and (110) planes in the fcc lattice), a 180◦ twist rotation yields the inverted . . .|AB|BA|. . . “STGB” configuration in (b)) which, after a rigid-body translation such that B → A (and hence A → B), converts back to the perfect crystal (see Fig. 7(c)). Given that an STGB has only two DOFs, namely those associated with the GB (hkl) plane, all six CSL parameters in nˆ CSL ≡ nˆ T , φCSL ≡ ψ, nˆ 1 , and (a) <111>
(b) <111>
A B C
(c)
<111>
C B θ⫽ 60˚
A
A C T⫽(Tx,Ty)
B
A B C
A B C
A B C
Ideal crystal
"Unstable Twin"
"Coherent Twin"
Figure 6. Generation of the coherent-twin configuration in (c) by a 180◦ twist rotation of the perfect-crystal configuration in (a). The STGB configuration in (b) is energetically unstable, inducing the rigid-body translation that results in the coherent twin [1, 5].
Structure and energy of grain boundaries (a)
1963
(b)
(c)
â n
â n
â n
A B
B A
A B
θ⫽ 180˚
T⫽(Tx,Ty )
A B
A B
A B
Ideal crystal
"Unstable Twin"
"Coherent Twin"
Figure 7. Hypothetical generation of an STGB on a lattice plane with P(hkl) = 2. The 180◦ twist rotation of the perfect-crystal configuration in (a) results in the characteristic, inverted stacking sequence in the STGB configuration in (b). However, a simple rigid-body translation of this configuration parallel to the interface plane re-establishes the perfect-crystal translational state in (c) [1, 5].
(nˆ T , ψ) can be expressed in terms of the Miller indices [6]. However, applying Eqs. (9) and (10) to express for example nˆ T in terms of (hkl), at first sight it appears that STGBs have a vanishing tilt component because [nˆ 1 × nˆ 1 ] = 0. This apparent discrepancy originates from the fact that the underlying condition for symmetry, nˆ 2 = nˆ 1 , is too narrow as it should include all combinations of crystallographically equivalent planes, (hkl)(±h ± k ± l) (see Section 1.2). In cubic crystals an arbitrary combination of Miller indices, (±h, ±k, ±l), can always be converted, by a sequence of 90◦ rotations about 100 which transform the crystal into itself, e.g., into a form (h k ± l). The three DOFs of a symmetric GB in a cubic crystal are therefore given by [1] {D O Fs} = {(hkl), (hkl), θ} = {(hkl), (hk − l), 180◦ − θ}, (symmetric twist).
(15)
For θ = 180◦ , Eq. (15) describes the (one and only) STGB on the (hkl) plane. According to Eq. (15), it can be viewed either as a pure twist GB generated by a 180◦ twist rotation of two perfect-crystal stacks, (hkl)(hkl), or as a pure tilt boundary, starting from two already inverted stacks, (hkl)(hk − l), with no twist component at all [see also Fig. 1(b)]. In the latter case, Eqs. (5) and (6) readily yield its tilt component in terms of (hkl) [1, 6]: nˆ T = (h 2 + k 2 )−1/2 (−k h 0), sin ψ =
2 l(h 2 + k 2 )1/2 . h2 + k2 + l2
(16) (17)
1964
D. Wolf
Finally, may be expressed in terms of the Miller indices as well. As shown earlier [5, 6], = (hkl) is determined by the stacking period P(hkl). For an odd value of P(hkl), in the STGB configuration only two planes out of the 2P(hkl) planes in a double stack of (hkl) planes remain in perfect registry; i.e., = 2P(hkl)/2 = P(hkl). More generally, for both odd and even values of P(hkl), one can show that [5, 6] (h 2 + k 2 + l 2 ), = δhkl P(hkl) = βhkl δhkl (h 2 + k 2 + l 2 ) = δhkl
(18)
=βhkl δhkl =0.5 or 1, ensuring that is always where Eq. (13) was used and δhkl odd. Combining Eqs. (12) and (14), the STGB (or perfect-crystal) planar unitcell area, Amin (hkl), may be related directly to given by Eq. (18) as follows [5, 6]: −1/2 1/2 −1 αhkl (δh kl) = εhkl 1/2 , (19) Amin (hkl) = a a √ −1 (δhkl )1/2 can assume the following four values: εhkl = 1, 2, 2 where√εhkl = αhkl and 2 2. A small value of is therefore necessary but not sufficient for the STGB on the (hkl) plane to have a small planar unit-cell; the necessary and sufficient condition requires that for a particular (hkl) plane εhkl is small as well (for more on this distinction, see Section 3.1.).
2.3.
Commensurate Grain Boundaries
Within the CSL terminology, naturally only commensurate GBs are considered; i.e., those with special misorientations that result in a common superlattice, the CSL. Within the interface-plane terminology, a criterion for the commensurability of the lattice planes forming the GB is readily derived [1]. Two lattice planes, (h 1 k1l1 ) and (h 2 k2l2 ), are commensurate if under a general twist rotation a common planar unit cell exists which describes their periodic structures. This implies that n Amin (h 1 k1l1 ) = m Amin (h 2 k2l2 ), where n and m are positive integers. By definition, all symmetric GBs [for which (h 2 k2l2 ) = (h 1 k1 ± l1 )] are therefore commensurate. Using Eqs. (12) and (14), this condition can be written as follows: εh21 k1 l1 (h 21 + k12 + l12 ) m 2 = . εh21 k1 l1 (h 22 + k22 + l22 ) n 2
(20)
For two lattice planes to be commensurate, the ratio of the sum of the squares of the Miller indices therefore has to be itself a ratio, m 2 /n 2 , of squares of integers m and n (for details, see Ref. [1]). As an example, Table 2 lists pairs of cubic lattice planes with the smallest planar unit cells that are commensurate with the (111), (100) and (110) planes.
Structure and energy of grain boundaries
1965
Table 2. Cubic lattice planes with the smallest unit cells which are compatible with (111), (001) and (011) planes, respectively (see Eq. (20)) [1] No (h 2 k2l2 ) (h 1 k1l1 ) m 2 /n 2 (h 2 k2l2 ) (h 1 k1l1 ) m 2 /n 2 (h 2 k2l2 ) (h 1 k1l1 ) m 2 /n 2 1 2 3 4 5 6 7 8
(1 1 1) (1 1 1) (1 1 1) (1 1 1) (1 1 1) (1 1 1) (1 1 1) (1 1 1)
(1 1 1) (1 1 5) (1 5 7) (1 5 11) (1 11 11) (5 7 13) (1 1 19) (5 7 17)
1 9 25 49 81 81 121 121
(0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1)
(0 0 1) (2 2 1) (4 3 0) (2 3 6) (1 4 8) (4 4 7) (6 7 7) (2 6 9)
1 9 25 49 81 81 121 121
3.
Grain-boundary Atomic Structure
(0 1 1) (0 1 1) (0 1 1) (0 1 1) (0 1 1) (0 1 1) (0 1 1) (0 1 1)
(0 1 1) (1 1 4) (0 7 1) (3 4 5) (1 4 9) (3 5 8) (4 5 11) (7 7 8)
1 9 25 49 81 81 121 121
The atomic structure of GBs is determined by the physics of the material. In accordance with their fundamentally different structures, we distinguish between three types of GBs. Special boundaries, involving the densest lattice planes and smallest planar unit cells, give rise to energy cusps in the phase space comprised of the five macroscopic DOFs. For small angular deviations from these, arrays of well-separated dislocations are formed. The resulting dislocation (i.e., low-angle or vicinal) GBs are to be distinguished from highangle GBs, in which the dislocation cores overlap completely.
3.1.
“Special” Boundaries
It has often been stated that the geometric reason for the existence of special GBs are the low values of associated with special misorientations. Much evidence to the contrary suggests, however, that a low value of does not guarantee an especially low GB energy. Instead, the main reasons for the appearance of energy cusps are the existence of (i) special GB planes, i.e., planes with large values of the interplanar spacing, d(hkl), parallel to the GB and (ii) GBs with especially small planar unit-cells (see, e.g., Refs. [5] and [7]). Similar to special (i.e., step-free or flat) free surfaces, special GBs contain no dislocations. The comparison of the structure of a stepped surface with that of a lowangle symmetric tilt boundary on the same lattice plane (see Fig. 8) demonstrates the close similarities between these simple types of interfaces. The steps in the free surface, with height h, are merely replaced by GB dislocations, with Burgers vector b.
1966
D. Wolf
(a)
∆Ψ ∆Ψ
b
δ
δ′
z
(b)
y
∆Ψ δ
h
x
â n cusp
δ′
â nv ∆Ψ
â nT
Figure 8. Step and dislocation structures of vicinal free surfaces and STGBs (schematic). Although both systems have only two DOFs (those associated with the interface plane), for vicinal interfaces the three-parameter description in terms of the tilt (or pole) axis, nˆ T , and angle, ψ, associated with the vicinal misorientation is more useful. nˆ cusp and nˆ v denote the normals of the “cusped” (or “special”, i.e., line-defect-free) and vicinal interface planes (see also Ref. [10]).
3.1.1. Special GB planes STGBs represent ideal model systems to deconvolute the distinct roles of (hkl) and d(hkl) in the appearance of special GBs since both parameters are governed by (hkl) [see Eqs. (12) and (18)]. Due to their close similarity, it is also instructive to compare STGBs with free surfaces. The appearance of GB dislocations or surface steps for small vicinal deviations from an energy cusp provides a fingerprint enabling the identification of special STGBs and surfaces. As illustrated in Fig. 8, in spite of having only two DOFs the dislocation and step structures of these vicinal interfaces are best characterized by the three CSL parameters associated with the tilt
Structure and energy of grain boundaries
1967
(100)
(113) (114) (116)
(112)
(111)
(331) (221) (332)
(110)
axis, nˆ T , and vicinal angle, ψ = arcos(nˆ v · nˆ cusp ), where nˆ cusp and nˆ v are the “special” and vicinal normals. As an example, Fig. 9 compares the simulated energies of fcc free surfaces and STGBs with a common 110 tilt axis plotted against the tilt angle, ψ = ψcusp + ψ. Given that each data point in Fig. 9 represents a different plane, the appearance of cusps indicates an extreme sensitivity of the surface and STGB energies to the interface plane. In fact, independent of the interatomic potential used [7] and in agreement with experiments [8], the STGBs exhibit cusps for the four fcc planes with the largest d(hkl) values (see Table 1). (We note that, with P(hkl) = 2, the (110) and (100) cusps at ψ = 0 and 180◦ correspond to perfect-crystal configurations; see Table 1 and Fig. 7.) That special free surfaces usually involve the two or three densest lattice planes has long been known (see, e.g., Ref. [9]); indeed, Fig. 9 reveals energy cusps associated with the (111) and (100) planes. All other STGBs and surfaces are vicinal to these special ones and it is the spacing between the line defects (i.e., ψ = ψ − ψcusp ) rather than d(hkl) that determines their energy [5].
1000
Σ9 Σ19 Σ3
Σ11
600
Σ9
<110 > Free surfaces Σ19
400 200
Σ11
Energy [mJ/m 2 ]
800
Σ3
0 Au (EAM)
<110 > STGBs
⫺200 0
30
60
90 Ψ
120
150
180
Figure 9. Simulated energies of fcc free surfaces and STGBs with a common 110 tilt axis plotted against the tilt angle, ψ [7]. For STGBs such plots were first obtained from both experiments and computer simulations by Hasson et al. [8].
1968
D. Wolf
In spite of their low values, remarkably absent in Fig. 9 are major cusps associated with the 3(112), 9(221), 9(114) and 11(332) STGBs. The reason is that these planes have rather small d(hkl) values, demonstrating that a low value alone is not sufficient to ensure a low STGB energy; a necessary and sufficient condition is that d(hkl) be large (see also Eq. (19)). The reason for the existence of special interface planes is readily understood in terms of a broken-bond model. The number of nearest-neighbor (nn) bonds within one of the densest planes is particularly large while the number of bonds per unit area across these planes is particularly small. Forming a free surface on such a plane therefore requires the smallest number of surface bonds to be broken; similarly, forming a GB on such a plane involves breaking, bending or stretching the smallest number of interplanar bonds per unit GB area. Another explanation, valid for GBs only, is that rigid-body translations parallel to the GB provide a particularly powerful relaxation mechanism if the GB area is very small [5].
3.1.2. Boundaries with small planar unit cells: STGB and ATGB cusps For STGBs a large value of d(hkl) is equivalent to a small GB planar unit-cell area, Amin (hkl) (see Eq. (19)). By contrast, in twist boundaries d and A are independent of each other. For twist angles other than the perfect crystal at θ = 0◦ and at the STGB at 180◦ , the GB area is given by A(hkl, θ) = (θ) Amin (hkl),
(21)
where the planar density of CSL sites, (θ) ≥ 1, indicates by how much A(hkl, θ) exceeds Amin . By definition, A(hkl, 0◦ ) = A(hkl,180◦ ) = Amin (hkl); i.e., (0) = (180◦ ) ≡ 1. For planes with P(hkl) ≤ 2, (θ) is identical to , the inverse volume density of CSL sites. For vicinal deviations from θ = 0◦ or 180◦ screw dislocations are formed; any twist GB on the (hkl) plane therefore has a larger GB area ((θ) > 1) than either the perfect crystal or the STGB. As seen in Fig. 10, this geometrical uniqueness gives rise to energy cusps at θ = 0◦ and 180◦ . The STGB on any given lattice plane is therefore special with respect to all other twist GBs on the same plane. While all STGBs are therefore special GBs with respect to vicinal twist deviations, some STGBs are clearly more special than others. As seen from Fig. 9, the STGBs on the four densest planes stand out in that they are special also with respect to vicinal tilt deviations; i.e., they represent cusps with respect to both tilt and twist vicinal deviations. Figure 10 also demonstrates that the large-d(hkl) criterion for low GB energy is valid even for high-angle twist GBs. An expression analogous to Eq. (21), in which A(hkl, θ) is replaced A(h 1 k1l1 , h 2 k2l2 , θ), is valid also for asymmetric GBs provided the two sets of planes are commensurate. θ = 0◦ and 180◦ then correspond to the two ATGB
Structure and energy of grain boundaries
GB ENERGY [mJ/m2]
1400
1969
(113)
1200 1000
(011)
800 (001) (m ⫽2)
600 400
(111) (m ⫽3)
200
Symm. twists, Cu(LJ)
0 0
30
60
90
120
150
180
mθ Figure 10. lattice [7].
Simulated energies of twist boundaries on the four densest planes of the fcc
configurations with the same, smallest planar GB area possible for this combination of planes, Amin (h 1 k1l1 ,h 2 k2l2 ), for which (0◦ ) = (180◦ ) ≡ 1. These two ATGB configurations differ from one-another merely by the inversion of stacking at the GB in one relative to the other. As in the symmetric case (Fig. 10), due to their small GB area, these geometrically unique GBs give rise to energy cusps [7]. ATGBs are therefore special with respect to asymmetric twist (or general) GBs involving the same combination of lattice planes. In summary, due to their small GB areas, all tilt GBs, symmetric or asymmetric, are special with respect to vicinal twist rotations. Moreover, similar to the STGBs, those ATGBs on GB planes with a large “effective” interplanar spacing, deff = [d(h 1 k1l1 ) = d(h 2 k2l2 )]/2, are special even with respect to vicinal tilt misorientations. In practice, deff is relatively large for any combination of lattice planes involving one of the densest planes on one side of the GB [7].
3.2.
Dislocation Boundaries
The CSL terminology in Eq. (1) is particularly useful for the description of dislocation GBs. For small vicinal deviations from a special GB, dislocations separated by strained perfect-crystal regions are formed to accommodate the misfit between the two grains. Depending on whether the vicinal deviation involves a pure tilt or twist rotation, the misfit dislocations are of pure edge or screw type; in general, however, they have mixed character.
1970
D. Wolf
3.2.1. Grain-boundary dislocations vs. surface steps Since their atomic structures are determined by the vicinal deviation, ψ, from a nearby “special” interface, it is instructive to compare the atomic structures of vicinal STGBs and free surfaces (Fig. 8). The main geometric difference between them is the replacement of the Burgers vector, b, in Fig. 8(a) by the step height, h ≡ |b| = b, in Fig. 8(b). The spacing, δ, between the steps, or between each of the two sets of edge dislocations, is given by the Frank formula, δ = b/sin ψ. Denoting the line energy per unit length by (δ), the interface energy may be written as follows [10]:
γ ( ψ) − γcusp cos ψ = n
(δ) sin ψ, b
(22)
where n = 1 for surfaces and n = 2 for STGBs. The mutual repulsion between steps, on one hand, and dislocations, on the other, due to their overlapping elastic strain fields may be incorporated by decomposing (δ) = core + el (δ) into its core and strain-field contributions 2 and assuming that only el depends on δ. For steps el (δ) = G s−s el /δ [11] d−d while for dislocations el (δ) = −G el ln(b/δ) [12] with the “elastic interacd−d tion strengths”, G s−s el and G el , governed by the elastic constants. Inserting these expressions into Eq. (22), for the surfaces we obtain [10] S S cos ψ = γcore ( ψ) + γelS ( ψ) γ S ( ψ) − γcusp
=
S core h
sin ψ +
G s−s el b3
while for the STGBs [10] γ
GB
( ψ) −
GB γcusp
sin3 ψ,
cos ψ =
GB γcore ( ψ)
−
G d−d el b3
+
γelGB ( ψ)
=2
GB core b
(23)
sin ψ
sin ψln sin ψ .
(24)
Here γcore and γel are the core and strain-field contributions to the interface energy. We note that Eq. (24) is valid for both tilt and twist boundaries. In the low-angle limit ( ψ → 0), Eq. (24) reduces to the famous expression,
γ
GB
( ψ) −
GB γcusp cos ψ
= 2 ψ
GB core b
−
(G d−d el /b) ln( ψ)
,
(25)
first derived by Read and Shockley [12], a derivation that also yields expressions for G d−d el for symmetric tilt and twist GBs in terms of the elastic moduli. The qualitatively different behaviors of STGBs and free surfaces for small vicinal angles ψ predicted by these expressions are clearly visible in Fig. 9.
Structure and energy of grain boundaries
1971
According to the Read–Shockley formula (25), for ψ → 0 the GB energy, γ GB ( ψ) ∼ γelGB ( ψ) ∼ ψln( ψ), decreases logarithmicly as it is dominated by the elastic strain-field term, giving rise to an infinite slope at ψ = 0. By contrast, due to the much weaker step-step interactions, for ψ → 0 the surface energy is dominated by the core term and therefore decreases sinusoidally, γ s ( ψ) ∼ γscore ( ψ) ∼ sin( ψ), with a finite slope at each cusp (see Eq. (23)). Combined with Eqs. (23) and (24), Fig. 8 demonstrates the little-known fact that cleavage decohesion of an STGB into free surfaces may be conceptualized as the reversible conversion of edge dislocations into surface steps; i.e., of long-ranged dislocation strain fields into short-ranged elastic strain fields near surface steps. The work of adhesion, Wad = 2γ s − γ GB , is therefore given by the difference between their respective core and elastic line energies [10].
3.2.2. The principle of misfit localization The formation of line defects may be viewed as a manifestation of a simple and rather general physical principle according to which a system responds to a structural inhomogeneity by localizing (“screening”) the disorder; i.e., a localized type of disorder is energetically preferred over a more spread-out, delocalized and inhomogeneous type of disorder. This fundamental principle originates from the anharmonicity present in virtually all interatomic interactions. Due to anharmonicity, interatomic distances shorter than the nearestneighbor distance cause a much greater increase in energy than the longer distances. By localizing the structural disorder, the number of these energetically most unfavored atoms is minimized while enabling the vast majority of atoms to relax into perfect crystal-like, elastically distorted local environments, thus minimizing the cost of structural disorder. For the case of vicinal GBs, this translates into the localization of the misfit within the highly defected dislocation cores separated by perfect-crystal-like regions.
3.3.
High-angle Boundaries
There has been much debate on the distinction between low-angle and high-angle GBs, in particular on whether a dislocation picture is appropriate at high angles and how the Read–Shockley model might possibly be extended to high-angle GBs. For example, in 1952 C. S. Smith stated [13]: “The success of dislocation theory simply and rationally applied to GBs is a great achievement . . .” and he suggested that “. . . there is a continuous transition from the pure dislocation boundary to whatever exists at higher angles . . .”. In support of this statement he suggested that the experimental evidence “. . . does not in any way preclude a nearly horizontal curve of energy vs. misorientation once
1972
D. Wolf
the maximum value has been achieved”, implying that the energy of highangle GBs is independent of the misorientation between the grains: “[Their structure] can undoubtedly be described formally in terms of dislocations, [although they are] clearly of a different type and the strain involved in the misfit has at least the appearance of being of shorter range”. That Smith’s intuition was correct is illustrated in Fig. 10 showing that, indeed, the energy increases monotonically with misorientation and levels off at a maximum value, suggesting that, indeed, there is a continuous transition from low-angle to high-angle GBs. Figure 11 shows a least-squares fit to the Read–Shockley expression (24) to the (100) data in Fig. 11, demonstrating that, in spite of the rather restrictive underlying assumptions, Eq. (24) (in GB /b and G d−d which core el /b are treated as adjustable parameters) represents the simulation data surprisingly well. Remarkably, the strain-field contribution, γelGB ( ψ), does not vanish at ∼ 15◦ where the dislocation cores start to overlap; rather, it goes through a maximum and then gradually decreases toward GB ( ψ), saturates simultaneously, zero. The sinusoidal core contribution, γcore however only as the angle ψ ≡ 2θ approaches 90◦ . By comparison, the total GB ( ψ) + γelGB ( ψ), converges much more rapidly, reaching GB energy, γcore a plateau already at ∼ 40◦ (see Fig. 10 for other GB planes). According to GB GB (90◦ ) = 2core /b; i.e., Eq. (24), the plateau energy is given by γ GB (90◦ ) ≡ γcore by the dislocation-core energy which depends on b and, hence, on the GB plane (see Fig. 11).
GB ENERGY [mJ/m 2 ]
800 (001) Twists Cu(LJ) CORE
600
400
200
STRAIN FIELD
0 0
15
30
45 2θ
60
75
90
Figure 11. Least-squares fit of the Read–Shockley expression (19) to the (100) twist-boundary data in Fig. 10 [7].
Structure and energy of grain boundaries
1973
As suggested by Smith [13], the above results demonstrate that, indeed, the energy of “true” high-angle GBs, i.e., those with a misorientation-independent energy, consists of both core and elastic strain-field contributions. Although for large angles the core energy clearly dominates, a residual strain energy arises from the elastic distortion of the dislocation cores.
3.3.1. Polyhedral-unit and broken-bond models That the atomic structure of high-angle GBs is composed of elastically distorted dislocation cores (“structural units”) is captured in the polyhedral-unit model [14, 15]. Clearly, the requirements of space filling at the GB and of the compatibility of the structural units with the adjoining grains cannot be satisfied simultaneously unless the polyhedra are elastically distorted in a manner that depends systematically on the misorientation. Unfortunately, because the model ignores these distortions and, moreover, offers no quantifiable measure of even the undistorted GB structure, it has virtually no predictive power as far as GB properties are concerned. The model can be at least partially quantified, by identifying the number of broken bonds per unit GB area associated with each type of structural unit [16]. Although elastic distortions of the units are thus neglected, it captures the broken-bond nature of the dislocation cores (i.e., the sinusoidal core contribuGB ( ψ), in Eq. (24) and Fig. 11) and hence becomes more quantitative tion, γcore as the elastic strain-field contribution, γelGB ( ψ), decreases towards zero for
ψ → 90◦ . In broken-bond models the atomic structure is generally quantified via the number of broken nearest-neighbor bonds per unit interface area. Given the rather weak elastic interactions between steps, such models work particularly well for surfaces because they quantify the dominating core contribution in Eq. (23). By contrast, these models are less quantitative for internal interfaces because they cannot capture the dominant elastic contribution due to the dislocations [16]. For example, whereas for small angles the GB energy is dominated by the (logarithmic) elastic contribution in Eq. (24), GB diffusion takes place within the cores (“pipe diffusion”) and is hence dominated by the (sinusoidal) core contribution. By contrast with the energy, GB diffusion can therefore be quantitatively predicted from a broken-bond description. In another group of models, known as hard-sphere models, the optimum translation parallel to the interface plane is assumed to be the one that minimizes the volume expansion at the GB. Although based on unrelaxed atomic structures, via the volume expansion at the GB these models provide at least a rough quantitative measure of the degree of GB structural disorder.
1974
D. Wolf
3.3.2. Short-range vs. long-range structural disorder In simulations, atomic-level GB structural disorder is best characterized in terms of the plane-by-plane radial distribution function, G(r), or its Fourier transform, S(k), averaged over the atoms in each of the lattice planes near the GB. The two structural measures are illustrated in Figs. 12 and 13 for the simulated (110) θ = 50.48◦ (11) twist GB and for the STGB on the (123) plane in fcc Pd. Figures 12(a) and (b) compare the G(r)’s for the atoms in the two center planes of each GB with that of bulk amorphous Pd (obtained by MD simulation of a rapidly quenched melt; see Ref. [17]). While the structure of the twist boundary in (a) is virtually indistinguishable from that of the bulk glass, with the characteristic split second peak indicated by the arrows, the more pronounced peak structure for the (123) STGB in (b) indicates a significant degree of residual crystallinity. These differences in the degree of long-range GB structural disorder can be quantified via the square of the planar structure factor, |S(kα )|2 (α = 1 or 2), where the wave vectors k1 and k2 lying within each lattice plane are chosen to be reciprocal lattice vectors in grains 1 and 2, respectively, with the smallest magnitude |kα |. Then, for a perfect-crystal plane at zero temperature belonging to grain 1, |S(k1 )|2 = 1 and |S(k2)|2 = 0 ; similarly, |S(k1 )|2 = 0 and |S(k2 )|2 = 1 for planes belonging to grain 2. At finite temperature, due to the vibration of the atoms the planar long-range order decreases by the Debye–Waller-factor. By contrast, in a liquid or an amorphous solid, due the absence of long-range order both |S(k1 )|2 and |S(k2 )|2 have near-zero values. The plane-by-plane structure factors for the two GBs are compared in Fig. 13. The center of the (110) twist GB is, indeed, highly disordered, as evidenced by the low values of |S(k)|2 ≈ 0.25 in the two center planes, indicating only a 25% residual crystallinity; by comparison, the (123) STGB is 94% crystalline even in the two planes immediately at the GB. In spite of these significant differences in the degree of long-range structural disorder, the two GBs are remarkably similar as far as short-range disorder is concerned, as evidenced by the rather similar shapes and widths of the nn peaks in G(r) in Fig. 12(a) and (b), indicating rather similar GB energies of 1027 mJ/m2 for the twist GB and 881 mJ/m2 for the STGB (see Ref. [17]). This comparison demonstrates that, in spite of fundamentally different long-range structures, the two GBs exhibit comparable degrees of short-range structural disorder, translating into similar energies. For GB properties dominated by the short-range disorder, such as the energy and self-diffusion behavior (and probably most other properties), the degree of long-range structural periodicity therefore appears to be of little importance (see also Ref. [17]). Hence, although the information in |S(k)| is equivalent to that in G(r), the
Structure and energy of grain boundaries
1975
(a) 8 Pd(EAM) T⫽0K
7 6
(110) Σ11 glass
G(r)
5 4 3 2 1 0 0
0.5
1 r [a 0 ]
1.5
2
(b) 8 Pd(EAM) T⫽0K
7 6
(123) STGB glass
G(r)
5 4 3 2 1 0 0
0.5
1 r [a 0 ]
1.5
2
Figure 12. Radial distribution function, G(r ), determined by molecular-dynamics simulation for the atoms in the two central planes of (a) the (110) θ = 50.48◦ (11) twist GB and (b) the STGB on the (123) plane in fcc Pd at zero temperature [22]. The dashed line indicates G(r ) for the glass obtained by quenching the melt to zero temperature.
latter provides a more useful structural measure as it is more directly related to GB properties. Finally, in a broken-bond model the nn peak in G(r) is replaced by the area under the peak. Because all the elastic information contained in the detailed peak shape is thus lost, strain-field effects due to line defects are thus ignored.
1976
D. Wolf 1
0.8
|S ( k )| 2
(123) STGB 0.6
0.4
(110) Σ11 twist Pd (EAM) T ⴝ0K
0.2
0 ⫺4
⫺2
0 z [a 0]
2
4
Figure 13. Square of the planar structure factor, |S(kα )|2 , in the two halves (α = 1 or 2) of the same two GBs considered in Fig. 12 [22].
G(r) therefore provides the most complete measure of GB structural disorder while a broken-bond model represents only a subset of this information.
3.3.3. Amorphous high-energy GBs The commonly observed existence of thin disordered intergranular films of nanometer thickness in ceramics, such as SiC and Si3 N4 , represents one of the most intriguing features in the atomic structures of GBs [18, 19]. Because these impurity-based films are usually formed during liquid-phase sintering, their presence at room temperature is usually thought to be a strictly kinetic, non-equilibrium phenomenon that has little to do with the structure of GBs in pure materials. However, recent simulations of pure materials raise the intriguing possibility that these films may be a manifestation of the thermodynamicequilibrium structure of high-energy GBs and thus may not require impurities for their stabilization. Early in the last century, based on the inherent low-temperature brittleness and high-temperature ductility of glasses, Rosenhain and his coworkers suggested that the GBs in a polycrystalline material represent an “amorphous cement” that holds the grains together (see, e.g., Refs. [20, 21]). However, based on observations in bicrystals and coarse-grained polycrystals of (a) good long-range structural periodicity within the GBs, (b) well-defined rigid-body
Structure and energy of grain boundaries
1977
translations and (c) the variation of the GB energy with misorientation, Rosenhain’s model has been thoroughly discredited. That Rosenhain appears to have been at least partially correct was suggested recently by extensive computer simulations of GBs in silicon, diamond and fcc palladium [22–24]. In complete qualitative agreement with the observations on the impurity-based amorphous intergranular films in ceramics, the simulations of impurity-free Si bicrystals revealed a common energy for all high-energy GBs and a universal, highly disordered (“confined amorphous”) structure of uniform thickness that is virtually indistinguishable from that of bulk amorphous Si. By contrast, low-energy GBs were found to exhibit very good crystalline order, also in agreement with the intergranular-film experiments. In these simulations, zero-temperature-relaxed GB structures, such as that in Fig. 14(a), were compared with the related high-temperature-equilibrated structure obtained either by high-temperature annealing or by growth from the melt containing well-oriented seeds followed by cooling to zero temperature. That the amorphous structure in Fig. 14(b) thus obtained has a significantly lower GB energy than the crystalline structure in (a) is demonstrated in Fig. 14(c), in which the plane-by-plane average energies per ion in excess of that in the perfect crystal are compared for the two GBs. The peak excess energy of 0.48 eV/atom for the crystalline structure in (a) far exceeds the average excess energy of bulk amorphous Si of 0.19 eV/atom (dashed line). By contrast, the most disordered plane of the amorphous GB in (b) exhibits an excess energy of only about 0.26 eV/atom, much closer to that of amorphous Si. More importantly, despite a broadening of the GB, the disordering results in an about 15% lower GB energy (given by the area under each peak). Having thus elucidated the thermodynamic and structural origins of the underlying driving forces, one might expect that all high-energy GBs should have similar, highly disordered “confined amorphous” structures. That this is, indeed, true is seen in Fig. 15 in which the energy profile of the 29 GB in Fig. 14(c) is compared with the profiles obtained for three other highangle twist boundaries on different Si planes: (100) φ = 31.89◦ ( 17), (110) φ = 44.00◦ (57) and (112) φ = 35.26◦ ( 35). Based on the above insights, one might expect that a low-energy GB is crystalline even after high-temperature equilibration. To test this hypothesis, Keblinski et al. [22] studied the (111), φ = 42.10◦ (31) twist GB which, like the 29 GB, is a high-angle boundary, however on the most widely-spaced (and hence, lowest-energy) plane in the diamond structure, with the rather low lattice-statics-relaxed energy of only 638 erg/cm2 . These simulations revealed that, indeed, no restructuring took place during the heat treatment and the GB energy remained totally unchanged. The projected “edge-on” zero-temperature structure of this GB shown in Fig. 16 indicates a highly ordered structure, which was confirmed by a quantitative characterization of the
1978
D. Wolf (a)
Crystalline (b)
Amorphous (c) 0.5 crystalline GB 0.4 bulk amorphous Si amorphous GB
0.3 0.2 0.1 0 ⫺16
⫺12
⫺8
⫺4
0 z [Å]
4
8
12
16
Excess energy (ev/atom)
Figure 14. (a) Crystalline, zero-temperature-relaxed structure and (b) amorphous, hightemperature-relaxed structure of the (100) θ = 43.60◦ (29) twist GB in Si. (c) shows the underlying plane-by-plane average-excess-energy-per-atom profiles for the two GBs by comparison with the average excess energy of bulk amorphous Si (dashed line) [22].
Structure and energy of grain boundaries ⫺3.95
1979
(100) Σ 29
bulk amorphous
(100) Σ 17 Energy [eV/atom]
⫺4.05
(110) Σ 57 (112) Σ 35
⫺4.15
⫺4.25
⫺4.35 ⫺2
⫺1
0 z [a0]
1
2
Figure 15. Energy profiles along the z direction. Figure shows data for four different highenergy GBs, (i) (100) φ = 43.60◦ (29) - solid squares, (ii) (100) φ = 31.89◦ (17) - open squares, (iii) (110) φ = 44.00◦ (57) - solid circles, (iv) (112) φ = 35.26◦ (35) – triangles [22].
structure via the corresponding bond-angle distribution function and a planeby-plane structure-factor analysis [22]. The atomic structure of the low-angle (100), φ = 10.60◦ (101) twist GB boundary on the (100) plane is also rather revealing. As seen in Fig. 17, the high-temperature annealed structure of this dislocation boundary on the same lattice plane as the high-angle GB in Fig. 14(b) consists of an array of highly disordered dislocation cores connected by more ordered, i.e., more perfect-crystal-like regions. Upon increasing the twist angle towards 45◦ , the continuously disordered structure in Fig. 14(b) is obviously obtained. The simulations of Keblinski et al. [22] also revealed that the “universal” GB energy of the structure in Fig. 14(b) obtained from the profiles in Figs. 14(c) and 15 is of similar magnitude as the excess energy of two noninteracting bulk crystal/amorphous interfaces brought into contact. The latter exhibit atomic structures, widths and energy profiles that are remarkably independent of the orientation of the crystalline substrate, an observation that may explain the practically uniform width of the constrained amorphous GB phase from one high-energy GB to another [18]. This comparison also demonstrates a slightly greater degree of confinement (i.e., a smaller width) of an amorphous GB coupled with a higher peak energy compared to a crystal/ amorphous interface; this provides further evidence for the stability of the
1980
D. Wolf 2
z [a 0 ]
1
0
⫺1
⫺2 ⫺3
⫺2
⫺1
0 x [a 0 ]
1
2
3
Figure 16. Zero-temperature structure obtained after high-temperature equilibration of the (111), φ = 42.10◦ (31) twist GB [22].
2
z [a 0 ]
1
0
⫺1
⫺2 ⫺3
⫺2
⫺1
0 x [a 0 ]
1
2
3
Figure 17. Zero-temperature structure obtained after high-temperature equilibration of the (100) φ = 10.60◦ (101) twist GB [22].
Structure and energy of grain boundaries
1981
amorphous GB phase against decomposition into two crystal/amorphous interfaces. These observations identify a simple criterion for the existence of an equilibrium disordered GB film: if atoms in an ordered GB have significantly higher energies than the atoms in the related bulk amorphous phase, the introduction of an amorphous film into the crystalline interface is energetically favorable. Although these results have yet to be confirmed experimentally for impurity-free GBs, they suggest that Rosenhain’s historic amorphous-cement model describes at least qualitatively the misorientation-independent structures of “true” high-angle GBs on high-energy GB planes (i.e., on non-special and non-vicinal planes; see also Fig. 11 and Section 3.2). These conclusions deduced from simulations of pure Si were later confirmed in similar studies of GBs in fcc Pd [17] (see also Figs. 12 and 13) and in diamond in which, however, structural disorder is partially replaced by coordination disorder [23].
4.
Conclusions
In this chapter we have reviewed the key concepts for the description of GB “structure” at three distinct levels, with particular emphasis on how at each level the underlying terminology can be used to connect “structure” with GB physical properties. At the level of the macroscopic GB geometry, two complementary types of descriptions are useful. The coincident-site lattice terminology is particularly suited for the description of dislocation (i.e., low-angle or vicinal) GBs because the spacing between the dislocations is governed by the misorientation. By contrast, the interface-plane terminology is best suited for the description of high-angle GBs in which the crystallographic orientation of the GB plane is the key parameter; four out of the five macroscopic degrees of freedom are therefore assigned to the GB plane. The concept of the atomic-level GB geometry provides a useful link between the macroscopic geometry and GB atomic structure. The two key parameters connecting to GB properties are the spacing of the lattice planes parallel to the GB and the magnitude of the GB planar unit-cell area. The intuitive connection here is that the “special” combination of a large interplanar spacing with a small planar unit-cell area minimizes the degree of GB structural disordering, giving rise to “special” GB properties. Finally, the degree of atomic-level GB structural disordering is captured quantitatively by two complementary structural measures, the plane-by-plane radial distribution function, G(r), and the related planar structure factor, S(k). Because most GB properties are governed by the degree of short-range structural disorder, G(r) is clearly the more useful of the two measures; by contrast,
1982
D. Wolf
S(k) captures the degree of long-range disorder, which appears to be relatively unimportant to most GB properties. This is the reason for the relative success of broken-bond models which utilize the information contained in the area under the nearest-neighbor peak in G(r). However, because they ignore the information on the elastic distortions of the bonds contained in the detailed peak shape, these models capture only the effects of the dislocation cores and are hence most suited for the description of high-angle GBs. A remaining challenge lies in the establishment of a closer connection between the highest-energy GBs in pure materials and the amorphous, impurity-based GB phases, for example, in ceramic materials. Computer simulations suggest that Rosenhain’s amorphous-cement idea must be at least partially correct, although the width of his “amorphous glue”, typically of less than 1 nm thickness, is clearly much smaller than what he anticipated. Experimental verification of the existence of such extremely thin amorphous GB films in the highest-energy GBs in single-phase materials would finally, after a century-long discussion, unify our understanding of GB structure. Such a unified picture would include both, low- and intermediate-energy GBs with well-defined rigid-body translations and an atomic structure and properties that depend strongly on the misorientation and the GB plane, and high-energy GBs (“true high-angle” or “Rosenhain boundaries”) with a universal, highly confined amorphous structure and misorientation-independent properties.
Acknowledgments This work was supported by the US Department of Energy, BES-Materials Science under contract W-31-109-Eng-38.
References [1] D. Wolf, Chapter 1 in Materials Interfaces, Atomic-level Structure and Properties, D. Wolf and S. Yip, (eds.), Chapman & Hall, New York, pp. 1–57, 1992. [2] D.G. Brandon, B. Ralph, S. Ranganathan, and M.S. Wald, Acta Metall., 12, 813, 1964. [3] W. Bollmann, Crystal Defects and Crystalline Interfaces, Springer, New York, 1970. [4] A.P. Sutton and R.W. Balluffi, Interfaces in Crystalline Materials, Clarendon Press, Oxford, 1995. [5] D. Wolf, J. Phys. Colloque, C4 46, C4–197, 1984. [6] D. Wolf and J.F. Lutsko, Z. Kristallographie, 189, 239, 1989. [7] D. Wolf and K.L. Merkle, Chapter 3 in Materials Interfaces, Atomic-level Structure and Properties, D. Wolf and S. Yip (eds.), Chapman & Hall, New York, pp. 87–150, 1992.
Structure and energy of grain boundaries
1983
[8] G. Hasson, J.-Y. Boos, I. Herbeuval, M. Biscondi, and C. Goux, Surf. Sci., 31, 115, 1972. [9] C. Herring, In: Structure and Properties of Solid Surfaces, R. Gomer and C.S. Smith (eds.), University of Chicago Press, Chicago, 1953. [10] D. Wolf and J. Jaszczak, Chapter 26 in Materials Interfaces, Atomic-level Structure and Properties, D. Wolf and S. Yip (eds.), Chapman & Hall, New York, pp. 662–690, 1992. [11] V.I. Marchenko and A.Y. Parshin, Sov. Phys.-JETP, 52, 129, 1980. [12] W.T. Read and W. Shockley, Phys. Rev., 78, 275, 1950. [13] C.S. Smith, In: Metal Interfaces, ASM, Cleveland, OH, p. 62, 1952. [14] M. Weins, H. Gleiter, and B. Chalmers, J. Appl. Phys., 42, 2639, 1971. [15] A.P. Sutton and V. Vitek, Phil. Trans. R. Soc. Lond. A, 309, 1, 1983. [16] D. Wolf, J. Appl. Phys., 68, 3221, 1990. [17] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Phil. Mag. A, 79, 2735, 1999. [18] D.R. Clarke, In: L.C. Dufour, C. Monty, and G. Petot-Ervas (eds.), Surfaces and Interfaces of Ceramic Materials, Kluwer Academic Publishers, Boston, MA, 1989. [19] H.-J. Kleebe, M.K. Cinibulk, R.M. Cannon, and M. R¨uhle, J. Am. Ceram. Soc., 76, 1969, 1993. [20] W. Rosenhain and D. Ewen, J. Institute Metals, 10, 119, 1913. [21] K.T. Aust and B. Chalmers, Metal Interfaces, ASM, Cleveland, OH, 1952. [22] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, J. Am. Cer. Soc., 80, 717, 1997. [23] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, J. Mater Res., 13, 2077, 1999. [24] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, MRS Bull., 23(9), 36, 1998.
6.10 HIGH-TEMPERATURE STRUCTURE AND PROPERTIES OF GRAIN BOUNDARIES Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
The evolution of polycrystalline microstructures under the driving forces of temperature and stress, i.e., during grain growth and plastic deformation, is controlled by the high-temperature structure and dynamical behavior of grain boundaries (GBs). These microstructural processes, driven by temperature and/or stress, involve three distinct types of dynamical GB phenomena, namely GB diffusion, GB migration and GB sliding. All three are known to depend strongly on the GB energy. A better understanding of microstructural evolution in terms of the behavior of the individual interfaces therefore requires insight on the interrelation among GB structure, GB energy and these dynamical GB processes. A polycrystal generally contains GBs of different types covering a wide spectrum of energies and, hence, GB properties. As described in Chapter 6.9, much work of recent decades has suggested a distinction among three qualitatively different types of GBs: (a) special boundaries, (b) dislocation boundaries and (c) high-angle GBs [1]. Among these, the high-angle boundaries are the most important – yet least understood – ones because their high-temperature behavior is rate-controlling in many phenomena, such as grain growth and plastic deformation, particularly in nanocrystalline materials, i.e., polycrystals with a grain size of nanometer dimensions (see Chapter 6.13). The atomic structure of high-angle GBs is characterized by a complete overlap of the GB dislocation cores [2, 3], giving rise to a more or less continuously distributed type of structural disorder along the interface and a GB energy which is entirely independent of the GB misorientation (see, e.g., Figs. 10, 14(b) and 17 in Chapter 6.9). Whereas high-angle GBs on the 1985 S. Yip (ed.), Handbook of Materials Modeling, 1985–2008. c 2005 Springer. Printed in the Netherlands.
1986
D. Wolf
lowest-index plane of a given crystal lattice, such as the (111) plane in fcc crystals, exhibit relatively little structural disorder – and hence a rather low GB energy (see, e.g., Fig. 16 in Chapter 6.9), boundaries on most other lattice planes are highly disordered, with a consequently very high energy (see Figs. 12(a) and (13) in Chapter 6.9). Although these high-energy, high-angle GBs may represent only a relatively small fraction of the GBs in a polycrystal, their very high mobility and diffusivity coupled with a rather low sliding resistance gives them a potentially rate-controlling role in many high-temperature properties of polycrystalline materials. A structural model that accounts for the highly non-perfect-crystal-like properties of these GBs, and one that connects also with other types of structural disorder, such as the amorphous state, is therefore clearly needed. Here we focus on recent molecular-dynamics (MD) simulations investigating the high-temperature structure and self-diffusion behavior of such highenergy, high-angle GBs. Many of the earlier simulations of high-temperature GB structure remain controversial (for a critical review, see [4]) in that they predict “premelting” at the GBs, i.e., GB disordering below the bulk melting point, Tm , with a width of the liquid-like GB layer that diverges as T → Tm , where the GB is completely wetted by the liquid. It was not until the simulations had provided a clearer understanding of the melting process itself [5, 6], of the related superheating limit [6, 7], and of the dependence of lowtemperature GB structure on the GB energy [3, 8], that a clearer picture of high-temperature GB structure and dynamical GB behavior started to develop. These simulations demonstrate that above the glass-transition temperature, Tg , the high-energy boundaries undergo a reversible structural and dynamical transition from a confined amorphous solid to a confined liquid. By contrast with the bulk glass transition, however, this equilibrium transition is continuous and thermally activated, starting at Tg and completed only at the melting point, Tm , at which the entire amorphous film is liquid. The coexistence of the confined amorphous and liquid phases in this two-phase region of less than 1 nm thickness has a profound effect on grain-boundary self-diffusion. The picture that emerges suggests that upon heating from zero temperature through the melting point, the highest-energy tilt and twist GBs in the microstructure undergo a reversible transition from a low-temperature, solid structure with a rather high, GB energy dependent activation energy, e.g., for GB diffusion, to a highly confined, liquid GB structure with a universal activation energy that is independent of the GB misorientation and related to selfdiffusion in the melt. This reversible structural transition should also manifest itself in an extremely high GB mobility and a low sliding resistance above the solid-to-liquid transition. It is this unique combination of high-temperature properties that appears to be the cause for the rate-controlling role of these GBs in grain growth and deformation of polycrystalline materials.
High-temperature structure and properties of grain boundaries
1.
1987
High-Energy, High-Angle Grain Boundaries in Silicon
As discussed in Chapter 6.9, recent simulations of silicon GBs equilibrated at high temperatures and subsequently cooled to zero temperature revealed a highly disordered equilibrium structure of uniform thickness and energy of all the high-energy boundaries while low-energy boundaries remained crystalline (see Figs. 14–17 in Chapter 6.9) [8, 9]. In Fig. 1(a) and (b) the local radial and bond-angle distribution functions for the atoms in the central two (100) planes of the (100) φ = 43.60 (29) twist boundary in Si determined by means of the Stillinger–Weber potential [9] are compared with the related distributions obtained for bulk amorphous Si. (The structure and energy of this GB are representative for those of other high-energy GBs [8].) The comparison in Fig. 1 reveals a virtually complete absence of crystalline order in the GB, in favor of short-range order that is strikingly similar to that seen in bulk amorphous Si. Qualitatively identical results were obtained via a tight-binding description of Si bonding [10] and the Tersoff potential [11], suggesting the robustness of these conclusions. Consistent with this, fully quantum-mechanical calculations for the (100) φ = 36.87◦ (5) twist GB in Ge yielded a disordered structure remarkably similar to that of amorphous Ge [12]. Although these simulation results have yet to be confirmed experimentally for impurityfree Si GBs, the misorientation-independent structures of these GBs seem to be what Rosenhain described in his historic “amorphous-cement” model (see also Chapter 6.9) [13]. Using the idea that liquid Si is significantly denser and better coordinated than either crystalline or amorphous Si, Keblinski et al. [14] investigated the possibility of an amorphous-to-liquid transition in high-energy GBs in Si. Their idea was that a transformation of an amorphous into a liquid-like GB structure should be detectable via a contraction at the GB, accompanied by an increase in the average nearest-neighbor (nn) coordination and changes in the nn bond-angles. Figure 2 compares the GB excess volume per unit GB area, δV (T ), between T = 0 K and Tm for two high-angle GBs: the high-energy (100) φ = 43.60◦ (29) twist GB with the amorphous structure shown in Figs. 1(a) and (b) and the (111) φ = 50.48◦ (31) twist GB whose structure remains crystalline all the way up to Tm [8]. It is interesting to note that at 0 K, the (100) GB is expanded relative to the perfect crystal whereas the (111) GB exhibits a volume contraction. The latter arises from the fact that all Si bonds point directly across the (111) GB and are therefore stretched during the 111 twist rotation, with a consequent contraction during relaxation so as to recover, as much as possible, perfect-crystal bond lengths. [8] More strikingly, however, as temperature increases the value of δV (T ) for the (111) GB remains unchanged; by
1988
D. Wolf
(a)
10 bulk amorphous (100) Σ29
8
G(r)
6 4 2 0 0
0.3
0.6
0.9
1.2
r [a 0 ] (b) 0.025 bulk amorphous (100) Σ29
P (cos θ)
0.02 0.015 0.01 0.005 0 ⫺1
⫺0.5
0
0.5
1
cos θ Figure 1. (a) Radial distribution function, G(r ), for the (100) φ = 43.60◦ (29) twist GB. The solid line shows G(r ) for bulk amorphous silicon obtained for the same interatomic potential. The atomic structure of this boundary is shown in Fig. 14(b) in Section 6.9. (b) Angular distribution function, P(cos θ), for the same GB. The solid line shows P(cos θ) for bulk amorphous silicon (redrawn from Keblinski et al., 1999).
High-temperature structure and properties of grain boundaries Tg
1989
Tm
0.01
ψV [a 0 ]
0
(100) ⌺29
⫺0.01 ⫺0.02 (111) ⌺31
⫺0.03 ⫺0.04 0
500
1000
1500
2000
T [K] Figure 2. Temperature dependence of the volume expansion per unit GB area, δV (in units of the zero-temperature lattice parameter, a0 ) for two high-angle GBs in silicon: the high-energy (100) φ = 43.60◦ (29) twist boundary and the low-energy (111) φ = 42.10◦ (31) twist GB. The bulk glass-transition temperature, Tg , and melting point, Tm , are indicated on the top [14].
contrast, at about 1000 K the (100) GB starts to contract continuously all the way up to Tm , indicating the formation of a liquid intergranular film. Finally, for T > Tm both systems melt by propagation of two crystal–liquid interfaces from the GBs until all crystalline material is consumed by the liquid [5–7]. Figure 3 demonstrates that the response of the (100) GB to temperature change is completely reversible, indicating the equilibrium nature of the amorphous-to-liquid transition [14]. Also, the fact that with increasing temperature, the contraction progresses continuously rather than occurs suddenly at a certain temperature, indicates that the structural transition involves a twophase mixture and progresses continuously between Tg and Tm . A detailed analysis of the underlying structural changes revealed a significant increase with increasing temperature in the average GB-atom coordination (from 4.21 at Tg to 4.55 at Tm ) accompanied by a decrease in the fraction of fourcoordinated GB atoms (from 0.78 at Tg to 0.49 at Tm ). Simultaneously, as shown in Fig. 4 the bond-angle distribution function changes from being similar to that of bulk amorphous Si with a broad peak near 110◦ indicating predominantly tetrahedral bonding (see Fig. 1b), to one that is similar to that of
1990
D. Wolf 0K 1600K
900K
1600K
0.04
ψV [a 0 ]
0
⫺0.04
⫺0.08
0
0.5
1
1.5
2
t [ns] Figure 3. Response to thermal cycling of the volume expansion per unit area, δV, normal to the (100) 29 twist GB. The plot illustrates the reversibility of the transition between the confined amorphous and liquid GB phases; t is the simulation time [14].
molten Si, with its peak shifted towards 90◦ , a characteristic of the almost six-coordinated high-temperature liquid [14]. The amorphous-to-liquid GB transition should profoundly affect the mechanism and activation energy for GB self-diffusion. Unfortunately, below Tg both lattice and GB diffusion are too slow to be observable on a typical MD time scale; in fact, even above Tg the atom mobility in the perfect-crystal regions surrounding the GB is negligible compared to that in the GB. The total observed mean-square-displacement (MSD) summed over all the atoms is therefore dominated by the in-plane (x − y) motions of the GB atoms, (x)2 + (y)2 . In analogy to the Gibbsian excess energy of the GB, this MSD normalized to the GB area represents the integrated, Gibbsian excess MSD of the GB atoms, from which the diffusion flux in the GB can be determined as follows: (x)2 + (y)2 , (1) 4t A where is the atomic volume and t the simulation time; DGB is the GB self-diffusion constant and δD is the “diffusional width” of the GB (to be DGB δD =
High-temperature structure and properties of grain boundaries
1991
0.02 GB (1600K)
bulk amorph. (900K)
P(ψ)
0.015
GB (900K)
bulk liquid (1600K) 0.01
0.005
0 60
75
90
105
120
135
150
ψ Figure 4. Comparison of the bond-angle distribution functions, P(θ), for the confined amorphous and confined liquid GB phases with those for bulk amorphous and supercooled liquid Si, respectively [14]. In perfect-crystal Si at T = 0 K, P(θ) exhibits a single, δ-function peak at the tetrahedral angle, θt = 109.47◦ .
distinguished from its “structural width”, δS ; see below). We note that DGB δD has the dimensions of (length)3 /time. As seen from the Arrhenius plot in Fig. 5, the simulated values of DGB δD (right scale) yield an activation energy of 1.4 ± 0.1 eV. By comparison, the self-diffusion constant of the bulk liquid, Dliq (left scale in Fig. 5), exhibits the much lower activation energy of 0.70 ± 0.05 eV. If one were to assume that δD is temperature independent, thus assigning the entire temperature dependence of DGB δD to that of DGB alone, one would be led to conclude that the rather different activation energies for DGB and Dliq dispel the notion that the GB is, indeed, liquid. However, by contrast with this interpretation, Fig. 6 reveals a strong, thermally activated increase of δD with increasing temperature, with an activation energy of 0.65 ± 0.05 eV [14]. These values of δD were determined from the expression [14] NGB , (2) A where NGB is the number of diffusing GB atoms with atomic volume . The structural GB width shown in Fig. 6 was defined [14] via the excess potential-energy profile of the GB, as the width-at-half-maximum of planar profiles like the ones shown in Fig. 7. According to Fig. 7, the excess-energy profiles at T = 0 and 900 K are practically indistinguishable. By contrast, the δD =
1992
D. Wolf Tm
Tg
0.0001 10⫺12
bulk melt
10⫺5 10⫺13 1.4eV 10⫺6
(100) Σ29 GB
ψD D GB [cm3/s]
D liq [cm2/s]
0.70eV
10⫺14
10⫺7 6
7
8
9
10
11
12
13
1/T [eV⫺1] Figure 5. Arrhenius plot for self-diffusion in the (100) 29 Si GB (right scale) and in the bulk, supercooled Si melt (left scale) [14]. DGB δD is the diffusion flux in the GB with a “diffusional width” δD .
profile broadens above Tg , from approximately four (100) planes (or δS = a0 = 5.5Å) at low temperatures to approximately six planes at 1600 K (δS = 8.2Å), indicating that two initially crystalline planes have become liquid, in addition to the initially amorphous GB phase (see Fig. 14b in Chapter 6.9). The rather small value of δS = δD ( = 8.2 Å) near the melting point is noteworthy as it underscores the severe confinement of the liquid. To further elucidate this confinement, Keblinski et al. compared this value of the GB width with the width of a bulk crystal–liquid (100) interface in Si, δc−liq = 6 Å, at T = T m determined for the SW potential [14]. The fact that all the way up to the melting point 2δc−liq > δS = δD demonstrates the stability of the confined liquid against decomposition into two unbound crystal–liquid interfaces. The results in Figs. 5 and 6 for DGB δD , δD and Dliq suggest that the activation energy of 1.4 ± 0.1 eV for DGB δD in the 29 GB above Tg involves two distinct processes: first, the thermally activated formation of the confined liquid from the confined-amorphous phase, with an activation energy E F = 0.65 ± 0.05 eV (see Fig. 6) and second, the subsequent bulk-liquid like atom migration, with an activation energy of E liq = 0.70 ± 0.05 eV (see Fig. 5). It follows that above Tg , the mechanism for self-diffusion in the 29 GB should be rather similar to that in the bulk melt. As a consequence, at Tm
High-temperature structure and properties of grain boundaries
1993 Tg
Tm
GB width [d(100)]
100
structural width, ψS 10
0.65eV
1
diffusional width, δD 0.1 6
7
8
9
10
11
12
13
1/T [eV⫺1] Figure 6. Arrhenius plot for the ‘diffusional width’, δD , of the GB, indicating that the formation of the confined liquid in the confined-amorphous matrix is a thermally activated process. For comparison, the only weakly temperature dependent “structural width” of the GB, δS , is also shown [14].
the self-diffusion constants DGB and Dliq should be of comparable magnitude. In fact, the value of δD = 8.2 Å combined with the value for DGB δD = 1.43 × 10−12 cm3 /s obtained by extrapolation to Tm in Fig. 6, yields DGB = 2 × 10−5 cm2 /s, compared to the value Dliq = 6 × 10−5 cm2 /s [14]. Given the uncertainties in the underlying activation energies and the degree of arbitrariness in the definition of the GB width, this agreement is rather good.
2.
Evidence for Structural Transition in fcc-Metal Grain Boundaries
The above simulation results for silicon suggest the intriguing possibility that, in the spirit of Rosenhain’s amorphous-cement model [13], even in fcc metals the high-energy GBs might exhibit a universal, highly disordered structure that is liquid-like above a certain critical temperature and more or less disordered, but solid at lower temperatures. Unfortunately, compared to Si for which the coordination and density of the melt differ significantly from
1994
D. Wolf 0.4
Excess energy [eV/atom]
T⫽0K 0.3
T⫽900K T⫽1600K
0.2
0.1
δS
0 ⫺3
⫺2
⫺1
0 z [ a 0]
1
2
3
Figure 7. Plane-by-plane excess-potential-energy profiles for the (100) 29 GB at T = 0, 900 and 1600 K, indicating that the transition from a confined-amorphous to a confined-liquid GB structure is accompanied by an increase in the “structural width” of the GB, δS , defined as the width-at-half-maximum of such profiles [14].
those in the amorphous and crystalline phases, in fcc metals the structural fingerprint of such a transition is much more subtle because the coordination and density of the glass (producible by simulations only, by rapidly quenching the melt) are essentially the same as those of the melt. Recent self- and impurity-diffusion experiments by Budke et al. [15] on Cu 001 tilt GBs near the 5 (36.87◦ ) misorientation (see Fig. 8) revealed a crossover between a strongly misorientation-dependent low-temperature diffusion mechanism with a high activation energy and a high-temperature mechanism with a 60% lower activation energy that is independent of the GB misorientation. This transition was interpreted as a GB structural transition, from an ordered low-temperature GB structure, with well-defined atomic jump vectors, to a disordered high-temperature structure with randomly directed jump vectors [15]. Consistent with these experimental observations, recent MD simulations of GB migration (see Fig. 9a) and GB diffusion (see Fig. 9b) in a (100) φ = 43.60◦ (29) twist bicrystal described by a Lennard–Jones potential fitted to Cu suggest a crossover from a solid-like mechanism at low temperatures to a liquidlike mechanism at high temperatures. According to Fig. 9, in both types of
High-temperature structure and properties of grain boundaries
⫺18
1000 900
T [K] 800
700
1995
600 ⫺18 10
10
⫺19
⫺19
10
3
sδDGB [m /s]
10
θ⫽37.74˚
θ⫽35.88˚
⫺20
⫺20
10
10
θ⫽36.53˚
⫺21
10
0.9
⫺21
1.0
1.1
1.2
1.3 ⫺3
1/T / [10
1.4
1.5
1.6
10
1/K]
Figure 8. Arrhenius plots for Au diffusion along [001] tilt GBs in Cu near the = 5 misorientation of θ = 36.87◦ [15].
GB processes this crossover occurs at a temperature of about T ∼ 750 K (or ∼ 0.62Tm ). Detailed analysis revealed a crossover in the underlying atomiclevel mechanisms, from ones based on solid-like atom hopping at low temperatures to ones involving liquid-like reshuffling of the GB atoms at high temperatures [16]. As seen in Figs. 9(a) and (b), this change in mechanism is accompanied by an about 50% decrease in the activation energies for both processes [16].
3.
Self-Diffusion in High-Energy fcc-Metal GBs
To systematically investigate the existence of a crossover from a solid lowtemperature to a liquid-like high-temperature GB structure, Keblinski et al. [17] have recently performed extensive simulations of self-diffusion in several well-chosen high-energy tilt and twist boundaries in Pd. These simulations not only confirm that, analogous to their Si work, a GB structural transition, indeed, takes place even in fcc-metal GBs, but also that (i) the transition, indeed, proceeds from a solid-like low-temperature to a liquid-like
1996
D. Wolf (a) 1000 900
ln m [10⫺8 m4/Js]
104
1000
T [K] 800 700
600
0.20 eV
0.40 eV 100 10
12
14
16
18
20
1/kT [eV⫺1] T [K]
(b)
1000 900 800 750 700 650
ψDGB [10⫺18 m3/s]
1
0.1 0.5 eV 0.01 1.1 eV 0.001 10
12
14 1/kT
16
18
20
[eV⫺1]
Figure 9. Arrhenius plots for (a) the GB mobility, m (in units of 10−8 m4 /Js) and (b) the GB diffusivity, δ D GB (see Eq. (1)) for the (001) φ = 43.60◦ (29) twist GB in Cu as described by a Lennard–Jones potential [16].
high-temperature structure and (ii) consistent with the experiments [15], the transition temperature depends strongly on the GB energy. Consistent with their earlier study of Si GBs [14], at high-temperatures four different high-energy GBs (two tilt and two twist GBs; see Table 1) were found to exhibit very similar activation energies (see Fig. 10), with practically the same absolute values of the product of the GB diffusivity and the diffusional GB width, DGB δD (right-hand scale in Fig. 10). For comparison, Fig. 10 also shows the Arrhenius plot for self-diffusion in the bulk melt, Dliq (left-hand scale in Fig. 10). We note that the “universal” activation energy for
High-temperature structure and properties of grain boundaries
1997
Table 1. GBs chosen for the diffusion study in fcc Pd of Keblinski et al. [17], listed in order of decreasing GB energy. ( denotes the familiar inverse density of coincidence sites.) Also listed are the activation energies for GB self-diffusion in the high- and low-temperature regimes and the crossover temperature, Tc . Values of the square of the square of the planar structure factor, |S(k)|2 , are given for the most disordered plane in the center of each GB GB
Zero-temp. GB Zero-temp. Activation energy Activation energy energy (erg/cm2 ) |S(k)|2 high T (eV) low T (eV) Tc (K)
(110) φ = 50.48◦ (11) (113) φ = 67.11◦ (9) (310) (5) STGB (123) (7) STGB
1027 1021 952 881
0.25 0.28 0.82 0.94
0.61 ± 0.05 0.56 ± 0.05 0.65 ± 0.05 0.70 ± 0.10
Tm 1400 K
0.88 ± 0.05 1.50 ± 0.10
< 700 < 700 900 1300
1000 K
10⫺8
10⫺17
10⫺9
10⫺18 0.60eV (110) Σ11 twist (311) Σ9 twist (310) Σ5 sym. tilt
10⫺10
10⫺19
D GBψD [m3/s]
D liq [m2/s]
melt 0.41eV
10⫺20
10⫺11 5
6
7
8 9 1/kT [eV⫺1]
10
11
12
Figure 10. Comparison of self-diffusion constants in Pd grain boundaries and in the melt [17]. Right-hand scale: high-temperature regime (1000–1400 K) in the Arrhenius plots for the diffusion flux, DGB δD , for three high-energy GBs (see also Table 1). Left-hand scale: self-diffusion constant in the bulk, 3d periodic Pd melt supercooled all the way down to 1000 K. The melting point for the Pd embedded-atom-method potential is estimated to be about Tm ∼ 1500 K.
GB diffusion in this high-temperature regime (of E GB = 0.60 ± 0.05 eV) is distinctly higher than that associated with self-diffusion in the supercooled melt (of E melt = 0.41 ± 0.03 eV). On the other hand, E GB is significantly lower than that for self-diffusion in bulk perfect-crystal Pd via monovacancies (with a simulated activation energy of 2.41 eV [18]).
1998
D. Wolf
To elucidate the difference between the activation energies for self-diffusion in these high-energy GBs and in the supercooled melt, we recall that GB diffusion involves the diffusion flux, DGB δD, rather than DGB alone. As first pointed out in the Si work (see Figs. 5 and 6) [14], only under the assumption that δD is temperature independent can the entire temperature dependence of DGB δD be assigned solely to that of DGB . To test this assumption, as in their earlier Si study Keblinski et al. [17] followed the diffusion of individual GB atoms over time. These Pd simulations [14, 17] also exposed the important distinction between the structural width of the GB, δ S (defined, for example, via the excess-energy profile across the GB), and the diffusional width of the GB (defined by the number of GB atoms actually responsible for the observed diffusion flux, i.e., for the excess mean-square-displacement of the GB atoms [14, 17]). As shown in Fig. 11, they observed that δS increases only slightly with increasing temperature (by about a factor of two between T = 0 K and T m ). However, the fraction of mobile GB atoms which governs δD was found to increase much more rapidly, involving a thermally-activated process as evidenced by the Arrhenius plot in Fig. 11 for the (110) 11 twist GB in Pd. Consistent with the Si simulations [14], the assumption of a temperatureindependent diffusional width is therefore inappropriate for fcc metals as well. Adding the activation energy of E δ = 0.22 ± 0.02 eV for δD (Fig. 11) go the value for diffusion in the melt, E melt = 0.41 ± 0.03 eV (Fig. 4), yields an 10
GB width [a 0 ]
structural width, δS
0.22 eV
1
diffusional width, ψD
0.1 8
9
10 1/kT
11
12
[eV⫺1]
Figure 11. Arrhenius plot for the “diffusional GB width”, δD , of the (110) φ = 50.48◦ (11) twist GB in Pd, indicating that the number of actually diffusing GB atoms is thermally activated. For comparison, the “structural GB width”, δS , with a much weaker temperature dependence, is also shown [17].
High-temperature structure and properties of grain boundaries
1999
activation energy of 0.63 ± 0.05 eV, which is in remarkable agreement with the “universal” high-temperature activation energy of ∼ 0.60 ± 0.05 for DGB δD (see Fig. 10 and Table 1). Similar to their Si work, Keblinski et al. therefore concluded that the high-temperature GB diffusion mechanism is liquidlike, involving two distinct processes: first the thermally activated formation of clusters of liquid from the solid GB, followed by liquid-like migration of the atoms within these clusters with an activation energy similar to that in the bulk melt; i.e., 0 δD0 exp[−(E δ + E melt )/kT ]. DGB δD = DGB
(3)
Some insight into the underlying self-diffusion mechanism can be gained from the analysis of the temporal evolution of the pattern of the diffusing GB atoms. Figs. 12(a) and (b) show two edge-on snapshots of the (110) 11 twist GB. (a)
(b)
Figure 12. Initial positions of those atoms that moved by at least one nearest-neighbor distance within 10 000 simulation time steps, projected onto the x − z plane for the (110) 11 twist boundary. (a) T = 1000 K; (b) T = 1400 K. The thin horizontal lines indicate the structural width of the GB defined in Eq. (3) and determined via the excess-energy profile of the GB (see also Fig. 11).
2000
D. Wolf
The figure shows the initial positions of only those GB atoms that moved by at least one nearest-neighbor distance within a time interval of 10 000 time steps at (a) T = 1000 K and (b) T = 1400 K. The dashed horizontal lines indicate the structural width of the GB, δS , determined in simulations by the energy profile of the GB (see Fig. 13). A comparison between the two snapshots reveals (a)
4 (123) STGB
T⫽1400 K
bulk liquid
g(r)
3
2
1
0 0
0.5
1
1.5
2
r [a 0] (b)
8 7
(123) STGB
T⫽0 K
glass
6 g(r)
5 4 3 2 1 0 0
0.5
1 r [a 0]
1.5
2
Figure 13. Radial distribution functions, g(r ), for the atoms in the two central planes of the STGB on the (123) plane of Pd at (a) T = 1400 K compared to that of the bulk, supercooled melt and (b) T = 0 K compared to that of the bulk glass produced by MD simulation [17].
High-temperature structure and properties of grain boundaries
2001
not only that δS increases somewhat with increasing temperature but, more importantly, that the fraction of mobile GB atoms increases significantly with increasing temperature. The diffusional width, δD , therefore increases much more rapidly with increasing temperature than the structural width, δS . That the high-temperature GB structure is, indeed, liquid-like was demonstrated [17] via the comparison of the radial distribution function for the bulk melt with the local radial distribution functions at T = 1400 K in Figs. 13(a) and 14(a) for the atoms in the centers of, respectively, the symmetric tilt boundary (STGB) on the (123) plane (Fig. 13a) and the (110) φ = 50.48◦ (11) twist GB (Fig. 14a). That the diffusion mechanism is, indeed, liquid-like in this high-temperature regime was shown directly by MD simulation, revealing a virtually complete absence of angular correlations between successive jump vectors of the diffusing atoms [17]. For comparison, Figs. 13(b) and 14(b) show the corresponding 0 K GB structures together with the radial distribution function of the bulk Pd glass (obtained by quench from the melt). By contrast with these high-temperature results, at lower temperatures the activation energy for GB self-diffusion can be significantly higher and strongly dependent on the GB energy. According to Fig. 15 and Table 1, the two STGBs exhibit a crossover in the diffusion behavior at different transition temperatures, Tc , from a high-temperature regime with the relatively low, universal activation energy to a low-temperature regime with a significantly higher activation energy that depends on the GB energy. For the case of the STGB on the (123) plane, Keblinski et al. demonstrated that, indeed, the diffusion mechanism in this low-temperature regime involves atom jump vectors in discrete directions, i.e., on a crystal lattice. The fact that the twist boundaries do not exhibit such a transition in Fig. 15 indicates that Tc is lower than 700 K, where the mean-square-displacement of the GB atoms became too small to be detectable by MD simulation [17]. On the other hand, that a structural transition must take place, for example, in the (110) 11 twist GB is seen from the comparison in Fig. 13(a) and (b) of the local radial distribution functions at 1400 and 0 K, respectively: the 0 K structure in (b) clearly exhibits the familiar split second peak of the amorphous solid (indicated by arrows).
4.
Short-Range vs. Long-Range Structural Disorder
The comparison between the (123) STGB and the (110) 11 twist GB in Figs. 13 and 14 reveals that, in spite of having rather similar GB energies (see Table 1) and similar, liquid-like high-temperature structures, the two GBs have very different low-temperature atomic structures: whereas the STGB in Fig. 13(b) is basically crystalline, the twist GB in Fig. 14(b) is essentially
2002
D. Wolf
(a) 4 (110) Σ11
T⫽1400 K
bulk liquid
g(r)
3
2
1
0 0
0.5
1
1.5
2
r [a 0 ] (b) 8 7
(110) Σ11
T⫽0 K
glass
6
g(r)
5 4 3 2 1 0 0
0.5
1
1.5
2
r [a 0 ] Figure 14. Radial distribution functions, g(r ), for the atoms in the two central planes of the (110) φ = 50.48◦ (11) twist GB in Pd at (a) T = 1400 K compared to that of the bulk, supercooled melt and (b) T = 0 K compared to that of the bulk glass produced by MD simulation [17]. The arrows in (b) indicate the familiar split second peak associated with the glass.
High-temperature structure and properties of grain boundaries
2003
Tm 1400 K
1000 K
10⫺18
700 K (110) Γ11 twist (311) Γ9 twist
ψD GB [m3/s]
10⫺19
10⫺20 (123) sym. tilt 10⫺21 (310) sym. tilt 10⫺22 8
10
12 1/kT [1/eV]
14
16
Figure 15. Arrhenius plots for the diffusion flux, DGB δD , between 700 and 1400 K for all four Pd GBs considered in the study of Keblinski et al. (compare with Fig. 10) [17].
amorphous. This raises the question why, in spite of these qualitatively different 0 K atomic structures, the high-temperature behavior of these GBs is so similar. As demonstrated by Keblinski et al. [17], elucidation of the precise manner in which GB structure affects GB diffusion requires a better understanding of the interrelation between GB energy and atomic structure, with particular emphasis on the distinction between long-range and short-range GB structural disorder. Given plane-by-plane radial distribution functions, such as those in Figs. 13 and 14 in the center of each GB, the degree of long-range GB structural order can be quantified by the square of the planar structure factors, S(kα ), within each lattice plane in the two rotated crystals (α = 1 and 2) forming a given bicrystal [5, 6]. For each plane, S(kα ) is defined as the Fourier transform of the average g(r) for this particular plane. For a perfect-crystal plane at zero temperature, |S(k1 )|2 = 1 and |S(k2)|2 = 0 for planes belonging to the bulk of grain 1; similarly, |S(k1 )|2 = 0 and |S(k2 )|2 = 1 for planes belonging to the bulk of grain 2. At finite temperature, due to the vibration of the atoms, the long-range order within the planes decreases by the Debye–Wallerfactor. By contrast, in a liquid or an amorphous solid, due the absence of
2004
D. Wolf
long-range order both |S(k1)|2 and |S(k2 )|2 fluctuate near zero at any temperature. The combination of |S(k1 )|2 and |S(k2 )|2 thus provides a convenient quantitative measure for the overall degree of crystallinity in each lattice plane [5, 6]. The zero-temperature plane-by-plane structure factors associated with the two GBs in Figs. 13 and 14 are compared in Fig. 16. Consistent with the g(r) in Fig. 14(b), the center of the (110) twist GB is, indeed, highly disordered at 0 K, as evidenced by the low values of |S(k)|2 = 0.25 in the two center planes, indicating only a 25% residual crystallinity. By comparison, the (123) STGB is 94% crystalline even in the two planes immediately at the GB. This qualitative difference is mostly due to the extremely high degree of structural periodicity exhibited by all STGBs: on any given lattice plane, the STGB has the smallest possible planar unit cell of any GB on that plane; by contrast, any twist boundary on the same plane has a significantly larger unit cell, i.e., less periodicity [19]. Whereas the degree of long-range GB structural disorder is governed by the higher-order peaks in g(r), the GB energy is dominated by the nn peak as it represents a weighted, highly damped sum over the peaks in g(r). Assuming, for example, that the interatomic interactions can be approximated by a pair
1
|S(k)|2
0.8
(123) STGB (110) Σ11 twist
0.6 0.4
Pd (EAM) T⫽0 K
0.2 0
⫺4
⫺2
0 z [a 0 ]
2
4
Figure 16. Square of the planar structure factors, |S(kα )|2 , in the two halves (α = 1 or 2) of the (123) STGB and the (110) φ = 50.48 (11) twist GB in Pd at zero temperature. The underlying radial distribution functions in the center of each GB are found in Figs. 6(b) and 7(b) [17].
High-temperature structure and properties of grain boundaries
2005
potential, the excess energy, γ , per unit GB area, A, of a bicrystal may be written as 1 γ= A
1 N N [gbi (rij ) − gid (rij )]V (rij ) , N i=1 i=1
(4)
where gbi (rij ) and gid (rij ) are the radial distribution functions of all N atoms in the bicrystal and the ideal reference crystal, respectively. Since in virtually all materials the interatomic interactions fall off rather quickly with decreasing separation between the atoms, the GB energy is controlled by the contribution in Eq. (4) from the nearest-neighbor (nn) peak, i.e., by the shape and width of the peak and the area under it. The magnitude of the GB energy is therefore a measure for the degree of short-range GB structural disorder that is very insensitive to any long-range structural order present in the GB. The fact that the (110) 11 twist GB and the (123) STGB have somewhat similar energies (of 1027 and 881 mJ/m2 ; see Table 1) therefore indicates comparable degrees of broadening of the nn peak in their related g(r)’s, as confirmed by the rather similar shapes and widths of the nn peaks in Figs. 13(b) and 14(b). Hence, although in principle the structural information contained in |S(k)|2 is equivalent to that in g(r), the GB energy represents a more direct predictor of GB properties, particularly of the diffusion behavior, than the magnitude of the planar structure factor. In fact, the success of brokenbond models for predicting surface and GB energies and physical properties [20] arises mostly from their focus on the nn peak in g(r), replacing its detailed shape by the area under the peak. Because all the elastic information contained in the detailed peak shape is thus lost, strain-field effects associated, for example, with line defects in vicinal interfaces are thus ignored. Nevertheless, the change in the area under the peak captures the total number of broken nn bonds, which is directly related to the excess energy of the average atom in the system. This comparison suggests that the zero-temperature GB energy is a much more reliable predictor of high-temperature GB behavior than the degree of long-range order present in the low-temperature GB structure. The above comparison also demonstrates that the degree of long-range structural disorder already present in the zero-temperature GB structure provides little indication as to whether or not the GB will undergo a solid-to-liquid transition at elevated temperatures. The fact that even the two tilt boundaries undergo the transition, in spite of being structurally much more ordered at low temperatures than the two twist boundaries, suggests that the degree of long-range order in the zero-temperature structure of the GB has little, if any, effect on the transition temperature, Tc . The decrease of Tc with increasing GB energy (see Table 1 and Fig. 15) therefore suggests that a higher degree of nn structural disorder present in the 0 K structure favors a transition into the liquid phase already at lower temperatures. In other words, it seems that
2006
D. Wolf
a certain, critical amount of local GB structural disorder is required to trigger the transition to a liquid-like high-temperature structure. At finite temperatures the amount of GB structural disorder has two origins, one from the static disorder already present at 0 K and another, dynamical in nature, due to the thermal disorder. The above simulations suggest that the sum of both types of disorder in the GB region must reach a critical level in order to trigger the solid-to-liquid structural transition. This behavior is reminiscent of the Lindemann criterion for melting, requiring a certain amount of (vibrational) disordering in order to trigger the melting transition. The increase in the activation energy in the low-temperature regime with decreasing GB energy (see Table 1 and Fig. 15) can be explained similarly: irrespective of the detailed diffusion mechanism, any amount of static, shortrange structural disorder already present in the GB lowers the barrier for nn diffusion jumps of the atoms. Conversely, as evidenced by the perfect-crystal like activation energy for self-diffusion in low-energy GBs (such as the (111) twin), the absence of short-range GB structural disorder increases the saddlepoint energy for diffusion jumps, until a maximum activation energy is reached, namely the perfect-crystal value associated with zero GB energy.
5.
Conclusions
The picture that emerges from the above simulations of GB self-diffusion in Si and Pd bicrystals suggests that upon heating from zero temperature through the melting point, the highest-energy tilt and twist GBs in a polycrystalline microstructure undergo a reversible transition from a low-temperature, solid structure with a rather high, GB energy-dependent activation energy, to a highly confined liquid-like GB structure with a universal activation energy that is related to, and slightly higher than, that in the melt. In parallel, the diffusion mechanism changes from being solid-like to liquid-like; the latter is practically indistinguishable from that in the melt. The temperature at which the transition occurs increases with decreasing GB energy until it reaches the melting point. Lower-energy GBs, with less built-in short-range GB structural disorder, never reach the critical level in the overall (static and dynamic) structural disorder that is required to trigger the transition below Tm ; their diffusion behavior is therefore solid-like all the way up to Tm , with an activation energy that increases with decreasing GB energy until it reaches the perfect-crystal value. One should keep in mind that this classification of GB diffusion based on the GB energy applies only to high-angle GBs, i.e., boundaries with completely overlapping dislocation cores and hence a more or less homogeneously distributed type of structural disorder along the interface. This can clearly not be the whole picture, however. For example, in dislocation boundaries (i.e., low-angle or vicinal GBs) local structural disordering is distributed in a highly
High-temperature structure and properties of grain boundaries
2007
inhomogeneous manner: the disorder is localized in the dislocation cores and well separated by elastically strained perfect-crystal like regions. Two GBs with the same, relatively low energy, one a dislocation boundary and the other a “true” high-angle GB (with a homogeneously distributed type of GB structural disorder), will therefore exhibit rather different diffusion behaviors: the dislocation boundary will exhibit rather fast diffusion down the dislocation pipes but no diffusion at all in the perfect-crystal regions; by contrast, the diffusion in the low-energy, high-angle boundary will probably be much slower but homogeneously distributed throughout the GB.
Acknowledgments This work was supported by the US Department of Energy, BES-Materials Science under contract W-31-109-Eng-38.
References [1] D. Wolf, Grain boundaries: structure. In: R. Cahn, principal editor, The Encyclopedia of Materials Science and Technology, Pergamon Press, pp. 3597–3609, 2001. [2] W.T. Read and W. Shockley, Phys. Rev., 78, 275, 1950. [3] D. Wolf and K.L. Merkle, Chapter 3 in D. Wolf and S. Yip (eds.), Materials Interfaces, Atomic-level Structure and Properties, Chapman and Hall, pp. 87–150, 1992. [4] V. Pontikis, J. de Physique (Paris), 49, C5–327, 1988. [5] S.R. Phillpot, J.F. Lutsko, D. Wolf, and S. Yip, Phys. Rev. B, 40, 2831, 1989. [6] S.R. Phillpot, S. Yip and D. Wolf, Comput. in Phys., 3, 20, 1989. [7] D. Wolf, P. Okamoto, S. Yip, J.F. Lutsko, and M. Kluge, J. Mater. Res., 5, 286, 1990. [8] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, Phys. Rev. Lett., 77, 2965, 1997; see also J. Am. Ceram. Soc., 80, 717, 1997. [9] F.H. Stillinger and T.A. Weber, Phys. Rev. B, 31, 5262, 1985. [10] F. Cleri, P. Keblinski, L. Colombo, S.R. Phillpot, and D. Wolf, Phys. Rev. B, 57, 6247, 1998. [11] J. Tersoff, Phys. Rev. B, 38, 9902, 1988. [12] E. Tarnow, P. Dallot, P.D. Bristowe, J.D. Joannopoulos, G.P. Francis, and M.C. Payne, Phys. Rev. B, 42, 3644, 1990. [13] W. Rosenhain and D. Ewen, J. Inst. Metals, 10, 119, 1913; for a review of this work, see K.T. Aust and T. Chalmers in Metal Interfaces, ASM, Cleveland, OH, 1952. [14] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, Phil. Mag. Letters, 76, 143, 1999. [15] E. Budke, T. Surholt, S.I. Prokofjev, L. Shvindlerman, and C. Herzig, Acta Mater., 47, 385, 1999. [16] B. Schoenfelder, P. Keblinski, D. Wolf, and S.R. Phillpot, Intergranular and Interphase Boundaries in Materials, Trans. Tech. Publ., pp. 9–16, 1999.
2008
D. Wolf
[17] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Phil. Mag. A, 79, 2735, 1999. [18] J.B. Adams, S.M. Foiles, and W.G. Wolfer, J. Mater. Res., 4, 102, 1989. [19] D. Wolf, Chapter 1 In: D. Wolf and S. Yip (eds.), Materials Interfaces, Atomic-level Structure and Properties, Chapman and Hall, pp. 1–57, 1992. [20] D. Wolf and J. Jaszczak, Chapter 26 in D. Wolf and S. Yip (eds.), Materials Interfaces, Atomic-level Structure and Properties, Chapman and Hall, pp. 662–90, 1992.
6.11 CRYSTAL DISORDERING IN MELTING AND AMORPHIZATION Sidney Yip1 , Simon R. Phillpot2 , and Dieter Wolf3 1
Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA 2 Department of Materials Science and Engineering, University of Florida, Gainesville, FL 32611, USA 3 Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
Among the structural phase transitions that evolve from an initially crystalline state, melting is the most common and most extensively studied. Another transformation that produces a disordered final state is solid-state amorphization. In this section the underlying thermodynamic and kinetic features of these two phenomena in a bulk lattice and at surfaces and grain boundaries will be discussed [1]. By focusing on the insights derived from molecular-dynamics simulations, we are led quite naturally to a view of structural disordering that unifies the crystal-to-liquid (C–L) and crystal-to-amorphous (C–A) transitions at high and low temperatures, respectively. On the one hand, a variety of models have been developed to describe melting in which intrinsic lattice defects, produced spontaneously within the crystal, play a central role [2]. On the other hand, it is known from experiments that melting generally occurs at extrinsic defects such as free surfaces, grain boundaries and voids [3]. From the standpoint of understanding the mechanisms of melting, the simulation technique of molecular-dynamics (MD) offers a way to follow the dynamics of the disordering transitions in molecular detail. In addition to offering precise control over specifying the initial atomic configuration of the system and the manner in which the sample is heated, MD simulation allows one to fully characterize the atomistic details associated with the onset of disorder. As we discuss here, MD simulations show that there are two distinct paths in melting. In the presence of an extrinsic defect, melting is a heterogenous process in which a small region of disorder is first nucleated at the defect and then propagates through the system. In the absence of extrinsic defects, melting is a homogenous process in 2009 S. Yip (ed.), Handbook of Materials Modeling, 2009–2023. c 2005 Springer. Printed in the Netherlands.
2010
S. Yip et al.
which the crystal lattice becomes mechanically unstable to shear deformation. Heterogenous melting takes place at the thermodynamic melting point Tm , which can be independently determined by equating the free energies of the crystal and liquid. Homogenous melting, on the other hand, generally occurs at a higher temperature, Ts , which can be determined from an analysis of the elastic constants of the crystal [4]. (Additionally, entropy arguments may be used to predict crystalline instabilities at temperatures above the thermodynamic melting temperature.) If, however, surface melting is suppressed, then the solid can be substantially superheated, as has been observed in the case of crystalline spheres of silver coated with gold. In terms of the familiar thermodynamic phase diagram, conventional melting has a clear and simple representation. When the equation of state is projected onto the temperature-volume plane, the melting curve is seen to terminate at the triple point temperature Tt . Conceptually one may think of an extension of the melting curve below Tt as suggesting the existence of a metastable crystalline phase in which sublimation, a thermally activated process, is kinetically suppressed. Following this point of view one can construct an effective phase diagram which also describes solid-state amorphization in the sense of a combined representation of thermodynamics and kinetics.
1.
How Crystals Melt – Thermodynamics vs. Kinetics
In thermodynamics the melting point Tm is the temperature at which the crystal and the liquid phases coexist in equilibrium, with coexistence being governed by the equality of the Gibbs free energies, G, for the crystal and the liquid, G = E − T S + PV
(1)
where E is the internal energy, T, S, P, and V are the system temperature, entropy, pressure, and volume, respectively. For a material where the interatomic interactions are specified, the free energies can be evaluated using the atomistic simulation methods discussed in Chapter 2; in this way one can essentially predict the melting point from the intersection of the two free-energy curves. Notice, however, that while thermodynamics tells us when the system should melt, it says nothing about how the melting process should occur. To see the disordering of a crystal lattice actually taking place at the molecular level, a question of kinetics, it is appropriate to also turn to molecular-dynamics simulation. Consider the case of silicon where the free-energy calculation of Tm was made for a three-body potential for Si in the diamond cubic phase, giving a
Crystal disordering in melting and amorphization
2011
surprisingly accurate prediction of 1691 ± 30 K [5] (the experimental value was 1683 K). Given this result it would appear that using the same potential in a straightforward molecular dynamics simulation of crystal heating would reveal readily the kinetics of crystal melting. This, however, was not what happened [6, 7]. In heating a perfect lattice of Si atoms with periodic boundary conditions in incremental steps, it was observed that the crystal remained stable well past 1691 K, even up to 2200 K; finally, at T ∼ 2500 K the ordered lattice collapsed abruptly, the disordering of atomic positions seemed to occur everywhere in the system. The observation of structural collapse thus seemed to contradict the above definition of melting. Why did the crystal not melt at 1691 K as predicted by the free-energy calculations? What kind of transition took place at 2500 K when structural disordering set in? The answer to the first question was obtained by repeating the simulations with the periodic boundary condition turned off along one direction which resulted in introducing two free surfaces at the ends of the simulation cell. The results showed that in the temperature range above 1700 K, structural disordering appeared first in the free-surface region and then propagated into the bulk crystal, as shown in Fig. 1. By evaluating the static structure factor for each layer of atoms (see Chapter. 6.8) one could follow the melt–crystal interface and deduce the speed of the moving front v(T ), T being the temperature at which the system was being heated. Figure 2 shows the resulting variation of the speed of the interface with temperature. The significance of this apparently simple behavior lies in that by extrapolating the data to zero speed one obtains a temperature of 1710 K, essentially the melting temperature predicted by the free-energy calculations. With a bit of hindsight it became apparent that this way of analyzing the simulation results amounts to a practical method of implementing the thermodynamic definition of phase coexistence. That is, at the melting point the melt–crystal interface should not move in either direction. It is worth pointing out a general implication, namely, rather than trying to find a phase-transition point precisely it is often easier to approach it by extrapolation. Although not very elegant, this method is simple and reasonably robust [6, 7]. We determined the melting point of a crystal by using a free surface to nucleate the structural transition and then extrapolating the “melting speed” to zero. From the standpoint of a thermodynamic process, melting is seen to require the presence of a nucleation site, the free surface in the present discussion. In the absence of a nucleation site, the transition can be kinetically suppressed, as was the case in the simulations using fully periodic boundary conditions, forcing the system to go into a superheated state. Other defects could just as well serve as the necessary nucleation site. For example, a grain boundary or a 13-vacancy void have been found to lead to the same extrapolated melting temperature as obtained in the free surface simulations [8].
2012
S. Yip et al.
(a)
S(K)
1.0 0.5 0.0 0
8
16 Atomic plane
24
32
0
8
16 Atomic plane
24
32
(b)
S(K)
1.0 0.5 0.0
Figure 1. Heterogenous thermodynamic melting of a silicon bicrystal containing in its center a (110) θ = 50.48◦ (11)twist boundary (GB). (The two semicrystals are pulled apart to facilitate visualization of the structural disorder.) After the bicrystal was heated from 1600 K (T < Tm ) to 2200 K (T > Tm ) over a period of 600 time steps (1000 steps corresponds to 1.15 ps of real time), the simulation time was set to t = 0. Shading of the atoms indicates nearest neighbor coordination, C, with white, gray and black circles denoting C = 4, 5 and 3, respectively. (a) After 2700 time steps, a number of planes on either side of the GB plane have melted. Nearzero values of the structure factors S(k1 ) and S(k2 ) show that a breakdown in long-range order in approximately seven (110) planes closest to the GB. By contrast, at t = 0 only a few atoms at the GB had coordination greater than four, and the structure-factor profiles showed a welldefined GB region consisting of about four (110) planes. (b) After 8100 time steps, melting has spread over half of the system, with loss of long-range in the 20 central planes.
Crystal disordering in melting and amorphization
2013
120 Si (SW)
Velocity (m/s)
100
(110) θ ⫽ 50.48˚ (Σ11)
80
60
40 Grain boundary Free surface
20
0 1700
1800
1900
2000
2100
2200
T(K)
Figure 2. Propagation velocity of the silicon solid–liquid interface as a function of temperature after nucleation from a grain boundary (solid squares with error bars) and from a free surface (open squares). The curve, representing a quadratic fit to the data points, extrapolates to zero velocity at T = 1710±30 K.
2.
Mechanical Melting via Elastic Instability
It still remains to clarify the nature of the abrupt structural disordering observed at 2500 K in the simulation runs with fully periodic boundary conditions. We know that in the absence of a nucleation site the crystal can be readily superheated to a metastable state. The upper limit of metastability is the mechanical stability limit of the lattice, the temperature at which the system would collapse without waiting for nucleation and growth of a liquid layer. This is what happened when the lattice disordered uniformly at 2500 K. We will henceforth refer to this homogenous process as mechanical melting in contrast to the heterogenous process of thermodynamic melting. Normally the former is not observed since the latter invariably occurs at a lower temperature. One can understand the nature of mechanical melting by recalling the simple criterion for crystal melting set forth by Born, in which melting is defined by the loss of shear rigidity [9]. Accordingly the melting point Tm is that temperature at which the shear modulus G vanishes, G(Tm ) = 0
(2)
In contrast to the thermodynamic definition based on free energies, this is a thermoelastic description based on elastic stability. Indeed Born extended
2014
S. Yip et al.
Eq. (2) to a set of conditions governing the structural stability of a cubic lattice [10], C11 + 2C12 > 0,
C11 − C12 > 0,
C44 > 0
(3)
where C11 , C12 , and C44 (= G) are the three distinct elastic constants (in Voigt notation). The melting criterion, Eq. (2), was not successful in explaining the experimental results at the time; also it was not clear how it could account for the existence of a latent heat and a latent volume associated with a first-order thermodynamic phase transition. Given that molecular-dynamics allows one to both observe directly structural disordering and calculate the elastic constants, the validity of Eq. (2) can be tested. This was carried out by simulating isobaric heating of a perfect crystal with fully periodic boundary conditions. Such a study showed that an abrupt structural disordering was indeed triggered by the vanishing of a shear modulus, although it was C11 − C12 instead of C44 [11]. It was shown that the temperature at which the lattice collapsed was in good agreement with that predicted by the elastic stability conditions. Since the crystal analyzed was initially defect free, the structural disordering was unlikely to correspond to the free energy-based process which should occur at a lower temperature. The simulations performed employed an interatomic potential model for fcc Au (details of the potential are not important for the present discussion) and a cell containing 1372 atoms with deformable (Parinello–Rahman) periodic boundary conditions imposed at constant stress (see Chapter 2, Basic MD). A series of isostress–isothermal simulations (with velocity rescaling) were carried out over a range of temperatures. At each temperature the atomic trajectories generated were used to compute the elastic constants at the current state using fluctuation formulas [12]. Figure 3 shows the variation with temperature of the lattice strain a/ao along the three cubic symmetry directions. The slight increase with increasing temperature merely indicates the lattice is expanding normally with temperature. Also the results for the three directions are the same as they should be for a cubic crystal. At T = 1350 K one sees a sharp bifurcation in the lattice dimension where the system elongates in two directions while contracting in the third. This is a clear signal of symmetry breaking, from cubic to tetragonal. To see whether the simulation results are in agreement with the prediction based on Eq. (3) we show in Fig. 4 the variation of the elastic moduli with temperature, or equivalently with lattice strain (the one-to-one correspondence is seen in Fig. 3); the three moduli of interest are the bulk modulus BT = (C11 + 2C12 ) /3, tetragonal shear modulus G = (C11 − C12 )/2, and rhombohedral shear modulus G = C44 . On the basis of Fig. 4 one would predict the incipient instability to be the vanishing of G , occurring at the theoretical (predicted) lattice strain of (a/ao )th = 1.025. From the simulation at T = 1350 K the observed strain is (a/ao )obs = 1.024. Thus
Crystal disordering in melting and amorphization
2015
1.2
1.12
ai/a0
1.04
0.96 ax/a0 ay/a0
0.88
az/a0 0.8
0
200
400
600
800
1000 1200 1400 1600
T (K)
Figure 3. Temperature variation of lattice strain a/ao along the three directions of the cubic simulation cell in a series of isobaric (zero pressure) simulations of incremental heating of a perfect crystal of Au atoms, showing a sudden symmetry-breaking structural response at 1350 K. 0.5
1.8 1.6
BT
0.4
1.4 1.2
0.3
1.0 0.8
0.2
BT (Mbar)
G (Mbar)
G
0.6 0.4
0.1
0.2
G' 0.0 1.00
1.01
1.02
1.03
1.04
0.0 1.05
a/a0
Figure 4. Variation of elastic moduli BT , G, and G with lattice strain a/ao in the same isobaric, heating simulation as Fig. 3.
we can conclude that the vanishing of tetragonal shear is responsible for the structural bifurcation behavior. For more details of the system behavior at T = 1350 K, we show in Fig. 5 the time evolution of the lattice strain, the off-diagonal elements of the cell matrix H , and the system volume. It is clear that from Fig. 5(a) that the vanishing of G triggers both a shear deformation (cf. Fig. 5b) and a lattice decohesion (Fig. 5c), the latter providing the characteristic volume expansion
2016
S. Yip et al. (a)
1.2
a/a0
1.1
1.0
0.9
0.8
(b)
0
50
100 150 Time steps (*100)
200
0
50
100 150 Time steps (*100)
200
0
50
100 150 Time steps (*100)
200
0.4
Hij
0.2
0.0
⫺0.2 ⫺0.4
(c)
1.125
Ω/Ω0
1.110 1.095 1.080 1.065 1.050
Figure 5. Overall system response in time in the same simulation as Fig. 3 revealing further characteristic behavior beyond the onset of the triggering instability, (a) lattice strain a/ao along the initially cubic simulation cell, (b) off-diagonal elements of the cell matrix H, and (c) normalized system volume /o . Arrows indicate the onset of Born instability in (a), shear instability in (b), and lattice decohesion in (c).
Crystal disordering in melting and amorphization
2017
associated with melting. This sequence of behavior implies that the signature of a first-order transition, namely, latent volume change, is not necessarily associated with the incipient instability. Our results also provide evidence supporting Born’s picture of melting being driven by a thermoelastic instability, interpreted to involve a combination of loss of shear rigidity and vanishing of the compressibility [13]. Moreover, it is essential to recognize that this mechanism applies only to the process of mechanical instability (homogenous melting) of a crystal without defects, and not to the coexistence of solid and liquid phases at a specific temperature. Although our results for an fcc lattice with metallic interactions show that homogenous melting is triggered by G = 0 and not Eq. (2), nevertheless, they constitute clear-cut evidence that a shear instability is responsible for initiating the transition. The fact that simulation reveals a sequence of responses apparently linked to the competing modes of instabilities (cf. Fig. 5) implies that it is no longer necessary to explain all the known characteristic features of melting on the basis of the vanishing of a single modulus. In other words, independent of whether G = 0 is the initiating mechanism, the system will in any event undergo volume change and latent heat release in sufficiently rapid order (on the time scale of physical observation) that these processes are all identified as part of the melting phenomenon. Generalizing this observation further, one may entertain the notion of a hierarchy of interrelated stability catastrophes of different origins, elastic, thermodynamic, vibrational , and entropic [14]. It may be mentioned that generalizing the stability criteria to arbitrary external load has led to the identification of the elastic instability triggering a particular structural transition [15]. In hydrostatic compression of Si, the instability which causes the transition from diamond cubic to β-tin structure is the vanishing of G (P) = (C11 − C12 − 2P)/2. In contrast, compression of crystalline SiC in the zinc blende structure results in an amorphization transition associated with the vanishing of G(P) = C44 − P. For behavior under tension, crack nucleation in SiC and cavitation in a model binary intermetallic, both triggered by the spinodal instability, vanishing of BT (P)=(C11 +2C12 + P)/3, are results which are analogous to the observation reported here. Notice also that in the present study a crossover from spinodal to shear instability can take place at sufficiently high temperature [15].
3.
Premelting and Thermal Disordering at an Interface
The question of whether a liquid-like layer can form at a grain boundary at temperatures distinctly below the melting point Tm is of long-standing interest [16]. A number of molecular dynamics simulations have been performed to address this issue. The early studies had found significant structural
2018
S. Yip et al.
disordering at T ≤ Tm which led to conflicting interpretations of the existence of ‘grain-boundary premelting’. A perceptive review of the factors that could account for the apparent discrepancy is a good reminder for students interested in atomistic simulations on the need to guard against numerical artifacts [17]. The present consensus concerning the high-temperature structural stability of grain boundaries is that there is no evidence for a premelting transition or a stable disorderd phase below the melting temperature of the bulk. This is consistent with the foregoing discussion of melting in Si where the same melting point was obtained from simulations of a single crystal with free surfaces and simulations of a bicrystal with no free surfaces. It is also consistent with a TEM study of aluminum bicrystals where no premelting was observed up to 0.999 Tm [18]. The surface-induced melting analysis discussed here has been applied to further investigate thermal structural disordering at a crystalline interface in aluminum [19]. Besides showing that melting is nucleated at the grain boundary in a similar manner as at the free surface, the results provided details of metastable behavior commencing at about 0.93 Tm . Three structural models were examined, a single crystal with 3D periodic boundary conditions (model A), a single crystal with 2D periodic boundary conditions and free surfaces along the third direction (model B), and a bicrystal with the same boundary conditions as model B (model C). The interatomic potential used was a many-body potential of the EAM-type for Al. From model A the mechanical melting temperature Ts was established to be 950 K. Using model B the thermodynamic melting point Tm was found by extrapolating to zero the propagation speed of the surface-nucleated disordered layer, the result was 865±15 K. Meanwhile, the bulk or ordered region of model B was observed to behave in the same manner as model A. (The experimental melting point of Al is 930 K; the difference between this and Tm is attributed to the interatomic interaction model used which was an EAM-type potential with no fitting to any thermal property.) With model C having two interfaces, the grain boundary and free surfaces, the analysis of interface-induced melting could be applied separately to the disordered layers nucleated in the vicinity of the interfaces to produce two extrapolated temperatures. Both extrapolations gave the same result for the thermodynamic melting point, a value of 865 K that was also obtained from model B. Concerning the nature of disordering in the free surface or grain boundary region at temperatures just below Tm , results on the energy–temperature variation and the re-emergence of structural order after a long run reveal the onset of metastable behavior. It is tempting to regard such data, along with the rapid growth of interfacial thickness as one approaches Tm , as supporting the theoretical prediction of a continuous process, in contrast to the first-order melting transition in the bulk. On the other hand, one should keep in mind that on the short time and distance scales of molecular dynamics simulation,
Crystal disordering in melting and amorphization
2019
kinetic-barrier effects associated with local energy minima could mask any “intrinsic metastability behavior”. Since the former are expected to be sensitive to system size, some assessment of their presence could be made by using larger simulation cells. To explicitly demonstrate metastability, one could examine potential-energy surfaces obtained by applying energy minimization to a sufficient number of system configurations generated during molecular dynamics simulation (see Chapter 2), or reaction pathway sampling methods (see Chapter 5). The concept of inherent structure [20] may well be relevant to the understanding of competing effects between ordering in the bulk and disordering at the interface.
4.
Parallels Between Melting and Amorphization
There are similarities in the underlying thermodynamic and kinetic features of structural disordering transitions to the liquid and the amorphous states that are worth exploring. We have already commented on such features in discussing the two types of melting transitions, the heterogenous process of nucleation and growth at defect sites and the homogenous process of lattice instability. It should not be surprising that similar distinctions can be made in the disordering transition where a destabilized lattice ends up in an amorphous state. The traditional methods of producing amorphous materials have been rapid solidification of a melt and quenching from the vapor state. More recent methods involving solid-state processes now include ion and electron irradiation, chemical interdiffusion, mechanical deformation, and pressure-induced amorphization [21]. Just as the discussion of mechanisms of melting continues to be a topic of current interest, the understanding of solid-state amorphization is by no means complete despite a considerable body of investigations [22]. Here we point out certain parallels between melting and amorphization by combining the mechanistic insights on melting derived from simulations with related observations from amorphization experiments [1]. Since melting occurs at elevated temperature, it is tempting to think that amorphization occurs more readily at low temperatures. Because solid-state amorphization experiments are usually carried out at temperatures where pointdefect mobility is limited, both heterogenous and homogenous transitions have been observed. In the case of beam irradiation, the amorphous phase was nucleated at defect sites when the temperature is close to a threshold value, whereas at lower temperatures homogenous transformation in the entire irradiated volume occurred. In experiments on hydrogen charging distinctly heterogenous or homogenous processes were observed to occur in regions that are, respectively, poor or rich in hydrogen. These observations show that the two mechanisms of melting occur also in amorphization.
2020
S. Yip et al.
Besides temperature another state variable which comes into play is the elastic strain, or volume change (usually expansion), accompanying the onset of structural disorder. The existence of volume change associated with melting and amorphization is clearly seen in the decrease of the shear modulus C in both types of transitions. There is an additional effect of chemical disordering in the amorphization of alloys; however, this is not an issue in the case of hydrogenation experiments where little or no chemical disordering takes place. We can combine our considerations of temperature and volume expansion by giving an unified interpretation of melting and amorphization in terms of a thermodynamic phase diagram. In Fig. 6 we show the phase boundaries delineating the crystal, liquid and vapor states of an elemental substance in the temperature–volume plane. The condition for thermodynamic melting is expressed by the melting curve Tm (V ), which terminates at the triple-point temperature Tt , the lowest value at which the crystal can coexist with the liquid. The freezing curve Tf (V ) would lie more or less parallel to the melting curve, also terminating at Tt . For many materials the melting point Tm is only a little higher than the triple-point temperature (often by less than 1K). Thus any experiment performed at T < Tm is also performed at T < Tt , which
T
Tm(v ) Ts(V ) Ts C-L
Tm
L L-Vap
C
Vap
Tt
C-Vap
L VtC Vt
V
Figure 6. Schematic temperature–volume phase diagram of a monatomic substance showing the single-phase regions of crystal (C), liquid (L), and vapor (Vap), and the various two-phase regions. On the horizontal triple line, at temperature It , the crystal (at volume VtC ) and the liquid (at volume VtL ) coexist with the vapor. The points on the thermodynamic melting line, Tm (V ), and the freezing curve, Tf (V ), indicate conditions of ambient pressure. As discussed in Chapter 5, to a good approximation, the freezing curve and the mechanical-stability line, Ts (V ), coincide.
Crystal disordering in melting and amorphization
2021
calls into question the meaning of ‘surface premelting’ in the sense of equilibrium thermodynamics. Although the phase diagram indicates that below Tt C–L transition cannot occur as a equilibrium thermodynamic process, one nevertheless can define a critical volume Vs(T ), with T < Tt , at which the crystal becomes mechanically unstable upon uniform volume expansion, as shown in Fig. 7. In effect this extends the instability curve for mechanical melting to arbitrarily low temperatures. That this is still consistent with the vanishing of the shear modulus can be demonstrated by calculating C at a function of temperature down to zero temperature. The extended curve therefore represents the existence of a metastable crystalline phase in which the thermally activated process of sublimation is kinetically suppressed. In other words, we justify the crossing of the phase boundaries by restricting the interpretation to time scales short compared to the relevant kinetics. With sublimation kinetically suppressed, a similar extension of the thermodynamic melting curve Tm (V ) can be introduced, as also shown in Fig. 7. Notice that neither the triple line separating the C–L region from the C–Vap region nor the sublimation curve appear in this diagram. The region lying to the left of Tm (V ) now becomes the effective single-phase region for the crystalline state, and it is important to keep in mind that the region between the original sublimation curve and Tm (V ) defines a metastable over-expanded
T
L C-L L-vap C Tm' (V )
Ts' (V )
Vap
V
Figure 7. An effective T − V phase diagram showing the extensions of both the thermodynamic-melting and the mechanical-stability (or freezing) curves below the triple-point temperature. Along the extension of the former, an expanded crystal becomes unstable against the disordered phase by heterogenous amorphization; by contrast, along the extension of the mechanical-melting curve, homogenous amorphization can occur.
2022
S. Yip et al.
solid. Similarly, the extended two-phase region below the triple-point temperature is where the over-expanded solid can coexist with the metastable supercooled liquid. The extended (effective) phase diagram applies equally well to solid-state amorphization. From this standpoint C–A transformation is viewed as an isothermal melting process driven by volum expansion at T < Tm . The analogy with C–L transition is that volume expansion produced by external forces at constant temperature plays the same role as thermal expansion during isobaric heating to melting. When the expansion is sufficient to cross Tm (V ), the crystal becomes unstable against structural disordering, and heterogenous amorphization may take place. If in addition to sublimation heterogenous amorphization is also kinetically suppressed, then the expansion may continue up to the curve Ts (V ), at which point homogenous amorphization triggered by the Born instability is expected to set in.
5.
Further Issues
With structural transitions affecting virtually all properties and behavior, it is unlikely all the issues relevant to a complete understanding of how a crystal lattice undergoes disordering will be resolved any time soon. We have addressed the thermodynamic meaning of melting as conventionally measured, which we now distinguish as thermodynamic melting, and demonstrate its relation to melting by an elastic instability, which we denote as mechanical melting. Whereas the former is governed by the kinetics of extrinsic defects, the latter is clearly a process intrinsic to all crystals. Further developments along the lines of our considerations are to be expected. The equivalence between the determination of melting by free energy calculations for a defectfree crystal and a liquid and by the method of defect-nucleation described above has been demonstrated in a much more extensive manner in studies that map out an melting curve of a noble-gas element, showing how well simulation can predict experiments [23]. Through molecular-dynamics simulation the connection between the elastic stability criterion and the empirical rule known as Lindemann’s law can be clarified, along with an investigation of surface melting [24]. As our understanding of melting mechanisms deepens, the issue of how quickly melting can occur becomes relevant. It is now feasible to observe melting on the time scale of electronic excitations by femtosecond laser spectroscopy. To interpret such measurements the studies described here would have to be extended to incorporate the role of electronic processes. A more complete understanding of structural disordering phenomena will then have to deal with a multiscale rate process.
Crystal disordering in melting and amorphization
2023
References [1] S.R. Phillpot, S. Yip, P.R. Okamoto, and D. Wolf, “Role of interfaces in melting and solid-state amorphization,” In: D. Wolf and S. Yip (eds.), Materials Interfaces, Chapman and Hall, London, pp. 228–254, 1992. [2] A.R. Ubbelohde, Molten State of Matter: Melting and Crystal Structure, Wiley, Chichester, 1978. [3] R.W. Cahn, “Melting and the surface,” Nature, 323, 668–669, 1986. [4] M. Born and K. Huang, Dynamical Theory of Crystal Lattices, Clarendon Press, Oxford, 1954. [5] J. Broughton and X.P. Li, “Phase diagram of silicon by molecular dynamics,” Phys. Rev. B, 35, 9120–9127, 1987. [6] S.P. Phillpot, J.F. Lutsko, D. Wolf, and S. Yip, “Molecular-dynamics simulation of lattice-defect-nucleated melting in silicon,” Phys. Rev. B, 40, 2831–2840, 1989. [7] S.R. Phillpot, S. Yip, and D. Wolf, “How crystals melt,” Comput. in Phys., 3, 20, 1989. [8] J.F. Lutsko, D. Wolf, S.R. Phillpot, and S. Yip, “Molecular-dynamics study of latticedefect-nucleated melting in metals using an embedded atom method potential,” Phys. Rev. B, 40, 2841–2855, 1989. [9] M. Born, J. Chem. Phys., 7, 591, 1939. [10] M. Born, Proc. Cambridge Philos. Soc.. 36, 160, 1940. [11] J. Wang, J. Li, S. Yip, D. Wolf, and S. Phillpot, “Unifying two criteria of born: elastic instability and melting of homogenous crystals,” Physica A, 240, 396–403, 1997. [12] J. Ray, “Elastic constants and statistical ensembles in molecular dynamics,” Comput. Phys. Rep., 8, 111–151, 1988. [13] L.L. Boyer, Phase Transitions, 5, 1, 1985. [14] J.L. Tallon, Crystal instability and melting, Nature, 342, 658–658, 1989. [15] J. Wang, J. Li, S. Yip, S. Phillpot, and D. Wolf, “Mechanical instabilities of homogenous crystals,” Phys. Rev. B, 52, 12627–12635, 1985. [16] H. Gleiter and B. Chalmers, High-Angle Boundaries, Pergamon, Oxford, p. 113, 1972. [17] V. Pontikis, “Grain-boundary structure and phase transformations–a critical review of computer-simulation studies and comparison with experiments,” J. de Phys., 49, C5, 327–338, 1988. [18] T.E. Hsieh and R.W. Balluffi, “Experimental study of grain-boundary melting in aluminum,” Acta Metall., 37, 1637–1644, 1989. [19] T. Nguyen, P.S. Ho, T. Kwok, C. Nitta, and S. Yip, “Thermal structural disorder and melting at a crystalline interface,” Phys. Rev. B, 10, 6050–6060, 1992. [20] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [21] W.L. Johnson, “Thermodynamics and kinetic aspects of the crystal to glass TRna sition in metallic materials,” Prog. Mat. Sci., 30, 81–134, 1986. [22] P.R. Okamoto and M. Meshii, In: H. Wiedersich and M. Meshii (eds.), Science of Advanced Materials, ASM International, Metals Park, OH, p. 33, 1990. [23] M. de Koning, A. Antonelli, and S. Yip, “Single-simulation determination of phase boundaries: a dynamic clausius-clapeyron integration method,” J. Chem. Phys., 115, 11025–11035, 2001. [24] Z.H. Jin, P. Gumbsch, K. Lu, and E. Ma, “Melting mechanisms at he limit of superheating,” Phys. Rev. Lett., 87, 055703, 2001.
6.12 ELASTIC BEHAVIOR OF INTERFACES Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439
A large body of work on epitaxial thin films has focused on controlling interfacial structure, particularly on preventing formation of interface dislocations which can lead to diffusion and electrical breakdown in semiconductor devices [1]. However, in other types of interfacial materials, dislocations might actually be desirable for controlling or enhancing certain mechanical properties, such as toughness and ductility. In this section we illustrate – by means of atomistic computer simulations – the important role of the atomic structure and, in particular, misfit dislocations in the elastic behavior of metallic interface materials. In particular, we review atomic-level simulations that elucidate the causes of the anomalous elastic behavior of thin films and composition-modulated superlattice materials. (For earlier reviews, see Refs. [2] and [3].) The investigation of thin-film superlattices composed of grain boundaries (GBs) shows that the elastic anomalies are not necessarily an electronic but a structural interface effect that is intricately connected with the local atomic disorder at the interfaces [4]. The consequent predictions that (i) coherent strained-layer superlattices should show the smallest elastic anomalies and (ii) incoherent interfaces exhibit much larger anomalies are validated by simulations of dissimilar-material superlattices. We hope to demonstrate that such simulations can be an effective aid in tailoring the elastic behavior of composite materials because, by contrast with experiments, they allow one to systematically investigate simple, but well characterized model systems with increasing complexity. This unique capability of simulations has enabled elucidation of the underlying driving forces and, in particular, (i) deconvolution of the distinct effects due to the inhomogeneous atomic-level disorder localized at the interfaces from the consequent interfacestress-induced anisotropic lattice-parameter changes and (ii) separation of the homogeneous effects of thermal disordering from the inhomogeneous effects due to the interfaces. 2025 S. Yip (ed.), Handbook of Materials Modeling, 2025–2054. c 2005 Springer. Printed in the Netherlands.
2026
D. Wolf
This case study of the interfacial elastic behavior also demonstrates the intuitive insights into the physical behavior of inhomogeneous systems obtainable from atomistic simulations. This behavior is often opposite to our usual intuition gained from the investigation of homogeneous systems. The simulations discussed below suggest that the usual intuition that the elastic constants and moduli soften when the density of a material decreases (for example, by thermal expansion) does not necessarily apply to inhomogeneous systems. Instead, these simulations demonstrate that the net elastic response of an interface material is the result of a rather complex, highly non-linear competition between two influences, structural disordering and consequent volume expansion. Our focus is mostly on multilayered systems (see Fig. 1) [5]. The elastic behavior of multilayered materials is known to exhibit significant anomalies, in that some elastic constants and moduli are significantly strengthened (the so-called “supermodulus effect”) while others may actually be softened [6]. In the past, much controversy has focused on the actual magnitude of these anomalies; [7–11] however it now seems rather widely accepted that, although no longer thought to be quite as striking as suggested in the original paper [7], these anomalies (typically of the order of 10–50%) are significantly larger than predictions from continuum elasticity theory based on the underlying anisotropic lattice-parameter changes (typically of the order of a few n â
n â
(a)
(b)
Λ
z
coherent x
incoherent y
Figure 1. (a) Coherent or “strained-layer” and (b) incoherent thin-film superlattice (schematic). The lattice parameters parallel and perpendicular to the interfaces will adjust in response to interfacial stresses.
Elastic behavior of interfaces
2027
percent) which usually accompany the elastic anomalies. The physical origin of these anomalies has also been the subject of much controversy, including suggestions that this behavior arises from (i) the modified net electronic structure of the multilayered material, (ii) the anisotropic – but homogeneous – state of strain of the two constituent films forming the multilayer, or (iii) the greatly disordered atomic structure of the inhomogeneous regions surrounding the interfaces (see Fig. 1) [6]. In this section present evidence that the anomalies are intimately connected with the detailed atomic structure of the interfaces, while the homogeneous strains and the related modification of the electronic structure appear to play only secondary roles. The fundamental importance of composition-modulated layered materials as model interfacial materials is due to the fact that, by controlling the composition modulation wavelength, (see Fig. 1), the fraction of atoms at or near the interfaces can be varied systematically. As a consequence, the physical response of these model systems consists of a tunable mixture of homogeneous and inhomogeneous effects: By decreasing gradually, more and more atoms in the system experience the presence of the interfaces, and their behavior resembles less and less that of a homogeneous system, thus gradually exposing the behavior characteristic of the inhomogeneous parts of the material. These characteristics consists mainly of a highly non-linear modification of the atomic structure and related elastic behavior with decreasing . In “real” multilayers the simultaneous presence of both structural and chemical disorder at the interfaces (the latter associated, for example, with interfacial reactions and segregation) gives rise to considerable difficulties in both, full characterization of the material and interpretation of any observed elastic anomalies in terms of the underlying atomic structure and composition of the interfaces. In simulations this difficulty can be avoided entirely by choosing model systems which are atomically and chemically flat, thus greatly simplifying the interpretation of the observed elastic behavior of these model systems, a simplification not easily possible in experiments. The outline of the paper follows a simple building-block concept in which the thin slab, delimited by flat surfaces and of variable thickness, is considered as the basic building block of the multilayered system (Fig. 1). Following a brief discussion of the computational approach in Section 1, we first consider the structure and elastic behavior of superlattices of GBs (see Fig. 2), thus avoiding any effects due to materials and interfacial chemistry (Section 2). These grain-boundary superlattices (GBSLs) represent ideal model systems for the investigation of the purely structural aspects in the anomalous elastic behavior, including the structure-property correlation. Next, by periodically stacking up thin films of different materials, the grain boundaries will be replaced with phase boundaries, thus modeling the more complex behavior of composition-modulated superlattices (Section 3). Finally, by comparing the temperature dependence of the elastic behavior of multilayers with that of a
2028
D. Wolf x, y
A
B
A
B z
Λ Figure 2. Schematic showing the periodic arrangement of thin slabs, A and B, to form a grainboundary superlattices (GBSL). A and B are identical materials of equal thickness, however rotated with respect to each other about the interface normal (||z) to form a periodic array of GBs. In a composition-modulated superlattice, A and B are different materials.
perfect crystal (Section 4), the homogeneous effects of temperature will be compared with the inhomogeneous effects due to the interfaces.
1.
Computational Approach
A hierarchy of atomistic simulation codes, schematically shown in Fig. 3, has been used to determine the structure and elastic behavior of multilayered systems. Following the choice of a suitable interatomic potential and the desired multilayer geometry, the system of atoms is first relaxed at zero temperature using an iterative energy-minimization algorithm (“lattice statics”). The periodic border conditions are chosen to be consistent with the geometry of the system. For example, three-dimensional (3D) periodic borders are used in the simulation of perfect crystals and superlattices, while 2D periodic border conditions are imposed in the plane of the interfaces for thin films and bicrystalline interfaces (i.e., individual grain or phase boundaries embedded between two bulk perfect crystals). The relaxation may be performed under conditions of either constant volume or constant stress, permitting one to elucidate the role of the anisotropic lattice-parameter changes in the elastic behavior. Following the complete relaxation of the system, the 6 × 6 elastic-constant and -compliance tensors at T = 0 are evaluated using a lattice-dynamics like method [12]. The elastic constants thus obtained can be tested and verified by a direct comparison with those extracted from stress-strain curves. Finally, the relaxed structures thus obtained can be used as input into molecular dynamics (MD) simulations to study the effects of temperature. A non-trivial conceptual problem in the evaluation of elastic constants for inhomogeneous systems arises from the internal relaxations that occur
Elastic behavior of interfaces
2029
Hierachy of Codes Interatomic Potential(s)
Unrelaxed Structure
Force Relaxation ("Lattice statics")
Lattice Dynamics
Molecular Dynamics
Figure 3. Schematic representation of the hierarchy of computer codes used to atomistically simulate the structure and elastic behavior of interface materials.
following the application of an external strain or stress to the system. This relaxation effect, absent when homogeneously deforming, for example, a perfect monatomic cubic crystal, gives rise to a contribution to the zero-temperature elastic constants [12], in addition to the well-known Born term [13]. In MD simulations of elastic constants this relaxation contribution is part of the so-called fluctuation term [12,14], which for inhomogeneous systems does not vanish in the T → 0 limit. In order to elucidate the degree to which the simulated results depend on the potential and the material being simulated, the results obtained by means of two conceptually different fcc-metal potentials will be compared: a Lennard– Jones (LJ) pair potential fitted for Cu (with ε = 0.167 eV and σ = 2.315 Å) and an embedded-atom-method (EAM) many-body potential fitted for Au [15]. As discussed in detail elsewhere [16], the two types of potentials yield qualitatively the same behavior for most interfacially controlled materials properties, indicating that interfacial behavior is dominated by the (usually pair-wise) repulsive interactions in these potentials. Those properties that do seem to vary with the potential can usually be understood in terms of the rather different interfacial stresses that are generated with the two potentials [17]. The different magnitudes of stresses associated with the two potentials can be seen by considering the variation of the cohesive energy of the two potentials with
2030
D. Wolf Table 1. Zero-temperature elastic constants, selected moduli and Poisson ratios for perfect crystals of each potential in the principal cubic coordinate system. Units are 1012 dynes/cm2 , except for the Poisson ratios, which are dimensionless. Elastic property C11 = C22 = C33 C12 = C13 = C23 C44 = C55 = C66 Young’s modulus, Yz Biaxial modulus, Yb Poisson ratio, ν
LJ potential
EAM potential
1.808 1.016 1.016 1.076 1.681 0.360
1.807 1.571 0.440 0.346 0.647 0.465
the deviation of the lattice parameter from its equilibrium T = 0 value (see, e.g., Refs. [17] and [18]). The steeper slopes of the EAM cohesive-energy curve shows a greater resistance to straining, and is reflected in the larger interfacial stresses this potential generates [13, 14]. To avoid discontinuities in the energy and forces, both potentials were shifted smoothly to zero at their respective cutoff radius (Rc /a = 1.32 and 1.49 for the EAM and LJ potentials, respectively). The zero-temperature perfect-crystal lattice parameters, a, for these potentials were determined to be 4.0828 Å (EAM) and 3.6160 Å (LJ). The three elastic constants of the perfect fcc crystal at zero temperature in the principal cubic coordinate system (with the x, y, and z-axes parallel to the principal 001 directions) are summarized in Table 1 for the two potentials. We note that, because of the Cauchy relation, for an equilibrium pair potential, C12 = C44 , i.e., the LJ potential has only two independent elastic constants.
2.
Grain-boundary Superlattices (GBSLs)
As illustrated in Fig. 2, GBSLs represent idealized, somewhat hypothetical layered materials consisting of a periodic arrangement, . . . | A | B | A | B | . . . , of thin slabs A and B of equal thickness, /2. By contrast with a composition-modulated superlattice, however, A and B consist of the same material, thus avoiding any effects that might arise from materials chemistry. The GBSLs described below contain symmetrical twist boundaries on the three principal cubic lattice planes; i.e., the thin slabs A and B in Fig. 2 are merely rotated with respect to each-other about the interface normal by an angle θ (between A and B) and −θ (between B and A). Unencumbered by any effects due to interfacial chemistry, these idealized model systems were shown to capture the essential interfacial phenomena of inhomogeneous structural disorder coupled with anisotropic lattice-parameter changes [4, 19].
Elastic behavior of interfaces
2031
1400 Symm, Twists, Cu(LJ)
GB Energy [mJ/m2]
1200 1000
(011) (m⫽1)
800 600
(001) (m ⫽ 2)
400 (111) (m ⫽ 3)
200 0 0
30
60
90 120 mθ (deg)
150
180
Figure 4. Energies (in mJ/m2 ) of twist boundaries versus twist angle for the three densest planes in the fcc lattice using the LJ(Cu) potential; the EAM results are qualitatively identical (see also Section 6.9) [18]. The factor m is related to the rotation symmetries of the (111) (m = 3) and (001) (m = 2) planes.
For the (001) GBSLs considered below, the twist angle θ was chosen to be 36.87◦ (forming so-called 5 (001) twist GBs); for the (011) GBSLs, θ = 50.48◦ (11 (011)); finally, for the (111) GBSLs, θ = 21.79◦ (7 (111)). The choice of these twist angles was motivated by our desire to maximize the degree of structural disordering on each GB plane in order to achieve the largest possible elastic anomalies on that plane. As seen in Fig. 4, on their respective planes each of these particular twist angles (arrows) represents a high-angle twist GB (in which, by definition, the dislocation cores overlap and the energy is therefore independent of θ; see also Section 6.9) [18].
2.1.
Simulation Results
As shown in Fig. 5(a), both the (001) and (111) GBSLs show monotonic, isotropic contractions in the x-y plane with decreasing , while the (anisotropic) (011) plane shows an expansion in one direction and a contraction in the other; a similar behavior was observed for (011) free-standing thin films [20]. The accompanying Poisson expansions normal to the interfaces are shown in Fig. 5(b). Interestingly, analogous to the results obtained for free-standing thin films [20], the (001) GBSLs show the largest expansions normal and the greatest contractions parallel to the interfaces.
2032
D. Wolf (a)
0.01
∆a x /a, ∆a y /a
0.00 ⫺0.01
GBSL Au(EAM) (111) (001) (011) - ax
⫺0.02 ⫺0.03
(011) - ay ⫺0.04 0
2
4
6
8
10 12 14 16 18 20 22 Λ/a
(b) 1.10
GBSL Au(EAM)
∆a z /a
1.08
(111) (001)
1.06
(011)
1.04 1.02 1.00 0
2
4
6
8
10 12 14 16 18 20 22 Λ/a
Figure 5. (a) Change in the average lattice parameters parallel to the interfaces for GBSLs on the three principal fcc planes, as a function of the modulation wavelength , using the Au(EAM) potential. (b) Average lattice parameter perpendicular to the interfaces for the same three GBSLs [19].
In Ref. [20] it was shown that linear elasticity theory, using bulk elastic moduli and the bulk free-surface stress, could account for the lattice-parameter changes of the (001) and (111) free-standing thin films very well. Figure 6, showing az from Fig. 5(b) plotted as a function of the planar contraction ax of Fig. 5(a), illustrates that linear elasticity theory works equally well for the anisotropic lattice-parameter changes in the GBSLs on the (001) and (111)
Elastic behavior of interfaces
2033
0.05 ⫺1.738
GBSL Au(EAM)
∆a z /a
0.04
Column 2 Column 3
0.03 ⫺1.212 0.02 0.01 0.00 ⫺0.03
(001) (111) ⫺0.02
⫺0.01
0.00
⌬a x /a Figure 6. Poisson expansion, az , in the direction of the surface normal resulting from the stress-induced in-plane lattice-parameter changes, ax , for the GBSLs. The straight lines are predicted from linear-elasticity theory based on the surface stress [19, 20].
planes. The slopes of the solid lines in the Fig. 6 are governed by a combination of Poisson ratios [19, 20]. As a consequence of these lattice-parameter contractions and expansions, the average GB area, A, decreases while the average atomic volume, , increases with decreasing [19]. Based on the behavior of homogeneous systems, one might expect some strengthening of the in-plane elastic behavior combined with a softening of the out-of-plane response. However, as was observed for the thin films [20], some moduli are strengthened (the “supermodulus effect”) while some are softened, regardless of being in-plane or out-of-plane (see Fig. 7). (Notice that the moduli in the figures have been normalized to the corresponding → ∞ values [19], which are governed by the appropriate averages over two perfect fcc crystals rotated with respect to one another about the GB normals.) According to Fig. 7(a), the in-plane Young’s modulus, Yx , for the (111) GBSLs does, indeed, strengthen as one would expect from the related in-plane contraction; however, its out-of-plane Young’s modulus, Yz (Fig. 7(b)), strengthens as well with decreasing , despite the z expansion. On the other hand, in spite of their by-far largest in-plane contraction, the (001) GBSLs exhibit only a very small increase in Yx for the larger values of followed by a decrease; moreover, in spite of their by-far largest increase in az and the atomic volume , in the (001) GBSLs Yz nevertheless strengthens. This comparison demonstrates that, even based on a complete knowledge of the
2034
D. Wolf
Young's Modulus, Yx / Yx∞
(a)
1.2 1.1 1.0 0.9 0.8 0.7 (111) (001) (011) - Yx (011) - Yy
0.6 0.5 GBSL Au(EAM)
0.4 0.3 0
Young's Modulus, Yz / Yz∞
(b)
2
4
6
8
10 12 14 16 18 20 22 Λ/a
2.5 (111) (001)
2.0
(011) 1.5 1.0 0.5
GBSL Au(EAM)
0.0 0
2
4
6
8
10 12 14 16 18 20
22
Λ/a Figure 7. (a) in-plane and (b) out-of-plane Young’s moduli for fully-relaxed Au (EAM) GBSLs, normalized to the related bulk moduli. For symmetry reasons, for the (001) and (111) GBSLs Yx = Yy [19].
interface-stress-induced anisotropic lattice-parameter changes, it is impossible to predict the magnitude or even the sign of the observed elastic anomalies. It is striking that even when the x-y lattice-parameter changes are suppressed in the simulation, the Young’s moduli still show anomalous behavior, albeit less pronounced, pointing again to the structural disorder due to the interfaces as the cause for the elastic anomalies [19].
Elastic behavior of interfaces
2035
An important aspect of the elastic behavior of the GBSLs is the resistance to shear strains parallel to the interfaces. According to Fig. 8, the shear moduli for the (001) and (111) GBSLs (for which Gxz = Gyz ) exhibit a pronounced softening with decreasing . However, by contrast with the above Young’s moduli, the shear moduli are of the same magnitude for the different GB planes. Why, by contrast with Yx and Yz , the shear moduli are rather insensitive functions of the detailed atomic structure and energy of the GBs was discussed in detail in [21], where it was argued that all high-angle GBs should exhibit a greatly reduced shear resistance right at the GBs because in such boundaries virtually all correlation is lost between the atom positions on opposite sides of the interface. In fact, in a bicrystal study [22, 23] the degree of structural disorder was systematically varied by considering a range of (001) twist angles between 0◦ and 45◦ . In correspondence with the rapidly increasing GB energy (see Fig. 4), the related shear moduli were shown to decrease dramatically with increasing twist angle. We finally mention that, because of the virtually complete loss of stacking order across the interfaces, the modulus for shear parallel to the GBs is a rather insensitive function of the x-y contractions (see Fig. 9(b)), by contrast to the related Young’s moduli (see Fig. 9(a)). Figure 9(a) also emphasizes the point made earlier that even when lattice-parameter changes are suppressed in the simulation, the Young’s modulus exhibits anomalous behavior, albeit less pronounced. 0.6 (111) Σ7 (100) Σ5
∞ Gxz/Gxz
0.5
LJ (Cu)
0.4
0.3
0.2
0.1 0.0
5.0
10.0
15.0 Λ/a
20.0
25.0
30.0
Figure 8. Normalized moduli for shear parallel to the interface planes for the (001) and (111) GBSLs [21].
2036
D. Wolf
Young's Modulus, Yz /Yz
(a)
2.5 x-y contraction: on
(001) GBSL Au(EAM)
2.0
off
1.5
1.0
0.5 0
10
20 Λ/a
30
40
(b) 1.0 Shear Modulus, Gxz /Gxz
0.9
x-y contraction: on off
(001) GBSL Au(EAM)
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0
10
20 Λ/a
30
40
Figure 9. Normalized (a) Young’s moduli, Yz , and (b) shear moduli, Gxz = Gyz , as obtained by the Au(EAM) potential, for (001) GBSLs with and without allowing for the stress-induced x-y lattice-parameter changes, as a function of .
2.2.
Role of the Atomic Structure of the Interfaces
We now investigate the relationship between the atomic structure of the interfaces and the elastic anomalies. The degree of structural disorder is best illustrated by radial distribution functions like the ones shown in Figs. 10(a)– (c) associated with high-angle GBs on the three principal fcc planes (see also
Elastic behavior of interfaces
2037
(a) 1.2 (111) θ ⫽ 17.90˚ (Σ ⫽ 31) PLANE 1
1.0
r 2 G(r)
0.8 0.6 0.4 0.2 0.0 0.6
0.8
1.0
1.2
1.4
1.6
1.2
1.4
1.6
r/a
(b)
1.5 (100) θ ⫽ 43.60˚ (Σ ⫽ 29) PLANE 1
r 2 G(r)
1.0
0.5
0.0 0.6
0.8
1.0 r/a
(c)
0.25 (110) θ ⫽ 50.48˚ (Σ ⫽ 11) PLANES 1 ⫹ 2
r 2 G(r)
0.20 0.15 0.10
0.05 0.00 0.6
0.8
1.0
1.2
1.4
1.6
r/a
Figure 10. Radial distribution functions, r 2 G(r ), for twist boundaries on the three densest planes of the fcc lattice at zero temperature. For the two densest planes (Figs. (a) and (b)) [24], only atoms in the planes immediately at the GB were included in G(r ); by contrast, because of the smaller spacing of the (110) planes and the larger degree of atomic-level “rumpling” in these planes, the two planes nearest to the GB were considered in the case of the (110) twist GB (Fig. (c)) [25]. The full arrows indicate the perfect-crystal δ-function peak positions at zero temperature.
2038
D. Wolf
Fig. 4 and Section 6.9). The comparison of the substantially broadened peaks with the corresponding zero-temperature δ-function peaks of heights 12, 6, 24, etc. at the nearest-neighbor (nn), 2nd nn, 3rd nn, etc. distances of 0.707a, a, 1.225a, etc. in the fcc lattice demonstrates the strongly defected local environments of the atoms near the GBs. Also, consistent with the related GB energies (see Fig. 4), the (110) boundaries show by far the greatest broadening, followed by the (001) and (111) GBs, the latter with the lowest energy. A detailed analysis given in Figs. 11(a)–(c) for the three planes nearest to an (001) high-angle twist boundary (labeled 29, with θ = 43.60◦ ) shows two effects [4]. First, as evidenced by the rapid recovery of sharp perfect-crystal peaks only three planes away from the GB (Fig. 11(c)), the structural disorder is highly localized at the GBs. Second, compared to the perfect crystal (solid arrows), the peak centers in the GBSLs (open arrows) are shifted slightly towards larger distances, by an amount approximately proportional to the corresponding volume expansion at the GBs [4]. The apparently paradoxical question is this: How can at least some elastic moduli of an interface material strengthen in spite of the decrease in its average density? Based on our usual intuition, gained from the study of homogeneous systems, one would expect all elastic moduli and constants to weaken upon expansion. As first pointed out in Ref. [4], although the overall volume of the system expands upon introduction of the interfaces (i.e., the average distance between the atoms increases), some atoms are in closer proximity to one-another, up to about 10%, than they are in the perfect crystal (see Figs. 10 and 11). These shorter distances are expected to strengthen the local elastic response whereas longer distances give rise to a softening, with the net effect apparently being a strengthening of some moduli. However, as illustrated above, the net outcome of this complex averaging process seems to depend strongly on the detailed atomic structure of the interfaces and on the particular elastic modulus considered. We finally consider the difference between the elastic constants and the related moduli, a rather fundamental distinction from both a conceptual and experimental viewpoint. When determining a modulus, an external stress is applied to the system and the ensuing strains are monitored; i.e., the stress is fixed and the strains are variables. In an elastic-constant measurement, by contrast, a strain is imposed on the system and the ensuing stresses are monitored. Hence, while a modulus describes the physical response of the system while permitting all lattice-parameter changes of the system in response to the applied stress to take place, an elastic constant describes the system response while all strains are fixed. The moduli are consequently given by the elastic compliances, thus representing combinations of elastic constants. Consequently, while the anomalies in the elastic constants may be rather small (see Fig. 12 for the (001) GBSLs), the anomalies in the related moduli may be much larger by comparison (see Fig. 7). It therefore appears that the reduced
Elastic behavior of interfaces
2039
(a) 1.5 (100) Σ 29 PLAN 1
G (r )
1.0
0.5
0.0 0.0
0.8
1.0
1.2
1.4
1.6
1.2
1.4
1.6
1.2
1.4
1.6
r
(b) 2
G (r )
(100) Σ 29 PLAN 2
1
0 0.6
0.8
1.0 r
(c) 5 (100) Σ 29 PLAN 3
4
G (r )
3 2 1 0 0.6
0.8
1.0 r
Figure 11. Plane-by-plane zero-temperature radial distribution functions for the three lattice planes closest to the (001) θ = 43.60 (29) symmetrical twist GB surrounded by bulk perfect crystals [4]. The full arrows indicate the perfect-crystal δ-function peak positions. The open arrows mark the average neighbor distance in each shell. The widths of these shells are indicated by the dashed lines. (We note that Fig. 2 in Section 6.7 represents the average G(r ) for a GBSL of (001) twist GBs with six (001) planes between the interfaces.)
2040
D. Wolf 1.4 (001) Cu(LJ)
Elastic Moduli
1.2 1.0 0.8
GBSL
0.6
Yz /Yz∞ Gx z /Yxz∞
0.4 Strained Perfect Crystal
0.2 0.0 0
10
20 Λ/a
Yz /Yz∞ Gx z /Yxz∞ 30
40
Figure 12. Elastic constants (in 1012 dyn/cm2 ) for the (001) GBSLs whose Young’s moduli are shown in Figs. 7 and 8 [19, 26].
elastic constants reported in numerous experiments may not contradict experiments in which enhanced moduli were observed. The supermodulus effect may therefore be very aptly named since a “super elastic-constant” effect may not exist [26].
3.
Dissimilar-material Superlattices
Based on the interpretation of the elastic behavior of thin films [20] and GBSLs [4, 9, 21–23, 26] as structural interface effects, several predictions can be made regarding tailoring elastic behavior by controlling interface structure. Most importantly, one would expect that coherent interfaces (“perfect epitaxy”) should exhibit the smallest elastic anomalies. Moreover, similar to the increase in the elastic anomalies upon replacing (111) by (001) GBs or free surfaces by GBs, introduction of structural disorder into a relatively perfect interface, via misfit dislocations, should increase these anomalies significantly. In order to test these predictions, Jasczak and Wolf [27, 28] investigated the role of coherency in the elastic behavior of composition-modulated superlattices on the (001) plane of fcc metals. Again, in order to eliminate materials and interfacial chemistry as much as possible as a contributing factor, these simulations were performed using Lennard–Jones potentials with a 10% [27] and 20% lattice-parameter mismatch [28] but with the same cohesive energies. We here focus on the (001) superlattices with a 20% lattice-parameter
Elastic behavior of interfaces
2041
mismatch because the results are more dramatic than those obtained for a 10% mismatch [28]. The fully relaxed structures of the three types of dissimilar-material superlattices studied here, containing varying degrees of interfacial structural disorder, are shown in Figs. 13(a)–(c). The coherent superlattices (COHSLs), while strained due to the lattice-parameter mismatch, are highly ordered at the interfaces (Fig. 13(a)). In the incoherent superlattices shown in Fig. 13(b) (INCSLs), the mismatch strains are relieved via the introduction of a square network of misfit dislocations, with a consequent increase in the degree of interfacial structural disorder. In order to introduce even more structural disorder, the (001) planes in (b) were twisted relative to each other, thus introducing screw dislocations in addition to the misfit dislocations already there. In the superlattices shown in Fig. 13(c), a twist angle of ±16.26◦ between alternating A and B layers (Fig. 2) was chosen; because the resulting planar unit cell contains five times as many atoms as the INCSLs, these superlattices are designated 5 NCSLs. As expected, the incoherent superlattices exhibit greater elastic anomalies than the coherent superlattices for both the 10 and 20% lattice-parameter mismatch [27, 28]. For example, Fig. 14(a) illustrates the systematic increase in the anomalous stiffening of Yz with decreasing as the incoherency of the superlattices increases from COHSL, to INCSL, to 5 INCSL. Figure 14(b) shows a parallel decrease in the shear moduli, Gxz = G yz , most notably the virtual absence of any shear resistance in the incoherent superlattices [27, 28].
(a)
(b)
COHSL
(c)
INCSL
Σ5 INCSL
Figure 13. Illustration of the various degrees of structural disorder in the three latticemismatched superlattice types considered by Jasczak and Wolf [27, 28], with increasing degrees of interfacial structural disorder from (a) to (c). Shown are the fully relaxed atom positions in one plane of material A (solid circles) and one plane of material B (open circles) for superlattices containing eight planes of each material per modulation wavelength. (a) coherent superlattice (COHSL), (b) incoherent superlattice (INCSL) and (c) 5 incoherent superlattice (5 INCSL) [27, 28].
2042
D. Wolf (a) Σ 5 INCSL
1.3 Young's Modulus, Yz / Yz
INCSL 1.2
COHSL
1.1 1.0 0.9
0.8 0
2
4
6
8
10
12
14
16
18
20
(b)
Shear Modulus, G x z / G x z
1.2 1.0 0.8 COHSL
0.6
INCSL Σ 5 INCSL
0.4 0.2 0.0 0
2
4
6
8
10 12 Λ /a
14
16
18
20
Figure 14. (a) Young’s Modulus, Yz , and (b) shear modulus, Gxz , as a function of the modulation wavelength for the three superlattice types of Fig. 13. The normalization constants, which are the → ∞ values for the unstrained systems (i.e., the INCSLs), are 12 2 ∞ 12 2 Y∞ z = 0.793 × 10 dyn/cm and Gxz = 0.727 × 10 dyn/cm [27, 28].
Similar to the thin films [20] and GBSLs, the 5 INCSLs show a pronounced contraction parallel to the interfaces with a consequent Poisson expansion perpendicular to the interfaces. Jasczak and Wolf [27, 28] showed that the interface-stress-induced changes in lattice parameters cannot account for the anomalous elastic behavior. However, in complete analogy to the observations for the thin films and GBSLs, they do enhance the anomalies caused by the structural disorder. In order to differentiate between the effects of the interfaces and those caused by the changes in lattice-parameters, we compare the elastic response
Elastic behavior of interfaces
2043
of the 5 INCSLs with that of corresponding interface-free perfect-crystal reference systems with the same unit-cell size and shape as the 5 INCSLs themselves. The latter was determined as follows [27, 28]: The elastic constants of ideal crystals of A and B, however strained to the appropriate anisotropic lattice parameters of the 5 INCSLs, were individually determined and then averaged in the appropriate manner for superlattices [19, 29, 30] to give the “interface-free-superlattice” (IFSL) elastic constants and moduli. As shown in Figs. 15 and 16 for the Young’s and shear moduli, respectively, the IFSLs show a behavior that is very close to that of a homogeneous system that cannot even nearly account for either the anomalous strengthening of Yz nor the extreme softening of Gxz . In particular, although Yz does show some stiffening in the
Σ 5 INCSL Σ 5GBSL
1.3
Σ 5 in SLAB
Young's Modulus, Y z / Y z
Σ 5 SLAB Σ 5 IFSL
1.2
(001) 1.1
1.0
0.9 0
2
4
6
8
10 12 Λ /a
14
16 18
20
Figure 15. Comparison of the Young’s moduli, Yz , as a function of the modulation wavelength for the 5 INCSLs with those of the superlattices of the (001) θ = 36.87 (5) twist grain boundaries (5 GBSLs; see Section 2.1), thin slabs with and without a 5 grain boundary, and interface-free composition-modulated superlattices (IFSLs) with the lattice-parameter changes of the 5 INCSLs. (We note that all results were obtained for the LJ potential.) The normalization factors are 12 2 12 2 Y∞ z = 0.793 × 10 dyn/cm and 1.076 × 10 dyn/cm , while a¯ = 3.9776 Å and 3.6160 Å, respectively, for the composite systems (5 INCSL and IFSL) and for the monatomic systems (GBSLs and SLABs) [29, 30].
2044
D. Wolf 1.0
Shear Modulus, GXZ / GXZ
0.8
IFSL SLAB 5S in SLAB
0.6 (001) 0.4
0.2 5S GBSL 5S INCSL 0.0 0
2
4
6
8
10 12 14 16 18 20 22 Λ/a
Figure 16. Shear modulus Gxz as a function of for the same systems as in Fig. 15. The 12 2 12 2 normalization factors are G∞ xz = 0.727 × 10 dyn/cm and 1.016 × 10 dyn/cm , respectively, for the composite and the monatomic systems [29, 30].
IFSLs with decreasing (Fig. 15) while Gxz softens slightly (Fig. 16), the effects are very small compared to those in the 5 INCSLs. For comparison, Figs. 15 and 16 also show the LJ results obtained for the (001) GBSLs [19, 21] and slabs [20] and for bicrystalline (001) slabs containing a single 5 (001) twist GB in their center [26]. It is interesting to observe that, except for the appearance of a maximum in Yz and a minimum in Gxz , the 5 GBSL results agree very well with the dissimilar-materials 5 INCSL results, indicating that some saturation level is reached in the elastic behavior as the dislocation cores overlap completely, similar to the interface energies in Fig. 4. As the degree of interfacial structural disorder is systematically decreased – by going from the superlattice of 5 GBs, via a bicrystalline slab containing a single 5 GB to the single-crystal slab – the interfaceinduced stiffening of Yz and softening in Gxz disappear gradually, with the slab exhibiting even a softening in Yz . The main conclusion is therefore that increasing the degree of structural disorder in the superlattices, either by increasing the lattice-parameter mismatch or by introducing a relative rotation between the two materials (thus introducing screw dislocations), will dramatically enhance the small elastic anomalies present in the coherent system. That the transition from a coherent
Elastic behavior of interfaces
2045
to an incoherent interface structure is, indeed, associated with enhanced elastic anomalies has been verified experimentally [31, 32].
4.
Effects of Temperature
So far we have shown that by controlling the structural disorder due to the interfaces, one can “engineer” the elastic behavior of the material. While interfacial structural disorder is inhomogeneous, i.e., localized at the interfaces, one might ask whether homogeneous structural disorder has the same effect as inhomogeneous disorder. The best-known form of homogeneous disorder arises from the thermal movements of the atoms at finite temperature. Another way of formulating the same question is therefore to ask whether the temperature dependence of the elastic behavior can be understood in terms of the same underlying causes as the zero-temperature elastic behavior of interface materials, namely a competition between the strengthening due to the atomic-level disorder and the softening due to the consequent volume expansion. In an attempt to elucidate whether such a temperature-induced competition, indeed, exists, Jaszczak and Wolf [33] have performed extensive MD simulations of the thermo-elastic behavior of both perfect crystals and of the (001) GBSLs described above. By comparing the results of zero-stress simulations (in which volume expansion is permitted) with constant-volume simulations (in which thermal expansion is suppressed), they were able to deconvolute the homogeneous effects induced by thermal disorder from the consequent thermal expansion. Following the full equilibration of the system, the elastic constants and moduli were calculated using the stress-fluctuation formula [14, 34, 35] (for details, see Ref. [33]).
4.1.
Thermo-elastic Behavior of a Perfect fcc Crystal
It is well known that most materials soften thermally, typically by a factor of two between absolute zero and melting, as a consequence of their thermal expansion. If the thermal and interface-induced types of disorder had the same physical effects, one would expect that temperature should provide causes for both elastic softening and strengthening. More specifically, since the thermal fluctuations increase the degree of (homogeneous) structural disorder, as evidenced by the well-known broadening of the peaks in the radial distribution function (see Fig. 17), they should actually give rise to an elastic strengthening. However, as the temperature is increased, the peaks in Fig. 17 shift to larger separations due to the thermal expansion; thermal expansion, the “prize” of thermal disorder, thus provides a mechanism for elastic softening.
2046
D. Wolf
(r/a)2 g(r)
15
10
Zero Stress
Perfect Crystal 100K 400K 1000K
5
0 0.6
0.8
1.0
1.2
1.4
1.6
1.8
Figure 17. Pair distribution function of a perfect fcc crystal at various temperatures under zero stress. G(r ) is normalized such that at zero temperature the nearest-neighbor peak of the perfect crystal is a δ-function of height 12 [33].
According to Fig. 18(a), the elastic constants of a perfect crystal, indeed, soften dramatically (and approximately linearly) with increasing temperature if the thermal expansion is permitted. As illustrated in Fig. 18(b), however, if the thermal expansion is suppressed, even the perfect crystal stiffens elastically with increasing temperature. Although this stiffening, by about 8% in C11 and C44 at 1000 K (i.e., at about 80% of the melting temperature, Tm ∼1200 K, for this potential), is much less pronounced than the net softening when the volume expansion is permitted, this comparison illustrates that homogeneous (i.e., thermal) disorder in a perfect crystal affects the elastic behavior in much the same way as inhomogeneous (i.e., interfacial) disorder in interface materials.
4.2.
Thermo-elastic Behavior of (001) GBSLs
We now superimpose the homogeneous effects of thermal disorder on the already inhomogeneously disordered superlattices of grain boundaries investigated above. The peaks in G(r), already broadened even at zero temperature (see Figs. 10 and 11), can then be expected to broaden further with increasing temperature. By selectively fixing the temperature or the modulation wavelength, Jaszczak and Wolf [33] were able to deconvolute the homogeneous from the inhomogeneous effects: At a fixed temperature, i.e., for a fixed degree of thermal disorder, the amount of interfacial disorder in the GBSLs can be systematically varied by changing the modulation wavelength; conversely, by fixing , the amount of interfacial disorder can be fixed while the thermal disorder is varied with the temperature.
Elastic behavior of interfaces
2047
Elastic Constants
1.00
C11 / C110 C12 / C120
0.80
C14 / C140
0.60 Perfect Crystal Zero Stress 0.40 0
200
400
600
800
1000
800
1000
T (K) 1.08
Cjj(T) / C (T ⫽ 0)
C11 1.06
C12
1.04
C44
1.02 1.00 Perfect Crystal; Fixed Volume 0.98 0
200
400
600 T (K)
Figure 18. (a) Isothermal elastic constants as a function of temperature for a perfect fcc crystal (a) under zero external stress and (b) for constant T = 0 K volume (LJ potential). The elastic constants in (a) are in units of 1012 dyn/cm2 ; those in (b) are normalized to the zero-temperature, zero-stress perfect-crystal values given in Table 1. The solid lines represent least-squares fits to the data [33].
Figure 19 shows G(r) for a GBSL composed of four (001) planes between the GBs at T = 100 K and 400 K under zero external stress. (At higher temperatures the GBSLs tend to crystallize by grain-boundary migration to form a perfect crystal [33]). As is evident from Fig. 19, the peaks in the G(r) for the GBSLs both broaden and shift to larger separations with increasing temperature, in a fashion completely analogous to the perfect-crystal peaks in Fig. 17.
2048
D. Wolf 8 (001) GBSL 4 planes between GB's
(r/a) g(r)2
6
Zero Stress
100 K 400 K 4
2
0 0.6
0.8
1.0
1.2 r/a
1.4
1.6
1.8
Figure 19. Pair distribution function of a GBSL composed of four (001) planes between GBs at two temperatures under zero stress [33].
Elastic Constants
1.00
(001) GBSL 4 planes between GBs Zero Stress
0.90
0.80 C11 / C110 C33 / C330 0.70 0
100
200
300
400
T (K) Figure 20. Selected isothermal elastic constants as a function of temperature for a GBSL composed of four (001) planes between GBs under zero stress. Moduli for the GBSLs soften similarly. The GBSL values are normalized to the zero-temperature elastic constants for this particular GBSL: C011 = 1.788 × 1012 dyn/cm2 , C033 = 1.859 × 1012 dyn/cm2 [33].
Moreover, due to the interfaces the peaks are broader in the GBSLs than in the related perfect crystal [33]. Figure 20 shows the combined effects of the thermal disordering and consequent thermal expansion on the elastic properties of the highly inhomogeneous
Elastic behavior of interfaces
2049
GBSLs with four (001) planes between the GBs. Interestingly, in complete analogy to the perfect crystal (see Fig. 18(a)), the net effect is a nearly linear softening of the elastic constants and moduli with increasing temperature. Hence, despite their inherently inhomogeneous structure, the GBSLs behave homogeneously (i.e., perfect-crystal like) in response to homogeneous (thermal) disorder when the thermal expansion is allowed to take place; i.e., they elastically soften as their G(r) peaks broaden and shift to larger separations with increasing temperature. Also, the centers of the G(r) peaks for the GBSLs are at nearly the same positions as the peaks for a perfect crystal at the same temperature [33]. To demonstrate that this softening is mostly due to the thermal expansion of these highly inhomogeneous systems, in Fig. 21 the values of C33 obtained under conditions of constant volume and constant (zero) pressure, respectively, are compared for the same GBSLs. According to the figure, the role of thermal disorder alone (i.e., as the temperature is increased) is, indeed, a slight strengthening of C33 ; the Young’s modulus, Yz , strengthens similarly [33]. Moreover, similar to the behavior of the perfect crystal (Fig. 18), the enhancement of C33 and Yz over the zero-stress values at the same temperature is quite significant (by over 40% at 400 K). The thermal softening under zero stress is therefore predominantly due to the volume expansion. Finally, by changing the modulation wavelength, , the fraction of atoms “seeing” the interfaces can be systematically varied. The corresponding variation of the Young’s moduli on is shown in Fig. 22 for several temperatures. Even at non-zero temperatures, the GBSLs show the same generic
C33 (T) / C (T33 ⫽ 0)
1.00 Fixed Volume Zero Stress
0.90
0.80 (001) GBSL 4 planes between GBs 0.70 0
100
200
300
400
500
T (K) Figure 21. Comparison of the isothermal elastic constants, C33 , as a function of temperature of a GBSL with four (001) planes between GBs at fixed zero-temperature volume and at fixed zero external stress [33].
2050
D. Wolf (a)
1.3 (001) GBSL Zero Stress
Yz / Yz∞
1.2 1.1 1.0
T=0K T = 100 K T = 400 K
0.9 0.8 0
5
10
15
20
Λ/a (b) 1.0
Yx / Yx∞
0.9 (001) GBSL Zero Stress
0.8 0.7
T⫽0K T = 100 K T = 400 K
0.6 0.5 0
5
10
15
20
Λ/a Figure 22. Isothermal Young’s moduli (a) Yz and (b) Yx as a function of at T = 0, 100 and 400 K. The moduli are normalized to their T = 0 values in the → ∞ limit: 1.076 and 1.429 × 1012 dyn/cm2 , respectively [33].
behavior as a function of as they do at zero temperature. In particular, despite the large, anisotropic thermal expansion, there remains at small an anomalous enhancement of the Young’s modulus, Yz [Fig. 22(a)]. By contrast, the in-plane Young’s moduli, Yx = Yy in Fig. 22(b) show only softening with decreasing at both zero and non-zero temperatures. On the other hand, for a fixed value of , and therefore a fixed degree of interfacial structural disorder, the effect of increasing the temperature under zero stress is to soften all the moduli.
Elastic behavior of interfaces
2051
We therefore conclude that the effect of thermal disorder and consequent volume expansion is largely to soften the elastic moduli by the same degree, independent of the amount of inhomogeneous structural disorder present (i.e., independent of ). The above investigation demonstrates that atomic-level structural disorder, be it homogeneous or inhomogeneous, can lead to elastic stiffening, provided that the related volume expansions do not dominate the elastic behavior and result in a softening. Whether the disorder or the consequent volume expansion will dominate the elastic response depends on the detailed nature of the disorder, the anisotropy of the volume expansion, and on the nature of the interatomic interactions. However, even using the simple Lennard–Jones potential, Jaszczak and Wolf [33] were able to demonstrate a full spectrum of behaviors, from stiffening to softening, by varying the degrees of homogeneous and inhomogeneous disorder and by controlling the resulting volume expansions.
5.
Conclusions
The above simulations illustrate the unique capabilities of atomic-level computer simulations to explore the physical origin of the anomalous elastic behavior of composition-modulated superlattice materials. These capabilities enable investigation of simple, but well characterized model systems which exhibit the same generic behavior as that observed experimentally in “real” composition-modulated alloys [6–11, 31, 32]. In particular, this behavior usually includes (i) a pronounced change in the lattice parameter in the plane of the interfaces accompanied by a related change in the normal direction (the Poisson effect) and (ii) a strengthening of some elastic moduli and constants accompanied by a softening of others. The most pronounced feature observed both experimentally and in the simulations is a strong variation of both the structure and elastic behavior as a function of the composition modulation wavelength, . This dependence expresses the fact that the average physical response of the material consists of a tunable mixture of homogeneous and inhomogeneous effects: By decreasing gradually, more and more atoms in the system experience the presence of the interfaces, and their behavior resembles less and less that of a homogeneous system, thus gradually exposing the behavior characteristic of the inhomogeneous parts of the system. By contrast with the typical experimental situation, in the simple model systems investigated above the atomic structure and any effects associated with materials and interfacial chemistry can be carefully controlled and systematically varied. Combined with the ability to apply external stress in simulations, these simulations were able to
2052
D. Wolf
(a) deconvolute the distinct effects due to the inhomogeneous atomic disorder localized at the interfaces from the consequent interface-stressinduced anisotropic lattice-parameter changes, and (b) separate the homogeneous effects of thermal disordering from the inhomogeneous effects due to the interfaces. These simulations thus provide insight into the atomic-level phenomena and processes governing interfacial elasticity and expose the physical causes for the elastic anomalies of interface materials. Such atomic-level insights are difficult to obtain by experimental means alone or from theoretical methods based on continuum mechanics. They can provide a fundamental basis for the design of composite materials with desired elastic properties. The three major conclusions of this work may be summarized as follows. First, while the average interface-stress-induced structure of thin films and multilayers can be predicted from linear elasticity theory, their elastic behavior, showing some moduli to be hardened while others are softened, is considerably more complex. The latter is the result of a highly complex interplay between two competing causes, namely (i) the built-in structural disorder at the interfaces, as evidenced by a broadening of the radial distribution function, G(r), even at zero temperature, and (ii) the consequent anisotropic lattice-parameter changes, giving rise to a shift in the G(r) peaks. The broadening results in some atoms near the interfaces being shoved more closely together than in the perfect crystal, providing the ingredient necessary for elastic strengthening; the peak shifts, by contrast, usually soften the material. The key to elastic strengthening therefore lies in minimizing the volume expansion per degree of structural disordering. Second, based on the interpretation that the elastic anomalies arise from the interfacial structural disorder, several predictions have been made and verified by simulations. Most importantly, coherent (i.e., epitaxial) interfaces were shown to exhibit the smallest elastic anomalies; introduction of structural disorder via misfit dislocations increases these anomalies significantly, a prediction that was verified experimentally [31, 32]. Third, the effect of temperature on the elastic anomalies has been investigated. It was found that the elastic moduli of grain-boundary superlattices soften with increasing temperature as one would expect for a homogeneous system. Considering that the elastic anomalies arise from the inhomogeneous structural disorder localized at the interfaces, this result is somewhat a surprise. Ultimately the elastic anomalies of interfacial materials arise from a competition between structural disorder and the consequent (usually anisotropic) lattice-parameter and net volume change. This competition can be seen even in a perfect crystal at finite temperature: Increasing the temperature (i.e., broadening the radial distribution function) without permitting the crystal to expand actually strengthens the elastic constants and moduli. In superlattices, by
Elastic behavior of interfaces
2053
contrast, such a broadening in the radial distribution function is present even at zero temperature. This broadening may lead to an elastic strengthening perpendicular to the interfaces provided the related volume expansion, and the associated elastic softening, is not too large. Throughout these simulations the interfaces were assumed to be atomically flat and chemically sharp. By contrast to the typical experimental situation, any effects that might arise from chemical disordering at the interfaces have thus been avoided. Based on the insights gained on these simple model systems, one would expect that chemical disorder, as evidenced in a broadening of the partial radial distribution functions of the material, should play the same role in the anomalous elastic behavior of multilayers as does structural disorder.
Acknowledgment This work was supported by the U.S. Department of Energy, BES Materials Sciences, under Contract W-3l-l09-Eng-38.
References [1] R. Hull and J.C. Bean, Crit. Rev. Solid State and Mater. Sci., 17, 507, 1992. [2] D. Wolf and J.A. Jasczcak, Chapter 14 in Materials Interfaces: Atomic-Level Structure and Properties, D. Wolf and S. Yip (eds.), Chapman and Hall, London, pp. 364 ff, 1992. [3] D. Wolf and J.A. Jasczcak, J. Comput. Aided Mats. Design, 1, 111, 1993. [4] D. Wolf and J.F. Lutsko, Phys. Rev. Lett., 60, 1170, 1988. [5] See, for example, Chapter 1 in Materials Interfaces: Atomic-Level Structure and Properties, D. Wolf and S. Yip (eds.), Chapman and Hall, London, p. 1 ff, 1992. [6] For recent reviews, see R.G. Brandt, Mater. Sci. Eng. B, 6, 95, 1990; M. Grimsditch and I.K. Schuller, Chapter 13 in Materials Interfaces: Atomic-Level Structure and Properties, D. Wolf and S. Yip (eds.), Chapman and Hall, London, 1992, pp. 354 ff; M. Grimsditch in Topics in Applied Physics: Light Scattering in Solids V, M. Cardona and G. Guntherodt (eds.), Springer, Berlin, 1989, p. 285; B.Y. Yin and J.B. Ketterson, Adv. Phys., 38, 189, 1989. [7] W.M.C. Yang, T. Tsakalakos, and J.E. Hilliard, J. Appl. Phys., 48, 876, 1977. [8] A. Kueny, M. Grimsditch, K. Miyano et al., Phys. Rev. Lett., 48, 166, 1988. [9] U. Helmersson, S. Todorova, S.A. Barnett et al., J. Appl. Phys., 62, 491, 1987. [10] B.M. Davis, D.N. Seidman, A. Moreau et al., Phys. Rev. B, 43, 9304, 1991. [11] A. Fartash, E.E. Fullerton, I.K. Schuller et al., Phys. Rev. B, 44, 13760, 1991. [12] J.F. Lutsko, J. Appl. Phys., 65, 2991, 1989. [13] M. Born and K. Huang, “Dynamical theory of crystal lattices,” Clarendon Press, Oxford, 1954. [14] J. Ray and A. Rahman, J. Chem. Phys., 80, 4423, 1984 and Phys. Rev. B, 32, 733, 1985.
2054
D. Wolf
[15] S.M. Foiles, M.I. Baskes, and M.S. Daw, Phys. Rev. B, 33, 7983, 1986. [16] D. Wolf, J.F. Lutsko, and M. Kluge, “Atomistic simulation in materials – beyond pair potentials,” V. Vitek and D. Srolovitz (eds.), Plenum Press, New York, p. 245, 1989. [17] D. Wolf, Surf. Sci., 226, 389, 1989; Phil. Mag. A, 63, 337, 1991. [18] D. Wolf and K.L. Merkle, Chapter 3, “Materials interfaces: atomic-level structure and properties,” D. Wolf and S. Yip (eds.), Chapman and Hall, London, pp. 87 ff, 1992. [19] D. Wolf and J.F. Lutsko, J. Mater. Res., 4, 1427, 1989. [20] D. Wolf, Appl. Phys. Lett., 58, 2081, 1991. [21] D. Wolf, Mater. Sci. Eng. A, 126, 1, 1990. [22] D. Wolf and M.D. Kluge, Scripta Metall. Mater., 24, 907, 1990. [23] M.D. Kluge, D. Wolf, J.F. Lutsko et al., J. Appl. Phys., 67, 2370, 1990. [24] D. Wolf, Acta Metall., 37, 1983, 1989. [25] D. Wolf, Acta Metall., 37, 2823, 1989. [26] D. Wolf and J.F. Lutsko, J. Appl. Phys., 66, 1961, 1989. [27] J.A. Jaszczak, S.R. Phillpot, and D. Wolf, J. Appl. Phys., 68, 4573, 1990. [28] J.A. Jaszczak and D. Wolf, J. Mater. Res., 6, 1207, 1991. [29] M.H. Grimsditch, Phys. Rev. B, 31, 6818, 1985. [30] M.H. Grimsditch and F. Nizzoli, Phys. Rev. B, 33, 5891, 1986. [31] G. Carlotti, D. Fioretto, G. Socino et al., J. Appl. Phys., 71, 4897, 1992. [32] E.E. Fullerton, I.K. Schuller, F.T. Parker et al., J. Appl. Phys., 73, 7370, 1993. [33] J.A. Jaszczak and D. Wolf, Phys. Rev. B, 46, 2473, 1992. [34] J. Ray, Comp. Phys. Reports, 8, 111, 1988. [35] D.R. Squire, A.C. Holt, and W.G. Hoover, Physica, 42, 388, 1969.
6.13 GRAIN BOUNDARIES IN NANOCRYSTALLINE MATERIALS Dieter Wolf Materials Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
The rapidly developing ability to manipulate matter at the atomic level is making considerable impact in materials-science related technologies because it permits novel microstructures and metastable phases to be atomically designed, often resulting in unusual materials properties. The importance of metastable phases with novel microstructures, such as nanocrystalline materials, metallic glasses, multilayers and epitaxially-stabilized thin-film phases, is based primarily on their intrinsic inhomogeneity. This inhomogeneity originates from the atomic-level disorder generated by manipulating the microstructure, for example, via introduction of grain or phase boundaries or, in the case of amorphous materials, by creation of a frozen-liquid like structure. By design, these materials systematically exploit the altered, yet to-date usually unknown local physical behavior in the heavily defected, inhomogeneous structural environments experienced by a large fraction of the atoms. Atomiclevel computer simulations are uniquely suited to provide insight into what constitutes “typically inhomogeneous” behavior in heavily disordered materials because they provide information on local structures and properties. The nature of the grain boundaries (GBs) in nanocrystalline materials has been the subject of intense debate ever since the first ultrafine-grained polycrystals, with a typical grain size of 5–50 nm, were synthesized over a decade ago by consolidation of small clusters formed via gas condensation [1, 2]. For some time it had been thought that the atomic structures of the severely constrained GBs in NCMs differ fundamentally from those observed in coarsegrained or bicrystalline materials because, for a grain diameter of nanometer dimensions, a significant fraction of the atoms is situated in or near GBs and grain junctions [3, 4]. The suggestion of a “frozen-gas” like structure of the GBs [3, 4] was aimed at explaining the rather unusual diffraction results; such a model might also explain some of the unusual thermal properties reported for NCMs, such as their enhanced specific heat [5, 6] and lower Debye 2055 S. Yip (ed.), Handbook of Materials Modeling, 2055–2079. c 2005 Springer. Printed in the Netherlands.
2056
D. Wolf
temperature [4, 7]. However, more recent experimental observations involving Raman spectroscopy [8], atomic-resolution TEM [9] and X-ray diffraction [10] indicate that the atomic structures of GBs in NCMs are, in fact, rather similar to those in coarse-grained polycrystalline or bicrystalline materials. Any attempt to elucidate the properties of NCMs requires, at the outset, a GB structural model that incorporates the severe microstructural constraints in NCMs and yet connects with the large body of knowledge on extended GBs in bicrystalline materials. Two key questions that have evolved from the experimental studies are: (1) What is the structural and thermodynamic relationship between nanocrystalline microstructures and amorphous solids? (2) To what extent can the atomic structure of the GBs in nanocrystalline materials be extrapolated from those of coarse-grained polycrystalline materials and bicrystals? Here we review the insights gained from atomic-level computer simulations on these two important questions. In Section 1, we discuss a simple model that is designed to capture both, the severe microstructural constraints associated with the small grain size and the physical inhomogeneity associated with the GBs. Simulations performed for this simple model system expose important parallels that exist in the dynamical properties of nanocrystalline materials and glasses. In Section 2, we review results obtained by a method for the computational synthesis of less idealized nanocrystalline microstructures, by growth from the melt into which randomly oriented crystal nuclei were inserted. This method enables a fuller comparison of the structures of the highly constrained GBs in nanocrystalline materials with those of entirely unconstrained, bicrystalline GBs. The simulations of GB diffusion creep reviewed in Section 3 represent an important test for the GB structural model suggested on the basis of the low-temperature observations discussed in Section 2. Finally, our most important conclusions are summarized in Section 4.
1.
A Simple Model
The problem of relating the structure and properties of the GBs in a polycrystal to those of corresponding bicrystalline GBs is extremely difficult as it requires three types of microstructural averages to be performed. These arise from the microstructural constraints present in a polycrystal and involve averages over (a) the five macroscopic degrees of freedom that each individual GB contributes to the total number of degrees of freedom of the polycrystal (three degrees associated with the misorientation between the grains and two characterizing the GB-plane normal; [12]), (b) the various grain shapes and (c) the distribution of grain sizes invariably present in polycrystals.
Grain boundaries in nanocrystalline materials
2057
In an early attempt to develop a structural model for nanocrystalline materials, Wolf et al. [12–14] presented a simple model that was tailored to capture the two essential structural features of NCMs, namely (i) the microstructural constraints associated with the finite grain size and (ii) the structural inhomogeneity due to the GBs and grain junctions. The question Wolf et al. [12] asked at the outset is this: As far as the total number of GB degrees of freedom, grain shapes and grain sizes are concerned, what is the conceptually simplest polycrystal that one can, at least in principle, construct? In other words: What is the smallest number of geometrically distinct types of GBs, grain shapes and grain sizes that a polycrystal has to contain and still be space filling? Figure 1 shows that it is geometrically possible to construct a space-filling, three-dimensional polycrystal with a uniform (and unique) rhombohedral grain shape in which all GBs are crystallographically equivalent; i.e., a monodisperse polycrystal with exactly the same number of macroscopic degrees of freedom (at most 5) as the corresponding bicrystal, the only difference being the finite, variable grain size. Having thus eliminated the distributions in the types of GBs and grain shapes, simulations of this model focus entirely on the effect of the grain size, i.e., on the role of the microstructural constraints, on the atomic structure and physical properties of a well-defined GB.
2
1 1
2 1
2 2
1 1
1 2
2 2
2 1 2
2
2 1
2
1
Figure 1. Idealized space-filling polycrystal model. The three-dimensionally periodic simulation cell shown here contains eight identically shaped rhombohedral grains delimited by two sets of crystallographically distinct surfaces (indicated by 1 or 2), forming a total of 24 crystallographically equivalent asymmetric tilt boundaries [12–14].
2058
D. Wolf
Naturally, in the limit of infinite grain size, the model reproduces the corresponding bicrystal structure [12–14]. The simulation cell in Fig. 1 contains two sets of distinct rhombohedral grains, each set delimited by six crystallographically equivalent surfaces, {hkl}1 or {hkl}2, respectively, and therefore forming 24 crystallographically equivalent asymmetric tilt boundaries (for details see [12]). A series of four such model NCMs of increasing size (d = 8.2, 14.4, 20.6 and 26.9 Å, corresponding to 416, 2408, 7280 and 16 328 atoms in the simulation cell) were investigated. In these systems one rhombohedron was chosen to be bounded by {111} and the other by {115} planes; all the GBs are therefore asymmetric {115}{111} tilt boundaries. This particular GB, with an energy of 674 erg/cm2 in the bicrystal, is a reasonably representative of all types of asymmetric tilt boundaries [11].
1.1.
Structure
Lattice-statics simulations, in which the force on each atom and the external stresses on the simulation cell are iteratively reduced to zero by energy minimization, were used to determine the zero-temperature equilibrium structure of the NCMs. A Lennard–Jones potential was used to describe the interatomic interactions. The energy and length scales in this potential (ε = 0.167 eV, σ = 2.315 Å) were fitted to the melting point, Tm = 1356 K, and zero-temperature lattice parameter, a0 = 3.616 Å of Cu, with a cohesive energy, E id = −1.0378 eV/atom. Extensive comparisons with physically better justified many-body potentials have demonstrated that this simple potential represents face-centeredcubic (fcc) metals remarkably well [15]. This similarity is due to the fact that most interfacial phenomena are dominated by the short-range repulsions between the atoms (which are of a central-force type in both types of potentials). The energy distribution function in Fig. 2(a) for the largest of the three fully relaxed {115}{111} model NCMs shows three peaks, indicating three distinct types of crystal environments experienced by the atoms. (For comparison, Fig. 2(b) shows the distribution function for a Lennard–Jones glass produced by molecular-dynamics simulation of a 500-atom quenched liquid; see Section 1.2 below.) The lowest-energy peak in Fig. 2(a), centered at E id , corresponds to the grain interiors that, although elastically distorted, are essentially single crystalline. Detailed structural analysis showed the second peak to be due to the GBs while the third peak arises from the grain junctions (i.e., the lines where four grain edges and the points where eight grain corners meet). As the grain size is reduced, the area under the perfect-crystal peak decreases until it disappears completely for the smallest grain size, indicating overlapping GBs; simultaneously the areas under the two remaining peaks increase. Interestingly, the NCMs with the three largest grain diameters exhibit well-defined GBs with an atomic structure, energy, volume expansion and
Grain boundaries in nanocrystalline materials
2059
(a) 25 Polycrsytal (D ⫽ 20.6Å)
I
g (E ) [%]
20 15 10
II III
5 0 ⫺1.1
⫺1.0
⫺0.9
⫺0.8
⫺0.7
E [eV/atom] (b) 16 Glass
14
g (E ) [%]
12 10 8 6 4 2 0 ⫺1.1
⫺1.0
⫺0.9
⫺0.8
⫺0.7
E [eV/atom] Figure 2. Energy distribution functions, g(E), for (a) the fully relaxed {115}{111} model nanocrystalline material (see Fig. 1) with a grain size of 20.6 Å and (b) a Lennard–Jones glass produced by molecular-dynamics simulation of a 500-atom quenched liquid. The corresponding radial distribution functions are shown in Fig. 3 below. The arrows indicate the ideal-crystal cohesive energy, E id = −1.0378 eV/atom, for the Lennard–Jones potential used here [13].
2060
D. Wolf
width (of about 1.5a0 ) which differ remarkably little from those of the corresponding bicrystalline asymmetric {115}{111} tilt boundary [14]. The radial distribution function for the third-largest grain size is shown in Fig. 3, indicating excellent crystallinity of the material; similar plots were obtained even for the two smallest grain sizes. Corresponding to the reduced density (of 94.2% of that of the perfect crystal), the mean peak positions are shifted slightly to larger values with respect to the δ-function peaks in the perfect fcc crystal at T = 0 K (situated at 0.707, 1.0, 1.225a0 , etc.). The radial distribution function for the NCM shown in Fig. 3 differs remarkably little from that obtained for the bicrystal (for the atoms within a distance of ±a0 from the GB; see also [14]). More importantly, the broadened peaks and the shift of the peak centers towards larger distances are generic features observed for virtually all GBs [15–17], suggesting that, as far as structural disorder is concerned, the particular GB considered in the model system in Fig. 1, is indeed reasonably representative.
0.6
Polycrystal (D ⫽ 20.6 Å) Glass
0.5
RDF
0.4
0.3
0.2
0.1
0 0.4
0.6
0.8
1.0 r [a ]
1.2
1.4
1.6
Figure 3. Radial distribution functions for the fully relaxed {115}{111} model nanocrystalline material (see Fig. 1) with a grain size of 20.6 Å and for a Lennard–Jones glass produced by molecular-dynamics simulation of a 500-atom quenched liquid [13].
Grain boundaries in nanocrystalline materials
1.2.
2061
Vibrational Behavior and Relationship to the Glass
Three types of experimental evidence suggest an intricate, possibly thermodynamic connection between the nanocrystalline and amorphous microstructures. However, partly due to different kinetic processes leading to one or the other microstructure and partly because the experiments were performed on different types of materials (such as pure metals, alloys and covalent semiconductors), a common thermodynamic framework connecting these observations has not been developed. (a) It is well established that mechanical alloying of two elements by ball milling (“mechanical attrition”) may lead to an amorphous phase [18], usually via a nanocrystalline precursor state with a lower limit in the grain size that depends on the concentration of the solid solution [19]. Also, the grain size in a series of nanocrystalline fcc metals produced by ball milling saturates to a minimum value which scales with the melting point of the elements [19]. Moreover, the excess energy due to the large number of GBs in the NCM has been shown to be of similar magnitude as the energy increase associated with solid-state amorphization [20]. (b) The crystallization of amorphous alloys provides a well-known method for the synthesis of fully dense, pore-free NCMs [21]. The NCMs thus obtained exhibit a smallest grain size below which crystallization does not occur [21, 22]. Conversely, the coexistence of nanocrystalline and amorphous phases observed, for example, in ball-milled silicon has been attributed to the existence of a “critical” grain size above which solid-state amorphization cannot take place [23]. (c) The synthesis of pure, submicron-grained polycrystalline metals by severe plastic deformation [24], although conceptually similar to ball milling, does not result in solid-state amorphization. While it is well known that pure metals cannot, in practice, be amorphized [25], the highly disordered non-equilibrium GBs thus obtained [24] – and their recovery to attain equilibrium structures upon annealing – suggest the existence of kinetically induced, locally disordered (and possibly amorphous) GB phases that disappear upon annealing. By contrast with the experiments, in computer simulations even singlecomponent materials can be amorphized, via very fast quenching of a liquid, i.e., orders of magnitude faster than in experiments. By capturing only the microstructural differences between the nanocrystalline and amorphous states, simulations of such simple systems provide an opportunity to compare these phases directly, unencumbered by the mostly kinetic effects of materials chemistry. The model of a fully dense, three-dimensional nanocrystalline material discussed above (see Fig. 1) is well-suited to elucidate how the low-temperature vibrational modes and the related thermal properties of this model NCM
2062
D. Wolf
deviate from those of the perfect crystal and the glass. As seen in Fig. 3, the structures of the glass and the NCM obviously differ fundamentally as far as long-range order is concerned. However, it is well established that the broadening of the nearest-neighbor peak in the related radial distribution function represents a generic feature of all heavily disordered systems. Because many of their physical properties are governed by the short-range repulsions between the atoms, i.e., ion size, the presence or absence of long-range structural periodicity plays only a minor role in most of their properties. The similar nearest-neighbor peaks of the NCM and the glass (see Fig. 3) can therefore be expected to give rise to similarities, for example, in their phonon properties. Using the fully relaxed zero-temperature structures as starting points, Wolf et al. [13, 14] performed lattice-dynamics simulations to determine the phonon spectrum, from which the low-temperature thermodynamic and elastic properties of the material can be determined. Figure 4 compares the phonon density of states for the smallest {115}{111} model NCM with the phonon spectrum of a perfect fcc crystal and with that of the glass. Notice that, in agreement with
0.04
g (ψ)
0.03
0.02
0.01
0 0.0
2.0
4.0
6.0 ψ[THz]
8.0
10.0
12.0
Figure 4. Comparison of the vibrational density of states, g(ν), of the {115}{111} model nanocrystalline material with a grain size of 8.2 Å (solid line), the 500-atom glass (dash-dotted line) and a 500-atom perfect fcc crystal (dashed line); ν is the phonon frequency. The degree of broadening of g(ν) relative to the perfect crystal observed for the NCM decreases with increasing grain size as a more and more perfect crystal like microstructure is obtained [13].
Grain boundaries in nanocrystalline materials
2063
Raman [26] and neutron-scattering [27] experiments, the densities of state of both the NCM and the glass exhibit low- and high-frequency tails which are not present in the perfect crystal. These were shown [16] to originate from the broadening and shift in the peaks in the radial distribution function: distances shorter than in the perfect crystal give rise to local elastic stiffening – or higher phonon frequencies – while longer distances and the overall volume expansion cause elastic – or vibrational – softening. The net elastic and phonon responses of the system are therefore the result of a highly non-linear averaging process over these competing contributions [16, 17], seen here explicitly as tails in the phonon spectrum. Figure 5 shows the excess specific heat over that of the perfect crystal for the model NCMs and for the glass determined from the related frequency spectra. In the context of lattice-dynamics, it is well known [28] that only the lower-frequency modes in Fig. 4 should contribute significantly to the thermodynamic properties. This is confirmed by the complete disappearance of the specific-heat anomaly for the NCMs and the glass when the lower-frequency
0.3
Excess specific heat [k B ]
D ⫽ 8.2 Å D ⫽ 14.4 Å D ⫽ 20.6 Å 0.2
D ⫽ 43 Å (MD) Glass
0.1
0.0 0
100
200 T [K]
300
400
Figure 5. Comparison of the temperature and grain-size dependence of the excess specific heat (in units of Boltzmann’s constant, kB ) of the idealized {115}{111} model nanocrystalline material (see Fig. 1) with a grain size of 8.2 Å (open symbols) with that of the 500-atom glass (dash-dotted line) and of a 55 296-atom nanocrystalline material with randomly oriented grain boundaries and an average grain size of 43 Å (solid line) [13].
2064
D. Wolf
modes were excluded; by contrast, omission of the high-frequency modes had no effect [14]. To test whether the {115}{111} asymmetrical tilt boundaries in our model NCMs are, indeed, reasonably representative of high-angle GBs in general, we have used molecular-dynamics (MD) simulations to crystallize from the melt a fully dense, three-dimensional NCM with random grain orientations and an average grain size of 43 Å [15]. Detailed structural characterization of the 55 296-atom system at zero temperature and stress gives a system-averaged theoretical density of 97.5% and a radial distribution function very similar to that of the model NCM in Fig. 2; most of the GBs were identified as “general” GBs, i.e., with random misorientations between the grains [15]. Subsequent lattice-dynamics simulations give the specific-heat anomaly shown also in Fig. 4 (solid line), with a maximum at virtually the same temperature as in the model NCMs, however, with a diminished peak height in accordance with the larger grain size [15]. These similarities in the specific-heat anomalies of NCMs and the glass, and their common origin in low-frequency phonon modes associated with the structural disorder in the material, strongly support the intuition that led to the design of the nanocrystalline model system in Fig. 1. Namely, at least as far as the low-temperature thermodynamic properties of NCMs are concerned, detailed microstructural averaging is not as important as the incorporation of the finite grain size and a realistic description of the structure and physics of the inhomogeneous regions. Lattice-dynamics calculations can also provide the free energy of the system. In Fig. 5, the temperature and grain-size dependence of the free energies of the {115}{111} model NCMs are compared with those of the glass and of the perfect crystal. Because of the use of the quantum-harmonic approximation, these results are strictly valid only at lower temperatures, typically up to about half the melting point [28]. According to Fig. 6, the glass and the NCMs exhibit a much higher free energy than the perfect crystal, as one would expect for these metastable phases. Interestingly, however, below a grain size of about 14 Å, the NCMs are unstable with respect to the glass. The grain size for which this free-energy based transition occurs is remarkably independent of the temperature; it can be expected to depend, however, on the particular GB incorporated into the mono-disperse model system in Fig. 1 and on the interatomic potential chosen for the simulations. Also, because the model NCM is so highly idealized, the actual grain size for which this transition occurs in a real material will be a function of the degree of microstructural averaging (i.e., on the distributions in the grain shapes, grain sizes and the types of GBs) and on the actual material considered, including the effects of interfacial chemistry. Irrespective of these factors, however, the existence of a reversible, free-energy-based transition between the NCM and the glass appears physically reasonable, given
Grain boundaries in nanocrystalline materials
2065
⫺0.85
F [eV/atom]
⫺1.00
⫺1.15 D = 8.2 Å D = 14.4 Å D = 20.6 Å Glass Perfect crystal
⫺1.30
⫺1.45 0
200
400
600
800
1000
1200
T [K] Figure 6. Comparison of the temperature and grain size dependence of the free energies of {115}{111} model nanocrystalline materials (open symbols) with those of the 500-atom glass (dashed line) and the perfect, 500-atom crystal (full symbols) [13, 14].
the common origin of the observed effects in the atomic-level structural disorder and in the related phonon spectra. The existence of such a transition has, indeed, been reported for nanocrystalline silicon if the grain size is reduced below about 20 Å [29]. Analogous to the crystal-to-liquid (“melting”) and crystal-to-glass (“solidstate amorphization”) transitions [30] (see also Chapter 6.11.), the existence of a free-energy based transition from the NCM to the glass below a critical minimum grain size can be expected to involve the nucleation of disorder at lattice defects and the subsequent growth of the amorphous phase into the grain interiors. To investigate the regions where the nucleation can actually occur, in Fig. 2(a) and (b) the zero-temperature energy distribution functions for the {115}{111} model NCM and for the glass are compared. As discussed in Section 1.1, the NCM exhibits three peaks, indicating three distinct types of crystal environments experienced by the atoms. The lowest-energy peak in Fig. 2(a) (near the perfect-crystal value indicated by the arrow) corresponds to the grain interiors which, although slightly distorted, are essentially single crystalline. This comparison suggests that nucleation of the amorphous phase is energetically possible at the grain junctions and the highest-energy GBs, which give rise to the two peaks in Fig. 2(a) with the highest energies.
2066
D. Wolf
The above thermodynamics-based results have a number of important implications for the structure and properties of both nanocrystalline materials and glasses. (a) By analogy with the formation of a two-phase region upon melting, the smaller grains, with diameters below the critical size, can become amorphous while the larger grains remain crystalline, resulting in a two-phase microstructure seen, for example, in ball-milled Si [23]. Similarly, the observation that NCMs synthesized by crystallization of amorphous alloys exhibit a smallest grain size below which crystallization cannot occur [21, 22] follows naturally from Fig. 6. (b) The observation that solid-state amorphization via mechanical attrition proceeds via a nanocrystalline precursor state with a lower limit in the grain size [19] finds a natural explanation in terms of the above freeenergy plots, as does the observation of similar excess energies of the nanocrystalline precursor state and the amorphous phase [20]. (c) It appears that the synthesis of pure, submicron-grained polycrystalline metals by severe plastic deformation [24] does not result in solid-state amorphization because, according to Fig. 6, the grain size required for a nanocrystalline-to-amorphous transition to be possible is orders of magnitude too large. However, even for grains that are larger than the minimum critical size, one can expect that amorphous nuclei can kinetically be formed at the grain junctions and the highest-energy GBs (see Fig. 2a). Upon annealing, these thermodynamically unstable disordered regions disappear, as observed in the experiments [24]. Finally, one might speculate on how the transition from the glass to the NCM actually takes place at the atomic level; i.e., the mechanism by which the atoms reshuffle when the material transforms from a short-range ordered glass structure to a long-range ordered polycrystalline structure. Clearly, only minor changes are necessary in the nearest-neighbor environment of the atoms (see Fig. 3). A simple and rather natural way for establishing long-range order from the glass would be possible if the intermediate-range structure of the glass contained the “microstructural fingerprint” of the NCM, namely connected, less-well coordinated regions of lower density and higher energy density into which the better coordinated, more perfect crystal like “grains” are embedded. That intermediate-range density fluctuations lead to an inhomogeneous distribution of the free volume in liquids and glassy materials has, indeed, been reported from Fabry–Perot and Raman spectroscopy [31]. These fluctuations were interpreted in terms of a “microstructure” consisting of nanometer-sized clusters with different densities, dynamics, etc. In conclusion, in spite of its conceptual simplicity, the highly idealized model of a nanocrystalline material shown in Fig. 1 combines the severe microstructural constraints associated with a small grain size with a realistic
Grain boundaries in nanocrystalline materials
2067
treatment of the GBs, and thus provides insights not readily obtained from experiments. Most notably, simulations of this model demonstrate remarkably similar phonon spectra of the nanocrystalline and amorphous phases of the same material, giving rise also to similar thermodynamic properties at low temperatures, most notably an anomaly in their specific heats. The possibility of a reversible, free-energy based transition between the nanocrystalline material and the glass suggests that their atomic structures may share common elements; these may kinetically enable local amorphous-phase formation in NCMs and, conversely, give the glass an NCM-like intermediate-range structure.
2.
Molecular-Dynamics Synthesis of Nanocrystalline Model Microstructures
In addition to the distributions in the grain size and grain shapes, a coarsegrained polycrystal contains GBs with very differing structures and a wide spectrum of energies and properties. In close analogy to the classification of the structure of free surfaces, much work of recent years has suggested the usefulness of distinguishing among the following three different types of GBs: Special high-angle GBs (analogous to flat surfaces), dislocation boundaries (analogous to stepped surfaces), and general high-angle GBs (analogous to surfaces with overlapping steps). (For recent reviews, see Ref. [32] and Section 6.9.) The least understood of these are the general high-angle GBs, i.e., GBs, with a structure consisting of completely overlapping dislocation cores. Although in a coarse-grained polycrystal, this type usually represents only a relatively small fraction of the GBs, the pronounced properties of these GBs, particularly their high mobility and diffusivity coupled with a low sliding resistance and cohesion, can dominate the evolution of polycrystalline microstructures [33]. Because of the completely overlapping dislocation cores, the structural disorder in these GBs is distributed rather homogeneously, in a manner analogous to the surfaces of amorphous materials [34]. By contrast, similar to stepped surfaces, in dislocation boundaries the structural disorder is inhomogeneously distributed, consisting of well-defined, usually highly disordered dislocation cores separated by elastically distorted, perfect crystal like regions [35]. In an attempt to incorporate a distribution of GBs into nanocrystalline microstructures, Phillpot et al. [36, 37] used molecular-dynamics (MD) simulations to synthesize NCMs from a supercooled melt into which small crystalline seeds with more or less random orientations had been inserted. The subsequent crystal-growth simulation resulted in fully dense, impurity and porosity-free microstructures with fully equilibrated GBs that were subjected to full structural characterization.
2068
D. Wolf
Figure 7 shows a cross-section through an fcc microstructure thus synthesized [36, 37]. The two crystallites in the lower half of the figure are of particular interest as their seeds were oriented so as to form a coherent twin boundary (i.e., the symmetric tilt boundary on the (111) plane of the fcc lattice). In its optimum translational state, this GB has mirror-plane symmetry and, hence, an extremely low energy (of 1 mJ/m2 for the Lennard–Jones potential used in these simulations; this potential is identical to that used in the studies described in Section 1) This energy is so low because only the third-nearest neighbors of the atoms at the GB are affected by the presence of the interface. However, in these simulations a small rigid-body translation away from the optimum, mirror-plane translational state was imposed on the original seeds; such translations might be present during the initial stages during the powder processing of nanocrystalline materials. The simulations revealed that this translation could not be optimized during the crystal-growth simulation, resulting in the highly 12
0
⫺12 ⫺12
0
12
Figure 7. Structural cross-sections of thickness 0.4a0 through the centers of four of the eight grains in the cubic, 3d periodic simulation cell. (a0 = 3.616 Å is the lattice parameter of the Lennard–Jones potential for Cu used in these simulations.) The different symbols denote different nearest neighbor miscoordinations, ranging between −3 (open squares), −2 (open triangles), −1 (open circles), 0 (small dots) and +1 (solid circles). (For more details, see Ref. [36].)
Grain boundaries in nanocrystalline materials
2069
disordered structure of this GB seen in Fig. 7, with the rather high energy of 701 mJ/m2 for the same Lennard–Jones potential. This high energy arises from the highly constrained nanocrystalline microstructure, in which the rigid-body translations of the grains parallel to the GB plane cannot be fully optimized, by contrast with an entirely unconstrained GB in a bicrystal. The GB structural disorder in this nanocrystalline microstructure was characterized in a rather simplistic way involving nearest-neighbor coordination, by simply determining the number of missing or extra nearest neighbors of each atom. In their first study of the deformation of nanocrystalline materials, Schiotz et al. [38] applied the much more powerful method of commonneighbor analysis (CNA) in which atoms are characterized as being either in a perfect crystal fcc environment, in an hcp environment (distinguished from fcc by the third neighbors) or miscoordinated (see Fig. 8). The extension of the work on fcc metals to silicon [39–41] further elucidated the connection between the GBs present in nanocrystalline and coarse-grained microstructures. The simulations of nanocrystalline Si [41], involving grain sizes of up to about 7 nm, revealed the presence of highly disordered GBs with a more or less uniform thickness. For comparison, extensive
Figure 8. Structural characterization of a deformed nanocrystalline microstructure using common neighbor analysis (CNA). The atoms with perfect fcc coordination are shown as white, those that are miscoordinated in the first neighbor shell are shown as dark. The atoms shown as gray are have perfect hcp coordination, which differs from fcc coordination only in the third neighbor shell [38].
2070
D. Wolf
simulations of microstructurally unconstrained, bicrystalline Si GBs [39, 40] revealed a universal, highly disordered atomic structure of all the high-angle, high-energy GBs. Quantitative structural characterization in terms of the radial distribution function, g(r), revealed that this universal structure was virtually indistinguishable from that of both the GBs present in nanocrystalline Si with random grain orientations and bulk amorphous Si (compare with Fig. 1(c) for Pd) [39, 40]; by contrast, high-angle but low-energy bicrystalline GBs were found to exhibit good crystallinity (see Figs. 14 and 16 in Chapter 6.9). This work also revealed that the disordering of the high-energy GBs is driven by a lowering of the GB energy. It was therefore concluded that the existence of a highly disordered (“confined amorphous”) GB phase in NCMs as well as high-energy bicrystals represents a thermodynamic rather than kinetic effect. This work also demonstrated that nanocrystalline-Si microstructures with randomly oriented grains contain mostly high-energy, large-unit-cell or incommensurate GBs with atomic structures that are qualitatively identical to those of high-energy bicrystalline GBs [36–42] Similar simulations for nanocrystalline Pd [43] using the Pd (EAM) potential of Foiles and Adams [44] yielded qualitatively identical results although, by contrast with Si, fcc metals do not have a stable bulk amorphous phase in terms of which the degree of GB structural disorder can be quantified. As in the Si simulations [41], the three-dimensionally (3d) periodic, cubic simulation cell used in the Pd simulations contained four randomly oriented seed grains in an fcc arrangement; these seeds were embedded in the melt filling the rest of the cell. The fully dense microstructure thus obtained after the crystal-growth simulation consists of dodecahedral grains delimited by GBs, triple lines and higher-fold point junctions [43]. In order to generate a driving force for crystal growth, the melt containing the seeds was cooled down to 800 K (i.e., well below the melting point of Tm ∼ 1500 K [44, 45]). The entire system (melt plus seeds) was subsequently allowed to evolve freely at constant temperature and under zero pressure. To ensure that the microstructure thus obtained was indeed stable, prior to structural characterization the sample was subjected to thermal annealing at 600 K under zero external stress for 30 000 MD steps, followed again by cooling to 0 K. The final structure thus obtained, with a grain size of 8 nm, was found to be practically the same as the unannealed one, demonstrating that it is thermally stable against grain growth, at least on an MD time scale. To characterize the microstructures thus obtained, planar cuts of thickness a0 were made. Figure 9 shows gray-scale contours of equal energy per atom for a slice parallel to the microstructural (111) planes; due to the 3d periodicity imposed on the simulation cell, this cut slices through all four grain centers. Clearly all GBs (seen as dark regions) have roughly the same width, while the triple lines appear to be slightly wider. Although the grain interiors appear to be perfect-crystal like, close inspection reveals a number of (111) twins
Grain boundaries in nanocrystalline materials
2071
25 20 15 10
Y [a o]
5 0 ⫺5 ⫺10 ⫺15 ⫺20 ⫺25 ⫺25 ⫺20 ⫺15 ⫺10 ⫺5
3 nm 0 5 X [a o]
10
15
20
25
Figure 9. Energy per atom gray-scale contour plot for a (111) slice through an fcc Pd microstructure that contains the centers of all four grains of uniform grain shape and size. Dark regions indicate high excess energy (Periodic images of some of the grains are also shown [43]).
developed during the growth process. Their formation is not surprising due to their extremely low energy (of ∼ 3 mJ/m2 for this Pd potential); this low energy is also the reason that they are not visible in the energy contour plot [43]. The average structure of the material as characterized by the system averaged, overall radial distribution function, g(r), is shown in Fig. 10(a). The sharp crystalline peaks originate from the ordered grain interiors; the nonvanishing background between the peaks indicates the presence of structural disorder. To elucidate the origin of this non-crystalline signal, the local radial distribution functions associated with the GBs, triple lines and point junctions were determined by considering only those atoms in the system with the highest excess energies. According to Fig. 10(b), these local distribution functions reveal a complete absence of long-range order in the highly disordered GBs. In particular, the second crystalline peak has almost completely disappeared. Remarkably, this distribution function is virtually indistinguishable from the radial distribution function of the bulk Pd glass (dashed line; see also Fig. 12(a) in Chapter 6.9). A comparison of Fig. 10(b) with Fig. 12(a) in Chapter 6.9 reveals that the GBs in the nanocrystalline microstructure have a structure that is virtually
2072
D. Wolf (a) 25 nanocrystalline Pd d ⫽ 8 nm
20
g (r )
15 10 5 0 0
0.5
1
1.5
2
r [a 0] (b) 10 interfaces in nanocrystalline Pd 8
glass
g (r )
6 4 2 0 0
0.5
1 r [a 0]
1.5
2
Figure 10. (a) System-averaged radial distribution function for a Pd microstructure with a grain size of 8 nm (see also Fig. 9). (b) Local radial distribution function for the GB atoms; similar distributions were obtained for the atoms located in the line and point grain junctions. For comparison, g(r ) for the bulk Pd glass is also shown [43].
identical to that of the high-angle (110) twist GB, i.e., to the universal structure of the high-energy GBs in coarse-grained Pd. The fact that, with an energy of 1025 mJ/m2 the (110) GB is, indeed, representative of the high-energy GBs in the nanocrystalline material is supported by the histogram of GB energies in the nanocrystalline microstructure (see Fig. 11). The narrow spread of GB
Grain boundaries in nanocrystalline materials 8 nanocrystalline Pd
2073
high-energy GBs
4
3 twin GB
Frequency
6
2
(111) twist GBs
(100) twist GBs
0 0
500 1000 Energy [mJ/m2]
1500
Figure 11. Histogram of GB energies obtained for the 24 GBs in the microstructure in Fig. 9. The energies of several bicrystalline GBs are also indicated [43].
energies, with none of the 24 GBs in the system having an energy lower than 800 mJ/m2 or higher than 1300 mJ/m2 , originates from the random grain misorientations that give rise to GBs having both tilt and twist components, thus effectively inhibiting the formation of low-angle and “special” low-energy GBs (i.e., high-angle GBs on special, low-index lattice planes [32]). There is, however, no reason to assume that low-angle and special low-energy GBs do not exist in real materials, in which the misorientations between neighboring grains may not be random.
3.
Grain-Boundary Diffusion Creep in Nanocrystalline Pd
If the structure and properties of the high-energy GBs in a nanocrystalline microstructure are, indeed, identical to high-energy bicrystalline GBs, the simulation of GB diffusion creep in the idealized microstructures considered in Section 2 (see, e.g., Figs. 7 and 9) should provide an important test case. From experimental studies of coarse-grained polycrystals it is well known that Coble creep involves homogeneous grain elongation via GB diffusion, with a creep rate that is linear in the stress, proportional to d −3 , and governed by the activation energy of GB diffusion [49]. Due to this d −3 increase of the strain
2074
D. Wolf
rate with decreasing grain size, d, according to the well-known Coble-creep formula [49], ε˙ = A
σ D δD DGB , kB T d 3
(1)
Coble-creep should be observable in nanocrystalline materials even during the short observation window of ∼10−9 s typically accessible by MD simulation. (In Eq. (1), δD DGB is the diffusion flux in the GBs with diffusion constant DGB , width δD and activation volume D ; kB T is the thermodynamic temperature, and A is a geometric constant depending on the grain shape.) To test the GB structural model described above, MD simulations of 3d periodic nanocrystalline Si [50] and Pd [51] microstructures were performed. So as to enable steady-state diffusion creep to be observed unencumbered by grain growth, model microstructures were tailored to have a uniform grain size and shape, with random grain orientations. As illustrated in Fig. 12 for the case of nanocrystalline Pd [51], these microstructures, indeed, exhibited steady-state diffusion creep that is homogenous, linear in the stress, and with a strain rate that agrees quantitatively with the Coble-creep formula in Eq. (1) [49]. The grain-size dependence of the creep rate, ε˙ , represents an important characteristic of the deformation mechanism. The observed linear relation 0.03 d ⫽7.6 nm; T ⫽1200 K
σ⫽
0.02
7
a; ε 4 Gp
a; ε
.3 Gp
ε
σ ⫽ 0.2
⫺1 ]
[s
2.
⫽1
0.
σ⫽0
0.01
0 6⫻1
⫺1 ] 07 [s
⫻1 ⫽ 8.7
7 ⫺1 ] ⫻10 [s
6.8 Gpa; ε ⫽
1 Gpa; ε ⫽ σ ⫽ 0.t[ps]
7 ⫺1 2.5⫻10 [s ]
0.00 0
50
100
t [ps] Figure 12. Total (elastic + plastic) strain vs. simulation time for a 7.6 nm grain size at 1200 K under different tensile stresses [51].
Grain boundaries in nanocrystalline materials
2075
between ε˙ and σ [51] suggests that, for a given grain size and temperature, the strain rates obtained for different stresses (Fig. 12) should – within the error bars – collapse into a single point, ε˙ /σ . The log–log plot of ε˙ /σ vs. the grain size, d, in Fig. 13 collects all the data points obtained at 1200 K. In this representation the strain rates, indeed, fall on a universal curve (dashed line), showing a d −3 dependence for the larger grain sizes. By contrast with Eq. (1), however, the d −2 dependence seen for the smallest grain sizes is a characteristic of Nabarro–Herring rather than Coble-creep. This apparent discrepancy is readily resolved by recognizing that in the small grain-size limit, Coble and Nabarro–Herring creep become essentially indistinguishable. In fact, Yamakov et al. [51] were able to show that the d −2 dependence follows naturally from Coble’s derivation as the limit in which δD /d ∼ 1. The activation energy represents an important fingerprint of the underlying deformation mechanism. The Arrhenius plot in Fig. 14 for the strain rates extracted from these simulations (see, e.g., Fig. 12) yields an activation energy of 0.61 ± 0.1 eV [51]. This value is in remarkable agreement with the universal high-temperature activation energy of 0.60 ± 0.1 eV for bicrystalline 101
ε/σ [S⫺1 Pa⫺1]
T ⫽1200 K
100 2 1 10⫺1
d ⫽3.8 nm d ⫽5.7 nm d ⫽7.6 nm d ⫽11.4 nm d ⫽15.2 nm
3 1
10⫺2 100
101
d [nm] Figure 13. Scaling plot of ε˙ /σ vs. d showing that all the data points obtained for different stresses at 1200 K collapse onto a single curve, thus indicating that the strain rate increases linearly with stress (see Eq. (1)). The dashed curve is merely a guide to the eye. The increase in error bars with decreasing grain size is due to the greater equilibrium fluctuations in the smaller systems [51].
2076
D. Wolf 109
d ⫽7.6 nm; σ ⫽ 0.4 GPa
1300 K ε [s⫺1]
1200 K 108
1100 K 0.61 eV 1000 K 900 K
107 8
10
12
14
1/ kB T [1/eV] Figure 14. Arrhenius plot for the strain rates at a stress of 0.4 GPa. The melting point for this potential is about 1500 K [51].
GBs determined in separate simulations of Pd self-diffusion in high-angle, high-energy GBs [45], i.e., for GB diffusion in the absence of applied stress and any microstructural constraints associated with the small grain size. This comparison reveals that GB diffusion is, indeed, the deformation-rate limiting process. Moreover, the fact that the activation energy for the creep in the nanocrystalline microstructure is the same as that for diffusion in high-energy, bicrystalline GBs confirms that the high-energy GBs predominantly present in nanocrystalline materials have a structure and properties that are virtually indistinguishable from those of the microstructurally unconstrained, extended GBs present in coarse-grained materials.
4.
Grain-Boundary Structural Model for Nanocrystalline Materials
The significant body of atomic-level simulations described above suggests important similarities and some differences between the GBs in coarse-grained polycrystalline and nanocrystalline microstructures. The most important differences seem to arise from the severe microstructural constraints present in
Grain boundaries in nanocrystalline materials
2077
nanocrystalline materials. As indicated by the highly disordered structure of the coherent twin boundary in Fig. 7, these constraints can have the effect of converting what would otherwise be a “special” (i.e., dislocation-free, lowenergy) GB into a highly disordered, high-energy interface. However, as evidenced by the observation in nanocrystalline Pd of perfectly ordered coherent twins transecting the grain interiors [43], the structures of “special” boundaries in a nanocrystalline microstructure may not be unique, but rather may depend on the nature of the microstructural confinement locally at the GB, and hence on the synthesis conditions and the degree of “thermal relaxation” of the material following its synthesis. Therefore, while nanocrystalline and coarse-grained polycrystalline microstructures appear to contain the same three types of GBs (see Section 2 above), the most important measure of the differences between them probably lies in their GB energy distribution functions. Whereas coarse-grained materials usually exhibit a broad distribution of GB energies, the severe microstructural constraints present in nanocrystalline microstructures seem to have the effect of significantly increasing the fraction of high-energy GBs at the expense of the special boundaries: The more severe the microstructural constraints become (e.g., by decreasing the grain size or by the use of a highly non-equilibrium synthesis route), the larger appears to be the fraction of the high-energy boundaries in the system. Valiev and coworkers [52, 53] have clearly demonstrated that in nanocrystalline metallic materials produced by severe plastic deformation, one observes highly non-equilibrium, high-angle GBs with a structure that facilitates significant GB sliding in the context of nanocrystalline superplasticity. Since, due to their high mobility and diffusivity coupled with a low sliding resistance and high GB energy, these boundaries dominate the evolution of these microstructures in response to stress and temperature, their effect should be particularly pronounced in nanocrystalline materials. The above simulations also suggest an intriguing structural and thermodynamic relation between nanocrystalline materials and bulk amorphous solids. Although the high-angle, high-energy GBs represent only a (synthesis and processing dependent) fraction of the GBs in a nanocrystalline microstructure, their structure and behavior strongly resembles Rosenhain’s historic “amorphouscement” model [46–48]. In fact, in the (hypothetical) limit in which all GBs in a nanocrystalline material are of this type, the two-phase microstructure consists of crystalline grains embedded in a glassy, intergranular, glue-like phase. In practice, this intergranular amorphous phase is disrupted by more inhomogeneously disordered dislocation boundaries and, perhaps, a few special boundaries. Given that in bicrystals this amorphous intergranular phase seems to be of thermodynamic, rather than purely kinetic origin, it appears that a rather intimate relation exists between nanocrystalline microstructures and the bulk amorphous phase. These observations are also consistent with the
2078
D. Wolf
simulations of the phonon density of states and of the related free energy described in Section 1 [12–14], which demonstrated that below a certain critical grain size (of ∼ 1.5–2 nm) nanocrystalline microstructures are thermodynamically unstable with respect to the amorphous phase.
Acknowledgment This work was supported by the US Department of Energy, BES-Materials Science under contract W-31-109-Eng-38.
References [1] H. Gleiter, In: N. Hansen et al., Proceedings Second Risø International Symposium on Metallurgy and Materials Science, Roskilde, Denmark, p. 15, 1981. [2] R. Birringer et al., Phys. Lett. A, 102, 365, 1984. [3] X. Zhu et al., Phys. Rev. B, 35, 9085, 1987. [4] H. Gleiter, Prog. Mater. Sci., 33, 223, 1989. [5] J. Rupp and R. Birringer, Phys. Rev. B, 36, 7888, 1987; A. Tsch¨ope and R. Birringer, Acta Metall. Mater., 41, 2791, 1993, Phil. Mag. B, 68, 2223, 1993. [6] H.G. Klein, Diplom Thesis, Universit¨at des Saarlandes, November (unpublished), 1992. [7] J. Jiang et al., Solid State Commun., 80, 525, 1991; U. Herr et al., Appl. Phys. Lett., 50, 472, 1987. [8] C.A. Melendres et al., J. Mater. Res., 4, 1246, 1989. [9] G.J. Thomas et al., Scripta Metall., 24, 201, 1990. [10] J. Eastman et al., Phil. Mag. B, 66, 667, 1992. [11] See, for example, D. Wolf, In: D. Wolf and S. Yip (eds.), Materials Interfaces: Atomic-Level Structure and Properties, Chapman and Hall, p. 16, 1992. [12] D. Wolf, J. Wang, S.R. Phillpot, and H. Gleiter, Phys. Rev. Lett., 74, 4686, 1995. [13] D. Wolf, J. Wang, S.R. Phillpot, and H. Gleiter, Phys. Lett. A, 205, 274, 1995. [14] D. Wolf, J. Wang, S.R. Phillpot, and H. Gleiter, Phil. Mag. A, 73, 517. 1996. [15] D. Wolf and K.L. Merkle, Chapter 3 In: D. Wolf and S. Yip (eds.), Materials Interfaces: Atomic-Level Structure and Properties, Chapman and Hall, pp. 87–150, 1992. [16] D. Wolf and J. Lutsko, Phys. Rev. Lett. , 60, 1170, 1988. [17] D. Wolf and J. Jaszczak, J. Comput. Aided Mater. Des., 1, 111, 1993. [18] See, for example, P.J. Desre, Nanostruct. Mater., 4, 957, 1994. [19] J. Eckert, J.C. Holzer, C.E. Krill, and W.L. Johnson, J. Mater. Res., 7, 1751, 1992. [20] A.R. Yavari, Mater. Sci. Eng. A, 179/180, 20, 1994. [21] Y. Yoshizawa, S. Oguma, and K. Yamauchi, J. Appl. Phys., 64, 6044, 1988. [22] K. Lu, Phys. Rev. B, 51, 18, 1995. [23] T.D. Shen, C.C. Koch, T.L. McCormick, R.J. Nemanich, J.Y. Huang, and J.G. Huang, J. Mater. Res., 10, 139, 1995. [24] R.Z. Valiev, E.V. Kozlov, Y.F. Ivanov, J. Lian, A.A. Nazarov, and B. Baudelet, Acta Metall. Mater., 42, 2467, 1994. [25] Y.-W. Kim, H.-M. Lin, and T.F. Kelley, Acta Metall., 37, 247, 1989. [26] E.D. Obraztsova et al., Nanostruct. Mats., 6, 827, 1995. [27] J. Trampeneau, K. Bauszus, W. Petry, and U. Herr, Nanostruct. Mater., 6, 551, 1995.
Grain boundaries in nanocrystalline materials
2079
[28] See, e.g., A.A. Maradudin et al., In: H. Ehrenreich et al. (eds.), Solid State Physics, Suppl. 3, 1971. [29] S. Veprek, Z. Iqbal, H.R. Oswald, and A.P. Webb, J. Phys. C, 14, 295, 1981. [30] D. Wolf, P.R. Okamoto, S. Yip, J.F. Lutsko, and M. Kluge, J. Mater. Res., 5, 286, 1990. [31] R.W. Fischer, Physica A, 201, 183, 1993. [32] D. Wolf, “Grain boundaries: structure,” In: Robert Cahn, principal editor, The Encyclopedia of Materials, Science and Technology, Pergamon Press, 2001. [33] F.J. Humphreys and M. Hatherly, Recrystallization and Annealing Phenomena, Pergamon, 1995. [34] J.C.M. Li, J. Appl. Phys., 32, 525, 1961. [35] W.T. Read and W. Shockley, Phys. Rev., 78, 275, 1950. [36] S.R. Phillpot, D. Wolf, and H. Gleiter, J. Appl. Phys., 78, 847, 1995. [37] S.R. Phillpot, D. Wolf, and H. Gleiter, Scripta Metall. Mater., 33, 1245, 1995. [38] J. Schiotz, F.D. DiTolla, and K.W. Jacobsen, Nature, 391, 561, 1998. [39] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, Phys. Rev. Lett., 77, 2965, 1996. [40] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, J. Am. Ceram. Soc., 80, 717, 1997. [41] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, Acta Mater., 45, 987, 1997. [42] P. Keblinski, S.R. Phillpot, D. Wolf, and H. Gleiter, Phil. Magn. Lett., 76, 143, 1997. [43] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Scripta Mater., 41, 631, 1999. [44] S.M. Foiles and J.B. Adams, Phys. Rev. B, 40, 5909, 1986. [45] P. Keblinski, D. Wolf, S.R. Phillpot, and H. Gleiter, Phil. Mag. A, 79, 2735, 1999. [46] W. Rosenhain and J.C.W. Humfrey, J. Iron Steel Inst., 87, 219, 1913. [47] W. Rosenhain and D. Ewen, J. Inst. Metals, 10, 119, 1913. [48] For an excellent description of Rosenhain’s amorphous-cement model, see K.T. Aust and B. Chalmers, in Metal Interfaces, ASM, p. 153, 1952. [49] R.L. Coble, J. Appl. Phys., 34, 1679, 1963. [50] P. Keblinski, D. Wolf, and H. Gleiter, Interface Sci., 6, 205, 1998. [51] V. Yamakov, D. Wolf, S.R. Phillpot, and H. Gleiter, Acta Mater., 50, 61, 2002. [52] R.Z. Valiev, R.K. Islamgaliev, and I.V. Alexandrov, Prog. Mater. Sci., 45, 103, 2000. [53] R.Z. Valiev, R. Islamgaliev, and N. Yunusova, Mater. Sci. Forum, 357–359, 449, 2001.
Chapter 7 MICROSTRUCTURE
7.1 INTRODUCTION: MICROSTRUCTURE David J. Srolovitz1 and Long-Qing Chen2 1 Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ 08544, USA 2 Department of Materials Science and Engineering, Penn State University, University Park, PA 16802, USA
There is a very common observation that two nominally identical samples of a material may exhibit remarkably different properties. Such differences can often be traced back to how a material was synthesized or subsequently processed. If two materials have exactly the same composition, how can the properties be different? The answer is generally associated with heterogeneities in the material. These heterogeneities may be associated with spatial distributions of phases of different compositions and/or crystal structures, grains of different orientations, domains of different structural variants, domains of different electrical or magnetic polarizations, as well as of structural defects such as dislocations. Materials scientists routinely manipulate these inhomogeneities to optimize the properties of materials. Contrary to common usage, “microstructure” does not refer to the scale of the structure (e.g., the micrometer scale). Rather, “microstructure” describes the arrangements of the defects or the compositional inhomogeneity within a material. In this sense, microstructure exists on scales ranging from 10 s of atoms up to that above which the material behaves as a homogeneous continuum (often millimeters). What types of defects constitute the microstructure? These include, for example, dislocations, grain boundaries, interphase boundaries, domain walls, precipitates/inclusions, cracks, and surfaces. The shape of an individual extended defect is itself a type of microstructure since shape is simply a statement of the spatial arrangement of segments of the continuous defect. However, the term “morphology” is often used to describe this type of microstructure. In order to characterize a particular type of microstructure we must first choose a description. Practical descriptions of microstructure resolve scales that are fine on the scale of the spacing between defects (e.g., grain size), but 2083 S. Yip (ed.), Handbook of Materials Modeling, 2083–2086. c 2005 Springer. Printed in the Netherlands.
2084
D.J. Srolovitz and L.-Q. Chen
not those that characterize the defect itself (the width of the grain boundary). Such descriptions may simply identify the location of a defect, but could also carry additional information, such as the Burgers vector of a dislocation. Generally, the second issue is to characterize the thermodynamics of the system. In cases where the defect only disrupts the crystal lattice on a scale which is very small compared to the spacing between defects, a thermodynamic description may only account for the total length or area of the defects and, possibly, be a function of the defect character (e.g., the change in crystal orientation upon crossing a boundary and/or the boundary inclination). However, in many cases, defects and volume elements in a microstructure interact with one another over very large distances. A classical case is in a dislocation microstructure. Other examples include coherent microstructures, ferroelectric domain structures, and ferromagnetic domain structures in which long-range elastic, electrostatic, and magnetic interactions exist. In such cases, a thermodynamic description may require summation or integration over the entire volume of the material. The third issue that we must decide is how to describe the temporal evolution of the microstructure. In most types of microstructure evolution problems, the defect dynamics are over-damped. In such cases, it is reasonable to assume that the defect velocity scales with the derivative of local chemical potential with respect to defect position (i.e., a thermodynamic force). In cases where the variables describing the defect are not conserved, this relation may simply be proportionality between velocity and thermodynamic force. In the conserved case, the rate of change of the microstructure descriptor will be proportional to the Laplacian of the thermodynamic force. In both cases, the proportionality constant is some type of mobility. Mobilities may be isotropic or anisotropic. Similarly, the rate of change of one variable may depend on gradients of the free energy with respect to non-conjugate variables; i.e., an Onsager form. There are times in which the simulator is faced with a situation in which the thermodynamic properties of the system are unavailable, but where experimental measurements of the defect velocity as a function of external variables have been made. In such situations, some form of front tracking method may be applied. The fourth issue (and often the most difficult one) is where do you get all of the parameters you need to describe the thermodynamics and kinetics. This issue is especially difficult in situations in which the key parameters are functions of many variables – e.g., the properties of grain boundaries depend on (at least) five distinct parameters (three to describe the misorientation between grains and two to specify the boundary plane). In cases where the parameter space is highly dimensional, we can (sometimes) use symmetry to reduce the volume of the space. However, in most cases, we employ the spherical chicken approximation. That is, we arbitrarily reduce the number of parameters we use until the solution can no longer describe the physics we care about
Introduction: microstructure
2085
(i.e., until you can no longer distinguish a chicken from a sphere). Let’s say that we have reduced the volume of the parameter space as much as possible, we will still be faced with the problem of choosing the parameters. In a multi-scale modeling framework, this is where we reach down to smaller scales. For example, many thermodynamic properties can be determined directly from first-principles calculations and kinetic parameters may be determined from atomistic simulations (perhaps combined with rate theory ideas). In many cases, these parameters may be extracted from experimental measurements (e.g., diffusivities) or combinations of experimental measurements and empirical calculations (e.g., CALPHAD data bases). Finally, the microstructure simulator will be faced with determining the properties of the microstructure that was found through the simulations (or from experiment). It is difficult to make general statements as to how to do this, since the procedures depend greatly on the properties of interest. However, this step is key to many activities because the ultimate goal is often to determine how to process a material to achieve a desired set of properties. In other cases, it is to determine the optimal microstructure for a particular set of objectives; this is so-called “materials by design.” In practice, many microstructure simulation activities stop short of this objective. Before leaving this general discussion of microstructure simulations, it is important to note that disparate microstructural features often interact in significant and surprising ways. For example, the evolution of the grain microstructure in a polycrystalline material is sensitive to the existence of precipitates (and the structure of their interfaces), the existence of domain structures within each grain, the evolution of the dislocation microstructure, and the distribution of solute within the material. Therefore, many of the most important issues in microstructure evolution can only be simulated through combinations of simulation methods. As such, microstructure evolution is itself a multi-physics multi-scale modeling activity. This chapter is organized as follows. We begin with a series of contributions on (7.2) phase field modeling and its application to (7.3) solidification, (7.4) precipitation, (7.5) ferroic domain structures and (7.6) grain growth. The phase field grain growth contribution is followed by a description of recrystallization using cellular automata (7.7). Next, (7.8) is a discussion of the coarsening of multi-phase systems using a front-tracking approach. Diffusion-controlled phase transformation modeling by kinetic Monte Carlo and microscopic kinetic simulations are the topics of the following two contributions (7.9 and 7.10). A series of four papers then addresses dislocation dynamics and/or dislocation microstructure using (7.11) front tracking, (7.12) phase field, (7.13) levelset methods and (7.14) coarse graining ideas. Then, a series of three papers describe the evolution of thin films during deposition using (7.15) level-set, (7.16) stochastic equations, and (7.17) Monte Carlo approaches. This chapter then concludes with discussions of (7.18) microstructure optimization and
2086
D.J. Srolovitz and L.-Q. Chen
(7.19) microstructure characterization. While microstructures are being simulated today using too wide of an array of approaches to be adequately surveyed in a single chapter of a handbook, we have strived to represent many of the major classes of microstructure simulation methods, many different approaches to the same microstructure evolution problem and many uses of microstructure simulations.
7.2 PHASE-FIELD MODELING Alain Karma Northeastern University, Boston, MA, USA
The phase-field method is a powerful simulation tool to describe xxx the complex evolution of interfaces in a wide range of contexts without explicitly tracking these interfaces. Its main application to date has been to problems in materials science where the evolution of interfaces and defects in the interior or on the surface of a material has a profound impact on its behavior [8]. A partial list of applications to date in this general area includes alloy solidification [5], where models combine elements of the first phase-field models of the solidification of pure materials [9, 32] and the Cahn–Hilliard equation (7), solid-state precipitation [66], stress-driven interfacial instabilities [29, 41, 58], microstructural evolution in polycrystalline materials [17, 31, 36, 60], crystal nucleation [16], surface growth [13, 25, 44], thin film patterning [34], ferroelectric materials [57], dislocation dynamics [22, 49, 52, 55], and fracture [3, 11, 27, 56]. Interface tracking is avoided by making interfaces spatially diffuse with the help of order parameters that vary smoothly in space. Evolution equations for these order parameters are derived variationally from a Lyapounov functional that represents the total free-energy of the system. This theoretical construct provides great flexibility to model simultaneously various physical processes on different length and time scales within a single self-consistent set of coupled partial differential equations.
1.
Phase and Grain Boundaries
The formation of alloys microstructures during solidification, and the subsequent evolution of this microstructure in the solid-state, are controlled by the motion of boundaries between thermodynamically distinct phases, or between regions of the same crystalline phase with different orientations. The key ingredient of the phase-field approach is to make these interfaces spatially diffuse over a region of finite thickness. As a first example of a phase boundary, 2087 S. Yip (ed.), Handbook of Materials Modeling, 2087–2103. c 2005 Springer. Printed in the Netherlands.
2088
A. Karma
consider a pure crystal at equilibrium with its melt. The total free-energy of this system can be written in the phenomenological form
F[φ] =
κ 2 dV |∇φ| + h f (φ) ≡ 2
dV Fv ,
(1)
where the integral is over the entire volume of the two-phase system, assumed to be at the melting point. The free-energy density Fv is the sum of a gradient square term and a double-well potential f (φ) = φ 2 (1 − φ)2 with two minima that correspond to solid and liquid. In this context, φ can be interpreted physically as a phenomenological measure of crystalline order, which varies continuously through the diffuse interface region from φ = 1 in the crystal with perfect long range order to φ = 0 in the disordered liquid. Equivalently, φ can be seen as the envelope of the density wave that has a constant amplitude in the solid and decays into the liquid [40, 43], as illustrated by the dashed line in Fig. 1(a). For a flat interface in equilibrium, the phase-field φ only depends on the coordinate x that is normal to this interface. This equilibrium profile, φ0 (x), and the interface thickness are obtained by minimization of the total freeenergy, which is carried out by the standard calculus of variation. One replaces φ(x) = φ0 (x) + δφ(x) into Eq. (1) and expands the integrand to linear order in δφ, with the substitution dV = A dx, where A is the total interface 2 = (dφ/dx)2 ≡ φx2 . After integrating once by parts, and using area, and |∇φ| the fact that φx vanishes away from the interface, Eq. (1) can be written in the form δF = A
dx δφ(x)
δ F , δφ φ(x)=φ0 (x)
(a)
(2)
(b)
Crystal
ψ ψ
ψ
Grain 2
Grain 1
Melt
x
ψ
x
Figure 1. Schematic plots of crystalline order parameter φ through spatially diffuse interfaces: (a) solid–liquid interface, and (b) grain boundary. The other order parameters ρ in (a) and θ in (b) are the density and crystal orientation, respectively.
Phase-field modeling
2089
where we have defined the functional derivative d ∂ Fv ∂ Fv δF ≡− , + δφ dx ∂φx ∂φ
(3)
For φ = φ0 (x) to minimize F, δ F must vanish for an arbitrary smooth variation δφ(x). This yields the equilibrium condition δ F/δφ = 0, and hence the equation −κ
d2 φ0 + h f (φ0 ) = 0, 2 dx
(4)
where f denotes the derivative of the double-well potential, f (φ0 ) = 2φ0 − 2 3 6φ √0 + 4φ0 . This equation√has the kink-shape solution φ0 (x) = (1 − tan h [x/ ( 2W )])/2, where W = κ/ h is a measure of the interface thickness. This solution interpolates smoothly between the values of φ in solid and liquid as shown in Fig. 1(a). In order to relate the diffuse interface model to a real physical system, one needs to calculate the excess free-energy of the solid–liquid interface, γsl, per unit area, which can be measured experimentally or computed from atomistic simulations [21]. This excess is the total free-energy of the two-phase system minus the total free-energy of a single phase system (either solid or liquid) occupying the same volume, divided by A, or here γsl =
+∞
dx −∞
2 κ φ0x + h f (φ0 ) 2
(5)
2 /2 and h f (φ0 ) yield equal By multiplying Eq. (4) by φ0x , one finds that κ φ0x contributions to this integral, such that Eq. (5) reduces to
γsl = W h
+∞
2 dy φ0y
(6)
−∞
where y = x/W and the integral are dimensionless. Consequently, γsl scales as the product of the interface thickness and the height of the double-well free-energy density. The crystalline nature of the solid is manifest in the fact that γsl varies with the orientation of the interface with respect to a fix set of crystal axes. The degree of variation depends generally on the atomic scale structure of the solid–liquid interface. This anisotropy can be incorporated into the phasefield model by letting the coefficient of the gradient square term in Eq. (1) be a function, κ(∇φ/| ∇φ|), of the direction normal to the interface [38, 63, is the unit normal pointing from solid to liquid. 65], where nˆ = −∇φ/| ∇φ| This approach has been applied to model dendritic evolution in materials with atomically rough interfaces and a smooth variation of γsl with orientation
2090
A. Karma
[6, 24], and in materials with faceted solid–liquid interfaces that have cusps in the γ -plot [10]. Grain boundaries in two dimensions have been modeled phenomenologically by coupling two order parameters [31, 36, 60]. The first, θ, measures the local crystal orientation; θ is constant in the interior of each grain and the jump θ across a grain boundary is the misorientation. The second, φ, is analogous to the order parameter used to describe the crsytal-melt interface. It differentiates between perfect crystalline order in the interior of a grain (φ = 1) and disordered material at the grain boundary (φ < 1). The two-order-parameter free-energy functional is written as [36, 60] F[φ, θ] =
dV
κ 2 + 2 H (φ)|∇θ| 2 , |∇φ| + h f (φ) + sg(φ)|∇θ| 2
(7)
in the third term inside the integrand where the singular dependence on |∇θ| is necessary to insure that the grain boundary is spatially localized [31]. This free-energy is rotationally invariant. Hence, both grain boundary migration and grain rotation can be modeled in dynamical applications of this model. One-dimensional stationary profiles of the two order parameters in the diffuse interface region, obtained from the equilibrium conditions δ F/δφ = δ F/δθ = 0, are shown schematically in Fig. 1(b). For =/ 0 (dashed lines), the profiles of both order parameters are smooth, which is desired in numerical applications of this model. For = 0 (solid lines), θ is a step function and φ has a cusp shape. While not suited for numerics, this limit is useful to gain insight into the static properties of the model. For the single-well function f (φ) = (1 − φ)2 /2, which penalizes energetically the formation of liquid at φ = 0, and g(φ) = φ 2 , the stationary φ-profile is φ0 (x) = 1 − (1 − φm ) exp (−|x|/W ),
(8)
where W = (κ/ h)1/2 is the interface thickness. This expression is easily seen to be solution of δ F/δφ =κd2 φ0 /dx 2 +h(1 − φ0 )=0 for x =/ 0, and the balance of the θ-jump and the jump of the first derivative of φ0 at x = 0 fixes φm = 1/(1 + W s θ/κ). This expression implies that disorder at the center of the grain boundary is larger (φm is smaller) for a larger misorientation. Furthermore, the grain boundary energy is given by γgb = sθ/(1 + W sθ/κ)2 ,
(9)
which is the sum of the contributions in Eq. (7) due to spatial disorder (φ variation) and the θ-jump across the boundary. For a different analytical form of g(φ), the model can also reproduce the form of the Read–Shockley energy,
Phase-field modeling
2091
γgb ≈ −sθ ln θ, of a low angle tilt grain boundary composed of a stack of edge dislocations [31]. More importantly, both solid–liquid and grain boundaries can be described simultaneously in the model by choosing again f (φ) to be a double-well potential with minima at φ = 0 and φ = 1. In this case, grain boundary wetting can occur if, above a critical misorientation, the excess freeenergy of the crystal–crystal interface exceeds twice the excess free-energy of the solid–liquid interface. While the exact determination of this misorientation requires to repeat the analysis of the stationary φ-profile for the double-well potential [60], its scaling can be obtained √ by comparing γsl (Eq. (6)) and γgb (Eq. (9)), which yields θ ∼ W h/s ∼ κh/s. Both the crystal-melt interface and grain boundaries have also been simulated using a more microscopic “phase-field crystal” model [13] where the order parameter is the density ρ of the material. A typical profile of the density averaged spatially in the plane perpendicular to the normal to the solid–liquid interface is plotted schematically in Fig. 1(a), where each peak corresponds to a single plane of atoms. This approach is rooted in classical density functional theory [43], but does not constrain the density to be a sum of density waves corresponding to reciprocal lattice vectors of a perfect crystal. Hence, it can naturally describe dislocations and vacancies in the solid. Furthermore, because phonons are averaged out in this mean field approach, the evolution of crystalline defects can be followed on time scales several orders of magnitude larger than in molecular dynamics simulations.
2.
Cracks
The nucleation and propagation of crack surfaces has been a topic of long standing interest in the materials science and engineering community. To illustrate the extension of the phase-field method to fracture, let us consider the one-dimensional problem of a stationary infinitely long crack in an elastic solid, as illustrated in Fig. 2. The original solid is stretched along the x-axis by displacing its boundaries at x = ±W by ±. The crack splits this solid into two equal parts. In the traditional approach where the crack surfaces are treated as sharp boundaries, the standard displacement field u(x) of mass points measured from their original positions is simply u(x) = x /W for the stretched solid. After complete fracture, the displacement is a step function, u(x) = + for x > 0 and u(x) = − for x < 0, and is discontinuous at the origin, u(0± ) = ±. A phase-field model for cracks can be formulated by introducing a scalar field φ(x) which describes the state of the material [27]. The model retains the same parametrization of linear elasticity where u(x) measures the displacement of mass points from their original positions. Hence, φ measures the
2092
A. Karma
⫹1
Solid
Crack
Solid ψ
x
0
or x ⫹ u (x ) u /∆
⫺1 Figure 2. Schematic phase-field profiles vs. the material coordinate x (thick solid line) and vs. the spatial coordinate x + u(x) (dashed line), where u(x) (thin solid line) is the displacement of mass points with respect to their original positions in the unstretched material. The thick vertical solid lines denote the spatial locations of the two crack surfaces.
state of the material at a spatial location x + u(x). The unbroken solid, which behaves purely elastically, corresponds to φ = 1, whereas the fully broken material that cannot support stress corresponds to φ = 0. The total energy per unit area of crack surfaces is taken to be
E=
κ dx 2
dφ dx
2
µ + h f (φ) + g(φ) 2 − c2 , 2
(10)
where = du/dx is the strain, f (φ) = φ 2 (1 − φ)2 is the same double-well potential as before with minima at φ =1 and φ =0, µ is the elastic modulus, and c is the critical strain magnitude such that the unbroken (broken) state is energetically favored for | | < c (| | > c ). The function g(φ) is a monotonously increasing function of φ with limits g(0) = 0 and g(1) = 1, which controls the softening of the elastic energy at large strain. In equilibrium, the energy must be a minimum, which implies that δ E/δφ = 0 and δ E/δu = 0. The second condition is equivalent to uniform stress in the material. It implies that d(g(φ) )/dx = 0, and hence that = 0 /g(φ) where 0 is the value of the remanent strain in the bulk of the material far from the crack. The first condition δ E/δφ = 0, after the substitution = 0 /g(φ), can
Phase-field modeling
2093
be written in the form of a one-dimensional mechanical problem of a rolling ball with coordinate φ and mass κ dVeff (φ) d2 φ , =− 2 dx dφ in an effective potential κ
(11)
µ 2 2 c g(φ) + 0 Veff (φ) = −h f (φ) + 2 g(φ)
(12)
This potential (Fig. 3) has a repulsive part because g(φ) vanishes for small φ. In this mechanical analog, the stationary phase-field profile φ(x) shown in Fig. 2 corresponds to the ball rolling down this potential, starting from φ = 1 at x = − W , to the turning point located close to φ = 0, and then back to φ = 1 at x = +W . This mechanical problem must be solved under the constraint that the
+Wintegral of the strain equals the total displacement of the fracture surfaces, −W dx 0 /g(φ)=2. An analysis of the solutions in the large system size limit [27] shows that the remanent strain is determined by the behavior of the function g(φ) for small φ. If this function has the form of a power law g ∼ φ 2+α
0.25 ε0 ⫽ 0.01 ε0 ⫽ 0.001 ε0 ⫽ 0.0001
Veff
0.15
0.05
⫺0.05
0
0.2
0.4
0.6
0.8
1
ψ Figure 3. Plots of the effective potential for different values of the remanent strain in the bulk material 0 for one-dimensional static fracture (µ = h = 1 and c = 1/2).
2094
A. Karma
near φ = 0, the result is 0 ∼ −(2+α)/α . Hence, as long as α is positive, 0 will vanish in the large system size limit such that the local contribution of the crack to the overall displacement is dominant compared to the bulk contribution, which scales √ as 0 W . In this limit, the width of φ-profile remains finite and scales ∼ κ/µ. The u-profile is also continuous in the diffuse interface region, but its width vanishes has an inverse power of the system size, such that the strain = du/dx becomes a Dirac delta function in the large system size limit. In addition, this analysis yields the expression for the surface energy [27] γ=
µ c2 κ
1
dφ 0
1 − g(φ) + 2
h f (φ) µ c2
(13)
In contrast to the interface energy for a phase boundary (Eq. (6)), γ for a crack remains finite when the height h of the double well potential vanishes. Therefore, the inclusion of this potential is not a prerequisite to model cracks within this model.
3.
Interface Dynamics
The preceding sections focused on flat static interfaces and their energies. This section examines the application of the phase-field method to simulate the motion of curved interfaces outside of equilibrium, when spatially inhomogeneous distributions of temperature, alloy concentration, or stress are present in the material. The effect of these inhomogeneities are straightforward to incorporate into the model by adding bulk internal energy and entropic contributions to the free-energy functional. Furthermore, the Ginzburg–Landau form [15, 18] of the equations is prescribed by conservation laws and by requiring that the total free-energy relaxes to a minimum. Three illustrative examples are considered: the solidification of a pure substance, the solidification of a binary alloy, and crack propagation. For the solidification of a pure melt [32], the total free-energy that includes the contribution due to the variation of the temperature field is a generalization of Eq. (1)
F[φ, T ] =
α κ 2 |∇φ| + h f (φ) + (T − Tm )2 , dV 2 2
(14)
which is minimum at the melting point T = Tm . Dynamical equations which guarantee that F decreases monotonically in time (dF/dt ≤ 0), and which
Phase-field modeling
2095
conserve the total energy dV e in a closed system with no energy flux through the boundaries, are [32] δF ∂φ = −K φ , ∂t δφ δF ∂e = Ke ∇ 2 , ∂t δe
(15) (16)
where the energy density e = C(T − Tm ) − p(φ)L and φ are chosen as the independent field variables in Eq. (14), C is the specific heat per unit volume, L is the latent heat of melting per unit volume, and p(φ) is a function that increases monotonously with φ with limits p(0) = 0 and p(1) = 1. The energy equation (Eq. (16)) yields L ∂ p(φ) ∂T = D∇ 2 T + ∂t C ∂t
(17)
where we have defined the thermal diffusivity D = α K e /C 2 . This is the standard heat diffusion equation with an extra source term ∼ ∂ p/∂t corresponding to latent heat production. The equation of motion for the phase-field (Eq. (15)), in turn, gives K φ−1
∂φ = κ∇ 2 φ − h f (φ) − α(L/C) p (φ)(T − Tm ), ∂t
(18)
where the prime denotes differentiation with respect to φ. In the region near the interface, where T is locally constant, Eq. (18) implies that the phase change is driven isothermally by the modified double-well potential h f (φ) + α(L/c) p(φ)(T − Tm ). This potential has a “bias” introduced by the undercooling of the interface, which lowers the free-energy of the solid well relative to that of the liquid well. A one-dimensional analysis of this equation [9, 32] shows that the velocity of the interface is simply proportional to the undercooling, V = µsl (Tm − T ), where the interface kinetic coefficient µsl ∼ α K φ (κ/ h)1/2(L/c). Further refinement of this phase-field model [24] and algorithmic developments have made it possible to simulate dendritic evolution quantitatively both in a low undercooling regime where the scale of the diffusion field is much larger than the scale of the dendrite tip [45, 47, 48], and in the opposite limit of rapid solidification [6]. Parameter free results obtained for the latter case using anisotropic forms of γsl and µsl computed from atomistic simulations [20, 21] are compared with experimental data in Fig. 4. Next, let us consider the isothermal solidification of a binary alloy [5, 26, 30, 59, 61]. The total free-energy of the system can be written in the form
F[φ, c, T ] =
dV
κ 2 |∇φ| + f pure (φ, T ) + f solute(φ, c, T ) , 2
(19)
2096
A. Karma 100
80
V (m/s)
60
40
20
0 0
100
200 ∆T (K)
300
400
Figure 4. Example of application of the phase-field method to simulate the dendritic crystallization of deeply undercooled nickel [6]. A snapshot of the solid–liquid interface is shown for an undercooling of 87 K. The dendrite growth rate versus undercooling obtained from the simulations (filled triangles and solid line) is compared to two sets of experimental data from Ref. [37] (open squares) and Ref. [64] (open circles).
where c denotes the solute concentration defined as the mole fraction of B in a binary mixture of A and B molecules, f pure = h f (φ) + α(L/c) p(φ)(T − Tm ) is the double-well potential of the pure material, and f solute(φ, c, T ) is the contribution due to solute addition. Dynamical equations that relax the system to a free-energy minimum are δF ∂φ = −K φ , ∂t δφ ∂c δF , = ∇ · Kc∇ ∂t δc
(20) (21)
where Eq. (21) is equivalent to the mass continuity relation with µc ≡ δ F/δc c as the solute current density. identified as the chemical potential and −K c ∇µ The smooth spatial variation of φ in the diffuse interface can be exploited to interpolate between known forms of the free-energy densities in solid and liquid ( f s and f l , respectively), by writing f solute(φ, c, T ) = g(φ) f s (c, T ) + (1 − g(φ)) f l (c, T ),
(22)
where g(φ) has limits g(0) = 0 and g(1) = 1. For example, for a dilute alloy, f s,l = s,l c + (RTm /v 0 )(c ln c − c) where s,l c is the change of internal energy density due to solute addition in solid or liquid, and the second term is the standard entropy of mixing, where R is the gas constant and v 0 is the molar volume. This interpolation describes the thermodynamic properties of the diffuse interface region as an admixture of the thermodynamic properties of the bulk
Phase-field modeling
2097
solid and liquid phases. The static phase-field and solute profiles through the interface are then obtained from the equilibrium conditions ∂φ/∂t = ∂c/∂t = 0. The limits of c in bulk solid (cs ) and liquid (cl ) are the same as the equilibrium values obtained by the standard common tangent construction of the alloy phase diagram. The method has been extended to non-isothermal conditions, multicomponent alloys, and polyphase transformations, as illustrated in Fig. 5 for the solidification of a ternary eutectic alloy. The first models of polyphase solidification used either the concentration field [23] or a second non-conserved order parameter [35, 62] to distinguish between the two solid phases in addition to the usual phase field that distinguishes between solid and liquid. The more recent multi-phase-field approach interprets the phase fields as local phase fractions and therefore assigns one field to each phase present [14, 42, 53, 54]. This approach provides a more general formulation of multi-phase solidification. The simplest nontrivial example of dynamic brittle fracture is antiplane shear (mode III) crack propagation where the displacement field u(x, y) perpendicular to the x–y plane is purely scalar. The total energy (defined here
Figure 5.
Phase-field simulation of two-phase cell formation in a ternary eutectic alloy [46].
2098
A. Karma
per unit length of the crack front) must now include both kinetic and elastic contributions to this energy, yielding the form
E=
dx dy
µ ρ 2 κ 2 u˙ + |∇φ| + h f (φ) + g(φ) | |2 − c2 , 2 2 2
(23)
is where dot denotes derivative with respect to time, ρ is the density, ≡ ∇u the strain and all the other functions and parameters are as previously defined. The dynamical equations of motion are then obtained variationally from this energy in the forms δE ∂φ = −χ ∂t δφ 2 δE ∂ u ρ 2 =− ∂t δu
(24) (25)
These equations describe both the microscopic physics of material failure and macroscopic linear elasticity. Figure 6 shows examples of cracks obtained in phase-field simulations of this model in a strip of width 2W with a fixed displacement u(x, ±W ) = ± at the strip edges. The stored elastic energy per unit area ahead of the crack tip is G = µ2 /W . The Griffith’s threshold for the onset fracture is well reproduced in this model [27]. This approach has been recently used to study instabilities of mode III [28] and mode I cracks [19]. (a)
(b)
(c)
Figure 6. Example of dynamic crack patterns for mode III brittle fracture [28] with increasing load from (a) to (c). Plots correspond to φ = 1/2 contours at different times.
Phase-field modeling
4.
2099
Discussion
The preceding examples illustrate the power of the phase-field method to simulate a host of complex interfacial pattern formation phenomena in materials. Making quantitative predictions on experimentally relevant length and time scales, however, remains a major challenge. This challenge stems from the fact that, in most applications, the interface thickness and the time scale of the phase field kinetics need to be chosen orders of magnitude larger than in a real material for the simulations to be feasible. Because of this constraint, phase-field results often depend on interface thickness and are only qualitative. Over the last decade, progress has been made in achieving quantitative simulations despite this constraint [12, 24, 26, 51, 66]. One important property of the phase-field model is that the interfacial energy (Eq. (6)) scales as W h. Hence, the correct magnitude of capillary effects can be modeled even with a thick interface by lowering the height h of the double-well potential. For alloys, the coupling of the phase field and solute profiles through the diffuse interface makes the interface energy dependent on interface thickness. This dependence, however, can be eliminated by a suitable choice of freeenergy density [12, 26]. More difficult to eliminate are nonequilibrium effects that become artificially magnified because of diffusive transport across a thick interface. These effects can compete with, or even supersede, capillary effects, and dramatically alter microstructural evolution. To illustrate these nonequilibrium effects, consider the solidification of a binary alloy. The effect best characterized experimentally and theoretically is solute trapping [1, 4], which is associated with a chemical potential jump across the interface. The magnitude of this effect scales with the interface thickness. Since W is orders of magnitude larger in simulations than in reality, solute trapping will prevail at growth speeds where it is completely negligible in a material. Additional effects modify the mass conservation condition at the interface cl (1 − k)Vn = −D
∂c + ··· ∂n
(26)
where cl is the concentration on the liquid side of the interface, k is the partition coefficient, Vn is the normal interface velocity, and “· · · ” is the sum of a correction ∼ cl (1 − k)W Vn κ, where κ is the local interface curvature, a correction ∼ W D∂ 2 cl /∂s 2 , corresponding to surface diffusion, and a correction ∼ kcl (1 − k)W Vn2 /D proportional to the chemical potential jump at the interface. All three corrections can be shown to originate physically from the surface excess of various quantities [12], such as the excess of solute illustrated in Fig. 7. These corrections are negligible in a real material. For this reason, they have not been traditionally considered in the standard free-boundary problem of alloy solidification. For a mesoscopic interface thickness, however, the
2100
A. Karma
c1
Solid
Liquid
cs 0 r Figure 7. Illustration of surface excess associated with a diffuse interface. The excess of solute is the integral, along the coordinate r normal to the interface, of the actual solute profile (thick solid line) minus its step profile idealization (thick dashed line) with the Gibbs dividing surface at r = 0. This excess is negative in the depicted example. The thin solid line depicts the phasefield profile. The use of a thick interface in simulations artificially magnifies the surface excess of several quantities and alters the results [12].
magnitude of these corrections becomes large. Thus, the phase-field model must be formulated to make these corrections vanish. Achieving this goal requires a detailed asymptotic analysis of the thin-interface limit of diffuse interface models [2, 12, 24, 26, 39]. This analysis provides the formal guide to formulate models free of these corrections. So far, however, progess has only been possible for dilute [12, 26] and eutectic alloys [14]. Thus, it is not yet clear whether or not it will always be possible to make phase-field models quantitative in more complicated applications.
5.
Outlook
The phase-field method has emerged as a powerful computational tool to model a wide range of interfacial pattern formation phenomena. The success of the approach can be judged by the rapidly growing list of fields in which it has been used from materials science to biology. It can also be judged by the wide range of scales that have been modeled from crystalline defects to nanostructures to microstructures. Like with any simulation method, however, obtaining quantitative results remains a major challenge. The core of this challenge is the disparity of length and time scales between phenomena on the
Phase-field modeling
2101
scale of the diffuse interface and on the scale of energy or mass transport in the bulk material. For well-established problems like solidification, and a few others, quantitative simulations have been achieved in a few cases following two decades of research since the introduction of the first models. In more recent applications like fracture, with no clear separation between microscopic and macroscopic scales, results remain so far qualitative. In the future, one can expect phase-field simulations to be useful both to gain new qualitative insights into pattern formation mechanisms and to make quantitative predictions in mature applications.
Acknowledgments The author thanks the US Department of Energy and NASA for financial support.
References [1] N.A. Ahmad, A.A. Wheeler, W.J. Boettinger, and G.B. McFadden, Phys. Rev. E, 58, 3436, 1998. [2] R.F. Almgren, SIAM, J. Appl. Math., 59, 2086, 1999. [3] I.S. Aranson, V.A. Kalatsky, and V.M. Vinokur, Phys. Rev. Lett., 85, 118, 2000. [4] M.J. Aziz, Metall. Mater. Trans. A, 27, 671, 1996. [5] W.J. Boettinger, J.A. Warren, C. Beckermann, and A. Karma, Ann. Rev. Mater. Res., 32, 163, 2002. [6] J. Bragard, A. Karma, Y.H. Lee, and M. Plapp, Interface Sci., 10, 121, 2002. [7] J.W. Cahn and J.E. Hilliard, J. Chem. Phys., 28, 258, 1958. [8] L.Q. Chen, Ann. Rev. Mater. Res., 32, 113, 2002. [9] J.B. Collins and H. Levine, Phys. Rev. B, 31, 6119, 1985. [10] J.M. Debierre, A. Karma, F. Celestini, and R. Guerin, Phys. Rev. E, 68, 041604, 2003. [11] L.O. Eastgate J.P. Sethna, M. Rauscher, T. Cretegny, C.-S. Chen, and C.R. Myers, Phys. Rev. E, 65, 036117, 2002. [12] B. Echebarria, R. Folch, A. Karma, and M. Plapp, Phys. Rev. E, 70, 061604, 2004. [13] K.R. Elder, M. Katakowski, M. Haataja, and M. Grant, Phys. Rev. Lett., 24, 245701, 2002. [14] R. Folch and M. Plapp, Phys. Rev. E, 68, 010602, 2003. [15] V.L. Ginzburg and L.D. Landau, Soviet Phys. JETP, 20, 1064, 1950. [16] L. Granasy, T. Borzsonyi, and T. Pusztai, Phys. Rev. Lett., 88, 206105, 2002. [17] L. Gr´an´asy, T. Pusztai, J.A. Warren, J.F. Douglas, T. B¨orzs¨onyi, and V. Ferreiro, Nat. Mater., 2, 92, 2003. [18] B.I. Halperin, P.C. Hohenberg, and S-K. Ma, Phys. Rev. B, 10, 139, 1974. [19] H. Henry and H. Levine, Phys. Rev. Lett., 93, 105504, 2004. [20] J.J. Hoyt, B. Sadigh, M. Asta, and S.M. Foiles, Acta Mater., 47, 3181, 1999. [21] J.J. Hoyt, M. Asta, and A. Karma, Phys. Rev. Lett., 86, 5530–5533, 2001. [22] S.Y. Hu, Y.L. Li, Y.X. Zheng, and L.Q. Chen, Int. J. Plasticity, 20, 403, 2004. [23] A. Karma, Phys. Rev. E, 49, 2245, 1994.
2102 [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62]
A. Karma A. Karma and W.-J. Rappel, Phys. Rev. E, 57, 4323, 1998. A. Karma and M. Plapp, Phys. Rev. Lett., 81, 4444, 1998. A. Karma, Phys. Rev. Lett., 87, 115701, 2001. A. Karma, D. Kessler, and H. Levine, Phys. Rev. Lett., 87, 045501, 2001. A. Karma and A. Lobkovsky, Phys. Rev. Lett., 92, 245510, 2004. K. Kassner and C. Misbah, Europhys. Lett., 46, 217, 1999. S.-G. Kim, W.T. Kim, and T. Suzuki, Phys. Rev. E, 60, 7186, 1999. R. Kobayashi, J.A. Warren, and W.C. Carter, Physica D, 140, 141, 2000. J. S. Langer, In: G. Grinstein and G. Mazenko (eds.), Directions in Condensed Matter, World Scientific, Singapore, p. 164, 1986. F. Liu and H. Metiu, Phys. Rev. E, 49, 2601, 1994. Z.R. Liu, H.J. Gao, L.Q. Chen, and K.J. Cho, Phys. Rev. B, 035429, 2003. T.-S. Lo, A. Karma, and M. Plapp, Phys. Rev. E, 63, 031504, 2001. A.E. Lobkovsky and J.A. Warren, Phys. Rev. E, 63, 051605, 2001. J.W. Lum, D.M. Matson, and M.C. Flemings, Metall. Mater. Trans. B, 27, 865, 1996. G.B. McFadden, A.A. Wheeler, R.J. Braun, S.R. Coriell, and R.F. Sekerka, Phys. Rev. E, 48, 2016, 1993. G.B. McFadden, A.A. Wheeler, and D.M. Anderson, Physica D, 154, 144, 2000. L.V. Mikheev and A.A. Chernov, J. Cryst. Growth, 112, 591, 1991. J. Muller and M. Grant, Phys. Rev. Lett., 82, 1736, 1999. B. Nestler and A.A. Wheeler, of growth structures: 114–133, Physica D, 138, 114, 2000. D.W. Oxtoby and P.R. Harrowell, J. Chem. Phys., 96, 3834, 1992. O. Pierre-Louis, Phys. Rev. E, 68, 021604, 2003. M. Plapp and A. Karma, Phys. Rev. Lett., 84, 1740, 2000; M. Plapp and A. Karma, J. Comp. Phys., 165, 592, 2000. M. Plapp and A. Karma, Phys. Rev. E, 66, 061608, 2002. N. Provatas, N. Goldenfeld, and J.A. Dantzig, Phys. Rev. Lett, 80, 3308, 1998. N. Provatas, N. Goldenfeld, and J.A. Dantzig, J. Comp. Phys., 148, 265, 1999. D. Rodney, Y. Le Bouar, and A. Finel, Acta Mater., 51, 17, 2003. Y. Shen and D.W. Oxtoby, J. Chem. Phys., 104, 4233, 1996. C. Shen, Q. Chen Q, and Y.H. Wen et al., Scripta Mater., 50, 1029, 2004. C. Shen C and Y. Wang, Acta Mater., 52, 683, 2004. I. Steinbach, F. Pezzolla, B. Nestler, M. BeeBelber, R. Prieler, G.J. Schmitz, and J.L.L. Rezende, Physica D, 94, 135, 1996. J. Tiaden, B. Nestler, H.J. Diepers, and I. Steinbach, Physica D, 115, 73, 1998. Y.U. Wang, Y.M. Jin, A.M. Cuitino, and A.G. Khachaturyan, Acta Mater., 49, 1847, 2001. Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, J. Appl. Phys., 91, 6435, 2002. J. Wang, S.Q. Shi, L.Q. Chen et al., Acta Mater., 52, 749, 2004. Y.U. Wang, Y.M.M. Jin, and A.G. Khachaturyan, Acta Mater., 52, 81, 2004. J.A. Warren and W.J. Boettinger, Acta Metall. Mater. A, 43, 689–703, 1995. J.A. Warren, R. Kobayashi, A.E. Lobkovsky, and W.C. Carter, Acta Mater., 51, 6035, 2003. A.A. Wheeler, W.J. Boettinger, and G.B. McFadden, Phys. Rev. A, 45, 7424, 1992. A.A. Wheeler, G.B. McFadden, and W.J. Boettinger, Proc. Royal Soc. Lond. A, 452, 495–525, 1996.
Phase-field modeling
2103
[63] A.A. Wheeler and G.B. McFadden, Euro J. Appl. Math., 7, 367, 1996. [64] R. Willnecker, D.M. Herlach, and B. Feuerbacher, Phys. Rev. Lett., 62, 2707, 1989. [65] R.K.P. Zia and D.J. Wallace, Phys. Rev. B, 31, 1624, 1985. [66] J.Z. Zhu, T. Wang, S.H. Zhou, Z.K. Liu, and L.Q. Chen, Acta Mater., 52, 833, 2004.
7.3 PHASE-FIELD MODELING OF SOLIDIFICATION Seong Gyoon Kim1 and Won Tae Kim2 1 Kunsan National University, Kunsan 573-701, Korea 2
Chongju University, Chongju 360-764, Korea
1.
Pattern Formation in Solidification and Classical Model
Pattern formation in solidification is one of the most well known freeboundary problems [1, 2]. During solidification, solute partitioning and release of the latent heat take place at the moving solid/liquid interface, resulting in a build-up of solute atoms and heat ahead of the interface. The diffusion field ahead of the interface tends to destabilize the plane-front interface. Conversely, the role of the solid/liquid interface energy, which tends to decrease by reducing the total interface area, is to stabilize the plane solid/liquid interface. Therefore the solidification pattern is determined by a balance between the destabilizing diffusion field effect and the stabilizing capillary effect. Anisotropy of interfacial energy or interface kinetics in a crystalline phase contributes to form an ordered pattern with a unique characteristic length scale rather than a fractal pattern. The key ingredients in pattern formation during solidification thus are contributions of diffusion field, interfacial energy and crystalline anisotropy [2]. The classical moving boundary problem for solidification of alloys assumes that the interface is mathematically sharp. The governing equations for isothermal alloy solidification [1] are given by ∂c L = DL ∇ 2cL ∂t ∂cS ∂c L − DL V (ciL − ciS ) = D S ∂n ∂n 1 Hm S i L i e βV + σ κ f c (cS ) = f c (c L ) = f c − e c L − ceS Tm ∂cS = DS ∇ 2 cS ; ∂t
2105 S. Yip (ed.), Handbook of Materials Modeling, 2105–2116. c 2005 Springer. Printed in the Netherlands.
(1) (2) (3)
2106
S.G. Kim and W.T. Kim
where c is composition and D is diffusivity. The subscripts S and L under c and D denote the values of solid and liquid phase, respectively. The superscripts i and e on composition denote the interfacial and equilibrium compositions, respectively. f cS and f cL are the chemical potentials of bulk solid and liquid, respectively, and f ce is the equilibrium chemical potential. Here the chemical potential denotes the difference between the chemical potentials of solute and solvent. β is the interface kinetics coefficient, V the interface velocity, Hm the latent heat of melting, Tm the melting point of the pure solvent, σ the interface energy, κ the interface curvature, ∂/∂t and ∂/∂n are the time and the interface normal derivatives, respectively. The solidification of pure substances involving the latent heat release at interface, instead of solute partitioning, can be described by the same set of Eqs. (1)–(3), which can be expressed by replacing variables: c → H/Hm , f c → T Hm /Tm , D → DT , where T , H and DT are temperature, enthalpy density and thermal diffusivity, respectively, with the same meanings for the superscripts and subscripts L, S and i.
2.
Phase-field Model
Many numerical approaches have been proposed to solve the Eqs. (1)– (3). These include direct front tracking methods and boundary integral formulations, where the interface is treated to be mathematically sharp. However these sharp interface approaches lead to significant difficulties due to the requirement of tracking interface position every time step, especially in handling topological changes in interface pattern or extending to 3D computation. An alternative technique for modeling the pattern formation in solidification is the phase-field model (PFM) [3, 4]. This approach adopts a diffuse interface model, where a solid phase changes gradually into a liquid phase across an interfacial region of a finite width. The phase state is defined by an order parameter φ as a function of position and time. The phase field φ takes on a constant value in each bulk phase, e.g., φ = 0 in liquid phase and φ = 1 in solid phase, and it changes smoothly from φ = 0 to φ = 1 across the interfacial region. Any point within the interfacial region is assumed to be a mixture of the solid and liquid phases, whose fractions are varying gradually from 0 to 1 across the transient region. All the thermodynamic and kinetic variables then are assumed to follow a mixture rule. A set of equations for PFM can be derived in a thermodynamically consistent way. Let us consider an isothermal solidification of an alloy. It is
Phase-field modeling of solidification
2107
assumed that the total free energy of the system of volume is given by a functional
F=
[ 2 |∇φ|2 + ωg(φ) + f (φ, c, T )]d
(4)
During solidification, which is a non-equilibrium process, the system evolves toward a more stable state by reducing the total free energy. To decrease the total free energy, the first term (phase-field gradient energy) in the functional (4) makes the phase-field profile to spread out, i.e., to widen the transient region. The second term (double-well potential ωg(φ)) makes the bulk phases stable, i.e., to sharpen the transient region. The diffuse interface maintains a stable width by a balance between these two opposite effects. Once the stable diffuse interface is formed, the two terms start to cooperate to decrease the total volume (in 3D) or area (in 2D) of the diffuse interfacial region where|∇φ| and g(φ) are not vanishing. This is corresponding to the curvature effect in the classical sharp interface model. Thus the gradient energy and the doublewell potential play two-fold roles; formation of stable diffuse interface and incorporation of the curvature effect. √ As the result, the interface width scales as the ratio of the coefficients (/ √ ω), whereas the interface energy scales as the multiplication of them ( ω). The last term in the functional (4) is a thermodynamic potential assumed to follow a mixture rule f (φ, c, T ) = h(φ) f S (cS , T ) + [1 − h(φ)] f L (c L , T )
(5)
where c is the average composition of the mixture, cS and c L are the compositions of coexisting solid and liquid phases in the mixture, respectively, and f S and f L are the free energy densities of the solid and liquid phases, respectively. It is natural to take c(x) at a given point x to be c(x) = h(φ)cS (x) + [1 − h(φ)]c L (x)
(6)
The monotonic function h(φ) satisfying h(0) = 0 and h(1) = 1 has the meaning of solid fraction. One more restriction is required for h(φ) to ensure that the solid and liquid phases are stable or metastable(i.e., exhibit energy minima), the function ωg(φ) + f in the functional (4) must have local minima at φ = 0 and φ = 1. It then leads to the condition h (0) = h (1) = 0 which confines the phase change occurring within the interfacial region. Finally the anisotropy effect in interface energy can be incorporated into the functional (4) by allowing to depend on the local orientation of the phase-field gradient [5]. Note that all thermodynamic components controlling pattern formation during solidification are incorporated into a single functional (4). The kinetic components controlling pattern formation are incorporated into the dynamic equations of the phase and diffusion fields. In a solidifying
2108
S.G. Kim and W.T. Kim
system where its total free energy decreases monotonically with time, the total amount of solute is conserved, whereas the total volume of each phase is not conserved. Therefore the phase field and concentration are assumed to follow relaxation dynamics of δF ∂φ = −Mφ ∂t δφ
(7)
δF ∂c = −∇ Mc · ∇ ∂t δc
(8)
where Mφ and Mc are mobilities of the phase and concentration fields, respectively. From the variational derivatives of the functional (4), it follows 1 ∂φ = 2 ∇ 2 φ − ωg (φ) − f φ Mφ ∂t ∂c = ∇ Mc · ∇ f c ∂t
(9) (10)
where the subscripts in f denote the partial derivatives by the specific variables. Mc is related to the chemical diffusivity D(φ) by Mc = D/ f cc , where f cc ≡ ∂ 2 f /∂c2 , D(1) = D S and D(0) = D L . The PFM for isothermal solidification of alloys thus consists of Eqs. (9) and (10). To solve these equations, we need f φ , f c and f cc . For the given thermodynamic data f S (cS ) and f L (c L ) at a given temperature, the above functions are obtained by differentiating Eq. (5). For this differentiation, relationships between c(x), cS (x) and c L (x) are required. Two alternative ways have been proposed for these relationships [6]: equal composition condition; and equal chemical potential condition. In the former case, which has been widely adopted [3], it is assumed that cS (x) = c L (x) and so f c S (cS (x)) =/ f c L (c L (x)), resulting in c = cS = c L from Eq. (6). Under this condition, it is straightforward to find fφ , f c and fcc from Eq. (5). In the latter case, it is assumed that fc S (cS (x)) = f c L (c L (x)) and so cS (x) =/ c L (x), resulting in f c = f c S = f c L from Eqs. (5) and (6). Under this condition, f φ in the phase-field Eq. (9) is given by f φ = h (φ)[ f S (cS , T ) − f L (c L , T ) − (cS − c L ) f c ]
(11)
and the diffusion Eq. (10) can be modified into the form ∂c = ∇ · D(φ)[h(φ)∇cS + (1 − h(φ))∇c L ] (12) ∂t Note that this diffusion equation is derived in a thermodynamically consistent way, even though the same equation has been introduced in an ad hoc manner previously [3]. In case of nonisothermal solidification of alloys, the evolution equations for thermal, solutal and phase fields can also be derived in a thermodynamically
Phase-field modeling of solidification
2109
consistent way, where positive entropy production is guaranteed [7]. The resulting evolution equations are dependent on the detailed form of the adopted entropy functional. With a simple form of the entropy functional, the thermal diffusion equation is given by ∂H = ∇k(φ) · ∇T (13) ∂t where H is the enthalpy density, k(φ) is the thermal conductivity, and the phase field and chemical diffusion equations remain identical with Eqs. (9) and (10). In the simplest case where the thermal conductivities and the specific heats of the liquid and solid are same and independent of temperature, the thermal diffusion equation can be written into ∂φ Hm ∂T h (φ) (14) − = ∇ DT · ∇T ∂t cp ∂t where c p is the specific heat.
3.
Sharp Interface Limit
Equations (9) and (10) in the PFM of alloy solidification can be mapped onto the classical free boundary problem, described in Eqs. (1)–(3). The relationships between the parameters in the phase-field equations and material’s parameters are obtained from the mapping procedure. It can be done at two different limit conditions: a sharp interface limit where the interface width 2ξ p is vanishingly small; and a thin interface limit where the interface width is finite, but much smaller than the characteristic length scales of diffusion field and the interface curvature. At first we deal with the sharp interface analysis. To find the interface width, consider an alloy system at equilibrium, with a 1D diffuse interface between solid (φ = 1 at x < − ξ p ) and liquid (φ = 0 at x > ξ p ) phases. Then the phase-field equation can be integrated and the equilibrium phase-field profile φ0 (x) [8] is the solution of
2 dφ0 2 = ωg(φ0 ) + Z (φ0 ) − Z (0) (15) 2 dx where Z (φ0 ) = f −c fc . The function Z (φ0 )−Z (0) in the right side of this equation has a double-well form under the equal composition condition, whereas it disappears under the equal chemical potential condition for alloys or the equal temperature condition for pure substances [6]. Integrating Eq. (15) again gives the interface width 2ξ p , corresponding to a length over which the phase field changes from φa to φb ; 2ξ p = √ 2
φb φa
√
dφ0 ωg(φ0 ) + Z (φ0 ) − Z (0)
(16)
2110
S.G. Kim and W.T. Kim
The interface energy is obtained by considering an equilibrium system with a cylindrical solid in liquid matrix, maintaining a diffuse interface between them. Integrating the phase-field equation in the cylindrical coordinate gives the chemical potential shift from the equilibrium value, which recovers the curvature effect in Eq. (3), if the interface energy σ is given by σ =
∞ 2 −∞
dφ0 dr
2
dr =
√
2
1
ωg(φ0 ) + Z (φ0 ) − Z (0) dφ0
(17)
0
The same expression for σ can be directly obtained from the excess free energy arising from the nonuniform phase-field distribution in the functional (4). In sharp interface limit, the interface width is vanishingly small, while the interface energy should remain finite. From Eqs. (16) and (17), it appears that the limit can be attained when → 0 and ω → ∞. This leads to ωg(φ0 ) Z (φ0 ) − Z (0) and then the interface width and the energy in the sharp interface limit are given by √ (18) 2ξ p = √ 2 2 ln 3 ω √ ω σ= √ (19) 3 2 when we used φa = 0.1, φb = 0.9 and g(φ) = φ 2 (1 − φ)2 . In sharp interface limit, Eq. (10) for chemical diffusion recovers not only the usual diffusion equations in bulk phases, but also the mass conservation condition at the interface. Similarly, the thermal diffusion Eq. (13) also reproduces the usual thermal diffusion equation in bulk phases and the energy balance condition at the interface. The remaining procedure is to find a relationship between the mobility Mφ and the kinetic coefficient β. Consider a moving plane-front interface with a steady velocity V . The 1D phase-field equation in a moving coordinate system can be integrated over the interfacial region, in which the chemical potential at the interface is regarded as a constant because its variation within the interfacial region can be ignored in the sharp interface limit. The integration yields a linear relationship between the interface velocity and the thermodynamic driving force, which recovers the kinetic effect in Eq. (3) if we put √ ωTm 1 (20) β= √ 3 2 Hm Mφ For given 2ξ p , σ and β, all the parameters , ω and Mφ in the phase-field Eq. (9) thus are determined from the three relationships (18)–(20). For the model of pure substances consisting of Eqs. (9) and (13), exactly same relationships between phase-field parameters and material’s parameters are maintained. When the phase-field parameters are determined with these equations,
Phase-field modeling of solidification
2111
special care should be taken to avoid the interface width effect on the computational results. It is often computationally too stringent to choose 2ξ p small enough to resolve the desired sharp interface limit.
4.
Thin Interface Limit
Remarkable progress has recently been made to overcome the stringent restriction on interface width by using a thin-interface analysis of the PFM [5, 9]. This analysis maps the PFM with a finite interface width, 2ξ p , onto the classi˜ V and ξ p R, where D˜ cal moving boundary problem at the limit of ξ p D/ and R are the average diffusivity in the interfacial region and the local radius of interface curvature, respectively. Furthermore, this makes it possible to eliminate the interface kinetic effect by a specific choice of the phase-field mobility. The mapping of the thin interface PFM onto the classical moving boundary problem is based on the following two ideas. First, due to the finite interface width, there can exist anomalous effects in (1) interface energy, (2) diffusion in the interfacial region, (3) release of the latent heat and (4) solute partitioning. Crossing the interface with a finite width, 2ξ p , the anomalous effects vary sigmoidally and change their signs around the middle position of the interface. By specific choices of the functions in the PFM such as h(φ) and D(φ), these anomalous effects can be eliminated by summing them over the interface width. Second, the thermodynamic variables such as temperature T and chemical potential f c at the interface are not uniquely defined, but rather varying smoothly ˜ V is satisfied, within the finite interface width. When the condition ξ p D/ ˜ V are linhowever, their profiles at the solid and liquid sides of ξ p |x| D/ ear. Extrapolating two straight profiles into the interfacial region, we get a value of the thermodynamic variable at the intersection point. The value corresponds to that in sharp interface limit. In this way, we can find the unique thermodynamic driving force for the thin interface. First we deal with a symmetric model [5] for pure substances, where the specific heat, c p , and the thermal diffusivity, DT , are constant throughout the whole system. In this case, all the anomalous effects arising from the finite interface width are vanishing when φ0 (x) − 1/2 and h(φ0 (x)) are odd functions of x. Because the extra potential disappears in Eq. (15) for pure substances, usual choices of g(φ) and h(φ) satisfy these conditions, for example, g(φ) = φ 2 (1 − φ)2 and h(φ) = φ 3 (6φ 2 − 15φ + 10 ); furthermore the relationships (18) and (19) remain unchanged. The next step of the thin interface analysis is to find the linear relationship between the interface velocity and the thermodynamic driving force, which leads to √ Hm ωTm 1 √ J (21) − β= √ DT c p 2ω 3 2 Hm Mφ
2112
S.G. Kim and W.T. Kim
and J is a constant given by 1
J= 0
h p (φ)[1 − h d (φ)] √ dφ g(φ)
(22)
where the subscripts p and d under h(φ) are added to discriminate solid fractions from the phase-field and diffusion equations, respectively. The discrimination was made because a model with h p (φ) =/ h d (φ) can also be mapped onto the classical moving boundary problem, although both functions become identical when the model is derived from the functional (4). The second term in the right side of Eq. (21) is the correction from the finite √ interface width effect, which disappears in sharp interface limit 2ξ p ∼ / ω → 0. For given 2ξ p , σ and β, all the parameters , ω and Mφ in the phase-field Eq. (9) thus are determined from the three relationships (18)–(20) in thin interface limit. Note that Mφ can be determined at the vanishing interface kinetic condition by putting β = 0 in Eq. (21).
5.
One-sided Model
When the specific heat c p and thermal diffusivity DT in solid and liquid phases are different from each other, the thin interface analysis is more deliberate because one must take care of the anomalous effects associated with asymmetric functions of c p (φ) and DT (φ). There exists similar difficulties in the analysis for the PFM of alloys. The analysis requires additional care of the solute trapping arising from a finite relaxation time for solute partitioning in the interfacial region. The thin interface analysis, however, is still tractable for a one-sided system where the diffusion in solid phase is ignored, which is described below. When the interface width is finite, the interface width and energy are given by Eqs. (16) and (17), respectively. They are influenced by the extra potential Z (φ)− Z (0). The potential imposes a restriction on the interface width [8, 10], for a given interface energy. The restriction is often so tight that it prevents us from taking the merit of the thin interface analysis – enhancing the computational efficiency by increasing the interface width. For high computational efficiency, therefore, it is desirable to take the equal chemical potential condition instead of the equal composition condition, under which the extra potential Z (φ0 ) − Z (0) disappears [6, 10]. In a dilute solution, the equal chemical potential condition is reduced to a simple relationship cS (x)/c L (x) = ceS /ceL ≡ k,
Phase-field modeling of solidification
2113
and the diffusion equation and the phase-field equation are as follows [9, 10]; c = [1 − (1 − k)h d (φ)]c L ≡ A(φ)c L
(23)
∂c = ∇ · D(φ)A( φ)∇c L ∂t
(24)
RT (1 − k) e 1 ∂φ = 2 ∇ 2 φ − ωg (φ) − (c L − c L )h p (φ) Mφ ∂t vm
(25)
where the last term in Eq. (25) is the dilute solution approximation of Eq. (11) and v m is the molar volume. The coefficient RT (1−k)/v m may be replaced by Hm /(m e Tm ), following the van’t Hoff relation, where m e is the equilibrium liquidus slope in the phase diagram. The mapping of Eqs. (24) and (25) in thin interface limit onto the classical moving boundary problem can be performed under the assumption of D S D L [9]. The following are the results obtained to remove anomalous interfacial effects in thin interface limit: Anomalous interface energy is vanishing if dφ0 (x)/dx is an even function of x, where the origin x = 0 is taken as the position with φ = 1/2. This is fulfilled by taking a symmetric potential, such as g(φ) = φ 2 (1−φ)2 . Anomalous solute partitioning is vanishing if h d (φ0 ) dφ0 /dx is an even function of x. This requirement is fulfilled when h d (φ0 (x)) is an even function of x because dφ0 (x)/dx also is an even function following the first condition. Usual choice for h d (φ) satisfies this condition, for example, h d (φ) = φ or h d (φ) = φ 3 (6φ 2 − 15φ + 10). Anomalous surface diffusion in the interfacial region is vanishing if D(φ(x))A(φ(x)) − D L /2 is an odd function of x, which can be fulfilled by putting D(φ) A(φ) = (1 − φ)D L . Also a condition for vanishing chemical potential jump is required at the imaginary sharp interface at x = 0. The chemical potential jump is directly related with the solute trapping effects arising from the finite interface width. Even though the solute trapping is one of the important physical phenomena in rapid solidification of alloys, it is negligible in normal slow solidification conditions. This often leads to a strong artificial solute trapping effect in such normal conditions, however, when a thick interface width is adopted for high efficiency in the phase-field computation. These artificial effects can be remedied by introducing an anti-trapping mass flux into the diffusion Eq. (24) [4], which is proportional to the interface velocity (∼ ∂φ/∂t) and directed toward the normal direction (∇φ/|∇φ|) to the interface. The modified diffusion equation then has the form
∂φ ∇φ ∂c = ∇ · D(φ) A(φ)∇c L + α(c L ) ∂t ∂t |∇φ|
(26)
2114
S.G. Kim and W.T. Kim
The coupling coefficient α(c L ) can be found from the condition for vanishing chemical potential jump; (27) α(c L ) = √ (1 − k)c L 2ω with the previous choices g(φ)=φ 2 (1−φ)2 , h d (φ)=φ and D(φ)A(φ)=(1−φ) D L . The linear relationship between the thermodynamic driving force and the interface velocity leads to a similar relationship between β and Mφ as for symmetric model, but with a replacement of Hm /(DT c p ) by m e ceL (1−k)/D L in the second term of the right side of Eq. (21).
6.
Multiphase and/or Multicomponent Models
The PFM explained above is for solidification of binary alloys into a single solid phase. Solidification of industrial alloys often involves more solid phases and/or more components. In multiphase systems, eutectic and peritectic solidification involving one liquid and two solid phases are of particular importance not only from engineering aspects, but also from scientific aspects because of their richness in interface patterns. Extending the number of phases for eutectic/peritectic solidification can be done by several ways; introducing three phase fields to denote each phase, introducing two phase fields where one is to distinguish between the solid and liquid phases and the other between two different solid phases, or coupling the PFM with the spinodal decomposition model where two solids phases are discriminated by two different compositions. Each approach has its own merits, yielding fruitful information for understanding pattern formation. For quantitative computation in real alloys with enhanced numerical efficiency, however, it is desirable for the models to have the following properties. First, thermodynamic and kinetic properties for three different interfaces in the system should be controlled independently. Second, the force balance at the triple interface junction should be maintained because it plays an essential role in pattern formation. Third, imposing the equal chemical potential condition is preferable because it significantly improves the numerical efficiency, as compared to the equal composition condition. Fourth, all the parameters should be determined to map the model onto the classical moving boundary problem of the eutectic/peritectic solidification. Such multiphase-field models are at the stage of development [10, 11]. The PFMs for binary alloys can be straightforwardly extended to the multicomponent system under the equal composition or equal chemical potential conditions. However, utilizing the advantage of the latter condition requires extra computation to find the compositions of coexisting solid and liquid phases having an equal chemical potential. If the thermodynamic database that are usually given by functions of the compositions are transformed into
Phase-field modeling of solidification
2115
functions of the chemical potential as a preliminary step of computation, the extra cost may be significantly reduced. When the dilute solution approximation is adopted, in particular, the cost is negligible because the condition is reduced to the constant partition coefficients for a reference phase, e.g., liquid phase. Although multicomponent PFMs have been developed with the constant partition coefficients, the complete mapping of the models onto the classical sharp interface model has not yet been done. Presently, the multicomponent PFMs remain as tools for qualitative simulation.
7.
Simulations
The PFMs can be easily implemented into a numerical code by finite difference or finite element schemes, and various simulations for dendritic, eutectic, peritectic and monotectic solidifications have been performed. Examples of them can be found in [3]. The large disparity between the interface width, the microstructural scale and the diffusion boundary layer width hinders the simulation in physically relevant growth conditions. Therefore, early simulations have focused on the qualitative computations of the basic patterns. However, recent advances in hardware resources and the thin interface analysis greatly improved the computational power and efficiency in phasefield simulation. For modeling the free dendritic growth at low undercooling, where the diffusion field reaches far beyond the dendrite itself, computationally efficient methods such as adaptive mesh refining methods [12, 13] and the multi-scale hybrid method [14] of the finite difference scheme and the Monte Carlo scheme have been developed. Through a combination of such advances, not only qualitative but also quantitative phase-field simulation are possible in experimentally relevant growth conditions. The earliest quantitative phase-field simulation [5] was on the free dendritic growth of the symmetric model in 2D and 3D. This was the first numerical test of the microscopic solvability theory for free dendritic growth, which left little doubt about its validity. Quantitative 3D simulations of free dendritic growth in pure substance are further being refined to answer long-standing questions, for examples, the role of fluid flow for dendritic growth [3] and the origin of the abrupt changes of growth velocity and morphology in highly undercooled pure melt [4]. In spite of the variety of simulations for alloy solidification, the quantitative simulation for alloys have been limited. Recent advances in thin interface analysis for a one-sided model opened the window for quantitative calculations. One example is the 2D multiphase-field simulations of directional eutectic solidification in CBr4 −C2 Cl6 alloys [10]. The 2D experimental results of solidification in this alloy may be used for benchmarking the quantitative simulations because all the materials’ parameters were not only measured with reasonable accuracy, but also the various oscillatory/tilting
2116
S.G. Kim and W.T. Kim
instabilities occur with varying lamella spacing, growth velocity and composition. The 2D phase-field simulations of the eutectic solidification under real experimental conditions quantitatively reproduced all the lamella patterns and the morphological changes observed in experiments. In views of the recent success in the thin interface analysis and importance of the alloy solidification in both the engineering and scientific aspects, application of the one-sided PFM will soon be one of the most active fields in modeling alloy solidification. The quantitative application of PFMs to solidification of real alloys is hindered by the lack of information on thermo-physical properties such as interface energy, interface kinetic coefficient and their anisotropies. Combining the PFMs with atomistic modeling to determine these properties will provide powerful tools for studying the solidification behavior in real alloys.
References [1] J.S. Langer, “Instabilities and pattern formation in crystal growth,” Rev. Mod. Phys., 52, 1–28, 1980. [2] P. Meakin, “Fractals, scaling and growth far from equilibrium,” 1st edn., Cambridge Press, UK, 1998. [3] Boettinger, W.J. Warren, J.A., C. Beckermann, and A. Karma, “Phase-field simulation of solidification,” Annu. Rev. Mater. Res., 32, 163–194, 2002. [4] W.J. Hoyt, M. Asta, and A. Karma, “Atomistic and continuum modeling of dendritic solidification,” Mater. Sci. Eng. R, 41, 121–163, 2003. [5] A. Karma and W.-J. Rappel, “Quantitative phase-field modeling of dendritic growth in two and three dimension,” Phys. Rev. E, 57, 4323–4349, 1998. [6] S.G. Kim, W.T. Kim, and T. Suzuki, “Phase-field model for binary alloys,” Phys. Rev. E, 60, 7186–7197, 1999. [7] A.A. Wheeler, G.B. McFadden, and W.J. Boettinger, “Phase-field model for solidification of a eutectic alloy,” Proc. R. Soc. London. A, 452, 495–525, 1996. [8] S.G. Kim, W.T. Kim, and T. Suzuki, “Interfacial compositions of solid and liquid in a phase-field model with finite interface thickness for isothermal solidification in binary alloys,” Phys. Rev. E, 58, 3316–3323, 1998. [9] A. Karma, “Phase-field formulation for quantitative modeling of alloy solidification,” Phys. Rev. Lett., 87, 115701, 2001. [10] S.G. Kim, W.T. Kim, T. Suzuki, and M. Ode, “Phase-field modeling of eutectic solidification,” J. Cryst. Growth, 261, 135–158, 2004. [11] R.Folch and M. Plapp, “Toward a quantitative phase-field modeling of two-solid solidification,” Phys. Rev. E, 68, 010602, 2003. [12] N. Provatas, N. Goldenfeld, and J. Dantzig, “Adaptive mesh refinement computation of solidification microstructures using dynamic data structures,” J. Comp. Phys., 148, 265–290, 1999. [13] C.W. Lan, C.C. Liu, and C.M. Hsu, “An adaptive finite volume method for incompressible heat flow problem in solidification,” J. Comp. Phys., 178, 464–497, 2002. [14] M. Plapp and A. Karma, “Multiscale random-walk algorithm for simulating interfacial pattern formation,” Phys. Rev. Lett., 84, 1740–1743, 2000.
7.4 COHERENT PRECIPITATION – PHASE FIELD METHOD C. Shen and Y. Wang The Ohio State University, Columbus, Ohio, USA
Phase transformation is still the most efficient and effective way to produce various microstructures at mesoscales, and to control their evolution over time. In crystalline solids, phase transformations are usually accompanied by coherency strain generated by lattice misfit between coexisting phases. The coherency strain accommodation alters both thermodynamics and kinetics of the phase transformations and, in particular, produces various self-organized, quasi-periodical array of precipitates such as the tweed [1], twin [2], chessboard structures [3], and fascinating morphological patterns such as the stars, fans and windmill patters [4], to name a few (Fig. 1). These microstructures have puzzled materials researchers for decades. Incorporation of the strain energy in models of phase transformations not only allows for developing a fundamental understanding of the formation of these microstructures, but also provides the opportunity to engineer new microstructures of salient features for novel applications. Therefore, it is desirable to have a model that is able to predict the formation and time-evolution of coherent microstructural patterns. Yet coherent transformation in solid is the toughest nut to crack [5]. In a non-uniform (either compositionally or structurally) coherent solid where lattice planes are continuous on passing from one phase to another (Fig. 2), the lattice misfit between the adjacent non-uniform regions has to be accommodated by displacement of atoms from their regular positions along the boundaries. This sets up elastic strain fields within the solid. Being long-range and strongly anisotropic, the mechanical interactions among these strain fields are very different from the short-range chemical interactions. For example, the bulk chemical free energy and interfacial energy, both of which are associated with the short-range chemical interactions, depend solely on the volume fraction and the total area and inclination of interfaces of the precipitates, respectively. The elastic strain energy, on the other hand, depends on the size, shape, spatial orientation and mutual arrangement of the precipitates. When 2117 S. Yip (ed.), Handbook of Materials Modeling, 2117–2142. c 2005 Springer. Printed in the Netherlands.
2118
C. Shen and Y. Wang
(a)
(b)
10µm (c)
(d)
30mm
50mm
70mm
50mm
70mm
Figure 1. Various strain-accommodating morphological patterns produced by coherent precipitation: (a) tweed, (b) twin, (c) chessboard structures, and (d) stars, fans and windmill patterns.
the elastic strain energy is included in the total free energy, every single precipitate (its size, shape and spatial position) contributes to the morphological changes of all other precipitates in the system through its influence on the stress field and the corresponding diffusion process. Therefore, many of the thermodynamic principles and rate equations obtained for incoherent precipitation may not be applicable anymore to coherent precipitation. A rigorous treatment of coherent precipitation requires a self-consistent description of microstructural evolution without any a priori assumptions about possible particle shapes and their spatial arrangements along a phase transformation path. The phase field method seems to satisfy this requirement. Over the past two decades, it has been demonstrated to have the ability to deal with arbitrary coherent microstructures produced by diffusional and
Coherent precipitation – phase field method (a)
2119
(b)
Figure 2. Schematic drawing of coherent interfaces (dashed lines). In (a) the precipitate (in grey) and the matrix have the same crystal structure but different lattice parameters while in (b) the precipitate has different crystal structure from the matrix.
displacive transformations with arbitrary transformation strains. Many complicated strain-induced morphological patterns such as those shown in Fig. 1 have been predicted (for recent reviews see [6–8]). A variety of new and intriguing kinetic phenomena underlying the development of these morphological patterns have been discovered, which include the correlated and collective nucleation [6, 7, 9, 10], local inverse coarsening, precipitate drifting and particle splitting [11–14]. These predictions have contributed significantly to our fundamental understanding of many experimental observations [15]. The purpose of this article is to provide an overview of the phase field method in the context of its applications to coherent transformations. We shall start with a discussion of the fundamentals of coherent precipitation, including how the coherency strain affects phase equilibrium (e.g., equilibrium compositions of coexisting phases and their equilibrium volume fractions), driving forces for nucleation, growth and coarsening, thermodynamic factors in diffusivity, and precipitate shape and spatial distribution. This will be followed by an introduction to microelasticity of an arbitrary coherent heterogeneous microstructure and its incorporation in the phase field method. Finally, implementation of the method in modeling coherent precipitations will be illustrated through three examples with progressively increasing complexity. For the purpose of simplicity and clarity, we limit our discussions to bulk materials of homogeneous modulus (i.e., all coexisting phases have the same elastic constants). Applications to more complicated problems such as small confining systems (such as thin films, multi-layers, and nano-particles) and elastically inhomogeneous systems will not be presented. For interested readers, these applications can be found in the references listed under Further Reading.
2120
1.
C. Shen and Y. Wang
Fundamentals of Coherent Precipitation
In depth coverage of this subject can be found in the monograph by Khachaturyan [16] and the book chapter by Johnson [17]. Below we discuss some of the basic concepts related to coherent precipitation. In a series of classical papers [18–21], Cahn laid the theoretical foundation for coherent transformations in crystalline solids. He distinguished the atomic misfit energy (part of the mixing energy of a solid solution) from the coherency elastic strain energy, and incorporated the latter into the total free energy to study coherent processes. He analyzed the effect of coherency strain energy on phase equilibrium, nucleation, and spinodal decomposition. Since the free energy is formulated within the framework of gradient thermodynamics [22], these studies are actually the earliest applications of the phase field method to coherent transformations.
1.1.
Atomic-misfit Energy and Coherency Strain Energy
A macroscopically stress-free solid solution with uniform composition can be in a “strained” state if the constituent atoms differ in size. The elastic energy associated with this microscopic origin is often referred to as the atomic-misfit energy in solid-solution theory [23]. It is the difference between the free energy of a real, homogeneous solution and the free energy of a hypothetical solution of the same system in which all the atoms have the same size. This atomicmisfit energy, even though mechanical in origin and long-range in character, is part of the physically measurable chemical free energy (e.g., free energy of mixing) and is included in thermodynamic databases in literature. The elastic energy associated with composition or structure non-uniformity (such as fluctuations and precipitates) in a coherent system is referred to as the coherency strain energy. The reference state for the measure of the coherency strain energy is a system of identical fluctuations or precipitate–matrix mixture, but with the fluctuations or precipitates/matrix separated into stressfree portions [21] (i.e., the incoherent counterpart). Since the coherency strain energy is in general a function of size, shape, spatial orientation and mutual arrangement of precipitates [16], it cannot be incorporated into the chemical free energy except for very special cases [18]. Thus the coherency strain energy is usually not included in the free energy from thermodynamic databases.
1.2.
Coherent and Incoherent Phase Diagrams
Different from the atomic misfit energy, the coherency strain energy is zero for homogeneous solid solutions and positive for any non-uniform coherent systems. It always promotes a homogeneous solid solution and suppresses
Coherent precipitation – phase field method
2121
phase separation. For a given system, the phase diagram determined by minimizing solely the bulk chemical free energy (including the contribution from the atomic misfit energy), or measured from a stage when precipitates already loose their coherency with the matrix, is referred to as incoherent phase diagram. Correspondingly the phase diagram determined by minimizing the sum of the bulk chemical free energy and the coherency strain energy, or measured from coherent stages of the system is referred to as coherent phase diagram. A coherent phase diagram, which is relevant to the study of coherent precipitation, could differ significantly from an incoherent one. This has been demonstrated clearly by Cahn [18] using an elastically isotropic system with a linear dependence of lattice parameter on composition. In this particular case the equilibrium compositions and volume fractions of coherent precipitates can be determined by the common-tangent rule with respect to the total bulk free energy (Fig. 3). Cahn showed that a coherent miscibility gap lies within an incoherent miscibility gap, with the differences in critical point and width of the miscibility gap determined by the amount of lattice misfit. In an elastically anisotropic system, the coherency strain energy becomes a function of precipitate size, shape and spatial location. In this case precipitates of different configurations will have different coherency strain energies, leading to a series of miscibility gaps lying within the incoherent one.
Incoherent free energy coherent free energy
c 1 c '1
c0
c
c'2 c 2
Figure 3. Incoherent (solid line) and coherent (dotted line) free energy as a function of composition for a regular solution that is elastically isotropic and its lattice parameter depends linearly on concentration. The equilibrium compositions in both cases (c1 ,c2 , c1 and c2 ) are determined by the common tangent construction. c0 is the average composition of the solid solution (after [21]).
2122
1.3.
C. Shen and Y. Wang
Coherent Precipitation
Precipitation involves typically phenomena of nucleation and growth of new phase particles out of a parent phase matrix, and subsequent coarsening of the resulting two-phase mixture. In the absence of coherency strain, nucleation is controlled by the interplay between the bulk chemical free energy and the interfacial energy, while growth and coarsening are dominated, respectively, by the bulk chemical free energy and the interfacial energy. For coherent precipitation, the coherency strain energy enters the driving forces for all three processes because it depends on both volume and morphology of the precipitates. In this case, nucleation is determined by the interplay among the bulk chemical free energy, the coherency strain energy, and the interfacial energy, while growth is dominated by the interplay between the chemical free energy and the coherency strain energy, and coarsening dominated by the interplay between the coherency strain energy and the interfacial energy. Therefore, many of the thermodynamic principles and rate equations derived for incoherent precipitation have to be modified for coherent processes. First of all, one has to pay attention to how the phase diagram and thermodynamic database for a given system were developed. For an incoherent phase diagram the thermodynamic data do not include the contribution from the coherency strain energy. In this case one needs to add the coherency strain energy to the chemical free energy from the database to obtain the total free energy for coherent transformations. However, if the phase diagram is determined for coherent precipitates and the thermodynamic database is developed by fitting the “chemical” free energy model to the coherent phase diagram, the “chemical” free energy already includes the coherency strain energy corresponding to the precipitate configuration encountered in the experiment. Adding again the coherency strain energy to such a “chemical” free energy will overestimate its contribution. Extra effort has to be made to formulate correctly the total free energy function in this case (see next section). Phase diagrams reported in literature are usually incoherent phase diagrams, but exceptions are not uncommon. For example, most existing Ni–Ni3 Al (γ /γ ) phase diagrams are actually coherent ones because the incoherent equilibrium between γ and γ are rarely observed in usual experiment [24]. To have an accurate chemical free energy model is essential for the construction of an accurate total free energy in the phase field method, which determines the coherent phase diagram and the driving forces for coherent precipitation. Even though the coherency strain energy always suppresses phase separation, reducing the driving force for nucleation and growth, coherent precipitation is still the preferred path at early stages of transformations in many material systems. This is because the nucleation barrier for a coherent precipitate is usually significantly lower than that for an incoherent precipitate because of the order-of-magnitude difference in interfacial energy between a
Coherent precipitation – phase field method
2123
coherent and an incoherent interface. Precipitates may loose their coherency at later stages when they grow to certain sizes; by then the strain-induced interactions among the coherent fluctuations and precipitates may have already fixed the spatial distribution of the precipitates. Therefore, developing any model for coherent precipitation has to start with coherent nucleation. Classical treatments of strain energy effect on nucleation (for reviews see, [5, 25, 26]) considered an isolated precipitate and calculate the strain energy per unit volume of the precipitate as a function of its shape. The strain energy was then added to the chemical free energy. In these approaches, the interaction of a nucleating particle with the existing microstructure was ignored. However, the strain fields associated with coherent particles interact strongly with each other in elastically anisotropic crystals. In this case the strain energy of a coherent precipitate depends not only on the strain field of its own but also on the strains due to all other particles in the system (for review, see [16]). This may have a profound influence on the nucleation process, e.g., making certain locations preferred nucleation sites [21]. In fact, many of the strain-induced morphological patterns observed (e.g., Fig. 1) may have been inherited from the nucleation stages and further developed during growth and coarsening. For example, the correlated (the position of a nucleus is determined by its interaction with the existing microstructure) [3, 6, 9] and collective (particles appear in groups) nucleation phenomena [10, 27] have been predicted for the formation of various self-organized, quasi-periodical morphological patterns as those shown in Fig. 1. Cahn [18, 21] analyzed coherent nucleation using the phase field method. He showed that one could derive analytical expressions for coherent interfacial energy, activation energy and critical size of a coherent nucleus for an elastically isotropic system. These expressions have exactly the same forms as those derived for incoherent precipitation, but with the chemical free energy replaced by a sum of the chemical free energy and the coherency strain energy. Although no solution is given for coherent nucleation in elastically anisotropic systems, Cahn illustrated qualitatively the effect of elastic interactions among coherent precipitates on the nucleation process in an elastically anisotropic cubic crystal. The driving force for nucleation reaches maximum at a nearby location in an elastically soft direction to an existing precipitate. In computer simulations using the phase field method, nucleation has been implemented in two ways: (a) solving numerically the stochastic phase field equations with the Langevin noise terms [6] and (b) stochastically seeding nuclei in an evolving microstructure according to the nucleation rates calculated as a function of local concentration and temperature [28] following the classical or non-classical nucleation theory. Recently the latter has been extended to coherent nucleation where the effect of elastic interaction of a nucleating particle with an existing microstructure is considered [29]. The Langevin approach is self-consistent with the phase field method but computationally
2124
C. Shen and Y. Wang
intensive, because observation of nucleation requires sampling at very high frequency in the simulation. It has been applied successfully to the study of collective and correlated nucleation under site-saturation conditions [6, 9, 10, 27]. The explicit algorithm is computationally more efficient and has been applied successfully in concurrent nucleation and growth processes under either isothermal or continuous cooling conditions [28, 30]. Because the interfacial energy scales with surface area while the coherency strain energy scales with volume, the shape of a precipitate tends to be dominated by the interfacial energy when it is small and by the coherency strain energy when it grows to larger sizes. Therefore, shape transitions during growth and coarsening of coherent precipitates are expected. The long-range and highly anisotropic elastic interactions give rise to directionality in precipitate growth and coarsening, promoting spatial correlation among precipitates. Extensive discussions on these subjects can be found in the references listed in the Further Reading section. Indeed, significant shape transition (including splitting) and strong spatial alignment of precipitate have been observed (See reviews [6, 15, 31]). The shape transition of a growing particle may further induce growth instability, leading to faceted dendrite [32]. One of the major advantages of the phase field mode is that it describes growth and coarsening seamlessly in a single, self-consistent methodology. Incorporation of the coherency strain energy in the phase field model allows for capturing all possible microstructural features developing during growth and coarsening of coherent precipitates. For example, precipitate drifting, local inverse coarsening, and particle splitting have been predicted during growth and coarsening of coherent precipitates [11–14]. Incorporation of the coherency strain energy will also alter the thermodynamic factor in diffusivity, which is the second derivative of the total free energy with respect to concentration. Since atomic mobility rather than diffusivity is employed in the phase field model, the effect of coherency strain on the thermodynamic factor is included automatically. Note that the thermodynamic factor used in the calculation of atomic mobility from diffusivity should include the elastic energy contribution if the diffusivity was measured from a coherent system.
2. 2.1.
Theoretical Formulation Phase Field Microelasticity of Coherent Precipitation
In the phase field approach [7, 8], microstructural evolution during phase transformation is characterized self-consistently by the tempero-spatial evolution of a set of continuum order parameters or phase fields. One of the major
Coherent precipitation – phase field method
2125
advantages of the method is its ability to describe effectively and efficiently an arbitrary microstructure at mesoscale without exp-licitly tracking moving interfaces. In order to apply such a method to describe coherent transformations, one need to formulate the coherency strain energy as a functional of the phase fields without any a priori assumptions about possible particle shapes and their spatial arrangements along the transformation path. The theoretical treatment of such an elasticity problem was due to Khachaturyan and Shatalov [16, 33, 34] who derived a close form of the coherency strain energy for an arbitrary coherent multi-phase mixture in an elastically anisotropic crystal under the homogenous modulus assumption. The theory essentially solves the equation of mechanical equilibrium in the reciprocal space for the well-known virtual process by Eshelby [35, 36] (Fig. 4). The process consists of five steps: (1) isolate portions of a parent phase matrix; (2) transform the isolated portions into precipitate phases in a stress-free state (e.g., outside the parent phase matrix). The deformation involved in this step by assuming certain lattice correspondence between the precipitate and parent phases is defined as the stress-free transformation strain (SFTS) εi0j ; (3) apply an opposite stress −Ci j kl εkl0 to the precipitates to restore their original shapes and sizes; (4) placed them back into the spaces they occupied originally in the matrix; (5) allow both the precipitates and matrix to relax to minimize the elastic strain energy subject to the requirement of interface coherency. Step (1) is traditionally taken prior to the phase transformation. If the precipitates
(5)
(1)
(2)
ε0ij
(4) (3)
σij ⫽⫺Cijkl ε0kl
Figure 4. The Eshelby’s virtual procedures for calculating the coherency strain energy of coherent precipitates.
2126
C. Shen and Y. Wang
differ in composition from the matrix the transformation in Step (2) will change the matrix composition as well because of mass conservation. To be consistent with the definition of the coherency strain energy given in Section 2, we may modify the Eshelby cycle as follows: (1 ) consider a coherent microstructure consisting of arbitrary concentration or structural non-uniformity produced along a phase transformation path; (2 ) decompose the microstructure into its incoherent counterpart (i.e., with all the microstructural constituents being in their stress-free states); (3 ) apply counter stress to force the lattices of all the constituents to be identical to nullify SFTS; (4 ) put them back together by re-stitching their corresponding lattice planes at interfaces; (5 ) let the system relax to minimize the elastic strain energy. The SFTS field associated with arbitrary compositional or structural inhomogeneities can be expressed either in terms of shape functions for sharpinterface approximation [16] of an arbitrary multi-phase mixture, or in terms of phase fields for diffuse-interface approximation of arbitrary concentration or structural non-uniformities: εi0j (x) =
N
εi00j ( p)φ p (x),
(1)
p=1
which is a linear superposition of all N types of non-uniformities with φ p (x) being the phase fields characterizing the p-th type non-uniformity and εi00j ( p) the corresponding SFTS measured from a given reference state. Note that εi00j ( p)(i, j = 1, 2, 3) depends on the lattice correspondence between the precipitate and parent phases. The calculation of εi00j ( p) is an important step towards formulating the coherency strain energy and it will be described in details later in several examples. The equilibrium elastic strain and hence the strain energy can be found from the condition of mechanical equilibrium [37] ∂σi j (x) + f i (x) = 0, ∂x j
(2)
subject to boundary conditions. Here σi j (x) is the ij component of the coherency stress at position x. f i (x) is a body force per unit volume exerted by, e.g., an external field. In Eq. (2) we have used the convention by Einstein where the repeated index j implies a summation over all its possible values. The boundary conditions include constraints on external surfaces and internal interfaces. At external surfaces the boundary conditions are determined by physical constraints on the macroscopic body of a sample, such as shape, surface traction, or a combination of the two. At internal interfaces,
Coherent precipitation – phase field method
2127
continuities of both displacement and coherency stress are required to ensure the coherency of the interfaces. The Green’s function solution of Eq. (2) under the homogeneous modulus assumption, gives the equilibrium elastic strain [16, 38]:
ei j (x) = ε¯ i j +
N 1 dg − [n j ki (n) + n i kj (n)]n l σkl00 ( p)φ˜ p (g) 2 (2π )3 p=1
−
N
εi00j ( p)φ p (x)
(3)
p=1
where ε¯ i j is a homogeneous strain that represents the macroscopic shape change of the material body, g is a vector in the reciprocal space and n ≡ g/g. [−1 (n)]ik ≡ Ci j kl n j n l is the inverse of the Green’s function in the reciprocal 00 ˜ space. σi00 j ( p) ≡ C i j kl εkl ( p), φ p (g) is the Fourier transform of φ p (x). – represents a principle value of the integral that excludes a small volume in the reciprocal space (2π )3 / V at g = 0, where V is the total volume of the system. The total coherency strain energy of the system at equilibrium is then readily obtained as
1 Ci j kl ei j (x)ekl (x)dx E = 2 N N V 1 Ci j kl εi00j ( p)εkl00 (q)φ p (x)φq (x)dx + Ci j kl ε¯ i j ε¯ kl = 2 p=1 q=1 2 el
− ε¯ i j
N
Ci j kl εkl00 ( p)φ p (x)dx
p=1
−
1 2
N N
00 ˜∗ ˜ − n i σi00 j ( p) j k (n)σkl (q)n l φ p (g)φq (g)
p=1 q=1
dg (2π )3
(4)
The asterisk in the last term stands for the complex conjugate. Equations (3) and (4) contain the homogeneous strain, ε¯ i j , which is suitable if the external boundary condition is given for a constrained macroscopic shape. Corresponding to the Eshelby circle aforementioned, the first term in the right-hand side of Eq. (4) is the energy required to “squeeze” the microstructural constituents to nullify the stress-free transformation strain in Step (3 ), and the remaining terms represent the energy reductions associated with relaxations of the “squeezed” state in Step (5 ). In particular, the second and third terms describe the homogeneous (macroscopic shape) relaxation and the fourth term
2128
C. Shen and Y. Wang
describes the local heterogeneous relaxation. For a constrained stress condition at the external surface, ε¯ i j is determined by the minimization of the total elastic energy with respect to itself, which yields [38]. ε¯ i j =
appl Si j kl σkl
N 1 + εi00j ( p)φ p (x)dx V p=1
(5) appl
where Si j kl is the elastic compliance tensor and σi j is the applied stress that appl is related to the surface traction T and the surface normal s by Ti = σi j s j . Combining Eqs. (3)–(5) gives 1 E = 2 el
− −
N N
Ci j kl εi00j ( p)εkl00 (q)φ p (x)φq (x)dx
p=1 q=1
1 Ci j kl 2V 1 2
N
εi0j ( p)φ p (x)dx
p=1
N
εkl00 (q)φq (x )dx
q=1
N N p=1
appl − σi j
dg 00 ˜∗ ˜ − n i σi00 j ( p) j k (n)σkl (q)n l φ p (g)φq (g) (2π )3 q=1
N p=1
εi00j ( p)φ p (x)dx −
V appl appl Si j kl σi j σkl 2
(6)
The expression for the mixed constrained shape and surface traction boundary conditions can be derived in a similar way [38]. The above equations were derived under an assumption of constant Ci j kl , i.e., the homogeneous modulus assumption. In cases with spatially dependent Ci j kl the solution is found to be contained in an implicit equation and thus requires a suitable solver, such as an iteration method. Readers are referred to the recent development by Wang et al. [39]. Equations (2)–(6) provide the close forms of the coherency strain energy for a general elastically anisotropic system with arbitrary coherent precipitates described by the phase fields. In such formulations, the coherency strain energy can be added directly to the chemical free energy in the phase field method, because both of them are functionals of the same phase field variables. As mentioned earlier, the “chemical” free energy contains part of the coherency strain energy if it is obtained by fitting the free energy model to a coherent phase diagram. In order to avoid possible double counting, it is necessary to subtract this part of the coherency strain energy from Eq. (4) or (6). Therefore, it is useful to separate the coherency strain energy into self-energy
Coherent precipitation – phase field method
2129
and interaction-energy. Following the same treatment as that presented in the microscopic elasticity theory of solid solutions [16], we can rewrite Eq. (4) as el el E el = E sel f + E int , el E sel f =
N N 1 2 p=1 q=1
−¯εi j
N
Ci j kl εi00j ( p)εkl00 (q)φ p (x)φq (x)dx +
V Ci j kl ε¯ i j ε¯ kl 2
Ci j kl εkl00 ( p)φ p (x)dx
p=1
− el =− E int
1 2
N N p=1 q=1
dg Qδ pq − φ˜ p (g)φ˜ q∗ (g) , (2π )3
N N 1 00 − n i σi00 j ( p) j k (n)σkl (q)n l −Qδ pq 2 p=1 q=1
dg × φ˜ p (g)φ˜ q∗ (g) , (2π )3 00 00 00 where Q = n i σi00 j ( p) j k (n)σkl ( p)n l g is the average of n i σi j ( p) j k (n)σkl ( p)n l over the entire reciprocal space and δ pq is the Kronecker delta that equals el unity when p = q or zero otherwise. E sel f is configuration-independent and equals the elastic energy of placing a coherent precipitate of unit-volume multiplying the total volume of the precipitate (small as compared to the volume el is configuration-dependent and conof the system) into a uniform matrix. E int tains the pair-wise interactions between precipitates and between volume elements within a finite precipitate. Since the self-energy depends only on the total volume of the precipitates and is independent of their morphology and spatial arrangement, it could be incorporated into and renormalizes the chemical free energy. Clearly, the self-energy should not be included in the calculation of the coherently strain energy if the “chemical” free energy of a system is obtained by fitting to a coherent phase diagram.
2.2.
Incorporation of Coherency Strain Energy into Phase Field Equations
The chemical free energy of a non-uniform system in the phase field approach is formulated as a functional of the field variable based on gradient thermodynamics [22]
F ch =
[ f (φ(x)) + κ|∇φ(x)|2 ]dx,
(7)
where the first term in the integrand is the local chemical free energy density that depends only on local values of the field, φ(x), while the second term is
2130
C. Shen and Y. Wang
the gradient energy that accounts for contributions from spatial variation of φ(x). More complex system may require multiple phase fields, as will be seen in the examples given in the next section. For a coherent system, the total free energy is a sum of the chemical free energy, F ch , and the coherency strain energy, E el , F = F ch + E el ,
(8)
where the chemical free energy is usually measured from a stress-free reference state mentioned earlier, and the coherency strain energy contains both the self- and interaction-energy discussed above. The time evolution of the phase fields, and thus the coherent microstructure, is described by the Onsager-type kinetic equation that assumes a linear dependence of the rate of evolution, ∂φ/∂t, on the driving force, δ F/δφ, ∂φ(x, t) ˆ δ F + ξ(x, t), = −M ∂t δφ(x, t)
(9)
ˆ is a kinetic coefficient matrix and ξ is the Langevin where the operator M random force term that describes thermal fluctuation. The kinetic coefficient ˆ = M if the phase field is non-conserved and matrix is often simplified to M 2 ˆ M = − M∇ if the phase field is conserved, where M is a scalar. Note that the total free energy, F, is a functional of the spatial distribution of the phase field and the energy minimization is a variational process.
3.
Examples of Applications Cubic → Cubic Transformation
3.1.
For a simple cubic → cubic transformation the SFTS is dilatational. If we assume that the coherency strain is caused by concentration inhomogeneity, which is the case for most cubic alloys, the SFTS tensor becomes a function of concentration, e.g., εi0j = ε 0 (c)δi j . The compositional dependence of ε 0 (c) can be written in a Taylor series around the average composition of the parent phase matrix, c¯
dε 0 ¯ + ¯ + ··· . ε (c) = ε (c) (c − c) dc c=c¯ 0
0
(10)
¯ c), ¯ and By choosing a reference state at c(stress-free), ¯ ε 0 (c) = [a(c) − a(c)]/a( the leading term at the right hand side of Eq. (10) vanishes. The SFTS may be approximated by taking the first non-vanishing term
εi0j (x) =
1 da [c(x) − c]δ ¯ ij , a(c) ¯ dc c=c¯
(11)
Coherent precipitation – phase field method
2131
where we have added the explicit dependence of the SFTS on the spatial posi¯ tion x. Accordingly, εi00j = a −1 (c)(da/dc) c=c¯ δi j . With the stress-free condition for the external boundary applied in this and the subsequent examples, the coherency strain energy is reduced from Eq. (4) with substituting φ p by c(x) − c¯ to
1 V ¯ 2 dx + Ci j kl ε¯ i j ε¯ kl E el = Ci j kl εi00j εkl00 [c(x) − c] 2 2 −¯εi j Ci j kl εkl00
[c(x) − c]dx ¯
dg 1 00 ˜ c˜∗ (g) , − − n i σi00 j j k (n)σkl n l c(g) 2 (2π )3 ˜ is the Fourier where ε¯ i j is determined by the boundary condition and c(g) transform of c(x). The kinetics of coherent precipitates is then described by Eqs. (7)–(9). A typical example of such a cubic → cubic coherent transformation is the precipitation of an ordered intermetallic phase (γ -L12 (Ni3 Al)) from a disordered matrix (γ-fcc solid solution) in Ni–Al (Fig. 5). The coherency strain is caused by the difference in composition between γ and γ that modifies the lattice parameters of the two phases. Since the two-phase equilibrium is coherent equilibrium in the system, the coherency strain energy should include only the configuration-dependent part, as discussed earlier:
1 dg el 00 00 00 00 ˜ c˜∗ (g) . E = − − n i σi j j k (n)σkl n l − n i σi j j k (n)σkl n l c(g) g 2 (2π )3 Figure 6 shows the simulated microstructural evolution during coherent precipitation by the phase field method [40]. The chemical free energy is
(a)
(b)
Figure 5. Crystal structures of γ (fcc solid solution) (a) and γ (ordered L12 ) (b) phases in nickel-aluminum alloy. In (b) the solid circles indicate nickel atoms and the open circles indicate aluminum atoms.
2132
C. Shen and Y. Wang
approximated by a Landau-type expansion polynomial, which provides appropriate descriptions of the equilibrium thermodynamic properties (such as equilibrium compositions and driving force) and reflects the symmetry relationship between the parent and product phases (for general discussion see [41, 42]. The elastic constants of the cubic crystal c11 (=C1111), c12 (=C1122), c44 (=C2323 ) are 231, 149, 117 GPa, respectively [43]. εi00j is chosen as 0.049δi j which corresponds to a SFTS of 0.56%. The simulation is performed on a 512 × 512 mesh with grid size of 1.7 nm. The starting microstructure is a homogeneous supersaturated solid solution of an average composition of 0.17at%Al. The nucleation processes in this and the subsequent examples was simulated by the Langevin noise terms described by ξ in Eq. (9). The noise terms were applied only for a short period of time at the beginning, corresponding to the site-saturation approximation. According to the group and subgroup relationship of crystal lattice symmetry of the parent and precipitate phases, three long-range order parameter fields were used in addition to the concentration field, which introduces automatically four anti-phase domains of the ordered γ phase. Periodical boundary conditions were employed. Because of the strong elastic anisotropy, the precipitates evolved into cuboidal shapes and align themselves into a quasi-periodical array, with both the interface inclination and spatial alignment along the elastically soft 100 directions. The simulated γ /γ microstructure agrees well with experimental observations (Fig. 6(b)). Through this example it becomes clear that the phase field method is able to handle high volume fractions of diffusionally and elastically interacting precipitates of complicated shapes and spatial distributions.
3.2.
Hexagonal → Orthorhombic Transformation
The hexagonal → orthorhombic transformation is a typical example of structural transformations with crystal lattice symmetry reduction. Different from a cubic → cubic transformation, there are several symmetry related orientation variants of the precipitate phase. Experimental observations [44–46] have shown remarkably similar morphological patterns formed by the low symmetry orthorhombic phase in different materials systems, indicating that accommodation of coherency strain among different orientation variants dominate the microstructural evolution during the precipitation reaction. In this example we present a generic transformation of a disordered hexagonal phase to an ordered orthorhombic phase with three lattice correspondence variants [27]. The atomic rearrangement during ordering occurs primarily on the (0001) plane of the parent hexagonal phase and, therefore, the essential features of the microstructural evolution can be well represented by ordering of the (0001) planes (Fig. 7) and effectively modeled in two-dimension.
Coherent precipitation – phase field method (a)
2133
(b)
0.2µm
Figure 6. (a) Simulated γ /γ microstructure by the phase field method. The lattice misfit is taken as (aγ − aγ ) / aγ ≈ 0.0056. (b) Experimental observation in Ni–Al–Mo alloy (Courtesy of M. F¨ahrmann). (a)
[010]O
[12 10]H
(b)
bO [12 10]H [100]O
aH
[100]O
bO
[12 10]H
Figure 7. Correspondence of the lattices of (a) the disordered hexagonal phase and (b) the ordered orthorhombic phase (with three orientation variants).
The lattice correspondence between the parent and product phases is shown in Fig. 7. For the first variant in Fig. 7(b) we have, 1 ¯¯ [2110]H 3 1 ¯ H [1120] 3
→ [100]O ,
→ 12 [110]O , [0001]H → [001]O , and the corresponding STFS tensor is
√ α 0 0 a b cO − cH − a − 3aH O H O 0 ,β = √ ,γ= , εi j = 0 β 0 , where α = aO cH 3aH 0 0 γ
where aH and cH are the lattice parameters of the hexagonal phase and aO , bO and cO are the lattice parameters of the orthorhombic phase. If we assume
2134
C. Shen and Y. Wang
no volume change for the transformation and the lattice parameter difference between the hexagonal and orthorhombic phases along the c-axis is negligible, the SFTS is simplified to
1 0 0 εi0j = ε 0 0 −1 0 , 0 0 0
(12)
where ε 0 = (aO − aH )/aO is the magnitude of the shear deformation. The three lattice correspondence variants of the orthorhombic phase are related by 120◦ rotation with respect to each other around the c-axis (Fig. 7b). The SFTS of the remaining two variants thus can be obtained by rotational operation (±120◦ around [100]0 ) on the strain tensor given in (Eq. (12). Furthermore, since the deformation along the c-axis is assumed zero, the SFTS of the three variants can be written as 2 × 2 tensors: √ 0 −1/2 3/2 00 0 1 00 0 , εi j (2) = ε √ , εi j (1) = ε 0 −1 3/2 1/2 √ −1/2 − 3/2 00 0 √ . (13) εi j (3) = ε 1/2 − 3/2 In the phase field method, the three variants are described by three longrange order (lro) parameters (η1 , η2 , η3 ), with each representing one variant. Since there is no composition change during the ordering reaction, the structural inhomogeneity is solely characterized by the lro parameters. Correspondingly, the chemical free energy is formulated as a Landau polynomial expansion with respect to the lro parameters. Substituting φ p by η2p ( p = 1, 2, 3) in Eq. (4) the elastic energy becomes, 3 3 1 Ci j kl εi00j ( p)εkl00 (q) E = 2 p=1 q=1 el
− ε¯ i j −
1 2
3
Ci j kl εkl00 ( p)
p=1 3 3
η2p (x)ηq2 (x)dx +
V Ci j kl ε¯ i j ε¯ kl 2
η2p (x)dx
00 2 2∗ − n i σi00 j ( p) j k (n)σkl (q)n l η p (g)ηq (g)
p=1 q=1
. dg (2π )3
Figure 8 shows the simulated microstructures by the phase field method [27]. The system was discretized into a 1024 × 1024 mesh with grid size 0.5 nm. The initial microstructure is a homogeneous hexagonal phase. Strong spatial correlation among the orthorhombic phase particles was developed during the nucleation (Fig. 8(a)). The subsequent growth and coarsening of the orthorhombic phase particles produced various special domain patterns
Coherent precipitation – phase field method (a)
(b)
t* = 20
2135 (c)
t* = 1000
t* = 3000
Figure 8. Microstructures obtained during hexagonal → orthorhombic ordering by 2D phase field simulation. Specific patterns (highlighted by circles, ellipses, and squares) are also found in experimental observations (Fig. 1d).
as a result of elastic strain accommodation among different orientation variants. These patterns show excellent agreements with experimental observations (Fig. 1(d)). Typical sizes of these configurations were also found in good agreement with the experimental observations. If the coherency strain energy was not considered, completely different domain pattern were observed. This indicates that the elastic strain accommodation among different orientation variants dominates the morphological pattern formation during the hexagonal → orthorhombic transformations. The coarsening kinetics of the domain structure deviates significantly from the one observed for an incoherent system [47].
3.3.
Cubic → Trigonal (ζ2 ) Martensitic Transformation in Polycrystalline Au–Cd Alloy
In the two examples presented above, single crystals with relative simple lattice rearrangements during precipitation are considered. In this example we present one of the most complicated cases that have been studied by the phase field method [48]. The trigonal lattice of the ζ2 martensite in Au–Cd can be visualized as a stretched cubic lattice in one of the body diagonal (i.e., [111]) directions. Four lattice correspondence variants are associated with the transformation, which correspond to the four 111 directions of the cube. In the phase field method, the spatial distribution of the four variants is characterized by four lro parameter fields and the chemical free energy is approximated by a Landau expansion polynomial with respect to the lro parameters. If we represent the trigonal phase in hexagonal indices, the lattice correspondence
2136
C. Shen and Y. Wang
between the parent and product phases are [49]: ¯ ς , [121] ¯ β2 → [12 ¯ 10] ¯ ς , [111]β2 → [0001]ς , ¯ β2 → [21¯ 10] Variant 1: [211] 2 2 2 ¯ ¯ ¯ ¯ ¯ ¯ ¯ Variant 2: [121]β2 → [2110]ς2 , [211]β2 → [1210]ς2 , [111]β2 → [0001]ς2 , ¯ ς , [121] ¯ β2 → [12 ¯ 10] ¯ ς , [1¯ 11] ¯ β2 → [0001]ς , ¯ β2 → [21¯ 10] Variant 3: [211] 2 2 2 ¯ ¯ ¯ ¯ ¯ ¯ ¯ Variant 4: [121]β2 → [2110]ς , [211]β2 → [1210]ς , [111]β2 → [0001]ς , 2
2
2
Correspondingly, the SFTS for the four lattice correspondence variants are:
α β β εi00j (1) = β α β , β β α
α −β −β β , εi00j (2) = −β α −β β α
α −β β α β −β α −β , (14) εi00j (3) = −β α −β , εi00j (4) = β β −β α −β −β α √ √ √ √ where α = ( 6ah + 3ch − 9ac )/9ac , β = (− 6ah + 2 3ch )/18ac , ac is the lattice parameter of the cubic parent phase, ah and ch are the lattice parameters of the trigonal phase represented in the hexagonal indices. The SFTS field that characterizes the structural inhomogeneity is a linear superposition of the SFTS of each variant, as given by Eq. (2). Thus the elastic energy (Eq. (4)) reduces to
4 4 1 00 − Ci j kl εi00j ( p)εkl00 (q) − n i σi00 E = j ( p) j k (n)σkl (q)n l 2 p=1 q=1 el
×η˜ p (g)η˜ q∗ (g)
dg . (2π )3
Figure 9(a) shows the 3D microstructure simulated in a 128 × 128 × 128 mesh for a single crystal. The grid size is 0.5 µm. The simulation started with a homogeneous cubic solid solution characterized by η1 (x) = η2 (x) = η3 (x) = η4 (x) = 0. The four orientation variants are represented by four shades of gray in the figure. The typical “herring-bone” feature of the microstructure formed by self-assembly of the four variants is readily seen, which agrees well with experimental observations (Fig. 9(c)). The treatment for a polycrystalline material may take the strain tensors in Eq. (14) as the ones in the local coordinate of each constituent single crystal grain. The SFTS expressed in the global coordinate thus requires applying a rotational operation 0,g
εi j (x) = Rik (x)R j l (x)εi0j (x),
(15)
where Ri j (x) is a 3×3 matrix that defines the orientation of the grain in the global coordinate, which has a constant value within a grain but differs from
Coherent precipitation – phase field method (a)
2137
(b)
(c)
Figure 9. Microstructures developed in a cubic → trigonal (ζ2 ) martensitic transformation in (a) single crystal and (b) polycrystal from 3D phase field simulations. The “herring-bone” structure observed in the simulation (a) agrees well with experiment observations (c).
one grain to another. The microstructure in Fig. 9(b) is obtained for a polycrystal with eight randomly oriented grains. The produced multi-domain structure is found to be quite different from the one obtained from the single crystal. Because of the constraint from neighboring randomly oriented grains, the martensitic transformation does not go to completion and the multi-domain structure is stable against further coarsening, which is in contrary to the case with single crystal where the martensitic transformation goes to completion and the multi-domain microstructure undergoes coarsening till a single domain state for the entire system is reached. This example demonstrates well the capability of the phase field method in predicting very complicated strain accommodating microstructural patterns produced by a coherent transformation in polycrystals.
4.
Summary
In this article we reviewed some of the fundamentals related to coherent transformations, the microelasticity theory of coherent precipitation and
2138
C. Shen and Y. Wang
its implementation in the phase field method. Through three examples, the formulations of the stress-free transformation strain field associated with compositional or structural non-uniformity produced by diffusional and diffusionless transformations are discussed. For any given coherent transformations, if the lattice correspondence between the parent and product phases, their lattice parameters and elastic constants are known, the coherency strain energy can be formulated in a rather straightforward fashion as a functional of the same field variables chosen to characterize the microstructure in the phase field method. The flexibility of the method in treating various coherent precipitations involving simple and complex atomic rearrangements has been well demonstrated through these examples. The description of microstructures in terms of phase fields allows for complexities at a level close to that encountered in real materials. The evolution of the microstructures is treated in a self-consistent framework where the variational principle is applied to the total free energy of the system. It would not be surprising to see in the near future a significant increase in the attempts of exploring various kinds of complex coherent phenomena with phase field method owing to these benefits. The formulation of the chemical free energy for solid state phase transformations is not emphasized in this review, but can be found in other reviews (see e.g., [6–8]). The numerical techniques employed in current phase field modeling of coherent transformations involve uniform finite difference schemes, which pose serious limitations on length scales. As a physical model, the affordable system size that can be considered in a phase field simulation is limited by the thickness of the actual interfaces when real material parameters are used as inputs. In order to overcome this length scale limit, one has to either employ more efficient algorithms such as the adaptive [50] and wavelet method [51] that are currently under active development, or produce artificially diffuse interfaces at length scales of interest without altering the velocity of interface motion by modifying properly certain model parameters [52–55]. Since the close form of the coherency strain energy is given in the reciprocal space, Fourier transform is required in solving the partial differential equations, which may impose serious challenges to the adaptive or wavelet method. A common approach to scale up the length scale of phase field modeling of a coherent transformation is to increase the contribution of the coherency strain energy relative to the chemical free energy [40, 56]. While it seems to be a reasonable approach for qualitative studies, it may result in serious artifacts in quantitative studies. For example, it may produce artificially high strain-induced concentration non-uniformity which may affect the kinetics of nucleation, growth and coarsening. This issue has received increasing attentions as the phase field method is being applied to quantitative simulation studies.
Coherent precipitation – phase field method
5.
2139
Further Reading
Monographs and Reviews on Coherent Phase Transformations 1. A.G. Khachaturyan, Theory of structural transformations in solids, John Wiley & Sons, New York, 1983. 2. Y. Wang, L.Q. Chen, and A.G. Khachaturyan, “Computer simulation of microstructure evolution in coherent solids,” Solid phase transformations, Warrendale, PA, TMS, 1994. 3. W.C. Johnson, “Influence of elastic stress on phase transformations,” In: H.I. Aaronson (ed.), Lectures on the theory of phase transformations, The Minerals, Metals & Materials Society, 35–134, 1999. 4. L.Q. Chen, “Phase field models for microstructure evolution,” Annu. Rev. Mater. Res., 32, 113–140, 2002. Articles on Elastically Inhomogeneous Solids and Thin Films 5. A.G. Khachaturyan, S. Semenovskaya, and T. Tsakalokos, “Elastic strain energy of inhomogeneous solids,” Phys. Rev. B, 52, 15909–15919, 1995. 6. S.Y. Hu and L.Q. Chen, “A phase-field model for evolving microstructures with strong elastic inhomogeneity,” Acta Mater., 49, 1879, 2001. 7. Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, “Phase field microelasticity theory and modeling of elastically and structurally inhomogeneous solid,” J. Appl. Phys., 92, 1351–1360, 2002. 8. Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, “Phase field microelasticity modeling of dislocation dynamics near free surface and in heteroepitaxial thin films,” Acta Mater., 51, 4209–4223, 2003.
References [1] L. Wang, D.E. Laughlin et al., “Magnetic domain structure of Fe-55 at %Pd alloy at different stages of atomic ordering,” J. Appl. Phys., 93, 7984–7986, 2003. [2] V.I. Syutkina and E.S. Jakovleva, Phys. Stat. Sol., 21, 465, 1967. [3] Y. Le Bouar and A. Loiseau, “Origin of the chessboard-like structures in decomposing alloys: Theoretical model and computer simulation,” Acta Mater., 46, 2777, 1998. [4] C. Manolikas and S. Amelinckx, “Phase-transitions in ferroelastic lead orthovanadate as observed by means of electron-microscopy and electron-diffraction 1. Static observations,” Phys. Stat. Sol., A60(2), 607–617, 1980. [5] K.C. Russell, “Introduction to: Coherent fluctuations and nucleation in isotropic solids by John W. Cahn,” In: W. Craig Carter and William C. Johnson (eds.), The selected works of John W. Cahn, Warrendale, Pennsylvania, The Minerals, Metals & Materials Society, 105–106, 1998. [6] Y. Wang, L.Q. Chen et al., “Computer simulation of microstructure evolution in coherent solids,” Solid → Solid Phase Transformations, Warrendale, PA, TMS, 1994.
2140
C. Shen and Y. Wang
[7] Y. Wang and L. Chen, “Simulation of microstructural evolution using the phase field method,” In: E.N. Kaufman (Editor in chief) Methods in materials research, a current protocols, Unit 2a.3, John Wiley & Sons, Inc., 2000 [8] L.Q. Chen, “Phase field models for microstructure evolution,” Annu. Rev. Mater. Res., 32, 113–140, 2002. [9] Y. Wang, H.Y. Wang et al., “Microstructural development of coherent tetragonal precipitates in Mg-partially stabilized zirconia: a computer simulation,” J. Am. Ceram. Soc., 78, 657, 1995. [10] Y. Wang and A.G. Khachaturyan, “Three-dimensional field model and computer modeling of martensitic transformation,” Acta Metall. Mater., 45, 759, 1997. [11] Y. Wang, L.Q. Chen et al., “Particle translational motion and reverse coarsening phenomena in multiparticle systems induced by a long-range elastic interaction,” Phys. Rev. B, 46, 11194, 1992. [12] Y. Wang, L.Q. Chen et al., “Kinetics of strain-induced morphological transformation in cubic alloys with a miscibility gap,” Acta Metall. Mater., 41, 279, 1993. [13] D.Y. Li and L.Q. Chen, “Shape evolution and splitting of coherent particles under applied stresses,” Acta Mater., 47(1), 247–257, 1998. [14] J.D. Zhang, D.Y. Li et al., “Shape evolution and splitting of a single coherent precipitate,” Materials Research Society Symposium Proceedings, 1998. [15] M. Doi, “Coarsening behavior of coherent precipitates in elastically constrained systems – with particular emphasis on gamma-prime precipitates in nickel-base alloys,” Mater. Trans. Japan. Inst. Metals, 33, 637, 1992. [16] A.G. Khachaturyan, Theory of Structural Transformations in Solids, John Wiley & Sons, New York, 1983. [17] W.C. Johnson, “Influence of elastic stress on phase transformations,” In: H.I. Aaronson (ed.), Lectures on the Theory of Phase Transformations, The Minerals, Metals & Materials Society, 35–134, 1999. [18] J.W. Cahn, “Coherent fluctuations and nucleation in isotropic solids,” Acta Met., 10, 907–913, 1962. [19] J.W. Cahn, “On spinodal decomposition in cubic solids,” Acta Met., 10, 179, 1962. [20] J.W. Cahn, “Coherent two-phase equilibrium,” Acta Met., 14, 83, 1966. [21] J.W. Cahn, “Coherent stress in elastically anisotropic crystals and its effect on diffusional proecesses,” In: The Mechanism of Phase Transformations in Crystalline Solids, The Institute of Metals, London, 1, 1969. [22] J.W. Cahn and J.E. Hilliard, “Free energy of a nonuniform system. I. Interfacial free energy,” J. Chem. Phys., 28(2), 258–267, 1958. [23] J.W. Christian, The Theory of Transformations in Metals and Alloys, Pergamon Press, Oxford, 1975. [24] A.J. Ardell, “The Ni-Ni3 Al phase diagram: thermodynamic modelling and the requirements of coherent equilibrium,” Modell. Simul. Mater. Sci. Eng., 8, 277–286, 2000. [25] K.C. Russell, “Nucleation in solids,” In: Phase Transformations, ASM, Materials Park, OH, 219–268, 1970. [26] H.I. Aaronson and J.K. Lee, “The kinetic equations of solid → solid nucleation theory and comparisons with experimental observations,” In: H.I. Aaronson (ed.), Lectures on the Theory of Phase Transformation, TMS, 165–229, 1999. [27] Y.H. Wen, Y. Wang et al., “Phase-field simulation of domain structure evolution during a coherent hexagonal-to-orthorhombic transformation,” Phil. Mag. A, 80(9), 1967–1982, 2000.
Coherent precipitation – phase field method
2141
[28] J.P. Simmons, C. Shen et al., “Phase field modeling of simultaneous nucleation and growth by explicit incorporating nucleation events,” Scripta Mater., 43, 935–942, 2000. [29] C. Shen, J.P. Simmons et al., “Modeling nucleation during coherent transformations in crystalline solids,” (to be submitted), 2004. [30] Y.H. Wen and J.P. Simmons et al., “Phase-field modeling of bimodal particle size distributions during continuous cooling,” Acta Mater., 51(4), 1123–1132, 2003. [31] W.C. Johnson and P.W. Voorhees, Solid State Phenomena, 23–24, 87, 1992. [32] Y.S. Yoo, Ph.D. dissertation, Korea Advanced Institute of Science and Technology, Taejon, Korea, 1993. [33] A.G. Khachaturyan, “Some questions concerning the theory of phase transformations in solids,” Sov. Phys. Solid State, 8, 2163, 1967. [34] A.G. Khachaturyan and G.A. Shatalov, “Elastic interaction potential of defects in a crystal,” Sov. Phys. Solid State, 11, 118, 1969. [35] J.D. Eshelby, “The determination of the elastic field of an ellipsoidal inclusion, and related problems,” Proc. R. Soc. A, 241, 376–396, 1957. [36] J.D. Eshelby, “The elastic field outside an ellipsoidal inclusion,” Proc. R. Soc. A, 252, 561, 1959. [37] L.E. Malvern, Introduction to the Mechanics of a Continuous Medium, Prentice-Hall, Englewood Cliffs, 1969. [38] D.Y. Li and L.Q. Chen, “Shape of a rhombohedral coherent Ti11Ni14 precipitate in a cubic matrix and its growth and dissolution during constrained aging,” Acta Mater., 45(6), 2435–2442, 1997. [39] Y.U. Wang, Y.M. Jin et al., “Phase field microelasticity theory and modeling of elastically and structurally inhomogeneous solid,” J. Appl. Phys., 92(3), 1351–1360, 2002. [40] Y. Wang, D. Banerjee et al., “Field kinetic model and computer simulation of precipitation of L12 ordered intermetallics from fcc solid solution,” Acta Mater., 46(9), 2983–3001, 1998. [41] L.D. Landau and E.M. Lifshitz, Statistical Physics, Pergamon Press, Oxford, New York, 1980. [42] P. Tol`edano and V. Dimitriev, Reconstructive Phase Transitions : In Crystals and Quasicrystals, World Scientific, Singapore, River Edge, NJ, 1996. [43] H. Pottebohm, G. Neitze et al., “Elastic properties (the stiffness constants, the shear modulus and the dislocation line energy and tension) of Ni-Al solid-solutions and of the nimonic alloy pe16,” Mat. Sci. Eng., 60, 189, 1983. [44] J. Vicens and P. Delavignette, Phys. Stat. Sol., A33, 497, 1976. [45] R. Sinclair and J. Dutkiewicz, Acta Met., 25, 235, 1977. [46] L.A. Bendersky and W.J. Boettinger, “Transformation of bcc and B2 hightemperature phases to hcp and orthorhombic structures in the ti-al-nb system 2. Experimental tem study of microstructures,” J. Res. Natl. Inst. Stand. Technol., 98(5), 585–606, 1993. [47] Y.H. Wen, Y. Wang et al., “Coarsening dynamics of self-accommodating coherent patters,” Acta Mater., 50, 13–21, 2002. [48] Y.M. Jin, A. Artemev et al., “Three-dimensional phase field model of low-symmetry martensitic transformation in polycrystal: simulation of ζ2 martensite in aucd alloys,” Acta Mater., 49, 2309–2320, 2001. [49] S. Aoki, K. Morii et al., “Self-accommodation of ζ2 martensite in a Au-49.5%Cd alloy. Solid → Solid Phase Transformations, Warrendale, PA, TMS, 1994.
2142
C. Shen and Y. Wang
[50] N. Provatas, N. Goldenfield et al., “Efficient computation of dendritic microstructures using adaptive mesh refinement,” Phys. Rev. Lett., 80, 3308–3311, 1998. [51] D. Wang and J. Pan, “A wavelet-galerkin scheme for the phase field model of microstructural evolution of materials,” Computat. Mat. Sci., 29, 221–242, 2004. [52] A. Karma and W.-J. Rappel, “Quantitative phase-field modeling of dendritic growth in two and three dimensions,” Phys. Rev. E, 57(4), 4323–4349, 1998. [53] K.R. Elder and M. Grant, “Sharp interface limits of phase-field models,” Phys. Rev. E, 64, 021604, 2001. [54] C. Shen, Q. Chen et al., “Increasing length scale of quantitative phase field modeling of growth-dominant or coarsening-dominant process,” Scripta Mater., 50, 1023– 1028, 2004. [55] C. Shen, Q. Chen et al., “Increasing length scale of quantitative phase field modeling of concurrent growth and coarsening processes,” Scripta Mater., 50, 1029–1034, 2004. [56] J.Z. Zhu, Z.K. Liu et al., “Linking phase-field model to calphad: Application to precipitate shape evolution in Ni-base alloys,” Scripta Mater., 46, 401–406, 2002.
7.5 FERROIC DOMAIN STRUCTURES USING GINZBURG–LANDAU METHODS Avadh Saxena and Turab Lookman Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
We present a strain-based formalism of domain wall formation and microstructure in ferroic materials within a Ginzburg–Landau framework. Certain components of the strain tensor serve as the order parameter for the transition. Elastic compatibility is explicitly included as an anisotropic, long-range interaction between the order parameter strain components. Our method is compared with the phase-field method and that used by the applied mathematics community. We consider representative free energies for a twodimensional triangle to rectangle transition and a three-dimensional cubic to tetragonal transition. We also provide illustrative simulation results for the two-dimensional case and compare the constitutive response of a polycrystal with that of a single crystal. Many minerals and materials of technological interest, in particular martensites [1] and shape memory alloys [2], undergo a structural phase transformation from one crystal symmetry to another crystal symmetry as the temperature or pressure is varied. If the two structures have a simple group–subgroup relationship then such a transformation is called displacive, e.g., cubic to tetragonal transformation in FePd. However, if the two structures do not have such a relationship then the tranformation is referred to as replacive or reconstructive [3, 4]. An example is the body-centered cubic (BCC) to hexagonal closepacked (HCP) transformation in titanium. Structural phase transitions in solids [5, 6] have aroused a great deal of interest over a century due to the crucial role they play in the fundamental understanding of physical concepts as well as due to their central importance in developing technologically useful properties. Both the diffusion-controlled replacive (or reconstructive) and the diffusionless displacive martensitic transformations have been studied although the former have received far more
2143 S. Yip (ed.), Handbook of Materials Modeling, 2143–2155. c 2005 Springer. Printed in the Netherlands.
2144
A. Saxena and T. Lookman
attention simply because their reaction kinetics is much more conducive to control and manipulation than the latter. We consider here a particular class of materials known as ferroelastic martensites. Ferroelastics are a subclass of materials known as ferroics [4], i.e., a non-zero tensor property appears below a phase transition. Some examples include ferromagnetic and ferroelectric materials. In some cases more than one ferroic property may coexist, e.g., magnetoelectrics. Such materials are called multi-ferroics. The term martensitic refers to a diffusionless first order phase transition which can be described in terms of one (or several successive) shear deformation(s) from a parent to a product phase [1]. The transition results in a characteristic lamellar microstructure due to transformation twinning. The morphology and kinetics of the transition are dominated by the strain energy. Ferroelasticity is defined by the existence of two or more stable orientation states of a crystal that correspond to different arrangements of the atoms, but are structurally identical or enantiomorphous [4, 5]. In addition, these orientation states are degenerate in energy in the absence of mechanical stress. Salient features of ferroelastic crystals include mechanical hysteresis and mechanically (reversibly) switchable domain patterns. Usually ferroelasticity occurs as a result of a phase transition from a non-ferroelastic high-symmetry “prototype” phase and is associated with the softening of an elastic modulus with decreasing temperature or increasing pressure in the prototype phase. Since the ferroelastic transition is normally weakly first order, or second order, it can be described to a good approximation by the Landau theory [7] with spontaneous strain as the order parameter. Depending on whether the spontaneous strain, which describes the deviation of a given ferroelastic orientation state from the prototype phase is the primary or a secondary order parameter, the low symmetry phase is called a proper or an improper ferroelastic, respectively. While martensites are proper ferroelastics, examples of improper ferroelastics include ferroelectrics and magnetoelastics. There is a small class of materials (either metals or alloy systems) which are both martensitic and ferroelastic and exhibit shape memory effect [2]. They are characterized by highly mobile twin boundaries and (often) show precursor structures (such as tweed and modulated phases) above the transition. Furthermore, these materials have small Bain strain, elastic shear modulus softening, and a weakly to moderately first order transition. Some examples include In1−x Tlx , FePd, CuZn, CuAlZn, CuAlNi, AgCd, AuCd, CuAuZn2 , NiTi and NiAl. In many of these transitions intra-unit cell distortion modes (or shuffles) can couple to the strain either as a primary or secondary order parameter. NiTi and titanium represent two such examples of technological importance. Additional examples include actinide alloys: UNb6 shape memory alloy and Ga-stabilized δ-Pu.
Ferroic domain structures using Ginzburg–Landau methods
1.
2145
Landau Theory
To understand the thermodynamics of the phase transformation and the phase diagram a free energy of the transformation is needed. This Landau free energy (LFE) is a symmetry allowed polynomial expansion in the order parameter that characterizes the transformation [7], e.g., strain tensor components and/or (intra-unit cell) shuffle modes. A minimization of this LFE with respect to the order parameter components leads to conditions that give the phase diagram. Derivatives of the LFE with respect to temperature, pressure and other relevant thermodynamic variables provide information about the specific heat, entropy, susceptibility, etc. To study domain walls between different orientational variants (i.e., twin boundaries) or diferent shuffle states (i.e., antiphase boundaries) symmetry allowed strain gradient terms or shuffle gradient terms must be added to the Landau free energy. These gradient terms are called Ginzburg terms and the augmented free energy is referred to as the Ginzburg–Landau (GLFE) free energy. Variation of the GLFE with respect to the order parameter components leads to (Euler–Lagrange) equations [8] whose solution leads to the microstruture. In two dimensions we define the symmetry-adapted dilatation (area change), deviatoric and shear strains [8, 9], respectively, as a function of the Lagrangian strain tensor components i j : 1 e1 = √ (x x + yy ), 2
1 e2 = √ (x x − yy ), 2
e3 = x y .
(1)
As an example, the Landau free energy for a triangular to (centered) rectangular transition is given by [10, 11] F(e2 , e3 ) =
A 2 B C A1 2 (e2 + e32 ) + (e23 − 3e2 e32 ) + (e22 + e32 )2 + e , 2 3 4 2 1
(2)
where A is the shear modulus, A1 is the bulk modulus, B and C are third and fourth order elastic constants, respectively. This free energy without the non-order parameter strain (e1 ) term below the transition temperature (Tc ) has three minima in (e2 , e3 ) corresponding to the three rectangular variants. Above Tc it has only one global minimum at e2 = e3 = 0 associated with the stable triangular lattice. Since the shear modulus softens (partially) above Tc , we have A = A0 (T − Tc ). In three dimensions we define symmetry-adapted strains as [8] 1 1 e1 = √ (x x + yy + zz ), e2 = √ (x x − yy ), 3 2 1 e3 = √ (x x + yy − 2zz ), e4 = x y , e5 = yz , 6
e6 = x z .
(3)
2146
A. Saxena and T. Lookman
As an example, the Landau part of the elastic free energy for a cubic to tetragonal transition in terms of the symmetry-adapted strain components is given by [8, 12, 13] F(e2 , e3 ) =
A 2 B C A1 2 (e + e32 ) + (e23 − 3e2 e32 ) + (e22 + e32 )2 + e 2 2 3 4 2 1 A4 2 (e + e52 + e62 ), + 2 4
(4)
where A1 , A and A4 are bulk, deviatoric and shear modulus, respectively, B and C denote third and fourth order elastic constants and (e2 , e3 ) are the order parameter deviatoric strain components. The non-order parameter dilatation (e1 ) and shear (e4 , e5 , e6 ) strains are included to harmonic order. For studying domain walls (i.e., twinning) and microstructure this free energy must be augmented [12] by symmetry allowed gradients of (e2 , e3 ). The plot of the free energy in Eq. (4) without the non-order parameter strain contributions (i.e., compression and shear terms) is identical to the two-dimensional case, Eq. (2), except that the three minima in this case correspond to the three tetragonal variants. The coefficients in the GLFE are determined from a combination of experimental structural (lattice parameter variation as a function of temperature or pressure), vibrational (e.g., phonon dispersion curves along different high symmetry directions) and thermodynamic data (entropy, specific heat, elastic constants, etc.). Where sufficient experimental data is not available, electronic structure calculations and molecular dynamics simulations (using appropriate atomistic potentials) can provide the relevant information to determine some or all of the coefficients in the GLFE. For simple phase transitions (e.g., two-dimensional square to rectangle [8, 9] or those involving only one component order parameter [14]) the GLFE can be written down by inspection (from the symmetry of the parent phase). However, in general the GLFE must be determined by group theoretic means which are now readily available for all 230 crystallographic space groups in three dimensions and (by projection) for all 17 space groups in two dimensions [14] (see the computer program ISOTROPY by Stokes and Hatch [15]).
2.
Microstructure
There are several different but related ways of modeling the microstructure in structural phase transformations: (i) GLFE based as described above [8], (ii) phase-field model in which strain variables are coupled in a symmetry allowed manner to the morphological variables [6], (iii) sharp interface models used by applied mathematicians [16, 17].
Ferroic domain structures using Ginzburg–Landau methods
2147
The natural order parameters in the GLFE are strain tensor components. However, until recent years researchers have simulated the microstructure in displacement variables by rewriting the free energy in displacement variables [10, 13]. This procedure leads to the microstructure without providing direct physical insight into the evolution. A natural way to bring out the insight is to work in strain variables only. However, if the lattice integrity is maintained during the phase transformation, that is no dislocation (or topological defect) generation is allowed, then one must obey the St. Venant elastic compatibility constraints because various strain tensor components are derived from the displacement field and are not all independent. This can be achieved by minimizing the free energy with compatibility constraints treated with Lagrangian multipliers [9, 11]. This procedure leads to an anisotropic long-range interaction between the order parameter strain components. The interaction (or compatibility potential) provides direct insight into the domain wall orientations and various aspects of the microstructure in general. Mathematically, the elastic compatibility condition on the “geometrically linear” strain tensor is given by [18]: × (∇ × ) = 0. ∇ (5) which is one equation in two dimensions connecting the three components of the symmetric strain tensor: x x,yy + yy,x x = 2x y,x y . In three dimensions it is two sets of three equations each connecting the six components of the symmetric strain tensor ( yy,zz + zz,yy = 2 yz,yz and two permutations of x, y, z; x x,yz + yz,x x = x y,x z + x z,x y and two permutations of x, y, z). For periodic boundary conditions in Fourier space it becomes an algebraic equation which is then easy to incorporate as a constraint. For the free energy in Eq. (2), the Euler–Lagrange variation of [F−G] with respect to the non-O P strain, e1 is then [11, 14] δ(F c −G)/δe1 = 0, where G denotes the constraint equation, Eq. (5), is a Lagrange multiplier and F c = ( A1 /2)e12 is identically equal to k F c (k). The variation gives (in k space assuming periodic boundary conditions) (k x2 + k 2y )(k) . (6) A1 We then put e1 (k) back into the compatibility constraint condition, Eq. (5), and solve for the Lagrange multiplier (k). Thus e1 (k) is expressed in terms of e2 (k), e3 (k) and e1 (k) =
A1 F (k) = 2 c
2 (k 2 − k 2 )e2 2k x2 k 2y e3 x y + , k2 k2
(7)
(k)e (k) with l = 2, 3, which is used in a identically equal to (1/2) A1 U (k)e (static) free energy variation of the order parameter strains. The (static) “comˆ is independent of |k| and therefore only orientationally patibility kernel” U (k)
2148
A. Saxena and T. Lookman
→ U (k). ˆ In coordinate space this is an anisotropic long-range dependent: U (k) 2 (∼ 1/r ) potential mediating the elastic interactions of the primary order parameter strain. From these compatibility kernels one can obtain domain wall orientations, parent product interface (i.e., “habit plane”) orientations and local rotations [14] consistent with those obtained previously using macroscopic matching conditions and symmetry considerations [19, 20]. The concept of elastic compatibility in a single crystal can be readily generalized to polycrystals by defining the strain tensor components in a global frame of reference [21]. By adding a stress term (bilinear in strain) to the free energy one can compute the stress–strain constitutive response in the presence of microstructure for both single and polycrystals and compare the recoverable strain upon cycling. The grain rotation and grain boundaries play an important role when polycrystals are subject to external stress in the presence of a structural transition. Similarly, the calculation of the constitutive response can be generalized to improper ferroelastic materials such as those driven by shuffle modes, ferroelectrics and magnetoelastics.
3.
Dynamics and Simulations
The overdamped (or relaxational) dynamics can be used in simulations to obtain equilibrium microstructure e˙ = −1/ A δ(F + F c )/δe, where A is a friction coefficient and F c is the long-range contribution to the free energy due to elastic compatibility. However, if the evolution of an initial non-equilibrium structure to the equilibrium state is important, one can use inertial strain dynamics with appropriate dissipation terms included in the free energy. The strain dynamics for the order parameter strain tensor components εl is given by [11]
c2 2 δ(F + F c ) δ(R + R c ) + , ρ0 ¨l = l ∇ 4 δl δ ˙l
(8)
where ρ0 is a scaled mass density, cl is a symmetry-specific constant, R = ( A /2)˙εl2 is Rayleigh dissipation and R c is contribution to the dissipation due to the long-range elastic interaction. We replace the compressional free energy in Eq. (2) with the corresponding long-range elastic energy in the order parameter strains and include a gradient term FG = (K /2)[(∇e2 )2 + (∇e3 )2 ], where the gradient coefficient K determines the elastic domain wall energy and can be estimated from the phonon dispersion curves. Simulations performed with the full underdamped dynamics for the triangle to centered rectangular transition are depicted in Fig. 1. The equilibrium microstructure is essentially the same as that found from the overdamped dynamics. The three shades of gray represent the three rectangular variants (or orientations) in the martensite phase. A similar microstrucure has
Ferroic domain structures using Ginzburg–Landau methods
2149
Figure 1. A simulated microstructure below the transition temperature for the triangle to rectangle transition. The three shades of gray represent the three rectangular variants.
been observed in lead orthovanadate Pb3 (VO4 )2 crystals [22]. This has also been simulated in the overdamped limit by phase–field [23] and displacement based simulations of Ginzburg–Landau models [10]. The 3D cubic to tetragonal transition (free energy in Eq. (4)) can be simulated either using the strain based formalism outlined here [12] or directly using the displacements [13]. In Fig. 2 we depict microstructure evolution for the cubic to tetragonal transition in FePd mimicked by a square to rectangle transition. To simulate mechanical loading of a polycrystal [21], an external tensile stress σ is applied quasi-statically, i.e., starting from the unstressed configuration of left panel (a), the applied stress σ is increased in steps of 5.13 MPa, after allowing the configurations relax for t ∗ = 25 time steps after each increment. The loading is continued till a maximum stress of σ = 200 MPa is reached in panel (e). Thereafter, the system is unloaded by
2150
A. Saxena and T. Lookman
Figure 2. Comparison of the constitutive response for a single crystal and a polycrystal for FePd parameters. The four right panels show the single crystal microstructure and the four left panels depict the polycrystal microstructure.
Ferroic domain structures using Ginzburg–Landau methods
2151
decreasing σ to zero at the same rate at which it was loaded; see panel (g). Panel (c) relates to a stress level of σ = 46.15 MPa during the loading process. The favored (rectangular) variants have started to grow at the expense of the unfavored (differently oriented rectangular) variants. The orientation distribution does not change much. As the stress level is increased further, the favored variants grow. Even at the maximum stress of 200 MPa, some unfavored variants persist, as is clear from panel (e). We note that the grains with large misorientation with the loading direction rotate. Grains with lower misorientation do not undergo significant rotation. The mechanism of this rotation is the tendency of the system to maximize the transformation strain in the direction of loading so that the total free energy is minimized [21]. Within the grains that rotate, sub-grain bands are present which correspond to the unfavored strain variants that still survive. Panel (g) depicts the situation after unloading to σ = 0. Upon removing the load, a domain structure is nucleated again due to the local strains at the grain boundaries and the surviving unfavored variants in the loaded polycrystal configuration in panel (e). This domain structure is not the same as that prior to loading, see panel (a), and thus there is an underlying hysteresis. The unloaded configuration has non-zero average strain. This average strain is recovered by heating to the austenite phase, as per the shape memory effect [2]. Note also that the orientation distribution reverts to its preloading state as the grains rotate back when the load is removed. We compare the above mechanical behavior of the polycrystal to the corresponding single crystal. The recoverable strain for the polycrystal is smaller than that for the single crystal due to nucleation of domains at grain boundaries upon unloading. In addition, the transformation in the stress–strain curve for the polycrystal is not abrupt because the response of the polycrystal is averaged over all grain orientations.
4.
Comparison with Other Methods
We compare our approach that is based on the work of Barsch and Krumhansl [8] with two other methods that make use of Landau theory to model structural transformations. Here we provide a brief outline of the differences, the methods are compared and reviewed in detail in Ref. [24]. Khachaturyan and coworkers [6, 23, 25] have used a free energy in which a “structural” or “morphological” order parameter, η, is coupled to strains. This order parameter is akin to a “shuffle” order parameter [26] and the inhomogeneous strain contribution is evaluated using the method of Eshelby [6]. The strains are then effectively removed in favor of the η’s and the minimization is carried out for these variables. This approach (sometimes referred to as
2152
A. Saxena and T. Lookman
“phase-field”) applied to improper ferroelastics is essentially the same as our approach with minor differences in the way the inhomogeneous strain contribution is evaluated. However, for the proper ferroelastics that are driven by strain, rather than shuffle, essentially the same procedure is used with phasefield, that is, the minimization (through relaxation methods) is ultimately for the η’s, rather than the strains. In our approach, the non-linear free energy is written up front in terms of the relevant strain order parameters with the discrete symmetry of the transformation taken into account. Here terms that are gradients in strains, which provide the costs of creating domain walls, are also added according to the symmetries. The free energy is then minimized with respect to the strains. That the microstructure for proper ferroelastics obtained from either method would appear qualitatively similar is not surprising. Although the free energy minima or equilibrium states are the same from either procedure, differences in the details of the free energy landscape would be expected to exist. These could affect, for example, the microstructure associated with metastable states. Our method and that developed by the Applied Mechanics community [16, 17] share the common feature of minimizing a free energy written in terms of strains. The method is ideally suited for laminate microstructures with domain walls that are atomistically sharp. This sharp interface limit means that incoherent strains are incorporated through the use of the Hadamard jump condition [16, 17]. The method takes into account finite deformation and has served as an optimization procedure for obtaining static, equilibrium structures, given certain volume fractions of variants. Our approach differs in that we use a continuum formulation with interfaces that have finite width and therefore the incoherent strains are taken into account through the compatibility relation [9, 11]. In addition, we solve the full evolution equations so that we can study kinetics and the effects of inertia.
5.
Ferroic Transitions
Above we considered proper ferroelastic transitions. This method can be readily extended (including the Ginzburg–Landau free energy and elastic compatibility) to the study of improper ferroelastics (e.g., shuffle driven transitions such as in NiTi [26]), proper ferroelectrics such as BaTiO3 [27–29], improper ferroelectrics such as SrTiO3 [30] and magnetoelastics and magnetic shape memory alloys, e.g., Ni2 GaMn [31], by including symmetry allowed coupling between the shuffle modes (or polarization or magnetization) with the appropriate strain tensor components. However, now the elastic energy is considered only up to the harmonic order whereas the primary order parameter has anharmonic contributions. For example for a two-dimensional
Ferroic domain structures using Ginzburg–Landau methods
2153
ferroelectric transition on a square lattice the Ginzburg–Landau free energy is given by [25, 32]: = α1 (Px2 + Py2 ) + α11 (Px4 + Py4 ) + α12 Px2 Py2 + α111 (Px6 + Py6 ) F( P) g1 2 g2 2 2 2 + Py,y ) + (Px,y + Py,x ) + α112 (Px2 Py4 + Px4 Py2 ) + (Px,x 2 2 1 1 1 + g3 Px,x Py,y + A1 e12 + A2 e22 + e32 + β1 e1 (Px2 + Py2 ) 2 2 2 2 2 + β2 e2 (Px − Py ) + β3 e3 Px Py , where Px and Py are the polarization components. The free energy for a twodimensional magnetoelastic transition is very similar with magnetization (m x , m y ) replacing the polarization (Px , Py ). For specific physical geometries the long-range electric (or magnetic) dipole interaction must be included. Certainly ferroelectric (and magnetoelastic) transitions can be modeled by phasefield [33] and other methods [34]. We have presented a strain-based formalism for the study of domain walls and microstructure in ferroic materials within a Ginzburg–Landau free energy framework with elastic compatibility constraint explicitly taken into account. The latter induces an anisotropic long-range interaction in the primary order parameter (strain in proper ferroelastics such as martensites and shape memory alloys [9, 11] or shuffle, polarization or magnetization in improper ferroelastics [28, 32]). We compared this method with the widely used phase-field method [6, 23, 25] and the formalism used by applied mathematics and mechanics community [16, 17, 34]. We also discussed the underdamped strain dynamics for the evolution of microstructure and compared the constitutive response of a single crystal with that of a polycrystal. Finally, we briefly mention four other related topics that can be modeled within the Ginzburg–Landau formalism. (i) Some martensites show strain modulation (or tweed precursors) above the martensitic phase transition. These are believed to be caused by disorder such as compositional fluctuations. They can be modeled and simulated by including symmetry allowed coupling of strain to compositional fluctuations in the free energy [9, 35, 36]. Similarly, symmetry allowed couplings of polarization (magnetization) with polar (magnetic) disorder can lead to polar [37] (magnetic [38]) tweed precursors. (ii) Some martensites exhibit supermodulated phases [39] (e.g., 5R, 7R, 9R) which can be modeled within the Landau theory in terms of a particular phonon softening [40] (and its harmonics) and coupling to the transformation shear. (iii) Elasticity at nanoscale can be different from macroscopic continuum elasticity. In this case one must go beyond the usual elastic tensor components and include intra-unit cell modes [41]. (iv) The results presented here are relevant for displacive transformations, i.e., when the parent and product crystal
2154
A. Saxena and T. Lookman
structures have a group-subgroup symmetry relationship. However, reconstructive transformations [3], e.g., BCC to HCP transitions, do not have a group– subgroup relationship. Nevertheless, the Ginzburg–Landau formalism can be generalized to these transformations [42]. Notions of a transcendental order parameter [3] and irreversibility [43] have also been invoked to model the reconstructive transformations.
Acknowledgments We acknowledge collaboration with R. Ahluwalia, K.H. Ahn, R.C. Albers, A.R. Bishop, T. Cast´an, D.M. Hatch, A. Planes, K.Ø. Rasmussen and S.R. Shenoy. This work was supported by the US Department of Energy.
References [1] Z. Nishiyama, Martensitic Transformations, Academic, New York, 1978. [2] K. Otsuka and C.M. Wayman (eds.), Shape Memory Materials, Cambridge University Press, Cambridge, 1998; MRS Bull., 27, 2002. [3] P. Tol´edano and V. Dimitriev, Reconstructive Phase Transitions, World Scientific, Singapore, 1996. [4] V.K. Wadhawan, Introduction to Ferroic Materials, Gordon and Breach, Amsterdam, 2000. [5] E.K.H. Salje, Phase Transformations in Ferroelastic and Co-elastic Solids, Cambridge University Press, Cambridge, UK, 1990. [6] A.G. Khachaturyan, Theory of Structural Transformations in Solids, Wiley, New York, 1983. [7] J.C. Tol´edano and P. Tol´edano, The Landau Theory of Phase Transitions, World Scientific, Singapore, 1987. [8] G.R. Barsch and J.A. Krumhansl, Phys. Rev. Lett., 53, 1069, 1984; G.R. Barsch and J.A. Krumhansl, Metallurg. Trans., A18, 761, 1988. [9] S.R. Shenoy, T. Lookman, A. Saxena, and A.R. Bishop, Phys. Rev. B, 60, R12537, 1999. [10] S.H. Curnoe and A.E. Jacobs, Phys. Rev. B, 63, 094110, 2001. [11] T. Lookman, S.R. Shenoy, K. Ø. Rasmussen, A. Saxena, and A.R. Bishop, Phys. Rev. B, 67, 024114, 2003. [12] K. Ø. Rasmussen, T. Lookman, A. Saxena, A.R. Bishop, R.C. Albers, and S.R. Shenoy, Phys. Rev. Lett., 87, 055704, 2001. [13] A.E. Jacobs, S.H. Curnoe, and R.C. Desai, Phys. Rev. B, 68, 224104, 2003. [14] D.M. Hatch, T. Lookman, A. Saxena, and S.R. Shenoy, Phys. Rev. B, 68, 104105, 2003. [15] H.T. Stokes and D.M. Hatch, Isotropy Subgroups of the 230 Crystallographic Space Groups, World Scientific, Singapore, 1988. (The software package ISOTROPY is available at http://www.physics.byu.edu/∼ stokesh/isotropy.html, ISOTROPY (1991)). [16] J.M. Ball and R.D. James, Arch. Rational Mech. Anal., 100, 13, 1987. [17] R.D. James and K.F. Hane, Acta Mater., 48, 197, 2000.
Ferroic domain structures using Ginzburg–Landau methods
2155
[18] S.F. Borg, Fundamentals of Engineering Elasticity, World Scientific, Singapore, 1990; M. Baus and R. Lovett, Phys. Rev. Lett., 65, 1781, 1990; M. Baus and R. Lovett, Phys. Rev. A, 44, 1211, 1991. [19] J. Sapriel, Phys. Rev. B, 12, 5128, 1975. [20] C. Boulesteix, B. Yangui, M. Ben Salem, C. Manolikas, and S. Amelinckx, J. Phys., 47, 461, 1986. [21] R. Ahluwalia, T. Lookman, and A. Saxena, Phys. Rev. Lett., 91, 055501, 2003; R. Ahluwalia, T. Lookman, A. Saxena, and R.C. Albers, Acta Mater., 52, 209, 2004. [22] C. Manolikas and S. Amelinckx, Phys. Stat. Sol., (a) 60, 607, 1980; C. Manolikas and S. Amelinckx, Phys. Stat. Sol., 61, 179, 1980. [23] Y.H. Wen, Y.Z. Wang, and L.Q. Chen, Philos. Mag. A, 80, 1967, 2000. [24] T. Lookman, S.R. Shenoy, and A. Saxena, to be published. [25] H.L. Hu and L.Q. Chen, Mater. Sci. Eng., A238, 182, 1997. [26] G.R. Barsch, Mater. Sci. Forum, 327–328, 367, 2000. [27] W. Cao and L.E. Cross, Phys. Rev. B, 44, 5, 1991. [28] S. Nambu and D.A. Sagala, Phys. Rev. B, 50, 5838, 1994. [29] A.J . Bell, J. Appl. Phys., 89, 3907, 2001. [30] W. Cao and G.R. Barsch, Phys. Rev. B, 41, 4334, 1990. [31] A.N. Vasil’ev, A.D. Dozhko, V.V. Khovailo, I.E. Dikshtein, V.G. Shavrov, V.D. Buchelnikov, M. Matsumoto, S. Suzuki, T. Takagi, and J. Tani, Phys. Rev. B, 59, 1113, 1999. [32] R. Ahluwalia and W. Cao, Phys. Rev. B, 63, 012103, 2001. [33] Y.L. Li, S.Y. Hu, Z.K. Liu, and L.Q. Chen, Appl. Phys. Lett., 78, 3878, 2001. [34] Y.C. Shu and K. Bhattacharya, Phil. Mag. B, 81, 2021, 2001. [35] S. Kartha, J.A. Krumhansl, J.P. Sethna, and L.K. Wickham, Phys. Rev. B, 52, 803, 1995. [36] T. Cast´an, A. Planes, and A. Saxena, Phys. Rev. B, 67, 134113, 2003. [37] O. Tikhomirov, H. Jiang, and J. Levy, Phys. Rev. Lett., 89, 147601, 2002. [38] Y. Murakami, D. Shindo, K. Oikawa, R. Kainuma, and K. Ishida, Acta Mater., 50, 2173, 2002. [39] K. Otsuka, T. Ohba, M. Tokonami, and C.M. Wayman, Scr. Matallurg. Mater., 19, 1359, 1993. [40] R.J. Gooding and J.A. Krumhansl, Phys. Rev. B, 38, 1695, 1988; R.J. Gooding and J.A. Krumhansl, Phys. Rev. B, 39, 1535, 1989. [41] K.H. Ahn, T. Lookman, A. Saxena, and A.R. Bishop, Phys. Rev. B, 68, 092101, 2003. [42] D.M. Hatch, T. Lookman, A. Saxena, and H.T. Stokes, Phys. Rev. B, 64, 060104, 2001. [43] K. Bhattacharya, S. Conti, G. Zanzotto, and J. Zimmer, Nature, 428, 55, 2004.
7.6 PHASE-FIELD MODELING OF GRAIN GROWTH Carl E. Krill III Materials Division, University of Ulm, Albert-Einstein-Allee 47, D–89081 Ulm, Germany
When a polycrystalline material is held at elevated temperature, the boundaries between individual crystallites, or grains, can migrate, thus permitting some grains to grow at the expense of others. Planar sections taken through such a specimen reveal that the net result of this phenomenon of grain growth is a steady increase in the average grain size and, in many cases, the evolution toward a grain size distribution manifesting a characteristic shape independent of the state prior to annealing. Recognizing the tremendous importance of microstructure to the properties of polycrystalline samples, materials scientists have long struggled to develop a fundamental understanding of the microstructural evolution that occurs during materials processing. In general, this is an extraordinarily difficult task, given the structural variety of the various elements of microstructure, the topological complexities associated with their spatial arrangement and the range of length scales that they span. Even for single-phase samples containing no other defects besides grain boundaries, experimental and theoretical efforts have met with surprisingly limited success, with observations deviating significantly from the predictions of the best analytic models. Consequently, researchers are turning increasingly to computational methods for modeling microstructural evolution. Perhaps the most impressive evidence for the power of the computational approach is found in its application to single-phase grain growth, for which several successful simulation algorithms have been developed, including Monte Carlo Potts and cellular automata models (both discussed elsewhere in this chapter), and phase-field, front-tracking and vertex approaches. In particular, the phase-field models have proven to be especially versatile, lending themselves to the simulation of growth occurring not only in single-phase systems, but also in the presence of multiple phases or gradients of concentration, strain or temperature. It is no exaggeration to claim that these simulation techniques 2157 S. Yip (ed.), Handbook of Materials Modeling, 2157–2171. c 2005 Springer. Printed in the Netherlands.
2158
C.E. Krill III
have revolutionized the study of grain growth, offering heretofore unavailable insight into the statistical properties of polycrystalline grain ensembles and the detailed nature of the microstructural evolution induced by grain boundary migration.
1.
Fundamentals of Grain Growth
From a thermodynamic standpoint, grain growth occurs in a polycrystalline sample because the network of grain boundaries is a source of excess energy with respect to the single-crystalline state. The interfacial excess free energy G int can be written as the product of the total grain boundary area AGB and the average excess energy per unit boundary area, γ: G tot = G bulk + G int = G X (T, P) + AGB γ (T, P),
(1)
where G X (T, P, . . .) denotes the free energy of the single-crystalline grain interiors at temperature T and pressure P. Because the specific grain boundary energy γ is a positive quantity, there is a thermodynamic driving force to reduce AGB or, owing to the inverse relationship between AGB and the average grain size R, to increase R. Consequently, grain boundaries tend to migrate such that smaller grains are eliminated in favor of larger ones, resulting in steady growth of the average grain size. The kinetics of this process of grain growth follow one of two qualitatively different pathways [1]: during so-called normal grain growth, the grain size distribution f (R, t) maintains a unimodal shape, shifting to larger R with increasing time t. In abnormal grain growth, on the other hand, only a subpopulation of grains in the sample coarsens, leading to the development of a bimodal size distribution. Although abnormal grain growth is far from rare, the factors responsible for its occurrence are poorly understood at best, depending strongly on properties specific to the sample in question [2]. In contrast, normal grain growth obeys two laws of apparently universal character: power-law evolution of the average grain size and the establishment of a quasistationary scaled grain size distribution [1, 3]. The first entails a relationship of the form Rm (t) − Rm (t0 ) = k (t − t0 ),
(2)
where k is a rate constant (with a strong dependence on temperature), and m denotes the growth exponent [Fig. 1(a)]. Experimentally, m is found to take on a value between 2 and 4, tending toward the lower end of this scale in materials of the highest purity annealed at temperatures near the melting point [2]. The second feature of normal grain growth encompasses the fact that, with increasing annealing time, f (R, t) evolves asymptotically toward a
Phase-field modeling of grain growth (a)
2159 (b)
1.4 1.0 0.8 0.6 800˚C 750˚C 700˚C
0.4 0.2
f (R /, t )
(mm)
2.5 min 5 min 12 min Hillert (3D)
1.0
1.2
0.8 0.6 0.4 0.2 0.0
0.0 0
50
100
150
Annealing time (min)
200
0.0
0.5
1.0
1.5
2.0
2.5
R /
Figure 1. Normal grain growth in polycrystalline Fe. [Data obtained from Ref. [30].] (a) Plot of the average grain size as a function of time in samples annealed at the indicated temperatures. Dashed lines are fits of Eq. (2) with m = 2 (fit function modified slightly to take ‘size effect’ into account). (b) Self-similar evolution of the grain size distribution in the sample annealed at 800 ◦ C for the indicated times. Solid line is a least-squares fit of a lognormal function to the scaled distributions. Dashed line is the prediction of Hillert’s analytic model for grain growth in 3D.
time-invariant shape when plotted as a function of the normalized grain size R/R [Fig. 1(b)]; that is, f (R, t) −→ f˜(R/R),
(3)
with the quasistationary distribution f˜(R/R) generally taking on a lognormal shape [4]. Analytical efforts to explain the origin of Eqs. (2) and (3) generally begin with the assumption that the migration rate v GB of a given grain boundary is proportional to its local curvature, with the proportionality factor defining the grain boundary mobility M [5]. Hillert [6] derived a simple expression for the resulting growth kinetics of a single grain embedded in a polycrystalline matrix. Solving the Hillert model self-consistently for the entire ensemble of grains leads directly to a power-law growth equation with m = 2 and to selfscaling behavior of f (R, t), but the shape predicted for f˜(R/R)–plotted in Fig. 1(b)–has never been confirmed experimentally. This failure is typical of all analytic growth models, which, owing to their statistical mean-field nature, do not properly account for the influence of the grain boundary network’s local topology on the migration of individual boundaries. Computer simulations are able to circumvent this limitation, either by calculating values for v GB from instantaneous local boundary curvatures (cellular automata, vertex, front-tracking methods) or by determining the excess free energy stored in the grain boundary network and then allowing this energy to relax in a physically plausible manner (Monte Carlo, phase-field approaches) [7, 8].
2160
2.
C.E. Krill III
Phase-field Representation of Polycrystalline Microstructure
The phase-field model for simulating grain growth takes its cue from Eq. (1), expressing the total free energy Ftot as the sum of contributions arising from the grain interiors, Fbulk , and the grain boundary (interface) regions, Fint [9]: Ftot = Fbulk + Fint =
f bulk({φi }) + f int ({φi }, {∇φi }) dr.
(4)
Both Fbulk and Fint are specified as functionals of a set of phase fields {φi (r, t)} (also called order parameters), which are continuous functions defined for all times t at all points r in the simulation cell. The energy density f bulk describes the free energy per unit volume of the grain interior regions, whereas f int accounts for the free energy contributed by the grain boundaries. As discussed below, grain boundaries in the phase-field model have a finite (i.e., non-zero) thickness; therefore, the interfacial energy density f int –like f bulk–is an energy per unit volume and must be integrated over the entire volume of the simulation cell to recover the total interfacial energy. The function f bulk({φi }) can be constructed such that each of the phase fields φi takes on one of two constant values–such as zero or unity–in the interior region of each crystallite [9]. Only when a boundary between two crystallites is crossed do one or, generally, more order parameters change continuously from value to the other; consequently, grain boundaries are locations of large gradients in one or more φi , suggesting that the grain boundary energy term f int should be defined as a function of {∇φi }. The specific functional forms chosen for f bulk and f int, however, depend on considerations of computational efficiency, the physics underlying the growth model and, to a certain extent, personal taste. Over the past several years, two general approaches have emerged in the literature for simulating grain growth by means of Eq. (4).
2.1.
Discrete-orientation Models
In the discrete-orientation approach [10, 11], each order parameter φi is φ (r, t) = φ (r, t), φ2 viewed as a continuous-valued component of a vector 1 (r, t), . . . , φ Q (r, t) specifying the local crystalline orientation throughout the simulation cell. Stipulating that the phase fields φi take on constant values of 0 or 1 within the interior of a grain, this model clearly allows at most 2 Q distinct grain orientations, with Q denoting the total number of phase fields. In the most common implementation of the discrete-orientation method, f bulk({φi }) is defined to have local minima when one and only one component of φ equals unity in a grain interior, thus reducing the total number of allowed
Phase-field modeling of grain growth
2161
orientations to Q. For example, in a simulation with Q = 4, a given grain might be represented by the contiguous set of points at which φ = (0, 0, 1, 0), and a neighboring grain by φ = (0, 1, 0, 0) [Fig. 2(a)]. As illustrated in Fig. 2(b), upon crossing from one grain to the other, φ2 changes continuously from 0 from 1 to 0; minimization of f int , which is defined to be proporto 1 and φ 3 Q tional to i=1 (∇φi )2 , leads to a smooth–rather than instantaneous–variation in the order-parameter values. The width of the resulting interfacial region is prevented from expanding without bound by the increase in f bulk that occurs when φ deviates from the orientations belonging to the set of local minima of f bulk. Thus, the mathematical representation of each grain boundary is determined by a competition between the bulk and interfacial components of Ftot – a common feature of phase-field representations of polycrystalline microstructures.
(a)
(b) (0, 0, 0, 1) 1
φ2
0
φ3
(0, 0, 1, 0)
(0, 1, 0, 0) (1, 0, 0, 0) (0, 0, 0, 1) (0, 0, 1, 0)
(c)
(d) φ1 θ 10˚
φ1 θ 18˚
φ1 θ 46˚ φ1 θ 72˚ φ1 θ 34˚
φ1 θ 27˚
crystal 50˚
0˚ 1
boundary
crystal θ
φ
0
Figure 2. Phase-field representations of polycrystalline microstructure. (a) Discreteorientation model: grain orientations are specified by a vector-valued phase field φ having four components in this example. (b) Smooth variation of φ2 and φ3 along the dashed arrow in (a). (c) Continuous-orientation model: grain orientations are specified by the angular order parameter θ, and local crystalline order by the value of φ. (d) Smooth variation of θ and φ along the dashed arrow in (c).
2162
C.E. Krill III
Restricting the grains to a set of discrete orientations may simplify the task of constructing expressions for f bulk and f int in Eq. (4), but it also introduces some conceptual as well as practical limitations to the model. Clearly, it is unphysical for the free energy density of the grain interiors, f bulk, to favor specific grain orientations defined relative to a fixed reference frame, for the free energy of the bulk phase must be invariant with respect to rotation in laboratory coordinates [12]. Even more seriously, the energy barrier in f bulk that separates allowed orientations prohibits the rotation of individual grains during a simulation of grain growth. Since the rotation rate rises dramatically with decreasing grain size [13], grain rotations may be important to the growth process even when R is large, given that there is always a subpopulation of smaller grains losing volume to their growing nearest neighbors.
2.2.
Continuous-orientation Models
In an effort to avoid the undesirable consequences of a finite number of allowed grain orientations, a number of researchers have attempted to express Eq. (4) in terms of continuous, rather than discrete, grain orientations [14–16]. In two dimensions, the orientation of a given grain can be specified completely by a single continuous parameter θ representing, say, the angle between the normal to a particular set of atomic planes and a fixed direction in the laboratory reference frame [Fig. 2(c)]. In 3D, the same specification can be accomplished with three such angular fields. By choosing f bulk to be independent of the orientational order parameters, one ensures that grains are free to take on arbitrary orientations rather than only those corresponding to local minima of the bulk energy density. Because of this independence, however, there is no orientational energy penalty preventing grain boundaries from widening without bound during a growth simulation; thus, it is necessary to introduce an additional phase field that couples the width of the interfacial region to the value of f bulk. Generally, one defines an order parameter φ specifying the degree of crystallinity at each point in the simulation cell, with a value of unity signifying perfect crystalline order (such as obtains in the grain interior) and lower values (0 ≤ φ <1) denoting the disorder characteristic of the boundary core. As illustrated in Fig. 2(d), both φ and θ manifest gradients in the interfacial regions, but only the value of the orientational coordinate distinguishes a given grain from its neighbors. Despite the conceptual advantages enjoyed by continuous-orientation models over their discrete counterparts, the numerical implementation of the former has proven to be far more challenging, with significant difficulties arising from an unavoidable singularity in the expression for f int [12, 14]. Only recently have continuous-orientation simulations of 2D grain growth been
Phase-field modeling of grain growth
2163
reported [16], and the authors of this formalism anticipate having to overcome significant additional hurdles to extend it to 3D.
3.
Microstructural Evolution
In both the discrete and continuous-orientation models, the time evolution of the phase-field φi (r, t) is assumed to be governed by the Allen–Cahn equation for non-conserved order parameters [17]: δ Ftot ∂φi = −L i ({φ j }) , ∂t δφi
(5)
where the kinetic coefficients L i –in general, themselves functions of the phase fields–are related to the interface mobility. For example, in a discreteorientation model for ideal grain growth with Q allowed orientations, where Q f int = (κ/2) i=1 (∇φi )2 , and Ftot is given by Eq. (4) with L i ≡ L for all i, Eq. (5) takes on the form [10] ∂ f bulk({φ j }) ∂φi = −L + Lκ∇ 2 φi ∂t ∂φi
(i, j = 1, 2, . . . , Q).
(6)
This coupled set of differential equations can be discretized and solved numerically at each site of a grid spanning the simulation cell, thus generating the time development of the microstructure represented by the phase-fields. The resulting velocity of each grain boundary segment is exactly that expected for curvature-driven boundary migration, as can be demonstrated by applying Eq. (6) to the analytically solvable case of a spherical grain embedded in a homogeneous matrix: the calculated behavior is identical to the analytic solution, with the product Lκ of model parameters equalling the product Mγ of physical parameters [18]. Thus, the microstructural evolution calculated by the phase-field approach should indeed correspond to experimental findings for normal grain growth. Practical issues to consider when solving Eq. (5) numerically include the initial conditions of the calculation (i.e., the starting grain configuration), the boundary conditions of the simulation cell, the spacing of grid points and the numerical scheme chosen for solving the differential equations. The starting microstructure can be obtained by any method for dividing the simulation cell into a space-filling ensemble of grains, such as the Poisson–Voronoi tessellation, or one can begin with an ‘undercooled liquid,’ in which grains nucleate homogeneously and grow to impingement. In the latter case, once nucleation and growth has annihilated the last remnants of liquid phase, subsequent evolution of the microstructure can occur only by boundary migration and grain rotation, as in real polycrystalline materials. Usually, it is most convenient to assume periodic boundary conditions at the edges of the simulation cell, thus
2164
C.E. Krill III
avoiding complications arising from surface effects. In contrast to the other grid-based method for simulating grain growth (Monte Carlo Potts model), the results of a phase-field simulation are insensitive to the symmetry chosen for the lattice of grid points, largely because the finite width of the boundary regions essentially ‘averages out’ the local symmetry of the grid. Thus, one is free to perform the calculation on a uniform lattice of grid points; however, care must be taken to ensure that the mesh size is small enough to include on the order of seven or more grid points within the thickness of a boundary, if boundary motion is to be independent of the detailed structure of the grid [19]. Finally, when solving Eq. (5) or (6), one must deal with the usual issues of stability and convergence encountered in any numerical solution of nonlinear differential equations. Surprisingly, even a simple numerical scheme like the explicit forward Euler method yields qualitatively correct behavior when applied to analytically solvable cases; however, if absolute rates of boundary migration are of interest, then it is advisable to implement more sophisticated techniques, like a semi-implicit Fourier scheme [20], although this inevitably entails additional computational complexity.
4.
“Ideal” Grain Growth
In a real polycrystalline material, the grain boundary mobility M and the specific grain boundary energy γ vary from grain boundary to grain boundary, depending on (i) the relative orientation of the crystalline regions on either side of the boundary (called the misorientation) and on (ii) the orientation of the boundary plane with respect to the adjacent crystal lattices (the boundary inclination) [2, 21]. For example, both M and γ manifest sharp minima or maxima (‘cusps’) at low-angle boundaries and at certain special misorientations [2, 22]. Analytic models for grain growth have traditionally ignored these dependencies, attributing the same values of M and γ to each grain boundary regardless of the nature of its misorientation or inclination. Grain growth occurring under this simplified scenario is sometimes considered to be “ideal,” and it is this case that, to date, has been investigated most extensively by computer simulation. Of the various phase-field approaches to simulating ideal grain growth, only the discrete-orientation models have been applied to 2D and 3D simulation cells containing a statistically significant number of grains (i.e., 103 ) [4, 18]. The coarsening observed during a phase-field simulation of normal grain growth in 3D is illustrated by the sequence of images in Fig. 3. The positions of grain boundaries can be established by locating the contours Q 2 φi (r, t), which, for the model of an appropriate function like ϕ(r, t) = i=1 employed in the calculations of Fig. 3, takes on a value of unity within individual grains and smaller values in the boundary regions. This allows
Phase-field modeling of grain growth
t 30.0 N 5901
t 100.0 N 2107
2165
t 300.0 N 696
t 800.0 N 216
Figure 3. Phase-field simulation of three-dimensional grain growth using a discreteorientation model [4]. The elapsed simulation time t and the number of grains N in the simulation cell are specified under each image.
topological properties like the volume, number of faces, number of edges, etc., of each grain in the simulation cell to be evaluated, thus enabling quantification of the evolution of both local and averaged topological properties. For example, in Fig. 4(a) the square of the average grain size, R2 , is plotted against the time t for the simulation of Fig. 3; the resulting straight line reveals that the simulated grain growth follows Eq. (2) with m = 2, consistent with the prediction of Hillert’s analytic model for ideal grain growth. Figure 4(b) illustrates that the grain size distribution evolves in a self-similar manner, as well, but the shape of the quasistationary distribution f˜(R/R) disagrees with the Hillert prediction. Finally, the average topological parameters of the 3D microstructures generated by phase-field simulation can be calculated and compared to measurements performed on real polycrystalline materials–an extraordinarily tedious task!–and to the results of other algorithms for simulating grain growth (Table 1).
5.
“Anisotropic” Grain Growth
With recent advances in computing power, it has become feasible to extend the phase-field approach to the “non-ideal” case in which the grain boundary mobility M and energy γ depend on misorientation and boundary inclination. Such a step is particularly straightforward for a continuous-orientation model, as the angular parameters needed to specify the orientation of each grain appear explicitly as phase fields in the free energy of Eq. (4). With discrete-orientation models, the misorientations and boundary inclinations must be calculated from the parameterization of allowed orientations represented by the array of values of the order parameters {φi }. In both cases, however, it is possible to introduce physically plausible expressions for the anisotropy of M and γ into the equations of motion for microstructural evolution [Eq. (5)].
2166
C.E. Krill III (a) 4000
2
3000
2000
1000
0 0
200
400 time
600
(b)
t 200.0 N 5306 t 400.0 N 2534 t 800.0 N 1113 Hillert (3D)
1.0
0.8 ãf (R /, t )
800
0.6 0.4
0.2
0.0 0.0
0.5
1.0 1.5 R /
2.0
2.5
Figure 4. Quantitative analysis of microstructural evolution generated by a discreteorientation grain growth model in 3D, averaged over five simulation runs [4]. (a) Linear evolution of R2 , illustrating parabolic growth kinetics following an initial transient. (Dashed line is a guide to the eye.) (b) The grain size distribution f˜(R/R, t), plotted at various times t for the indicated number of grains N. The prediction of Hillert’s analytic model is included for comparison.
To date, all phase-field investigations of the influence of anisotropic M and γ on the kinetics of grain growth have been based on a discrete-orientation model [23, 24]. The results of these studies point to a qualitative difference in the consequences of the two types of anisotropy: whereas both mobility and
Phase-field modeling of grain growth
2167
Table 1. Topological parameters of 3D microstructures generated by various grain growth algorithms, measured in cellular materials, or modeled as tessellations of space. The quantity F denotes the average number of faces per cell (grain), and E F the average number of edges per face. See Refs. [4, 29] for references. Phase field Monte Carlo Surface Evolver Vertex Al98 Sn2 Fe Soap froth Tetrakaidecahedra Poisson–Voronoi
(a)
(b)
F
E F
13.7 13.7 13.5 13.8 13.9 13.4 13.4 14 15.535
5.12 5.05 5.05 5.01 5.14 5.11 5.11 5.143 5.228
(c)
Figure 5. Microstructures generated during a 2D phase-field simulation of grain growth, calculated assuming (a) isotropic grain boundary mobility M and energy γ; (b) anisotropic M and isotropic γ ; (c) anisotropic M and γ . Microstructures drawn such that darker boundaries correspond to larger misorientations. Only the introduction of misorientation-dependent grain boundary energies alters the topology of the microstructure. [After (Ref. [25]). Reprinted with permission.]
energy anisotropy affect the overall growth kinetics, only the latter alters the topology of the microstructure generated by grain growth (Fig. 5) [24]. This initially puzzling result can be understood as a consequence of the establishment of local equilibrium at grain boundary junctions: when both higher and lower-energy boundaries meet at a junction, equilibrium favors the lengthening of the lower-energy boundaries at the expense of the higher-energy ones, and it leads to the replacement of threefold-coordinated junctions (the only stable junctions in the isotropic case) with higher-order junctions [25]. Both of these effects have a major impact on the topology of the grain boundary network. In contrast, the introduction of a spectrum of boundary mobilities merely tends to increase or decrease the growth rate of R. Since control over microstructural topology is the primary goal of the processing of
2168
C.E. Krill III
polycrystalline materials, the future improvement of anisotropic growth models will depend to a large extent on the development of more accurate treatments of energy anisotropy. Further extension of the phase-field approach to even more complex systems, such as those containing multiple phases or gradients of concentration or temperature, is likewise an active area of research [11, 16, 26].
6.
Comparison to other Grain Growth Simulation Algorithms
The phase-field method is just one of many approaches to simulating grain growth in two and three dimensions. Broadly speaking, the various computational algorithms can be divided into two classes: boundary-tracking models, in which the equations of motion are solved numerically for a set of points describing the network of grain boundaries in the simulation cell, thus permitting the positions of each boundary to be calculated as a function of time, and volumetric-relaxation models, in which local microstructural changes are calculated from an equation governing the evolution of the free energy of the overall system. Examples for the boundary-tracking approach include vertex, front-tracking, Surface Evolver and cellular automata models, whereas the primary representatives of the volumetric approach are the Monte Carlo Potts models and the phase-field method [3, 8]. The relative strengths and weaknesses of the two classes of algorithms are clearest for the conceptually and computationally demanding case of 3D growth. Computational considerations seem to favor the boundary-tracking approach, because the dimensionality of the grain boundary network is one less than that of the space in which it is embedded; thus, the computational resources required to simulate a given growth process are potentially far smaller than for a volumetric computation. Moreover, there is no intrinsic limit to the accuracy to which the boundary motion can be determined, as the boundary positions are not restricted to lie along the points of a discrete grid, as they are in the volumetric techniques. However, the topological richness of the grain growth process is the source of a fundamental weakness of boundary-tracking models, at least when applied to 3D growth, as determining the precise topological consequences of singular events like the disappearance of a grain is a currently unsolved problem in 3D. Volumetric-relaxation algorithms avoid this problem entirely, because in those models all topological changes occur naturally as a result of global energy minimization, not in a biased manner through the application of an ad hoc set of rules for local topological changes. The boundary positions need not be tracked explicitly in the volumetric approach, as they can always be determined from the instantaneous state of the simulation cell; however, this
Phase-field modeling of grain growth
2169
requires performing calculations throughout the cell volume. Consequently, the volumetric models are computationally tractable only on a discrete lattice, the symmetry and spacing of which can under certain circumstances influence the calculated growth kinetics. Most importantly, it is necessary to construct the energy function in such a manner that the energy minimization pathway yields physically plausible equations of motion for individual boundaries–i.e., curvature-driven migration, in the case of grain growth. For both the Monte Carlo Potts and the phase-field models, this has been verified by comparing the calculated shrinkage of a circular or spherical grain embedded in a homogeneous matrix against the analytic solution for curvature-driven boundary migration. In the case of ideal grain growth, boundary-tracking and volumetricrelaxation simulations yield essentially identical results for growth occurring in two dimensions [27, 28] as well as in three [4]. All calculations predict a value of m = 2 for the growth exponent in Eq. (2), and the simulated microstructures manifest nearly the same quasistationary grain size distribution (Fig. 6) and topological averages (Table 1). It is interesting to note that the simulated grain size distributions universally disagree with the prediction of Hillert’s analytic model, but–unlike in the experimental case illustrated in Fig. 1 (b)–this discrepancy cannot be blamed on the presence of impurities in the grain boundaries or on the existence of mobility and energy anisotropy. Although far less extensive comparisons have been carried out between various simulation algorithms for anisotropic grain growth, here, too, the initial results point to general agreement [25]. Thus, it appears that the choice of
phase field Monte Carlo Surface Evolver vertex Hillert (3D)
1.0
frequency
0.8 0.6 0.4 0.2 0.0 0.0
0.5
1.0
1.5 R /
2.0
2.5
Figure 6. Comparison of quasistationary grain size distributions generated by various algorithms for simulating ideal grain growth in 3D [4]. The prediction of Hillert’s analytic model is included for comparison.
2170
C.E. Krill III
an algorithm for simulating grain growth can safely be made on the basis of computational convenience for the specific problem at hand.
References [1] H.V. Atkinson, “Theories of normal grain growth in pure single phase systems,” Acta Metall., 36, 469–491, 1988. [2] F.J. Humphreys and M. Hatherly, Recrystallization and Related Annealing Phenomena, Pergamon Press, Oxford, 1996. [3] C.V. Thompson, “Grain growth and evolution of other cellular structures,” Solid State Phys., 55, 269–314, 2001. [4] C.E. Krill III and L.-Q. Chen, “Computer simulation of 3-D grain growth using a phase-field model,” Acta Mater., 50, 3057–3073, 2002. [5] J.E. Burke and D. Turnbull, “Recrystallization and grain growth,” Prog. Metal Phys., 3, 220–292, 1952. [6] M. Hillert, “On the theory of normal and abnormal grain growth,” Acta Metall., 13, 227–238, 1965. [7] H.J. Frost and C.V. Thompson, “Computer simulation of grain growth,” Curr. Op. Solid State Mater. Sci., 1, 361–368, 1996. [8] M.A. Miodownik, “A review of microstructural computer models used to simulate grain growth and recrystallisation in aluminium alloys,” J. Light Metals, 2, 125–135, 2002. [9] L.-Q. Chen, “Phase-field models for microstructure evolution,” Ann. Rev. Mater. Res., 32, 113–140, 2002. [10] L.-Q. Chen and W. Yang, “Computer simulation of the domain dynamics of a quenched system with a large number of nonconserved order parameters: The grain growth kinetics,” Phys. Rev. B, 50, 15752–15756, 1994. [11] I. Steinbach, F. Pezzolla, B. Nestler, M. Seeßelberg, R. Prieler, G.J. Schmitz, and J.L.L. Rezende, “A phase field concept for multiphase systems,” Physica D, 94, 135– 147, 1996. [12] R. Kobayashi, J.A. Warren, and W.C. Carter, “A continuum model of grain boundaries,” Physica D, 140, 141–150, 2000. [13] D. Moldovan, D. Wolf, and S.R. Phillpot, “Theory of diffusion-accommodated grain rotation in columnar polycrystalline microstructures,” Acta Mater., 49, 3521–3532, 2001. [14] R. Kobayashi, J.A. Warren, and W.C. Carter, “Vector-valued phase field model for crystallization and grain boundary formation,” Physica D, 119, 415–423, 1998. [15] M.T. Lusk, “A phase-field paradigm for grain growth and recrystallization,” Proc. R. Soc. London A, 455, 677–700, 1999. [16] J.A. Warren, R. Kobayashi, A.E. Lobkovsky, and W.C. Carter, “Extending phase field models of solidification to polycrystalline materials,” Acta Mater., 51, 6035–6058, 2003. [17] S.M. Allen and J.W. Cahn, “A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening,” Acta Metall., 27, 1085–1095, 1979. [18] D. Fan, and L.-Q. Chen, “Computer simulation of grain growth using a continuum field model,” Acta Mater., 45, 611–622, 1997. [19] D. Fan, L.-Q. Chen, and S.P. Chen, “Effect of grain boundary width on grain growth in a diffuse-interface field model,” Mater. Sci. Eng. A, A238, 78–84, 1997.
Phase-field modeling of grain growth
2171
[20] L.-Q. Chen and J. Shen, “Applications of semi-implicit Fourier-spectral method to phase field equations,” Comput. Phys. Commun., 108, 147–158, 1998. [21] G. Gottstein and L.S. Shvindlerman, Grain Boundary Migration in Metals: Thermodynamics, Kinetics, Applications, CRC Press, Boca Raton, FL, 1999. [22] D. Wolf and K.L. Merkle, “Correlation between the structure and energy of grain boundaries in metals,” In: D. Wolf and S. Yip (eds.), Materials Interfaces: AtomicLevel Structure and Properties, Chapter 3, pp. 87–150, Chapman & Hall, London, 1992. [23] A. Kazaryan, Y. Wang, S.A. Dregia, and B.R. Patton, “Generalized phase-field model for computer simulation of grain growth in anisotropic systems,” Phys. Rev. B, 61, 14275–14278, 2000. [24] A. Kazaryan, Y. Wang, S.A. Dregia, and B.R. Patton, “Grain growth in anisotropic systems: comparison of effects of energy and mobility,” Acta Mater., 50, 2491–2502, 2002. [25] M. Upmanyu, G.N. Hassold, A. Kazaryan, E.A. Holm, Y. Wang, B. Patton, and D.J. Srolovitz, “Boundary mobility and energy anisotropy effects on microstructural evolution during grain growth,” Interface Sci., 10, 201–216, 2002. [26] D. Fan and L.-Q. Chen, “Topological evolution during coupled grain growth and Ostwald ripening in volume-conserved 2-D two-phase polycrystals,” Acta Mater., 45, 4145–4154, 1997. [27] V. Tikare, E.A. Holm, D. Fan, and L.-Q. Chen, “Comparison of phase-field and Potts models for coarsening processes,” Acta Mater., 47, 363–371, 1999. [28] C. Maurice, “Numerical modelling of grain growth: Current status,” In: G. Gottstein, and D.A. Molodov (eds.), Recrystallization and Grain Growth, Vol.1, pp. 123–134, Springer-Verlag, Berlin, 2001. [29] K.M. D¨obrich, C. Rau, and C.E. Krill III, “Quantitative characterization of the threedimensional microstructure of polycrystalline Al-Sn using x-ray microtomography,” Metall. Mater. Trans. A, 35A, 1953–1961, 2004. [30] H. Hu, “Grain growth in zone-refined iron,” Can. Metall. Q., 13, 275–286, 1974.
7.7 RECRYSTALLIZATION SIMULATION BY USE OF CELLULAR AUTOMATA Dierk Raabe Max-Planck-Institut f¨ur Eisenforschung, Max-Planck-Str. 1, 40237 D¨usseldorf, Germany
1. 1.1.
Introduction to Cellular Automata Basic Setup of Cellular Automata
Cellular automata are algorithms that describe the discrete spatial and temporal evolution of complex systems by applying local (or sometimes longrange) deterministic or probabilistic transformation rules to the cells of a regular (or non-regular) lattice. The space variable in cellular automata usually stands for real space, but orientation space, momentum space, or wave vector space can be used as well. Cellular automata can have arbitrary dimensions. Space is defined on a regular array of lattice points which can be regarded as the nodes of a finite difference field. The lattice maps the elementary system entities that are regarded as relevant to the model under investigation. The individual lattice points can represent continuum volume units, atomic particles, lattice defects, or colors depending on the underlying model. The state of each lattice point is characterized in terms of a set of generalized state variables. These could be dimensionless numbers, particle densities, lattice defect quantities, crystal orientation, particle velocity, blood pressure, animal species or any other quantity the model requires. The actual values of these state variables are defined at each of the individual lattice points. Each point assumes one out of a finite set of possible discrete states. The opening state of the automaton which can be derived from experiment (for instance from a microtexture experiment) or theory (for instance from crystal plasticity finite element simulations) is defined by mapping the initial distribution of the values of the chosen state variables onto the lattice. 2173 S. Yip (ed.), Handbook of Materials Modeling, 2173–2203. c 2005 Springer. Printed in the Netherlands.
2174
D. Raabe
The dynamical evolution of the automaton takes place through the application of deterministic or probabilistic transformation rules (also referred to as switching rules) that act on the state of each lattice point. These rules determine the state of a lattice point as a function of its previous state and the state of the neighboring sites. The number, arrangement, and range of the neighbor sites used by the transformation rule for calculating a state switch determines the range of the interaction and the local shape of the areas which evolve. Cellular automata work in discrete time steps. After each time interval the values of the state variables are updated for all lattice points in synchrony mapping the new (or unchanged) values assigned to them through the transformation rule. Owing to these features, cellular automata provide a discrete method of simulating the evolution of complex dynamical systems which contain large numbers of similar components on the basis of their local (or long-range) interactions. Cellular automata do not have restrictions in the type of elementary entities or transformation rules employed. They can map such different situations as the distribution of the values of state variables in a simple finite difference simulation, the colors in a blending algorithm, the elements of fuzzy sets, or elementary growth and decay processes of cells. For instance, the Pascal triangle which can be used to calculate higher-order binominal coefficients or the Fibonaccy numbers can be regarded as a one-dimensional cellular automaton where the value that is assigned to each site of a regular triangular lattice is calculated through the summation of the two numbers above it. In this case the entities of the automaton are dimensionless integer numbers and the transformation rule is a summation. Cellular automata were introduced by von Neumann [1] for the simulation of self-reproducing Turing automata and population evolution. In his early contributions von Neumann denoted them as cellular spaces. Other authors used notions like tessellation automata, homogeneous structures, tessellation structures, or iterative arrays. Later applications were particularly in the field of describing non-linear dynamic behavior of fluids and reaction-diffusion systems. During the last decade cellular automata increasingly gained momentum for the simulation of microstructure evolution in the materials sciences.
1.2.
Formal Description and Classes of Cellular Automata
The local interaction of neighboring lattice sites in a cellular automaton is specified through a set of transformation (switching) rules. While von Neumann’s original automata were designed with deterministic transformation rules probabilistic transformations are conceivable as well. The value of an arbitrary state variable ξ assigned to a particular lattice site at a time (t0 + t) is determined by its present state (t0 ) (or its last few states t0 , t0 − t, etc.) and the state of its neighbors [1–4].
Recrystallization simulation by use of cellular automata
2175
Considering the last two time steps for the evolution of a one-dimensional −t , cellular automaton, this can be put formally by writing ξ tj0 +t = f (ξ tj0−1 t0 −t t0 −t t0 t0 t0 t0 , ξ j +1 , ξ j −1, ξ j , ξ j +1 ) where ξ j indicates the value of the variable at a ξj time t0 at the node j . The positions ( j + 1) and ( j − 1) indicate the nodes in the immediate neighborhood of position j (for one-dimension). The function f specifies the set of transformation rules, for instance such as provided by standard discrete finite difference algorithms. If the state of the node depends only on its nearest neighbors (NN) the array is referred to as von Neumann neighboring (Fig. 1a). If both the NN and the next-nearest neighbors (NNN) determine the ensuing state of the node, the array is called Moore neighboring (Fig. 1b). Due to the discretization of space, the type of neighboring affects the local transformation rates and the evolving morphologies [1–4]. For the Moore and other extended configurations, which allows one to introduce a certain medium-range interaction among the sites the transformation rule can in one dimension and for interaction with the last −t −t −t , ξ jt0−n+1 , . . . , ξ tj0−1 , ξ tj0 −t , two time steps be rewritten as ξ tj0 +t = f ξ tj0−n
−t , ξ tj0−1 , ξ tj0 , ξ tj0+1 , . . . , ξ tj0+n−1 , ξ tj0+n where n indicates the range of the ξ tj0+1 transformation rule in units of lattice cells. Even for very simple automata there exists an enormous variety of possible transformation rules. If in a one-dimensional Boolean cellular automaton with von Neumann neighboring and reference to the preceding time step each node can have one of two possible ground states, say ξ j = 1 or ξ j = 0, the
(a)
(b) 1,1
1,2
1,3
1,1
1,2
1,3
2,1
2,2
2,3
2,1
2,2
2,3
3,1
3,2
3,3
3,1
3,2
3,3
x (2,2) f {x (1,2), x (2,1), x (2,3), x (3,2)}
x (2,2) g {x (1,1), x (1,2), x (1,3), x (2,1), x (2,3), x (3,1), x (3,2), x (3,3)}
Figure 1. (a) Example of a 2D von Neumann configuration considering nearest neighbors. (b) Example of 2D Moore configuration considering both nearest and next-nearest neighbors.
2176
D. Raabe
transformation rule assumes the form ξ jt0 +t = f ξ tj0−1 , ξ tj0 , ξ tj0+1 . This simple Boolean configuration defines 28 possible transformation rules. One of them has the form if if if if if if if if
t0 ξ = 1, jt0−1 ξ = 1, jt0−1 ξ = 1, jt0−1 ξ j −1 = 1, t0 ξ = 0, jt0−1 ξ = 0, jt0−1 ξ j −1 = 0, t0
ξ tj0 = 1, ξ tj0 = 1, ξ tj0 = 0, ξ tj0 = 0, ξ tj0 = 1, ξ tj0 = 1, ξ tj0 = 0, ξ j −1 = 0, ξ tj0 = 0,
ξ tj0+1 = 1 ξ tj0+1 = 0 ξ tj0+1 = 1 ξ tj0+1 = 0 ξ tj0+1 = 1 ξ tj0+1 = 0 ξ tj0+1 = 1 ξ tj0+1 = 0
then then then then then then then then
ξ tj0 +t ξ tj0 +t ξ tj0 +t ξ tj0 +t ξ tj0 +t ξ tj0 +t ξ tj0 +t ξ tj0 +t
=0 =1 =0 =1 =1 =0 =1 =0
(1, 1, 1) → 0 (1, 1, 0) → 1 (1, 0, 1) → 0 (1, 0, 0) → 1 (0, 1, 1) → 1 (0, 1, 0) → 0 (0, 0, 1) → 1 (0, 0, 0) → 0
This particular transformation rule can be encoded by (01011010)2 . Its digital description is of course only valid for a given arrangement of the corresponding basis. This order is commonly chosen as a decimal row with decreasing value, i.e. (1,1,1) translates to 111 (one hundred eleven), (1,1,0) to 110 (one hundred ten), and so on. Transforming the binary code into decimal numbers using 27
26
25
24
23
22
21
20
0
1
0
1
1
0
1
0
leads to the decimal code number 9010 . The digital coding system is commonly used for compactly describing transformation rules for cellular automata in the literature [2–4]. In general terms the number of rules can be calculated by k (kn) , where k is the number of states for the cell and n is the number of neighbors including the core cell. For a two-dimensional automaton with Moore neighborhood and two possible cell states (i.e., k = 2, n = 9) 229 = 262 144 different transformation rules exist. If the state of a node is determined by the sum of the neighbor site values, the model is referred to as totalistic cellular automaton. If the state of a node has a separate dependence on the state itself and on the sum of the values taken by the variables of the neighbors, the model is referred to as outer totalistic cellular automaton. Cellular automata fall into four basic classes of behavior [2–4]. Class 1 cellular automata evolve for almost any initial configuration after a finite number of time steps to a homogeneous and unique state from which they do not evolve further. Cellular automata in this class exhibit the maximal possible order both at the global and local scale. The geometrical analogy for this class is a limit point in the corresponding phase space. Class 2 cellular automata
Recrystallization simulation by use of cellular automata
2177
usually create short period patterns that repeat periodically, typically recurring after small periods, or are stable. Local and global order exhibited is in such automata, although not maximal. Class 2 automata can be interpreted as filters, which derive the essence from discrete data sets for a given set of transformation rules. In phase space such systems form a limit cycle. Class 3 cellular automata lead from almost all possible initial states to aperiodic chaotic patterns. The statistical properties of these patterns and the statistical properties of the starting patterns are almost identical at least after a sufficient period of time. The patterns created by class 3 automata are usually self-similar fractal arrays. After sufficiently many time steps, the statistical properties of these patterns are typically the same for almost all initial configurations. Geometrically class 3 automata form so called strange attractors in phase space. Class 3 is the most frequent type of cellular automata. With increasing neighborhood and increasing number of possible cell states the probability to design a class 3 automaton increases for an arbitrary selected rule. Cellular automata in this class can exhibit maximal disorder on both global and local scales. Class 4 cellular automata yield stable, periodic, and propagating structures which can persist over arbitrary lengths of time. Some class 4 automata dissolve after a finite steps of time, i.e., the state of all cells becomes zero. In some class 4 a small set of stable periodic figures can occur (such as for instance in Conway’s game of life [5]). By properly arranging these propagating structures, final states with any cycle length may be obtained. Class 4 automata show a high degree of irreversibility in their time development. They usually reveal more complex behavior and very long transient lengths, having no direct analogue in the field of dynamical systems. The cellular automata in this class can exhibit significant local (not global) order. These introductory remarks show that the cellular automaton concept is defined in a very general and versatile way. Cellular automata can be regarded as a generalization of discrete calculation methods [1, 2]. Their flexibility is due to the fact that, besides the use of crisp mathematical expressions as variables and discretized differential equations as transformation rules, automata can incorporate practically any kind of element or rule that is deemed relevant.
2.
Application of Cellular Automata in Materials Science
Transforming the abstract rules and properties of general cellular automata into a materials-related simulation concept consists in mapping the values of relevant state variables onto the points of a cellular automaton lattice and using the local finite difference formulations of the partial differential equations of the underlying model as local transformation rules. The particular versatility of the cellular automaton approach for microstructure simulations particularly in the fields of recrystallization, grain growth, and phase transformation
2178
D. Raabe
phenomena is due to its flexibility in considering a large variety of state variables and transformation laws. The design of such time and space discretized simulations of materials microstructures which track kinetics and energies in a local fashion are of interest for two reasons. First, from a fundamental point of view it is desirable to understand better the dynamics and the topology of microstructures that arise from the interaction of large numbers of lattice defects which are characterized by a wide spectrum of intrinsic properties and interactions in spatially heterogeneous materials. For instance, in the fields of recrystallization and grain growth the influence of local grain boundary characteristics (mobility, energy), local driving forces, and local crystallographic textures on the final microstructure is of particular interest. Second, from a practical point of view it is necessary to predict microstructure parameters such as grain size or texture which determine the mechanical and physical properties of real materials subjected to industrial processes on a phenomenological though sound physical basis. Apart from cellular automata a number of excellent models for discretely simulating recrystallization and grain growth phenomena have been suggested. They can be grouped as multi-state kinetic Potts Monte Carlo models, topological boundary dynamics and front–tracking models, and Ginzburg–Landau type phase field kinetic models (see overview in Ref. [6]). However, in comparison to these approaches the strength of scaleable kinetic cellular automata is that they combine the computational simplicity and scalability of a switching model with the physical stringency of a boundary dynamics model. Their objective lies in providing a numerically efficient and at the same time phenomenologically sound method of discretely simulating recrystallization and grain growth phenomena. As far as computational aspects are concerned, cellular automata can be designed to minimize calculation time and reduce code complexity in terms of storage and algorithm. As far as microstructure physics is concerned, they can be designed to provide kinetics, texture, and microstructure on a real space and time scale on the basis of realistic or experimental input data for microtexture, grain boundary characteristics, and local driving forces. The possible incorporation of realistic values particularly for grain boundary energies and mobilities deserves particular attention since such experimental data are increasingly available enabling one to make quantitative predictions. Cellular automaton simulations are often carried out at an elementary level using atoms, clusters of atoms, dislocation segments, or small crystalline or continuum elements as underlying units. It should be emphasized that particularly those variants that discretize and map microstructure in continuum space are not intrinsically calibrated by a characteristic physical length or time scale. This means that a cellular automaton simulation of continuum systems requires the definition of elementary units and transformation rules
Recrystallization simulation by use of cellular automata
2179
that adequately reflect the system behavior at the level addressed. If some of the transformation rules refer to different real time scales (e.g., recrystallization and recovery, bulk diffusion and grain boundary diffusion) it is essential to achieve a correct common scaling of the entire system. The requirement for an adjustment of time scaling among various rules is due to the fact that the transformation behavior of a cellular automaton is sometimes determined by non-coupled Boolean routines rather than by the exact local solutions of coupled differential equations. The same is true when underlying differential equations with entirely different time scales enter the formulation of a set of transformation rules. The scaling problem becomes particularly important in the simulation of non-linear systems (which applies for most microstructure based cellular automata). During the simulation it can be useful to refine or coarsen the scale according to the kinetics (time re-scaling) and spatial resolution (space re-scaling). Since the use of cellular automata is not confined to the microscopic regime, it provides a convenient numerical means for bridging various space and time scales in microstructure simulation. Important fields where microstructure based cellular automata have been successfully used in the materials sciences are primary static recrystallization and recovery [6–19], formation of dendritic grain structures in solidification processes [20–26], as well as related nucleation and coarsening phenomena [27–36]. In what follows this chapter is devoted to the simulation of primary static recrystallization. For studying related microstructural topics the reader is referred to the quotes given above.
3. 3.1.
Example of a Recrystallization Simulation by Use of a Probabilistic Cellular Automaton Lattice Structure and Transformation Rule
The model for the present recrystallization simulation is designed as a cellular automaton with a probabilistic transformation rule [16–18]. Independent variables are time t and space x = (x1 , x2 , x3 ). Space is discretized into an array of equally shaped cells (2D or 3D depending on input data). Each cell is characterized in terms of the dependent variables. These are scalar (mechanical, electromagnetic) and configurational (interfacial) contributions to the driving force and the crystal orientation g = g(ϕ1 , φ, ϕ2 ), where g is the rotation matrix and ϕ1 , φ, ϕ2 the Euler angles. The driving force is the negative change in Gibbs enthalpy G t per transformed cell. The starting data, i.e., the crystal orientation map and the spatial distribution of the driving force, can be provided by experiment, i.e., orientation imaging microscopy via electron back scatter diffraction or by simulation, e.g., a crystal plasticity finite element
2180
D. Raabe
simulation. Grains or sub-grains are mapped as regions of identical crystal orientation, but the driving force may vary inside these areas. The kinetics of the automaton result from changes in the state of the cells (cell switches). They occur in accord with a switching rule (transformation rule) which determines the individual switching probability of each cell as a function of its previous state and the state of its neighbor cells. The switching rule is designed to map the phenomenology of primary static recrystallization in a physically sound manner. It reflects that the state of a non-recrystallized cell belonging to a deformed grain may change due to the expansion of a recrystallizing neighbor grain which grows according to the local driving force and boundary mobility. If such an expanding grain sweeps a non-recrystallized cell the stored dislocation energy of that cell drops to zero and a new orientation is assigned to it, namely that of the expanding neighbor grain. To put this formally, the switching rule is cast in a probabilistic form of a linearized symmetric rate equation, which describes grain boundary motion in terms of isotropic single-atom diffusion processes perpendicular through a homogeneous planar grain boundary segment under the influence of a decrease in Gibbs energy,
G − G t /2 G + G t /2 − exp − (1) x˙ = nνD λgb c exp kB T kB T where x˙ is the grain boundary velocity, νD the Debye frequency, λgb the jump width through the boundary, c the intrinsic concentration of grain boundary vacancies or shuffle sources, n the normal of the grain boundary segment, G the Gibbs enthalpy of motion through in the interface, G t the Gibbs enthalpy associated with the transformation, kB the Boltzmann constant, and T the absolute temperature. Replacing the jump width by the burgers vector and the Gibbs enthalpy terms by the total entropy, S, and total enthalpy, H , leads to a linearized form S H pV exp − (2) x˙ ≈ nνD b exp − kB kB T kB T where p is the driving force and V the atomic volume which is of the order of b3 (b is the magnitude of the Burgers vector). Summarizing these terms reproduces Turnbull’s rate expression
Q gb p (3) x˙ = n m p = n m 0 exp − kB T where m is the mobility. These equations provide a well known kinetic picture of grain boundary segment motion, where the atomistic processes (including thermal fluctuations, i.e., random thermal backward and forward jumps) are statistically described in terms of the pre-exponential factor of the mobility m 0 = m 0 (g,n) and of the activation energy of grain boundary mobility Q gb = Q gb (g, n).
Recrystallization simulation by use of cellular automata
2181
For dealing with competing switches affecting the same cell the deterministic rate equation can be replaced by a probabilistic analogue which allows one to calculate switching probabilities. For this purpose Eq. (3) is separated into a deterministic part, x˙ 0 , which depends weakly on temperature, and a probabilistic part, w, which depends strongly on temperature:
kB T m 0 pV Q gb exp − V kB T kB T Q gb pV exp − w= kB T kB T
x˙ = x˙ 0 w = n
with x˙ 0 = n
kB T m 0 , V (4)
The probability factor w represents the product of the linearized part pV /(kB T ) and the non–linearized part exp(−Q gb /(kB T )) of the original Boltzmann terms. According to this expression non-vanishing switching probabilities occur for cells which reveal neighbors with different orientation and a driving force which points in their direction. The automaton considers the first, second (2D), and third (3D) neighbor shell for the calculation of the total driving force acting on a cell. The local value of the switching probability depends on the crystallographic character of the boundary segment between such unlike cells.
3.2.
Scaling and Normalization
Microstructure based cellular automata are usually applied to starting data which have a spatial resolution far above the atomic scale. This means that the automaton lattice has a lateral scaling of λm b where λm is the scaling length of the cellular automaton lattice and b the Burgers vector. If a moving boundary segment sweeps a cell, the grain thus grows (or shrinks) by λ3m rather than b3 . Since the net velocity of a boundary segment must be independent of this scaling value of λm , an increase in jump width must lead to a corresponding decrease in the grid attack frequency, i.e., to an increase of the characteristic time step, and vice versa. For obtaining a scale-independent grain boundary velocity, the grid frequency must be chosen in a way to ensure that the attempted switch of a cell of length λm occurs with a frequency much below the atomic attack frequency which attempts to switch a cell of length b. This scaling condition which is prescribed by an external scaling length λm leads to the equation x˙ = x˙ 0 w = n (λm ν) w
with ν =
kB T m 0 V λm
(5)
where ν is the eigenfrequency of the chosen lattice characterized by the scaling length λm .
2182
D. Raabe
The eigenfrequency represents the attack frequency for one particular grain boundary with constant mobility. In order to use a whole spectrum of mobilities and driving forces in one simulation it is necessary to normalize the eigenfrequency by a common grid attack frequency ν0 rendering it into
x˙ = x˙ 0 w = nλm ν0
ν w = xˆ˙ 0 ν0
ν w = xˆ˙ 0 wˆ ν0
(6)
The value of the attack frequency ν0 which is characteristic of the lattice can be calculated by the assumption that the maximum occurring switching probability cannot be larger than one
wˆ
max
max Q min m max gb 0 p = exp − min kB T λ m ν0
! ≤1
(7)
is the maximum occurring pre–exponential factor of the mobility, where m max 0 pmax the maximum possible driving force, ν0min the minimum allowed grid attack frequency, and Q min gb the minimum occurring activation energy. With wˆ max = 1 one obtains the normalization frequency as a function of the upper bound input data.
ν0min
max Q min m max gb 0 p = exp − λm kB T
(8)
This frequency and the local values of the mobility and the driving force lead to
wˆ
local
Q local m local plocal gb = 0 min exp − kB T λ m ν0
=
m local 0 m max 0
local
p pmax
exp−
min Q local gb − Q gb
kB T
m local plocal 0 = max max m0 p
(9) This expression is the central switching equation of the algorithm. One can interpret this equation also in terms of the local time t = λm /˙x which is required by a grain boundary with velocity x˙ to sweep an automaton cell of size λm
wˆ
local
=
m local plocal m max pmax
=
x˙ local x˙ max
t max = local t
(10)
Equation (9) shows that the local switching probability can be quantified by the ratio of the local and the maximum mobility m local/m max , which is a function of the grain boundary character and by the ratio of the local and the maximum driving pressure plocal / pmax . The probability of the fastest occurring
Recrystallization simulation by use of cellular automata
2183
local min boundary segment (characterized by m local = m max = pmax , Q local 0 0 , p gb = Q gb ) to realize a cell switch is equal to 1. Equation (9) shows that an increasing cell size does not influence the switching probability but only the time step elapsing during an attempted switch. This relationship is obvious since the volume to be swept becomes larger which requires more time. The characteristic time constant of the simulation t is 1/ν0min . While Eq. (8) allows one to calculate the switching probability of a cell as a function of its previous state and the state of the neighbor cells the actual decision about a cell switch is made by a Monte Carlo step. The use of random numbers ensures that all cell switches are sampled according to their proper statistical weight, i.e., according to the local driving force and mobility between cells. The simulation proceeds by calculating the individual local switching probabilities wˆ local for each cell and evaluating them using a Monte Carlo algorithm. This means that for each cell the calculated switching probability is compared to a randomly generated number r which lies between 0 and 1. The switch is accepted if the random number is equal or smaller than the calculated switching probability. Otherwise the switch is rejected.
m local plocal accept switch if r ≤ m max pmax
random number r between 0 and 1
m local plocal reject switch if r >
m max pmax
(11) Except for the probabilistic evaluation of the analytically calculated transformation probabilities, the approach is entirely deterministic. Thermal fluctuations other than already included via Turnbull’s rate equation are not permitted. The use of realistic or even experimental input data for the grain boundaries enables one to make predictions on a real time and space scale. The switching rule is scalable to any mesh size and to any spectrum of boundary mobility and driving force data. The state update of all cells is made in synchrony.
3.3.
Simulation of Primary Static Recrystallization and Comparison to Avrami-type Kinetics
Figure 2 shows the kinetics and 3D microstructures of a recrystallizing aluminum single crystal. The initial deformed crystal had a uniform Goss orientation (011)[100] and a dislocation density of 1015 m−2 . The driving force was due to the stored elastic energy provided by the dislocations. In order to compare the predictions with analytical Avrami kinetics recovery
2184
D. Raabe
100 recrystallized volume fraction [%]
90 80 70 60 50 40 30 20 10 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 time [s] Figure 2. Kinetics and microstructure of recrystallization in a plastically strained aluminum single crystal. The deformed crystal had a (011)[100] orientation and a uniform dislocation density of 1015 m−2 . Simulation parameter: site saturated nucleation, lattice size: 10 × 10 × 10 × µ m3 , cell size: 0.1 µm, activation energy of large angle grain boundary mobility: 1.3 eV, pre–exponential factor of large angle boundary mobility: m 0 = 6.2 ×10−6 m3 /(N s), temperature: 800 K, time constant: 0.35 s.
and driving forces arising from local boundary curvature were not considered. The simulation used site saturated nucleation conditions, i.e., the nuclei were at t =0 s statistically distributed in physical space and orientation space. The grid size was 10 × 10 × 10 µm3 . The cell size was 0.1 µm. All grain boundaries had the same mobility using an activation energy of the grain boundary mobility of 1.3 eV and a pre–exponential factor of the boundary mobility of m 0 = 6.2 · 10−6 m3 /(N s) [37]. Small angle grain boundaries had a mobility of zero. The temperature was 800 K. The time constant of the simulation was 0.35 s. Figure 3 shows the kinetics for a number of 3D recrystallization simulations with site saturated nucleation conditions and identical mobility for all grain boundaries. The different curves correspond to different initial numbers of nuclei. The initial number of nuclei varied between 9624 (pseudo–nucleation energy of 3.2 eV) and 165 (pseudo–nucleation energy of 6.0 eV). The curves (Fig. 3a) all show a typical Avrami shape and the logarithmic plots (Fig. 3b)
Recrystallization simulation by use of cellular automata
2185
(a) recrystallized volume fraction [%]
100 3.2 eV (nucl.)
90 80 70 60 50
6.0 eV (nucl.)
40 30 20 10 0
0
5
10
15
20 25 30 annealing time [s]
35
40
45
(b) 1.5 1
In (In (1/(1 x )))
0.5 0 0.5 1 1.5 2 2.5 1.4
1.8
2.2
2.6 In(t )
3
3.4
3.8
Figure 3. Kinetics for various 3D recrystallization simulations with site saturated nucleation conditions and identical mobility for all grain boundaries. The different curves correspond to different initial numbers of nuclei. The initial number of nuclei varied between 9624 (pseudo– nucleation energy of 3.2 eV) and 165 (pseudo–nucleation energy of 6.0 eV). (a) Avrami diagrams. (b) Logarithmic diagrams showing Avrami exponents between 2.86 and 3.13.
2186
D. Raabe
reveal Avrami exponents between 2.86 and 3.13 which is in very good accord with the analytical value of 3.0 for site saturated conditions. The simulations with a very high initial density of nuclei reveal a more pronounced deviation of the Avrami exponent with values around 2.7 during the beginning of recrystallization. This deviation from the analytical behavior is due to lattice effects: while the analytical derivation assumes a vanishing volume for newly formed nuclei the cellular automaton has to assign one lattice point to each new nucleus. Figure 4 shows the effect of grain boundary mobility on growth selection. While in Fig. 4a all boundaries had the same mobility, in Fig. 4b one grain boundary had a larger mobility than the others (activation energy of the mobility of 1.35 eV instead of 1.40 eV) and consequently grew much faster than the neighboring grains which finally ceased to grow. The grains in this simulation all grew into a heavily deformed single crystal. (a)
temporal evolution
deformed single crystal
growing nucleation front
(b)
temporal evolution
deformed single crystal
growing nucleation front
Figure 4. Effect of grain boundary mobility on growth selection. All grains grow into a deformed single crystal. (a) All grain boundaries have the same mobility. (b) One grain boundary has a larger mobility than the others (activation energy of the mobility of 1.35 eV instead of 1.40 eV) and grows faster than the neighboring grains.
Recrystallization simulation by use of cellular automata
4.
4.1.
2187
Examples of Coupling Cellular Automata with Crystal Plasticity Finite Element Models for Predicting Recrystallization Motivation for Coupling Different Spatially Discrete Microstructure and Texture Simulation Methods
Simulation approaches such as the crystal plasticity finite element method or cellular automata are increasingly gaining momentum as tools for spatial and temporal discrete prediction methods for microstructures and textures. The major advantage of such approaches is that they consider material heterogeneity as opposed to classical statistical approaches which are based on the assumption of material homogeneity. Although the average behavior of materials during deformation and heat treatment can sometimes be sufficiently well described without considering local effects, prominent examples exist where substantial progress in understanding and tailoring material response can only be attained by taking material heterogeneity into account. For instance in the field of plasticity the quantitative investigation of ridging and roping or related surface defects observed in sheet metals requires knowledge about local effects such as the grain topology or the form and location of second phases. In the field of heat treatment, the origin of the Goss texture in transformer steels, the incipient stages of cube texture formation during primary recrystallization of aluminum, the reduction of the grain size in microalloyed low carbon steel sheets, and the development of strong {111}uvw textures in steels can hardly be predicted without incorporating local effects such as the orientation and location of recrystallization nuclei and the character and properties of the grain boundaries surrounding them. Although spatially discrete microstructure simulations have already profoundly enhanced our understanding of microstructure and texture evolution over the last decade, their potential is sometimes simply limited by an insufficient knowledge about the external boundary conditions which characterize the process and an insufficient knowledge about the internal starting conditions which are, to a large extent, inherited from the preceding process steps. It is thus an important goal to improve the incorporation of both types of information into such simulations. External boundary conditions prescribed by real industrial processes are often spatially non-homogeneous. They can be investigated using experiments or process simulations which consider spatial resolution. Spatial heterogeneities in the internal starting conditions, i.e., in the microstructure and texture, can be obtained from experiments or microstructure simulations which include spatial resolution.
2188
4.2.
D. Raabe
Coupling, Scaling and Boundary Conditions
In the present example the results obtained from a crystal plasticity finite element simulation were used to map a starting microstructure for a subsequent discrete recrystallization simulation carried out with a probabilistic cellular automaton. The finite element model was used to simulate a plane strain compression test conducted on aluminum with columnar grain structure to a total logarithmic strain of ε = –0.434. Details about the finite element model are given elsewhere [34, 35, 38, 39]. The values of the state variables (dislocation density, crystal orientation) given at the integration points of the finite element mesh were mapped on the regular lattice of a 2D cellular automaton. While the original finite element mesh consisted of 36 977 quadrilateral elements, the cellular automaton lattice consisted of 217 600 discrete points. The values of the state variables (dislocation density, crystal orientation) at each of the integration points were assigned to the new cellular automaton lattice points which fell within the Wigner–Seitz cell corresponding to that integration point. The Wigner–Seitz cells of the finite element mesh were constructed from cell walls which were the perpendicular bisecting planes of all lines connecting neighboring integration points, i.e., the integration points were in the centers of the Wigner–Seitz cells. In the present example the original size of the specimen which provided the input microstructure to the crystal plasticity finite element simulations gave a lattice point spacing of λm = 61.9 µm. The maximum driving force in the region arising from the stored dislocation density amounted to about 1 MPa. The temperature dependence of the shear modulus and of the Burgers vector was considered in the calculation of the driving force. The grain boundary mobility in the region was characterized by an activation energy of the grain boundary mobility of 1.46 eV and a pre-exponential factor of the grain boundary mobility of m0 = 8.3 × 10−3 m3 /(N s). Together with the scaling length λm = 61.9 µm these data were used for the calculation of the time step t = 1/ν0min and of the local switching probabilities wˆ local. The annealing temperature was 800 K. Large angle grain boundaries were characterized by an activation energy for the mobility of 1.3 eV. Small angle grain boundaries were assumed to be immobile.
4.3.
Nucleation Criterion
The nucleation process during primary static recrystallization has been explained for pure aluminum in terms of discontinuous subgrain growth [40]. According to this model nucleation takes place in areas which reveal high misorientations among neighboring subgrains and a high local driving force
Recrystallization simulation by use of cellular automata
2189
for curvature driven discontinuous subgrain coarsening. The present simulation approach works above the subgrain scale, i.e., it does not explicitly describe cell walls and subgrain coarsening phenomena. Instead, it incorporates nucleation on a more phenomenological basis using the kinetic and thermodynamic instability criteria known from classical recrystallization theory (see e.g., [40]). The kinetic instability criterion means that a successful nucleation process leads to the formation of a mobile large angle grain boundary which can sweep the surrounding deformed matrix. The thermodynamic instability criterion means that the stored energy changes across the newly formed large angle grain boundary providing a net driving force pushing it forward into the deformed matter. Nucleation in this simulation is performed in accord with these two aspects, i.e., potential nucleation sites must fulfill both, the kinetic and the thermodynamic instability criterion. The used nucleation model does not create any new orientations: at the beginning of the simulation the thermodynamic criterion, i.e., the local value of the dislocation density was first checked for all lattice points. If the dislocation density was larger than some critical value of its maximum value in the sample, the cell was spontaneously recrystallized without any orientation change, i.e., a dislocation density of zero was assigned to it and the original crystal orientation was preserved. In the next step the ordinary growth algorithm was started according to Eqs. (1)–(11), i.e., the kinetic conditions for nucleation were checked by calculating the misorientations among all spontaneously recrystallized cells (preserving their original crystal orientation) and their immediate neighborhood considering the first, second, and third neighbor shell. If any such pair of cells revealed a misorientation above 15◦ , the cell flip of the unrecrystallized cell was calculated according to its actual transformation probability, Eq. (8). In case of a successful cell flip the orientation of the first recrystallized neighbor cell was assigned to the flipped cell.
4.4.
Predictions and Interpretation
Figures 5–7 show simulated microstructures for site saturated spontaneous nucleation in all cells with a dislocation density larger than 50% of the maximum value (Fig. 5), larger than 60% of the maximum value (Fig. 6), and larger than 70% of the maximum value (Fig. 7). Each figure shows a set of four subsequent microstructures during recrystallization. The upper graphs in Figs. 5–7 show the evolution of the stored dislocation densities. The gray areas are recrystallized, i.e., the stored dislocation content of the affected cells was dropped to zero. The lower graphs represent the microtexture images where each color represents a specific crystal orientation.
2190
D. Raabe
(a)
(b)
(c)
(d)
Figure 5. Consecutive stages of a 2D simulation of primary staticrecrystallization in a deformed aluminum polycrystal on the basis of crystal plasticity finite element starting data. The figure shows the change in dislocation density (top) and in microtexture (bottom) as a function of the annealing time during isothermal recrystallization. The texture is given in terms of the magnitude of the Rodriguez orientation vector using the cube component as reference. The gray areas in the upper figures indicate a stored dislocation density of zero, i.e., these areas are recrystallized. The fat white lines in both types of figures indicate grain boundaries with misorientations above 15◦ irrespective of the rotation axis. The thin green lines indicate misorientations between 5◦ and 15◦ irrespective of the rotation axis. The simulation parameters are: 800 K; thermodynamic instability criterion: site-saturated spontaneous nucleation in cells with at least 50% of the maximum occurring dislocation density (threshold value); kinetic instability criterion for further growth of such spontaneous nuclei: misorientation above 15◦ ; activation energy of the grain boundary mobility: 1.46 eV; pre-exponential factor of the grain boundary mobility: m0 = 8.3 × 10−3 m3 /(N s); mesh size of the cellular automaton grid (scaling length): λm = 61.9 µm.
The color level is determined as the magnitude of the Rodriguez orientation vector using the cube component as reference. The fat white lines in both types of figures indicate grain boundaries with misorientations above 15◦ irrespective of the rotation axis. The thin green lines indicate misorientations between 5◦ and 15◦ irrespective of the rotation axis.
Recrystallization simulation by use of cellular automata
2191
(a)
(b)
(c)
(d)
Figure 6. Parameters like in Fig. 5, but site-saturated spontaneousnucleation occurred in all cells with at least 60% of the maximum occurring dislocation density.
2192
D. Raabe (a)
(b)
(c)
(d)
Figure 7. Parameters like in Fig. 5, but site-saturated spontaneousnucleation occurred in all cells with at least 70% of the maximum occurring dislocation density.
Recrystallization simulation by use of cellular automata
2193
The incipient stages of recrystallization in Fig. 5 (cells with 50% of the maximum occurring dislocation density undergo spontaneous nucleation without orientation change) reveal that nucleation is concentrated in areas with large accumulated local dislocation densities. As a consequence the nuclei form clusters of similarly oriented new grains (e.g., Fig. 5a). Less deformed areas between the bands reveal a very small density of nuclei. Logically, the subsequent stages of recrystallization (Fig. 5 b–d) reveal that the nuclei do not sweep the surrounding deformation structure freely as described by Avrami– Johnson–Mehl theory but impinge upon each other and thus compete at an early stage of recrystallization. Figure 6 (using 60% of the maximum occurring dislocation density as threshold for spontaneous nucleation) also reveals strong nucleation clusters in areas with high dislocation densities. Owing to the higher threshold value for a spontaneous cell flip nucleation outside of the deformation bands occurs vary rarely. Similar observations hold for Fig. 7 (70% threshold value). It also shows an increasing grain size as a consequence of the reduced nucleation density. The deviation from Avrami–Johnson–Mehl type growth, i.e., the early impingement of neighboring crystals is also reflected by the overall kinetics which differ from the classical sigmoidal curve which is found for homogeneous nucleation conditions. Figure 8 shows the kinetics of recrystallization
100
recrystallized volume fraction [vol.%]
90 80 70 60 50 40 30 50% max. disloc. density
20
60% max. disloc. density
10
70% max. disloc. density
0 0
100
200
300
400
500
600
700
800
annealing time [s]
Figure 8. Kinetics of the recrystallization simulations shown in Figs. 5–7, annealing temperature: 800 K; scaling length λm = 61.9 µm.
2194
D. Raabe
(for the simulations with different threshold dislocation densities for spontaneous nucleation, Figs. 5–7). Al curves reveal a very flat shape compared to the analytical model. The high offset value for the curve with 50% critical dislocation density is due to the small threshold value for a spontaneous initial cell flip. This means that 10% of all cells undergo initial site saturated nucleation. Figure 9 shows the corresponding Cahn–Hagel diagrams. It is found that the curves increasingly flatten and drop with an increasing threshold dislocation density for spontaneous recrystallization. It is an interesting observation in all three simulation series that in most cases where spontaneous nucleation took place in areas with large local dislocation densities, the kinetic instability criterion was usually also well enough fulfilled to enable further growth of these freshly recrystallized cells. In this context one should take notice of the fact that both instability criteria were treated entirely independent in this simulation. In other words only those spontaneously recrystallized cells which subsequently found a misorientation above 15◦ to at least one non-recrystallized neighbor cell were able to expand further. This makes the essential difference between a potential nucleus and a successful nucleus. Translating this observation into the initial deformation microstructure means that in the present example high dislocation densities
interface area between recrystallized and non-recrystallized matter devided by sample volume [cellsize1]
0.020 0.018 0.016 0.014 0.012 0.010 0.008 0.006 50% max. disloc. density
0.004
60% max. disloc. density
0.002
70% max. disloc. density
0.000 0
10
20
30
40
50
60
70
80
90
100
recrystallized volume fraction [%]
Figure 9. Simulated interface fractions between recrystallized and non-recrystallized material for the recrystallization simulations shown in Figs. 5–7, annealing temperature: 800 K; scaling length λm = 61.9 µm.
Recrystallization simulation by use of cellular automata
2195
and large local lattice curvatures typically occurred in close neighborhood or even at the same sites. Another essential observation is that the nucleation clusters are particularly concentrated in macroscopical deformation bands which were formed as diagonal instabilities through the sample thickness. Generic intrinsic nucleation inside heavily deformed grains, however, occurs rarely. Only the simulation with a very small threshold value of only 50% of the maximum dislocation density as a precondition for a spontaneous energy drop shows some successful nucleation events outside the large bands. But even then nucleation is only successful at former grain boundaries where orientation changes occur naturally. Summarizing this argument means that there might be a transition from extrinsic nucleation such as inside bands or related large scale instabilities to intrinsic nucleation inside grains or close to existing grain boundaries. It is likely that both types of nucleation deserve separate attention. As far as the strong nucleation in macroscopic bands is concerned, future consideration should be placed on issues such as the influence of external friction conditions and sample geometry on nucleation. Both aspects strongly influence through thickness shear localization effects. Another result of relevance is the partial recovery of deformed material. Figures 5d, 6d, and 7d reveal small areas where moving large angle grain boundaries did not entirely sweep the deformed material. An analysis of the state variable values at these coordinates and of the grain boundaries involved substantiates that not insufficient driving forces but insufficient misorientations between the deformed and the recrystallized areas–entailing a drop in grain boundary mobility– were responsible for this effect. This mechanisms is referred to as orientation pinning.
4.5.
Simulation of Nucleation Topology within a Single Grain
Recent efforts in simulating recrystallization phenomena on the basis of crystal plasticity finite element or electron microscopy input data are increasingly devoted to tackling the question of nucleation. In this context it must be stated clearly that mesoscale cellular automata can neither directly map the physics of a nucleation event nor develop any novel theory for nucleation at the sub-grain level. However, cellular automata can predict the topological evolution and competition among growing nuclei during the incipient stages of recrystallization. The initial nucleation criterion itself must be incorporated in a phenomenological form. This section deals with such as an approach for investigating nucleation topology. The simulation was again started using a crystal plasticity finite
2196
D. Raabe
element approach. The crystal plasticity model set-up consisted in a single aluminum grain with face centered cubic crystal structure and 12 {111}110 slip systems which was embedded in a plastic continuum which had the elasticplastic properties of an aluminum polycrystal with random texture. The crystallographic orientation of the aluminum grain in the center was ϕ1 = 32◦ , φ = 85◦ , ϕ2 = 85◦ . The entire aggregate was plane strain deformed to 50% thickness reduction (given as d/d0 , where d is the actual sample thickness and d0 its initial thickness). The resulting data (dislocation density, orientation distribution) were then used as input data for the ensuing cellular automaton recrystallization simulation. The distribution of the dislocation density taken from all integration points of the finite element simulation is given in Fig. 10. Nucleation was initiated as outlined in detail in Section 4.3, i.e., each lattice point which had a dislocation density above some critical value (500 × 1013 m−2 in the present case, see Fig. 10) of the maximum value in the sample was
25000
Λd /d 50% FCC, orentation ϕ1 32˚, φ85˚, ϕ285˚
22500
10000
20000
8000
frequency [1]
17500
6000
15000
4000
12500
2000
10000
0 350
400
450
500
550
600
650
700
7500 5000 2500 0 0
100
200
300 400 500 600 700 dislocation density [ 1013 m2 ]
800
900
1000
Figure 10. Distribution of the simulated dislocation density in a deformed aluminum grain embedded in a plastic aluminum continuum. The simulation was performed by using a crystal plasticity finite element approach. The set-up consisted of a single aluminum grain (orientation: ϕ1 = 32◦ , φ = 85◦ , ϕ2 =85◦ in Euler angles) with face centered cubic crystal structure and 12 {111}110 slip systems which was embedded in a plastic continuum which had the elasticplastic properties of an aluminum polycrystal with random texture. The sample was plane strain deformed to 50% thickness reduction. The resulting data (dislocation density, orientation distribution) were used as input data for a cellular automaton recrystallization simulation.
Recrystallization simulation by use of cellular automata
2197
spontaneously recrystallized without orientation change. In the ensuing step the growth algorithm was started according to Eqs. (1)–(11), i.e., a nucleus could only expand further if it was surrounded by lattice points of sufficient misorientation (above 15◦ ). In order to concentrate on recrystallization in the center grain the nuclei could not expand into the surrounding continuum material. Figures 11a–c show the change in dislocation density during recrystallization (Fig. 11a: 9% of the entire sample recrystallized, 32.1 s; Fig. 11b: 19% of the entire sample recrystallized, 45.0 s; Fig. 11c: 29.4% of the entire sample recrystallized, 56.3 s). The color scale marks the dislocation density of each lattice point in units of 1013 m−2 . The white areas are recrystallized. The surrounding blue area indicates the continuum material in which the grain is embedded (and into which recrystallization was not allowed to proceed). Figures 12a–c show the topology of the evolving nuclei without coloring the as-deformed volume. All recrystallized grains are colored indicating their crystal orientation. The non-recrystallized material and the continuum surrounding the grain are colored white. Figure 13 shows the volume fractions of the growing nuclei during recrystallization as a function of annealing time (800 K). The data reveal that two groups of nuclei occur. The first class of nuclei shows some growth in the beginning but no further expansion during the later stages of the anneal. The second class of nuclei shows strong and steady growth during the entire recrystallization time. One could refer to the first group as non-relevant nuclei while the second group could be termed relevant nuclei. The reasons of such a spread in the evolution of nucleation topology after their initial formation are nucleation clustering, orientation pinning, growth selection, or driving force selection phenomena. Nucleation clustering means that areas which reveal localization of strain and misorientation produce high local nucleation rates. This entails clusters of newly formed nuclei where competing crystals impinge on each other at an early stage of recrystallization so that only some of the newly formed grains of each cluster can expand further. Orientation pinning is an effect where not insufficient driving forces but insufficient misorientations between the deformed and the recrystallized areas – entailing a drop in grain boundary mobility – are responsible for the limitation of further growth. In other words some nuclei expand during growth into areas where the local misorientation drops below 15◦ . Growth selection is a phenomenon where some grains grow significantly faster than others due to a local advantage originating from higher grain boundary mobility such as shown in Fig. 4b. Typical examples are the 40◦ 111 rotation relationship in aluminum or the 27◦ 110 rotation relationship in iron–silicon which are known to have a growth advantage (e.g., Ref. [40]). Driving force selection is a phenomenon where some grains grow significantly faster than others due to a local advantage in driving force (shear bands, microbands, heavily deformed grain).
2198
D. Raabe (a)
(b)
(c)
Figure 11. Change in dislocation density during recrystallization (800 K).The color scale indicates the dislocation density of each lattice point in units of 1013 m−2 . The white areas are recrystallized. The surrounding blue area indicates the continuum material in which the grain is embedded. (a) 9% of the entire sample recrystallized, 32.1 s; (b) 19% of the entire sample recrystallized, 45.0 s; (c) 29.4% of the entire sample recrystallized, 56.3 s.
Recrystallization simulation by use of cellular automata
2199
(a)
(b)
(c)
Figure 12. Topology of the evolving nuclei of the microstructure given inFig. 11 without coloring the as-deformed volume. All newly recrystallized grains are colored indicating their crystal orientation. The non-recrystallized material and the continuum surrounding the grain are colored white. (a) 9% of the entire sample recrystallized, 32.1 s; (b) 19% of the entire sample recrystallized, 45.0 s; (c) 29.4% of the entire sample recrystallized, 56.3 s.
2200
D. Raabe 12000
volume of new grains [ cell3 ]
10000
8000
6000
4000
2000
0 0
10
20
30 annealing time [s]
40
60
80
Figure 13. Volume fractions of the growing nuclei in Fig. 11 during recrystallization as a function of annealing time (800 K).
5.
Conclusions and Outlook
A review was given about the fundamentals and some applications of cellular automata in the field of microstructure research. Special attention was placed on reviewing the fundmentals of mapping rate formulations for interfaces and driving forces on cellular grids. Some applications were discussed from the field of recrystallization theory. The future of the cellular automaton method in the field of mesoscale materials science lies most likely in the discrete simulation of equilibrium and non-equilibrium phase transformation phenomena. The particular advantage of automata in this context is their versatility with respect to the constitutive ingredients, to the consideration of local effects, and to the modification of the grid structure and the interaction rules. In the field of phase transformation simulations the constitutive ingredients are the thermodynamic input data and the kinetic coefficients. Both sets of input data are increasingly available from theory and experiment rendering cellular automaton simulations more and more realistic. The second advantage, i.e., the incorporation of local effects will improve our insight into cluster effects, such as arising from the spatial competition of expanding neighboring spheres already in the incipient stages of transformations. The third advantage, i.e., the flexibility of automata with respect to the grid structure and the interaction rules is probably the most
Recrystallization simulation by use of cellular automata
2201
important aspect for novel future applications. By introducing more global interaction rules (in addition to the local rules) and long-range or even statistical elements in addition to the local rules for the state update might establish cellular automata as a class of methods to solve some of the intricate scale problems that are often encountered in the materials sciences. It is conceivable that for certain mesoscale problems such as the simulation of transformation phenomena in heterogeneneous materials in dimensions far beyond the grain scale cellular automata can occupy a role between the discrete atomistic approaches and statistical Avrami-type approaches. The mayor drawback of the cellular automaton method in the field of transformation simulations is the absence of solid approaches for the treatment of nucleation phenomena. Although basic assumptions about nucelation sites, nucleation rates, and nucelation textures can often be included on an empirical basis as a function of the local values of the state variables, intrinsic physically based phenomenological concepts such as available to a certain extent in the Ginzburg–Landau framework (in case of the spinodal mechanism) are not yet available for automata. It might hence be beneficial in future work to combine Ginzburg–Landau-type phase field approaches with the cellular automaton method. For instance the (spinodal) nucleation phase could then be treated with a phase field method and the resulting microstructure could be further treated with a cellular automaton simulation.
References [1] J. von Neumann, “The general and logical theory of automata,” In: W. Aspray and A. Burks (eds.), Papers of John von Neumann on Computing and Computer Theory, vol. 12 in the Charles Babbage Institute Reprint Series for the History of Computing, MIT Press, Cambridge, 1987, 1963. [2] S. Wolfram, Theory and Applications of Cellular Automata, Advanced Series on Complex Systems, selected papers 1983–1986, vol. 1, World Scientific Publishing Co. Pte. Ltd, Singapore, 1986. [3] S. Wolfram, “Statistical mechanics of cellular automata,” Rev. Mod. Phys., 55, 601– 622, 1983. [4] M. Minsky, Computation: Finite and Infinite Machines, Prentice-Hall, Englewood Cliffs, NJ, 1967. [5] J.H. Conway, Regular Algebra and Finite Machines, Chapman & Hall, London, 1971. [6] D. Raabe, Computational Materials Science, Wiley-VCH, Weinheim, 1998. [7] H.W. Hesselbarth and I.R. G¨obel, “Simulation of recrystallization by cellular automata,” Acta Metall., 39, 2135–2144, 1991. [8] C.E. Pezzee and D.C. Dunand, “The impingement effect of an inert, immobile second phase on the recrystallization of a matrix,” Acta Metall., 42, 1509–1522, 1994. [9] R.K. Sheldon and D.C. Dunand, “Computer modeling of particle pushing and clustering during matrix crystallization,” Acta Mater., 44, 4571–4582, 1996. [10] C.H.J. Davies, “The effect of neighbourhood on the kinetics of a cellular automaton recrystallisation model,” Scripta Metall. et Mater., 33, 1139–1154, 1995.
2202
D. Raabe
[11] V. Marx, D. Raabe, and G. Gottstein, “Simulation of the influence of recovery on the texture development in cold rolled BCC-alloys during annealing,” In: N. Hansen, D. Juul Jensen, Y.L. Liu, and B. Ralph (eds.), Proceedings 16th RISøInt. Sympos. on Mat. Science: Materials: Microstructural and Crystallographic Aspects of Recrystallization, RISø Nat. Lab, Roskilde, Denmark, pp. 461–466, 1995. [12] D. Raabe, “Cellular automata in materials science with particular reference to recrystallization simulation,” Ann. Rev. Mater. Res., 32, 53–76, 2002. [13] V. Marx, D. Raabe, O. Engler, and G. Gottstein, “Simulation of the texture evolution during annealing of cold rolled bcc and fcc metals using a cellular automaton approach,” Textures Microstruct., 28, 211–218, 1997. [14] V. Marx, F.R. Reher, and G. Gottstein, “Stimulation of primary recrystallization using a modified three-dimensional cellular automaton,” Acta Mater., 47, 1219–1230, 1998. [15] C.H.J. Davies, “Growth of nuclei in a cellular automaton simulation of recrystallisation,” Scripta Mater., 36, 35–46, 1997. [16] C.H.J. Davies and L. Hong, “Cellular automaton simulation of static recrystallization in cold-rolled AA1050,” Scripta Mater., 40, 1145–1152, 1999. [17] D. Raabe, “Introduction of a scaleable 3D cellular automaton with a probabilistic switching rule for the discrete mesoscale simulation of recrystallization phenomena,” Philos. Mag. A, 79, 2339–2358, 1999. [18] D. Raabe and R. Becker, “Coupling of a crystal plasticity finite element model with a probabilistic cellular automaton for simulating primary static recrystallization in aluminum,” Modell. Simul. Mater. Sci. Eng., 8, 445–462, 2000. [19] D.Raabe, “Yield surface simulation for partially recrystallized aluminum polycrystals on the basis of spatially discrete data,” Comput. Mater. Sci., 19, 13–26, 2000. [20] D. Raabe, F. Roters, and V. Marx, “Experimental investigation and numerical simulation of the correlation of recovery and texture in bcc metals and alloys,” Textures Microstruct., 26–27, 611–635, 1996. [21] M.B. Cortie, “Simulation of metal solidification using a cellular automaton,” Metall. Trans. B, 24, 1045–1052, 1993. [22] S.G.R. Brown, T. Williams, and JA. Spittle, “A cellular automaton model of the steady-state free growth of a non-isothermal dendrite,” Acta Metall., 42, 2893–2906, 1994. [23] C.A. Gandin and M. Rappaz, “A 3D cellular automaton algorithm for the prediction of dendritic grain growth,” Acta Metall., 45, 2187–2198, 1997. [24] C.A. Gandin, “Stochastic modeling of dendritic grain structures,” Adv. Eng. Mater., 3, 303–306, 2001. [25] C.A. Gandin, J.L. Desbiolles, and P.A. Thevoz, “Three-dimensional cellular automaton-finite element model for the prediction of solidification grain structures,” Metall. Mater. Trans. A, 30, 3153–3172, 1999. [26] J.A. Spittle and S.G.R. Brown, “A cellular automaton model of steady-state columnardendritic growth in binary alloys,” J. Mater. Sci., 30, 3989–3402, 1995. [27] S.G.R. Brown, G.P. Clarke, and A.J. Brooks, “Morphological variations produced by cellular automaton model of non-isothermal free dendritic growth,” Mater. Sci. Technol., 11, 370–382, 1995. [28] J.A. Spittle and S.G.R. Brown, “A 3D cellular automation model of coupled growth in two component systems,” Acta Metallurgica, 42, 1811–1820, 1994. [29] M. Kumar, R. Sasikumar, P. Nair, and R. Kesavan, “Competition between nucleation and early growth of ferrite from austenite-studies using cellular automaton simulations,” Acta Mater., 46, 6291–6304, 1998.
Recrystallization simulation by use of cellular automata
2203
[30] S.G.R. Brown, “Simulation of diffusional composite growth using the cellular automaton finite difference (CAFD) method,” J. Mater. Sci., 33, 4769–4782, 1998. [31] T. Yanagita, “Three-dimensional cellular automaton model of segregation of granular materials in a rotating cylinder,” Phys. Rev. Lett., 3488–3492, 1999. [32] E.M. Koltsova, I.S. Nenaglyadkin, A.Y. Kolosov, and V.A. Dovi, “Cellular automaton for description of crystal growth from the supersaturated unperturbed and agitated solutions,” Rus. J. Phys. Chem., 74, 85–91, 2000. [33] J. Geiger, A. Roosz, and P. Barkoczy, “Simulation of grain coarsening in two dimensions by cellular-automaton,” Acta Mater., 49, 623–629, 2001. [34] Y. Liu, T. Baudin, and R. Penelle, “Simulation of grain growth by cellular automata,” Scripta Mater., 34, 1679–1686, 1996. [35] T. Karapiperis, “Cellular automaton model of precipitation/sissolution coupled with solute transport,” J. Stat. Phys., 81, 165–174, 1995. [36] M.J. Young and C.H.J. Davies, “Cellular automaton modelling of precipitate coarsening,” Scripta Mater., 41, 697–708, 1999. [37] O. Kortluke, “A general cellular automaton model for surface reactions,” J. Phys. A, 31, 9185–9198, 1998. [38] G. Gottstein and L.S. Shvindlerman, Grain Boundary Migration in Metals– Thermodynamics, Kinetics, Applications, CRC Press, Boca Raton, 1999. [39] R.C. Becker, “Analysis of texture evoltuion in channel die compression-I. Effects of grain interaction,” Acta Metall. Mater., 39, 1211–1230, 1991. [40] R.C. Becker and S. Panchanadeeswaran, “Effects of grain interactions on deformation and local texture in polycrystals,” Acta Metall. Mater., 43,2701–2719, 1995. [41] F.J. Humphreys and M. Hatherly, Recrystallization and Related Annealing Phenomena, Pergamon Press, New York, 1995.
7.8 MODELING COARSENING DYNAMICS USING INTERFACE TRACKING METHODS John Lowengrub University of California, Irvine, California, USA
In this paper, we will discuss the current state-of-the-art in numerical models of coarsening dynamics using a front-tracking approach. We will focus on coarsening during diffusional phase transformations. Many important structural materials such as steels, aluminum and nickel-based alloys are products of such transformations. Diffusional transformations occur when the temperature of a uniform mixture of materials is lowered into a regime where the uniform mixture is unstable. The system responds by nucleating second phase precipitates (e.g., crystals) that then evolve diffusionally until the process either reaches equilibrium or is quenched by further reducing the temperature. The diffusional evolution consists of two phases – growth and coarsening. Growth occurs in response to a local supersaturation in the primary (matrix) phase and a local mass balance relation is satisfied at each precipitate interface. Coarsening occurs when a global mass balance is achieved and involves a dynamic rearrangement of the fixed total mass in the system so as to minimize a global energy. Typically, the global energy consists of the surface energy. If the transformation occurs between components in the solid state, there is also an elastic energy that arises due to the presence of a misfit stress between the precipitates and the matrix as their crystal structures are often slightly different. Diffusional phase transformations are responsible for producing the material microstructure, i.e., the detailed arrangement of distinct constituents at the microscopic level. The details of the microstructure greatly influence the material properties of the alloy (i.e., stiffness, strength, and toughness). In many alloys, an in situ coarsening process can occur at high temperatures in which a dispersion of very small precipitates evolves to a system consisting of a few very large precipitates in order to decrease the surface energy of the system. This coarsening severely degrades the properties of the alloy and can lead to in service failures. The details of this coarsening process depend strongly 2205 S. Yip (ed.), Handbook of Materials Modeling, 2205–2222. c 2005 Springer. Printed in the Netherlands.
2206
J. Lowengrub
on the elastic properties and crystal structure of the alloy components. Thus, one of the goals of this line of research is to use elastic stress to control the evolution process so as to achieve desirable microstructures. Numerical simulations of coarsening two-phase microstructures have followed two directions – interface capturing and interface tracking. In capturing methods, the precipitate/matrix interfaces are implicitly determined through an auxiliary function that is introduced to delineate between the precipitate and matrix phases. Examples include phase-field and level-set methods. Typically, sharp interfaces are smoothed out and the elasticity and diffusion systems are replaced by mesoscopic approximations that mimic the true field equations together with interface jump conditions. These methods have the advantage that topological changes such as precipitate coalescence and splitting are easily described. A disadvantage of this approach is that the results can be sensitive to the parameters that determine the thickness of the interfacial regions and care needs to be taken reconcile the results using sharp interfaces and tracking methods. In interface tracking methods, which are the subject of this article, a specific mesh is introduced to approximate the interface. The evolution of the interface is tracked by explicitly evolving the interface mesh in time. Examples include boundary integral, immersed interface [1], ghost-fluid [2], front-tracking [3, 4]. In boundary integral, immersed interface and ghost-fluid methods, for example, the interfaces remain sharp and the true field equations and jump conditions are solved. These methods have the advantage that high order accurate solutions can be obtained. Thus, in addition to their intrinsic value, results from these algorithms can also be used as benchmarks to validate interface-capturing methods. Boundary integral methods have the additional advantage that the field equations and jump conditions are mapped to the precipitate/matrix interfaces thereby reducing the dimensionality of the problem. However, boundary integral methods typically apply only in the limited situation where the physical domains and parameters are piecewise homogeneous. The other tracking methods listed above do not suffer from this difficulty although they are generally not as accurate as boundary integral methods. A general disadvantage of the tracking approach is that ad-hoc cut-and-connect procedures are required to handle changes in interface topologies. In this article, we will focus primarily on a description of the state-of-theart in boundary integral methods.
1.
Coarsening
One of the central assumptions of mathematical models of coarsening is that the system evolves so as to decrease the total energy. This energy consists of an interfacial part, associated with the precipitate/matrix interfaces and a
Modeling coarsening dynamics using interface tracking methods
2207
bulk part due to the elasticity of the constituent materials. In the absence of the elastic stress, precipitates tend to be roughly spherical and interfacial area is reduced by the diffusion of matter from regions of high interfacial curvature to regions of low curvature. During coarsening, this leads to a survival of the fattest since large precipitates grow at the expense of small ones. This coarsening process may severely degrade the properties of the alloy. In the early 1960s, an asymptotic theory, now referred to as the LSW theory, was developed by Lifshitz and Slyosov [5], and Wagner [6] to predict the temporal power law of precipitate growth and in particular the scaling at long times of the precipitate radius distribution. In this LSW theory, only surface energy is considered and it is found that the average precipitate radius R ∼ t 1/3 at long times. The LSW theory has two major restrictions, however. First, precipitates are assumed to be circular (spherical in 3-D) and second, the theory is valid only in the zero (precipitate) volume fraction limit. Extending the results of LSW to account for non-spherical precipitates, finite volume fractions and elastic interactions has been a subject of intense research interest and is one of the primary reasons for the development of accurate and efficient numerical methods to study microstructure evolution. See the recent reviews by Johnson and Voorhees [7], Voorhees [7] and Thornton et al. [8].
2.
Governing Equations
For the purposes of illustration, let us focus a two-phase microstructure in a binary alloy. We further assume that the matrix phase M extends to infinity (or in 2D may be contained in a large domain ∞ ), while the precipitate phase P consists of isolated particles occupying a finite volume. The interface between the two phases is a collection of closed surfaces . The evolution of the precipitate matrix interface is controlled by diffusion of matter across the interface. Assuming quasi-static diffusion, the normalized composition c is governed by Laplace’s equation c = 0
(1)
in both phases. The composition on a precipitate-matrix interface is given by the Gibbs–Thomson boundary condition [9] c = −(τ I + ∇n ∇n τ ) : K − Zg el − λVn ,
(2)
where τ = τ (n) is the non-dimensional anisotropic surface tension, n is the normal vector directed towards M , I is the identity tensor, (∇n ∇n τ )i j = ∂ 2 τ/ ∂n i ∂n j ,
K=−
N M L s1 s1 + √ (s1 s2 + s2 s1 ) + s2 s2 E F EG
2208
J. Lowengrub
is the curvature tensor where s1 and s2 are tangent vectors to the interface and the definitions of L, E, M, G, F and N depend on the interface parametrization and can be found in standard differential geometry texts [10]. Note that ˆ = 2H where H is the mean curvature. In addition, Z characterizes the tr(E) relative strength of the surface and elastic energies, g el is the elastic energy density (defined below), Vn is the normal velocity of the precipitate/matrix interface and λis a non-dimensional linear kinetic coefficient. Roughly speaking, this boundary condition reflects the idea that changing the shape of a precipitate changes the energy of the system both through the additional surface energy, (τ I + ∇n ∇n τ ) : K, and also through the change in elastic energy of the system, Zg el . We note that the composition is normalized differently in the precipitate and matrix, so that the normalized composition is continuous across the interface; the actual dimensional composition undergoes a jump. The normal velocity is given by the flux balance Vn = k
∂c ∂c − , ∂n P ∂n M
(3)
and (∂c/∂n) P and (∂c/∂n) M denote the values of normal derivative of c evaluated on the precipitate side and the matrix side of the interface, respectively, and k is the ratio of thermal diffusivities. Two different far-field conditions for the diffusion problem can be posed. In the first, the mass flux J into the system is specified: 1 J= 4π
1 Vn d = 4π
∂∞
∂c d∂∞ , ∂n
(4)
where ∞ is a large domain containing all the precipitates. As a second, alternative boundary condition, the far-field composition c∞ is specified lim c(x) = c∞ .
|x|→∞
(5)
In 2D, the limit in Eq. (5) is taken only to ∂∞ since c diverges logarithmically at infinity (see the 2D Green’s function below). Since the elastic energy density g el appears in the Gibbs–Thomson boundary condition (1), one must solve for the elastic fields in the system before finding the diffusion fields. The elastic fields arise if there is a misfit strain, denoted by ε T between the precipitate and matrix. This misfit is taken into account through the constitutive relations between the stress σi j and strain εi j . These are σiPj = CiPj kl εlkP − εlkT in the precipitate and σiMj = CiMj kl εlkM in the matrix, where we have taken the matrix lattice as the reference. The superscripts P and M refer to the values in the precipitate and matrix respectively. The elastic stiffness tensor Ci j kl may be different in the matrix and precipitate (elastically inhomogeneous) and may also reflect different material symmetries of the two phases.
Modeling coarsening dynamics using interface tracking methods
2209
The equations of elastic equilibrium require that σi j, j = 0,
(6)
in both phases (in the absence of body forces). We also assume the interfaces are coherent, so the displacement u (i.e., εi j =(u i, j +u j,i )/2) and the traction t (i.e., ti = σi j n j ) are continuous across them. For simplicity, we suppose that the far-field tractions and displacements vanish. Finally, the elastic energy density g el is given by 1 P P σi j εi j − εiTj − σiMj εiMj + σiMj εiMj − εiPj . (7) 2 Finally, the total energy of the system is the sum of the surface and elastic energies
g el =
Wtot = Ws + Wel . where
Ws =
(8)
Z τ (n) d, and Wel = 2
σiPj εiPj − εiTj d+
P
σiMj εiMj d.
M
(9) For details on the isotropic formulation, derivation and nondimensionalization, see Li et al. [11], the review articles [8, 12] and the references therein.
3.
The Boundary Integral Formulation
We first consider the diffusion problem. If the interface kinetics λ > 0, then a single-layer potential can be used. That is, the composition is given by
c(x) =
σ (x ) G(x − x ) d(x ) + c¯∞ ,
(10)
where σ (x) is the single-layer potential, G(x)is the Green’s function (i.e., 2D: G(x) = (1/2π ) log |x|, 3D: G(x) = (1/4π |x|) and c¯∞ is a constant. Then, taking the limit as x → , and using Eq. (2), we get the Fredholm boundary integral equation
σ (x ) G(x − x ) d(x ) + λV n + c¯∞
(11)
where the normal velocity Vn is related to σ (x). In fact, if the ratio of diffusivities k = 1, then Vn = σ (x) and the equation is a 2nd kind Fredholm integral
2210
J. Lowengrub
equation. See [13]. For simplicity, let us suppose this is the case. Then, if the flux is specified, Eq. (11) is solved together with Eq. (4) to determine Vn and c¯∞ . In 3D, if far-field condition (5) is imposed, then c¯∞ = c∞ . In 2D, if (5) is imposed, then another single layer potential must be introduced at the far-field boundary ∂∞ [14]. If the interface kinetics λ = 0, then a double-layer potential should be used: c(xi ) =
µi (x )
∂G np (xi − x )d(x ) + k=1 Ak G(xi − Sk ), ∂n
(12)
in each domain i where i = p, m, and n p is the number of precipitates and Sk is a point inside the kth precipitate. In the limit x → leads to the system of 2nd kind Fredholm equations
µi (x )
∂G µi np (xi − x ) d(x ) + k=1 Ak G(xi − Sk ) ± ∂n 2
= − (τ I + ∇n ∇n τ ) : K − Zg el ,
(13)
where the plus sign is taken when i = m [13]. The Ak are determined from the equations
µi (x) d(x ) = 0,
for i = 1, n p − 1,
and
n
p k=1 Ak = J.
The normal velocity Vn is obtained by taking the normal derivative of Eq. (12), taking care to treat the singularity of the Green’s function [13], and thus depends on µi (x ). Equation (13) is then solved together with the far-field conditions in either Eq. (4) or (5) to obtain µi (x ) and c¯∞ and Vn . We note that in 3D, we have recently found that a vector potential formulation [15] rather than a dipole formulation gives better numerical accuracy for computing Vn in this case (Pham, Lowengrub, Nie, Cristini, in preparation). Finally, once Vn is known, the interface is updated by n•
dx = Vn . dt
(14)
To actually solve the boundary integral equations, the elastic energy density g el must be determined first. This requires the solution of the elasticity equations. The boundary integral equations for the continuous displacement field u(x), and traction field t(x) on the interface involve Cauchy-principal-value
Modeling coarsening dynamics using interface tracking methods
2211
integrals over the interface. The equations can, using a direct formulation, be written as
(u i (y) − u i (x))Ti Pj k (y − x)n k (y) d(y) −
=
ti (y)G iPj k (y − x) d(y)
tiT (y)G iPj k (y
− x) d(y),
(15)
and u i (x) −
(u i (y) − u i (x))Ti M j k (y − x)n k (y) d(y)
−
ti (y)G iMj k (y − x) d(y) = 0,
(16)
where Ti j k and G ikj are the Green’s functions associated with the traction T n j is the misfit traction. For and displacement respectively and tiT = CiPj kp εkp isotropic elasticity, the Green’s functions are given by the Kelvin solution. For general 3D anisotropic materials, the Green’s functions cannot be written explicitly and are formulated in terms of line integrals. In 2D, explicit formulas exist for the Green’s functions. See for example [16, 17]. From the components of the displacements and tractions, the elastic energy density g el can be calculated [12].
4.
Numerical Implementation
The numerical procedure to simulate the evolution is as follows. Given the precipitate shapes, the elasticity Eqs. (15) and (16) are solved and the elastic energy g el is determined. Then diffusion equation is solved, the normal velocity is calculated and the interfaces are advanced one step in time. Precipitates whose volume falls below a certain tolerance are removed from the simulation. In 2D, very efficient and spectrally numerical methods have been developed to solve this problem [12]. The integrals with smooth integrands are discretized with spectral accuracy using the trapezoid rule. The Cauchy principal value integrals are discretized with spectral accuracy using the alternating point trapezoid rule. The fast multipole method [18] is used to evaluate the discrete sums in O(N ) work where N is the total number of collocation points on all the interfaces. Further efficiency is gained by neglecting particle–particle interactions if the particles are well-separated. The iterative method GMRES is then used to solve the discrete nonsymmetric, non-definite elasticity and diffusion matrix systems. The surface tension introduces a severe third order time step constraint for stability: t ≤ Cs 3 where C is a constant and s is the minimum spacing in
2212
J. Lowengrub
arclength along all the interfaces. To overcome this difficulty, Hou, Lowengrub and Shelley [12] performed a mathematical analysis of the equations of motion at small length-scales (the “small-scale decomposition”). This analysis shows that when the equations of motion are properly formulated, surface tension acts through a linear operator at small length-scales. This contribution, when combined with a special reference frame in which the collocation points remain equally spaced in arclength, can then be treated implicitly and efficiently in a time-integration scheme, and the high-order constraints removed. In 3D, efficient algorithms have been recently developed by Li et al. [11] and Cristini and Lowengrub [19]. In these approaches, the surfaces are discretized using an adaptive surface triangulated mesh [20]. As in 2D, the integral equations are solved using the collocation method and GMRES. In Li et al. [11], local quadratic Lagrange interpolation is used to represent field quantities (i.e., u, t, Vn , and the position of the interface x ) in triangle interiors. The normal vector is derived from the local coordinates using the Lagrange interpolants of the interface position. The curvature is determined by performing a local quadratic fit to the triangulated surface. This combination was found to yield the best accuracy for a given resolution. On mesh triangles where the integrand is singular, a nonlinear change of variables (Duffy’s transformation) is used to map the singular triangle to a unit square and to remove the 1/r divergence of the integrand. For triangles in a region close to the singular triangle, the integrand is nearly singular, and, so, each of these triangles is divided into four smaller triangles, and a high-order quadrature is used on each subtriangle individually. On all other mesh triangles, the highorder quadrature is used to approximate the integrals. In Cristini and Lowengrub [19], there are no effects of elasticity (Z = 0) the collocation method is used to solve the diffusion integral equation together with GMRES and the nonlinear Duffy transformation to remove the singularity of the integrand in the singular triangle. Away from the singular triangle, the trapezoid rule is used and no interpolations are used to represent the field quantities in triangle interiors. As in Li et al., the curvature is still determined by performing a local quadratic fit to the triangulated surface. In both Li et al., and Cristini and Lowengrub, a second-order Runge–Kutta method is used to advance the triangle nodes. The time-step size is proportional to the smallest diameter of the triangular elements raised to the 3/2 power:t = Ch 3/2 . This scaling is due to the fact that the adaptive mesh uniformly resolves the solid angle. Since the shape of the precipitate can change substantially during its evolution, one of the keys to the success of these algorithms is the use of the adaptive-mesh refinement algorithm developed originally by Cristini, Blawzdzieweicz, and Loewenberg [20]. In this algorithm, the solid angle is uniformly resolved throughout the simulation using the following local-mesh restructuring operations to achieve an optimal mesh density: grid equilibration,
Modeling coarsening dynamics using interface tracking methods
2213
edge-swapping, and node addition and subtraction. This results in a density of node points that is proportional to the maximum of the curvature (in absolute value), so that grid points cluster in highly curved regions of the interface. Further, each of the mesh triangles is nearly equilateral. Finally, to further increase efficiency, a parallelization algorithm is implemented for the diffusion and elasticity solvers. The computational strategy for the parallelization is similar to the one designed for the microstructural evolution in 2D elastic media [12]. A new feature of the algorithm implemented by Li et al. is that the diffusion and elasticity matrices are also divided among the different processors in order to reduce the amount of memory required on each individual processor.
5.
Two-dimensional Results
The state-of-the-art in 2D simulations of purely diffusional evolution in the absence of elastic stress (Z = 0) is the work of [21]. In metallic alloy systems, this corresponds to simulating systems of very small precipitates where the surface energy dominates the elastic energy. Using the methods described above, Akaiwa and Meiron performed simulations containing over 5000 precipitates. Akaiwa and Meiron divided the computational domain into subdomains each containing 50–150 precipitates. Inside each sub-domain, the full diffusion field is computed. The influence of particles outside each subdomain is restricted to only involve those lying within a distance of 6–7 times the average precipitate radius from the sub-domain. This was found to give at most a 1% error in the diffusion field and significantly reduces the computational cost. In Fig. 1, two snapshots of a typical simulation are shown at the very late stages of coarsening. In this simulation, the precipitate area fraction is 0.5 and periodic boundary conditions are applied. In Fig. 1(left), there are approximately 130 precipitates remaining, while in Fig. 1(right) there are only approximately 70 precipitates left. Note that there is no discernible alignment of precipitates. Further, as the system coarsens, the typical shape of a precipitate shows significant deviation from a circle. The simulation results of Akaiwa and Meiron agree with the classical Lifshitz–Slyozov–Wagner (LSW) theory in which the average precipitate radius R is predicted to scale as R ∝ t 1/3 at large times t. It was found that certain statistics, such as the particle size distribution functions, are insensitive to the non-circular particle shapes at even at moderate volume fractions. Simulations were restricted to volume fractions less than 0.5 due to the large computational costs associated with refining the space and time scales to resolve particle-particle near contact interactions at larger volume fractions. The current state of the art in simulating diffusional evolution in homogeneous, anisotropic elastic media is the recent work of [22] who studied alloys
2214
J. Lowengrub
Figure 1. The late stages of coarsening in the absence of elastic forces (Z = 0). Left: Moderate time; Right: Late time. After [21]. Reproduced with permission.
with cubic symmetry. In metallic alloys, such a system can be considered as a model for nickel–aluminum alloys. In the homogeneous case, one need not solve Eqs. (15)–(16). Instead, the derivatives of the displacement field and hence the elastic energy density g el due to a misfitting precipitate may be evaluated directly from the Green’s function tensor via the boundary integral [22]
u j,k (x) = Ci j + Ci j 22
gi j,k (x, x )n l (x )d(x ),
(17)
where the misfit is a unit dilatation and x is either in the matrix or precipitate and Ci j kl is the stiffness tensor. Using the methods described above together with a fast summation method to calculate the integral in Eq. (17), Akaiwa, Thornton and Voorhees, 2001 have performed simulations involving over 4000 precipitates. See Fig. 2 for results with isotropic surface tension and dilatational misfits. The value of Z is allowed to vary dynamically through an average precipitate radius. Thus, as precipitates coarsen and grow larger, Z increases correspondingly. The initial volume fraction of precipitates is 0.1. Thornton, Akaiwa and Voorhees find that the morphological evolution is significantly different in the presence of elastic stress. In particular, large-scale alignment of particles is seen in the 100 and 010 directions during the evolution process. In addition, there is significant shape dependence as nearly circular precipitates are seen at small Z and as Z increases, precipitates become squarish and then rectangular. It is found that in the elastically homogeneous system, elastic stress does not modify the 1/3 temporal exponent of the LSW coarsening law even though the precipitate morphologies are far from circular. Surprisingly, as long as the
Modeling coarsening dynamics using interface tracking methods
2215
Figure 2. Coarsening in homogeneous, cubic elasticity. The volume fraction is 10%. The left column shows the computational domain, while the right column is scaled with the average particle size. After Thornton, Akaiwa and Voorhees, 2001. Reproduced with permission.
2216
J. Lowengrub
shapes remain fourfold symmetric, the kinetics (coefficient of temporal factor) remains unchanged also. It is only when a majority of the particles have a two-fold rectangular shape that the coarsening kinetics changes [23]. The inhomogeneous elasticity problem is much more difficult to solve than the homogeneous problem because in the inhomogeneous case, the integral Eqs. (15)–(16) must be solved in order to obtain the inhomogeneous elastic fields and the elastic energy density g el . For this reason, the state of theory and simulations are less well-developed for the inhomogeneous case compared to the homogeneous problem. The current state-of-the-art in simulating microstructure evolution in inhomogeneous, anisotropic elastic media is the work of Leo, Lowengrub and Nie 2000. Although the system (15), (16) is a Fredholm equation of mixed type with smooth, logarithmic, and Cauchy-type kernels, it was shown by Leo, Lowengrub and Nie 2000, in the anisotropic case, that the system may be transformed directly to a second kind Fredholm system with smooth kernels. The transformation relies on an analysis of the equations at small spatial scales. Leo, Lowengrub and Nie, 2000 found that even small elastic inhomogeneities may have a strong effect on precipitate evolution in systems with small numbers of precipitates. For instance, in systems where the elastic constants of the precipitates are smaller than those of the matrix (soft precipitates), the precipitates move toward each other. In the opposite case (hard precipitates), the precipitates tend to repel one another. The rate of approach or repulsion depends on the amount of inhomogeneity. Anisotropic surface energy may either enhance or reduce this effect. The evolutions of two sample inhomogeneous systems in 2D are shown in Fig. 3. The solid curves correspond to Ni3 Al precipitates (soft, elastic constants less than the Ni matrix) and the dashed curves correspond to Ni3 Si precipitates (hard, elastic constants larger than the Ni matrix). In both cases, the matrix is Ni. Note that only the Ni3 Si precipitates are shown at time t = 20.09 for reasons explained below. From a macroscopic point of view, there seems to be little difference in the results of the two simulations over the times considered. The precipitates become squarish at very early times and there is only a small amount of particle translation. One can observe that the upper and lower two relatively large pairs of precipitates tend to align along the horizontal direction locally. The global alignment of all precipitates on the horizontal and vertical directions appears to occur on a longer time scale. On the time scale presented, the kinetics appears to be primarily driven by the surface energy which favors coarsening–the growth of large precipitates at the expense of the small precipitates to reduce the surface energy. Upon closer examination, differences between the simulations are observed. For example, consider the result at time t = 15.77 which is shown in Fig. 3. In the Ni3 Al case, the two upper precipitates attract one another and likely merge. In the Ni3 Si case, on the other hand, it does not appear that these two
Modeling coarsening dynamics using interface tracking methods t 0
t 2.5
t 5.0
t 15.0
t 15.77
t 20.09
2217
Figure 3. Evolution of 10 precipitates in a Ni matrix. Solid, Ni3 Al; dashed, Ni3 Si, Z =1. After Leo, Lowengrub and Nie 2000. Reproduced with permission.
precipitates will merge. This is consistent with the results of smaller precipitate simulations [24]. In addition, the interacting pairs of Ni3 Al precipitates tend to be “flatter” than their Ni3 Si counterparts. Also observe that the lower two precipitates in the Ni3 Al case attract one another. In the process, the lower right precipitate develops very high curvature (note its flat bottom) that ultimately prevents the simulation to be continued much beyond this time. This is why no Ni3 Al precipitates are shown in Fig. 3 at time t = 20.09. Finally, more work needs to be done in order to simulate larger inhomogeneous systems in order to reliably determine coarsening rate constants.
6.
Three-dimensional Results
Because of the difficulties in simulating the evolution of 2D surfaces in 3D, the simulation of microstructure evolution in 3D is much less developed than the 2D counterpart. Nevertheless, there has been promising recent work that is beginning to bridge the gap. The state-of-the-art in 3D boundary integral simulations is the work of Cristini and Lowengrub, 2004 and Li et al., 2003. Using the adaptive simulation algorithms described above, Cristini and Lowengrub, 2004 simulated the diffusional evolution of systems with a
2218
J. Lowengrub
single precipitate growing under the influence of a driving force consisting of either an imposed far-field heat flux or a constant undercooling in the far-field. Under conditions of constant heat flux, Cristini and Lowengrub demonstrated that the Mullins–Sekerka instability can be suppressed and precipitates can be grown with compact shapes. An example simulation from Cristini and Lowengrub, 2004 is shown in Fig. 4. In this figure, the precipitate morphologies together with the shape factor δ/R are shown for precipitates grown under constant undercooling and constant flux conditions. R is the effective precipitate radius (i.e., radius of a (equivalent) sphere with the same volume enclosed) and δ/Rmeasures the shape deviation from the equivalent sphere. In Fig. 5, the coarsening of a system of 8 precipitates in 3D is shown in the absence of elastic effects (Z = 0), from Li, Lowengrub and Cristini, 2004. This adaptive simulation uses the algorithms described above and is performed an infinitely large domain. Because the precipitates are spaced relatively far from one another, there is little apparent deviation of the morphologies from spherical. However, this is not assumed or required by the algorithm. In Fig. 5, we see the classical survival of the fattest as mass is transferred from small precipitates to large ones. Work is ongoing to develop simulations at finite
Figure 4. Precipitate morphologies grown under constant undercooling and constant flux conditions. After Cristini and Lowengrub, 2004. Reproduced with permission.
Modeling coarsening dynamics using interface tracking methods t0
t 0.75
t 1.5
t 4.0
t 6.75
t 7.5
2219
Figure 5. The coarsening of a system of 8 precipitates in 3D in the absence of elastic effects (Z = 0). Figure courtesy of Li, Lowengrub and Cristini, 2004.
Figure 6. The evolution of a Ni3 Al precipitate in a Ni matrix (Z = 4). Left: early time. Right: late time (equilibrium). After Li et al., 2003. Reproduced with permission.
volume fractions of precipitate coarsening in periodic geometries [25] in order to determine statistically meaningful coarsening rate statistics. The current state-of-the-art in simulations of coarsening in 3D with elastic effects is the work of Li et al., 2003. To date, simulations have been performed with single precipitates. A sample simulation from Li et al., 2003 is shown in Fig. 6 for the evolution of a Ni3 Al precipitate in a Ni matrix with Z = 4. For this value of Z , and those above it (for Ni3 Al precipitates), there is a transition from cuboidal shapes to elongated shapes as seen in the figure. Such elongated
2220
J. Lowengrub
Figure 7. Left and Middle: Growth shapes of a Ni3 Al precipitate in a Ni matrix. After Li et al., 2003. Reproduced with permission. A. Right: An experimental precipitate from a Ni-based superalloy after Yoo, Yoon and Henry, 1995. Reproduced with permission.
shapes are often seen in experiments. Finally, in Fig. 7, we present growth shapes (left and middle) of a Ni3 Al precipitate in a Ni matrix with Z = 4 under a driving force consisting of a constant flux of Al atoms [11]. In contrast to the precipitate in Fig. 6, under growth, the Ni3 Al precipitate retains its cuboidal shape although it develops concave faces. On the right, an image is shown from an experiment [26] showing Ni-based precipitates with concave faces similar to those observed in the simulation.
7.
Outlook
In this paper, we have presented a brief description of the state-of-theart in simulating microstructure evolution, and in particular coarsening, using boundary integral interface tracking methods. In general, the methods are quite well-developed in 2D. In particular, large-scale coarsening studies have been performed in the absence of elastic effects and when the elastic media is homogeneous and anisotropic. Although methods have been developed to study coarsening in fully inhomogeneous, anisotropic elastic media, so far the computational expense of the current methods have prevented large-scale studies to be performed. There have been exciting developments in 3D and although the state-ofthe-art in 3D simulations is still well behind those in 2D, this direction looks very promising for the future. This is also an important future direction as coarsening in metallic alloys, for example, is a fully 3D phenomenon. Efforts in this direction will have a significant potential payoff in that they will allow, for the first time, not only a rigorous check of the LSW coarsening kinetics in 3D but also will allow the effects of finite volume fraction and elastic forces on the coarsening kinetics to be assessed.
Modeling coarsening dynamics using interface tracking methods
2221
References [1] Z. Li and R. Leveque, “Immersed interface methods for Stokes flow with elastic boundaries or surface tension,” SIAM J. Sci. Comput., 18, 709, 1997. [2] S. Osher and R. Fedkiw, “Level set methods: An overview and some recent results,” J. Comp. Phys., 169, 463, 2001. [3] J. Glimm, M.J. Graham, J. Grove et al., “Front tracking in two and three dimensions,” Comput. Math. Appl., 35, 1, 1998. [4] G. Tryggvason, B. Bunner, A. Esmaeeli et al., “A front tracking method for the computations of multiphase flow,” J. Comp. Phys., 169, 708, 2001. [5] I.M. Lifshitz and V.V. Slyozov, J. Phys. Chem. Solids, 19, 35, 1961. [6] C. Wagner, Z. Elektrochem., 65, 581, 1961. [7] W.C. Johnson and P.W. Voorhees, “Elastically-induced precipitate shape transitions in coherent solids,” Solid State Phenom, 23, 87, 1992. [8] K. Thornton, J. Agren, and P.W. Voorhees, “Modelling the evolution of phase boundaries in solids at the meso- and nano-scales,” Acta Mater., 51(3), 5675–5710, 2003. [9] C. Herring “Surface tension as a motivation for sintering,” In: W. E. Kingston, (ed.), The Physics of Powder Metallurgy, Mcgraw-Hill, p. 143, 1951. [10] M. Spivak, “A Comprehensive Introduction to Differential Geometry,” Vol. 4, Publish or Perish, 3rd edn., 1999. [11] Li Xiaofan, J.S. Lowengrub, Q. Nie et al., “Microstructure evolution in threedimensional inhomogeneous elastic media,” Metall. Mater. Trans. A, 34A, 1421, 2003. [12] T.Y. Hou, J.S. Lowengrub, and M.J. Shelley, “Boundary integral methods for multicomponent fluids and multiphase materials,” J. Comp. Phys., 169, 302–362, 2001. [13] S.G. Mikhlin, “Integral equations and their applications to certain problems in mechanics, mathematical physics, and technology,” Pergamon, 1957. [14] P.W. Voorhees, “Ostwald ripening of two phase mixtures,” Annu. Rev. Mater. Sci., 22, 197, 1992. [15] W.T. Scott, “The physics of electricity and magnetism,” Wiley, 1959. [16] A.E.H. Love, “A treatise on the mathematical theory of elasticity,” Dover, 1944. [17] T. Mura, “Micromechanics of defects in solids,” Martinus Nijhoff, 1982. [18] J. Carrier, L. Greengard, and V. Rokhlin, “A fast adaptive multipole algorithm,” SIAM J. Sci. Stat. Comput., 9, 669, 1988. [19] V. Cristini and J.S. Lowengrub, “Three-dimensional crystal growth II. Nonlinear simulation and control of the Mullins-Sekerka instability,” J. Crystal Growth, in press, 2004. [20] V. Cristini, J. Blawzdzieweicz, and M. Loewenberg, “An adaptive mesh algorithm for evolving surfaces: Simulations of drop breakup and coalescence,” J. Comp. Phys., 168, 445, 2001. [21] N. Akaiwa and D.I. Meiron, “Two-dimensional late-stage coarsening for nucleation and growth at high-area fractions,” Phys. Rev. E, 54, R13, 1996. [22] N. Akaiwa, K. Thornton, and P.W. Voorhees, “Large scale simulations of microstructure evolution in elastically stressed solids,” J. Comp. Phys., 173, 61–86, 2001. [23] K. Thornton, N. Akaiwa, and P.W. Voorhees, “Dynamics of late stage phase separation in crystalline solids,” Phys. Review Lett., 86(7), 1259–1262, 2001. [24] P.H. Leo, J.S. Lowengrub, and Q. Nie, “Microstructure evolution in inhomogeneous elastic media,” J. Comp. Phys., 157, 44, 2000.
2222
J. Lowengrub
[25] Li Xiangrong, J.S. Lowengrub, and V. Cristini, “Direct numerical simulations of coarsening kinetics in three-dimensions,” In preparation, 2004. [26] Y.S. Yoo, D.Y. Yoon, and. M.F. Henry, “The effect of elastic misfit strain on the morphological evolution of γ -precipitates in a model Ni-base superalloy,” Metals Mater., 1, 47, 1995.
7.9 KINETIC MONTE CARLO METHOD TO MODEL DIFFUSION CONTROLLED PHASE TRANSFORMATIONS IN THE SOLID STATE Georges Martin1 and Fr´ed´eric Soisson2 1
´ Commissariat a` l’Energie Atomique, Cab. H.C., 33 rue de la F´ed´eration, 75752 Paris Cedex 15, France 2 CEA Saclay, DMN-SRMP, 91191 Gif-sur-Yuette, France
The classical theories of diffusion-controlled transformations in the solid state (precipitate-nucleation, -growth, -coarsening, order-disorder transformation, domain growth) imply several kinetic coefficients: diffusion coefficients (for the solute to cluster into nuclei, or to move from smaller to larger precipitates. . . ), transfer coefficients (for the solute to cross the interface in the case of interface-reaction controlled kinetics) and ordering kinetic coefficients. If we restrict to coherent phase transformations, i.e., transformations, which occur keeping the underlying lattice the same, all such events (diffusion, transfer, ordering) are nothing but jumps of atoms from site to site on the lattice. Recent progresses have made it possible to model, by various techniques, diffusion controlled phase transformations, in the solid state, starting from the jumps of atoms on the lattice. The purpose of the present chapter is to introduce one of the techniques, the Kinetic Monte Carlo method (KMC). While the atomistic theory of diffusion has blossomed in the second half of the 20th century [1], establishing the link between the diffusion coefficient and the jump frequencies of atoms, nothing as general and powerful occurred for phase transformations, because of the complexity of the latter at the atomic scale. A major exception is ordering kinetics (at least in the homogeneous case, i.e., avoiding the question of the formation of microstructures), which has been described by the atomistic based Path Probability Method [2]. In contrast, supercomputers made it possible to simulate the formation of microstructures by just letting the lattice sites occupancy change in course of time following a variety of rules: the Kinetic Ising model (KIM) in particular has been (and 2223 S. Yip (ed.), Handbook of Materials Modeling, 2223–2248. c 2005 Springer. Printed in the Netherlands.
2224
G. Martin and F. Soisson
still is) extensively studied and is summarized in the appendix [3]; other models include “Diffusion Limited Aggregation”, etc. . . Such models stimulate a whole field of the statistical physics of non-equilibrium processes. However, we choose here a distinct point of view, closer to materials science. Indeed, a unique skill of metallurgists is to master the formation of a desired microstructure simply by well controlled heat treatments, i.e., by imposing a strictly defined thermal history to the alloy. Can we model diffusion controlled phase transformations at a level of sophistication capable of reproducing the expertise of metallurgists? Since Monte Carlo techniques were of common use in elucidating delicate points of the theory of diffusion in the solid state [4, 5], it has been quite natural to use the very same technique to simulate diffusion controlled coherent phase transformations. Doing so, one is certain to retain the full wealth that the intricacies of diffusion mechanisms might introduce in the kinetic pathways of phase transformations. In particular, the question of the time scale is a crucial one, since the success of a heat treatment in stabilizing a given microstructure, or in insuring the long-term integrity of that microstructure, is of key importance in Materials Science. In the following, we first recall the physical foundation of the expression for the atomic jump frequency, we then recall the connection between jump frequencies and kinetic coefficients describing phase transformation kinetics; the KMC technique is then introduced and typical results pertaining to metallurgy relevant issues are given in the last section.
1.
Jumps of Atoms in the Solid State
With a few exceptions, out of the scope of this introduction, atomic jump in solids is a thermally activated process. Whenever an atom jumps, say from site α to α , the configuration of the alloy changes from i to j . The probability per unit time, for the transition to occur, writes:
Wi, j = νi, j
Hi, j exp − kB T
(1)
In Eq. (1), ν i, j is the attempt frequency, kB is the Boltzmann’s constant, T is the temperature and Hi, j is the activation barrier for the transition between configurations i and j . According to the rate theory [6], the attempt frequency writes, in the (quasi-) harmonic approximation: 3N−3
νi, j = k=1 3N−4 k=1
νk νk
(2)
In Eq. (2), νk and νk are the vibration eigen-frequencies of the solid, respectively in the initial configuration, i, and at the saddle point between configurations i and j . Notice that for a solid with N atoms, the number of eigen modes
Diffusion controlled phase transformations in the solid state
2225
is 3N . However, the vibrations of the centre of mass (3 modes) are irrelevant in the diffusion process, hence the upper bound 3N −3 in the product at the numerator. At the saddle point position between configurations i and j , one of the modes is a translation rather than a vibration mode, hence the upper bound 3N −4 in the denominator. Therefore, provided we know the value of Hi, j and νi, j for each pair of configurations, i and j , we need to implement some algorithm which would propagate the system in its configuration space, as the jumps of atoms actually do in the real solid. Notice that the algorithm must be probabilistic since Wi, j in Eq. (1) is a jump probability per unit time. Before we discuss this algorithm, we give some more details on diffusion mechanisms in solids, since the latter deeply affect the values of Wi, j in Eq. (1). The most common diffusion mechanisms in crystalline solids are vacancy-, interstitial- and interstitialcy-diffusion [7]. Vacancies (a vacant lattice site) allow for the jumps of atoms from site to site on the lattice; in alloys, vacancy diffusion is responsible for the migration of solvent- and of substitutional solute- atoms. Therefore, the transition from configuration i to j implies that one atom and one (nearest neighbor) vacancy exchange their position. As a consequence, the higher the vacancy concentration, the more numerous are the configurations, which can be reached from configuration i: indeed, starting from configuration i, any jump of any vacancy destroys that configuration. Therefore the transformation rate depends both on the jump frequencies of vacancies, as given by Eq. (1), and on the concentration of vacancies in the solid. This fact is commonly taken advantage of, in practical metallurgy. At equilibrium, the vacancy concentration depends on the temperature, the pressure and, in alloys, of the chemical potential differences between the species:
Cve
gf = exp − v kB T
(3)
In Eq. (3), Cve = Nv /(N + Nv ), with N the number of atoms, and gvf is the free enthalpy of formation of the vacancy. At equilibrium, the probability for an atom to jump equals the product of the probability for a vacancy to be nearest neighbor of that atom (deduced from Eq. 3), times the jump frequency given by Eq. (1). In real materials, vacancies form and annihilate at lattice discontinuities (free surfaces, dislocation lines and other lattice defects). If, in course of the phase transformation the equilibrium vacancy concentration changes, e.g., because of vacancy trapping in one of the phases, it takes some time for the vacancy concentration to adjust to its equilibrium value. This point, of common use in practical metallurgy, is poorly known from the basic point of view [8] and will be discussed later. Interstitial diffusion occurs when an interstitial atom (like carbon or nitrogen in steels) jumps to a nearest neighbor unoccupied interstitial site.
2226
G. Martin and F. Soisson
Interstitialcy diffusion mechanism implies that a substitutional atom is “pushed” into an interstitial position by a nearest neighbor interstitial atom, which itself, becomes a substitutional one. This mechanism prevails, in particular, in metals under irradiation, where the collisions of lattice atoms with the incident particles produce Frenkel pairs; a Frenkel pair is made of one vacancy and one dumb-bell interstitial (two atoms competing for one lattice site). The migration of the dumb-bell occurs by the interstitialcy mechanism. The concentration of dumb-bell interstitials results from the competition between the production of Frenkel pairs by nuclear collisions and of their annihilation either by recombination with vacancies or by elimination on some lattice discontinuity. The interstitialcy mechanism may also prevail in some ionic crystals, and in the diffusion of some gas atoms in metals.
2.
From Atomic Jumps to Diffusion and to the Kinetics of Phase Transformations
The link between the jump frequencies and the diffusion coefficients has been established in details in limiting cases [1]. The expressions are useful for adjusting the values of the jump frequencies to be used, to experimental data. As a matter of illustration, we give below some expressions for the vacancy diffusion mechanism in crystals with cubic symmetry (with a for the lattice parameter): – In a pure solvent, the tracer diffusion coefficient writes: D ∗ = a 2 f 0 W0 Cve ,
(4a)
with f 0 for the correlation factor (a purely geometrical factor) and W0 , the jump frequency of the vacancy in the pure metal. – In a dilute solution with Face Centered Cubic (FCC) lattice, with non interacting solutes, and assuming that the spectrum of the vacancy jump frequencies is limited to 5 distinct values (Wi , i = 0 to 4, for the vacancy jumps respectively in the solvent, around one solute atom, toward the solute, toward a solvent atom nearest neighbor of the solute, and away from the solute atom, see Fig. 1), the solute diffusion coefficient writes: W4 f 2 W2 , (4b) W3 where the correlation factor f 2 can be expressed as a function of the Wi ’s. In dilute solutions, the solvent- as well as the solute-diffusion coefficient depends linearly on the solute concentration, C, as: Dsolute = a 2 Cve
D ∗ (C) = D ∗ (0)(1 + bC). The expression of b is given in [1, 9].
(4c)
Diffusion controlled phase transformations in the solid state W3
2227
W1
W3 W2
W3
W1
W4 W0
Figure 1. The Five-frequency model in dilute FCC alloys: the five types of vacancy jumps are represented in a (111) plane (light gray: solvent atoms, dark gray: solute atom, open square: vacancies).
– In concentrated alloys, approximate expressions have been recently derived [10]. The atomistic foundation of the classical models of diffusion controlled coherent phase transformation is far less clear. For precipitation problems, two main techniques are of common use: the nucleation theory (and its atomistic variant sometimes named “cluster dynamics”) and Cahn–Hilliard diffusion equation [11]. In the nucleation theory, one defines the formation free energy (or enthalpy, if the transformation occurs under fixed pressure), F(R) of a nucleus with size R (volume vR 3 and interfacial area sR2 , v and s being geometric factors computed for the equilibrium shape): F(R) = δµvR 3 +σ sR 2 .
(5)
In Eq. (5), δµ and σ are respectively the gain of chemical potential on forming one unit volume of second phase, and the interfacial free energy (or free enthalpy) per unit area. If the solid solution is supersaturated, δµ is negative and F(R) first increases as a function of R, then goes through a maximum for the critical size R ∗ (R ∗ = (2s/3v) (σ/|δµ|)) and then decreases (Fig. 2). F(R) can be given a more precise form, in particular for small values of R. More details may be found in Perini et al. [12]. For the critical nucleus, F ∗ = F(R ∗ ) ≈ σ 3 /(δµ)2 .
(6)
F(R) can be seen as an energy hill which opposes the growth of sub-critical nuclei (R< R ∗ ) and which drives the growth of super-critical nuclei (R >R ∗ ). The higher the barrier, i.e., the larger F ∗ , the more difficult the nucleation is. F ∗ is very sensitive to the gain in chemical potential: the higher the supersaturation, the larger the gain, the shallower the barrier, and the easier the
2228
G. Martin and F. Soisson
F (R )
F R
R Figure 2. Free energy change on forming a nucleus with radius R.
nucleation. F ∗ also strongly depends on the interfacial energy, a poorly known quantity, which, in principle depends on the temperature. With the above formalism, the nucleation rate (i.e., the number of supercritical nuclei which form per unit time in a unit volume) writes, under stationary conditions: F∗ ∗ (7a) Jsteady = β Z N0 exp − kB T with N0 for the number of lattice sites and Z for the Zeldovich’s constant:
1 Z= − 2π kT
∂2F ∂n 2
1/2
,
(7b)
n=n ∗
n for the number of solute atoms in a cluster and θ ∗ for the sticking rate of solute atoms on the critical nucleus. If the probability of encounter of one solute atom with one nucleus is diffusion controlled: β(R) = 4πDRC
(7c)
For a detailed discussion, see Waite [13]. In Eq. (7c), D is the solute diffusion coefficient in the (supersaturated) matrix with the solute concentration C. An interesting quantity is the incubation time for precipitation, τinc , i.e., the time below which the nucleation current is much smaller than Jsteady . The former writes: 1 (7d) τinc ∝ ∗ 2 β Z When the supersaturation is small and/or the interfacial energy is high, the incubation time gets very large. Also the incubation time is scaled to the diffusion coefficient of the solute.
Diffusion controlled phase transformations in the solid state
2229
The nucleation process can be described also by the technique named “cluster dynamics”. The microstructure is described, at any time, by the number density, ρn, of clusters made of n solute atoms. The latter varies in time as: dρn = − ρn (αn + βn ) + ρn+1 αn+1 + ρn−1 βn−1 dt
(8)
where α n and β n are respectively the rate of solute evaporation and sticking at a cluster of n solute atoms. Again, α n and β n can be expressed in terms of solute diffusion or transfer coefficients. At later stages, when the second phase precipitation has exhausted the solute supersaturation, Ostwald ripening takes place: because the chemical potential of the solute close to a precipitate increases with the curvature of the precipitate-matrix interface (δµ(R) = 2σ/R), the smaller precipitates dissolve to the benefit of the larger ones. According to Lifschitz and Slyosov and to Wagner [14], the mean precipitate volume increases linearly with time, or the mean radius (as well as the mean precipitate spacing) goes as: R(t) − R(0) = k t 1/3
(9a)
with k3 =
(8/9)Dσ Cs kB T
(9b)
In Eq. (9b), D is again the solute diffusion coefficient, Cs the solubility limit, and the atomic volume. The problem of multicomponent alloys has been addressed by several authors [15]. The above models do not actually generate a full microstructure: they give the size distribution of precipitates as a function of time, as well as the mean precipitate spacing, since the total amount of solute is conserved, provided that the precipitates do not change composition in the course of the phase separation process. The formation of a full microstructure (i.e., including the variability of precipitate shapes, the correlation in the positions of precipitates etc.) is best described by Cahn’s diffusion equation [16]. In the latter, the chemical potential, the gradient of which is the driving force for diffusion, includes an inhomogeneity term, i.e., is a function, at each point, both of the concentration and of the curvature of the concentration field. The diffusion coefficient was originally given the form due to Darken. Based on a simple model of Wi, j and a mean field approximation, an atomistic based expression of the mobility has been proposed, both for binary [17] and multicomponent alloys [18]. When precipitation occurs together with ordering, Cahn’s equation is complemented with an equation for the relaxation of the degree of order; the latter relaxation occurs at a rate proportional to the gain in free energy due to the onsite relaxation of the degree order. The rate constant is chosen arbitrarily [19]. Since in a crystalline
2230
G. Martin and F. Soisson
sample the ordering reaction proceeds by the very same diffusion mechanism as the precipitation, both rate constants (for the concentration- and for the degree of order fields) should be expressed from the same set of Wi, j . This introduces some couplings, which have been ignored by classical theories [20]. As a summary, despite their efficiency, the theories of coherent phase separation introduce rate constants (diffusion coefficients, interfacial transfer coefficients, rate constants for ordering) the microscopic definition of which is not fully settled. The KMC technique offers a means to by-pass the above difficulties and to directly simulate the formation of a microstructure in an alloy where atoms jump with the frequencies defined by Eq. (1).
3.
Kinetic Monte Carlo Technique to Simulate Coherent Phase Transformations
The KMC technique can be implemented in various manners. The one we present here has a transparent physical meaning.
3.1.
Algorithm
Consider a computational cell with Ns sites, Na atoms and Nv = Ns − Na vacancies; each lattice site is linked to Z neighbor sites with which atoms may be exchanged (usually, but not necessarily, nearest neighbor sites). A configuration is defined by the labels of the sites occupied respectively by A, B, C, . . . atoms and by vacancies. Each configuration “i” can be escaped by Nch channels (Nch = Nv Z minus the number of vacancy–vacancy bounds if any), leading to Nch new configurations “ j1 ” to “ j Nch ”. The probability that the transition “i; jq ” occurs per unit time is given by Eq. (1) which can be computed a priori provided a model is chosen for Hi, j and νi, j . Since the configuration “i” may disappear by Nch independent channels, the probability for the configuration to disappear per unit time, Wiout , is the sum of the probabilities it decays by each channel (Wi, j q , q = 1 to Nch ), and the life time τ i of the configuration is the inverse of Wiout :
τi =
Nch
−1
Wi, jq
(10a)
q=1
The probability that the configuration “ jq ” is reached among the Nch target configurations is simply given by: Wi, jq = Wi, jq × τi (10b) Pi jq = N ch
Wi, jq q=1
Diffusion controlled phase transformations in the solid state
2231
Assuming all possible values of Wi, jq are known (see below), the code proceeds as follows: Start at time t = 0 from the configuration “i 0 ”, set i = i 0 ;
1. Compute τi (Eq. (10a)) and the Nch values of Si,k = kq=1 Pi jq , k = 1 to Nch . 2. Generate a random number R on ]0; 1]. 3. Find the value of f to be given to k such that Si,k−1 < R ≤ Si,k . Choose f as the final configuration. 4. Increment the time by τi (t MC => t MC + τi ) and repeat the process from step 1, giving to i the value f .
3.2.
Models for the Transition Probabilities Wi , j (Eq. (1))
For a practical use of the above algorithm, we need a model for the transitions probabilities per unit time, Wi, j . In principle, at least, given an interatomic potential, all quantities appearing in Eqs. (1)–(3) can be computed for any pair of configurations, hence Wi, j . The computational cost for this is so high that most studies use simplified models for the parameters entering Eqs. (1)–(3); the values of the parameters are obtained by fitting appropriate quantities to available experimental data, such as phase boundaries and tie lines in the equilibrium phase diagram, vacancy formation energy and diffusion coefficients. We describe below the most commonly used models, starting from the simplest one. Model (a) The energy of any configuration is a sum of pair interactions ε with a finite range (nearest- or farther neighbors). The configurational energy is the sum of the contributions of two types of bounds: those which are modified by the jump, and those which are not. We name esp the contribution of the bounds created in the saddle point configuration. This model is illustrated in Fig. 3. The simplest version of this model is to assume that esp depends neither on the atomic species undergoing the jump, nor on the composition in the surrounding of the saddle point [17]. Model (b) Same as above, but with esp depending on the atomic species at the saddle point. This approximation turned out to be necessary to account for the contrast in diffusivities in the ternary Ni–Cr–Al [21]. Model (c) Same as above, but with esp written as a sum of pair interactions [22]. This turned out to provide an excellent fit to the activation barriers computed in Fe(Cu) form fully relaxed atomistic simulations based on an EAM potential. As shown on Fig. (4), the
2232
G. Martin and F. Soisson
non broken bonds
0
esp broken bonds
Saddle-Point position
∆Hi;j
Hj Hi
(
)
(
)
i
j
Figure 3. Computing the migration barrier between configurations i and j (Eq. (1)), from the contribution of broken- and restored bounds. 7.5 8 2
eFe(SP)
8.5
6
9
eCu(SP)
5 3 1
V
4
9.5 10
0
1
2
3
4
5
6
NCu(SP)
Figure 4. The six nearest-neighbors (labeled 1 to 6) of the saddle-point in the BCC lattice (left). Contribution to the configurational energy, of one Fe atom, eFe (SP), or one Cu atom, eCu (SP), at the saddle point, as a function of the number of Cu atoms nearest neighbor of the saddle point (right).
contribution to the energy of one Cu atom at the saddle point, eCu (SP), does not depend on the number of Cu atoms around the saddle point, while that of one Fe atom, eFe (SP), increases linearly with the latter. Model (d) The energy of each configuration is a sum of pair and multiple interactions [18]. Taking into account higher order interactions permits to reproduce phase diagrams beyond the regular solution
Diffusion controlled phase transformations in the solid state
2233
model. The attempt frequency (Eq. 2) was adjusted, based on an empirical correlation between the pre-exponential factor and the activation enthalpy. Complex experimental interdiffusion profiles in four components alloys (AgInCdSn) could be reproduced successfully. Multiplet interactions have been used in KMC to model phase separation and ordering in AlZr alloys [23]. Model (e) The energies of each configuration and at the saddle point, as well as the vibration frequency spectrum (entering Eq. (2)) are computed from a many body interaction potential [24]. The vibration frequency spectrum can be estimated either with Einstein’s model [25] or Debye approximation [26, 27]. The above list of approximations pertains to the vacancy diffusion mechanism. Fewer studies imply also interstitial diffusion, as carbon in iron, or dumbbell diffusion, in metals under irradiation, as will be seen in the next section. The models for the activation barrier are of model (b) described above.
3.3.
Physical Time and Vacancy Concentration
Consider the vacancy diffusion mechanism. If the simulation cell only contains one vacancy, the vacancy concentration is 1/Ns , often much larger than a typical equilibrium vacancy concentration Cve . From Eq. (10), we conclude that the time evolution in the cell is faster than the real one, by a factor equal to the vacancy supersaturation in the cell: (1/Ns )/Cve . The physical time, t is therefore longer than the Monte Carlo time, tMC , computed above: t = tMC /(Ns Cve )
(11)
Equation (11) works as long as the equilibrium vacancy concentration does not vary much in the course of the phase separation process, a point which we discuss now. Consider an alloy made of N A atoms A, N B atoms B on Ns lattice sites. For any atomic configuration of the alloy, there is an optimum number of lattice sites, Nse , that minimizes the configurational free energy; the vacancy concentration in equilibrium with that configuration is: Cve = (Nse − N A − N B )/Nse . For example assume that the configurations can be described by K types of sites onto which the vacancy is bounded by an energy E bk (k = 1 to K ), with k = 1 corresponding to sites surrounded by pure solvent (E b1 = 0). We name N1 , . . . , N K the respective numbers of such sites. The equilibrium concentrations of vacancies on the sites of type 1 to K are respectively:
e = Cvk
Nvk E f + E bk = exp − Nk + Nvk kB T
(12a)
2234
G. Martin and F. Soisson
In Eq. (11), E f is the formation energy of a vacancy in pure A. The total vacancy concentration, in equilibrium with the configuration as defined by N1 , . . . , N K is thus (in the limit of small vacancy concentrations):
e Nk Cvk e ≈ = Cv0 k Nk X k = Nk /N1
Cve
k
1+
X k exp(−E bk /kB T )
; 1 + k=2,K X k
k=2,K
(12b)
e is the equilibrium vacancy concentration in the pure solvent, In Eq. (12), Cv0 and X k depends on the advancement of the phase separation process: e.g., in the early stages of solute clustering, we expect the proportion of sites surrounded by a small number of solute atoms to decrease. The overall vacancy equilibrium concentration thus changes in time (Eq. (12b)), while it remains unaffected for each type of site (Eq. (12a)). Imposing a fixed number of vacancies in the simulation cell, creates the opposite situation: in the simulation, the overall vacancy concentration is kept constant, thus the vacancy concentration on each type of site must change in course of time: the kinetic pathway will be altered. This problem can be faced in various ways. We quote below two of them:
– Rescaling the time from an estimate of the free vacancy concentration, i.e., the concentration of those vacancies with no solute as neighbor [22]. The vacancy concentration in the solvent is estimated in the course of the simulation, at a certain time scale, t, from the fraction of the time, where the vacancy is surrounded by solvent atoms only. Each time interval t is rescaled by the vacancy super saturation, which prevails during that time interval. – Modeling a vacancy source (sink) in the simulation cell [28]: in real materials, vacancies are formed and destroyed at lattice discontinuities (extended defects), such as dislocation lines (more precisely jogs on the dislocation line), grain boundaries, incoherent interfaces and free surfaces. The simplest scheme is as follows: creating one vacancy implies that one atom on the lattice close to the extended defect jumps into the latter in such a way as to extend the lattice by one site; eliminating one vacancy implies that one atom at the extended defect jumps into the nearby vacancy. Vacancy formation and elimination are a few more channels by which a configuration may change. The transition frequencies are still given by Eq. (1) with appropriate activation barriers: Fig. 5 gives a generic energy diagram for the latter transitions. As shown by the above scheme, while the vacancy equilibrium concentration is dictated by the formation energy, E f , the time to adjust to a change in the equilibrium vacancy concentration implies the three parameters E f ,
Diffusion controlled phase transformations in the solid state
2235
Em
Ef
Figure 5. Configurational energy as a function of the position of the vacancy. When one vacancy is added to the crystal, the energy is increased by E f .
E m and δ. In other words, a given equilibrium concentration can be achieved either by frequent or by rare vacancy births and deaths. The consequences of this fact on the formation of metastable phases during alloy decomposition are not yet fully understood.
3.4.
Tools to Characterize the Results
The output of a KMC simulation is a string of atomistic configurations as a function of time. The latter can be observed by the eye (e.g., to recognize specific features in the shape of solute clusters); one can also measure various characteristics (short range order, cluster size distribution, cluster composition and type of ordering. . . ); one can simulate signals one would get from classical techniques such as small- or large-angle scattering, or use the very same tools as used in Atom Probe Field Ion Microscopy to process the very same data, namely the location of each type of atom. Some examples are given below.
3.5.
Comparison with the Kinetic Ising Model
The KIM, of common use in the Statistical Physics community, is summarized in the appendix. It is easily checked that the models presented above for the transition probabilities introduce new features, which are not addressed by the KIM. In particular, the only energetic parameter to appear in KIM is what is named, in the community of alloys thermodynamics, the ordering energy: ω = ε AB − (ε A A + ε B B )/2 (for the sake of simplicity, we restrict, here, to two
2236
G. Martin and F. Soisson
component alloys). While ω is indeed the only parameter to enter equilibrium thermodynamics, the models we introduced show that the kinetic pathways are affected by a second independent energetic parameter, the asymmetry between the cohesive energies of the pure elements: ε A A − ε B B . This point is discussed into details, by Ath`enes and coworkers [29–31]. Also, the description of the activated state between two configurations is more flexible in the present model as compared to KIM. For these reasons, the present model offers unique possibilities to study complex kinetic pathways, a common feature in real materials.
4.
Typical Results: What has been Learned
In the 70s the early KMC simulations have been devoted to the study of simple ordering and phase separation kinetics in binary systems with conserved or non-conserved order parameters. Based on the Kinetic Ising model and so called “Kawazaki dynamics” (direct exchange between nearest neighbor atoms, with a probability proportional to exp [−(Hfinal − Hinitial)/2kB T ]), with no point defects and no migration barriers, they could mainly reproduce some generic features of intermediate time behaviors, taking the number of Monte Carlo step as an estimate of physical time: the coarsening regime of precipitation with R − R0 ∝ t 1/3 ; the growth rate of ordered domains R − R0 ∝ t 1/2 , dynamical scaling laws, etc. [3, 32]. However, such models cannot reproduce important metallurgical features such as the role of distinct solute and solvent mobilities, of point defect trapping, or of correlations among successive atomic jumps etc. In the frame of the models (a)–(e) previously described, these features are mainly controlled by the asymmetry parameters for the stable configurationsp sp and saddle-point energies (respectively ε A A − ε B B , and e A − e B ). We give below typical results, which illustrate the sensitivity, to the above features, of the kinetic pathways of phase transformations.
4.1.
Diffusion in Ordered Phases
Since precipitates are often ordered phases, the ability of the transition probability models to well describe diffusion in ordered phases must be assessed. As an example, diffusion in B2 ordered phases presents specific features which have been related to the details of the diffusion mechanism: at a given composition, the Arrhenius plot displays a break at the order/disorder temperature and an upward curvature in the ordered phase; at a given temperature, the tracer diffusion coefficients are minimum close to the stoichiometric composition. The reason for that is as follows: starting from a perfectly
Diffusion controlled phase transformations in the solid state
2237
ordered B2 phase, any vacancy jump creates an antisite defect, so that the most probable next jump is the reverse one which annihilates the defect. As a consequence, it has been proposed that diffusion in B2 phases occurs via highly correlated vacancy jump sequences, such as the so-called 6-jump cycle (6JC) which corresponds to 6 effective vacancy jumps (resulting from many more jumps, most of them being canceled by opposite jumps). Based on the above “model (a)” for the jump frequency, Ath`enes’ KMC simulations [29] show that other mechanisms (e.g., the antisite-assisted 6JC) contribute to long-range diffusion, in addition to the classical 6JC (see Figure 6). Their relative importance increases with the asymmetry parameter u = ε A A − ε B B , which controls the respective vacancy concentrations on the two B2 sublattices and the relative mobilities of A and B atoms. Moreover while diffusion by 6JC only would implies a D ∗A /D ∗B ratio between 1/2 and 2, the newly discovered antisite-assisted cycles yield to a wider range, as observed experimentally in some B2 alloys, such as Co–Ga. Moreover, high asymmetry parameters produce an upward curvature of the Arrhenius plot in the B2 domain. Similar KMC model has been applied to the L12 ordered structures and successfully explains some particular diffusion properties in these phases [30].
4.2.
Simple Unmixing: Iron–Copper Alloys
Copper precipitation in α-Fe has been extensively studied with KMC: although pure copper has an FCC structure, experimental observations show that the first step of precipitation is indeed fully coherent, up to precipitate radii of the order of 2 nm, with a Cu BCC lattice parameter very close to that of iron. The composition of the small BCC copper clusters has long been debated: early atom probe or field ion microscopy studies or small angle neutron scattering experiments suggested that they might contain more than 50% (a)
(b) 0
4
3
1
4
3
1 5
5 2
0
6
2
6
Figure 6. Classical Six Jump Cycle (a) and Antisite assisted Six Jump Cycle (b) in B2 compounds [29].
2238
G. Martin and F. Soisson
iron, while others experimental techniques suggested pure copper clusters. Using the above simple “model (a)”, KMC suggest almost pure copper precipitates, but with very irregular shapes [33]: the significant iron content measured in some experiments could then be due to the contribution of atoms at the precipitate matrix interface if a simple smooth shape is attributed to the precipitate while the small Cu clusters have very irregular shapes. This explanation is in agreement with the most direct observations using a 3D atom probe [34]. The simulations have also shown that, with the parameter values we used, fast migration of small Cu clusters occurs: the latter induces direct coagulation between nuclei, yielding ramified precipitate morphologies. On the same Fe–Cu system, Le Bouar and Soisson [22] have used an EAM potential to parameterize the activation barriers in Eq. (1). In dilute alloys, the EAM computed energies of stable and saddle-point relaxed configurations, can be reproduced with pair interactions on a rigid lattice (including some vacancy-atom interactions). The saddle-point binding energies of Fe and Cu are shown in Fig. 4 and have already been discussed. Such a dependence of the SP binding energies does not modify the thermodynamic properties of the system (the solubility limit, the precipitation driving force, the interfacial energies, the vacancy concentrations in various phases do not depend on the SP properties) and it slightly affects the diffusion coefficients of Fe and Cu in pure iron. Nevertheless such details strongly affect the precipitation kinetic pathway, by changing the diffusion coefficients of small Cu clusters and thus the balance between the two possible growth mechanisms: classical emissionadsorption of single solute atoms and direct coagulation between precipitates. This is illustrated by Fig. 7, where two simulations of copper precipitation Fe on the are displayed: one which takes into account the dependence of esp Fe local atomic composition and one with a constant esp . In the second case small copper clusters (with typically less than 10 Cu atoms) are more mobile than in the first case, which results in an acceleration of the precipitation. Moreover, the nucleation regime in Fig. 7(b) almost vanishes, because two small clusters can merge as- or even more rapidly than a Cu monomer and a precipitate. The dashed line of Fig. 7 represents the results obtained with the empirical parameter values described in the previous paragraph [33]: as can be seen these results do not differ qualitatively from those obtained by Le Bouar et al. [22], so that the qualitative interpretation of the experimental observations is conserved. The competition between the classical solute emission–adsorption and direct precipitate coagulation mechanisms observed in dilute Fe–Cu alloys appears indeed to be quite general and to have important consequences on the whole kinetic pathway. First studies [35] focused on the role of the atomic jump mechanism (Kawasaki dynamics versus vacancy jump), but recent KMC simulations based on the transition probability models (a)–(c) above have shown that both single solute atom- and cluster-diffusion are observed when
Diffusion controlled phase transformations in the solid state
2239
t (year) 3
10
(a) 0,8
101
100
101
101
DSPE ISPE
0,6 Cu
10
2
0,4 0,2 0,0
(b)
1600 1200 Np(i 1) 800 400 0
(c) 102 101
100 104
105
106
107
108
109
1010
t (s)
Figure 7. Precipitation kinetics in a Fe-3at.%Cu alloy at T = 573 K [22]. Evolution of (a) the degree of the copper short-range order parameter, (b) the number of supercritical precipFe itates and (c) the averaged size of supercritical precipitates. Monte Carlo simulations with esp depending on the local atomic configuration (•) or not (♦). The dashed lines corresponds to the results of Soisson et al. [33].
vacancy diffusion is carefully modeled. Indeed the balance between both mechanisms is controlled by: – the asymmetry parameter which controls the relative vacancy concentrations in the various phases [31]. A vacancy trapping in the precipitates (e.g., in Fe–Cu alloys) or at the precipitate-matrix interface tends to favor direct coagulation, while if the vacancy concentration is higher in the matrix, as is the case for Co precipitation in Cu, [36], the migration of monomers and emission-adsorption of single solute atoms are dominant.
2240
G. Martin and F. Soisson
– the saddle-point energies which, together with the asymmetry parameter, control the correlation between successive vacancy jumps and the migration of solute clusters [22].
4.3.
Nucleation/Growth/Coarsening: Comparison with Classical Theories
The classical theories of nucleation, growth or coarsening, as well as the theory of spinodal decomposition in highly supersaturated solid solutions, can be assessed using KMC simulations [37]. For the nucleation regime, the thermodynamic and kinetic data involved in Eqs. (5)–(7) (the driving force for precipitation, δµ, the interfacial energy, σ , the adsorption rate β, etc.) can be computed from the atomistic parameters used in KMC (pair interaction, saddle-point binding energies, attempt frequencies): a direct assessment of the classical theories is thus possible. For low supersaturations and in cases where only the solute monomers are mobile, the incubation time and the steady–state nucleation rate measured in the KMC simulations are very close to those predicted by the classical theory of nucleation. On the contrary, when small solute clusters are mobile (keeping the overall solute diffusion coefficient the same), the classical theory strongly overestimates the incubation time and weakly underestimates the nucleation rate, as exemplified on Fig. 8.
100 10 5 4 1014
ψi Cv(s)
109
3
2.5
2.1
1012
1010
F
1011
W
1010 J 108
1012
106
13
10
1014 3 10
T = 0.5 Ω/2k b
st
T = 0.4 Ω/2k
104 102 1/(S0 (ln S0)3)
101
102
b
T = 0.3 Ω/2kb
0
0.5
1
1.5
2
1/ (ln S0)2
Figure 8. Incubation time and steady-state nucleation rate, in a binary model alloy A–B, as eq a function of supersaturation S0 = C 0B /C B (initial/equilibrium B concentration in the solid solution). Comparison of KMC (symbols) and Classical Theory of Nucleation (lines). On the left: the dotted lines refer to two classical expressions of the incubation time (Eq. (7d)), the plain line is obtained by numerical integration of Eq. (8); KMC with mobile monomers only, KMC with small mobile clusters. On the right: the dotted and plain lines refer to Eq. (7a) with respectively Z = 1 or Z from Eq. (7b); ♦, ◦ and refer to KMC with mobile monomers. For more details, see Ref. [37].
Diffusion controlled phase transformations in the solid state
2241
The above general argument has been assessed in the case of Al3 Zr and Al3 Sc precipitation in dilute aluminum alloys: the best estimates of the parameters suggest that diffusion of Zr and Sc in Al occurs by monomer migration [38]. When the precipitation driving force and interfacial energy are computed in the frame of the Cluster Variation Method, the classical theory of nucleation predicts nucleation rates in excellent agreement with the results of the KMC simulations, for various temperatures and supersaturations. Similarly, the distribution of cluster sizes in the solid solution ρn ∼ exp(−Fn /kB T ), with Fn given by the capillarity approximation (Eq. (5)) is well reproduced, even for very small precipitate sizes.
4.4.
Precipitation in Ordered Phases
The kinetic pathways become more complex when ordering occurs in addition to simple unmixing. Such kinetics have been explored by Ath`enes [39] in model BCC binary alloys, in which the phase diagram displays a tricritical point and a two-phase field (between a solute rich B2 ordered phase and a solute depleted A2 disordered phase). The simulation was able to reproduce qualitatively the main experimental features reported from transmission electron microscopy observations during the decomposition of Fe–Al solid solutions: (i) for small supersaturations, a nucleation-growth-coarsening sequence of small B2 ordered precipitates in the disordered matrix occurs; (ii) for higher supersaturations, a short range ordering starts before any modification of the composition field, followed by a congruent ordering with a very high density of antiphase boundaries (APB). In the center of the two phase field, this homogeneous state then decomposes by a thickening of the APBs which turns into the A2 phase. Close to the B2 phase boundary, the decomposition process also involves a nucleation of iron rich A2 precipitates inside the B2 phase. Varying the asymmetry parameter u mainly affects the time scale. However qualitative differences are observed, at very early stages, in the formation of ordered microstructures: if the value of u enhances preferentially the vacancy exchanges with the majority atoms (u > 0), ordering proceeds everywhere, in a diffuse manner; while if u favors vacancy exchanges with the solute atoms (u < 0), ordering proceeds locally by patches. This could explain the experimental observation of small B2 ordered domains in as-quenched Fe-Al alloys, in cases where phenomenological theories predict a congruent ordering [39]. Precipitation and ordering in Ni(Cr,Al) FCC alloys have been studied by Pareige et al. [21], with MC parameters fitted to thermodynamic and diffusion properties of Ni-rich solid solutions (Fig. 9a). For relatively small Cr and Al
2242
G. Martin and F. Soisson
<001>
30 nm
(a)
(b)
Figure 9. (a) Microstructure of a Ni-14.9at.%Cr-5.2at%Al alloy after a thermal ageing of 1 h at 600◦ C. Monte Carlo simulation (left) and 3D atom probe image (right). Each dot represents an Al atom (for the sake of clarity, Ni and Cr atoms are not represented). One observes the Al-rich 100 planes of γ precipitates, with an average diameter of 2 nm [21]. (b) Monte Carlo simulation of NbC precipitation in ferrite with transient precipitation of a metastable iron carbide, shown in faint in the snapshots at 1.5, 11 and 25 seconds [28].
Diffusion controlled phase transformations in the solid state
2243
contents, at 873 K, the phase transformation occurs in three stages: (i) a short range ordering of the FCC solid solution, with two kinds of ordering symmetry (a “Ni3 Cr” symmetry corresponding to the one observed at high temperature in binary Ni–Cr alloys, and an L12 symmetry) followed by a nucleation-growthcoarsening sequence, (ii) the formation of the Al-rich γ precipitates (with L12 structure), (iii) the growth and coarsening of the precipitates. In the γ phase Cr atoms substitute for both Al and Ni atoms, with a preference for the Al sublattice. The simulated kinetics of precipitation are in good agreement with 3D-atom probe observations during a thermal ageing of the same alloy, at the same temperature [21]. For higher Cr and Al contents, MC simulations predict an congruent L12 ordering (with many small antiphase domains) followed by the γ − γ decomposition, as in the A2/B2 case discussed above.
4.5.
Interstitial and Vacancy Diffusion in Parallel
Advanced high purity steels offer a field of application of KMC with practical relevance. In so called High-Strength Low-Alloy (HSLA) steels, Nb is used as a means to retain carbon in niobium carbide precipitates, out of solution in the BCC ferrite. The precipitation of NbC implies the migration, in the BCC Fe lattice, of both Nb, by vacancy mechanism, and C, by direct interstitial mechanism. At very early stages, the formation of coherent NbC clusters on the BCC iron lattice is documented from 3D atom probe observations. The very same Monte Carlo technique can be used [28]; the new feature is the large value of the number of channels by which a configuration can decay, because of the many a priori possible jumps of the numerous carbon atoms. This makes step 3 of the algorithm above, very time consuming. A proper grouping of the channels, as a function of their respective decay time, helps speeding up this step. Among several interesting features, KMC simulations revealed the possibility for NbC nucleation to be preceded by the formation of a transient iron carbide, due to the rapid diffusion of C atoms by comparison with Nb and Fe diffusion (Fig. 9b). This latter kinetic pathway is found to be sensitive to the ability of the microstructure to provide the proper equilibrium vacancy concentration during the precipitation process.
4.6.
Driven Alloys
KMC offers a unique tool to explore the stability and the evolution of the microstructure in “Driven Alloys”, i.e., alloys exposed to a steady flow of energy, such as alloys under irradiation, or ball milling, or cyclic loading. . . [40]. Atoms in such alloys, change position as a function of time because of two mechanisms acting in parallel: one of the thermal diffusion mechanisms as discussed above, on the one hand, and forced, or “ballistic jumps”
2244
G. Martin and F. Soisson
on the other hand. The latter occur with a frequency imposed by the coupling with the surrounding of the system: their frequency is proportional to some “forcing intensity” (e.g., the irradiation flux). This situation is reminiscent of the “Kinetic Ising Model with two competing dynamics”, much studied in the late 80s. However, one observes a strong sensitivity of the results to the details of the diffusion mechanism and of the ballistic jumps. The main results are : – a solubility limit which is a function both of the temperature and of the ratio of the frequencies of ballistic to thermally activated jumps (i.e., on the forcing intensity); – at given temperature and forcing intensity, the solubility limit may also depend on the number of ballistic jumps to occur at once (“cascade size effect”); – the “replacement distance”, i.e., the distance of ballistic jumps has a crucial effect on the phase diagrams as shown in Fig. 10. For appropriate replacement distances, self-patterning can occur, with a characteristic length, which depends on the forcing intensity and on the replacement distance [41]. What has been said of the solubility limit also applies to the kinetic pathways followed by the microstructure when the forcing conditions are changed. Such KMC studies and the associated theoretical work helped to understand, for alloys under irradiation, the respective effects of the time and space structure of the elementary excitation, of the dose rate and of the integrated dose (or “fluence”). (a)
(A) G 5 104 s1
(B) 103 s1
(b) 2
1 Patterning
Solid Solution
(C) 102 s1
(D) 1 s1
R (ann)
10
1
Macroscopic Phase Separation 102 101 100
101
102
103
104
105
(s1)
Figure 10. (a) Steady–state microstructures in KMC simulations of the phase separation in a binary alloy, for different ballistic jump frequencies . (b) Dynamical phase diagram showing the steady–state microstructure as a function of the forcing intensity and the replacement distance R [41].
Diffusion controlled phase transformations in the solid state
5.
2245
Conclusion and Future Trends
The above presentation is by no means exhaustive. It aimed mainly at showing the necessity to model carefully the diffusion mechanism, and the techniques to do so, in order to have a realistic kinetic pathway for solid state transformations. All the examples we gave are based on a rigid lattice description. The latter is correct as long as strain effects are not too large, as shown by the discussion of the Fe(Cu) alloy. Combining KMC for the configuration together with some technique to handle the relaxation of atomic positions is quite feasible, but for the time being requires a heavy computation cost if the details of the diffusion mechanism are to be retained. Interesting results have been obtained e.g., for the formation of strained hetero-epitaxial films [42]. A field of growing interest is the first principle determination of the parameters entering the transition probabilities. In view of the lack of experimental data for relevant systems, and of the fast improvement of such techniques, no doubt such calculations will be of extreme importance. Finally, at the atomic scale, all the transitions modeled so far are either thermally activated or forced at some imposed frequency. A field of practical interest is where “stick and slip” type processes are operating: such is the case in shear transformations, in coherency loss etc. Incorporating such processes in KMC treatment of phase transformations has not yet been attempted to our knowledge, and certainly deserves attention.
Acknowledgments We gratefully acknowledge many useful discussions with our colleagues at Saclay and at the Atom Probe Laboratory in the University of Rouen, as well as with Prs. Pascal Bellon (UICU) and David Seidman (NWU).
Appendix: The Kinetic Ising Model In the KIM, the kinetic version of the model proposed by Ising for magnetic
materials, the configurational Hamiltonian writes H = i=/ j Ji j σi σ j + i h i σi , with σ ι = ± 1, the spin at site i, Ji j , the interaction parameter between spins at sites i and j , and h i the external field on site i. The probability of a transition per unit time, between two configurations {σι } and {σι } is chosen as: W{σ },{σ } = w exp[−(H − H )/2kB T ], with w for the inverse time unit. Two models are studied:
KIM with conserved total spin, for which i σi = so that the configuration after the transition is obtained by permuting the spins on two (nearest neighbor) sites;
2246
G. Martin and F. Soisson
KIM with non-conserved total spin, for which the new configuration is obtained by flipping one spin on one given site. When treated by Monte Carlo technique, two types of algorithms are currently applied to KIM: Metropolis’ algorithm, where the final configuration is accepted with probability one if (H − H ) ≤ 0, and with probability exp[−(H − H )/kB T ] if (H − H ) > 0. Glauber’s the final configuration is accepted with proba algorithm, where bility 1/2 1 + tanh(−(H − H )/2kB T ) .
References [1] A.R. Allnatt and A.B. Lidiard, “Atomic transport in solids,” Cambridge University Press, Cambridge, 1994. [2] T. Morita, M. Suzuki, K. Wada, and M. Kaburagi, “Foundations and Applications of Cluster Variation Method and Path Probability Method,” Prog. Theor. Phys. Supp., 115, 1994. [3] K. Binder, “Applications of Monte Carlo methods to statistical physics,” Rep. Prog. Phys., 60, 1997. [4] Y. Limoge and J.-L. Bocquet, “Monte Carlo simulation in diffusion studies: time scale problems,” Acta Met., 36, 1717, 1988. [5] G.E. Murch and L. Zhang, “Monte Carlo simulations of diffusion in solids: some recent developments,” In: A.L. Laskar et al. (eds.), Diffusion in Materials, Kluwer Academic Publishers, Dordrecht, 1990. [6] C.P. Flynn, “Point defects and diffusion,” Clarendon Press, Oxford, 1972. [7] J. Philibert, “Atom movements, diffusion and mass transport in solids,” Les Editions de Physique, Les Ulis, 1991. [8] D.N. Seidman and R.W. Balluffi, “Dislocations as sources and sinks for point defects in metals,” In: R.R. Hasiguti (ed.), Lattice Defects and their Interactions, GordonBreach, New York, 1968. [9] J.-L. Bocquet, G. Brebec, and Y. Limoge, “Diffusion in metals and alloys,” In: R.W. Cahn and P. Haasen (eds.), Physical Metallurgy, North-Holland, Amsterdam, 1996. [10] M. Nastar, V.Y. Dobretsov, and G. Martin, “Self consistent formulation of configurational kinetics close to the equilibrium: the phenomenological coefficients for diffusion in crystalline solids,” Philos. Mag. A, 80, 155, 2000. [11] G. Martin, “The theories of unmixing kinetics of solids solutions,” In: Solid State Phase Transformation in Metals and Alloys, pp. 337–406. Les Editions de Physique, Orsay, 1978. [12] A. Perini, G. Jacucci, and G. Martin, “Interfacial contribution to cluster free energy,” Surf. Sci., 144, 53, 1984. [13] T.R. Waite, “Theoretical treatment of the kinetics of diffusion-limited reactions,” Phys. Rev., 107, 463–470, 1957. [14] I.M. Lifshitz and V.V. Slyosov, “The kinetics of precipitation from supersaturated solid solutions,” Phys. Chem. Solids, 19, 35, 1961. [15] C.J. Kuehmann and P.W. Voorhees, “Ostwald ripening in ternary alloys,” Metall. Mater Trans., 27A, 937–943, 1996.
Diffusion controlled phase transformations in the solid state
2247
[16] J.W. Cahn, W. Craig Carter, and W.C. Johnson (eds.), The selected works of J.W. Cahn., TMS, Warrendale, 1998. [17] G. Martin, “Atomic mobility in Cahn’s diffusion model,” Phys. Rev. B, 41, 2279– 2283, 1990. [18] C. Desgranges, F. Defoort, S. Poissonnet, and G. Martin, “Interdiffusion in concentrated quartenary Ag–In–Cd–Sn alloys: modelling and measurements,” Defect Diffus. For., 143, 603–608, 1997. [19] S.M. Allen and J.W. Cahn, “A macroscopic theory for antiphase boundary motion and its application to antiphase domain coarsening,” Acta Metal., 27, 1085–1095, 1979. [20] P. Bellon and G. Martin, “Coupled relaxation of concentration and order fields in the linear regime,” Phys. Rev. B, 66, 184208, 2002. [21] C. Pareige, F. Soisson, G. Martin, and D. Blavette, “Ordering and phase separation in Ni–Cr–Al: Monte Carlo simulations vs Three-Dimensional atom probe,” Acta Mater., 47, 1889–1899, 1999. [22] Y. Le Bouar and F. Soisson, “Kinetic pathways from EAM potentials: influence of the activation barriers,” Phys. Rev. B, 65, 094103, 2002. [23] E. Clouet and N. Nastar, “Monte Carlo study of the precipitation of Al3 Zr in Al–Zr,” Proceedings of the Third International Alloy Conference, Lisbon, in press, 2002. [24] J.-L. Bocquet, “On the fly evaluation of diffusional parameters during a Monte Carlo simulation of diffusion in alloys: a challenge,” Defect Diffus. For., 203–205, 81–112, 2002. [25] R. LeSar, R. Najafabadi, and D.J. Srolovitz, “Finite-temperature defect properties from free-energy minimization,” Phys. Rev. Lett., 63, 624–627, 1989. [26] A.P. Sutton, “Temperature-dependent interatomic forces,” Philos. Mag., 60, 147– 159, 1989. [27] Y. Mishin, M.R. Sorensen, F. Arthur, and A.F. Voter, “Calculation of point-defect entropy in metals,” Philos. Mag. A, 81, 2591–2612, 2001. [28] D. Gendt, Cin´etiques de Pr´ecipitation du Carbure de Niobium dans la ferrite, CEA Report, 0429–3460, 2001. [29] M. Ath`enes, P. Bellon, and G. Martin, “Identification of novel diffusion cycles in B2 ordered phases by Monte Carlo simulations,” Philos. Mag. A, 76, 565–585, 1997. [30] M. Ath`enes and P. Bellon, “Antisite diffusion in the L12 ordered structure studied by Monte Carlo simulations,” Philos. Mag. A, 79, 2243–2257, 1999. [31] A. Ath`enes, P. Bellon, and G. Martin, “Effects of atomic mobilities on phase separation kinetics: a Monte Carlo study,” Acta Mater., 48, 2675, 2000. [32] R. Wagner and R. Kampmann, “Homogeneous second phase precipitation,” In: P. Haasen (ed.), Phase Transformations in Materials, VCH, Weinhem, 1991. [33] F. Soisson, A. Barbu, and G. Martin, “Monte Carlo simulations of copper precipitation in dilute iron-copper alloys during thermal ageing and under electron irradiation,” Acta Mater., 44, 3789, 1996. [34] P. Auger, P. Pareige, M. Akamatsu, and D. Blavette, “APFIM investigation of clustering in neutron irradiated Fe–Cu alloys and pressure vessel steels,” J. Nucl. Mater., 225, 225–230, 1995. [35] P. Fratzl and O. Penrose, “Kinetics of spinodal decomposition in the Ising model with vacancy diffusion,” Phys. Rev. B, 50, 3477–3480, 1994. [36] J.-M. Roussel and P. Bellon, “Vacancy-assisted phase separation with asymmetric atomic mobility: coarsening rates, precipitate composition and morphology,” Phys. Rev. B, 63, 184114, 2001. [37] F. Soisson and G. Martin, Phys. Rev. B, 62, 203, 2000.
2248
G. Martin and F. Soisson
[38] E. Clouet, M. Nastar, and C. Sigli, “Nucleation of Al3 Zr and Al3 Sc in aluminiun alloys: from kinetic Monte Carlo simulations to classical theory,” Phys. Rev. B, 69, 064109, 2004. [39] M. Ath`enes, P. Bellon, G. Martin, and F. Haider, “A Monte Carlo study of B2 ordering and precipitation via vacancy mechanism in BCC lattices,” Acta Mater., 44, 4739–4748, 1996. [40] G. Martin and P. Bellon, “Driven alloys,” Solid State Phys., 50, 189, 1997. [41] R.A. Enrique and P. Bellon, “Compositional patterning in immiscible alloys driven by irradiation,” Phys. Rev. B, 63, 134111, 2001. [42] C.H. Lam, C.K. Lee, and L.M. Sander, “Competing roughening mechanisms in strained heteroepitaxy: a fast kinetic Monte Carlo study,” Phys. Rev. Lett., 89, 216102, 2002.
7.10 DIFFUSIONAL TRANSFORMATIONS: MICROSCOPIC KINETIC APPROACH I.R. Pankratov and V.G. Vaks Russian Research Centre, “Kurchatov Institute”, Moscow 123182, Russia
The term “diffusional transformations” is used for the phase transformations (PTs) of phase separation or ordering of alloys as these PTs are realized via atomic diffusion, i.e., by interchange of positions of different species atoms in the crystal lattice. Studies of kinetics of diffusional PTs attract interest from both fundamental and applied points of view. From the fundamental side, the creation and evolution of ordered domains or precipitates of a new phase provide classical examples of the self-organization phenomena being studied in many areas of physics and chemistry. From the applied side, the macroscopic properties of such alloys, such as their strength, plasticity, coercivity of ferromagnets, etc., depend crucially on their microstructure, in particular, on the distribution of antiphase or interphase boundaries separating the differently ordered domains or different phases, while this microstructure, in its turn, sharply depends on the thermal and mechanical history of an alloy, in particular, on the kinetic path taken during the PT. Therefore, the kinetics of diffusional PTs is also an important area of Materials Science. Theoretical treatments of these problems employ usually either Monte Carlo simulation or various phenomenological kinetic equations for the local concentrations and local order parameters. However, Monte Carlo studies in this field are difficult, and until now they provided limited information about the microstructural evolution. The phenomenological equations are more feasible, and they are widely used to describe the diffusional PTs, see, e.g., Turchi and Gonis [1], part I. However, a number of arbitrary assumptions are usually employed in such equations, and their validity region is often unclear [2]. Recently, the microscopic statistical approach has been suggested to treat the diffusional PTs [3–5]. It aims to develop the theoretical methods which can describe the non-equilibrium alloys as consistently and generally as the canonical Gibbs method describes the equilibrium systems. This approach was used for simulations of many different PTs. The simulations revealed a number 2249 S. Yip (ed.), Handbook of Materials Modeling, 2249–2268. c 2005 Springer. Printed in the Netherlands.
2250
I.R. Pankratov and V.G. Vaks
of new and interesting microstructural effects, many of them agreeing well with experimental observations. Below we describe this approach.
1. 1.1.
Statistical Theory of Non-equilibrium Alloys Master Equation Approach: Basic Equations
A consistent microscopical description of non-equilibrium alloys can be based on the fundamental master equation for the probabilities of various atomic distributions over lattice sites [3, 4]. For definiteness, we consider a binary alloy Ac B1−c with c ≤ 0.5. Various distributions of atoms over lattice sites i are described by the sets of occupation numbers {n i } where the operator n i = n Ai is unity when the site i is occupied by atom A and zero otherwise. The interaction Hamiltonian H has the form H=
vi j ni n j +
vi j k ni n j nk + · · ·
(1)
i> j >k
i> j
where v i... j are effective interactions. The fundamental master equation for the probability P of finding the occupation number set {n i } = α is dP(α) = [W (α, β)P(β) − W (β, α)P(α)] ≡ Sˆ P dt β
(2)
where W (α, β) is the β → α transition probability per unit time. Adopting for this probability the conventional “thermally activated atomic exchange model”, we can express the transfer matrix Sˆ in Eq. (2) in terms of the probability WiAB j of an elementary inter-site exchange Ai B j : s ˆ in ˆ in WiAB j = n i n j ωi j exp[−β(E i j − E i j )] ≡ n i n j γi j exp(β E i j ).
(3)
Here n j = n B j = (1 − n j ); ωi j is the attempt frequency; β = 1/T is the reciprocal temperature; E isj is the saddle point energy; γi j is ωi j exp(−β E isj ); and Eˆ iinj is the initial (before the jump) configurational energy of jumping atoms. The most general expression for the probability P{n i } in (2) can be conveniently written in the “generalized Gibbs” form:
P{n i } = exp
β
+
i
λi n i − Q
.
(4)
Diffusional transformations: microscopic kinetic approach
2251
Here the parameters λi can be called the “site chemical potentials”; the “quasiHamiltonian” Q is an analogue of the hamiltonian H in (1); and the generalized grand-canonical potential = {λi , ai... j } is determined by the normalizing condition: Q=
ai j n i n j +
ai j k n i n j n k + · · ·
i> j >k
i> j
= −T ln Tr exp
β
λi n i − Q
(5)
i
where Tr (. . .) means the summation over all configurations {n i }. Multiplying Eq. (2) by operators n i , n i n j , etc., and summing over all configurations, we obtain the set of exact kinetic equations for averages gi j ...k = n i n j . . . n k , in particular, for the mean site occupation ci ≡ gi = n i where . . . means Tr (. . . )P: dgi... j ˆ = n i . . . n j S. (6) dt These equations enable us to derive an explicit expression for the free energy of a non-equilibrium state, F = F{ci , gi... j }, which obeys both the “generalized first” and the second law of thermodynamics: F = H + T ln P = + dF =
iα
λi =
λi dci +
∂F ∂ci
λi ci + H − Q
iα
(v i... j − ai... j ) dgi... j
i>... j
(v i... j − ai... j ) =
dF ≤ 0. dt
∂F ∂gi... j (7)
The stationary state (being not necessarily uniform) corresponds to the minimum of F with respect to its variables ci and gi... j provided the total number of atoms N A = i ci is fixed. Then the relations (7) yield the usual, Gibbs equilibrium equations: λi = µ = constant; ai... j = v i... j ,
or :
(8) Q = H.
(9)
Non-stationary atomic distributions arising under the usual conditions of diffusional PTs appear to obey the “quasi-equilibrium” relations which correspond to an approximate validity in the course of the evolution of the second
2252
I.R. Pankratov and V.G. Vaks
equilibrium Eq. (9), while the site chemical potentials, generally, differ with each other [2]. Then the free energy F in (7) takes the form: F =+
λi ci
(10)
i
while the system of Eq. (6) is reduced to the “quasi-equilibrium” kinetic equation (QKE) for the mean occupations ci = ci (t) [3]:
β(λ j − λi ) dci = Mi j 2 sinh . dt 2 j
(11)
Here the quantities λ j are related to ci by the self-consistency equation:
ci = n i = Tr n i P{λ j }
(12)
while the “generalized mobility” Mi j for the pair interaction case, when the Hamiltonian (1) includes only the first term, can be written as [6]:
Mi j = γi j n i n j exp
β λ + λ − (v + v + u + u )n i j jk ik jk k k ik
2
. (13)
AB BB Here γi j , n i and v i j = ViAA j − 2Vi j + Vi j are the same as in Eqs. (3) and (1), AA BB while u i j = Vi j − Vi j is the so-called asymmetric potential. The description of the diffusional PTs in terms of the mean occupations ci given by Eqs. (11)–(13) seems to be sufficient for the most situations of practical interest, in particular, for the “mesoscopic” stages of such PTs when the local fluctuations of occupations are insignificant. At the same time, to treat the fluctuative phenomena, such as the formation and evolution of critical nuclei in metastable alloys, one should modify the QKE (11), for example, by an addition of some “Langevin-noise”-type terms [4].
1.2.
Kinetic Mean-field and Kinetic Cluster Approximations
To find explicit expressions for the functions F{ci }, λi {c j }, and Mi j {ck } in Eqs. (10)–(12), one should employ some approximate method of statistical physics. Several such methods have been developed [4]. For simplicity we consider the pair interaction model and write the interaction v i j in (1) as δi j,n v n where the symbol δi j,n is unity when sites i and j are nth neighbors in the lattice and zero otherwise, while v n is the interaction constant. Then the simplest,
Diffusional transformations: microscopic kinetic approach
2253
“kinetic mean-field” approximation (KMFA, or simply MFA) corresponds to the following expressions for , λi and Mi j : MFA =
T ln ci −
i
λMFA i
=T
ln (ci /ci )
1 δi j,n v n ci c j 2 i, j,n
+
(14)
δi j,n v n c j
j,n
MiMFA j
= γi j
ci ci c j cj
exp β
1/2
(u ik + u j k )ck )
.
(15)
k
Here ci is 1 − ci , while the free energy F is related to and λi by Eq. (10). For a more refined and usually more accurate, kinetic pair-cluster approximation (KPCA, or simply PCA), the expressions for and λi are more complex but still can be written analytically: PCA =
T ln ci +
i
λPCA i
= T ln (ci /ci ) +
1 δi j,n inj 2 i, j,n
(16)
ij
δi j,n λni .
j,n ij
Here inj = − T ln(1 − ci c j gni j ); λni = −T ln(1 − c j gni j ); and the function gni j is expressed via the Mayer function f n = exp (−βv n ) − 1 and the mean occupations ci and c j : gni j = Rni j
2 fn ij Rn
+ 1 + f n (ci + c j )
= [1 + (ci + c j ) f n ] − 4ci c j f n ( f n + 1) 2
1/2
(17) .
For the weak interaction, βv n 1, the function gni j becomes (−βv n ), inj − ij v i j ci c j , λni v n c j , and the PCA expressions (16) become the MFA ones (14). The MFA or the PCA is usually sufficient to describe the PTs between the disordered phases and/or the BCC-based ordered phases, such as the B2 and D03 phases. However, these simple methods are insufficient to describe the FCC-based L12 and L10 ordered alloys as strong many-particle correlations are characteristic of such systems. These alloys can be adequately described by the cluster variation method (CVM) which takes into account the correlations mentioned within at least 4-site tetrahedron cluster of nearest neighbors. However, the CVM is cumbersome, and it is difficult to use it for the non-homogeneous systems. At the same time, a simplified version of CVM, the tetrahedron cluster-field approximation (TCA), usually combines the high accuracy of CVM with great simplification of calculations [6].
2254
I.R. Pankratov and V.G. Vaks
The TCA expressions for and λi can be written explicitly and are similar to those in Eq. (16), but to find the functions (ci ) and λi (c j ) explicitly one should solve the system of four algebraic equations for each tetrahedron cluster. In practice, these equations can easily be solved numerically using the conjugate gradients method [4, 7]. We can also use the PCA or the TCA methods to more accurately calculate the mobility Mi j in the expression (13) [4]. However, in this expression the above-mentioned correlations of atomic positions result only in some quantitative factors that weakly depend on the local composition and ordering and seem to be of little importance for the microstructural evolution. Therefore, the simple MFA expression (15) for Mi j was employed in the previous KTCAbased simulations of the L12 and L10 -type orderings [4, 7].
1.3.
Deformational Interactions in Dilute and Concentrated Alloys
The effective interaction v i... j in the Hamiltonian (1) includes the “chemic cal” contribution v i... j which describes the energy change under the substitution of some atoms A by atoms B in the rigid lattice, and the “deformational” d term v i... j due to the difference in the lattice deformation under such a substitution. The interaction v d includes the long-range elastic forces which can significantly affect the microstructural evolution, see, e.g., Turchi and Gonis [1]. A microscopical model to calculate the interaction v d in dilute alloys was suggested by Khachaturyan [8]. In the concentrated alloys, the deformational interaction can lead to some new effects, in particular, to the lattice symmetry change under PT, such as the tetragonal distortion under L10 ordering. Below we describe the generalization of the Khachaturyan’s model of deformational interactions to the case of a concentrated alloy [9]. Supposing a displacement uk of site k relative to its position Rk in the “average” crystal Ac B1−c to be small, we can write the alloy energy H as H = Hc {n i } −
u αk Fαk +
k
1 u αk u βl Aαk,βl 2 αk,βl
(18)
where α and β are Cartesian indices and both the Kanzaki force Fk and the force constant matrix Aαk,βl are some functions of occupation numbers n i . For the force constant matrix, the conventional average crystal approximation seems usually to be sufficient: Aαk,βl {n i } → Aαk,βl {c} ≡ A¯ αk,βl . The Kanzaki force Fαk can be written as a series in the occupation numbers n i : Fαk =
i
(1) f αk,i ni +
i> j
(2) f αk,i j ni n j + · · ·
(19)
Diffusional transformations: microscopic kinetic approach
2255
where the coefficients f (n) do not depend on n i . Minimizing the energy (18) with respect to displacements uk we obtain for the deformational Hamiltonian Hd: 1 Fαk ( A¯ −1 )αk,βl , Fβl (20) Hd = − 2 αk,βl where ( A¯ −1 )αk,βl means the matrix inverse to A¯ αk,βl which can be written ¯ explicitly using the Fourier transformation of the force constant matrix A(k). For the dilute alloys, one can retain in (19) only the first sum which corresponds to a pairwise H d by Khachaturyan [8]. The next terms in (19) lead to non-pairwise interactions which describe, in particular, the above-mentioned effects of a lattice symmetry change. To describe these effects, for example, for the case of the L10 ordering in the FCC lattice, we can retain in (19) only terms with f (1) and f (2) and estimate them from the experimental data about the concentration dilatation in the disordered phase and about the lattice parameter changes under the L12 and L10 orderings [6, 9].
1.4.
Vacancy-mediated Kinetics and Equivalence Theorem
In the most theoretical treatments of kinetics of diffusional PT, as well as in the previous part of this paper, the simplified direct exchange model was used which assumes direct exchange of positions between unlike neighboring atoms in an alloy. Actually, the exchange occurs between the main alloy component atom, e.g., A or B atom in an ABv alloy, and the neighboring vacancy “v”. As the vacancy concentration cv in alloys is actually quite small, cv 10−4 , employing the direct exchange model greatly simplifies the theoretical studies by reducing the computation times by several orders of magnitude. However, it is not clear a priori whether using the unrealistic direct exchange model results in some errors or missing some effects. In particular, a notable segregation of vacancies at interphase or antiphase boundaries was observed in a number of simulations, and the problem of possible influence of this segregation on the microstructural evolution was discussed by a number of authors. To clarify these problems, the statistical approach described above has been generalized to the vacancy-mediated kinetics case [5]. In particular, the QKE for an ABv alloy, instead of Eq. (11), takes the form of a set of equations for the A-atom and the vacancy mean occupations, cAi = ci and cvi :
dci Av = γi j Bi j eβ(λA j + λvi ) − eβ(λAi + λv j ) dt j
dcvi vA βλAi = Bi j eβλv j γivB j + γi j e dt j
(21)
− {i → j } .
(22)
2256
I.R. Pankratov and V.G. Vaks
Here Bi j is an analogue of the second factor in Eq. (13), while λAi and λvi are the site chemical potentials for the A atom and the vacancy, respectively, in (14): which in the MFA have the form similar to λMFA i
λMFA Ai
= T ln
ci ci
+
v iAA j cj;
λMFA vi
j
= T ln
cvi ci
+
v ivA j cj
j
(23) where v ivA j is an effective interaction between a vacancy and an A atom. The main alloy components kinetics determined by the QKE (21) can usually be described in terms of a certain equivalent direct exchange model; this statement can be called “the equivalence theorem”. To prove it, we first note that the factor exp(βλvi ) in Eqs. (21) and (22) is proportional to the vacancy concentration cvi , which is illustrated by (22) and is actually a general relation of thermodynamics of dilute solutions. Thus the time derivatives of the mean occupations are proportional to the local vacancy concentration cvi or cv j , which is natural for the vacancy-mediated kinetics. As cvi is quite small, this implies that the main component relaxation times are by a factor 1/cvi larger than the time of the relaxation of vacancies to their “quasi-equilibrium” distribution cvi {ci } minimizing the free energy F{cvi , ci } at the given main component distribution {ci }. Therefore, neglecting the small correction of the relative order of cvi 1, we can find this “adiabatic” vacancy distribution cvi by equating the left-hand side of (22) to zero. Employing for simplicity the vB conventional nearest-neighbor vacancy exchange model: γivB j = δi j,1 γnn and vA vA γi j = δi j,1 γnn , we can solve this equation explicitly. The solution corresponds to the first term in square brackets in (22) to be constant not depending on the site number i, though it can, generally, depend on time: νi =
vB γnn exp(βλvi ) = ν(t) vB Av [γnn + γnn exp(βλρi )]c¯v
(24)
vB and the average concentration of vacancies c¯v where the common factor γnn are introduced for convenience. Relations (24) determine the adiabatic vacancy distribution cvi {ci } mentioned above. Substituting these relations into (21) we obtain the QKE for the main alloy component which has the “direct exchange” form (11) with an effective rate vA γieff j = γi j c¯v ν(t).
(25)
Physically, the opportunity to reduce the vacancy-mediated kinetics to the equivalent direct exchange kinetics is connected with the above-mentioned fact that in the course of the alloy evolution the vacancy distribution adiabatically fast follows that of the main components. Thus it is natural to believe that
Diffusional transformations: microscopic kinetic approach
2257
for the quasi-equilibrium stages of evolution under consideration such equivalence holds not only for the nearest-neighbor vacancy exchange model but is actually a general feature of any vacancy-mediated kinetics. In more detail, features of the vacancy-mediated kinetics for both the phase separation and the ordering case have been discussed by Belashchenko and Vaks [5] who used computer simulations based on Eqs. (21) and (22). The simulations confirmed the equivalence theorem for the “quasi-equilibrium” stages of evolution, t τAB , where τAB is the mean time needed for an exchange of neighboring A and B atoms. The function ν(t) in (24) was found to monotonously increase with the PT time t, and in the course of the PT this function slowly approaches its equilibrium value ν∞ . At the same time, at very early stages of PT, for times t less than the vacancy distribution equilibration time τve , the equivalence theorem does not hold as the spatial fluctuations in the initial vacancy distribution are here important. These fluctuations can lead, in particular, to a peculiar phenomenon of “localized ordering” observed by Allen and Cahn [10] in Fe–Al alloys. However, at later times t τve ∼ τAB · cv1/3 , the vacancy distribution equilibrates and the equivalence theorem holds.
2.
Applications of Statistical Approach for Simulation of Diffusional Transformations
Numerous applications of the above-described statistical methods for simulation of diffusional PTs are discussed and compared to experimental observations in reviews [4, 7]. Below we illustrate these applications with some examples.
2.1.
Methods of Simulation
Most of these simulations were based on the QKE (11). For the mobility Mi j in this equation, the MFA expression (15) with the “nearest-neighbor symmetric atomic exchange”, γi j = δi j,1 γnn and u i j = 0, was usually used. Vaks, Beiden and Dobretsov [11] also considered the effect of an asymmetric potential u i j =/ 0 on spinodal decomposition. For the site chemical potential λi in the disordered phase and in the BCC-based ordered phases, the MFA expression (14) was employed which is usually sufficient to describe PTs between these phases. The simulations of the L12 - and L10 -type orderings in FCC alloys were based on the KTCA expressions. Equations (11) were usually solved by the 4th-order Runge–Kutta method [12] with the dimensionless time variable t = tγnn and the variable time-step t . This time-step was chosen so that the maximum variation | ci | = |ci (t + t ) − ci (t )| for one time-step does not exceed 0.01. The typical t values were 0.01 − 0.1, depending on
2258
I.R. Pankratov and V.G. Vaks
the evolution stage. For the PTs after a quench of a disordered alloy, the initial as-quenched distribution ci =c(Ri ) at t =0 was characterized by its mean value c and small random fluctuations δci ±0.01. The most of simulations were performed on 2D lattices with periodic boundary conditions as it enables us to study more sizable structures. However, some main conclusions were also verified by 3D simulations with periodic boundary conditions.
2.2.
Spinodal Decomposition of Disordered Alloys
Vaks, Beiden and Dobretsov [11] simulated spinodal decomposition (SD) of a disordered alloy after its quench into the spinodal instability area in the c, T plane. The interaction v i j = v(ri j ) = v(Ri − R j ) was assumed to be Gaussian and long-ranged: v(r)=− A exp (−r 2 /rv2 ) with rv2 a 2 and the constant A proportional to the critical temperature Tc . Some results of this simulation are presented in Figs. 1 and 2. The figures illustrate the transition from the initial stage of SD corresponding to the development of non-interacting Cahn’s concentration waves with growing amplitudes (see, e.g., [8]) to the next stages, first to the stage of non-linear interaction of concentration waves (Fig. 1), and then to the stage of interaction and fusion of new-formed precipitates via a peculiar “bridge” mechanism
(a)
(b)
Figure 1. Profiles of the concentration c(r) at spinodal decomposition for the 2D model described in the text at c = 0.35; T = T /Tc = 0.4, u i j = 0 , and the following values of the reduced time t = tγnn : (a) 5; and (b) 10. Distances at the horizontal axes are given in the interaction radius rv units.
Diffusional transformations: microscopic kinetic approach
2259
Figure 2. Distribution of c(r) for the same model as in Fig. 1 at the following t : (a) 20, (b) 120, (c) 130, (d) 140, (e) 160, (f) 180, (g) 200, and (h) 5000. The grey level linearly varies with c(r) for c between 0 and 1 from completely dark to completely bright.
illustrated by Fig. 2. This mechanism was discussed in detail by Vaks, Beiden and Dobretsov [11], while the microstructures shown in Fig. 2 reveal a striking similarity with those observed in the recent experimental studies of SD in some liquid mixtures [4].
2.3.
Kinetics of B2 and D03 -type Orderings
The B2 order corresponds to the splitting of the BCC lattice into two cubic sublattices, a and b, with the displacement vector rab = [1, 1, 1]a/2 and the mean occupations ca = c + η and cb = c − η where η is the order parameter. There are two types of antiphase ordered domain (APD) differing with the sign of η, and one type of antiphase boundary (APB) separating these APDs. The inhomogeneously ordered alloy states including APBs can be conveniently described in terms of the local order parameter ηi = η(Ri ) and the local concentration c¯i = c(Ri ) obtained by the averaging of mean occupations ci over site i and its nearest neighbors: c¯i =
1 1 1 1 ci + cj ηi = ci − c j exp(ik1 Ri ). (26) 2 z nn j =nn(i) 2 z nn j =nn(i)
2260
I.R. Pankratov and V.G. Vaks
Here index nn(i) means the summation over nearest-neighbors of site i; z nn is the number of such neighbors, i.e., 4 for the 2D square lattice and 8 for the 3D BCC lattice; and the superstructure vector k1 is (1, 1)2π/a or (1, 1, 1)2π/a for the 2D or 3D case, respectively. Dobretsov, Martin and Vaks [13] investigated kinetics of phase separation with B2 ordering using the KMFA-based 2D simulations on a square lattice of 128 × 128 sites and the Fe–Al-type interaction model. The simulations enabled one to specify the earlier phenomenological considerations [10] and to find a number of new effects. As an illustration, in Fig. 3 we show the evolution after a quench of an alloy from the disordered A2 phase to the two-phase state in which SD into the B2 and the A2 phases takes place. The volume ratio of these two phases in the final mixture is the same as that for the disordered “dark” and “bright” phases in Fig. 2, and so one might expect a similarity of microstructural evolution for these two transformations. However, the formation of numerous APBs at the initial, “congruent ordering” stage of PT A2 → A2 + B2 (which occurs at an approximately unchanged initial concentration c) ¯ and the subsequent “wetting” of these APBs by the A2 phase lead to significant structural differences with the SD into disordered phases. In particular, the concentration c(r) ¯ and the order parameter η(r) at the first stages of SD shown in Figs. 3(a)–3(c) form a “ridge-valley”-like pattern, rather
Figure 3. Temporal evolution of mean occupationals ci =c(ri ) for the Fe–Al-type alloy model under PT A2→A2+B2 at c = 0.175, T = 0.424, and the following t : (a) 50, (b) 100, (c) 200, (d) 1000, (e) 4000, and (f) 9000.
Diffusional transformations: microscopic kinetic approach
2261
than the “hill-like” pattern seen in Fig. 1. For the PT B2 → A2 + B2, the simulations reveal some peculiar microstructural effects in vicinity of initial APBs, the formation of wave-like distributions, “broken layers” of ordered and disordered domains parallel to the initial APB, and these results agree well with experimental observations for Fe–Al alloys [4, 10]. For the homogeneous D03 phase, the mean occupation ci can be written as ci = c + η exp(ik1 Ri ) + ζ [exp(ik2 Ri )sgn(η) + exp(−ik2 Ri )].
(27)
Here Ri is the BCC lattice vector of site i; k2 = [111]π/a is the D03 superstructure vectors, and η or ζ is the B2- or the D03 -type order parameter. Both η and ζ in (27) can be positive and negative, thus there are four types of ordered domain and two types of APB, which separate either the APDs differing in the sign of η (“η-APB”), or the APDs differing in the sign of ζ (“ζ -APB”). Using the relations analogous to (26), one can also define the local parameters ηi , ζi and c¯i , in particular, the local order parameter ηi2 used in Figs. 4 and 5:
1 2 1 ci − cj + ηi2 = 16 z nn j =nn(i) z nnn
2
cj .
(28)
j =nnn(i)
Here nn(i) or nnn(i) means the summation over nearest or next-nearest neighbors of site i, and z nn or z nnn is the total number of such neighbors. The
a
b
c
d
e
f
Figure 4. Temporal evolution of model I for PT A2 → A2 + D03 at c = 0.187, T = T/Tc = 0.424, and the following t : (a) 10, (b) 30, (c) 100, (d) 500, (e) 1000, and (f) 2000. The grey level linearly varies with ηi2 defined by (28) between its minimum and maximum values from completely dark to completely bright.
2262
I.R. Pankratov and V.G. Vaks a
b
c
d
e
f
Figure 5. As Fig. 4, but for model II and PT A2 → A2 + B2 at c = 0.325, T = 0.424.
distribution of ηi2 is similar to that observed in the transmission electron microscopy (TEM) images with the reflection vector k1 [14]. To study kinetics of D03 ordering, Belashchenko, Samolyuk and Vaks [15] simulated PTs A2 → D03 , A2 → A2 + D03 , A2 → B2 + D03 and D03 → B2 + D03 using the Fe-Al-type interaction models. They also considered two more models, I and II, in which the deformational interaction v d was taken into account for the PT A2 → A2 + D03 and A2 → A2 + B2, respectively. The simulations reveal a number of microstructural features related to the “multivariance” of the D03 orderings. Some of them are illustrated in Figs. 4 and 5 where the PT A2 → A2 + D03 for model I is compared to the PT A2 → A2 + B2 for model II. The first stage of both PTs corresponds to congruent ordering at approximately unchanged initial concentration. Frame 4a illustrates the transient state in which only the B2 ordered APDs (“η-APDs”) are present. Frame 4b shows the formation of the D03 -ordered APDs (“ζ -APDs”) within initial η-APDs, and these ζ -APDs are much more regular-shaped than the η-APDs in frame 5b. Frames 4b–4d also illustrate wetting of both the η-APBs and ζ -APBs by the disordered A2 phase. Later on the deformational interaction tends to align the ordered precipitates along elastically soft (100) directions, and frame 4f shows an array of approximately rectangular D03 -ordered precipitates, unlike rod-like structures seen in frame 5f. The microstructure in frame 4f is similar to those observed for the PT A2 → A2 + D03 in alloys Fe– Ga, while the microstructure in frame 5f is similar to those observed for the PT B2 → B2 + D03 in alloys Fe–Si. The latter similarity reflects the topological equivalence of the A2 → A2 + B2 and B2 → B2 + D03 PTs [4].
Diffusional transformations: microscopic kinetic approach
2.4.
2263
Kinetics of L12 and L10 -type Orderings
For the FCC-based L12 - or L10 -ordered structures, the occupation ci of the FCC lattice site Ri is described by three order parameters ηα corresponding to three superstructure vectors kα : ci = c + η1 exp(ik1 Ri ) + η2 exp(ik2 Ri ) + η3 exp(ik3 Ri ) k1 = (1, 0, 0)2π/a k2 = (0, 1, 0)2π/a k3 = (0, 0, 1)2π/a
(29)
where a is the FCC lattice constant. For the cubic L12 structure |η1 | = |η2 | = |η3 |, η1 η2 η3 > 0, and four types of ordered domain are possible. In the L10 phase with the tetragonal axis α, a single nonzero parameter ηα is present which is either positive or negative. Thus six types of ordered domain are possible with two types of APB. The APB separating two APDs with the same tetragonal axis can be for brevity called the “shift-APB”, and that separating the APDs with perpendicular tetragonal axes can be called the “flip-APB”. The inhomogeneously ordered alloy states can be described by the local 2 similar to those in Eqs. (26) and (30), and by quantities ηi2 parameters ηαi characterizing the total degree of the local order:
2
1 1 2 = ci + c j exp(ikα Ri j ) ; ηαi 16 4 j =nn(i)
2 2 2 ηi2 = η1i + η2i + η3i
(30)
where R j i is R j − Ri . Belashchenko et al. [6, 9] simulated PTs A1 → L12 , A1 → A1 + L12 , and A1 → L10 after a quench of an alloy from the disordered FCC phase A1. The simulations were performed in FCC simulation boxes of sizes Vb = L 2 × H , and the value H = 1 (in the lattice constant a units) corresponds to quasi-2D simulation when the simulation box contains two atomic planes. A number of different models have been considered: the short-range-interaction models 1, 2, and 3; the intermediate-range-interaction model 4 with v n estimated from the experimental data for Ni–Al alloys; and the extended-interaction model 5. In studies of PTs A1 → L10 , the models 1 –5 were also considered in which the deformational interaction v d was added to the “chemical” interactions v n of models 1–5. This v d was found with the use of Eq. (20) and the experimental data for Co–Pt alloys. The simulations revealed many interesting microstructural features for both the L12 and L10 -type orderings. It was found, in particular, that the character of the microstructural evolution strongly depends on the type of the interaction v i j , particularly on its interaction range rint , as well as on temperature T and the degree of non-stoichiometry δc which is (c − 0.25) for the L12 phase, and (c − 0.5) for the L10 phase. With increasing rint , T , or δc, the microstructures become more isotropic and the APBs become more diffuse and mobile. At the same time, for the short-range-interaction systems at not-high T and small δc,
2264
I.R. Pankratov and V.G. Vaks
the microstructures are highly anisotropic while the most of APBs are thin and low-mobile. Figures 6 and 7 illustrate these features for the L12 -type orderings. Figure 6 shows the evolution under the A1 → L12 PT for the intermediate-interactionrange model 4 at non-stoichiometric c = 0.22. We see that the distribution of APBs is virtually isotropic. The main evolution mechanism is the growth of larger domains at the expense of smaller ones which is also typical for the simple B2 ordering. At the same time, one more mechanism, the fusion of in-phase domains, is also important for the multivariant orderings under consideration. For the later stages of evolution, Fig. 6 also reveals many approximately equiangular triple junctions of APDs with angles 120◦ ; it agrees with TEM observations for Cu–Pd alloys [14]. Kinetics of the A1 → L12 PT for the short-range-interaction system is illustrated in Fig. 7. The distribution of APBs here reveals a high anisotropy, a tendency to the formation of thin “conservative” APBs with (100)-type orientation. One also observes many “step-like” APBs with the conservative segments; the triple junctions of APBs with one non-conservative APBs and two conservative APBs; and the “quadruple” junctions of APDs. All these features were
Figure 6. Temporal evolution of model 4 under PT A1 → L12 for the simulation box size Vb = 1282 × 1 at c = 0.22, T = 0.685 and the following t : (a) 5; (b) 50; (c) 120; (d) 125; 2 + η2 + η2 between its mini(e) 140; and (f) 250. The grey level linearly varies with ηi2 = η1i 2i 3i mum and maximum values from completely dark to completely bright. The symbol A, B, C or D indicates the type of the ordered domain, and the thick arrow indicate the fusion-of-domain process.
Diffusional transformations: microscopic kinetic approach a
b
c
d
e
f
2265
Figure 7. As Fig. 6, but for model 1 and Vb = 642 × 1 at c = 0.25, T = 0.57 and the following t : (a) 2, (b) 3, (c) 20, (d) 100, (e) 177 and (f) 350.
observed in the electron microscopy studies of Cu3 Au alloys [14]. Figure 7 also illustrates the peculiar kinetic processes related to conservative APBs and discussed by Vaks [4, 7]. The L10 structure, unlike the cubic L12 structure, is tetragonal and has a tetragonal distortion . Depending on the importance of this distortion, the evolution in the course of the A1 → L10 PT can be divided into three stages. I. The initial stage when the L10 -ordered APDs are quite small, their tetragonal distortion is insignificant, and all six types of APD are present in the same proportion. II. The intermediate stage when the tetragonal distortion of APDs leads to some predominance of the (110)-type orientations of flip-APBs and to decreasing of the portion of APDs with the unfavorable orientation (001). III. The final, “twin” stage when the well-defined twin bands delimited by the flip-APBs with (110)-type orientation are formed. Each band includes only two types of APD with the same tetragonal axis, and these axes in the adjacent bands are “twin” related, i.e., have alternate (100) and (010) orientations. The thermodynamic driving force for the (110)-type orientation of flipAPBs is the gain in the elastic energy: at other orientations this energy increases proportionally to the volume of the adjacent APDs [8].
2266
I.R. Pankratov and V.G. Vaks
The simulations of PTs A1 → L10 [9] revealed a number of peculiar microstructural features for each of the stages mentioned above. Figures 8 and 9 illustrate some of these features. Frame 8a corresponds to stage I ; frames 8b–8c, to stage II; and frames 8d–8f and 9a–9d, to stage III. The following processes and configurations are seen to be characteristic of both the stage I and stage II: (1) The abundant processes of fusion of in-phase domains which are among the main mechanisms of domain growth at these stages. (2) Peculiar long-living configurations, the quadruple junctions of APDs (4-junctions) of the type A1 A2 A1 A3 where A2 and A3 can correspond to any two of four types of APD different from A1 and A1 . (3) Many processes of “splitting” of a shiftAPB into two flip-APBs which leads either to the fusion of in-phase domains or to the formation of a 4-junction. For the final, “nearly equilibrium” twin stage, Figs. 8f and 9a–9d demonstrate a peculiar alignment of shift-APBs: within a (100)-oriented twin band in a (110)-type polytwin the APBs tend to align normally to some direction n = (cos α, sin α, 0) characterized by a “tilting” angle α which is mainly
Figure 8. Temporal evolution of model 4 under PT A1 → L10 for Vb = 1282 × 1 at c = 0.5, T = 0.67, and the following t : (a) 10; (b) 20; (c) 50; (d) 400; (e) 750; and (f) 1100. ¯ B or B ¯ and C or C¯ indicates an APD with the tetragonality axis along The symbol A or A, (100), (010) and (001), respectively. The thick, the thin and the single arrow indicates the fusion-of-domain process, the quadruple junction of APDs, and the splitting APB process, respectively.
Diffusional transformations: microscopic kinetic approach
2267
Figure 9. As Fig. 8, but for model 2 at the following values of c, T , and t : (a) c = 0.5, T = 0.77, and t = 350; (b) c = 0.5, T = 0.95, and t = 300; (c) c = 0.46, T = 0.77, and t = 350; and (d) c = 0.44, T = 0.77, and t = 300.
determined by the type of chemical interaction. For the short-range interaction systems this angle is close to zero, in agreement with observations for CuAu. For the intermediate-interaction-range systems, the scale of α is illus-trated by Fig. 8f, and the alignment of APBs shown in this figure is very similar to that observed for a Co0.4 Pt0.6 alloy [4]. Figure 9 also illustrates sharp changes of the alignment type under variation of temperature T and non-stoichiometry δc, including the “faceting-tilting”-type morphological transitions.
3.
Outlook
For the last decade the statistical theory of diffusional PTs has been formulated in terms of both approximate and exact kinetic equations and was applied to studies of many concrete problems. These applications yielded numerous new results, many of them agreeing well with experimental observations. Many predictions of this theory are still awaiting experimental verification. At the same time, there remain a number of further problems in this approach to be solved, such as the elaboration of a microscopical “phase-fieldtype” approach suitable for treatments of sizeable and complex structures [2]; the consistent treatment of fluctuative effects, including the problem of nucleation of embryos of a new phase within the metastable one, and others. Some of these problems are now underway, and for the nearest future one can expect a further progress in that field.
2268
I.R. Pankratov and V.G. Vaks
References [1] P.E.A. Turchi and A. Gonis (eds.), “Phase transformations and evolution in materials,” TMS, Warrendale, 2000. [2] I.R. Pankratov and V.G. Vaks, “Generalized Ginzburg–Landau functionals for alloys: general equations and comparison to the phase-field method,” Phys. Rev. B, 68, 134208 (in press), 2003. [3] V.G. Vaks, “Master equation approach to the configurational kinetics of nonequilibrium alloys: exact relations, H-theorem and cluster approximations,” JETP Lett., 78, 168–178, 1996. [4] V.G. Vaks, “Kinetics of phase separation and orderings in alloys,” Physics Reports, 391, 157–242, 2004. [5] K.D. Belashchenko and V.G. Vaks, “Master equation approach to configurational kinetics of alloys via vacancy exchange mechanism: general relations and features of microstructural evolution,” J. Phys. Condensed Matter, 10, 1965–1983, 1998. [6] K.D. Belashchenko, V. Yu. Dobretsov, I.R. Pankratov et al., “The kinetic clusterfield method and its application to studes of L12 -type orderings in alloys,” J. Phys. Condens. Matter, 11, 10593–10620, 1999. [7] V.G. Vaks, “Kinetics of L12 -type and L10 -type orderings in alloys,” JETP Lett., 78, 168–178, 2003. [8] A.G. Khachaturyan, “Theory of structural phase transformations in solids,” Wiley, New York, 1983. [9] K.D. Belashchenko, I.R. Pankratov, G.D. Samolyuk et al., “Kinetics of formation of twinned structures under L10 -type orderings in alloys,” J. Phys. Condens. Matter, 14, 565–589, 2002. [10] S.M. Allen and J.W. Cahn, “Mechanisms of phase transformations within the miscibility gap of Fe-rich Fe-Al alloys,” Acta Metall., 24, 425–437, 1976. [11] V.G. Vaks, S.V. Beiden, V. Dobretsov, and Yu., “Mean-field equations for configurational kinetics of alloys at arbitrary degree of nonequilibrium,” JETP Lett., 61, 68–73, 1995. [12] G. Korn and T. Korn, “Mathematical handbook for scientists and engineers,” McGraw-Hill, New York, 1961. [13] V. Yu. Dobretsov, V.G. Vaks, and G. Martin, “Kinetic features of phase separation under alloy ordering,” Phys. Rev. B, 54, 3227–3239, 1996. [14] A. Loiseau, C. Ricolleau, L. Potez, and F. Ducastelle, “Order and disorder at interfaces in alloys,” In: W.C. Johnson, J.M. Howe, D.E. Mc Laughlin, and W.A. Soffa (eds.), Solid–Solid Phase Transformations, pp. 385–400, TMS, Warrendale, 1994. [15] K.D. Belashchenko, G.D. Samolyuk, and V.G. Vaks, “Kinetic features of alloy ordering with many types of ordered domain: D03 -type ordering,” J. Phys. Condens. Matter, 10, 10567–10592, 1999.
7.11 MODELING THE DYNAMICS OF DISLOCATION ENSEMBLES Nasr M. Ghoniem Department of Mechanical and Aerospace Engineering, University of California, Los Angeles, CA 90095-1597, USA
1.
Introduction
A fundamental description of plastic deformation is under development by several research groups as a result of dissatisfaction with the limitations of continuum plasticity theory. The reliability of continuum plasticity descriptions is dependent on the accuracy and range of available experimental data. Under complex loading situations, however, the database is often hard to establish. Moreover, the lack of a characteristic length scale in continuum plasticity makes it difficult to predict the occurrence of critical localized deformation zones. It is widely appreciated that plastic strain is fundamentally heterogenous, displaying high strains concentrated in small material volumes, with virtually undeformed regions in-between. Experimental observations consistently show that plastic deformation is internally heterogeneous at a number of length scales [1–3]. Depending on the deformation mode, heterogeneous dislocation structures appear with definitive wavelengths. It is common to observe persistent slip bands (PSBs), shear bands, dislocation pile ups, dislocation cells and sub grains. However, a satisfactory description of realistic dislocation patterning and strain localization has been rather elusive. Since dislocations are the basic carriers of plasticity, the fundamental physics of plastic deformation must be described in terms of the behavior of dislocation ensembles. Moreover, the deformation of thin films and nanolayered materials is controlled by the motion and interactions of dislocations. For all these reasons, there has been significant recent interest in the development of robust computational methods to describe the collective motion of dislocation ensembles. Studies of the mechanical behavior of materials at a length scale larger than what can be handled by direct atomistic simulations, and smaller than what allows macroscopic continuum averaging represent particular difficulties. Two 2269 S. Yip (ed.), Handbook of Materials Modeling, 2269–2286. c 2005 Springer. Printed in the Netherlands.
2270
N.M. Ghoniem
complimentary approaches have been advanced to model the mechanical behavior in this meso length scale. The first approach, commonly known as dislocation dynamics (DD), was initially motivated by the need to understand the origins of heterogeneous plasticity and pattern formation. In its early versions, the collective behavior of dislocation ensembles was determined by direct numerical simulations of the interactions between infinitely long, straight dislocations [3–9]. Recently, several research groups extended the DD methodology to the more physical, yet considerably more complex 3D simulations. Generally, coarse resolution is obtained by the Lattice Method, developed by Kubin et al. [10] and Moulin et al. [11], where straight dislocation segments (either pure screw or edge in the earliest versions, or of a mixed character in more recent versions) are allowed to jump on specific lattice sites and orientations. Straight dislocation segments of mixed character in the The Force Method, developed by Hirth et al. [12] and Zbib et al. [13] are moved in a rigid body fashion along the normal to their mid-points, but they are not tied to an underlying spatial lattice or grid. The advantage of this method is that the explicit information on the elastic field is not necessary, since closed-form solutions for the interaction forces are directly used. The Differential Stress Method developed by Schwarz and Tersoff [14] and Schwarz [15] is based on calculations of the stress field of a differential straight line element on the dislocation. Using numerical integration, Peach–Koehler forces on all other segments are determined. The Brown procedure [16] is then utilized to remove the singularities associated with the self-force calculation. The method of The Phase Field Microelasticity [17–19] is of a different nature. It is based on Khachaturyan–Shatalov (KS) reciprocal space theory of the strain in an arbitrary elastically homogeneous system of misfitting coherent inclusions embedded into the parent phase. Thus, consideration of individual segments of all dislocation lines is not required. Instead, the temporal and spatial evolution of several density function profiles (fields) are obtained by solving continuum equations in Fourier space. The second approach to mechanical models at the mesoscale has been based on statistical mechanics methods [20–24]. In these developments, evolution equations for statistical averages (and possibly for higher moments) are to be solved for a complete description of the deformation problem. We focus here on the most recent formulations of 3D DD, following the work of Ghoniem et al. We review here the most recent developments in computational DD for the direct numerical simulation of the interaction and evolution of complex, 3D dislocation ensembles. The treatment is based on the parametric dislocation dynamics (PDD), developed by Ghoniem et al. In Section 2, we describe the geometry of dislocation loops with curved, smooth, continuous parametric segments. The stress field of ensembles of such curved dislocation loops is then developed in Section 3. Equations of motion for dislocation loops
Modeling the dynamics of dislocation ensembles
2271
are derived on the basis of irreversible thermodynamics, where the time rate of change of generalized coordinates will be given in Section 4. Extensions of these methods to anisotropic materials and multi-layered thin films are discussed in Section 5. Applications of the parametric dislocation dynamics methods are given in Section 6, and a discussion of future directions is finally outlined in Section 7.
2.
Computational Geometry of Dislocation Loops
Assume that the dislocation line is segmented into (n s ) arbitrary curved segments, labeled (1 ≤ i ≤ n s ). For each segment, we define rˆ (ω)=P(ω) as the position vector for any point on the segment, T(ω) = T t as the tangent vector to the dislocation line, and N(ω) = N n as the normal vector at any point (see Fig. 1). The space curve is then completely described by the parameter ω, if one defines certain relationships which determine rˆ (ω). Note that the position of any other point in the medium (Q) is denoted by its vector r, and that the vector connecting the source point P to the field point is R, thus R = r − rˆ . In the following developments, we restrict the parameter 0 ≤ ω ≤ 1, although we map it later on the interval −1 ≤ ωˆ ≤ 1, and ωˆ = 2ω − 1 in the numerical quadrature implementation of the method. To specify a parametric form for rˆ (ω), we will now choose a set of gen( j) eralized coordinates qi for each segment ( j ), which can be quite general. If one defines a set of basis functions C i (ω), where ω is a parameter, and allows
g3 ⫽ b冫 冩 b 冩 P ω⫽ 0
R
g2 ⫽ t
g2 ⫽ e r
Q
ω⫽ 1
1z
1x
Figure 1. segment.
1y
Differential geometry representation of a generalparametric curved dislocation
2272
N.M. Ghoniem
for index sums to extend also over the basis set (i = 1, 2, . . . , I ), the equation of the segment can be written as ( j) rˆ ( j ) (ω) = qi Ci (ω)
2.1.
(1)
Linear Parametric Segments
The shape functions of linear segments Ci (ω), and their derivatives Ci,ω take the form: C1 = 1 − ω, C2 = ω and C1,ω = −1, C2,ω = 1. Thus, the available degrees of freedom for a free, or unconnected linear segment ( j ) are just the position vectors of the beginning ( j ) and end ( j + 1) nodes. ( j)
( j)
q1k = Pk
2.2.
and
( j)
( j +1)
q2k = Pk
(2)
Cubic Spline Parametric Segments
For cubic spline segments, we use the following set of shape functions, their parametric derivatives, and their associated degrees of freedom, respectively: C1 = 2ω3 − 3ω2 + 1, C2 = −2ω3 + 3ω2 , C3 = ω3 − 2ω2 + ω, and C4 = ω3 − ω2 C1,ω = 6ω2 − 6ω, C2,ω = −6ω2 + 6ω2 , C3,ω = 3ω2 − 4ω + 1, and C4,ω = 3ω2 − 2ω ( j) q1k
=
( j) Pk ,
( j) q2k
=
( j +1) Pk ,
( j) q3k
=
( j) Tk ,
and
( j) q4k
=
( j +1) Tk
(3) (4) (5)
Extensions of these methods to other parametric shape functions, such as circular, elliptic, helical, and composite quintic space curves are discussed by Ghoniem et al. [25]. Forces and energies of dislocation segments are given per unit length of the curved dislocation line. Also, line integrals of the elastic field variables are carried over differential line elements. Thus, if we express the Cartesian ( j) ( j) ( j) differential in the parametric form: dk = rˆk, ω dω = qsk Cs, ω dω. The arc length differential for segment j is then given by
( j)
( j ) 1/2
| d( j ) | = dk dk
( j)
( j)
( j ) ( j ) 1/2
= rˆk, ω rˆk, ω
= q pk C p, ω qsk Cs, ω
1/2
dω
dω
(6) (7)
Modeling the dynamics of dislocation ensembles
3.
2273
Elastic Field Variables as Fast Sums
3.1.
Formulation
In materials that can be approximated as infinite and elastically isotropic, the displacement vector u, strain ε and stress σ tensor fields of a closed dislocation loop are given by deWit [26] ui = −
εi j =
σi j
bi 4π
1 8π
=
Ak dlk +
C
ikl bl R, pp +
C
1 kmn bn R,mi dlk 1−ν
−
1 j kl bi R,l + ikl b j R,l − ikl bl R, j − j kl bl R,i , pp 2
×
kmn bn R,mi j dlk 1−ν
C
µ 4π
1 8π
(8)
C
(9)
1 1 R,mpp j mn dli + imn dl j + kmn 2 1−ν
× R,i j m − δi j R, ppm dlk
(10)
where µ and ν are the shear modulus and Poisson’s ratio, respectively, b is Burgers vector of Cartesian components bi , and the vector potential Ak (R) = i j k X i s j /[R(R+R· s)] satisfies the differential equation: pik Ak, p (R) = X i R −3 , where s is an arbitrary unit vector. The radius vector R connects a source point on the loop to a field point, as shown in Fig. 1, with Cartesian components Ri , successive partial derivatives R,i j k... , and magnitude R. The line integrals are carried along the closed contor C defining the dislocation loop, of differential arc length dl of components dlk . Also, the interaction energy between two closed loops with Burgers vectors b1 and b2 , respectively, can be written as µb1i b2 j EI = − 8π
R,kk C (1) C (2)
2ν dl2i dl1 j dl2 j dl1i + 1−ν
2 (R,i j − δi j R,ll )dl2k dl1k + 1−ν
(11)
The higher order derivatives of the radius vector, R,i j and R,i j k are components of second and third order Cartesian tensors that are explicitly known [27]. The dislocation segment in Fig. 1 is fully determined as an affine mapping on the scalar interval ∈ [0, 1], if we introduce the tangent vector T,
2274
N.M. Ghoniem
the unit tangent vector t, the unit radius vector e, and the vector potential A, as follows T=
dl , dω
t=
T , |T|
e=
R , R
A=
e×s R(1 + e · s)
Let the Cartesian orthonormal basis set be denoted by 1 ≡ {1x , 1 y , 1z }, I = 1 ⊗ 1 as the second order unit tensor, and ⊗ denotes tensor product. Now define the three vectors (g1 = e, g2 = t, g3 = b/|b|) as a covariant basis set for the curvilinear segment, and their contravariant reciprocals as: gi · g j = δ ij , where δ ij is the mixed Kronecker delta and V = (g1 × g2 ) · g3 the volume spanned by the vector basis, as shown in Fig. 1. When the previous relationships are substituted into the differential forms of Eqs. (8), (10), with V1 = (s × g1 ) · g2 , and s an arbitrary unit vector, we obtain the differential relationships (see Ref. [27] for details)
|b||T|V (1 − ν)V1 / V du = g3 + (1 − 2ν)g1 + g1 dω 8π(1 − ν)R 1 + s · g1 V |T| d 1 1 =− −ν g ⊗ g + g ⊗ g 1 1 dω 8π(1 − ν)R 2
+ (1 − ν) g3 ⊗ g3 + g3 ⊗ g3 + (3g1 ⊗ g1 − I) µV |T| dσ 1 1 = g ⊗ g + g ⊗ g 1 1 dω 4π(1 − ν)R 2
+ (1 − ν) g2 ⊗ g2 + g2 ⊗ g2 − (3g1 ⊗ g1 + I)
µ|T1 ||b1 ||T2 ||b2 | d2 E I =− (1 − ν) g2I · g3I g2II · g3II dω1 dω2 4π(1 − ν)R
+ 2ν g2II · g3I
+ g3I · g1
g2I · g3II − g2I · g2II
g3II · g1
g3I · g3II
µ|T1 ||T2 ||b|2 d2 E S =− (1 + ν) g3 · g2I g3 · g2II dω1 dω2 8π R (1 − ν)
− 1 + (g3 · g1 )2
g2I · g2II
(12)
The superscripts I and II in the energy equations are for loops I and II , respectively, and g1 is the unit vector along the line connecting two interacting points on the loops. The self energy is obtained by taking the limit of 1/2 the interaction energy of two identical loops, separated by the core distance. Note that the interaction energy of prismatic loops would be simple, because g3 · g2 = 0. The field equations are affine transformation mappings of the scalar interval neighborhood dω to the vector (du) and second order tensor (d, dσ)
Modeling the dynamics of dislocation ensembles
2275
neighborhoods, respectively. The maps are given by covariant, contravariant and mixed vector, and tensor functions.
3.2.
Analytical Solutions
In some simple geometry of Volterra-type dislocations, special relations between b, e, and t can be obtained, and the entire dislocation line can also be described by one single parameter. In such cases, one can obtain the elastic field by proper choice of the coordinate system, followed by straight-forward integration. Solution variables for the stress fields of infinitely-long pure and edge dislocations are given in Table 1, while those for the stress field along the 1z -direction for circular prismatic and shear loops are shown in Table 2. Note that for the case of a pure screw dislocation, one has to consider the product of V and the contravariant vectors together, since V = 0. When the parametric equations are integrated over z from −∞ to +∞ for the straight dislocations, and over θ from 0 to 2π for circular dislocations, one obtains the entire stress field in dyadic notation as: 1. Infinitely-long screw dislocation µb − sin θ 1x ⊗ 1z + cos θ 1 y ⊗ 1z + cos θ 1z ⊗ 1 y 2πr − sin θ 1z ⊗ 1x }
σ=
(13)
Table 1. Variables for screw and edge dislocations Screw dislocation
Edge dislocation
g2
1 (r cos θ1x + r sin θ1 y + z1z ) R 1z
1 (r cos θ1x + r sin θ1 y + z1z ) R 1z
g3
1z
1x
g1
0
1 1y V
g1
g2 V g3 V T R V
r
r
r 2 + z2 r 2 + z2
dz 1z dω
(− sin θ1x + cos θ1 y ) V (sin θ1x − cos θ1 y ) V
r 2 + z2 r
r 2 + z2
dz 1z dω
0
r 2 + z2
1
r 2 + z2
r sin θ r 2 + z2
(−z1 y + r sin θ1z ) (sin θ1x − cos θ1 y )
2276
N.M. Ghoniem
Table 2. Variables for circular shear and prismatic loops Shear loop 1
Prismatic loop
g2
− sin θ1x + cos θ1 y
− sin θ1x + cos θ1 y
g3
1x
1z
g1
−
r 2 + z2
g2 V g3 V
T R V
(r cos θ1x + r sin θ1 y + z1z )
cos θ 1y V 1
(−z1 y + r sin θ1z )
(−z cos θ1x − z sin θ1 y
r 2 + z2 1 r 2 + z2
−r sin θ
+ r 1z )
dθ dθ 1x + r cos θ 1y dω dω
1
g1
r 2 + z2
1 (cos θ1x + sin θ1 y ) V r (− sin θ1x + cos θ1 y ) V r 2 + z2 1 (−z cos θ1x − z sin θ1 y V r 2 + z2 + r 1z ) −r sin θ
r 2 + z2
(r cos θ1x + r sin θ1 y + z1z )
dθ dθ 1x + r cos θ 1y dω dω
r 2 + z2
z cos θ − r 2 + z2
r
r 2 + z2
2. Infinitely-long edge dislocation µb sin θ(2 + cos 2θ )1x ⊗1x − (sin θ cos 2θ )1 y ⊗1 y 2π(1 − ν)r + (2ν sin θ)1z ⊗ 1z − (cos θ cos 2θ)(1x ⊗ 1 y + 1 y ⊗ 1x ) (14)
σ=−
3. Circular shear loop (evaluated on the 1z -axis)
σ=
µbr 2 2 2 2 (ν − 2)(r + z ) + 3z 4(1 − ν)(r 2 + z 2 )5/2 × 1 x ⊗ 1z + 1z ⊗ 1 x
(15)
4. Circular prismatic loop (evaluated on the 1z -axis)
σ=
µbr 2 (2(1 − ν)(r 2 + z 2 ) − 3r 2 ) 4(1 − ν)(r 2 + z 2 )5/2
× 1x ⊗ 1x + 1 y ⊗ 1 y − 2(4z 2 + r 2 ) 1z ⊗ 1z
(16)
As an application of the method in calculations of self- and interaction energy between dislocations, we consider here two simple cases. First, the
Modeling the dynamics of dislocation ensembles
2277
interaction energy between two parallel screw dislocations of length L and with a minimum distance ρ between them is obtained by making the following substitutions in Eq. (12) g2I = g2II = g3I = g3II = 1z ,
|T| =
dl = 1, dz
z2 − z1 1z · g1 = 2 ρ + (z 2 − z 1 )2
where z 1 and z 2 are distances along 1z on dislocations 1 and 2, respectively, connected along the unit vector g1 . The resulting scalar differential equation for the interaction energy is µb2 d2 E I =− dz 1 dz 2 4π(1 − ν)
(z 2 − z 1 )2 ν − 2 ρ 2 + (z 2 − z 1 )2 [ρ + (z 2 − z 1 )2 ] 3/2
(17) Integration of Eq. (17) over a finite length L yields identical results to those obtained by deWit [26] and by application of the more standard Blin formula [28]. Second, the interaction energy between two coaxial prismatic circular dislocations with equal radius can be easily obtained by the following substitutions g3I = g3II = 1z , g2I = − sin ϕ1 1x + cos ϕ1 1 y , g2II = − sin ϕ2 1x + cos ϕ2 1 y ϕ1 − ϕ2 2 z ) , 1z · g1 = 1z · g2I = 0, R 2 = z 2 + (2ρ sin 2 R Integration over the variables ϕ1 and ϕ2 from (0 − 2π ) yields the interaction energy.
4.
Dislocation Loop Motion
Consider the virtual motion of a dislocation loop. The mechanical power during this motion is composed of two parts: (1) change in the elastic energy stored in the medium upon loop motion under the influence of its own stress (i.e., the change in the loop self-energy), (2) the work done on moving the loop as a result of the action of external and internal stresses, excluding the stress contribution of the loop itself. These two components constitute the Peach– Koehler work [29]. The main idea of DD is to derive approximate equations of motion from the principle of Virtual Power Dissipation of the second law of thermodynamics Ghoniem et al. [27]. Once the parametric curve for the dislocation segment is mapped onto the scalar interval {ω ∈ [0, 1]}, the stress field everywhere is obtained as a fast numerical quadrature sum [30]. The Peach– Koehler force exerted on any other dislocation segment can be obtained from the total stress field (external and internal) at the segment as [30]. F P K = σ · b × t.
2278
N.M. Ghoniem
The total self-energy of the dislocation loop is determined by double line integrals. However, Gavazza and Barnett [31] have shown that the first variation in the self-energy of the loop can be written as a single line integral, and that the majority of the contribution is governed by the local line curvature. Based on these methods for evaluations of the interaction and self-forces, the weak variational form of the governing equation of motion of a single dislocation loop was developed by Ghoniem et al. [25] as
Fkt − Bαk Vα δrk |ds| = 0
(18)
Here, Fkt are the components of the resultant force, consisting of the Peach– Koehler force F P K (generated by the sum of the external and internal stress fields), the self-force Fs , and the Osmotic force F O (in case climb is also considered [25]). The resistivity matrix (inverse mobility) is Bαk , Vα are the velocity vector components, and the line integral is carried along the arc length of the dislocation ds. To simplify the problem, let us define the following dimensionless parameters r r∗ = , a
f∗ =
F , µa
t∗ =
µt B
Here, a is lattice constant, and t is time. Hence Eq. (18) can be rewritten in dimensionless matrix form as dr∗ ∗ ∗ ∗ δr f − ∗ ds = 0 (19) dt ∗
Here, f∗ = [ f 1∗ , f 2∗ , f 3∗ ] and r∗ = [r1∗ , r2∗ , r3∗ ] , which are all dependent on the dimensionless time t ∗ . Following Ghoniem et al. [25], a closed dislocation loop can be divided into Ns segments. In each segment j , we can choose a set of generalized coordinates qm at the two ends, thus allowing parametrization of the form r∗ = CQ
(20)
Here, C = [C1 (ω), C2 (ω), . . . , Cm (ω)], Ci (ω), (i = 1, 2, . . . , m) are shape functions dependent on the parameter (0 ≤ ω ≤ 1) and Q = [q1 , q2 , . . . , qm ] , qi are a set of generalized coordinates. Substituting Eq. (20) into Eq. (19), we obtain Ns j =1
Let,
δQ
dQ C f − C C ∗ |ds| = 0 dt ∗
j
fj = j
C f∗ |ds| ,
kj = j
C C |ds|
(21)
Modeling the dynamics of dislocation ensembles
2279
Following a similar procedure to the FEM, we assemble the EOM for all contiguous segments in global matrices and vectors, as F=
Ns j =1
fj,
K=
Ns
kj
j =1
then, from Eq. (21) we get, dQ =F (22) dt ∗ The solution of the set of ordinary differential Eq. (22) describes the motion of an ensemble of dislocation loops as an evolutionary dynamical system. However, additional protocols or algorithms are used to treat: (1) strong dislocation interactions (e.g., junctions or tight dipoles), (2) dislocation generation and annihilation, (3) adaptive meshing as dictated by large curvature variations [25]. In the The Parametric Method [25, 27, 32, 33] presented above, the dislocation loop can be geometrically represented as a continuous (to second derivative) composite space curve. This has two advantages: (1) there is no abrupt variation or singularities associated with the self-force at the joining nodes in between segments, (2) very drastic variations in dislocation curvature can be easily handled without excessive re-meshing. K
5.
Dislocation Dynamics in Anisotropic Crystals
Extension of the PDD to anisotropic linearly elastic crystals follows the same procedure described above, with the exception of two aspects [34]. First, calculations of the elastic field, and hence forces on dislocations, is computationally more demanding. Second, the dislocation self-force is obtained from non-local line integrals. Thus PDD simulations in anisotropic materials are about an order of magnitude slower than in isotropic materials. Mura [35] derived a line integral expression for the elastic distortion of a dislocation loop, as u i, j (x)= ∈ j nk C pqmn bm
G ip,q (x − x )νk dl(x ),
(23)
L
where νk is the unit tangent vector of the dislocation loop line L, dl is the dislocation line element, ∈ j nh is the permutation tensor, Ci j kl is the fourth order elastic constants tensor, G i j ,l (x − x ) = ∂ G i j (x − x )/∂ xl , and G i j (x − x ) are the Green’s tensor functions, which correspond to displacement component along the xi -direction at point x due to a unit point force in the x j -direction applied at point x in an infinite medium.
2280
N.M. Ghoniem
The elastic distortion formula (23) involves derivatives of the Green’s functions, which need special consideration. For general anisotropic solids, analytical expressions for G i j,k are not available. However, these functions can be expressed in an integral form (see, e.g., Refs. [36–39]), as G i j ,k (x − x ) =
1 2 8π |r|2
Ck
¯ −1 (k) ¯ − r¯k Ni j (k)D
¯ j m (k)D ¯ −2 (k) ¯ dφ + k¯k Clpmq (¯r p k¯q + k¯ p r¯q )Nil (k)N (24) where r = x − x , r¯ = r/|r|, k¯ is the unit vector on the plane normal to r, the integral is taken around the unit circle Ck on the plane normal to r, Ni j (k) and D(k) are the adjoint matrix and the determinant of the second order tensor Cikj l kk kl , respectively. The in-plane self-force at the point P on the loop is also obtained in a manner similar to the external Peach–Koehler force, with an additional contribution from stretching the dislocation line upon a virtual infinitesimal motion [40] F S = κ E(t) − b · σ¯ S · n
(25)
where E(t) is the pre-logarithmic energy factor for an infinite straight dislocation parallel to t: E(t) = 12 b · (t) · n, with (t) being the stress tensor of an infinite straight dislocation along the loop’s tangent at P. σ S is self stress tensor due to the dislocation L, and σ¯ = 12 [σ S (P + m) + σ S (P − m)] is the average self-stress at P, κ is the in-plane curvature at P, and = |b|/2. Barnett [40] and Gavazza and Barnett [31] analyzed the structure of the self-force as a sum 8 S − J (L , P) + Fcore (26) F = κ E(t) − κ E(t) + E (t) ln κ where the second and third terms are line tension contributions, which usually account for the main part of the self-force, while J (L , P) is a non-local contribution from other parts of the loop, and Fcore is due to the contribution to the self-energy from the dislocation core.
6.
Selected Applications
Figure 2 shows the results of computer simulations of plastic deformation in single crystal copper (approximated as elastically isotropic) at a constant strain rate of 100 s−1 . The initial dislocation density of ρ = 2 × 1013 m−2 has been divided into 300 complete loops. Each loop contains a random number
Modeling the dynamics of dislocation ensembles
2281
Figure 2. Results of computer simulations for dislocation microstructure deformation in copper deformed to increasing levels of strain (shown next to each microstructure).
of initially straight glide and superjog segments. When a generated or expanding loop intersects the simulation volume of 2.2 µm side length, the segments that lie outside the simulation boundary are periodically mapped inside the simulation volume to preserve translational strain invariance, without loss of dislocation lines. The number of nodes on each loop starts at five, and is then increased adaptively proportional to the loop length, with a maximum number of 20 nodes per loop. The total number of Degrees of Freedom (DOF) starts at 6000, and is increased to 24 000 by the end of the calculation. However, the number of interacting DOF is determined by a nearest neighbor criterion, within a distance of 400a (where a is the lattice constant), and is based on a binary tree search. The dislocation microstructure is shown in Fig. 2 at different total strain. It is observed that fine slip lines that nucleate at low strains evolve into more pronounced slip bundles at higher strains. The slip bundles are well-separated in space forming a regular pattern with a wavelength of approximately one micron. Conjugate slip is also observed, leading to the formation of dislocation junction bundles and stabilization of a cellular structures. Next, we consider the dynamic process of dislocation dipole formation in anisotropic single crystals. To measure the degree of deviation from elastic isotropy, we use the anisotropy ratio A, defined in the usual manner: A = 2C44 /(C11 − C12 ) [28]. For an isotropic crystal, A = 1. Figure 3(a) shows the configurations (2D projected on the (111)-plane) of two pinned dislocation segments, lying on parallel (111)-planes. The two dislocation segments are
2282
N.M. Ghoniem (a) 300
A⫽1 A⫽2 A ⫽ 0.5
200
[⫺1 ⫺1 2]
b Stable dipole
100
0
⫺500
0
500
[⫺1 1 0]
(b) 0.4 A⫽1 A⫽1
0.35
A⫽2 A⫽1
τ/µ (%)
0.3 A ⫽ 0.5
0.25
0.2
0.15 Backward break up Forward break up Infinite dipole
0.1
0.05
0.04
0.08
0.12
a/h
Figure 3. Evolution of dislocation dipoles without applied loading (a) and dipole break up shear stress (b).
¯ initially straight, parallel, and along [110], but of opposite line directions, ¯ have the same Burgers vector b = 1/2[101], and are pinned √ at both ends. Their 3a, L : d : h = 800 : glide planes are separated by h. In this figure, h = 25 √ 300 : 25 3, with L and d being the length of the initial dislocation segments and the horizontal distance between them, respectively. Without the application of any external loading, the two lines attract one another, and form an equilibrium state of a finite-size dipole. The dynamic shape of the segments during the dipole formation is seen to be dependent on the anisotropy ratio A, while the final configuration appears to be insensitive to A. Under external loading, the dipole may be unzipped, if applied forces overcome binding forces between dipole arms. The forces (resolved shear stresses τ , divided by µ = (C11 − C12 )/2) to break up the dipoles are shown in Fig. 3(b). It can be seen that the break up stress is inversely proportional to the separation distance h, consistent with the results of infinite-size dipoles. It is easier to break up dipoles in crystals with smaller A-ratios (e.g., some BCC crystals). It is also noted that two ways to break up dipoles are possible: in backward direction (where the self-force assists the breakup), or forward direction (where the
Modeling the dynamics of dislocation ensembles
2283
self-force opposes the breakup). For a finite length dipole, the backward break up is obviously easier than the forward one, due to the effects of self forces induced by the two curved dipole arms, as can be seen in Fig. 3(b). As a final application, we consider dislocation motion in multi-layer anisotropic thin films. It has been experimentally shown that the strength of multilayer thin films is increased as the layer thickness is decreased, and that maximum strength is achieved for layer thickness on the order of 10–50 nm. Recently, Ghoniem and Han [41] developed a new computational method for the simulation of dislocation ensemble interactions with interfaces in anisotropic, nanolaminate superlattices. Earlier techniques in this area use cumbersome and inaccurate numerical resolution by superposition of a regular elastic field obtained from a finite element, boundary element, surface dislocation or point force distributions to determine the interaction forces between 3D dislocation loops and interfaces. The method developed by Ghoniem and Han [41] utilizes two-dimensional Fourier Transforms to solve the full elasticity problem in the direction transverse to interfaces, and then by numerical inversion, obtain the solution for 3D dislocation loops of arbitrary complex geometry. Figure 4 shows a comparison between the numerical simulations (stars) for the critical yield strength of a Cu/Ni superlattice, compared to Freund’s analytical solution (red solid line) and the experimental data of the Los Alamos group (solid triangles). The saturation of the nanolayered system strength (and hardness) with a nanolayer thickness less than 10–50 nm is a result of dislocations overcoming the interface Koehler barrier and loss of dislocation confinement within the soft Cu layer.
4.0 Freund critical stress Experiment (Misra, et al.,1998) Simulation, image force
Critical yield stress (GPa)
3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 1
Figure 4.
10 Cu layer thickness h (nm)
100
Dependence of a Cu/Ni superlattice strength onthe thickness of the Cu layer [41].
2284
7.
N.M. Ghoniem
Future Outlook
As a result of increased computing power, new mathematical formulations, and more advanced computational methodologies, tremendous progress in modeling the evolution of complex 3D dislocation ensembles has been recently realized. The appeal of computational dislocation dynamics lies in the fact that it offers the promise of predicting the dislocation microstructure evolution without ad hoc assumptions, and on sound physical grounds. At this stage of development, many physically-observed features of plasticity and fracture at the nano- and micro-scales have been faithfully reproduced by computer simulations. Moreover, computer simulations of the mechanical properties of thin films are at an advanced stage now that they could be predictive without ambiguous assumptions. Such simulations may become very soon standard and readily available for materials design, even before experiments are performed. On the other hand, modeling the constitutive behavior of polycrystalline metals and alloys with DD computer simulations is still evolving and will require significant additional developments of new methodologies. With continued interest by the scientific community in achieving this goal, future efforts may well lead to new generations of software, capable of materials design for prescribed (within physical constraints) strength and ductility targets.
Acknowledgments Research is supported by the US National Science Foundation (NSF), grant #DMR-0113555, and the Air Force Office of Scientific Research (AFOSR), grant #F49620-03-1-0031 at UCLA.
References [1] H. Mughrabi, “Dislocation wall and cell structures and long-range internal-stresses in deformed metal crystals,” Acta Met., 31, 1367, 1983. [2] H. Mughrabi, “A 2-parameter description of heterogeneous dislocation distributions in deformed metal crystals,” Mat. Sci. & Eng., 85, 15, 1987. [3] R. Amodeo and N.M. Ghoniem, “A review of experimental observations and theoretical models of dislocation cells,” Res. Mech., 23, 137, 1988. [4] J. Lepinoux and L.P. Kubin, “The dynamic organization of dislocation structures: a simulation,” Scripta Met., 21(6), 833, 1987. [5] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Sol. St. Phen., 3&4, 377, 1988. [6] A.N. Guluoglu, D.J. Srolovitz, R. LeSar, and R.S. Lomdahl, “Dislocation distributions in two dimensions,” Scripta Met., 23, 1347, 1989.
Modeling the dynamics of dislocation ensembles
2285
[7] N.M. Ghoniem and R.J. Amodeo, “Numerical simulation of dislocation patterns during plastic deformation,” In: D. Walgreaf and N. Ghoniem (eds.), Patterns, Defects and Material Instabilities, Kluwer Academic Publishers, Dordrecht, p. 303, 1990. [8] R.J. Amodeo and N.M. Ghoniem, “Dislocation dynamics I: a proposed methodology for deformation micromechanics,” Phys. Rev., 41, 6958, 1990a. [9] R.J. Amodeo and N.M. Ghoniem, “Dislocation dynamics II: applications to the formation of persistent slip bands, planar arrays, and dislocation cells,” Phy. Rev., 41, 6968, 1990b. [10] L.P. Kubin, G. Canova, M. Condat, B. Devincre, V. Pontikis, and Y. Brechet, “Dislocation microstructures and plastic flow: a 3D simulation,” Diffusion and Defect Data–Solid State Data, Part B (Solid State Phenomena), 23–24, 455, 1992. [11] A. Moulin, M. Condat, and L.P. Kubin, “Simulation of frank-read sources in silicon,” Acta Mater., 45(6), 2339–2348, 1997. [12] J.P. Hirth, M. Rhee, and H. Zbib, “Modeling of deformation by a 3D simulation of multi pole, curved dislocations,” J. Comp.-Aided Mat. Des., 3, 164, 1996. [13] R.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” Int. J. Mech. Sci., 40(2–3), 113, 1998. [14] K.V. Schwarz and J. Tersoff, “Interaction of threading and misfit dislocations in a strained epitaxial layer,” Appl. Phys. Lett., 69(9), 1220, 1996. [15] K.W. Schwarz, “Interaction of dislocations on crossed glide planes in a strained epitaxial layer,” Phys. Rev. Lett., 78(25), 4785, 1997. [16] L.M. Brown, “A proof of lothe’s theorem,” Phil. Mag., 15, 363–370, 1967. [17] A.G. Khachaturyan, “The science of alloys for the 21st century: a hume-rothery symposium celebration,” In: E. Turchi and a. G.A. Shull, R.D. (eds.), Proc. Symp. TMS, TMS, 2000. [18] Y.U. Wang, Y.M. Jin, A.M. Cuitino, and A.G. Khachaturyan, “Presented at the international conference, Dislocations 2000, the National Institute of Standards and Technology,” Gaithersburg, p. 107, 2000. [19] Y. Wang, Y. Jin, A.M. Cuitino, and A.G. Khachaturyan, “Nanoscale phase field microelasticity theory of dislocations: model and 3D simulations,” Acta Mat., 49, 1847, 2001. [20] D. Walgraef and C. Aifantis, “On the formation and stability of dislocation patterns. I. one-dimensional considerations,” Int. J. Engg. Sci., 23(12), 1351–1358, 1985. [21] J. Kratochvil and N. Saxlo`va, “Sweeping mechanism of dislocation patternformation,” Scripta Metall. Mater., 26, 113–116, 1992. [22] P. H¨ahner, K. Bay, and M. Zaiser, “Fractal dislocation patterning during plastic deformation,” Phys. Rev. Lett., 81(12), 2470, 1998. [23] M. Zaiser, M. Avlonitis, and E.C. Aifantis, “Stochastic and deterministic aspects of strain localization during cyclic plastic deformation,” Acta Mat., 46(12), 4143, 1998. [24] A. El-Azab, “Statistical mechanics treatment of the evolution of dislocation distributions in single crystals,” Phys. Rev. B, 61, 11956–11966, 2000. [25] N.M. Ghoniem, S.-H. Tong, and L.Z. Sun, “Parametric dislocation dynamics: a thermodynamics-based approach to investigations of mesoscopic plastic deformation,” Phys. Rev., 61(2), 913–927, 2000. [26] R. deWit, “The continuum theory of stationary dislocations,” In: F. Seitz and D. Turnbull (eds.), Sol. State Phys., 10, Academic Press, 1960. [27] N.M. Ghoniem, J. Huang, and Z. Wang, “Affine covariant-contravariant vector forms for the elastic field of parametric dislocations in isotropic crystals,” Phil. Mag. Lett., 82(2), 55–63, 2001.
2286
N.M. Ghoniem
[28] J. Hirth and J. Lothe, Theory of Dislocations, 2nd edn, McGraw–Hill, New York, 1982. [29] M.O. Peach and J.S. Koehler, “The forces exerted on dislocations and the stress fields produced by them,” Phys. Rev., 80, 436, 1950. [30] N.M. Ghoniem and L.Z. Sun, “Fast sum method for the elastic field of 3-D dislocation ensembles,” Phys. Rev. B, 60(1), 128–140, 1999. [31] S. Gavazza and D. Barnett, “The self-force on a planar dislocation loop in an anisotropic linear-elastic medium,” J. Mech. Phys. Solids, 24, 171–185, 1976. [32] R.V. Kukta and L.B. Freund, “Three-dimensional numerical simulation of interacting dislocations in a strained epitaxial surface layer,” In: V. Bulatov, T. Diaz de la Rubia, R. Phillips, E. Kaxiras, and N. Ghoniem (eds.), Multiscale Modelling of Materials, Materials Research Society, Boston, Massachusetts, USA, 1998. [33] N.M. Ghoniem, “Curved parametric segments for the stress field of 3-D dislocation loops,” Transactions of ASME. J. Engrg. Mat. & Tech., 121(2), 136, 1999. [34] X. Han, N.M. Ghoniem, and Z. Wang, “Parametric dislocation dynamics of anisotropic crystalline materials,” Phil. Mag. A., 83(31–34), 3705–3721, 2003. [35] T. Mura, “Continuous distribution of moving dislocations,” Phil. Mag., 8, 843–857, 1963. [36] D. Barnett, “The precise evaluation of derivatives of the anisotropic elastic green’s functions,” Phys. Status Solidi (b), 49, 741–748, 1972. [37] J. Willis, “The interaction of gas bubbles in an anisotropic elastic solid,” J. Mech. Phys. Solids, 23, 129–138, 1975. [38] D. Bacon, D. Barnett, and R. Scattergodd, “Anisotropic continuum theory of lattice defects,” In: C.J.M.T. Chalmers, B (ed.), Progress in Materials Science, vol. 23, Pergamon Press, Great Britain, pp. 51–262, 1980. [39] T. Mura, Micromechanics of Defects in Solids, Martinus Nijhoff, Dordrecht, 1987. [40] D. Barnett, “The singular nature of the self-stress field of a plane dislocation loop in an anisotropic elastic medium,” Phys. Status Solidi (a), 38, 637–646, 1976. [41] X. Han and N.M. Ghoniem, “Stress field and interaction forces of dislocations in anisotropic multilayer thin films,” Phil. Mag., in press, 2005.
7.12 DISLOCATION DYNAMICS – PHASE FIELD Yu U. Wang,1 Yongmei M. Jin,2 and Armen G. Khachaturyan2 1 Department of Materials Science and Engineering, Virginia Tech., Blacksburg, VA 24061, USA 2 Department of Ceramic and Materials Engineering, Rutgers University, 607 Taylor Road, Piscataway, NJ 08854, USA
Dislocation, as an important category of crystal defects, is defined as a one-dimensional line (curvilinear in general) defect. It not only severely distorts the atomic arrangement in a region (called core) around the mathematical line describing its geometrical configuration, but also in a less severe manner (elastically) distorts the lattice beyond its core region. Dislocation core structure is studied by using the methods and models of atomistic scale (see Chapter 2). The long-range strain and stress fields generated by dislocation are well described by linear elasticity theory. In the elasticity theory of dislocations, dislocation is defined as a line around which a line integral of the elastic displacement yields a non-zero vector (Burgers vector). The elastic fields, displacement, strain and stress, of an arbitrarily curved dislocation are known in the form of line integrals. For complex dislocation configurations, the exact elasticity solution is quite difficult. A conventional alternative is to approximate a curved dislocation by a series of straight line segments or spline fitted curved segments. This involves explicit tracking of each segment of the dislocation ensemble (see “Dislocation Dynamics – Tracking Methods” by Ghoniem). In a finite body, the strains and stresses depend on the external surface. For general surface geometries, the elastic fields of dislocations are difficult to determine. In this article we discuss an alternative to the front-tracking methods in modeling dislocation dynamics. This is the structure density phase field method, which is a more general version of the phase field method used to describe solidification process. Instead of explicitly tracking the dislocation lines, the phase field method describes the slipped (plastically deformed by shear) and unslipped regions in a crystal by using field variables (structure density functions or, less accurately but more conventionally called, phase 2287 S. Yip (ed.), Handbook of Materials Modeling, 2287–2305. c 2005 Springer. Printed in the Netherlands.
2288
Y.U. Wang et al.
fields). Dislocations are the boundaries between the regions of different degrees of slipping. One of the advantages of the phase field approach is that it treats the system with arbitrarily complex microstructures as a whole and automatically describes the evolution events producing changes of the microstructure topology (e.g., nucleation, multiplication, annihilation and reaction of dislocations) without explicitly tracking the moving segments. Therefore, it is easy for numerical implementation even in three-dimension (a front-tracking scheme often results in difficult and untidy numerical algorithm). No ad hoc assumptions are required on evolution path. The micromechanics theory proposed by Khachaturyan and Shatalov (KS) [1–3] and recently further developed by Wang, Jin and Khachaturyan (WJK) in a series of works [4–9] is formulated in such a form that it is easily incorporated in the phase field theory. It allows one to determine the elastic interactions at each step of the dislocation dynamics. In the case of elastically homogeneous systems, the exact elasticity solution for an arbitrary dislocation configuration can be formulated as a closed-form functional of the Fourier transforms of the phase fields describing the dislocation microstructure irrespective of its geometrical complexity (the number of the phase fields is equal to the number of operative slip systems that is determined by the crystallography instead of by a concrete dislocation microstructure). This fact makes it easy to achieve high computational efficiency by using Fast Fourier Transform technique, which is also suitable for parallel computing. The Fourier space solution is formulated in terms of arbitrary elastic modulus tensor. This means that the solution for dislocations in single crystal of elastic anisotropy practically does not impose more difficulty. By simply introducing a grain rotation matrix function that describes the geometry and orientation of each grain and the entire multi-grain structure, the phase field method is readily extended to model dislocation dynamics in polycrystal composed of elastically isotropic grains. If the grains are elastically anisotropic, their misorientation makes the polycrystal an elastically inhomogeneous body. The limitation of grain elastic isotropy could be lifted without serious complication of the theory and model by an introduction of additional virtual misfit strain field. This field acting in the equivalent system with the homogeneous modulus produces the same mechanical effect as that produced by elastic modulus heterogeneity. The introduction of the virtual misfit strain greatly simplifies a treatment of elastically inhomogeneous system of arbitrary complexity, in particular, a body with voids, cracks, and free surfaces. The structural density phase field model of multi-crack evolution can be developed in the formalism similar to the phase field model of multidislocation dynamics. This development of the theory has been an extension of the corresponding phase field theories of diffusional and displacive phase transformations (e.g., decomposition, ordering, martensitic transformation, etc.). All these structure density field theories are conceptually similar and
Dislocation dynamics – phase field
2289
are formulated in the similar theoretical and computational framework. The latter facilitates an integration of multi-physics such as dislocations, cracks and phase transformations into one unified structure density field model, where multiple processes are described by simultaneous evolution of various relaxing density fields. Such a unified model would be highly desirable for simulations of complex materials behaviors. The following sections will discuss the basic ingredients of the phase field model of dislocation dynamics. Single crystalline system is considered first, followed by the extension to polycrystal composed of elastically isotropic grains. Finite body with free surfaces is discussed next. The phase field model of cracks, in many respects, is similar to the dislocation model. It is also discussed. The article concludes with a brief outlook on the structural density field models for integration of multiple solid-state physical phenomena and connections between mesoscale phase field modeling and atomistic as well as continuum models.
1.
Dislocation Loop as Thin Platelet Misfitting Inclusion
Consider a simple two-dimensional lattice of circles representing atoms, as shown in Fig. 1(a). Imagine that we cut and remove from the lattice a thin platelet consisting of two monolayers indicated by shaded circles, deform it by gliding the top layer with respect to the bottom layer by one interatomic distance, as shown in Fig. 1(b), then reinsert the deformed thin platelet back into the original lattice, and allow the whole lattice to relax and reach mechanical equilibrium. In doing so, we create an edge dislocation that is located at (a)
(c)
(d)
(b)
Figure 1. Illustration of dislocations as thin platelet misfitting inclusions. (a) A 2D lattice. (b) A thin platelet misfitting inclusion generated by transformation. (c) Bragg–Nye bubble model of an edge dislocation in mechanical equilibrium (after Ref. [10], reproduced with permission). (d) Continuum presentation of the dislocation line ABC ending on the crystal surface at points A and C and a dislocation loop by the thin platelet misfitting inclusions (after Ref. [4], reproduced with permission). b is the Burgers vector, d is the thickness of the inclusion equal to the interplanar distance of the slip plane, and n is the unit vector normal to the inclusion habit plane coinciding with the slip plane.
2290
Y.U. Wang et al.
the edge of the thin platelet. The equilibrium state of such a lattice is demonstrated in Fig. 1(c), which shows the Bragg–Nye bubble model of an edge dislocation [10]. In the continuum elasticity theory of dislocations, dislocation loop can be created in the same way by transforming thin platelet in the matrix of untransformed solid. Consider an arbitrary-shaped plate-like misfitting inclusion, whose habit plane (interface between inclusion and matrix) coincides with slip plane, as shown in Fig. 1(d). The misfit strain (also called stress-free transformation strain or eigenstrain describing the homogeneous deformation of the transformed stress-free state) of the platelet is a dyadic prod inclusion under = b n + b n 2d, where b is a Burgers vector, n is the normal and d uct, εidis i j j i j is the platelet thickness equal to the interplanar distance of the slip plane. Such a misfitting thin platelet generates stress that is exactly the same as generated by a dislocation loop of Burgers vector b encircling the platelet [2]. This fact, as will be shown in next two sections, greatly facilitates the description of dislocation microstructure and the solution of the elasticity problem, which is the basis of the WJK phase field microelasticity (PFM) theory of dislocations [4]. This theory was extended by Shen and Wang [11] and WJK [7, 9, 12]. In fact, the dislocation-associated misfit strain εidis j characterizes the plastic strain of the transformed (plastically deformed) platelet inclusion.
2.
Structure Density Field Description of Dislocation Ensemble
As discussed above, by treating dislocation loops as thin platelet misfitting inclusions, instead of describing dislocations by lines, we describe the transformed regions in the untransformed matrix. The transformed regions are the regions that have been plastically deformed by slipping. Dislocations correspond to the boundaries separating the regions of different degrees of slipping. In this description, we track a spatial and temporal evolution of the dislocationassociated misfit strain (plastic strain), which is the structure density field. This field describes the evolution of individual dislocations in an arbitrary ensemble. For an arbitrary dislocation ensemble involving all operative slip systems, the total dislocation-associated misfit strain εidis j (r) is the sum over all slip planes numbered by α: εidis j (r) =
1 α
2
bi (α, r) H j (α) + b j (α, r) Hi (α) ,
(1)
where b(α, r) is the slip displacement vector, H(α) = n(α)/d(α) is the reciprocal lattice vector of the slip plane α, n(α) and d(α) are the normal and interplanar distance, respectively, of the slip plane α. Therefore, a set of
Dislocation dynamics – phase field
2291
vector fields, {b(α, r)}, completely characterizes the dislocation configuration. Slipped (plastically deformed) regions are the ones where b(α, r) =/ 0. The vector b(α, r) can be expressed as a sum of the slip displacement vectors numbered by m α corresponding to the operative slip modes within the same slip plane α: b (α, r) =
b (α, m α , r).
(2)
mα
It is convenient to present each field b(α, m α , r) in terms of an order parameter η (α, m α , r) through the following relation b (α, m α , r) = b (α, m α ) η (α, m α , r),
(3)
where η (α, m α , r) is a scalar field, and b (α, m α ) is the corresponding elementary Burgers vector of the slip mode m α in the slip plane α. Thus, an arbitrary dislocation configuration involving all possible slip systems is completely characterized by a set of order parameter fields (phase fields), {η(α, m α , r)}. The number of the fields is equal to the number of the operative slip systems that is determined by the crystallography rather than a concrete dislocation configuration. For example, face-centered cubic (fcc) crystal has four {111} slip planes (α=1, 2, 3, 4) and three 110 slip modes in each slip plane (m α =1, 2, 3), thus has 12 slip systems. A total number of 12 phase fields are used to characterize an arbitrary dislocation ensemble in a fcc crystal if all possible slip systems are involved. An in-depth discussion on the choice of Phase Fields (dislocation density fields) is presented in Ref. [12]. It is noteworthy that the structural density phase field (order parameter) here has the physical meaning of structure (dislocation) density, which is more general than the order parameter used in the phase field model of solidification that assumes 1 in solid and 0 in liquid.
3.
Phase Field Microelasticity Theory
As discussed in the preceding section, the micromechanics of an arbitrary dislocation ensemble involving all operative slip systems is characterized by the dislocation-associated misfit strain εidis j (r) defined in Eq. (1). Substituting Eqs. (2) and (3) into Eq. (1) expresses εidis j (r) as a linear function of a set of phase fields, {η (α, m α , r)}: εidis j (r) =
1 α
mα
2
bi (α, m α ) H j (α) + b j (α, m α ) Hi (α) η (α, m α , r). (4)
2292
Y.U. Wang et al.
The elastic (strain) energy generated by such a dislocation ensemble is
E
elast
= V
1 Ci j kl εi j (r) − εidis εkl (r) − εkldis (r) d 3r, j (r) 2
(5)
where Ci j kl is elastic modulus tensor, V is body volume, and εi j (r) is the equilibrium strain that minimizes the elastic energy (5) under the compatibility (continuity) condition. The exact elastic energy E elast can be expressed as closed-form functional of εidis j (r). This is obtained by using the KS theory developed for arbitrary multi-phase and multi-domain misfitting inclusions in the homogeneous anisotropic elastic modulus case. The total elastic energy for an arbitrary multidislocation ensemble described by a set of phase fields {η(α, m α , r)} in an appl elastically homogeneous anisotropic body under applied stress σi j is E elast =
1 d 3k α,m α β,m β
−
α,m α
−
2
−
(2π )
3
K α, m α , β, m β , e
∗
×η˜ (α, m α , k) η˜ β, m β , k appl σi j
bi (α, m α ) H j (α)
η (α, m α , r) d 3r
V
V −1 appl appl C σ σkl , 2 i j kl i j
(6)
where η˜ (α, m α , k) = V η (α, m α , r) e−ik·r d 3r is the Fourier transform of η(α, m α , r), the superscript asterisk (*) indicates complex conjugate, e = k/k is
a unit directional vector in the reciprocal (Fourier) space, and the integral as a principal value excluding the point – in the reciprocal space is evaluated k = 0. The scalar function K α, m α , β, m β , e is defined as
K α, m α , β, m β , e = Ci j kl −em Ci j mn np (e) Cklpq eq × bi (α, m α ) H j (α) bk β, m β Hl (β),
(7)
where i j (e) is the Green function tensor inverse to the tensor −1 i j (e) = Cikj l ek el . The elastic energy (6) is a closed-form functional of η(α, m α , r) and their Fourier transform η˜ (α, m α , k) irrespective of dislocation geometrical complexity. This fact makes it easy to achieve high computational efficiency in solving elasticity problem of dislocations. In computer simulations, elasticity solution is obtained numerically. The fields η˜ (α, m α , k) are evaluated by using fast Fourier transform technique, which is also suitable for parallel computing. Since the functional (6) is formulated for arbitrary elastic modulus tensor Ci j kl , a consideration of elastic anisotropy does not impose more difficulty. In fact, in simulations the function K(α, m α , β, m β , e) defined in Eq. (7)
Dislocation dynamics – phase field
2293
needs to be evaluated only once and stored in computer memory. Therefore, elastic anisotropy practically does not affect computational efficiency. The elastic energy E elast consists of dislocation self-energy and interaction appl energy as well as the energy generated by the applied stress σi j and the (potential) energy associated with the external loading device. The elastic energy is calculated by using the linear elasticity theory. Equation (6) provides the exact solution for the long-range elastic interactions between individual dislocations in an arbitrary configuration, which is the same as described by the Peach–Koehler equation.
4.
Crystalline Energy and Gradient Energy
In the phase field model, individual dislocations of an arbitrary configuration are completely described by a set of phase fields, {η(α, m α , r)}. For perfect dislocations, each slip displacement vector b(α, m α , r) should relax to a discrete set of values that are multiples of the corresponding elementary Burgers vector b(α, m α ). Thus according to Eq. (3), the order parameter η(α, m α , r) should relax to integer values. The elementary Burgers vectors b(α, m α ) correspond to the shortest crystal lattice translations in the slip planes. For partial dislocations, b(α, m α , r) do not correspond to crystal lattice translations, and η(α, m α , r) may assume non-integer values. The integers η(α, m α , r) are equal to the number of perfect dislocations with Burgers vector b(α, m α ) sweeping through the point r. The sign of the integer determines the slip direction with respect to b(α, m α ). The above-discussed behavior of η(α, m α , r) is automatically achieved by a choice of the Landau-type coarse-grained “chemical” free energy functional of a set of phase fields {η(α, m α , r)}. In the case of dislocations, this free energy is the crystalline energy that reflects the periodic properties of the host crystal lattice:
E
cryst
=
f cryst ({η (α, m α , r)})d 3r,
(8)
V
which should be minimized at {η(α, m α )} equal to integers. The integrand f cryst ({η(α, m α )}) is a periodical function of all parameters {η(α, m α )} with periods equal to any integers. This property follows from the fact that the Burgers vectors b(α, m α , r) in Eq. (3) corresponding to the integers η(α, m α , r) are lattice translation vectors that do not change the crystal lattice. The crystalline energy characterizes an interplanar potential during a homogeneous gliding of one atomic plane above another atomic plane by a slip displacement vector b(α). In the case of one slip mode, say (α1 , m α1 ), the
2294
Y.U. Wang et al.
local specific crystalline energy function f cryst ({η(α, m α )}) can be reduced to the simplest form by keeping the first non-vanishing term of its Fourier series: f cryst [b(α1 , m α1 )η (α1 , m α1 ) , 0, . . . , 0] = A sin2 π η (α1 , m α1 ),
(9)
where A is a positive constant providing the shear modulus at small strain limit. Its general behavior is schematically illustrated in Fig. 2(a). Any deviation of the slip displacement vector b from the lattice translation vectors is penalized by the crystalline energy. In the case where all slip modes are operative, the general expression of the multi-periodical function f cryst({η(α, m α )}) can also be presented as a Fourier series summed over the reciprocal lattice vectors of the host lattice, which reflects the symmetry of the crystal lattice (see, for detailed discussion, Refs. [4, 9, 11, 12]). The energy E cryst characterizes an interplanar potential of a homogeneous slipping. If the interplanar slipping is inhomogeneous, correction should be made to the crystalline energy (8). This is done by gradient energy E grad that characterizes the energy contribution associated with the inhomogeneity of the slip displacement. For one dislocation loop characterized by the phase field η(α1 , m α1 , r), as shown in Fig. 3(a) where η = 1 inside the disc domain describing the slipped region and 0 outside, E grad is formulated as
E
grad
= V
1 β [n (α1 ) × ∇η (α1 , m α1 , r)]2 d 3r, 2
(10)
where β is a positive coefficient, and ∇ is the gradient operator. As shown in Fig. 3(a), the term n (α) × ∇η (α, m α , r) defines the dislocation sense at point r. The gradient energy (10) is proportional to the dislocation loop perimeter and vanishes over the slip plane. For an arbitrary dislocation configuration characterized by a set of phase fields {η(α, m α )}, the general form of the gradient energy is
E
grad
=
ϕi j (r) d 3r,
(11)
V
(a)
(b) f(b)
f(h)
2γ/d b0
2b0
b
h hc
Figure 2. Schematic illustration of the general behavior of Landau-type coarse-grained “chemical” energy function for (a) dislocation (crystalline energy) and (b) crack (cohesion energy).
Dislocation dynamics – phase field (a)
2295 (b)
Figure 3. (a) A thin platelet domain describing the slipped region. The term n×∇η (r) defines the dislocation sense along the dislocation line (plate edge) and vanishes over the slip plane (plate surface). (b) Schematic of a polycrystal model. Each grain has a different orientation described by its rotation matrix Qi . The rotation matrix function Q (r) completely describes the geometry and orientation of each grain and the entire multi-grain structure.
where the argument of the integrand, ϕi j (r), is defined as ϕi j (r) =
α
[H(α) × ∇η(α, m α , r)]i b j (α, m α ).
(12)
mα
The choice of the tensor ϕi j (r) is dictated by the physical requirements that (i) the gradient energy is proportional to the dislocation length and vanishes over the slip planes and (ii) the gradient energy depends on the total Burgers vector of the dislocation. Following the Landau theory, we can approximate the function ϕi j (r) by the Taylor expansion, which reflects the symmetry of the crystal lattice. As discussed in the preceding section, the elastic energy of dislocations is calculated by using the linear elasticity theory. The nonlinear effects associated with dislocation cores are described in the phase field model by both the crystalline energy E cryst and the gradient energy E grad , which produce significant contributions only near dislocation cores. More detailed discussion on the crystalline and gradient energies is presented in Refs. [4, 9, 11, 12].
5.
Time-dependent Ginzburg–Landau Kinetic Equation
The total energy of a dislocation system is the sum of elastic energy (6), crystalline energy (8) and gradient energy (11): E = E elast + E cryst + E grad ,
(13)
which is a functional of a set of phase fields {η(α, m α , r)}. The temporalspatial dependence of η (α, m α , r, t ) describes the collective motion of the dislocation ensemble. The evolution of η (α, m α , r, t ) is characterized by a
2296
Y.U. Wang et al.
phenomenological kinetic equation, which is the time-dependent Ginzburg– Landau equation: δE ∂η(α, m α , r, t) = −L + ξ(α, m α , r, t), (14) ∂t δη(α, m α , r, t ) where L is the kinetic coefficient characterizing dislocation mobility, E is the total system energy (13), and ξ(α, m α , r, t) is the Langevin Gaussian noise term reproducing the effect of thermal fluctuations (an in-depth discussion on the invariant form of the time-dependent Ginzburg–Landau kinetic equation is presented in Ref. [12]). A numerical solution η(α, m α , r, t) of the kinetic Eq. (14) automatically takes into account the dislocation multiplication, annihilation, interaction and reaction without ad hoc assumptions. Figure 4 shows one example of the PFM simulation of self-multiplying and self-organizing dislocations during plastic deformation of single crystal ([4]; more simulations are presented therein, and also in Ref. [13] on dislocations in polycrystal, Ref. [11] on network formation, Ref. [14] on solute–dislocation interaction, and Ref. [15] on alloy hardening). The kinetic Eq. (14) is based on the assumption that the relaxation rate of a field is proportional to the thermodynamic driving force. Note that Eq. (14) assumes a linear dependence between dislocation glide velocity v and local resolved shear stress τ along the Burgers vector, i.e., v = mτ b, where m is a constant. In fact, ∂η/∂t = −Lδ E elast/δη −Lδ(E cryst + E grad )/δη, where the first term of the right-hand side gives the linear dependence (L/d) σi j n j bi with σi j being local stress. The second term provides the effect of lattice friction on dislocation motion. It is worth noting that the WJK theory is an interpolational theory providing a bridge between the high and low spatial resolutions. In the high resolution
Figure 4. PFM simulation of stress–strain curve and the corresponding 3D dislocation microstructures during plastic deformation of fcc single crystal under uniaxial loading (after Ref. [4], reproduced with permission).
Dislocation dynamics – phase field
2297
limit, it is a 3D generalization of the Peierls–Nabarro (PN) theory [16] to arbitrary dislocation configuration: the WJK theory reproduces the results of the PN theory in a particular case considered in this theory, i.e., a 2D model with a straight single dislocation. The gradient energy (11) is one ingredient that the PN theory lacks. As discussed in the preceding section, the gradient term is necessary as an energy correction associated with slip inhomogeneity and, together with the crystalline energy, describes the core radius and nonlinear core energy. As the PN theory, the WJK theory is applicable in the atomic resolution as well. However, to make the PN and WJK theories fully consistent with atomic resolution modeling, instead of continuum Green function, the atomistic Green function of the crystal lattice statics should be used [2]. To obtain the atomic resolution in the computer simulations, the computational grid sites should be the crystal lattice sites. Another option is to use subatomic scale phase field model where density function models individual atoms [17]. In the low resolution, the WJK theory gives a natural transition to the continuum dislocation theory where local dislocation density εidis j (r), which is related to the dislocation density fields η(α, m α , r) by Eq. (4), is smeared over volume elements corresponding to a computational grid cell, where the grid size l is much larger than the crystal lattice parameter. Then the reciprocal lattice vectors should be defined as H (α) = n (α)/l. In such situations, individual dislocation’s position is uncertain within one grid cell. The dislocation core width, which is the order of crystal lattice parameter, is too small to be resolved by the low resolution computational grids. To effectively eliminate the inaccuracy associated with the Burgers vector relaxation (the core effect) to the dislocation interaction energies at distances exceeding a computational grid length, a non-linear relation between the slip displacement vector b(α, m α , r) and the order parameter η(α, m α , r), rather than the linear relation (3), should be used in the low resolution cases. One simple example of such non-linear relation is [14]:
b(α, m α , r) = b(α, m α ) η(α, m α , r) −
1 2π
sin 2π η (α, m α , r) ,
(15)
which shrinks the effective radius of the dislocation core to improve the accuracy in the mesoscale diffuse-interface modeling. If the resolution of the simulation is microscopic, the use of the non-linear relation becomes unnecessary and the linear dependence (3) of the Burgers vector on the order parameter should be used.
6.
Dislocation Dynamics in Polycrystals
Equation (4) completely characterizes the dislocation configuration in a single crystal, where the elementary Burgers vectors b(α, m α ) and reciprocal
2298
Y.U. Wang et al.
lattice vectors H(α) are defined in the coordinate system related to the crystallographic axes of crystal. However, it should be modified to characterize a dislocation microstructure in a polycrystal. In the same global coordinate system the components of the vectors b(α, m α ) and H(α) will have different values in different grains because of the mutual rotations of crystallographic axes of grains. In the latter case, we have to describe the orientation of each grain in the polycrystal. To do this, we introduce a static rotation matrix function Q i j (r) that is constant within each grain but assumes different constant values in different grains [13]. In fact, Q i j (r) describes the geometry and orientation of each grain and the entire multi-grain structure, as shown in Fig. 3(b). Then the misfit strain εidis j (r) of a dislocation microstructure in a polycrystal is given by εidis j (r) =
1 α
mα
2
Q ik (r) Q j l (r) [bk (α, m α ) Hl (α)
+ bl (α, m α ) Hk (α)] η (α, m α , r).
(16)
For a single crystal, Q i j (r) = δi j and Eq. (6) is reduced to Eq. (4). Therefore, a dislocation microstructure consisting of all possible slip systems in both single crystal and polycrystal can be completely described by a set of phase fields {η(α, m α , r)}. The elastic energy E elast is still determined by Eq. (6) if the polycrystal is composed of elastically isotropic grains, since the KS theory is applicable to elastically homogeneous body. Otherwise if the grains are elastically anisotropic, their mutual rotations would make the polycrystal an elastically inhomogeneous body. The limitation of grain elastic isotropy could be lifted without serious complication of the theory and computational model by using the PFM theory of elastically inhomogeneous solid [6]. A special case of this theory, viz., a discontinuous body with voids, cracks and free surfaces, will be discussed in the following sections. With the simple modification (16), the above-discussed theory is applicable to dislocation dynamics in polycrystal composed of elastically isotropic grains. Simulation examples are presented in Ref. [13].
7.
Free Surfaces and Heteroepitaxial Thin Films
Free surface is one common type of defects that is shared by all real materials. The stress field is significantly modified near free surfaces (the so-called image force effect). This produces important effects on dislocation dynamics. It is generally a difficult task to calculate the image force corrections to stress field and elastic energy for an arbitrary dislocation configuration in the vicinity of arbitrary-shaped free surfaces. To address this problem, the WJK theory has been extended to deal with finite systems with arbitrary-shaped free
Dislocation dynamics – phase field
2299
surfaces based on the theory of a stressed discontinuous body with arbitraryshaped voids and free surfaces [5]. The latter provides an effective method to solve the elasticity problem without sacrificing accuracy. In this section, we discuss the applications of the phase field dislocation dynamics to a system with free surfaces. We first discuss a recently established variational principle that makes this extension possible. A body containing voids is no longer continuous. The elasticity problem for this discontinuous body under applied stress can be solved by using the (r), located following variational principle [5]: if a virtual misfit strain, εivirtual j within the domains of equivalent continuous body minimizes its elastic energy, the generated strain and elastic energy of this equivalent continuous body are the equilibrium strain and elastic energy of original discontinuous body with voids. This variational principle is equally applicable to the cases of voids within a solid and a finite body with arbitrary-shaped free surfaces. The latter can be considered as the body fully “immersed into a void”, where the vacuum around the body can be regarded as the domain defined in the vari(r) ational principle. The position, shape and size of the domains with εivirtual j coincide with those of the voids and surrounding vacuum. Together with the (r), generates externally applied stress, the strain energy minimizer, εivirtual j the stress that vanishes within the domains. The latter allows one to remove the domains without disturbing the strain field and thus return to the initial externally loaded discontinuous body. This variational principle enables one to reduce the elasticity problem of a stressed discontinuous elastically anisotropic body to a much simpler equivalent problem of the continuous body. The above-discussed variational principle leads to the method of determination of the virtual misfit strain εivirtual (r) through a numerical minimization j elast , for the equivalent continuous body with of the strain energy functional, E equiv (r) under external stress. The explicit form of this functional of εivirtual εivirtual (r) j j is given by the KS theory. We may employ a Ginzburg–Landau type equation for energy minimization, which is similar to Eq. (14): elast δ E equiv ∂εivirtual (rd , t) j , (17) = −K i j kl virtual ∂t δεkl (rd , t) where K i j kl is “kinetic” coefficient, t is “time”, and rd represents the points inside the void domains. The “kinetic” Eq. (17) leads to a steady-state solution (r) that is the energy minimizer and generates vanishing stress in the εivirtual j void domains. Equation (17) provides a general approach to determining 3D elastic field, displacement and elastic energy of an arbitrary finite multi-void system in an elastically anisotropic body under applied stress. In particular, it can be used to calculate elasticity solution for a body with mixed-mode cracks of arbitrary configuration, which enables us to develop a phase field model of cracks, as discussed in next section.
2300
Y.U. Wang et al.
The system with free surface is also structurally inhomogeneous if defects generate a crystal lattice misfit. In the case of dislocations in a heteroepitaxial film, the structural inhomogeneity is characterized by dislocation-associated epitax misfit strain εidis (r) associated with j (r) as well as epitaxial misfit strain εi j crystal lattice misfit between film and substrate. The effective misfit strain εieffect (r) of equivalent system is a sum as j epitax
εieffect (r) = εi j j
virtual (r) + εidis (r). j (r) + εi j
(18)
elast of equivalent system is expressed in terms of The elastic energy E equiv effect εi j (r). For a given dislocation microstructure characterized by εidis j (r), the virtual virtual misfit strain εi j (r) can be determined by using Eq. (17), which has to be solved only at points rd inside the domains corresponding to vacuum (r) generates vanishing stress in around the body. As discussed above, εivirtual j the vacuum domains. Since the whole equivalent system (regions corresponding to vacuum and film/substrate) is in elastic equilibrium, the vanishing stress in the vacuum region automatically satisfies free surface boundary condition. The total energy of a dislocation ensemble near free surfaces is also given elast . Since the role of virby Eq. (13), where the elastic energy is given by E equiv virtual tual misfit strain εi j (r) is just to satisfy the free surface boundary condition, it does not enter crystalline energy (8) or gradient energy (11). As discussed above, the dislocation-associated misfit strain εidis j (r) is a function of a set of phase fields {η (α, m α , r)} given by Eq. (4). Since the epitaxy misfit strain epitax εi j (r) is a static field describing heteroepitaxial structure, the total energy is
a functional of two sets of evolving fields, i.e., E {η (α, m α , r)}, εivirtual (r) . j Following Wang et al. [5], the evolution of dislocations in a heteroepitaxial film is characterized by simultaneous solutions of Eqs. (14) and (17), which is driven by an epitaxial stress relaxation under influence of image forces near free surfaces. Figure 5 shows one example of the PFM simulation of
Figure 5. PFM simulation of motion of a threading dislocation and formation of misfit dislocation at film/substrate interface during stress relaxation in heteroepitaxial film. The numbers indicate the time sequence of dislocation configurations (after Ref. [5], reproduced with permission).
Dislocation dynamics – phase field
2301
misfit dislocation formation through threading dislocation motion in epitaxial film [5].
8.
Phase Field Model of Cracks
According to the variational principle discussed in the preceding section, the effect of voids can be fully reproduced by an appropriately chosen virtual (r) defined inside the domains corresponding to the voids. In misfit strain εivirtual j particular, the domains corresponding to cracks are thin platelets of interplanar thickness. To model moving cracks, which can spontaneously nucleate, propagate and coalesce, the virtual misfit strain εivirtual (r) is no longer constrained inside fixed j domains and is allowed to evolve driven by a reduction of total system free energy. In this formalism, εivirtual (r) describes evolving cracks: regions where j virtual εi j (r) =/ 0 are the laminar domains describing cracks. The crack-associated virtual misfit strain is also a dyadic product, εicrack = j (h i n j + h j n i )/2d, where n is the normal and d is the interplanar distance of the cleavage plane, and h(r) is the crack opening vector. As in the phase field model of dislocations, individual cracks of an arbitrary configuration are completely described by a set of fields, {h(α, r)}, where α numbers operative cleavage planes [15]. The total number of the fields is determined by the crystallography rather than a concrete crack configuration. For an arbitrary crack configuration in a polycrystal involving all operative cleavage planes, the total virtual misfit strain is expressed as a function of the fields h(α, r): εicrack (r) = j
1 α
2
Q ik (r) Q j l (r) [h k (α, r) Hl (α) + h l (α, r) Hk (α)], (19)
where H (α) = n (α)/d(α) is the reciprocal lattice vector of the cleavage plane α, and Q i j (r) is the grain rotation matrix field function that describes polycrystalline structure. Under stress, the opposite surfaces of cracks undergo opening displacements h(α, r). For given crack configuration, h(α, r) are a priori unknown and (r) vary under varying stress. The crack-associated virtual misfit strain εicrack j defined in Eq. (18), and thus the fields h(α, r), can be obtained through a numerical minimization procedure similar to that in Eq. (17), where the elaselast of such a crack system is also given by the KS elastic energy tic energy E equiv (r). functional in terms of εicrack j
2302
Y.U. Wang et al.
The non-linear effect of cohesive forces resisting crack-opening is described by the Landau-type coarse-grained “chemical” energy, which in the case of cracks is the cohesion energy,
f cohes [{h (α, r)}]d 3r,
E cohes =
(20)
V
whose integrand is a function of a set of fields {h(α, r)}. The specific cohesion energy f cohes (h) characterizes an energy that is required to provide a separation of two pieces of crystals by the distance h cut along the cleavage plane. From a microscopic point of view, the energy f cohes (h) is the atomistic energy required for a continuous breaking of atomic bonds across cleavage plane and thus creating two free surfaces during a process of crack formation. A specific approximation of this function similar to the one first proposed by Orowan is formulated by Wang et al. [5]. The general behavior of specific cohesion energy is schematically illustrated in Fig. 2(b), which introduces crack tip cohesive force acting in small crack tip zones. The cohesion energy E cohes defined in Eq. (20) describes a homogeneous separation where both boundaries of crack-opening are kept flat and parallel to cleavage plane. The energy correction associated with the effect of crack surface curvature is taken into account by the gradient energy
E
grad
=
φi j (r) d 3r,
V
(21)
where the argument of the integrand φi j (r) is defined as φi j (r) =
[H (α) × ∇]i h j (α, r),
(22)
α
which is similar to the tensor ϕi j (r) defined in Eq. (12) in the case of dislocations. The choice of the tensor φi j (r) is dictated by similar physical requirement, i.e., the gradient energy is significant only near crack tip where the surface curvature is big and is proportional to the crack front length while vanishes at flat surfaces of homogeneous opening. Following the Landau theory approach, we can also approximate the function φi j (r) by the Taylor expansion, which reflects the symmetry of the crystal lattice (see, for detailed discussion, Refs. [5, 9, 12]). The total free energy of the crack system characterized by the fields h(α, r) (r)), cohesion energy (20) and is the sum of elastic energy (in terms of εicrack j gradient energy (21): E = E elast + E cohes + E grad ,
(23)
which is a functional of a set of fields, {h(α, r)}. The temporal-spatial dependences of h(α, r, t) describe the collective motion of the crack ensemble.
Dislocation dynamics – phase field (a)
2303 (b)
Figure 6. PFM simulation of crack propagation during cleavage fracture in a 2D polycrystal composed of elastically isotropic grains (after Ref. [5], reproduced with permission). Different grain orientations are shown in gray scales.
The evolution of h(α, r, t) is obtained as a solution of the time-dependent Ginzburg–Landau kinetic equation: δE ∂h i (α, r, t) + ξi (α, r, t), = −L i j ∂t δh j (α, r, t)
(24)
where L i j is the kinetic coefficient characterizing crack propagation mobility, E is the system free energy (23), and ξi (α, r, t) is the Gaussian noise term reproducing the effect of thermal fluctuations. As shown by Wang et al. [5], a numerical solution h(α, r, t) of kinetic Eq. (24) automatically takes into account crack evolution without ad hoc assumption on possible path. Figure 6 shows one example of the PFM simulation of self-propagating crack during cleavage fracture in polycrystal [5].
9.
Multi-physics and Multi-scales
This article discusses the recent developments of the phase field theory and models of structurally inhomogeneous systems and their applications to modeling of the multi-dislocation dynamics and multi-crack evolution. The phase field approach can be used to simulate diffusional and displacive phase transformations (see “Phase Field Method–General Description and Computational Issues” by Karma and Chen, “Coherent Precipitation–Phase Field” by Wang, “Ferroic Domain Structures/Martensite” by Saxena and Chen, and the references therein), dislocation dynamics during plastic deformation and cracks development during fracture, as well as dislocation dynamics and morphology evolution [7, 8] of the heteroepitaxial thin films driven by the relaxation of epitaxial stress. These computational models are formulated in the same PFM formalism of the structure density dynamics. The difference between them is only in the analytical form of the Landau-type coarse-grained energy reflecting the physical nature and invariancy properties of the structural heterogeneities. This common analytical framework makes it easy to integrate the models of
2304
Y.U. Wang et al.
physically different processes into one unified structure density dynamics model. A cost of this would be just an increase in the number of evolving fields. A use of such a unified model allows one to address problems of arbitrary multi-mode microstructure evolution in complex materials systems. In particular, it enables one to investigate structure–property relationships of structurally inhomogeneous materials in situations where the structural heterogeneities of different kinds, which determine the mechanical properties of these materials, simultaneously evolve. The PFM theories and models presented in this article show that while challenges remain, significant advances have been achieved in integrating multiple physical phenomena for simulation of complex materials behavior. The second issue of equal importance is to bridge multiple length and time scales in materials modeling and simulation. Since the PFM approach is based on continuum theory, the PFM simulation is performed at mesoscale from a few nanometers to hundreds of micrometers. The PFM theory can also be applied to the atomic scales, in which case the role of structure density Fields is played by the occupation probabilities of the crystal lattice sites [18]. Recently the Phase Field model has been further extended to the subatomic scale where the field is the subatomic scale continuum density describing individual atoms [17]. The latter model bridges the molecular dynamics approach and the phase field theories discussed in this article. At an intermediate length scale, the mesoscale PFM theory and modeling bridge the gap between the modeling of atomistic level physical processes and macroscopic level material behaviors. The input information to the mesoscale modeling is the macroscopic material constants such as crystallographic data, elastic moduli, bulk chemical energy, interfacial energy, equilibrium composition, domain wall mobility, diffusivity, etc., which could be obtained via either atomistic calculations (first principle, molecular dynamics) or experimental measurements or both. Its output could be directly used to formulate the continuum constitutive relations for macroscopic materials theory and modeling. In particular, the PFM theory and models require a determination of the functional forms of Landau-type energy for different physical processes. This could be obtained through atomistic scale calculations. Incorporation of the results of atomistic simulations into the mesoscale PFM theories is a feasible way for multi-scale modeling.
References [1] A.G. Khachaturyan, Fiz. Tverd. Tela, 8, 2710 (1967. Sov. Phys. Solid State, 8, 2163), 1966. [2] A.G. Khachaturyan, Theory of Structural Transformations in Solids, John Wiley & Sons, New York, 1983. [3] A.G. Khachaturyan and G.A. Shatalov, Sov. Phys. JETP, 29, 557, 1969.
Dislocation dynamics – phase field
2305
[4] Y.U. Wang, Y.M. Jin, A.M. Cuiti˜no, and A.G. Khachaturyan, Acta Mater., 49, 1847, 2001. [5] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, J. Appl. Phys., 91, 6435, 2002a. [6] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, J. Appl. Phys., 92, 1351, 2002b. [7] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, Acta Mater., 51, 4209, 2003. [8] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, Acta Mater., 52, 81, 2004. [9] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, “Mesoscale modeling of mobile crystal defects – dislocations, cracks and surface roughening: phase field microelasticity approach,” accepted to Phil. Mag., 2005a. [10] W.L. Bragg and J.F. Nye, Proc. R. Soc. Lond. A, 190, 474, 1947. [11] C. Shen and Y. Wang, Acta Mater., 51, 2595, 2003. [12] Y.U. Wang, Y.M. Jin, and A.G. Khachaturyan, “Structure density field theory and model of dislocation dynamics,” unpublished, 2005b. [13] Y.M. Jin and A.G. Khachaturyan, Phil. Mag. Lett., 81, 607, 2001. [14] S.Y. Hu, Y.L. Li, Y.X. Zheng, and L.Q. Chen, Int. J. of Plast., 20, 403, 2004. [15] D. Rodney, Y. Le Bouar, and A. Finel, Acta Mater., 51, 17, 2003. [16] F.R.N. Nabarro, Proc. Phys. Soc. Lond., 59, 256, 1947. [17] K.R. Elder and M. Grant, “Modeling elastic and plastic deformations in nonequilibrium processing using phase field crystals,” unpublished, 2003. [18] L.Q. Chen and A.G. Khachaturyan, Acta Metall. Mater., 39, 2533, 1991.
7.13 LEVEL SET DISLOCATION DYNAMICS METHOD Yang Xiang1 and David J. Srolovitz2 1
Department of Mathematics, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong 2 Princeton Materials Institute and Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544, USA
1.
Introduction
Although dislocation theory had its origins in the early years of the last century and has been an active area of investigation ever since (see [1–3]), our ability to describe the evolution of dislocation microstructures has been limited by the inherent complexity and anisotropy of the problem. This complexity has several contributing features. The interactions between dislocations are extraordinarily long-ranged and depend on the relative positions of all dislocation segments and the orientation of their Burgers vectors and line orientation. Dislocation mobility depends on the orientations of the Burgers vector and line direction with respect to the crystal structure. A description of the dislocation structure within a solid is further complicated by such topological events as annihilation, multiplication and reaction. As a result, analytical descriptions of dislocation structure have been limited to a small number of the simplest geometrical configurations. More recently, several dislocation dynamics simulation methods have been developed that account for complex dislocation geometries and/or the motion of multiple, interacting dislocations. The first class of these dislocation dynamics simulation methods is based upon front tracking methods. Three-dimensional simulations based upon these methods were first performed by Kubin et al. [4, 5] and later augmented by other researchers [6–11]. In these simulation methods, dislocation lines are discretized into individual segments. During the simulations, each segment is tracked and the forces on each segment from all other segments are calculated at each time increment (usually through the Peach–Koehler formula 2307 S. Yip (ed.), Handbook of Materials Modeling, 2307–2323. c 2005 Springer. Printed in the Netherlands.
2308
Y. Xiang and D.J. Srolovitz
[12]). Three-dimensional front tracking methods made it possible to simulate dislocations motion with a degree of reality heretofore not possible. Such methods require, however, large computational investments because they track each segment of each dislocation line and calculate the force on each segment due to all other segments at each time increment. Moreover, special rules are needed to describe the topological changes that occur when segments of the same or different dislocations annihilate or merge [8, 9, 11]. Another class of dislocation dynamics models employs a phase field description of dislocations, as proposed by Khachaturyan, et al. [13, 14]. In their phase field model, density functions are used to model the evolution of a three-dimensional dislocation system. Dislocation loops are described as the perimeters of thin platelets determined by density functions. Since this method is based upon the evolution of a field in the full dimensions of the space, there is no need to track individual dislocation line segments and topological changes occur automatically. However, contributions to the energy that are normally not present in dislocation theory must be included within the phase field model to keep the dislocation core from expanding. In addition, dislocation climb is not easily incorporated into this type of model. Recently, a three-dimensional level set method for dislocation dynamics has been proposed [15, 16]. In this method, dislocation lines in three dimensions are represented as the intersection of zero levels (or zero contors) of two three-dimensional scalar functions (see [17–19] for a description of the level set method). The two three-dimensional level set functions are evolved using a velocity field extended smoothly from the velocity of the dislocation lines. The evolution of the dislocation lines is implicitly determined by the evolution of the two level set functions. Linear elasticity theory is used to compute the stress field generated by solved using a fast Fourier transform (FFT) method, assuming periodic boundary conditions. Since the level set method does not track individual dislocation line segments, it easily handles topological changes associated with dislocation multiplication and annihilation. This level set method for dislocation dynamics is capable of simulating the threedimensional motion of dislocations, naturally accounting for dislocation glide, cross-slip and climb through the choice of the ratio of the glide and climb mobilities. Unlike previous field-based methods [13, 14], no unconventional contributions to the system energy are required to keep the dislocation core localized. Numerical implementation of the level set method is through simple and accurate finite difference schemes on uniform grids. Results of simulation examples using this method agree very well with the theoretic predictions and the results obtained using other methods [15]. This method has also been used to simulate the dislocation-particle bypass mechanisms [16]. Here we shall review this level set dislocation dynamics method and present some of the simulation results in [15, 16].
Level set dislocation dynamics method
2.
2309
Continuum Dislocation Theory
We first briefly review the aspects of the continuum theory of dislocations that are relevant to the development of the level set description of dislocation dynamics. More complete descriptions of the continuum theory of dislocations can be found in, e.g., [2, 3, 20, 21]. Dislocations are line defects in crystals for which the elastic displacement vector satisfies
du = b,
(1)
L
where L is any contor enclosing the dislocation line with Burgers vector b and u is the elastic displacement vector. We can rewrite Eq. (1) in terms of the distortion tensor w, wi j = ∂u j /∂ xi for i, j = 1, 2, 3, as ∇ × w = ξ δ(γ ) ⊗ b,
(2)
where ξ is the unit vector tangent to the dislocation line, δ(γ ) is the two dimensional delta function in the plane perpendicular to the dislocation and is zero everywhere except on the dislocation, the operator ⊗ implies the tensor product of two vectors. While the Burgers vector is constant along any individual dislocation line, different dislocation lines may have different Burgers vectors. Equation (2) is valid only for dislocations with the same Burgers vector. In crystalline materials, the number of possible Burgers vectors, N , is finite (e.g., typically N = 12 for a FCC metal). Equation (2) may be extended to account for all possible Burgers vectors: ∇×w=
N
ξi δ(γi ) ⊗ bi
(3)
i=1
where γi represents all of the dislocations with Burgers vector bi , and ξi is the tangent to dislocation line i. Next, we consider the tensors describing the strain and stress within the body containing the dislocations. The strain tensor is defined as i j = 12 (wi j + w j i )
(4)
for i, j = 1, 2, 3. The stress tensor σ is determined from the strain tensor by the linear elastic constitutive equations (Hooke’s law) σi j =
3 k,l=1
Ci j kl kl
(5)
2310
Y. Xiang and D.J. Srolovitz
for i, j = 1, 2, 3, where {Ci j kl } is the elastic constant tensor. For an isotropic medium, the constitutive equations can be written as 2ν (11 + 22 + 33 )δi j (6) 1 − 2ν for i, j = 1, 2, 3, where G is the shear modulus, ν is the Poisson ratio, and δi j is equal to 1 if i = j and is equal to 0, otherwise. In the absence of body forces, the equilibrium equation is simply σi j = 2Gi j + G
∇ · σ = 0.
(7)
Finally, the stress and strain tensors associated with a dislocation can be found by combining Eqs. (2), (4), (5) and (7). Dislocations can be driven by stresses within the body. The driving force for dislocation motion, referred to as the Peach–Koehler force, is f = σ tot · b × ξ,
(8)
where the total stress field σ includes the applied stress σ self-stress σ obtained by solving Eqs. (2), (4), (5) and (7): tot
σ tot = σ + σ appl.
appl
and the (9)
Dislocation migration can, at low velocities, be thought of as purely dissipative, such that the local dislocation velocity can be written as v = M · f,
(10)
where M is the mobility tensor. The interpretation of the mobility tensor M is deferred to the next section.
3.
The Level Set Dislocation Dynamics Method
The level set framework was devised by Osher and Sethian [17] in 1987 and and has been successfully applied to a wide range of physical and computer graphics problems [18, 19]. In this section, we present the level set approach to dislocation dynamics. More details and applications of this method can be found in [15, 16]. A level set is defined as a surface on which the level set function has a particular constant value. Therefore, an arbitrary scalar level set function can be used to describe a surface in three dimensional space, a line in two dimensional space, etc. In the level set method for dislocation dynamics, a dislocation in three dimensional space γ (t) is represented by the intersection of the zero levels of two level set functions φ(x, y, z, t) and ψ(x, y, z, t) defined in the three-dimensional space, i.e., where φ(x, y, z, t) = ψ(x, y, z, t) = 0,
(11)
Level set dislocation dynamics method
2311
see Fig. 1. The evolution of the dislocation is described by φt + v · ∇φ = 0 ψt + v · ∇ψ = 0
(12)
where v is the velocity of the dislocation extended smoothly to the threedimensional space, as described below. The reason this system of partial differential equations gives the correct motion of the dislocation can be understood in the following way. Assume that the dislocation γ (s, t), described in parametric form using the variable s, is given by φ(γ (s, t), t) = 0 ψ(γ (s, t), t) = 0,
(13)
where t is time. The derivative of Eq. (13) with respect to t gives ∇φ(γ (s, t), t) · γt (s, t) + φt (γ (s, t), t) = 0 ∇ψ(γ (s, t), t) · γt (s, t) + ψt (γ (s, t), t) = 0.
(14)
Comparing this result with Eq. (12) shows that γt (s, t) = v,
(15)
which means the velocity of the dislocation is equal to v, as required. The velocity field of a dislocation is computed from the stress field using Eqs. (8), (9) and (10). The self-stress field is obtained by solving the elasticity equations: (2), (4), (5) and (7). The unit vector locally tangent to the dislocation line, ξ , in Eqs. (2) and (8), is calculated from the level set functions φ and ψ using ξ=
∇φ × ∇ψ . |∇φ × ∇ψ|
(16)
ψ(x,y,z) ⫽ 0
ψ(x,y,z) ⫽ 0
Figure 1. A dislocation in three-dimensional space γ (t) is the intersection of the zero levels of the two level set functions φ(x, y, z, t) and ψ(x, y, z, t).
2312
Y. Xiang and D.J. Srolovitz
The self-stress obtained by solving the elasticity equations (2), (4), (5) and (7) is singular on the dislocation line. This singularity is artificial because of the discreteness of the atomic lattice and non-linearities in the stress–strain relation not included in the linear elastic formulation. This non-linear region corresponds to the dislocation core. One approach to handling this problem is to use a smeared delta function instead of the exact delta function in Eq. (2) near each point on the dislocation line. The smeared delta function, like the exact one, is defined in the plane perpendicular to the dislocation line, and the vector ξ is defined everywhere in this plane to be the dislocation line tangent vector. This smeared delta function can be considered to be the distribution of the Burgers vector in the plane perpendicular to the dislocation line. The width of the smeared delta function is the diameter of the core region of the dislocation line. We use this approach to treat the dislocation core and its smeared delta function description. More precisely, the smeared delta function in Eq. (2) is given by δ(γ ) = δ(φ)δ(ψ),
(17)
where the delta functions on the right-hand-side are one-dimensional smeared delta functions δ(x) =
1 1 + cos π x
2 0
− ≤ x ≤
,
(18)
otherwise
and scales the distance over which the delta function is smeared. The level set functions φ and ψ are usually chosen to be signed distance functions to their zero levels (i.e., the magnitude of the function is the distance from the closest point on the surface and the sign changes as we cross the zero level) and their zero levels are kept perpendicular to each other. A procedure called reinitialization is used to retain these properties of φ and ψ during their temporal evolution (see the next section for details). Therefore the delta function defined by (17) is a two-dimensional smeared delta function in the plane perpendicular to the dislocation line. Moreover, the size and the shape of the core region do not change during the evolution of the system. We now define the mobility tensor M. A dislocation line can glide conservatively (i.e., without diffusion) only in the plane containing both its tangent vector and the Burgers vector (i.e., the slip plane). A screw segment on a dislocation line can move in any plane containing the dislocation segment, since the tangent vector and Burgers vector are parallel. The switching of a screw segment from one slip plane to another is known as cross-slip. At high temperatures, non-screw segments of a dislocation can also move out of the slip plane by a non-conservative (i.e., diffusive) process; i.e., climb. The
Level set dislocation dynamics method
2313
following form of the mobility tensor satisfies these constraints:
M=
m g (I − n ⊗ n) + m c n ⊗ n
non-screw (ξ not parallel to b)
mgI
screw (ξ parallel to b)
,
(19) where n=
ξ ×b |ξ × b|
(20)
is the unit vector normal to the slip plane (i.e., the plane that contains the tangent vector ξ of the dislocation and its Burgers vector b), I is the identity matrix, I − n ⊗ n is the orthogonal matrix that projects vectors onto the plane with normal vector n, m g is the mobility constant for dislocation glide and m c is the mobility constant for dislocation climb. Typically, mc 1. (21) 0≤ mg The mobility tensor M, defined above, can account for the relatively high glide mobility and slow climb mobility. The present method is equally applicable to all crystal systems and all crystal orientations through appropriate choice of the Burgers vector and the mobility tensor (which can be rotated into any arbitrary orientation). In the present model, the dislocation can slip on all mathematical slip planes (i.e., planes containing the Burgers vector and line direction) and are not constrained to a particular set of crystal plane {hkl}, although it would be relatively simple to impose this constraint. Finally, while we implicitly assume that the glide mobilities of screw and non-screw segments are identical, this restriction is also easily relaxed. For simplicity, we restrict our description of the problem throughout rest of this discussion to the case of isotropic elasticity. While anisotropy will not cause any essential difficulties in the model, the added complexity clouds the description of the method. If we further assume periodic boundary conditions, the stress field can be solved analytically from the elasticity system (2), (4), (6) and (7) in Fourier space. The formulation can be found in [15]. A necessary condition for the elasticity system to have a periodic solution is that the total Burgers vector is equal to zero in the simulation cell. If the total Burgers vector is not equal to zero, the stress is equal to a periodic function plus a linear function in x, y and z [22, 23]. In this case, we also use the above mentioned expression for the stress field, as it only gives the periodic part of that field. This is consistent with the approach suggested by Bulatov et al. for computing periodic image interactions in the front tracking method [22, 23]. The above description of the method can only be applied to the case where all dislocations have the same Burgers vector b. For a more general case,
2314
Y. Xiang and D.J. Srolovitz
where dislocation lines have different Burgers vectors, we would use different level set functions φi and ψi for each of the unique set of Burgers vectors bi , i = 1, 2, . . . , N , where N is the total number of the possible Burgers vectors, and use Eq. (3) instead of Eq. (2) in the elasticity equations.
4. 4.1.
Numerical Implementation Computing the Elastic Fields and the Dislocation Velocity
We solve the elasticity equations associated with the dislocations (2), (4), (6) and (7) using the FFT approach. The first step is to compute the dislocation tangent vector ξ δ(γ ) from the level set functions φ and ψ. The delta function δ(γ ) is computed using Eq. (17) with core radius = 3dx, where dx is the spacing of the numerical grid. The tangent vector ξ is computed using a regularized form of Eq. (16) (to avoid division by zero), i.e., ∇φ × ∇ψ , ξ= |∇φ × ∇ψ|2 + dx 2
(22)
as is standard in level set methods. The gradients of φ and ψ in Eq. (22) are computed using the third order weighted essentially nonoscillatory (WENO) method [24]. Since (WENO) derivatives are one-sided, we switch sides after several time steps to reduce the error caused by asymmetry. After we obtain the stress field, we compute the velocity field using Eqs. (8)–(10). We now use central differencing to compute the gradients of φ and ψ in (22) to get the tangent vector ξ in Eqs. (8) and (20). The mobility tensor in Eq. (10) is computed using Eqs. (19) and (20). We also regularize the denominator in Eq. (20) to avoid division by zero, as we did in Eq. (22). For the mobility tensor (19), we use the mobility for a screw dislocation when |ξ × b| < 0.1 and use the mobility for a non-screw dislocation otherwise.
4.2.
Numerical Implementation of the Level Set Method
4.2.1. Solving the evolution equations The level set evolution equations are commonly solved using high order essentially nonoscillatory (ENO) or WENO methods for the spatial discretization [17, 25, 24] and total variation diminishing (TVD) Runge–Kutta methods for the time discretization [26, 27]. Here we compute the spatial upwind derivatives using the third order WENO method [24] and use the fourth order TVD Runge–Kutta [27] to solve the temporal evolution equations (12).
Level set dislocation dynamics method
2315
4.2.2. Reinitialization In level set methods for three-dimensional curves, the desired level set functions φ and ψ are signed distance functions to their zero levels (i.e., the value at each point in the scalar field is equal to the distance from the closest point on the zero level contor surface with a positive value on one side of the zero level and a minus sign on the other). Ideally, the zero level surfaces of these two functions should be perpendicular to each other. Initially, we choose φ and ψ to be such signed distance functions. However, there is no guarantee that the level set functions will always remain orthogonal signed distance functions during their evolution. This has the potential for causing large numerical errors. Standard level set techniques are used to reconstruct new level set functions from old ones with the dislocations unchanged. The resultant new level set functions are signed distance functions and their zero levels are perpendicular to each other. It has been shown [28, 29, 18, 30] that this procedure does not change the evolution of the lines represented by the intersection of the two level set functions, which are the dislocations here. (1) Signed Distance Functions To obtain a new signed distance function φ˜ from φ, we solve the following evolution equation to steady state [29] φ˜ ˜ − 1) = 0 (|∇ φ| φ˜t +
˜ 2 dx 2 . φ˜ 2 + |∇ φ| ˜ φ(t = 0) = φ
(23)
The new signed distance function ψ˜ from the level set function ψ can be found similarly. We solve for the steady state solutions to these equations using fourth order TVD Runge Kutta [27] in time and Godunov’s scheme [25, 31] combined with third order WENO [24] in space. We iterate these equations several steps of the fourth order TVD Runge Kutta method [27] using a time increment equal to half of the Courant-Friedrichs-Levy (CFL) number (i.e., the numerical stability limit). We solve for the new level set functions φ˜ and ψ˜ at each time step for use in solving the evolution equation (12). (2) Perpendicular Zero Levels Theoretically, the following equation resets the zero level of φ perpendicular to that of ψ [18, 30] ψ
∇ψ · ∇ φ˜ = 0 φ˜t +
2 2 2 ψ + |∇ψ| dx |∇ψ|2 + dx 2 . (24) ˜ φ(t = 0) = φ We solve for the steady state solution to this equation using fourth order TVD Runge Kutta [27] in time and third order WENO [24] for the upwind one˜ The gradient of ψ in the equation is computed using sided derivatives of φ.
2316
Y. Xiang and D.J. Srolovitz
the average of the third order WENO [24] derivatives on both sides. We iterate this equation several steps of the fourth order TVD Runge–Kutta method given in [27] using a time increment of half of the CFL number. We reset the zero level of ψ perpendicular to that of φ similarly. We perform this perpendicular resetting procedure once every few time steps in the integration of the level set evolution equations (Eq. (12)).
4.2.3. Visualization The plotting of the dislocation line configurations is complicated by the fact that the dislocation lines are determined implicitly by the two level set functions. We use the following plotting method, described in more detail in [18]. Each cube in the grid is divided into six tetrahedra. Inside each tetrahedron, the level set functions φ and ψ are approximated by linear functions. The intersection of the zero levels of the two linear functions is a line segment inside the tetrahedron if the intersection is not empty (i.e., we need only compute the two ending points of the line segment on the tetrahedron surface), see Fig. 2. The union of all of these segments is the dislocation configuration.
4.2.4. Velocity interpolation and extension We use a smeared delta function (rather than an exact delta function) to compute the self-stress of the dislocations in order to smooth the singularity in the dislocation self-stress. The region near the dislocations where the smeared delta function is non-zero is the core region of the dislocations. The size of the core region is set by the discretization of space rather than by the physical
A
E G B
D F C
Figure 2. A cube in the grid, a tetrahedron A BC D and a dislocation line segment E F inside the tetrahedron. Point G is on the segment E F and the length of CG is the distance from the grid point C to the segment E F.
Level set dislocation dynamics method
2317
core size. The leading order of the self-stress near the dislocations, when using a smeared delta function, is of the order 1/, where is the dislocation core size. This O(1/) self-stress near the dislocations does not contribute to the motion of the dislocations. We remove this contribution to the self-stress by a procedure which we call velocity interpolation and extension. We first interpolate the velocity on the dislocation line and then extend the interpolated value to the whole space using the fast sweeping method [32–36]. In the velocity interpolation, we use a method similar to that used in the plotting of dislocation lines. For any grid point, the dislocation line segments in nearby cubes can be found by the plotting method. The distance from this grid point to the dislocation line is the minimum distance to any dislocation segment. The remainder of the procedure is most simply described by consideration of the example in Fig. 2. The distance from the grid point of interest, point C for example, to the dislocation line is the distance from C to the segment E F. We locate a point G on the segment E F such that the length of C G is the minimum distance from C to E F. We know the velocity on the grid points of the cube in Fig. 2. We compute the velocity on the points E and F by trilinear interpolation of the velocity on these grid points. Then, we compute the velocity on the point G using a linear interpolation of the velocity on E and F. The velocity of point C is approximated as that on grid point G. To extend the velocities calculated at grid points neighboring the dislocation lines to the whole space, we employ the fast sweeping method [32–36]. The fast sweeping method is an algorithm for obtaining the distance function d(x) to the dislocations at all gridpoints from the distance values at gridpoints neighboring the dislocations (obtained as described above). This involves solving |∇d(x)| = 1
(25)
using the Godunov scheme with Gauss-Seidel iterations [35, 36]. Velocity extension is incorporated into this algorithm by updating the velocity v = (v 1 , v 2 , v 3 ) at each gridpoint after the distance function is determined such that the velocity is constant in the directions normal to the dislocations (the gradient directions of the distance function). This involves solving equations ∇v i (x) · ∇d(x) = 0,
(26)
for i = 1, 2, 3 simultaneously d(x) [32–34].
4.2.5. Initialization Initially, we choose the level set functions φ and ψ such that (1) the intersection of their zero levels gives the initial configuration of the dislocation lines; (2) φ and ψ are signed distance functions to their zero levels, respectively; and (3) the zero levels of φ and ψ are perpendicular to each other.
2318
Y. Xiang and D.J. Srolovitz
Though we solve the elasticity equations assuming periodicity, the level set functions are not necessarily periodic and may be defined in a region smaller than the periodic simulation box.
5.
Applications
Figures 3–10 show several applications of the level set method for dislocation dynamics, described above. Additional simulation details and results can be found in [15, 16]. The simulations were performed within simulation cells that were l × l × l (where l = 2) in arbitrary units. The simulation cell is discretized into 64 × 64 × 64 grid points (For Fig. 6, the simulation cell is 2l ×2l ×l discretized into 128×128×64 grid points). We set the Poisson ratio ν = 1/3 and the climb mobility m c = 0, except in Figs. 3 and 4. The simulations described in Fig. 3, performed with these parameters, required less than five hours on a personal computer with a 450 MHz Pentium II microprocessor. 1 0.8 0.6 0.4 0.2
y
0 ⫺0.2 ⫺0.4 ⫺0.6 ⫺0.8 ⫺1 ⫺1 ⫺0.8 ⫺0.6 ⫺0.4 ⫺0.2
0
0.2
0.4
0.6
0.8
1
x
Figure 3. A prismatic loop shrinking under its self-stress by climb. The Burgers vector b is pointing out of the paper. The loop is plotted at uniform time intervals starting with the outermost circle. The loop eventually disappears. (a)
(b)
(d)
(e)
1
1
1
1
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.4
0.2
0.2 z
(c)
1 0.8
0
z
0.2
0
z
0.2
0
z
0.2
0
z
0
⫺0.2
⫺0.2
⫺0.2
⫺0.2
⫺0.2
⫺0.4
⫺0.4
⫺0.4
⫺0.4
⫺0.4
⫺0.6
⫺0.6
⫺0.6
⫺0.6
⫺0.6
⫺0.8
⫺0.8 ⫺1 1 0.5 x
⫺1 1
0.5
0
⫺0.5 y
⫺1
⫺0.8 ⫺1 1
⫺0.8
⫺1 1 0.5
b 0 ⫺0.5
x
⫺1 1
b 0 ⫺0.5
⫺1 1
0.5 0.5
0
⫺0.5 y
⫺1
x
b 0 ⫺0.5
⫺1 1
⫺0.8
0.5 0.5
0
⫺0.5 y
⫺1
x
⫺1 1
b 0 ⫺0.5
⫺1 1
0.5 0.5
0
⫺0.5 y
⫺1
x
b 0 ⫺0.5
⫺1 1
0.5
0
⫺0.5
⫺1
y
Figure 4. An initially circular glide loop in the x y plane, with a Burgers vector b in the x direction, expanding under a complex applied stress (σx z , σx y =/ 0) with mobility ratios m c /m g of (a) 0, (b) 0.25, (c) 0.5, (d) 0.75, and (e) 1.0. The loop is plotted at regular intervals in time.
Level set dislocation dynamics method
2319
The computational efficiency is independent of the absolute value of the glide mobility or the absolute value of the grid spacing. Figure 3 shows a prismatic loop (Burgers vector perpendicular to the plane containing the loop) shrinking under its self-stress by climb (the climb mobility m c > 0). The simulation result agrees with the well-known fact that the leading order shrinking force in this case is proportional to the curvature of the loop. Figure 4 shows an initially circular glide loop expanding under a complex applied stress with mobility ratios m c /m g of 0, 0.25, 0.5, 0.75, and 1.0. The applied stress generates a finite force on all the dislocation segments that tends to move them out of the initial slip plane. However, if the climb mobility m c = 0, only the screw segments move out of the slip plane; the non-screw segments cannot because the mobility in such direction is zero (Fig. 4(a)). If the climb mobility m c > 0, both the screw and non-screw segments move out of the slip plane (Fig. 4(b)–(e)). Figure 5 shows the intersection of two initially straight screw dislocations with different Burgers vectors. One dislocation is driven by an applied stress towards the other and then cuts through it. Two pairs of level set functions are used and the elastic fields are described using Eq. (3) instead of Eq. (2). Figure 6 shows the simulation of the Frank-Read source. Initially the dislocation segment is an edge segment. It bends out under an applied stress and generates a new loop outside. The initial configuration in this simulation is a rectangular loop. Of its four segments, two opposite ones are operating as the
b2
z
b1 y x
Figure 5. Intersection of two initially straight screw dislocations with Burgers vectors b1 and b2 . Dislocation 1 is driven in the direction of the −x axis by the applied stress σ yz .
Figure 6. Simulation of the Frank-Read source. Initially the dislocation segment is an edge segment in the x y plane (the z axis is pointing out of the paper). The Burgers vector is parallel to the x axis and a stress σx z is applied. The configuration in the slip plane is plotted at different time during the evolution.
2320
Y. Xiang and D.J. Srolovitz
Frank-Read source in the plane perpendicular to the initial loop and the other two are fixed. Figure 7 shows an edge dislocation bypassing a linear array of impenetrable particles, leaving Orowan loops [37] around the particles behind. The dislocation moves towards the particles under an applied stress. The glide plane of the dislocation intersects the centers of the particles (the particles are coplanar). The impenetrable particles are assumed to exert a strong short-range repulsive force on dislocations, see [15] for details. Figure 8 shows a screw dislocation bypassing an impenetrable particle by a combination of Orowan looping [37] and cross-slipping [38]. The dislocation moves towards the particle under an applied stress. It leaves two loops behind on the two sides of the particle. The plane in which the screw dislocation would glide in the absence of the particle is above the particle center. Figure 9 shows an edge dislocation bypassing a misfitting spherical particle by cross-slip [38], where the slip plane of the dislocation is above the particle center. The misfit > 0. The dislocation moves towards the particles under an applied stress. Two loops are left behind: one is behind the particle and the other is around the particle. They have the same Burgers vector but opposite line directions. The stress fields generated by a (dilatational) misfitting spherical particle (isotropic elasticity) were given by Eshelby [39]. Figure 10 shows the critical stress for an edge dislocation to bypass co-planar impenetrable particles by the Orowan mechanism. The stress is 3
3
3
3
2
2
2
2
1
1
1
0
0
0
⫺1
⫺1
⫺1
⫺2
⫺2
1
y
0
⫺1 ⫺2 ⫺3
b ⫺3
⫺2
⫺1
0
x
1
2
⫺2 3 ⫺3 ⫺3
⫺2
⫺1
0
1
2
3
⫺3
⫺3
⫺2
⫺1
⫺3 0
1
2
3
⫺3
⫺2
⫺1
0
1
2
3
Figure 7. An edge dislocation bypassing a linear array of impenetrable particles, leaving Orowan loops [37] around the particles behind. The Burgers vector b is in the x direction. The applied stress σx z =/ 0, where the z direction is pointing out of the paper.
z b y
x
Figure 8. A screw dislocation bypassing an impenetrable particle by a combination of Orowan looping [37] and cross-slipping [38]. The Burgers vector b is in the y direction, the applied stress is σ yz =/ 0, and the plane in which the screw dislocation would glide in the absence of the particle is above the particle center (in the +z direction).
Level set dislocation dynamics method
2321
Figure 9. An edge dislocation bypassing a misfitting spherical particle by cross-slip [38], where the slip plane of the dislocation is above the particle center. The Burgers vector b is in the x direction, the applied stress is σx z =/ 0. The misfit > 0.
0.4
Critical stress (Gb/L)
0.35
0.3
0.25 slope⫽1/2π 0.2
0.15
0.1 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
log(D /r ) 1 0
Figure 10. The critical stress for an edge dislocation to bypass co-planar impenetrable particles by the Orowan mechanism. The stress is plotted in the unit Gb/L against log(D1 /r0 ).
2322
Y. Xiang and D.J. Srolovitz
plotted in the unit (Gb/L) against log(D1 /r0 ), where G is the shear modulus, b is the length of the Burgers vector, L is the inter-particle distance, D is the diameter of the particle, D1 is the harmonic mean of L and D, and r0 is the inner cut-off radius, associated with the dislocation core. The data points represent the simulation results and the straight line is the best fit to our data using the classic equation (Gb/2π L) log(D1 /r0 ) [37, 40, 41]. It shows a good agreement between the simulation results using the level set method and the theoretical estimates.
References [1] V. Volterra, Ann. Ec. Norm., 24, 401, 1905. [2] F.R.N. Nabarro, Theory of Crystal Dislocations, Clarendon Press, Oxford, England, 1967. [3] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edition, John Wiley, New York, 1982. [4] L.P. Kubin and G.R. Canova, In: U. Messerschmidt et al. (eds.), Electron Microscopy in Plasticity and Fracture Research of Materials, Akademie Verlag, Berlin, p. 23, 1990. [5] L.P. Kubin, G. Canova, M. Condat, B. Devincre, V. Pontikis, and Y. Brechet, Solid State Phenomena, 23/24, 455, 1992. [6] H.M. Zbib, M. Rhee, and J.P. Hirth, Int. J. Mech. Sci., 40, 113, 1998. [7] M. Rhee, H.M. Zbib, J.P. Hirth, H. Huang, and T. de la Rubia, Modelling Simul. Mater. Sci. Eng., 6, 467, 1998. [8] K.W. Schwarz, J. Appl. Phys., 85, 108, 1999. [9] N.M. Ghoniem, S.H. Tong, and L.Z. Sun, Phys. Rev. B, 61, 913, 2000. [10] B. Devincre, L.P. Kubin, C. Lemarchand, and R. Madec, Mat. Sci. Eng. A-Struct., 309, 211, 2001. [11] D. Weygand, L.H. Friedman, E. Van der Giessen, and A. Needleman, Modelling Simul. Mater. Sci. Eng., 10, 437, 2002. [12] M. Peach and J.S. Koehler, Phys. Rev., 80, 436, 1950. [13] A.G. Khachaturyan, In: E.A. Turchi, R.D. Shull, and A. Gonis (eds.), Science of Alloys for the 21st Century, TMS Proceedings of a Hume-Rothery Symposium, TMS, p. 293, 2000. [14] Y.U. Wang, Y.M. Jin, A.M. Cuitino, and A.G. Khachaturyan, Acta Mater., 49, 1847, 2001. [15] Y. Xiang, L.T. Cheng, D.J. Srolovitz, and W. E, Acta Mater., 51, 5499, 2003. [16] Y. Xiang, D.J. Srolovitz, L.T. Cheng, and W. E, Acta Mater., 52, 1745, 2004. [17] S. Osher and J.A. Sethian, J. Comput. Phys., 79, 12, 1988. [18] P. Burchard, L.T. Cheng, B. Merriman, and S. Osher, J. Comput. Phys., 170, 720, 2001. [19] S. Osher and R.P. Fedkiw, J. Comput. Phys., 169, 463, 2001. [20] R.W. Lardner, Mathematical Theory of Dislocations and Fracture, University of Toronto Press, Toronto and Buffalo, 1974. [21] L.D. Landau and E.M. Lifshitz, Theory of Elasticity, 3rd edn., Pergamon Press, New York, 1986. [22] V.V. Bulatov, M. Rhee, and W. Cai, In: L. Kubin, et al. (eds.), Multiscale Modeling of Materials – 2000, Materials Research Society, Warrendale, PA, 2001.
Level set dislocation dynamics method [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41]
2323
W. Cai, V.V. Bulatov, J. Chang, J. Li, and S. Yip, Phil. Mag., 83, 539, 2003. G.S. Jiang and D. Peng, SIAM J. Sci. Comput., 21, 2126, 2000. S. Osher and C.W. Shu, SIAM J. Numer. Anal., 28, 907, 1991. C.W. Shu and S. Osher, J. Comput. Phys., 77, 439, 1988. R.J. Spiteri and S.J. Ruuth, SIAM J. Numer. Anal., 40, 469, 2002. M. Sussman, P. Smereka, and S. Osher, J. Comput. Phys., 114, 146, 1994. D. Peng, B. Merriman, S. Osher, H.K. Zhao, and M. Kang, J. Comput. Phys., 155, 410, 1999. S. Osher, L.T. Cheng, M. Kang, H. Shim, and Y.H.R. Tsai, J. Comput. Phys., 179, 622, 2002. M. Bardi and S. Osher, SIAM J. Math. Anal., 22, 344, 1991. H.K. Zhao, T. Chan, B. Merriman, and S. Osher, J. Comput. Phys., 127, 179, 1996. S. Chen, M. Merriman, S. Osher, and P. Smereka, J. Comput. Phys., 135, 8, 1997. D. Adalsteinsson and J.A. Sethian, J. Comput. Phys., 148, 2, 1999. Y.H.R. Tsai, L.T. Cheng, S. Osher, and H.K. Zhao, SIAM J. Numer. Anal., 41, 673, 2003. H.K. Zhao, Math Comp., to appear. E. Orowan, In: Symposium on Internal Stress in Metals and Alloys, London: The Institute of Metals, p. 451, 1948. P.B. Hirsch, J. Inst. Met., 86, 13, 1957. J.D. Eshelby, In: F. Seitz and D. Turnbull, (ed.), Solid State Physics, vol. 3, Academic Press, New York, 1956. M.F. Ashby, Acta Metall., 14, 679, 1966. D.J. Bacon, U.F. Kocks, and R.O. Scattergood, Phil. Mag., 28, 1241, 1973.
7.14 COARSE-GRAINING METHODOLOGIES FOR DISLOCATION ENERGETICS AND DYNAMICS J.M. Rickman1 and R. LeSar2 1
Department of Materials Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA 2 Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
1.
Introduction
Recent computational advances have permitted mesoscale simulations, wherein individual dislocations are the objects of interest, of systems containing on the order of 106 dislocation [1–4]. While such simulations are beginning to to elucidate important energetic and dynamical features, it is worth noting that the large-scale deformation response in, for example, wellworked metals having dislocation densities ranging between 1010 −1014 /m2 can be accurately described by a relatively small number of macrovariables. This reduction in the number of degrees of freedom required to characterize plastic deformation implies that a homogenization, or coarse-graining, of variables is appropriate over some range of length and time scales. Indeed, there is experimental evidence that, at least in some cases, the mechanical response of materials depends most strongly on the macroscopic density of dislocations [5] while, in others, the gross substructural details may also be of importance. A successful, coarse-grained theory of dislocation behavior requires the identification of the fundamental homogenized variables from among the myriad of dislocation coordinates as well as the time scale for overdamped defect motion. Unfortunately, there has been, to date, little effort to devise workable coarse-graining strategies that properly reflect the long-ranged nature of dislocation–dislocation interactions. Thus, in this topical article, we review salient work in this area, highlighting the observation that seemingly unrelated problems are, in fact, part of a unified picture of coarse-grained dislocation behavior that is now emerging. More specifically, a prescription is given for identifying a relevant macrovariable set that describes a collection of mutually interacting dislocations. This set follows from a real-space 2325 S. Yip (ed.), Handbook of Materials Modeling, 2325–2335. c 2005 Springer. Printed in the Netherlands.
2326
J.M. Rickman and R. LeSar
analysis involving the subdivision of a defected system into volume elements and subsequent multipole expansions of the dislocation density. It is found that the associated multipolar energy expansion converges quickly (i.e., usually at dipole or quadrupole order) for well-separated elements. Having formulated an energy functional for the macrovariables, the basic ingredients of temporal coarse-graining schemes are then outlined to describe dislocation–dislocation interactions at finite temperature. Finally, we suggest dynamical models to describe the time evolution of the coarse macrovariables. This article is organized as follows. In Section 2 we outline spatial coarsegraining strategies that permit one to link mesoscale dislocation energetics and dynamics with the continuum. In Section 3 we review some temporal coarsegraining procedures that make it possible to reduce the number of macrovariables needed in a description of thermally induced kinks and jogs on dislocation lines. Section 4 contains a summary of the paper and a discussion of coarse-grained dynamics.
2.
Spatial Coarse-Graining Strategies
A homogenized description of the energetics of a collection of dislocations in, for example, a well-worked metal is complicated by the long-ranged, anisotropic nature of dislocation–dislocation interactions. Such interactions lead to the formation of patterns at multiple length scales as dislocations polygonize to lower the energy of the system [6, 7]. This tendency to form dislocation walls can be quantified via the calculation of an orientationally weighted pair correlation function [8, 9] from a large-scale, two-dimensional mesoscale simulation of edge dislocations, as shown in Fig. 1. As is evident from the figure, both 45◦ and 90◦ walls are dominant (with other orientations also represented), consistent with the propensity to form dislocation dipoles with these relative orientations. Thus, a successful coarse-graining strategy must preserve the essential features of these dislocation structures while reducing systematically the number of degrees of freedom necessary for an accurate description. There are different, although complementary, avenues to pursue in formulating a self-consistent, real-space numerical coarse-graining strategy in which length scales shorter than some prescribed cutoff are eliminated from the problem. One such approach involves subdividing the system into equally sized blocks and then, after integrating out information on scales less than the block size, inferring the corresponding coarse-grained free energy from probability histograms compiled during finite-temperature mesoscale simulations [10–12]. In this context, each block contains many dislocations, and so the free energy extracted from histograms will be a function of a block-averaged dislocation density. This method is motivated by Monte Carlo coarse-graining (MCCG) studies of spin system and can be readily applied, for example, to a
Coarse-graining methodologies for dislocation energetics
2327
Figure 1. An angular pair-correlation function. In the white (black) region there is a relatively high probability of finding a dislocation with positive (negative) Burgers vector, given that a dislocation with positive Burgers vector is located at the origin. From [9].
two-dimensional dislocation system, modeled as a “vector” lattice gas, once the long-ranged nature of the dislocation–dislocation interaction is taken into account. Unfortunately, however, the energy scale associated with dislocation interactions is typically much greater than kB T , where kB is Boltzmann’s constant and T is the temperature, and therefore the finite-temperature sampling inherent in the MCCG technique is not well-suited to the current problem. To develop a more useful technique that reflects the many frustrated, lowenergy states relevant here, consider first the ingredients of a coarse-graining strategy based on continuous dislocation theory. The theory of continuous dislocations follows from the introduction of a coarse-graining volume over which the dislocation density is averaged. The dislocation density is a tensor r ), where k indicates the component of field defined at r with components ρki ( the line direction and i indicates the component of the Burgers vector. In this development it is generally assumed that is large relative to the dislocation spacing, yet small relative to the system size [13]. However, the exact meaning of this averaging prescription is unclear, and it is not obvious at what scales a continuum theory should hold. In particular, if one takes the above assumption that a continuum theory holds for length scales much greater than the typical dislocation spacing, then the applicability of the method is restricted to
2328
J.M. Rickman and R. LeSar
scales much greater than the dislocation structures known to be important for materials response [14]. Clearly, if the goal is to apply this theory at smaller length scales so as to capture substructures relevant to mechanical response, then one must build the ability to represent such substructures into the formalism. As previous work focused on characterizing these dislocation structures (calculated from two-dimensional simulations) through the use of pair correlation functions [8, 9], we outline here an extension to the continuous dislocation theory that incorporates important spatial correlations. The starting point for this development is the work of Kosevich, who showed that the interaction energy of systems of dislocations (in an isotropic linear elastic medium) can be written in terms of Kr¨oner’s incompatibility tensor [15]. From that form one can derive an energy expression in terms of the dislocation density tensor [16] µ EI = 16π
ipl j mn R,mp ( r , r )
r )ρin ( r ) + δi j ρkl ( r )ρkn (r ) + × ρ j l (
2ν ρil ( r )ρ j n ( r ) d r d r , 1−ν (1)
where the integrals are over the entire system, δi j is the Kronecker delta, and repeated indices are summed. The notation a,i denotes the derivative of a with respect to xi .R,mk indicates the derivative ∂ 2 | r − r | /∂ xm ∂ xk . It should be noted here that the energy expression in Eq. (l) includes very limited information about dislocation structures at scales smaller than the averaging volume. Here we summarize results from one approach to incorporate the effects of lower-scale structures, with a more complete derivation given elsewhere [17]. The basic plan is to divide space into small averaging volumes, calculate the local multipole moments of the dislocation microstructure (as described next), and then to write the energy as an expansion over the multipoles. Consider a small region of space with volume containing n distinct dislocation loops, not necessarily entirely contained within . We can define a set of moment densities of the distribution of loops in as [17] = ρl() j ρl() jα = ···
n 1 (q) b q=1 j
1
n q=1
,
(q)
(q)
dll ,
(2)
C(q),
bj
C(q),
(q)
rα(q) dll ,
(3)
Coarse-graining methodologies for dislocation energetics
2329
where b is the Burgers vector and the notation (C (q) , ) indicates that we integrate over those parts of dislocation line q that lie within the volume . () Here ρl() j is the dislocation density tensor and ρl j α is the dislocation dipole moment tensor for volume . Higher-order moments can also be constructed. Consider next two regions in space denoted by A and B. We can write the interaction energy between the dislocations in the two regions as sums of pair interactions or, equivalently, as line integrals over the dislocation loops [18, 19]. Now, if the volumes are well separated, then the interaction energy can be written as a multipole expansion [17]. Upon truncating this expansion at zeroth order (i.e., the “charge–charge” term) one finds (o) = E AB
µ 8π
A B
×
ipl j mn R,mp
B ) (A ) ρ ( j l ρin
+
(A ) δi j ρkl(B ) ρkn
2ν (B ) (A ) ρ + ρ jn d rA d rB , (4) 1 − ν il
where R connects the centers of the two regions. Summing the interactions between all regions of space and then taking the limit that the averaging volumes A and B go to differential volume elements, the Kosevich form for continuous dislocations in Eq. (l) is recovered and the dislocation density tensor approaches asymptotically the continuous result. Corrections to the Kosevich form associated with a finite averaging volume can now be obtained by including higher-order moments in the expansion. For example, the first-order term (“charge–dipole”) has the form (dipole−charge)
EI
=
µ 16π
ipl j mn R,mpα
ρ j l ( r )ρinα (r )
2ν ρil ( r )ρknα (r ) + r )ρ j nα (r ) + δi j ρkl ( 1−ν
− ρ j lα ( r )ρin (r ) + δi j ρklα ( r )ρkn (r ) 2ν ρil,α ( r )ρ j n (r ) + 1−ν
d r dr
(5)
where R,mpα is the next higher-order derivative of R [17]. We note that inclusion of terms that depend on the local dipole are equivalent to gradient corrections to the Kosevich form. The expression in Eq. (5) (and higher-order terms) can be used as a basis for a continuous dislocation theory with local structure by including the dipole (and higher) dislocation moment tensors as descriptors. For a systematic analysis of the terms in a dislocation multipolar energy expansion and their dependence on coarse-grained cell size, the reader is referred to a review elsewhere [20].
2330
3.
J.M. Rickman and R. LeSar
Temporal Coarse-graining – Finite-temperature Effects
At finite temperatures dislocation lines may be perturbed by thermally induced kinks and jogs. While such perturbations are inherent in 3D mesoscale dislocation dynamics simulations at elevated temperatures, it is of interest here to explore methods to integrate out these modes to arrive at a simpler description of dislocation interactions. For example, motivated by calculations of the fluctuation-induced coupling of dipolar chains in electrorheological fluids and flux lines in superconductors [21], one can determine the interaction free energy between fluctuating dislocation lines that are in contact with a thermal bath and thereby deduce the effective force between dislocations. Indeed, the impact of temperature-induced fluctuations on the interaction of two (initially) parallel screw dislocations was the focus of a recent paper [16]. In this work it was assumed that perturbations in the dislocation lines that arise from thermal fluctuations in the medium can be viewed as a superposition of modes having screw, edge and mixed character. The impact of these fluctuations on the force between the dislocations at times greater than the those associated with the period of a fluctuation was then examined by integrating out the vibrational modes of the dislocation lines. The procedure employed was similar to that used to construct quaisharmonic models of solids in which vibrational atomic displacements are eliminated in favor of their corresponding frequency spectrum in the canonical partition function [22]. In both cases the resulting free energy then depends on a small set of coarse-grained variables. To see how a finite-temperature force may be constructed, consider a prototypical system in which harmonic perturbations are added to two straight screw dislocation lines without changing the Burgers vector, which remains along the z (i.e., x3 ) axis. We describe those fluctuations by parameterizing the line position in the x1 −x2 plane with a Fourier series with r = xˆ1 F (x3 ) + xˆ2 F⊥ (x3 ) where Fκ (x3 ) =
n max
C+,n,κ einκ π x3 /L + C−,n,κ e−inκ π x3 /L ,
(6)
n κ=1
κ is either ⊥ or , L is a maximal length characterizing the system, and n max is related to a minimum characteristic length. An expression for the dislocation r )) in the form of the expansion in Eq. (6) can be written density tensor (ρi j ( in terms of Dirac delta functions indicating the line position. The next step in the analysis is to calculate the Fourier transform of the dislocation density for the perturbed dislocation lines. While it is possible to write these densities in terms of infinite series expansions, it is more useful here to restrict attention to the lowest-order terms in the fluctuation amplitudes that are excited at low temperatures. Having determined the dislocation density tensor, the aim is then to calculate the interaction energy between two
Coarse-graining methodologies for dislocation energetics
2331
perturbed dislocation lines. This energy will, in turn, determine the corresponding Boltzmann weight for the fluctuating pair of lines and, hence, the equilibrium statistical mechanics of this system. The interaction energy can be obtained from an expression for the total energy, E, based on ideas from continuous dislocation theory [23]. For this purpose it is again convenient to write the Kosevich energy functional, this time as an integral in reciprocal space, [13, 15] as E[ρ] ¯ =
1 2
d3 k ρ˜i j (k) ρ˜kl (−k), K i j kl (k) (2π)3
(7)
where the integration is over reciprocal space (tilde denoting a Fourier transform), the kernel (without core energy contributions)
K i j kl
µ 2ν Ci j Ckl , = 2 Q ik Q j l + Cil Ckj + k 1−ν
(8)
and Q¯ and C¯ are longitudinal and transverse projection operators, respectively. (The energetics of the disordered core regions near each line can be incorporated, at least approximately, by the inclusion of a phenomenological energy penalty term in the kernel above.) The Helmholtz free energy and, therefore, the associated finite-temperature k ) for forces can be obtained by first constructing the partition function Z (k, ˆ ¯ the system of two perturbed screw dislocations with associated k = i k + jˆk¯⊥ and k = iˆ k¯ + jˆk¯ ⊥ . This is accomplished by considering the change in energy,
e(a), associated with fluctuations on the (initially straight) dislocations and noting that it can be written as a sum of contributions, ( e) and ( e)⊥ , corresponding to in-plane and transverse fluctuation modes. One then finds that the factorized partition function k ) = N Z (k,
= Z⊥ Z,
−L( e) dω exp kB T
−L( e)⊥ dω⊥ exp kB T
(9)
where N is a normalization factor and ω is the eight-dimensional configuration space described by the complex fluctuation amplitudes. The Helmholtz free energy associated with the interactions between the fluctuating screws is then given by A = −kB T ln(Z ) = −kB T {ln(Z ) + ln(Z ⊥ )}.
(10)
In our earlier work [16] we gave analytic expressions for both Z ⊥ and Z . Upon integrating A over all possible perturbation wavevectors one finally
2332
J.M. Rickman and R. LeSar 0.002 0.0015
b2/kBT
0.001 0.0005 0 ⫺0.0005 ⫺0.001 ⫺0.0015 ⫺0.002 20
22
24
26
28 a*
30
32
34
36
Figure 2. The contributions to the normalized force versus normalized separation for two perturbed dislocations. The parallel (perpendicular) contribution is denoted by triangles (circles). From [16].
arrives at the total free energy, now a function of coarse-grained variables (i.e., the average line locations.) From the development above it is clear that the average force between the dislocations is obtained by differentiating the total free energy with respect to the line separation a. For the purposes of illustration it is convenient to decompose this force into a sum of components both parallel and perpendicular to a line joining the dislocations. For concreteness, we evaluate the resulting force for dislocations embedded in copper and having the same properties. The maximum size of the system is taken to be L = 200b, where b is the magnitude of the Burgers vector of a dislocation. As can be seen from Fig. 2, a plot of the normalized force contributions versus normalized separation a ∗ (a ∗ = a/b), the parallel (perpendicular) contribution to the force is repulsive (attractive), both components being of similar magnitude. Further analysis indicates that the net thermal force at a temperature of 600 K at a separation of a ∗ = 22 is approximately 1.3 × 10−4 J/m 2 for b = 2.56 Å. This thermal force is approximately 1000 smaller in magnitude than the direct (Peach–Koehler) force for the same separation.
4.
Discussion
Several applications of spatial and temporal coarse graining to systems containing large numbers of dislocations have been outlined here. A common
Coarse-graining methodologies for dislocation energetics
2333
theme linking these strategies is the classification of relevant state variables and the subsequent elimination of a subset of degrees of freedom (via averaging, etc.) in favor of those associated with a coarser description. For example, in the case of the straight screw dislocations interacting with a thermal bath (see Section 3), the vibrational modes of the dislocation lines can be identified as “fast” variables that can be integrated out of the problem, with the resultant free energy based on the long-time, average location of these lines. Furthermore, the spatial coarse graining schemes proposed above involve the identification of a dislocation density, based on localized collections of dislocations, and the separation of interaction length scales (i.e., in terms of a multipolar decomposition and associated gradient expansions) with the aim of developing a model based solely on the dislocation density and other macrovariables. It remains to link coarse-grained dislocation energetics with the corresponding dynamics. While the history of the theory of dislocation dynamics goes back to the early work of Frank [24], Eshelby [25], Mura [26] and others, who deduced the inertial response for isolated edge and screw dislocations in an elastically isotropic medium, we note that the formulation of equations of motion for an ensemble of mutually interacting dislocations at finite temperature is an ongoing enterprise that presents numerous challenges. We therefore merely outline promising approaches here. The construction of a kinetic model is, perhaps, best motivated by earlier work in the field of critical dynamics [27, 28]. More specifically, in this approach, one formulates a set of differential equations that reflect any conservation laws that constrain the evolution of the variables (e.g., conservation of Burgers vector in the absence of sources). Different workers have employed variations of this formalism in dislocation dynamics simulations. For example, in early work in this area, Holt [29] postulated a dissipative equation of motion for the scalar dislocation density, subject to the constraint of conservation of Burgers vector, with a driving force given by gradients of fluctuations in the dislocation interaction energy. Rickman and Vinals [30], following an earlier statistical-mechanical treatment of free dislocation loops [13] and by hydrodynamic descriptions of condensed systems, considered a dynamics akin to a noise-free Model B [28] to track the time evolution of the dislocation density tensor in an elastically isotropic medium. Equations of motion for dislocation densities have also been advanced by Marchetti and Saunders [31] in a description of a viscoelastic medium containing unbound dislocations, by Haataja et al. [32] in a continuum model of misfitting heteroepitaxial films and, recently, by Khachaturyan and coworkers [33–35] in several phase-field simulations. The elegant approach of this group is, however, an alternative formulation of overdamped discrete dislocation models, as opposed to a spatially coarse-grained description. As indicated above, work in this area continues, with some current efforts directed at incorporating dislocation substructural information in the dynamics.
2334
J.M. Rickman and R. LeSar
Acknowledgments J.M. Rickman would like to thank the National Science Foundation for its support under grant number DMR-9975384. The work of R. LeSar was performed under the auspices of the United States Department of Energy (US DOE under Contract No. W-7405-ENG-36) and was supported by the Office of Science/Office of Basic Energy Sciences/Division of Materials Science of the US DOE.
References [1] E. Van der Giessen and A. Needleman, “Micromechanics simulations of fracture,” Ann. Rev. Mater. Res., 32, 141, 2002. [2] R. Madec, B. Devincre, and L. Kubin, “Simulation of dislocation patterns in multislip,” Scripta Mater., 47, 689–695, 2002. [3] M. Rhee, D.H. Lassila, V.V. Bulatov, L. Hsiung, and T.D. de la Rubia, “Dislocation multiplication in BCC molybdenum: a dislocation dynamics simulation,” Phil. Mag. Lett., 81, 595, 2001. [4] M. Koslowski, A.M. Cuitino, and M. Ortiz, “A phase-field theory of dislocation dynamics, strain hardening and hysteresis in ductile single crystals,” J. Mech. Phys. Solids, 50, 2597, 2002. [5] A. Turner and B. Hasegawa, “Mechanical testing for deformation model development,” ASTM, 761, 1982. [6] J.P. Hirth and J. Lothe, Theory of Dislocations, Krieger, Malabar, Florida, 1982. [7] D.A. Hughes, D.C. Chrzan, Q. Liu, and N. Hansen, “Scaling of misorientation angle distributions,” Phys. Rev. Lett., 81, 4664–4667, 1998. [8] A. Gulluoglu, D.J. Srolovitz, R. LeSar, and P.S. Lomdahl, “Dislocation distributions in two dimensions,” Scripta Metall., 23, 1347–1352, 1989. [9] H.Y. Wang, R. LeSar, and J.M. Rickman, “Analysis of dislocation microstructures: impact of force truncation and slip systems,” Phil. Mag. A, 78, 1195–1213, 1998. [10] K. Binder, “Critical properties from Monte Carlo coarse graining and renormalization,” Phys. Rev. Lett., 47, 693–696, 1981. [11] K. Kaski, K. Binder, and J.D. Gunton, “Study of cell distribution functions of the three-dimensional ising model,” Phys. Rev. B, 29, 3996–4009, 1984. [12] M.E. Gracheva, J.M. Rickman, and J.D. Gunton, “Coarse-grained Ginzburg-Landau free energy for Lennard–Jones systems,” J. Chem. Phys., 113, 3525–3529, 2000. [13] D.R. Nelson and J. Toner, “Bond-orientational order, dislocation loops and melting of solids and smectic–a liquid crystals,” Phys. Rev. B, 24, 363–387, 1981. [14] U.F. Kocks, A.S. Argon, and M.F. Ashby, Thermodynamics and Kinetics of Slip, Prog. Mat. Sci., 19, 1975. [15] A.M. Kosevich, In: F.R.N. Nabarro (ed.), Dislocations in Solids, New York, p. 37, 1979. [16] J.M. Rickman and R. LeSar, “Dislocation interactions at finite temperature,” Phys. Rev. B, 64, 094106, 2001. [17] R. LeSar and J.M. Rickman, Phys. Rev. B, 65, 144110, 2002. [18] N.M. Ghoniem and L.Z. Sun, “Fast-sum method for the elastic field of three-dimensional dislocation ensembles,” Phys. Rev. B, 60, 128, 1999.
Coarse-graining methodologies for dislocation energetics
2335
[19] R. de Wit, Solid State Phys., 10, 249, 1960. [20] R. LeSar and J.M. Rickman, “Coarse-grained descriptions of dislocation behavior,” to be published in Phil. Mag., 83, 3809–3827, 2003. [21] T.C. Halsey and W. Toor, “Fluctuation-induced couplings between defect lines or particle chains,” J. Stat. Phys., 61, 1257–1281, 1990. [22] J.M. Rickman and D.J. Srolovitz, “A modified local harmonic model for solids,” Phil. Mag. A, 67, 1081–1094, 1993. [23] E. Kr¨oner, Kontinuumstheorie der Versetzungen and Eigenspannungen, Ergeb. Angew. Math. 5 (Springer-Verlag, Berlin 1958). English translation: Continuum Theory of Dislocations and Self-Stresses, translated by I. Raasch and C.S. Hartley, (United States Office of Naval Research), 1970. [24] F.C. Frank, “On the equations of motion of crystal dislocations,” Proc. Phys. Soc., 62A, 131–134, 1949. [25] J.D. Eshelby, “Supersonic dislocations and dislocations in dispersive media,” Proc. Phys. Soc., B69, 1013–1019, 1956. [26] T. Mura, “Continuous distribution of dislocations,” Phil. Mag., 8, 843–857, 1963. [27] J.D. Gunton and M. Droz, “Introduction to the theory of metastable and unstable states,” Springer-Verlag, New York, pp. 34–42, 1983. [28] P.C. Hohenberg and B.I. Halperin, “Theory of dynamic critical phenomena,” in Rev. Mod. Phys., 49, 435–479, 1977. [29] D.L. Holt, “Dislocation cell formation in metals,” J. Appl. Phys., 41, 3197 1970. [30] J.M. Rickman and Jorge Vinals, “Modeling of dislocation structures in materials,” Phil. Mag. A, 75, 1251, 1997. [31] M.C. Marchetti and K. Saunders, “Viscoelasticity from a microscopic model of dislocation dynamics,” Phys. Rev. B 66, 224113, 2002. [32] M. Haataja, J. Miiller, A.D. Rutenberg, and M. Grant, “Dislocations and morphological instabilities: continuum modeling of misfitting heteroepitaxial films,” Phys. Rev. B, 65, 165414, 2002. [33] Y.U. Wang, Y.M. Jin, A.M. Cuitino, and A.G. Khachaturyan, “Nanoscale phase field microelasticity theory of dislocations: model and 3D simulations,” Acta Mater., 49, 1847–1857, 2001. [34] Y.M. Jin and A.G. Khachaturyan, “Phase field microelasticity theory of dislocation dynamics in a polycrystal: model and three-dimensional simulations,” Phil. Mag. Lett., 81, 607–616, 2001. [35] S.Y. Hu and L.-Q. Chen, “Solute segregation and coherent nucleation and growth near a dislocation – a phase-field model for integrating defect and phase microstructures,” Acta Mater., 49, 463–472, 2001.
7.15 LEVEL SET METHODS FOR SIMULATION OF THIN FILM GROWTH Russel Caflisch and Christian Ratsch University of California at Los Angeles, Los Angeles, CA, USA
The level set method is a general approach to numerical computation for the motion of interfaces. Epitaxial growth of a thin film can be described by the evolution of island boundaries and step edges, so that the level set method is applicable to simulation of thin film growth. In layer-by-layer growth, for example, this includes motion of the island boundaries, merger or breakup of islands, and creation of new islands. A system of size 100 × 100 nm may involve hundreds or even thousands of islands. Because it does not require smoothing and or discretization of individual island boundaries, the level set method can accurately and efficiently simulate the dynamics of a system of this size. Moreover, because it does not resolve individual hopping events on the terraces or island boundaries, the level set method can take longer time steps than those of an atomistic method such as kinetic Monte Carlo (KMC). Thus the level set approach can simulate some systems that are computationally intractable for KMC.
1.
The Level Set Method
The level set method is a numerical technique for computing interface motion in continuum models, first introduced by [11]. It provides a simple, accurate way of computing complex interface motion, including merger and pinchoff. This method enables calculations of interface dynamics that are beyond the capabilities of traditional analytical and numerical methods. For general references on level set methods, see the books [12, 21]. The essential idea of the method is to represent the interface as a level set of a smooth function, φ(x) – for example the set of points where φ = 0. For numerical purposes, the interface velocity is smoothly extended to all points x of the domain, as v(x). Then, the interface motion is captured simply by 2337 S. Yip (ed.), Handbook of Materials Modeling, 2337–2350. c 2005 Springer. Printed in the Netherlands.
2338
R. Caflisch and C. Ratsch
convecting the values of the smooth function φ with the smooth velocity field v. Numerically, this is accomplished by solving the convection equation ∂φ + v · ∇φ = 0 ∂t
(1)
on a fixed, regular spatial grid. The main advantage of this approach is that interface merger or pinch off is captured without special programming logic. The merger of two disjoint level sets into one occurs naturally as this equation is solved, through smooth changes in the function φ(x, t). For example, two disjoint interface loops would be represented by a φ with two smooth humps, and their merging into a single loop is represented by the two humps of φ smoothly coming together to form a single hump. Pinch off is the reverse process. In particular, the method does not involve smoothing out of the interface. The normal component of the velocity v = n · v contains all the physical information of the simulated system, where n is the outward normal of the moving boundary and v · ∇ϕ = v|∇ϕ|. Another advantage of the method is that the local interface geometry – normal direction, n, and curvature, κ – can be easily computed in terms of partial derivatives of φ. Specifically, −∇φ |∇φ| κ =∇·n n=
(2) (3)
provide the normal direction and curvature at points on the interface.
2.
Epitaxial Growth
Epitaxy is the growth of a thin film on a substrate in which the crystal properties of the film are inherited from those of the substrate. Since an epitaxial film can (at least in principle) grow as a single crystal without grain boundaries or other defects, this method produces crystals of the highest quality. In spite of its ideal properties, epitaxial growth is still challenging to mathematically model and numerically simulate because of the wide range of length and time scales that it encompasses, from the atomistic scale of Ångstroms and picoseconds to the continuum scale of microns and seconds. The geometry of an epitaxial surface consists of step edges and island boundaries, across which the height of the surface increases by one crystal layer, and adatoms which are weakly bound to the surface. Epitaxial growth involves deposition, diffusion and attachment of adatoms on the surface. Deposition is from an external source, such as a molecular beam. The principal dimensionless parameter (for growth at low temperature) is the ratio D/(a 4 F),
Level set methods for simulation of thin film growth
2339
in which a is the lattice constant and D and F are the adatom diffusion coefficient and deposition flux. It is conventional to refer to this parameter as D/F, with the understanding that the lattice constant serves as the unit of length. Typical values for D/F are in the range of 104 –108 . The models that are typically used to describe epitaxial growth include the following: Molecular dynamics (MD) consists of Newton’s equations for the motion of atoms on an energy landscape. A typical Kinetic Monte Carlo (KMC) method simulates the dynamics of the epitaxial surface through the hopping of adatoms along the surface. The hopping rate comes from an Arrhenius rate of the form e−E/kB T in which E is the energy barrier for going from the initial to the final position of the hopping atom. Island dynamics and level set methods, the subject of this article, describe the surface through continuum scaling in the lateral directions but atomistic discreteness in the growth direction. Continuum equations approximate the surface using a smooth height function h = h(x, y, t), obtained by coarse graining in all directions. Rate equations describe the surface through a set of bulk variables without spatial dependence. Within the level set approach, the union of all boundaries of islands of height k + 1, can be represented by the level set ϕ = k, for each k. For example, the boundaries of islands in the submonolayer regime then correspond to the set of curves ϕ = 0. A schematic representation of this idea is given in Fig. 1, where two islands on a substrate are shown. Growth of these islands is described by a smooth evolution of the function ϕ (cf. Figs. 1 (a) and (b)). (a) ϕ⫽0
(b) ϕ ⫽0
(c) ϕ⫽0
(d)
ϕ ⫽1 ϕ ⫽0
Figure 1. A schematic representation of the level-set formalism. Shown are island morphologies (left side), and the level-set function ϕ (right side) that represents this morphology.
2340
R. Caflisch and C. Ratsch
The boundary curve (t) generally has several disjoint pieces that may evolve so as to merge (Fig. 1(c)) or split. Validation of the level set method will be detailed in this article by comparison to results from an atomistic KMC model. The KMC model employed is a simple cubic pair-bond solid-on-solid (SOS) model [24]. In this model, atoms are randomly deposited at a deposition rate F. Any surface atom is allowed to move to its nearest neighbor site at a rate that is determined by r = r0 exp{−(E S + n E N )/kB T }, where r0 is a prefactor which is chosen to be 1013 s−1 , kB is the Boltzmann constant, and T is the surface temperature. E S and E N represent the surface and nearest neighbor bond energies, and n is the number of nearest neighbors. In addition, the KMC simulations include fast edge diffusion, where singly bonded step edge atoms diffuse along the step edge of an island with a rate Dedge , to suppress roughness along the island boundaries.
3.
Island Dynamics
Burton, Cabrera and Frank [5] developed the first detailed theoretical description for epitaxial growth. In this “BCF” model, the adatom density solves a diffusion equation with an equilibrium boundary condition (ρ = ρeq ), and step edges (or island boundaries) move at a velocity determined from the diffusive flux to the boundary. Modifications of this theory were made, for example in [9], to include line tension, edge diffusion and nonequilibrium effects. These are “island dynamics” models, since they describe an epitaxial surface by the location and evolution of the island boundaries and step edges. They employ a mixture of coarse graining and atomistic discreteness, since island boundaries are represented as smooth curves that signify an atomistic change in crystal height. Adatom diffusion on the epitaxial surface is described by a diffusion equation of the form 2dNnuc (4) ∂t ρ − D∇ 2 ρ = F − dt in which the last term represents loss of adatoms due to nucleation and desorption from the epitaxial surface has been neglected. Attachment of adatoms to the step edges and the resulting motion of the step edges are described by boundary conditions at an island boundary (or step-edge) for the diffusion equation and a formula for the step-edge velocity v. For the boundary conditions and velocity, several different models are used. The simplest of these is ρ = ρ∗ v=D
∂ρ ∂n
(5)
Level set methods for simulation of thin film growth
2341
in which the brackets indicate the difference between the value on the upper side of the boundary and the lower side. Two choices for ρ∗ are ρ∗ = 0, which corresponds to irreversible aggregation in which all adatoms that hit the boundary stick to it irreversibly, and ρ∗ = ρeq for reversible aggregation. For the latter case, ρeq is the adatom density for which there is local equilibrium between the step and the terrace [5]. Line tension and edge diffusion can be included in the boundary conditions and interface velocity as in ∂ρ = DT (ρ± − ρ∗ ) − µκ, ∂n ±
(6) µ κss , v = DT n · [∇ρ] + βρ∗ ss + DE in which κ is curvature, s is the variable along the boundary, and D E is the coefficient for diffusion along and detachment from the boundary. Snapshots of the results from a typical level-set simulation are shown in Fig. 2. Shown is the level-set function (a) and the corresponding adatom concentration (b) obtained from solving the diffusion Eq. (4). The island boundaries that correspond to the integer levels of panel (a) are shown in (c). Dashed (solid) lines represent the boundaries of islands of height 1. Comparison of panels (a) and (b) illustrates that ρ is indeed zero at the island boundaries (where ϕ takes an integer value). Numerical details on implementation of the level set method for thin film growth are provided in [7]. The figures in this article are taken from [17] and [15].
4.
Nucleation and Submonolayer Growth
For the case of irreversible aggregation, a dimer (consisting of two atoms) is the smallest stable island, and the nucleation rate is dNnuc = Dσ1ρ 2 , (7) dt where · denotes the spatial average of ρ(x, t)2 and σ1 =
4π ln[(1/α)ρD/F]
(8)
is the adatom capture number as derived in [4]. The parameter α reflects the island shape, and α 1 for compact islands. Expression (7) for the nucleation rate implies that the time of a nucleation event is chosen deterministically. Whenever Nnuc L 2 passes the next integer value (L is the system size), a new island is nucleated. Numerically, this is realized by raising the level-set function to the next level at a number of grid points chosen to represent a dimer.
2342
R. Caflisch and C. Ratsch (a)
2.5 2 1.5 1 0.5 0 90
90 60
60 30
30 0
0
(b) 5
z 10 5 4 3 2 1 0 90
90 60
90
60 30
30 0 0
(c)
Figure 2. Snapshots of a typical level-set simulation. Shown are a 3D view of the level-set function (a) and the corresponding adatom concentration (b). The island boundaries as determined from the integer levels in (a) are shown in (c), where dashed (solid) lines correspond to islands of height 1 (2).
Level set methods for simulation of thin film growth
2343
The choice of the location of the new island is determined by probabilistic choice with spatial density proportional to the nucleation rate ρ 2 . This probabilistic choice constitutes an atomistic fluctuation that must be retained in the level set model for faithful simulation of the epitaxial morphology. For growth with compact islands, computational tests have shown additional atomistic fluctuations can be omitted [18]. Additions to the basic level set method, such as finite lattice constant effects and edge diffusion, are easily included [17]. The level set method with these corrections is in excellent agreement with the results of KMC simulations. For example, Fig. 3 shows the scaled island size distribution (ISD)
s , ns = 2 g sav sav
(9)
where n s is the density of islands of size s, sav is the average island size, and g(x) is a scaling function. The top panel of Fig. 3 is for irreversible attachment; the other two panels include reversibility that will be discussed below. All three panels show excellent agreement between the results from level set simulations, KMC and experiment.
5.
Multilayer Growth
In ideal layer-by-layer growth, a layer is completed before nucleation of a new layer starts. In this case, growth on subsequent layers would essentially be identical to growth on previous layers. In reality, however, nucleation on higher layers starts before the previous layer has been completed and the surface starts to roughen. This roughening transition depends on the growth conditions (i.e., temperature and deposition flux) and the material system (i.e., the value of the microscopic parameters). At the same time, the average lateral feature size increases in higher layers, which we will refer to as coarsening of the surface. These features of multilayer growth and the effectiveness of the level set method in reproducing them is illustrated in Fig. 4 that shows the island number density N as a function of time for two different values of D/F from both a level set simulation and from KMC. The results show near perfect agreement. The KMC results were obtained with a value for the edge diffusion that is 1/100 of the terrace diffusion constants. The island density decreases as the film height increases which implies that the film coarsens. The surface roughness w is defined as w 2 = (h i − h)2 ,
(10)
where the index i labels the lattice site. Figure 5 shows the increase of surface roughness for various different values of the edge diffusion, which implies that
2344
R. Caflisch and C. Ratsch 1.4
n s s av 2/ψ
1.2
KMC
1.0
LS
0.8
Exp
0.6 0.4 0.2 0.0 1.4 1.2
n s s av 2/ψ
1.0 0.8 0.6 0.4 0.2 0.0 1.4 1.2
n s s av 2/ψ
1.0 0.8 0.6 0.4 0.2 0.0 0
1
2
3
s /s av
Figure 3. The island size distribution, as given by KMC (squares) and LS (circles) methods, in comparison with STM experiments(triangles) on Fe/Fe(001) [23]. The reversibility increases from top to bottom.
Level set methods for simulation of thin film growth
2345
0.0015 KMC Levelset
N
0.001
0.0005
0
N
0.002
0.001
0
0
2
4 6 Coverage (ML)
8
Figure 4. Island densities N on each layer for D/F =106 (lower panel) and D/F =107 (upper panel) obtained with the level-set method and KMC simulations. For each data set there are 10 curves in the plot, corresponding to the 10 layers.
edge diffusion contributes to roughening, as also observed in KMC studies. It suggests that faster edge diffusion leads to more compact island shapes, and as a result the residence time of an atom on top of compact islands is extended. This promotes nucleation at earlier times on top of higher layers, and thus enhanced roughening. Effects of edge diffusion were included in these simulations through a term of the form κ − κ rather than κss as in (6).
6.
Reversibility
The simulation results presented above have been for the case of irreversible aggregation. If aggregation is reversible the KMC method must simulate a large number of events that do not affect the time-average of the system: Atoms detach from existing islands, diffuse on the terrace for a short period of time and reattach to the same island most of the time. These processes can slow down KMC simulations significantly. On the other hand, in a level set simulation these events can directly be replaced by their time average
2346
R. Caflisch and C. Ratsch D edge 0 D edge 10 D edge 20 D edge 50 D edge 100
0.7
Roughness
0.6
0.5
0.4
0.3
0.2 0
5
10
15
Coverage (ML) Figure 5. Time evolution of the surface roughness w for different values of edge diffusion Dedge .
and therefore the simulation only needs to include detachment events that do not lead to a subsequent reattachment, making the level set method much faster than KMC. Reversibility does not necessarily depend only on purely local conditions (e.g., local bond strength) but often on more global quantities such as strain or chemical environment. To include these kind of effects is a rather hard task in a KMC simulation but can be quite naturally included in a mean field picture. Reversibility can be included in the level set method using the boundary conditions (5) with ρ∗ = ρeq in which ρeq depends on the local environment of the island, in particular the edge atom density [6]. For islands consisting of only of a few atoms, however, the stochastic nature of detachment becomes relevant and is included through random detachment and breakup for small islands, as detailed in [14]. Figure 3 shows that the level set method with reversibility reproduces nicely the trends in the scaled ISD found in the KMC simulations and experiment. In particular, the scaled ISD depends only on the degree of reversibility, and it narrows and sharpens in agreement with the earlier prediction of [19].
Level set methods for simulation of thin film growth 1.4
2347
1.3 1.2
1.2
1.1
log R
1 |
1
1.2
1.4
1
ψ 0.085 ψ 0.16
0.8
0.6 0.5
0
0.5 log t
1
1.5
Figure 6. Time dependence (in seconds) of the average island radius R¯ (in units of the lattice constant) for two different coverages on a log–log plot. The straight lines have slope 1/3, which was the theoretical prediction.
In [15], the level set method with reversibility was used to determine the long time asymptotics of Ostwald ripening. A similar computation was performed in [8]. Figure 6 shows that the average island size R¯ grows as t 1/3 , which was an earlier theoretical prediction. Because reversibility greatly increases the number of hopping events and thus lowers the time step for an atomistic computation, KMC simulations have been unable to reach this asymptotic regime. The longer time steps in the level set simulation give it a significant advantage over KMC for this problem.
7.
Hybrid Methods and Additional Applications
As described above, the level set method does not include island boundary roughness or fractal island shapes, which can be significant in some applications. One way of including boundary roughness is by including additional state variables φ for the density of edge atoms and k for the density of kinks along an island boundary or step edge. A detailed step edge model was derived
2348
R. Caflisch and C. Ratsch
in [6] and used in determination of ρeq for the level set method with reversibility. While adequate for simulating reversibility, this approach will not extend to fractal island shapes. A promising alternative is a hybrid method that combines island dynamics with KMC; e.g., the adatom density is evolved through diffusion of a continuum density function, but attachment at island boundaries is performed by Monte Carlo [20]. In a different approach [10], where diffusion is described and the adatom density is evolved by explicit solution of the master equation, the atoms are resolved explicitly only once they attach to an island boundary. While this methods do not use a level set method, it is sufficiently similar to the method discussed here to warrant mention in this discussion. Level set methods have been used for a number of thin film growth problems that are related to the applications described above. In [22] a level set method was used to describe spiral growth in epitaxy. A general level set approach to material processing problems, including etching, deposition and lithography, was developed in [1], [2] and [3]. A similar method was used in [13] for deposition in trenches and vias.
8.
Outlook
The simulations described above have established the validity of the level set method for simulation of epitaxial growth. Moreover, the level set method makes possible simulations that would be intractable for atomistic methods such as KMC. This method can now be used with confidence in many applications that include epitaxy along with additional phenomena and physics. Examples that seem promising for future developments include strain, faceting and surface chemistry: Elastic strain is generated in heteroepitaxial growth due to lattice mismatch between the substrate and the film. It modifies the material properties and surface morphology, leading to many interesting growth phenomena such as quantum dot formation. Strained growth could be simulated by combining an elasticity solver with the level set method, and this would have significant advantages over KMC simulations for strained growth. Faceting occurs in many epitaxial systems, e.g., corrugated surfaces and quantum dots, and can be an important factor in the energy balance that determines the kinetic pathways for growth and structure. The coexistence of different facets can be represented in a level set formulation using two level set functions, one for crystal height and the second to mark the boundaries between adjacent facets [16]. Determination of the velocity for a facet boundary, as well for the nucleation of new facets, should be performed using energetic arguments. Similarly, surface chemistry such as the effects of different surface reconstructions could in principle be represented using two level set functions.
Level set methods for simulation of thin film growth
2349
References [1] D. Adalsteinsson and J.A. Sethian, “A level set approach to a unified model for etching, deposition, and lithography 1. Algorithms and two-dimensional simulations,” J. Comp. Phys., 120, 128–144, 1995. [2] D. Adalsteinsson and J.A. Sethian, “A level set approach to a unified model for etching, deposition, and lithography. 2. 3-dimensional simulations,” J. Comp. Phys., 122, 348–366, 1995. [3] D. Adalsteinsson and J.A. Sethian, “A level set approach to a unified model for etching, deposition, and lithography. 3. Redeposition, reemission, surface diffusion, and complex simulations,” J. Comp. Phys., 138, 193–223, 1997. [4] G.S. Bales and D.C. Chrzan, “Dynamics of irreversible island growth during submonolayer epitaxy,” Phys. Rev. B, 50, 6057–6067, 1994. [5] W.K. Burton, N. Cabrera, and F.C. Frank, “The growth of crystals and the equilibrium structure of their surfaces,” Phil. Trans. Roy. Soc. London Ser. A, 243, 299–358, 1951. [6] R.E. Caflisch, W.E, M. Gyure, B. Merriman, and C. Ratsch, “Kinetic model for a step edge in epitaxial growth,” Phys. Rev. E, 59, 6879–87, 1999. [7] S. Chen, M. Kang, B. Merriman, R.E. Caflisch, C. Ratsch, R. Fedkiw, M.F. Gyure, and S. Osher, “Level set method for thin film epitaxial growth,” J. Comp. Phys., 167, 475–500, 2001. [8] D.L. Chopp. “A level-set method for simulating island coarsening,” J. Comp. Phys., 162, 104–122, 2000. [9] B. Li and R.E. Caflisch, “Analysis of island dynamics in epitaxial growth,” Multiscale Model. Sim., 1, 150–171, 2002. [10] L. Mandreoli, J. Neugebauer, R. Kunert, and E. Sch¨oll, “Adatom density kinetic Monte Carlo: A hybrid approach to perform epitaxial growth simulations,” Phys. Rev. B, 68, 155429, 2003. [11] S. Osher and J.A. Sethian, “Front propagation with curvature dependent speed: Algorithms based on Hamilton-Jacobi formulations,” J. Comp. Phys., 79, 12–49, 1988. [12] S.J. Osher and R.P. Fedkiw, Level Set Methods and Dynamic Implicit Surfaces, Springer Verlag, New York, 2002. [13] P.L. O’Sullivan, F.H. Baumann, G.H. Gilmer, J.D. Torre, C.S. Shin, I. Petrov, and T.Y. Lee, “Continuum model of thin film deposition incorporating finite atomic length scales,” J. Appl. Phys., 92, 3487–3494, 2002. [14] M. Petersen, C. Ratsch, R.E. Caflisch, and A. Zangwill, “Level set approach to reversible epitaxial growth,” Phys. Rev. E, 64, #061602, U231–U236, 2001. [15] M. Petersen, A. Zangwill, and C. Ratsch, “Homoepitaxial Ostwald ripening,” Surf. Sci., 536, 55–60, 2003. [16] C. Ratsch, C. Anderson, R.E. Caflisch, L. Feigenbaum, D. Shaevitz, M. Sheffler, and C. Tiee, “Multiple domain dynamics simulated with coupled level sets,” Appl. Math. Lett., 16, 1165–1170, 2003. [17] C. Ratsch, M.F. Gyure, R.E. Caflisch, F. Gibou, M. Petersen, M. Kang, J. Garcia, and D.D. Vvedensky, “Level-set method for island dynamics in epitaxial growth,” Phys. Rev. B, 65, #195403, U697–U709, 2002. [18] C. Ratsch, M.F. Gyure, S. Chen, M. Kang, and D.D. Vvedensky, “Fluctuations and scaling in aggregation phenomena,” Phys. Rev. B, 61, 10598–10601, 2000. [19] C. Ratsch, P. Smilauer, A. Zangwill, and D.D. Vvedensky, “Submonolyaer epitaxy without a critical nucleus,” Surf. Sci., 329, L599–L604, 1995.
2350
R. Caflisch and C. Ratsch
[20] G. Russo, L. Sander, and P. Smereka, “A hybrid Monte Carlo method for surface growth simulations,” preprint, 2003. [21] J.A. Sethian. Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science, Cambridge U. Press, Cambridge, 1999. [22] P. Smereka, “Spiral crystal growth,” Physica D, 138:282–301, 2000. [23] J.A. Stroscio and D.T. Pierce, “Scaling of diffusion-mediated island growth in ironon-iron homoepitaxy,” Phys. Rev. B, 49:8522–8525, 1994. [24] D.D. Vvedensky, “Atomistic modeling of epitaxial growth: comparisons between lattice models and experiment,” Comp. Materials Sci., 6:182–187, 1996.
7.16 STOCHASTIC EQUATIONS FOR THIN FILM MORPHOLOGY Dimitri D. Vvedensky Imperial College, London, United Kingdom
Many physical phenomena can be modeled as particles on a lattice that interact according to a set of prescribed rules. Such systems are called “lattice gases”. Examples include the non-equilibrium statistical mechanics of driven systems [1, 2], cellular automata [3, 4], and interface fluctuations of growing surfaces [5, 6]. The dynamics of lattice gases are generated by transition rates for site occupancies that are determined by the occupancies of neighboring sites at the preceding time step. This provides the basis for a multi-scale approach to non-equilibrium systems in that atomistic processes are expressed as transition rates in a master equation, while a partial differential equation, derived from this master equation, embodies the macroscopic evolution of the coarse-grained system. There are many advantages to a continuum representation of the dynamics of a lattice system: (i) the vast analytic methodology available for identifying asymptotic scaling regimes and performing stability analyses; (ii) extensive libraries of numerical methods for integrating deterministic and stochastic differential equations; (iii) the extraction of macroscopic properties by coarsegraining the microscopic equations of motion, which, in particular, enables (iv) the discrimination between inherently atomistic effects from those that find a natural expression in a coarse-grained framework; (v) the more readily discernible qualitative behavior of a lattice model from a continuum representation than from its transition rules, which (vi) helps to establish connections between different models and thereby facilitate the transferal of concepts and methods across disciplines; and (vii) the ability to examine the effect of apparently minor modifications to the transition rules on the coarse-grained evolution which, in turn, facilitates the systematic reduction of full models to their essential components.
2351 S. Yip (ed.), Handbook of Materials Modeling, 2351–2361. c 2005 Springer. Printed in the Netherlands.
2352
1.
D.D. Vvedensky
Master Equation
The following discussion is confined to one-dimensional systems to demonstrate the essential elements of the methodology without the formal complications introduced by higher dimensional lattices. Every site i of the lattice has a column of h i atoms, so every configuration H is specified completely by the array H = {h 1 , h 2 , . . .}. The system evolves from an initial configuration according to transition rules that describe processes such as particle deposition and relaxation, surface diffusion, and desorption. The probability P(H, t) of configuration H at time t is a solution of the master equation [7], ∂P = [W (H − r; r)P(H − r, t) − W (H; r)P(H, t)], ∂t r
(1)
where W (H; r) is the transition rate from H to H + r, r = {r1 , r2 , . . .} is the array of all jump lengths ri , and the summation over r is the joint summation over all the ri . For particle deposition, H and H + r differ by the addition of one particle to a single column. In the simplest case, random deposition, the deposition site is chosen randomly and the transition rate is W (H; r) =
1 δ(ri , 1) δ(r j , 0), τ0 i j= /i
(2)
where τ0−1 is the deposition rate and δ(i, j ) is the Kronecker delta. A particle may also relax immediately upon arrival on the substrate to a nearby site within a fixed range according to some criterion. The two most common relaxation rules are based on identifying the local height minimum, which leads to the Edwards–Wilkinson equation, and the local coordination maximum, i.e., the site with greatest number of lateral nearest neighbors, which is known as the Wolf-Villain model [5]. If the search range extends only to nearest neighbors, the transition rate becomes W (H; r) =
1 (1) wi δ(ri , 1) δ(r j , 0) + wi(2) δ(ri−1 , 1) δ(r j , 0) τ0 i j= /i j= / i−1
+ wi(3)δ(ri+1 , 1)
δ(r j , 0) ,
(3)
j= / i+1
where the wi(k) embody the rules that determine the final deposition site. The sum rule wi(1) + wi(2) + wi(3) = 1
(4)
Stochastic equations for thin film morphology
2353
expresses the requirement that the deposition rate per site is τ0−1 . The transition rate for the hopping of a particle from a site i to a site j is W (H; r) = k0
wi j δ(ri , −1)δ(r j , 1)
δ(rk , 0),
(5)
k= / i, j
ij
where k0 is the hopping rate and the wi j contain the hopping rules. Typically, hopping is considered between nearest neighbors ( j = i ± 1).
2.
Lattice Langevin Equation
Master equations provide the same statistical information as kinetic Monte Carlo (KMC) simulations [8] and so are not generally amenable to an analytic solution. Accordingly, we will use a Kramers–Moyal–Van Kampen expansion [7] of the master equation to obtain an equation of motion that is a more manageable starting point for detailed analysis. This requires expanding the first term on the right-hand side of Eq. (1) which, in turn, relies on two criteria. The first is that W is a sharply peaked function of r in that there is a quantity δ > 0 such that W (H; r) ≈ 0 for |r| > δ. For the transition rates in Eqs. (2), (3) and (5), this “small jump” condition is fulfilled because the difference between successive configurations is at most a single unit on one site (for deposition) or two sites (for hopping). The second condition is that W is a slowly varying function of H, i.e., W (H + H; r) ≈ W (H; r)
for
|H| < δ.
(6)
In most growth models, the transition rules are based on comparing neighboring column heights to determine, for examine, local height minima or coordination maxima, as discussed above. Thus, an arbitrarily small change in the height of a particular column can lead to an abrupt change in the transition rate at a site, in clear violation of Eq. (6). Nevertheless, this condition can be accommodated by replacing the unit jumps in Eqs. (2), (3) and (5) with rescaled jumps of size −1 , where is a “largeness” parameter that controls the magnitude of the intrinsic fluctuations. The time is then rescaled as t → τ = t/ to preserve the original transition rates. The transformed master equation reads ∂P = ∂τ
(H − r; r)P(H − r, t) − W (H; r)P(H, t) dr, W
(7)
2354
D.D. Vvedensky
corresponding to those in Eqs. (2), (3) and (5) are where the transition rates W given by (H; r) = τ −1 W
δ ri −
i
(H; r) = τ −1 W
i
wi(1)δ
+ wi(2) δ ri−1 −
+ wi(3) δ ri+1 − (H; r) = W
ij
1 δ(r j ), j =/ i
(8)
1 ri − δ(r j ) j =/ i
1 1
δ(r j )
j= / i−1
δ(r j ) ,
j= / i+1
1 1 wi j δ r i + δ rj −
(9) δ(rk ),
(10)
k= / i, j
in which δ(x) is the Dirac δ-function. The central quantities for extracting a Langevin equation from the master : equation in Eq. (7) are the moments of W K i(1) (H) = K i(2) j (H) =
(H; r)dr ∼ O(1), ri W
(11)
(H; r)dr ∼ O(−1 ), ri r j W
(12)
and, in general, K (n) ∼ O(1−n ). With these orderings in , a limit theorem due to Kurtz [9] states that, as → ∞, the solution of the master equation (1) is approximated, with an error of O(ln / ), by that of the Langevin equation dh i = K i(1) (H) + ηi , (13) dτ where the ηi are Gaussian noises that have zero mean, ηi (τ ) = 0, and covariance ηi (τ )η j (τ ) = K i(2) j (H)δ(τ − τ ).
(14)
The solutions of this stochastic equation of motion are statistically equivalent to those of the master equation (1).
3.
The Edwards–Wilkinson Model
There are several applications of the Langevin equation (13). If the occupancy of only a single site is changed with each transition, the correlation
Stochastic equations for thin film morphology
2355
matrix in Eq. (14) is site-diagonal, in which case the numerical integration of Eq. (13) provides a practical alternative to KMC simulations. More important for our purposes, however, is that this equation can be used as a starting point for coarse-graining to extract the macroscopic properties produced by the transition rules. We consider the Edwards–Wilkinson model as an example. The Edwards–Wilkinson model [10], originally proposed as a continuum equation for sedimentation, is one of the standard models used to investigate morphological evolution during surface growth. There are several atomistic realizations of this model, but all are based on identifying the minimum height or heights near a randomly chosen site. In the version we study here, a particle incident on a site remains there only if its height is less than or equal to that of both nearest neighbors. If only one nearest neighbor column is lower than that of the original site, deposition is onto that site. However, if both nearest neighbor columns are lower than that of the original site, the deposition site is chosen randomly between the two. The transition rates in Eq. (3) are obtained by applying these relaxation rules to local height configurations. These configurations can be tabulated by using the step function
θ(x) =
1 if x ≥ 0 0 if x < 0
(15)
to express the pertinent relative heights between nearest neighbors as an identity:
θ(h i−1 − h i ) + (h i−1 − h i ) θ(h i+1 − h i ) + (h i+1 − h i ) = 1, (16)
where (h i − h j ) = 1 − θ(h i − h j ). The expansion of this equation produces four configurations, which are shown in Fig. 1 together with the deposition ( j) rules described above. Each of these is assigned to one of the wi , so the sum rule in Eq. (4) is satisfied by construction, and we obtain the following expressions: wi(1) = θ(h i−1 − h i )θ(h i+1 − h i ),
wi(2) = θ(h i+1 − h i ) 1 − θ(h i−1 − h i ) +
× 1 − θ(h i+1 − h i ) , wi(3) = θ(h i−1 − h i ) 1 − θ(h i+1 − h i ) +
× 1 − θ(h i+1 − h i ) .
1 2
1 2
1 − θ(h i−1 − h i ) 1 − θ(h i−1 − h i )
(17)
The lattice Langevin equation for the Edwards–Wilkinson model is, therefore, from Eq. (13), given by 1 (1) dh i (2) (3) = wi + wi+1 + wi−1 + ηi , dτ τ0
(18)
2356
D.D. Vvedensky (a)
(b)
(c)
(d)
Figure 1. The relaxation rules of the Edwards–Wilkinson model. The rule in (a) corresponds (1) (2) (3) to wi , those in (b) and (d) to wi , and those in (c) and (d) to wi . The broken lines indicates sites where greater heights do not affect the deposition site.
where the ηi have mean zero and covariance ηi (τ )η j (τ ) =
1 (1) (2) (3) wi + wi+1 + wi−1 δi j δ(τ − τ ). τ0
(19)
The statistical equivalence of solutions of this Langevin equation and those of the master equation, as determined by KMC simulations, can be demonstrated by examining correlation functions of the heights. One such quantity is the surface roughness, defined as the root-mean-square of the heights,
W (L , t) = h 2 (t) − h(t)2
1/2
,
(20)
where h k (t) = L −1 i h ki (t) for k = 1, 2, and L is the length of the substrate. For sufficiently long times and large substrate sizes, W is observed to conform to the dynamical scaling hypothesis [5], W (L , t) ∼ L α f (t/L z ), where f (x) ∼ x β for x 1 and f (x) → constant for x 1, α is the roughness exponent, z = α/β is the dynamic exponent, and β is the growth exponent. The comparison of W (L , t) obtained from KMC simulations with that computed from the Langevin equation in (18) is shown in Fig. 2 for systems
Stochastic equations for thin film morphology
2357
(a)
W(lattice units)
L⫽100
Ω⫽1 Ω⫽2
100
Ω⫽50 KMC
100
101
t(ML)
102
(b)
W(lattice units)
L⫽1000
Ω⫽1 Ω⫽20
100
KMC 100
101
102 t(ML)
103
Figure 2. Surface roughness obtained from the lattice Langevin Eq. (18) and KMC simulations for systems of size L = 100 and 1000 for the indicated values of . Data sets for L = 100 were averaged over 200 independent realizations. Those for L = 1000 were obtained from a single realization. The time is measured in units of monolayers (ML) deposited. Figure courtesy of A.L.-S. Chua.
2358
D.D. Vvedensky
of lengths L = 100 and 1000, each for several values of . Most apparent is that the roughness increases with time, a phenomenon known as “kinetic roughening” [5], prior to a system-size-dependent saturation. The roughness obtained from the Langevin equation is greater than that of the KMC simulation at all times, but with the difference decreasing with increasing . The greater roughness is due, in large part, to the noise in Eq. (19): the variance includes information about nearest-neighbors, but the noise is uncorrelated between sites. Thus, as the lattice is scanned, the uncorrelated noise produces a larger variance in the heights than the simulations. But even apart from the rougher growth front the discrepancies for smaller are appreciable. For L = 100 and = 1, 2, the saturation of the roughness is delayed to later times and the slope prior to saturation differs markedly from that of the KMC simulation. There are remnants of these discrepancies for L = 1000, though the slope of the roughness does approach the correct value at sufficiently long times even for = 1.
4.
Coarse-grained Equations of Motion
The non-analyticity of the step functions in Eq. (17), which reflects the threshold character of the relative column heights on neighboring sites, presents a major obstacle to coarse graining the lattice Langevin equation in Eq. (18), as well as those corresponding to other growth models [11, 12]. To address this problem, we begin by observing that θ(x) is required only at the discrete values h k±1 − h k = n, where n is an integer. Thus, we are free to interpolate between these points at our convenience. Accordingly, we use the following representation of θ(x) [13]:
θ(x) = lim+ →0
e(x+1)/ + 1 ln e x/ + 1
.
(21)
For finite , the right-hand side of this expression is a smooth function that represents a regularization of the step function (Fig. 3). This regularization can be expanded as a Taylor series in x and, to second order, we obtain θ(x) = A +
B2x 2 Cx3 Bx − − + ··· , 2 8 62
(22)
where A = ln
1 (1 2
+ e1/ ) ,
B=
e1/ − 1 , e1/ + 1
C=
e1/ (e1/ − 1) . (e1/ )3
As → 0, A → 1 − ln 2 + · · · , B → 1, and C → 0.
(23)
Stochastic equations for thin film morphology
2359
1 ∆⫽1
0.8 ∆⫽0.5
θ (x)
0.6
∆⫽0.25
0.4 0.2 0 ⫺2
⫺1
0
1
x
Figure 3. The regularization in (21) showing how, with decreasing , the step function (shown emboldened) is recovered.
We now introduce the coarse-grained space and time variables x = i and t = z τ/τ0 , where z is to be determined and parametrizes the extent of the coarse-graining, with = 1 corresponding to a smoothed lattice model (with no coarse-graining) and → 0 corresponding to the continuum limit. The coarse-grained height function u is u(x, t) =
α
τ hi − , τ0
(24)
where α is to be determined and τ/τ0 is the average growth rate. Upon applying these transformations and the expansion in Eq. (22) to Eqs. (18) and (19), we obtain the following leading terms in the equation of motion:
z−α
∂u ∂ 2u ∂ 4u ∂ 2 ∂u = ν 2−α 2 + K 4−α 4 + λ1 4−2α 2 ∂t ∂x ∂x ∂x ∂x 3 ∂ ∂u + λ2 4−3α + · · · + (1+z)/2ξ, ∂x ∂x
2
(25)
where ν = B,
K=
1 (4 − 3A), 12
λ1 =
B2 B2 − (1 − A), 8 8
λ2 = −
C , 3 (26)
and ξ is a Gaussian noise with mean zero and covariance ξ(x, t)ξ(x , t ) = δ(x − x )δ(t − t ).
(27)
2360
D.D. Vvedensky
The most direct approach to the continuum limit is obtained by requiring (i) that the coefficients of u t , u x x , and ξ have the same scale in and (ii) that these are the dominant terms as → 0. The first of these necessitates setting z = 2 and α = 1/2. To satisfy condition (ii), we first write = δ . A lower bound of the scale of the nth order term in the expansion in Eq. (25) can be estimated from Eq. (22) as
1−n
∂h ∂x
n
1
∼ n(1−α)−(n−1)δ = 2 n−(n−1)δ .
(28)
This yields the condition δ < 1/2, and satisfies condition (ii) for λ1 and λ2 as well. Thus, in the limit → 0, we obtain the Edwards–Wilkinson equation: ∂u ∂ 2 u = + ξ. ∂t ∂ x 2
(29)
The method used to obtain this equation can be applied to other models and in higher spatial dimensions. There have been several simulation studies of the Edwards–Wilkinson [14] and Wolf–Villain [15, 16] models that suggest intriguing and unexpected behavior that is not present for one-dimensional substrates. Taking a broader perspective, if a direct coarse-graining transformation is not suitable, our method can be used to generate an equation of motion as the initial condition for a subsequent renormalization group analysis. This will provide the basis for an understanding of continuum growth models as the natural expression of particular atomistic processes.
5.
Outlook
There are many phenomena in science and engineering that involve a disparity of length and time scales [17]. As a concrete example from materials science, the formation of dislocations within a material (atomic-scale) and their mobility across grain boundaries of the microstructure (“mesoscopic” scale) are important factors for the deformation behavior of the material (macroscopic scale). A complete understanding of mechanical properties thus requires theoretical and computational tools that range from the atomic-scale detail of density functional methods to the more coarse-grained picture provided by continuum elasticity theory. One approach to addressing such problems is a systematic analytic and/or numerical coarse-graining of the equations of motion for one range of length and time scales to obtain equations of motion that are valid over much longer length and time scales. A number of approaches in this direction has already been taken. Since driven lattice models are simple examples of atomic-scale systems, the approach described here may serve as a paradigm for such efforts.
Stochastic equations for thin film morphology
2361
References [1] C. Godr`eche (ed.), Solids far from Equilibrium, Cambridge University Press, Cambridge, England, 1992. [2] H.J. Jensen, Self-Organized Criticality: Emergent Complex Behavior in Physical and Biological Systems, Cambridge University Press, Cambridge, England, 2000. [3] S. Wolfram (ed.), Theory and Applications of Cellular Automata, World Scientific, Singapore, 1986. [4] G.D. Doolen (ed.), Lattice Gas: Theory Application, and Hardware, MIT Press, Cambridge, MA, 1991. [5] A.-L. Barab´asi and H.E. Stanley, Fractal Concepts in Surface Growth, Cambridge University Press, Cambridge, England, 1995. [6] J. Krug, “Origins of scale invariance in growth processes,” Adv. Phys., 46, 139–282, 1997. [7] N.G. Van Kampen, Stochastic Processes in Physics and Chemistry, North-Holland, Amsterdam, 1981. [8] M.E.J. Newman and G.T. Barkema, Monte Carlo Methods in Statistical Physics, Oxford University Press, Oxford, England, 1999. [9] R.F. Fox and J. Keizer, “Amplification of intrinsic fluctuations by chaotic dynamics in physical systems,” Phys. Rev. A, 43, 1709–1720, 1991. [10] S.F. Edwards and D.R. Wilkinson, “The surface statistics of a granular aggregate,” Proc. R. Soc. London Ser. A, 381, 17–31, 1982. [11] D.D. Vvedensky, A. Zangwill, C.N. Luse, and M.R. Wilby, “Stochastic equations of motion for epitaxial growth,” Phys. Rev. E, 48, 852–862, 1993. [12] M. Pˇredota and M. Kotrla, “Stochastic equations for simple discrete models of epitaxial growth,” Phys. Rev. E, 54, 3933–3942, 1996. [13] D.D. Vvedensky, “Edwards–Wilkinson equation from lattice transition rules,” Phys. Rev. E, 67, 025102(R), 2003. [14] S. Pal, D.P. Landau, and K. Binder, “Dynamical scaling of surface growth in simple lattice models,” Phys. Rev. E, 68, 021601, 2003. ˇ [15] M. Kotrla and P. Smilauer, “Nonuniversality in models of epitaxial growth,” Phys. Rev. B, 53, 13777–13792, 1996. [16] S. Das Sarma, P.P. Chatraphorn, and Z. Toroczkai, “Universality class of discrete solid-on-solid limited mobility nonequilibrium growth models for kinetic surface roughening,” Phys. Rev. E, 65, 036144, 2002. [17] D.D. Vvedensky, “Multiscale modelling of nanostructures,” J. Phys.: Condens. Matter, 16, R1537–R1576, 2004.
7.17 MONTE CARLO METHODS FOR SIMULATING THIN FILM DEPOSITION Corbett Battaile Sandia National Laboratories, Albuquerque, NM, USA
1.
Introduction
Thin solid films are used in a wide range of technologies. In many cases, strict control over the microscopic deposition behavior is critical to the performance of the film. For example, today’s commercial microelectronic devices contain structures that are only a few microns in size, and emerging microsystems technologies demand stringent control over dimensional tolerances. In addition, internal and surface microstructures can greatly influence thermal, mechanical, optical, electronic, and many other material properties. Thus it is important to understand and control the fundamental processes that govern thin film deposition at the nano- and micro-scale. This challenge can only be met by applying different tools to explore the various aspects of thin film deposition. Advances in computational capabilities over recent decades have allowed computer simulation in particular to play an invaluable role in uncovering atomic- and microstructure-scale deposition and growth behavior. Ab initio [1] and molecular dynamics (MD) calculations [2, 3] can reveal the energetics and dynamics of processes involving individual atoms and molecules in very fine temporal and spatial resolution. This information provides the fundamentals – the “unit processes” – that work in concert to deposit a solid film. The environmental conditions in the deposition chamber are commonly simulated using either the basic processing parameters directly (e.g., temperature and flux for simple physical vapor deposition systems); or continuum transport/reaction models [4] or direct simulation Monte Carlo methods [5] for more complex chemically active environments. These methods offer a wealth of information about the conditions inside a deposition chamber, but perhaps most important to the modeling of film growth itself are the fluxes and identities of species arriving at the deposition surface. All of 2363 S. Yip (ed.), Handbook of Materials Modeling, 2363–2377. c 2005 Springer. Printed in the Netherlands.
2364
C. Battaile
this information, including atomic-scale information about unit processes and chamber-scale information about surface fluxes and chemistry, must be used to construct a comprehensive model of deposition. Many methods have been used to model film growth. These range from one-dimensional solutions of coupled rate equations, which usually provide only growth rate information; to time-intensive MD simulations of the arrival and incorporation of many atoms at the growth surface, which yield detailed structural and energetic information at the atomic scale. This chapter addresses an intermediate approach, namely kinetic Monte Carlo (KMC) [6], that has been widely and successfully used to model a variety deposition systems. The present discussion is restricted to latticed-based KMC approaches, i.e., those that employ a discrete (lattice) representation of the material, which can provide a wealth of structural information about the deposited material. In addition, the underlying KMC foundation allows the treatment of problems spanning many time and length scales, depending primarily on the nature of the input kinetic data. These kinetic data are often derived using transition state information from experiments or from atomistic simulations. The growth model is often coupled to information about the growth environment such as temperature, pressure, vapor composition, and flux, and these data can be measured experimentally or computed using reactive transport models. The following discussion begins with a brief theoretical background of the Monte Carlo (MC) method in the context of thin film deposition, then continues with a discussion of its implementation, and concludes with an overview of both historical and current applications of KMC (and related variants) to the modeling of thin film growth. The intent is to instill in the reader a basic understanding of the foundations and implementation of the MC method in the context of thin film deposition simulations, and to provide a starting point in their exploration of this broad and rich topic.
2.
The Monte Carlo Method
Many collective phenomena in nature are essentially deterministic. For example, a ball thrown repeatedly with a specific initial velocity (in the absence of wind, altitudinal air density variations, and other complicating factors) will follow virtually the same trajectory each time. Other behaviors appear stochastic, as evidenced by the seemingly random behavior of a pachinko ball. Nanonscopically (i.e., on the time and length scale of atomic motion), most processes behave stochastically rather than deterministically. The vibrations of an atom or molecule as it explores the energetic landscape near the potential energy minimum created by the interactions with its environment are, for all practical purposes, random, i.e., stochastic. When that atom is in the vicinity of others, e.g., in a solid or liquid, the energetic landscape is very
Monte Carlo methods for simulating thin film deposition
2365
complex and consists of many potential energy minima separated by energy barriers (i.e., maxima). Given enough time, a vibrating atom will eventually happen to “hop over” one of these barriers and “fall into” an adjacent potential energy “well.” In doing so, the atom has transitioned from one state (i.e., energy) to a new one. The energetics of such a transition are depicted in Fig. 1, where the states are described by their free energies (i.e., both enthalpic and entropic contributions). These concepts apply not only to vibrating atoms but also to the fundamental transitions of any system that has energy minima in configurational space. Transition state theory describes the frequency of any transition that can be described energetically by a curve like the one in Fig. 1. Although a detailed account of transition state theory is beyond the scope of this chapter, suffice it to say that the average rate of transitioning from State A to State B is described by the rate constant
E , kA→B = A exp − kT
(1)
where A is the frequency with which the system attempts the transition, E is the activation barrier, k is Boltzmann’s constant equal to 1.3806503 × 10−23 J K−1 = 8.617269 × 10−5 eV K−1 , and T is the temperature. Likewise, the
Energy
E
A ∆G
B Reaction coordinate Figure 1.
2366
C. Battaile
average rate of the reverse transition from State B to State A is described by the rate constant E − G , (2) kA←B = A exp − kT where G is the change in free energy on transitioning from State A to B (notice from Fig. 1 that G is negative), and the reuse of the symbol, A, implies that the attempt frequencies for the forward (A → B) and reverse (A ← B) transitions are assumed equal. (The rate constants are obviously in the same units as the attempt frequency. If these units are not those of an absolute rate, i.e., sec−1 , then the rate constant can be converted into an absolute rate by multiplying by the appropriate quantity, e.g., concentrations in the case of chemical reactions.) Whereas Eqs. (1) and (2) describe the average rates for the transitions in Fig. 1, the actual rates for each instance of a particular transition will vary because the processes are stochastic. The state of the system will vary (apparently) randomly inside the energy well at State A until, by chance, the system happens to make an excursion that reaches the activated state, at which point (according to transition state theory) the system has a 50% chance of returning to State A and a 50% chance of transitioning into State B. The Monte Carlo (MC) method, named after the casinos in the Principality of Monaco (an independent sovereign state located between the foot of the Southern Alps and the Mediterranean Sea) is ideally suited to modeling not only realistic instantiations of individual state transitions (provided the relevant kinetic parameters are known) but also time- and ensemble-averages of complex and collective phenomena. The MC method is essentially an efficient method for numerically estimating complex and/or multidimensional integrals [7]. It is commonly used to find a system’s equilibrium configuration via energy minimization. Early MC algorithms involved choosing system configurations at random, and weighting each according to its potential energy via the Boltzmann equation, E (3) P = exp − kT where P is the weight (i.e., the probability the configuration would actually be realized). The configuration with the most weight corresponds to equilibrium. Metropolis et al. [7] improved on this scheme with an algorithm that, instead of choosing configurations randomly and Boltzmann-weighting them, chooses configurations with the Boltzmann probability in Eq. (3) and weighting them equally. In this manner, the model system wastes less time in configurations that are highly unlikely to exist. Bortz et al. [8] introduced yet another rephrasing of the MC method, and termed the new algorithm the N-Fold Way (NFW). This algorithm always accepts the chosen changes to the system’s configuration, and shifts the stochastic component of the computation into the time
Monte Carlo methods for simulating thin film deposition
2367
incrementation (which can thereby vary at each MC step). Independent discoveries of essentially the same algorithm were presented shortly thereafter by Gillespie [9], and more recently by Voter [10]. The NFW is only applicable in situations where the Boltzmann probability is nonzero only for a finite and enumerable set of configurational transitions. So, for example, it cannot be used (without adaptation of either the algorithm or the model system) to find the equilibrium positions of atoms in a liquid, since the phase space representing these positional configurations is continuous and thus contains a virtually infinite number of possible transitions. Both the Metropolis algorithm (in its kinetic variation, described below) and the NFW can treat kinetic phenomena, but the NFW is better suited to generating physically realistic temporal sequences of configurational transitions [6] provided the rates of all possible state transitions are known a priori. To illustrate the concepts behind these techniques, it is useful to consider a simple example. Imagine a system that can exist in one of three states: A, B, or C. All the possible transitions for this system are therefore A ↔ B ↔ C. When the system is in State A, it can undergo only one transition, i.e., conversion to State B. When in State C, the system is only eligible for conversion to State B. When in State B, the system can either convert to State A, or convert to State C. Assume that the energetics of the transition paths are described by Fig. 2. The symbol *IJ denotes the activated state for the transition between
AB
CB Energy
EAB
ECB C
A
∆GCB
∆GAB
B Reaction coordinate Figure 2.
2368
C. Battaile
States I and J, E I J is the activation barrier encountered upon the transition from State I to State J, and G IJ is the difference in the free energies between States I and J. (Note that both G AB and G CB are negative because the free energy decreases upon transitioning from State A to B, and from State C to B.) The lowest-energy state is B, the highest is C, and A is intermediate. Simply by examining Fig. 2, it is clear that the thermodynamic equilibrium for this system is State B. However, the kinetic properties of the system depend on the transition rates, which in turn depend not only on the energies but also on the attempt frequencies. If the attempt frequencies of all four transitions are equal, then the state with the maximum residence time (in steady state) would certainly be State B, and that with the minimum time would be State C. Otherwise the residence properties might be quite different. As aforementioned, the Metropolis algorithm proceeds by choosing configurations at random, and accepting or rejecting them based on the change to the system energy that is incurred by changing the system’s configuration. So, in the present example, such an algorithm would randomly choose one of the three states – A, B, or C – and accept or reject the chosen state with a probability based on the energy difference between it and the previous state. Specifically, the probability of accepting a new State J when the system is in State I is PI→J =
G IJ exp − kT
G IJ > 0
.
(4)
G IJ ≤ 0
1
This so-called thermodynamic Metropolis MC approach clearly utilizes only the states’ energy differences, and does not account for the properties of the activated states or the dynamics that lead to transition attempts. As such, it can reveal the equilibrium state of the system, but provides no information about the kinetics of the system’s evolution. However, the same algorithm can be adapted into a kinetic Metropolis MC scheme in order to capture kinetic information. This is accomplished by introducing time into the approach, and by using the transition rate information from Eqs. (1) and (2). Specifically, the rate constants for the “forward” transition, I→J, and the “backward,” I←J, are E IJ and (5) kI→J = AIJ exp − kT
kI←J = AJI exp −
E IJ − G IJ . kT
(6)
Assuming that the rate and the rate constant are the same, the probability of accepting a new State J when the system is in an “adjacent” State I is PI→J = kI→J t,
(7)
Monte Carlo methods for simulating thin film deposition
2369
where t is a constant time increment that is chosen a priori to accommodate the fastest transitions in the problem. Generally, t is chosen to be near 0.5/kmax . Thus, at each step in a kinetic Metropolis MC calculation, a transition is chosen at random from those that are possible given the state of the system, the chosen transition is realized with a probability according to Eq. (7), and the time is incremented by the constant t. Notice that while the thermodynamic Metropolis scheme allows the system to change its configuration to a state that is not directly accessible (e.g., A→C), the kinetic Metropolis approach considers only transitions between accessible states (i.e., the transitions A ↔ C in Fig. 2 would be forbidden). Similarly, the NFW deals only with accessible transitions, but unlike the kinetic Metropolis formulation, the NFW realizes state transitions with unit probability. Specifically, at each step in an NFW computation, a transition is chosen at random from those that are possible given the state of the system. The probability of choosing a particular transition depends on its relative rate. As such, i−1 j =1
kj < ζ ≤
M
kj,
(8)
j =i
where j merely indexes each transition, i denotes the chosen transition, ζ is a random number between zero (inclusive) and one (exclusive) such that ζ ∈ [0, 1), and is the sum of the rates of all the transitions that are possible given the state of the system. (Recall that the transition rates are equal to the rate constants in the present example, as aforementioned.) The chosen transition is always realized, and the time is incremented by ln (ξ ) , (9) t = − where ξ is another random number between zero and one (exclusive of both bounds) such that ξ ∈ (0,1). On closer inspection, it is apparent that the NFW is simply a rearrangement of the kinetic Metropolis MC algorithm [8]. Consider a system in some arbitrary State I. Assume that the system can exist in multiple states, so that the system will eventually (at non-zero temperature) transition out of State I. Because the transitioning process is stochastic, the time that the system spends in State I will vary each time it visits that state. (This “fact” is evident in the kinetic MC algorithms discussed above.) Let P− (dt) denote the probability that the system remains in State I for a time of at least dt, and P+ (dt) be the probability that the system leaves State I before dt has elapsed, where dt = 0 refers to the moment that the system entered State I. Since the system has no other choices but to either stay in State I during dt or leave State I sometime during dt, it is clear that P− (dt) + P+ (dt) = 1.
(10)
2370
C. Battaile
Consider some value of time, t =/ dt, where again t =0 refers to the moment that the system entered State I. Multiplying Eq. (10) by P− (t) yields P− (dt) P− (t) + P+ (dt) P− (t) = P− (t).
(11)
Notice that P− (dt)P− (t) = P− (t)P− (dt), and is simply the probability that the system is still in State I after t and also after the following dt, i.e., it is the probability that the system remains in State I for at least a time of t + dt. Therefore, P− (t + dt) + P+ (dt) P− (t) = P− (t).
(12)
Also notice that P+ (dt) = dt,
(13)
where is the average number of transitions from State I per unit time, i.e., the sum of the rates of all the transitions that the system can make from State I. Substituting Eqs. (12) and (13) into Eq. (11) yields P− (t + dt) + P− (t) dt = P− (t).
(14)
Rearranging Eq. (14) produces P− (t + dt) − P− (t) = − P− (t). dt In the limit that dt → 0, Eq. (15) becomes
(15)
dP−
= − P− (t). dt t
(16)
Integrating Eq. (16), and realizing that P− (0) = 1, yields
ln P− (t) = −t, hence
(17)
ln P− (t) . (18) t = − Let t∗ be the average residence time for State I. On each visit that the system makes to State I, it remains there for a different amount of time, and the associated residence probabilities for each visit follow a uniform random distribution such that P− (t∗ ) ∈ (0,1). Therefore, the individual residence times from visit to visit follow a distribution of the form ln (ξ ) , (19) t∗ = − where ξ is a random number such that ξ ∈ (0,1), and thus Eq. (9) is obtained. Clearly the time that elapses between one transition and the next is stochastic
Monte Carlo methods for simulating thin film deposition
2371
and is a function only of the sum of the rates of all available transitions. When in any given state, the probability that the system will actually make a particular transition, provided it is accessible, is equal to the rate of the transition relative to the sum of the rates of all accessible transitions, as described in Eq. (8).
3.
Implementing the N-Fold Way
One can readily see the utility of the NFW for simulating the fundamental processes involved in thin film deposition. Simply put, one need only apply the algorithms described above, illustrated for the idealized system in Fig. 2, onto each fundamental location on the model deposition surface. For example, consider the simple two-dimensional surface in Fig. 3a. Assume that each square represents a fundamental unit of the solid structure (e.g., an atom), that there is a flux of material toward the substrate, and that gray denotes a static substrate. If the incoming material is appropriate for coherent epitaxy on the substrate, then the evolution of the surface in Fig. 3a will begin by the attachment of material to the surface, i.e., the filling of one of the sites above the surface denoted by dotted squares in Fig. 3b, by a unit of incoming material. Consider only one of these candidate sites, e.g., the one labeled d in Fig. 3c. Site d represents a subsystem of the entire surface, and that subsystem is in a particular state whose configuration is defined by the “empty” site just above the surface. Site d can transition into another state, namely one in which the site contains a deposited unit of material, as depicted in Fig. 3d. This local transition occurs just as described in the simple example for Fig. 2 above. In fact, the evolution of the entire surface can be modeled by collectively considering the local transitions of each fundamental unit (i.e., site) in the system. Consider the behavior of the entire system from the initial configuration shown in Fig. 3c. Each site above the surface can be filled by incoming
(a)
Flux
(b)
(c)
(d)
Figure 3. Deposition of a single practicle onto a simple two-dimensional substrate. Gray squares are substrate sites, white dotted squares are sites into which particles can potentially deposit, and black squares are deposited particles.
2372
C. Battaile
material. The NFW algorithm suggests that the time that passes before a particular site, e.g., Site a transitions is ta = −
X ln (ξ ), F
(20)
where X is the aerial density of surface sites in units of length−2 and F is the deposition flux in length−2 sec−1 (taking into account such factors as the sticking probability), so that F/X is the average rate at which material can deposit into Site a. But how much time passes before any of the sites makes a transition? In other words, how long will the system remain in the configuration of Fig. 3c? By way of analogy, consider rolling six-sided dice. If only one die is rolled, the chance that a particular side faces upwards (after the die comes to rest) is 1/6. So the chance of rolling “a three” is 1/6, as is the chance of rolling a five, etc. If three dice are rolled, the chance that at least one of them shows a three is 3/6 = 1/2. Thus, since there are seven sites in Fig. 3c that can accept incoming material, the probability that at least one of them will transition during some small time increment is seven times the probability that a specific isolated site will transition in the same increment. Because more probable events obviously occur more often, i.e., require less time, then the time that passes before the entire system leaves the configuration in Fig. 3c, i.e., the time it takes for a unit of material to deposit somewhere on the surface, is t =
1X td =− ln (ξ ). 7 7F
(21)
Notice that 7F/X is simply the sum of the rates of all the per-site transitions that can occur in the entire system, i.e., the system’s activity, and thus it is clear that the general form of Eq. (21) is Eq. (9). As described above, the NFW algorithm prescribes that the choice of transition at each time step be randomized, with the probability of choosing a particular transition proportional to its relative rate. Since one of only seven transitions can occur on the surface in Fig. 3c, and each has the same rate, then the selection of a transition from the configuration in Fig. 3c involves simply choosing at random one of the seven sites marked with dotted outlines. Duplicating Fig. 3c as Fig. 4a, and assuming that Site d is randomly selected to transition, then the configuration after the first time step is that in Fig. 4b. If the per-site flux (i.e., F/ X ) is 1 sec−1 , then the time increment that elapses before the first transition is dictated by Eq. (9) to be t1 = −
ln (ξ1 ) sec . 7
(22)
A random number of ξ1 = 0.631935 yields a time increment for the first step of t1 = 0.065567 sec for a total time after the first step of, obviously, t1 = 0.065567 sec (where the starting time is naturally t0 = 0 sec).
Monte Carlo methods for simulating thin film deposition
2373
Assume that the deposited material at Site d can either diffuse to Site c, diffuse to Site e, or desorb. Assume further that the temperature is T = 1160 K such that kT =0.1 eV; the attempt frequency and activation barrier for diffusion are A D = 1x104 sec−1 and E D = 0.70 eV, respectively; and those for desorption are A R = 1x104 sec−1 and E R = 0.85 eV. Then the per-site rate of diffusion is approximately 9 sec−1 , and that of desorption is 2 sec−1 . The set of transitions available to the configuration in Fig. 4b includes seven deposition events at the dotted sites, two diffusion events, and one desorption event. To illustrate the process of choosing one of these ten transitions in the NFW algorithm, it is useful to visualize them on a graph. Figure 5b shows the ten possible transitions on a line plot, with the width of each corresponding to its relative rate. A transition can be selected in accord with Eq. (8) simply by generating a random number ζ2 ∈ [0,1), plotting it on the graph in Fig. 5b, and selecting the appropriate transition. (Figure 5a depicts the same type of plot for the configuration in Fig. 4a, assuming a value of ζ1 = 0.500923.) For example, if a random number of ζ2 = 0.652493 is generated, then the black atom at Site d in Fig. 4b would diffuse into Site e yielding the configuration in Fig. 4c. Since the activity of the system in Fig. 4b is = 27 sec −1 , a random number of ξ2 = 0.548193 yields a time increment for the second step of t2 = 0.022264 sec yielding a time value of t2 = 0.087831 sec. (Notice that when fast transitions are available to the system, as in Fig. 5b, the activity of the system increases and the time resolution in the NFW becomes finer to accommodate the fast processes.) By repeating this recipe, the evolution of the surface from its initial state in Fig. 4a can be simulated, as shown in Figs. 4 and 5. The random numbers (for transition selection) corresponding to the system’s evolution from Fig. 4c are ζ3 = 0.132087, ζ4 = 0.327872, and ζ5 = 0.891473, and the simulation time would be calculated as prescribed above. This NFW approach can be straightforwardly extended into three dimensions, and all manner of complex, collective, and environment- and structure-dependent transitions can be modeled provided their rates are known.
(a)
(b)
(c)
(d)
(e)
(f)
Figure 4. Possible configurations for the first few steps of deposition onto a simple twodimensional substrate. Gray squares are substrate sites, white dotted squares are sites into which particles can potentially deposit, and black squares are deposited particles.
2374
C. Battaile
Figure 5. Lists of transitions for use in the NFW algorithm applied to the surface evolution depicted in Fig. 4. The numerals at the upper right of each plot indicate the total rate in sec−1 , those below each plot demark relative rates, and the letters above each plot denote transition classes and locations. The letter F corresponds to particle deposition, D to diffusion, and R to desorption. Lowercase italic letters correspond to the site labels, in Fig. 4, and the notation i ⇒ j indicates diffusion of the particle at site i into site j . The thick gray lines below each plot mark the locations of the random numbers used to select a transition from each configuration.
4.
Historical Perspective
Thousands of papers have been published on Monte Carlo simulations of thin film deposition. They encompass a wide range of thin film applications
Monte Carlo methods for simulating thin film deposition
2375
and employ a variety of methods. This section contains a brief overview of some selected examples. No attempt is made here to provide a comprehensive review; instead, the goal is to present selected sources for further exploration. Some of the earliest applications of MC to the computer simulation of deposition used simple models of deposition on an idealized surface. One of the first of these is attributed to Young and Schubert [11], who simulated the multilayer adsorption (without desorption or migration) of tungsten onto a tungsten substrate. Chernov and Lewis [12] performed MC calculations of kink migration during binary alloy deposition using a 1000-particle linear chain in one dimension, a 99×49 square grid in two dimensions (where the grid represented a cross-section of the film), and a 99×49×32 cubic grid in three dimensions. Gordon [13] simulated the monolayer adsorption and desorption of particles onto a 10×10 grid of sites with hexagonal close-packing (where the grid represented a plan view of the film). Abraham and White [14] considered the monolayer adsorption, desorption, and migration of atoms onto a 10×10 square grid (again in plan view), with atomic transitions modeled using a scheme that resembles the NFW. (Notice that Abraham’s and White’s publication appeared five years before the first publication of the NFW algorithm.) Leamy and Jackson [15], and Gilmer and Bennema [16], used the solid–on-solid (SOS) model [17–19] to analyze the roughness of the solid– vapor interface on a three-dimensional film represented by a 20×20 square grid. The SOS model represents the film by columns of atoms (or larger solid quanta) so that no subsurface voids or vacancies can exist. One major advantage of this approach is that the three-dimensional film can be represented digitally by a two-dimensional matrix of integers that describe the height of the film at each location on the free surface. Their approach was later extended [20] to alleviate the restrictions of the SOS model so that the structure and properties of the diffuse solid-vapor interface could be examined. Over the years, KMC methods have been applied to a wide range of materials and deposition technologies. These include materials such as simple metals, alloys, semiconductors, oxides, diamond, nanotubes, and quasicrystals; and technologies like molecular beam epitaxy, physical vapor deposition, chemical vapor deposition, electrodeposition, ion beam assisted deposition, and laser assisted deposition. Because of their relative simplicity, lattice KMC models were used in many of the computational deposition studies performed to date. However, MC methods can also be applied to model systems where the basic structural units (e.g., atoms) do not adhere to prescribed lattice positions. For example, continuous-space MC methods [21] allow particles to assume any position inside the computational domain. The motion of the particles is generally simulated by attempting small displacements, computing the associated energy changes via an interparticle potential, and applying the MC algorithms described above to accept or reject the attempted displacements. Alternatively, MC methods can be combined with other techniques within
2376
C. Battaile
the same simulation framework to create a hybrid approach [22]. Common applications of these hybrids involve relaxing atomic positions near the surface, usually by means of energy minimization or molecular dynamics, and performing the MC calculations at off-lattice locations that are identified as potential transition sites on the relaxed structure.
5.
Summary
The preceding discussion should demonstrate clearly that the topic of MC deposition simulations is broad and rich. Unfortunately, a comprehensive review of existing methods and past research is beyond the scope of this article, and the reader is referred to the works mentioned herein and to the numerous reviews on the subject [23–30] for further study. As the techniques for applying MC methods to the study of thin film deposition continue to mature, novel approaches and previously inaccessible technologies will emerge. Hybrid MC methods seem particularly promising, as they allow for a physically based description of the fundamental surface structure, can allow for the real-time calculation of transition rates via physically accurate methods, and are able to access spatial and temporal scales that are well beyond the reach of more fundamental approaches. Whatever the future holds, it is certain that our ability to study thin film processing using computer simulations will continue to evolve and improve, yielding otherwise unobtainable insights into the physics and phenomenology of deposition, and that MC methods will play a crucial role in that process.
References [1] J. Fritsch and U. Schr¨oder, “Density functional calculation of semiconductor surface phonons,” Phys. Lett. C – Phys. Rep., 309, 209–331, 1999. [2] M.P. Allen, “Computer simulation of liquids,” Oxford University Press, Oxford, 1989. [3] J.M. Haile, “Molecular dynamics simulation: elementary methods,” John Wiley and Sons, New York, 1992. [4] C.K. Harris, D. Roekaerts, F.J.J. Fosendal, F.G.J. Buitendijk, P. Daskopoulos, A.J.N. Vreenegoor, and H. Wang, “Computational fluid dynamics for chemical reactor engineering,” Chem. Eng. Sci., 51, 1569–1594, 1996. [5] G.A. Bird, “Molecular gas dynamics and the direct simulation of gas flows,” Oxford University Press, Oxford, 1994. [6] K.A. Fichthorn and W.H. Weinberg, “Theoretical foundations of dynamical Monte Carlo simulations,” J. Chem. Phys., 95, 1090–1096, 1991. [7] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087–1092, 1953.
Monte Carlo methods for simulating thin film deposition
2377
[8] A.B. Bortz, M.H. Kalos, and J.L. Lebowitz, “A new algorithm for Monte Carlo simulation of ising spin systems,” J. Comp. Phys., 17, 10–18, 1975. [9] D.T. Gillespie, “Exact stochastic simulation of coupled chemical reactions,” J. Phys. Chem., 81, 2340–2361, 1977. [10] A.F. Voter, “Clasically exact overlayer dynamics: diffusion of rhodium clusters on Rh(100),” Phys. Rev., B, 34, 6819–6829, 1986. [11] R.D. Young and D.C. Schubert, “Condensation of tungsten on tungsten in atomic detail – Monte Carlo and statistical calculations vs experiment,” J. Chem. Phys., 42, 3943–3950, 1965. [12] A.A. Chernov and J. Lewis, “Computer model of crystallization of binary systems – kinetic phase transitions,” J. Phys. Chem. Solids, 28, 2185–2198, 1967. [13] R. Gordon, “Adsorption isotherms of lattice gases by computer simulation,” J. Chem. Phys., 48, 1408–1409, 1968. [14] F.F. Abraham and G.W. White “Computer simulation of vapor deposition on twodimensional lattices,” J. Appl. Phys., 41, 1841–1849, 1970. [15] H.J. Leamy, and K.A. Jackson, “Roughness of crystal–vapor interface,” J. Appl. Phys., 42, 2121–2127, 1971. [16] G.H. Gilmer and P. Bennema, “Simulation of crystal-growth with surface diffusion,” J. Appl. Phys., 43, 1347–1360, 1972. [17] T.L. Hill, “Statistical mechanics of multimolecular adsorption 3: introductory treatment of horizontal interactions – Capillary condensation and hysteresis,” J. Chem. Phys., 15, 767–777, 1947. [18] W.K. Burton, N. Cabrera, and F.C. Frank, “The growth of crystals and the equilibrium structure of their surfaces,” Phil. Trans. Roy. Soc. A, 243, 299–358, 1951. [19] D.E. Temkin, “Crystallization processes,” Consultant Bureau, New York, 1966. [20] H.J. Leamy, G.H. Gilmer, K.A. Jackson, and P. Bennema, “Lattice–gas interface structure: a Monte Carlo simulation,” Phys. Rev. Lett., 30, 601–603, 1973. [21] B.W. Dodson and P.A. Taylor, “Monte Carlo simulation of continuous-space crystal growth,” Phys. Rev. B, 34, 2112–2115, 1986. [22] M.D. Rouhani, A.M. Gu´e, M. Sahlaoui, and D. Est`eve, “Strained semiconductor structures: simulation of the first stages of the growth,” Surf. Sci., 276, 109–121, 1992. [23] K. Binder, “Monte Carlo methods in statistical physics,” Springer-Verlag, Berlin, 1986. [24] T. Kawamura, “Monte Carlo simulation of thin-film growth on si surfaces,” Prog. Surf. Sci., 44, 67–99, 1993. [25] J. Lapujoulade, “The roughening of metal surfaces,” Surf. Sci. Rep., 20, 195–249, 1994. [26] M. Kotrla, “Numerical simulations in the theory of crystal growth,” Comp. Phys. Comm., 97, 82–100, 1996. [27] G.H. Gilmer, H. Huang, and C. Roland, “Thin film deposition: fundamentals and modeling,” Comp. Mat. Sci., 12, 354–380, 1998. [28] M. Itoh, “Atomic-scale homoepitaxial growth simulations of reconstructed III–V surfaces,” Prog. Surf. Sci., 66, 53–153, 2001. [29] H.N.G. Wadley, A.X. Zhou, R.A. Johnson, and M. Neurock, “Mechanisms, models, and methods of vapor deposition,” Prog. Mat. Sci., 46, 329–377, 2001. [30] C.C. Battaile, and D.J. Srolovitz, “Kinetic Monte Carlo simulation of chemical vapor deposition,” Ann. Rev. Mat. Res., 32, 297–319, 2002.
7.18 MICROSTRUCTURE OPTIMIZATION S. Torquato Department of Chemistry, PRISM, and Program in Applied & Computational Mathematics Princeton University, Princeton, NJ 08544, USA
1.
Introduction
An important goal of materials science is to have exquisite knowledge of structure-property relations in order to design material microstructures with desired properties and performance characteristics. Although this objective has been achieved in certain cases through trial and error, a systematic means of doing so is currently lacking. For certain physical phenomena at specific length scales, the governing equations are known and the only barrier to achieving the aforementioned goal is the development of appropriate methods to attack the problem. Optimization methods provide a systematic means of designing materials with tailored properties for a specific application. This article focuses on two optimization techniques: (1) the topology optimization procedure used to design composite or porous media, and (2) stochastic optimization methods employed to reconstruct or construct material microstructures.
2.
Topology Optimization
A promising method for the systematic design of composite microstructures with desirable macroscopic properties is the topology optimization method. The topology optimization method was developed almost two decades ago by Bendsøe and Kikuchi [1] for the design of mechanical structures. It is now also being used in smart and passive material design, mechanism design, microelectro-mechanical systems (MEMS) design, target optimization, multifunctional optimization, and other design problems [2–7]. Consider a two-phase composite material consisting of a phase with a property K 1 and volume fraction φ1 and another phase with a property K 2 and 2379 S. Yip (ed.), Handbook of Materials Modeling, 2379–2396. c 2005 Springer. Printed in the Netherlands.
2380
S. Torquato
volume fraction φ2 (= 1 − φ1 ). The property K i is perfectly general: it may represent a transport, mechanical or electromagnetic property, or properties associated with coupled phenomena, such as piezoelectricity or thermoelectricity. For steady-state situations, the generalized flux F(r) at some local position r in the composite obeys the following conservation law in the phases: ∇ · F(r) = 0.
(1)
In the case of electrical conduction and elasticity, F represents the current density and stress tensor, respectively. The local constitutive law relates F to a generalized gradient G, which in the special case of a linear relationship is given by F(r) = K (r)G(r),
(2)
where K (r) is the local property. In the case of electrical conduction, relation (2) is just Ohm’s law, and K and G are the conductivity and electric field, respectively. For elastic solids, relation (2) is Hooke’s law, and K and G are the stiffness tensor and strain field, respectively. For piezoelectricity, F is the stress tensor, K embodies the compliance and piezoelectric coefficients, and G embodies both the electric field and strain tensor. The generalized gradient G must also satisfy a governing differential equation. For example, in the case of electrical conduction, G must be curl free. One must also specify the appropriate boundary conditions at the two-phase interface. One can show that the effective properties are found by homogenizing (averaging) the aforementioned local fields [8, 9]. In the case of linear material, the effective property K e is given by F(r) = K e G(r),
(3)
where angular brackets denote a volume average and/or an ensemble average. For additional details, the reader is referred to the other article (“Theory of Random Heterogeneous Materials”) by the author in this encyclopedia.
2.1.
Problem Statement
The basic topology optimization problem can be stated as follows: distribute a given amount of material in a design domain such that an objective function is extremized [1, 2, 4, 7]. The design domain is the periodic base cell and is initialized by discretizing it into a large number of finite elements (see Fig. 2) under periodic boundary conditions. The problem consists in finding the optimal distribution of the phases (solid, fluid, or void), such that the objective function is minimized. The objective function can be any combination of the individual components of the relevant effective property tensor
Microstructure optimization
2381
subject to certain constraints [2, 7]. For target optimization [5] and multifunctional optimal design [6], the objective function can be appropriately modified, as described below. In the most general situation, it is desired to design a composite material with N different effective properties, which we denote by K e(1) , K e(2) , . . . , K e(N) , given the individual properties of the phases. In principle, one wants to know the region (set) in the multidimensional space of effective properties in which all composites must lie (see Fig. 1). The size and shape of this region depends on how much information about the microstructure is specified and on the prescribed phase properties. One could begin by making an initial guess for the distribution of the two phases among the elements, solve for the local fields using finite elements and then evolve the microstructure to the targeted properties. However, even for a small number of elements, this integer-type optimization problem becomes a huge and intractable combinatorial problem. For example, for a small design problem with N = 100, the number of different distributions of the three material phases would be astronomically large (3100 = 5 · 1047 ). As each function evaluation requires a full finite element analysis, it is hopeless to solve the optimization problem using random search methods such as, genetic algorithms or simulated annealing methods, which use a large number of function evaluations and do not make use of sensitivity information. Following the idea of standard topology optimization procedures, the problem is therefore relaxed by allowing the material at a given point to be a gray-scale mixture of the two phases. This makes it possible to find sensitivities with respect to design changes, which in turn allows one to use linear programming methods to solve the optimization problem. The optimization
Property Ke(2)
All composites
Property Ke(1) Figure 1. Schematic illustrating the allowable region in which all composites with specified phase properties must lie for the case of two different effective properties.
2382
S. Torquato
procedure solves a sequence of finite element problems followed by changes in material type (density) of each of the finite elements, based on sensitivities of the obj-ective function and constraints with respect to design changes. At the end of the optimization procedure, however, we desire to have a design where each element is either phase 1 or phase 2 material (Fig. 2). This is achieved by imposing a penalization for grey phases at the final stages of the simulation. In the relaxed system, let xi ∈ [0, 1] be the local density of the ith element, so that when xi = 0, the element corresponds to phase 1 and when xi = 1, the element corresponds to phase 2. Let x (xi ,i = 1, . . . , n) be the vector of design variables which satisfies the constraint for the fixed volume fraction φ2 = xi . For any x, the local fields are computed using the finite element method and the effective property K e (K ;x), which is a function of the material property K and x, is obtained by the homogenization of the local fields. The optimization problem is specified as follows: Minimize : subject to :
= K e (x) n 1 xi = φ2 n i=1
(4)
0 ≤ xi ≤ 1, i = 1, . . . , n and prescribed symmetries. The objective function K e (x) is generally nonlinear. To solve this problem, the objective function is linearized, enabling one to take advantage of powerful sequential linear programming techniques. Specifically, the objective function is expanded in Taylor series for a given microstructure x0 : K e (X0 ) + ∇ K e · x, Design domain (base cell)
Phase 1:
(5) Periodic material structure
Phase 2:
Figure 2. Design domain and discretization for a two-phase, three-dimensional topology optimization problem. Each cube represents one finite element, which can consist of either phase 1 material or phase 2 material.
Microstructure optimization
2383
where x = x − x0 is the vector of density changes. In each iteration, the microstructure evolves to the optimal state by determining the small change x. One can use the simplex method [2] or the interior-point method [5] to minimize the linearized objective function in Eq. (5). In each iteration, the homogenization step to obtain the effective property K e (K ; x0 ) is carried out numerically via the finite-element method on the given configuration x0 . Derivatives of the objective function (∇ K e ) are calculated by a sensitivity analysis which requires one finite element calculation for each iteration. One can use the topology optimization to design at will composite microstructures with targeted effective properties under required constraints [5]. The objective function for such a target optimization problem has been chosen to be given by a least-square form involving the effective property K e (x) at any point in the simulation and a target effective property K 0 : = [K e (x) − K 0 ]2 .
(6)
The method can also be employed for multifunctional optimization problems. The objective function in this instance has been chosen to be a weighted average of each of the effective properties [6].
2.2.
Illustrative Examples
The topology optimization procedure has been employed to design composite materials with extremal properties [2, 3, 10], targeted properties [5, 11], and multifunctional properties [6]. To illustrate the power of the method, we briefly describe microstructure designs in which thermal expansion and piezoelectric behaviors are optimized, the effective conductivity achieves a targeted value, and the thermal conduction demands compete with the electrical conduction demands. Materials with extreme or unusual thermal expansion behavior are of interest from both a technological and fundamental standpoint. Zero thermal expansion materials are needed in structures subject to temperature changes such as space structures, bridges and piping systems. Materials with large thermal displacement or force can be employed as “thermal” actuators. A negative thermal expansion material has the counterintuitive property of contracting upon heating. A fastener made of a negative thermal expansion material, upon heating, can be inserted easily into a hole. Upon cooling, it will expand, fitting tightly into the hole. All three types of expansion behavior have been designed [2]. In the negative expansion case, one must consider a three-phase material: a high expansion material, a low expansion material, and a void region. Figure 3 shows the two-dimensional optimal design that was found; the main mechanism behind the negative expansion behavior is the reentrant cell
2384
S. Torquato
Figure 3. Optimal microstructure for minimization of effective thermal expansion coefficient [2]. White regions denote void, black regions consist of low expansion material and cross-hatched regions consist of high expansion material.
structure having bimaterial components which bend (into the void space) and cause large deformation when heated. In the case of piezoelectricity, actuators that maximize the delivered force or displacement can be designed. Moreover, one can design piezocomposites (consisting of an array of parallel piezoceramic rods embedded in a polymer matrix) that maximize the sensitivity to acoustic fields. The topology optimization method has been used to design piezocomposites with optimal performance characteristics for hydrophone applications [3]. When designing for maximum hydrostatic charge coefficient, the optimal transversally isotropic matrix material has negative Poisson’s ratio in certain directions. This matrix material itself turns out be a composite, namely, a special porous solid. Using an autocad file of the three-dimensional matrix material structure and a stereolithography technique, such negative Poisson’s ratio materials have actually been fabricated [3]. For the case of a two-phase, two-dimensional, isotropic composite, the popular effective-medium approximation (EMA) formula for the effective electrical conductivity σ e is given by
φ1
σe − σ1 σe − σ2 + φ2 = 0, σe + σ1 σe + σ2
(7)
where φi and σi are the volume fraction and conductivity of phase i, respectively. Milton [12] showed that the EMA expression is exactly realized by granular aggregates of the two phases such that spherical grains (in any dimension) of comparable size are well separated with self-similarity on all length scales. This is why the EMA formula breaks down when applied to dispersions of identical circular inclusions. An interesting question is the following: Can the EMA formula be realized by simple structures with a single
Microstructure optimization
2385
length scale? Using the target optimization formulation in which the target effective conductivity σ0 is given by the EMA function (7), Torquato et al. [6] found a class of periodic, single-scale dispersions that achieve it at a given phase conductivity ratio for a two-phase, two-dimensional composite over all volume fractions. Moreover, to an excellent approximation (but not exactly), the same structures realize the EMA for almost the entire range of phase conductivities and volume fractions. The inclusion shapes are given analytically by the generalized hypocycloid, which in general has a non-smooth interface (see Fig. 4). Minimal surfaces necessarily have zero mean curvature, i.e., the sum of the principal curvatures at each point on the surface is zero. Particularly fascinating are minimal surfaces that are triply periodic because they arise in a variety of systems, including block copolymers, nanocomposites, micellar materials, and lipid-water systems [6]. These two-phase composites are bicontinuous in the sense that the surface (two-phase interface) divides space into two disjoint
2 ⫽ 0.001
2 ⫽ 0.05
2 ⫽ 0.089
2 ⫽ 0.3
2 ⫽ 0.5
2 ⫽ 0.7
2 ⫽ 0.911
2 ⫽ 0.95
2 ⫽ 0.999
Figure 4. Unit cells of generalized hypocycloidal inclusions in a matrix that realize the EMA relation (1) for selected values of the volume fraction in the range 0 < φ2 < 1. Phases 1 and 2 are the white and black phase, respectively.
2386
S. Torquato
but intertwining phases that are simultaneously continuous. This topological feature of bicontinuity is rare in two dimensions and therefore virtually unique to three dimensions [8]. Using multifunctional optimization [6], it has been discovered that triply periodic two-phase bicontinuous composites with interfaces that are the Schwartz P and D minimal surfaces (see Fig. 5) are not only geometrically extremal but extremal for simultaneous transport of heat and electricity. More specifically, these are the optimal structures when a weighted sum of the effective thermal and electrical conductivities ( = λe + σ e ) is maximized for the case in which phase 1 is a good thermal conductor but poor electrical conductor and phase 2 is a poor thermal conductor but good electrical conductor with φ1 = φ2 = 1/2. The demand that this sum is maximized sets up a competition between the two effective transport properties, and this demand is met by the Schwartz P and D structures. By mathematical analogy, the optimality of these bicontinuous composites applies to any of the pair of the following scalar effective properties: electrical conductivity, thermal conductivity, dielectric constant, magnetic permeability, and diffusion coefficient. It will be of interest to investigate whether the optimal structures when φ1 =/ φ2 are bicontinuous structures with interfaces of constant mean curvature, which would become minimal surfaces at the point φ1 = φ2 = 1/2. The topological property of bicontinuity of these structures suggests that they would be mechanically stiff even if one of the phases is a compliant solid or a liquid, provided that the other phase is a relatively stiff material. Indeed, it has recently been shown that the Schwartz P and D structures are extremal when a competition is set up between the bulk modulus and electrical (or thermal) conductivity of the composite [13].
Figure 5. Unit cells of two different minimal surfaces with a resolution of 64 × 64 × 64. Left panel: Schwartz simple cubic surface. Right panel: Schwartz diamond surface.
Microstructure optimization
3.
2387
Reconstruction Techniques
The reconstruction of realizations of disordered materials, such as liquids, glasses, and random heterogeneous materials, from a knowledge of limited microstructural information (lower-order correlation functions) is an intriguing inverse problem. Clearly, one can never reconstruct the original material perfectly in the infinite-system limit, i.e., such reconstructions are nonunique. Thus, the objective here is not the same as that of data decompression algorithms that efficiently restore complete information, such as the gray scale of every pixel in an image. The generation of realizations of random media with specified lower-order correlation functions can: 1. shed light on the nature of the information contained in the various correlation functions that are employed; 2. ascertain whether the standard two-point correlation function, accessible experimentally via scattering, can accurately reproduce the material and, if not, what additional information is required to do so; 3. identify the class of microstructures that have exactly the same lowerorder correlation functions but widely different effective properties; 4. probe the interesting issue of nonuniqueness of the generated realizations; 5. construct structures that correspond to specified correlation functions and categorize classes of random media; 6. provide guidance in ascertaining the mathematical properties that physically realizable correlation functions must possess [14]; and 7. attempt three-dimensional reconstructions from slices or micrographs of the sample: a poor man’s X-ray microtomography experiment. The first reconstruction procedures applied to heterogeneous materials were based on thresholding Gaussian random fields. This approach to reconstruct random media originated with Joshi [15], and was extended by Adler [16] and Roberts and Teubner [17]. This method is currently limited to the standard two-point correlation function, and is not suitable for extension to non-Gaussian statistics.
3.1.
Optimization Problem
It has recently been suggested that reconstruction problems can posed as optimization problems [18, 19]. A set of target correlation functions are prescribed based upon experiments, theoretical models, or some ansatz. Starting from some initial realization of the random medium, the method proceeds to find a realization by evolving the microstructure such that the calculated correlation functions best match the target functions. This is achieved by minimizing
2388
S. Torquato
an error based upon the distance between the target and calculated correlation functions. The medium can be a dispersion of particles [18] or, more generally, a digitized image of a disordered material [19]. For simplicity, we will introduce the problem for the case of digitized heterogeneous media here and consider only a single two-point correlation function for statistically isotropic two-phase media (the generalization to multiple correlation functions is straightforward [18, 19]). It is desired to generate realizations of two-phase isotropic media that have a target two-point correlation function f 2 (r) associated with phase i, where r is the distance between the two points and i = 1 or 2. Let fˆ2 (r) be the corresponding function of the reconstructed digitized system (with periodic boundary conditions) at some time step. It is this system that we will attempt to evolve towards f 2 (r) from an initial guess of the system realization. Again, for simplicity, we define a fictitious “energy” (or norm-2 error) E at any particular stage as E=
[ fˆ2 (r) − f 2 (r)]2 ,
(8)
r
where the sum is over all discrete values of r. Potential candidates for the correlation functions [8] include: (1) the standard two-point probability function S2(r), lineal path function L(z), pore-size density function P(δ), and twopoint cluster function C2 (r). For statistically isotropic materials, S2 (r) gives the probability of finding the end points of a line segment of length r in one of the phases (say phase 1) when randomly tossed into the system, whereas L(z) provides the probability of finding the entire line segment of length r in phase 1 (or 2) when randomly tossed into the system.
3.2.
Solution of Optimization Problem
The aforementioned optimization problem is very difficult to solve due to the complicated nature of the objective function, which involves complex microstructural information in the form of correlation functions of the material, and due to the combinatorial nature of the feasible set. Standard mathematical programming techniques are therefore most likely inefficient and likely to get trapped in local minima. In fact, the complexity and generality of the reconstruction problem makes it difficult to devise deterministic algorithms of wide applicability. One therefore often resorts to heuristic techniques for global optimization, in particular, the simulated annealing method. Simulated annealing has been applied successfully to many difficult combinatorial problems, including NP-hard ones such as the “traveling salesman” problem. The utility of the simulated annealing method stems from its simplicity in that it only requires “black-box” cost function evaluations, and in its physically designed ability to escape local minima via accepting locally
Microstructure optimization
2389
unfavorable configurations. In its simplest form, the states of two selected pixels of different phases are interchanged, automatically preserving the volume fraction of both phases. The change in the error (or “energy”) E = E − E between the two successive states is computed. This phase interchange is then accepted with some probability p(E) that depends on E. One reasonable choice is the Metropolis acceptance rule, i.e.,
p(E) =
1, E ≤ 0, exp(−E/T ), E > 0,
(9)
where T is a fictitious “temperature”. The concept of finding the lowest error (lowest energy) state by simulated annealing is based on a well-known physical fact: If a system is heated to a high temperature T and then slowly cooled down to absolute zero, the system equilibrates to its ground state. We note that there are various ways of appreciably reducing computational time. For example, computational cost can be significantly lowered by using other stochastic optimization schemes such as the “Great Deluge” algorithm, which can be adjusted to accept only “downhill” energy changes, and the “threshold acceptance” algorithm [20]. Further savings can be attained by developing strategies that exploit the fact that pixel interchanges are local and thus one can reuse the correlation functions measured in the previous time step instead of recomputing them fully at any step [19]. Additional cost savings have been achieved by interchanging pixels only at the two-phase interface [8].
3.3.
Illustrative Examples
Lower-order correlation functions generally do not contain complete information and thus cannot be expected to yield perfect reconstructions. Of course, the judicious use of combinations of lower-order correlation functions can yield more accurate reconstructions than any single function alone. Yeong and Torquato [19, 21] clearly showed that the two-point function S2 alone is not sufficient to reconstruct accurately random media. By also incorporating the lineal-path function L, they were able to obtain better reconstructions. They studied one-, two- and three-dimensional digitized isotropic media. Each simulation began with an initial configuration of pixels (white for phase 1 and black for phase 2) in the random checkerboard arrangement at a prescribed volume fraction. A two-dimensional example illustrating the insufficiency of S2 in reconstructions is a target system of overlapping disks at a disk volume fraction of φ 2 = 0.5; see Fig. 6(a). Reconstructions that accurately match S2 alone, L alone, and both S2 and L are shown in Fig. 6. The S2-reconstruction is not very accurate; the cluster sizes are too large, and the system actually percolates.
2390
S. Torquato (a)
(b)
(c)
(d)
Figure 6. (a) Target system: a realization of random overlapping disks. System size = 400 × 400 pixels, disk diameter = 31 pixels, and volume fraction φ2 = 0.5. (b) S2 -reconstruction. (c) Corresponding L-reconstruction. (d) Corresponding hybrid (S2 + L)-reconstruction.
(Note that overlapping disks percolate at a disk area fraction of φ2 ≈ 0.68 [8]). The L-reconstruction does a better job than the S2 -reconstruction in capturing the clustering behavior. However, the hybrid (S2 + L)-reconstruction is the best. The optimization method can be used in the construction mode to find the specific structures that realize a specified set of correlation functions. An interesting question in this regard is the following: Is any correlation function physically realizable or must the function satisfy certain conditions? It turns out that not all correlation functions are physically realizable. For example, what are the existence conditions for a valid (i.e., physically realizable) auto covariance function χ(r) ≡ S2(r)−φ12 for statistically homogeneous twophase media? It is well known that there are certain nonnegativity conditions involving the spectral representation of the auto covariance χ(r) that must be obeyed [14]. However, it is not well known that these nonnegativity conditions are necessary but not sufficient conditions that a valid auto covariance χ(r) of a statistically homogeneous two-phase random medium (i.e., a binary stochastic spatial process) must meet. Some of these “binary” conditions are described by Torquato [8] but the complete characterization is a very difficult problem. Suffice it to say that that the algorithm in the construction mode can be used
Microstructure optimization
2391
to provide guidance on the development of the mathematical conditions that a valid auto covariance χ(r) must obey. Cule and Torquato [20] considered the construction of realizations having the following autocovariance function: sin(qr) S2(r) − φ12 , = e−r/a φ1 φ2 qr
(10)
where q = 2π/b and the positive parameter b is a characteristic length that controls oscillations in the term sin(qr)/(qr), which also decays with increasing r. This function possesses phase-inversion symmetry [8] and exhibits a considerable degree of short-range order; it generalizes the purely exponentiallydecaying function studied by Debye, et al. [22]. This function satisfies the nonnegativity condition on the spectral function but may not satisfy the “binary” conditions, depending on the values of a, b, and φ1 [14]. Two structures possessing the correlation function (10) are shown in Fig. 7 for φ2 = 0.2 and 0.5, in which a = 32 pixels and b = 8 pixels. For these sets of parameters, all of the aforementioned necessary conditions on the function are met. At φ2 = 0.2, the system resembles a dilute particle suspension with “particle” diameters of order b. At φ2 = 0.5, the resulting pattern is labyrinthine such that the characteristic sizes of the “patches” and “walls” are of order a and b, respectively. Note that S2(r) was sampled in all directions during the annealing process. In all of the previous two-dimensional examples, however, both S2 and L were sampled along two orthogonal directions to save computational time. This time-saving step should be implemented only for isotropic media, provided that there is no appreciable short-range order; otherwise, it leads to unwanted anisotropy [20, 23]. However, this artificial anisotropy can be reduced by optimizing along additional selected directions [24].
2 ⫽ 0.2
2 ⫽ 0.5
Figure 7. Structures corresponding to the target correlation function given by (10) for φ2 = 0.2 and 0.5. Here a = 32 pixels and b = 8 pixels.
2392
S. Torquato
To what extent can information extracted from two-dimensional cuts through a three-dimensional isotropic medium, such as S2 and L , be employed to reproduce intrinsic three-dimensional information, such as connectedness? This question was studied for the aforementioned Fontainebleau sandstone for which we know the full three-dimensional structure via X-ray microtomography [21]. The three-dimensional reconstruction that results by using a single slice of the sample and matching both S2 and L is shown in Fig. 8. The reconstructions reproduce accurately certain three-dimensional properties of the pore space, such as the pore-size functions, the mean survival time of a Brownian particle, and the fluid permeability. The degree of connectedness of the pore space also compares remarkably well with the actual sandstone, although this is not always the case [25]. As noted earlier, the aforementioned algorithm was originally applied to reconstruct realizations of many-particle systems [18]. The hard-sphere system in which pairs of particles only interact with an infinite repulsion when they overlap is one of the simplest interacting particle systems [8]. Importantly, the impenetrability constraint does not uniquely specify the statistical ensemble. The hard-sphere system can be in thermal equilibrium or in one of the infinitely many nonequilibrium states, such as the random sequential addition (or adsorption) (RSA) process that is produced by randomly, irreversibly, and sequentially placing nonoverlapping objects into a volume [8]. While particles in equilibrium have thermal motion such that they sample the configuration space uniformly, particles in an RSA process do not sample the configuration space uniformly, since their positions are forever “frozen” (i.e., do not diffuse) after they have been placed into the system.
Figure 8. Hybrid reconstruction of a sandstone (described in Ref. [8]) using both S 2 and L obtained from a single “slice”. System size is 128 × 128 × 128 pixels. Left panel: Pore space is white and opaque, and the grain phase is black and transparent. Right panel: 3D perspective of the surface cuts.
Microstructure optimization
2393
The geometrical blocking effects and the irreversible nature of the process results in structures that are distinctly different from corresponding equilibrium configurations, except for low densities. The saturation limit (the final state of this process whereby no particles can be added) occurs at a particle volume fraction φ2 ≈ 0.55 in two dimensions [8]. The reconstruction of the two-dimensional RSA disk system in which the target correlation function is the well-known radial distribution function (RDF) g(r). In two dimensions, the quantity ρ2πrg(r) dr gives the average number of particle centers in an annulus of thickness dr at a radial distance of r from the center of a particle (where ρ is the number density). The RDF is of central importance in the study of equilibrium liquids in which the particles interact with pairwise-additive forces since all of the thermodynamics (a)
(b)
Figure 9. (a) A portion of a sample RSA system at φ2 = 0.543. (b) A portion of the reconstructed RSA system at φ2 = 0.543.
2394
S. Torquato
Figure 10. Configurations of 289 particles for φ2 = 0.2 in two dimensions. The equilibrium hard-disk system (left) shows more clustering and larger pores than the annealed step-function system (right).
can be expressed in terms of the RDF. The RDF can be ascertained from scattering experiments, which makes it a likely candidate for the reconstruction of a real system. The initial configuration was 5000 disks in equilibrium. Figure 9 shows a realization of the the RSA system at φ2 = 0.543 in (a), and the reconstructed system. As a quantitative comparison of how the original and reconstructed systems matched, it was found that the corresponding pore-size distribution functions [8] were similar. This conclusion gives one confidence that a reasonable facsimile of the actual structure can be produced from the RDF for a class of many-particle systems in which there is not significant clustering of the particles. For the elementary unit step-function g2 , previous work [26] indicated that this function was achievable by hard-sphere configurations up to a terminal covering fraction of particle exclusion diameters equal to 2−d in d dimensions. To test whether the unit step g2 is actually achievable by hard spheres for nonzero densities, the aforementioned stochastic optimization procedure was applied in the construction mode. Calculations for d = 1 and 2 confirmed that the step-function g2 is indeed realizable up to the terminal density [27]. Figure 10 compares an equilibrium hard-disk configuration at φ2 = 0.2 to a corresponding annealed step-function system.
4.
Summary
The fundamental understanding of the microstructure/properties connection is the key to designing new materials with the tailored properties for
Microstructure optimization
2395
specific applications. Optimization methods combined with novel synthesis and fabrication techniques provide a means of accomplishing this goal systematically and could make optimal design of real materials a reality in the future. The topology optimization technique and the stochastic reconstruction (construction) method address only a small subset of optimization issues of importance in materials science, but the results that are beginning to emerge from these relatively new methods bode well for progress in the future.
References [1] M.P. Bendsøe and N. Kikuchi, “Generating optimal topologies in structural design using a homogenization method,” Comput. Methods Appl. Mech. Eng., 71, 197–224, 1988. [2] O. Sigmund and S. Torquato, “Design of materials with extreme thermal expansion using a three-phase topology optimization method,” J. Mech. Phys. Solids, 45, 1037–1067, 1997. [3] O. Sigmund, S. Torquato, and I.A. Aksay, “On the design of 1-3 piezocomposites using topology optimization,” J. Mater. Res., 13, 1038–1048, 1998. [4] M.P. Bendsøe, Optimization of Structural Topology, Shape and Material, SpringerVerlag, Berlin, 1995. [5] S. Hyun and S. Torquato, “Designing composite microstructures with targeted properties,” J. Mater. Res., 16, 280–285, 2001. [6] S. Torquato, S. Hyun, and A. Donev, “Multifunctional composites: optimizing microstructures for simultaneous transport of heat and electricity,” Phys. Rev. Lett., 89, 266601, 1–4, 2002. [7] M.P. Bendsøe and O. Sigmund, Topology Optimization, Springer-Verlag, Berlin, 2003. [8] S. Torquato, Random Heterogeneous Materials: Microstructure and Macroscopic Properties, Springer-Verlag, New York, 2002. [9] G.W. Milton, The Theory of Composites, Cambridge University Press, Cambridge, England, 2002. [10] U.D. Larsen, O. Sigmund, and S. Bouwstra, “Design and fabrication of compliant mechanisms and material structures with negative Poisson’s ratio,” J. Microelectromechanical Systems, 6(2), 99–106, 1997. [11] S. Torquato and S. Hyun, “Effective-medium approximation for composite media: realizable single-scale dispersions,” J. Appl. Phys., 89, 1725–1729, 2001. [12] G.W. Milton, “Multicomponent composites, electrical networks and new types of continued fractions. I and II,” Commun. Math. Phys., 111, 281–372, 1987. [13] S. Torquato and A. Donev, “Minimal surfaces and multifunctionality,” Proc. R. Soc. Lond. A, 460, 1849–1856, 2004. [14] S. Torquato, “Exact conditions on physically realizable correlation functions of random media,” J. Chem. Phys., 111, 8832–8837, 1999. [15] M.Y. Joshi, A Class of Stochastic Models for Porous Media, Ph.D. thesis, University of Kansas, Lawrence, 1974. [16] P.M. Adler, Porous Media – Geometry and Transports, Butterworth-Heinemann, Boston, 1992. [17] A.P. Roberts and M. Teubner, “Transport properties of heterogeneous materials derived from Gaussian random fields: bounds and simulation,” Phys. Rev. E, 51, 4141–4154, 1995.
2396
S. Torquato
[18] M.D. Rintoul and S. Torquato, “Reconstruction of the structure of dispersions,” J. Colloid Interface Sci., 186, 467–476, 1997. [19] C.L.Y. Yeong and S. Torquato, “Reconstructing random media,” Phys. Rev. E, 57, 495–506, 1998a. [20] D. Cule and S. Torquato, “Generating random media from limited microstructural information via stochastic optimization,” J. Appl. Phys., 86, 3428–3437, 1999. [21] C.L.Y. Yeong and S. Torquato, “Reconstructing random media: II. Three-dimensional media from two-dimensional cuts,” Phys. Rev. E, 58, 224–233, 1998b. [22] P. Debye, H.R. Anderson, and H. Brumberger, “Scattering by an inhomogeneous solid. II. The correlation function and its applications,” J. Appl. Phys., 28, 679–683, 1957. [23] C. Manwart and R. Hilfer, “Reconstruction of random media using Monte Carlo methods,” Phys. Rev. E, 59, 5596–5599, 1999. [24] N. Sheehan and S. Torquato, “Generating microstructures with specified correlation function,” J. Appl. Phys., 89, 53–60, 2001. [25] C. Manwart, S. Torquato, and R. Hilfer, “Stochastic reconstruction of sandstones,” Phys. Rev. E, 62, 893–899, 2000. [26] F.H. Stillinger, S. Torquato, J.M. Eroles, and T.M. Truskett, “Iso-g (2) processes in equilibrium statistical mechanics,” J. Phys. Chem. B, 105, 6592–6597, 2001. [27] J.R. Crawford, S. Torquato, and F.H. Stillinger, “Aspects of correlation function realizability,” J. Chem. Phys., 2003.
7.19 MICROSTRUCTURAL CHARACTERIZATION ASSOCIATED WITH SOLID–SOLID TRANSFORMATIONS J.M. Rickman1 and K. Barmak2 1
Department of Materials Science and Engineering, Lehigh University, Bethlehem, PA 18015, USA 2 Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
1.
Introduction
Materials scientists have long been interested in the characterization of complex poly-crystalline systems, as embodied in the distribution of grain size and shape, and have sought to link microstructural features with observed mechanical, electronic and magnetic properties [1]. The importance of detailed microstructural characterization is underscored by systems of limited spatial dimensionality, with length scales of the order of nanometers to microns, as their reliability and performance are greatly influenced by specific microstructural features rather than by average, bulk properties [2]. For example, the functionalities of many electronic devices depend critically on the microstructure of thin metallic films via the film deposition process and the occurrence of reactive phase formation at metallic contacts. Various tools are available for quantitative microstructural characterization. Most notably, microstructural analyses often employ stereological techniques [1] and the related formalism of stochastic geometry [3] to interrogate grain populations and to deduce plausible growth scenarios that led to the observed grain morphologies. In this effort computer simulation is especially valuable, permitting one to implement various growth assumptions and to generate a large number of microstructures for subsequent analysis. The acquisition of comparable grain size and shape information from experimental images is, however, often problematic given difficulties inherent in grain recognition. The case of polycrystalline thin films is illustrative here. In these systems transmission 2397 S. Yip (ed.), Handbook of Materials Modeling, 2397–2408. c 2005 Springer. Printed in the Netherlands.
2398
J.M. Rickman and K. Barmak
electron microscopy (TEM) is necessary to resolve pertinent microstructural features. Unfortunately, complex contrast variations peculiar to TEM images plague grain recognition and therefore image interpretation. As a result, the tedious, state-of-the-art analysis, until quite recently [4, 6], involved human intervention to trace grain boundaries and to collect grain statistics. In this topical article we review methods for quantitative microstructural analysis, focusing first on systems that evolve from nucleation and subsequent growth processes. As indicated above, computer simulation of these processes provides considerable insight into the link between initial conditions and product microstructures, and so we will highlight some recent work in this area of evolving, first-order phase transformations in the absence of grain growth (i.e., coarsening). The analysis here will involve several important descriptors that are sensitive to different microstructural details and that can be used to infer the conditions that led to a given structure. Finally, we conclude with a discussion of new, automated image processing techniques that permit one to acquire information on large grain populations and to make useful comparisons of the associated grain-size distributions with those advanced in theoretical investigations of grain growth [6–8].
2.
Phase Transformations
Computer simulations are particularly useful for investigating the impact of nucleation conditions on product grain morphology resulting from a firstorder phase transformation [9, 3]. Several schemes for modeling such transformations have been discussed in the literature [10, 11], and it is generally possible to use them to describe a variety of nucleation scenarios, including those involving site saturation (e.g., a burst) and a constant nucleation rate. To illustrate a typical microstructural analysis, consider the constant radial growth to impingement of product grains originating from a burst of nuclei that are randomly distributed in two dimensions. The resulting microstructures consists of a set of Voronoi grains that tile the system, as shown in Fig. 1.
2.1.
Grain Area Distribution
Our characterization of this microstructure begins with the compilation of ¯ where the bar a frequency histogram of normalized grain areas, A = A/ A, denotes a microstructural average. The corresponding probability distribution P( A ), as obtained for a relatively large grain population (∼106 grains) is
Microstructural characterization in solid–solid transformations
2399
Figure 1. A fully coalesced product microstructure produced by a burst of nuclei that subsequently grow at a constant radial rate until impingement.
shown in Fig. 2. While no exact analytical form for this distribution is known, approximate expressions based on the gamma distribution P γ ( A ) =
1 (A )α−1 exp( A ) β α (α)
(1)
follow from stochastic geometry [12, 13], where α and β are parameters such that α = 1/β. For the Voronoi structure β is then the variance, as obtained analytically by Gilbert [14]. As can be seen from Fig. 2, the agreement between the simulated and approximate distributions is excellent. As P( A ) is a quantity of central importance in most microstructural analyses, it is of interest to determine whether it can be used to deduce, a posteriori, nucleation conditions. For this purpose, consider next the product microstructure resulting from nuclei that are randomly distributed on an underlying microstructure. A systematic analysis of such structures follows from a comparison of the relevant length scales here, namely the average underlying cell diameter, lu , and the average internuclear separation along the boundary, lb . For this discussion it is convenient to define a relative measure of these length scales r = lb /lu , and so one would intuitively expect that in the limit r > 1 (r < 1) the product microstructure comprises largely equiaxed (elongated) grains. Several product microstructures corresponding to different values of r, shown in Fig. 3, confirm these expectations.
2400
J.M. Rickman and K. Barmak
Figure 2. The corresponding probability distribution, P(A ), of normalized grain areas, A , and an approximate representation, P γ (A ) (solid line), based on the gamma distribution. Note the excellent agreement between the simulated and approximate distributions.
Figure 3. Product microstructures corresponding, from left to right, to r < 1, r ∼ 1, and r > 1. Note that in the limit r > 1(r < 1) the product microstructure comprises largely equiaxed (elongated) grains.
Upon examining the probability distributions for large collections of grains with these values of r (see Fig. 4), it is evident that, upon decreasing r, the distribution shifts to the left and a greater number of both relatively small and large grains is created. A more detailed analysis of these distributions demonstrates, again, that the gamma distribution is a good approximation in many cases, and a calculation of lower-order moments reveals a scaling regime for intermediate values of r [9]. Despite these features, it is found that, in general, P( A ) lacks the requisite sensitivity to variations in r needed for an unambiguous identification of nucleation conditions. As an alternative to the grain-area distribution, one can obtain descriptors that focus on the spatial distribution of the nucleation sites themselves, regarded here as microstructural generators. The utility of such descriptors depends, of course, on the ability to extract from a product microstructure the spatial distribution of these generators. As a reverse Monte Carlo method was recently devised to accomplish this task in some circumstances [3], we merely
Microstructural characterization in solid–solid transformations
2401
Figure 4. The probability distribution P(A ) for different values of the ratio of length scales r . Although there is a discernible shift in curve position and attendant change in curve shape upon changing r, the distribution is not very sensitive to these changes.
outline here the use of one such descriptor. Now, from the theory of point processes one can define a neighbor distribution wk (r), the generalization of a waiting-time distribution to space, that measures the probability of finding the k-th neighbor at a distance r (not to be confused with the dimensionless microstructural parameter defined above) away from a given nucleus [15]. Consider then the first moment of this distribution rk for the kth neighbor. For points randomly distributed in d dimensions one finds that
1 d 1+ rk = √ 1/d π(λd ) 2
1/d
(k + 1/d) , (k)
(2)
where λd is the d-dimensional volume density of points. Thus, the dependence of rk on k is a signature of the effective spatial dimensionality of the site-saturated nucleation process. Figure 5 shows the dependence of the normalized first moment on k for several cases of catalytic nucleation on an underlying microstructure, each corresponding to a different value of ζ = 1/r. An interpretation of Fig. 5 follows upon examining Fig. 6, the latter showing the dependence of the moment on k for the small and large ζ along with the predicted results for strictly oneand two-dimensional random distributions of nuclei. For low linear nucleation densities (e.g., ζ = 0.1) the underlying structure is effectively unimportant and so rk follows the theoretical two dimensional random curve for small to intermediate k. By contrast, at high nucleation densities, nuclei have many neighbors along a given edge and so rk initially exhibits pseudo-one-dimensional behavior. As more distant neighbors are considered, rk is consistent with
2402
J.M. Rickman and K. Barmak
Figure 5. The first moment of the neighbor distribution, rk , as a function of neighbor number k for different values of ζ = 1/r.
two-dimensional random behavior as these neighbors are now on other boundary segments distributed throughout the system. With this information it is possible to infer different nucleation scenarios from rk vs. k [3]. Finally, it is worth noting that other, related descriptors are useful in distinguishing different nucleation conditions. For example, as is often done in atomistic simulation, one can calculate the pair correlation function, g(r), for the nuclei. The results of such a calculation are presented in Fig. 7 for nucleation on the corners of an underlying grain structure. A measure of the nonrandomness of this spatial distribution of nuclei at a particular r is given by g(r) − 1. Thus, g(r) is a sensitive measure of deviations from randomness, and has been employed to investigate spatial correlations among nuclei formed at underlying grain junctions [3, 16].
3.
Image Processing and Grain-size Distributions
As indicated above, the acquisition of statistically significant grain size and shape information from experimental micrographs is difficult owing to
Microstructural characterization in solid–solid transformations
2403
Figure 6. The dependence of rk on k for the small and large ζ along with the predicted results for strictly one- and two-dimensional random distributions of nuclei.
problems associated with grain recognition. Nevertheless, it is essential to obtain such information to enable meaningful comparisons with simulated structures and to investigate various nucleation and growth scenarios. With this in mind, we outline below recent progress toward automated analysis of experimental micrographs. In this short review, our focus will be on assessing models of grain growth (i.e., coarsening) in thin films that describe microstructural kinetics after transformed grains have grown to impingement. The grain size of thin films is known to scale with the thickness of the film. Thus, for films with thicknesses of 1 nm to 1 µm it is necessary to employ transmission electron microscopy (TEM) to image the film grain structure. Although the grain structure of these film is easily discernable by eye from TEM micrographs, the image contrast is often quite complex. Such image contrast arises primarily from variations in the diffraction condition that result from: (1) changes in crystal orientation as a grain boundary is traversed, (2) localized straining of the lattice, and (3) long-wavelength bending of the sample. The latter two sources of contrast cannot be easily deconvoluted from the first, and, as a result, conventional image processing algorithms have been of limited utility in thin film grain structure analysis.
2404
J.M. Rickman and K. Barmak
Figure 7. The pair correlation function g(r ) versus internuclear separation, r , for nucleation on the corners of an underlying grain structure. A measure of the nonrandomness of this spatial distribution of nuclei at a particular r is given by g(r ) − 1.
Recently we have developed and used a practical, automated scheme for the acquisition of microstructural information from a statistically significant population of grains imaged by TEM [4]. Our overall philosophy for automated detection of grain boundaries is to first optimize the microscope operating conditions and the resultant image, and to then eliminate as much as possible false features in the processed images, even sometimes at the expense of real microstructural features. The true information deleted in this manner is recovered by optimally combining processed images of the same field of view taken at different sample tilts. The new algorithms developed to independently process the images taken at different samples tilts are automated thresholding and three filters for removal of (i) short, disconnected pixel segments, (ii) excessively connected or “tangled” lines, and (iii) isolated clusters. The segment and tangle filters employ a length scale specified by the user that is estimated as the length, in
Microstructural characterization in solid–solid transformations
2405
pixels, of the shortest real grain boundary segment in the TEM image. These newly developed filters are used in combination with other existing image processing routines including, for example, gray scale and binary operations such as the median noise filter, the Sobel and Kirsch edge detection filters, dilation and erosion operations, skeletonization, and opening and closing operations to generate the binary image seen Fig. 8. The images at different sample tilts are then combined to generate a single processed image that can then be analyzed using available software (e.g., NIH image, Rasband, US National Institutes of Health, or Scion Image, http://www.scioncorp.com.) Additional details of our automated image processing methodology can be found elsewhere [4, 5]. The experimentally determined grain size data for 8185 Al grains obtained using our automated methodology is shown in Fig. 9. The figure also shows three continuous probability density functions, corresponding to the lognormal (l), gamma (g), and Rayleigh (r) distributions, respectively, that have been fitted to the experimental data. The functional forms of these distributions are given by pl (d) =
1 2 2 exp −(1n(d) − α) /2β , d(2πβ 2 )1/2
(3)
pg (d) =
d α−1 exp(−d/β), β α (α)
(4)
pr (d) =
αd exp −d 2 /4β , β
(5)
where α and β are fitting parameters that are different for each distribution and, in the case of the Rayleigh distribution, normalization requires that α = 1/2. In these expressions, d represents an equivalent circular grain diameter, i.e., the diameter of a circle with equal area to that of the grain. The figure clearly demonstrates that the Rayleigh density is a poor representation of the experimental data, while both the lognormal and gamma densities fall within the error of the experimental distribution. It should be emphasized that large data sets, acquired here via automated methodologies, are needed to examine quantitatively the predictions of various grain growth models.
4.
Conclusions
Various techniques for the analysis of microstructures generated both experimentally and by computer simulation were described above. In the case
2406
J.M. Rickman and K. Barmak
Figure 8. A bright-field scanning transmission electron micrograph and processed images of a 100 nm thick Al film. (B1–D1) Results from conventional image processing, after (B1) median noise filter and Sobel edge detection operator, (C1) dilation, skeletonization. (B2–D2) Results from using a combination of new and conventional image processing operations, after (B2) hybrid median noise filter and Kirsch edge detection filter, (C2) dilation, skeletonization, segment filter and tangle filter, and (D2) cluster filter and final consolidation. Note that conventional image processing results in a number of false grains.
of computer simulation the focus was on developing descriptors that can be used to infer nucleation and growth conditions associated with a first-order phase transformation from a final, coalesced product microstructure. We also describe a methodology for the automated analysis of experimental TEM
Microstructural characterization in solid–solid transformations
2407
Figure 9. Fig. 3 (a) Lognormal, (b) Gamma, and (c) Rayleigh distributions fitted to experimental grain size data comprising 8185 Al grains in a thin film. Error bars represent a 95% confidence level.
micrographs. The purpose of such an analysis is to obtain statistically significant size and shape data for a large grain population. Finally, we use the information from the automated analysis to assess the validity of different grain growth models.
2408
J.M. Rickman and K. Barmak
Acknowledgments The authors are grateful for support under DMR-9256332, DMR-9713439 and DMR-9996315.
References [1] E.E. Underwood, Quantitative Stereology, Addison-Wesley, Reading, Massachusetts, 1970. [2] J. Harper and K. Rodbell, J. Vac. Sci. Technol. B, 15, 763, 1997. [3] W.S. Tong, J.M. Rickman, and K. Barmak, Acta Mater., 47, 435, 1999. [4] D.T. Carpenter, J.M. Rickman, and K. Barmak, J. Appl. Phys., 84, 5843, 1998. [5] D.T. Carpenter, J.R. Codner, K. Barmak, and J.M. Rickman, Mater. Lett., 41, 296, 1999. [6] N. Louat, Acta Metall., 22, 721, 1974. [7] P. Feltham, Acta Metall., 5, 97, 1957. [8] W.W. Mullins, Acta Mater., 46, 6219, 1998. [9] W.S. Tong, J.M. Rickman, and K. Barmak, “Impact of boundary nucleation on product grain size distribution,” J. Mater. Res., 12, 1501, 1997. [10] K.W. Mahin, K. Hanson, and J.W. Morris, Jr., Acta Metall., 28, 443, 1980. [11] H.J. Frost and C.V. Thompson, Acta Metall., 35, 529, 1987. [12] T. Kiang, Z. Astrophys, 48, 433, 1966. [13] D. Weaire, J.P. Kermode, and J. Wejchert, Phil. Mag. B, 53, L101–105, 1986. [14] E.N. Gilbert, Ann. Math. Stat., 33, 958, 1962. [15] N.G. van Kampen, Stochastic Processes in Physics and Chemistry, North-Holland, New York, 1992. [16] D. Stoyan and H. Stoyan, Appl. Stoch. Mod. Data Anal., 6, 13, 1990.
Chapter 8 FLUIDS
8.1 MESOSCALE MODELS OF FLUID DYNAMICS Bruce M. Boghosian1 and Nicolas G. Hadjiconstantinou2 1 Department of Mathematics, Tufts University, Bromfield-Pearson Hall, Medford, MA 02155, USA 2 Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
During the last half century, enormous progress has been made in the field of computational materials modeling, to the extent that in many cases computational approaches are used in a predictive fashion. Despite this progress, modeling of general hydrodynamic behavior remains a challenging task. One of the main challenges stems from the fact that hydrodynamics manifests itself over a very wide range of length and time scales. On one end of the spectrum, one finds the fluid’s “internal” scale characteristic of its molecular structure (in the absence of quantum effects, which we omit in this chapter). On the other end, the “outer” scale is set by the characteristic sizes of the problem’s domain. The resulting scale separation or lack thereof as well as the existence of intermediate scales are key to determining the optimal approach. Successful treatments require a judicious choice of the level of description which is a delicate balancing act between the conflicting requirements of fidelity and manageable computational cost: a coarse description typically requires models for underlying processes occuring at smaller length and time scales; on the other hand, a fine-scale model will incur a significantly larger computational cost. When no molecular or intermediate length scales are important, e.g., for simple fluids, modeling the fluid at the outer scale and as a continuum results in the most efficient approach. The most well known example of these “continuum” approaches is the Navier–Stokes description of a viscous fluid. Continuum hydrodynamic descriptions are typically derived from conservation laws which require transport models before they can be solved. The resulting mathematical model is in the form of partial differential equations. A variety of methods have been developed for the solution of these, including finitedifference, finite-element, finite-volume, and spectral-element methods, such 2411 S. Yip (ed.), Handbook of Materials Modeling, 2411–2414. c 2005 Springer. Printed in the Netherlands.
2412
B.M. Boghosian and N.G. Hadjiconstantinou
as are described in Ref. [1]. All of these methods require that the physical domain is discretized using a mesh, the generation of which can be fairly involved, depending on the complexity of the problem. More recent efforts have culminated in the development of meshless methods for solving partial differential equations, an exposition of which can be found in Ref. [2]. In certain circumstances, the separation between the molecular and macroscopic scales of length and time is lost. This happens locally in, inter alia, liquid droplet coalescence, amphiphilic membranes and monolayers, contactline dynamics in immiscible fluids, and shock formation. It may also happen globally, for example if ultra-high frequency waves are excited in a fluid. In these cases, one is forced to use a particulate description, the most well known of which is Molecular Dynamics (MD) in which particle orbits are tracked numerically. An extensive description of MD can be found in Chapter 2 [3], while a discussion of its applications to hydrodynamics can be found in Ref. [4]. The Navier–Stokes equations on one hand and MD on the other, represent two extreme possibilities. Typical problems of interest, and in particular of practical interest involving complex fluids and inhomogeneities, are significantly more complex leading to a wide range of intermediate scales that need to be addressed. For the foreseeable future, MD can be applied only to very small systems and for very short periods of time due to the computational cost associated with this approach. The principal purpose of this Chapter is to describe numerous intermediate or “mesoscale” approaches between these extremes, which attempt to coarse-grain the particulate description to varying degrees to address modeling needs. An example of a mesoscale approach can be found in descriptions of a dilute gas, in which particles travel in straight line orbits for the great majority of the time. In this situation, calculating trajectories between collisions in an exact fashion is unnecessary and therefore inefficient. A particularly ingenious method, known as Direct Simulation Monte Carlo (DSMC) takes advantage of this observation to split particle motion into successive collisionless advection and collision events. The collisionless advection occurs in steps on the order of a collision time, in contrast to MD which may require on the order of 102 time steps per collision time; likewise, collision events are processed in a stochastic manner in DSMC, in contrast to MD which tracks detailed trajectories of colliding particles. The result of this coarse graining is a description which is many orders of magnitude more computationally efficient than MD, but sacrifices atomic-level detail and precise representation of interparticle correlations. The method is described in Ref. [5]. An extension of this method, called Direct Simulation Automata (DSA), includes multiparticle collisions that make it suitable for the description of liquids and complex fluids; this is described in Ref. [6]. For a wider range of materials, including dense liquids and complex fluids, thermal fluctuations and viscous dissipation are among the essential emergent
Mesoscale models of fluid dynamics
2413
properties captured by MD. For example, these are the principle ingredients of a Langevin description of a colloidal suspension. In a physical system, with microscopically reversible particle orbits, these quantities are related by the Fluctuation–Dissipation Theorem of statistical physics. Dissipative Particle Dynamics (DPD) takes advantage of this to include these ingredients in a physically realistic fashion. It modifies the conservative forces of an MD simulation by introducing effective potentials, as well as fluctuating and frictional forces that represent the degrees of freedom that are lost by coarse-graining. The result is a description that is orders of magnitude more computationally efficient than MD, but which sacrifices the precise treatment of correlations and fluctuations, and requires the use of effective potentials. The DPD model is described in Ref. [7]. If one is willing to dispense with all representation of thermal fluctuations and multiparticle correlations, one may retain only the single-particle distribution function, as for example in the Boltzmann equation of kinetic theory. It was discovered in the late 1980’s that the Navier–Stokes limit of the Boltzmann description is surprizingly robust with respect to radical discretizations of the velocity space. In particular, it is possible to adopt a velocity space that consists only of a small discrete set of velocities, coincident with lattice vectors of a particular lattice. For certain choices of lattice and of collision operator, the resulting Boltzmann equation, which describes the transport of particles on a lattice with collisions localized to lattice sites, can be rigorously shown to give rise to Navier–Stokes behavior. These lattice Boltzmann models are described in Ref. [8]. Since their discovery, they have been extended to deal with compressible flow, adaptive mesh refinement on structured and unstructured grids, multiphase flow, and complex fluids. In a number of situations, the hydrodynamics of certain problems evolve in a wide range of length and time scales. If this range of scales is sufficiently wide such that no single description can be used, hybrid methods which combine more than one description can be used. The motivation for hybrid methods stems from the fact that, invariably, the “higher fidelity” description is also more computationally expensive and thus it becomes advantageous to limit its use only in the regions in which it is necessary. Clearly, hybrid methods in this respect make sense only when the “higher fidelity” description is required in small regions of space. Although hybrid methods coupling any of the methods described in this chapter can be envisioned, currently most effort has been focused towards the development of Navier–Stokes/MD and Navier–Stokes/DSMC hybrids. These are described in detail in Ref. [9]. The list of topics chosen for inclusion in this chapter is representative but not exhaustive. In particular, space limitations have precluded us from including much interesting and excellent work in the area of mesh genration, adaptive mesh refinement, and boundary element methods for the Navier–Stokes equations. Also missing are descriptions of certain mesoscale methods, such
2414
B.M. Boghosian and N.G. Hadjiconstantinou
as lattice-gas automata and smoothed-particle hydrodynamics. Nevertheless, we feel that the topics included provide a representative cross section of this fast developing and exciting area of materials modeling research.
References [1] S. Sherwin and J. Peiro, “Finite difference, finite element and finite volume methods for partial differential equations,” Article 8.2, this volume. [2] G. Li, X. Jin, and N.R. Aluru, “Meshless methods for numerical solution of partialdifferential equations,” Article 8.3, this volume. [3] J. Li, “Basic molecular dynamics,” Article 2.8, this volume. [4] J. Koplik and J.R. Banavar, “Continuum deductions from molecular hydrodynamics,” Ann. Rev. Fluid Mech., 27, 257–292, 1995. [5] F.J. Alexander, “The direct simulation Monte Carlo method: going beyond continuum hydrodynamics,” Article 8.7, this volume. [6] T. Sakai and P.V. Coveney, “Discrete simulation automata: mesoscopic fluid models endowed with thermal fluctuations,” Article 8.5, this volume. [7] P. Espa˜nol, “Dissipative particle dynamics,” Article 8.6, this volume. [8] S. Succi, W.E, and E. Kaxiras, “Lattice Boltzmann methods for multiscale fluid problems,” Article 8.4, this volume. [9] H.S. Wijesinghe and N.G. Hadjiconstantinou, “Hybrid atomistic-continuum formulations for multiscale hydrodynamics,” Article 8.8, this volume.
8.2 FINITE DIFFERENCE, FINITE ELEMENT AND FINITE VOLUME METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS Joaquim Peir´o and Spencer Sherwin Department of Aeronautics, Imperial College, London, UK
There are three important steps in the computational modelling of any physical process: (i) problem definition, (ii) mathematical model, and (iii) computer simulation. The first natural step is to define an idealization of our problem of interest in terms of a set of relevant quantities which we would like to measure. In defining this idealization we expect to obtain a well-posed problem, this is one that has a unique solution for a given set of parameters. It might not always be possible to guarantee the fidelity of the idealization since, in some instances, the physical process is not totally understood. An example is the complex environment within a nuclear reactor where obtaining measurements is difficult. The second step of the modeling process is to represent our idealization of the physical reality by a mathematical model: the governing equations of the problem. These are available for many physical phenomena. For example, in fluid dynamics the Navier–Stokes equations are considered to be an accurate representation of the fluid motion. Analogously, the equations of elasticity in structural mechanics govern the deformation of a solid object due to applied external forces. These are complex general equations that are very difficult to solve both analytically and computationally. Therefore, we need to introduce simplifying assumptions to reduce the complexity of the mathematical model and make it amenable to either exact or numerical solution. For example, the irrotational (without vorticity) flow of an incompressible fluid is accurately represented by the Navier–Stokes equations but, if the effects of fluid viscosity are small, then Laplace’s equation of potential flow is a far more efficient description of the problem. 2415 S. Yip (ed.), Handbook of Materials Modeling, 2415–2446. c 2005 Springer. Printed in the Netherlands.
2416
J. Peir´o and S. Sherwin
After the selection of an appropriate mathematical model, together with suitable boundary and initial conditions, we can proceed to its solution. In this chapter we will consider the numerical solution of mathematical problems which are described by partial differential equations (PDEs). The three classical choices for the numerical solution of PDEs are the finite difference method (FDM), the finite element method (FEM) and the finite volume method (FVM). The FDM is the oldest and is based upon the application of a local Taylor expansion to approximate the differential equations. The FDM uses a topologically square network of lines to construct the discretization of the PDE. This is a potential bottleneck of the method when handling complex geometries in multiple dimensions. This issue motivated the use of an integral form of the PDEs and subsequently the development of the finite element and finite volume techniques. To provide a short introduction to these techniques we shall consider each type of discretization as applied to one-dimensional PDEs. This will not allow us to illustrate the geometric flexibility of the FEM and the FVM to their full extent, but we will be able to demonstrate some of the similarities between the methods and thereby highlight some of the relative advantages and disadvantages of each approach. For a more detailed understanding of the approaches we refer the reader to the section on suggested reading at the end of the chapter. The section is structured as follows. We start by introducing the concept of conservation laws and their differential representation as PDEs and the alternative integral forms. We next discusses the classification of partial differential equations: elliptic, parabolic, and hyperbolic. This classification is important since the type of PDE dictates the form of boundary and initial conditions required for the problem to be well-posed. It also permits in some cases, e.g., in hyperbolic equations, to identify suitable schemes to discretise the differential operators. The three types of discretisation: FDM, FEM and FVM are then discussed and applied to different types of PDEs. We then end our overview by discussing the numerical difficulties which can arise in the numerical solution of the different types of PDEs using the FDM and providing an introduction to the assessment of the stability of numerical schemes using a Fourier or Von Neumann analysis. Finally we note that, given the scientific background of the authors, the presentation has a bias towards fluid dynamics. However, we stress that the fundamental concepts presented in this chapter are generally applicable to continuum mechanics, both solids and fluids.
1.
Conservation Laws: Integral and Differential Forms
The governing equations of continuum mechanics representing the kinematic and mechanical behaviour of general bodies are commonly referred
Numerical methods for partial differential equations
2417
to as conservation laws. These are derived by invoking the conservation of mass and energy and the momentum equation (Newton’s law). Whilst they are equally applicable to solids and fluids, their differing behaviour is accounted for through the use of a different constitutive equation. The general principle behind the derivation of conservation laws is that the rate of change of u(x, t) within a volume V plus the flux of u through the boundary A is equal to the rate of production of u denoted by S(u, x, t). This can be written as d dt
u(x, t) dV +
V
f(u) · n dA −
A
S(u, x, t) dV = 0
(1)
V
which is referred to as the integral form of the conservation law. For a fixed (independent of t) volume and, under suitable conditions of smoothness of the intervening quantities, we can apply Gauss’ theorem
∇ · f dV =
V
f · n dA
A
to obtain
V
∂u + ∇ · f (u) − S dV = 0. ∂t
(2)
For the integral expression to be zero for any volume V , the integrand must be zero. This results in the strong or differential form of the equation ∂u + ∇ · f (u) − S = 0. ∂t
(3)
An alternative integral form can be obtained by the method of weighted residuals. Multiplying Eq. (3) by a weight function w(x) and integrating over the volume V we obtain V
∂u + ∇ · f (u) − S w(x) dV = 0. ∂t
(4)
If Eq. (4) is satisfied for any weight function w(x), then Eq. (4) is equivalent to the differential form (3). The smoothness requirements on f can be relaxed by applying the Gauss’ theorem to Eq. (4) to obtain V
∂u − S w(x) − f (u) · ∇w(x) dV + ∂t
f · n w(x) dA = 0.
A
(5) This is known as the weak form of the conservation law.
2418
J. Peir´o and S. Sherwin
Although the above formulation is more commonly used in fluid mechanics, similar formulations are also applied in structural mechanics. For instance, the well-known principle of virtual work for the static equilibrium of a body [1], is given by δW =
(∇ σ + f ) · δv dV = 0
V
where δW denotes the virtual work done by an arbitrary virtual velocity δv, σ is the stress tensor and f denotes the body force. The similarity with the method of weighted residuals (4) is evident.
2.
Model Equations and their Classification
In the following we will restrict ourselves to the analysis of one-dimensional conservation laws representing the transport of a scalar variable u(x, t) defined in the domain = {x, t : 0 ≤ x ≤ 1, 0 ≤ t ≤ T }. The convection–diffusionreaction equation is given by ∂ ∂u ∂u + au − b −r u =s (6) L(u) = ∂t ∂x ∂x together with appropriate boundary conditions at x = 0 and x = 1 to make the problem well-posed. In the above equation L(u) simply represents a linear differential operator. This equation can be recast in the form (3) with f (u) = au − ∂u/∂ x and S(u) = s + ru. It is linear if the coefficients a, b, r and s are functions of x and t, and non-linear if any of them depends on the solution, u. In what follows, we will use for convenience the convention that the presence of a subscript x or t under a expression indicates a derivative or partial derivative with respect to this variable, for example du ∂u (x); u t (x, t) = (x, t); dx ∂t Using this notation, Eq. (6) is re-written as u x (x) =
u x x (x, t) =
∂ 2u (x, t). ∂x2
u t + (au − bu x )x − ru = s.
2.1.
Elliptic Equations
The steady-state solution of Eq. (6) when advection and source terms are neglected, i.e., a=0 and s =0, is a function of x only and satisfies the Helmholtz equation (bu x )x + ru = 0.
(7)
Numerical methods for partial differential equations
2419
This equation is elliptic and its solution depends on two families of integration constants that are fixed by prescribing boundary conditions at the ends of the domain. One can either prescribe Dirichlet boundary conditions at both ends, e.g., u(0) = α0 and u(1) = α1 , or substitute one of them (or both if r=/ 0) by a Neumann boundary condition, e.g., u x (0) = g. Here α0 , α1 and g are known constant values. We note that if we introduce a perturbation into a Dirichlet boundary condition, e.g., u(0) = α0 + , we will observe an instantaneous modification to the solution throughout the domain. This is indicative of the elliptic nature of the problem.
2.2.
Parabolic Equations
Taking a = 0, r = 0 and s = 0 in our model, Eq. (6) leads to the heat or diffusion equation u t − (b u x )x = 0,
(8)
which is parabolic. In addition to appropriate boundary conditions of the form used for elliptic equations, we also require an initial condition at t = 0 of the form u(x, 0) = u 0 (x) where u 0 is a given function. If b is constant, this equation admits solutions of the form u(x, t) = Aeβt sin kx if β + k 2 b = 0. A notable feature of the solution is that it decays when b is positive as the exponent β < 0. The rate of decay is a function of b. The more diffusive the equation (i.e., larger b) the faster the decay of the solution is. In general the solution can be made up of many sine waves of different frequencies, i.e., a Fourier expansion of the form u(x, t) = Aeβt sin k x u(x, t) =
Am eβm t sin km x,
m
where Am and km represent the amplitude and the frequency of a Fourier mode, respectively. The decay of the solution depends on the Fourier contents of the initial data since βm = −km2 b. High frequencies decay at a faster rate than the low frequencies which physically means that the solution is being smoothed. This is illustrated in Fig. 1 which shows the time evolution of u(x, t) for an initial condition u 0 (x) = 20x for 0 ≤ x ≤ 1/2 and u 0 (x) = 20(1 − x) for 1/2 ≤ x ≤ 1. The solution shows a rapid smoothing of the slope discontinuity of the initial condition at x = 1/2. The presence of a positive diffusion (b > 0) physically results in a smoothing of the solution which stabilizes it. On the other hand, negative diffusion (b < 0) is de-stabilizing but most physical problems have positive diffusion.
2420
J. Peir´o and S. Sherwin u(x)
11 10
t0
9
t T
8
t 2T
7
t 3T t 4T
6 5
t 5T t 6T
4 3 2 1 0 0.0
0.5
1.0
x Figure 1. Rate of decay of the solution to the diffusion equation.
2.3.
Hyperbolic Equations
A classic example of hyperbolic equation is the linear advection equation u t + a u x = 0,
(9)
where a represents a constant velocity. The above equation is also clearly equivalent to Eq. (6) with b = r = s = 0. This hyperbolic equation also requires an initial condition, u(x, 0) = u 0 (x). The question of what boundary conditions are appropriate for this equation can be more easily be answered after considering its solution. It is easy to verify by substitution in (9) that the solution is given by u(x, t) = u 0 (x − at). This describes the propagation of the quantity u(x, t) moving with speed “a” in the x-direction as depicted in Fig. 2. The solution is constant along the characteristic line x − at = c with u(x, t) = u 0 (c). From the knowledge of the solution, we can appreciate that for a > 0 a boundary condition should be prescribed at x = 0, (e.g., u(0) = α0 ) where information is being fed into the solution domain. The value of the solution at x = 1 is determined by the initial conditions or the boundary condition at x = 0 and cannot, therefore, be prescribed. This simple argument shows that, in a hyperbolic problem, the selection of appropriate conditions at a boundary point depends on the solution at that point. If the velocity is negative, the previous treatment of the boundary conditions is reversed.
Numerical methods for partial differential equations
2421 Characteristic x at c
u (x,t ) t
x
u (x,0 )
x Figure 2. Solution of the linear advection equation.
The propagation velocity can also be a function of space, i.e., a = a(x) or even the same as the quantity being propagated, i.e., a = u(x, t). The choice a = u(x, t) leads to the non-linear inviscid Burgers’ equation u t + u u x = 0.
(10)
An analogous analysis to that used for the advection equation shows that u(x, t) is constant if we are moving with a local velocity also given by u(x, t). This means that some regions of the solution advance faster than other regions leading to the formation of sharp gradients. This is illustrated in Fig. 3. The initial velocity is represented by a triangular “zig-zag” wave. Peaks and troughs in the solution will advance, in opposite directions, with maximum speed. This will eventually lead to an overlap as depicted by the dotted line in Fig. 3. This results in a non-uniqueness of the solution which is obviously non-physical and to resolve this problem we must allow for the formation and propagation of discontinuities when two characteristics intersect (see Ref. [2] for further details).
3.
Numerical Schemes
There are many situations where obtaining an exact solution of a PDE is not possible and we have to resort to approximations in which the infinite set of values in the continuous solution is represented by a finite set of values referred to as the discrete solution. For simplicity we consider first the case of a function of one variable u(x). Given a set of points xi ; i = 1, . . . , N in the domain of definition of u(x), as
2422
J. Peir´o and S. Sherwin t
t3
t2
u u>0
t1
t 0
x
u 0 Figure 3. Formation of discontinuities in the Burgers’ equation.
ui ui 1
ui 1
x1
x i 1
xi
xi 1
xn
x
Ωi
xi
1 2
xi 12
Figure 4. Discretization of the domain.
shown in Fig. 4, the numerical solution that we are seeking is represented by a discrete set of function values {u 1 , . . . , u N } that approximate u at these points, i.e., u i ≈ u(xi ); i = 1, . . . , N . In what follows, and unless otherwise stated, we will assume that the points are equally spaced along the domain with a constant distance x = xi+1 − xi ; i = 1, . . . , N − 1. This way we will write u i+1 ≈ u(xi+1 ) = u(xi + x). This partition of the domain into smaller subdomains is referred to as a mesh or grid.
Numerical methods for partial differential equations
3.1.
2423
The Finite Difference Method (FDM)
This method is used to obtain numerical approximations of PDEs written in the strong form (3). The derivative of u(x) with respect to x can be defined as u(xi + x) − u(xi ) x→0 x u(xi ) − u(xi − x) = lim x→0 x u(xi + x) − u(xi − x) . = lim x→0 2x
u x |i = u x (xi ) = lim
(11)
All these expressions are mathematically equivalent, i.e., the approximation converges to the derivative as x → 0. If x is small but finite, the various terms in Eq. (11) can be used to obtain approximations of the derivate u x of the form u i+1 − u i x u i − u i−1 u x |i ≈ x u i+1 − u i−1 . u x |i ≈ 2x
u x |i ≈
(12) (13) (14)
The expressions (12)–(14) are referred to as forward, backward and centred finite difference approximations of u x |i , respectively. Obviously these approximations of the derivative are different.
3.1.1. Errors in the FDM The analysis of these approximations is performed by using Taylor expansions around the point xi . For instance, an approximation to u i+1 using n + 1 terms of a Taylor expansion around xi is given by
u i+1
dn u x n x 2 + · · · + n = u i + u x |i x + u x x |i 2 dx i n! dn+1 u ∗ x n+1 . + n+1 (x ) dx (n + 1)!
(15)
The underlined term is called the remainder with xi ≤ x ∗ ≤ xi+1 , and represents the error in the approximation if only the first n terms in the expansion are kept. Although the expression (15) is exact, the position x ∗ is unknown.
2424
J. Peir´o and S. Sherwin
To illustrate how this can be used to analyse finite difference approximations, consider the case of the forward difference approximation (12) and use the expansion (15) with n = 1 (two terms) to obtain x u i+1 − u i = u x |i + u x x (x ∗ ). x 2 We can now write the approximation of the derivative as u i+1 − u i + T x where T is given by u x |i =
(16)
(17)
x u x x (x ∗ ). (18) 2 The term T is referred to as the truncation error and is defined as the difference between the exact value and its numerical approximation. This term depends on x but also on u and its derivatives. For instance, if u(x) is a linear function then the finite difference approximation is exact and T = 0 since the second derivative is zero in (18). The order of a finite difference approximation is defined as the power p such that limx→0 (T /x p ) = γ =/ 0, where γ is a finite value. This is often written as T = O(x p ). For instance, for the forward difference approximation (12), we have T = O(x) and it is said to be first-order accurate ( p = 1). If we apply this method to the backward and centred finite difference approximations (13) and (14), respectively, we find that, for constant x, their errors are T = −
x u i − u i−1 + u x x (x ∗ ) ⇒ T = O(x) x 2 x 2 u i+1 − u i−1 − u x x x (x ) ⇒ T = O(x 2 ) u x |i = 2x 12 u x |i =
(19) (20)
with xi−1 ≤ x ∗ ≤ xi and xi−1 ≤ x ≤ xi+1 for Eqs. (19) and (20), respectively. This analysis is confirmed by the numerical results presented in Fig. 5 that displays, in logarithmic axes, the exact and truncation errors against x for the backward and the centred finite differences. Their respective truncation errors T are given by (19) and (20) calculated here, for lack of a better value, with x ∗ = x = xi . The exact error is calculated as the difference between the exact value of the derivative and its finite difference approximation. The slope of the lines are consistent with the order of the truncation error, i.e., 1:1 for the backward difference and 1:2 for the centred difference. The discrepancies between the exact and the numerical results for the smallest values of x are due to the use of finite precision computer arithmetic or round-off error. This issue and its implications are discussed in more detail in numerical analysis textbooks as in Ref. [3].
Numerical methods for partial differential equations
2425
1e 00 Backward FD Total Error Backward FD Truncation Error Centred FD Total Error Centred FD Truncation Error
1e 02 1e 04
1
1e 06
ε
1 2
1e 08
1 1e 10 1e 12 1e 14 1e 00
1e 02
1e 04
1e 06
1e 08
1e 10
1e 12
∆x
Figure 5.
Truncation and rounding errors in the finite difference approximation of derivatives.
3.1.2. Derivation of approximations using Taylor expansions The procedure described in the previous section can be easily transformed into a general method for deriving finite difference schemes. In general, we can obtain approximations to higher order derivatives by selecting an appropriate number of interpolation points that permits us to eliminate the highest term of the truncation error from the Taylor expansions. We will illustrate this with some examples. A more general description of this derivation can be found in Hirsch (1988). A second-order accurate finite difference approximation of the derivative at xi can be derived by considering the values of u at three points: xi−1 , xi and xi+1 . The approximation is constructed as a weighted average of these values {u i−1 , u i , u i+1 } such as u x |i ≈
αu i+1 + βu i + γ u i−1 . x
(21)
Using Taylor expansions around xi we can write x 2 x 3 u x x |i + u x x x |i + · · · 2 6 x 2 x 3 u x x |i − u x x x |i + · · · = u i − x u x |i + 2 6
u i+1 = u i + x u x |i +
(22)
u i−1
(23)
2426
J. Peir´o and S. Sherwin
Putting (22), (23) into (21) we get 1 αu i+1 + βu i + γ u i−1 = (α + β + γ ) u i + (α − γ ) u x |i x x x x 2 u x x |i + (α − γ ) u x x x |i + (α + γ ) 2 6 x 3 u x x x x |i + O(x 4 ) + (α + γ ) (24) 12 We require three independent conditions to calculate the three unknowns α, β and γ . To determine these we impose that the expression (24) is consistent with increasing orders of accuracy. If the solution is constant, the left-hand side of (24) should be zero. This requires the coefficient of (1/x)u i to be zero and therefore α+β +γ = 0. If the solution is linear, we must have α−γ =1 to match u x |i . Finally whilst the first two conditions are necessary for consistency of the approximation in this case we are free to choose the third condition. We can therefore select the coefficient of (x/2) u x x |i to be zero to improve the accuracy, which means α + γ = 0. Solving these three equations we find the values α = 1/2, β = 0 and γ = −(1/2) and recover the second-order accurate centred formula u x |i =
u i+1 − u i−1 + O(x 2 ). 2x
Other approximations can be obtained by selecting a different set of points, for instance, we could have also chosen three points on the side of xi , e.g., u i , u i−1 , u i−2 . The corresponding approximation is known as a one-sided formula. This is sometimes useful to impose boundary conditions on u x at the ends of the mesh.
3.1.3. Higher-order derivatives In general, we can derive an approximation of the second derivative using the Taylor expansion 1 1 αu i+1 + βu i + γ u i−1 u x |i = (α + β + γ ) 2 u i + (α − γ ) 2 x x x 1 x u x x x |i + (α + γ ) u x x |i + (α − γ ) 2 6 x 2 u x x x x |i + O(x 4 ). + (α + γ ) 12
(25)
Numerical methods for partial differential equations
2427
Using similar arguments to those of the previous section we impose
α + β + γ = 0 α−γ =0 ⇒ α = γ = 1, β = −2. α+γ =2
(26)
The first and second conditions require that there are no u or u x terms on the right-hand side of Eq. (25) whilst the third conditon ensures that the righthand side approximates the left-hand side as x tens to zero. The solution of Eq. (26) lead us to the second-order centred approximation u i+1 − 2u i + u i−1 + O(x 2 ). (27) x 2 The last term in the Taylor expansion (α − γ )xu x x x |i /6 has the same coefficient as the u x terms and cancels out to make the approximation second-order accurate. This cancellation does not occur if the points in the mesh are not equally spaced. The derivation of a general three point finite difference approximation with unevenly spaced points can also be obtained through Taylor series. We leave this as an exercise for the reader and proceed in the next section to derive a general form using an alternative method. u x x |i =
3.1.4. Finite differences through polynomial interpolation In this section we seek to approximate the values of u(x) and its derivatives by a polynomial P(x) at a given point xi . As way of an example we will derive similar expressions to the centred differences presented previously by considering an approximation involving the set of points {xi−1 , xi , xi+1 } and the corresponding values {u i−1 , u i , u i+1 }. The polynomial of minimum degree that satisfies P(xi−1 ) = u i−1 , P(xi ) = u i and P(xi+1 ) = u i+1 is the quadratic Lagrange polynomial (x − xi )(x − xi+1 ) (x − xi−1 )(x − xi+1 ) + ui (xi−1 − xi )(xi−1 − xi+1 ) (xi − xi−1 )(xi − xi+1 ) (x − xi−1 )(x − xi ) . + u i+1 (xi+1 − xi−1 )(xi+1 − xi )
P(x) = u i−1
(28)
We can now obtain an approximation of the derivative, u x |i ≈ Px (xi ) as (xi − xi+1 ) (xi − xi−1 ) + (xi − xi+1 ) + ui (xi−1 − xi )(xi−1 − xi+1 ) (xi − xi−1 )(xi − xi+1 ) (xi − xi−1 ) . (29) + u i+1 (xi+1 − xi−1 )(xi+1 − xi )
Px (xi ) = u i−1
If we take xi − xi−1 = xi+1 − xi = x, we recover the second-order accurate finite difference approximation (14) which is consistent with a quadratic
2428
J. Peir´o and S. Sherwin
interpolation. Similarly, for the second derivative we have Px x (xi ) =
2u i−1 2u i + (xi−1 − xi )(xi−1 − xi+1 ) (xi − xi−1 )(xi − xi+1 ) 2u i+1 + (xi+1 − xi−1 )(xi+1 − xi )
(30)
and, again, this approximation leads to the second-order centred finite difference (27) for a constant x. This result is general and the approximation via finite differences can be interpreted as a form of Lagrangian polynomial interpolation. The order of the interpolated polynomial is also the order of accuracy of the finite diference approximation using the same set of points. This is also consistent with the interpretation of a Taylor expansion as an interpolating polynomial.
3.1.5. Finite difference solution of PDEs We consider the FDM approximation to the solution of the elliptic equation u x x = s(x) in the region = {x : 0 ≤ x ≤ 1}. Discretizing the region using N points with constant mesh spacing x = (1/N − 1) or xi = (i − 1/N − 1), we consider two cases with different sets of boundary conditions: 1. u(0) = α1 and u(1) = α2 , and 2. u(0) = α1 and u x (1) = g. In both cases we adopt a centred finite approximation in the interior points of the form u i+1 − 2u i + u i−1 = si ; x 2
i = 2, . . . , N − 1.
(31)
The treatment of the first case is straightforward as the boundary conditions are easily specified as u 1 = α1 and u N = α2 . These two conditions together with the N − 2 equations (31) result in the linear system of N equations with N unknowns represented by
1 0 ... 1 −2 1 0 ... 0 1 −2 1 0 ... .. .. .. . . . 0 ... 0 1 −2 1 0 ... 0 1 −2 0 ... 0
0 0 0
u1 u2 u3 .. .
u 0 N−2 1 u N−1
1
uN
α1 x 2 s2 x 2 s3 .. .
= x 2 s N−2 x 2 s N−1
α2
.
Numerical methods for partial differential equations
2429
This matrix system can be written in abridged form as Au = s. The matrix A is non-singular and admits a unique solution u. This is the case for most discretizations of well-posed elliptic equations. In the second case the boundary condition u(0) = α1 is treated in the same way by setting u 1 = α1 . The treatment of the Neumann boundary condition u x (1) = g requires a more careful consideration. One possibility is to use a one-sided approximation of u x (1) to obtain u x (1) ≈
u N − u N−1 = g. x
(32)
This expression is only first-order accurate and thus inconsistent with the approximation used at the interior points. Given that the PDE is elliptic, this error could potentially reduce the global accuracy of the solution. The alternative is to use a second-order centred approximation u x (1) ≈
u N+1 − u N−1 = g. x
(33)
Here the value u N+1 is not available since it is not part of our discrete set of values but we could use the finite difference approximation at x N given by u N+1 − 2u N + u N−1 = sN x 2 and include the Neumann boundary condition (33) to obtain 1 u N − u N−1 = (gx − s N x 2 ). 2
(34)
It is easy to verify that the introduction of either of the Neumann boundary conditions (32) or (34) leads to non-symmetric matrices.
3.2.
Time Integration
In this section we address the problem of solving time-dependent PDEs in which the solution is a function of space and time u(x, t). Consider for instance the heat equation u t − bu x x = s(x)
in
= {x, t : 0 ≤ x ≤ 1, 0 ≤ t ≤ T }
with an initial condition u(x, 0) = u 0 (x) and time-dependent boundary conditions u(0, t) = α1 (t) and u(1, t) = α2 (t), where α1 and α2 are known
2430
J. Peir´o and S. Sherwin
functions of t. Assume, as before, a mesh or spatial discretization of the domain {x1 , . . . , x N }.
3.2.1. Method of lines In this technique we assign to our mesh a set of values that are functions of time u i (t) = u(xi , t); i = 1, . . . , N . Applying a centred discretization to the spatial derivative of u leads to a system of ordinary differential equations (ODEs) in the variable t given by b du i {u i−1 (t) − 2u i (t) + u i+1 (t)} + si ; = dt x 2
i = 2, . . . , N − 1
with u 1 = α1 (t) and u N = α2 (t). This can be written as
u2
−2 1
u2 u3 .. . + u N−2 u N−1 1 −2
u3 1 −2 1 b d .. .. .. .. . = . . . x 2 dt u N−2 1 −2 1
u N−1
bα1 (t) s2 + x 2 s3 .. .
s N−2 bα2 (t)
s N−1 +
x 2
or in matrix form as du (t) = A u(t) + s(t). (35) dt Equation (35) is referred to as the semi-discrete form or the method of lines. This system can be solved by any method for the integration of initial-value problems [3]. The numerical stability of time integration schemes depends on the eigenvalues of the matrix A which results from the space discretization. For this example, the eigenvalues vary between 0 and −(4α/x 2 ) and this could make the system very stiff, i.e. with large differences in eigenvalues, as x → 0.
3.2.2. Finite differences in time The method of finite differences can be applied to time-dependent problems by considering an independent discretization of the solution u(x, t) in space and time. In addition to the spatial discretization {x1 , . . . , x N }, the discretization in time is represented by a sequence of times t 0 = 0 < · · · < t n < · · · < T . For simplicity we will assume constant intervals x and t in space and time, respectively. The discrete solution at a point will be represented by
Numerical methods for partial differential equations
2431
u ni ≈ u(xi , t n ) and the finite difference approximation of the time derivative follows the procedures previously described. For example, the forward difference in time is given by u t (x, t n ) ≈
u(x, t n+1 ) − u(x, t n ) t
and the backward difference in time is u t (x, t n+1 ) ≈
u(x, t n+1 ) − u(x, t n ) t
both of which are first-order accurate, i.e. T = O(t). Returning to the heat equation u t − bu x x = 0 and using a centred approximation in space, different schemes can be devised depending on the time at which the equation is discretized. For instance, the use of forward differences in time leads to b n − u ni u n+1 i = u i−1 − 2u ni + u ni+1 . 2 t x
(36)
This scheme is explicit as the values of the solution at time t n+1 are obtained directly from the corresponding (known) values at time t n . If we use backward differences in time, the resulting scheme is b n+1 − u ni u n+1 i n+1 n+1 = u − 2u + u i−1 i i+1 . t x 2
(37)
Here to obtain the values at t n+1 we must solve a tri-diagonal system of equations. This type of schemes are referred to as implicit schemes. The higher cost of the implicit schemes is compensated by a greater numerical stability with respect to the explicit schemes which are stable in general only for some combinations of x and t.
3.3.
Discretizations Based on the Integral Form
The FDM uses the strong or differential form of the governing equations. In the following, we introduce two alternative methods that use their integral form counterparts: the finite element and the finite volume methods. The use of integral formulations is advantageous as it provides a more natural treatment of Neumann boundary conditions as well as that of discontinuous source terms due to their reduced requirements on the regularity or smoothness of the solution. Moreover, they are better suited than the FDM to deal with complex geometries in multi-dimensional problems as the integral formulations do not rely in any special mesh structure.
2432
J. Peir´o and S. Sherwin
These methods use the integral form of the equation as the starting point of the discretization process. For example, if the strong form of the PDE is L(u) = s, the integral from is given by 1
L(u)w(x) dx =
0
1
sw(x) dx
(38)
0
where the choice of the weight function w(x) defines the type of scheme.
3.3.1. The finite element method (FEM) Here we discretize the region of interest = {x : 0 ≤ x ≤ 1} into N − 1 subdomains or elements i = {x : xi−1 ≤ x ≤ xi } and assume that the approximate solution is represented by u δ (x, t) =
N
u i (t)Ni (x)
i=1
where the set of functions Ni (x) is known as the expansion basis. Its support is defined as the set of points where Ni (x)=/ 0. If the support of Ni (x) is the whole interval, the method is called a spectral method. In the following we will use expansion bases with compact support which are piecewise continuous polynomials within each element as shown in Fig. 6. The global shape functions Ni (x) can be split within an element into two local contributions of the form shown in Fig. 7. These individual functions are referred to as the shape functions or trial functions.
3.3.2. Galerkin FEM In the Galerkin FEM method we set the weight function w(x) in Eq. (38) to be the same as the basis function Ni (x), i.e., w(x) = Ni (x). Consider again the elliptic equation L(u) = u x x = s(x) in the region with boundary conditions u(0) = α and u x (1) = g. Equation (38) becomes 1
w(x)u x x dx =
0
1
w(x)s(x) dx.
0
At this stage, it is convenient to integrate the left-hand side by parts to get the weak form −
1 0
wx u x dx + w(1) u x (1) − w(0) u x (0) =
1 0
w(x) s(x) dx.
(39)
Numerical methods for partial differential equations ui 1
u1
2433
ui ui 1 Ωi
x1
xi 1
xi
uN x i 1
xN
x
u1 x 1
x
.. . Ni (x)
ui x
1 x
.. . uN x
1 x
Figure 6. A piecewise linear approximation u δ (x, t) =
N
i=1 u i (t)Ni (x).
ui ui 1 Ωi
xi
ui
x i 1
Ni 1
Ni 1
xi
ui 1
x i 1
x
1
x i 1
Figure 7. Finite element expansion bases.
This is a common technique in the FEM because it reduces the smoothness requirements on u and it also makes the matrix of the discretized system symmetric. In two and three dimensions we would use Gauss’ divergence theorem to obtain a similar result. The application of the boundary conditions in the FEM deserves attention. The imposition of the Neumann boundary condition u x (1) = g is straightforward, we simply substitute the value in Eq. (39). This is a very natural way of imposing Neumann boundary conditions which also leads to symmetric
2434
J. Peir´o and S. Sherwin
matrices, unlike the FDM. The Dirichlet boundary condition u(0) = α can be applied by imposing u 1 = α and requiring that w(0) = 0. In general, we will impose that the weight functions w(x) are zero at the Dirichlet boundaries. N δ Letting u(x) ≈ u (x) = j =1 u j N j (x) and w(x) = Ni (x) then Eq. (39) becomes −
1
N dNi dN j uj (x) (x) dx = dx dx j =1
0
1
Ni (x) s(x) dx
(40)
0
for i =2, . . . , N . This represents a linear system of N − 1 equations with N − 1 unknowns: {u 2 , . . . , u N }. Let us proceed to calculate the integral terms corresponding to the i-th equation. We calculate the integrals in Eq. (40) as sums of integrals over the elements i . The basis functions have compact support, as shown in Fig. 6. Their value and their derivatives are different from zero only on the elements containing the node i, i.e., x − xi−1 xi−1 < x < xi x i−1 Ni (x) = xi+1 − x xi < x < xi+1 xi 1 xi−1 < x < xi x i−1
dNi (x) = dx −1 xi < x < xi+1 xi with xi−1 = xi − xi−1 and xi = xi+1 − xi . This means that the only integrals different from zero in (40) are xi
−
x i−1
dNi dNi−1 dNi + ui u i−1 dx dx dx
xi
=
Ni s dx +
x i−1
−
x i+1
xi
x i+1
Ni s dx
(41)
xi
The right-hand side of this equation expressed as xi
F= x i−1
x − xi−1 s(x) dx + xi−1
x i+1
xi
xi+1 − x s(x) dx xi
can be evaluated using a simple integration rule like the trapezium rule x i+1
xi
g(x) dx ≈
dNi dNi dNi+1 + u i+1 ui dx dx dx dx
g(xi ) + g(xi+1 ) xi 2
Numerical methods for partial differential equations and it becomes
F=
2435
xi xi−1 si . + 2 2
Performing the required operations in the left-hand side of Eq. (41) and including the calculated valued of F leads to the FEM discrete form of the equation as −
u i+1 − u i xi−1 + xi u i − u i−1 + = si . xi−1 xi 2
Here if we assume that xi−1 = xi = x then the equispaced approximation becomes u i+1 − 2u i + u i−1 = x si x which is identical to the finite difference formula. We note, however, that the general FE formulation did not require the assumption of an equispaced mesh. In general the evaluation of the integral terms in this formulation is more efficiently implemented by considering most operations in a standard element st = {−1 ≤ x ≤ 1} where a mapping is applied from the element i to the standard element st . For more details on the general formulation see Ref. [4].
3.3.3. Finite volume method (FVM) The integral form of the one-dimensional linear advection equation is given by Eq. (1) with f (u) = au and S = 0. Here the region of integration is taken to be a control volume i , associated with the point of coordinate xi , represented by xi− 1 ≤ x ≤ xi+ 1 , following the notation of Fig. 4, and the integral form is 2 2 written as x i+ 1
x i+ 1
2
u t dx +
x i− 1
2
f x (u) dx = 0.
(42)
x i− 1
2
2
This expression could also been obtained from the weighted residual form (4) by selecting a weight w(x) such that w(x) = 1 for xi− 1 ≤ x ≤ xi+ 1 and 2 2 w(x) = 0 elsewhere. The last term in Eq. (42) can be evaluated analytically to obtain x i+ 1
2
f x (u) dx = f u i+(1/2) − f u i−(1/2)
x i− 1 2
2436
J. Peir´o and S. Sherwin
and if we approximate the first integral using the mid-point rule we finally have the semi-discrete form
u t |i xi+ 1 − xi− 1 + f u i+ 1 − f u i− 1 = 0. 2
2
2
2
This approach produces a conservative scheme if the flux on the boundary of one cell equals the flux on the boundary of the adjacent cell. Conservative schemes are popular for the discretization of hyperbolic equations since, if they converge, they can be proven (Lax-Wendroff theorem) to converge to a weak solution of the conservation law.
3.3.4. Comparison of FVM and FDM To complete our comparison of the different techniques we consider the FVM discretization of the elliptic equation u x x = s. The FVM integral form of this equation over a control volume i = {xi− 1 ≤ x ≤ xi+ 1 } is 2
x i+ 1
2
x i+ 1
2
2
u x x dx = x i− 1
s dx. x i− 1
2
2
Evaluating exactly the left-hand side and approximating the right-hand side by the mid-point rule we obtain
u x xi+ 1 − u x xi− 1 = xi+ 1 − xi− 1 2
2
2
2
si .
(43)
If we approximate u(x) as a linear function between the mesh points i − 1 and i, we have u i − u i−1 u i+1 − u i , u x |i+ 1 ≈ , u x |i− 1 ≈ 2 2 xi − xi−1 xi+1 − xi and introducing these approximations into Eq. (43) we now have u i − u i−1 u i+1 − u i − = (xi+ 1 − xi− 1 ) si . 2 2 xi+1 − xi xi − xi−1 If the mesh is equispaced then this equation reduces to u i+1 − 2u i + u i−1 = x si , x which is the same as the FDM and FEM on an equispaced mesh. Once again we see the similarities that exist between these methods although some assumptions in the construction of the FVM have been made. FEM and FVM allow a more general approach to non-equispaced meshes (although this can also be done in the FDM). In two and three dimensions, curvature is more naturally dealt with in the FVM and FEM due to the integral nature of the equations used.
Numerical methods for partial differential equations
4.
2437
High Order Discretizations: Spectral Element/ p-Type Finite Elements
All of the approximations methods we have discussed this far have dealt with what is typically known as the h-type approximation. If h = x denotes the size of a finite difference spacing or finite elemental regions then convergence of the discrete approximation to the PDE is achieved by letting h → 0. An alternative method is to leave the mesh spacing fixed but to increase the polynomial order of the local approximation which is typically denoted by p or the p-type extension. We have already seen that higher order finite difference approximations can be derived by fitting polynomials through more grid points. The drawback of this approach is that the finite difference stencil gets larger as the order of the polynomial approximation increases. This can lead to difficulties when enforcing boundary conditions particularly in multiple dimensions. An alternative approach to deriving high-order finite differences is to use compact finite differences where a Pad´e approximation is used to approximate the derivatives. When using the finite element method in an integral formulation, it is possible to develop a compact high-order discretization by applying higher order polynomial expansions within every elemental region. So instead of using just a linear element in each piecewise approximation of Fig. 6 we can use a polynomial of order p. This technique is commonly known as p-type finite element in structural mechanics or the spectral element method in fluid mechanics. The choice of the polynomial has a strong influence on the numerical conditioning of the approximation and we note that the choice of an equi-spaced Lagrange polynomial is particularly bad for p > 5. The two most commonly used polynomial expansions are Lagrange polynomial based on the Gauss–Lobatto–Legendre quadratures points or the integral of the Legendre polynomials in combination with the linear finite element expansion. These two polynomial expansions are shown in Fig. 8. Although this method is more (a)
(b) 1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
Figure 8. Shape of the fifth order ( p = 5) polynomial expansions typically used in (a) spectral element and (b) p-type finite element methods.
2438
J. Peir´o and S. Sherwin
involved to implement, the advantage is that for a smooth problem (i.e., one where the derivatives of the solution are well behaved) the computational cost increases algebraically whilst the error decreases exponentially fast. Further details on these methods can be found in Refs. [5, 6].
5.
Numerical Difficulties
The discretization of linear elliptic equations with either FD, FE or FV methods leads to non-singular systems of equations that can easily solved by standard methods of solution. This is not the case for time-dependent problems where numerical errors may grow unbounded for some discretization. This is perhaps better illustrated with some examples. Consider the parabolic problem represented by the diffusion equation u t − u x x = 0 with boundary conditions u(0) = u(1) = 0 solved using the scheme (36) with b = 1 and x = 0.1. The results obtained with t = 0.004 and 0.008 are depicted in Figs. 9(a) and (b), respectively. The numerical solution (b) corresponding to t = 0.008 is clearly unstable. A similar situation occurs in hyperbolic problems. Consider the onedimensional linear advection equation u t + au x = 0; with a > 0 and various explicit approximations, for instance the backward in space, or upwind, scheme is u n − u ni−1 − u ni u n+1 i +a i = 0 ⇒ u n+1 = (1 − σ )u ni + σ u ni−1 , i t x the forward in space, or downwind, scheme is u n − u ni − u ni u n+1 i + a i+1 =0 t x
⇒
(a)
u n+1 = (1 + σ )u ni − σ u ni+1 , i
(44)
(45)
(b)
0.3
0.3
t0.20 t0.24 t0.28 t0.32
0.2
t0.20 t0.24 t0.28 t0.32
0.2
0.1 u(x,t)
u(x,t)
0.1
0
0
0.1
0.1
0.2
0
0.2
0.4
0.6 x
0.8
1
0.2
0
0.2
0.4
0.6
0.8
1
x
Figure 9. Solution to the diffusion equation u t + u x x = 0 using a forward in time and centred in space finite difference discretization with x = 0.1 and (a) t = 0.004, and (b) t = 0.008. The numerical solution in (b) is clearly unstable.
Numerical methods for partial differential equations
2439
u(x,t)
0
u(x, 0) =
1 + 5x
1 − 5x
0
x ≤ −0.2 −0.2 ≤ x ≤ 0 0 ≤ x ≤ 0.2 x ≥ 0.2
a 1.0 0.0 0.2
0.2
x
Figure 10. A triangular wave as initial condition for the advection equation.
and, finally, the centred in space is given by u n − u ni−1 − u ni u n+1 i + a i+1 =0 t 2x
⇒
u n+1 = u ni − i
σ n (u − u ni−1 ) 2 i+1 (46)
where σ = (at/x) is known as the Courant number. We will see later that this number plays an important role in the stability of hyperbolic equations. Let us obtain the solution of u t + au x = 0 for all these schemes with the initial condition given in Fig. 10. As also indicated in Fig. 10, the exact solution is the propagation of this wave form to the right at a velocity a. Now we consider the solution of the three schemes at two different Courant numbers given by σ = 0.5 and 1.5. The results are presented in Fig. 11 and we observe that only the upwinded scheme when σ ≤ 1 gives a stable, although diffusive, solution. The centred scheme when σ = 0.5 appears almost stable but the oscillations grow in time leading to an unstable solution.
6.
Analysis of Numerical Schemes
We have seen that different parameters, such as the Courant number, can effect the stability of a numerical scheme. We would now like to set up a more rigorous framework to analyse a numerical scheme and we introduce the concepts of consistency, stability and Convergence of a numerical scheme.
6.1.
Consistency
A numerical scheme is consistent if the discrete numerical equation tends to the exact differential equation as the mesh size (represented by x and t) tends to zero.
2440
J. Peir´o and S. Sherwin 3
1 0.9
2 0.8 0.7
1
u(x,t)
u(x,t)
0.6 0.5
0
0.4 1
0.3 0.2
2
0.1 0
1
0.8
0.6
0.4
0.2
0
0.2
σ 0.5
0.4
0.6
0.8
3 1
1
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0.8
0.6
0.4
0.2
0 x
0.2
0.4
0.6
0.8
1
0.4
0.6
0.8
1
σ 1.5
30
20
2
10 1 0
u(x,t)
0 u(x,t)
10 1
20
2
3
30
1
0.8
0.6
0.4
0.2
0 x
0.2
0.4
0.6
0.8
1
40 1
σ 1.5
σ 0.5 3
1.2 1
2
0.8 1
0
0.4 u(x,t)
u(x,t)
0.6
0.2
1
0 2 0.2 3
0.4 0.6 1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
4 1
x
σ 0.5
0.8
0.6
0.4
0.2
0 x
0.2
σ 1.5
Figure 11. Numerical solution of the advection equation u t + au x = 0. Dashed lines: initial condition. Dotted lines: exact solution. Solid line: numerical solution.
Consider the centred in space and forward in time finite diference approximation to the linear advection equation u t + au x = 0 given by Eq. (46). Let us , u ni+1 and u ni−1 around (xi , t n ) as consider Taylor expansions of u n+1 i = u ni + t u t |ni + u n+1 i
t 2 u t t |ni + · · · 2
Numerical methods for partial differential equations
2441
x 2 x 3 u x x |ni + u x x x |ni + · · · 2 6 x 2 x 3 u x x |ni − u x x x |ni + · · · u ni−1 = u ni − x u x |ni + 2 6 Substituting these expansions into Eq. (46) and suitably re-arranging the terms we find that u n − u ni−1 − u ni u n+1 i + a i+1 − (u t + au x )|ni = T (47) t 2x where T is known as the truncation error of the approximation and is given by u ni+1 = u ni + x u x |ni +
t x 2 u t t |ni + au x x x |ni + O(t 2 , x 4 ). 2 6 The left-hand side of this equation will tend to zero as t and x tend to zero. This means that the numerical scheme (46) tends to the exact equation at point xi and time level t n and therefore this approximation is consistent. T =
6.2.
Stability
We have seen in the previous numerical examples that errors in numerical solutions can grow uncontrolled and render the solution meaningless. It is therefore sensible to require that the solution is stable, this is that the difference between the computed solution and the exact solution of the discrete equation should remain bounded as n → ∞ for a given x.
6.2.1. The Courant–Friedrichs–Lewy (CFL) condition This is a necessary condition for stability of explicit schemes devised by Courant, Friedrichs and Lewy in 1928. Recalling the theory of characteristics for hyperbolic systems, the domain of dependence of a PDE is the portion of the domain that influences the solution at a given point. For a scalar conservation law, it is the characteristic passing through the point, for instance, the line P Q in Fig. 12. The domain of dependence of a FD scheme is the set of points that affect the approximate solution at a given point. For the upwind scheme, the numerical domain of dependence is shown as a shaded region in Fig. 12. The CFL criterion states that a necessary condition for an explicit FD scheme to solve a hyperbolic PDE to be stable is that, for each mesh point, the domain of dependence of the FD approximation contains the domain of dependence of the PDE.
2442
J. Peir´o and S. Sherwin
(a)
(b) t
t
∆x
∆x
a∆t
Characteristic P
P
a∆t
∆t
∆t
x Q
x Q
Figure 12. Solution of the advection equation by the upwind scheme. Physical and numerical domains of dependence: (a) σ = (at/x) > 1, (b) σ ≤ 1.
For a Courant number σ = (at/x) greater than 1, changes at Q will affect values at P but the FD approximation cannot account for this. The CFL condition is necessary for stability of explicit schemes but it is not sufficient. For instance, in the previous schemes we have that the upwind FD scheme is stable if the CFL condition σ ≤ 1 is imposed. The downwind FD scheme does not satisfy the CFL condition and is unstable. However, the centred FD scheme is unstable even if σ ≤ 1.
6.2.2. Von Neumann (or Fourier) analysis of stability The stability of FD schemes for hyperbolic and parabolic PDEs can be analysed by the von Neumann or Fourier method. The idea behind the method is the following. As discussed previously the analytical solutions of the model diffusion equation u t − b u x x = 0 can be found in the form u(x, t) =
∞
eβm t e I km x
m=−∞
if βm + b km2 = 0. This solution involves a Fourier series in space and an expocomnential decay in time since βm ≤ 0 for b > 0. Here we have included the√ I km x = cos km x + I sin km x with I = −1, plex version of the Fourier series, e because this simplifies considerably later algebraic manipulations. To analyze the growth of different Fourier modes as they evolve under the numerical scheme we can consider each frequency separately, namely u(x, t) = eβm t e I km x .
Numerical methods for partial differential equations
2443
A discrete version of this equation is u ni = u(xi , t n ) = eβm t e I km xi . We can take, without loss of generality, xi = ix and t n = nt to obtain n
n
u ni = eβm nt e I km ix = eβm t e I km ix . The term e I km ix = cos(km ix) + I sin(km ix) is bounded and, therefore, any growth in the numerical solution will arise from the term G = eβm t , known as the amplification factor. Therefore the numerical method will be stable, or the numerical solution u ni bounded as n → ∞, if |G| ≤ 1 for solutions of the form u ni = G n e I km ix . We will now proceed to analyse, using the von Neummann method, the stability of some of the schemes discussed in the previous sections. Example 1 Consider the explicit scheme (36) for the diffusion equation u t − bu x x = 0 expressed here as u n+1 = λu ni−1 + (1 − 2λ)u ni + λu ni+1 ; i
λ=
bt . x 2
We assume u ni = G n e I km ix and substitute in the equation to get G = 1 + 2λ [cos(km x) − 1] . Stability requires |G| ≤ 1. Using −2 ≤ cos(km x) − 1 ≤ 0 we get 1 − 4λ ≤ G ≤ 1 and to satisfy the left inequality we impose −1 ≤ 1 − 4λ ≤ G
=⇒
1 λ≤ . 2
This means that for a given grid size x the maximum allowable timestep is t = (x 2 /2b). Example 2 Consider the implicit scheme (37) for the diffusion equation u t − bu x x = 0 expressed here as n+1 n + λu n+1 λu n+1 i−1 + −(1 + 2λ)u i i+1 = −u i ;
λ=
bt . x 2
The amplification factor is now G=
1 1 + λ(2 − cos βm )
and we have |G| < 1 for any βm if λ > 0. This scheme is therefore unconditionally stable for any x and t. This is obtained at the expense of solving a linear system of equations. However, there will still be restrictions on x
2444
J. Peir´o and S. Sherwin
and t based on the accuracy of the solution. The choice between an explicit or an implicit method is not always obvious and should be done based on the computer cost for achieving the required accuracy in a given problem. Example 3 Consider the upwind scheme for the linear advection equation u t + au x = 0 with a > 0 given by = (1 − σ )u ni + σ u ni−1 ; u n+1 i
σ=
at . x
Let us denote βm = km x and introduce the discrete Fourier expression in the upwind scheme to obtain G = (1 − σ ) + σ e−Iβm The stability condition requires |G| ≤ 1. Recall that G is a complex number G = ξ + I η so ξ = 1 − σ + σ cos βm ;
η = −σ sin βm
This represents a circle of radius σ centred at 1 − σ . The stability condition requires the locus of the points (ξ, η) to be interior to a unit circle ξ 2 + η2 ≤ 1. If σ < 0 the origin is outside the unit circle, 1 − σ > 1, and the scheme is unstable. If σ > 1 the back of the locus is outside the unit circle 1 − 2σ < 1 and it is also unstable. Therefore, for stability we require 0 ≤ σ ≤ 1, see Fig. 13. Example 4 The forward in time, centred in space scheme for the advection equation is given by = u ni − u n+1 i
σ n (u − u ni−1 ); 2 i+1
σ=
at . x
η
1 σ
1
G σ
ξ
Figure 13. Stability region of the upwind scheme.
Numerical methods for partial differential equations
2445
The introduction of the discrete Fourier solution leads to σ G = 1 − (e Iβm − e−Iβm ) = 1 − I σ sin βm 2 Here we have |G|2 = 1 + σ 2 sin2 βm > 1 always for σ =/ 0 and it is therefore unstable. We will require a different time integration scheme to make it stable.
6.3.
Convergence: Lax Equivalence Theorem
A scheme is said to be convergent if the difference between the computed solution and the exact solution of the PDE, i.e. the error E in = u ni − u(xi , t n ), vanishes as the mesh size is decreased. This is written as lim
x,t →0
|E in | = 0
for fixed values of xi and t n . This is the fundamental property to be sought from a numerical scheme but it is difficult to verify directly. On the other hand, consistency and stability are easily checked as shown in the previous sections. The main result that permits the assessment of the convergence of a scheme from the requirements of consistency and stability is the equivalence theorem of Lax stated here without proof: Stability is the necessary and sufficient condition for a consistent linear FD approximation to a well-posed linear initial-value problem to be convergent.
7.
Suggestions for Further Reading
The basics of the FDM are presented a very accessible form in Ref. [7]. More modern references are Refs. [8, 9]. An elementary introduction to the FVM can be consulted in the book by Versteeg and Malalasekera [10]. An in-depth treatment of the topic with an emphasis on hyperbolic problems can be found in the book by Leveque [2]. Two well established general references for the FEM are the books of Hughes [4] and Zienkiewicz and Taylor [11]. A presentation from the point of view of structural analysis can be consulted in Cook et al. [11] The application of p-type finite element for structural mechanics is dealt with in the book of Szabo and Babu˘ska [5]. The treatment of both p-type and spectral element methods in fluid mechanics can be found in the book by Karniadakis and Sherwin [6]. A comprehensive reference covering both FDM, FVM and FEM for fluid dynamics is the book by Hirsch [13]. These topics are also presented using a more mathematical perspective in the classical book by Quarteroni and Valli [14].
2446
J. Peir´o and S. Sherwin
References [1] J. Bonet and R. Wood, Nonlinear Continuum Mechanics for Finite Element Analysis. Cambridge University Press, 1997. [2] R. Leveque, Finite Volume Methods for Hyperbolic Problems, Cambridge University Press, 2002. [3] W. Cheney and D. Kincaid, Numerical Mathematics and Computing, 4th edn., Brooks/Cole Publishing Co., 1999. [4] T. Hughes, The Finite Element Method: Linear Static and Dynamic Finite Element Analysis, Dover Publishers, 2000. [5] B. Szabo and I. Babu˘ska, Finite Element Analysis, Wiley, 1991. [6] G.E. Karniadakis and S. Sherwin, Spectral/hp Element Methods for CFD, Oxford University Press, 1999. [7] G. Smith, Numerical Solution of Partial Differential Equations: Finite Diference Methods, Oxford University Press, 1985. [8] K. Morton and D. Mayers, Numerical Solution of Partial Differential Equations, Cambridge University Press, 1994. [9] J. Thomas, Numerical Partial Differential Equations: Finite Difference Methods, Springer-Verlag, 1995. [10] H. Versteeg and W. Malalasekera, An Introduction to Computational Fluid Dynamics. The Finite Volume Method, Longman Scientific & Technical, 1995. [11] O. Zienkiewicz and R. Taylor, The Finite Element Method: The Basis, vol. 1, Butterworth and Heinemann, 2000. [12] R. Cook, D. Malkus, and M. Plesha, Concepts and Applications of Finite Element Analysis, Wiley, 2001. [13] C. Hirsch, Numerical Computation of Internal and External Flows, vol. 1, Wiley, 1988. [14] A. Quarteroni and A. Valli, Numerical Approximation of Partial Differential Equations, Springer-Verlag, 1994.
8.3 MESHLESS METHODS FOR NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL EQUATIONS Gang Li∗ , Xiaozhong Jin† , and N.R. Aluru‡ Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
A popular research topic in numerical methods recently has been the development of meshless methods as alternatives to the traditional finite element, finite volume, and finite difference methods. The traditional methods all require some connectivity knowledge a priori, such as the generation of a mesh, whereas the aim of meshless methods is to sprinkle only a set of points or nodes covering the computational domain, with no connectivity information required among the set of points. Multiphysics and multiscale analysis, which is a common requirement for microsystem technologies such as MEMS and Bio-MEMS, is radically simplified by meshless techniques as we deal with only nodes or points instead of a mesh. Meshless techniques are also appealing because of their potential in adaptive techniques, where a user can simply add more points in a particular region to obtain more accurate results. Extensive research has been conducted in the area of meshless methods in recent years (see [1–3] for an overview). Broadly defined, meshless methods contain two key steps: construction of meshless approximation functions and their derivatives and meshless discretization of the governing partial-differential equations. Least-squares [4–6, 8–13], kernel based [14–18] and radial basis function [19–23] approaches are three techniques that have gained considerable attention for construction of meshless approximation functions (see [26] for a detailed discussion on least-squares and kernel approximations). The meshless discretization of the partial-differential equations can be categorized into three classes: cell integration [5, 6, 12, 15, 16], local point integration [9, 24, 25], and point collocation [8, 10, 11, 17, 18, 20, 21]. Another class of important meshless methods are developed for boundaryonly analysis of partial differential equations. Boundary integral formulations 2447 S. Yip (ed.), Handbook of Materials Modeling, 2447–2474. c 2005 Springer. Printed in the Netherlands.
2448
G. Li et al.
[27], especially when combined with fast algorithms based on multipole expansions [28], Fast Fourier Transform (FFT) [29] and singular value decomposition (SVD) [30, 31], are powerful computational techniques for rapid analysis of exterior problems. Recently, several meshless methods for boundary-only analysis have been proposed in the literature. Some of the methods include the boundary node method [32, 33], the hybrid boundary node method [34] and the boundary knot method [35]. The boundary node method is a combined boundary integral/meshless approach for boundary only analysis of partial differential equations. A key difficulty in the boundary node method is the construction of interpolation functions using moving least-squares methods. For 2-D problems, where the boundary is 1-D, Cartesian coordinates cannot be used to construct interpolation functions (see [36] for a more detailed discussion). Instead, a cyclic coordinate is used in the moving least-squares approach to construct interpolation functions. For 3-D problems, where the boundary is 2-D, curvilinear coordinates are used to construct interpolation functions. The definition of these coordinates is not trivial for complex geometries. Recently, we have introduced a boundary cloud method (BCM) [36, 37], which is also a combined boundary-integral/scattered point approach for boundary only analysis of partial differential equations. The boundary cloud method employs a Hermite-type or a varying polynomial basis least-squares approach to construct interpolation functions to enable the direct use of Cartesian coordinates. Due to the length restriction, boundary-only methods are not discussed in this article. This paper summarizes the key developments in meshless methods and their implementation for interior problems. This material should serve as a starting point for the reader to venture into more advanced topics in meshless methods. The rest of the article is organized as follows: In Section 1, we introduce the general numerical procedures for solving partial differential equations. Meshless approximation and discretization approaches are discussed in Sections 2 and 3, respectively. Section 4 provides a brief summary of some existing meshless methods. The solution of an elasticity problem by using the finite cloud method is presented in Section 5. Section 6 concludes the article.
1.
Steps for Solving Partial Differential Equations: An Example
Typically, the physical behavior of an object or a system is described mathematically by partial differential equations. For example, as shown in Fig. 1, an irregular shaped 2-D plate is subjected to certain conditions of heat transfer: it has a temperature distribution of g(x, y) on the left part on its boundary (denoted as u ) and a heat flux distribution of h(x, y) on the remaining part of the boundary (denoted as q ). At steady state, the temperature at any point on
Meshless methods for numerical solution
2449
u g(x,y)
Ω
Γu
∇2 u 0
Γq
u,n h(x,y) Figure 1. Heat conduction within a plate.
the plate is described by the steady-state heat conduction equation, i.e., ∇ 2u = 0
(1)
where u is the temperature. The temperature and the flux prescribed on the boundary are defined as boundary conditions. The prescribed temperature is called the Dirichlet or an essential boundary condition, i.e., u = g(x, y) on u
(2)
and the prescribed flux is called the Neumann or anatural boundary condition, i.e., ∂u = h(x, y) on q (3) ∂n where n is the outward normal to the boundary. The governing equations along with the Dirichlet and/or Neumann boundary conditions permit a unique temperature field on the plate. There are various numerical techniques available to solve the simple example considered above. Finite difference method (FDM) [38], finite element method (FEM) [39] and boundary element method (BEM) [27, 40] are the most popular methods for solving PDEs. Recently, meshless methods have been proposed and they have been successfully applied to solve many physical problems. Although the FDM, FEM, BEM and meshless methods are different in many aspects, all these methods contain three common steps: 1. Discretization of the domain 2. Approximation of the unknown function 3. Discretization of the governing equation and the boundary conditions.
2450
G. Li et al.
In the first step, a meshing process is often required for conventional methods such as finite element and boundary element methods. For objects with complex geometry, the meshing step could be complicated and time consuming. The key idea in meshless methods is to eliminate the meshing process to improve the efficiency. Many authors have shown that this can be done through meshless approximation and meshless discretization of the governing equation and the boundary conditions.
2.
Meshless Approximation
In meshless methods, as shown in Fig. 2, a physical domain is represented by a set of points. The points can be either structured or scattered as long as they cover the physical domain. An unknown function such as the temperature field in the domain is defined by the governing equation along with the appropriate boundary conditions. To obtain the solution numerically, one first needs to approximate the unknown function (e.g., temperature) at any location in the domain. There are several approaches for constructing the meshless approximation functions as will be discussed in the following sections.
2.1.
Weighted Least-squares Approximations
Assume we have a 2-D domain and denote the unknown function as u(x, y). In a weighted moving least-squares (MLS) approximation [41], the unknown function can be approximated by u a (x, y) =
m
a j (x, y) p j (x, y)
(4)
j =1
z
approximated unknown funtion x
weighting function
y support domain
Figure 2. Meshless approximation.
Meshless methods for numerical solution
2451
where a j (x, y) are the unknown coefficients, p j (x, y) are the basis functions and m is the number of basis functions. Polynomials are often used as the basis functions. For example, typical 2-D basis functions are given by linear basis: p(x, y) = [1 x y]T qudratic basis: p(x, y) = [1 x y x 2 x y y 2 ]T cubic basis: p(x, y) = [1 x y x 2 x y y 2 x 3 x 2 y x y 2 y 3 ]T
m=3 m=6 m = 10
(5)
The basic idea in weighted least-squares method is to minimize the weighted error between the approximation and the exact function. The weighted error is defined as E(u) =
NP i=1
=
NP
wi (x, y) u a (xi , yi ) − u i
2
2 m wi (x, y) a j (x, y) p j (xi , yi ) − u i
i=1
(6)
j =1
where NP is the number of points, wi (x, y) is the weighting function centered at the point (x, y) and evaluated at the point (xi , yi ). If the weighting function is a constant, the weighted least-squares approach reduces to the classical least-squares approach. The weighting function is used in meshless methods for two reasons: first is to assign the relative importance of the error as a function of distance from the point (x, y); second, by choosing weighting functions whose value will vanish outside certain region, the approximation becomes local. The region where a weighting function has a non-zero value is called a support, a cloud or a domain of influence. The center point (x, y) is called a star point. As shown in Fig. 2, a typical weighting function is bellshaped. Several popular weighting functions used in meshless methods are listed below [1, 17, 42]: 2 3 2/3 − 4r + 4r
r ≤ 1/2 4/3 − 4r + 4r 2 − 4/3r 3 1/2 ≤ r ≤ 1 r >1 0 2 3 4 1 − 6r + 8r − 3r r ≤1 quartic spline: wi (r ) = r >1 0 2 2 e−(r/c) − e−(rmax /c) 0 ≤ r ≤ rmax Gaussian: wi (r) = 1 − e−(rmax /c)2 wi (r) 0 ≤ r ≤ rmax Modified Gaussian: wi (r) = 1 − wi (r) +
cubic spline:
wi (r) =
(7)
where r = r/rmax , r is the distance from the point (x, y) to the point (xi , yi ), i.e., r = |x − xi | = (x − xi )2 + (y − yi )2 and rmax is the radius of the support and c is called the dilation parameter which controls the sharpness of the weighting function. Typical value of c is between rmax /2 and rmax /3. The shape
2452
G. Li et al.
of the support, which defines the region where the weighting function is nonzero, can be arbitrary. The parameter in the modified Gaussian weighting is a small number to prevent the weighting function from being singular at the center. Multidimensional weighting functions can be constructed as products of one-dimensional weighting functions. For example, it is possible to define the 2-D weighting function as the product of two 1-D weighting functions in each direction, i.e., wi (x, y) = w(x − xi , y − yi ) = w(x − xi )w(y − yi )
(8)
In this case, the shape of the support/cloud is rectangular. The support size of the weighing function associated with a node i is selected to satisfy the following considerations [43]: 1. The support size should be large enough to cover a sufficient number of points and these points should occupy all the four quadrants of the star point (for boundary star points, the quadrants outside the domain are not considered). 2. The support size should be small enough to provide adequate local character to the approximation. Algorithm 1 gives a procedure for determining the support size for a given point i. Note that several other algorithms [8, 42, 44] are available for determining the support size. However, determining an “optimal” support size for a set of scattered points in meshless methods is still an open research topic. Algorithm 1 The implementation of determining support size rmax for a given point i 1: Select the nearest N E points in the domain (N E is typically several times of m). 2: For each selected point (x j , y j ), j = 1, 2, . . . , N E , compute the distance
3: 4: 5: 6:
from the point i, ρi j = (xi − x j )2 + (yi − y j )2 . Sort nodes in order of increasing ρi j and designate the first m nodes of the sort to a list. Draw a ray from the point i to each of the node in the list. If the angle between any two consecutive rays is greater than 90o , add the next node from the sort to the list and go to 4, if not, go to 6. Set rmax = Max(ρi j ) and multiply rmax by a scaling factor αs . The value of the scaling factor is provided by user.
Once the weighting function is selected, the unknown coefficients are computed by minimizing the weighted error (Eq. (6)) ∂E =0 ∂a j
j = 1, 2, . . . , m
(9)
Meshless methods for numerical solution
2453
For a point (x, y), Eq. (9) leads to a linear system, which in matrix form is BW B T a = BW u
(10)
where a is the m × 1 coefficient vector, u is an NP × 1 unknown vector, B is an m × NP matrix,
B=
p1 (x1 , y1 ) p2 (x1 , y1 ) .. .
p1 (x2 , y2 ) p2 (x2 , y2 ) .. .
pm (x1 , y1 )
··· ··· .. .
pm (x2 , y2 ) · · ·
p1 (x NP , y NP ) p2 (x NP , y NP ) .. .
W =
0 .. .
0 ··· w(x − x2 , y − y2 ) · · · .. .. . .
0
0
···
,
(11)
pm (x NP , y NP )
W is an NP × NP diagonal matrix defined as w(x − x1 , y−y ) 1
0 .. . w(x − x NP ,
0
(12)
y − y NP )
Rewriting M(x, y) = BW B T
(13)
C(x, y) = BW
(14)
and where the matrix M(x, y) of size m × m is called the moment matrix and from Eqs. (10), (13), and (14), the unknown coefficients can be written as a = M −1 Cu
(15)
Therefore, the approximation of the unknown function is given by u a (x, y) = pT (M −1 C)u
(16)
One can write Eq. (16) in short form as u a (x, y) = N(x, y)u =
NP
Ni (x, y)u i
(17)
i=1
Note that typically u i =/ u a (xi , yi ). In the moving least-squares method, the unknown coefficients a(x, y) are functions of (x, y). The approximation of the first derivatives of the unknown function is given by
T −1 (M −1 C) + pT (M −1 C ,k ) u u a,k (x, y) = p,k ,k C + M
= N ,k (x, y)u
(18)
2454
G. Li et al.
where k = 1 is the x-derivative or k = 2 is the y-derivative. One alternative to the moving least-squares approximation is the fixed least-squares (FLS) approximation [10, 13]. In FLS, the unknown function u(x, y) is approximated by u a (x, y) =
m
a j p j (x, y)
(19)
j =1
Note that a j in Eq. (19) is not a function of (x, y), i.e., the coefficients a j , j = 1, 2, . . . , m are constants for a given support or cloud. The weighting matrix W in the fixed least-squares approximation is
w(x K − x1 , y −y ) 1 K
W =
0 .. . 0
0 ··· w(x K − x2 , y K − y2 ) · · · .. .. . . 0
···
0 .. . w(x K − x NP ,
0
(20)
y K − y NP )
where (x K , y K ) is the center of the weighting function. Note that (x K , y K ) can be arbitrary and consequently the interpolation functions can be multivalued (see [18] for details). A unique set of interpolation functions can be constructed by fixing (x K , y K ) at the center point (x, y), i.e., when computing Ni (x, y), i = 1, 2, . . . , NP and its derivatives, the center of the weighting function is always fixed at (x, y). Therefore, it is clear that the moment matrix M and matrix C are not functions of (x, y) and the derivatives of the function are given by T (M −1 C)u u a,k (x, y) = p,k
k ∈ {1, 2}
(21)
Comparing Eqs. (18) and (21), it is easily shown that the cost of computing the derivatives in FLS is much less than that in MLS. However, it is reported in literature [6] that the approximated derivatives obtained from FLS may be less accurate. Algorithm 2 gives the procedure for computing the moving leastsquares approximation. In Algorithm 2, N C is the number of points in a cloud.
2.2.
Kernel Approximations
Consider again an arbitary 2-D domain, as shown in Fig. (2), and assume the domain is discretized into NP points or nodes. Then, for each node an approximation function is generated by constructing a cloud about that node (also referred to as a star node). A support/cloud is constructed by centering a kernel (i.e., the weighting function in the case of weighted least-squares
Meshless methods for numerical solution
2455
Algorithm 2 The implementation of moving least-squares approximation 1: Discretize the domain into NP points to cover the entire domain and its boundary . 2: for each point in the domain, (x j , y j ), do 3: Center the weighting function at the point. 4: Search the nearby domain and determine the support size to get N C points in the cloud by using Algorithm 1. 5: Compute the matrices M, C and their derivatives. 6: Compute the approximation function Ni (x j , y j ), i = 1, 2, . . . , N C and its derivatives by using Eqs. (16,18). 7: end for approximation) about the star point. The kernel is non-zero at the star point and at few other nodes that are in the vicinity of the star point. Two types of the kernel approximations can be considered: the reproducing kernel [15] and the fixed kernel [18]. In a 2-D reproducing kernel approach, the approximation u a (x, y) to the unknown function u(x, y) is given by u (x, y) =
a
C (x, y, s, t)w(x − s, y − t)u(s, t)ds dt
(22)
where w is the kernel function centered at (x, y). Typical kernel functions are given by Eq. (7). C (x, y, s, t) is the correction function which is given by C (x, y, s, t) = pT (x − s, y − t)c(x, y)
(23)
pT ={p1 , p2 , . . . , pm } is an m ×1 vector of basis functions. In two dimensions, a quadratic polynomial basis vector is given by
pT = 1, x − s, y − t, (x − s)2 , (x − s)(y − t), (y − t)2
m = 6 (24)
c(x, y) is an m × 1 vector of unknown correction function coefficients. The correction function coefficients are computed by satisfying the consistency conditions, i.e.,
pT (x − s, y − t)c(x, y)w(x − s, y − t) pi (s, t)ds dt = pi (x, y)
i = 1, 2, . . . , m
(25)
In discrete form, Eq. (25) can be written as NP
pT (x − x I , y − y I )c(x, y)w(x − x I , y − y I ) pi (x I , y I )VI
I =1
= pi (x, y)
i = 1, 2, . . . , m
(26)
2456
G. Li et al.
where NP is the number of points in the domain and VI is the nodal volume of node I . Typically a unit nodal volume of the nodes is assumed (see [18] for a discussion on nodal volumes). Equation (26) can be written in a matrix form as M c(x, y) = p(x, y)
(27)
where M is the m × m moment matrix and is a function of (x, y). The entries in the moment matrix are given by Mij =
NP
p j (x − x I , y − y I )w(x − x I , y − y I ) pi (x I , y I )VI
(28)
I =1
From Eq. (27), the unknown correction function coefficients are computed as c(x, y) = M −1 (x, y) p(x, y)
(29)
Substituting the correction function coefficients into Eq. (23) and employing a discrete approximation for Eq. (22), we obtain u a (x, y) =
NP
pT (x, y)M −T (x, y) p(x − x I , y − y I )
I =1
×w(x − x I , y − y I )VI uˆ I =
NP
N I (x, y)uˆ I
(30)
I =1
where uˆ I is the nodal parameter for node I , and N I (x, y) is the reproducing kernel meshless interpolation function. The first derivatives of the correction function coefficients can be computed from Eq. (27) M ,k (x, y)c(x, y) + M(x, y)c,k (x, y) = p,k (x, y)
(31)
c,k = M −1 ( p,k − M ,k c)
(32)
where k = 1 (for x-derivative) or k = 2 (for y-derivative). Thus, the first derivatives of the approximation can be written as
u a (x, y)
,k
= =
NP
(cT ),k pw + cT p,k w + cT pw,k VI uˆ I
I =1 NP
N I,k (x, y)uˆ I
(33)
I =1
Similarly, the second derivatives of the correction function coefficients are given by M ,mn (x, y)c(x, y) + M ,m (x, y)c,n (x, y) + M ,n (x, y)c,m (x, y) + M(x, y)c,mn (x, y) = p,mn (x, y)
(34)
c,mn = M −1 ( p,mn − M ,mn c − M ,m c,n − M ,n c,m )
(35)
Meshless methods for numerical solution
2457
where m, n = x or y, and
u a (x, y)
,mn
=
NP
(cT ),mn pw + cT p,mn w + cT pw,mn + (cT ),m p,n w
I =1
+ (cT ),m pw,n + (cT ),n pw,m + (cT ),n p,m w
+ cT p,m w,n + cT p,n w,m VI uˆ I =
NP
N I,mn (x, y)uˆ I
(36)
I =1
The other major type of the kernel approximation is the fixed-kernel approximation. In a fixed-kernel approximation, the unknown function u(x, y) is approximated by
u (x, y) = C (x, y, x K − s, y K − t)w(x K − s, y K − t)u(s, t)ds dt (37) a
Note that in the fixed-kernel approximation, the center of the kernel is fixed at (x K , y K ) for a given cloud. Following the same procedure as in the reproducing kernel approximation, one can obtain the discrete form of the fixed kernel approximation u a (x, y) =
NP
pT (x, y)M −T (x K , y K ) p(x K − x I , y K − y I )
I =1
× w(x K − x I , y K − y I )VI uˆ I =
NP
N I (x, y)uˆ I
(38)
I =1
Since (x K , y K ) can be arbitrary in Eq. (38), the interpolation functions obtained by Eq. (38) are multivalued. A unique set of interpolation functions can be constructed by computing N I (x K , y K ), I = 1, 2, . . . , NP, when the kernel is centered at (x K , y K ) (see [18] for more details). Equation (38) shows that only the leading polynomial basis vector is a function of (x, y). Therefore, the derivatives of the interpolation functions can be computed simply by differentiating the polynomial basis vector in Eq. (38). For example, the first and second x derivatives are computed as:
N I ,x (x, y) = 0 1 0 2x y 0 M −T p(x K − x I , y K − y I ) ×w(x K − x I , y K − y I )VI
(39)
N I ,x x (x, y) = [0 0 0 2 0 0] M −T p(x K − x I , y K − y I ) ×w(x K − x I , y K − y I )VI
(40)
2458
G. Li et al.
It has been proved in [26] that, if the nodal volume is taken to be 1 for each node, the reproducing kernel approximation is mathematically equivalent to the moving least-squares approximation, and the fixed kernel approximation is equivalent to the fixed least-squares approximation. The algorithm to construct the approximation functions by using the fixed-kernel approximation method is given by Algorithm 3 The implementation of fixed-kernel approximation 1: Allocate NP points to cover the domain and its boundary . 2: for each point in the domain, (x j , y j ), do 3: Center the weighting function at the point. 4: Determine the support size to get N C points in the cloud by using Algorithm 1. 5: Compute the moment matrix M and the basis vector p(x, y). 6: Solve M c = p 7: Compute the approximation function N I (x j , y j ) I = 1, 2, . . . , N C and its derivatives by using Eqs. (38)–(40). 8: end for
2.3.
Radial Basis Approximation
In a radial basis meshless approximation, the approximation of an unknown function u(x, y) is written as a linear combination of NP radial functions [19], u a (x, y) =
NP
α j φ(x, y, x j , y j )
(41)
j =1
where NP is the number of points in the domain, φ is the radial basis function and α j , j = 1, 2, . . . , NP are the unknown coefficients. The unknown coefficients α1 , . . . , α NP can be computed by solving the governing equation by using either a collocation or a Galerkin method, which we will discuss in the following sections. The partial derivatives of the approximation function in a multidimensional space can be calculated as NP ∂ k φ(x, y, x j , y j ) ∂ k u a (x, y) = α j ∂ x a ∂ yb ∂ x a ∂ yb j =1
(42)
where a, b ∈ 0, 1, 2 and k = a + b. The multiquadrics [19–21] and thin-plate spline functions [45] are among the most popular radial basis functions. The multiquadrics radial basis function is given by φ(x, y, x j , y j ) = (x, x j ) = (r j ) = (r 2j + c2j )0.5
(43)
Meshless methods for numerical solution
2459
where r j = ||x − x j || is the Euclidian norm and c j is a constant. The value of c controls the shape of the basis function. The reciprocal multiquadrics radial basis function has the form (r) =
1 (r 2 + c2 )0.5
(44)
The thin-plate spline radial basis function is given by (r) = r 2m log r
(45)
where m is the order of the thin-plate spline. To avoid the singularity of the interpolation system, a polynomial function is often added to the approximation Eq. (41) [46]. The modified approximation is given by u a (x, y) =
NP
α j φ(x, y, x j , y j ) +
j =1
m
βi pi (x, y)
(46)
i=1
along with m additional constraints NP
α j pi (x j , y j ) = 0 i = 1, . . . , m
(47)
j =1
where βi , i = 1, 2, . . . , m are the unknown coefficients and p(x) are the polynomial basis functions as defined in Eq. (5). Equations (46) and (47) lead to a positive definite linear system which is gauranteed to be nonsingular. The radial basis function approximation shown above is global since the radial basis function are non-zero everywhere in the domain. It is required to solve a dense linear system to solve the unknown coefficients. The computational cost could be very high when the domain contains a large number of points. Recently, compactly supported radial basis functions have been proposed and applied to solve PDEs with largely reduced computational cost. For more details on compactly supported RBFs, please refer to [23].
3.
Discretization
As shown in Eqs. (17), (18), (30), (33), (36), (38) and (41), although each approximation method has a different way of computing the approximation functions, all the methods presented in previous sections represent u(x, y) in the same general form as u (x, y) = a
NP I =1
N I (x, y)uˆ I
(48)
2460
G. Li et al.
and the approximation of the derivtaives can also be written in the general form given by NP k ∂ k u a (x, y) ∂ N I (x, y) = uˆ I a b ∂x ∂y ∂ x a ∂ yb I =1
(49)
where a, b ∈ 0, 1, 2 and k = a + b. After the approximation functions are constructed, the next step is to compute the unknown coefficients in Eq. (48) by discretizing the governing equations. The meshless discretization techniques can be broadly classified into three categories: (1) point collocation; (2) cell integration and (3) local domain integration.
3.1.
Point Collocation
Point collocation is the simplest and the easiest way to discretize the governing equations. In a point collocation approach, the governing equations for a physical problem can be written in the following general form L (u(x, y)) = f (x, y) in G (u(x, y)) = g(x, y) on g H (u(x, y)) = h(x, y) on h
(50) (51) (52)
where is the domain, g is the portion of the boundary on which Dirichlet boundary conditions are specified, h is the portion of the boundary on which Neumann boundary conditions are specified and L , G and H are the differential, Dirichlet and Neumann operators, respectively. The boundary of the domain is given by = g ∪ h . After the meshless approximation functions are constructed, for each interior node, the point collocation technique simply substitutes the approximated unknown into the governing equations. For nodes with prescribed boundary conditions the approximate solution or the derivative of the approximate solution are substituted into the given Dirichlet and Neumann-type boundary conditions, respectively. Therefore, the discretized governing equations are given by L (u a ) = f (x, y) for points in G (u a ) = g(x, y) for points on g H (u a ) = h(x, y) for points on h
(53) (54) (55)
The point collocation approach gives rise to a linear system of equations of the form, K uˆ = F
(56)
Meshless methods for numerical solution
2461
The solution of Eq. (56) provides the nodal parameters at the nodes. Once the nodal parameters are computed, the unknown solution at each node can be computed from Eq. (48). Let’s revisit the heat condution problem presented in Section 2 as an example. The governing equation is the steady-state heat conduction along with the appropriate boundary conditions stated in Eqs. (1)–(3). As shown in Fig. 3(a), the points are distributed over the domain and the boundary. Using the meshless approximation functions, the nodal temperature can be expressed by Eq. (48). If a node i is an interior node, the governing equation is satisfied, i.e.,
∇2
NP
N I (xi , yi )uˆ I
I =1
=
NP
(∇ 2 N I (xi , yi ))uˆ I = 0
(57)
I =1
If a node j is a boundary node with a Dirichlet boundary condition, we have NP
N I (x j , y j )uˆ I = g(x j , y j )
(58)
I =1
and if a node q is a boundary node with a Neumann boundary condition (heat flux at the boundary) ∂(
NP I =1
NP N I (xq , yq )uˆ I ) ∂(N I (xq , yq )) = uˆ I = h(xq , yq ) ∂n ∂n I =1
(59)
Assuming that there are ni interior points, nd Dirichlet boundary points, and nn Neumann boundary nodes (NP = ni + nd + nn) in the domain, the final
(a)
Governing equation
(b)
(c) background cells Ωs Γs
Dirichlet boundary condition
Neumann boundary condition
Ls Γsq
Figure 3. Meshlessdiscretization: (a) point collocation. (b) cell integration. (c) local domain integration.
2462
G. Li et al.
linear system takes the form
∇ 2 N1 (x1 ) ∇ 2 N1 (x2 ) . ..
∇ 2 N (x ) 1 ni N1 (xni+1 ) . . . N1 (xni+nd ) ∂(N (x 1 ni+nd+1 ) ∂n . .. ∂(N (x )) 1
∂n
NP
∇ 2 N2 (x1 ) ∇ 2 N2 (x2 ) .. .
∇ 2 N2 (xni ) N2 (xni+1 ) .. .
··· ··· .. . ··· ··· .. .
N2 (xni+nd ) · · · ∂(N2 (xni+nd+1 )) ··· ∂n .. .. . . ∂(N2 (x NP )) ··· ∂n
0 0 . . . 2 ∇ N NP (xni ) u ˆ 0 1 N NP (xni+1 ) g(x u ˆ ) 2 ni+1 .. . = . . . . . . N NP (xni+nd ) ) u ˆ g(x NP ni+nd ∂(N NP (xni+nd+1 )) ∂n ) h(x ni+nd+1 .. . . . . ∂(N NP (x NP ))
∇ 2 N NP (x1 ) ∇ 2 N NP (x2 ) .. .
∂n
h(x NP )
(60) where xni denotes the coordinates of node ni. Equation (60) can be solved ˆ The nodal temperature can be computed by to obtain the nodal parameters u. using Eq. (48). Algorithm 4 summarizes the key steps involved in the implementation of a point collocation method for linear problems. The point collocation steps are the same for nonlinear problems. However, a linear system such as Eq. (60) cannot be directly obtained by substituting the approximated unknown into the governing equation and the boundary conditions. A Newton’s method can be used to solve the discretized nonlinear system (please refer to [47] for detail). The point collocation method provides a simple, efficient and flexible meshless method for interior domain numerical analysis. Many meshless methods, such as the finite point method [10], the finite cloud method [18] and the h–p meshless cloud method [8], employ the point collocation technique to discretize the governing equation. However, there are several issues one needs to pay attention to improve the robustness of the point collocation method: 1. Ensuring the quality of clouds: We have found that, for scattered point distributions, the quality of the clouds is directly related to the numerical error in the solution. When the point distribution is highly scattered, it is likely that certain stability conditions, namely the positivity conditions (see [42] for details), could be violated for certain clouds. For this reason, the modified Gaussian, cubic or quartic inverse distance functions [42] are better choices for the kernel/weighting function in point collocation. In [42], we have proposed quantitative criteria to measure the cloud quality and approaches to ensure the satisfaction of the positivity conditions for 1-D and 2-D problems. However, for really bad point distributions, it could be difficult to satisfy the positivity conditions and modification of the point distribution may be necessary.
Meshless methods for numerical solution
2463
Algorithm 4 Implementation of a point collocation technique for numerical solutions of PDEs 1: Compute the meshless approximations for the unknown solution 2: for each point in the domain do 3: if the node is in the interior of the domain then 4: substitute the approximation of the solution into the governing equation 5: else if the node is on the Dirichlet boundary then 6: substitute the approximation of the solution into the Dirichlet boundary condition 7: else if the node is on the Neumann boundary then 8: substitute the approximation of the solution into the Neumann boundary condition 9: end if 10: assemble the corresponding row of Eq. (60) 11: end for 12: Solve Eq. (60) to obtain the nodal parameters 13: Compute the solution by using Eq. (48)
2. Improving the accuracy for high aspect-ratio clouds: Like the conventional finite difference and finite element methods, large error could occur with the collocation meshless methods when the point distribution has a high aspect ratio (i.e. anisotropic cloud). Further investigation is needed to deal with the high aspect ratio problem.
3.2.
Cell Integration
Another approach to discretize the governing equation is the Galerkin method. The Galerkin approach is based on the weak form of the governing equations. The weak form can be obtained by minimizing the weighted residual of the governing equation. For the heat condution problem, a weak form of the governing equation can be written as
w ∇ u d + 2
v (u − g(x, y)) d = 0
(61)
u
where w and v are the test functions for the governing equation and the Dirichlet boundary condition, respectively. Note that the second integral in Eq. (61) is used to enforce the Dirichlet boundary condition. By applying the
2464
G. Li et al.
divergence theorem and imposing the natural boundary condition, Eq. (61) can b written as u
∂u wd + ∂n
q
∂u wd − ∂n
u ,i w,i d +
v (u − g(x, y)) d = 0
u
(62) The approximation for the unknown function is given by the meshless approximation (Eq. (48)) and the normal derivative of the unknown function can be computed by !
NP ∂ NI ∂ NI ∂u a = nx + n y uˆ I ∂n ∂x ∂y I =1
(63)
Denoting ∂ NI ∂ NI nx + ny ∂x ∂y
I =
(64)
The normal derivative of the unknown function can be rewritten as N ∂u = I uˆ I ∂n I =1
(65)
We choose the test functions w and v by w= v=
NP I =1 NP
N I uˆ I
(66)
I uˆ I
(67)
I =1
Subtituting the approximations into the weak form, we obtain NP I =1
" NP
uˆ I
N I,i N J,i duˆ J −
J =1
=
NP I =1
uˆ I
" q
NP J =1
u
N I h(x, y) d −
N I J d uˆ J − u
NP J =1
#
I g(x, y) d
#
I N J d uˆ J
u
(68)
Meshless methods for numerical solution
2465
Equation (68) can be simplified as NP N I,i N J,i d − N I J d − I N J d uˆ J J =1
=
u
N I h(x, y) d −
q
u
I g(x, y) d
(69)
u
In matrix form
K − G − G T uˆ = h − g
(70)
where the entries of the coefficient matrix and the right hand side vector are given by
K IJ =
N I,i N J,i d
(71)
N I J d
(72)
N I h(x, y)d
(73)
I g(x, y)d
(74)
G IJ = u
hI = q
gI = u
As shown in Eqs. (71)–(74), the entries in the matrices and the right hand side vector are integrals over the domain or over the boundary. Since there is no mesh available to compute the various integrals, one approach is to use a background cell structure as shown in Fig. 3(b). The integrations are computed by appropriately summing over the cells and using Gauss quadrature in each cell. The implementation of cell integration is summarized in Algorithm 5. In a cell integration approach, the approximation order is reduced, i.e., for a second order PDE, there is no need to compute the second derivatives of the approximation functions. However, the cell integration approach requires background cells and the treatment of the boundary cells is not straightforward. Element-free Galerkin method [6], partition of unity finite element method [12], diffuse element method [5] and reproducing kernel particle method [15] are among the meshless methods using cell integration technique for discretizating the governing equation.
2466
G. Li et al.
Algorithm 5 Implementation of cell integration technique [48] 1: Compute the meshless approximations 2: Generate the background cells which cover the domain. 3: for each cell C i do 4: for each quadrature points x Q in the cell do 5: if the quadrature point is inside the physical domain then 6: Check all nodes in the cell Ci and surrounding cells to determine the nodes x I in the domain of influence of x Q 7: if x I − x Q does not intersect the boundary segment then 8: Compute the N I (x Q ) and N I,i (x Q ) at the quadrature point. 9: Evaluate contributions to the integrals. 10: Assemble contributions to the coefficient matrix. 11: end if 12: end if 13: end for 14: end for 15: Solve Eq. (77) to obtain the nodal parameters 16: Compute the solution by using Eq. (48)
3.3.
Local Domain Integration
Another method for discretizing the governing equation is based on the concept of local domain integration [9]. In the local domain integration method, the global domain is covered by local subdomains, as shown in Fig. 3(c). The local domains can be of arbitrary shape (typically circles or squares are convenient for integration) and can overlap with each other. In the heat conduction example, for a given node, a generalized local weak form over the node’s subdomain s can be written as s
v ∇ u d − α 2
v u − u b d = 0
(75)
su
where su = ∂s ∩ u is the intersection of the boundary of s and the global Dirichlet boundary. For nodes near or on the global boundary, ∂s = s + Łs . s is a part of the local domain boundary which is also located on the global boundary. Łs is the remaining part of the local boundary which is inside the global domain. α 1 is a penalty parameter used to impose the Dirichlet boundary conditions.
Meshless methods for numerical solution
2467
By applying the divergence theorem and imposing the Neumann boundary condition, for any local domain s , we obtain the local weak form
∂u v d + ∂n
Ls
−
su
∂u v d + ∂n
u ,k v ,k d − α
s
h(x, y)v d sq
v (u − g(x, y)) d = 0
(76)
su
in which sq is the intersection of the boundary of s and the global Neumann boundary. For a sub-domain located entirely within the global domain, there is no intersection between ∂s and , the integrals over su and sq vanish. In order to simplify the above equation, one can deliberately select a test function v such that it vanishes over ∂s . This can be easily accomplished by using the weighting function in the meshless approximations as also the test function, with the support of the weighting function set to be the size of the corresponding local domain s . In this way, the test function vanishes on the boundary of the local domain. By substituting the test function and the meshless approximation of the unknown (Eq. (48)) into the local domain weak form (Eq. (76)), we obain the matrix form K uˆ = f
(77)
where
Ki j = si
N j,k v i,k d + α
sui
N j v i d −
N j,n v i d
(78)
sui
and
fi = sqi
h(x, y)v i d + α
g(x, y)v i d
(79)
sui
where si , sui and sqi are the domain and boundary for the local domain i. The integrations in Eqs. (78) and (79) can be computed within each local domain by using Gauss quadrature. The implementation of the local integration can be carried out as summarized in Algorithm 6. Meshless methods based on local domain integration include the meshless local Petrov–Galerkin method [9] and the method of finite spheres [24].
2468
G. Li et al.
Algorithm 6 Implementation of the local domain integration technique 1: Compute the meshless approximations for the unknown solution 2: for each node (x i , yi ) do 3: Determine the local sub-domain s and its corresponding local boundary ∂s 4: Determine Gaussian quadrature points x Q in s and on ∂s 5: for each quadrature points x Q in the local domain do 6: Compute the Ni (x Q ) and Ni, j (x Q ) at the quadrature point x Q . 7: Evaluate contributions to the integrals. 8: Assemble contributions to the coefficient matrix. 9: end for 10: end for 11: Solve Eq. (77) to obtain the nodal parameters 12: Compute the solution by using Eq. (48)
4.
Summary of Meshless Methods
In this paper, we have introduced several approaches to construct the meshless approximations and three approaches to discretize the governing equations. Many meshless methods published in the literature can be viewed as different combinations of the approximation and discretization approaches introduced in the previous sections. Table 1 lists the popular methods with their approximation and discretization components.
Table 1. The catagory of meshless methods Point collocation
Cell integration Galerkin
Local domain integration Galerkin
Moving leastSquares
Finite point method [10]
Element-free Galerkin method [6], partition of unity finite element method [12]
Meshless local Petrov-Galerkin method [3], method of finite spheres [24]
Fixed leastsquares
Geleralized finite difference method [7] h − p meshless cloud method [8], finite point method [10]
Diffuse element method [5]
Reproducing
Finite cloud method [18]
Repeoducing kernel
kernel
particle method [15]
Fixed kernel
Finite cloud method [18]
Radial basis
Many
Many
Meshless methods for numerical solution
5.
2469
Example: Finite Cloud Method for Solving Linear Elasticity Problems
As shown in Fig. 4, an elastic plate containing three holes and a notch is subjected to a uniform pressure at its right edge [49]. We solve this problem by using the finite cloud method to demonstrate the effectiveness of the meshless method. To show the accuracy of the solution, the problem is solved by both the finite element method by using ANSYS and the finite cloud method. We construct the FCM discretizations by employing the same set of FEM nodes. For two-dimensional elasticity, there are two unknowns associated with each node in the domain, namely the displacements in the x and y directions. The governing equations assuming zero body force, can be rewritten as the Navier–Cauchy equations of elasticity 1 ∂ ∇ 2u + 1 − 2ν ∂ x ∇ 2v +
1 ∂ 1 − 2ν ∂ y
!
∂u ∂v + ∂x ∂y ∂u ∂v + ∂x ∂y
=0 (80)
!
=0
with
ν =
ν
ν 1+ν
for plane strain (81) for plane stress
where ν is the Poisson’s ratio. In this paper we consider the plane stress situation. In the finite cloud method, the first step is to construct the fixed kernel approximation for the displacements u and v by using Algorithm 3. In this example, the cloud size is set for each node to cover 25 neighboring nodes. 200
100
150 q
75
30
5
100
30
75
Thickness 1
75
υ 0.3
5
E 20
250 120
55
100
95
115
Figure 4. Plate with holes.
q 1.0
2470
G. Li et al.
The 2-D version of the modified Gaussian weighting function (Eq. (7)) is used as the kernel. After the approximation funcitons are computed, a point collocation approach is used for discretizing the governing equation and the boundary conditions by using Algorithm 4 to obtain the solution of the displacements. Figure 5 shows the deformed shape obtained by the FEM code ANSYS. The FEM mesh consists of 4474 nodes. All the 4474 ANSYS nodes are taken as the points in the FCM simulation. The deformed shapes obtained by FCM are shown in Fig. 6. The results obtained from the FEM and FCM agree with each other quite well and the difference of the maximum displacement is within 1%. Figure 7 shows a quantitative comparison of the computed σx x stress on the surfaces of the holes obtained from the two methods. The results
FEM solution
Figure 5. Deformed shape obtained by the finite element method.
400 FCM solution 350 300 250 200 150 100 50 0 0
100
200
300
400
500
Figure 6. Deformed shapeobtained by the finite cloud method.
Meshless methods for numerical solution
2471
5 FEM (ANSYS) FCM 4
3
σxx
θ 2
1
0
1
0
10
20
30
40
50
60
70
80
90
100
θ (degree)
Figure 7. Results comparionfor σx x at the lower left circular boundary.
show very good agreement and demonstrate that the FCM approach provides accurate results for problems with complex geometries.
Remarks: 1. The construction of approximation functions is more expensive in meshless methods compared to the cost associated with construction of interpolation functions in FEM. The integration cost in Galerkin meshless methods is more expensive. Galerkin meshless methods can be a few times slower (typically more than five times) than FEM [25]. 2. Collocation meshless methods are much faster since no numerical integrations are involved. However, they may need more points and their robustness needs to be addressed [42]. 3. Meshless methods introduce a lot of flexibility. One needs to sprinkle only a set of points or nodes covering the computational domain as shown in Fig. 6, with no connectivity information required among the set of points. This property is very appealing because of its potential in adaptive techniques, where a user can simply add more points in a particular region to obtain more accurate results.
References [1] T. Belytschko, Y. Krongauz, D. Organ, M. Fleming, and P. Krysl, “Meshless methods: an overview and recent developments,” Comput. Methods Appl. Mech. Engrg., 139, 3–47, 1996.
2472
G. Li et al.
[2] S. Li and W.K. Liu, “Meshfree and particle methods and their applications,” Appl. Mech. Rev., 55, 1–34, 2002. [3] S.N. Atluri, The Meshless Local Petrov–Galerkin (MLPG) Method, Tech Science Press, 2002. [4] P. Lancaster and K. Salkauskas, “Surface generated by moving least squares methods,” Math. Comput., 37, 141–158, 1981. [5] B. Nayroles, G. Touzot, and P. Villon, “Generalizing the finite element method: diffuse approximation and diffuse elements,” Comput. Mech., 10, 307–318, 1992. [6] T. Belytschko, Y.Y. Lu, and L. Gu, “Element free galerkin methods,” Int. J. Numer. Methods Eng., 37, 229–256, 1994. [7] T.J. Liszka and J. Orkisz, “The finite difference method at arbitrary irregular grids and its application in applied mechanics,” Comput. Struct., 11, 83–95, 1980. [8] T.J. Liszka, C.A. Duarte, and W.W. Tworzydlo, “hp-meshless cloud method,” Comput. Methods Appl. Mech. Eng., 139, 263–288, 1996. [9] S.N. Atluri and T. Zhu, “A new meshless local Petrov–Galerkin (MLPG) approach in computational mechanics,” Comput. Mech., 22, 117–127, 1998. [10] E. O˜nate, S. Idelsohn, O.C. Zienkiewicz, and R.L. Taylor, “A finite point method in computational mechanics. Applications to convective transport and fluid flow,” Int. J. Numer. Methods Eng., 39, 3839–3866, 1996. [11] E. O˜nate, S. Idelsohn, O.C. Zienkiewicz, R.L. Taylor, and C. Sacco, “A stabilized finite point method for analysis of fluid mechanics problems,” Comput. Methods Appl. Mech. Eng., 139, 315–346, 1996. [12] I. Babuska and J.M. Melenk, “The partition of unity method,” Int. J. Numer. Meth. Eng., 40, 727–758, 1997. [13] P. Breitkopf, A. Rassineux, G. Touzot, and P. Villon, “Explicit form and efficient computation of MLS shape functions and their derivatives,” Int. J. Numer. Methods Eng., 48(3), 451–466, 2000. [14] J.J. Monaghan, “Smoothed particle hydrodynamics,”Annu. Rev. Astron. Astrophys., 30, 543–574, 1992. [15] W.K. Liu, S. Jun, S. Li, J. Adee, and T. Belytschko, “Reproducing Kernel particle methods for structural dynamics,” Int. J. Numer. Methods Eng., 38, 1655–1679, 1995. [16] J.-S. Chen, C. Pan, C. Wu, and W.K. Liu, “Reproducing Kernel particle methods for large deformation analysis of non-linear structures,” Comput. Methods Appl. Mech. Eng., 139, 195–227, 1996. [17] N.R. Aluru, “A point collocation method based on reproducing Kernel approximations,” Int. J. Numer. Methods Eng., 47, 1083–1121, 2000. [18] N.R. Aluru and G. Li, “Finite cloud method: a true meshless technique based on a fixed reproducing Kernel approximation,” Int. J. Numer. Methods Eng., 50(10), 10, 2373–2410, 2001. [19] R.L. Hardy, “Multiquadric equations for topography and other irregular surfaces,” J. Geophys. Res., 176, 1905–1915, 1971. [20] E.J. Kansa, “Multiquadrics – a scattered data approximation scheme with applications to computational fluid dynamics – I, surface approximations and partial derivative estimates,” Comp. Math. Appl., 19, 127–145, 1990. [21] E.J. Kansa, “Multiquadrics – a scattered data approximation scheme with applications to computational fluid dynamics – II, solutions to parabolic, hyperbolic and elliptic partial differential equations,” Comp. Math. Appl., 19, 147–161, 1990. [22] M.A. Golberg and C.S. Chen, “A bibliography on radial basis function approximation,” Boundary Elements Comm., 7, 155–163, 1996.
Meshless methods for numerical solution
2473
[23] H. Wendland, “Piecewise polynomial, positive definite and compactly supported radial functions of minial degree,” Adv. Comput. Math., 4, 389–396, 1995. [24] S. De and K.J. Bathe, “The method of finite spheres,” Comput. Mech., 25, 329–345, 2000. [25] S. De and K.J. Bathe, “Towards an efficient meshless computational technique: the method of finite spheres,” Eng. Comput., 18, 170–192, 2001. [26] X. Jin, G. Li, and N.R. Aluru, “On the equivalence between least-squares and Kernel approximation in meshless methods,” CMES: Comput. Model. Eng. Sci., 2(4), 447– 462, 2001. [27] J.H. Kane, Boundary Element Analysis in Engineering Continuum Mechanics, Prentice-Hall, 1994. [28] L. Greengard and V. Rokhlin, “A fast algorithm for particle simulations,” J. Comput. Phys., 73(2), 325–348, 1987. [29] J.R. Phillips and J.K. White, “A precorrected-FFT method for electrostatic analysis of complicated 3-D structures,” IEEE Transact. on Comput.-Aided Des. of Integrated Circuits Sys., 16(10), 1059–1072, 1997. [30] S. Kapur and D.E. Long, “I E S 3 : a fast integral equation solver for efficient 3-dimensional extraction,” IEEE Computer Aided Design, 1997, Digest of Technical Papers 1997, IEE/ACM International Conference, 448–455, 1997. [31] V. Shrivastava and N.R. Aluru, “A fast boundary cloud method for exterior 2-D electrostatics,” Int. J. Numer. Methods Eng., 56(2), 239–260, 2003. [32] Y.X. Mukherjee and S. Mukherjee, “The boundary node method for potential problems,” Int. J. Numer. Methods Eng., 40, 797–815, 1997. [33] M.K. Chati and S. Mukherjee, “The boundary node method for three-dimensional problems in potential theory,” Int. J. Numer. Methods Eng., 47, 1523–1547, 2000. [34] J. Zhang, Z. Yao, and H. Li, “A hybrid boundary node method,” Int. J. Numer. Methods Eng., 53(4), 751–763, 2002. [35] W. Chen, “Symmetric boundary knot method,” Eng. Anal. Boundary Elements, 26(6), 489–494, 2002. [36] G. Li and N.R. Aluru, “Boundary cloud method: a combined scattered point/boundary integral approach for boundary-only analysis,” Comput. Methods Appl. Mech. Eng., 191, (21–22), 2337–2370, 2002. [37] G. Li and N.R. Aluru, “A boundary cloud method with a cloud-by-cloud polynomial basis,” Eng. Anal. Boundary Elements, 27(1), 57–71, 2003. [38] G.E. Forsythe and W.R. Wasow, Finite Difference Methods for Partial Differential Equations, Wiley, 1960. [39] T.J.R. Hughes, The Finite Element Method, Prentice-Hall, 1987. [40] C.A. Brebbia and J. Dominguez, Boundary Elements An Introductory Course, McGraw-Hill, 1989. [41] K. Salkauskas and P. Lancaster, Curve and Surface Fitting, Elsevier, 1986. [42] X. Jin, G. Li, and N.R. Aluru, “Positivity conditions in meshless collocation methods,” Comput. Methods Appl. Mech. Eng., 193, 1171–1202, 2004. [43] W.K. Liu, S. Li, and T. Belytschko, “Moving least-square reproducing kernel methods (I) methodology and convergence,” Comput. Methods Appl. Mech. Eng., 143, 113– 154, 1997. [44] P.S. Jensen, “Finite difference techniques for variable grids,” Comput. Struct., 2, 17– 29, 1972. [45] G.E. Fasshauer, “Solving differential equations with radial basis functions: multilevel methods and smoothing,” Adv. Comput. Math., 11, 139–159, 1999.
2474
G. Li et al.
[46] M. Zerroukat, H. Power, and C.S. Chen, “A numerical method for heat transfer problems using collocation and radial basis functions,” Int. J. Numer. Methods Eng., 42, 1263–1278, 1998. [47] M.T. Heath, Scientific Computing: An Introductory Survey, McGraw-Hill, 1997. [48] Y.Y. Lu, T. Belytschko, and L. Gu, “A new implementation of the element free galerkin method,” Comput. Methods Appl. Mech. Eng., 113, 397–414, 1994. [49] G. Li, G.H. Paulino, and N.R. Aluru, “Coupling of the meshfree finite cloud method with the boundary element method: a collocation approach,” Comput. Meth. Appl. Mech. Eng., 192(20–21), 2355–2375, 2003.
8.4 LATTICE BOLTZMANN METHODS FOR MULTISCALE FLUID PROBLEMS Sauro Succi1, Weinan E2 , and Efthimios Kaxiras3 1
Istituto Applicazioni Calcolo, National Research Council, viale del Policlinico, 137, 00161, Rome, Italy 2 Department of Mathematics, Princeton University, Princeton, NJ 08544-1000, USA 3 Department of Physics, Harvard University, Cambridge, MA 02138, USA
1.
Introduction
Complex interdisciplinary phenomena, such as drug design, crackpropagation, heterogeneous catalysis, turbulent combustion and many others, raise a growing demand of simulational methods capable of handling the simultaneous interaction of multiple space and time scales. Computational schemes aimed at such type of complex applications often involve multiple levels of physical and mathematical description, and are consequently referred to as to multiphysics methods [1–3]. The opportunity for multiphysics methods arises whenever single-level methods, say molecular dynamics and partial differential equations of continuum mechanics, expand their range of scales to the point where overlap becomes possible. In order to realize this multiphysics potential specific efforts must be directed towards the development of robust and efficient interfaces dealing with “hand-shaking” regions where the exchange of information between the different schemes takes place. Two-level schemes combing atomistic and continuum methods for crack propagation in solids or strong shock fronts in rarefied gases have made their appearance in the early 90s. More recently, three-level schemes for crack dynamics, combining finite-element treatment of continuum mechanics far away from the crack with molecular dynamics treatment of atomic motion in the near-crack region and a quantum mechanical description of bond-snapping in the crack tip have been demonstrated. These methods represent concrete instances of composite algorithms which put in place seamless interfaces between the different mathematical models associated with different physical levels of description, say continuum and atomistic. An alternative approach is to explore methods 2475 S. Yip (ed.), Handbook of Materials Modeling, 2475–2486. c 2005 Springer. Printed in the Netherlands.
2476
S. Succi et al.
that can host multiple levels of description, say atomistic, kinetic, and fluid, within the same mathematical framework. A potential candidate is the lattice Boltzmann equation (LBE) method. The LBE is a minimal form of Boltzmann kinetic equation in which all details of molecular motion are removed except those that are strictly needed to recover hydrodynamic behavior at the macroscopic scale (mass-momentum and energy conservation) [4, 5]. The result is an elegant and simple equation for the discrete distribution function f i ( x , t) describing the probability to find a particle at lattice site x at time t with speed v. LBE has potential to combine the power of continuum methods with the geometrical flexibility of atomistic methods. However, as multidisciplinary problems of increasing complexity are tackled, it is evident that significant upgrades are called for, both in terms of extending the range of scales accessible by LBE itself and in terms of coupling LBE downwards/upwards with micro/macroscopic methods. In the sequel, we shall offer a cursory view of both these research directions. Before proceeding further, a short review of the basic ideas behind LBE theory is in order.
2.
Lattice Boltzmann Scheme: Basic Theory
The lattice Boltzmann equation is based on the idea of moving pseudoparticles along prescribed directions on a discrete lattice (the discrete particle speeds define the lattice connectivity). At each lattice site, these pseudoparticles undergo collisional events designed in such a way as to conserve the basic mass, momentum and energy principles which lie at the heart of fluid behavior. Historically, LBE was generated in response to the major problems of its ancestor, the lattice gas cellular automaton, namely statistical noise, high viscosity, and exponential complexity of the collision operator with increasing number of speeds [6, 7]. A few years later, its mathematical connections with model kinetic equations of continuum theory have also been clarified [8]. The most popular, although not necessarily the most efficient, form of lattice Boltzmann equation (Lattice BGK, for Bhatnagar, Gross, Krook) reads as follows [9]
x + ci t, t + t) − f i ( x , t) − ωt f i − f ie ( x , t) + Fi t, f i ( →
(1)
x , v = ci , t), i = 1,b, is the discrete one-body distribution where f i ( x , t) = f ( function moving along the lattice direction defined by discrete speed ci. At the left hand side, we recognize the streaming operator of the Boltzmann equation, ∂t f + v · ∇ f, advanced in discrete time from t to t + t, along the characteristics xi = ci t. The right hand side represents the collisional operator in the form of single-time relaxation to the local equilibrium f ie · Finally, the effect of an external force, Fi , is also included. In order to recover fluid-dynamic
Lattice Boltzmann methods for multiscale fluid problems
2477
behavior, the set of discrete speeds must guarantee the basic symmetries of fluid equations, namely mass, momentum and energy conservation, as well as rotational invariance. Only a limited subclass of lattices qualifies. A popular choice in three-dimensional space is the nineteen-speed lattice, consisting of one speed-zero (c = 0) particle sitting on the center of the cell, six speed-one (c = 1) particles√connecting to the face centers of the cell, and twelve particles with speed c = 2, connecting the center of the cell with edge centers. The local equilibrium is usually taken in the form of a quadratic expansion of a Maxwellian
uu · ci ci − cs2 I u · ci e , f i = ρωi 1 + 2 + cs 2cs4
(2)
i /ρ the flow speed. Here where ρ = i f i the fluid density, and u = = i fi c 2 cs is the lattice sound speed defined by the condition cs I = i ωi ci ci , where I denotes the unit tensor. Finally, ωi is a set of lattice-dependent weights normalized to unity. For athermal flows, the lattice sound speed is a constant of order one (cs2 = 1/3 for the 19-speed lattice of Fig. 1). Local equilibria obey the following conservation relations (mass made unity for convenience):
f ie = ρ,
(3)
f ie ci = ρ u,
(4)
i
i
f ie ci ci = ρ uu + cs2 I .
i
Figure 1. The D3Q19 lattice.
(5)
2478
S. Succi et al.
Using linear transport theory, in the limit of long-wavelengths as compared to particle mean free path, (small-Knudsen number) and low fluid speed as compared to the sound speed (low-Mach number), the fluid density and speed are shown to obey the Navier-Stokes equations for a quasi-incompressible fluid (with no external force for simplicity) ∂t ρ + divρ u = 0,
(6)
u + ( u )T + λdiv uI , ∂t ρ u + divρ uu = − ∇ P + div µ(
(7)
where P = pcs2 is the fluid pressure, and µ = ρu is the dynamic viscosity, and λ is the bulk viscosity (this latter term can be neglected to all practical purposes since we deal with quasi-incompressible fluids). Note that, according to the above relation, the LBE fluid obeys an ideal equation of state, as it belongs to a system of molecules with no potential energy. Potential energy effects can be introduced via a self-consistent force Fi , but in this work we shall not deal with such non-ideal gas aspects. The kinematic viscosity of the LBE fluid turns out to be:
ν=
cs2
1 τ − t 2
x 2 . t
(8)
The term τ ≡ 1/ω is the relaxation time around local equilibria, while the factor –1/2 is a genuine lattice effect which stems from second order spatial derivatives in the Taylor expansion of the discrete streaming operator. It is fortunate that such a purely numerical effect can be reabsorbed into a physical (negative) viscosity. In particular, by choosing ωt = 2 − , very small viscosities of order O() (in lattice units) can be achieved, corresponding to the very challenging regime of fluid turbulence [10]. Main assets of LBE are: • • • •
mathematical simplicity; physical flexibility; easy implementation of complex boundary conditions; excellent amenability to parallel processing.
Mathematical simplicity is related to the fact that, at variance with the Navier-Stokes equations in which non-linearity and non-locality are lumped into a single term, u∇ u, in LBE the non-local term (streaming) is linear and the non-linear term (the local equilibrium) is local. This disentangling proves beneficial from both the analytical and computational point of views. Physical flexibility relates to opportunity of accomodating additional physics via generalizations of the local equilibria and/or the external source Fi , such as to include the effects of additional fields interacting with the fluid.
Lattice Boltzmann methods for multiscale fluid problems
2479
Easy implementation of complex boundary conditions results from the fact that the most common hydrodynamic boundary conditions, such as prescribed speed at solid boundaries, or prescribed pressure at fluid oulets, can be imposed in terms of elementary mechanical operations on the discrete distributions. However, in the presence of curved boundaries, i.e., boundaries which do not fit into the lattice sites, the boundary procedure may become considerably more involved. This represents one of the most active research topic in the field. It must be pointed out that in addition to fluid density and pressure, LBE also carries along the momentum flux tensor, whose equilibrium part corresponds to the fluid pressure. As a result, LBE does not need to solve the Poisson problem to compute the pressure distribution corresponding to a given flow configuration. This is a significant advantage as compared to explicit finite-difference schemes for incompressible flows. The price to pay is an extra-amount of information as compared to a hydrodynamic approach. For instance, in two dimensions, the most popular LBE requires nine populations (one rest particle, four nearest-neighbors and four next-to-nearest neighbors) to be contrasted with only three hydrodynamic fields (density, two velocity components). On the other hand, since LBE populations always stream “upwind” (from x to x + ci t, only one time level needs to be stored, which saves a factor two over hydrodynamic representations. As per efficiency on parallel computers, the key is again the locality of the collision operator which can be advanced concurrently at each lattice site independently of all others. Owing to these highlights, LBE has been used for more than 10 years for the simulation of a large variety of flows, including flows in porous media, turbulence, and complex flows with phase transitions, to name but a few. Multiscale applications, on the other hand, have appeared only recently, as we shall discuss in the sequel.
3.
Multiscale Lattice Boltzmann
Multiscale versions of LBE were first proposed by Filippova and Haenel [11] in the form of a LBE working on locally embedded grids, namely regular grids in which the lattice spacing is locally refined or coarsened, typically in steps of two for practical purposes. The same option was available since even longer in commercial versions of LB methods [12]. In the sequel, we shall briefly outline the main elements of multiscale LBE theory on locally embedded cartesian grids.
2480
S. Succi et al.
3.1.
Basics of the Multiscale LB Method
The starting point of multiscale LBE theory is the lattice BGK equation (1). Grid-refinement is performed by introducing an n-times finer grid with spacing: δx =
t x , δt = , n n
The kinematic viscosity on the coarse lattice is given by Eq. (8) from which we see that in order to achieve the same viscosity on both coarse and fine grids, the relaxation parameter in the fine grid has to be rescaled as follows
τn = nτ1 1 −
n − 1 t/2 , n τ1
(9)
where rn and τ1 ≡ τ are the relaxation parameters on the n times-refined and on the original coarse grids, respectively n = 2l after l levels of grid-refinement). Next, we need to set up the interface conditions controlling the exchange of information between the coarse and fine grids. The guiding requirement is the continuity of hydrodynamic quantities (density, flow speed) and of their fluxes. Since hydrodynamic quantities are microscopically conserved, the corresponding interface conditions simply consists in setting the local equilibria in the fine grid equal to those in the coarse one. The fluxes, however, do not correspond to any microscopic invariant, and consequently their continuity implies requirements on the non-equilibrium component of the discrete distribution function. Therefore, the first step of the interface procedure consists in splitting the discrete distribution function into an equilibrium and non-equilibrium components: f i = f ie + f ine .
(10)
Upon expanding the left hand side of the LBE equation (1) to first order in at, the non-equilibrium component reads as
f ine = −τ [∂t + cia ∂a ] f ie + O K n 2 ,
(11)
where the latin index a runs over spatial dimensions and repeated indices are summed upon. This is second-order accurate in the Knudsen number K n = x/L, where L is a typical macroscopic scale of the flow. In the low-frequency limit t/τ ∼ K n 2 , the time derivative can be neglected, and by combining the above relation with continuity of the hydrodynamic variables at the interface between the two grids, one obtains the following scaling relations between the coarse and fine grid populations
f i = F˜ie + F˜i − F˜ie −1 ,
Fi = f ie + ( f i − f ie ) ,
(12) (13)
Lattice Boltzmann methods for multiscale fluid problems
2481
where capital means coarse-grid, prime means post-collision, and tilde stands for interpolation from the coarse grid. In the above,
=n
(τ1 − t) · (τn − t)
The basic one-step algorithm reads as follows: 1. Advance (Stream, and Collide) F on the coarse grain grid. 2. For all subcycles k = 0, l, . . . , n − 1 do: a. Interpolate F on the interface coarse-to-fine grid. b. Scale F to f via (12) on the interface coarse-to-fine grid. c. Advance (Stream and Collide) f on the fine-grain grid. 3. Scale back f to F via (13) on the interface of the fine-to-coarse grid. Step 1 applies to all nodes in the coarse grid, bulk and interface, Steps 2a and 2b apply to interface nodes which belong only to the fine grid, Step 2c applies to bulk nodes of the fine grid, and Step 3 applies to interface nodes which belong to both coarse and fine grids. It is noted that becomes singular at τn = t, corresponding to n = (t/2)/(τ1 − t/2) = cs2 t/2ν (see Eq. (8)). For high-Reynolds applications, in which v is of the order of ∼10−3 or less (in units of the original lattice), the above singularity is of no practical concern, for it would be met only after hundred levels refinement. For low-Reynolds flow applications, however, this flaw needs to be cured. To this purpose, a more general approach that avoids the singularity has been recently developed by Dupuis [13]. These authors show that by defining the scale transformations between the coarse and fine grain populations before they collide, the singularity disappears ( = n(τ1 )/(τn )). In practice, this means that, at variance with Filippova’s model, the collision operator is applied also to the interface nodes which belong to the fine grid only.
4.
Multiscale LBE Applications
To date, Multiscale LBEs have been applied mainly to macroscopic turbulent flows [14, 15]. Here, however, we focus our attention to microscale problems of more direct relevance to material science applications.
4.1.
Microscale Flows with Chemical Reactions
The LBE couples easily to finite difference/volume methods for continuum parial differential equations. Distinctive features of LBE in this context
2482
S. Succi et al.
are: (1) Use of very-small time-steps, (2) Geometrical flexibility. Item (1) refers to the fact that since LBE is an explicit method ticking at the particle speed, not the fluid one, it advances in much smaller time-steps than usual fluid-dynamics methods, typically a factor ten. (The flip side, is that a large number of time-step is required in long-time evolutions.) As an example, take a millimetric flow with, say 100 grid points per side, yielding a mesh spacing dx =10 µm. Assuming a sound speed of the order of 300 m/s, we obtain a timestep of the order of dt= 30 ns. Such a small time-step permits to handle relatively fast reactions without going to implicit time stepping, thus avoiding the solution of large systems of algebraic equations. Item (2) is especially suited to heterogeneous catalysis since the simplicity of particle trajectories permits to describe fairly irregular geometries and boundary conditions. Because of these two points, LBE is currently being used to simulate reactive flows over microscopically corrugated surfaces, an application of great interest for the design of chemical traps, catalytic converters and related devices [16, 17] (Fig. 2). These problems are genuinely multiphysics, since they involve a series of hydrodynamic and chemical time-scales. The major control parameters are the Reynolds number Re = U d/ν, the Peclet number Pe = Ud/D, and the Damkohler number Da = d 2 /Dτc . In the above, U and d are typical flow speed and size, D is the mass diffusivity of the chemical species and τc is a typical chemical reaction time-scale. Depending on various physical and geometrical parameters, a wide separation of these time-scales can arise. In general, the LBE time-step is sufficiently small to resolve all the relevant time-scales.
Figure 2. A multiscale computation of a flow in a microscopic restriction of a catalytic converter. Local flow gradients may lead to significant enhancements of the fluid-wall mass transfer, with corresponding effects on the chemical reactivity of the device. Note that three levels of refinement are used.
Lattice Boltzmann methods for multiscale fluid problems
2483
Whenever faster time-scales develop, e.g., fast chemical reactions, the chemical processes are sub-cycled, i.e., advanced in multiple steps each with the smallest time-scale, until completion of a single LBE step [18].
4.2.
Nanoscale Flows
When the size of the micro/nanoscopic flow becomes comparable to the molecular mean free path, the Knudsen number is no longer small, and the whole fluid picture becomes questionable. A fundamental question then arises as to whether LBE can be more than a “Navier-Stokes solver in disguise”, namely capture genuinely kinetic information not available at the fluid-dynamic level. Mathematically, this possibility stems from the fact that – as already observed – discrete populations f i consistently outnumber the set of hydrodynamic observables, so that the excess-variables are potentially available to carry non-hydrodynamic information. This would represent a very significant advance, for it would show that LBE can be used as a tool for computational kinetic theory, beyond fluid dynamics. Nonetheless, a few numerical simulations of LBE microflows in microscopic electro-mechanical systems (MEMS) seem to indicate that standard LBE can capture some genuinely kinetic features of rarefied gas dynamics, such as slip motion at solid walls [19]. LBE schemes for nanoflow applications will certainly require new types of boundary conditions. A simple way to accomodate slip motion within LBE is to allow a fraction of LBE particles to be elastically reflected at the wall. A typical slip-boundary condition for, say, southeast propagating molecules entering the fluid domain from the north wall, y = d, would read as follows (lattice spacing made unity for simplicity): f se (x, d) = (1 − r) fne (x − 1, d − 1) + r fnw (x + 1, d − 1). Here r is a bounce-back coefficient in the range 0 < r < 1, and subscripts se, ne stand for south-east and north-east propagation, respectively [20]. It is easily seen that the special case r = 1 corresponds to a complete bounceback along the incoming direction, a simple option to implement zero fluid speed at the wall. More general conditions, borrowed from “diffusive” boundary conditions used in rarefied gas dynamics for the solution of the “true” Boltzmann equation have also been developed [21]. Much remains to be done to show that existing LBE models, extended with appropriate boundary conditions, can solve non-hydrodynamic flow regimes. This is especially true if thermal effects must be taken into account, as it is often the case in nanoflows applications.
2484
S. Succi et al.
Even if the use of LBE stand-alone turned out to be unviable, one could still think of coupling LBE with truly microscopic methods, such as direct simulation or kinetic Monte Carlo [22, 23]. A potential advantage of coupling LBE instead of Navier-Stokes solvers to atomistic, or kinetic Monte Carlo, descriptions of atomistic flows is that the shear tensor Sab =
ν(∂a u b + ∂b u a ) 2
(14)
can be computed locally as 1
( f i − f ie )(cia cib − cs2 δab ) Sab = µ 2 i
(15)
with no need of taking spatial derivatives (a delicate, and often error-prone, task at solid interfaces). Moreover, while the expression (14) is only valid in the limit of small Knudsen number, no such restriction applies to the kinetic expression (15). Both aspects could significantly enhance the scope of sampling procedures converting fluid-kinetic information (the discrete populations) into atomistic information (the particles coordinates and momenta) and vice versa, at fluid–solid interfaces [24]. This type of coupling procedures represent one of the most exciting frontiers for multiscale LBE applications at the interface between fluid dynamics and material science [25].
5.
Future Prospects
LBE has already made proof of significant versatility in addressing a wide range of problems involving complex fluid motion at disparate scales. Much remains to be done to further boost the power of the LB method towards multiphysics applications of increasing complexity. Important topics for future research are: • robust interface conditions for strongly non-equilibrium flows; • locally adaptive LBEs on unstructured, possibly moving, grids; • acceleration strategies for long-time and steady-state calculations. Finally, the development of a solid mathematical framework identifying the general conditions for the validity (what can go wrong and why!) of multiscale LBE techniques is also in great demand [26]. There are good reasons to believe that further upgrades of the LBE technique, as indicated above, hopefully stimulated by enhanced communication with allied sectors of computational physics, will make multiphysics LBE applications flourish in the near future.
Lattice Boltzmann methods for multiscale fluid problems
2485
References [1] M. Seel, “Modelling of solid rocket fuel: from quantum chemistry to fluid dynamic simulations,” Comput. Phys., 5, 460–469, 1991. [2] W. Hoover, A.J. de Groot, and C. Hoover, “Massively parallel computer simulation of plane-strain elastic–plastic flow via non-equilibrium molecular dynamics and Lagrangian continuum mechanics,” Comput. Phys., 6(2), 155–162, 1992. [3] F.F. Abraham, J. Broughton, N. Bernstein, and E. Kaxiras, “Spanning the length scales in dynamic simulation,” Comput. Phys., 12(6), 538–546, 1998. [4] R. Benzi, S. Succi, and M. Vergassola, “The lattice Boltzmann equation: theory and applications,” Phys. Rep., 222, 145–197, 1992. [5] S. Succi, “The lattice Boltzmann equation for fluid dynamics and beyond,” Oxford University Press, Oxford, 2001. [6] G. McNamara and G. Zanetti, “Use of the Boltzmann equation to simulate lattice gas automata,” Phys. Rev. Lett., 61, 2332–2335, 1988. [7] F. Higuera, S. Succi, and R. Benzi, “Lattice gas dynamics with enhanced collisions,” Europhys. Lett., 9, 345–349, 1989. [8] X. He and L.S. Luo, “A priori derivation of the lattice Boltzmann equation,” Phys. Rev. E, 55, R6333–R6336, 1997. [9] Y.H. Qian, D. d’Humieres, and P. Lallemand, “Lattice BGK models for the Navier– Stokes equation,” Europhys. Lett., 17, 479–484, 1992. [10] S. Succi, I.V. Karlin, and H. Chen, “Role of the H theorem in lattice Boltzmann hydrodynamic simulations,” Rev. Mod. Phys., 74, 1203–1220, 2002. [11] O. Filippova and D. H¨anel, “Grid-refinement for lattice BGK models,” J. Comput. Phys., 147, 219–228, 1998. [12] H. Chen, C. Teixeira, and K. Molvig, “Realization of fluid boundary conditions via discrete Boltzmann dynamic,” Int. J. Mod. Phys. C, 9, 1281–1292, 1998. [13] A. Dupuis, “From a lattice Boltzmann model to a parallel and reusable implementation of a virtual river,” PhD Thesis n. 3356, University of Geneva, 2002. [14] O. Fippova, S. Succi, F.D. Mazzocco, C. Arrighetti, G. Bella, and D. Haenel, “Multiscale lattice Boltzmann schemes with turbulence modeling,” J. Comp. Phys., 170, 812–829, 2001. [15] S. Chen, S. Kandasamy, S. Orszag, R. Shock, S. Succi, and V. Yakhot, “Extended Boltzmann kinetic equation for turbulent flows,” Science, 301, 633–636, 2003. [16] A. Gabrielli, S. Succi, and E. Kaxiras, “A lattice Boltzmann study of reactive microflows,” Comput. Phys. Commun., 147, 516–521, 2002. [17] S. Succi, G. Smith, O. Filippova, and E. Kaxiras, “Applying the Lattice Boltzmann equation to multiscale fluid problems,” Comput. Sci. Eng., 3(6), 26–37, 2001. [18] M. Adamo, M. Bernaschi, and S. Succi, “Multi-representation techniques for multiscale simulation: reactive microflows in a catalytic converter,” Mol. Simul., 25(1–2), 13–26, 2000. [19] X.B. Nie, S. Chen, and G. Doolen, “Lattice Boltzmann simulations of fluid flows in MEMS,” J. Stat. Phys., 107, 279–289, 2002. [20] S. Succi, “Mesoscopic modeling of slip motion at fluid–solid interfaces with heterogeneus catalysis,” Phys. Rev. Lett., 89(6), 064502, 2002. [21] S. Ansumali and I.V. Karlin, “Kinetic boundary conditions in the lattice Boltzmann method,” Phys. Rev. E, 66, 026311–17, 2002. [22] M. Silverberg, A. Ben-Shaul, and F. Rebentrost, “On the effects of adsorbate aggregation on the kinetics of surface-reactions,” J. Chem. Phys., 83, 6501–6513, 1985.
2486
S. Succi et al.
[23] T.P. Schulze, P. Smereka, and Weinan E, “Coupling kinetic Monte Carlo and continuum models with application to epitaxial growth,” J. Comput. Phys., 189, 197–211, 2003. [24] W. Cai, M. de Koning, V.V. Bulatov, and S. Yip, “Minimizing boundary reflections in coupled-domain simulations,” Phys. Rev. Lett., 85, 3213–3216, 2000. [25] D. Raabe, “Overview of the lattice Boltzmann method for nano and microscale fluid dynamics in material science and engineering,” Model. Simul. Mat. Sci. Eng., 12(6), R13–R14, 2004. [26] W. E, B. Engquist, Z.Y. Huang, “Heterogeneous multiscale method: a general methodology for multiscale modeling,” Phys. Rev. B, 67(9), 092101, 2003.
8.5 DISCRETE SIMULATION AUTOMATA: MESOSCOPIC FLUID MODELS ENDOWED WITH THERMAL FLUCTUATIONS Tomonori Sakai1 and Peter V. Coveney2,∗ 1 Centre for Computational Science, Queen Mary, University of London, Mile End Road, London E1 4NS, UK 2 Centre for Computational Science, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, UK
1.
Introduction
Until recently, theoretical hydrodynamics has largely dealt with relatively simple fluids which admit or are assumed to have an explicit macroscopic description. It has been highly successful in describing the physics of such fluids by analyses based on the Navier-Stokes equations, the classical equations of fluid dynamics which describe the motion of fluids, and usually predicated on a continuum hypothesis, namely that matter is infinitely divisible [1]. On the other hand, many real fluids encountered in our daily lives, in industrial, biochemical, and other fields are complex fluids made of molecules whose individual structures are themselves complicated. Their behavior is characterized by the presence of several important length and time scales. It must surely be among the more important and exciting research topics of hydrodynamics in the 21st century to properly understand the physics of such complex fluids. Examples of complex fluids are widespread – surfactants, inks, paints, shampoos, milk, blood, liquid crystals, and so on. Typically, such fluids are comprised of molecules and/or supramolecular components which have a non-trivial internal structure. Such microscopic and/or mesoscopic structures lead to a rich variety of unique rheological characteristics which not only make the study of complex fluids interesting but in many cases also enhance our quality of life.
* Corresponding author: P.V. Coveney, Email address: P.V. [email protected]
2487 S. Yip (ed.), Handbook of Materials Modeling, 2487–2501. c 2005 Springer. Printed in the Netherlands.
2488
T. Sakai and P.V. Coveney
In order to investigate and model the behavior of complex fluids, conventional continuum fluid methods based on the governing macroscopic fluid dynamical equations are somewhat inadequate. The continuous, uniform, and isotropic assumptions on which the macroscopic equations depend are not guaranteed to hold in such fluids where complex and time-evolving mesoscopic structures, such as interfaces, are present. As noted above, complex fluids are ones in which several length and time scales may be of importance in governing the large scale dynamical properties, but these micro and mesoscales are completely omitted in macroscopic continuum fluid dynamics, where empirical constitutive relations are instead shoe-horned into the Navier–Stokes equations. On the other hand, fully atomistic approaches based on molecular dynamics [2], which are the exact antithesis of conventional continuum methods, are in most cases not viable due to their vast computational cost. Thus, simulations which provide us with physically meaningful hydrodynamic results are out of reach of present day molecular dynamics and will not be accessible within the near future. Mesoscopic models are good candidates for mitigating problems with both conventional continuum methods and fully atomistic approaches. Spatially and temporally discrete lattice gas automata (LGA)[3] and lattice Boltzmann (LB) [4–7] methods have proven to be of considerable applicability to complex fluids, including multi-phase [8, 9] and amphiphilic [8, 9] fluids, solid–fluid suspensions [10], and the effect of convection–diffusion on growth processes [11]. These methods have also been successfully applied to flow in complex geometries, in particular to flow in porous media, an outstanding contemporary scientific challenge that plays an essential role in many technological, environmental, and biological fields [12–16]. Another important advantage of LGA and LB is that they are ideally suited for high performance parallel computing due to the inherent spatial locality of the updating rules in their dynamical time-stepping algorithms [17]. However, lattice-based models have certain well-known disadvantages associated with their spatially discrete nature [4, 7]. Here, we describe another mesoscopic model worthy of study. The method, which we call discrete simulation automata (DSA), is a spatially continuous but still temporally discrete version of the conventional spatio-temporally discrete lattice gas method, whose prototype was proposed by Malevanets and Kapral [18]. Since the particles now move in continuous space, DSA has the advantage of eliminating the spatial anisotropy that plagues conventional lattice gases, while also providing conservation of energy which enables one to deal with thermohydrodynamic problems not easily accessible by conventional lattice methods. We have coined the name DSA by analogy with the direct simulation Monte Carlo (DSMC) method [24] to which it is closely related, as we discuss further in Section 2. Some authors have referred to this method as a “realcoded lattice gas” [19–22]. Others have used the terms “Malevanets–Kapral
Discrete simulation automata
2489
method” “stochastic rotation method” or “multiple particle collision dynamics”. We have proposed the term DSA which we hope will be widely adopted in order to avoid further confusion [23]. The remainder of our paper is structured as follows. Starting from a review of single-phase DSA in Section 2, Section 3 describes how DSA can deal with binary immiscible fluids. In Section 4, we describe the application of DSA to amphiphilic fluids. Two of the latest developments of DSA, flow in porous media and a parallel implementation, are discussed in Section 5. Section 6 concludes our paper with a summary of the method.
2.
The Basic DSA Model and its Physical Properties
DSA are based on a microscopic, bottom-up approach and are comprised of cartesian cells between which massive point particles with a certain mass move. For a single component DSA fluid, state variables evolve by a twostep dynamical process: particle propagation and multi-particle collision. Each particle changes its location in the propagation process r = r + v
(1)
and its velocity in a collision process v = V + σ (v − V ),
(2)
where V is the mean velocity of all particles within a cell in which the collision occurs and σ is a random rotation, the same for all particles in one cell but differing between cells. In these equations, primes denote post-collision values and the mass as of all the particles are set to unity for convenience. This collision operation is equivalent to that in the direct simulation Monte Carlo (DSMC) method [24], except that pairwise collisions in DSMC are replaced by multi-particle collisions. The loss of molecular detail is an unavoidable consequence of the DSA algorithms as with other mesoscale modeling methods; however, these details are not required in order to describe the universal properties of fluid flow. Evidently, the use of multi-particle collisions allows DSA to deal readily with phenomena on mesoscopic and macroscopic scales which would be much more costly to handle using DSMC. Mass, momentum and energy are locally and hence globally conserved during the collision process. The velocity distribution of DSA particles corresponds to that of a Maxwellian when the system has relaxed to an equilibrium state [18]. We can thus define a parameter which may be regarded as a measure of average kinetic energy of the particles; this is the temperature T . For example, T = 1.0 specifies a state when each cartesian velocity component for the particles is described by a Maxwell distribution, whose variance is equal to one lattice unit (i.e., one DSA cell length).
2490
T. Sakai and P.V. Coveney
The existence of an H-theorem has been established using a reduced one-particle distribution function [18]. By applying a Chapman–Enskog asymptotic expansion to the reduced distribution function, the Navier–Stokes equations can be derived, as in the case of LGA [3] and LB [5]. When σ rotates v − V (see Eq. (2)) by a random angle in each cell, the fluid viscosity in DSA is written as ν=
ρ + 1 − e−ρ 1 +T , 12 2(ρ − 1 + e−ρ )
(3)
where ρ is the number density of particles.
3. 3.1.
DSA Models of Interacting Particles Binary Immiscible Fluids
DSA have been extended to model binary immiscible fluids by introducing the notion of “color”, in both two and three dimensions [20]. Individual particles are assigned color variables, e.g., red or blue, and “color charges” which act rather like electrostatic charges. This notion of “color” was first introduced by Rothman and Keller [8]. With the color charge Cn of the nth particle given by
Cn =
+1
red particle,
−1
blue particle,
(4)
there is an attractive force between particles of the same color and a repulsive force between particles of different colors. To quantify this interaction, we define the color flux vector N( r) Cn (vn − V (r)), (5) Q(r) = n=1
where the sum is over all particles, and the color field vector F(r) =
i
wi
N( r ) Ri i Cn , |Ri | n
(6)
where the first and the second sums are over all nearest neighbor cells and all particles, respectively. N (r) is the number of particles in the local cell, vn the velocity of the nth particle, and V (r) the mean velocity of particles in a cell. The weighting factors are defined as wi = 1/|Ri |, where Ri = r − r i and r i is the location of the centre of ith nearest neighbor cell. The range of the index i differs according to the definition of the neighbors. With two- and threedimensional Moore neighbors, for example, i would range from 0 to 7 and 0 to
Discrete simulation automata
2491
26, respectively. One can model the phase separation kinetics of an immiscible binary fluid by choosing a rotation angle for each collision process such that the color flux vector points in the same direction as the color field vector after the collision. The model exhibits complete phase separation in both two [20, 22] and three [22] dimensions and has been verified by investigating domain growth laws and the resultant surface tension between two immiscible fluids [21], see Figs. 1–3. Although the precise location of the spinodal temperature has not thus far been investigated within DSA, we have confirmed that all binary immiscible fluid simulations presented in this review operate below it.
Initial
50 steps
500 steps
1000 steps
Figure 1. Two-phase separation in a binary immiscible DSA simulation [22]. Randomly distributed particles of two different colors (dark grey for water, light grey for oil) in the initial state segregate from each other, until two macroscopic domains are formed. The system size is 32 × 32 × 32, and the number density of both water and oil particles is 5.0. 1.2 1.1
T ⫽ 2.0
1 0.9 0.8 ∆P
0.7 0.6
T ⫽ 1.0
0.5 0.4 0.3
T ⫽ 0.5
0.2 0.1 0
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
1/ R
Figure 2. Verification of Laplace’s law for two-dimensional DSA [21]. The pressure difference between inside and outside of a droplet of radius R, P = Pin − Pout , was measured in a system of size 4R × 4R(R = 16, 32, 64, 128), averaged over 10 000 time-steps. The error bars are smaller than the symbols. T is the “temperature” which can be regarded as the indicator of averaged kinetic energy of particles and is defined by T = kT ∗ /m (k is Boltzmann’s constant, T ∗ the absolute temperature, and m the mass of the particles).
2492
T. Sakai and P.V. Coveney 0.8 0.7
3.6 ⴛ t⫺0.5
0.6
7.8 ⴛ t⫺0.67
0.5
0.4
0.3
0.2
100 Time step
200
300
Figure 3. Temporal evolution of the characteristic wave number [25] in two-dimensional DSA simulations of binary phase separation, averaged over seven independent runs [21]. The domain growth is characterized with two distinct rates, namely, a slow growth rate R ∼ t 1/2 in the initial stage and a fast growth rate R ∼ t 2/3 at later times.
3.2.
Ternary Amphiphilic Fluids
A typical surfactant molecule has a hydrophilic head and a hydrophobic tail. Within DSA this structure is described by introducing a dumbbell-shaped particle in both two [21] and three [22] dimensions. Figure 4 is a schematic description of the two-dimensional particle model. A and B correspond to the hydrophilic head and the hydrophobic tail. G is the centre of mass of the surfactant particle. Color charges Cphi and Cpho are assigned to A and B, respectively. If we take the other DSA particles to be water particles whose color charges are
l phi
F (r)
A C phi θ
l pho
G
x
B C pho Figure 4. The schematic description of the two-dimensional surfactant model. A and B with color charges Cphi and Cpho correspond to the hydrophilic head and the hydrophobic tail, respectively. The mass of the surfactant particle is assumed to be concentrated at G, the centre of mass of the dumbbell particle.
Discrete simulation automata
2493
positive, Cphi and Cpho should be set as Cphi > 0 and Cpho < 0. The attractive interaction between A and water particles and the repulsive interaction between A and oil particles and conversely for B are described in a similar way to those in the binary immiscible DSA. For simplicity, the mass of the surfactant particle is assumed to be concentrated at the centre of mass. This assumption provides the model with great simplicity especially in describing the rotational motion of surfactant particles, while adequately retaining the ability to reproduce essential properties of surfactant solutions. Since there is no need to consider the rotational motions of the surfactant particle explicitly, its degrees of freedom are reduced to only three, that is, its location, orientation angle, and translational velocity. Calculations of the color flux F(r) and the color field Q(r) resemble those in the binary immiscible DSA. For the calculation of F(r), we use Eq. (5), without taking the contributions of surfactant particles into account. Note that motions of A and B only result in suppressing the tendency of F(r) and Q(r) to overlap each other, because they would not influence the “non-color” momentum exchanges. Q(r) is determined by considering both the distribution and the structure of surfactant particles. When a surfactant particle is located at r G with an orientation angle θ (see Fig. 4), A and B ends of the particle are located at
rA =
rB =
r Ax r Ay
rBx rBy
=
=
rG x rGy rG x rGy
+
−
cos θ sin θ cos θ sin θ
· lphi ,
(7)
· lpho .
(8)
In these equations, lphi and lpho are the distance between G and the hydrophilic end (A in Fig. 4), and the distance between G and the hydrophobic end (B in Fig. 4), respectively. We then add the color charge Cphi and Cpho to cells located at r A and r B , which corresponds to modifying Eq. (4) into +1
red particle, −1 blue particle, Cn = hydrophilic head, C phi Cpho hydrophobic tail.
(9)
After calculating the color flux and the color field in each cell, a rotation angle is chosen using the same method as for binary immsicible DSA fluids, namely, the color flux vector overlaps the color field vector. Finally, the orientation angle θ of each surfactant particle, after the momentum exchange, is set in such a way that it overlaps with the color field, which can be expressed as:
cos θ sin θ
=
F(r) . |F(r)|
(10)
2494
T. Sakai and P.V. Coveney
Both two- and three-dimensional versions of this model have been derived in this way [21, 22]. Using this model, the formation of spherical micelles, water-in-oil and oil-in-water droplet microemulsion phases, and water/oil/ surfactant sponge phase in both two [21] and three [22] dimensions have been reported (see Figs. 5 and 6). Suppression of phase separation and resultant domain growth, the lowering of interfacial tension between two immiscible fluids, and the connection between the mesoscopic model parameters and the macroscopic surfactant phase behavior have been studied within the model in both two and three dimensions [21, 22]. These studies have been primarily qualitative in nature, and correspond to some of the early papers published on ternary amphiphilic fluids using LGA [26, 27] and LB [17, 28] methods. Much more extensive work on the
Initial
After 250 steps
After 2500 steps
After 10000 steps
Figure 5. A two-dimensional DSA simulation of a sponge phase in a ternary amphiphilic fluid starting from a random initial condition [21]. Surfactant is visible at the interface between oil (dark grey) and water (light grey) regions. The system size is 64×64, the number density of DS A particles 10, the concentration ratio of water/oil/surfactant 1 : 1 : 1, the temperature of the system 0.2, color charges for hydrophilic and hydrophobic end groups Cphi = 10.0, Cpho = − 10.0.
Figure 6. The formation of spherical micelles in aqueous solvent [22]. The system size is 32 × 32 × 32, the concentration of surfactant particles is 10%.
Discrete simulation automata
2495
quantitative aspects of self-assembly kinetics has already been published using these two techniques.
4.
Some Recent Developments and Applications
DSA is currently attracting growing attention; the most recent published works using the method include the modeling of colloids [29], a detailed quantitative analysis of single-phase fluid behavior [30], and studies on the theoretical and numerically determined viscosity [31, 32]. Here we describe our own latest developments, concerning flow in porous media and parallel implementation.
4.1.
Flow in Porous Media
Within DSA, updating the state in porous media simulations requires close attention to be paid to the propagation process. This is due to the fact that particles are allowed to assume velocities of arbitrary directions and magnitudes: it frequently happens that a particle penetrates unphysically through an obstacle and reaches a fluid area on another side. It is thus not enough to know only the information about the starting and ending sites of the moving particles, as is done in LGA and LB studies, but rather their entire trajectories need to be investigated. We detect whether a particle hits an obstacle or not in the following way. First, we look at the cell containing r =r +v. If the cell is inside an obstacle the particle move is rejected and bounce-back boundary conditions are applied to update the particle velocity in the cell. When the cell is within a pore region, we extract a rectangular set of cells where the cells including r and r face each other on the diagonal line, as shown in Fig. 7. From this set of cells we further extract cells which intersect the trajectory of the particle. In order to do this, every cell C j in the “box” shown in Fig. 7 except those containing r and r , is investigated by taking the cross product v × c j k , where c j k denotes the position vector of four points of a cell C j and k = 1, 2, 3 and 4. If the v × c j k s for all k have the same sign, this means that the whole of C j is located on either side of v, that is, it does not intersect v and there is no need to check whether the site is inside a pore or the solid matrix. Otherwise, C j intersects v and the move is rejected if the site is inside the solid matrix, see Fig. 7. Using this method we have simulated single phase and binary immiscible fluid flow in two-dimensional porous media [23]. Good linear force–flux relationships were observed in single phase fluid flows, as is expected from Darcy’s law. In binary immiscible fluid flows, our findings are in good
2496
T. Sakai and P.V. Coveney DSA cells
r'
r
"box"
Solid matrices
Figure 7. Scheme for detecting particles’ collisions with obstacles within the discrete simulation automata model [23]. Assume a particle moves from r to r = r + v (v is the velocity of the particle) in the current time-step. This particle obviously collides with the obstacle which is colored gray. However, the collision cannot be detected if we only take into account the information on r which is within the pore region. The whole trajectory of the particle must be investigated to accurately detect the collisions. In order to do this, we first extract all – in this case twelve – cells comprising the “box”, a rectangular set of cells where the cells including r and r are aligned with each other on a diagonal line. Secondly, from within the box, we further extract cells which overlap with the trajectory of the particle. The six cells comprising the region bordered with slashed lines are such cells in this case. These cells except those which include r and r are finally checked to establish whether they are part of an obstacle or a pore region.
agreement with previous studies using LGA [12–14]: a well defined linear force–flux relationship was obtained only when the forcing exceeded specified thresholds. We also found a one-to-one correspondence between these thresholds and the interfacial tension between the two fluids, which supports the interpretation from previous LGA studies that the existence of these thresholds is due to the presence of capillary effects within the pore space. In the study [23], we assumed that the binary immiscible fluids are uncoupled. However, a more general force–flux relationship allows for the fluids to be coupled and there have been a few studies of two-phase flow taking such coupling into account [12–14, 33, 34]. Within LGA, using the gravitational
Discrete simulation automata
2497 υj
j
i
υi Figure 8. Occluded particles [23]: some particles can be assigned to a pore completely surrounded by solid matrices at the initial state, like particle i. Other particles can be occluded in a depression on the surface of an obstacle, like particle j . By imposing the gravitational force on such particles, they will gain kinetic energy limitlessly because their energy cannot be dissipated through interactions with other particles.
forcing method, it is possible to apply the forcing to only one species of fluid and discuss similarities with the Onsager relations [12, 13]. In our DSA study, we have used pressure forcing [23] and thus have not been able to investigate the effect of the coupling of the two immiscible fluids. The difficulty in implementing gravitational forcing within DSA is partly due to the local heating effects caused by occluded particles which are trapped within pores and will gain kinetic energy in an unbounded manner by the gravitational force imposed on them at every time step; see Fig. 8.
4.2.
Parallel Implementation
For large scale simulations in three dimensions, the computational cost of DSA is high, as with LGA and LB methods. Due to the spatially local updating rules, however, all basic routines in DSA algorithms are parallelizable. Good
2498
T. Sakai and P.V. Coveney
computer performance can thus be expected given an efficient parallel implementation. We have parallelized our DSA codes in two and three dimensions, written in C++ and named DSA2D and DSA3D, respectively, by spatially decomposing the system and implementing the MPI libraries [35]. It is in the propagation process that the MPI library functions are mainly used. There are two key features which are worth pointing out here. First, in the propagation process, information on the particles which exit each domain is stored in arrays which are then handed over to a send function MPI_Isend. The size of the arrays depends on temperature and the direction of the target domain. Second, as the number of particles within a domain fluctuates, 10 – 20% of the memory allocated for particles in the domain is used as an absorber. (Particles are allocated at an initial stage up to 80 – 90% of the total capacity.) Figures 9 and 10 show the parallel performance in two and three dimensions, respectively. Although DSA2D scales superlinearly across all processor counts, DSA3D scales well only with a large number of CPU (DSA3D’s propagation() routine even slows down with increasing processor counts for certain sets of parameters). The difference here is due to the way the system is spatially decomposed: DSA2D has been domain decomposed in one direction whereas DSA3D has been decomposed in three directions. In order to realise good scalability in three dimensions for our current parallel
50 45 collision()
40 35 Performance
TOTAL
30 IDEAL
25 propagation()
20 15 10 5 0
0
5
10
15
20
25
30
35
Number of CPUs
Figure 9. Scalability of two-dimensional DSA (DSA2D) for single-phase fluids on SGI Origin 3000 (400 MHz MIPS R12000) processors. “Performance” (vertical axis) means “speed-up”, which is relative to the number of processors. The overall performance is indeed superlinear.
Discrete simulation automata
2499
(a) 60 IDEAL
50 Performance
collision() 40 30
TOTAL
20 propagation()
10 0
0
10
20
30
40
50
60
Number of CPUs
(b) 60 IDEAL
Performance
50
collision()
40 TOTAL 30 20
propagation()
10 0
0
10
20
30
40
50
60
Number of CPUs
(c) 60 IDEAL
Performance
50
collision()
40 TOTAL 30 propagation() 20 10 0
0
10
20
30
40
50
60
Number of CPUs
Figure 10. Parallel performance of DSA3D for single-phase fluids of varying system sizes: (A) 643 ; (B) 1283 ; (C) 2563 , on SGI Origin 3000 (400 MHz MIPS R12000) processors. “Performance” (vertical axis) means “speed-up”, which is relative to the number of processors. For 643 and 1283 systems the performance of the propagation process actually decreases when the number of CPUs becomes large.
2500
T. Sakai and P.V. Coveney
implementation, a large system and a large number of CPUs are required. The present parallel implementation should be regarded only as preliminary; further optimization may be expected to result in better overall performance.
5.
Summary
Discrete simulation automata (DSA) represent a mesoscopic fluid simulation method which, in common with lattice gas automata (LGA) and the lattice Boltzmann (LB) methods, has several advantages over conventional continuum fluid dynamics. Beyond LGA and LB’s beneficial aspects, DSA’s most eminent characteristic is that a temperature can be defined very naturally. It is thus a promising candidate to deal with complex fluids where fluctuations can often play an essential role in determining macroscopic behavior. There remain, however, some drawbacks to the DSA technique. The existence of particles with continuously valued velocities coupled to the intrinsic temporal discreteness of the model leads to some problems in handling wall boundary collisions, including trapping of particles with increasing energy in certain flow regimes, which do not arise with LGA and LB methods. Nonetheless, DSA appears to be a promising technique for the study of numerous complex fluids. We have reviewed a few examples here, including immiscible fluids, amphiphilic fluids, and flow in porous media. Most of these studies have so far not reached an equivalent maturity and quantitative level to that of LGA and LB publications. DSA is amenable to fairly straightforward parallel implementation. We therefore expect to see further fruitful explorations of complex fluid dynamics using DSA in the future.
References [1] G.K. Batchelor, An Introduction to Fluid Dynamics, Cambridge University Press, Cambridge, 1967. [2] D.C. Rapaport, The Art of Molecular Dynamics Simulation, Cambridge University Press, Cambridge, 1995. [3] U. Frisch, B. Hasslacher, and Y. Pomeau, Phys. Rev. Lett., 56, 1505, 1986. [4] S. Succi, The Lattice Boltzmann Equation, Oxford University Press, Oxford, 2001. [5] R. Benzi, S. Succy, and M. Vergassola, Phys. Rep., 222, 145, 1992. [6] S. Chen, Z. Wang, X. Shan, and G. Doolen, J. Stat. Phys., 68, 379, 1992. [7] D.H. Rothman and S. Zaleski, Lattice Gas Cellular Automata, Cambridge University Press, Cambridge, 1997. [8] D.H. Rothman, and J. Keller, J. Stat. Phys., 52, 1119, 1988. [9] D. Grunau, S. Chen, and K. Eggert, Phys. Fluids A, 5, 2557, 1993. [10] A.J.C. Ladd, J. Fluid Mech., 271, 285, 1994. [11] J.A. Kaandorp, C. Lowe, D. Frenkel, and P.M.A. Sloot, Phys. Rev. Lett., 77, 2328, 1996.
Discrete simulation automata
2501
[12] P.V. Coveney, J.-B. Maillet, J.L. Wilson, P.W. Fowler, O. Al-Mushadani, and B.M. Boghosian, Int. J. Mod. Phys. C, 9, 1479, 1998. [13] J.-B. Maillet and P.V. Coveney, Phys. Rev. E, 62, 2898, 2000. [14] P.J. Love, J.-B. Maillet, and P.V. Coveney, Phys. Rev. E, 64, 061302, 2001. [15] N.S. Martys and H. Chen, Phys. Rev. E, 53, 743, 1996. [16] A. Koponen, D. Kandhai, E. Hellen, M. Alava, A. Hoekstra, M. Kataja, K. Niskanen, P. Sloot, and J. Timonen, Phys. Rev. Lett., 80, 716, 1998. [17] P.J. Love, M. Nekovee, P.V. Coveney, J. Chin, N. Gonzalez-Segredo and J.M.R. Martin, Comput. Phys. Commun., 153, 340, 2003. [18] A. Malevanets and R. Kapral, J. Chem. Phys., 110, 8605, 1999. [19] Y. Hashimoto, Y. Chen, and H. Ohashi, Int. J. Mod. Phys. C, 9(8), 1479, 1998. [20] Y. Hashimoto, Y. Chen, and H. Ohashi, Comput. Phys. Commun., 129, 56, 2000. [21] T. Sakai, Y. Chen, and H. Ohashi, Phys. Rev. E, 65, 031503, 2002. [22] T. Sakai, Y. Chen, and H. Ohashi, J. Coll. and Surf., 201, 297, 2002. [23] T. Sakai and P.V. Coveney, “Single phase and binary immiscible fluid flow in two-dimensional porous media using discrete simulation automata,” 2002 (preprint). [24] G.A. Bird, Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Clarendon, Oxford, 1994. [25] T. Kawakatsu, K. Kawasaki, M. Furusaka, H. Okabayashi and T. Kanaya, J. Chem. Phys., 99, 8200, 1993. [26] B.M. Boghosian, P.V. Coveney, and A.N. Emerton, Proc. R. Soc. A, 452, 1221, 1996. [27] B.M. Boghosian, P.V. Coveney, and P.J. Love, Proc. R. Soc. A, 456, 1431, 2000. [28] H. Chen, B.M. Boghosian, P.V. Coveney, and M. Nekovee, Proc. R. Soc. A, 456, 2043, 2000. [29] S.H. Lee and R. Kapral, Physica A, 298, 56, 2001. [30] A. Lamura and G. Gompper, Eur. Phys. J.E, 9, 477, 2002. [31] T. Ihle and D.M. Kroll, Phys. Rev. E, 63, 020201(R), 2001. [32] A. Lamura, G. Gompper, T. Ihle, and D. M. Kroll, Europhys. Lett., 56, 319, 2001. [33] C. Zarcone and R. Lenormand, C.R. Acad. Sci. Paris, 318, 1429, 1994. [34] J.F. Olson and D.H. Rothman, J. Fluid Mech., 341, 343, 1997. [35] http://www-unix.mcs.anl.gov/mpi/index.html
8.6 DISSIPATIVE PARTICLE DYNAMICS Pep Espa˜nol Dept. Física Fundamental, Universidad Nacional de Educaci´on a Distancia, Aptdo. 60141, E-28080 Madrid, Spain
1.
The Original DPD Model
In order to simulate a complex fluid like a polymeric or colloidal fluid, a molecular dynamics simulation is not very useful. The long time and space scales involved in the mesoscopic dynamics of large macromolecules or colloidal particles as compared with molecular scales imply to follow an exceedingly large number of molecules during exceedingly large times. On the other hand, at these long scales, molecular details only show up in a rather coarse form, and the question arises if it is possible to deal with coarse-grained entities that reproduce the mesoscopic dynamics correctly. Dissipative particle dynamics (DPD) is a fruitful modeling attempt in that direction. DPD is a stochastic particle model that was introduced originally as an off-lattice version of Lattice gas automata (LGA) in order to avoid its lattice artifacts [1]. The method was put in a proper statistical mechanics context a few years later [2] and the number of applications since then is growing steadily. The original DPD model consists of a collection of soft repelling frictional and noisy balls. From a physical point of view, each dissipative particle is regarded not as a single molecule of the fluid but rather as a collection of molecules that move in a coherent fashion. In that respect, DPD can be understood as a coarse-graining of molecular dynamics. There are three types of forces between dissipative particles. The first type is a conservative force deriving from a soft potential that tries to capture the effects of the “pressure” between different particles. The second type of force is a friction force between the particles that wants to describe the viscous resistance in a real fluid. This force tries to reduce velocity differences between dissipative particles. Finally, there is a stochastic force that describe the degrees of freedom that have been eliminated from the description in the coarse-graining process. 2503 S. Yip (ed.), Handbook of Materials Modeling, 2503–2512. c 2005 Springer. Printed in the Netherlands.
2504
P. Espa˜nol
This stochastic force will be responsible for the Brownian motion of polymer and colloidal particles simulated with DPD. The postulated stochastic differential equations (SDEs) that define the DPD model are [2] dri = vi dt FCi j (ri j )dt − γ ω(ri j )(ei j ·vi j )ei j dt m i dvi = j= /i
+σ
(1)
j= /i
ω
1/2
(ri j )ei j dWi j
j= /i
Here, ri , vi are the position and velocity of the dissipative particles, m i is the mass of particle i, FCi j is the conservative repulsive force between dissipative particles i, j , ri j = ri −r j , vi j = vi −v j , and the unit vector from the j th particle to the ith particle is ei j = (ri − r j )/ri j with ri j = |ri − r j |. The friction coefficient γ governs the overall magnitude of the dissipative force, and σ is a noise amplitude that governs the intensity of the stochastic forces. The weight function ω(r) provides the range of interaction for the dissipative particles and renders the model local in the sense that the particles interact only with their neighbors. A usual selection for the weight function in the DPD literature is a linear function with the shape of a Mexican hat, but there is no special reason for such a selection. Finally, dWi j = dW j i are independent increments of the Wiener process that satisfy the Itˆo calculus rule dWi j dWi j = (δii δ j j + δi j δ j i ) dt. There are several remarkable features of the above SDEs. They are translationally, rotationallyand Galilean invariant. Most importantly, total momentum is conserved, d( i pi )/dt = 0, because the three types of forces satisfy Newton’s Third Law. Therefore, the DPD model captures the essentials of mass and momentum conservation which are responsible for the hydrodynamic behavior of a fluid at large scales [3, 4]. Despite its appearance as Langevin equations, Eq. (2) is quite different from the ones used in Brownian Dynamics simulations. In the Brownian Dynamics method, total momentum of the particles is not conserved and only mass diffusion can be studied. The above SDE are mathematically equivalent to a Fokker–Planck equation (FPE) that governs the time-dependent probability distribution ρ(r, v; t) of positions and velocities of the particles. The explicit form of the FPE can be found in Ref. [2]. Under the assumption that the noise amplitude and the friction coefficient are related by the fluctuation–dissipation relation σ = (2kB T γ )1/2, the equilibrium distribution ρ eq of the FPE has the familiar form
1 1 ρ (r, v) = exp − Z kB T eq
m i v i2 i
2
+ V (r)
(2)
where V is the potential function that gives rise to the conservative forces FC , kB is Boltzmann’s constant, T is the equilibrium temperature and Z is the normalizing partition function.
Dissipative particle dynamics
2.
2505
DPD Simulations of Complex Fluids
One of the most attractive features of the model is its enormous versatility in order to construct simple models for complex fluids. In DPD, the Newtonian fluid is made “complex” by adding additional interactions between the fluid particles. Just by changing the conservative interactions between the fluid particles, one can easily construct polymers, colloids, amphiphiles, and mixtures. Given the simplicity of modeling of mesostructures, DPD appears as a competitive technique in the field of complex fluids. We review now some of the applications of DPD to the simulation of different complex fluids systems (see also Ref. [5]). Colloidal particles are constructed by freezing fluid particles inside certain region, typically spheres or ellipsoids, and moving those particles as a rigid body. The idea was pioneered by Koelman and Hoogerbrugge [6] and has been explored in more detail by Boek et al. [7]. The simulation results for shear thinning curves of spherical particles compare very well with experimental results for volume fractions below 30%. At higher volume fractions somewhat inconsistent results are obtained, which can be attributed to several factors. The colloidal particles modeled in this way are to certain degree “soft balls” that can interpenetrate leading to unphysical interactions. At high volume fractions solvent particles are expelled from the region in between two colloidal particles. Again, this misrepresents the hydrodynamic interaction, which is mostly due to lubrication forces [8]. Depletion forces appear [9, 10] which are unphysical and due solely to the discrete representation of the continuum solvent. It seems that a judicious selection of lubrication forces that would take into account the effects of the solvent when no dissipative particle exist in between two colloidal particles can eventually solve this problem. Finally, we note that DPD can resolve the time scales of fluid momentum transport on the length scale of the colloidal particles or their typical interdistances. These scales are probed experimentaly by diffusive wave spectroscopy [11]. Polymer molecules are constructed in DPD through the linkage of several dissipative particles with springs (either Hookean or FENE [12]). Dilute polymer solutions are modeled by a set of polymer molecules interacting with a sea of fluid particles. The solvent quality can be varied by fine tuning the solvent–solvent and solvent–monomer conservative interactions. In this way, a collapse transition has been observed in passing from a good solvent to a poor solvent [13]. Static scaling results for the radius of gyration and relaxation time with the number of beads are consistent with the Rouse/Zimm models [14]. The model displays hydrodynamic interactions and excluded volume interactions, depending on solvent quality. Rheological properties have been also studied showing a good agreement with known kinetic theory results [15, 16]. Polymer solutions confined between walls have also been modeled showing anisotropic relaxation in nanoscale gaps [17]. Polymer melts have been
2506
P. Espa˜nol
simulated showing that the static scaling and rheology correspond to the Rouse theory, as a result of screening of hydrodynamic and excluded volume interactions in the melt [14]. The model is unable to simulate entanglements due to the soft interactions between beads that allow polymer crossing [14], although this effect can be partially controlled by suitably adjusting the length and intensity of the springs. At this point, DPD appears as a well benchmarked model for the simulation of polymer systems. Nevertheless, there is still not a direct connection between the model parameters used in DPD and actual molecular parameters like molecular weight, torsion potentials, etc. Immiscible fluid mixtures are modeled in DPD by assuming two types of particles [18]. Unequal particles repel each other more strongly than equal particles thus favoring phase separation. Starting from random initial conditions representing a high temperature miscible phase suddenly quenched, the domain growth has been investigated [19, 20]. Although lattice Boltzmann simulations allow to explore larger time scales than DPD [21], the simplicity of DPD modeling allows one to generalize easily to more complex systems in a way that lattice Boltzmann cannot. For example, mixtures of homopolymer melts have been modeled with DPD [22]. Surface tension measurements allow for a mapping of the model to the Flory–Huggins theory [22]. In this way, thermodynamic information has been used to fix the model parameters of DPD. A recent more detailed analysis of this procedure has been presented in Refs. [23, 24], where a calculation of the phase diagram of monomer and polymer mixtures of DPD particles allowed to discuss the connection of the repulsion parameter difference and the Flory–Huggins parameter χ. Another successful application of DPD has been the simulation of microphase separation of diblock copolymers [25], that has allowed to discuss the pathway to equilibrium. This pathway is strongly affected by hydrodynamics [26]. In a similar way, simulations of rigid DPD dimers in a solution of solvent monomers has allowed to study the growth of amphiphilic mesophases and its behavior under shear [27] and the self-assembly of model membranes [28]. DPD has also been applied to other complex situations like the dynamics of a drop at a liquid–solid interface [29], flow and rheology in the presence of polymers grafted to walls [30], vesicle formation of amphiphilic molecules [31] and polyelectrolytes [32].
3.
Thermodynamically Consistent DPD Model
Despite its successes, the DPD model suffers from several conceptual shortcomings that originate from the oversimplification of the so-formulated
Dissipative particle dynamics
2507
dissipative particles as representing mesoscopic portions of fluid. There are several issues in the original model that are unsatisfactory. For example, even though the macroscopic behavior of the model is hydrodynamic [3], it is not possible to relate in a simple direct way the viscosity of the fluid with the model parameters. Only after a recourse to the methods of kinetic theory can one estimate what input values for the friction coefficient should be imposed to obtain a given viscosity [4]. Another problem with the original model is that the conservative forces fix the thermodynamic behavior of the fluid [22]. The pressure equation of state, for example, is an outcome of the simulation, not an input. The model is isothermal and not able to study energy transport. There are no rules for specifying the range and shape of the weight functions that affect both, thermodynamic and transport properties. Perhaps the biggest problem of the model is the unclear physical length and time scales that are actually simulated. How big is a dissipative particle is not known from the model parameters. DPD appeared as a quick way of getting hydrodynamics suitable for “mesoscales”. Of course, the fact that there exists a well-defined Hamiltonian with a proper equilibrium ensemble, still makes the DPD model useful, at least as a thermostating device that respect hydrodynamics. In particular, when considering models of coarse-grained complex molecules (like amphiphiles or macromolecules) DPD as it was originally formulated can be very useful, despite the fact that an explicit correspondence between molecular parameters and DPD parameters are not known. However, the above-mentioned problems render DPD as a poor tool for the simulation of Newtonian fluids at mesoscopic scales. One needs to simulate a Newtonian fluid when dealing with colloidal suspension, dilute polymeric suspensions or mixtures of Newtonian fluids. In these cases, one should use better models that are thermodynamically consistent. These models consider each dissipative particle as a fluid particle, this is, a small moving thermodynamic system with proper thermodynamic variables. The idea of introducing an internal energy variable in the DPD model was developed in Refs. [33, 34] in order to obtain an energy conserving DPD model. Yet, it is necessary to introduce a second thermodynamic variable to have a full thermodynamic description. This variable is the volume of the fluid particles. There have been also attempts to introduce a volume variable in the isothermal DPD model [35, 36], but a full non-isothermal and thermodynamically consistent model has only appeared recently [37]. One way to define the volume is with the help of a bell-shaped weight function W (r) of finite range h normalized to unity.We introduce the density of every fluid particle through the relation di = j W (ri j ). Clearly, if around particle i there are many particles j , the density di defined above will be large. One associates a volume V i = di−1 to the fluid particle. Another possibility for defining the volume of each fluid particle relies on the Voronoi tessellation [38–40].
2508
P. Espa˜nol
The equations for the evolution of the position, velocity, and entropy of each fluid particle in the thermodynamically consistent DPD model are [37] r˙ i = vi m v˙ i =
j
Ti S˙i = −2κ
Pj 5η Fi j Pi + F r − vi j + ei j ei j ·vi j + F˜ i i j i j 2 2 3 j di d j di dj
Fi j j
di d j
Ti j +
5η Fi j 2 vi j + (ei j ·vi j )2 + Ti J˜i 6 j di d j
(3)
Here, Pi , Ti are the pressure and temperature of the fluid particle i, given in terms of equations of state, and Ti j = Ti − T j . We have introduced the function F(r) through ∇W (r) = −rF(r) and F˜ i , J˜i are suitable stochastic forces that obey the fluctuation–dissipation theorem [37]. Some small terms have been neglected in Eq. (3) for the sake of presentation. It can be shown that the above model conserves mass, momentum and energy and that the total entropy is a non-decreasing function of time, rendering the model consistent with the Laws of Thermodynamics. What are the similarities and differences between the thermodynamically consistent DPD model in Eq. (3) and the original DPD model of Eq. (2)? As in DPD, now particles of constant mass m move according to their velocities and exert forces of finite range to each other of different nature. The conservative forces of DPD are now replaced by a repulsive force directed along the line joining the particles that has a magnitude given by the pressure Pi and densities of the particles. Because the pressure Pi depends on the density, these type of force is not pair-wise but multibody [35]. The friction forces still depend on velocity differences between neighbor particles, but there is an additional term directly proportional to vi j . This new bit is necessary in order to have a faithful representation of the second space derivative terms that appear in the continuum equations of hydrodynamics [41]. In other words, it can be shown that, when thermal fluctuations can be neglected, Eq. (3) is a Lagrangian discretization of the continuum equations for hydrodynamics. Note that the friction coefficient is now given by the actual viscosity η of the fluid to be modeled and not an arbitrary tuning parameter. Finally, there is an additional dynamic equation for the entropy Si of the fluid particles. The terms in the entropy equation have a simple meaning as heat conduction and viscous heating. The heat conduction term tries to reduce temperature differences between particles by suitable energy exchange [42], whereas the viscous heating term proportional to the square of the velocities ensures that the kinetic energy dissipated by the friction forces is transformed into internal energy of the fluid particles. The model solves all the conceptual problems of DPD mentioned in the beginning of this section. In particular, the pressure and any other thermodynamic information is introduced as an input. The conservative forces of the
Dissipative particle dynamics
2509
original model become physically sounded pressure forces. Arbitrary equations of state and, in particular, of the van der Waals type can be used to study liquid–vapor coexistence in dynamic situations. Energy is conserved and we can study transport of energy in the system. The Second Law is satisfied. The transport coefficients are input of the model. The range functions of DPD enter in a very specific form, both in the conservative part of the dynamics through the density and pressure and in the dissipative part through the function Fi j . The particles have a physical size given by its physical volume and it is possible to specify the physical scale being simulated. The concept of resolution enters into play in the sense that one has to use many fluid particles per relevant length scale in order to recover the continuum results. Therefore, for resolving micron-sized objects one has to use very small fluid particles, whereas for resolving meter-sized objects large fluid particles are sufficient. In the model, it turns out that the amplitude of the thermal fluctuations scales with the square root of the volume of the fluid particles, in accordance with the usual notions of equilibrium statistical mechanics. Therefore, we expect that thermal fluctuations can be neglected in a simulation of meter-sized objects, but they are essential in the simulation of colloidal particles. This natural switching off thermal fluctuations with size is absent in the original DPD model. The model in Eq. (3) (without thermal fluctuations) is actually a version of the smoothed particle hydrodynamics (SPH) model, which is a Lagrangian particle method introduced by Lucy [43] and Monaghan [44] in the 70s in order to solve hydrodynamic problems in astrophysical contexts. Generalizations of SPH in order to include viscosity and thermal conduction and address laboratory scale situations like viscous flow and thermal convection have been presented only quite recently [42, 45, 46]. In order to formulate the thermodynamically consistent DPD model in Eq. (3), we have resorted to the GENERIC framework, which is a very elegant and useful way of writing dynamic equations that, by structure, are thermodynamically consistent [47]. It is possible to derive new fluid particle models based on both, the SPH methodology for discretizing continuum equations, and the GENERIC framework to ensure thermodynamic consistency. Continuum models for complex fluids typically involve additional structural or internal variables that are coupled with the conventional hydrodynamic variables. The coupling renders the behavior of the fluid non-Newtonian and complex. For example, polymer melts are characterized by additional conformation tensors, colloidal suspensions can be described by further concentration fields, mixtures are characterized by several density fields (one for each chemical species), emulsions are described with the amount and orientation of interface, etc. All these continuum models rely on the hypothesis of local equilibrium and, therefore, the fluid particles are regarded as thermodynamic subsystems. The physical picture that emerges from these fluid particles is that they represent “large” portions of the fluid and therefore, the scale of these
2510
P. Espa˜nol
fluid particles is supramolecular. This allows one to study large time scales. The price, of course, is the need for a deep understanding of the physics at this more coarse-grained level. In order to model polymer solutions, for example, ten Bosch [48] has associated to each dissipative particle an elongation vector representing the average elongation of polymer molecules. Although the ten Bosch model has all the problems of the original DPD model, it can be cast into a thermodynamically consistent model for non-isothermal dilute polymer solutions [49]. Another example where the strategy of internal variables can be successful is in the simulation of chemically reacting mixtures. Chemically reacting mixtures are not easily implemented with the usual approach taken by DPD in order to model mixtures. In DPD, mixtures are represented by “red” and “blue” particles. It is not trivial to specify a chemical reaction in which, for example, two red particles react with a blue particle to form a “green” particle. In this case, it is better to start from the well-established continuum equations for chemical reactions [41]. The fluid particles in the model have as additional variable the fraction of component red and blue inside the fluid particle. These two examples show how one can address viscoelastic flow problems and chemical reacting fluids with a simple methodology that involves fluid particles with internal variables. The idea can, of course, be applied to other complex fluids where the continuum equations are known.
Acknowledgments This work has been partially supported by the project BFM2001-0290 of the Spanish Ministerio de Ciencia y Tecnología.
References [1] P.J. Hoogerbrugge and J.M.V.A. Koelman, “Simulating microscopic hydrodynamics phenomena with dissipative particle dynamics,” Europhys. Lett., 19(3), 155–160, 1992. [2] P. Espa˜nol and P. Warren, “Statistical mechanics of dissipative particle dynamics,” Europhys. Lett., 30, 191, 1995. [3] P. Espa˜nol, “Hydrodynamics from dissipative particle dynamics,” Phys. Rev. E, 52, 1734, 1995. [4] C. Marsh, G. Backx, and M.H. Ernst, “Static and dynamic properties of dissipative particle dynamics,” Phys. Rev. E, 56, 1976, 1997. [5] P.B. Warren, “Dissipative particle dynamics,” Curr. Opinion Colloid Interface Sci., 3, 620, 1998. [6] J.M.V.A. Koelman and P.J. Hoogerbrugge, “Dynamic simulations of hard-sphere suspensions under steady shear,” Europhys. Lett., 21, 363–368, 1993.
Dissipative particle dynamics
2511
[7] E.S. Boek, P.V. Coveney, H.N.W. Lekkerkerker, and P. van der Schoot, “Simulating the rheology of dense colloidal suspensions using dissipative particle dynamics,” Phys. Rev. E, 55(3), 3124–3133, 1997. [8] J.R. Melrose, J.H. van Vliet, and R.C. Ball, “Continuous shear thickening and colloid surfaces,” Phys. Rev. Lett., 77, 4660, 1996. [9] E.S. Boek and P. van der Schoot, “Resolution effects in dissipative particle dynamics simulations,” Int. J. Mod. Phys. C, 9, 1307, 1997. [10] M. Whittle and E. Dickinson, “On simulating colloids by dissipative particle dynamics: issues and complications,” J. Colloid Interface Sci., 242, 106, 2001. [11] M. Kao, A. Yodh, and D.J. Pine, “Observation of brownian motion on the time scale of hydrodynamic interactions,” Phys. Rev. Lett., 70, 242, 1993. [12] A.G. Schlijper, P.J. Hoogerbrugge, and C.W. Manke, “Computer simulation of dilute polymer solutions with dissipative particle dynamics,” J. Rheol., 39(3), 567–579, 1995. [13] Y. Kong, C.W. Manke, W.G. Madden, and A.G. Schlijper, “Effect of solvent qualityon the conformation and relaxation of polymers via dissipative particle dynamics,” J. Chem. Phys., 107, 592, 1997. [14] N.A. Spenley, “Scaling laws for polymers in dissipative particle dynamics,” Mol. Simul., 49, 534, 2000. [15] Y. Kong, C.W. Manke, W.G. Madden, and A.G. Schlijper, “Modeling the rheology of polymer solutions by dissipative particle dynamics,” Tribol. Lett., 3, 133, 1997. [16] A.G. Schlijper, C.W. Manke, W. GH, and Y. Kong, “Computer simulation of non-Newtonian fluid rheology,” Int. J. Mod. Phys. C, 8(4), 919–929, 1997. [17] Y. Kong, C.W. Manke, W.G. Madden, and A.G. Schlijper, “Simulation of a confined polymer on solution using the dissipative particle dynamics method,” Int. J. Thermophys., 15, 1093, 1994. [18] P.V. Coveney and K. Novik, “Computer simulations of domain growth and phase separation in two-dimensional binary immiscible fluids using dissipative particle dynamics,” Phys. Rev. E, 54, 5134, 1996. [19] S.I. Jury, P. Bladon, S. Krishna, and M.E. Cates, “Test of dynamical scaling in threedimensional spinodal decomposition,” Phys. Rev. E, 59, R2535, 1999. [20] K.E. Novik and P.V. Coveney, “Spinodal decomposition off of-critical quenches with a viscous phase using dissipative particle dynamics in two and three spatial dimensions,” Phys. Rev. E, 61, 435, 2000. [21] V.M. Kendon, J.-C. Desplat, P. Bladon, and M.E. Cates, “Test of dynamical scaling in three-dimensional spinodal decomposition,” Phys. Rev. Lett., 83, 576, 1999. [22] R.D. Groot and P.B. Warren, “Dissipative particle dynamics: bridging the gap between atomistic and mesoscopic simulation,” J. Chem. Phys., 107, 4423, 1997. [23] S.M. Willemsen, T.J.H. Vlugt, H.C.J. Hoefsloot, and B. Smit, “Combining dissipative particle dynamics and Monte Carlo techniques,” J. Comput. Phys., 147, 50, 1998. [24] C.M. Wijmans, B. Smit, and R.D. Groot, “Phase behavior of monomeric mixtures and polymer solutions with soft interaction potential,” J. Chem. Phys., 114, 7644, 2001. [25] R.D. Groot and T.J. Madden, “Dynamic simulation of diblock copolymer microphase separation,” J. Chem. Phys., 108, 8713, 1997. [26] R.D. Groot, T.J. Madden, and D.J. Tildesley, “On the role of hydrodynamic interactions in block copolymer microphase separation,” J. Chem. Phys., 110, 9739, 1999.
2512
P. Espa˜nol
[27] S. Jury, P. Bladon, M. Cates, S. Krishna, M. Hagen, N. Ruddock, and P.B. Warren, “Simulation of amphiphilic mesophases using dissipative particle dynamics,” Phys. Chem. Chem. Phys., 1, 2051, 1999. [28] M. Venturoli and B. Smit, “Simulating the self-assembly of model membranes,” Phys. Chem. Commun., 10, 1, 1999. [29] J.L. Jones, M. Lal, N. Ruddock, and N.A. Spenley, “Dynamics of a drop at a liquid/solid interface in simple shear fields: a mesoscopic simulation study,” Faraday Discuss., 112, 129, 1999. [30] P. Malfreyt and D.J. Tildesley, “Dissipative particle dynamics of grafted polymer chains between two walls,” Langmuir, 16, 4732, 2000. [31] S. Ymamoto, Y. Maruyama, and S. Hyodo, “Dissipative particle dynamics study of spontaneous vesicle formation of amphiphilic molecules,” J. Chem. Phys., 116(13), 5842, 2003. [32] R.D. Groot, “Electrostatic interactions in dissipative particle dynamics – simulation of polyelectrlytes and anionic surfactants,” J. Chem. Phys., 118, 11265, 2003. [33] J. Bonet-Aval´os and A.D. Mackie, “Dissipative particle dynamics with energy conservation,” Europhys. Lett., 40, 141, 1997. [34] P. Espa˜nol, “Dissipative particle dynamics with energy conservation,” Europhys. Lett., 40, 631, 1997. [35] I. Pagonabarraga and D. Frenkel, “Dissipative particle dynamics for interacting systems,” J. Chem. Phys., 115, 5015, 2001. [36] S.Y. Trofimov, E.L.F. Nies, and M.A.J. Michels, “Thermodynamic consistency in dissipative particle dynamics simulations of strongly nonideal liquids and liquid mixtures,” J. Chem. Phys., 117, 9383, 2002. [37] P. Espa˜nol and M. Revenga, “Smoothed dissipative particle dynamics,” Phys. Rev. E, 67, 026705, 2003. [38] E.G. Flekkøy, P.V. Coveney, and G. DeFabritiis, “Foundations of dissipative particle dynamics,” Phys. Rev. E, 62, 2140, 2000. [39] M. Serrano and P. Espa˜nol, “Thermodynamically consistent mesoscopic fluid particle model,” Phys. Rev. E, 64, 046115, 2001. [40] M. Serrano, G. DeFabritiis, P. Espa˜nol, E.G. Flekkoy, and P.V. Coveney, “Mesoscopic dynamics of voronoi fluid particles,” J. Phys. A: Math. Gen., 35, 1605–1625, 2002. [41] S.R. de Groot and P. Mazur, Non-equilibrium Thermodynamics, North Holland, Amsterdam, 1964. [42] P.W. Cleary and J.J. Monaghan, “Conduction modelling using smoothed particle hydrodynamics,” J. Comput. Phys., 148, 227, 1999. [43] L.B. Lucy, “A numerical testing of the fission hypothesis,” Astron. J., 82, 1013, 1977. [44] J.J. Monaghan, “Smoothed particle hydrodynamics,” Annu. Rev. Astron. Astrophys., 30, 543–574, 1992. [45] H. Takeda, S.M. Miyama, and M. Sekiya, “Numerical simulation of viscous flow by smoothed particle hydrodynamics,” Prog. Theor. Phys., 92, 939, 1994. [46] O. Kum, W.G. Hoover, and H.A. Posch, “Viscous conducting flows with smoothparticle applied mechanics,” Phys. Rev. E, 52, 4899, 1995. ¨ [47] H.C. Ottinger and M. Grmela, “Dynamics and thermodynamics of complex fluids. II. Ilustrations of a general formalism,” Phys. Rev. E, 56, 6633, 1997. [48] B.I.M. ten Bosch, “On an extension of dissipative particle dynamics for viscoelastic flow modelling,” J. Non-Newtonian Fluid Mech., 83, 231, 1999. [49] M. Ellero, P. Espa˜nol, and E.G. Flekkøy, “Thermodynamically consistent fluid particle model for viscoelastic flows,” Phys. Rev. E, 68, 041504, 2003.
8.7 THE DIRECT SIMULATION MONTE CARLO METHOD: GOING BEYOND CONTINUUM HYDRODYNAMICS Francis J. Alexander Los Alamos National Laboratory, Los Alamos, NM, USA
The Direct Simulation Monte Carlo method is a stochastic, particle-based algorithm for solving kinetic theory’s Boltzmann equation. Materials can be modeled at a variety of scales. At the quantum level, for example, time-dependent density functional theory or quantum Monte Carlo may be used. At the atomistic level, typically molecular dynamics is used, while at the continuum level, partial differential equations describe the evolution of conserved quantities and slow variables. Between the atomistic and continuum descriptions lives is the kinetic level. The ability to model at this level is crucial for electron and phonon transport in materials. For classical fluids, especially gases in certain regimes, modeling at this level is required. This article addresses computer simulations at the kinetic level.
1.
Direct Simulation Monte Carlo
The equations of continuum hydrodynamics, such as Euler and NavierStokes, model fluids under a variety of conditions. From capillary flow, to river flow, to the flow of galactic matter, these equations describe the dynamics of fluids over a wide range of space and time scales. However, these equations do not apply in important situations such as gas flow in nanoscale channels and flight in rarefied atmospheric conditions. Because these flows may be collisionless, or nonequilibrium or have sharp gradients, they require a finer-grained description than that provided by hydrodynamics. In these situations, the single particle distribution function, f (r, v, t) is used. Here, f is the number density of atoms or molecules in an infinitesimal six-dimensional volume of phase space, centered at location r and with 2513 S. Yip (ed.), Handbook of Materials Modeling, 2513–2522. c 2005 Springer. Printed in the Netherlands.
2514
F.J. Alexander
velocity v. For dilute gases, Boltzmann was the first to determine how this distribution changes in time. His insight led to the equation that bears his name [1]: ∂ f (r, v, t) ∂t
=
+v·
dv1
∂ f (r, v, t) ∂r
+
F ∂ f (r, v, t) · m ∂v
d ( f (r, v , t) f (r, v1 , t) − f (r, v, t) f (r, v1 , t))|v − v1 |σ (v − v1 ).
(1)
The Boltzmann equation for hard spheres (1) accounts for all of the processes which change the particle distribution function. The advection term, v · (∂ f /∂r), accounts for the change in f due to particles’ velocities carrying them into and out of a given region of space around r. The force term, (F/m) · (∂ f /∂v) accounts for the change in f due to forces acting on particles of mass m to carry them into and out of a given region of velocity space around v. Terms on the right hand side represent the changes due to collisions. The first term on the right accounts for particles at r, with velocities v1 and v1 which, upon collision, are scattered into a small volume of velocity phase-space around v. The second term accounts for particles at r which, upon collision, are scattered out of this region of velocity space. The collision rate is given by σ and is a function of relative velocities. Though it provides the level of detail necessary to describe many important flows, the Boltzmann equation (1) has several features which make solving it extremely difficult. First, it is a nonlinear integro-differential equation. Only in special cases has it been amenable to exact analytic solution. Second, the Boltzmann equation lives in infinite dimensional phase space. Thus, the methods which work so well for partial differential equations cannot be used. As a result, approximate numerical methods are required. Monte Carlo methods are ideally suited for such high dimensional problems. In the early 1960s, Graeme Bird developed a Monte Carlo technique to solve the Boltzmann equation. This method, now known as Direct Simulation Monte Carlo (DSMC), has been extraordinarily successful in aerospace applications and is also gaining popularity with computational scientists in many fields. A brief outline of DSMC is given here. For more comprehensive descriptions, see Refs. [2–4]. The DSMC method solves the Boltzmann equation by using a representative sample of particles drawn from the actual single particle distribution function. Each DSMC particle represents Ne molecules in the original physical system. For flows of practical interest, typically Ne 1. This approach allows the modeling of extremely large systems while using a computationally tractable number of particles Ntot ≤ 108 , instead of a macroscopic number, 1023 .
The direct simulation monte carlo method
2515
A DSMC simulation is set up in the following way. First the spatial domain, boundary conditions and initial conditions of the simulation are specified. The domain is then partitioned into cells, typically, though not always, of uniform size. These cells are later used in the collision phase of the algorithm. Particles are placed according to a density distribution specified by the initial conditions. To guarantee accuracy, the number of particles used in the simulation should not be too small, i.e., not fewer than about 20 particles per cell [5]. Along with its spatial location ri , each particle is also initialized with a velocity vi . If the system is in equilibrium, this velocity distribution is Maxwellian. However, the velocity distribution can be set to accomodate any flow. The state of the DSMC system is given by the positions and velocities of the particles, {ri , vi }, for i = 1, . . . , N . The DSMC method simulates the dynamics of the single particle distribution using a two-step, splitting algorithm. These steps are advection and collision and model two physical processes at work in the Boltzmann equation. Advection models the free streaming between collisions, and the collision step models the two-body collisions. Each advection–collision step simulates a time t.
2.
Advection Phase
During the advection phase, all particles’ positions are changed from ri to ri + vi t. When a particle strikes a boundary or interface, it responds according to the appropriate boundary condition. The time of the collision is then determined by tracing the straight line trajectory from the initial location ri to the point of impact, rw . The time of flight from the particle’s initial position ˆ i · n), ˆ where nˆ is the unit norto the point of impact is tw = (rw − ri ) · n/(v mal to the surface. After striking the surface, the particle rebounds with a new velocity. This velocity depends on the boundary conditions. The particle then propagates freely for the remaining time t − tw . If, in the remaining time, the same particle again strikes a wall, this process is repeated until all of the time in that step has been exhausted. DSMC can model several types of boundaries (for example, specular surfaces, periodic boundaries, and thermal walls). Upon striking a specular surface, the component of a particle’s velocity normal to the surface is reversed. If a particle should strike a perfect thermal wall at temperature Tw , then all three components of the velocity are reset according to a biasedMaxwellian distribution. The resulting component normal to the wall is distributed as m 2 v ⊥ e−mv ⊥ /2kTw . (2) P⊥ (v ⊥ ) = kTw
2516
F.J. Alexander
The individual parallel components are distributed as
P (v ) =
2 m e−mv /2kTw , 2πkTw
(3)
where Tw is the wall temperature, m is the particle’s mass and k is Boltzmann’s constant. Along with the tangential velocity component generated by thermal equilibration with the wall, an additional velocity is required to account for any translational motion of the wall. The distribution (3) is given in the rest frame of the wall. Assume the x and y axes are parallel to the wall. If the wall is moving in the lab frame, for example in the x-direction with velocity u w , then u w is added to the x-component of velocity for particles scattering off the wall. The components of the velocity of a particle leaving a thermal wall are then
vx =
vy =
v⊥ =
kTw RG + u w m
(4)
kTw R m G
(5)
−
2kTw ln R m
(6)
where R is a uniformly distributed random number in [0,1) and RG , RG are Gaussian distributed random numbers with zero mean and unit variance. For most engineering applications, gas-surface scattering is far more complicated. Nevertheless, these scattering rates can usually be effectively modeled in the gas-surface scattering part of the algorithm [6].
3.
Collision Phase
Interparticle collisions are addressed independently from the advection phase. For this to be accurate, the interaction potential between molecules must be short-range. While many short-range interaction models exist, the DSMC algorithm is formulated in this article for a dilute gas of hard sphere particles with diameter σ . When required for specific engineering applications, more realistic representations of the molecular interaction may be used [2, 8]. During the collision phase, some of the particles are selected at random to undergo collisions with each other. The selection process is determined by classical kinetic theory. While there are many ways to accomplish this, a simple and effective method is to sort the particles into spatial cells, the size of which should be less than a mean free path. Only particles in the same cell
The direct simulation monte carlo method
2517
are allowed to collide. As with the particles themselves, the collisons are only statistical surrogates of the actual collisions that would occur in the system. At each time-step, and within each cell, sets of collisions then are generated. All pairs of particles in a cell are eligible to become collision partners. This eligibility is independent of the particles’ positions within the cell. Only the magnitude of the relative velocity between particles is used to determine their collision probability. Even particles that are moving away from each other may collide. The collision probability for a pair of hard spheres, i and j , is given by Pcoll (i, j ) =
2|vi − v j | Nc (Nc − 1)|v rel |
(7)
where |v rel | is the mean magnitude of the relative velocities of all pairs of particles in the cell, and Nc is the number of particles in the cell. To implement this in an efficient manner, a pair of potential collision partners, i and j , is selected at random from the particles within the cell. The pair collides if |vi − v j | > r, v r,max
(8)
where v r,max is the maximum relative speed in the cell and r is a uniform random variable chosen from the interval [0, 1). (Rather than determining v r,max exactly each time step, it is sufficient to simply update it everytime a relative velocity is actually calculated.) If the pair does not collide,then another pair is selected and the process repeats until the required number of pairs Mcoll (explained below) in the cell have been handled. If the pair does collide, then the new velocities of the particles are determined by the following procedure, and the process repeats until all collisions have been processed. In an elastic hard sphere collision, linear momentum and energy are conserved. These conserved quantities fix the magnitude of the relative velocity and center of mass velocity v r = |vi − v j | = |vi − vj | = v r ,
(9)
and vcm = 12 (vi + v j ) = 12 (vi + vj ) = vcm ,
(10)
where vi and vj are the post-collision velocities. In three dimensions, Eqs. (9) and (10) constrain four of the six degrees of freedom. The two remaining degrees of freedom are chosen at random. These correspond to the azimuthal and polar angles, θ and φ, for the post-collision relative velocity. vr = v r [(sin θ cos φ) xˆ + (sin θ sin φ) yˆ + cos θ zˆ ].
(11)
2518
F.J. Alexander
For the hard sphere model, these angles are uniformly distributed over the unit sphere. Specifically, the azimuthal angle φ is uniformly distributed between 0 and 2π, and the angle θ has the following distribution P(θ) dθ = 12 sin θ dθ.
(12)
Since only sin θ and cos θ are required, it is convenient to change variables from θ to ζ = cos θ.Then ζ is chosen uniformly from [−1, 1], and setting cos θ = ζ and sin θ = 1 − ζ 2 . These values are used in (11). The post-collision velocities are then given by vi = vcm + 12 vr , vj = vcm − 12 vr .
(13)
The mean number of collisions that take place in a cell during a time-step is given by Nc (Nc − 1)πσ 2 v r Ne t , (14) 2Vc where Vc is the volume of the cell, and v r is the average relative velocity in the cell. To avoid the costly computation of v r , and since the ratio of total accepted to total candidates is v r Mcoll = . (15) Mcand v r,max Using (14) and (15) Mcoll =
Nc (Nc − 1)πσ 2 v r,max Ne t , (16) 2Vc produces the number of candidate pairs to select over a time step t. Note that Mcoll will, on average, equal the acceptance probability (8) multiplied by (16) and is independent of v r,max . Setting v r,max too high still processes the same number of collisions on the average, but the program is inefficient because the acceptance probability is low. This procedure selects collision pairs according to (7). Even if the value of v r,max is overestimated, the method is still correct, though less efficient because too many potential collisions are rejected. A better option is to make a guess which slightly overestimates v r,max [7]. To maintain accuracy while using the two-step, advection–collision algorithm, t should only be a fraction of the mean free time. If too large a time-step is used, then particles move too far between collisions. On the other hand, if the spatial cells are too large, then collisions can occur between particles which are “too far” from each other. Time steps beyond a mean free time and spatial cells larger than a mean free path have the effect of artificially enhancing transport coefficients such as viscosity and thermal conductivity [17, 18]. Mcand =
The direct simulation monte carlo method
4.
2519
Data Analysis
In DSMC, as with other stochastic methods, most quantities of interest are computed as averages. For example, the instantaneous, fluctuating mass ˜ t), and energy density e(r, density, ρ(r, ˜ t), momentum density, p(r, ˜ t) are given by ˜ t) ρ(r,
1 m ˜ t) = . p(r, mvi Vs 1 2 i e(r, ˜ t) m|vi | 2
(17)
The sum is over particles over a volume of space surrounding r. Because it contains details of the single particle distribution function, DSMC can provide far more information than what is contained in the hydrodynamic variables above. However, this extra information comes at a price. As with other Monte Carlo-based methods, DSMC suffers from errors√due to the finite number of particles used. Convergence is typically O(1/ N ). These errors can be reduced by using more particles in the simulation, but for some systems, that can be prohibitive. For a detailed discussion on the statistical errors in DSMC and the techniques to estimate them in a variety of flow situations, refer to the recent work of Hadjiconstantinou et al. [9]. To reduce the fluctuations in the average quantities, a large number of particles is used, or, in the case of time-independent flows, statistics are gathered over a long run after the system has reached its steady state. For timedependent problems, a statistical ensemble of realizations of the simulation is used. Physical quantities of interest can be obtained from these averages. From the description of the algorithm above it should be clear that DSMC is computationally very expensive and should not be used in situations when Navier-Stokes or Euler PDE solvers apply. To check if DSMC is necessary, one should determine the Knudsen number K n. This dimensionless parameter is defined as K n = λ/L, where L is the characteristic length scale of the physical system, and λ is the molecular mean free path (i.e., the average distance between successive collisions of a given molecule). While there is no clear dividing line, a useful rule of thumb is that DSMC should be used when K n > 1/10. For a dilute gas, the mean free path λ is given by λ= √
1 2 πσ 2 n
,
(18)
where n is the number density, and σ is the effective diameter of the molecule. Air at atmospheric pressure has λ ≈ 50 nm. In the upper atmosphere, however, (e.g., > 100 km altitude), the mean free path is several meters. The Knudsen number for air flow through a nanoscale channel or around a meter scale space vehicle can therefore easily exceed K n ≈ 1. For these cases
2520
F.J. Alexander
continuum hydrodynamics is not an option and DSMC should be used. Other, more detailed criteria can also be used [7].
5.
Discussion
Despite obvious similarities, key differences exist between DSMC and molecular dynamics. In molecular dynamics, the trajectory of every particle in the gas is computed from Newton’s equations, given an empirically determined interparticle potential. Each MD particle represents one atom or molecule. In DSMC, each particle represents Ne atoms or molecules, where Ne is on the order of 1/20 of the number of atoms/molecules in a cubic mean free path. Using MD to simulate one cubic micron of air at standard temperature and pressure MD requires integrating Newton’s equations for approximately 1010 molecules for 104 time steps to model one mean free time. With DSMC, only 105 particles and approximately 10 time steps are required. The DSMC method is therefore an efficient alternative for simulating a dilute gas. The method can be viewed as a simplified molecular dynamics (though DSMC is several orders of magnitude faster). DSMC can also be considered a Monte Carlo method for solving the time-dependent nonlinear Boltzmann equation. Instead of exactly calculating collisions as in molecular dynamics, the DSMC method generates collisions stochastically with scattering rates and post-collision velocity distributions determined from the kinetic theory of a dilute gas. Although DSMC simulations are not correct at the length scale of an atomic diameter, they are accurate at scales smaller than a mean free path. However, if more detail is required, then MD is the best option.
6.
Outlook
Though it originated in the aerospace community, since the mid-1980s DSMC has been used in a variety of other areas which demand a kinetic level formulation. These include the study of nonequilibrium fluctuations [10], nanoscale fluid dynamics [11] and granular gases [13]. Originally, DSMC was confined to to dilute gases. Several advances, however, such as the consistent Boltzmann algorithm (CBA) [8] and Enskog simulation Monte Carlo (ESMC) [12] have extended DSMC’s reach to nonideal, dense gases. Among other areas, CBA has found applications in heavy ion dynamics [14]. Similar methods also are used in transport theories of condensed matter physics [15]. While the DSMC method has been quite successful in these applications, only within the last decade has it been put on a firm mathematical foundation.
The direct simulation monte carlo method
2521
Wagner [16], for example, proved that the method, in the limit of infinite particle number, has a deterministic evolution which solves an equation “close” to the Boltzmann equation. Subsequent work has shown that DSMC and its variants converge to a variety of kinetic equations. Other analytical work has determined the error incurred in DSMC by the use of a space and time discretization [17, 18]. Efforts have been made to improve the computational efficiency of DSMC for flows in which some spatial regions are hydrodynamic and others kinetic. Pareschi and Caflisch [19] have developed an implicit DSMC method which seamlessly interpolates between the kinetic and hydrodynamic scales. Another hybrid approach optimizes performance by using DSMC where required and then using Navier-Stokes or Euler in regions where allowed. The two methods are then coupled across an interface to provide information to each other [20, 21]. This is currently a rapidly growing field.
Acknowledgments This document was prepared at LANL under the auspices of the Department of Energy LA-UR 03-7358.
References [1] C. Cercignani, The Boltzmann Equation and its Applications, Springer, New York, 1988. [2] G.A. Bird, Molecular Gas Dynamics, Clarendon, Oxford, 1976; G.A. Bird, Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Clarendon, Oxford, 1994. [3] A.L. Garcia, Numerical Methods for Physics, Prentice Hall, Englewood Cliffs, 1994. [4] E.S. Oran, C.K. Oh, and B.Z. Cybyk, Annu. Rev. Fluid Mech., 30, 403, 1998. [5] M. Fallavollita, D. Baganoff, and J. McDonald, J. Comput. Phys., 109, 30, 1993; G. Chen and I. Boyd, J. Comput. Phys., 126, 434, 1996. [6] A.L. Garcia and F. Baras, Proceedings of the Third Workshop on Modeling of Chemical Reaction Systems, Heidelberg, 1997 (CD-ROM only). [7] I. Boyd, G. Chen, and G. Candler, Phys. Fluids, 7, 210, 1995. [8] F. Alexander, A.L. Garcia, and B. Alder, Phys. Rev. Lett., 74, 5212, 1995; F. Alexander, A.L. Garcia, and B. Alder, in 25 Years of Non-Equilibrium Statistical Mechanics, J.J. Brey, J. Marco, J.M. Rubi, and M. San Miguel (eds.), Springer, Berlin, 1995; A. Frezzotti, A particle scheme for the numerical solution of the Enskog equation, Phys. Fluids, 9(5), 1329–1335, 1997. [9] N. Hadjiconstantinou, A. Garcia, M. Bazant, and G. He, J. Comput. Phys., 187, 274–297, 2003. [10] F. Baras, M.M. Mansour, A.L. Garcia, and M. Mareschal, J. Comput. Phys., 119, 94, 1995. [11] F.J. Alexander, A.L. Garcia, and B.J. Alder, Phys. Fluids, 6, 3854, 1994. [12] J.M. Montanero and A. Santos, Phys. Rev. E, 54, 438, 1996; J. M. Montanero and A. Santos, Phys. Fluids, 9, 2057, 1997.
2522
F.J. Alexander
[13] H.J. Herrmann and S. Luding, Continuum Mechanics and Thermodynamics, 10, 189, 1998; J. Javier Brey, F. Moreno, R. García-Rojo, and M.J. Ruiz-Montero, “Hydrodynamic Maxwell demon in granular systems,” Phys. Rev. E, 65, 011305, 2002. [14] G. Kortemeyer, F. Daffin, and W. Bauer, Phys. Lett. B, 374, 25, 1996. [15] C. Jacoboni and L. Reggiani, Rev. Mod. Phys., 55, 645, 1983. [16] W. Wagner, J. Stat. Phys., 66, 1011, 1992. [17] F.J. Alexander, A.L. Garcia, and B.J. Alder, Phys. Fluids, 10, 1540, 1998; Phys. Fluids, 12, 731, 2000. [18] N.G. Hadjiconstantinou, Phys. Fluids, 12, 2634, 2000. [19] L. Pareschi and R.E. Caflisch, J. Comput. Phys., 154, 90, 1999. [20] H.S. Wijesinghe and N.G. Hadjiconstantinou, “Hybrid atomistic-continuum formulations for multiscale hydrodynamics,” Article 8.8, this volume. [21] A.L. Garcia, J.B. Bell, W.Y. Crutchfield, and B.J. Alder, J. Comput. Phys., 154, 134, 1999.
8.8 HYBRID ATOMISTIC–CONTINUUM FORMULATIONS FOR MULTISCALE HYDRODYNAMICS Hettithanthrige S. Wijesinghe and Nicolas G. Hadjiconstantinou Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
Hybrid atomistic-continuum formulations allow the simulation of complex hydrodynamic phenomena at the nano and micro scales without the prohibitive cost of a fully atomistic approach. Hybrid formulations typically employ a domain decomposition strategy whereby the atomistic model is limited to regions of the flow field where required and the continuum model is implemented side-by-side in the remainder of the domain within a single computational framework. This strategy assumes that non-continuum phenomena are localized and that coupling of the two descriptions can be achieved in a spatial region where both formulations are valid. In this article we review hybrid atomistic-continuum methods for multiscale hydrodynamic applications. Both liquid and gas formulations are considered. The choice of coupling method and its relation to the fluid physics as well as the differences between incompressible and compressible hybrid methods are discussed using illustrative examples.
1.
Background
While the fabrication of MEMS devices has received much attention, transport mechanisms at the nano and micro scale environment are currently poorly understood. Furthermore, efficient and accurate design capabilities for nano and micro engineering components are also somewhat limited since design tools based on continuum formulations are increasingly reaching their limit of applicability. 2523 S. Yip (ed.), Handbook of Materials Modeling, 2523–2551. c 2005 Springer. Printed in the Netherlands.
2524
H.S. Wijesinghe and N.G. Hadjiconstantinou
For gases, deviation from the classical Navier–Stokes behavior is typically quantified by the Knudsen number, K n = λ/L where λ is the atomistic mean free path ( = 4.9 × 10−8 m for air) and L is a characteristic dimension. The Navier–Stokes formulation is found to be invalid for K n 0.1. Ducts of width 100 nm or less which are common in N/MEMS correspond to Knudsen numbers of order 1 or above [1]. The Knudsen number for Helium leak detection devices and mass spectrometers can reach values of up to 200 [2]. Also material processing applications such as chemical vapor deposition and molecular beam epitaxy involve high Knudsen number flow regimes [3]. The Navier–Stokes description also deteriorates in the presence of sharp gradients. One example comes from Navier–Stokes formulations for high Mach number shock waves which are known to generate spurious post-shock oscillations [4, 5]. In such cases, a Knudsen number can be defined using the characteristic length scale of the gradient. A significant challenge therefore exists to develop accurate and efficient design tools for flow modeling at the nano and micro scales. Liquids in nanoscale geometries or under high stress and liquids at material interfaces may also exhibit deviation from Navier–Stokes behavior [6]. Examples of problems which require modeling at the atomistic scale include the moving contact-line problem between two immiscible liquids [6], corner singularities, the breakup and merging of droplets [7], dynamic melting processes [8], crystal growth from a liquid phase and polymer/colloid wetting near surfaces. Accurate modeling of wetting phenomena is of particular concern in predicting microchannel flows. While great accuracy can be obtained by an atomistic formulation over a broader range of length scales, a substantial computational overhead is associated with this approach. To mitigate this cost, “hybrid” atomisticcontinuum simulations have been proposed as a novel approach to model hydrodynamic flows across multiple length and time scales. These hybrid approaches limit atomistic models to regions of the flow field where needed, and allow continuum models to be implemented in the remainder of the domain within a single computational framework. A hybrid method therefore allows the simulation of complex hydrodynamic phenomena which require modeling at the microscale without the prohibitive cost of a fully atomistic calculation. In what follows we provide an overview of this rapidly expanding field and discuss recent developments. We begin by discussing the challenges associated with hybrid formulations, namely the choice of the coupling method and the imposition of boundary conditions on atomistic simulations. We then illustrate hybrid methods for incompressible and compressible flows by describing recent archetypal approaches. Finally we discuss the effect of statistical fluctuations in the context of developing robust criteria for adaptive placement of the atomistic description.
Hybrid atomistic–continuum formulations
2.
2525
Challenges
Over the years a fair number of hybrid simulation frameworks have been proposed leading to some confusion over the relative merits and applicability of each approach. Original hybrid methods focused on dilute gases [9–12], which are arguably easier to deal with within a hybrid framework than dense fluids, mainly because boundary condition imposition is significantly easier in gases. The first hybrid methods for dense fluids appeared a few years later [13–16]. These initial attempts have led to a better understanding of the challenges associated with hybrid methods. Coupling the continuum and atomistic formulations requires a region of space where information exchange takes place between the two descriptions. This information exchange between the two subdomains is typically in the form of state variables and/or hydrodynamic fluxes, with the latter typically measured across the matching interface. This process may be viewed as a boundary condition exchange between subdomains. In some cases transfer of information is facilitated by an overlap region. The transfer of information from the atomistic subdomain to the continuum subdomain is fairly straightforward. A hydrodynamic field can be obtained from atomistic simulation data through averaging (see for example the article “The Direct Simulation Monte Carlo: going beyond continuum hydrodynamics” in the Handbook). The relative error due to statistical sampling in atomistic hydrodynamic formulations was also recently characterized [17]. Imposition of the latter data as boundary conditions on the continuum method is also well understood and depends on the numerical method used (see article “Finite Difference, Finite Element and Finite Volume Methods for Partial Differential Equations” of the Handbook). As discussed below, the most challenging aspect of the information exchange lies in imposing the hydrodynamic field obtained from the continuum subdomain onto the atomistic description, a process which is not well defined in the absence of the complete distribution function (hydrodynamic fields correspond to the first few moments of the distribution function). Thus to a large extent, the two major issues in developing a hybrid method is the choice of a coupling method and the imposition of boundary conditions on the atomistic simulation. Generally speaking, these two can be viewed as decoupled. The coupling technique can be developed on the basis of matching two compatible and equivalent over some region of space hydrodynamic descriptions and can thus be borrowed from the already existing and extensive continuum-based numerical methods literature. Boundary condition imposition can be posed as a general problem of imposing “macroscopic” boundary conditions on an atomistic simulation. In our opinion this is a very challenging problem that has not been, in general, resolved to date completely satisfactorily. Boundary condition imposition on the atomistic subdomain is discussed shortly.
2526
3. 3.1.
H.S. Wijesinghe and N.G. Hadjiconstantinou
Developing a Hybrid Method The Choice of Coupling Method
Coupling a continuum to an atomistic description is meaningful in a region where both can be presumed valid. In choosing a coupling method it is therefore convenient to draw upon the wealth of experience and large cadre of coupling methods nearly 50 years of continuum computational fluid dynamics have brought us. Coupling methods for the compressible and incompressible formulations generally differ, since the two correspond to two different physical and mathematical limits. Faithful to their mathematical formulations, the compressible formulation lends itself naturally to time-explicit flux-based coupling while incompressible formulations are typically coupled using either state properties (Dirichlet) or gradient information (Neumann). Given that the two formulations have different limits of applicability and physical regimes in which each is significantly more efficient than the other, care must be exercised when selecting the ingredients of the hybrid method. In other words, the choice of a coupling method and continuum subdomain formulation needs to be based on the degree to which compressibility effects are important in the problem of interest, and not on a preset notion that a particular coupling method is more appropriate than all others. The latter approach was recently pursued in a variety of studies which enforce the use of the compressible formulation to steady and essentially incompressible problems to achieve coupling by time-explicit flux matching. This approach is not recommended. On the contrary, for an efficient simulation method, similarly to the case of continuum solution methods, it is important to allow the flow physics to dictate the appropriate formulation, while the numerical implementation is chosen to cater to the particular requirements of the latter. Below, we expand on some of the considerations which influence the choice of coupling method under the assumption that the hybrid method is applied to problems of practical interest and therefore the continuum subdomain is appropriately large. Our discussion focuses on timescale considerations that are more complex but equally important to limitations resulting from lengthscale considerations, such as the size of the atomistic region(s). It is well known that the timestep for explicit integration of the compressible Navier–Stokes formulation, τc , scales with the physical timestep of the problem, τx , according to [18] M τx (1) 1+ M where M is the Mach number. As the Mach number becomes small, we are faced with the classical stiffness problem whereby the numerical efficiency of the solution method suffers [18] due to disparity of the time scales in the τc ≤
Hybrid atomistic–continuum formulations
2527
system of governing equations. For this reason, when the Mach number is small, the incompressible formulation is used which allows integration at the physical timestep τx . In the hybrid case matters are complicated by the introduction of the atomistic integration timestep, τm , which is at most of the order of τc in gases (if the discretization scale is O(λ)) and in most cases significantly smaller. Thus as the global domain of interest grows, the total integration time grows, and transient calculations in which the atomistic subdomain is explicitly integrated in time become more computationally expensive and eventually infeasible. The severity of this problem increases with decreasing Mach number and makes unsteady incompressible problems very computationally expensive. New integrative frameworks which coarse grain the time integration of the atomistic subdomain are therefore required. Fortunately, for low speed steady problems implicit (iterative) methods exist which provide solutions without the need for explicit integration of the atomistic subdomain to the global problem steady state. One such implicit method is discussed in this review; it is known as the Schwarz method. This method decouples the global evolution timescale from the atomistic evolution timescale (and timestep) by achieving convergence to the global problem steady state through an iteration between steady state solutions of the continuum and atomistic subdomains. Since the atomistic subdomain is small, explicit integration to its steady state is feasible. Although the steady assumption may appear restrictive, it is interesting to note that the majority of both compressible and incompressible test problems solved to date have been steady. A variety of other iterative methods may be suitable if they provide for timescale decoupling. The choice of the Schwarz coupling method using state variables versus a flux matching approach was motivated by the fact (as explained below) that state variables suffer from smaller statistical noise and are thus easier to prescribe on a continuum formulation. The above observations do not preclude the use of the compressible formulation in the continuum subdomain for low speed flows. In fact, preconditioning techniques which allow the use of the compressible formulation at very low Mach numbers have been developed [18]. Such a formulation can, in principle, be used to solve for the continuum subdomain while this is being coupled to the atomistic subdomain via an implicit (eg., Schwarz) iteration. What should be avoided is a time-explicit compressible flux-matching coupling procedure for solving essentially incompressible steady state problems. The issues discussed above have not been very apparent to date because in typical test problems published so far, the continuum and atomistic subdomains are of the same size (and, of course, small). In this case the large cost of the atomistic subdomain masks the cost of the continuum subdomain and also typical evolution timescales (or times to steady state) are small. It should not be forgotten, however, that hybrid methods make sense when the continuum subdomain is significantly larger than the atomistic subdomain.
2528
H.S. Wijesinghe and N.G. Hadjiconstantinou
The stiffness resulting from a small timestep in the atomistic subdomain may be remedied by implicit timestepping methods [19]. However, flux-based coupling additionally suffers from adverse signal to noise ratios in connection with the averaging required for imposition of boundary conditions from the atomistic subdomain to the continuum subdomain. In the case of an ideal gas it has been shown for low speed flows [17] that for the same number of samples, flux (shear stress, heat flux) averaging exhibits relative noise, E f , which scales as Ef ∝
E sv Kn
(2)
where E sv is the relative noise in √ the corresponding state variable (velocity, temperature) which varies as 1/ (number of samples). Here K n is based on the characteristic lengthscale of the transport gradients. Since, by assumption, in the matching region a continuuum description is appropriate, we expect K n = λ/L 1. It thus appears that flux coupling will be significantly disadvantaged in this case, since 1/K n 2 times the number of samples required by state-variable averaging is required to achieve comparable noise levels in the matching region. Statistical noise has important implications on hybrid methods which will be discussed throughout this paper. The effect of statistical noise becomes of critical importance in unsteady incompressible flows which are discussed later.
4.
Boundary Condition Imposition
Consider the boundary, ∂ of the atomistic region on which we wish to impose a set of hydrodynamic (macroscopic) boundary conditions. Typical implementations require the use of particle reservoirs R (see Fig. 1) in which particle dynamics may be altered in such a way that the desired boundary conditions appear on ∂; the hope is that the influence of the perturbed dynamics in the reservoir regions decays sufficiently fast and does not propagate into the region of interest, that is, the relaxation distance both for the velocity distribution function and the fluid structure is small compared to the characteristic scale of . Since ∂ represents the boundary with the continuum region, R extends into the continuum subdomain. Knowledge of the continuum solution in R is typically used to aid imposition of the above on ∂. In a dilute gas, the non-equilibrium distribution function in the continuum limit has been characterized [20] and is known as the Chapman–Enskog distribution. Use of this distribution to impose boundary conditions on atomistic simulations of dilute gases results in a robust, accurate and theoretically elegant approach. Typical implementations [21] require the use of particle generation and initialization within R. Particles that move into within the
Hybrid atomistic–continuum formulations
2529 ∂Ω
Ω
R
Figure 1. Continuum to atomistic boundary condition imposition using reservoirs.
simulation timestep are added to the simulation whereas particles remaining in R are discarded. For liquids, both the particle velocity and the fluid structure distribution functions are important and need to be imposed. Unfortunately no theoretical results for these distributions exist. A related issue is that of domain termination; due to particle interactions, , or in the presence of a reservoir R, needs to be terminated in a way that does not have significant effect on the fluid state inside of . As a result, researchers have experimented with possible methods to impose boundary conditions. It is now known that similarly to a dilute gas, use of a Maxwell–Boltzmann distribution for particle velocities leads to slip [14]. In [22] a Chapman–Enskog distribution is used to impose boundary conditions to generate a liquid shear flow. In this approach, particles crossing ∂ acquire velocities that are drawn from a Chapman–Enskog distribution parametrized by the local values of the required velocity and stress boundary condition. Although this approach was only tested for a Couette flow, it appears to give reasonable results (within atomistic fluctuations). Since in Couette flow no flow normal to ∂ exists, ∂ can be used as symmetry boundary separating two back-to-back shear flows; this sidesteps the issue of domain termination. Boundary conditions on MD simulations can also be imposed through the method of constraint dynamics [13]. Although the approach in [13] did not allow hydrodynamic fluxes across the matching interface, the latter feature can be integrated into this approach with a suitable domain termination. In a different approach [16] external forces are used to impose boundary conditions. More specifically, the authors apply an external field with a magnitude such that the total force on the fluid particles in R is the one required by momentum conservation as required by the coupling procedure.
2530
H.S. Wijesinghe and N.G. Hadjiconstantinou
The outer boundary of the reservoir region is terminated by using a force preventing particles from leaving the domain and an ad-hoc weighting factor for the force distribution on particles. This weighting factor diverges as particles approach the edge of R. This prevents their escape and also ensures new particles introduced in R move towards . Particles introduced into the reservoir are given velocities drawn from a Maxwell–Boltzmann distribution, while a Langevin thermostat keeps the temperature constant. The method appears to be successful although the non-unique choice of force fields and Maxwell– Boltzmann distribution makes it not very theoretically pleasing. It is also not clear what the effect of these forces are on the local fluid state (it is well known that even in a dilute gas gravity driven flow [23] exhibits significant deviations from Navier–Stokes behavior) but this effect is probably small since force fields are only acting in the reservoir region. The above approach was refined [24] by using a version of the Usher algorithm to insert particles in the energy landscape such that they have the desired specific energy, which is beneficial to imposing a desired energy current while eliminating the risk of particle overlap at some computational cost. This approach uses a Maxwell– Boltzmann distribution, however, for the initial velocities of the inserted particles. Temperature gradients are imposed by a small number of thermostats placed in the direction of the gradient. Although no proof exists that the disturbance to the particle dynamics is small, it appears that this technique is successful at imposing boundary conditions with moderate error [24]. A method for terminating incompressible molecular dynamics simulations with small effect on particle dynamics has been suggested and used [14]. This simply involves making the reservoir region fully periodic. In this manner, the boundary conditions on ∂ also impose a boundary value problem on R, where the inflow to is the outflow from R. As R becomes bigger, the gradients in R become smaller and thus the flowfield in R will have a small effect on the solution in . The disadvantage of this method is the number of particles that are needed to fill R as this grows, especially in high dimensions. We believe that significant contributions can still be made by developing methods to impose boundary conditions in hydrodynamically consistent and, most importantly, rigorous approaches.
4.1.
Particle Generation in Dilute Gases Using the Chapman–Enskog Velocity Distribution Function
In the case of dilute gases the atomistic structure is not important and the gas is characterized by the single-particle distribution function. This relative simplicity has led to solutions of the governing Boltzmann equation [25, 26] in the Navier–Stokes limit. The resultant Chapman–Enskog solution [20, 25] can be used to impose boundary conditions in a robust and rigorous manner.
Hybrid atomistic–continuum formulations
2531
In what follows we illustrate this procedure using the direct simulation Monte Carlo (DSMC) as our dilute gas simulation method. DSMC is an efficient method for the simulation of dilute gases which solves the Boltzmann equation [27] using a splitting approach. The time evolution of the system is approximated by a sequence of discrete timesteps, t, in which particles undergo, successively, collisionless advection and collisions. An appropriate number of collisions are performed between randomly chosen particle pairs within small cells of linear size x. DSMC is discussed further in the article, “The Direct Simulation Monte Carlo Method: going beyond continuum hydrodynamics” of the Handbook. In a typical hybrid implementation, particles are created in a reservoir region in which the continuum field to be imposed is known. Correct imposition of boundary conditions requires generation of particles with the correct single particle distribution function which includes the local value of the number density [21]. Current implementations [21, 28, 29] show that linear interpolation of the density gradient within the reservoir region provides sufficient accuracy. Generation of particles according to a linear density gradient can be achieved with a variety of methods, including acceptance-rejection schemes. In the next section we outline an approach for generation of particle velocities from a Chapman–Enskog distribution parametrized by the required flow variables. After particles are created in the reservoir they move for a single DSMC timestep. Particles that enter the atomistic region are incorporated into the standard convection/collision routines of the DSMC algorithm. Particles that remain in the reservoir are discarded. Particles that leave the atomistic region are also deleted from the computation.
4.2.
Generation of Particle Velocities Using the Chapman–Enskog Velocity Distribution
The Chapman–Enskog velocity distribution function f (C) can be written as [30], f (C) = f 0 (C)(C)
(3)
where, C = C/(2kT /m) 1 ∈ f 0 (C) = 3/2 e−C π and,
1/2
is the normalized thermal velocity, (4)
2 2 C −1 (C) = 1 + (qx Cx + q y C y + qz Cz ) 5 − 2(τx y Cx C y + τx z Cx Cz + τ yz C y Cz ) − τx x (Cx2 − Cz2 ) − τ yy (C y2 − Cz2 )
(5)
2532
H.S. Wijesinghe and N.G. Hadjiconstantinou
with, κ P
2m 1/2 ∂ T kT ∂ xi µ ∂v i ∂v j 2 ∂v k + − δi, j τi j = P ∂x j ∂ xi 3 ∂ xk qi = −
(6) (7)
where qi and τi j are the dimensionless heat flux vector and stress tensor respectively with µ, κ, P and v = (v x , v y , v z ) being the viscosity, thermal conductivity, pressure and mean fluid velocity. An “Acceptance–Rejection” scheme [30, 31] can be utilized to generate Chapman–Enskog distribution velocities. In this scheme an amplitude parameter A = 1 + 30B is first chosen where B = max(|τi j |, |qi |). Next a trial velocity Ctry is drawn from the Maxwell–Boltzmann equilibrium distribution function f 0 given by Eq. (4). Note f 0 is a normal (Gaussian) distribution that can be generated using standard numerical techniques [32]. The trial velocity Ctry is accepted if it satisfies AR ≤ (Ctry ) where R is a uniform deviate in [0, 1). Otherwise a new trial velocity Ctry is drawn. The final particle velocity is given by C = (2kT /m)1/2 Ctry + v
5.
(8)
Incompressible Formulations
Although in some cases compressibility may be important, a large number of applications are typically characterized by flows where use of the incompressible formulation results in a significantly more efficient approach [18]. As explained earlier, our definition of incompressible formulation is based on the flow physics and not on the numerical method used. Although in our example implementation below we have used a finite element discretization based on the incompressible formulation, we believe that a preconditioned compressible formulation [18] could also be used to solve the continuum subdomain problem if it could be successfully matched to the atomistic solution through a coupling method which takes into account the incompressible nature of the (low speed) problem to provide solution matching consistent with the flow physics. From the variety of methods proposed to date, it is becoming increasingly clear that almost any continuum–continuum coupling method can be used so long as it is properly extended to account for boundary condition imposition. The challenge thus lies more in choosing a method that best matches the physics of the problem of interest (as explained above) rather than developing general methods for large classes of problems. Below we illustrate a hybrid implementation appropriate for incompressible steady flow using the Schwarz alternating coupling method.
Hybrid atomistic–continuum formulations
2533
Before we proceed with our example, a subtle numerical issue associated with the incompressible formulation should be discussed. Due to inherent statistical fluctuations, boundary conditions obtained from the atomistic subdomain may lead to mass conservation discrepancies. Although this phenomenon is an artifact of the finite sampling, in the sense that if a sufficiently large (infinite) number of samples are taken the mean field obtained from the atomistic simulation should be appropriately incompressible, it is sufficient to cause a numerical instability in the continuum calculation. A simple correction that can be applied consists of removing the discrepancy in mass flux equally across all normal velocity components of the atomistic boundary data that are to be imposed on the continuum subdomain. If 1 is the portion of the continuum subdomain φ that receives boundary data from the atomistic subdomain (φ ⊇ 1 ), n is the unit outward normal vector to φ, and d S is a differential element of φ, the correction to the atomistic data on 1 , v1 , can be written as
(v1 .n)corrected = v1 .n −
φ
vφ .ndS
1
dS
(9)
Tests with various problems [14, 15, 28] indicate that this simple approach is successful at removing the numerical instability.
5.1.
The Schwarz Alternating Method for Steady Flows
The Schwarz method was originally proposed for molecular dynamicscontinuum methods [14, 15], but it is equally applicable to DSMC-continuum hybrid methods [28, 33]. This approach was chosen because of its ability to couple different descriptions through Dirichlet boundary conditions (easier to impose on liquid atomistic simulations compared to flux conditions, because fluxes are non-local in liquid systems), and its ability to reach the solution steady state in an implicit manner which requires only steady solutions from each subdomain. The importance of the latter characteristic cannot be overemphasized; the implicit convergence in time through steady solutions guarantees timescale decoupling that is necessary for the solution of macroscopic problems; the integration of atomistic trajectories at the atomistic timestep for total times corresponding to macroscopic evolution times is, and will for a long time be, infeasible, while integration of the small molecular region to its steady state solution is feasible. A continuum–continuum domain decomposition can be used to illustrate the Schwarz alternating method as shown graphically in Figs. 2–4 (adapted from [34]) to solve for the velocity in a simple, one-dimensional problem, a pressure driven Poiseuille flow. Starting with a zero guess for the solution in domain 2, the first steady solution in domain 1 can be obtained. This provides the first boundary condition for a steady solution in domain 2 (Fig. 2). The
2534
H.S. Wijesinghe and N.G. Hadjiconstantinou
Domain 1 Domain 2 Overlap Region
Wall
Wall First BC for domain 2 First BC for domain 1
Domain 1 first iteration a x
b L
Figure 2. Schematic illustrating the Schwarz alternating method for Poiseuille flow. Solution at the first Schwarz iteration. Adapted from [34].
Figure 3. Schematic illustrating the Schwarz alternating method for Poiseuille flow. Solution at the second Schwarz iteration. Adapted from [34].
Hybrid atomistic–continuum formulations
2535
Figure 4. Schematic illustrating the Schwarz alternating method for Poiseuille flow. Solution at the third Schwarz iteration. Adapted from [34].
new solution in domain 2 provides an updated second boundary condition for domain 1 (Fig. 3). This process is repeated until the two solutions are identical in the overlap region. As seen in Fig. 4 the solution across the complete domain rapidly approaches the steady state solution. This method is guaranteed to converge for elliptic problems [35]. The Schwarz method was recently applied [33] to the simulation of flow through micromachined filters. These filters have passages that are sufficiently small that require an atomistic description for the simulation of the flow through them. Depending on the geometry and number of filter stages the authors have reported computational savings ranging from 2 to 100.
5.2.
Driven Cavity Test Problem
In this section we solve the steady driven cavity problem using the Schwarz alternating method. The driven cavity problem is used here as a test problem for verification and illustration purposes. In fact, although wall effects might be important in small scale flows, and a hybrid method which treats only the regions close to the walls using the molecular approach may be an interesting problem, the formulation chosen here is such that no molecular effects are present. This is achieved by placing the molecular description in the center of the computational domain such that it is not in contact with any of the system
2536
H.S. Wijesinghe and N.G. Hadjiconstantinou
walls (see Fig. 5). The rationale is that the hybrid solution of this problem should reproduce the full Navier–Stokes solution and thus the latter can be used as a benchmark result. In our formulation the continuum subdomain is described by the Navier– Stokes equations solved by finite element discretization. Standard Dirichlet velocity boundary conditions for a driven cavity problem were applied on the system walls which in this implemetation are captured by the continuum subdomain; the horizontal velocity component on the left, right and lower walls were held at zero, while on the upper wall it was set to 50 m/s. The vertical velocity component on all boundaries was set to zero. Boundary conditions from the atomistic domain are imposed on nodes that have been centered on DSMC cells (see Fig. 6). The pressure is scaled by setting the middle node on the lower boundary at atmospheric pressure (1.013×105 Pa). Despite the relatively high flow velocity, the flow is essentially incompressible and isothermal.
Figure 5. Continuum and atomistic subdomains for Schwarz coupling for the twodimensional driven cavity problem.
Hybrid atomistic–continuum formulations
2537
Figure 6. Boundary condition exchange. Only the bottom left corner of the matching region is shown for clarity. Particles are created with probability density proportional to the local number density.
The imposition of boundary conditions on the atomistic subdomain is facilitated by a particle reservoir as shown in Fig. 6. Note that in this particular implementation the reservoir region serves also as part of the overlap region, thus reducing the overall cost of the molecular description. Particles are created at locations x, y within the reservoir with velocities C x , C y drawn from a Chapman–Enskog velocity distribution. The Chapman Enskog distribution is generated, as explained above, by using the mean and gradient of velocities from the continuum solution; the number and spatial distribution of particles in the reservoir are chosen according to the overlying continuum cell mean density and density gradients. The rapid convergence of the Schwarz approach is demonstrated in Fig. 7. The continuum numerical solution is reached to within ±10% at the 3rd Schwarz iteration and to within ±2% at the 10th Schwarz iteration. The error estimate which includes the effects of statistical noise [17] and discretization error due to finite timestep and cell size is approximately 2.5%. Similar convergence is also observed for the velocity field in the vertical direction. The close agreement with the fully continuum results indicates that the Chapman–Enskog procedure is not only theoretically appropriate but also robust. Despite a Reynolds number of Re ≈ 1, the Schwarz method converges
2538
H.S. Wijesinghe and N.G. Hadjiconstantinou
Figure 7. Convergence of the horizontal velocity component along the Y = 0.425 × 10−6 m plane with successive Schwarz iterations.
with negligible error. This is in agreement with findings [36] which have recently shown that the Schwarz method is expected to converge for Re ∼ O(1).
5.3.
Unsteady Formulations
Unsteady incompressible calculations are particularly challenging for two reasons. First, due to the low flow speeds associated with them and the associated large number of samples required, the computational cost of the atomistic subdomain simulation rises sharply. Second, because of Eq. (1) and the fact that τm τc (typically), explicit time integration to the time of interest is very expensive. Approaches which use explicit time coupling based on compressible fluxmatching schemes have been proposed for these flows but it is not at all clear that these approaches provide the best solution. First, they suffer from signal to noise problems more than state-variable based methods. Second, integration of the continuum subdomain using the compressible formulation for an incompressible problem becomes both expensive and inaccurate [18]. On the other hand, iterative methods require a number of re-evaluations of the molecular
Hybrid atomistic–continuum formulations
2539
solution to achieve convergence. This is an additional computational cost that is not shared by the time-explicit coupling and leads to a situation whereby (for incompressible unsteady problems) the choice between a time-explicit fluxmatching coupling formulation or an iterative (Schwarz-type) coupling formulation is not clear and may be problem dependent. An alternative approach would be the adaptation of non-iterative continuum-continuum coupling techniques which take into account the incompressible nature of the problem and avoid the use of flux matching, such as the coupling approach presented in O’Connell and Thompson [13]. We should also recall that from Eq. (1), unless time coarse-graining techniques are developed, large, low-speed, unsteady problems are currently too expensive to be feasible by any method.
6.
Compressible Formulations
As discussed above, consideration of the compressible equations of motion leads to hybrid methods which differ significantly from their incompressible counterparts. The hyperbolic nature of compressible flows means that steady state formulations typically do not offer a significant computational advantage, and as a result, explicit time integration is the preferred solution approach and flux matching is the preferred coupling method. Given that the characteristic evolution time, τh , scales with the system size, the largest problem that can be captured by a hybrid method is limited by the separation of scales between the atomistic integration time and τh . Local mesh refinement techniques [21, 29] minimize the regions of space that need to be integrated at small CFL timesteps (due to a fine mesh), such as the regions adjoining the atomistic subdomain. Implicit timestepping methods [19] can also be used to speed up the time integration of the continuum subdomain. Unfortunately, although both approaches enhance the computational efficiency of the continuum sub-problem, they do not alleviate the issues arising from the disparity between the atomistic timestep and the total integration time. Compressible hybrid continuum-DSMC approaches are popular because compressible behavior is often observed in gases. In these methods, locally refining the continuum solution cells to the size of DSMC cells leads to a particularly seamless hybrid formulation in which DSMC cells differ from the neighboring continuum cells only by the fact that they are inherently fluctuating. The DSMC timestep required for accurate solutions [37–39] is very similar to the CFL timestep of a compressible formulation, and thus a finite volume formulation can be used to couple the two descriptions (for finite volume methods see the article, “Finite Difference, Finite Element and Finite Volume Methods for Partial Differential Equations” in the Handbook). In such a method [9, 10, 40] the flux of mass, momentum and energy from DSMC
2540
H.S. Wijesinghe and N.G. Hadjiconstantinou
to the continuum domain can be used directly for finite volume integration. Going from the continuum solution to DSMC requires the use of reservoirs. A DSMC reservoir extending into the continuum subdomain is attached at the end of the DSMC subdomain and initialized using the overlying continuum field properties. Flux of mass, momentum and energy is then provided by the particles entering the DSMC subdomain from the reservoir. The particles leaving the DSMC subdomain to the reservoir are discarded (after their contribution to mass, momentum and energy flux to the continuum is recorded). Another characteristic inherent to compressible formulations is the possibility of describing parts of the domain by the Euler equations of motion [29]. In that case, consistent coupling to the atomistic formulation can be performed using a Maxwell–Boltzmann distribution [21]. It has been shown [41] that explicit time-dependent flux-based formulations preserve the fluctuating nature of the atomistic description within the atomistic regions but the fluctuation amplitude decays rapidly within the continuum regions; correct fluctuation spectra can be obtained in the entire domain by solving a fluctuating hydrodynamics formulation [42] in the continuum subdomain. Below we discuss a particular hybrid implementation to illustrate atomisticcontinuum coupling in the compressible limit. We would like to emphasize again that a variety of methods can be used, although the compressible formulation is particularly well suited to flux matching. The method illustrated here is an extended version of the original Adaptive Mesh and Algorithm Refinement (AMAR) method [21]. This method was chosen since it is both the current state of the art in compressible fully adaptive hybrid methods and since it also illustrates how existing continuum multiscale techniques can be used directly for atomistic-continuum coupling with minimum modification.
6.1.
Fully Adaptive Mesh and Algorithm Refinement for a Dilute Gas
The compressible adaptive mesh and algorithm refinement formulation of Garcia et al., [21], referred to as AMAR, pioneered the use of mesh refinement as a natural framework for the introduction of the atomistic description in a hybrid formulation. In AMAR the typical continuum mesh refinement capabilities are supplemented by an algorithmic refinement (continuum to atomistic) based on continuum breakdown criteria. This seamless transition is both theoretically and practically very appealing. In what follows we briefly discuss a recently developed [29] fully adaptive AMAR method. In this method DSMC provides an atomistic description of the
Hybrid atomistic–continuum formulations
2541
flow while the compressible two-fluid Euler equations serve as the continuumscale model. Consider the Euler equations in conservative integral form d dt where
U dV +
φ
F · nˆ dS = 0
(10)
∂φ
U=
ρ px py pz e ρc
;
x F =
ρu x ρu 2x + P ρu x u y ρu x u z (e + P)u x ρcu x
(11)
Only the x-direction component of the flux terms are listed here; other directions are similar. A two-species gas is assumed with the mass concentrations being c and (1 − c). Discrete time integration is achieved by using a finite volume approximation to Eq. (10). This yields a conservative finite difference expression with Unij k appearing as a cell-centered quantity at each x,n+1/2 time level and Fi+1/2, j,k located at faces between cells at half-time levels. A second-order version of an unsplit Godunov scheme is used to approximate the fluxes [43–45]. Time stepping on an AMR grid hierarchy involves interleaving time steps on individual levels [46]. Each level has its own spatial grid resolution and timestep (typically constrained by a CFL condition). The key to achieving a conservative AMR algorithm is to define a discretization for Eq. (10) that holds on every region of the grid hierarchy. In particular, the discrete cell volume integrals of U and the discrete cell face integrals of F must match on the locally-refined AMR grid. Thus, integration of a level involves two steps: solution advance and solution synchronization with other levels. Synchronizing the solution across levels assumes that fine grid values are more accurate than coarse grid values. So, coarse values of U are replaced by suitable cell volume averages of finer U data where levels overlap, and discrete fine flux integrals replace coarse fluxes at coarse-fine grid boundaries. Although the solution is computed differently in overlapping cells on different levels as each level is advanced initially, the synchronization procedure enforces conservation over the entire AMR grid hierarchy.
6.2.
Details of Coupling
During time integration of continuum grid levels, fluxes computed at each cell face are used to advance the solution U (Fig. 8b). Continuum values are
2542
H.S. Wijesinghe and N.G. Hadjiconstantinou
(a)
(b)
(c)
(d)
(e)
(f)
Figure 8. Outline of AMAR hybrid: (a) Beginning of a time step; (b) Advance the continuum grid; (c) Create buffer particles; (d) Advance DSMC particles; (e) Refluxing; (f) Reset overlying continuum grid. Adapted from [29].
advanced using a time increment tc appropriate for each level, including those that overlay the DSMC region. When the particle level is integrated, it is advanced to the new time on the finest continuum level using a sequence of particle time steps, tp . The relative magnitude of tp to the finest continuum grid tc depends on the finest continuum grid spacing x (typically a few λ) and the particle mean collision time. Euler solution information is passed to the particles via buffer (reservoir) cells surrounding the DSMC region. At the beginning of each DSMC integration step, particles are created in the buffer cells using the continuum hydrodynamic values (ρ, u, T ) and their gradients (Fig. 8c) in a manner analogous to the incompressible case discussed above and the guidelines of the section on particle generation in dilute gases. Since the continuum solution is advanced first, these values are time interpolated between continuum time steps for the sequence of DSMC time steps needed to reach the new continuum solution time. DSMC buffer cells are one mean free path wide; thus, the time step tp is constrained so that it is extremely improbable that a particle will travel further than one mean free path in a single time step. The particle velocities are drawn from an appropriate distribution for the continuum solver, such as the Chapman–Enskog distribution when coupling to a Navier–Stokes description and a Maxwell–Boltzmann when coupling to an Euler description. During each DSMC time integration step, all particles are moved, including those in the buffer regions (Fig. 8d). A particle that crosses the interface
Hybrid atomistic–continuum formulations
2543
between continuum and DSMC regions will eventually contribute to the flux at the corresponding continuum cell face during the synchronization of the DSMC level with the finest continuum level. After moving particles, those residing in buffer regions are discarded. Then, collisions among the remaining particles are evaluated and new particle velocities are computed. After the DSMC region has advanced over an entire continuum grid timestep, the continuum and DSMC solutions are synchronized in a manner analogous to the AMR level synchronization process described earlier. First, the continuum values in each cell overlaying the DSMC region interior are set to the conservative averages of data from the particles within the continuum grid cell region (Fig. 8e). Second, the continuum solution in cells adjacent to the DSMC region is recomputed using a “refluxing” process (Fig. 8f). That is, a flux correction is computed using a space and time integral of particle flux data, δF = −AFn+(1/2) +
Fp.
(12)
particles
The sum represents the flux of the conserved quantities carried by particles passing through the continuum cell face during the DSMC updates. Finally, Un+1 = Un+1 +
tc δF xyz
(13)
is used to update the conserved quantities on the continuum grid where Un+1 is the coarse grid solution before computing the flux correction. Note, multiple DSMC parallelepiped regions (i.e., patches) are coupled by copying particles from patch interiors to buffer regions of adjacent DSMC patches (see Fig. 9). That is, particles in the interior of one patch supply boundary values (by acting as a reservoir) for adjacent particle patches. After copying particles into buffer regions, each DSMC patch may be integrated independently, in the same fashion that different patches in a conventional AMR problems are treated after exchanging boundary data. In summary, the coupling between the continuum and DSMC methods is performed in three operations. First, continuum solution values are interpolated to create particles in DSMC buffer cells before each DSMC step. Second, conserved quantities in each continuum cell overlaying the DSMC region are replaced by averages over particles in the same region. Third, fluxes recorded when particles cross the DSMC interface are used to correct the continuum solution in cells adjacent to the DSMC region. This coupling procedure makes the DSMC region appear as any other level in the AMR grid hierarchy. Figure 10 shows the adaptive tracking of a shockwave of Mach number 10 used as a validation test for this method. Density gradient based mesh refinement ensures the DSMC region tracks the shock front accurately. Furthermore, as shown in Fig. 11 the density profile of the shock wave remains smooth and
2544
H.S. Wijesinghe and N.G. Hadjiconstantinou
Figure 9. Multiple DSMC regions are coupled by copying particles from one DSMC region (upper left) to the buffer region of an adjacent DSMC region (lower right). After copying, regions are integrated independently over the same time increment. Adapted from Wijesinghe et al. [29].
Figure 10. Moving Mach 10 shock wave though Argon. The AMAR algorithm tracks the shock by adaptively moving the DSMC region with the shock front. Note, dark Euler region shading corresponds to density = 0.00178 g/cm3 , light Euler region shading corresponds to density = 0.00691 g/cm3 .
Hybrid atomistic–continuum formulations
2545
Figure 11. Moving Mach 10 shock wave though Argon. The AMAR profile (dots) is compared with the analytical time evolution of the initial discontinuity (lines). τm is the mean collision time.
is devoid of oscillations that are known to plague traditional shock capturing schemes [4, 5]. Further details of the implementation using the Structured Adaptive Mesh Refinement Application Infrastructure (SAMRAI) developed at Lawrence Livermore National Laboratory [47] can be found in [29].
7.
Refinement Criteria
The AMAR scheme discussed above allows grid and algorithm refinement based on any combination of flow variables and their gradients. Density gradient based refinement has has been found to be generally robust and reliable. However, refinement may be triggered by any number of user defined criteria. For example, concentration gradients or concentration values within some interval are also effective refinement criteria especially for multispecies flows. In the AMAR formulation, refinement is triggered by spatial gradients exceeding user defined tolerances. This approach follows from the continuum breakdown parameter method [48]. Due to spontaneous stochastic fluctuations in atomistic computations, it is important to track gradients in a manner that does not allow the fluctuations
2546
H.S. Wijesinghe and N.G. Hadjiconstantinou
to trigger unnecessary refinement and excessively large atomistic regions. Let us consider a dilute gas for simplicity and the gas density as an example. For an ideal gas under equilibrium conditions, the number of particles in a given volume is Poisson distributed; the standard deviation in the normalized density gradient perceived by the calculation at cell i is dρ/dx 2 ≈
ρ
Ni+1 − Ni 2 =
x Ni
√
2 √ x N
(14)
where N is the number of particles in a cell where macroscopic properties are defined. The use of equilibrium fluctuations is sufficiently accurate as long as the deviation from equilibrium is not too large [17]. The fluid density fluctuation can thus only be reduced by increasing the number of simulation particles. This has consequences for the use of density gradient tolerances Rρ , the value of which, as a result, must be based on the number of particles used in the atomistic subdomain. Let us illustrate this through an example. Consider the domain geometry shown in Fig. 12 where an atomistic region is in contact with a continuum region. Let the gas be in equilibrium. As stated above, the effect of nonequilibrium on fluctuations is small. In this problem, grid refinement occurs when the density gradient at the interface between two descriptions exceeds a normalized threshold,
2λ dρ Rρ < ρ dx
(15)
After such a “trigger” event the atomistic region grows by a single continuum cell width. Lets us assume that we would like to estimate the value of
Figure 12. 3D AMAR computational domain for investigation of tolerance parameter variation with number of particles in DSMC cells. From [29].
Hybrid atomistic–continuum formulations
2547
refinement threshold such that a given trigger rate, say 5–10%, is achieved. The interpretation of this trigger rate, is that there is a probability of 5–10% of observing spurious growth of the atomistic subdomain due to density fluctuations. Following [29] we show how the trigger rate can be related to the number of particles per cell used in the calculation. For the geometry considered in the above test problem, each continuum cell consists of 8 DSMC cells and hence effectively the contribution of 8 × N particles is averaged to determine the density gradient between continuum cells. Applying Eq. (14) to these continuum cells we obtain, dρ/dx 2 ≈ σ=
ρ
c
1 √
2x N
(16)
Note that we are assuming that the fluctuation of the continuum cells across from the atomistic-continuum interface is approximately the same as that in the atomistic region. This was shown to be the case for the diffusion equation and a random walk model [41], and has been verified for the Euler–DSMC system [29] (see Fig. 14). This allows the use of Eq. (14) that was derived assuming 2 atomistic cells. Note that the observed trigger event is a composite of a large number of possible density gradient fluctuations that could exceed Rρ ; gradients across all possible nearest neighbor cells, next-to-nearest neighbor cells and diagonally-nearest neighbor cells are all individually evaluated by the refinement routines and checked against Rρ . For a 10% trigger rate (or equivalent probability of trigger) the probability of an individual cell having a density fluctuation exceeding Rρ can be estimated as O(0.1/100) by observing that, 1. since the trigger event is rare, probabilities can be approximated as additive, 2. for the geometry considered, there are ≈ 300 nearest neighbor, nextnearest neighbor and diagonal cells that can trigger refinement and 3. the rapid decay of the Gaussian distribution ensures the decreasing probability (O(0.1/100) ∼ O(0.001)) of a single event does not significantly alter the corresponding confidence interval and thus an exact enumeration of all possible trigger pairs with correct weighting factors is not necessary. Our probability estimate at O(0.001) suggests that our confidence interval is 3σ − 4σ . This is verified in Fig. 13. Smaller trigger rates can be achieved by increasing Rρ , that is, by increasing the number of particles per cell.
2548
Figure 13.
H.S. Wijesinghe and N.G. Hadjiconstantinou
Variation of density gradient tolerance with number of DSMC particles. From [29].
Figure 14. Average density for stationary fluid Euler–DSMC hybrid simulation with 80 particles per cubic mean free path. Errorbars give one standard deviation over 10 samples. From [29].
Hybrid atomistic–continuum formulations
8.
2549
Outlook
Although hybrid methods provide significant savings by limiting atomistic solutions only to the regions where they are needed, solution of timeevolving problems which span a large range of timescales is still not possible if the atomistic subdomain, however small, needs to be integrated for the total time of interest. New frameworks are therefore required which allow timescale decoupling or coarse grained time evolution of atomistic simulations. Significant computational savings can be obtained by using the incompressible formulation, when appropriate, for steady problems. Neglect of these simplifications can lead to a problem that is simply intractable when the continuum subdomain is appropriately large. It is interesting to note that, when a hybrid method was used to solve a problem of practical interest [33] while providing computational savings, the Schwarz method was preferred because it provides a steady solution framework with timescale decoupling. For dilute gases the Chapman–Enskog distribution provides a robust and accurate method for imposing boundary conditions. Further work is required for the development of similar frameworks for liquids.
Acknowledgments The authors wish to thank R. Hornung and A.L. Garcia for help with the computations and valuable comments and discussions, and A.T. Patera and B.J. Alder for helpful comments and discussions. This work was supported in part by the Center for Computational Engineering, and the Center for Advanced Scientific Computing, Lawrence Livermore National Laboratory, US Department of Energy, W-7405-ENG-48. The authors also acknowledge the financial support from the University of Singapore through the Singapore-MIT alliance.
References [1] A. Beskok and G.E. Karniadakis, “A model for flows in channels, pipes and ducts at micro and nano scales,” Microscale Thermophys. Eng., 3, 43–77, 1999. [2] S.A. Tison, “Experimental data and theoretical modeling of gas flows through metal capillary leaks,” Vacuum, 44, 1171–1175, 1993. [3] D.G. Coronell and K.F. Jensen, “Analysis of transition regime flows in low pressure CVD reactors using the direct simulation Monte Carlo method,” J. Electrochem. Soc., 139, 2264–2273, 1992. [4] M. Arora and P. L. Roe, “On postshock oscillations due to shock capturing schemes in unsteady flows,” J. Comput. Phys., 130, 25, 1997. [5] P.R. Woodward and P. Colella, “The numerical simulation of two-dimensional fluid flow with strong shocks,” J. Comput. Phys., 54, 115, 1984.
2550
H.S. Wijesinghe and N.G. Hadjiconstantinou
[6] J. Koplik and J.R. Banavar, “Continuum deductions from molecular hydrodynamics,” Annu. Rev. Fluid Mech., 27, 257–292, 1995. [7] M.P. Brenner, X.D. Shi, and S.R. Nagel, “Iterated instabilities during droplet fission,” Phys. Rev. Lett., 73, 3391–3394, 1994. [8] P.A. Thompson and M.O. Robbins, “Origin of stick–slip motion in boundary lubrication,” Science, 250, 792–794, 1990. [9] D.C. Wadsworth and D.A. Erwin, “One-dimensional hybrid continuum/particle simulation approach for rarefied hypersonic flows,” AIAA Paper 90-1690, 1990. [10] D.C. Wadsworth and D.A. Erwin, “Two-dimensional hybrid continuum/particle simulation approach for rarefied hypersonic flows,” AIAA Paper 92-2975, 1992. [11] J. Eggers and A. Beylich, “New algorithms for application in the direct simulation Monte Carlo method,” Prog. Astronaut. Aeron., 159, 166–173, 1994. [12] D. Hash and H. Hassan, “A hybrid DSMC–Navier Stokes solver,” AIAA Paper 95-0410, 1995. [13] S.T. O’Connell and P. Thompson, “Molecular dynamics-continuum hybrid computations: A tool for studying complex fluid flows,” Phys. Rev. E, 52, R5792–R5795, 1995. [14] N.G. Hadjiconstantinou and A.T. Patera, “Heterogeneous atomistic-continuum representations for dense fluid systems,” Int. J. Mod. Phys. C, 8, 967–976, 1997. [15] N.G. Hadjiconstantinou, “Hybrid atomistic-continuum formulations and the moving contact-line problem,” J. Comput. Phys., 154, 245–265, 1999. [16] E.G. Flekkoy, G. Wagner, and J. Feder, “Hybrid model for combined particle and continuum dynamics,” Europhys. Lett., 52, 271–276, 2000. [17] N.G. Hadjiconstantinou, A.L. Garcia, M.Z. Bazant, and G.He, “Statistical error in particle simulations of hydrodynamic phenomena,” J. Comput. Phys., 187, 274–297, 2003. [18] P. Wesseling, Principles of Computational Fluid Dynamics, Springer, 2001. [19] X. Yuan and H. Daiguji, “A specially combined lower-upper factored implicit scheme for three dimensional compressible Navier-Stokes equations,” Comput. Fluids, 30, 339–363, 2001. [20] S. Chapman and T.G. Cowling, The Mathematical Theory of Non-uniform Gases, Cambridge University Press, 1970. [21] A.L. Garcia, J.B. Bell, W.Y. Crutchfield et al., “Adaptive mesh and algorithm refinement using direct simulation Monte Carlo,” J. Comput. Phys., 54, 134, 1999. [22] J. Li, D. Liao and S. Yip, “Nearly exact solution for coupled continuum/MD fluid simulation,” J. Comput. Aided Mater. Design, 6, 95–102, 1999. [23] M.M. Mansour, F. Baras, and A.L. Garcia, “On the validity of hydrodynamics in plane poiseuille flows,” Physica A, 240, 255–267, 1997. [24] R. Delgado–Buscalioni and P.V. Coveney, “Continuum–particle hybrid coupling for mass, momentum and energy transfers in unsteady fluid flow,” Phys. Rev. E, 67(4), 2003. [25] C. Cercignani, The Boltzmann Equation and its Applications, Springer-Verlag, New York, 1988. [26] G.A. Bird, Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Clarendon Press, Oxford, 1994. [27] W. Wagner, “A convergence proof for bird’s direct simulation Monte Carlo method for the Boltzmann equation,” J. Statist. Phys., 66, 1011, 1992. [28] H.S. Wijesinghe and N.G. Hadjiconstantinou, “A hybrid continuum-atomistic scheme for viscous incompressible flow,” In: Proceedings of the 23th International Symposium on Rarefied Gas Dynamics, 907–914, Whistler, British Columbia, 2002.
Hybrid atomistic–continuum formulations
2551
[29] H.S. Wijesinghe, R. Hornung, A.L. Garcia et al., “Three–dimensional hybrid continuum–atomistic simulations for multiscale hydrodynamics,” ASME J. Fluids Eng., 126, 768–777, 2004. [30] A.L. Garcia and B.J. Alder, “Generation of the Chapman Enskog distribution,” J. Comput. Phys., 140, 66, 1998. [31] L. Devroye, “Non-uniform random variate generation,” In: A.L. Garcia (ed.), Numerical Methods for Physics, Prentice Hall, New Jersey, 1986. [32] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.A. Vetterling, Numerical Recipes in Fortran, Cambridge University Press, 1992. [33] O. Aktas and N.R. Aluru, “A combined continuum/DSMC Technique for multiscale analysis of microfluidic filters,” J. Comput. Phys., 178, 342–372, 2002. [34] N. G. Hadjiconstantinou, Hybrid Atomistic-Continuum Formulations and the Moving Contact Line Problem, Phd Thesis edn., Mechanical Engineering Department, Massachusetts Institute of Technology, Cambridge, Massachusetts, 1998. [35] P.L. Lions, “On the Schwarz alternating method,” I. In: R. Glowinski, G. Golub, G. Meurant, and J. Periaux (eds.), First International Symposium on Domain Decomposition Methods for Partial Differential Equations, pp. 1–42, SIAM, Philadelphia, 1988. [36] S.H. Liu, “On Schwarz alternating methods for the incompressible Navier–Stokes equations,” SIAM J. Sci. Comput., 22(6), 1974–1986, 2001. [37] F.J. Alexander, A.L. Garcia, and B.J. Alder, “Cell size dependence of transport coefficients in stochastic particle algorithms,” Phys. Fluids, 10, 1540, 1998. [38] N.G. Hadjiconstantinou, “Analysis of discretization in the direct simulation Monte Carlo,” Phys. Fluids, 12, 2634–2638, 2000. [39] A.L. Garcia and W. Wagner, “Time step truncation error in direct simulation Monte Carlo,” Phys. Fluids, 12, 2621–2633, 2000. [40] R. Roveda, D.B. Goldstein, and P.L. Varghese, “Hybrid Euler/direct simulation Monte Carlo calculation of unsteady slit flow,” J. Spacecraft and Rockets, 37(6), 753–760, 2000. [41] F.J. Alexander, A.L. Garcia, and D. Tartakovsky, “Algorithm refinement for stochastic partial diffential equations: I. Linear diffusion,” J. Comput. Phys., 182(1), 47–66, 2002. [42] L.D. Landau and E.M. Lifshitz, Statistical Mechanics Part 2, Pergamon Press, Oxford, 1980. [43] P. Colella, “A direct Eulerian (MUSCL) scheme for gas dynamics,” SIAM J. Sci. Statist. Comput., 6, 104–117, 1985. [44] P. Colella and H.M. Glaz, “Efficient solution algorithms for the riemann problem for real gases,” J. Comput. Phys., 59, 264–289, 1985. [45] J. Saltzman, “An unsplit 3D upwind method for hyperbolic conservation laws,” J. Comput. Phys., 115, 153, 1994. [46] M. Berger and P. Colella, “Local adaptive mesh refinement for shock hydrodynamics,” J. Comput. Phys., 82, 64, 1989. [47] CASC, “Structured adaptive mesh refinement application infrastructure,” http://www.llnl.gov/CASC/, 2000. [48] G.A. Bird, “Breakdown of translational and rotational equilibrium in gaseous expansions,” Am. Inst. Aeronautics and Astronaut. J., 8, 1998, 1970.
Chapter 9 POLYMERS AND SOFT MATTER
9.1 POLYMERS AND SOFT MATTER L. Mahadevan1 and Gregory C. Rutledge2 1
Division of Engineering and Applied Sciences, Department of Organismic and Evolutionary Biology, Department of Systems Biology, Harvard University Cambridge, MA 02138, USA 2 Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
1.
Introduction
Within the context of this Handbook, the combined areas ofpolymers and soft matter encompasses a vast range of complex materials, including both synthetic and natural polymers, many biological materials, and complex fluids such as colloids and viscoelastic media. What distinguishes these materials from most of those considered in other chapters of this Handbook is the macromolecular or supermolecular nature of the basic components of the material. In addition to the usual atomic level interactions responsible for chemically specific material behavior, as is found in all materials, these macromolecular and supermolecular objects exhibit topological features that lead to new, larger scale, collective nonlinear and nonequilibrium behaviors that are not seen in the constituents. As a consequence, these materials are typically characterized by a broad range of both length and time scales over which phenomena of both scientific and engineering interest can arise. In polymers, for instance, the organic nature of the molecules is responsible for both strong (intramolecular, covalent) and weak (intermolecular, van der Waals) interactions, as well as interactions of intermediate strength such as hydrogen bonds that are common in macromolecules of biological interest. In addition, however, the long chain nature of the molecule introduces a distinction between dynamics that occur along the chain or normal to it; one consequence of this is the observation of certain generic behaviors such as the “slithering snake” motion, or reptation, in polymer dynamics. It is often the very ability of polymers and soft matter to exhibit both atomic (or molecular) and macro- (or super-) molecular behavior that makes them so interesting and powerful as a class of materials and as building blocks for living systems. 2555 S. Yip (ed.), Handbook of Materials Modeling, 2555–2559. c 2005 Springer. Printed in the Netherlands.
2556
L. Mahadevan and G.C. Rutledge
Nevertheless, polymers and soft matter are, at their most basic level, collections of atomic and subatomic particles, like any other class of materials. They exhibit both liquid-like and crystalline (or at least semi-crystalline) order in their condensed forms. For polymers, vitrification and the glassy state are particularly important, as both the vitrification temperature and the kinetics of vitrification are strong functions of the inverse of molecular weight. For the most part, the methods developed for atomic and electronic level modeling described in the earlier chapters of this Handbook are equally applicable, at least in principle, to the descriptive modeling of polymers and soft matter. Electronic structure calculations, atomistic scale molecular dynamics and Monte Carlo simulations, coarse-grained and mesoscale models such as Lattice Boltzmann and Dissipative Particle Dynamics all have a role to play in modeling of polymers and soft matter. As materials, these interesting solids and fluids exhibit crystal plasticity, amorphous component viscoelasticity, rugged energy landscapes, and fascinating phase transitions. Indeed, block copolymers consisting of two or more covalently-joined but relatively incompatible chemical segments, and the competition they represent between intermolecular interactions and topological constraints, give rise to the rich field of microphase separation, with all its associated issues and opportunities regarding manipulation of microstructure, size and symmetry. It has not been our objective in assembling the contributions to this chapter to repeat any of the basic elements of modeling that have been developed to describe materials at any of these particular length and time scales, or strategies for generating thermodynamics information relevant to ensembles, phase transitions, etc. Rather, in recognition of those features which make polymers and soft matter distinct and novel with respect to their atomic or monomolecular counterparts, we have attempted to assemble a collection of contributions which highlight these features, and which describe methods developed specifically to handle the particular problems and complexities of dimensionality, time and length scale which are unique to this class of materials. With this in mind, the following sections in this chapter should be understood as extensions and revisions of what has gone before. We begin with a discussion of interatomic potentials specific to organic materials typical of synthetic and natural polymers and other soft matter. Accurate force fields lie at the heart of any molecular simulation intended to describe a particular material. Over the years, numerous apparently dissimilar force fields for organic materials have been proposed. However, certain motifs consistently reappear in such force fields, and common pitfalls in parameterization and guidelines for application of such force fields can be identified. These are discussed in the contribution by Smith. The recognition that one of the defining features of macromolecules is their very large conformation space motivated the relatively early development by Volkenstein in the late 1950s of the concept of rotational isomeric
Polymers and soft matters
2557
states for each of the rotatable bonds along the backbone (i.e., the topologically connected dimension) of molecular chains. This essential discretization of conformation space allowed the development by Flory and others of what is now known as the rotational isomeric states (RIS) method, discussed in the section by Mattice. This method for evaluation of conformational averages is unique to polymers and provides an important alternative to the sampling strategies embodied by molecular dynamics and Monte Carlo simulation. What RIS gives up in assuming a simplified, discrete set of allowed rotational states for each bond, it more than makes up for in its computational efficiency and rigorous representation of contributions from all allowed conformers to the partition function and resulting conformational averages. The issues in sampling of phase space using molecular dynamics or Monte Carlo simulations for chain models are discussed by Mavrantzas. Molecular dynamics is of course applicable to the study of polymers and soft matter, but the broad range of length and, in particular, time scales alluded to earlier as being a consequence of the macromolecular and/or supermolecular nature of such matter, render this method of limited utility for many of the most interesting and unique behaviors in this class of materials. For this reason, Monte Carlo simulation has come to play a particularly important role in the modeling of polymers and soft matter. At the expense of detailed dynamics, the state of the art in Monte Carlo simulations of chain molecules and aggregates has advanced through the development of new sampling schemes that permit drastic, sometimes seemingly unphysical, moves through phase space. These moves are designed with both intermolecular interactions and intramolecular topology in mind. Without them, full equilibration and accurate simulation of complex materials are all but impossible. An alternative approach to accessing the long length and time scales of interest in polymers and soft matter is to coarse-grain the model description, gaining computational efficiency at the price of atomic scale detail. Such methods are useful for studying the generic, or universal, properties of polymers and aggregates. In the field of polymers and soft matter, lattice models have long been employed for rendering such coarse-grained models. The Bond Fluctuation Model, in particular, is typical of this class of methods and has enjoyed widespread application, due at least in part to the delicate compromise it achieves between the complexity of conformation space and the simplification inherent in rendering on a lattice. Importantly, it does so while retaining the essential topological connectivity. These methods are discussed by M¨uller and provide a link to continuum-based methods. Continuum based methods start to become relevant when the number of particles involved is very large and one is interested in long wavelength, long time modes, as is typical of hydrodynamics. The dimensionalities of both the “material” component and the embedding component, or matrix, play important roles in determining the behavior of mesophases such as suspensions, colloids
2558
L. Mahadevan and G.C. Rutledge
and membranes. The article by Sierou provides an introduction to Stokesian dynamics, a molecular dynamics-like method for simulating the multi-phase behavior of particles suspended in a fluid. The particles are treated in a discrete sense, while the surrounding fluid is treated using a continuum approximation and is thus valid when the particle size is much larger than that of the molecules of the solvent. By accounting for Brownian motion, Stokesian dynamics provides a generalization of Brownian dynamics, treated by Doyle and Underhill in the next section, wherein the many-body contribution from hydrodynamics is accounted for properly. It thus paves the road for a study of the equilibrium and non-equilibrium rheology of colloids and other complex multiphase fluids. Moving up in dimensionality from particles to chains, the section by Doyle and Underhill discusses Brownian dynamics simulation of long chain polymers. The topological connectivity of these polymers implies a separation in time and energy scales for deformations tangential to and normal to the backbone. Coarse-grained models that account for this separation of scales range from bead-spring models to continuum semi-flexible polymers. While these models have been corroborated with each other and with simple experiments involving single molecules, the next frontier is clearly the use of these dynamical methods to probe the behavior of polymer solutions, a subject that still merits much attention. Next, Powers looks at the 2D generalization of polymers, i.e., membranes, which are assemblies of lipid molecules that are fluid-like in the plane but have an elastic response to bending out of the plane. In contrast to the previous sections, the focus here is on the continuum and statistical mechanics of these membranes using analytical tools via a coarse-grained free energy written in terms of the basic broken-symmetries of the system. Once again the role of non-equilibrium dynamics comes up in the example of active membranes. The last section in this chapter offers a union of the molecular and continuum perspectives, in some sense, to address problems such as molecular structure-mediated microphase formation. Here again continuum models based on density fields and free energy functionals are most appropriate. It is a relatively recent development, however, that such models have been used as a starting point for computer simulations. The Field Theoretic Simulation method developed by Frederickson and co-workers does just this, and is discussed by Ganesan and Frederickson in this chapter. They provide a prescription by which a molecular model can be recast as a density field with its projected Hamiltonian, and then present appropriate methods for discretizing and sampling phase space during the simulation. Thus, polymers and soft matter are in some sense no different than hard matter, in that their constituents are atomic in nature. Yet they are distinguished by the predominance of weak interactions comparable to the thermal fluctuations, which makes them amenable to change. Looking to the future, the wide
Polymers and soft matters
2559
variety of phases and broken symmetries that they embody is nowhere more abundant than in living systems that operate far from equilibrium and are eternally mutable. From a materials perspective, polymers and soft matter offer opportunities to mimic and understand nature in ways that we are only just beginning to appreciate. It is our hope that the sections in this chapter offer a glimpse of the techniques that one may use and the questions that motivate them.
9.2 ATOMISTIC POTENTIALS FOR POLYMERS AND ORGANIC MATERIALS Grant D. Smith Department of Materials Science and Engineering, Department of Chemical Engineering, University of Utah, Salt Lake City, Utah, USA
Accurate representation of the potential energy lies at the heart of all simulations of real materials. Accurate potentials are required for molecular simulations to accurately predict the behavior and properties of materials, and even qualitative conclusions drawn from simulations employing inaccurate or unvalidated potentials are problematic. Various forms of classical potentials (force fields) for polymers and organic materials can be found in the literature [1–3]. The most appropriate form of the potential depends largely upon the properties of interest to the simulator. When interest lies in reproducing the static, thermodynamic and dynamic (transport and relaxational) properties of non-reactive organic materials, the potential must accurately represent the molecular geometry, nonbonded interactions, and conformational energetics of the materials of interest. The relatively simple representation of the classical potential energy discussed below has been found to work remarkable well for these properties. More complicated potentials that can handle chemical reactions [4] or are designed to very accurately reproduce vibrational spectra [5] can be found in the literature. The form of the force field considered here has the advantages of being more easily parameterized than more complicated forms. Parameterization of even simple potentials is a challenging task, however, as discussed below.
1.
Form of the Potential
The classical force field represents the total potential energy of an ensemble of atoms V ( r ) with positions given by the vector r as a sum of nonbonded
2561 S. Yip (ed.), Handbook of Materials Modeling, 2561–2573. c 2005 Springer. Printed in the Netherlands.
2562
G.D. Smith
interactions V N B ( r ) and energy contributions due to all bond, valence bend, and dihedral interactions: V ( r ) = V nb ( r) +
V bond(ri j ) +
bonds
V bend (θi j k ) +
bends
V tors (ϕi j kl )
dihedrals
(1) The various interactions are illustrated in Fig. 1. The dihedral term also includes four-center improper torsion or out-of-plane bending interactions that occur at sp2 hybridized centers. r ) consists of a sum of the twoCommonly, the nonbonded energy V N B ( body repulsion and dispersion energy terms between atoms i and j represented by the Buckingham (exponential-6) potential, the energy due to the interactions between fixed partial atomic or ionic charges (Coulomb interaction), and the energy due to many-body polarization effects: r ) = V pol ( r) + V nb (
N 1 Ci j qi q j Ai j exp(−Bi j ri j ) − 6 + 2 i, j =1 4π ε0ri j ri j
(2)
The generic behavior of the dispersion/repulsion energy for an atomic pair is shown in Fig. 2. The dispersion interactions are weak compared to repulsion, but are longer range, resulting in an attractive well with well depth ε at an interatomic separation of σ ∗ . The separation where the net potential is zero, σ , is often used to define the atomic diameter. In addition to the exponential-6 dihedral twist
intramolecular nonbonded intermolecular nonbonded
bond stretch
valence angle bend
Figure 1. Schematic representation of intramolecular bonded and nonbonded (intramolecular and intermolecular) interactions in a typical polymer.
V DIS-REP(r)
Atomistic potentials for polymers and organic materials
0
2563
ε σ σ∗
r Figure 2. Schematic representation of the dispersion/repulsion potential between two atoms as a function of separation.
form, the Lennard–Jones form of the dispersion–repulsion interaction,
Ai j Ci j σ V D I S−R E P (ri j ) = 12 − 6 = 4ε ri j ri j ri j
12
12 6 σ ∗ σ ∗ = ε −2
ri j
ri j
−
σ ri j
6
(3)
is commonly used, although this form tends to yield a poorer (too stiff) description of repulsion. The relationship between the well depth and atomic diameter and the dispersion–repulsion parameters is particularly simple for the Lennard–Jones potential (ε = C 2 /4A, σ = (A/C)1/6 , σ ∗ = 21/6 σ ), allowing the dispersion–repulsion interaction to be expressed in terms of these parameters, as shown in Eq. (3). Nonbonded interactions are typically included between all atoms of different molecules and between atoms of the same molecule separated by more than two bonds (see Fig. 1). It is not uncommon, however, to scale intramolecular nonbonded interactions between atoms separated by three bonds. Care must therefore be taken in implimenting a potential that the 1–4 intramolecular nonbonded interactions are correctly treated. Repulsion parameters have the shortest range and typically become negligible at 1.5 σ. Dispersion parameters are longer range than the repulsion parameters requiring cutoff distances of 2.5 σ. The Coulomb term is long-range, necessitating use of special summing methods [6, 7]. While dispersion interactions are typically weaker and are shorter range than Coulomb interactions, they are always attractive, independent of the configuration of the molecules, and typically make the dominate contribution to the cohesive energy even in highly polar polymers and organic materials.
2564
G.D. Smith
A further complication arises in cases where many-body dipole polarization needs to be taken into account explicitly. The potential energy due to dipole polarization is not pair-wise additive and is given by a sum of the interaction energy between the induced dipoles µi and the electric field Ei0 at atom i generated by the permanent charges in the system (qi ), the interaction energy between the induced dipoles and the energy required to induce the dipole moments [7] V
pol
(r) = −
N i=1
N N 1 µ i • µ i 0 µ • Ei − µ i • T ij • µ j + 2 i, j 2α i i=1
(4)
tot where µ i = αi E tot i , αi is the isotropic atomic polarizability, E i is the total electrostatic field at the atomic site i due to permanent charges and induced dipoles, and the second order dipole tensor is given by 1 1 T i j = ∇i ∇ j = 4π ε0ri j 4π ε0ri3j
3 ri j ri j −1 ri2j
(5)
where ri j is the vector from atom i to atom j . Because of the expense involved in simulations with explicit inclusion of many-body dipole polarization, it may be desirable to utilize a two-body approximation for these interactions [8]. The contributions due to bonded interactions are represented as
ri j − ri0j V bond (ri j ) = 12 kibond j
2
θi j k − θi0j k V bend (θi j k ) = 12 kibend jk V tors(ϕi j kl ) = V tors(ϕi j kl ) =
1 2
(6) 2
= 12 k bend cos θi j k − cos θi0j k ijk
kitors j kl (n) 1 − cos nϕi j kl
n 1 oop k 2 i j kl
φi j k
2
2
(7)
or (8)
Here, ri0j is an equilibrium bond length and θi0j l is an equilibrium valence bend oop bend tors angle while kibond j , ki j k , ki j kl (n) and ki j kl are the bond, bend, torsion and outof-plane bending force constants, respectively. Note that the forms given for the valence bend interaction are entirely equivalent for sp2 and sp3 bonding geometries for reasonably stiff bends at reasonable temperatures, with k = k/sin2 θ 0 . The indices indicate which (bonded) atoms that are involved in the interaction. These geometric parameters and force constants, combined with the nonbonded parameters qi , αi , Ai j , Bi j and Ci j , constitute the classical force field for a particular material. In contrasting the form of the potential for polymers and organics with potentials for other materials, the nature of bonding in organic materials becomes manifestly apparent. In organic materials the relatively strong covalent bonds and valence bends serve primarily to define the geometry of the
Atomistic potentials for polymers and organic materials
2565
molecule. Much weaker/softer intramolecular degrees of freedom, namely torsions, and intermolecular nonbonded interactions, primarily determine the thermodynamic and dynamic properties of polymers and large organic molecules. Hence relatively weak (and consequently difficult to parameterize) torsional and repulsion/dispersion parameters must be determined with great accuracy in potentials for polymers and organics. However, this separation of scales of interaction strengths (strong intramolecular covalent bonding, weak intermolecular bonding) has the advantage of allowing many-body interactions, which often must be treated through explicit many-body nonbonded terms in simulations of other classes of materials, to be treated much more efficiently as separate intramolecular bonded interactions in organic materials.
2.
Existing Potentials
By far the most convenient way to obtain a force field is to utilize an extant one. In general, force fields can be divided into three categories: (a) force fields parametrized based upon a broad training set of molecules such as small organic molecules, peptides, or amino acids including AMBER [1], COMPASS [9], OPLS-AA [3] and CHARMM [10]; (b) generic potentials such as DREIDING [11] and UNIVERSAL [12] that are not parameterized to reproduce properties of any particular set of molecules; and (c) specialized force fields carefully parametrized to reproduce properties of a specific compound. A procedure for parameterizing the latter class of potential is described below. A summary of the data used in the parametrization of some of the most common force fields is presented in Table 1. Parametrized force fields (AMBER, OPLS and CHARMM) can work well within the class of molecules they have been parametrized upon. However, when the force field parameters are utilized for compounds similar to those in the original training set but not contained in the training set significant errors can appear and the quality of force field predictions is often no better than that of the generic force fields [13]. Similar behavior is expected when parameterized force fields transferred to new classes of compounds. Therefore, in choosing a potential, both the quality of the potential and the transferability of the potential need to be considered. The quality of a potential can be estimated by examining the quality and quantity of data used in its parameterization. For example, AMBER ff99 (Table 1) uses a much higher level of quantum chemistry calculation for determination of dihedral parameters than the early AMBER ff94. The ability of the force fields to describe the molecular and condensed-phase properties of the training set is another indicator of the force field quality. The issue of transferability of a potential is faced when a high-quality force field, adequately validated for compounds similar to the one of interest, is used in modeling
2566
Table 1. Summary of the primary data used in parameterization of popular force fields Interactions
AMBER[ff94, ff99, ff02]
OPLS-AA
CHARMM
DREIDING
repulsion/dispersion
PVT, H vap
PVT, H vap
PVT, H vap , crystal structures, QC
Crystal structures and sublimation energies
electrostatic
QC
PVT, H vap
QC, experimental dipoles
predictive method
polarization
[N/A, N/A, experiment]
N/A
N/A
N/A
bond/bend
X-ray structure, IR, Raman
AMBER[ff94] with some values from CHARMM
IR, Raman, microwave and electron diffraction, X-ray crystal data, QC
Generic
torsion
various experimental sources, QC
QC
Microwave and electron diffraction, QC
Generic based on hybridization
training set
peptides, nucleic acids, organics
organic liquids
peptides
generic
G.D. Smith
Atomistic potentials for polymers and organic materials
2567
related compounds not in the training set, or in modeling entirely new classes of materials. Transferability varies tremendously upon the potential function parameter, with some parameters being in general quite transferable between similar compounds and others being much less so.
3.
Sources of Data for Force Field Parametrization
In order to judge the quality of existing force fields for a compound of interest, or to undertake the demanding but often inevitable task of parameterizing or partially parameterizing a new force field, one requires data against which the force field parameters (or subset thereof) can be tested and if necessary, fit. As can be seen in Table 1, there are two primary sources for such data: experiment and ab initio quantum chemistry calculations. Experimentally measured structural, thermodynamic and dynamic data for condensed phases (liquid and/or crystal) of the material of interest or closely related compounds are particularly useful in force field parameterization and validation. Highlevel quantum chemistry calculations are the best source of molecular level information for force field parameterization. While such calculations are not yet possible on high polymers and very large organic molecules, they are feasible on small molecules representative of polymer repeat units and oligomers, fragments of large molecules, as well as molecular clusters that reproduce interactions between segments of polymers or organic molecules or the interaction of a these with surfaces, solvents, ions, etc. These calculations can provide the molecular geometries, partial charges, polarizabilities, conformational energy surface, and intermolecular nonbonded interactions critical for accurate prediction of structural, thermodynamic and dynamic properties of polymers. Of key importance in utilizing quantum chemistry calculations for force field parameterization is use of an adequate level of theory and the choice of the basis set. As a rule of thumb, augmented correlation-consistent polarizable basis sets (e.g., aug-cc-pVDZ) utilizing DFT geometries (e.g., B3LYP) and correlated (MP2) energies work quite well, often providing molecular dipole moments within a few percent of experimental values, conformer energies within ±0.3 kcal/mol, rotational energy barriers between conformations within ± 0.5 kcal/mol, and intermolecular binding energies after basis set superposition error (BSSE) correction within 0.1–1 kcal/mol. However, whenever force field parameterization for any new class of molecule for which extensive quantum chemistry studies do not exist is undertaken, a comprehensive study of the influence of basis set and electron correlation on molecular geometries, conformational energies, cluster energies, dipole moments, molecular polarizabilities and electrostatic potential is warranted.
2568
4.
G.D. Smith
Determining Potential Function Needs
In examining candidate potentials for a material, one should ascertain whether they have been parameterized for the material of interest or for closely related materials. One should also determine what data (quantum chemistry and experimental) were used in the parametrization, the quality of the data employed, and how well the potential reproduces the “training set” data. Finally, what if any validation steps that have been carried by the originators of the potential or by others who have utilized the potential should be determined. Next, one should determine what force field parameters are missing or may need reparameterization for the material of interest. The parameters that have most limited transferability from the training set to related compounds and hence are most likely to need parameterization are partial charges and dihedral parameters. Other parameters that may need parameterization in order of decreasing probability (increasing transferability) are equilibrium bond lengths and angles, bond, bend and improper torsion force constants, dispersion/repulsion parameters and atomic polarizabilities (for many-body polarizable potentials). A general procedure for systematic parameterization and validation of potential functions suitable for any polymer, organic compound or solution is provide below. Detailed derivations of quantum-chemistry based potentials for organic compounds and polymers can be found in the literature [9, 14].
5.
Establishing the Quantum Chemistry Data Set
Once it has been determined that parameterization or partial parameterization of a potential function is needed, it is necessary to determine the set of model molecules to be utilized in the potential function parameterization. If dispersion/repulsion parameters are needed, this may include molecular complexes containing the intermolecular interactions of interest. For smaller organic molecules, the entire molecule should be included in the data set. For polymers and larger organic molecules, oligomers/fragments containing all single conformations and conformational pairs extant in the polymer/large organic should be included. A search for existing quantum chemistry studies of these and related molecules should be conducted before beginning quantum chemistry calculations. When a new class of material (one for which extensive quantum chemistry studies have not yet been conducted) is being investigated, the influence of basis set and level of theory should be systematically investigated. Comparison with experiment (binding energies, molecular geometry, conformational energies, etc.) can help establish what level of theory is adequate.
Atomistic potentials for polymers and organic materials
2569
Once the level of theory is established, all important conformers and rotational energy barriers for the model molecule(s) in the data set should be found, as well as dipole moments and electrostatic potential for the lowest energy conformers. BSSE corrected binding energies for important configurations of molecular clusters should also be determined if parameterization of dispersion/repulsion interactions is required. These data provide the basis for parameterization of the potential as described briefly below.
6. 6.1.
Potential Function Parameterization and Validation Partial Charges
Most organic molecules are sufficiently polar that Coulomb interactions must be accurately represented. Often it is sufficient to treat Coulomb interactions with fixed partial atomic charges (Eq. (2)) and neglect explicit inclusion of many-body dipolar polarizability. The primary exception occurs when small ionic species are present. In such cases the force field needs to be augmented with additional terms describing polarization of a molecule (Eq. (4)). When needed, atomic polarizabilities can be determined straightforwardly from quantum chemistry [14, 15]. In parameterization of partial atomic charges, one attempts to reproduce the molecular dipole moment and electrostatic potential in the vicinity of model molecules as determined from high-level quantum chemistry calculations with a set of partial charges of the various atoms. Fig. 3 illustrates the quality of agreement that can be achieved in representing the electrostatic potential with partial atomic charges.
6.2.
Dispersion and Repulsion Interactions
Carrying out quantum chemistry studies of molecular clusters of sufficient accuracy to allow for final determination of dispersion parameters is very computationally expensive. Fortunately repulsion and dispersion parameters are highly transferable. Therefore, it is expedient to utilize literature values for repulsion and dispersion parameters where high-quality, validated values exist. Where necessary BSSE corrected Hartree–Fock binding energies of molecular clusters can be used establish repulsion parameters and initial values for dispersion parameters can be determined from fitting to correlated binding energies [14, 15]. Regardless of the source of data utilized to parameterize dispersion interactions (experimental thermodynamic or structural data, quantum chemistry data on molecular clusters, or direct use of existing parameters)
2570
G.D. Smith
5 5
-30 -20 0 0
-10
Figure 3. Electrostatic potential in the plane of a 1,2-dimethoxyethane molecule from ab initio electronic structure calculations (QC) and from partial atomic charges (FF) parameterized to reproduce the potential. Energy contours are in kcal/mol.
it may be necessary to make (hopefully) minor empirical adjustments (as large as ±10%) to the dispersion parameters so as to yield highly accurate thermodynamic properties for the material of interest. This can be accomplished by carrying out simulations of model molecules and comparing predicted thermodynamic properties (density, heat of vaporization, thermal expansion, compressibility) with experiment and adjusting dispersion parameters as needed to improve agreement.
6.3.
Bond and Bend Interactions
The covalent bond and valence bend force constants are also highly transferable between related compounds. As long as the dihedral potential (see
Atomistic potentials for polymers and organic materials
2571
below) is parameterized with the chosen bond and bend force constants, the particular (reasonable) values of the force constants will not strongly influence structural, thermodynamic, or dynamic properties of the material. It is therefore recommended that stretching bending force constants be taken from the literature where available. When not available, stretching and bending force constants can be taken directly from quantum chemistry normal mode frequencies determined for representative model molecules with appropriate scaling of the force constants.
6.4.
Molecular Geometry
The molecular geometry can strongly influence static, thermodynamic and dynamic properties and needs to be accurately reproduced. Therefore, accurate representation of bond lengths and angles is important. Equilibrium bond lengths and bond angles can be adjusted so as to accurate reproduce the bond lengths and bond angles of model compounds determined from high-level quantum chemistry.
6.5.
Dihedral Potential
It is crucial that the conformational energies, specifically the relative energies of important conformations and the rotational energy barriers between them, be accurately represented for polymers and conformationally flexible organic compounds. As a minimum a force field must be able to reproduce the relative energies of the important conformations of single dihedrals and dihedral pairs (dyad) in model molecules. The conformational energies and rotational energy barriers obtained from quantum chemistry for model molecules are quite sensitive to the level of theory utilized, both basis set size and electron correlation. Fortunately, it is typically not necessary to conduct geometry optimizations with electron correlation—for many compounds SCF or DFT geometries are sufficient. Unfortunately, relative conformational energies and rotational energy barriers obtained at the SCF and DFT level are usually not sufficient accurate, necessitating the calculation of MP2 energies at SCF or DFT geometries. In fitting the dihedral potential, it is sometimes possible to utilize only 1, 2 and 3-fold dihedral terms (n = 1–3 in Eq. (8)). However, it is often necessary to up to 6-fold dihedral terms to obtain a good representation of the conformational energy surface. One must be cognizant of possible artifacts (e.g., spurious minima and conformational energy barriers) that can be introduced into the conformational energy surface when higher-fold terms (n > 3) with large amplitudes are utilized. Fig. 4 show the quality of agreement for conformational energies between quantum chemistry and molecule
2572
G.D. Smith 7
conformational energy (kcal/mol)
QC
6
FF
5 4 3 2 1 0 0
60
120
180
240
300
360
β dihedral angle Figure 4. The relative conformational energy for rotation about the β-dihedral in 1,5hexadiene from ab initio electronic structure calculations (QC) and a force field parameterized to reproduce the conformational energy surface (FF).
mechanics that is possible with a 1-3 fold potential for model molecules for poly(butadiene).
6.6.
Validation of the Potential
As a final step, the potential, regardless of its source, should be validated through extensive comparison of structural, thermodynamic and dynamic properties obtained from simulations of the material of interest, closely related materials, and model compounds used in the parameterization, with available experimental data. The importance of potential function validation in simulation of real materials cannot be overemphasized.
References [1] W.D. Cornell et al., “A second generation force field for simulations of proteins, nucleic acids, and organic molecules,” J. Am. Chem. Soc., 117, 5179–5197, 1995. [2] J.W. Ponder and D.A. Case, “Force fields for protein simulation,” Adv. Prot. Chem., 66, 27–85, 2003. [3] W.L. Jorgensen, D.S. Maxwell, and J. Tirado-Rives, “Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic luquids,” J. Am. Chem. Soc., 118, 11225–11236, 1996. [4] D.W. Brenner, “Empirical potential for hydrocarbons for use in simulating the chemical vapor deposition of diamond films,” Phys. Rev. B, 42, 9458–9471, 1990.
Atomistic potentials for polymers and organic materials
2573
[5] T.A. Halgren, “Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94,” J. Comput. Chem., 17, 553–586, 1996. [6] A. Toukmaji, C. Sagui, J. Board, and T.Darden, “Efficient particle-mesh ewald based approach to fixed and induced dipolar interactions,” J. Chem. Phys., 113, 10912– 10927, 2000. [7] T.M. Nymand and P. Linse, “Ewald summation and reaction field methods for potentials with atomic charges, dipoles, and polarizabilities,” J. Chem. Phys., 112, 6152–6160, 2000. [8] O. Borodin, G.D. Smith, and R. Douglas, “Force field development and MD simulations of poly(ethylene oxide)/LiBF4 polymer electrolytes,” J. Phys. Chem. B, 108, 6824–6837, 2003. [9] H. Sun, “COMPASS: An ab initio force-field optimized for condensed-phase applications-overview with details on alkane and benzene compounds,” J. Phys. Chem. B, 102, 7338–7364, 1998. [10] A.D. MacKerell et al., “All-atom empirical potential for molecular modeling and dynamics studies of proteins,” J. Phys. Chem. B, 102, 3586–3616, 1998. [11] S.L. Mayo, B.D. Olafson, and W.A. Goddard, III, “DREIDING: A generic force field for molecular simulations,” J. Phys. Chem., 94, 8897–8909, 1990. [12] A.K. Rapp´e, C.J. Casewit, K.S. Colwell, W.A. Goddard, and W.M. Skiff, “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations,” J. Am. Chem. Soc., 114, 10024–10035, 1992. [13] F. Sato, S. Hojo, and H. Sun, “On the transferability of force field parameters-with an ab initio force field developed for sulfonamides,” J. Phys. Chem. A., 107, 248–257, 2003. [14] O. Borodin and G.D. Smith, “Molecular modeling of poly(ethylene oxide) melts and poly(ethylene oxide)-based polymer electrolytes,” In: L. Curtiss and M. Gordon, (eds.), Methods and Applications in Computational Materials Chemistry, Kluwer Academic Publishers, 35–90, 2004. [15] O. Borodin and G.D. Smith, “Development of the quantum chemistry force fields for poly(ethylene oxide) with many-body polarization interactions,” J. Phys. Chem. B, 108, 6801–6812, 2003.
9.3 ROTATIONAL ISOMERIC STATE METHODS Wayne L. Mattice Department of Polymer Science, The University of Akron, Akron, OH 44325-3909
At very small degree of polymerization, x, the conformation-dependent physical properties of a chain are easily evaluated by discrete enumeration of all allowed conformations. Each conformation can be characterized in terms of bond lengths, l, bond angles, θ, torsion angles, φ, and conformational energy, E. The rapid increase in conformations as x → ∞ prohibits discrete enumeration when the chain reaches a degree of polymerization associated with a high polymer. This difficulty is overcome with the rotational isomeric state (RIS) model. This model provides a tractable method for computation of average conformation-dependent physical properties of polymers, based on the knowledge of the properties of the members of the homologous series with very small values of x. The physical property most commonly computed with the RIS method is the mean square unperturbed end-to-end distance, r 2 0 . Zero as a subscript denotes the unperturbed state, where the properties of the chain are controlled completely by the short-range interactions that are present at very small values of x. This assumption is appropriate for the polymer in its melt, which is a condition of immense importance both for modeling studies and for the use of polymers in reality. The assumption also applies in dilute solution in a solvent, where the excluded volume effect is nil [1]. The second virial coefficient for the osmotic pressure is zero in this special solvent. In good solvents, where the second virial coefficient is positive, the mean square end-to-end distance is larger than r 2 0 , due to the expansion of the chain produced by the excluded volume effect. The excluded volume effect is not incorporated in the usual applications of the RIS model. The first use of the RIS method was reported over five decades ago, well before the widespread availability of fast computers [2]. Given this date of origin of the method, it is not surprising that the correct numerical evaluation of a RIS model requires very little computer time, in comparison with newer 2575 S. Yip (ed.), Handbook of Materials Modeling, 2575–2582. c 2005 Springer. Printed in the Netherlands.
2576
W.L. Mattice
simulation methods that were developed after fast computers populated nearly every desktop.
1.
Information Required for Calculation of r 2 0
The essential features of the RIS method are well illustrated by the classic calculation of r 2 0 for a long unperturbed polyethylene chain, using as input the properties of n-butane and n-pentane [3]. This illustration identifies the information that is required from the small molecules, and shows how that information is incorporated into the model in order to calculate r 2 0 for a very long chain. The information required for a successful RIS treatment of polyethylene is summarized in Table 1. From n-butane we obtain the values for the length of the C–C bond, l = 0.154 nm, and the C–C–C bond angle, 112◦ . The internal C–C bond is subject to a symmetric torsion potential with three preferred conformations, ν = 3, denoted by trans(t), gauche+ (g + ), and gauche− (g − ). When φ is defined to be zero in the cis state, the torsion angles are 180◦ and ± (60◦ + φ), with the value of φ being about 7.5◦ . The g states are higher in energy that the t state by E σ = E g − E t = 2.1 kJ/mol. This first-order (dependence on a single torsion angle) interaction energy specifies a temperature-dependent statistical weight of σ = exp (−E σ /RT) for a g state relative to a t state. The input from n-butane would be sufficient for the RIS model if the bonds in polyethylene were independent of one another. However, independence of bonds in not observed in polyethylene or in most other polymers. Information about the pair-wise interdependence of the bonds comes from the next higher alkane in the homologous series. Specifically it is from the examination of the energies of the four conformations of n-pentane in which both internal C–C bonds adopt g states. If the two bonds were independent, the four gg states would have the same conformational energy, and that energy would be higher Table 1. Input from small n-alkanes to the RIS model for polyethylene Alkane Butane
Pentane
Information
Symbol
Value for polyethylene
C–C bond length C–C–C bond angle Number of rotational isomeric states Torsion angles
l θ
0.154 nm 112◦
ν φ
First-order interaction energy Second-order interaction energy
Eσ = E g – Et
3 180◦ and ± (60◦ + φ), φ = 7.5◦ 2.1 kJ/mol
E ω = E g + g− − E g+g+
8.4 kJ/mol
Rotational isomeric state methods
2577
by 2Eσ than the conformational energy in the tt state. This expectation is realized if both g states are of the same sign. However, if they are of opposite sign, a strong repulsive interaction of the pendant methyl groups causes the energy to be higher than the energy of the tt conformation by 2Eσ + 8.4 kJ/mol. This important extra energy, denoted E ω , is termed a second-order interaction because it depends on two torsion angles. Examination of the remaining conformations of n-pentane reveals no other important second-order interactions. Third- and higher-order interactions can be incorporated in the model, but often they are unnecessary. Polyethylene is an example of a chain where the performance of the model is not improved by the incorporation of thirdorder interactions. Third-order interactions occur between the methyl groups in n-hexane. Their interaction is prohibitively repulsive when the intervening C–C bonds are all in g states that alternate in sign. However, the g + g − g + conformation of n-hexane is severely penalized by the second-order interactions described in the previous paragraph. Penalizing it further by specifically incorporating the third-order interaction has a trivial effect on numerical results calculated from the model. Therefore the simpler approach, based on first- and second-order interactions only, is the one usually adopted. All of the information in Table 1 is used in the calculation of r 2 0 for a long unperturbed polyethylene chain via the RIS method. Initially the thermodynamic (or energetic) and structural (bond lengths, bond angles, torsion angles) contributions are considered separately. Then these two pieces of the problem are combined for the final answer.
2.
Thermodynamic (energetic) Information: The Conformational Partition Function
The thermodynamic information appears in the conformational partition function, Z, which is the sum of the statistical weights for all ν (n−2) conformations for an unperturbed chain with n bonds. The first- and second-order interactions from Table 1 are counted correctly in an expression for Z that uses a statistical weight matrix, Ui , for each bond. Z = U1 U2 . . . Un
(1)
For internal bonds, Ui is a ν × ν matrix, with rows indexed by the state at bond i − 1, and columns indexed in the same order by the state at bond i. Each column contains the first-order statistical weight appropriate for the conformation that indexes that column, and each element contains the second-order statistical weight appropriate for the pair of states defined by that row and
2578
W.L. Mattice
column. If the order of indexing is t, g + , g − , Ui is specified by Eq. (2) for 1 < i < n.
1 σ σ Ui = 1 σ σ ω , 1 σω σ
1
(2)
The terminal row and column vectors in Eq. (1) are U1 = [1 0 0] and Un = [1 1 1]T , where T as a superscript denotes the transpose. The Z for n-butane and n-pentane are 1 + 2σ and 1 + 4σ + 2σ 2 (1 + ω), respectively. Equation (1) continues to correctly count the contributions of σ and ω at all higher n.
3.
Geometric Information: r 2
The geometric information is utilized first for a single, arbitrarily chosen, conformation of the chain. At this stage of the development of the model, the energy of this conformation is not relevant. The squared end-to-end distance of a specified conformation of a chain is often written in terms of the bond vectors, li , as shown in Eq. (3).
r =r·r= 2
li
·
i
i
li
=
li2 + 2
i
li · l j
(3)
1≤i< j ≤n
Evaluation of the last term in this equation requires knowledge of the angles between all pairs of bond vectors. The RIS method uses an alternative (but completely equivalent) formulation of r 2 in which every bond vector is written in its own coordinate system as li = [li 0 0]T . The angles between the bond vectors are treated with transformation matrices, defined so that Ti li + 1 expresses bond vector i + 1 in the local coordinate system of bond i. The local coordinate system for bond i is usually defined so that the x axis runs along this bond, the y axis is in the plane of bonds i and i − 1, with a positive projection on bond i − 1, and the z-axis completes a right-handed Cartesian coordinate system. With this definition, Ti is given by Eq. (4).
− cos θ sin θ 0 Ti = − sin θ cos φ − cos θ cos φ − sin φ − sin θ sin φ − cos θ sin φ cos φ
(4)
Equation (4) is written using the convention that φ = 0 in the cis state. With transformation matrices defined in this manner, r 2 can be written as shown in Eq. (5). r2 =
i
li2 + 2
1≤i< j ≤n
liT Ti Ti+1 . . . T j −1 l j
(5)
Rotational isomeric state methods
2579
The most important difference in the two formulations is that every bond vector is expressed in its own local coordinate system as [li 0 0]T in Eq. (5), but in Eq. (3) none of the bond vectors can be written until their orientation in a common coordinate system has been established. The expression in Eq. (5) can be cast into a matrix form that is similar in structure to Eq. (1). The desired property, r 2 , is obtained as a serial product of n matrices, and all necessary information about bond i is found in the ith matrix. r 2 = F1 F2 . . . Fn
(6)
The internal Fi are 5 × 5 matrices that are more conveniently written in block form as 3 × 3 matrices.
1 2liT Ti li2 Ti li , Fi = 0 0 0 1
1
(7)
The F1 and Fn in Eq. (6) are the first row and the last column, respectively, of the matrix in Eq. (7).
4.
Combination of Matrix Expressions for Z and r 2
The desired average of r 2 can be written as a sum over the κ conformations of the chain.
r2
0
=
κ
pκ rκ2
(8)
Here rκ2 is obtained from Eq. (6) with the assignment of the n −2 torsion angles that are found in conformation κ. The information required for the normalized probability of this conformation, pκ , is present in Z. It is the statistical weight for conformation κ divided by Z. The desired statistical weight is the product of σ , raised to a power given by the number of g states in conformation κ, and ω, raised to a power given by the number of adjacent pairs of bonds in g states of opposite sign in that conformation. The proper number of factors of σ and ω, and the sum over all κ conformations, are obtained with another serial product of n matrices. r 2 0 = Z −1 G1 G2 . . . Gn
(9)
2580
W.L. Mattice
The internal Gi matrices in Eq. (9) are obtained by expansion of each element of Ui by the appropriate form for Fi . This form for Fi must use the torsion angle in Ti that is appropriate for the state indexed by that column of Ui . Using Ft , Fg+ , and Fg− to denote these forms of the Fi defined in Eq. (7), the internal Gi for Eq. (9) can be written in block form as ν × ν matrices.
Ft Gi = Ft Ft
σ Fg+ σ Fg− σ Fg+ σ ωFg− , σ ωFg+ σ Fg−
1
(10)
When written out element by element, the dimensions are 5ν × 5ν. The terminal Gi in Eq. (9) are given by G1 = [F1 0 0] and Gn = [Fn Fn Fn ]T . Numerical evaluation of Eq. (9) is fast, even for very long chains, because computers can rapidly perform the matrix multiplication that is required. Numerical results are usually reported as the dimensionless characteristic ratio, Cn , in which r 2 0 has been normalized by the value expected for a random flight chain with the same n and l. Thus Cn = r 2 0 /nl 2 . The values of r 2 0 increase without limit as n increases, but Cn approaches an asymptotic limit as n → ∞. The approach to this limit is linear in 1/n at large n. For this reason, C∞ is easily obtained by linear extrapolation to 1/n = 0 of a plot of Cn vs. 1/n. A short FORTRAN program that exploits this property of Cn can be found in Appendix C of reference 4. Table 2 presents a comparison of the behavior of the RIS method with simpler analytical descriptions of C∞ . The freely jointed chain has C∞ = 1 at all temperatures. Fixing the bond angle yields a temperature-independent value given by (1 − cos θ)(1 + cos θ)−1 . Introduction of the symmetric torsion, via C∞ = [(1 − cos θ)(1 + cos θ)−1 ][(1−cosφ)(1 + cosφ)−1 ], produces a further increase in C∞ and correctly predicts the negative temperature coefficient of the mean square dimensions for unperturbed polyethylene chains. The value of C∞ remains too small, however. Introduction of the pair-wise interdependence of the bonds, achieved with the RIS method, is required if the computed C∞ is to be in agreement with experiment.
Table 2. C∞ for unperturbed polyethylene at 413 K, as evaluated by four methods Method
Information used for r 2 0
C∞
Freely jointed chain Freely rotating chain Symmetric hindered rotation, independent bonds RIS model, including bond interdependence
n, l n, l, θ
1 2.20
∂ ln C∞ / ∂ T 0 0
n, l, θ, ν, φ, E σ
3.91
−0.0011 K−1
n, l, θ, ν, φ, E σ , E ω
7.95
−0.0010 K−1
Rotational isomeric state methods
5.
2581
Other Common Uses of the RIS Method
Other conformation-dependent properties of the unperturbed polyethylene chain can also be evaluated by simple modification of the method described above for r 2 0 . Equation (1) for Z is still pertinent, and we also retain the form of Eq. (9). The only important change is alteration of Eq. (7) so that the Fi are the ones appropriate for the new property of interest. For example, the average of the end-to-end vector is obtained through replacement of Eq. (7) with the following expression:
Fi =
Ti 0
li , 1
1
(11)
The terminal matrices in the form of Eq. (6) that is appropriate for r instead of r 2 are F1 = [T1 l1 ] and Fn = [ln 1]T . Elaboration of this approach to other types of conformation-dependent properties may require the introduction of additional information into the Fi . For example, calculation of the mean square dipole moment for a polar chain such as polyoxyethylene or poly(vinyl chloride) requires introduction into Fi of the dipole moment for that bond, mi . In addition to mi , the anisotropic part of the polarizability tensor for the bond is also required in the calculation of the molar Kerr constant. Additional properties, such as the macrocyclization equilibrium constant and the stereochemical composition of a vinyl polymer after epimerization to stereochemical equilibrium, are also accessible. The construction of a RIS model takes its simplest form for a molecule such as polyethylene in which all of the bonds are identical. The method is not restricted to such chains, however. The bonds can be of different types, as they are in polyoxyethylene, and the bonds do not all need to have the same number of rotational isomeric states. Differences in the numbers of rotational isomeric states merely produce rectangular Ui , with the number of rows given by ν at bond i − 1, and the number of columns given by ν at bond i, as is seen in polycarbonate. Third order interactions can be incorporated if the dimensions of Ui are expanded to νi−2 νi−1 × νi−1 νi . Manipulation of Z yield probabilities for the local conformations that can be useful for comparison with experimental data, such as average bond conformations deduced from NMR spectra. This information can also be used to control the behavior of coarse-grained chains so that the chain, and all of its subchains, have distributions functions for their end-to-end distances that match those for the real polymer that the coarse-grained chain represents. These probabilities also provide the basis for Monte Carlo calculations that efficiently generate representative chains, such that the probability of the generation of a chain is directly proportional to its statistical weight. This procedure can be used to efficiently generate data such as the distribution function for the end-to-end distance.
2582
W.L. Mattice
Flory’s classic book [5] and review [6] are required reading for new users of the RIS model. The subject was updated 25 years after the first publication of Flory’s book, in another book that also includes problems, many with answers, that may facilitate self-instruction in the use of the method [4]. Detailed RIS models have been devised for an enormous number of polymers. The RIS models that appear in the literature through the mid-1990s have been tabulated in an extensive review [7].
Acknowledgment Preparation of this manuscript was supported by NSF DMR 00-98321.
References [1] P.J. Flory, Principles of Polymer Chemistry, Cornell University Press, Ithaca, New York, 1953. [2] M.V. Volkenstein, Dokl. Akad. Nauk SSSR, 78, 879, 1951. [3] A. Abe, R.L. Jernigan, and P.J. Flory, J. Am. Chem. Soc., 88, 631, 1966. [4] W.L. Mattice and U.W. Suter, “Conformational theory of large molecules. The rotational isomeric state model in macromolecular systems,” Wiley-Interscience, New York, 1994. [5] P.J. Flory, “Statistical mechanics of chain molecules,” Wiley-Interscience, New York, 1969; reprinted with the same title by Hanser, München, 1989. [6] P.J. Flory, Macromolecules, 7, 381, 1974. [7] M. Rehahn, W.L. Mattice, and U.W. Suter, Adv. Polym. Sci., 131/132, 1, 1997.
9.4 MONTE CARLO SIMULATION OF CHAIN MOLECULES V.G. Mavrantzas Department of Chemical Engineering, University of Patras, Patras, GR 26500, Greece
Molecular simulations differ from other forms of numerical computation in that the computer with which the calculations are carried out is not merely a machine but the virtual laboratory in which the system is studied. In such a “laboratory”, understanding is achieved by constructing first a theoretical model of molecular behavior able to reproduce and predict experimental observations and solving it using a suitable algorithm or a computer program. Molecular dynamics and Monte Carlo are two such methods that provide exact results to statistical mechanics problems (for the given molecular model) in preference to approximate solutions. Monte Carlo, in particular, has developed to a powerful tool for simulating the properties of complex systems such as chain molecules, because of its capability to accelerate system equilibration through the implementation of large or unphysical moves that do not require the system to follow the natural trajectory.
1.
The Monte Carlo Method
The Monte Carlo (MC) is a computing method for simulating the properties of matter that relies on probabilities. Like molecular dynamics (MD), it aims at providing exact solutions to statistical mechanical problems through a rigorous calculation of the potential energy of interaction; simultaneously, many of the assumptions invoked in analytical or approximate approaches to the same problems are avoided. In contrast to MD, however, where the atoms are moved according to the inter- and intra-molecular forces derived from the potential function by solving Newton’s equations of motion, MC is a stochastic method: it relies on transition probabilities between different states of the simulated system [1, 2]. These transitions are traced through a scheme that 2583 S. Yip (ed.), Handbook of Materials Modeling, 2583–2597. c 2005 Springer. Printed in the Netherlands.
2584
V.G. Mavrantzas
involves in general three (3) steps: (a) generation of an initial configuration, (b) trial of a randomly generated system configuration, and (c) evaluation of an “acceptance criterion” for the trial configuration and comparison to a random number to decide whether the trial configuration will be accepted or rejected. The acceptance criterion is usually formulated in terms of the potential energy change between trial (new) and existing (old) states and some other properties of the new and old configurations. To accelerate sampling in a MC process, it is important to sample preferentially those states that make the most significant contributions to the configurational properties of the system. This is achieved by the technique of “importance sampling” [1]. According to this, the simulation proceeds by generating a Markov chain of states, i.e., a sequence of states in which the outcome of a trial state depends only on the state that immediately precedes it. Such random states are chosen from a certain distribution, ρ X (Γ), where X denotes the macroscopic constraints of the statistical ensemble in which the simulation is carried out and Γ the phase space. This allows function evaluations to be concentrated in the region of space that makes important contributions to ensemble averages such as the energy. In MC (see details in Chapter 2: Basic MC, Volume 1 of the Handbook), two states Γm and Γn are linked by the element (mn) of a transition matrix π which gives the probability of going from state m to state n. To generate the phase space trajectory, Metropolis et al. [3] suggested the following rule for constructing the matrix π:
πmn =
amn
if ρn ≥ ρm m =/ n
amn ρn ρ
if ρn ≺ ρm m =/ n
m
(1)
In Eq. (1), a is a symmetric (amn = anm ) stochastic matrix designed to take the system from state m into any of its neighboring states nwith equal probability and is often called the underlying matrix of the Markov chain. According to the Metropolis rule, then, the probability of accepting the new state is in general:
U Pacc = min 1, exp − kB T
(2)
where U = Unew − Uold denotes the difference in potential energy between new and old states, kB is Boltzmann’s constant and T the temperature. Equation (2) guarantees that energetically more “favorable” states are accepted preferentially. By construction, there is considerable freedom in choosing a, the only requirement being that anm = amn . For example, instead of one (as in the original MC algorithms) several or all atoms of the system may be simultaneously moved. Moreover, one can devise totally unphysical ways for moving atoms
Monte Carlo simulation of chain molecules
2585
in space that substantially depart the system from its natural trajectory. These allow moving through configuration space much more efficiently than by MD. For systems of chain molecules (e.g., synthetic polymers, highly branched macromolecules and biopolymers), this is of paramount importance. Chain systems present considerable difficulties for molecular simulation relative to either atoms or short polyatomic molecules due to the wide spectrum of time and length scales characterizing their dynamics and structure. The computational difficulties are more acute for the dynamic methods, such as MD. These methods are plagued by the problem of long relaxation times and their applicability is restricted to short chain-length systems. Of course, one can think of a domain decomposition method and algorithm execution on a parallel computer for decreasing the length scale of MD. However, one is still faced with the problem of long relaxation times, since the time scale that can be spanned by a brute-force dynamic method today falls short of the longest relaxation time of real-life chain systems [4]. In this direction, MC methods can play a key role: through the design of clever (sometimes “unphysical”, from the point of view of true dynamics) moves for generating new, trial configurations, they can accelerate system equilibration by many orders of magnitude more efficiently than MD, for the same model of molecular geometry and interatomic potential.
2.
Simple Monte Carlo Moves
For systems of simple molecules, trial configurations in a MC process can typically be generated by displacing, exchanging, removing or adding a molecule [2]. For systems of chain molecules, the situation is more complex. Excluded volume interactions among polymer segments, connectivity of atoms along chains, and conformational stiffness of chain backbones make it very difficult to sample configuration space efficiently. Thus, the earliest MC simulations were conducted using lattice models, initially on single chains and later on multi-chain systems [5]. An early “unphysical” MC move that proved particularly useful in simulations of dense polymer systems was “reptation” [6]. Reptation is a “slithering snake” move that deletes a segment from one end of a randomly selected chain and appends it to the other end; through this, the chain “slides” along its contour by one segment. Other simple MC moves include [7]: (a) end-mer rotation, (b) libration or flip, (c) configurational bias, (d) concerted rotation, (e) generalized reptation and (f) parallel rotation [8]. Configurational bias (CB), in particular, has found extensive applications in simulations of phase equilibria formulated initially in the form of the Gibbs ensemble MC method [9] and more recently in the form of an expanded grand canonical ensemble [10]; the latter formulation
2586
V.G. Mavrantzas
alleviates problems associated with the insertion/deletion or exchange of large chain molecules in dense systems. From the above MC moves, reptation, configurational bias, end-mer rotation and generalized reptation operate only at chain ends. Due to this, when used with realistic continuum models, they benefit from the excess free volume available near chain ends and can significantly enhance system equilibration. However, their effectiveness degrades in longer-chain systems where chain ends are scarce. As a result, only chains up to 70 units long can be simulated with the cocktail of simple MC moves mentioned above.
3.
Complex MC Moves: Variable Connectivity and Extended Configurational Bias MC Methods
To simulate longer-chain systems, moves capable of inducing drastic reconfiguration of large internal sections along the chain are also needed. Such moves were first introduced in the 1980s in the form of “chain breaking” or “pseudokinetic” MC algorithms [11]. These algorithms alter the connectivity among polymer segments in the lattice model at the expense of introducing some polydispersity (i.e., a distribution of chain lengths) in the polymer. Small alterations in chain connectivity are very desirable in a MC simulation because they substantially enhance the efficiency with which the long-range structural features of the system (chain end-to-end vectors, radii of gyration, etc.) are sampled. Guided by these “chain-breaking” lattice MC algorithms, variable connectivity MC methods of continuous-space models were designed in the mid 1990’s for simulations of chain systems. The first such method to be developed was end-bridging [12]. End-bridging (EB) effects a change in chain connectivity by constructing a trimer bridge between a chain end and an interior segment of another chain. Simultaneously, one of the trimers adjacent to the bridged interior atom is excised; this ensures that the total polymer mass remains constant. EB is schematically shown in Fig. 1. EB as well as the rest of the variable connectivity MC moves for chain molecules developed in the ensuing years were founded around the geometric problem of trimer bridging; this is mathematically formulated as follows: Given two dimers in space, connect them with a trimer such that the resulting heptamer has prescribed bond lengths and angles [7]. The problem has so far been addressed by two different methods [13, 14]. Other inter- or intra-molecular variable connectivity moves include intramolecular rebridging or concerted rotation, directed internal bridging, directed end-bridging, fusion and scission, and self-end-bridging. These moves belong to the class of chain connectivity-altering MC methods, whose introduction and
Monte Carlo simulation of chain molecules
2587
Figure 1. Schematic of the end-bridging (EB) move. The attacking chain is denoted as ich and the victim chain as jch. Trimer ( ja , jb , jc ) is to be excised from the victim chain. ( ja , jb , jc ) is the trimer bridging end i of the attacking chain to internal mer j of the victim chain. The two new chains are labeled ich and j ch .
application in simulations of chain molecules have revolutionized the study the conformational and thermodynamic properties of these systems [7]. To comply with the condition of microscopic reversibility in a variable connectivity MC move such as EB, care must be taken to: (a) evaluate all possible geometric solutions to the trimer bridging problem associated with the move, (b) incorporate appropriate Jacobians of the transformations (involved in the solution of the geometric problem) into the acceptance criteria, and (c) calculate the attempt probabilities for both the forward and reverse problems. A typical acceptance criterion of an EB move, for example, reads:
Pacc = min 1,
Pselect (new → old) J (new) exp (−U (new)/kb T ) Pselect (old → new)J (old) exp (−U (old)/kb T )
(3)
where J and Pselect denote the Jacobian of transformation and attempt probability of the corresponding move (old → new or new → old). To deal with polydispersity effects, simulations with chain connectivityaltering MC algorithms are carried out in a semi-grand canonical ensemble, in which a spectrum µ∗ of chemical potentials is employed to control the chain length distribution. Such a semigrand ensemble is denoted as Nch nPT µ∗ , since the following variables are kept constant: the pressure P, the temperature T ,
2588
V.G. Mavrantzas
the total number of chains Nch , the total number of mers n, and the spectrum of relative chemical potentials µ∗ of all chain species in the chain except two, which are taken as reference species. Expressions for µ∗ that generate the most important chain length distributions have been derived by Pant and Theodorou [12] for bulk systems, and Daoulas et al. [15] for chains at the interface with a substrate. Complex MC moves which do not effect alterations in chain connectivity and, still, enable equilibration of rather long chain length systems have also been formulated. These moves were first introduced by Escobedo and de Pablo [16] as extensions of the original continuum confgurational bias (CCB) method. Today, they have developed to what are known as internal configurational bias (ICB) and self-adapting fixed end point configurational bias (SAFE-CB) methods which, by combining the philosophies of Con-Rot and CB methods, enable sampling of arbitrarily long internal chain sections [17, 18]. Generalized CB algorithms have proven very useful in simulations of cross-linked-network structures, but most importantly in designing simulation strategies based on density-of-states sampling methods [19–21].
4.
Advanced Monte Carlo Moves: Simulation of Non–linear Chain Systems
The class of variable connectivity MC moves was initially developed for simulating model systems of linear chains and was heavily dependent on the presence of chain ends. MC moves that are independent of chain ends were for the first time proposed by Balijepalli and Rutledge [22]. They are the intrachain and inter-chain concerted rotation moves developed as generalizations of the ConRot move for simulating the morphology and elasticity [23] of semicrystalline polymer interphases consisting of chain segments whose terminus is fixed to the crystal surface. Through a novel design by Karayiannis et al. [24], these moves have now developed to what is known as the double bridging (DB) and intramolecular double rebridging (IDR). The two moves are generalizations of the EB move, since they involve the construction of two trimer bridges (instead of one); they are schematically shown in Figs. 2(a) and (b). DB and IDR are applicable to a variety of systems such as nonlinear chain architectures, long-chain branched macromolecules, cyclic peptides, grafted polymers, chains with stiff backbones, and infinite-length chain molecules (see Fig. 3); the simulation of all these systems is almost impossible with MD or other MC methods. A typical example is that of the long-chain branched molecules [see, for example, Fig. 4]. By properly re-designing the two moves to allow for bridgings between: (a) the main backbones of two different chains, (b) the branches of two different chains, and (c) the branches of the same
Monte Carlo simulation of chain molecules
2589
Figure 2a. Schematic of the double bridging (DB) move. (a): Local configuration of the two chains, ich and jch, prior to the DB move. Trimer ( ja , jb , jc ) is to be excised from jch and trimer (i a , i b , i c ) from ich. (b): Local configurations of the two new chains after the DB move. Trimer ( ja , jb , jc ) connects atoms i and j in ich’. Trimer (i a , i b , i c ) connects atoms i 2 and j2 in j ch (after Karayiannis et al. 2002).
chain, and by introducing special moves (such as the H-BR and the double ConRot) to effect atom displacements at the junction points, a novel MC algorithm arises capable of simulating H-shaped, comb and star-like chain molecules [25]. In addition to developing new MC algorithms capable of more efficient sampling of long chain systems, key to balancing CPU time and code performance for a given system is the choice of the optimal (or near-optimal) mix of moves. For a given application, with what appears to be a host or a cocktail of moves, choosing or identifying the most efficient mix can substantially affect code performance. In searching for optimal move mixes, one can profit by simple scaling arguments that consider polymer chains as random coils. Such an analysis was carried out, for example, by Mavrantzas et al. [13] who quantified the performance of EB in simulations of bulk polymers. For polydisperse, linear-chain systems, a mix of moves which results in nearly optimal code performance includes: 5% reptations, 5% end-mer rotations, 5% CCBs, 5% flips, 30% ConRots, 48% EBs and 2% volume fluctuations. This mix combines the higher acceptance ratio of the simpler moves with the power of the more complex (less frequently accepted, though) EB move. Such a scheme remains near-optimum for systems with a polydispersity index
2590
V.G. Mavrantzas
Figure 2b. Schematic of the intramolecular double rebridging (IDR) move. Top: Local configuration of the chain prior to the IDR move. The attack shown by the dark gray arrow is combined either with attack a or with attack b represented by the light gray arrows. Bottom: trial configurations of the chain after both a and b attacks have been completed (after Karayiannis et al. 2002).
higher than 1.1. At lower polydispersities or for strictly monodisperse systems, the EB moves should be replaced by DB’s and IDR’s at equal proportions. Including DB’s and IDR’s (in favor of ConRot or EB moves) could accelerate system equilibration also: (a) In cases where one or both chain ends are permanently fixed at a surface; typical examples include systems of end-grafted polymer ends (polymer brushes) or semicrystalline interphases. (b) In systems of non-linear polymer architectures; typical examples are the long-chain branched and the cyclic molecules, and (c) In systems of oriented chains such as in the presence of a deforming tensorial field [26]. For chains bearing short, frequently-spaced branches along their backbone, code performance could be enhanced by re-growing the short branches with extended CB moves. For
Monte Carlo simulation of chain molecules
2591
Figure 3. Application of the DB and/or IDR moves to MC simulations of chain molecules with a variety of chemical architectures. (a): H-shaped molecules, (b): cyclic molecules and (c): grafted molecules (after Karayiannis et al. 2002).
systems consisting of different chain species (such as bi-disperse melts, mixtures of end-grafted and free chains or mixtures of linear and non-linear molecules), configurational sampling can be accelerated by allowing the available variable connectivity moves to also operate on pairs of chains that belong to different species.
5.
Applications
An important test of the ability of any atomistic simulation method to describe liquids in a realistic manner is the calculation of atomic structure. This is usually quantified by calculating the pair radial distribution function g(r) describing the spatial correlations between two atoms at separation distance r in the liquid. For chain molecules, the (total) pair distribution function has contributions from both intra- and inter-molecular correlations: g tot (r) = g(r) +
w(r) ρN
(4)
2592
V.G. Mavrantzas Scheme 1
Scheme 2
(a) (a)
(b)
(c)
(b)
Figure 4. Schematic representation of possible combinations of the DB and IDR moves to be used in MC simulations of non-linear, H-shaped chain molecules. Scheme 1(a)–(c): bridges between the main backbones or the branches of two different molecules. Scheme 2(a)–(b): the H-BR move designed to effect displacements of the branch points (after Karayiannis et al. 2003).
where w(r) is the intra-chain pair density function and g(r) the intermolecular pair distribution function. In Eq. (4), ρ = Nch /V is the chain number (or molecular) density and N the number of mers per chain. The total pair distribution function is of great significance because its Fourier transform gives the static structure factor S(k), which is experimentally measured by X-ray diffraction: S(k) − 1 = ρ N
∞
4πr 2 0
sin(kr) tot g (r) − 1 dr kr
(5)
Figure 5 demonstrates the unique ability of the variable connectivity MC algorithms to reliably simulate the atomic structure of a long-chain linear polyethylene (PE) melt, by comparing simulated against experimentally measured X-ray patterns. The agreement is really excellent. The MC simulation has been carried out in a cubic box characterized by periodic boundary conditions in all three dimensions with a 25-chain C500 strictly monodisperse PE melt at T = 450 K and P = 1 atm, using the united atom model [24]. The superiority of MC algorithms based on chain connectivity-altering moves relative to the conventional MD method in efficiently simulating systems of chain molecules is documented in Fig. 6. The figure compares the rate with which the melt long-length scale characteristics is relaxed with the two methods in two systems, representative of a linear and a non-linear chain
Monte Carlo simulation of chain molecules
2593
2.0 simulation results Xⴚray diffraction data
1.5
S(k)ⴚ1
1.0 0.5 0.0 ⴚ0.5 ⴚ1.0
0
2
4
6
8
10
12
14
k (A⫺1) Figure 5. Simulated (at T = 450 K) and experimental (at T = 430 K) X-ray diffraction patterns of linear polyethylene (P = 1 atm). The simulations have been executed with the variable connectivity MC algorithms discussed in the main text (after Karayiannis et al. 2002).
architecture, respectively. The first is a linear, strictly monodisperse C500 PE melt and the second a non-linear H-shaped PE melt containing on the average 300 carbon atoms on the backbone and 50 carbon atoms on each one of its four branches denoted as PEH(50)2 (300)(50)2 . As a measure of method efficiency, we use the rate of decay of the end-to-end vector orientational autocorrelation function with CPU time. Figure 6 shows MC to be orders of magnitude more efficient than MD: Even if the length scale of the MD simulation is decreased with a domain decomposition method and execution on a 16-node parallel computer, MC still remains the method of choice. Figure 7 shows a typical snapshot of a 4-chain PEH(70)2 (400)(70)2 ) melt before and after the MC simulation with the advanced moves discussed above. Within about 2 × 106 CPU s on a dual 2.8 GHz Xeon system, the bad initial configuration (characterized by unusually large voids and an unphysical distribution of mass in the simulation box) is equilibrated to a structure fully representative of that of the real melt (characterized by a uniform distribution of the polymer mass all over the box and excellent statistics of chain dimensions). The new variable-connectivity MC moves can also be designed for effective vector implementation through domain decomposition methods. This has
2594
V.G. Mavrantzas
1.0
0.8
(d)
0.6
(a)
(c)
0.4
0.2 (e)
(b) 0.0 0.0
0.5
1.0 CPU TIME (106 S)
1.5
2.0
Figure 6. Decay of the chain end-to-end vector orientational autocorrelation function in MC and MD simulations of linear and H-shaped PE melts: (a) A monodisperse C500 PE melt simulated with brute force MD. (b) The same system simulated with the variable connectivity MC moves. (c) The same system simulated with a parallel MD code on a 16-node Beowulf cluster. (d) A PEH(50)2 (300)(50)2 PE melt simulated with brute force MD. (e) The same system simulated with the new chain connectivity-altering MC moves.
Figure 7. Typical atomistic snapshots of the simulated PEH(70)2 (400)(70)2 system before (a) and after (b) the simulation with the variable connectivity MC algorithms (T = 450 K, P = 1 atm). Shown in black and dark gray are atoms belonging to the backbone and the four branches of an arbitrarily selected H-shaped molecule (after Karayiannis et al. 2003).
Monte Carlo simulation of chain molecules
2595
allowed simulating by MC the equilibrium atomic structure of a molten linear PE melt with an average chain length of 6000 backbone carbon atoms (C6000); this is typical of the commercial grades that are widely used for injection molded articles [27].
6.
Outlook
Through the design of efficient (mostly “unphysical”) moves, MC has developed to a principal research tool for simulating chain systems of a variety of chemical architectures. Polyolefin and polydiene melts, polymers grafted on substrates, cyclic peptides, oriented systems, as well as binary systems composed of chemically similar macromolecules are all amenable to MC simulation in continuous space and full atomistic detail. Of course, being a stochastic method, MC cannot unfortunately provide any direct dynamic information. Due to its dynamical nature, the basic technique of MD remains the only molecular simulation method that can give real-time information about the system evolution. MC can, however, be used to calculate dynamic properties indirectly. This can be achieved either in the context of non-equilibrium thermodynamics or by combining MC with MD simulations in a hierarchical scheme. In the former case, MC can be used to address questions related to the viscoleastic features of the melt [26, 28]. For example [26], it can address questions related to: (a) the dependence of the entropy of the deformed chain system on its average conformation or (b) the functional form of its effective spring constant. In the latter case, model configurations thoroughly equilibrated with MC can serve as starting points for executing MD simulations in order to estimate the spectrum of relaxation times of the chain system and its frictional and other dynamic properties [4]. Designing MC algorithms for simulations of beyond-equilibrium systems or combining MC with MD in the context of coarse-grained methodologies constitute one of the most active research areas in the field of molecular simulation. These two directions offer a more promising avenue for probing both structure and dynamics in chain systems than direct MD. The simulation of polymers (either purely amorphous or semi-crystalline) beyond equilibrium, the prediction of mixing thermodynamics in polymer blends, the derivation of “coarse-grained” model representations for complex chain molecules (such as polyimides and polyesters) from atomistic information, and the simulation of the structural and conformational properties of peptide molecules constitute only a small part in the list of materials problems to be investigated by MC in the coming years. As large-scale scientific computing becomes more accessible and computing performance grows each passing day, so the popularity and usefulness of MC will continue to increase.
2596
V.G. Mavrantzas
References [1] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987. [2] R.J. Sadus, Molecular Simulation of Fluids: Theory, Algorithms and ObjectOrientation, Elsevier, Amsterdam, 1999. [3] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys., 21, 1087–1092, 1953. [4] V.A. Harmandaris and V.G. Mavrantzas, “Molecular dynamics simulations of polymers.” In: M. Kotelyanski and D.N. Theodorou (eds.), Simu. Meth. for Poly., Marcel Dekker, New York, pp. 177–222, 2004. [5] Z. Alexandrowicz and Y. Accad, “Monte Carlo of chains with excluded volume: distribution of intersegmental distances,” J. Chem. Phys., 54, 5338–5345, 1971. [6] M. Vacatello, G. Avitabile, P. Corradini, and A. Tuzi, “A computer model of molecular arrangement in a n-paraffinic liquid,” J. Chem. Phys., 73, 548–552, 1980. [7] D.N. Theodorou, “Variable-connectivity Monte Carlo algorithms for the atomistic simulation of long-chain polymer systems,” In: P. Nielaba, M. Mareschal, G. Ciccotti (eds.), Bridging Time Scales: Molecular Simulations for the Next Decade, Springer Verlag, Berlin, pp. 64–127, 2002. [8] S. Santos, U.M. Suter, M. M¨uller, and J. Nievergelt, “A novel parallel-rotation algorithm for atomistic Monte Carlo simulation of dense polymer systems,” J. Chem. Phys., 114, 9772–9779, 2001. [9] Panagiotopoulos, “Direct determination of fluid phase equilibria in the Gibbs ensemble: a review,” Mol. Phys., 9, 1–23, 1992. [10] A.P. Lyubartsev, A.A. Martsinovski, S.V. Shevkunov, and P.N. VorontsovVelyaminov, “New approach to Monte Carlo calculation of the free energy: method of expanded ensembles,” J. Chem. Phys., 96, 1776–1783, 1991. [11] O.F. Olaj and W. Lantschbauer, “Simulation of chain arrangement in bulk polymer .1. Chain dimensions and distribution of the end-to-end distance,” Makromol. Chem.Rapid Commun., 3, 847–858, 1982. [12] P.V.K. Pant and D.N. Theodorou, “Variable connectivity method for the atomistic Monte Carlo simulation of polydisperse polymer melts,” Macromolecules, 28, 7224–7234, 1995. [13] V.G. Mavrantzas, T.D. Boone, E. Zervopoulou, D.N. Theodorou, “End-bridging Monte Carlo: a fast algorithm for atomistic simulation of condensed phases of long polymer chains,” Macromolecules, 32, 5072–5096, 1999. [14] M.G. Wu and M.W. Deem, “Efficient Monte Carlo for cyclic peptides,” Mol. Phys., 97, 559–580, 1999. [15] K.Ch. Daoulas, A.F. Terzis and V.G. Mavrantzas, “Variable connectivity methods for the atomistic Monte Carlo simulation of inhomogeneous and/or anisotropic polymer systems of precisely defined chain length distribution: tuning the spectrum of chain relative chemical potentials,” Macromolecules, 36, 6674–6682, 2003. [16] F.A. Escobedo and J.J. de Pablo, “Extended continuum configurational bias Monte Carlo methods for simulation of flexible molecules,” J. Chem. Phys., 102, 2636– 2652, 1994. [17] A. Uhlherr, “Monte Carlo conformational sampling of the internal degrees of freedom of chain molecules,” Macromolecules, 33, 1351–1360, 2000.
Monte Carlo simulation of chain molecules
2597
[18] C.D. Wick and J.I. Siepmann, “Self-adapting fixed end-point configurational-bias Monte Carlo method for the regrowth of interior segments of chain molecules with strong intramolecular interactions,” Macromolecules, 33, 7207–7218, 2000. [19] F. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050–2053, 2001. [20] T.S. Jain and J.J. de Pablo, “A biased Monte Carlo technique for calculation of density of states of polymer films,” J. Chem. Phys., 116, 7238–7243, 2002. [21] M.S. Shell, P.G. Debenedetti, and A.Z. Panagiotopoulos, “An improved Monte Carlo method for direct calculation of the density of states,” J. Chem. Phys., 119, 9406– 9411, 2003. [22] S. Balijepalli and G.C. Rutledge, “Simulation study of semi-crystalline polymer interphases,” Macromol. Symp., 133, 71–99, 1998. [23] P.J. in ’t Veld and G.C. Rutledge, “Temperature-dependent elasticity of a semicrystalline interphase composed of freely rotating chains,” Macromolecules, 36, 7358– 7365, 2003. [24] N.Ch. Karayiannis, V.G. Mavrantzas, and D.N. Theodorou, “A novel Monte Carlo scheme for the rapid equilibration of atomistic model polymer systems of precisely defined molecular architecture,” Phys. Rev. Lett., 88, 105503: 1–4, 2002. [25] N.Ch. Karayiannis, A.E. Giannousaki, and V.G. Mavrantzas, “An advanced Monte Carlo method for the equilibration of model long-chain branched polymers with a well-defined molecular architecture: detailed atomistic simulation of an H-shaped polyethylene melt,” J. Chem. Phys., 118, 2451–2454, 2003. ¨ [26] V.G. Mavrantzas and H-Ch. Ottinger, “Atomistic Monte Carlo simulations of polymer melt elasticity: their nonequilibrium thermodynamics GENERIC formulation in a generalized canonical ensemble,” Macromolecules, 35, 960–975, 2002. [27] A. Uhlherr, S.J. Leak, N.E. Adam, P.E. Nyberg, M. Doxastakis, V.G. Mavrantzas, and D.N. Theodorou, “Large scale atomistic polymer simulations using Monte Carlo methods for parallel vector processes,” Comp. Phys. Commun., 144, 1–22, 2002. ¨ [28] H-Ch. Ottinger, Beyond Equilibrium Thermodynamics, Wiley, New York, 2004.
9.5 THE BOND FLUCTUATION MODEL AND OTHER LATTICE MODELS Marcus M¨uller Department of Physics, University of Wisconsin, Madison, WI 53706-1390
Lattice models constitute a class of coarse-grained representations of polymeric materials. They have enjoyed a longstanding tradition for investigating the universal behavior of long chain molecules by computer simulations and enumeration techniques. A coarse-grained representation is often necessary to investigate properties on large time- and length scales. First, some justification for using lattice models will be given and the benefits and limitations will be discussed. Then, the bond fluctuation model by Carmesin and Kremer [1] is placed into the context of other lattice models and compared to continuum models. Some specific techniques for measuring the pressure in lattice models will be described. The bond fluctuation model has been employed in more than 100 simulation studies in the last decade and only few selected applications can be mentioned.
1.
Coarse Graining and Universal Behavior
Lattice models are well-suited to investigate universal behavior. Long chain molecules share many common mesoscopic characteristics which are independent of the atomistic structure of the chemical repeat units. For instance, in solutions or melts the self-similar structure at large length scales is only characterized by a single length scale, the chain’s end-to-end distance R. This independence of the qualitative behavior on chemical details is born out by many experimental observations. Therefore, one can use a coarse-grained description, which represents a small number of chemical repeat units by an effective segment. There are few examples of explicit coarse-graining [2] from a specific material onto a lattice model, often this mapping is only invoked conceptually. In this case, the question which interactions are necessary to bring about the observed universal behavior is of great interest in itself [3]. The concept of universality is 2599 S. Yip (ed.), Handbook of Materials Modeling, 2599–2606. c 2005 Springer. Printed in the Netherlands.
2600
M. M¨uller
not restricted to one-component polymer systems, but it is also observed, for instance, in amphiphilic systems, i.e., diblock copolymers in the molten state, amphiphilic polymers in aqueous solution, and biological liquids self-assemble into spatially periodic structures on the length scale of the molecule’s extension. Although the systems differ strongly in their microscopic interactions they share many qualitative features of their phase behavior [4]. For the self-similar structure of polymers in solutions (and melts) there exists a formal justification of coarse-grained models: de Gennes [5] related the structure of a polymer chain in a good solvent to a field theory of a n-component vector model in the limit n → 0. This class of models exhibits a continuous phase transition and the properties close to this critical point have been investigated extensively with renormalization group calculations. The inverse chain length plays the role of the distance from the critical point of the n = 0 component vector model. As in the theory of critical phenomena, the behavior in the vicinity of this critical point (i.e., 1/N 1) is governed by a universal scaling behavior, which is brought about by only a few relevant interactions. This fact justifies the use of highly coarse-grained models that incorporate only two relevant interactions: connectivity along the chain and binary segmental interactions.
2.
Lattice Models and Computational Techniques
Lattice models of polymer solutions are a particularly simple and computationally efficient realization, and therefore they have attracted abiding interest both for single chain simulations as well as for simulations of polymer solutions and melts. In simple lattice models, a small group of atomistic repeat units is represented by a site on a simple cubic lattice. Also other lattice structures have been considered, e.g., face-centered cubic, square or triangular. Segments along a polymer occupy neighboring lattice sites and multiple occupation of lattice sites is forbidden (excluded volume). The latter constraint corresponds to the repulsive binary interaction under good solvent conditions. Isolated chains on the lattice adopt configurations of self-avoiding walks. Lattice models and algorithms have been devised for multi-chain systems. Some analytical theories (e.g., Flory–Huggins theory of polymer mixtures or Flory equation of state) or enumeration techniques are particularly clearly formulated for lattice models and can be directly investigated by Monte Carlo simulations. Compared to coarse-grained models in continuum space (e.g., bead-spring type models), lattice models are computationally more efficient. The structure of the underlying lattice allows for a fast rejection of forbidden configurations and additional program optimizations. For instance, there is only a (small) number of allowed bond vectors in a lattice model and excluded volume constraints can be checked very efficiently. In a generic off-lattice model, there are many bond vectors with a very large energy and the excluded volume constraint is often modeled as a
The bond fluctuation model and other lattice models
2601
large, but not infinite energy of overlap. Even though configurations with a large repulsive energy in the off-lattice model will have only a negligible statistical weight, they cannot be rejected upfront and the whole energy change of a move has to be calculated to accept or reject the move. Therefore, lattice models are particularly suitable for investigating phenomena on mesoscopic length and time scales, which pose large computational challenges that cannot yet be addressed with atomistic or off-lattice models [4]. There are also disadvantages of lattice models: only the configurational part of the partition function can be investigated. There are no forces or momenta. Therefore, one cannot obtain the pressure via the virial expression and one cannot use molecular dynamics simulations to study the ballistic motion of segments at short time scales or hydrodynamic behavior at large time scales. This holds a fortiori for non-equilibrium situations, e.g., shear flow, etc. Random local displacements of segments can at most mimic a purely diffusive dynamics. Moreover, simulations at constant pressure or in the Gibbs ensemble are difficult. The equilibration of dense multi-chain systems is a challenging problem for computer simulations [6, 7], and lattice models have been a testing ground for many algorithms. Algorithms can be particularly easily formulated on a lattice (e.g., configurational bias Monte Carlo) and efficiently implemented. Some methods are tailored to isolated chains or very dilute systems (e.g., the pivot algorithm [8, 9]); other methods provide an effective relaxation of the overall chain dimensions in dense systems (e.g., configurational bias Monte Carlo [10, 11]). Special techniques have been devised to calculate the pressure in lattice models. Dickman [12] proposed a method to measure the pressure by calculating the free energy change associated with moving a hard wall by one lattice unit. This method is very efficient if one takes due account of the increase of the density that occurs upon compression [13]. An alternative method has been devised for systems with periodic boundary conditions, where one inserts or removes a whole slice of the lattice and regrows the chains which are affected by this Monte Carlo move via configurational bias [14]. This scheme has been applied in lattice Gibbs ensemble Monte Carlo simulations [15], but it is a tour de force and the acceptance rate and complexity depends on the system size. If one is interested in the equation of state it might be more convenient to obtain the pressure by integrating the excess chemical potential.
3.
The Bond Fluctuation Model and Selected Applications
Although simple lattice models reproduce the universal features of polymer solutions and melts, it is difficult to incorporate some qualitative details
2602
M. M¨uller
of molecular architecture. The simple lattice model allows only for two bond angles which makes the investigation of orientational effects prone to lattice artifacts, e.g., there is a strong cubic anisotropy and the lattice structure tends to stabilize liquid crystalline phases. Moreover, particles in real fluids arrange to form neighboring shells. This local packing structure of the fluid does not affect the universal scaling behavior but it is pertinent to relating coarse-grained effective interactions to underlying microscopic potentials. Since the vacancies on the lattice and the polymer segments have the same size, packing effects in the density correlation function are largely absent. More sophisticated lattice models, in which monomers are represented by extended objects (e.g., a whole unit cube) on the lattice, have been explored.∗ These models exhibit packing effects, albeit much weaker than bead-spring type models in continuum space. A large number of bond vectors and angles (see also Ref. [16]) results in a better approximation of isotropic space while still retaining the computational advantages of lattice models. They also allow for a diffusive dynamics of the polymers on the lattice which consists of random local displacements of the monomers. Moreover, the bond vectors can be chosen such that the excluded volume constraint prevents bonds from crossing through each other in the course of these local displacements. This non-crossability takes account of topological effects which are important for the dynamical properties of linear chains and influence the conformational statistics of ring polymers (i.e., to avoid topological interactions rings collapse in a concentrated solution) and networks. The bond fluctuation model can be formulated in two [1] and three [17] spatial dimensions. In three dimensions a monomer blocks all eight corners of a unit cell of a simple cubic lattice from further occupancy. This represents the segmental excluded volume. Monomers along polymer chain √ a√ √ are connected via one of 108 bond vectors of length 2, 5, 6, 3, and 10 in units of the lattice spacing. This realizes the connectivity on the monomers along a chain. The basic Monte Carlo move consists in randomly selecting a monomer and attempting to displace it by one lattice unit in one randomly chosen direction. While this move mimics a diffusive dynamics, other Monte Carlo moves (e.g., slithering snake, configuration bias Monte Carlo or semi-grandcanonical identity switches) have been applied to investigate thermodynamic equilibrium properties. In addition to its computational efficiency, one of the key advantages of the bond fluctuation model is the knowledge of a variety of different quantities. The density and chain length dependence of many static and dynamic properties of the basic model are compiled in Refs. [18, 19]. Upon increasing
* The use of extended particles on a lattice has also been applied to fluids (cf. A.Z. Panagiotopoulos, “On the equivalence of continuum and lattice models for fluids,” J. Chem. Phys., 112, 7132–7137, 2000.)
The bond fluctuation model and other lattice models
2603
the density from a dilute solution to a melt one observes a cross-over from self-avoiding to Gaussian chain statistics and from (renormalized) Rouse to reptation-like dynamics for very long chain lengths. Inter- and intramolecular paircorrelation functions in melts and semi-dilute solutions have been investigated. Different chain topologies have been studied: ring polymers [20], polymer networks [21, 22], end-tethered chains [23, 24] as well as equilibrium polymers [25]. Various additional interactions have been incorporated, e.g., bond-length potentials, chain stiffness, and binary interactions: attractive interactions between monomers result in a collapse of an isolated chain from a self-avoiding walk (for T > Tθ ) to a dense globule (for T < Tθ ). At the θ-temperature Tθ the chain conformations are Gaussian. A multi-chain system separates into liquid and vapor below the θ-temperature. The scaling of the critical temperature and density with chain length has attracted much interest [26]. At very low temperatures the liquid that coexists with the vapor becomes very dense and the lattice structure becomes important. Using a bond-length potential, Baschnagel [27] investigated the glass transition. The bond potential favors extended bonds which block lattice sites and effectively decrease the free volume. At low temperatures or high densities, the competition between extending bonds and dense packing frustrates the systems and leads to a glassy arrest of the dynamics, which has been investigated both in the bulk and in thin films [28]. Two component mixtures have been modeled by incorporating a shortranged repulsion between unlike segments. From the structure of the polymer fluid and the binary repulsion, the Flory–Huggins parameter can be extracted [29]. This makes the bond fluctuation model an ideal testing bed for comparing simulation results quantitatively to predictions of the self-consistent field theory for spatially inhomogeneous systems. Reister et al. [30] compared the dynamics of phase separation to different versions of dynamic mean field theory. The phase diagram of diblock copolymers in confinement [31], and reactive compatibilization [32] has been investigated, and also the phase behavior of more complex architectures, e.g., triblock-copolymers [33], is accessible. Due to extended structure, a polymer interacts with many neighbors and mean field predictions are often accurate. There are however caveats: capillary waves broaden interface profiles [34], they renormalize the interaction between an interface and a boundary (wall) [35, 36], and they give rise to a microemulsion in homopolymer/diblock-copolymer mixtures [15, 37]. The phase diagram of random copolymers [38] in three spatial dimensions and the scaling of the critical temperature in two-dimensional homopolymer blends [39] exhibits large deviations from mean field predictions. This list of selected applications of the bond fluctuation model illustrates the versatility and range of applications of coarse-grained lattice models. By virtue of their computational efficiency and the availability of information on
2604
M. M¨uller
a wide variety of properties, coarse-grained lattice models will continue to be important for tackling computationally challenging questions about the structure and the thermodynamics of polymer systems and for evaluating the approximations invoked in analytical approaches.
References [1] I. Carmesin and K. Kremer, “The bond fluctuation method – a new effective algorithm for the dynamics of polymers in all spatial dimensions,” Macromolecules, 21, 2819–2823, 1988. [2] J. Baschnagel, K. Binder, P. Doruker, A.A. Gusev, O. Hahn, K. Kremer, W.L. Mattice, F. M¨uller-Plathe, M. Murat, W. Paul, S. Santos, U.W. Suter, and V. Tries “Bridging the gap between atomistic and coarse-grained models of polymers: status and perspectives,” Adv. Polym. Sci., 152, 41–156, 2000. [3] M. M¨uller, “Mesoscopic and continuum models. In: J.H. Moore and J.H. Spencer (eds.), Encyclopedia of Physical Chemistry and Chemical Physics, vol II, IOP, Bristol, pp. 2087–2110, 2001. [4] M. M¨uller, K. Katsov, and M. Schick, “Coarse grained models and collective phenomena in membranes: computer simulation of membrane fusion,” J. Polym. Sci. B, Polym.Phys., 41, 1441–1450, 2003 (highlight article). [5] P.G. de Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, 1979. [6] K. Kremer and K. Binder, “Monte Carlo simulations of lattice models for macromolecules,” Comput. Phys. Rep., 7, 259–310, 1988. [7] J.J. de Pablo and F.A. Escobedo, “Monte Carlo methods for polymeric systems,” Adv. Chem. Phys., 105, 337–367, 1999. [8] N. Madras and A.D. Sokal, “The pivot algorithm – a highly efficient Monte Carlo method for the self-avoiding walk,” J. Stat. Phys., 50, 109–186, 1988. [9] A.D. Sokal, “Monte Carlo methods for the self-avoiding walk,” In: K. Binder (ed.), Monte Carlo and Molecular Dynamics Simulations in Polymer Science, Oxford University Press, New York, p. 47, 1995. [10] J.I. Siepmann and D. Frenkel, “Configurational bias Monte Carlo – a new sampling scheme for flexible chains,” Mol. Phys., 75, 59–70, 1992. [11] D. Frenkel and B. Smit, “Understanding molecular simulations: from algorithms to applications,” 2nd edn., Academic Press, Boston, 2001. [12] R. Dickman, “New simulation method for the equation of state of lattice chains,” J. Chem. Phys., 87, 2246–2248, 1987. [13] M.R. Stukan, V.A. Ivanov, M. M¨uller, W. Paul, and K. Binder, “Finite size effects in pressure measurements for Monte Carlo simulations of lattice polymer systems,” J. Chem. Phys., 117, 9934–9941, 2002. [14] A.D. Mackie, A.Z. Panagiotopoulos, D. Frenkel, and S.K. Kumar, “Constantpressure Monte Carlo simulations for lattice models,” Europhys. Lett., 27, 549–544, 1994. [15] A. Poncela, A.M. Rubio, and J.J. Freire, “Gibbs ensemble simulations of a symmetric mixtures composed of the homopolymers AA and BB and their symmetric diblock copolymer,” J. Chem. Phys., 118, 425–433, 2003. [16] J.S. Shaffer, “Effects of chain topology on polymer dynamics: bulk melts,” J. Chem. Phys., 101, 4205–4213, 1994.
The bond fluctuation model and other lattice models
2605
[17] H.-P. Deutsch and K. Binder, “Interdiffusion and self-diffusion in polymer mixtures: a Monte Carlo study,” J. Chem. Phys., 94, 2294–2304, 1991. [18] W. Paul, K. Binder, D.W. Heermann, and K. Kremer, “Crossover scaling in semidilute polymer solutions: a Monte Carlo test,” J. Phys. II, 1, 37–60, 1991. [19] W. Paul, K. Binder, D.W. Heermann, and K. Kremer, “Dynamics of polymer solutions and melts – reptation predictions and scaling of relaxation times,” J. Chem. Phys., 95, 7726–7740, 1991. [20] M. M¨uller, J.P. Wittmer, and M.E. Cates, “Topological effects in ring polymers: a computer simulation study,” Phys. Rev. E, 53, 5063–5074, 1996. [21] J.U. Sommer and S. Lay, “Topological structure and nonaffine swelling of bimodal polymer networks,” Macromolecules, 25, 9832–9843, 2002. [22] Z. Chen, C. Cohen, and F.A. Escobedo, “Monte Carlo simulation of the effect of entanglements on the swelling and deformation behavior of end-linked polymeric networks,” Macromolecules, 25, 3296–3305, 2002. [23] J. Wittmer, A. Johner, J.F. Joanny, and K. Binder, “Chain desorption from a semidilute polymer brush – a Monte Carlo simulation,” J. Chem. Phys., 101, 4379–4390, 1994. [24] P.Y. Lai and K. Binder, “Structure and dynamics of grafted polymer layers: a Monte Carlo simulation,” J. Chem. Phys., 95, 9288–9299, 1991. [25] J.P. Wittmer, A. Milchev, and M.E. Cates, “Dynamical Monte Carlo study of equilibrium polymers: static properties,” J. Chem. Phys., 109, 834–845, 1998. [26] N.B. Wilding, M. M¨uller, and K. Binder, “Chain length dependence of the polymersolvent critical point parameters,” J. Chem. Phys., 105, 802–809, 1996. [27] J. Baschnagel, “Analysis of the incoherent intermediate scattering function in the framework of the idealized mode-coupling theory – a Monte Carlo study for polymer melts,” Phys. Rev. B, 49, 135–146, 1994. [28] J. Baschnagel and K. Binder, “On the influence of hard walls on the structuralproperties in polymer glass simulation,” Macromolecules, 28, 6808–6818, 1995. [29] M. M¨uller, “Miscibility behavior and single chain properties in polymer blends: a bond fluctuation model study,” Macromolecules Theory Simul., 8, 343–374, 1999 (feature article). [30] E. Reister, M. M¨uller, and K. Binder, “Spinodal decomposition in a binary polymer mixture: dynamic self-consistent field theory and Monte Carlo simulations,” Phys. Rev. E, 64, 041804/1–17, 2001. [31] K. Binder and M. M¨uller, “Monte Carlo simulation of block copolymers,” Curr. Opin. Colloid Interface Sci., 5, 315–323, 2001. [32] M. M¨uller, “Reactions at polymer interfaces: a Monte Carlo simulation,” Macromolecules 30, 6353–6357, 1997. [33] G. Szamel and M. M¨uller, “Thin films of asymmetric triblock copolymers: a Monte Carlo study,” J. Chem. Phys., 118, 905–913, 2003. [34] A. Werner, F. Schmid, M. M¨uller, and K. Binder, “Intrinsic profiles and capillary waves at homopolymer interfaces: a Monte Carlo study,” Phys. Rev. E, 59, 728–738, 1999. [35] M. M¨uller and K. Binder, “Wetting and capillary condensation in symmetric polymer blends: a comparison between Monte Carlo simulations and self-consistent field calculations,” Macromolecules, 31, 8323–8346, 1998. [36] M. M¨uller and K. Binder, “Interface localization–delocalization transition in a symmetric polymer blend: a finite size scaling Monte Carlo study,” Phys. Rev. E, 63, 021602/1–16, 2001.
2606
M. M¨uller
[37] M. M¨uller and M. Schick, “Bulk and interfacial thermodynamics of a symmetric, ternary homopolymer–copolymer mixture: a Monte Carlo study,” J. Chem. Phys., 105, 8885–8901, 1996. [38] J. Houdayer and M. M¨uller, “Deviations from the mean field predictions for the phase behavior of random copolymers,” Europhys. Lett., 58, 660–665, 2002. [39] A. Cavallo, M. M¨uller, and K. Binder, “Anomalous scaling of the critical temperature of unmixing with chain length for two-dimensional polymer blends,” Europhys. Lett., 61, 214–220, 2003.
9.6 STOKESIAN DYNAMICS SIMULATIONS FOR PARTICLE LADEN FLOWS Asimina Sierou University of Cambridge, Cambridge, UK
Stokesian Dynamics is a molecular-dynamics-like method for simulating the behavior of many particles suspended in a fluid. The method treats the suspended particles in a discrete sense while the continuum approximation remains valid for the surrounding fluid, i.e., the suspended particles are generally assumed to be significantly larger than the molecules of the solvent. The particles then interact through hydrodynamic forces transmitted via the continuum fluid, and when the particle Reynolds number is small, these forces are determined through the linear Stokes equations (hence the name of the method). In addition, the method can also resolve non-hydrodynamic forces, such as Brownian forces, arising from the fluctuating motion of the fluid, and interparticle or external forces. Stokesian Dynamics can thus be applied to a variety of problems, including sedimentation, diffusion and rheology, and it aims to provide the same level of understanding for multiphase particulate systems as molecular dynamics does for statistical properties of matter.
1.
Equations of Motion
For N rigid particles of radius a suspended in an incompressible Newtonian fluid of viscosity η and density ρ, the motion of the fluid is governed by the Navier–Stokes equations, while the motion of the particles is described by the coupled equation of motion m·
dU = FH + FP + FB, dt
(1)
which simply states that the mass times the acceleration equals the sum of the forces. In this equation, m is a generalized mass/moment-of-inertia matrix, U is the particle translational/rotational velocity vector of dimension 6N , 2607 S. Yip (ed.), Handbook of Materials Modeling, 2607–2617. c 2005 Springer. Printed in the Netherlands.
2608
A. Sierou
and the 6N vector F represents the hydrodynamic (F H ), Brownian (F B ) and arbitrary interparticle or external forces/torques (F P ) acting on the particles. The method then aims to evaluate each force acting on the particles as a known function of the velocities and positions of all other particles, and then integrate Eq. (1) in time to follow the dynamic evolution of the suspension microstructure.
2.
Hydrodynamic Forces
When the particle Reynolds number is small, the hydrodynamic force exerted on the particles should scale linearly with the particles’ velocity (due to the linearity of Stokes equations). For particles in a suspension undergoing a bulk linear shear flow (e.g., simple shear) this hydrodynamic force can be written as (see Ref. [1], and references therein) F H = −R FU · (U − u∞ ) + R F E : E ∞ ,
(2)
where u∞ is the imposed bulk flow evaluated at the particle center, xp . For ˙ · x p and Γ ˙ = E ∞ + Ω∞ is the velocity gradient a simple shear flow u∞ = Γ tensor of the bulk flow (split into a symmetric and antisymmetric part). The resistance tensors R FU and R F E give the hydrodynamic force/torque on the particles due to their motion relative to the fluid and due to an imposed flow, respectively. Note that these resistance tensors can only depend on the positions of the particles (xp ) and not on their velocities, due to the linearity of the problem. Similarly, resistance tensors which relate the particle stresslet S (the symmetric first moment of the force density on a particle) to the velocity and rate of strain can be defined and thus a “grand resistance” matrix constructed
R=
R FU R SU
RF E RS E
(3)
with
F U − u∞ = −R · . − E∞ S
(4)
The inverse of the grand resistance matrix is the grand mobility matrix, M and it gives the particle velocities and rate of strain in terms of the total forces/torques and stresslets. Note that in this notation, RSE is a fourth-order tensor, while RFU a second-order one (see also Refs. [2, 3] for a more detailed introduction on the resistance and mobility formulations). The calculation of the resistance tensors is trivial for the case of a single particle (where, for example, RFU = 6πηa I for a particle of radius a,and the hydrodynamic drag simply corresponds to the Stokes drag), and exact analytical expressions are
Stokesian dynamics simulations for particle laden flows
2609
known for the case of two particles [3]. Stokesian Dynamics extends these results and makes an accurate approximation for the resistance and mobility tensors for the case of an arbitrary number of particles N . This procedure is briefly outlined below. Initially the force density on the surface of each particle is expanded in a series of moments around the center of the particle. (The zeroth moment is simply the net force acting on a particle, while the first moment is split into an antisymmetric torque and a symmetric stresslet.) Then the i-component of the velocity at any position x in the fluid can be simply expressed as u i (x) − u ∞ i (x) =
G i j (x, x np ) F jn ,
(5)
n
where F jn is the j -component of the hydrodynamic force/torque/stresslet (or even higher moment) on each particle n, and G i j corresponds to an appropriate solution function for the Stokes equations, which again can only depend on the configuration of the particles (see Refs. [1, 4, 5] for details on the functional form of G i j and issues of convergence, periodicity and infinite systems). To determine the motion of a particle immersed in a flow field given by Eq. (5) the method makes use of the Fax´en formulae for spheres [6] which relate the force on a particle n to the velocity of the particle Uin and the fluid velocity at the particle center u i (xpn )
Fin
=
−6πηa(Uin
−
n u∞ i (x p ))
a2 + 6πηα 1 + ∇ 2 u i (x np ) 6
(6)
(similar expressions can be written for the torque and stresslet as a function of the angular velocities and rate of strain). Through the combination of Eqs. (5) and (6), the method eliminates the fluid velocity u i (x np ) and a mobility matrix can thus be constructed relating the velocity of each particle Uin to the force moments on all particles. This approach, however, becomes computationally prohibitive when a pair of particles is close to contact. The hydrodynamic forces then diverge as −1 and ln −1 , where is the surface-to-surface distance of two particles (non-dimensionalized by the particle radius) and a large number of moments is required in order to capture the surface force density correctly. To circumvent this problem, higher order moments in the expansion are neglected and their effect is instead approximated by an added lubricationtype force between particles. Since these lubrication forces are dominated by interactions between contact points rather than boundary surfaces, they can be approximated as two-body interactions and thus simply added to the far-field solution in a pairwise-additive manner. The method then can be used to simulate a realistic number of particles by decomposing the hydrodynamic interactions into a far-field and a near-field part, thus maintaining both a relatively small number of parameters (moments) and high accuracy for dense systems.
2610
A. Sierou
The algorithm, however, remains computationally expensive, as it requires O(N 2 ) operations for the construction of the mobility matrix (Eq. (5) applied for all N particles) and additionally an O(N 3 ) inversion of the resulting matrix (since for many problems the calculation of the resistance matrix is also required). Faster versions of Stokesian dynamics have thus been developed [7] which utilize fast Fourier transform techniques in combination with iterative inversions to reduce the cost of the hydrodynamic calculations to a O(N ln N ) scaling.
3.
Brownian Forces
When the suspended particles are sufficiently small the fluctuating forces they receive from the fluid influence their dynamics and give rise to the familiar phenomenon of Brownian motion. Stokesian Dynamics successfully accounts for these Brownian forces, in combination with an accurate manybody description of the hydrodynamic forces, and is thus an extension to “Brownian Dynamics” [8], a method commonly used to examine the properties of colloidal systems, where the many-particle hydrodynamics are ignored. The Brownian force F B arising from the thermal fluctuations of the fluid is a Gaussian stochastic variable defined by
F B = 0 and
F B (0)F B (t) = 2kT R FU δ(t),
(7)
where the angle brackets denote an ensemble average, k is Boltzmann’s constant, T is the absolute temperature and δ(t) the delta function. The correlation of the Brownian forces at times 0 and t is a direct consequence of the fluctuation–dissipation theorem for the N -particle system (see also Ref. [9] for an introduction to colloidal dynamics). Once the hydrodynamic resistance matrix is known, it is straightforward (although often computationally expensive) to calculate the “random” component of the displacement. Equation (1) can be integrated in time (assuming that the configuration of the particles does not change significantly during the time scale of the Brownian motion, i.e., during the time required for a particle’s momentum to relax after a Brownian pulse) leading to [8] X B = kT ∇ · R−1 FU t + X(t)
(8)
with X = 0,
X(t)X(t) = 2kT R−1 FU t.
(9)
Here X B is the change in the position of the particle associated with the Brownian motion and X(t) is a Gaussian random displacement with zero mean and covariance given by the inverse of the resistance tensor. (Note that
Stokesian dynamics simulations for particle laden flows
2611
due to the spatial variation of the resistance matrix and the simple forward time-stepping scheme, the random walk has a mean drift with a mean velocity kT ∇ · R−1 FU in addition to any systematic velocity.) The motion and rheological properties arising from the Brownian forces can then be combined with the hydrodynamic contributions described in Section 2, to give an accurate description of hydrodynamically interacting colloidal suspensions. As was the case with the hydrodynamic force, new versions of Stokesian Dynamics [10] have utilized iterative techniques and Chebyshev polynomial approximations (for calculating the square root of matrices) to reduce the computational cost to O(N 1.25 ln N ) and thus make possible the simulation of larger systems.
4.
Random Hard Sphere Dispersions
Stokesian Dynamics has been used successfully to calculate the transport properties of a random dispersion of hard spheres (see Refs. [7, 11, 12] for a full review). Figure 1 shows the high-frequency dynamic viscosity η∞ 8 7 6 5 4 3
2
N ⫽ 125 N ⫽ 343 N ⫽ 512 N ⫽ 1000 N ⫽ 2000 Ladd (1990) Vander Werff et al. (1989) Shikata & Pearson (1994)
10 η∞' /η
9 8 7 6 5 4 3
2
1
9 8
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
φ Figure 1. High-frequency dynamic viscosity vs. volume fraction. The experiments are for sterically stabilized silica particles which behave to a very good approximation as hard spheres.
2612
A. Sierou
(non-dimensionalized by the solvent viscosity η) for hard spheres determined by Stokesian Dynamics simulation and compared with experiment. The highfrequency dynamic viscosity is the dissipative part of the stress measured in a very small amplitude and high frequency oscillatory shear flow (where the distribution of the solid particles is unaffected by shear flow and it still corresponds to the equilibrium hard-sphere structure). In simulations, it is completely determined by the hydrodynamic interactions and is easily extracted from the particle contribution to the bulk stress, P
P = n SH = −n R SU · (U − u∞ ) − R S E : E ∞
(10)
where n is the number density of particles, the non-zero particle velocities U are defined by Eqs. (1) and (2), and the resulting stress is ensemble-averaged over a number of equilibrium configurations. (New implementations of the method, which do not calculate the resistance tensors explicitly, obtain the far-field contribution to the hydrodynamic stress through an iterative inversion of an equivalent mobility matrix.) The results demonstrate that the hydrodynamic interactions are calculated accurately by Stokesian Dynamics for all volume fractions, capturing the singular nature of the viscosity as random close packing is approached.
5.
Rheology of Non-Colloidal Particles
The rheological properties of suspensions can be calculated with a dynamic simulation of a sheared system. In the absence of Brownian motion it has been demonstrated [13, 14] that the presence of a repulsive interparticle force is always necessary to prevent the system from forming infinite clusters. A pair-wise repulsive force of the form F Pαβ = F0
τ e−τ e(αβ) 1 − e−τ
(11)
is commonly used [1]. In this equation, F Pαβ is the force exerted on sphere α by sphere β, F0 represents its magnitude, τ −1 its range, is the dimensionless spacing between particle surfaces, and e(αβ) the unit vector connecting the centers of the two spheres. Since now there are two driving forces affecting the motion of the particles, a non-dimensional parameter γ = 6πηa 2 γ˙ /F0 is defined, describing the ratio of the shear rate to the magnitude of the ˙ is the magnitude of the velocity gradient interparticle forces (where γ˙ = | | tensor). The particle contribution to the stress is now given by [1]
P = −n R SU · (U − u∞ ) + n R S E : E ∞ − n x p F P ,
(12)
Stokesian dynamics simulations for particle laden flows
2613
3
2
10 ηr
9 8 7 6
Stokesian Dynamics N ⫽ 512, ψ⫽ 1000, ∗ ⫽ 1000 Experiments Zarraga et al. (2000) Pätzold (1980) Gadala-Maria (1979) Rutgers (1962)
5 4 3
2
1 0.0
0.1
0.2
0.3
0.4
0.5
φ Figure 2. fraction.
Steady-shear viscosity for non-colloidal particles as a function of the volume
where n is the particle number density, and the ensemble average denotes a number of steady-state configurations, or equivalently a time average over a large strain. The suspension relative viscosity is presented in Fig. 2 for a given magnitude of the interparticle force and compared with a number of experimental results. The steady-shear viscosity is generally larger than the highfrequency dynamic viscosity, since under steady-shear particle clusters form, which increase the total stress. The agreement between simulation and experiment is very good for low to moderate volume fractions, while for large volume fractions sizeable discrepancies exist, not only between the experimental and simulation results, but also between different sets of experiments. This is attributed to the fact that the exact form and magnitude of the interparticle force (e.g., surface roughness, particle material) can have a profound effect on the rheology, especially for large volume fractions, where the total stress is dominated by the near-particle interactions [15].
6.
Rheology of Colloidal Particles
Stokesian dynamics has also been used extensively to study the behavior of colloidal systems, where in addition to the hydrodynamic forces, Brownian forces are also important. For such systems the action of an external driving
2614
A. Sierou
force such as shear disturbs their equilibrium microstructure and strongly affects the resulting behavior. Figure 3 shows the steady-shear viscosity ηr as a function of the P´eclet number for a volume fraction of φ = 0.45. The P´eclet number measures the relative importance of shear to thermal forces and is given by Pe = 6πηa 3 γ˙ /(kT ). The stress for such systems is still given by Eq. (10), where now the particle velocities include both a purely hydrodynamic and a Brownian contribution. It is thus instructive to decompose the total stress into two similar contributions: a hydrodynamic one, given by Eq. (10) with the particle velocities coming from only the deterministic displacements, and a Brownian one, given by [16]:
SB = −kT ∇ · R SU · R−1 FU
.
(13)
The overall viscosity appears to have a non-monotonic dependence on the shear rate, decreasing as a function of the shear rate (shear-thinning behavior) for small to moderate P´eclet numbers, but increasing as a function of the shear rate at high P´eclet numbers (shear-thickening behavior). The same transitional behavior has also been seen in experimental studies; as a measure of the quantitative predictive ability of Stokesian Dynamics, Fig. 4 shows the simulated steady-shear viscosity for a number of volume fractions compared with 16 14 12 10 ηr
8 6 φ ⫽ 0.45, N ⫽ 27 ηT
4
ηH
2
ηB
0 10⫺3
10⫺2
10⫺1
100
101
102
103
104
Pe
Figure 3. The steady-shear viscosity and its different contributions determined by Stokesian Dynamics for a system of N = 27 and φ = 0.45.
Stokesian dynamics simulations for particle laden flows
2615
35 Stokesian Dynamics φ ⫽0.316
30
van der Werff & de K ruif (1989) φ ⫽0.316
φ ⫽0.419
φ ⫽0.419
φ ⫽0.470 φ ⫽0.490
φ ⫽0.488
φ ⫽0.470
25
D' Haene et al. (1993) φ ⫽0.276 φ⫽0.389 φ ⫽0.460 φ ⫽0.484
20 ηr 15
10
5
0
10⫺3
10⫺2
10⫺1
100
101
102
103
104
Pe
Figure 4. The steady-shear viscosity (normalized by the solvent viscosity) as a function of the Peclet number Pe for different volume fractions.
experiments. Despite the different types and sizes of particles the agreement between the two sets of experiments and the simulation results is satisfactory. This complex behavior can be further clarified by analyzing the two contributions to the stress separately (see also Ref. [17]). The Brownian contribution to the stress is a result of the flow-induced deformation of the equilibrium structure (particles are trying to diffuse against the flow towards their unstressed/equilibrium configuration and the resulting stress is proportional to this deformation). The Brownian viscosity (stress normalized by ηγ˙ ) then scales simply as the deformation over the P´eclet number. As the P´eclet number increases, however, it has been suggested that the particle motion can no longer keep up with the flow and the deformation of the microstructure saturates, leading to a Brownian viscosity which is a decreasing function of the P´eclet number (as shown in Fig. 3). The hydrodynamic stress, on the other hand, remains constant for small values of the P´eclet number, since, due to the Brownian forces, the particles are still well-dispersed; its value is thus very close to the high-frequency dynamic viscosity. As the shear-rate is increased further, however, particle clusters form (since the shear-flow pushes particles together and the Brownian forces are no longer strong enough to disperse them) and strong lubrication stresses are manifested, leading to an increase in the total particle stress.
2616
7.
A. Sierou
Discussion
The Stokesian Dynamics method provides a rigorous procedure for the simulation of hydrodynamically interacting particles and suspensions. The results discussed above demonstrated the ability of the method to make accurate predictions for a number of rheological properties, and shed light on the physical mechanisms dictating the particles’ motion. However, they only represent a small fraction of the problems to which the method has been applied, or could be applied in the future. The combination of simulation with experiment and theory has allowed for a very detailed picture to emerge, describing the structure and rheology of colloidal and non-colloidal systems. The system microstructure, expressed through the pair-distribution function, has been studied extensively [1, 15, 17] and the observed anisotropy in the microstructure has been correlated with the suspension non-Newtonian behavior (i.e., normal stress differences). The method has also been expanded to the study of bounded systems and pressure-driven flows and has been instrumental in developing macroscopic models for such systems (see Ref. [18]). The diffusive motion of the particles, and in particular the effect of shear on diffusion, has also been the subject of recent studies; such studies have demonstrated the ability of Stokesian Dynamics (and simulation in general) to investigate effects which are often hard to measure experimentally.
8.
Outlook
Stokesian Dynamics, and faster O(N ln N ) implementations inspired by it, have been used extensively over the last decade and have increased our understanding and ability to predict suspension behavior. Although the method has so far mainly been used for monodisperse, hard-sphere systems, it is relatively straightforward to extend it to deformable drops, to non-spherical particles, or polydisperse systems. The method has thus opened up opportunities for the study of a much larger class of new problems and flows, which hopefully will provide further insight into the physics of complex fluids.
References [1] J.F. Brady and G. Bossis, “Stokesian dynamics,” Annu. Rev. Fluid Mech., 20, 111–157, 1988. [2] J. Happel and H. Brenner, Low Reynolds Number Hydrodynamics, Prentice Hall, New York, 1965. [3] S. Kim and S.J. Karilla, Microhydrodynamics: Principles and Selected Applications, Butterworths, London, 1991.
Stokesian dynamics simulations for particle laden flows
2617
[4] L.J. Durlofsky, J.F Brady, and G. Bossis, “Dynamic simulations of hydrodynamically interacting particles,” Fluid Mech., 180, 21–49, 1987. [5] J.F. Brady, R.J. Phillips, J.C. Lester, and G. Bossis, dynamic simulation of hydrodynamically interacting suspensions,” J. Fluid Mech., 195, 257–280, 1988. [6] G.K. Batchelor and J.T. Green, “The hydrodynamic interaction of two small freely moving spheres in a linear flow field,” J. Fluid Mech., 56, 375–400, 1972. [7] A. Sierou and J.F. Brady, “Accelerated Stokesian dynamics simulations,” J. Fluid Mech., 448, 115–146, 2001. [8] D.L. Ermak and J.A. McCammon, “Brownian dynamics with hydrodynamic interactions,” J. Chem. Phys., 69, 1352–1360, 1978. [9] W.B. Russel, D.A. Saville, and W.R. Schowalter, Colloidal Dispersions, Cambridge University Press, Cambridge, 1989. [10] A.J. Banchio and J.F. Brady, “Accelerated Stokesian dynamics: Brownian motion,” J. Chem. Phys., 118, 10323–10332, 2003. [11] R.J. Phillips, J.F. Brady, and G. Bossis, “Hydrodynamic transport properties of hard-sphere dispersions. I. Suspensions of freely mobile particles,” Phys. Fluids, 31, 3462–3472, 1988. [12] A.J.C. Ladd, “Hydrodynamic transport coefficients of random dispersions of hard spheres,” J. Chem. Phys., 93, 3484–3494, 1990. [13] J.R. Melrose and R.C. Ball, “The pathological behavior of sheared hard-spheres with hydrodynamic interactions,” Europhys. Lett., 32, 535–546, 1995. [14] D.I. Dratler and W.R. Schowalter, “Dynamic simulation of suspensions of non-Brownian hard spheres,” J. Fluid Mech., 325, 53–77, 1996. [15] J.F. Brady and J.F. Morris, “Microstructure of strongly sheared suspensions and its impact on rheology and diffusion,” J. Fluid Mech., 348, 103–139, 1997. [16] J.F. Brady, “The rheological behavior of concentrated colloidal dispersions,” J. Chem. Phys., 99, 567–581, 1993. [17] D.R. Foss and J.R. Brady, “Structure, diffusion and rheology of Brownian suspensions by Stokesian dynamics simulations,” J. Fluid Mech., 407, 167–200, 2000. [18] P.R. Nott and J.F. Brady, “Pressure driven flow of suspensions: simulation and theory,” J. Fluid Mech., 275, 157–199, 1994.
9.7 BROWNIAN DYNAMICS SIMULATIONS OF POLYMERS AND SOFT MATTER Patrick S. Doyle and Patrick T. Underhill Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
The Brownian dynamics (BD) simulation technique is a mesoscopic method in which explicit solvent molecules are replaced instead by a stochastic force. The technique takes advantage of the fact that there is a large separation in time scales between the rapid motion of solvent molecules and the more sluggish motion of polymers or colloids. The ability to coarse-grain out these fast modes of the solvent allows one to simulate much larger time scales than in a molecular dynamics simulation. At the core of a Brownian dynamics simulation is a stochastic differential equation which is integrated forward in time to create trajectories of molecules. Time enters naturally into the scheme allowing for the study of the temporal evolution and dynamics of complex fluids (e.g., polymers, large proteins, DNA molecules and colloidal solutions). Hydrodynamic and body forces, such as magnetic or electric fields, can be added in a straightforward way. Brownian dynamics simulations are particularly well suited to studying the structure and rheology of complex fluids in hydrodynamic flows and other nonequilibrium situations.
1.
Basic Brownian Dynamics
The technique of Brownian dynamics is used to simulate the dynamics of particles that undergo Brownian motion. Because of the small mass of these particles, it is common to neglect inertia. Using Newton’s Second Law for particle i, Ftot i = m i ai , the neglect of inertia means that the total force is always approximately zero. The total force on a particle is composed of a drag force Fdi from the particle moving through the viscous solvent, a Brownian force FBi
2619 S. Yip (ed.), Handbook of Materials Modeling, 2619–2630. c 2005 Springer. Printed in the Netherlands.
2620
P.S. Doyle and P.T. Underhill
due to random collisions of the solvent with the particle, and all nonhydrodynamic forces Fnh i d B nh Ftot i = Fi + Fi + Fi 0 .
(1)
This total nonhydrodynamic force includes any external body forces, any spring forces, and any excluded volume interactions. In creeping flow and neglecting hydrodynamic interactions (free-draining), the drag force is taken as Stokes drag on a sphere dri − u∞ (ri ) (2) Fdi = −ζ dt where ζ is the drag coefficient and u∞ (ri ) is the unperturbed velocity of the solvent evaluated at the position of the particle. The differential equation governing the motion of the particle then becomes 1 nh dri = u∞ (ri ) + Fi ({r j }) + FBi (t) (3) dt ζ and is commonly called a Langevin equation. Note that the nonhydrodynamic force depends on the set of all particle positions {r j }. This is a stochastic differential equation because the Brownian force is taken from a random distribution. In order for the dynamics to satisfy the fluctuation–dissipation theorem, the expectation values of the Brownian force are FBi (t) = 0
(4)
FBi (t)FBj (t ) = 2kB T ζ δi j δ(t − t )δ
(5)
where kB is Boltzmann’s constant, T is the absolute temperature, δi j is the Kronecker delta, δ(t − t ) is the Dirac delta function, and δ is the unit secondorder tensor. Equations (3)–(5) are equivalent to the Fokker–Plank equation description, which is a diffusion equation for the phase space probability density [1]. Having developed the governing stochastic differential equation, one performs a BD simulation by integrating this equation forward in time. The stochastic nature means that one must produce many independent trajectories that are averaged together, producing the time-evolution of an ensembleaveraged property. The repetition of many independent trajectories is a time-consuming but necessary part to follow the time-evolution of a property. However, to calculate a steady-state property, one uses the ergodic hypothesis to time-average a single trajectory.
2.
Hydrodynamic Interactions
As a particle moves along its trajectory, it exerts a force on the solvent which changes the velocity field from its undisturbed value. The disturbance
Brownian dynamics simulations of polymers and soft matter
2621
velocity changes the viscous drag force exerted on the other particles. This interaction between particles mediated by the solvent is called hydrodynamic interaction (HI). The hydrodynamic interactions are included in Brownian dynamics through the use of an interaction tensor Ωi j included as part of the diffusion tensor Di j [1]. The two tensors are related by Di j (ri , r j ) =
kB T δi j δ + ζ Ωi j (ri , r j ) ζ
(6)
where Ωii = 0. The stochastic differential equation including HI then becomes N N √ 1 dri Di j (ri , r j ) · Fnh ({r }) + 2 Bi j ({rk }) · n j (t) = u∞ (ri ) + k j dt kB T j =1 j =1
(7) where n j are random vectors with expectation values n j (t) = 0
(8)
ni (t)n j (t ) = δi j δ(t − t )δ
(9)
and the weighting factors Bi j must be calculated from the diffusion tensor in order to satisfy the fluctuation–dissipation theorem Di j (ri , r j ) =
N
Bip ({rk }) · BTjp ({rk }).
(10)
p=1
This can be inverted to calculate Bi j by Cholosky decomposition. However, a more efficient method has been developed by Fixman [2] and implemented in BD simulations. The interaction tensor in an unbounded solvent is taken as the Rotne– Prager–Yamakawa tensor, which is a regularized version of the Oseen–Burgers tensor. The Oseen–Burgers tensor represents the disturbance due to a pointforce in creeping flow. However, it results in a nonpositive-definite diffusion tensor if particle separations are comparable to the particle radius. The Rotne– Prager–Yamakawa tensor modifies the small separation disturbance such that the diffusion tensor is always positive-definite.
3.
Polymer Models used in Brownian Dynamics
The choice of polymer model is intrinsically a modeling decision which depends upon the real polymer one wants to model and the level of fine-scale molecular detail one needs to retain or can computationally afford to simulate. Polymers can be broadly separated into flexible and semiflexible chains. The
2622
P.S. Doyle and P.T. Underhill
flexibility of a chain is determined by the ratio L/lp , where lp is the persistence length and L the contour length of the chain. Flexible chains have L/lp 1 and semiflexible chains have L/lp ∼ 1. The most common coarse-grained models for flexible polymers are the freely jointed bead-rod and bead-spring chains. The polymer is modeled as a series of beads connected by either rods or springs, as shown in Fig. 1. The frictional drag on the chain is distributed at bead centers. The term “freely jointed” implies that there is no energetic penalty to rotating a spring or rod about a bead center. The spirit of these mesoscopic models is to coarse-grain out molecular details smaller than the finest length scale in the given model (rod or spring). We consider first the flexible bead-rod chain. Physically, the rod in a beadrod chain corresponds to a Kuhn length lk (twice the value of the persistence length). Mathematically, the rods act as a constraint on the system which ensures that adjacent beads in the chain are maintained at a constant separation at all times. How one achieves this constraint is important as there is a subtle difference between a completely rigid constraint and the approximation of that constraint using a very (infinitely) stiff potential [3]. For example, the equilibrium distribution of a bead-rod chain using stiff constraints yields a random walk configuration while imposing rigid constraints gives rise to correlations in the rod directions, even in the absence of bending forces. Physically, one would argue that the random walk configuration is more realistic. However, rigid constraints are attractive from a computational standpoint as they freeze out the usually uninteresting rapid bond fluctuations. In practice, one chooses to simulate the stiff system, but does so by imposing rigid constraints (which introduces a new tension force Ftens ) and adding a corrective pseudopotential metric force Fmet which makes the system equivalent to imposing an infinitely stiff
Figure 1. Schematic of the canonical polymer model used in Brownian dynamics simulations: a bead-spring chain. The continuous curve is the “real” polymer that the bead-spring chain is meant to represent.
Brownian dynamics simulations of polymers and soft matter
2623
potential. A recent detailed review of constrained Brownian motion and the implementation of bead-rod BD algorithms can be found in Morse [3]. With current computers typically one can only simulate chains with up to ∼100−200 Kuhn steps which corresponds to a low molecular weight polystyrene polymer ( ∼ 75 000−150 000 Da). Large flexible polymers are more commonly modeled as a series of Ns springs connected by beads. Each spring models a portion of the full chain and has a contour length L s = L/Ns . The spring represents the entropic restoring force associated with stretching a subsection of the chain. The entropic force can be calculated starting from a fine-scale micromechanical model (e.g., freely jointed bead-rod chain) using equilibrium statistical mechanics and calculating the extension of a chain when subject to a constant force. The force-extension response of a freely joined chain is exactly described by the inverse Langevin function which is closely approximated by the more convenient FENE force law 3kB T r/L s (11) F FENE (r) = lk [1 − (r/L s )2 ] where r is the spring extension and L s is the spring length when fully stretched [4]. A slightly more accurate approximation to the inverse Langevin function can be obtained using a Pad´e approximation (see for example Ref. [5]). Most flexible synthetic polymers (polystyrene, poly(ethylene oxide), etc.) and single-stranded DNA have significant bond rotations and should be modeled using the FENE force law. However, many biopolymers resist local bond torsion (e.g., duplex DNA) and are more accurately described by a wormlike chain model which for L s /lp 1 is well approximated by the Marko–Siggia spring law [6] F
wlc
kB T (r) = lp
r 1 1− 4 Ls
−2
r 1 − + . 4 Ls
(12)
It is important to note some limitations and assumptions when using common spring force laws such as Eqs. (11) and (12). These relations are derived assuming a large number of persistence or Kuhn lengths are contained in the spring. More refined calculations show, however, that corrections to the spring laws must be made when the spring has a contour length less than ∼20lk . Furthermore, if the underlying micromechanical model is freely jointed at each node connecting the springs (e.g., freely jointed chains), then spring force laws can be derived for springs containing an arbitrarily small number of Kuhn lengths which will exactly reproduce the force extension response of the micromechanical model. The use of equilibrium statistical mechanics to derive spring forces also raises a subtle point when using a bead-spring model in flow or other nonequilibrium situations. In such cases, it is implicitly assumed that the springs are deformed slowly enough such that the micromechanical model
2624
P.S. Doyle and P.T. Underhill
describing the spring would be able to fully sample its configuration space; in a sense achieving a local equilibration in phase space. Lastly, it is important to note that parameters in the spring force are directly related to the physical polymer system being modeled (e.g., a certain subsection of a polymer chain of length L s with a given persistence length lp ). An extensive discussion of these and other issues on the coarse-graining polymers into bead-spring chains is given in Underhill and Doyle [7]. The model for a semiflexible polymer is based on the premise that the polymer can be described as a homogeneous, isotropic elastic filament. The energy associated with
bending a continuous filament described by a curve r(s) is U bend = κ/2 ds |∂2 r/∂s 2 |2 , where s is a contour distance along the chain. The bending rigidity κ is related to the persistence length by lp =κ/kB T . The chain is typically coarse-grained to a series of beads connected by rods which have a bending energy given by
U
bend
κ N−1 =− u j · u j −1 a j =2
(13)
where a is the length of the connecting rod and u j is the unit vector directed = − ∂U bend/∂ri . This descripfrom bead j to j + 1. The bending force is Fbend i tion of a semiflexible polymer is commonly referred to as the wormlike chain or Kratky–Porod model [8]. Systems with constraints (rods between beads) which ensure local inextensibility of a chain are computationally expensive and are sometimes replaced by stiff Fraenkel springs of the form Fsi = H (|ri+1 − ri | − a)ui , where H is the spring constant and a is the equilibrium length of the spring. Pasquali and coworkers have shown that rigid constraints and stiff Fraenkel springs give similar results for the collapse of DNA in poor solvent [9]. They found typical time savings in using stiff springs over rigid constraints are on the order of 10–50-fold. Most researchers use stiff springs only when simulating a polymer which is at or near equilibrium. Detailed comparisons between Fraenkel springs and constraints have not yet been performed for polymers in flow or under large tensions. In general, it is recommended to use rigid rods (with the corrective metric force discussed previously) unless one is certain stiff springs introduce no computational artifacts.
4.
Numerical Algorithms
We have developed the stochastic differential equation and discussed the types of models and corresponding forces involved. To solve for the trajectory,
Brownian dynamics simulations of polymers and soft matter
2625
one must integrate forward in time. A simple explicit Euler time integration scheme is widely used
ri (t + δt) ri (t) +
dri (t) t. dt
(14)
In this discrete version, the random vectors have an expectation value ni (t)n j (t ) = δi j δt t δ/t.
(15)
If the forces are steep, this would require very small time steps. This is particularly important for bead-spring chains with finitely extensible springs such as the FENE or Marko–Siggia force law. The above algorithm allows for the possibility of a spring being stretched beyond its fully extended length. A semi-implicit predictor corrector method including HI has been developed that prevents this overstretching of springs [5]. The realization of a trajectory requires the sampling of random vectors with the appropriate expectation values. The calculation of these “random” numbers is a significant computational cost of BD simulations. Because of this cost, it is better to use uniform random numbers than Gaussian distributed. ¨ These random number generators should be used with care. Ottinger [1] reviews important aspects of random number generators. The use of random variables in the simulation and a finite number of trajectories in the ensemble means that there is intrinsic statistical noise to −1/2 the method. The size of this error is proportional to NT , where NT is the number of independent trajectories. An important technique for the reduction of this error by reducing the proportionality factor (not by increasing NT ) is called variance reduction. The exact way of performing variance reduction depends on the system of study. Two important types of variance reduction are importance sampling and subtracting off a control variable [1]. Another issue to be noted for the implementation of BD algorithms concerns the neglect of mass, which is a singular limit. This limit is discussed in detail by Grassia et al. [10]. We note here that one consequence of this singular limit is a drift that results if the diffusivity depends on position. To correct for this drift, a term with the gradient of the diffusivity must be added to the algorithm. However, if a higher order scheme is used such as a midpoint method or predictor corrector method, the extra term should not be added.
5.
Computing Stress
The rheology of most complex fluids is not described well by simple constitutive relations. Central to the study of rheology then is the calculation of the bulk stress tensor σ in a simulation. The bulk stress of the mixture is a linear combination of the solvent contribution σs (typically a simple
2626
P.S. Doyle and P.T. Underhill
Newtonian fluid) and the polymer/colloid contribution σp . The exact form of the stress tensor depends upon the details of the microstructural model. However, the most common model for a polymer or colloid is a collection of discrete beads (with positions ri ) for which the particle contribution to the bulk stress is given by the Kramers–Kirkwood expression 1 σ =− V p
N
ri Fhi
(16)
i=1
where Fhi is the total hydrodynamic force exerted on bead i, V is the volume of the simulation box and · · · denotes an ensemble average. Equation (16) is a quite general result and is still applicable for systems with constraints (e.g., bead-rod chains) and when hydrodynamic interactions are taken into account [3, 4]. While it is important to establish Eq. (16) as a rather general form for the stress tensor, it useful to consider the specific form of the tensor for restricted classes of models and other commonly encountered cases. Consider first the canonical model employed in Brownian dynamics for a polymer: a series of N beads connected by springs. The polymer contribution to the stress is then p
σ = np
N
Fsi
+
Fother i
Ri
+ n p N kB T δ
(17)
i=1
where Ri is the position of a bead relative to the center of mass of the chain, n p is the number density of polymer molecules, Fsi the net force by springs is the sum of all the nonhydrodynamic and nonspring on bead i and Fother i forces exerted on bead i, such as bead–bead excluded volume interactions or bending forces. Equation (17) is valid both when hydrodynamic interactions are accounted for or neglected. The last term in Eq. (17) is the familiar ideal gas-like internal energy contribution from each bead which is a direct result of applying the usual “equilibration in momentum space” and assuming an isotropic friction tensor for a bead [4]. There can be computational advantages to using certain formulas over others to calculate the stress tensor in particular simulations. To gain some appreciation of this consider the steady-state rheology of a bead-rod (or beadspring) chain in a simple shear flow u ∞ x (r) = γ˙ r y and neglect hydrodynamic interactions between beads. Using the Giesekus stress formula the polymer contribution to the shear viscosity (ηp = −σxpy /γ˙ ) can be written npζ d η =− 2γ˙ dt p
N i=1
Rxi R iy
n pζ + 2
N
R iy R iy
.
(18)
i=1
However, because in this example we are only interested in the stress of the system at steady-state we can immediately set the first term on the right side
Brownian dynamics simulations of polymers and soft matter
2627
of Eq. (18) to zero and eliminate it from our simulation code. Not only is this a great simplification, but now one has a simple expression which gives physical insight into the connection between polymer viscosity and molecular configurations. For this example, the effective width of the chain in the shear-gradient direction is directly related to the contribution to the viscosity. Furthermore, simplifying the formula for the stress tensor can aid in the development of scaling relations which generalize the trends [11]. An extensive discussion of the various forms of the stress tensor can be found in Bird et al. [4]. We note that extra care must be taken when handling polymers containing constraint forces. In this case, a brute force calculation of the stress using Eq. (16) will contain some contributions which fluctuate about zero but are quite large (order t −1/2 ). Various efficient algorithms have been developed specifically for this case which essentially filter out these uninteresting “noise” terms [3, 11].
6.
Polymer Solutions in Hydrodynamic Flows
The ability to predict the dynamics and rheology of dilute polymer solutions in simple linear flows is an important benchmark by which researchers evaluate BD simulations and mesoscopic polymer models. In recent years, it has become possible to directly observe single polymer molecule dynamics in flow using double stranded DNA [12]. DNA is a unique polymer in that its force-extension relation (entropic spring-like force) has been well characterized and for large DNA is accurately described by the Marko–Siggia force law. To date most experiments have been performed with a particular commercially available DNA molecule, λ-DNA, which when stained with typical fluorescent dyes contains approximately 400 persistence lengths. Figure 2 shows a comparison between BD simulations using a free-draining (no HI) bead-spring chain (spring law given by Eq. (12)) for free and tethered λ-DNA in simple linear flows. Excellent quantitative agreement between free-draining BD simulations and experiments of λ-DNA in shear, elongational and mixed flows has been attained (see for example Refs. [13, 14]). The ability to neglect HI in the BD simulations and obtain quantitative agreement with experiments is because λ-DNA contains a modest number of persistence lengths and so even when HI is included there is little difference in the drag on a coiled and fully extended chain. RecentexperimentsusingextremelylongDNA(genomiclength, ∼20 000 persistence lengths) have demonstrated conformational hysteresis in extensional flows [15]. The authors show the qualitative nature of this behavior is captured only if HI is included in BD simulations. As it stands now, quantitative description of most trends in the stretching of DNA with lengths up to ∼1000 persistence lengths can be captured without including HI in a simulation. Further simulations need to be done to
2628
P.S. Doyle and P.T. Underhill 1.0 Mean fractional extension
Free extensional flow 0.8 Tethered shear flow
0.6 0.4
Free shear flow 0.2 0.0 0
20
40
60
80
100
Wi Figure 2. Comparison of Brownian dynamics simulations (lines) to single molecule DNA experiments (symbols). Mean fractional extension (stretch of the DNA scaled by the contour length of the molecule) vs. Weissenberg number (W i) is shown for tethered and free DNA in linear hydrodynamic flows. W i is the product of the shear rate (or elongation rate) and the longest relaxation time of the DNA. Figure adapted from Doyle et al. [13].
quantitatively test HI when comparing to larger genomic length DNA. Furthermore, when polymers are placed near hard surfaces the hydrodynamic interaction between segments will change due to satisfying the no-slip boundary condition at the surface. These wall effects can be taken into account by numerically solving for the Green’s function for creeping flow in an arbitrary geometry [16]. These modified hydrodynamic interactions are needed in a BD simulation to predict the correct trend of shear-induced migration in channels. However, it has been shown that quantitative agreement between free-draining BD simulations and experiments for the stretch and fluctuations of tethered λ-DNA is obtained if the drag coefficient is merely adjusted [13]. While the ability to observe single polymer molecules is very powerful, a measure of the polymer contribution to the stress is not attained in such experiments. Recently, it has become possible to accurately measure transient extensional stresses of high molecular weight polymer solutions using filament stretching rheometers [17]. The stresses developed in these flows are challenging to model because the elongational nature of the flow leads to large deformation of the initially coiled polymer and brings with this a large contribution by the polymer to solution viscosity. Hsieh and Larson have performed the most detailed study of the role in HI in BD simulations of polystyrene and DNA in extensional flows [5]. A method was developed to determine the HI
Brownian dynamics simulations of polymers and soft matter
2629
parameters in the model which on one hand will keep the number of springs to a modest number while on the other hand will match the relaxation time or diffusivity of the experimental polymer system and the drag on the fully stretch polymer (estimated from Batchelor’s formula). They confirmed that including HI has little effect for λ-DNA. Hsieh and Larson find that it is necessary to include HI in order to quantitatively match stress–strain behavior up to strains of ∼6 in filament stretching experiments of polystyrene with a molecular weight of 2 million ( ∼5400 persistence lengths). However, the simulations do not properly predict the experimental values for the high strain plateau in the stress.
7.
Outlook
Brownian dynamics is a powerful technique to simulate nonequilibrium dynamics of polymers and other soft matter. Efficient and stable algorithms have been developed which allow for the simulation of a wide class of polymer models ranging from flexible bead-spring chains to semiflexible bead-rod filaments. Furthermore, the strengths and deficiencies of springs representing a small number of persistence lengths is now well understood, at least for unconfined polymers. Quantitative comparisons of BD simulations to the rheology and dynamics of dilute polymer solutions show that our understanding of the importance and correct implementation of hydrodynamic interactions into a simulation is continuing to evolve. With the growing importance of processing biological and other complex fluids in micro and even nanochannels, the BD simulation technique will continue to advance to properly treat molecules in tight spaces. Further quantitative comparisons to single molecule DNA experiments, both in ideal bulk flows and in microfluidic devices, will be critical in helping us to evaluate the state of the art in Brownian dynamics simulations.
References ¨ [1] H.C. Ottinger, Stochastic Processes in Polymeric Fluids, Springer-Verlag, New York, 1996. [2] M. Fixman, “Construction of Langevin forces in the simulation of hydrodynamic interaction,” Macromolecules, 19, 1204, 1986. [3] D.C. Morse, “Theory of constrained Brownian motion,” Adv. Chem. Phys., 128, 65, 2004. [4] R.B. Bird, C.F. Curtiss, R.C. Armstrong, and O. Hassager, Dynamics of Polymeric Liquids, Volume 2, 2nd edn., Wiley, New York, 1987. [5] C.-C. Hsieh, L. Li, and R.G. Larson, “Modeling hydrodynamic interaction in Brownian dynamics: simulations of extensional flows of dilute solutions of DNA and polystyrene,” J. Non-Newtonian Fluid Mech., 113, 147, 2003.
2630
P.S. Doyle and P.T. Underhill
[6] E. Marko and E.D. Siggia, “Stretching DNA,” Macromolecules, 28, 8759, 1995. [7] P.T. Underhill and P.S. Doyle, “On the coarse-graining of polymers into bead-spring chains,” J. Non-Newtonian Fluid Mech., 122, 3, 2004. [8] H. Yamakawa, Helical Wormlike Chains in Polymer Solutions, Springer, Berlin, 1997. [9] A. Montesi, M. Pasquali, and F.C. MacKintosh, “Collapse of a semiflexible polymer in poor solvent,” Phys. Rev. E, 69, 021916, 2004. [10] P.S. Grassia, E.J. Hinch, and L.C. Nitsche, “Computer simulations of Brownian motion of complex systems,” J. Fluid Mech., 282, 373, 1995. [11] P.S. Doyle, A.P. Gast, and E.S.G. Shaqfeh, “Dynamic simulation of freely draining flexible polymers in steady linear flows,” J. Fluid Mech., 334, 251, 1997. [12] S. Chu, “Biology and polymer physics at the single molecule level,” Phil. Trans. R. Soc. Lond. A, 361, 689, 2003. [13] P.S. Doyle, B.Ladoux, and J.L. Viovy, “Dynamics of a tethered polymer in shear flow,” Phys. Rev. Lett., 84, 4769, 2000. [14] R.G. Larson, H. Hu, D.E. Smith, and S. Chu, “Brownian dynamics simulations of DNA molecules in an extensional flow field,” J. Rheol., 43, 267, 1999. [15] C.M. Schroeder, H.P. Babcock, E.S.G. Shaqfeh, and S. Chu, “Observation of polymer hysteresis in extensional flow,” Science, 3001, 1515, 2003. [16] R.M. Jendrejack, D.C. Schwartz, J.J. de Pablo, and M.D. Graham, “Shearinduced migration in flowing polymer solutions: simulation of long-chain DNA in microchannels,” J. Chem. Phys., 120, 2513, 2004. [17] G.H. McKinley and T. Sridhar, “Filament stretching rheometry of complex fluids,” Annu. Rev. Fluid Mech., 34, 375, 2002.
9.8 MECHANICS OF LIPID BILAYER MEMBRANES Thomas R. Powers Division of Engineering, Brown University, Providence, RI, USA
All cells have membranes. The plasma membrane encapsulates the cell’s interior, acting as a barrier against the outside world. In cells with nuclei (eukaryotic cells), membranes also form internal compartments (organelles) which carry out specialized tasks, such as protein modification and sorting in the case of the Golgi apparatus, and ATP production in the case of mitochondria. The main components of membranes are lipids and proteins. The proteins can be channels, carriers, receptors, catalysts, signaling molecules, or structural elements, and typically contribute a substantial fraction of the total membrane dry weight. The equilibrium properties of pure lipid membranes are relatively well-understood, and will be the main focus of this article. The framework of elasticity theory and statistical mechanics that we will develop will serve as the foundation for understanding biological phenomena such as the nonequilibrium behavior of membranes laden with ion pumps, the role of membrane elasticity in ion channel gating, and the dynamics of vesicle fission and fusion. Understanding the mechanics of lipid membranes is also important for drug encapsulation and delivery. Lipid molecules are amphiphilic, with a charged or polar hydrophilic head group, and one or two hydrophobic hydrocarbon tails. In water, lipid molecules can self-assemble into sheet-like bilayer structures, with the hydrocarbon tails sandwiched between the hydrophilic head groups [1]. Free edges that expose the hydrophobic core are energetically unfavorable, and tend to cause the membrane to close up into a vesicle. The membrane structures formed by lipid molecules range in size from the 50 nm-diameter vesicles used for internal transport in eukaryotic cells to the 100 µm-diameter giant vesicles studied in vitro as model systems. The discrepancy between the size of such vesicles and the small size of the individual molecules justifies a coarse-grained approach in which the membrane is treated as a continuum. In this approach, 2631 S. Yip (ed.), Handbook of Materials Modeling, 2631–2643. c 2005 Springer. Printed in the Netherlands.
2632
T.R. Powers
most of the details of the chemistry or structure of the molecules are unimportant. The diffusion constant of lipid molecules in a lipid bilayer membrane at room temperature is roughly D ≈ 10−8 cm2 s−1 , much smaller than that of a small molecule in water, but still large enough that a single lipid molecule will diffuse to the other side of a 1 µm-diameter vesicle in about 1 s. Furthermore, above the chain-melting temperature [1], there is no crystalline order in the plane of the monolayers. Thus, lipid membranes are two-dimensional fluids. The hydrophobic nature of the hydrocarbon chains has several important consequences. For lipid molecules with two chains, the solubility in water is low, implying a constant total number of lipid molecules in a vesicle. Also, flip-flop of molecules from one monolayer to another is relatively rare since the polar or charged heads would have to pass through the hydrophobic inner region. Thus to a good approximation, the number of lipid molecules in each monolayer is also constant. Despite the hydrophobic chains, lipid membranes are relatively permeable to water. However, the permeability of ions is small; thus, changes in vesicle volume are resisted by osmotic pressure.
1.
Equilibrium Vesicle Shapes
Vesicles exhibit a rich variety of shapes (see Fig. 1): prolate and oblate surfaces of revolution, cup-shaped stomatocytes, pear shapes, budding shapes,
(a)
(b)
(c)
(d)
(e)
(f)
Figure 1. Sketches of observed vesicle shapes (not to scale): (a) prolate, (b) oblate, (c) stomatocyte, (d) pear-shaped, (e) budding, (f) nonaxisymmetric. Shapes (a)–(e) are surfaces of revolution with a vertical symmetry axis; shape (f) has a threefold symmetry axis normal to the plane of the paper.
Mechanics of lipid bilayer membranes
2633
and nonaxisymmetric “starfish”. In this section we will review the theory used to calculate these shapes (for more comprehensive reviews, see Refs. [2, 3]). In particular, we will see that accounting properly for the bilayer structure is crucial for generating the complete set of shapes. Unlike the interface between oil and water, lipid membranes have a resistance to bending. To develop the elastic theory of membranes, we exploit the small thickness of a membrane and take a purely two-dimensional approach. We begin by ignoring the bilayer structure and consider the minimal model in which the membrane is a monolayer. Once we see how the minimal model fails to account for all of the observed vesicle shapes, we will generalize the model to include bilayer structure. For a fluid membrane, there is no reference configuation for the positions of the constituent molecules (there is a reference density), and the bending energy depends on the current shape only. Therefore, the bending energy must depend on purely geometric quantities, such as curvature. To define the curvature at a point p of a surface, construct the tangent plane at p and measure the height h of the surface above the tangent plane as a function of local Cartesian coordinates with origin p (Fig. 2). To leading order in x and y, the height will be a quadratic form, h(u) = ij Lij (u)uiuj , where u = (x, y). The invariants of the matrix L (the trace and the determinant) are independent of the choice of coordinate system and depend on shape only. Thus, we define the mean curvature H ≡ trL/2 and the Gaussian curvature K ≡ det L; see Ref. [4]. Note that 2H = 1/R1 + 1/R2 and K = 1/(R1 R2 ), where the principal curvatures 1/R1 and 1/R2 are the eigenvalues of L . For a sphere of radius R, R1 = R2 = R. For a cylinder of radius R, R1 = ∞ and R2 =R. Note that we must choose a convention to define the sign of H. The direction of positive height could be along either the inward or outward surface normal ˆ For a symmetric monolayer (or bilayer) with the same solvent on either side, n. there is no reason to prefer one choice over the other. The Gaussian curvature, on the other hand, is independent of the choice of normal. Thus, the energy ˆ H −→ −H, K −→ K . must have the symmetry nˆ −→ −n,
p x
y h(x,y)
Figure 2. Height h of the surface above the tangent plane at p.
2634
T.R. Powers
We can think of H as a measure of strain due to bending. For small curvature (d/R1 1 and d/R2 1, where d is the membrane thickness), the leading terms of the bending energy are F=
κ 2
(2H )2 dA +
κG 2
K dA.
(1)
Although the energy is quadratic in the “strains”, it depends nonlinearly on the membrane shape r(u). The Gaussian curvature term is usually disregarded, since the Gauss–Bonnet theorem implies that it is insensitive to continuous changes of shape for a closed vesicle [4]. If we relax the bilayer symmetry requirement, for example because the solution inside the vesicle is different from the solution outside, then we have the spontaneous curvature model [5, 6]: FSC = (κ/2) (2H − c0 )2 dA. We will set c0 = 0 for simplicity, but density-dependent spontaneous curvature terms will arise in our discussions of bilayer structure and active membranes. In real cell membranes, the two monolayers have different compositions, and generally c0 =/ 0. The deformation energy of a fluid membrane is similar to that of a solid plate. But there is a crucial difference between plates and membranes. It costs energy to shear a flat plate in its plane. On the other hand, no energy is required to shear a flat membrane in its plane, since this deformation rearranges the molecules without stretching any bonds. (Note that we assume the shear occurs so slowly that viscous effects are small.) For a solid plate, the ratio of stretching energy to bending energy diverges as the thickness vanishes [15]. To see why bending is easier than stretching, consider a thin solid plate bent into the shape of a section of a cylinder with small curvature 1/R (Fig. 3). The difference in length between the inside and outside faces of the plate is proportional to d/R, where d is the plate thickness. These small strains can add up to a deflection which is large compared to the plate thickness. But to get a similar magnitude deflection from pure stretching requires strains which are independent of the thickness and much larger. The result is that the lowest energy configurations accessible to a thin solid plate are those in which there is no stretching. In the limit of vanishing thickness, we can replace the high penalty for stretching with a constraint which forces any deformation to be isometric, in which the distance between any two nearby points on the surface is fixed. This effective stiffness is known as geometric rigidity. For example, it is easy to bend a sheet of paper into a cylinder or cone shape, but it is impossible to smoothly shape the paper into a spherical cap. However, a flat fluid membrane could be shaped into a spherical cap without stretching – the lipid molecules flow to adjust to the new configuration. Therefore, the fluid nature of membranes implies that the geometric constraint in the limit of vanishing thickness is weaker than the constraint for solid plates; the total area is fixed,
Mechanics of lipid bilayer membranes
2635
(a)
R
d
inner
δ
outer
(b)
δ
L Figure 3. (a) The outer face of the bent plate is longer than the inner face by δd/R; therefore, the strain vanishes as d/R → 0. (b) The strain in the stretched plate is δ/L , independent of d.
but the deformation need not be isometric. This weaker constraint leads to a rich morphology of vesicle shapes. Membranes in solution undulate constantly due to Brownian motion. Although these thermal fluctuations are always present, they are usually not large, and they are often either disregarded or treated as a perturbation in discussions of vesicle shape. In this section, we use the equipartition theorem to estimate the amplitude of the fluctuations. Consider a membrane patch which is almost flat, parametrized in the Monge representation, r(u)=(x, y, h(x, y)), where h(x, y) is theheight above the xy-plane. Then the bending energy (1) reduces to F ≈ k/2 (∇ 2 h)2 dx dy. Assuming that the patch is square with side length L, and imposing periodic boundary conditions for simplicity, we write the height as a Fourier series, h(u)= q exp(iq · u)h q /L 2 , where the sum over q = (2πm/L , 2πn/L) runs over all the integers m and n less than a cutoff m max . The cutoff corresponds to the shortest excitable wavelength, which is comparable to the thickness of the membrane. Applying the equipartition theorem (each mode of a quadratic energy function contributes kB T /2 to the average energy [3]) to the bending energy yields the mean-square-amplitude
h q h q = L 2 δq+q ,0
kB T q4.
(2)
2636
T.R. Powers
The mean square δn · δn of the deviation δn ≡ nˆ − zˆ of the unit surface normal nˆ from zˆ is a measure of the size of the fluctuations. Since the membrane is almost flat, δn ≈ −∇h, and [7]
δn · δn =
1 q,q
L
q · q h q h q ≈ 2
d2 q kB T . (2π)2 κq 2
(3)
The integral in (3) is cut off at small wavenumbers by the lateral size of the membrane, and at large wavenumbers by a molecular scale a, such as the size of the molecules or the thickness of the membrane. Approximating the sum as an integral and supposing the wavevectors range from qmin ≈ 2π/L to qmax ≈ 2π/a leads to δn · δn ≈ kB T /(2πκ) ln(L/a). Thus, the magnitude of the fluctuations of the normal become comparable to the magnitude of the normal itself when L ≈ ξP , where the persistence length ξP ≈ a exp(2πκ/kB T ). For a ≈ 1 nm and κ ≈ 10kB T , the persistence length is enormous, and thermal fluctuations may be safely ignored in calculations of vesicle shapes. Perhaps the simplest approach to computing vesicle shapes is to treat the membrane as a monolayer with constant area and enclosed volume, and a resistance to bending. Minimizing the bending energy subject to these constraints leads to the Euler–Lagrange equation
p + 2σ H − 2κ 2H (H 2 − K ) − ∇ 2 H = 0,
(4)
where ∇ 2 is the Laplacian on the surface, and σ and p are the Lagrange multipliers for the constraints of fixed area and volume, respectively [2]. The first two terms of (4) correspond to the Young–Laplace law for an interface, and the rest are the contributions from the bending stresses. Note that the bending terms vanish for a perfectly spherical vesicle, since the bending energy of a sphere of any radius is 8πκ. More generally, if r(u) describes the shape of a vesicle, and if λ is a positive, constant scale factor, then the new shape λr(u) has the same bending energy. We can use this scale-invariance to reduce the two-parameter family of shapes (one for every area A and volume V ) to a one-parameter family. Thus, to find the critical points of F for area λ2 A and volume λ3 V , use the stationary solutions of F + σ A + pV and rescale the shapes is to use the all lengths by λ. A convenient way to parameterize √ reduced volume υ ≡ 3V /(4π R03 ), where R0 ≡ A/(4π) is the radius of the sphere with area A. Since the sphere has the maximum volume for a given area, 0 ≤ υ ≤ 1. Equivalently, since the sphere has minimum area for a given volume, any υ < 1 corresponds to a shape with excess area over the sphere that encloses the same volume. Numerical solution of the Euler–Lagrange equations (4) yields prolate, oblate, and stomatocyte shapes only, with the oblate shapes occuring only in a narrow range of υ. There are no budding shapes, pear shapes, or nonaxisymmetric shapes [2]. Thus, the minimal model captures some of the
Mechanics of lipid bilayer membranes
2637
features of the experimentally measured phase diagram of vesicle shape, but fails to predict some of the observed shapes. The shortcomings of the minimal model arise because it disregards an important physical effect. Bending a bilayer necessarily stretches and compresses the two monolayers, whereas bending a liquid monolayer does not stretch its midplane. For example, consider an element of area dA ≡ n · ∂r/∂u 1 × ∂r/∂u 2 on the midplane of the bilayer. By projecting all the points in dA along the positive and negative normal, we define the regions dA± on each leaf of the bilayer. If 2d is the distance between the two leaves, then a short calculation shows dA± = dA[1 ± 2d H + O(d 2 K )]. This formula is useful for writing ± relative the number densities φ ± of each monolayer in terms of densities φproj ± ± ± to the middle surface of the bilayer: φ dA = φproj dA. If φ0 is the equilibrium density on each monolayer, then the projected dimensionless density is ± /φ0 −1), and relative and total dimensionless projected densities are ρ ± = (φproj + ρ ≡ (ρ −ρ − )/2 and ρ¯ ≡ (ρ + −ρ − )/2. Note that under the bilayer symmetry ˆ H −→ −H, ρ −→ −ρ, and ρ¯ −→ ρ. operation, nˆ −→ −n, ¯ Thus, bilayer symmetry forbids a term in the elastic energy density of the form ρ¯ H, but allows ρ H. The total elastic energy is F=
κ 2
dA(2H )2 + km
dA (ρ − 2dH )2 + ρ −2 ,
(5)
where we have assumed that each monolayer has the same area compression modulus km [2]. The reader should check that (5) correctly gives no stretching or compression cost when the bilayer is bent with curvature H but each monolayer has been allowed to relax to its equilibrium density φ ± = φ0 . This relaxation always possible in a vesicle where the number of molecules is not ± in each leaf of the bilayer is fixed. N ± = dA φproj To reduce the energy density of (5) to a function of shape only, minimize over ρ and ρ¯ subject to the constraint of fixed N ± . Just as for the minimal model, the penalty for changing the total area is much larger than the cost of bending, and we may regard the area as constant. However, the coupling of curvature and density-difference leads to a new elastic term F=
κ 2
dA (2H )2 +
κβ π ( A − A0 )2 , 8Ad 2
(6)
where β = 2km d 2 /(πκ), A = 4d dAH, and A0 = (N + − N − )/φ0 . This model is known as the “area-difference elasticity” model. The parameter β is often regarded as an independent parameter, since its value depends on the model for stretching elasticity. If β → ∞, then (6) becomes the “bilayer couple model”, in which there is a constraint on the integrated mean curvature. Both of these models have an additional parameter, A0 , relative to the single parameter υ of the minimal model. Thus, the phase diagram is now
2638
T.R. Powers
two-dimensional, and the upshot is that these models can predict the observed shapes. The reader is referred to Refs. [2, 8] for the details.
2.
Entropic Elasticity
In this section and the next, we consider situations in which membrane stretching cannot be ignored. First we review Evans and Rawicz’s classic measurement of tension vs. apparent areal strain for a spherical vesicle [9]. We saw earlier that collisions of solvent molecules with a membrane excite undulations. If the membrane is stretched across a frame, these undulations will tend to contract the frame, leading to tension. The actual area of the membrane will be greater than the area of the frame. Likewise, a fluctuating vesicle will have an actual area which is larger then its apparent area, the area of the average vesicle shape. Applying tension will reduce the amplitude of the undulations, pulling some of the excess area out of the undulations and increasing the apparent area. Eventually, most of the undulations are smoothed out, and increasing the tension will stretch the membrane like an elastic material. To measure the relation between the tension and the excess area, Evans and Rawicz aspirated a vesicle onto a pipet. At low suction pressures, the vesicle shape fluctuates. With increased suction pressure, the outer portion of the vesicle becomes spherical, and the fluctuations are no longer optically visible. Nevertheless, as the suction pressure increases further, significant excess area is pulled from the microscopic fluctuations, which is reflected by an increase in the length of the vesicle in the pipet (Fig. 4). As the apparent area increases, the tension increases. The fluid nature of the membrane implies that the tension is uniform over the entire vesicle (the experimentalists take care that the membrane does not adhere to the pipet). There are two spherical portions where the Young–Laplace law [(Eq. (4) with κ = 0] applies: the main
R0 P1
Rp P0
Figure 4. Vesicle aspirated onto a pipet; p = p2 − p1 .
Mechanics of lipid bilayer membranes
2639
body of the vesicle with radius R0 , and the little spherical cap with radius RP inside the pipet. Eliminating the pressure inside the vesicle from these relations yields the tension τ=
p Rp , 2(1 − Rp /R0 )
(7)
where p is the pressure difference between the solvent and the pipet. To determine the apparent areal strain, define α ≡ (A − A0 )/ A0 , where A and A0 are the final and initial apparent areas of vesicle, respectively. The strain α is the (dimensionless) difference in the excess areas stored in the undulations at tensions τ0 and τ, where τ0 is the tension corresponding to the area A0 . Typically A0 is the apparent area at the lowest suction that holds the vesicle onto the pipet. If R˜ 0 is the final vesicle radius and R0 is the initial vesicle radius, then α A0 =2π Rp +4π( R˜ 02 − R02 ). Using volume conservation, /R0 1, R p /R0 1, and /R0 1, it follows that 1 α= 2
Rp R0
2
−
Rp R0
3
. Rp
(8)
Thus, α and τ may be determined by measuring R0 and as p is varied. There are two regimes. For α less than a few percent, the resistance to stretching is purely entropic, and the tension is related nonlinearly to strain: τ ∝ exp α. For greater strains, most of the thermal undulations have been pulled out, and the intrinsic membrane elasticity leads to a linear (Hooke’s law) response. To understand these two regimes, we present the self-consistent approach of Helfrich and Servuss [10] (see also Ref. [3]). The problem is considerably simplified if we consider a membrane attached to a square frame, and allow the membrane area (but not the area of the frame) to fluctuate. In this approach, the “tension” τ plays the role of a chemical potential per area. Our goal is to compute the excess area hidden in the fluctuations as a function of τ Up to an additive constant, the free energy to quadratic order is τ κ 2 2 2 d u(∇ h) + d2 u(∇h)2 , (9) F≈ 2 2 where 2 the second term arises from the expansion of the membrane area, Am = d u 1 + (∇h)2 . The average excess area hidden in the undulations is
Am − L 2 ≈
1 2
d2 u (∇h)2 ,
(10)
or, using the equipartition theorem [now with |h q |2 /L 2 = kB T /(κq 4 + τ q 2 )]
Am − L 2 1 = 2 L 4π
π/L
kB T q dq , κq 2 + τ
(11)
2640
T.R. Powers
where ∼ π/a is the short-wavelength cutoff, and π/L is the long-wavelength cutoff. Evaluating the integrals leads to the dimensionless excess area
Am − L 2 kB T κ 2 + τ ln . = L2 8πκ κ(π/L)2 + τ
(12)
To relate this calculation to the experiment of Evans and Rawicz, define α F to be the difference in the areas hidden in the fluctuations for zero tension and tension τ : α F ≡ ( Am τ = 0 − Am τ ) /L 2 . Except for geometrical factors which arise because α F applies to a planar geometry and α applies to a spherical geometry, α F and α both describe precisely the same difference in excess area (assuming τ0 ≈ 0). Thus, we identify α = α F , and let L 2 = A, the apparent area. In the low-tension regime, τ κ 2 , which means that drops out of the difference Am τ = 0 − Am τ . Finally, we combine the intrinsic membrane elasticity with the entropic elasticity by assuming these two effects can be modeled as springs in series
τA kB T ln 1 + 2 α≈ 8πκ κπ
+
τ , KA
(13)
where K A is the intrinsic stretch modulus. For realistic values of tension, area, and bending stiffness, it turns out that τ A/(κ π2 ) 1, and the logarithmic term dominates at low tension. Thus, τ ∝ exp (α) at low tension. Evans and Rawicz determined the bending stiffness κ and the stretch modulus K A by fitting (13) to their measure of τ vs. α.
3.
Active Membranes
Our discussion so far has been confined to membranes in thermal equilibrium, or “passive membranes”. However, since living systems are by definition out of equilibrium, it is desirable to extend our discussion to nonequilibrium processes. This area is largely unexplored. In this section, we review recent work on the effects of ion pumps on membrane fluctuations [11]. The fluctuations of red blood cell membranes have been observed for many years, and have been interpreted using the equilibrium theory discussed in the Section 2 [12]. However, recent observations have cast doubt on the purely thermal origin of these fluctuations and shown them to depend on the concentration of ATP [13], the energy source for many of the active processes in the cell. The fluctuations increase with ATP concentration, and for a given ATP concentration, decrease as the solvent viscosity increases. When ATP is depleted, the fluctuations become independent of viscosity. Enhanced fluctuations due to non-thermal effects have also been seen in an artificial system
Mechanics of lipid bilayer membranes
2641
consisting of a lipid vesicle studded with bacteriorhodopsin (BR) [14]. BR is a light-driven proton pump purified from the purple membrane of the saltloving bacterium Halobacterium salinurum. Manneville et al. aspirated the vesicle onto a pipet and measured the tension versus the areal strain to find that fluctuations were enhanced when the BR was “on” (illuminated by light of the proper wavelength). To begin to understand these phenomena, we outline a simple theory for an almost flat monolayer membrane embedded with diffusing pumps. When a pump transfers an ion across the membrane, the pump exerts a force on the membrane. We will neglect the random fluctuations of the force over time, and suppose each pump exerts a constant force Fp ≈ 10kB T /d, where 10kB T is the free energy given up by ATP hydrolysis at physiological concentrations, and d is the membrane thickness. Each pump acts in only one direction, and cannot flip across the membrane to change its orientation. Assume there is an equal number of upward and downward-directed pumps, with number densities n + and n − , respectively, so that n + − n − = 0. Just as in our discussion of bilayer structure, it is convenient to introduce the dimensionless total density φ ≡ (n + + n − )/n 0 and density difference ψ ≡ (n + − n − )/n 0 , where n 0 = n + + n − . Bilayer symmetry allows a coupling between the density difference and the mean curvature but implies that φ is decoupled from the other ˆ leading to a quadratic energy fields (φ → φ and ψ → −ψ under nˆ → −n), similar to (5) F=
1 2
d2 u κ(∇ 2 h)2 + τ (∇h)2 + Bψ 2 − 2κc0 ψ∇ 2 h ,
(14)
where B is a compression or osmotic modulus. At low pump density, B ∼ n 0 kB T, since the compression modulus of an ideal gas is the pressure. When the pumps are on, they consume ATP and the equipartition theorem cannot be used to determine the variance of h; instead, we must solve the dynamical equations of motion. The details of this calculation are technical and we refer the reader to the review of (Ramaswamy and Rao, [11]); here we just summarize the main points. At typical membrane length and time scales, inertia is unimportant, and the equation for the height field amounts to a balance of elastic, viscous, and pump forces. The elastic forces per unit area can be deduced from the elastic energy (14). To determine the pump force per unit area f pumps , let Fp be the force exerted by an isolated pump when H = 0. The symmetry nˆ → −nˆ allows a coupling between density difference and mean curvature f pumps = (n + − n − )Fp + (n + + n − ) 2 H Fp ,
(15)
where 2 is a coupling constant with units of length. Finally, the simplest model for the viscous forces is to disregard the hydrodynamics of the solvent and use a local model, analogous to the Rouse model for polymer dynamics.
2642
T.R. Powers
In the membrane version of this model, the local membrane velocity is proportional to the elastic and pump forces per unit length, with the permeation constant µp acting as the constant of proportionality. The dynamics for h is completed with a random force density representing the effects of Brownian motion, leading to a Langevin equation [7]. The equation of motion for ψ is the diffusion equation modified with terms corresponding to the coupling of density and curvature in (14), and again with a random forcing term. Solving the coupled equations for the variance of h and computing the areal strain for small tensions leads to α ≡ α(τ0 ) − α(τ ) ≈ (kB Teff /κeff ) ln(τ/τ0 ), with kB Teff = kB T +
κeff µp Fa . D
(16)
In (16), D is the diffusion constant of the pumps, κeff = κ − (κc0 )2 /B, and
¯ = 2 + κc0 /B. In this model, the fluctuations are enhanced by the pumps, but have the same dependence on tension as in passive membranes. Note that increasing the solvent viscosity should decrease µp and thus α, in accord with the observations of Tuvia et al., [13]. Increasing the membrane viscosity should decrease D, enhancing the nonequilibrium fluctuations.
4.
Outlook
The good agreement between theoretical predictions and experimental observations of vesicle shapes demonstrates that the equilibrium elastic behavior of lipid bilayer membranes is well-understood. Likewise, we have seen that quantitative understanding of entropic tension has progressed to the point that measurements of thermally generated tension may be used as a tool for probing new phenomena such as the fluctuations of active membranes. These fluctuations are but one example requiring a deeper understanding of the dynamics of membranes, which is likely to be the main front for future developments.
Acknowledgment Preparation of this review was supported in part by Grant No. CMS0093658 from the National Science Foundation.
References [1] J. Israelachvili, Intermolecular and Surface Forces, 2nd edn. Academic Press, London, 1992.
Mechanics of lipid bilayer membranes
2643
[2] U. Seifert, Configurations of fluid membranes and vesicles, Adv. Phys., 46, 13, 1997. [3] D. Boal, Mechanics of the Cell, Cambridge University Press, Cambridge, 2002. [4] R. Kamien, “The geometry of soft materials: a primer,” Rev. Mod. Phys., 74, 953, 2002. [5] P. Canham, “The minimum energy of bending as a possible explanation of the biconcave shape of the human red blood cell,” J. Theor. Biol., 26, 61, 1970. [6] W. Helfrich, “Elastic properties of lipid bilayers: theory and possible experiments,” Z. Naturforsh., 28c, 693, 1973. [7] P. Chaikin and T. Lubensky, Principles of Condensed Matter Physics, Cambridge University Press, Cambridge, 1995. [8] H.-G. D¨obereiner, E. Evans, M. Kraus, U. Seifert, and M. Wortis, “Mapping vesicle shapes into the phase diagram: a comparison of experiment and theory,” Phys. Rev. E, 55, 4458, 1997. [9] E. Evans and W. Rawicz, “Entropy-driven tension and bending elasticity in condensed-fluid membranes,” Phys. Rev. Lett., 64, 2094, 1990. [10] W. Helfrich and R.-M. Servuss, “Undulations, steric interaction and cohesion of fluid membranes,” Nuovo Cimento, D3, 137, 1984. [11] S. Ramaswamy and M. Rao, “The physics of active membranes,” C.R. Acad. Sci. IV Paris, 2, 817, 2001. [12] F. Brochard and J. Lennon, “Frequency spectrum of the flicker phenomenon in erythrocytes,” J. Phys. (Paris), 36, 1035, 1975. [13] S. Tuvia, A. Almagor, A. Bitler, S. Levin, R. Korenstein, and S. Yedgar, “Cell membrane fluctuations are regulated by medium viscosity: evidence for a metabolic driving force,” Proc. Natl. Acad. Sci. USA, 94, 5045, 1997. [14] J.-B. Manneville, P. Bassereau, S. Ramaswamy, and J. Prost, “Active membrane fluctuations studied by micopipet aspiration,” Phys. Rev., E64, 021908, 2001. [15] J. Rayleigh, The Theory of Sound, vol. I, 2nd edn. Dover Publications, New York, 1945.
9.9 FIELD-THEORETIC SIMULATIONS Venkat Ganesan1 and Glenn H. Fredrickson2 1 Department of Chemical Engineering, The University of Texas at Austin, Austin, TX, USA 2
Department of Chemical Engineering & Materials, The University of California at Santa, Barbara Santa Barbara, CA, USA
The science and engineering of materials is entering a new era of so-called “designer materials”, wherein, based upon the properties required for a particular application, a material is designed by exploiting the self-assembly of appropriately chosen molecular constituents [1]. The desirable and marketable properties of such materials, which include plastic alloys, block and graft copolymers, and polyelectrolyte solutions, complexes, and gels, depend critically on the ability to control and manipulate morphology by adjusting a combination of molecular and macroscopic variables. For example, styrene– butadiene block copolymers can be devised that serve either as rigid, tough, transparent thermoplastics or as soft, flexible, thermoplastic elastomers, by appropriate control of copolymer architecture and styrene/butadiene ratio. In this case, the property profiles are intimately connected to the extent and type of nanoscale self-assembly that is established within the material. One of the main challenges confronting the successful design of nano-structured polymers is the development of a basic understanding of the relationship between the molecular details of the polymer formulation and the morphology that is achieved. Unfortunately, such relationships are still mainly determined by trial and error experimentation. A purely experimental-based program in pursuit of this objective proves cumbersome – primarily, due to the broad parameter space accessible at the time of synthesis and formulation. Consequently, there is a significant motivation for the development of computational tools that can enable a rational exploration of the parameter space. Atomistically faithful computer simulations of self-assembly in dense phases of soft materials prove to be difficult or impossible for many systems of practical interest [2, 3]. Such methods typically involve building classical descriptions (with atomic resolution) of a complex fluid. Interactions in such models are
2645 S. Yip (ed.), Handbook of Materials Modeling, 2645–2656. c 2005 Springer. Printed in the Netherlands.
2646
V. Ganesan and G.H. Fredrickson
described by some combination of bonded and non-bonded potential functions, typically parameterized at the two-body and/or three-body level. Determining the equilibrium and non-equilibrium properties involves carrying out a computer simulation, usually by employing Monte Carlo (MC) or molecular dynamics (MD) techniques. The major drawback of such atomistic methods is that except in rare instances, it is very difficult to equilibrate sufficiently large systems at realistic densities in order to extract meaningful information about structure and thermodynamics. Such limitations become particularly significant in the context of modeling inhomogeneous self-assembled phases, wherein the length scale of the self-assembled morphology corresponds to many atomic lengths. Consequently, most modern computer simulation methods for self-assembly in complex fluids have focused on coarsegrained, particle-based methods, i.e., particle based simulations (PBS) [4–6], where atoms or groups of atoms are lumped into larger “particles”. This could simply amount to a “united atom” approach where, e.g., each CH2 unit in a polyethylene chain is replaced by a single effective particle. Interactions in such a model are then effective interactions between lumped CH2 particles and standard MC or MD simulation methods can be employed. Often even more extensive coarse-graining is carried out. For example, bead-spring polymer chains are often employed in which each bead can represent the force center associated with 10 or more backbone atoms. Despite the great success and impact of such approaches, a difficulty is that there is no unique coarse-graining procedure and that the effective interactions between beads (particles) are often difficult to parameterize accurately. Moreover, PBS methods remain expensive to simulate, especially at melt densities and for heterogeneous systems that exhibit nanoscale or macroscale phase separation [7]. Alternative methods such as dissipative particle dynamics (DPD) [8] speed up the simulations by introducing soft inter-particle potentials, but at the cost of producing artificially high compressibilities, loss of topological constraints between chains, and often also a loss of connection to the chemical details of the underlying complex fluid. In the above modeling strategies, the fundamental degrees of freedom to be sampled in a computer simulation are the generalized coordinates (including bond and torsional angles) associated with the atoms or particles. An alternative approach for computing equilibrium properties is the focus of this review and is termed as field-theoretic simulations (FTS). Field theory based approaches have long been used as the basis for approximate analytical calculations on a variety of complex fluid systems including polymer solutions, melts, blends, and copolymers [9–11]. Further, lattice gauge simulation methods [12, 13] have also applied to field theories in nuclear, high energy, and “hard” condensed matter physics. However, it is only very recently that such field-theory models have considered as the starting point for a computer
Field-theoretic simulations
2647
simulation strategy for polymers or other soft condensed matter systems [14, 15]. In the field-theoretic framework one integrates out the particle coordinates in the partition function, replacing them instead with functional integrals over smoothed, coarse-grained density fields and/or chemical potential fields that are conjugate to such density fields. The degrees of freedom in the FTS scheme are the values of the density and the chemical potential fields at different positions in the simulation domain. In order to represent these fields with a finite number of degrees of freedom, a variety of techniques are available including spectral methods, finite differences, and finite elements. These techniques are used extensively for the numerical solution of partial differential equations, e.g., in computational fluid mechanics [16], and a vast literature exists for efficiently and accurately representing fields of physical interest. Subsequently, an appropriate sampling strategy is used to generate different configurations of the fields (“ensembles”) with a thermodynamically consistent statistical weight, allowing one to calculate the experimental observables and the equilibrium thermodynamic properties of the material through averages over such fluctuating fields. It is hard to delineate precisely the systems for which field-theory based approaches should work more effectively than PBS and vice versa. Smallmolecule fluids and complex fluids that are characterized by harsh repulsions at small separations (like for instance, suspensions of colloidal particles, liquid crystals, etc.) possess rich liquid structure at small length scales that plays an important role in determining the thermodynamic properties of such materials [17]. Field-theoretic approaches would typically require a high spatial resolution (and correspondingly larger computational effort) to capture such short range effects, and hence such materials are better simulated by PBS methods. On the other hand, PBS methods are expensive to simulate, especially at high densities, and for heterogeneous systems that exhibit nanoscale or macroscale phase separation. In contrast, field-theories and the simulations of such theories work best for situations where the number densities of the particles are quite large so that the fluctuations in a coarse-grained description are kept at a modest level. Consequently, field-theoretic approaches offer an attractive platform (from the standpoint of computational effort) to simulate inhomogeneous polymers in situations where the short-range details turn out to be unimportant in determining the macroscale behavior. Examples include the cases of concentrated polymer solutions and blends, multiblock copolymers, charge-stabilized colloidal suspensions, etc. In the following, we take up the simple case of a fluid of colloidal particles to illustrate the two steps involved in developing and implementing the FTS approach. Many details, which have been omitted in the development to preserve brevity, can be found in Ref. [15]. We conclude with an overview of some recent applications and some potential future directions.
2648
1.
V. Ganesan and G.H. Fredrickson
Step 1: Developing the Field Theory
The first step of FTS requires the development of a model of the classical/complex fluid whose thermodynamic properties are desired. In principle, such a model could be constructed at the atomistic scale (with interactions determined from quantum chemical calculations) or at a coarse-grained PBS scale with effective interactions between the lumped “particles”. In some instances the microscopic model could have particles smeared into fields, as in polymeric fluids where chains are often represented as flexible space curves. As an illustrative example, we consider the case of a fluid of colloidal particles that are interacting by means of a specified general pairwise interaction potential v(r). For a collection of n such particles in a volume V , the configurational partition function can be written as Z∝
drn exp −
β v(|r j − rk |) , 2 j =/ k k
(1)
where rn ≡ (r1 , r2 , . . . , rn ) and r j represents the coordinates of the particle j . In the above, β ≡ (kB T )−1 , where kB represents the Boltzmann constant and T the temperature. We note that the above framework also constitutes the starting point of PBS strategies. In such approaches, different configurations are characterized by different positions of the particles (or molecules), and the corresponding simulation strategies focus on generating ensembles of such configurations with the statistical weight: exp − β/2 j =/ k k v(|r j − rk |) [18]. The starting point for the FTS method smears out the coordinates of the particles, describing the thermodynamics of the system by means of a coarsegrained density field ρ(r). The appropriate statistical weights for such a description is obtained through two steps: (i) The first step is to specify the relationship between the microscopic density field and the positions of the particles. It is possible to postulate a variety of physically reasonable prescriptions for smearing out the positions of the particles into a density field. In the present context, we consider the simple case of “point” particles where we can introduce a microscopic density field ρ(r) ˆ ≡ ni=1 δ(r − ri ). We can rewrite the above partition function in terms of ρ(r) ˆ as Z∝
drn exp −
β 2
dr
dr ρ(r)v(|r ˆ − r |)ρ(r ˆ ) ,
(2)
where an unimportant self-energy term has been omitted. (ii) The second step involves “projecting” the above microscopic description to an effective
Field-theoretic simulations
2649
“Hamiltonian” H [ρ] that is a functional of the real density fields ρ(r): exp(−β H [ρ]) ∝
drn
δ[ρ(r) − ρ(r)] ˆ
r
β dr dr ρ(r)v(|r ˆ − r |)ρ(r ˆ ) . (3) 2 The above step can be simply understood as enumerating the microscopic configurations for which ρ(r) ˆ = ρ(r), to it the weight exp(−β √ H [ρ]). and ascribing Dw exp[i dr w(ρ − ρ)], ˆ where i = −1 and Using the identity δ(ρ − ρ) ˆ = Dw denotes a functional integral over a scalar “chemical potential” field w(r), we obtain × exp −
exp(−β H [ρ]) ∝
Dw
drn
× exp i ×
drw(ρ − ρ) ˆ −
β 2
dr dr ρ(r)v(|r − r |)ρ(r ) .
(4)
After a few simple mathematical manipulations we obtain exp(−β H [ρ]) ∝ where Hˆ [ρ, w]=−i and Q[iw] = V −1
Dw exp(− Hˆ [ρ, w]),
dr wρ+
β 2
(5)
dr drρ(r)v(|r−r |)ρ(r )−n ln Q[iw], (6)
dr exp[−iw(r)].
(7)
Note that the above mathematical transformations have transformed the integrals over the coordinates of the particles rn to a functional integral over a real chemical potential field w(r). This step has also had the effect of eliminating all the particle–particle interactions, instead replacing them with the interaction between the particles and the fluctuating field w(r). As a result the coordinate integrals dr j are identical for each particle, allowing us to replace the n integrals by (Q[iw])n , where Q[iw] represents the single particle partition function in a purely imaginary potential iw(r) [9, 15]. For the case of flexible polymers, Eq. (7) is replaced by an expression related to the solution of a modified diffusion equation describing the conformations of a polymer experiencing the potential iw(r). The formulation embodied in Eq. (5) represents the thermodynamic description at the scale of coarse-grained density fields ρ(r). Indeed, the partition function Z can be equivalently expressed as Z∝
Dρ
Dw exp(− Hˆ [ρ, w]).
(8)
2650
V. Ganesan and G.H. Fredrickson
Thermodynamic observables like the average densities, density correlations and osmotic pressures can be computed as appropriate ensemble averages over the different configurations of the density fields ρ(r). For instance, the density–density correlations ρ(r)ρ(r ) can be expressed as: ρ(r)ρ(r ) = Z −1
Dρ
Dw ρ(r)ρ(r ) exp(− Hˆ [ρ, w]).
(9)
The above approach can also be easily generalized to describe the thermodynamics of multicomponent systems in terms of the density fields of each of the components. As might be evident, the introduction of additional density fields would also necessitate the introduction of a conjugate chemical potential field for each such density [15, 19]. More recent applications have also generalized the above description to incorporate additional coarse-grained variables like stress fields to describe the thermodynamics of deformed systems [20].
2.
Step 2: Discretizing and Sampling the Field-Theory
The transformation of Eq. (1) into the field-theoretic formulation of Eqs. (5), (8) and (9) would seem to be a step in the wrong direction, since we now are faced with the task of evaluating infinite-dimensional functional integrals over the fields ρ(r) and w(r). However, upon discretization, the functional integrals reduce to finite-dimensional integrals that can be tackled with stochastic simulation methods. In such methods, large numbers of configurations of the fields ρ(r) and w(r) are generated with a probability weight exp(− Hˆ [ρ, w]), and the field configurations are used to evaluate thermodynamic averages like Eq. (9). The latter constitutes the basis of FTS, where the space (i.e., the simulation domain) is explicitly discretized and different configurations of the density and potential fields (as specified by their values at the different points in the domain) are generated with a weight exp(− Hˆ [ρ, w]) by an appropriate sampling scheme [15, 19]. The appropriate scheme for discretizing the spatial domain is a highly flexible component of FTS and can be accomplished by conventional finite difference or finite element representations of the fields. Spectral and pseudo-spectral techniques are particularly attractive for this purpose [21, 22]. Modern adaptive, unstructured finite element methods [16] could also be applied. On the other hand, the appropriate sampling scheme for generating the density and the potential fields involves some subtle issues which are discussed below. Conventional simulation strategies such as Monte Carlo schemes and molecular dynamics methods are well tuned to generating configurations with a given ˆ positive definite weight [18]. However, the effective “Hamiltonians” H √[ρ, w] which accompany FTS are complex (due to the appearance of i = −1 in Eq. (6)), despite the fact that the fields ρ and w and the partition function Z are
Field-theoretic simulations
2651
real. If Hˆ is decomposed as Hˆ R + i Hˆ I , then it follows that Z can be expressed as Z = Dρ Dw exp(− Hˆ R ) cos( Hˆ I ), where the integrand is explicitly real, but is not positive semi-definite due to the phase factor cos(HI ). To overcome this subtle, albeit important feature of the FTS weights, two modified simulation strategies have been developed.
2.1.
Steepest Descent Sampling
This method [15, 23] tackles the issue of positive definiteness by generating the different configurations of ρ and w fields with a modified positive semidefinite weight, viz., exp(− Hˆ R ). Averages of observables are then computed by using an explicit phase factor, cos( Hˆ I )
φ([ρ]) =
φ([ρ]) cos[ Hˆ I ]R , cos[ Hˆ I ]R
(10)
where the averages · · · R are now over the potential fields generated with a weight exp(− Hˆ R ). While this approach is simple, in practice it is not very useful. The integration path, which in this case is the real axis for the density and potential fields, is not a constant phase or steepest-descent(ascent) path, whence the phase factor oscillates in sign from state to state along a Monte Carlo trajectory. As a result, it proves very difficult to accurately compute the right hand side of expressions such as Eq. (10). The steepest-descent (SD) simulation strategy adds one more step to the above idea where such oscillations are minimized by deforming the path of integration onto a new path that passes through the relevant saddle point and is also a constant phase path to quadratic order near the saddle point [24]. In this manner, provided that fluctuations about the relevant saddle point are weak, the phase factor cos( Hˆ I ) is kept at (an almost) constant value, thereby damping the spurious oscillations encountered in evaluating the above averages. In situations of strong fluctuations, a global stationary phase technique has recently become available [25].
2.2.
Complex Langevin Sampling
The Complex Langevin (CL) method was originally developed by Klauder [26] and Parisi [27] as a strategy for sampling quantum field theories on a lattice and for simulating more general types of lattice gauge theories with complex actions. The basic idea behind the CL method is to stochastically sample the relevant fields, i.e., ρ(r) and w(r), not just along the real axis, but
2652
V. Ganesan and G.H. Fredrickson
in the entire complex plane of ρ =ρR +iρI and w =wR +iwI . For any observable φ, the expectation (average) value can be expressed as φ([ρ]) = Z
−1
D[ρR ]
D[wR ] exp(− Hˆ [ρR , wR ])φ([ρR ]).
(11)
In the CL method, the strategy is to instead express such an observable as φ([ρ]) =
D[ρR ]
D[ρI ]
D[wR ]
D[wI ] P[ρR , ρI , wR , wI ]
× φ([ρR + iρI ]),
(12)
where the complex weight Z −1 exp(− Hˆ ) has been replaced by a real, positive definite statistical weight P[ρR , ρI , wR , wI ]. The statistical weight P[ρR , ρI , wR , wI ] is generated as the steady-state distribution of a stochastic “complex Langevin” dynamics, which gives the probability of observing the field configurations w = wR + iwI and ρ = ρR + iρI at time t. The extension of the fields to the complex plane has a practical cost in that the number of configurational degrees of freedom in a simulation is doubled. However, applications of this approach have demonstrated that such an increase in the number of degrees of freedom is more than offset by the efficiency of this approach in generating statistically relevant field configurations.
2.3.
Saddle-points and Self-consistent Field Theories
An important feature of the field theory models [11, 28], described in the previous sections is the identification of stationary field configurations that correspond to extrema of the complex effective Hamiltonian Hˆ . Such configuration(s) can correspond to a local minimum, maximum, or (usually) a saddle point in the field configuration space. While the above two simulation strategies focus on generating large numbers of configurations with appropriate weights, the extremum field configurations correspond to dominating (in most cases) configurations, and in many a cases provide a wealth of information about the thermodynamic properties of the material. In the specific context of polymer models, the solutions corresponding to inhomogeneous saddle points are identical to those obtained using a mean-field theory known as selfconsistent field theory (SCFT). Such theories have been applied with great success in the analysis of the excluded volume effect, the characteristics of polymer–polymer interfaces and polymer brushes, the self-assembly features of block copolymer mesophases, polymer blends, thin films of polymers, block copolymer–nanoparticle blends, etc. [29]. New pseudo-spectral algorithms for efficient numerical implementation of SCFT will undoubtedly further extend the impact of the saddle point approach [21, 22].
Field-theoretic simulations
3.
2653
Outlook
In the present review we have described the idea behind the field-theoretic computer simulation (FTS) tools for analyzing the equilibrium structure and thermodynamics of both simple and complex fluids. The preceding sections illustrated how field theory models can be formulated starting from conventional particle-based models of fluids at the atomistic, mesoscopic, or macroscopic scales. Thus, it is possible to connect the potential parameters used in traditional MD or MC simulations to the parameters in field theory models amenable to study by the methods described here. Further, field theory models are commonly used in analytical studies aimed at extracting “universal” features of the structure and thermodynamics of polymers and complex fluids [10]. FTS methods thus enable numerical studies of field theory models in parameter ranges or situations where approximate analytical tools are inadequate or fail. Finally, experimental studies of complex fluids are often interpreted in the context of parameters (e.g., Flory χ parameters) and predictions derived from field theory models [30, 31]. As a result, it is often more straightforward to connect experimental data to results from a FTS simulation than to numerical data from particle-based MD or MC simulations. It is still too early in the development of FTS methods to predict their competitiveness against other theoretical and computational techniques. Nevertheless, the following problems appear particularly promising in the context of applying the FTS approach: • Concentrated polymer systems, especially multiphase blends and copolymer melts are ideally suited for study by FTS methods, especially when the atomic-scale structure is not of interest or relevant to mesoscopic/ macroscopic self-assembly behavior. We list two such examples where FTS has been applied: – Order–disorder transitions of block copolymers [15, 19]. FTS simulations have been used to explore quantitatively the effect of fluctuations and the polymerization index N of the polymer on shifting the order–disorder transition (ODT) in two-dimensional, symmetric diblock copolymer melts. The results of these studies matched quantitatively with approximate analytical calculations, but also extended the predictions to regimes where such calculations were not applicable. It is important to emphasize that such results would be very difficult to obtain by means of a conventional particle-based simulation of a highly incompressible block copolymer melt. – Polymeric microemulsion phases [23]. A variant of the steepest descent Monte Carlo method was used to analyze a field theory model of a ternary blend of AB diblock copolymers with A and B homopolymers. These studies found a shift in the line of order–disorder
2654
V. Ganesan and G.H. Fredrickson transitions from their mean-field values, as well as strong signatures of the existence of a polymeric bicontinuous microemulsion phase in the vicinity of the mean-field Lifshitz critical point. These results matched qualitatively with a series of experiments conducted with various three-dimensional realizations of this model system. – Confined polymer solutions [32]. FTS simulations using the CL technique were used to explore the structure of semi-dilute and concentrated polymer solutions confined to a slit geometry. The crossover behavior between the semi-dilute and concentrated regimes was explicitly accessed and the role of concentration fluctuations on the structure and thermodynamics was examined.
• Systems with soft, long-ranged interactions such as electrolyte solutions, polyelectrolytes, block co-polyelectrolytes, etc. may prove to be easier to study using FTS techniques. A long-ranged Coulomb interaction, v(|r − r |) ∼ |r − r |−1 , can be transformed into a short-ranged interaction in the field-theoretic framework by the Hubbard–Stratonovich transformation. Thus, computationally expensive approaches to treat long-range forces, for example Ewald summations, can be avoided in the FTS approach. • The determination of potentials of mean force between colloidal particles can be conveniently addressed by FTS techniques. The stability of colloidal (and nanoparticle) suspensions with surface charges, grafted polymers, free polymers, counterions, and salts can in principle be investigated by the methods described here. • The dynamical and rheological properties of complex fluids are currently best addressed with particle-based simulation methods. Recently, two field theory approaches have been advanced that appear promising for describing the dynamics and rheology of complex fluids. While the first approach generalizes the coarse-graining methodology to include new variables characterizing nonequilibrium states, the second approach combines particle-based simulations with the field-theoretic ideas to effect dynamical simulations in dense systems. With further development, either or both these methods could become the appropriate generalization of FTS for addressing nonequilibrium phenomena in complex fluids [20, 33].
References [1] R.A. Ball, Made to Measure, Princeton University Press, Princeton, 1997. [2] F. Muller-Plathe, “Combining quantum chemistry and molecular simulation,” Adv. Quantum Chem., 28, 81–87, 1997. [3] J. Bascnagel, K. Binder, P. Doruker, A.A. Gusev, O. Hahn, K. Kremer, W.L. Mattice, F. Muller-Plathe, M. Murat, W. Paul, S. Santos, U.W. Suter, and V. Tries, “Bridging
Field-theoretic simulations
[4] [5] [6] [7] [8] [9] [10] [11] [12]
[13] [14]
[15] [16] [17] [18] [19] [20] [21]
[22]
[23] [24] [25]
2655
the gap between atomistic and coarse-grained models of polymers: status and perspectives,” Adv. Pol. Sci., 152, 41–156, 2000. K. Binder (ed.), Monte Carlo and Molecular Dynamics Simulations in Polymer Science, Oxford University Press, New York, 1995. K. Binder and W. Paul, “Monte Carlo simulations of polymer dynamics: recent advances,” J. Polym. Sci., Part B – Polym. Phys., 35, 1–31, 1997. K. Kremer and F. Muller-Plathe, “Multiscale problems in polymer science: simulation approaches,” MRS Bull., 26, 205–210, 2001. G.S. Grest, M.D. Lacasse, and M. Murat, “Molecular-dynamics simulations of polymer surfaces and interfaces,” MRS Bull., 22, 27–31, 1997. R.D. Groot and P.B. Warren, “Dissipative particle dynamics: bridging the gap between atomistic and mesoscopic simulation,” J. Chem. Phys., 107, 4423–4435, 1997. E. Helfand, “Theory of inhomogeneous polymers – fundamentals of Gaussian random-walk model,” J. Chem. Phys., 62, 999, 1975. M. Doi and S.F. Edwards, The Theory of Polymer Dynamics, Oxford University, Press, New York, 1986. K. Freed, Renormalization Group Theory of Macromolecules, Wiley, New York, 1987. G.G. Batrouni, G.R. Katz, A.S. Kronfeld, G.P. Lepage, B. Svetitsky, and K.G. Wilson, “Langevin simulations of lattice-field theories,” Phys. Rev. D, 32, 2736, 1985. A.D. Kennedy, “The Hybrid Monte Carlo algorithm on parallel computers,” Parallel Comput., 25, 1311, 1999. P. Altevogt, O.A. Evers, J. Fraaije, N.M. Maurits, and B.A.C. van Vlimmeren, “The MesoDyn project: software for mesoscale chemical engineering,” Theochem – J. Mol. Struct., 463, 139–143, 1999. G.H. Fredrickson, V. Ganesan, and F. Drolet, “Field-theoretic computer simulation methods for polymers and complex fluids,” Macromolecules, 35, 16–39, 2002. T.J. Chung, Computational Fluid Dynamics, Cambridge, University Press, Cambridge, 2002. J.-P. Hansen and I.R. McDonald, Theory of Simple Liquids, Academic Press, New York, 1986. M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford University, Press, New York, 1987 V. Ganesan and G.H. Fredrickson, “Field-theoretic polymer simulations,” Europhys. Lett., 55, 814–820, 2001. G.H. Fredrickson, “Dynamics and rheology of inhomogeneous polymeric fluids: a complex Langevin approach,” J. Chem. Phys., 117, 6810–6820, 2002. G. Tzeremes, K.O. Rasmussen, T. Lookman, and A. Saxena, “Efficient computation of the structural phase behavior of block copolymers,” Phys. Rev. E., 65, 041806, 2002. S.W. Sides and G.H. Fredrickson, “Parallel algorithm for numerical self-consistent field theory simulations of block copolymer structure,” Polymer, 44, 5859–5866, 2003. D. Duechs, V. Ganesan, F. Schmid, and G.H. Fredrickson, “Fluctuation effects in ternary AB + A + B polymeric emulsions,” Macromolecules, 36, 9237, 2003. C.M. Bender and S.A. Orszag, Advanced Mathematical Methods for Scientists and Engineers, McGraw-Hill, New York, 1978. A.G. Moreira, S.A. Baeurle, and G.H. Fredrickson, “Global stationary phase and the sign problem,” Phys. Rev. Lett., 91, 150201, 2003.
2656
V. Ganesan and G.H. Fredrickson
[26] J.R. Klauder, “Coherent-state Langevin-equations for canonical quantum-systems with applications to the quantized hall effect,” Phys. Rev. A, 29, 2036, 1984. [27] G. Parisi, “On complex probabilities,” Phys. Lett. B, 131, 393, 1983. [28] M.W. Matsen and M. Schick, “Stable and unstable phases of a diblock copolymer melt.” Phys. Revi. Lett., 72, 2660–2663, 1994. [29] M.W. Matsen, “The standard gaussian model for block copolymer melts,” J. Phys. Cond,. Matter, 14, R21–R47, 2002. [30] F.S. Bates and G.H. Fredrickson, “Block copolymer thermodynamics – Theory and Experiment,” Annu. Rev. Phys. Chem., 41, 525–557, 1990. [31] F.S. Bates, “Polymer–polymer phase behavior,” Science, 251, 898–905, 1991. [32] A. Alexander-Katz, A.G. Moreira, and G.H. Fredrickson, “Field-theoretic simulations of confined polymer solutions,” J. Chem. Phys., 118, 9030–9036, 2003. [33] V. Pryamitsyn and V. Ganesan, “Dynamical mean-field theory for inhomogeneous polymeric systems,” J. Chem. Phys., 118, 4345–4348, 2003.
PLENARY PERSPECTIVES INTRODUCTION The Plenary Perspectives is a special feature of the Handbook, a collection of commentaries written in an informal style to engage the materials modeling community, especially the young scientists whom this reference work primarily hopes to serve. Each author was invited to address the community in a manner he feels appropriate – to examine an idea, make an observation, or express an opinion. It is envisioned that despite their brevity and diverse themes, these Perspectives provide a collective complement to the longer articles in the Handbook, giving the reader another view of materials modeling, a sense of the dynamic multidisciplinary endeavor that is now just emerging.
Perspective 1 PROGRESS IN UNIFYING CONDENSED MATTER THEORY Duane C. Wallace Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
Over the years, experimentalists have uncovered an array of fascinating properties of condensed matter. The initial response of theorists has been to treat each new property as a separate problem to be solved, with the result that condensed matter theory presents the appearance of “a different Hamiltonian for every problem.” But there has always been an undercurrent of work aimed at searching out common ground, and of reconstructing disparate theories from a more universal basis. Ultimately, the same wavefunctions and energy levels should explain all the properties of a given material in a given state. Here we sketch development along this line, and note how it is useful in continuing research. In qualitative terms, the physical nature of condensed matter is well understood. In a single isolated atom, the negatively charged electrons are bound by the Coulomb force to the positively charged nucleus. If the Coulomb force were unopposed, the electron cloud would shrink to the size of the nucleus. But the electron wavefunctions have to be mutually orthogonal solutions of the Shrödinger equation, and since electrons have only two orthogonal spin states, their spatial wavefunctions must oscillate to achieve orthogonality in a many-electron system, and this oscillation produces positive kinetic energy. Hence, the balance between negative Coulomb energy and positive kinetic energy determines the binding energy of the atom. The same principle operates when several atoms are brought together to form a molecule, or when many atoms are brought together to form condensed matter. Here the atomic electron clouds can deform so as to move electron density into regions between nearest-neighbor nuclei, thus lowering the negative Coulomb energy of the condensed system, but this process is opposed by the positive kinetic energy that results from the increased localization of the electrons. Hence the equilibrium configuration of the condensed system, and its binding energy relative to 2659 S. Yip (ed.), Handbook of Materials Modeling, 2659–2661. c 2005 Springer. Printed in the Netherlands.
2660
D.C. Wallace
free atoms, is again controlled by the balance between Coulomb attraction and kinetic-energy repulsion of the electrons. From this qualitative picture, a broad part of condensed matter theory works out as follows [1]: we can calculate all the equilibrium properties if we know the energy of the electronic groundstate as function of the nuclear positions, and specifically for metals, if we also know the density of excited electronic energy levels. To illustrate the breadth and accuracy of current theory, let us consider the material class composed of metallic elements in the crystalline state, where density functional theory provides the required electronic energy information. By calculating the groundstate energy for various crystal structures, we can account for the observed stable crystal structure, its equilibrium lattice parameters, and its binding energy relative to free atoms. By calculating the increase in groundstate energy when nuclei are displaced from equilibrium, we obtain the elastic constants and phonon frequencies. With these energy levels, and statistical mechanics theory, we understand the separate phonon and electron contributions to thermal properties, such as specific heat and thermal expansion. The theory also reproduces observed crystal–crystal phase transitions, both compression induced transitions, and the theoretically subtle temperature induced transitions. With few exceptions, all these properties can be calculated to an accuracy on the order of 1%, or a few percent, for all the elemental crystals, leaving no doubt that the theory is correct in detail. Extending our view, we see a wide variety of material types and material properties. For insulating crystals, both elemental and polyatomic, density functional theory is still reliable, and from it we can again calculate equilibrium crystal properties to high accuracy. The rare gases are a special case, since their binding is extremely weak and is not accurately given by density functional theory, but once their interatomic forces are constructed, by the appropriate theory, all crystal properties can be calculated to high accuracy. Alloys are polyatomic metals, for which the atomic disorder presents difficulties for electronic structure theory, but these difficulties are currently being overcome. For the liquid state, the nuclear motion potential is given by precisely the same theory as for the crystal, but the problem has always been the complicated shape of this potential surface. The liquid problem is finally being solved, with development of an accurate theory for monatomic liquids, and with help from computer simulations. Similar techniques are being applied to the study of glasses and polymers. Much of modern condensed matter research is devoted to properties more delicate than those mentioned above, such as magnetism, ferroelectricity, and correlated electron effects including superconductivity, with the goal of making qualitative theories more accurate, and this inevitably calls for making theory more fundamental, hence more universal. Finally, we note that nonequilibrium properties can be understood from the same physical nature of condensed matter, but nonequilibrium theory
Progress in unifying condensed matter theory
2661
is more complicated because it requires a higher resolution of the nuclear and electronic motion. Our discussion leads to an important point about how we do research. Research is, after all, an enterprise in problem solving. The common situation calls for us to answer questions about a specific property of a given material, when the current understanding of this property for this material is insufficient for the purpose. The most advantageous way to proceed is to examine theory at its most fundamental and most universal level, to identify the physical processes which contribute to the property in question, in materials where it is well understood. This does not mean one cannot make approximations, since often they are unavoidable, but it is only by knowing the correct physics that one can make correct approximations. Suppose, for example, one needs to examine effects associated with thermal expansion of a polymer. The physical process involved is the same as for an insulating crystal, but the polymer is much more complicated owing to its greatly reduced symmetry. Consider the polymer nuclei, having equilibrium positions and normal modes of vibration about equilibrium. Under a small increase in the system volume, the equilibrium positions change, with a corresponding increase in the system potential energy, and the vibrational modes also change, with a corresponding decrease in their frequencies. With the decrease in frequencies the nuclei have larger vibrational amplitudes, hence they cover more space, or have more entropy, and the competition between higher potential energy and higher entropy determines the increase in volume which will accompany an increase in temperature. The equations expressing this picture are essentially exact, and the potential energy and frequencies involved are defined through the electronic structure. What makes the problem difficult is what also makes it interesting: we must find the variation with temperature of the entire atomic microstructure of the polymer. This microstructural change will then influence a host of other equilibrium and nonequilibrium properties.
Reference [1] D.C. Wallace, Statistical Physics of Crystals and Liquids, World Scientific, New Jersey.
Perspective 2 THE FUTURE OF SIMULATIONS IN MATERIALS SCIENCE D.P. Landau Center for Simulational Physics, The University of Georgia, Athens, GA 30602
The early part of the 21st century is rapidly developing into the era of man-made “materials by design.” The full realization of this situation requires the requisite understanding of the microscopic origins of diverse phenomena and the subsequent incorporation of this knowledge into the process which leads to the production of new materials. The optimum approach to scientific research now often requires the interplay between theory, experiment, and simulation as shown schematically in Fig. 1 below. Of course, each vertex of this triangular array represents a spectrum of different methods, some of which are more sensitive than others. From the perspective of computer simulations the situation has been brightening rapidly with the passage of time. Nevertheless, even with the dramatic increase in algorithmic sophistication and computer speed that has occurred during the past several decades, the treatment of models with large numbers of atoms is quite difficult. Moreover, because of the need to examine diverse, imperfect systems over wide ranges of temperature and other applied “fields,” it is quite likely that fully classical simulations employing approximate potentials will continue to play an exceedingly important role for a number of years to come. Heretofore, these potentials have been empirical; but because of the limitations of many such potentials, the interplay between quantum studies of smaller systems and the extraction of more fundamentally based and quantitatively accurate effective interactions for classical simulations will become increasingly important. We should thus expect to see electronic structure methods used to provide information that is used to help parameterize interactions that are then used as input to molecular dynamics and Monte Carlo simulations. Certainly, with the increased emphasis on systems at the nanoscale, there will be a class of simulations for which it will be possible to fully examine models containing a number of atoms that approximates that, which is present in the physical system under consideration. In other cases, it will be necessary 2663 S. Yip (ed.), Handbook of Materials Modeling, 2663–2666. c 2005 Springer. Printed in the Netherlands.
2664
D.P. Landau
Simulation
Nature
Theory
Experiment
Figure 1. Schematic representation of modern approach to scientific research in physics/ materials science.
to develop multiscale methods that will span length and time scales within a single coherent effort. As an example of the need for information over wide ranges of length and time scales, in Fig. 2 we portray the approximate length and times that are important for the study of fracture along with methods that are suitable to each range [1]. While this picture applies to one specific class of problems in material science, there are certainly many others for which the same degree of complexity exists. One pronounced feature of recent developments in computer simulations in statistical physics has been that new algorithms have succeeded in pushing back the frontier of accessible time scales quite substantially. This allowed advances to be made near phase transitions where very high resolution has been needed. Initially most of these new approaches have been applied to simple systems with discrete variables, but later methods were devised to allow them to be applied to models with continuous degrees of freedom. As an example, we can consider studies of the phase transitions in magnetic systems such as the Ising or XY-models. The combination of new “cluster flipping” simulation algorithms [2, 3] and new histogram reweighting analysis techniques [4] has permitted the extraction of extremely accurate values of critical temperatures (relative errors in the 6th significant digit) and critical exponents [5–7]. This lesson should be remembered when more complex models of real materials are being investigated, and brute force alone should not be relied on.
The future of simulations in materials science
2665
Figure 2. Schematic view of the multiscale nature of fracture and the approaches that apply in different regions of length and time (from Ref. [1]).
A completely different problem centered about the study of crack propagation in silicon. Here, processes are occurring on quite disparate length and time scales. In a pioneering study Broughton et al. [8] used a tight binding method to examine the immediate region of the crack tip, classical molecular dynamics to simulate the region around the crack (excluding the crack tip), and a finite element method to treat the system at scales that are large compared to atomic dimensions. The “handshaking” between the different regions was an essential part of the calculation to allow the passage of energy between regions in a continuous fashion, and the entire algorithm was coded for a parallel platform. Although the implantation was, for an “ideal” problem of brittle cleavage, it was an early indication of the manner in which significant problems in materials science will be treated in the future. Although the algorithms, in particular the “handshaking” techniques, will require further developments, the authors aptly describe the study as the “beginnings of computational atomistic engineering.”
References [1] D.P. Landau, F.F. Abraham, G.G. Batrouni, J.M. Carlson, J.R. Chelikowsky, D.D. Koelling, S.G. Louie, C. Mailhiot, A.C. Switendick, P.R. Taylor, A.R. Williams, and
2666
[2] [3] [4] [5] [6] [7]
[8]
D.P. Landau B.L. Holian, Computational and Theoretical Techniques for Materials Science, NRL Strategic Series, National Academy Press, Washington, DC, 1995. R.H. Swendsen and J.-S. Wang, “Nonuniversal critical dynamics in Monte Carlo simulations,” Phys. Rev. Lett., 58, 86, 1987. U. Wolff, “Collective Monte Carlo updating for spin systems,” Phys. Rev. Lett., 62, 361, 1989. A.M. Ferrenberg and R.H. Swendsen, “New Monte Carlo technique for studying phase transitions,” Phys. Rev. Lett., 61, 2635, 1988. A.M. Ferrenberg and D.P. Landau, “Critical behaviour of the three dimensional Ising model: a high resolution Monte Carlo study,” Phys. Rev. B, 44, 5081, 1991. H.W.J. Blöte, E. Luijten, and J.R. Heringa, “Ising universality in three dimensions,” J. Phys. A, 28, 6289, 1995. K. Chen, A.M. Ferrenberg, and D.P. Landau, “Static behavior of three dimensional classical Heisenberg models: a high resolution Monte Carlo study,” Phys. Rev. B, 48, 239, 1993. J.Q. Broughton, F.F. Abraham, N. Bernstein, and E. Kaxiras, “Concurrent coupling with length scales,” Phys. Rev. B, 60, 2391, 1999.
Perspective 3 MATERIALS BY DESIGN Gregory B. Olson Department of Materials Science and Engineering, Northwestern University, Evanston, IL
As a new millennium unfolds, a Science Age of three centuries draws to a close, replaced by a Technology Age based not in scientific discovery but in a revolution in engineering design led by U.S. industry. The resulting New Economy, which we now strive to sustain, is based in technology not found in a laboratory, but deliberately created from the human mind in response to perceived needs. While we tend to be nostalgic about exploration ages, at this point in history humankind not only enjoys an unprecedented ability to create wealth from thought, but holds all the tools for a much-needed transformation from mere technology to responsible technology. At the strategic level, this revolution is founded in new systems-based design methodologies that accelerate the total product development cycle while achieving new levels of product reliability. At the tactical level, it integrates a new understanding of human team creativity with new opportunities in information technology to create powerful computational tools tailored to strategic needs. Meanwhile, our “modern” research universities we inherited from the Cold War have continued to train explorers for a bygone era. Dominated by a culture of reductionist analysis, university engineering has for the past half century been replaced by engineering science, leaving industry on its own to advance engineering practice, and leaving the teaching of modern engineering to business schools and corporate universities. While this cultural incongruity has affected all fields of engineering, no field has suffered more damage than the materials profession. Left on its own, the industrial practice of materials development has languished in a slow and costly empirical discovery process that cannot keep pace with the compressed product development cycle, leaving no chance for participation in concurrent engineering. Despite a widespread deeply-held goal of scientific engineering from within the community, the academic materials enterprise has been diverted under external forces through 2667 S. Yip (ed.), Handbook of Materials Modeling, 2667–2669. c 2005 Springer. Printed in the Netherlands.
2668
G.B. Olson
funding policies toward reductionism and the pursuit of novelty, leading to highly dissipative random-walk exploration that yields much paper but no materials. Against this background, the multi-institutional steel research group (SRG) was founded in 1985 to build within a modern systems engineering framework the methods, tools and databases to support the rapid computational design of materials, using high-performance steels as a first example. Treating materials as dynamic multilevel-structured systems, integration of process/structure/property/performance relations has generated a hierarchy of design models. The methods, tools and models, and their successful application in the thermodynamics-based parametric design of new alloys, are described in detail elsewhere [1–3]. Transfer of this technology to the commercial sector has been led by the university spinoff company QuesTek Innovations, commercializing both the design technology and the first cyber-materials emerging from it, and a range of small businesses now furnish software tools and supporting databases as surveyed in a recent National Academy study [4]. Credibility of computational materials design based on the SRG/QuesTek success has helped to bring about major initiatives supported by the Defense Advanced Research Projects Agency (DARPA) and other DoD agencies that are facilitating a much needed transformation of the materials profession. Of particular note is the recently completed 3-year DARPA-AIM initiative in Accelerated Insertion of Materials, expanding the scope of computational materials design to accelerate the full development and qualification cycle. With both structural composites and aeroturbine disc superalloys as motivating use cases, a combination of integrated high-fidelity simulation and focused strategic testing demonstrated both accelerated process optimization at the component level, and the efficient prediction of part-to-part property variation in manufacturing for accurate forecast of minimum design allowables. A particularly historic achievement was the demonstration of enhanced performance in a subscale turbine disc for which the disc designer was enabled to continuously vary materials processing in component design. The industry-led AIM initiative also set a new standard for extracting useful work from academic research. This experience in turn has had a significant positive impact on DoD universitycentered programs, notably including the ongoing ONR Grand Challenge in Naval Materials by Design and the AFOSR-MEANS (Materials Engineering for Affordable New Systems) initiatives. Addressing the early stages of computational materials design, these initiatives have notably integrated emerging quantum engineering tools for both the accelerated assessment of fundamental databases and the predictive control of crucial interfacial phenomena. This significant investment in modern materials engineering has enabled a new learning environment on our campuses that has gone beyond traditional graduate education to invigorate undergraduate design, dominating the TMS national undergraduate design competition, and allowing upper-level materials students
Materials by design
2669
in multidisciplinary projects to participate in a new form of concurrent materials engineering that did not exist 3 years ago. The new AIM paradigm of industry-led university engineering projects supported by mission-driven agencies promises a new rationalized research infrastructure meeting the needs of our times, where science toys can be intelligently fashioned into purposeful tools of engineering optimization, enabling materials modeling and simulation to reach their full potential for deliberate value creation. If indeed it ushers a new era where the physical sciences support engineering as well as the life sciences have supported medicine, we can expect to be as successful and as valued by society as we bring materials into this new Design Age.
References [1] G.B. Olson, “Science of steel,” In: G.B. Olson, M. Azrin, and E.S. Wright (eds.), Innovations in Ultrahigh-Strength Steel Technology, Sagamore Army Materials Research Conference Proceedings: 34th, 3–66, 1990. [2] G.B. Olson, “Computational design of hierarchically structured materials,” Science, 277(5330), 1237–1242, 1997. [3] G.B. Olson, “Designing a new material world,” Science, vol. 288, 12 May, 993–998, 2000. [4] National Research Council Report, Accelerating Technology Transition, The National Academies Press, Washington D.C., 2004.
Perspective 4 MODELING AT THE SPEED OF LIGHT J.D. Joannopoulos Francis Wright Davis Professor of Physics, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
For over half a century, semiconductor physics has played a vital role in almost every aspect of modern technology. Advances in this field have allowed scientists to adapt the conducting properties of materials and have led to the transistor revolution in electronics. New research suggests that we may now be able to initiate a similar revolution by adapting the properties of light. The key in achieving this goal lies in the use of a new class of materials called photonic crystals [1–3]. The basic idea is to design materials that can affect the properties of photonmodes (or just photons, for brevity) in much the same way that ordinary semiconductor crystals affect the properties of electrons. That this is feasible, becomes clear if one considers that Maxwell’s equations for linear materials in the frequency-domain, and in the absence of external currents and sources, can be cast in a form that is reminiscent of the Schroedinger equation, namely
1 ω2 ∇× H(r) = 2 H(r), (1) ε(r) c where H(r) is the magnetic field subject to the transversality constraint ∇ · H(r) = 0. Equation (1) represents a linear Hermitian eigenvalue problem whose solutions are determined entirely by the properties of the macroscopic dielectric function ε(r) . Therefore, if one were to construct a crystal consisting of a periodic array of macroscopic metallic or uniform-dielectric “atoms”, the photons in this crystal could be described in terms of a bandstructure, as in the case of electrons. Of particular interest is a photonic crystal whose bandstructure possesses a complete photonic band gap (PBG). A PBG defines a range of frequencies for which light is forbidden to exist inside the crystal, regardless of its direction of propagation. Forbidden, that is, unless there is a defect in the otherwise perfect crystal. A defect could lead to one or more localized photon states in ∇×
2671 S. Yip (ed.), Handbook of Materials Modeling, 2671–2674. c 2005 Springer. Printed in the Netherlands.
2672
J.D. Joannopoulos
the gap, whose shapes and properties would be dictated by the nature of the defect. A point defect could act like a microcavity to confine light and a line defect could act as a linear waveguide to guide light. Thus, deliberately designed structural defects are good things in photonic crystals, providing a new mechanism for molding and controlling the properties of light. Therein lies the exciting potential of photonic crystals. Moreover, given a new mechanism one might expect to enable photon phenomena that have never been possible before. Recent examples include theoretical predictions of negative refraction [4], novel Cerenkov radiation [5], reversed Doppler shifts [5] and anomalous dispersion relations [6]. A very significant and attractive difference between photonic crystals and electronic semiconductor crystals is the former’s inherent ability to provide complete tunability. A defect in a photonic crystal could, in principle, be designed to be of any size, shape or form and could be chosen to have any of a wide variety of dielectric constants. Thus, defect states in the gap could be tuned to any frequency and spatial extent of design interest. But, in addition to tuning the frequency, one also has some control over the symmetry of the localized photon state. For example, the very specific symmetry associated with each photon mode in a photonic crystal microcavity [3] translates into an effective orbital angular momentum for each photon, which can exist in addition to the intrinsic spin angular momentum [8]. This is a very intriguing notion that can have spectacular consequences in the selection rules of electronic transition rates in quantum well or quantum well structures. In addition to changing the symmetry of a photon state, one can also change the density of states in order to affect spontaneous emission. The rate of spontaneous emission of a given initial state is proportional to the density of final photon states available at the transition frequency. Hence, by operating within the gap of a finite photonic crystal and thus eliminating nearly all photon modes at the transition frequency, the emission rate will be greatly reduced. Conversely, by operating at a point-defect resonance, the emission rate could be dramatically enhanced due to the large increase in the density of final states. In this way one could improve the efficiency of current lasers, or even design lasers that operate in frequency ranges yet unachieved. The similarity of Maxwell’s equation (1) to Schroedinger’s equation implies that computational techniques used to study electrons in solids, such as the conjugate gradients approach, may also be used to study photon modes in photonic crystals [9–11]. The main differences are that electrons are described by a complex scalar field and strongly interact with each other whereas the photons are described by a real vector field and do not interact with each other. Solution of the photon-equations is thus only a single-particle problem and leads, to all intents and purposes, to an “exact” description of their properties.
Modeling at the speed of light
2673
To solve Maxwell’s equations for periodic dielectric media in the frequencydomain one typically begins by expanding the fields in plane waves. As in the case of electrons, the use of a plane wave basis set has a number of extremely desirable consequences. First, the transversality constraint is easily satisfied. Second, the set is complete and orthonormal. Third, finite sets can be systematically improved in a straightforward manner. Finally, a priori knowledge of the field distribution is not required for the generation of the set. The chief difficulty, however, in using plane waves is that huge numbers of plane waves are typically required in order to describe the sudden changes in dielectric constant inherent in a photonic crystal structure. This problem can easily be overcome by a better treatment of the boundaries between the dielectric media. In particular, construction of a dielectric tensor [10] to interpolate in the boundary regimes allows the proper screening of photons with different polarization and leads to a rapid convergence of all eigenmodes by over an order of magnitude as compared to a scalar dielectric. A variational functional can then be constructed from Eq. (1) whose iterative minimization leads to the required stationary solutions [3]. To solve Maxwell’s equations for periodic dielectric media in the timedomain, one typically employs Yee-lattice finite difference time domain (FDTD) methods [12, 13] that can include periodic as well as absorbing boundary conditions. Such computations are extremely useful for making direct comparisons with experiments by modeling the experimental setup and measurement process. Moreover, complex frequency dependent dielectric functions, as well as non-linear response, can be straightforwardly implemented in this approach. Future computational work will involve enabling the treatment of photons together with electrons and phonons within a single framework. This capability will be critical for enabling future numerical studies of photonic crystals where material gain, acousto-optic response, or thermal response are of interest. In conclusion, theory and computation play a particularly important role in the field of photonic crystals. As opposed to electrons, where the exchange– correlation functional is only known approximately, photons in linear material systems do not interact. And in non-linear materials systems this interaction is well known. Thus, Maxwell’s equations can be solved numerically to any desired degree of precision or accuracy. This represents one of the few cases in science where numerical experiments can be as accurate as laboratory experiments! Future directions of research in this field will involve the design and realization of novel microdevices and device-components whose sizes are on the order of the wavelength of light of interest. This capability would make it possible, eventually, to integrate a large number and variety of optical devices on a single chip just as is now done for electronic devices. Moreover, the search for unusual or anomalous photonic phenomena associated with photonic crystal systems will continue both theoretically and
2674
J.D. Joannopoulos
experimentally, with ultimate goal the identification of novel and useful photon functionalities.
References [1] E. Yablonivitch, Phys. Rev. Lett., 58, 2509, 1987. [2] S. John, Phys. Rev. Lett., 58, 2486, 1987. [3] J.D. Joannopoulos, R.D. Meade, and J.N. Winn, Photonic Crystals, Princeton, New York, 1995. [4] C. Luo, S. Johnson, and J.D. Joannopoulos, Appl. Phys. Lett., 81, 2352, 2002. [5] C. Luo, M. Ibanescu, S. Johnson, and J.D. Joannopoulos, Science, 299, 368, 2003. [6] E. Reed, M. Soljacic, and J.D. Joannopoulos, Phys. Rev. Lett., 91, 133901, 2003. [7] M. Ibanescu, S.G. Johnson, D. Roundy, C. Luo, Y. Fink, and J.D. Joannopoulos, Phys. Rev. Lett., 92, 063903, 2004. [8] J.D. Joannopoulos, P.R. Villeneuve, and S. Fan, Nature, 386, 143, 1997. [9] K. Ho, C. Chan, and C. Soukoulis, Phys. Rev. Lett., 65, 3125, 1990. [10] R. Meade, K. Brommer, A. Rappe, and J. Joannopoulos, Phys. Rev. Rapid Comm. B, 44, 13772, 1991; Erratum: Phys. Rev. B, 55, 15942, 1997. [11] S. Johnson and J.D. Joannopoulos, Opt. Express, 8, 173, 2001. [12] K.S. Yee, IEEE Trans. Ant. Prop. AP, 14, 302, 1966. [13] J.P. Berenger, J. Comput. Phys., 114, 185, 1994.
Perspective 5 MODELING SOFT MATTER Kurt Kremer MPI for Polymer Research, 55021 Mainz, Germany
Soft matter science or soft materials science is a relatively new term for the science of a huge class of rather different materials such as colloids, polymers (of synthetic or biological origin), membranes, complex molecular assemblies, complex fluids, etc. and combinations thereof. While many of these systems are contained in or are even the essential part of everyday products (“simple” plastics such as yoghurt cups, plastic bags, CDs, many car parts; gels and networks such as rubber, many low fat foods, “gummi” bears; colloidal systems such as milk, mayonnaise, paints, almost all cosmetics or body care products, the border lines between the different applications and systems are of course not sharp) or as biological molecules or assemblies (DNA, proteins, membranes and cytoskeleton, etc.) are central to our existence, others are basic ingredients of current and future high tech products (polymers with specific optical or electronic properties, conducting macromolecules, functional materials). Though the motivation is different in life science rather than in materials science biomolecular simulations, the basic structure of the problems faced in the two fields is very similar. Often combinations of the above-mentioned materials are employed both in actual research as well as technology. Though rather different from the beginning and thus asking for rather different modeling tools, there is one unifying aspect, which makes it very reasonable to treat such systems from a common point of view. Compared to “hard matter” the characteristic energy density is much smaller. While the typical energy of a chemical bond (C–C bond) is about 3 × 10−19 J ≈ 80 kB T the non-bonded interactions are of the order of kB T and allow for the strong fluctuations even though the molecular connectivity is never affected. kB is Boltzmann’s constant and T the temperature. Two typical examples might illustrate this in comparison to a prototypical conventional crystal. To give a very rough and simple estimate one can compare a typical crystal to soft matter. Taking 10 nearest neighbors (8 for bcc, 12 for fcc/hcp) in a crystal and an interaction energy between 1 . . . 2 kB T 2675 S. Yip (ed.), Handbook of Materials Modeling, 2675–2686. c 2005 Springer. Printed in the Netherlands.
2676
K. Kremer
one reaches a typical energy density of a crystal made up of atoms of about E/ V ≈ 5–10 kB T /Å3 . Comparing this to a typical polymer melt, where the strand–strand distance is about 3–6 Å and a chain typically has of the order of six neighbors we arrive at (depending on the persistence length) not more than 0.3–0.1 kB T /Å3 . Thus, since to a first approximation the energy density corresponds to the elastic constants, polymeric systems are at least 10–100 times softer than classical crystals. For typical colloidal crystals, the situation is even much more dramatic due to the typical size of the unit cells (often around 103 Å) leading to a factor of up to 109 compared to conventional crystals. This is the reason why many colloidal systems can be “shear melted” just by turning the containers by hand. As a consequence the thermal energy kB T is not a “small energy” for the systems any more, but rather defines the essential energy scale. This means that entropy, which typically is of the order of kB T per degree of freedom, plays a crucial role. Especially in the case of macromolecules, this mainly means intramolecular entropy, which for a linear polymer of length N is of order kB T 0(N ). As an immediate consequence it is clear that typical quantum chemical approaches cannot be sufficient to characterize a material and even be less sufficient to properly predict/interpret macroscopic properties. How these different contributions influence each other can most easily be seen in Fig. 1. Any study of electronically excited states, reactions, etc. in principle requires quantum mechanical calculations. Also specific interactions might require this approach. On a next level typically all atom or united atom force field simulations are performed. It should, however, be kept in mind that because of the many hydrogen atoms in typical soft matter systems solving classical equations of motion might cause severe problems. There is no universal force field. In both regimes the energies of local bond lengths, bond angles and torsions dominate the properties. Special difficulties here pose non-bonded interactions as will be discussed below. If the view is more coarsened the detailed atomistic picture is replaced by a more or less flexible path in space in the case of polymers. On that level the many possible conformations, i.e., the resulting intra-chain entropy, govern the overall conformation of the chains. The rather delicate interplay of entropic and energetic contributions give raise to the huge variety of properties one encounters in soft matter. Any modeling attempt has to keep this in mind. How this affects the properties can be seen in two rather simple cases. Need less to say that in most experimental cases the situation is more complicated. The first example is the miscibility of different polymers in a melt. Figure 2 illustrates this in a cartoon like manner. Consider a mixture of two different polymers of type A and B. In a homogenous mixture the average number of AA, BB as well AB nearest neighbor contacts is of order 0(N ) as is the intra-chain entropy. To a first approximation the intra-chain entropy does not change in a pure A or a pure B melt. Thus, just as in a mixture of small molecules an effective interaction energy
Modeling soft matter
2677
Figure 1. Sketch of the different time and length scales in a typical soft matter system such as polymers.
of order E ≈ 0 (kB T ) per molecule, which is a polymer in our case, is sufficient to drive the phase separation. For polymers this means E ∼ N (εAB – 0.5 (εAA + εBB )), where the ε denote the specific interactions in a lattice model for example, due to the macromolecular structure of the molecules. Compared to small molecules the driving pair wise nearest neighbor energy difference at the phase separation point is proportional to N −1 , which eventually vanishes for larger chains. This is generic and independent of the systems, while the actual phase separation temperature (the “prefactor”) for a given N depends on chemical details. This generic behavior was impressively demonstrated by computer simulations on lattice models and by experiments studying the phase separation of protonated and partially deuterated polystyrene [1, 2]. The prediction of the actual value of critical interaction certainly is a challenge, which at least in the limiting case of large N will remain a challenge. In a similar way dynamical properties are governed by a combination of generic aspects originating from the connectivity and the non-crossability of the chains as well as local interaction parameters, which determine the bead friction and the local packing. For polymer chains exceeding the entanglement molecular weight the reptation model, which roughly assumes a motion
2678
K. Kremer
Figure 2. Cartoon of the origin of the very poor miscibility of different polymers.
of the polymer beads along the coarse-grained contour of the chains (“reptating chains”) holds. Within this concept one can write for the melt viscosity η = AN 3.4 , N again being the chain length and A the prefactor where everything is included which is difficult to determine and which is (to a first order) independent of N . Now the viscosity can be modified by orders of magnitude in two ways, varying N or A. E.g., changing N by a factor of two changes η by about factor of 23.4 ≈ 10. In a similar fashion one can vary A by slightly changing the chemical structure or simply changing the temperature. For BPA-Polycarbonate, the classical CD material, a shift in the process temperature from 500 K down to 470 K also increases η by roughly a factor of 10 (the glass transition temperature for BPA-PC is around TG ≈ 420 K). Again one encounters two equally important ways to manipulate the system properties, one based on universal aspects and the other based on local details. These were two almost trivial examples, which illustrate the interplay of contributions originating from different length scales. Of course, the above discussion was over simplified. Actual theoretical, experimental, or technical problems often do not allow for such a well-defined separation of scales. On the other hand in many cases investigations on one level of description have been and still are of high relevance and are still pursued at a high level. Thus they are a very active research topic in their own right. In the following I would like to shortly discuss a few examples illustrating the different directions.
Modeling soft matter
2679
The attempts to understand soft matter systems theoretically follow a long tradition. Beginning with the early work of Flory simplified models were studied, which were able to explain many generic/universal properties but failed to provide a solid basis for a theoretical understanding. It was then up to the seminal works by Edwards and deGennes to provide a link between statistical mechanics of phase transitions (critical phenomena) and polymer chain conformations. This link to modern concepts of theoretical physics not only provided a huge momentum to the field but also marked a starting point for statistical mechanics computer simulations applied to soft matter problems on a larger scale. Already for the simplest problems, computer simulations play a crucial role, i.e., the problem of an isolated self-avoiding walk cannot be solved exactly in three dimensions and until today the best data result from very extensive computer simulations. In a similar fashion basic features, such as the non-crossability of the chains are hard to deal with analytically and can only be included properly within a simulation approach. In this context, highly optimized and highly simplified models were and are still employed very extensively, and contributed significantly to our present knowledge. Typical other examples are phase separation studies of polymer mixtures [1], wetting phenomena in polymers, generic aspects of the glass transition of polymeric systems, dynamics of long and short chain melts [3, 4], hydrodynamic interactions in polymer solutions [5], and increasingly often many component complex fluids. These are just a few examples out of the huge literature, which occurred over the last 30 years. Increasing computer power, but especially ever improved and optimized models and algorithms allowed for this success and are still central to many research projects. For melts typical current examples are the dynamics of polymeric melts under shear or mixtures of linear polymers and polymers including branches [3]. To study such systems on a simple bead spring model level (the chains are represented by a string of beads and springs) is a challenging problem for modern super computers and simultaneously of highest technical relevance. Figure 3 shows an example of the determination of the backbone of the reptation tube in a melt of linear polymers. Other fields concern the coupling of the polymeric degrees of freedom to the hydrodynamic interaction in a solution [5] or related questions for colloidal systems. In both cases very recently developed algorithms allow for a first glance at such problems. From a statistical mechanics point of view the investigation of multi-component systems is just at the beginning. A typical example is the mixture of polystyrene and carbon dioxide which turns out to give rise to a very complicated phase diagram [6]. This is a special polymer solvent system, where the solvent itself is not as usual in such studies an inert species providing a background for a polymeric solute, but rather undergoes a liquid gas transition itself. The most prominent example, the above mentioned CO2 and polystyrene, is used to produce Styrofoam. For systems of this kind still many qualitative aspects of the phase diagrams are not known and before
2680
K. Kremer
Figure 3. Original polymer melts (left) and network of the backbone of the tube of the red chain with the chains causing the confinement. All other chains are shown as very thin lines only [4]. From this the melt plateau modulus can be determined.
going into too many specific details they have to be established. On a more detailed level, namely force field simulations employing models for polymers, membranes or even proteins, all atoms are treated explicitly and Newton’s equations of motion are solved numerically. The problems in determining such a force field have already been mentioned before. But even then such calculations can only provide insight on a very short time scale or for rather small systems. Equilibrating such systems and deducing quantitative results for a macroscopic quantity still is a severe problem [7, 8] (see also the contribution by D.N. Theodorou). On the other hand it is even questionable whether runs on very long time scales, i.e., the time scale on which a long polymer chain would move its own diameter in space, on such a level would be very useful, because the enormous amount of data has to analyzed and structured. Therefore, it is useful to study such problems on a lower level of detail. Another field, which nowadays still is mostly confined to rather coarse-grained and simple models are polyelectrolytes in solution, due to the long range nature of the interaction combined with a typically slow relaxation [9]. While these individual studies provide important insight in either atomistic details or generic aspects such a separation quite often is not possible or a link between different scales within a hierarchical simulation scheme is needed. This is the center part of what is nowadays called “multi-scale” or “scale bridging” modeling. For many future applications and especially for a close link to experiment it is absolutely crucial to establish this bridge. To illustrate this I would again like to mention two examples. If one wants to study the statics and dynamics of a long chain polymer melt a highly simplified model is needed in order to reach the necessary time scales for the diffusion of the chains or for stress relaxation (cf. Fig. 3). This even nowadays requires highly optimized programs and top of the line super computers.
Modeling soft matter
2681
However, to treat a specific system enough details of the atomistic structure have to be carried along. One way would be to run two separate studies, one all atom simulation for very short times to determine the time scaling from small length scale mean square displacements and one simulation of a more coarse-grained model. This implies to link these to sets of data, which in that case is quite difficult as both a time and a length scale mapping are needed. This is the traditional way which still dominates the literature but also poses some conceptual problems. An alternative way is to start from an all atom description and derive from that directly a coarse-grained model, which is efficient enough to study long time dynamics. Since this fixes the length scaling, the time mapping by comparing runs from the micro and meso level becomes unique and truly quantitative time dependent “measurement” becomes possible. This approach has another advantage, if one is interested in the atomistic structure of melts. By employing an inverse mapping back to the atomistic model one can actually use the simulation on the coarse-grained level to efficiently equilibrate all atom models of huge size. The strategy of this way of performing a hierarchical or multiscale modeling is illustrated in Fig. 4. However, in some cases this sequential ansatz of working on different levels is not sufficient. Consider the situation of selective adsorption of parts of a macromolecule at a surface. Due to the constraints coming from the connectivity within the molecule and the structure of the surface it can only bind to the surface in a rather specific configuration. This requires a significant amount
Figure 4. Cartoon like illustration of the hierarchical modeling scheme.
2682
K. Kremer
of detail for this problem. On the other hand, the whole rest of the molecule, located in the polymer matrix, where it is not in contact to the surface, has to be equilibrated as well. For the latter more coarse-grained models are sufficient and also needed due to the CPU time requirements. Such a problem (e.g., polycarbonate at a nickel surface) led us to develop a dual scale simulation where the level of description along the backbone varied [10, 11]. In that case the bond angles close to the chain ends had to be considered within a simple bead spring model with atomistic resolution. The outcome of such a study as well as the effect of different chain ends on the morphology of short chain melts are illustrated in Fig. 5. Typical for many problems is that regions in space or periods of time where higher resolution is needed can be identified. Often these regions can change significantly with time. For not too large systems it might be advantageous to switch completely between two levels of representation (e.g., fully atomistic vs. fully coarse-grained) depending on some intrinsic system parameters. For this reason the above mentioned dual–resolution approach (where two levels of description are applied simultaneously at different parts of a given system) and uniform-resolution approaches (where either one or the other level is used to describe the entire system, given a suitable mapping/back – mapping procedure) are being developed. Alternatively, when there is no exchange of
Figure 5. Illustration of the multiscale modeling approach for specific surface interactions of a polymer with a surface [10, 11]. Depending on the specific interactions, in that case of the (non) phenolic chain ends with a Ni surface, the melt morphology of short chains close to the surface can be significantly different.
Modeling soft matter
2683
particles/atoms regions of space with different resolutions are studied. An example for this is the so-called “QM/MM” or quantum/classical hybrid approach, which is successfully applied in biochemistry and catalysis research. A part of the system is treated by electronic structure methods while being coupled to a force field which represents the environment (e.g., in a protein) at the level of classical mechanics [12]. At the classical atomistic (microscopic) level the starting point usually is a parameterized force field mostly based on experimental input. Different classical multi-level and coarse graining techniques are currently developed within the realm of materials science/statistical physics. Still an open issue is to connect the quantum level via the microscopic up to the mesoscopic description quite independently from the specific problem at hand. Though this might be already complicated, this for many important questions is not sufficient. An even deeper challenge is the development of adaptive schemes, which allow for an adjustment of the description locally in space and/or in time as required, thus coupling different levels of description in a dynamical manner. Many important questions in materials science, soft matter and biophysical chemistry require an adaptive multi-scale approach for a deep overall understanding. Typical problems, which would benefit significantly from such a development, are, how does the atomistic structure of a functionalized polymer grafted onto a surface affect the materials properties of a composite system, how does the topological structure of a (block co-)polymer melt affect the overall rheological properties, how does the atomistic structure of photoactive molecules affect the optical properties of a cross–linked film of such chromophores or how do concentration fluctuations and chain conformations of liquids with polymer additives influence the turbulent drag reduction by such additives? In particular, local electronic properties are often crucially linked dynamically to global conformational properties. These are just a few typical challenges for modeling of soft matter. The above examples illustrate how phenomena on different length and time scales are linked to each other. In this context modern experimental work poses another additional challenge as well as opportunity. The characteristic size of experimental systems is constantly downsizing. Often structures and assemblies on the nanometer or micrometer scale are studied. Simultaneously simulations become more and more efficient due to both, hardware and software development. Eventually both will meet. In this regime it is essential that simulations will be able to tackle whole systems and not small parts, where one tries to eliminate surface effects by periodic boundary conditions. Such systems can be characterized in many ways. However, one important characteristic is that the distinction between bulk and surface contributions to the free energy does not make any sense anymore, they are both (if separable) of the same order. Since such systems will still contain many thousands of atoms
2684
K. Kremer
and can be of high structural complexity (e.g., ion conducting columns of dendritic molecules, decorated micelles, etc.) a hierarchical modeling scheme as discussed above will be essential. If at hand and versatile enough such a development could open up a path to a new and improved interaction with experiments in nanotechnology, physical chemistry as well as biophysics. However, there is still a long way to go. As this book shows, many of the questions/problems discussed are subject to very active research in many laboratories throughout the world. In the soft matter field we still face many challenges, which will keep researchers busy for many years. Here I name just a few, which certainly reflect my own experience, but also my own taste of important problems: – Thorough studies of generic phase diagrams of many component soft matter systems (solvent–polymers–colloids, etc.) – Dynamics (equilibrium AND non-equilibrium) of complex branched systems and mixtures of linear and branched polymers in melt and solution – Multi-scale simulations of ordered structures such as hierarchical assemblies or proteins – Adaptive simulation schemes for a wide class of systems, including the quantum level simulations. – Simulations of large systems of charged macromolecules, statics and dynamics, with explicit ions taking dielectric contrast explicitly into account. These are just a few examples and this list could easily be extended significantly. In conclusion, the development of adaptive multilevel simulation methods is an, if not the most important challenge in multiscale modeling of complex systems, which is shared by such diverse fields as materials, life sciences, and beyond. Thus, adaptive multilevel techniques should connect the atomistic description based on quantum mechanics via the microscopic to the mesoscopic level and eventually include also the macroscopic view.
References [1] [2] [3] [4] [5] [6]
H.P. Deutsch and K. Binder, J. Phys. (France) II, 3, 1049, 1993. D.J. Londono et al., Macromolecules, 27, 2864, 1994. T.C.B. McLeish, Adv. Phys., 5, 1379, 2002. R. Everaers et al., Science, 303, 823, 2004. P. Ahlrichs, R. Everaers, and B. D¨unweg, Phys. Rev. E, 64, 040501, (R), 2001. K. Binder et al., Polymer + solvent systems: phase diagrams, interface free energies, and nucleation, In: C. Holm and K. Kremer (eds.), 2004.
Modeling soft matter
2685
[7] D.R. Heine, G.S. Grest, and J.G. Curro, “Structure of polymer melts and blends: comparison of integral Equation theory and computer simulation,” In: C. Holm and K. Kremer (eds.), 2004. [8] D.N. Theodorou, Mol. Phys., 102, 147, 2004. [9] C. Holm et al., “Polyelectrolytes with defined molecular architecture II,” In: M. Schmidt (ed.), Adv. Pol. Sci., vol. 166, p. 67, 2004. [10] C.F. Abrams, L. Delle Site, and K. Kremer, Phys. Rev., E67, 021807, 2003. [11] L. Delle Site, S. Leon, and K. Kremer, JACS, 126, 2944, 2004. [12] U.F. R¨ohrig, I. Frank, J. Hutter, A. Laio, J. VandeVondele, and U. Rothlisberger, Chem. Phys. Chem., 4, 1177, 2003.
2686
K. Kremer
Suggested General Reading: M. Doi and S.F. Edwards, The Theory of Polymer Dynamics, Clarendon Press, Oxford, 1986. P.G. deGennes, Scaling Concepts in Polymer Physics, Cornell Univ. Press, Ithaca, NY, 1979. Computational Soft Matter: From Synthetic Polymers to Proteins, N. Attig, K. Binder, H. Grubm¨uller, V.K. Kremer (eds.), NIC Series 23 J¨ulich 2004. D. Frenkel and B. Smit, Understanding Molecular Simulations: From Algorithms to Applications, Academic Press, San Diego, 2002. J. Baschnagel et al., Bridging the Gap between Atomistic and Coarse Grained Models of Polymers: Status and Perspectives, Adv. Pol. Sc., 152, 2000. K. Kremer and F. M¨uller-Plathe, Multiscale Problems in Polymer Science: Simulation Approaches, MRS Bulletin, 26, 205, 2001. C.F. Abrams, L. Delle Site, and K. Kremer, in: Bridging Time Scales: Molecular Simulations for the Next Decade (P. Nielaba, M. Mareschal, and G. Ciccotti (eds.)), Multiscale Computer Simulations for Polymeric Materials in Bulk and Near Surface, Proceedings Simu Conference, Konstanz, August 2002, 143 (Springer, Berlin - Heidelberg, 2002). K. Kremer, Multiscale Aspects of Polymer Simulations, in: Multiscale Modelling and Simulation, Lecture Notes in Computational Science and Engineering, S. Attinger and P. Koumoutsakos (eds.), Springer Verlag, 2004. C. Holm, K. Kremer (eds.), Advanced Computer Simulation Approaches for Soft Matter Sciences I Adv. Pol. Sc. 179, 2004.
Perspective 6 DROWNING IN DATA – A VIEWPOINT ON STRATEGIES FOR DOING SCIENCE WITH SIMULATIONS Dierk Raabe Max-Planck-Institut für Eisenforschung, Düsseldorf, Germany
1.
Introduction
Computational materials scientists are nowadays capable of producing an enormous wealth of simulation data. When analyzing such predictions the challenge often consists in extracting meaningful observations from them, and, wherever possible, to discover general and representative principles behind the often-huge data sets. Only the capability of condensing large data sets into the discovery of new microstructure principles renders materials simulations into computational materials science. This chapter is devoted to this topic. The following sections present some strategies for filtering new observations from materials simulations.
2.
Microstructure or the Hunt for Mechanisms
While the evolutionary direction of microstructure is prescribed by thermodynamics, its actual evolution path is selected by kinetics. It is this strong influence of thermodynamic non-equilibrium mechanisms that entails the large variety and complexity of microstructures typically encountered in materials. It is an essential observation that it is not those microstructures that are close to equilibrium, but often those that are in a highly non-equilibrium state that provide advantageous and desired material properties. Following Haasen [1] microstructure can be understood as the sum of all thermodynamic nonequilibrium lattice defects on a space scale that ranges from Angstroms (point defects) to meters (sample surface), Figs. 1, 2. Its temporal evolution ranges from picoseconds (dynamics of atoms) to years (fatigue, creep, corrosion, 2687 S. Yip (ed.), Handbook of Materials Modeling, 2687–2693. c 2005 Springer. Printed in the Netherlands.
2688
D. Raabe
Figure 1. Example of relevant scales occurring in automotive crash simulations [2, 3].
Figure 2. Example of some relevant scales in polymer mechanics [4].
Drowning in data – a viewpoint on strategies for doing science
2689
diffusion). Haasens’s definition clearly underlines that microstructure does not mean micrometer, but nonequilibrium. Some of the size- and time-scale hierarchy classifications typically suggested for materials group microstructure research into macroscale, mesoscale, microscale, and nanoscale. They take a somewhat different perspective in which they refer to the real length scale of microstructures (often ignoring the intrinsic time scales which are more relevant when it comes to achieving relevant integration times). This might oversimplify the situation and suggest that we can linearly isolate the different space scales from each other. In other words, the classification of microstructures into a scale sequence merely reflects a spatial rather than a crisp physical classification. For instance, small defects, such as dopants, can have a larger influence on strength or conductivity than large defects such as precipitates. Or, think of the highly complex phenomenon of shear banding. These can be initiated not only by interactions among dislocations or between dislocations and point defects but as well by macroscopic stress concentrations introduced by the local sample shape, surface topology, and contact situation. However, if we accept that everything is connected with everything and that linear scale separation could blur the view on important scale-crossing mechanisms what is the consequence of this insight? One clear answer to that is: we do what materials scientists always did – we look for mechanisms. Let us be more precise. While former generations of materials researchers often focused on mechanisms or effects that pertain to single lattice defects and less complex mechanisms not amenable to basic analytical theory or experiments available in those times, present materials researchers have three basic advantages for identifying new mechanisms. First, ground state and molecular dynamics simulations have matured to a level at which we can exploit them to discover possible mechanisms at high resolution and reliability [5]. This means that materials theory is standing – in terms of the addressed time and space scale – for the first time on robust quantitative grounds allowing us insights we could not get before. It is hardly necessary to mention the obvious benefits arising from increased computer power in this context. Second, experimental techniques have been improved to such a level that – although sometimes only with enormous efforts – new theoretical findings can be critically scrutinized by experiment (e.g., microscopy, nanomechanics, diffraction techniques). Third, due to advances in both, theory and experiment more complex, self-organizing, critical, and collective non-linear mechanisms can be elucidated which cannot be understood by studying only one single defect or one single length scale. All these comments can be condensed to the statement, that microstructure simulation – as far as a fundamental understanding is concerned – consists in the hunt for key mechanisms. Only after identifying those we can (and should) make scale classifications and decide how to integrate them into macroscopic
2690
D. Raabe
constitutive concepts or subject them to further detailed investigation. In other words the mechanisms that govern microstructure kinetics do not know about scales. It was often found in materials science that – once a basic new mechanism or effect was discovered – a subsequent avalanche of basic and also phenomenological work followed opening the path to new materials, new processes, new products, and sometimes even new industries. Well-known examples are the dislocation concept, transistors, aluminium reduction, austenitic stainless steels, superconductivity, or precipitation hardening. The identification of key mechanisms, therefore, has a bottleneck function in microstructure research and computational materials science plays a key role in it. This applies particularly when closed analytical expressions cannot be formulated and when the investigated problem is not easily accessible to experiments.
3.
Drowned by Data – Handling and Analyzing Simulation Data
A very good multi-particle simulation nowadays faces the same problem as a good experiment, namely, the handling, analysis, and interpretation of the resulting huge data sets [6]. Let us take for a moment the position of a quantum-Laplace-daemon and assume we can solve the Schr¨odinger equation for 1023 particles over a period of time, which covers significant processes in microstructure. What had we learned at the end? The answer to that is: not much more than from an equivalent experiment with high lateral and temporal resolution. The major common task of both operations would be to filter, analyze, and understand what we simulated or measured, respectively. We must not forget that the basic aim of most scientific initiatives consists in obtaining a general understanding of principles, which govern processes and states we observe in nature. This means that the mere mapping and reproduction of 1023 sets of single data packages (e.g., 1023 times the positions and momentum of all particles as a function of time) can only build a quantitative bridge to a basic understanding, but it cannot replace it. However, the advantages of this quantitative bridge built by the Laplacian super-simulation introduced above are at hand. First, it would give us a complete and well documented history record of all particles over the simulated period, i.e., more details than from any experiment. Second, once a simulation procedure is working properly it is often a small effort to apply it to other cases. Third, simulations have the capability to predict rather than only to describe. Fourth, usually in simulations the boundary conditions are exactly known (because they are mathematically imposed) as opposed to experimentation where they are typically not so well known. Therefore my initial statement about the concern of being drowned by simulation data aims at
Drowning in data – a viewpoint on strategies for doing science
2691
encouraging the computational materials science community to better cultivate an expertise of discovering basic mechanisms and microstructural principles behind simulations rather than getting lost in the details.
4.
Scaling, Coarse Graining, and Renormalization in Computational Materials Science
When realizing that quantum mechanics is not capable of directly treating 1023 particles the question arises how macroscopic material properties of microstructured samples can nonetheless be recovered from first principles. Numerous methods have been suggested to tackle such scale problems. They can be classified into two basic groups, namely, multi-scale and scale-bridging methods [2]. The first set of approaches (multiscale) consists in repeatedly including in simulation parameters or rules that were obtained from simulations at a respective smaller scale. For instance, the interatomic potentials of a material can be approximated using local density functional theory. This result can enter molecular dynamics simulations by using it for the design of embedded atom or tight-binding potentials. The molecular dynamics code could now be used, say for the simulation of a dislocation reaction. Reaction rules and resulting force fields of reaction products obtained from such predictions could be part of a subsequent elastic discrete dislocation dynamics simulation. The results obtained from this simulation could be used to derive the elements of a phenomenological hardening matrix formulation, and so on. Scale bridging methods take a somewhat different approach at scaling. They try to identify in phenomenological macroscopic constitutive laws those few key parameters, which mainly map the atomic scale physical nature of the investigated material and try to skip some of the regimes between the atomic and the macroscopic scale. Both approaches suffer from the disadvantage that they do not follow some basic and general transformation or scaling rules but require instead complete heuristical and empirical guidance. This means that both, multi-scale and scale-bridging methods must be directed by well-trained intuition and experience. An underlying and commonly agreed methodology of extracting meaningful information from different scales and implementing them into another does simply not exist. A similar but less arbitrary approach to this scale problem might lie in introducing suitable averaging methods, which are capable of reformulating forces among lattice defects accurately in terms of a new model with renormalized interactions obtained form an appropriate coarse graining algorithm [7–9]. Such a method could render some of the multi-scale and scale-bridging
2692
D. Raabe
efforts more systematic, general, and reproducible. The basic idea of coarse graining and renormalization group theory consists in identifying a set of transformations (group) that translate characteristic properties of a system from one resolution to another. This procedure is as a rule not symmetric (semi-group). The transformations are conducted in such a way that they preserve the fundamental energetics of the system such as for instance the value of the partition function. For understanding the principle of renormalization let us consider an Ising lattice model, which is defined by some spatial distribution of the (Boolean) spins and its corresponding partition function. The Ising model can be reformulated by applying Kadanoff’s original real space coarse graining or block spin approach [7]. This algorithm works by summarizing a small subset of neighboring lattice spins and replacing them by one single new lattice point containing one new single lattice spin. The value of the new spin can be derived according to a decimation or majority rule. The technique can be applied to the system in a repeated fashion until the system remains invariant under further transformation. The art of rendering these repeated transformations a meaningful operation lies in how the value of the partition function or some other property of significance is preserved throughout the repeated renormalizations. The effect of successive scale transformations is to generate a flow diagram in the corresponding parameter space. The state flow occurs in our present example as a gradual change of the renormalized coupling constant. Reaching finally scale invariance is equivalent to transforming the system into a fixed point. These are points in parameter space that correspond to a state where the system is self-similar, i.e., it remains unchanged upon further coarse graining and transformation. The system properties at fixed points are, therefore, truly fractal. Each fixed point has a surrounding region in parameter space where all initial states finally end at the fixed point upon renormalization, i.e., they are attracted by this point. The surface where such competing areas abut is referred to as a critical surface. It separates the regions of the phase diagram that scale toward different single-component limits. The approach outlined in this section is called direct or real-space renormalization. For states on the critical surface, all scales of length coexist. The characteristic length for the system goes to infinity, becoming arbitrarily large with respect to atomic-scale lengths. For magnetic phase transitions, the measure of the diverging length scale is the correlation length. For percolation, the diverging length scale is set by the connectivity length of magnetic clusters. The presence of a diverging length scale is what makes it possible to apply to percolation the technique of real-space renormalization. Although the idea of applying the principles of renormalization group theory to microstructures might seem appealing at first sight it is essential to underline that there exists no general unified method of coarse graining microstructure problems. Approaches in this context must therefore also first take
Drowning in data – a viewpoint on strategies for doing science
2693
a heuristic view at scaling and carefully check which material property might be useful to assume the position of a preserved function when translating the system to the respective coarser scale. Another challenge consists in identifying appropriate methods to describe the in-grain and grain-to-grain behavior of the system clusters obtained by coarse graining. On the other hand, coarse graining and renormalization group theory might offer an elegant opportunity to eliminate empiricism encountered in some of the scaling approaches used in microstructure simulations [10–12]. Another advantage of applying renormalization group theory to microstructure ensembles might be to elucidate universal scaling laws, which are common also to other materials problems.
References [1] P. Haasen, Physikalische Metallkunde, Springer-Verlag, Berlin, Heidelberg, 1984. [2] D. Raabe, Computational Materials Science, Wiley-VCH, Weinheim, 1998. [3] R.W. Cahn, The Coming of Materials Science, Pergamon Press, Amsterdam, New York, 2001. [4] D. Raabe, “Mesoscale simulation of spherulite growth during polymer crystallization by use of a cellular automaton,” Acta Mater., in press, 2004. [5] M.P. Allen and D.J. Tildesley, Computer Simulation of Liquids, Oxford Science, 1987. [6] H.O. Kirchner, L.P. Kubin, and V. Pontikis, Proceedings of NATO ASI on Computer Simulation in Materials Science. NATO Advanced Science Institutes Series, Series E: Applied Sciences, vol. 308, Kluwer Academic in cooperation with NATO Science Division, 1996. [7] J. Cardy, Scaling and Renormalization in Statistical Physics, Cambridge, University Press, Cambridge, 1996. [8] D. Raabe, “Challenges in computational materials science,” Adv. Mater., 14, 639– 650, 2002. [9] D. Raabe, “Don’t Trust your Simulation – Computational materials science on its way to maturity?” Adv. Eng. Mater., 4, 255–267, 2002. [10] K. Binder, Monte-Carlo Methods in Statistical Physics, Springer-Verlag, New York, 1986. [11] H.E. Stanley, Phase Transitions and Critical Phenomena, Oxford University Press, London, 1971. [12] J.M. Yeomans, Statistical Mechanics of Phase Transitions, Clarendon, Oxford, 1992.
Perspective 7 DANGERS OF “COMMON KNOWLEDGE” IN MATERIALS SIMULATIONS Vasily V. Bulatov Lawrence Livermore National Laboratory, University of California, Livermore, CA 94550
For someone entering the field of materials simulations, it may be difficult to navigate through the maze of various ideas and concepts existing in the literature and to make one’s own judgment about their validity and certainty. Monographs and chapter books make it easier for a beginner to prepare for reading the literature describing the state of the art. Yet, even while reading a textbook, a novice may get the discomforting feeling of “not digging” a certain statement. If and when this happens, the first urge is usually to reread the passage and think harder and, if that fails, to re-view the preceding discussion trying to pay more specific attention to the facts and logic behind the elusive idea. Then, depending on one’s patience, it may become necessary to read other texts or talk to more experienced people. But what if all of this fails to clarify the point in question? What if the misunderstanding persists through the years and continues to nag even after most of the other, initially difficult, ideas happily find their proper place in one’s mind. It is quite natural then to begin to doubt oneself: why does no one else have this difficulty? Is it only I who is stupid? Eventually, the feeling of desperation subsides, often replaced by a conditional acceptance: “I don’t dig it but I can live with it”. Having changed my research area several times by now – from nuclear theory to statistical physics of polymers to physics of dislocations – I have had my share of such moments of desperation. My most recent foray into computer simulations of dislocations began in 1993 when Ali Argon and Sid Yip suggested for me to work on kink mechanisms of dislocation mobility in silicon. My prior exposure to dislocation theory consisted of a half-lecture in an undergraduate course on solid-state theory. The only thing I carried away with me from that lecture was that dislocation theory is something exceptionally boring and involves a lot of tedious tensor algebra. And that, however boring, the theory of dislocation is a very well established chapter of solid-state physics. It 2695 S. Yip (ed.), Handbook of Materials Modeling, 2695–2700. c 2005 Springer. Printed in the Netherlands.
2696
V.V. Bulatov
is with this pre-conception that I started reading the literature on dislocations in 1993. Now, having read most of the existing textbooks and hundreds if not thousands of papers on the subject, I feel humbled and humiliated by a vast variety of ideas and concepts existing in this fascinating field of study. And, surely, though still a relative novice I have accumulated my share of qualms most of which I learned to get over with. Yet some of the nagging issues appeared more troubling than others precisely because they seemed so simple and basic. Below I give a brief account of several such troubling statements and argue, at the risk of sounding contrarian, that some of the common understanding presented in the literature may be suspect. First, I will describe “a saga of misconceptions” around one technical issue in dislocation simulations, namely the use of periodic boundary conditions in dislocation dynamics simulations. Second, I will discuss one very basic idea concerned with the tendency of dislocations to glide on the most widely spaced crystallographic planes. Finally, I will speculate on several other bits of common knowledge that grow increasingly suspicious in my mind the more I think about them. The first issue may appear trivial but, mysteriously, remained unresolved until the year 2000. It concerns an alleged impossibility of using periodic boundary conditions (PBC) in 3D dislocation dynamics simulations. The following excerpts from the literature speak for themselves. “In order to avoid artifacts due to PBC, free surfaces are set” (1992). “In 3D, periodic boundary conditions should be implemented by mapping each set of slip planes on itself, which may be problematic. . . ” (1992). “PBC can be used in 2D but not in 3D if continuity of each dislocation line is to be maintained across the boundary” (1996). “The building of PBC at the boundaries of the simulation box is an unsolved problem. Indeed, the Born von Karman periodic condition used at the atomic scale can not be geometrically reproduced since we are considering linear defects moving in more than three glide systems and inside a 3-D space volume” (1996). “However, 3D periodicity is geometrically impossible since continuity of the dislocation lines can never be maintained” (1999). Although this list is certainly incomplete, it is nevertheless representative of the thinking at the time among the developers of a relatively new method of dislocation line dynamics. Rather than trying to examine what were the perceived difficulties∗ in using PBC in 3D line dynamics, it is curious to observe that the tone of the statements quoted above grew progressively more assertive to outright aggressive. The tone changed from doubts to absolute certainty that PBC in 3D line dynamics are not usable, and the notion continued to reinforce itself over time. Unfortunately, this misunderstanding was not * There were none. In fact, after PBC were “acquitted” in 2000 [1], in most cases it took just a few extra lines of code to modify the existing 3D algorithms to handle PBC gracefully. In a few weeks after the first use of PBC in 3D dislocation dynamics was demonstrated, essentially every group actively working in this area has implemented the simple trick.
Dangers of “common knowledge” in materials simulations
2697
entirely harmless: having been convinced that PBC are impossible to use, several groups pursued alternative ideas for boundary conditions for dislocation line dynamics in the bulk. These alternative developments led to some rather complicated boundary conditions full of their own technical problems and artifacts and are now essentially abandoned in favor of PBC. The most amusing aspect of this story was that, all at the same time, there were plenty of people (myself included) who knew, from experience, that PBC are not only possible but also very simple to use when one is concerned with the line objects. Needless to say, I had many an agonizing thought about this issue while reading the literature on dislocation dynamics. I now turn to another notion that appears well entrenched in the dislocation physics community. Unlike the technical issue discussed above, this one is much more basic and appears very early in the textbooks. Give or take a few words, it is usually stated approximately like this: “Dislocations tend to exist and glide in those crystallographic planes that are most widely separated from each other”. Over some 10 years of reading the literature, I have seen quite a few such statements. Granted, exceptions to this rule are known and the notion itself is not pushed as aggressively among the specialists. Yet, the logic behind it is flawed and any reliance on such a geometric rule can lead to misunderstanding. Two factors are likely to have contributed to this preconceived notion. One is that the thinking about dislocations is still dominated by the observations of their behavior in FCC metals where dislocations indeed move in the widely spaced {111} planes. The other possible culprit is the Peierls–Nabarro model that, in its formulation, invokes the notion of inter-planar sliding: it appears logical that coupling between the wide spaced planes and, hence, the resistance to dislocation motion are the weakest. To see that this notion is dubious at best it suffices to consider dislocation glide in BCC metals. It is straightforward to show, by direct atomistic simulations, that Peierls stress of 1/2 111 dislocations in BCC crystals depends only weakly on the selection of glide plane of the 111 zone [2]. Instead, within one and the same plane, Peierls stress may vary by orders of magnitude as a function of dislocation character angle [3]. In particular, in the {110} planes, Peierls stress varies from 2 GPa for screw dislocations to below 20 MPa for edge dislocations. It appears that, the single geometrical parameter of the lattice that matters most is the period of the Peierls potential in the direction perpendicular to the dislocation line: the longer the period the higher is the Peierls stress. This period is the same as the spacing between the atomic rows parallel to the line direction and depends not only on the character angle in the geometric glide plane but also on the glide plane itself. Our yet unpublished results suggest that, even among edge dislocations, Peierls stress may vary from several hundreds of MPa to barely detectable levels of several MPa, depending on the selected glide plane [2]. Coming back to dislocations in FCC metals, the fact that dislocations glide (mostly) in the {111} planes is because these are the
2698
V.V. Bulatov
planes where dislocations dissociate into Shockley partials. This dissociation confines dislocations to gliding in the {111} planes. Thus, it is not the fact that {111} planes are most widely spaced but the fact that they are the only ones with low energy stacking faults that defines this characteristic behavior. In pure Al, where the stacking fault energy is relatively high, dissociation is suppressed and glide on planes other than {111} is observed [4]. Surely, one can counter this argument by saying that low energy stacking faults are more likely to exist between widely spaced planes. This conjecture is probably acceptable as a tendency, but not as a rule. Having described two examples of the existing “common knowledge”, I would like to name a few other common perceptions with respect to which my initial doubts are now giving way to a sense of “growing discontent”. I realize that such a speculative argument is quite risky and may eventually fly back in my face at some later time. However my sense of self-preservation is outweighed by a desire to call attention to possible misunderstandings and to persuade others to take a harder look at the issues at hand. 1. The traditional meaning of the Peierls stress does not seem to hold up against the recent results of direct MD simulations of screw dislocation motion in BCC metals. Simulation data from our group [5] as well as several unpublished observations [6, 7] suggest that screw dislocations in BCC metals are able to move under stress well below the (statically computed) Peierls threshold even at temperatures as low as 10 K. Furthermore, the ability of screw dislocations to move appears to depend strongly on the length of the simulated dislocation segment. 2. First suggested by P. Hirsch, the idea that a three-way non-planar splitting makes it difficult for screw dislocations in BCC metals to glide became rather pervasive in the literature. Yet, there is an emerging new consensus, based on recent atomistic simulations, that the high resistance to glide of screw dislocations in BCC metals is not a consequence of the notorious three-way splitting of the screw dislocation core [8, 9]. It appears that, as long as the screws do not experience a planar splitting, their lattice resistance is bound to be high regardless of other details of the core structure. 3. The well-known Friedel–Escaig (FE) mechanism of cross-slip in FCC metals has been recently confirmed in a series of large-scale atomistic simulations [10, 11]. There was little doubt, from the outset, that FE is the mechanism by which dislocation motion in FCC metals becomes 3D. However, the setup of atomistic simulations reported in [10, 11] made it virtually impossible to find any other cross-slip mechanism but FE. In particular, the choice of the method for finding the transition pathway together with the choice of the initial and final states along the sampling path, have restricted the search to FE-like transitions. Is this yet another
Dangers of “common knowledge” in materials simulations
2699
case of self-reinforcing misconception? Notably, our recent simulations [12] suggest that cross-slip in FCC metals may proceed via a different mechanism suggested earlier by Fleischer [13]. 4. One finds numerous references to cross-slip from dislocation pile-ups as the mechanism underlying the transition from stage II to stage III hardening in FCC metals. Having examined much of the literature on this subject, including the original papers by Seeger, I have been frustrated by my inability to understand this mechanism and to paint a clear picture in my mind how exactly cross-slip from pile-ups defines this transition. The only weak consolation I had was that many other researchers, especially outside of Europe, have found this equally difficult to do. My frustration culminated in searching for and proposing an alternative mechanism of this transition [14, 15]. I am nearly certain that other researchers can offer more alternative ideas. In this short remark, I described my personal views on several aspects of dislocation theory. My intention was to share a few misgivings about some of the seemingly well-established knowledge in the field of dislocation physics. This is my current area of research and, obviously, the only one I can draw from for experience. Yet, given the vast amount of literature on dislocation physics, I will not be surprised to eventually find that other researchers have thought and even, possibly, stated their reservations concerning the same or related issues. In any case, my only and rather evident conclusion is that it is a good practice to remain agnostic about the ideas and concepts prevalent in any field of research, materials modeling included. Others may have their own “skeletons in the closet” waiting to get out in the open. This brief article was my way of doing just this.
References [1] V.V. Bulatov, W. Cai, and M. Rhee, “Periodic boundary conditions for dislocation dynamics simulations in three dimensions,” L.P. Kubin, J.L. Bassani, K. Cho et al., (eds.), Mat. Res. Soc. Symp., vol. 653, 2001. [2] V.V. Bulatov, W. Cai, and C. Krenn, unpublished. [3] J.P. Chang, W. Cai, V.V. Bulatov, and S. Yip, “Molecular dynamics simulations of motion of edge and screw dislocations in a metal,” Computat. Mater. Sci., 23, 111, 2002. [4] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., Wiley, New Yourk, p. 272, 1982. [5] J. Marian, W. Cai, and V.V. Bulatov, “Dynamic transitions from smooth to rough to twinning in dislocation motion,” Nat. Mater., 3, 158–163, 2004. [6] V. Pontikis, private communication. [7] Yu. Osetsky, private communication.
2700
V.V. Bulatov
[8] S. Ismail-Beigi and T.A. Arias, “Ab initio study of screw dislocations in Mo and Ta: a new picture of plasticity in bcc transition metals,” Phys. Rev. Lett., 84, 1499, 2000. [9] C. Woodward and S.I. Rao, “Flexible ab initio boundary conditions: simulating isolated dislocations in bcc Mo and Ta,” Phys. Rev. Lett., 88, 216402, 2002. [10] T. Rasmussen, K.W. Jacobsen, T. Leffers, O.B. Pedersen, S.G. Srinivasan, and H. J´onsson, “Atomistic determination of cross-slip pathway and energetics,” Phys. Rev. Lett., 79, 3676–3679 1997. [11] T. Rasmussen, K.W. Jacobsen, T. Leffers, and O.B. Pedersen, “Simulations of the atomic structure, energetics, and cross slip of screw dislocations in copper,” Phys. Rev. B, 56, 2977, 1997. [12] W. Cai, V.V. Bulatov, J.P. Chang, J. Li, and S. Yip, “Dislocation core effects on mobility,” In: F.R.N. Nabarro and J.P. Hirth (eds.), Dislocations in Solids, Elsevier, Amsterdam, vol. 12, chap. 12, p. 17, 2004. [13] R.L. Fleischer, “Cross slip of extended dislocations,” Acta Metall., 7, 134, 1959. [14] V.V. Bulatov, “Unlocking dislocation secrets – challenges in theory and simulations of crystal plasticity,” In: J.V. Carstensen, T. Leffers, T. Lorentzen et al. (eds.), Modelling of Structure and Mechanics of Materials from Microscale to Product, RISO National Laboratory, Roskilde Denmark, pp. 39–60, 1998. [15] V.V. Bulatov, “Connecting the micro to the mesoscale: review and specific examples,” In: J. Lepinoix et al. (eds.), Multiscale Phenomena in Plasticity, Kluwer Academic Publishers, Netherlands, pp. 259–269, 2000.
Perspective 8 QUANTUM SIMULATIONS AS A TOOL FOR PREDICTIVE NANOSCIENCE Giulia Galli and François Gygi Lawrence Livermore National Laboratory, CA, USA
In the last two decades, the coming of age of first principles theories of condensed and molecular systems, and the continuous increase in computer power have positioned physicists to address anew the complexity of matter at the microscopic level. Theoretical and algorithmic developments in ab initio molecular dynamics [1] and quantum Monte Carlo methods [2], together with optimized codes running on high-performance computers, have allowed many properties of matter to be inferred from the fundamental laws of quantum mechanics, without input from experiment. In particular, quantum simulations are playing an increasingly important role in understanding and controlling matter at the nanoscale and in predicting with controllable, quantitative accuracy the novel and complex properties of nanomaterials. In the next few years, we expect quantum simulations to acquire a central role in nanoscience, as further theoretical and algorithmic developments will allow one to simulate a wide variety of alternative nanostructures with specific, targeted properties. In turn this will open the possibility of designing optimized materials entirely from first principles. Although the full accomplishment of this modeling revolution will be years in the making, its unprecedented benefits are already becoming clear. Indeed, ab initio simulations are providing key contributions to the understanding of a rapidly growing body of measurements at the nanoscale. A microscopic, fundamental understanding is very much in demand as such experimental investigations are often controversial and they cannot be explained on the basis of simple models. Quantum simulations provide simultaneous access to numerous physical properties (e.g., electronic, thermal and vibrational), and they allow one to investigate properties which are not yet accessible to experiment. A notable example is represented by microscopic models of the structure of surfaces at the nanoscale, which cannot yet be characterized experimentally due to the lack of appropriate imaging techniques. The characterization 2701 S. Yip (ed.), Handbook of Materials Modeling, 2701–2706. c 2005 Springer. Printed in the Netherlands.
2702
G. Galli and F. Gygi
of nanoscale surfaces and interfaces is of paramount importance to predict the function of nanomaterials and eventually their assembly into macroscopic solids: the surface to bulk ratio is very large in any nanostructure and key chemical reactions determining the properties of nanomaterials are usually occurring at surfaces and interfaces. In Fig. 1 we show some examples of prediction of nanostructure properties recently obtained by quantum simulations. The first illustrates a prediction of the existence of a new class of nanoparticles (bucky diamonds) [3] which has been experimentally verified; the second shows how quantum simulations can be used to understand the complex interplay between quantum confinement effects and surface properties (in simple Si dots) [4] and to predict yet unexplored solvation effects on Si clusters. The third example shows how accurate calculations for CdSe nanoparticles can help interpret and better understand a set of apparently well established experimental data, and provide atomistic models which open the way to complex nanomaterials growth studies [5]. One common and important point shown by all of these examples is the unique ability of quantum simulations to separate different physical effects and assess their quantitative relevance in determining various properties (e.g., the
Figure 1. Example of nanostructures investigated with first principles, state-of-the-art calculations: carbon nanoparticles, in particular the structure of bucky-dimamond which was predicted by ab initio molecular dynamics simulations (left hand side [3]); surface reconstructions of hydrogenated silicon nanoparticles, simulated by ab initio MD with electronic band gap computed by highly accurate quantum Monte Carlo techniques (bottom right [4]) and self-healing of CdSe dots ([upper right [5]).
Quantum simulations as a tool for predictive nanoscience
2703
relative importance of quantum confinement and surface structure in determining stability and optical gaps of semiconductor nanoparticles, or the relative importance of thermal disorder and solvation effects in determining electronic properties). This ability to discern between various physical effects is a unique feature of quantum simulations and it is an essential prerequisite to the development of materials design tools. The left hand side of Fig.1 shows structural models of bare nanodiamonds as obtained using ab initio molecular dynamics (MD) simulations [3]. These calculations have shown that, in the 1–4 nm size range, nanodiamond has a fullerene-like surface and, unlike silicon and germanium, exhibits very weak quantum confinement effects. These carbon nanoparticles characterized by a diamond core and a fullerene-like surface reconstruction have been called bucky diamonds. The proposed microscopic structure of bucky diamonds has been experimentally verified by a series of X-ray absorption and EELS measurements. In addition, ab initio calculations of bare and hydrogenated nanodiamonds have shown that at about 3 nm, and in a broad range of pressure and temperature, particles with bare, reconstructed surfaces become thermodynamically more stable than those with hydrogenated surfaces. These findings provided an explanation of the size distribution of extra-terrestrial nanodiamond found in meteorites and in outer space (e.g., proto-planetary nebulae) and of terrestrial nanodiamond found in ultradispersed and ultra-crystalline diamond films. Carbon is unique among group IV elements in exhibiting very weak quantum confinement effects at the nanoscale. Both Si and Ge are known to present stronger quantum confinement (below 5–6 nm), although experimentally it has been difficult to understand the interplay between mere size reduction of the crystalline nanoparticle core and surface reconstruction effects. Using a combination of quantum Monte Carlo (QMC) and ab initio MD techniques, the relative stability of Si nanoparticles (up to 2 nm) with reconstructed and unreconstructed surfaces has been predicted. Interestingly, these simulations have permitted to identify reconstructions which are unique to the highly curved surfaces of nanostructured materials and could not be guessed by a simple knowledge of the structure of solid surfaces. In addition, a clear connection between structure and function has been established: for example, calculations have shown that reconstructions of surface steps dramatically reduce the optical gap (right hand side of Fig.1) of hydrogenated Si dots, and decrease excitonic lifetimes, by localizing the band edge electronic states on the surface of the clusters. These predictions provided an explanation of both measured photoluminescence spectra of colloidally synthesized nanoparticles and observed deep gap levels in porous silicon. While surface reconstruction and some surface passivation (e.g., by oxygen) have been found to greatly influence the optical properties of Si dots, ab initio MD simulations of solvation of oxygenated Si clusters in water have shown no observable impact of the solvent.
2704
G. Galli and F. Gygi
This is in contrast with blue shifts observed for several organic molecules in polar solvents and indicates that the vacuum optical properties of Si dots are preserved in the presence of water. This information is extremely important for possible applications of Si dots as sensors in aqueous environments. The need for quantum simulations in investigations of group IV nanostructures is apparent, given current experimental difficulties encountered in synthesizing and characterizing most of these nanoparticles. In addition, these simulations play a very important role in understanding other systems (such as, e.g., CdSe dots) which are better characterized experimentally. For example, ab initio calculations of the structural and electronic properties of CdSe nanoparticles [5] (right hand side of Fig.1) have shown significant geometrical rearrangements of the nanoparticle surface while the wurtzite core is maintained. Remarkably these reconstructions, which are very different from those of group IV dots, are similar in vacuo and in the presence of ligands used in colloidal synthesis. Surface rearrangements lead to the opening of an optical gap even in the absence of passivating ligands, thus “self-healing” the surface electronic structure. These calculations provided microscopic models which open the way to study the growth of both spheres and wires and eventually the surface functionalization of CdSe nanostructures. The nanostructures illustrated in Fig. 1 all contain between 100 and 500 atoms and they are representative of what can be dealt with today with ab initio MD and QMC tools. At present, state-of-the-art ab initio molecular dynamics can treat systems with a few hundred atoms (200–500 depending on the number of electrons and the accuracy required to describe the electronic wave-functions) and simulation times of 10–100 ps (depending on the size of the systems involved). State-of-the-art Quantum Monte Carlo (QMC) codes using newly developed linear scaling algorithms [6] can now enable the calculation of the energy and optical gaps of sp-bonded systems with up to 100–300 atoms, as illustrated above in the case of Si clusters. We estimate that in the next few years, algorithmic developments (e.g., linear scaling methods [7]) along with an anticipated surge in computational power will enable ab initio simulations of systems comprising 3000–4000 atoms for several picoseconds, as well as of systems comprising 200–300 atoms in the nanosecond range. In addition, nearly linear scaling QMC calculations [6] of systems containing several thousand atoms, will be made possible with unprecedented levels of accuracy. This will permit realistic simulations of organic/inorganic interfaces found in nanoscale devices for bio-detection, transport properties of single-molecule electronic devices and semiconductor nanowires, the properties of magnetic systems at the nanoscale and in general of advanced materials. Finally, the application of ab initio MD and QMC techniques is extending beyond the traditional fields of condensed matter physics and physical chemistry into biochemistry and biology. In the next decade we expect
Quantum simulations as a tool for predictive nanoscience
2705
quantum simulations to effectively enter the realm of biology and to tackle problems such as microscopic modeling of DNA repair mechanisms and drug/ DNA interactions. In particular nearly exact QMC results may provide invaluable theoretical benchmarks that help overcome some of the current limitations of experimental biology. Although promising, quantum simulations still require improvements, in order to provide tools that theoreticians and experimentalists alike can use to design new materials, and many challenging problems remain to be solved. Besides the clear need for theoretical and algorithmic developments and complex code optimizations to adapt to new and changing platform architectures, new strategies need to be developed to best use these techniques in a way fully complementary to experiment. In particular, novel approaches to analyze, store and use data obtained from quantum simulations (including visualization tools and simulation data bases) need to be established. Progress in all of these areas will bring quantum simulations to be robust predictive tools for the design of new materials with targeted properties. Large increases in computer power – together with efficient coupled classical/ quantum-mechanical techniques (e.g., classical and ab initio molecular dynamics and quantum Monte Carlo) – will enable the design of new materials at the nanoscale, by generating a vast amount of accurate data to be used in configurational-expansion searches. The creation of easily accessible libraries of ab initio data for quantum design of materials will then allow one to predict systems with desired properties and quantities which are amenable to experimental validation. Many of the next-generation technologies will benefit from an ab initio computational design process, including optoelectronic materials, energy and information storage, detection of biological and chemical contaminants and spintronic devices.
Acknowledgment This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under Contract W-7405-Eng-48.
References [1] R. Car and M. Parrinello, Phys.Rev. Lett., 55, 2471, 1985. [2] D.M. Ceperley and B. Alder, Phys. Rev. Lett., 45, 566, 1980. [3] J-Y. Raty, G. Galli, A. Van Buuren, C. Boestedt, and L. Terminello, Phys. Rev. Lett., 90, 037401, 2003; J-Y. Raty and G. Galli, Nature Mat., 2, 792, 2003.
2706
G. Galli and F. Gygi
[4] A. Puzder, A.J. Williamson, F. Reboredo, and G. Galli, Phys. Rev. Lett., 91, 157405, 2003. [5] A. Puzder, A.J. Williamson, F. Gygi and G. Galli, Phys. Rev. Lett., 92, 217401, 2004. [6] A. Williamson, R.Q. Hood, and J.C. Grossman, Phys. Rev. Lett., 87, 246406, 2001. [7] J.-L. Fattebert and J. Bernholc, Phys. Rev. B, 62, 1713, 2000; J.-L. Fattebert and F. Gygi, Comp. Phys. Comm., 162, 24, 2004.
Perspective 9 A PERSPECTIVE OF MATERIALS MODELING William A. Goddard III Materials and Process Simulation Center, California Institute of Technology, Pasadena, California 91125, USA
1.
The Vision and Opportunity of de novo Multi Paradigm and Multiscale Materials Modeling
The impossible combinations of materials properties required for essential industrial applications have made the present paradigm of empirically based experimental synthesis and characterization increasingly untenable. Since all properties of all materials are in principle describable by quantum mechanics (QM), one could in principle replace current empirical methods used to model materials properties by first principles or de novo computational design of materials and devices. This would revolutionalize materials technologies, with rapid computational design, followed by synthesis and experimental characterization only for materials and designs predicted to be optimum. From good candidate materials and processes, one could iterate between theory and experiment to optimize materials. The problem is that direct de novo applications of QM are practical for systems with ∼102 atoms whereas the materials designer deals with systems of ∼1022 atoms. The solution to this problem is to factor the problems into several overlapping scales each of which can achieve a scale factor of ∼ 104 . By adjusting the parameters of each scale to match the results of the finer scale, it is becoming possible to achieve de novo simulations on practical devices with just ∼ 5 levels. This would allow accurate predictions of the properties for novel materials never previously synthesized and would allow the intrinsic bounds on properties to be established so that one does not waste time on impossible challenges. The levels in this hierarchy of methods (theory, algorithms, and software) must overlap as in the Figure so that results of the finer level can be used to determine the parameters and models suitable at the coarse level while retaining the accuracy of the finer level. This will allow a designer to modify in real time 2707 S. Yip (ed.), Handbook of Materials Modeling, 2707–2711. c 2005 Springer. Printed in the Netherlands.
2708
W.A. Goddard III
time scale
De novo Hierarchical Strategy for Predicting Real Optimized Materials
Design process: stage 1 Target properties at the macroscale determine the desired behavior at smaller scales FF/MD
QM
Software integration MACRO MESO
Design process: stage 2 High quality multi-scale modeling of promising materials for accurate predictions of materials properties
length
the choice of materials, microstructures, and assembly and obtain rapid feedback on the target properties until attaining the designed performance. Despite enormous progress, there remain enormous gaps in our ability to use theory and computation to address with reliability the prediction of optimized materials. Indeed other articles in this handbook address some of the innovative ideas for extending current theory, algorithms, and computational methods to solve the remaining problems. What is clear now is that the progress over the last 40 years has brought us to the brink of a whole new age in materials science in which theory and simulation are trusted to do the broad elements of designing bold new materials and the optimized solutions are refined experimentally to obtain the bringing to practice of these designs. Indeed the chapters in this Handbook provide details of the progress that has been made in various parts of this problem. The methods involved include: (a) First Principles Quantum Mechanics (QM) calculations to solve the Schr¨odinger equation, HØ = −i ∂Ø/∂t to predict the electronic, vibrational, excitation, and reactive properties of molecules, surfaces, and solids. The elements at this level are the wavefunctions Ø describing the electrons. Ø is a function of the coordinates of the N electrons in the system and of the N atom positions. There are various flavors of QM theory as discussed below. However it is not necessary to input any data about the system in order to predict the structures and properties. This allows us to predict the properties of novel materials never before synthesized or characterized. Unfortunately practical use of these methods is limited to ∼102 atoms. (b) Force Fields (FF) to describe the potential energy of the system in terms of the positions of the atoms, V (Ri , i = 1, . . . , N), where Ri is the 3D
A perspective of materials modeling
2709
vector describing the location of atom i. Thus the electronic information in the QM description is captured in a far simpler form in terms of the atom locations (with the electron coordinates averaged out). Traditional FF were designed to describe the molecule or solid near equilibrium, making them useful for describing the structure and vibrational levels, but not the reactions or chemistry. (c) Reactive force fields (ReaxFF). A major recent breakthrough is the ReaxFF force field [1–3] that provides an accurate description of reactive processes (including barriers) with a classical FF, allowing simulation of complex materials and processes involving 1000’s of atoms (including chemical reactions, charge transfer, polarizabilities, and mechanical properties for metals, oxides, organics, and their interfaces). This is an essential part of the Hierarchy of methods discussed above. The parameters for ReaxFF are obtained entirely from QM. Critical elements of ReaxFF are: a) a general model for electrostatics involving self-consistent charge transfer and atomic polarizability in which the charge distribution is determined from the instantaneous environment of each atom and applied to all pairs [Goddard et al., 2002, Zhang et al., 2003, Rapp´e and Goddard, 1991)]. b) Valence terms are based on partial bond orders allowing proper bond dissociation. c) Pauli Repulsion and dispersion is applied to all pairs (no exclusions). ReaxFF accurately describes various metals, oxides, and covalent systems and has been used for simulating shock-induced decomposition in RDX [Strachan et al., 2003]. We are excited that this will complement the methods discussed in this handbook by providing the chemical reaction input for many important materials problems. (d) Molecular Dynamics (MD) simulations. Given the FF we can describe the dynamics of the system in terms of Newton’s equations F = Ma or −∇V = M ∂ 2 R∂t2 properly modified to take into account temperature and pressure. With non-reactive FF, MD simulations are practical for systems with 105 – 107 atoms. For ReaxFF they are practical for ∼105 atoms. This allows Large-scale ab initio-based Force Fields (FF) on systems with millions of particles to predict properties (yield strength, Peierls stresses, elasticity, electrical and thermal conductivity, dielectric loss, ferroelectric domain boundaries) relevant for materials design (phase fields, reaction models) to be input into mesoscale and continuum models. (e) Mesoscale Dynamics (MesoD) (Kinetic Monte Carlo, Phase Fields, Chemical Kinetics) of heterogeneous interfaces and structures, using parameters from the MD and predicting critical performance properties for continuum modeling. Here some cases may involve coarse grain
2710
W.A. Goddard III
descriptions in which beads represent collections of atoms. In other cases (say plasticity of metals) the mesoscale may involve only defects such as dislocations, with the atoms completely gone (of course that atoms were there at the MD level to obtain the Peierls stresses, kink energies and other quantities needed to describe plasticity. (f) Macroscale and Continuum Modeling of real devices to consider packaging of the components, effects of environment. A model here is NEMO 3D [Klimeck, 2002] used successfully in the semiconductor industry to predict quantum transport in devices. To obtain first principles-based results for macroscale systems, we must ensure that each scale of simulation overlaps sufficiently with the finer description so that all input parameters and constitutive laws at each level of theory can be determined from more fundamental theory. Equally important we must ensure that these relations are invertible so that the results of coarse level simulations can be used to suggest the best choices for finer level parameters, which can be used to suggest new choices of composition and structure. Validation and error propagation. Essential to design of materials is estimating the reliability (error bars) in the calculated results. Unfortunately progress here has been slow with much to do in establishing methods to estimate likely uncertainties by comparing to finer more accurate theory and by validating against available experimental data. Often in applications it is expedient to use less accurate levels than the best available to obtain answers quickly, which requires that the errors for the various level of theory be estimated and propagated forward so that the designer can know what confidence to have in modifying compositions and materials to optimize performance. Of course to be useful to designers, the simulation software must integrate the various computational methods (QM, FF, MD, MM, and FE) into an integrated framework so that they can focus on the design issues. To provide this framework we developed the Computational Materials Design Facility (CMDF) to allow the multiscale software to be accessed transparently with data automatically passed sequentially from one generation to the next for optimization and utilization of parameters.
Applications A recent application using multiscale methods to design a new material is Deng [8] in which a new previously unsynthesized material capable of reversible binding of H2 up to 6% by weight (the DOT goal for 2010) designed using FF developed from QM and using grand canonical MD to predict the pressure-temperature loading, including a viable synthetic strategy.
A perspective of materials modeling
2711
Other recent examples use a multiscale strategy to predict the 3D structures of membrane bond proteins and to validate the structures by predicting the binding sites for agonists and antagonists [9, 10]. A recent application of the multiscale strategy (first principles QM through modeling of the stress-strain behavior as a function of temperature and strain rate is given in Cuitino et al. [11]. An implementation of the multiscale approach to rapid throughput screening of catalysts is given in Muller et al. [12].
References [1] A.C.T. van Duin, S. Dasgupta, F. Lorant et al., “ ReaxFF: A reactive force field for hydrocarbons,” J. Phys. Chem. A, 105, 9396–9409, 2001. [2] A.C.T. van Duin, A. Strachan et al., “ReaxFF sio reactive force field for silicon and silicon oxide systems,” J. Phys. Chem. A, 107, 3803–3811, 2003. [3] A. Strachan, A.C.T. van Duin, D. Chakraborty et al., “Shock waves in high-energy materials: the initial chemical events in nitramine RDX,” Phys. Rev. Let., 91(9): art. No. 098301, 2003. [4] W.A. Goddard III, Q. Zhang, M. Uludogan et al., “The ReaxFF polarizable reactive force fields for molecular dynamics simulation of ferroelectrics,” R.E. Cohen and T. Egami (eds.), Fundamental Physics of Ferroelectrics, 45–55, 2002. [5] Q. Zhang, T. Cagin, A. van Duin, et al., “Adhesion and nonwetting-wetting transition in the Al/alpha-Al2O3 interface,” Phys. Rev. B, 69(4): art. No. 045423, 2004. [6] A.K. Rapp´e and W.A. Goddard, “Charge equilibration for molecular dynamics simulations,” J. Phys. Chem., 95, 3358–3363, 1991. [7] G. Klimeck, F. Oyafuso, T.B. Boykin, R.C. Bowen, and P.V. Allmen, Comput. Modeling Eng. Sci., 3, 5, 601–642, 2002. [8] W.Q. Deng, X. Xu, and W.A. Goddard, “New alkali doped pillared carbon materials designed to achieve practical reversible hydrogen storage for transportation,” Phys. Rev. Let., 92(16): art. No. 166103, 2004. [9] M. Yashar, S. Kalani, N. Vaidehi et al., “The predicted 3D structure of the human D2 dopamine receptor and the binding site and binding affinities for agonists and antagonists,” PNAS, 101(11), 3815–3820, 2004. [10] P.L. Freddolino, M.Y.S. Kalani, N. Vaidehi et al., “Predicted 3D structure for the human beta 2 adrenergic receptor and its binding site for agonists and antagonists,” PNAS, 101(9), 2736–2741, 2004. [11] A.M. Cuitino, L. Stainier, G. Wang et al., “A multiscale approach for modeling crystalline solids,” J. Comput. Aided Mater. Des., 8, 127–149, 2001. [12] R.P. Muller, D.M. Philipp, and W.A. Goddard III, “Quantum mechanical – rapid prototyping applied to methane activation,” Top. Catal., 23, 81–98, 2003.
Perspective 10 AN APPLICATION ORIENTED VIEW ON MATERIALS MODELING Peter Gumbsch Institut f¨ur Zuverl¨assigkeit von Bauteilen und Systemen izbs, Universit¨at Karlsruhe (TH), Kaiserstr. 12, 76131Karlsruhe, Germany and Fraunhofer Institut f¨ur Werkstoffmechanik IWM, W¨ohlerstr. 11, 79194, Freiburg, Germany
Modeling and simulation has become a major part of Materials Science and Engineering in academia as well as in industrial research and development. Materials oriented modeling and simulation, however, is not a well established monolithic area. It encompasses all the tools which physicists, chemists, mechanical engineers and materials scientists have developed over the years to describe materials as such or their behavior. One usually distinguishes three distinct areas connected with materials modeling: materials modeling, process simulation and component simulation. Materials modeling is directed towards the simulation of the materials itself, its evolution, changes in its internal structure and its properties. Almost all contributions in this volume are directly oriented towards this part of materials modeling. The final goal of all materials modeling is better understanding of materials behavior and the development of simple models which accurately describe it. These models are the basis for all other materials oriented modeling and simulation. Process simulation is aiming at the description of the synthesis of a material and the further processing into a component. It takes into account the specific processing parameters and is often used to optimize individual processing steps. Component simulation is aiming at the characterization and evaluation of the properties of components or entire devices. This includes their behavior in service, possible changes during service and their lifetime. In industry, computer simulation is used to reduce the number of cycles of trial and error during development and to transfer these cycles from the lab to the computer. Before the insertion of products into the market, computer simulation is also used to reduce the costly testing of prototypes to a few well selected cases as in the case of crash tests of cars. 2713 S. Yip (ed.), Handbook of Materials Modeling, 2713–2718. c 2005 Springer. Printed in the Netherlands.
2714
P. Gumbsch
Recent developments in process simulation aim at not only considering a single process but the chain of consecutive processing steps and the influence of one on the other. The need for the latter comes from the increasing demand for higher precision which requires inhomogenities in the material to be correctly considered. Such inhomogenities are usually associated with alterations in the microstructure of the material, which in turn have their origin in locally different processing conditions in the previous processing step. Microstructural information is therefore explicitly or implicitly transferred when transferring locally different processing conditions. The need to couple simulations of the individual steps of a process chain also arises because potential improvements in complex production processes can only be exploited if the entire process chain is considered as a whole. Whatever the particular reason for the modeling of the entire process chain, the value added is due to the use of information which has already been acquired at a previous step. Making this information useful is highly non trivial. In practical applications it often constitutes a mayor effort since it requires that implicitly available information (e.g., the cooling rate in casting) must be explicitly transformed to the relevant characteristics of the microstructure (e.g., the distribution of grain sizes and precipitates). The importance and the potential of the simulation of process chains is illustrated with the example of the producing of a geometrically complex aluminum oxide ceramic component. (Details can be found in [1] and references therein.) The geometry of the component is predetermined from the design. The essential process steps are the filling of the powder into the die, the pressing of the green body and the sintering of the ceramic. The difficulty lies within sintering, the last processing step, during which the component shrinks and bends and often leaves locally varying remaining porosity. Differences in density in the green body as a consequence of the pressing, shown in Fig. 1, are responsible for this. The changes in geometry during sintering have to be
Figure 1. Simulation of the pressing process of an alumina ceramic and the resulting density distribution in the green body [4].
An application oriented view on materials modeling
2715
anticipated and counteracted through alterations of the die and or the pressing route. Such alterations can ideally be simulated. The density distribution after pressing can directly be transferred to the simulation of sintering and the outcome of this simulation can directly be compared to the specifications. Particularly valuable in this case is the possibility to invert the order of these simulations. This allows questions to be addressed like whether and how the desired design can be achieved with an existing tool by changes to the pressing only. The experience with simulations of powder metallurgical processing steps is extremely positive. The results match experimental findings to a very good degree. Remaining differences can often be traced back to inadequate knowledge of the first process step, the filling of the die, which may give an inhomogeneous density distribution before pressing. The simulation of filling processes of granular media is intensively worked on [2]. It makes use of discrete (atomistic) simulation methods similar to the large scale MD methods described in this volume. Recent developments in the component simulation aim at accounting for actual conditions in service, instead of the often much simplified testing conditions. They also aim at taking into account local inhomogenities in the material, which of course result from the processing of the material. A particularly memorable example of locally different materials properties due to the processing history is known to everyone who ever tried to straighten a crooked nail. Similar experience can be gained if one tries to bend a paper clip back to a straight wire. One experiences local differences in strain hardening which make it quite difficult to bend the clip back at precisely the location where it had previously been bent. This is shown in Fig. 2. In practical application such local differences in strain hardening also lead to locally different elastic stresses in a component during forming, which after removal of the forming tool lead to very different spring back behavior. The difficulty in correctly simulating the spring back behavior originates from the difficulties in correctly describing the flow stress and the work hardening. Often the flow stress may be significantly lower when reversing the direction of deformation. Changing the mode of deformation can lead to even more complex conditions and parts of the hardening can also relax with time (dependent on temperature). Altogether this results in a huge parameter space which can hardly be covered experimentally. In practical application one partly circumvents this problem by fitting advanced material laws to few experiments, specifically designed to test different fields in parameter space (e.g., [3]). A strict reference to the microstructure of the material can however not be drawn today. Materials modeling has seen enormous advances in multiscale simulation in the recent years. To simulate materials properties and to develop materials models often demands access to very different time and length scales. Sticking
2716
P. Gumbsch
Figure 2. Bending a copper paper clip back usually does not lead to a straight wire due to local differences in hardening.
with the example of plastic deformation, one first has to consider the grain structure with which the dislocations interact. Typical grain sizes are millimeters and below. Dependent on the type of deformation the dislocations arrange in sub-grain structures of micrometer dimensions; and the development of these dislocation structures is substantially determined by cutting processes of the dislocations which occur at an atomic scale. All these different processes can not be described with one particular simulation tool. Thus different methods are established at different scales from crystal plasticity via dislocation dynamics simulation to atomistic methods all described in this volume. Until recently, all these methods were safely separated in space and time but the enormous advances in computing power and the development of dedicated hybrid simulation methods which couple several of the methods helped to close these gaps. The first atomistic simulations of dislocation intersections became possible five years ago [4, 5]. Of current interest are atomistic simulations that realistically describe the interaction between dislocations and grain boundaries [6, 7] and details of dislocation-dislocation interactions and their consequences for strain hardening [8, 9]. Simulating the evolution of a multitude of dislocations is still a formidable task. However, recent advances in the three dimensional discrete dislocation simulations now enable the simulation of simple strain paths and small deformations. Soon these simulations will go to large enough deformations in non-trivial geometries and thereby enable statistical
An application oriented view on materials modeling
2717
analyses along the lines of recent two dimensional studies [10] and of dislocation density evolution for more complicated strain paths. Such statistical methods could in principle directly help to determine the flow stress of a material in classical engineering simulations. They will certainly aid the development of dislocation field theories which will enable one to do so. These topics are currently under investigation for example in the research network SizeDepEn [11]. Even if the emphasis of this commentary is on aspects of the mechanical properties of materials, multiscale simulation is not limited to that. Various other examples can for example be found in reference [12]. A common trend of all the different aspects of computational materials science discussed so far is the attempt to include microstructural aspects of the material. Advanced materials modeling must in this respect provide the basis for the other disciplines. The drive may however come from the users in the component and process simulation for whom a physically based integral materials simulation is the goal, which will give them a qualitative new basis for product design. So far the development of a new component is performed sequentially thru the design, the processing and the component evaluation phases as schematically displayed in Fig. 3. Starting with a proposal for component design, a first simple component evaluation can be done using standard component simulation tools. Next the possibilities for processing are evaluated using process simulation tools. Accurate evaluation and simulation of the properties of the component and detailed assessment with respect to the specifications are however only possible after the production of a few prototypes, which have the precise microstructure that results from the particular processing steps chosen to produce the component. If intolerable differences to the specifications occur, the entire design cycle has to be repeated. In general, an inversion of the design process is not possible today. Consequently, questions like: “What type of microstructure is needed in which part of the component to fulfill specifications?” and “How is the processing to be done to reach this microstructure?” can not be answered today. Component design on the basis of an integral materials simulation which uses transferable microstructure-based materials models at all steps will in the future allow precisely that. Besides qualitatively and quantitatively improved simulations in the specific steps, integral materials simulation will at least partly allow to invert the design steps and to back calculate how processing needs to be done to reach the desired properties. In full generality the picture drawn right now is still futuristic. In particular cases, however, and this includes several of the aforementioned examples, this scenario is reacheable in the near future if the still separated disciplines of materials, process and component simulation find together on common subjects.
2718
P. Gumbsch
Figure 3. Component design has so far relied on a sequence of steps from the geometrical design and specification to assessment of the processing steps and component evaluation. Each of these steps is aided by simulation. Today, integral modeling of the genesis of a component on a common simulation basis is not yet available. Integral materials simulation which takes into consideration the microstructure of the material in the individual processing steps and carries it over to component simulation will enable much more precise simulation and will help to partly reverse the design process.
References [1] T. Kraft and H. Riedel, Powder Metall., 45, 227–231, 2002. [2] H.J. Herrmann, J.-P. Hovi, and S. Luding (eds.), Physics of Dry Granular Media, Kluwer Academic Publishers, Netherlands, 1998. [3] A.A. Krasowsky et al., “Computational fluid and solid mechanics,” K.J. Bathe (ed.), Elsevier Science, New York, pp. 403–406, 2003. [4] S.J. Zhou et al., Sci., 279, 1525–1527, 1998. [5] P. Gumbsch, Sci., 279, 1489–1490, 1998. [6] J. Schiotz and K.W. Jakobsen, Sci., 301, 1357–1359, 2003. [7] H. van Swygenhoven et al., Nat. Mater., 3, 399–403, 2004. [8] R. Madec et al., Sci., 301, 2003. [9] P. Gumbsch, Sci., 301, 1857–1858, 2003. [10] M. Zaiser, M.-C. Miguel and I. Groma, Phys. Rev. B, 64, 224102, 1–9, 2001. [11] www.sizedepen.net [12] MRS Bulletin, 26/3, 169–221, 2001.
Perspective 11 THE ROLE OF THEORY AND MODELING IN THE DEVELOPMENT OF MATERIALS FOR FUSION ENERGY Nasr M. Ghoniem Mechanical and Aerospace Engineering Department, University of California, Los Angeles, CA 90095-1597, USA
The environmental and operational conditions of First Wall/ Blanket (FW/B) structural materials in fusion energy systems are undoubtedly amongst the harshest in any technological application. These materials must operate reliably for extended periods of times without maintenance or repair. They must withstand the assaults of high particle and heat fluxes, as well as significant thermal and mechanical forces. Rival conditions have not been experienced in other technologies, with possible exceptions in aerospace and defense applications. Moreover, the most significant dilemma here is that the actual operational environment cannot be experimentally established today, with all of the synergistic considerations of neutron spectrum, radiation dose, heat and particle flux, and gigantic FW/B module sizes. Because of these considerations, we may rely on a purely empirical and incremental boot-strapping approach (as in most human developments so far), or an approach based on data generation from non prototypical setups (e.g., small samples, fission spectra, ion irradiation, etc.), or a theoretical/computational methodology. The first approach would have been the most direct had it not been for the unacceptable risks in the construction of successively larger and more powerful fusion machines, learning from one how to do it better for the next. The last approach (theory and modeling alone) is not a very viable option, because we are not now in a position to predict materials behavior in all its aspects from purely theoretical grounds. The empirical, extrapolative approach has also proved itself to be very costly, because we cannot practically cover all types of material compositions, sizes, neutron spectra, temperatures, irradiation times, fluxes, etc. Major efforts had to be scrapped because of our inability to encompass all of these variations simultaneously. While all three approaches must be considered for the 2719 S. Yip (ed.), Handbook of Materials Modeling, 2719–2729. c 2005 Springer. Printed in the Netherlands.
2720
N.M. Ghoniem
development of fusion materials, the multi-scale materials modeling (MMM) framework that we propose here can provide tremendous advantages if coupled with experimental verification at every relevant length scale. A wide range of structural materials has been considered over the past 25–30 years for fusion energy applications [1]. This list includes conventional materials (e.g., austenitic stainless steel), low-activation structural materials (ferritic/martensitic steels, V-4Cr-4Ti, and SiC/SiC composites), oxide dispersion strengthened (ODS) ferritic steels, conventional high temperature refractory alloys (Nb, Ta, Cr, Mo, W alloys), titanium alloys, Ni-based super alloys, ordered intermetallics (TiAl, Fe3 Al, etc.), high-strength, high-conductivity copper alloys, and various composite materials (C/C, metal-matrix composites, etc.). Numerous factors must be considered in the selection of structural materials, including material availability, cost, fabricability, joining technology, unirradiated mechanical and thermophysical properties, radiation effects (degradation of properties), chemical compatibility and corrosion issues, safety and waste disposal aspects (decay heat, etc.), nuclear properties (impact on tritium breeding ratio, solute burnup, etc.). Strong emphasis has been placed within the past 10–15 years on the development of three reduced-activation structural materials: ferritic/martensitic steels containing 8–12%Cr, vanadium base alloys (e.g., V-4Cr-4Ti), and SiC/SiC composites. Recently there also has been increasing interest in reduced-activation ODS ferritic steels. Additional alloys of interest for fusion applications include copper alloys (CuCrZr, Cu–NiBe, dispersion-strengthened copper), tantalumbase alloys (e.g., Ta-8W–2Hf), niobium alloys (Nb–1Zr), molybdenum, and tungsten alloys. In the following, we give a brief analysis of the most limiting mechanical properties based on our earlier work [1].
1.
Lower Operating Temperature Limits
The lower temperature limits for FW/B structural materials (i.e., excluding copper alloys) are strongly influenced by radiation effects. For body-centered cubic (BCC) materials such as ferritic-martensitic steels and the refractory alloys, radiation hardening at low temperatures can lead to a large increase in the Ductile-To-Brittle-Transition-Temperature (DBTT)[2, 3]. For SiC/SiC composites, the main concerns at low temperatures are radiation-induced amorphization (with an accompanying volumetric swelling of ∼11%) [4] and radiation-induced degradation of thermal conductivity. The radiation hardening in BCC alloys at low temperatures (0.3TM ) is generally pronounced, even for doses as low as ∼1 dpa [3]. The amount of radiation hardening typically decreases rapidly with irradiation temperature above 0.3 TM , and radiationinduced increase in the DBTT may be anticipated to be acceptable at temperatures above ∼0.3TM . A Ludwig–Davidenkov relationship between hardening
The role of theory and modeling
2721
and embrittlement was used to estimate the DBTT shift with increased irradiation dose. In this model, brittle behavior occurs when the temperature dependent yield strength exceeds the cleavage stress. It is worth noting that operation at lower temperatures (i.e., within the embrittlement temperature regime) may be allowed for some low-stress fusion structural applications (depending on the value of the operational stress intensity factor relative to the fracture toughness). Numerous studies have been performed to determine the radiation hardening and embrittlement behavior of ferritic-martensitic steels. The hardening and DBTT shift are dependent on the detailed composition of the alloy. For example, the radiation resistance of Fe-9Cr-2WVTa alloys appears to be superior (less radiation hardening) to that of Fe-9Cr-1MoVNb. The radiation hardening and DBTT shift appear to approach saturation values following low temperature irradiation to doses above 1–5 dpa, although additional high-dose studies are needed to confirm this apparent saturation behavior. At higher doses under fusion conditions, the effects of He bubble accumulation on radiation hardening and DBTT need to be addressed. Experimental observa√ tions revealed brittle behavior (K I C ∼30 MPa- m) in V-(4–5)%Cr-(4–5)%Ti specimens irradiated and tested at temperatures below 400 ◦ C. From a comparison of the yield strength and Charpy impact data of unirradiated and irradiated V-(4–5)%Cr-(4–5)%Ti alloys, brittle fracture occurs when the tensile strength is higher than 700 MPa. Therefore, 400 ◦ C may be adopted as the minimum operating temperature for V-(4–5)%Cr-(4–5)%Ti alloys in fusion reactor structural applications[5]. Further work is needed to assess the impact (if any) of fusion-relevant He generation rates on the radiation hardening and embrittlement behavior of vanadium alloys. Very little information is available on the mechanical properties of irradiated W alloys. Tensile elongation of ∼ 0 have been obtained for W irradiated at relatively low temperatures of 400 and 500 ◦ C (0.18–0.21 TM ) and fluences of 0.5−1.5×1026 n/m2 (≺2 dpa in tungsten) [6]. Severe embrittlement (DBTT ≥ 900 ◦ C) was observed in un-notched bend bars of W and W-10%Re irradiated at 300 ◦ C to a fluence of 0.5 × 1026 n/m2 (≺ 1 dpa). Since mechanical properties data are not available for pure tungsten or its alloys irradiated at high temperatures, an accurate estimate of the DBTT versus irradiation temperature cannot be made. The minimum operating temperature which avoids severe radiation hardening embrittlement is expected to be 900 ± 100 ◦ C.
2.
Upper Operating Temperature Limits
The upper temperature limit for structural materials in fusion reactors may be controlled by four different mechanisms (in addition to safety considerations): Thermal creep, high temperature helium embrittlement, void swelling, and
2722
N.M. Ghoniem
compatibility: corrosion issues. Void swelling is not anticipated to be significant in ferritic-martensitic steel [7] or V–Cr–Ti alloys [8] up to damage levels in excess of 100 dpa, although swelling data with fusion-relevant He:dpa generation rates are needed to confirm this expectation and to determine the lifetime dose associated with void swelling. The existing fission reactor database on high temperature (Mo, W, Ta) refractory alloys (e.g., [6]) indicates low swelling (≺2%) for doses up to 10 dpa or higher. Radiation-enhanced recrystallization (potentially important for stress-relieved Mo and W alloys) and radiation creep effects (due to a lack of data for the refractory alloys and SiC) need to be investigated. Void swelling is considered to be of particular importance for SiC (and also Cu alloys, which were shown to be unattractive fusion structural materials [1]. An adequate experimental database exists for thermal creep of ferriticmartensitic steels [7] and the high temperature (Mo, W, Nb, Ta) refractory alloys [10]. Oxide-dispersion-strengthened ferritic steels offer significantly higher thermal creep resistance compared to ferritic-martensitic steels [11], with a steady-state creep rate at 800 ◦ C as low as 3 × 10−10 s−1 for an applied stress of 140 MPa. The V-4Cr-4Ti creep data suggest that the upper temperature limit lies between 700 and 750 ◦ C, although strengthening effects associated with the pickup of 200–500 ppm oxygen during testing still need to be examined. The predicted thermal creep temperature limit for advanced crystalline SiC-based fibers is above 1000 ◦ C [12]. One convenient method to determine the dominant creep process for a given stress and temperature is to construct an Ashby deformation map. Using the established constitutive equations for grain boundary sliding (Coble creep), dislocation creep (power law creep) and self-diffusion (Nabarro–Herring) creep, the dominant deformation-mode regimes can be established [1].
3.
Operating Temperature Windows
Figure 1 summarizes the operating temperature windows (based on thermal creep and radiation damage considerations) for nine structural materials considered by Zinkle and Ghoniem [1]. The temperature limits for Type 316 austenitic stainless steel are also included for sake of comparison. In this figure, the light shaded regions on either side of the dark horizontal bands are an indication of the uncertainties in the temperature limits. Helium embrittlement may cause a reduction in the upper temperature limit, but sufficient data under fusion-relevant conditions are not available for any of the candidate materials. Due to a high density of matrix sinks, ferritic/martensitic steel appears to be very resistant to helium embrittlement [13]. An analysis of He diffusion kinetics in vanadium alloys predicted that helium embrittlement would
The role of theory and modeling
2723
Figure 1. Operating temperature windows (based on radiation damage and thermal creep considerations) for refractory alloys, Fe-(8-9%)Cr ferritic-martensitic steel, Fe-13%Cr oxide dispersion strengthened ferritic steel, Type 316 austenitic stainless steel, solutionized and aged Cu-2%Ni-0.3%Be, and SiC/SiC composites. The light shaded bands on either side of the dark bands represent the uncertainties in the minimum and maximum temperature limits.
be significant at temperatures above 700 ◦ C [14]. The lower temperature limits in Fig. (1) for the refractory alloys and ferritic:martensitic steel are based on fracture toughness embrittlement associated with low temperature neutron √ irradiation. An arbitrary fracture toughness limit of 30 MPa- m was used as the criterion for radiation embrittlement. Further work is needed to determine the minimum operating temperature limit for oxide dispersion strengthened (ODS) ferritic steel. The value of 290 ± 40 ◦ C used in Fig. (1) was based on results for HT-9 (Fe-12Cr ferritic steel). The minimum operating temperature for SiC/SiC was based on radiation-induced thermal conductivity degradation, whereas the minimum temperature limit for CuNiBe was simply chosen to be near room temperature. The low temperature fracture toughness radiation embrittlement is not sufficiently severe to preclude using copper alloys near room temperature [15], although there will be a significant reduction in strain hardening capacity as measured by the uniform elongation in a tensile test. The high temperature limit was based on thermal creep for all of the materials except SiC and CuNiBe. Due to a lack of long-term (10 000 h), low-stress creep data for several of the alloy systems, a Stage II creep deformation limit of 1% in 1000 h for an applied stress of 150 MPa was used as an arbitrary criterion for determining the upper temperature limit associated with thermal creep. Further creep data are needed to establish the temperature limits for longer times and lower stresses in several of the candidate materials.
2724
4.
N.M. Ghoniem
Main Challenges
4.1.
Deformation and Fracture
Problems of deformation and fracture stem from several phenomena that can render structural material brittle either at low or at high temperature. The interplay between these phenomena is complex resulting in deformation and fracture properties to depend on many intrinsic and extrinsic variables. We must therefore take into account environmental variables, when we develop a database for materials properties, which can be functions of (T, dpa, dpa rate, He, H, stress, etc.). Since properties involve many mechanisms at multiscale levels, and sometimes differences between large rates result in macroscopic changes (e.g., void swelling), we need to develop physically based properties models. We also need a modeling-experiment integration strategy for validation of models at different scales. Models that are to be developed can be hierarchical, starting from the atomic information all the way to the prediction of constitutive relationships and macroscopic fracture. The main crosscutting issues in deformation and fracture are: 1. Irradiation effects on stress-strain, constitutive laws, and consequences of flow localization; 2. Validity & physical basis of the Master–Curve (MC) for predicting the ductile-to-brittle-transition; 3. Embrittlement – MC shifts due to hardening & He effects 4. Model-based designs for high performance alloys; 5. Irradiation effects on constitutive properties: J2 laws linked to microstructure evolution; 6. Development of plasticity models for constitutive properties, for example bridging between Dislocation Dynamics, crystal plasticity, and polycrystalline plasticity; 7. Understanding flow localization and ductility loss of irradiated materials; 8. The apparent universality of the MC shape, and the physical basis for this universality. 9. Effects of helium on GB, and how that influences shifts in DBTT; 10. Model-based design of alloys; for example including a high density of nano-clusters to trap helium in high-pressure bubbles and thus preventing them from going to grain boundaries.
4.2.
Helium Effects
Several methods of modeling helium effects on irradiated materials have been developed over the past two decades. Atomistic MD simulations are
The role of theory and modeling
2725
now being used more extensively to determine the energetics of binding and migration of various helium-vacancy complexes. Information on the fraction of residual defects has also been obtained from cascade simulations. Such atomic level information is passed on to mesoscale simulations of microstructure evolution based on reaction rate theory. Most of these simulations have assumed that the microstructure is spatially homogeneous in space and time. However, some of these assumptions have been relaxed, such as the effects of cascades on point defect diffusion, formation of microstructure patterns, etc. One of the key advantages of rate theory is that the results of simulations can be directly compared to experiments, while the key parameters are obtained from either experiments or atomistic simulations. For example, KMC simulations can now be used to solve complex point defect diffusion problems in the stress field of dislocations, and thus derive more realistic values for the dislocation bias factors. At the same time, large systems of equations describing the nucleation and growth of void and bubble populations can be solved with current day large-scale computers, thus providing more accurate descriptions of nucleation and growth. This level of detailed rate theory modeling is essential, because experiments show that several phenomena are influenced by helium in a complex fashion, for example, the swelling rate is not a monotone function of the helium-to-dpa ratio. Likewise, the effects of small helium concentrations on grain boundary fracture depend on many details of the microstructure, while the effects of helium bubbles on hardening or embrittlement at low temperature is not yet clear.
4.3.
Radiation Stability of Alloys
Real alloys are made of major and minor components. While the fate of minor elements under irradiation can be handled, in principle at least, with the same tools as point defects (i.e., book keeping of the mean or local concentration as a function of time), such is not the case for major alloy components. In particular, the cluster dynamics technique (rate theory) fails, because of percolation problems. A small community works at developing a theoretical framework to assess the stability of stationary phases under irradiation. At the present time, it is acknowledged that the latter stability depends altogether on the temperature, composition, irradiation flux and “cascade size”. This implies that both the spatial extension of the cascade and the number of replacements per cascade are important factors. For the overall approach to be justified, the evolution of the precipitate population under irradiation must be fast compared to that of the defect sink structure (dislocation network, defect aggregates of various forms): such is indeed the case since the latter evolves at a rate proportional to the small difference between the vacancy and the interstitial fluxes
2726
N.M. Ghoniem
at sinks, two large quantities; on the contrary, the precipitates grow or shrink because of the coupling of the solute flux with the two above fluxes, an additive process.
5.
Modeling Research Needs
5.1.
Interatomic Potentials for Radiation Damage
The crucial properties for radiation damage simulations are: (1) (2) (3) (4) (5)
Point defect formation, migration and interaction energies; Elastic constant anisotropy; Grain boundary energetics; Dislocation structure and response to stress; Alloy phase stability.
None of these properties are automatically correct as a result of the physical basis of potentials. With pair-wise interactions, some are necessarily wrong. With many-body potentials (used here as a generic term covering glue, Finnis– Sinclair, embedded atom, modified embedded atom and effective medium theory potentials) many can be fitted provided “correct” values are available. These types of potentials have been the “state of the art” for twenty years. Historically, there has been an insufficient database for robust potential fitting. Not all this data is available experimentally for parameterization and verification of potentials. Recent renewed interest in interatomic potentials is based on the ability of ab initio calculations to provide this missing data – with teraflop machines verification of predictions is finally possible. Where tested against new data, existing potentials have generally proved disappointing. Some common problems include poor interstitial formation energies, the energy difference between configurations too small and no satisfactory description of the austenitic-ferritic transition. Some of these problems can be traced to problems in parameterization of the potentials and have been addressed in recent work by simple reparameterization. Others, such as the absence of a physically sensible treatment of magnetization, point to more fundamental problems in the many-body potential concept. The majority of the effort in potentials for metallic phases has focused on elemental materials. Potentials for multi-component systems have been developed in isolated cases, but the predictive capability of these potentials is typically disappointing. Reasonable models for the mission-critical helium impurities exist, the inertness of helium making its behavior in MD somewhat insensitive to parameterization. There are two challenges in the development of potentials in alloy systems. First, there is generally much less data available though this can be rectified through the use of ab initio methods.
The role of theory and modeling
2727
Second, the appropriate functional forms are not as well developed. Most potentials are based on simple pictures of bonding. In alloy systems, the nature of the bonding is inherently more complex suggesting that more sophisticated potentials are needed to describe the energetics reliably. Non-metallic impurities (carbon, phosphorus) are more problematic.
5.2.
Dislocation Interactions & Dynamics
One of the critical problems for the development of radiation-resistant structural materials is the embrittlement, loss of ductility and plastic flow localization. Modeling the interaction between dislocations and radiationinduced obstacles is providing great insights into the physics of this problem, and will eventually lead to the design of radiation-resistant structural alloys. Models of dislocation-defect interactions are pursues at two levels: (1) the atomistic level, where MD simulations are playing significant role; and (2) the mesoscopic level, where DD simulations are providing insights into largerscale behavior. Both types of models are complementary, and provide direct information for experimental validation on the effects of irradiation on hardening, yield drop, and plastic flow localization, etc. Atomic scale models are used to “inform” DD models on the details of dislocation-defect interactions. Presently, MD models can simulate 1–10 million atoms on a routine basis. Both static and dynamic simulations are used. For static simulations, fixed displacement boundary conditions are applied, and conjugate gradient minimization is used. On the other hand, Newtonian equations of motion are used for dynamic simulations, and either force or velocity conditions are applied on boundary atoms. Atomistic simulations have shown the range where elasticity estimates are valid for dislocation-defect interactions, and where they break down due to new mechanisms. For example, the interaction of dislocations with small precipitates can result in local phase transitions and an associated energy cost that cannot be predicted from DD models. Also, it has been shown that dislocation-void interaction leads to dislocation climb, and the formation of a dislocation dipole before the dislocation completes cutting through the void completely. These effects are all of an atomic nature, and the information should be passed on to DD simulations. A number of challenges remain in the area of dislocation-defect interactions, as described below: (1) The strain rates in MD simulations are far in excess of experimentally achievable rates, and methods to incorporate slow rate events due to temperature or force field fluctuations have not yet been developed. (2) The information passing between MD and DD is not systematic yet. For example, the “angle” between dislocation arms before it leaves the
2728
(3)
(4)
(5)
(6)
(7)
N.M. Ghoniem obstacle is often used in DD simulations as a measure of obstacle strength. However, the definition of this angle in both MD and experiments is problematic for a variety of reasons. Force-displacemt information will be necessary. Methods for incorporating lower length scale microstructure effects into DD simulations are not well developed. For example, we do not have information on obstacle dynamics, solute effects, dislocations near cracks, dislocation nucleation, etc. The size of atomistic simulations is very small, and cannot deal with complex dislocation structures. Methods for reducing the degrees of freedom are needed. The boundary conditions used in MD simulations are either periodic, fixed, or represented by elastic Green’s functions. General methods for embedding MD simulations into the continuum are in an early stage of development. DD codes are limited to small size crystals. To improve their speed and range of applicability, new methods of designing these codes on massively parallel computers are needed. The connection between DD and macroscopic plasticity has not yet been made through “coarse graining” and a systematic reduction of the degrees of freedom. Development of this area is essential to the prediction of constitutive relations and macroscopic plastic deformation.
References [1] S. Zinkle, N. Ghoniem, “Operating temperature windows for fusion reactor structural materials,” Fusion Engineering and Design, 2000. [2] R. Klueh and D. Alexander, “Embrittlement of crmo steels after low fluence irradiation in hfir,” J. Nucl. Mater., 218, 151, 1995. [3] M. Rieth, B. Dafferner and H.-D. Rohrig, “Embrittlement behaviour of different international low activation alloys agter neutron irradiation,” J. Nucl. Mater., 258– 263, 1147, 1998. [4] L. Snead, S. Zinkle, J. Hay, and M. Osborne, “Amorphization of sic under ion and neutron irradiation,” Nucl. Instr. Methods B, 141, 123, 1998. [5] S. Zinkle et al., “Research and dvelopment on vanadium alloys for fusion applications,” J. Nucl. Mater., 258–263, 205, 1998. [6] F. Wiffen, “Effects of irradiation on properties of refractory alloys with emphasis on space power reactor applications,” Proc. Symp. on Refractory Alloy Technology for Space Nuclear Power Applications, CONF-8308130, Oak Ridge National Lab, p. 254, 1984. [7] D. Gelles, “Microstructural examination of commerical ferritic alloys at dpa,” J. Nucl. Mater., 233–237, 293, 1996. [8] B. Loomis and D. Smith “Vanadium alloys for structural applications in fusion systems: a review of vanadium alloy mechanical and physical properties,” J. Nucl. Mater., 191–194, 84, 1992.
The role of theory and modeling
2729
[9] K. Shiba, A. Hishinuma, A. Tohyama and K. Masamura, “Properties of low activation ferritic steel f82h iea heat-interim report of iea round-robin tests (1),” Japan Atomic Energy Research Institute Report JAERITech. [10] H. McCoy, “Creep properties of selected refractory alloys for potential space nuclear power applications,” Oak Ridge National Lab Report ORNL:TM-10127. [11] P. Maziasz et al., “New ods ferritic alloys with dramatically improved hightemperature strength,” J. Nucl. Mater., Proc. 9th Int. Conf. Fusion Reactor Materials, 1999. [12] G. Youngblood, R. Jones, G. Morscher and A. Kohyama, “Creep behavior for advanced polycrystalline sic fibers,” in Fusion Materials Semiann,” Prog. Report for period ending June 30 1997, DOE:ER-0313:22, Oak Ridge National Lab, p. 81, 1997. [13] H. Schroeder and H. Ullmaier, “Helium and hydrogen effects on the embrittlement of iron and nickel-based alloys,” J. Nucl. Mater., 179–181, 118, 1991. [14] A. Ryazanow, V. Manichev, and van W. Witzenburg, “Influence of helium and impurities on the tensile properties of irradiated vanadium alloys,” J Nucl. Mater., 227–263, 304, 1996. [15] D. Alexander, S. Zinkle and A. Rowcliffe “Fracture toughness of copper-base alloys for fusion energy applications,” J. Nucl. Matter, 271&272, 429, 1999.
Perspective 12 WHERE ARE THE GAPS? Marshall Stoneham Centre for Materials Research, and London Centre for Nanotechnology, Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT, UK
Reading a Handbook like this gives a vivid picture of the enormous vigour and power of materials modelling. One is tempted to believe that we can answer all the questions materials technology might pose. Even if that were partly true, we should be identifying just what we do not know how to do. Some gaps will be depend on new hardware and software, especially when modelling quantum systems. Some gaps will be recognised only after some social or technological change has brought them into focus. Among the developments likely to stimulate innovation could be novel nanoelectronics, or the fields where physics meets biology. Still further gaps exist because we have been slaves to fashion, and have been drawn away from unpopular (roughly translating as “too difficult”) fields; examples might include excited state spectroscopy, or electrical breakdown. Much pioneering work on materials modelling was based on very simple potentials and non-self-consistent electronic structure. Today, potentials are sophisticated and accurate, and self-consistent electronic structure for molecular dynamics is routine for adiabatic energy surfaces. This has given us confidence. But can almost any process be modelled with success? Sadly, this is not so. There are at least three groups of challenges. One first challenge concerns electronic excited states and other non-equilibrium systems. Predicting the crystal structure of lowest energy is not the major issue in modelling, and is rarely a performance characteristic by itself. For many systems, the key characteristics are those of metastable forms (e.g., most steels, diamond), not the state of lowest free energy. Predicting which crystal structure will result from a particular growth process can be very hard indeed, especially for larger organic molecules. A second challenge concerns timescales, from femtoseconds to tens of years, and length scales, and the link between microscopic (atomistic) and mesoscopic (microstructural) scales for hierarchical phenomena. 2731 S. Yip (ed.), Handbook of Materials Modeling, 2731–2736. c 2005 Springer. Printed in the Netherlands.
2732
M. Stoneham
A third challenge lies in understanding the quantum physics of highly correlated systems, for which intuition based on the human scale is not a safe guide.
1.
Special Excited States
Why bother with excited states? They offer new ways to control systems, especially at the nanoscale, ranging from materials modification to quantum coherence [1]. In modifying materials, the basic solid-state processes rely on energy localisation, whether ionic or electronic. If atoms are to be displaced, some minimum energy is needed, and this energy must be associated with specific local atomic motions. Self-trapping, the immobilisation of excitations by local lattice deformation, is often the key. The amount of energy available for a process is crucial, yet current methods are disappointing in predicting available energies. Often, electronic excited states have features qualitatively different from the ground state, notably their equilibrium geometry and their degeneracy or near-degeneracy. Their excess energy can be dissipated by radiative or non-radiative transitions. Charge localisation is another key process. It is less necessary for materials modification, but offers one simple means of guiding energy localisation. Again, current methods for predicting self-trapping disappoint: the local density approximation (LDA) often fails to predict it, even when experiment is unambiguous, and Hartree–Fock methods (HF) predict localisation when they should not. Energy transfer is a third process. The atomic displacements do not need to occur at the sites originally excited, but can occur at more distant sites. A common example, if not yet fully understood, is photosynthesis. Fourthly, there is energy storage. Energy sinks can delay damage processes, and sometimes change their character. Even conduction electrons in a metal can store energy for long enough to affect outcomes. Finally, there are the effects of charge transfer and space charge. Charge buildup can be important for indirect reasons, e.g., because a macroscopic field can influence subsequent damage events.
2.
Away from Equilibrium and the Steady State
In a semiconductor, it is natural to assume that carrier injection will be followed by various processes leading to equilibrium or to some driven steady state. But when an exciton is created, or a muon implanted, these species have a finite lifetime. The exciton or the muon, can decay before it reaches its most stable state. So experiments may measure the properties of metastable states. This should be good news, in that the variety of excited states includes a wealth of different behaviours. There may be opportunities to manipulate
Where are the gaps?
2733
and exploit the states and the branchings between competing excited state processes. The consequences might range from solid state reactions (such as the photographic process) and enhanced diffusion in semiconductors to minimal invasive dentistry. In radiation damage, it is a convenient metaphor to say that the centre of a collision cascade becomes liquid, since the energy deposited can be tens of eV per atom. How good is the “liquid” description? How well does it describe the more complex, but important, cases like Si, where solid Si is a semiconductor and liquid Si is a metal? Can one pretend that the system is neutral everywhere, even though electrons will have been scattered out of the central “liquid” zone? If there is a transiently positive core region, how soon is neutrality restored? And how quickly and where will the electrons deposit their energy? The Fermi level is not defined, so it is hard to know how to decide what to do about charge states. Life processes, as Schr¨odinger noted, are inherently non-equilibrium. There are many striking phenomena that are hard to mimic, let alone model. For example, is Davydov’s model of efficient energy transfer along the α-helix of a protein by a soliton mechanism correct [2]? And how are we to relate the biologist’s views of force [3] to the physicist’s clear ideas on forces as derivatives of potentials, acting at well-defined sites? Modelling biosystems, going beyond mimicry to understanding, is an open field.
3.
Beyond the Adiabatic Approximation
Even the commonest non-radiative transitions, the non-adiabatic processes by which many excited systems revert to the ground state, have special difficulties [1]. The normal input to such transition rates involved two components. One is an electronic matrix element of the non-adiabatic part of the Hamiltonian, and this needs wavefunctions. There, known (and subtle) difficulties have been identified, but there are few attempts to do state-of-the-art calculations. The other component concerns the quantum treatment of nuclear motion, going beyond the usual classical dynamics with quantum treatments of electrons. Unfortunately, the usual Fermi Golden Rule, or something broadly equivalent, does not suffice to predict the evolution of a system by self-consistent molecular dynamics. This is not a problem of the harmonic approximation, or of unusual adiabatic dynamics, such as solitons in conducting polymers, all of which can be handled. The non-adiabatic aspects are tricky, elusive in a convenient form. What is needed here is a means to avoid the Fermi Golden Rule. Various tools, like frozen Gaussian methods or energy surface hopping, offer ways forward, but there is little theory that addresses the remarkable experiments on transient excited state phenomena in halides and oxides, for example.
2734
M. Stoneham
Situations involving multiple energy surfaces are also challenging. It is only too tempting to treat a Jahn–Teller system as if there were a single energy surface. Yet this would be wrong dynamically, and would also omit extra excitations, like orbitons in coupled Jahn–Teller systems. Such coupled systems can exhibit other ideas hard to simulate, such as frustration, or distinctions between real and quasi spins when the interactions between component magnetic moments are significant.
4.
Full Quantum Treatments, Including Predictive Decoherence
Serious predictions of electronic structure inevitably involve quantum electrons, usually through the h¯ in kinetic energy terms, through quantum statistics, or through the exclusion principle. Quantum information processing, and the even more demanding ideas of quantum computing, are widely discussed [4] Quantum tunnelling has become a common phenomenon, although its modelling is often primitive, intended more to make use of simple analytical models than to represent the system accurately. When the adiabatic approximation is not enough, quantum nuclear motion is needed, as well as that of electrons. In some highly-correlated systems, and in superconductivity, the quantum effects become both more sophisticated and less intuitive. Indeed, it is not clear what practical limits exist to the modelling of quantum systems. Some of the most interesting cases involve entanglement, the quantum dance of one electron with another, and the way that is reduced by decoherence, the quantum analogue of classical dissipation. The most interesting aspects of quantum behaviour can also take one away from equilibrium phenomena. Quantum statistics, of course, is at its most useful at or close to equilibrium. Quantum entanglement, and the manipulations of quantum information processing, need the avoidance of decoherence, the quantum analogue of dissipation, and this can often be achieved in highly non-equilibrium situations.
5.
Open Systems, and Interfaces between Unlike Materials
Many important interfaces involve dissimilar materials: a metal and an oxide, a polymer and an indium tin oxide electrode, an adhesive and wood, a biomaterial and blood. Some of the challenges are to understand mechanical properties, like adhesion and friction. Sometimes simple ideas work well, like the image interaction picture of non-reactive metal/oxide adhesion, and can give at least a framework for discussion of complex interfaces. Other issues concern the processes of transferring charge (electrons, protons, etc.) across the interface. It is not simple to match absolute energies on the two
Where are the gaps?
2735
sides of most interfaces. Ther are often significant dipole moments, for instance. Tunnelling rates and injection phenomena need a dynamic description of screening. Electron emission needs to recognise the long-range electric field. Charge transfer, whether through a blocking electrode or through an ohmic contact, usually often implies an open system, whether it be a biosystem and its environment, or a nanocomponent within a device. In such cases, the boundary conditions demand subtle treatment. Moreover, the dynamics of screening can take one well away from the usual classical electrostatic descriptions.
6.
Mesoscopic Issues where Rare Events Dominate
Condensed matter processes exhibit enormous range. Some are fast processes: electronic relaxation in metals (fs); photochemistry and fast nonradiative processes (0.1–1 ps), and allowed radiative transitions (say 10 ns). Others are slow processes: spin-forbidden transitions (ms) and diffusioncontrolled processes (0.1 s to geological times). For a steel component in a nuclear reactor, the fastest processes in collision cascades take only a few femtoseconds, and most of the action is over in a few picoseconds. The consequences of these cascades and subsequent slower diffusion processes on the ductility of the steel can grow in importance over 30 years, i.e., the billion second timescale, and may affect dramatically reactor economics. In some cases, energy densities can be extremely large, perhaps tens of eV per atom. Radiationinduced processes are hierarchical, as for polymers, for which scission or cross-linking affect subsequent damage. Such cross-linking can generate diamond-like carbon, and this can be modelled. Key features of radiation behaviour are radiochemical yields (what happens for every 100 eV put in?), gas production, and the influence of interstitial molecular oxygen. Embrittlement and tendency to fracture, plus a reduced electrical breakdown performance, are critical for applications like insulation in fission reactors. Yet fracture and breakdown predictions are classic examples of behaviour for which phenomenological behaviour is well documented, but reliable predictions, even of behaviour statistics, barely attempted. When there are hierarchical processes, one common symptom is non-Debye relaxation behaviour (stretched exponential, “universal response”, etc.). The differences can be more dramatic, of course, with qualitative changes in some cases. Clearly, fracture is qualitatively distinct from modest shifts in elastic constants. And, if one is dealing with fracture, or electrical breakdown, for which extremal statistics apply, how do you ensure your model has the right Weibull parameters? Focussing one’s model on the appropriate scale seems much more effective [5]. Whilst multiscale modelling is not uncommon, there should be real doubts about the wisdom of using an all-embracing code that
2736
M. Stoneham
incorporates sophisticated calculations at several levels, e.g., finite difference macroscopic, atomistic, and electronic structure. One problem is making sure the ideas are right at each stage: in a silicon MEMS device, for instance, is the atomistic treatment of the silicon to have priority over treatment of thermal oxide?
7.
Posing the Problem: Making Contact with Reality
For all the early days of materials modelling, there was an implicit assumption that either the computer hardware or the software was the most serious limitation. This no longer seems to be the case. Posing the problem effectively, and interpreting the results properly, were supposed straightforward. The emphasis has changed. Major developments in hardware and software are welcome, but the serious limits need brainware and experience, rather than computer power or software [6]. Can we frame the problem in a way that could be modelled at the right level of detail, precision and sophistication? Do we understanding enough of the modelling output to give adequate answers to key questions? Have we the underlying science to recognise the limitations of the models used and be aware of the value of the answer assessed? Do we have enough confidence to use our models to take unpopular decisions? Could we tell an influential person that their favourite idea can’t work? Anyone who has worked in a technology-based organisation will know that new and unexpected situations arise with remarkable regularity. Sometimes these are problems, sometimes opportunities. Usually, the window of opportunity is short, since timescales are fixed by non-scientific constraints. Posing the problem in a soluble form can be the biggest challenge. It can also be rewarding. As thermodynamicist J Willard Gibbs acknowledged, thermodynamics owed more to the steam engine than the steam engine owed to thermodynamics.
References [1] N. Itoh and A.M. Stoneham, Materials Modification by Electronic Excitation, Cambridge University Press, Cambridge, 2001. [2] P.-A. Lindgard and A.M. Stoneham, “Self trapping, biomolecules and free-electron lasers,” J. Phys. Cond. Matter, 15, V5-V9, 2003. [3] M.E. Fisher and A.B. Kolmeiski, Proc. Nat. Acad. Sci., 96, 6597–6602, 1999. [4] C.P. Williams and S.H. Clearwater, Ultimate Zero and One, Copernicus, New York, 2000. [5] A.M. Stoneham and J.H. Harding, “Not too big, not too small: the appropriate scale,” Nature Materials, 2, 77–83, 2003. [6] A.M. Stoneham, A. Howe, and T. Chart, “Predictive Materials Modelling,” UK Department of Trade and Industry/Office of Science and Technology Foresight Report DTI/Pub5344/02/01/NP, URN 01/630, 2001.
Perspective 13 BRIDGING THE GAP BETWEEN QUANTUM MECHANICS AND LARGE-SCALE ATOMISTIC SIMULATION John A. Moriarty Lawrence Livermore National Laboratory, University of California, Livermore, CA 94551-0808
The prospect of modeling across disparate length and time scales to achieve a predictive multiscale description of real materials properties has attracted widespread research interest in the last decade [1]. To be sure, the challenges in such multiscale modeling are many, and in demanding cases, such as mechanical properties or dynamic phase transitions, multiple bridges extending from the atomic level all the way to the continuum level must be built. Although often overlooked in this process, one of the most fundamental and important problems in multiscale modeling is that of bridging the gap between first-principles quantum mechanics, from which true predictive power for real materials emanates, and the large-scale atomistic simulation of thousands or millions of atoms, which is usually essential to describe the complex atomic processes that link to higher length and time scales. For example, to model single-crystal plasticity at micron length scales via dislocation-dynamics simulations that evolve the detailed dislocation microstructure requires accurate large-scale atomistic information on the mobility and interaction of individual dislocations. Similarly, modeling the kinetics of structural phase transitions requires linking accurate large-scale atomistic information on nucleation processes with higher length and time scale growth processes.
1.
Electronic-atomic Gap
As indicated in Fig. 1, there currently exists a wide spectrum of atomic-scale simulation methods in condensed-matter and materials physics, extending from essentially exact quantum-mechanical techniques to classical descriptions with totally empirical force laws. All of these methods fall into one of two distinct 2737 S. Yip (ed.), Handbook of Materials Modeling, 2737–2747. c 2005 Springer. Printed in the Netherlands.
2738
J.A. Moriarty Material-dependent gap
Electronic: electron + ion motion Many-electron Self-consistent …… states mean field Correlated electron theory
Density Functional Theory (DFT)
Atomic: ion motion only Coarse-grained No electronic …… electronic structure structure Quantum based potentials
Empirical potentials
Exact quantum mechanics
“ ab initio”
Total empirical description QMC DMFT ...
QMD PP FP-LMTO ...
Quantum simulation
1⫺10
GPT MGPT BOP ...
EAM FS ...
“empirical”
Atomistic simulation
10⫺102
102⫺106
104⫺108
Number of atoms Figure 1. Representative sample of the wide spectrum of electronic and atomistic simulation approaches used in condensed-matter and materials physics and the material-dependent gap separating them.
categories, which are separated by a material-dependent gap. On one side of this gap are electronic methods based on direct quantum-mechanical treatments. These include quantum simulations that attempt to treat electron and ion motion on an equal footing, solving quantum-mechanical equations on the fly for both the electronic states of the system and the forces on the individual ions. In principle, such methods can provide a highly accurate description of the system and are chemically very robust, but they come at the price of being severely limited in the size and duration of the simulation. Typically, even efficient mean-field methods such as quantum molecular dynamics (QMD) [2, 3] can at best treat a hundred or so atoms for a few picoseconds of time. On the other side of the gap are methods used in atomistic simulations that treat only the ion motion, solving classical Newtonian equations of motion with the forces derived from explicit interatomic potentials, which may or may not be encoded with detailed quantum information about the electronic structure. For the simplest short-range empirical potentials, tens or hundreds of millions of atoms can be so simulated with molecular dynamics (MD) for time durations extending to tens or hundreds of nanoseconds. But this computational robustness often comes at the price of losing any connection to the underlying electronic structure of the material. For studying generic phenomena in simple systems this may not be a major drawback, but more generally, for the predictive multiscale modeling of real complex materials that we envision here, the retention of adequate quantum information is essential.
Gap between quantum mechanics and large-scale atomistic simulation 2739 In practice, both quantum and atomistic simulations may be approached at many different levels of approximation. Many-body quantum-mechanical methods attempt to treat the full many-electron states of the system and provide a general means of addressing the fundamental issue of electron correlation. The valence electrons in the vast majority of systems of practical interest, however, including most metals and semiconductors, effectively exhibit only weak electron correlation. For such systems accurate total energies and forces can be achieved through self-consistent, mean-field electronic-structure methods based on modern density functional theory (DFT) [4, 5]. Indeed, today DFT-based electronic-structure and quantum simulation methods are usually described as “ab initio,” even though significant approximations from exact quantum mechanics are involved. Nonetheless, for weakly correlated systems first-principles DFT methods are quantitatively predictive and rely on only the barest input information: the atomic numbers and masses of the material constituents. Within this category of computational method are all-electron, full-potential (e.g., FP–LMTO) techniques as well as pseudopotential (PP) techniques, which treat only valence electrons and are normally essential in QMD simulations. In addition to such direct computational approaches, simplified representations of DFT are also possible via orbital-basis-state approaches using plane waves, localized atomic orbitals or a hybrid combination of the two. Such simplification provides a useful starting point to “coursegrain” the electronic structure and actually bridge the electronic-atomic gap. The interatomic potentials used in atomistic simulations have likewise been developed to many different levels of sophistication and quantum compatibility. One basic consequence of the quantum-mechanical nature of electronic bonding in both metals and semiconductors is that the total energy of the system, E tot (R1 , . . . , R N ), is inherently a many-body functional of its atomic coordinates Ri and must contain terms beyond radial-force pair interactions. This requirement is directly manifest in a number of measurable properties including the elastic moduli, where in metals it is well known that the Cauchy relations implied by pure pair potentials are not satisfied in general (e.g., C12 =/ C44 in cubic metals) and in semiconductors non-radial forces are needed even to stabilize the basic diamond structure. Most modern empirical potentials satisfy this requirement by adding a more general functional to a pairpotential contribution in E tot . In some cases, this additional contribution is inspired by specific quantum-mechanical considerations, but typically arbitrary short-ranged functional forms are still maintained in both pair and nonpair terms. In addition to such empirical potentials, however, there are also more rigorous quantum-based interatomic potentials (QBIPs) – potentials that are actually derived in whole or in part from the underlying quantum mechanics by suitably course graining of the electronic structure. The gap between electronic and atomistic methods and between QMD and MD/QBIP simulations is then directly related to the additional approximations entailed in such
2740
J.A. Moriarty
course graining. The size of the gap and our ability to bridge it is dependent both on the complexity of the material in question and the complexity of its environment. For some materials and/or some environments this gap is actually relatively small and can be readily bridged, such as is the case for bulk simple metals. For other materials and/or environments, however, the gap is larger and bridging it is still a forefront challenge of current research. This is the case, for example, with directionally-bonded transition metals and semiconductors as well as for chemically reactive surfaces. Nonetheless, significant progress has been made in the last decade or so in many of these latter areas, inspired in part by the demands of multiscale modeling.
2.
Quantum-based Interatomic Potentials
In the case of simple sp-bonded metals (e.g., Na, Mg, Al), a rigorous formulation of QBIPs in the bulk material has been available since the 1960s in terms of pseudopotential perturbation theory [6]. In this approach, it is recognized that the electronic structure of such materials is nearly free electron in character and that the electron-ion interaction can be represented by a weak nonlocal pseudopotential. Using an orbital basis of plane waves, one can develop E tot to second-order in the pseudopotential and express the result explicitly in the real-space form 1 v 2 (Ri j ; ), (1) E tot (R1 , . . . , R N ) = N E vol () + 2 i, j where is the atomic volume of the metal, E vol is a collective volume term that satisfies the many-body requirement for E tot , and v 2 is a volumedependent, but structure-independent and transferable, radial-force pair potential. The volume dependence of v 2 is a consequence of the self-consistent electron screening, which also gives rise to long-range Friedel oscillations in the potential tail. By the 1970s and 80s, first-principles DFT-based implementations of this approach had already become well developed [7–9]. This method is particularly effective in dealing with bulk structural properties, including phonons and elasticity, solid phase transitions, liquid structure and dynamics, and melting. This approach has also been readily and successfully extended to compounds and alloys as well as to high pressures, where atomic volume is a very compatible environmental variable. In the 1980s, interest in simulating materials properties beyond the bulk environment and particularly at surfaces led to the development of alternative “glue” models for simple metals [10]. These include the radial-force potential models obtained from the embedded-atom method (EAM) [11] and from effective medium theory (EMT) [12]. The total-energy functional in the EAM or EMT is inspired by the DFT notion that the total energy of a system is
Gap between quantum mechanics and large-scale atomistic simulation 2741 a functional of its electron density and is assumed to take the form of an attractive embedding contribution balanced by a repulsive pair-potential contribution: 1 F(n¯ i ) + v 2 (Ri j ), (2) E tot (R1 , . . . , R N ) = 2 i, j i where F is a nonlinear function of the average electron density n¯ i on the site i. The embedding contribution correctly accounts for the increased bond strength at the surface relative to the bulk, although Eq. (2) itself is an ansatz and cannot be directly derived from quantum mechanics. In the EMT, the ingredients of this equation are evaluated from first-principles DFT considerations within a well-defined prescription starting from the embedding of an atom in a free-electron gas. In the EAM, on the other hand, these ingredients are all treated empirically with convenient parameterized analytic forms chosen for F, n¯ i and v 2 to maximize flexibility and achieve high computational speed. Alternatively, F, n¯ i and v 2 have also been spline-fit to larger databases that also include DFT energies and forces. Regardless of how they are parameterized, the EAM and EMT models are most appropriate for simple sp-bonded metals and series-end transition metals (e.g., Cu), where a description of the bonding in terms of radial forces is reasonable, and to such systems these models been extensively applied in the past twenty years. A general approach to QBIPs that does allow one to go beyond radialforce interactions in a rigorous way makes use of a local-orbital, tight-binding (TB) representation of the electronic structure [13]. In principle, if an accurate TB representation can be found, one can then develop a QBIP model based on an expansion of the total energy in terms of moments of the local electronic density of states, µ2 , µ3 , µ4 , . . . . In practice, the notion of casting potentials in terms of moments has been used both to develop empirical models as well as in full quantum-mechanical derivations. Such approaches have been mainly directed at the d states in central transition metals (e.g., Mo) and at the s and p states in covalently bonded semiconductors (e.g., Si). The simplest empirical scheme in this category is the second-moment, radialforce Finnis–Sinclair (FS) model [14]. This model is formally similar to the “glue” models discussed above with an assumed embedding function in the form F(µ2 ) ∝ (µ2 )1/2 , where the second moment µ2 is treated empirically as a short-ranged radial function about each atomic site. The FS model has been mostly applied to central bcc transition metals, although at this level of treatment there are in fact no angular-force terms to accommodate the d-state directional bonding. In this regard, empirical fourth-moment schemes for transition metals have also been developed that do implicitly include angularas well as radial-force contributions. The most complete and fundamental TB approach to QBIPs, however, is the bond-order-potential (BOP) model of Pettifor [13], which is based on an explicit expansion of the total energy within
2742
J.A. Moriarty
TB theory and has been considerably developed and applied over the last ten years [15]. In all the TB schemes, an empirical repulsive pair-potential contribution is included in the total energy, as in Eq. (2), and parameterization of the local-orbital matrix elements defining the moments is required. In the BOP model, the embedding energy of Eq. (2) is directly replaced by the full TB bond energy derived from quantum mechanics for the dominant bonding electrons, e.g., the d electrons in transition metals. In such a case an additional empirical environmental energy correction term to the pair potential is also added to E tot to account for the fact that local s and p orbitals in a TB representation of the sp electrons are effectively environmentally dependent. Another general approach to QBIPs in metals involves combining a plane-wave-based pseudopotential treatment for s and p electrons with a localorbital-based tight-binding treatment for d electrons, allowing application to both simple and transition metals. The primary example of this approach is first-principles generalized pseudopotential theory (GPT), which has been rigorously developed from DFT quantum mechanics [16]. In the GPT applied to transition metals, a mixed basis of plane waves and localized d-state orbitals is used to self-consistently expand the electron density and total energy of the system in terms of weak sp pseudopotential, d-d tight-binding, and sp-d hybridization matrix elements, which in turn are all directly calculable from first principles. The GPT total-energy expansion has been carried out to the level of four-ion interactions and formally generalizes Eq. (1). In a bulk elemental transition metal, one obtains the explicit real-space form E tot (R1 , . . . , R N ) = N E vol () + +
1 1 v 2 (i j ; ) + v 3 (i j k; ) 2 i, j 6 i, j,k
1 v 4 (i j kl; ). 24 i, j,k,l
(3)
The leading volume term in this expansion, E vol , as well as the two-, three-, and four-ion interatomic potentials, v 2 , v 3 , and v 4 , are as in Eq. (1) volume dependent, but structure independent quantities and thus transferable to all bulk ion configurations, either ordered or disordered. This includes all structural phases as well as the deformed solid and the imperfect bulk solid with either point or extended defects present. The angular-force multi-ion potentials v 3 and v 4 in Eq. (3) reflect contributions from partially-filled d bands and are generally important for central transition metals. In the full ab initio GPT, however, these potentials are long-ranged, nonanalytic and multidimensional functions, so that v 3 and v 4 cannot be readily tabulated for application purposes. This has led to the development of a simplified and complementary model GPT or MGPT applicable to central transition metals [17]. Within the MGPT, the multi-ion potentials v 3 and v 4 are systematically approximated by introduing canonical d bands and other simplifications to
Gap between quantum mechanics and large-scale atomistic simulation 2743 achieve short-ranged, analytic forms, which can then be applied to large-scale atomistic simulations. To compensate for the approximations introduced into the MGPT, a limited amount of parameterization is allowed in which the coefficients of the modeled potential terms are constrained by either DFT or experimental data. In practice, the ab inito GPT and the MGPT potentials have complementary ranges of application. The ab initio GPT is most effective in situations where the total-energy expansion (3) can be truncated at the pair-potential level, as in Eq. (1), since tabulation and interpolation of a nonanalytic pair potential v 2 (r, ) represents no computational barrier for atomistic simulations. Thus ab initio GPT applications include simple metals and series-end transition metals as well as appropriate binary alloys, including the transitionmetal aluminides [18]. The primary application range for the MGPT, on the other hand, is the bcc transition metals (e.g., Ta, Mo). Both GPT and MGPT potentials have been implemented in atomistic simulations and applied to a wide range of bulk structural, thermodynamic, defect and mechanical properties at both ambient and extreme conditions of temperature and pressure [17]. Extension of the bulk GPT and MGPT potentials to highly nonbulk situations, such as surfaces, voids and clusters, is also possible through appropriate environmental modulation [19]. This refinement has only been studied in detail, however, in the cases of free surfaces, where environmental corrections have been shown to be very important (∼ 50 − 70%), and for vacancies, where such corrections have been confirmed to be negligible (∼1 − 2%).
3.
Outlook
There are still many remaining challenges in bridging the gap between quantum and atomistic simulations. One inherent advantage that QBIPs have over empirical potentials in this quest is that they are systematically improvable in a manner consistent with quantum mechanics. In spite of the significant progress made over the past forty years, the collective amount of time and energy spent to date on developing QBIPs has actually been very small compared to that spent on developing advanced electronic-structure methods and quantum simulations themselves. As a result, this is still a young research field that has very much room to grow. Below we discuss three general areas where there would seem to be great opportunities for major progress over the next decade.
3.1.
Improved Accuracy and Computational Speed
In first-principles QBIPs such as GPT, the main additional approximation beyond DFT is the truncation of the total-energy expansion at finite order.
2744
J.A. Moriarty
For both simple and transition metals, it is now computationally feasible to extend this expansion to higher order as needed. For transition metals in particular, it should be possible to extend Eq. (3), both in the GPT and MGPT representations, to include five- and six-ion d-state interactions. This would improve the description of certain structural properties, such as the hcp-fcc energy difference and corresponding stacking faults, and would enable accurate applications both to the left and right of the central bcc metals. Moreover, in the context of semi-empirical QBIPs such as MGPT or BOP, it should be possible to eliminate or improve many of the secondary approximations that are currently used. In the MGPT, for example, it has recently been possible to formulate a more general matrix version of the theory that allows one to go beyond simple canonical d bands and hence achieve a more accurate representation of the electronic structure. Also, in both MGPT and BOP current applications to transition metals, explicit sp-d hybridization contributions have been dropped for convenience, but these should be included in the future, especially for non-central transition metals. The balancing consideration to increased accuracy is, of course, computational speed. One enormously appealing aspect of empirical EAM potentials is their extreme speed, which can be up to six orders of magnitude faster than first-principles DFT electronic-structure or QMD methods and up to two orders of magnitude faster than QBIPs. While angular-force QBIPs will be inherently slower than radial-force EAM potentials, a reasonable goal would be to come within one order of magnitude of their speed, at least at some basic level of operation. In the case of MGPT, recent algorithm improvements have increased computational speed dramatically by up to a factor of six and put us at or close to that goal. Adding higher-order interactions and/or sp-d hybridization will, of course, work to reverse that gain, but inevitably one must think in terms of having QBIPs at various levels of approximation and match the level and speed with the intended application.
3.2.
Treatment of Complex Systems and Complex Environments
The general application area of intermetallic compounds and alloys would seem to a potentially ideal one for QBIPs since in general they are much better positioned to handle chemical and structural complexity than empirical potentials. In this regard, BOP and GPT treatments of transition-metal intermetallics have already had significant success and these applications are expected to continue and grow in the future. Future MGPT and hydrid MGPT/GPT treatments also look very promising for transition-metal rich systems. Similarly, another fruitful application area is expected to be high-pressure physics. Here
Gap between quantum mechanics and large-scale atomistic simulation 2745 volume-dependent GPT and MGPT treatments to date have proven to be successful, both in terms of predicting new high-pressure phases and in describing how materials properties scale with pressure for an existing phase. More generally, QBIPs seem to be well suited to deal with changes in structural stability under pressure and the prediction of the thermodynamics and mechanical properties of new encountered phases. The application of QBIPs to non-bulk environments, such as free surfaces, voids and clusters, would also seem to be a potentially important area, especially for doing large-scale simulations involving growth and interaction that are beyond the reach of QMD. In this case, however, significantly more developmental work may be needed, since applications to date have been limited and bulk assumptions and approximations often require modification near surfaces. Yet another promising application area for QBIPs is that of f -electron metals. The BOP and GPT/MGPT transition-metal methodologies are readily adaptable from d to f electrons, at least in the weak correlation DFT regime. Initial MGPT applications in this area look promising, but an added challenge is the structural complexity of some f - electron phases. Treating strongly correlated systems poses another new challenge that will require going beyond DFT to correlated-electron theories such dynamical mean field theory (DMFT) and re-building the bridge to QBIPs from that starting point. This looks possible but no work has yet been done in this area.
3.3.
Temperature-dependent QBIPs and Direct Linkage with QMD
One interesting possible use of QMD simulations is to interface them with MD/QBIP simulations both to extend the time scale of the QMD simulations and to develop improved QBIPs at finite temperature. This linkage has actually been tried with empirical potentials in a few cases, but mostly as an interpolation mechanism and without any regard as to whether or not the potentials had physical meaning at the conditions used. For d- and f -electron metals, however, the concept of temperature-dependent potentials is actually a very important one. In such materials there are large electron-thermal effects at temperatures as low as melt arising from the high density of electronic states at the Fermi level. These effects can have a dramatic impact on high-temperature properties including the melt curve itself. Electron-thermal effects are normally treated separately from the more familiar ion-thermal effects that are associated with QBIPs constructed at zero temperature. It should be possible, however, to capture the coupled electron plus ion thermal effects simultaneously and self-consistently by building QBIPs on the basis of the total electron free energy at finite temperature. In principle, this can be done without resort
2746
J.A. Moriarty
to QMD simulations, but adding the possibility of refining such temperaturedependent potentials by matching QMD and MD/QBIP simulations on the fly might substantially improve the accuracy of such potentials.
Acknowledgment This work was performed under the auspices of the U.S. Department of Energy by the University of California Lawrence Livermore National Laboratory under contract number W-7405-ENG-48.
References [1] J.A. Moriarty, V. Vitek, V.V. Bulatov, and S. Yip, “Atomistic simulation of dislocations and defects,” J. Comput.-Aided Mater. Desi., 9, 99–132, 2002. [2] R. Car and M. Parrinello, “Unified approach for molecular dynamics and densityfunctional theory,” Phys. Rev. Lett., 55, 2471–2474, 1985. [3] M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos, “Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients,” Rev. Mod. Phys., 64, 1045–1097, 1992. [4] P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,” Phys. Rev., 136, B864– B871, 1964. [5] W. Kohn and L.J. Sham, “Self-consistent equations including exchange and correlation effects,” Phys. Rev., 140, A1133–A1138, 1965. [6] W.A. Harrison, “Pseudopotentials in the theory of metals,” Benjamin, Reading, 1966. [7] L. Dagens, M. Rasolt, and R. Taylor, “Charge densities and interionic potentials in simple metals: Nonlinear effects II,” Phys. Rev. B, 11, 2726–2734, 1975. [8] A.K. McMahan and J.A. Moriarty, “Structural phase stability in third-period simple metals,” Phys. Rev. B, 27, 3235–3251, 1983. [9] J. Hafner, “From Hamiltonians to phase diagrams,” Springer-Verlag, Berlin, 1987. [10] R.M. Nieminen, M.J. Puska, and M.J. Manninen (eds.), “Many-atom interactions in solids,” Springer-Verlag, Berlin, 1990. [11] M.S. Daw, S.M. Foiles, and M.I. Baskes, “The embedded atom method: a review of theory and applications,” Mat. Sci. Rep., 9, 251–310, 1993. [12] K.W. Jacobsen, J.K. Norskov, and M.J. Puska, “Interatomic interactions in the effective-medium theory,” Phys. Rev. B, 35, 7423–7442, 1987. [13] D.G. Pettifor, “Bonding and structure of molecules and solids,” Oxford University Press, Oxford, 1995. [14] M.W. Finnis and J.E. Sinclair, “A simple N-body potential for transition metals,” Philos. Mag. A, 50, 45–55, 1984. [15] M. Mrovec, D. Nguyen-Manh, D.G. Pettifor, and V. Vitek, “Bond-order potential for molybdenum: application to dislocation behavior,” Phys. Rev. B, 69, 94115–94130, 2004. [16] J.A. Moriarty, “Density-functional formulation of the generalized pseudopotential theory. III. Transition-metal interatomic potentials,” Phys. Rev. B, 38, 3199–3230, 1988.
Gap between quantum mechanics and large-scale atomistic simulation 2747 [17] J.A. Moriarty, J.F. Belak, R.E. Rudd, P. S¨oderlind, F.H. Streitz, and L.H. Yang, “Quantum-based atomistic simulation of materials properties in transition metals,” J. Phys.: Condens. Matter, 14, 2825–2857, 2002. [18] J.A. Moriarty and M. Widom, “First-principles interatomic potentials for transitionmetal aluminides: theory and trends across the 3d series,” Phys. Rev. B, 56, 7905– 7917, 1997. [19] J.A. Moriarty and R. Phillips, “First-principles interatomic potentials for transitionmetal surfaces,” Phys. Rev. Lett., 66, 3036–3039, 1991.
Perspective 14 BRIDGING THE GAP BETWEEN ATOMISTICS AND STRUCTURAL ENGINEERING J.S. Langer Department of Physics, University of California, Santa Barbara, CA 93106-9530, USA
When Sid Yip asked me to write a commentary for this section of the handbook, I promptly reminded him that I am a co-author of a longer article in the section on mathematical methods. I told him that my article on amorphous plasticity, written with Michael Falk and Leonid Pechenik, already is more of a departure from conventional ideas than may be appropriate for a book like this one, which should serve as a reliable reference for years into the future; and I asked whether I really ought to be given yet more space for expressing my opinions. Sid insisted that I should write the commentary anyway. So here are some remarks about one of the topics of interest in this book, the search for predictive models of deformation and failure of solids, and the role of nonequilibrium physics in this effort. Like many of my colleagues, I am impatient about the slow rate of progress in theoretical solid mechanics. We theorists have been given great opportunities. Remarkable developments in instrumentation and computation have advanced our knowledge about the atomic-scale behavior of solids far beyond what most of us could have imagined a decade or so ago; and yet it seems to me that our ability to bring that knowledge to bear on practical problems has not kept pace. I blame ourselves – the theorists – for this state of affairs. We have not been quick enough to explore new concepts that might move us from atomistic models and numerical simulations to engineering practice. To bridge this gap between atomistics and structural engineering, it seems almost trivially obvious that we need new phenomenologies. Our goal must be to develop predictive, quantitative and tractable descriptions of an enormously wide range of complex materials and processes. First-principles theories may be necessary to get us started, but they do not take us far enough by themselves, especially if they require that each physical situation be treated separately 2749 S. Yip (ed.), Handbook of Materials Modeling, 2749–2756. c 2005 Springer. Printed in the Netherlands.
2750
J.S. Langer
as in molecular dynamics simulations or even the most powerful multi-scale analyses. Moving across these length and time scales, learning how the smallscale phenomena fit together to produce complex, larger scale behaviors, is every bit as important and challenging a research goal as is atomic-scale investigation. Phenomenological research of the kind needed here requires physical insight to extract the general principles from the less relevant details and, therefore, it necessarily involves a substantial amount of guesswork. The classic phenomenological models that are most relevant to deformation and failure in solids are Hookean elasticity for static stress analysis and the Navier–Stokes equation for fluid dynamics. In both of those cases, the atomicscale theories can in principle be used to compute constitutive parameters such as elastic constants or viscosities; and those first-principles calculations are very valuable in themselves because they tell us about the limits of validity of the phenomenological descriptions. Much of the value of phenomenology, however, lies in the fact that it is usually easier and more reliable to obtain the constitutive parameters experimentally. The challenge is to make sure that the phenomenological framework truly captures the essential features of the systems that we need to describe. In elasticity and fluid dynamics, the essential ingredients are Newton’s laws of motion plus continuity and symmetry criteria. Those must be the basic ingredients of a theory of solid plasticity, but they are not sufficient. The Navier–Stokes analogy is particularly relevant to my argument because I want to talk mostly about noncrystalline materials. Deformable amorphous solids are very similar to fluids in all but a few, albeit very important, respects. Although their molecular structures look very much like those of fluids, they support shear stresses, they exhibit stress-driven transitions between jammed and flowing states, and they even exhibit memory effects. Nevertheless, because of their molecular-level similarities, I see no fundamental reason why amorphous solids should not be amenable to a level of analysis roughly similar to that which we use for fluids. Moreover, I suspect that, once we have found a useful way of describing the dynamics of deformation in amorphous solids, we shall be well on our way to a useful description of polycrystalline materials as well. When I make remarks like these in public lectures, I am invariably accused (not always politely) of ignoring the huge body of literature and tradition in plasticity theory. Indeed, conventional approaches to plasticity have been extremely successful in conventional engineering applications; but many of the problems that these theories must now confront – in biological materials, for example – are distinctly unconventional. Textbook treatments of plasticity generally appear in two different forms, one based on the usual formulation of elasticity supplemented by phenomenological stress-strain relations and plastic yield criteria, and another liquid-like approach that focuses on rheological relations between stresses and strain rates, usually with no reference to yield
Bridging the gap between atomistics and structural engineering
2751
stresses or the irreversible deformations that occur in low-stress, non-flowing regimes. The hydrodynamic analogy, however, tells us that we should not try to separate these two kinds of descriptions. Deformation or failure in a bounded material under loading is usually a localized phenomenon; plastic flow occurs where the stresses are large while, elsewhere, the stresses are relaxed and the material behaves nearly elastically. The material may harden in some places and soften in others. To be truly useful, therefore, our phenomenological equations of motion must incorporate all of those behaviors, and they must do so in a natural and relatively simple way. What new concepts and analytic tools will be needed in order to develop a unified theory of this kind? How might those techniques differ from the ones we have been using in the past? I shall try to start answering these questions by pointing to some puzzles and internal inconsistencies that persist in the conventional theories. These puzzles include the question of how breaking stresses can penetrate plastic zones near crack tips, and the possibly related question of why brittle fracture becomes dynamically unstable at high speeds. I know of no convincing solution to either of those problems, certainly not for the noncrystalline materials in which the definitive experiments have been performed; and the fact that these apparently simple problems have remained unsolved for such a long time is, by itself, enough to convince me that there is something seriously missing in our theories. Here, however, I would like to focus on a few more basic questions that I think lead us to a better understanding of what the missing ingedients might be. The most elementary and familiar of these questions is: What are the fundamental distinctions between brittle and ductile behaviors? A brittle solid breaks when subjected to a large enough stress, whereas a ductile material deforms plastically. Remarkably, we do not yet have a deep understanding of the distinction between these two behaviors. Conventional theories of crystalline solids say that dislocations form and move more easily through ductile materials than brittle ones, thus allowing deformation to occur in one case and fracture in the other. But the same behaviors occur in amorphous solids; thus the dislocation mechanism cannot be the essential ingredient of all theories. Moreover, the brittleness or ductility of some materials depends upon the speed of loading, which implies that a proper description of deformation and fracture must be dynamic; that is, it must be expressed in the form of equations of motion rather than the conventional static or quasistatic formulations. A second question that I find especially revealing is the following: What is the origin of memory effects in plasticity? Standard, hysteretic, stress-strain curves for deformable solids tell us that these materials – even the simplest amorphous ones – have rudimentary memories. For example, they “remember” the direction in which they most recently have been deformed. When unloaded and then reloaded in the original direction, they harden and respond elastically, whereas, when loaded in the opposite direction, they deform
2752
J.S. Langer
plastically. The conventional way of dealing with such behavior is to specify rules for how the response to an applied stress is determined by the history of prior loading; but such rules provide little insight about the nature of a theory based more directly on atomic mechanisms. A much better way to deal with memory effects is to introduce internal state variables that carry information about prior history and determine the current response of the system to applied forces. The trick is to identify the relevant variables. I am coming to believe that this is one of the main points at which a gap opens between atomistic understanding and engineering practice. All too often, for example, the plastic strain itself is used as such a state variable. This procedure has its roots in the conventional Lagrangian formulation of solid mechanics in which deformations of a material are described by displacements relative to fixed reference states. When applied to materials undergoing irreversible plastic deformations, such a procedure violates basic principles of nonequilibrium physics because, if taken literally, it implies that a material somehow must remember its configurations at times arbitrarily far in the past. That cannot be possible for an amorphous solid any more than it is for a liquid, where it is well understood that only displacement rates, and not the displacements themselves, may appear in equations of motion. When a solid undergoes a sequence of loadings and unloadings, bendings and stretchings, the displacement of an element of material from its original position cannot possibly be a physically meaningful quantity, thus it cannot be a sensible way of characterizing the internal state of the system. Nevertheless, the use of the total plastic strain, for example as a “hardening parameter,” appears frequently in the literature on plasticity. What, then, are the appropriate state variables for amorphous solids? My proposed answer to this question starts with the “flow-defect” or “sheartransformation-zone” (STZ) picture of Cohen et al. [1–4], in which plastic deformation occurs only at localized sites where molecules undergo irreversible rearrangements in response to applied stresses. Falk, Pechenik and I, in our paper in this volume (here denoted "FLP"), present a critical analysis of those earlier STZ theories, which I will not repeat in any detail here. On the plus side, these theories nicely satisfy my criteria for sensible phenomenological approaches. Their central ingredient is an internal state variable, i.e., the density of zones, and they generally postulate equations of motion for this density. On the other hand, they make a crucial assumption with which I disagree – that the plastic flow is equal simply to the density of zones multiplied by a stressand temperature-dependent Eyring rate factor. The interesting behavior, then, is contained in various assumptions about rates of annihilation and creation of zones as functions of temperature and strain rate. Such theories can do reasonably well in accounting for some rheological and calorimetric behaviors of, say, metallic or polymeric glasses. They do not predict yield stresses, however,
Bridging the gap between atomistics and structural engineering
2753
nor can they convincingly account for the wide range of dynamic behavior observed recently by Johnson and coworkers in bulk metallic glasses. In the work described in FLP and in earlier papers [5–8], we have extended and modified the original STZ theories in two ways. First, instead of assuming that the STZ’s are structureless objects, we have modeled them as two-state systems; that is, we have assumed that they transform back and forth between two different orientations in response to applied stresses. This two-state picture is inspired by molecular-dynamics simulations [5]. Among other implications, it tells us that we must supplement the scalar density by a tensorial quantity that carries information about the orientations of the zones, and that the actual plastic flow will depend upon those orientations. The appearance of this internal state variable produces entirely new dynamical properties, specifically, jamming behavior when the zones are all aligned with the stress and cannot transform further in the same direction, and a yield stress at which the jammed state starts to flow. Although the STZ’s are structural irregularities that live in an unperturbed material for very long times, they are ephemeral in the sense that they are created and annihilated during irreversible deformations. These annihilation and creation terms play the same roles as those that appear in the original STZ theories. It is here that we have made the second of our basic changes using a combination of phenomenological guesswork and the constraints imposed by symmetry and the second law of thermodynamics. In particular, we have argued that the simplest possible creation rate is proportional to the rate at which energy is dissipated during plastic deformation, which is necessarily a non-negative scalar quantity. This phenomenological assumption leads us to a rate factor that is substantially different from earlier versions, and which seems free from unphysical features. As described in more detail in FLP, our equations of motion for the STZ populations are best expressed in terms of two dimensionless state variables: , a scalar field proportional to the density of STZ’s, and i j , a traceless symmetric tensor field that describes the local orientation of the zones. The full theory is necessarily expressed in Eulerian coordinates, as in fluid dynamics. It consists of equations of motion for and i j supplemented by the usual acceleration equation relating the vector flow field v i to the divergence of the stress σi j , and an equation expressing the rate-of-deformation tensor as the sum of elastic and plastic parts. Because these equations refer to solids rather than liquids, they are necessarily more complicated than Navier–Stokes; but they are capable of serving similar purposes. The most unconventional result of this theory is the way in which the yield stress emerges. The equations of motion for an isotropic system have two kinds of steady state solutions at fixed applied stress. One of these solutions is jammed, i.e., non-flowing, and the other is unjammed. The jammed
2754
J.S. Langer
solution is dynamically stable below some stress (a function of the material parameters) and is unstable above that stress. Conversely, the unjammed solution is unstable below this “yield” stress and stable above it. Thus the conventional yielding criterion is replaced by an exchange of dynamic stability between two branches of steady-state solutions of a set of coupled, nonlinear differential equations. The physical interpretation of this situation is that, at smaller stresses, the two-state STZ’s become saturated in the direction of the stress – the magnitude of the orientational bias i j reaches a stress-dependent maximum – and the motion stops. At larger stresses, jammed zones are annihilated and unjammed ones created fast enough to sustain steady-state plastic flow. The resulting dynamic version of an STZ theory reproduces a wide range of the phenomena observed in plastically deforming materials. Depending upon the choice of just a small number of material parameters and initial conditions, theoretical sress-strain curves may exhibit work hardening, strain softening (for annealed samples with low initial densities of STZ’s), strain recovery following unloading, Bauschinger effects, necking instabilities, and the like. With the addition of thermal fluctuations that cause spontaneous relaxation of the STZ state variables, the theory quantitatively explains the experimentally observed transition between Newtonian viscosity at small loading to superplastic flow at larger stresses as a transition from thermally assisted creep at small stress to plastic flow at the STZ yield stress. There are also some interesting shortcomings. In the form described in FLP, the theory does not predict the results of calorimetric measurements. Also, like almost all other theories of plasticity, this version of the STZ theory lacks an intrinsic length scale. The theory does show signs of shear banding instabilities; but a complete theory of shear banding will have to predict both the width of the bands and the thickness of the transition region between flowing and jammed material. I shall conclude my remarks by suggesting that these shortcomings may be associated with a second theoretical gap in solid mechanics. I am thinking of the largely unexplored possibility that the statistical physics of a nonequilibrium system such as a deforming solid, even when it is deforming very slowly, may be qualitatively different from that of a system in thermal and mechanical equilibrium. Specifically, I want to raise the possibility that, during irreversible plastic deformation, the slow, configurational degrees of freedom associated with molecular rearrangements may fall out of thermal equilibrium with the fast, vibrational degrees of freedom that couple strongly to a thermal reservoir. The statistical properties of both of these kinds of degrees of freedom may be described by “temperatures”; and the two temperatures may be quite different from one another under nonequilibrium conditions. This kind of effective temperature is formally similar to the free volume introduced by Spaepen and others, and can be used in much the same ways. The important difference is
Bridging the gap between atomistics and structural engineering
2755
that the effective temperature is a measure of the configurational disorder in the system. Like ordinary temperature, it is an intensive quantity, and does not necessarily carry any implication of volume changes. It does, however, have clear thermodynamic consequences. Evidence for the existence of well defined effective disorder temperatures is emerging from recent studies of granular materials, foams, and related systems [9–13]. (So far, most of these studies are based on numerical simulations.) The elementary components of such systems, such as sand grains, are much too large for the ambient temperature to be relevant to their motions. Nevertheless, the use of fluctuation-dissipation relations in conjunction with measurements of diffusion constants, viscosities, stress fluctuations and the like, yield estimates of effective temperatures that are nonzero and remarkably consistent with one another. How might the addition of an effective disorder temperature resolve the remaining shortcomings of the STZ equations? In principle, this concept should be closely related to the mechanism that we have postulated for the annihilation and creation of STZ’s. That is, the rate of energy dissipation associated with plastic deformation must also be the heat source for the effective temperature. There also must be a cooling mechanism by which the effective temperature decreases in the absence of driving forces. The latter two effects, which couple the effective temperature to the ordinary bath temperature, should determine the calorimetric properties of the material. Finally, it will be important that this disorder temperature diffuses from hotter to cooler regions of the material; but its diffusion constant must naturally be very much smaller than that for ordinary temperature because the associated molecular rearrangements are very much slower than thermal vibrations. Higher effective disorder temperatures imply a higher density of STZ’s and thus a higher plastic strain rate at fixed stress, which in turn implies nonlinear amplification of the plastic response to driving forces. Preliminary investigations indicate that the resulting theory predicts experimentally interesting behavior of this kind. The effective thermal enhancement of plastic flow also appears to imply a picture of shear banding in which the flowing material inside the band is “hotter” than the jammed material outside, and the thickness of the boundary between the two regions is determined by the length scale contained in the effective – not the ordinary thermal – diffusion constant. In summary, I think that a dynamic version of the STZ theory has a good chance of closing the gap between atomistics and engineering applications. Essential elements of the theory are the identification of physically meaningful state variables, the choice of rate factors that are consistent with basic principles of nonequilibrium physics, and – perhaps – an effective disorder temperature to account for the fact that the configurational degrees of freedom may fall out of equilibrium with the heat bath in systems undergoing irreversible deformations.
2756
J.S. Langer
References [1] D. Turnbull and M. Cohen, J. Chem. Phys., 52, 3038, 1970. [2] F. Spaepen, Acta Metall., 25(4), 407, 1977. [3] F. Spaepen and A. Taub, In: R. Balian and M. Kleman (ed.), Physics of Defects, Les Houches Lectures, North Holland, Amsterdam, p. 133, 1981. [4] A.S. Argon, Acta Metall., 27, 47, 1979. [5] M.L. Falk and J.S. Langer, Phys. Rev. E, 57, 7192, 1998. [6] L.O. Eastgate, J.S. Langer, and L. Pechenik, Phys. Rev. Lett., 90, 045506, 2003. [7] J.S. Langer and L. Pechenik, Phys. Rev. E, 2003. [8] M.L. Falk, J.S. Langer, and L. Pechenik, Phys. Rev. E, 70, 011507, 2004. [9] I.K. Ono, C.S. O’Hern, D.J. Durian, S.A. Langer, A. Liu, and S.R. Nagel, Phys. Rev. Lett., 89, 095703, 2002. [10] L. Cugliandolo, J. Kurchan, and L. Peliti, Phys. Rev. E, 55, 3898, 1997. [11] P. Sollich, F. Lequeux, P. Hebraud, and M. Cates, Phys. Rev. Lett., 78, 2020, 1997. [12] L. Berthier and J.-L. Barrat, Phys. Rev. Lett., 89, 095702, 2002. [13] D.J. Lacks, Phys. Rev. E, 66, 051202, 2002.
Perspective 15 MULTISCALE MODELING OF POLYMERS Doros N. Theodorou School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechniou Street, Zografou Campus, 157 80 Athens, Greece
Meeting today’s technological challenges calls for a quantitative understanding of structure-property-processing-performance relations in materials. Developing precisely this understanding constitutes the main objective of materials modeling and simulation. Along with novel experimental techniques, which probe matter at an increasingly finer scale, and new screening strategies, such as high-throughput experimentation, modeling has become an indispensable tool in the development of new materials and products. A challenge faced by materials modelers, which is especially serious in the case of polymeric materials, is that structure and dynamics are characterized by extremely broad spectra of length and time scales, ranging from tenths of nanometers to centimeters and from femtoseconds to years [1]. It is by now generally accepted that the successful solution of design problems involving polymers calls for hierarchical, or multiscale, modeling or simulation involving a judicious combination of atomistic (<10 nm), mesoscopic (10–1000 nm) and macroscopic methods. A rough sketch of how multiscale modeling approaches can be developed for polymeric materials is given in Fig. 1. Quantum mechanical methods can take us from chemical constitution to the bonded geometry, to electronic properties, as well as to potentials describing the interactions between building blocks in a material. Using molecular geometry and potentials as input, statistical mechanicsbased theories and molecular simulations can provide estimates of macroscopic thermal, mechanical, rheological, electrical, optical, interfacial, permeability, and other properties, as well as information on equations of state and constitutive relations governing equilibrium and nonequilibrium behavior. In addition, molecular simulations provide a wealth of detailed information on molecular organization and motion and their consequences for properties; this information can be validated against, and used to interpret contemporary microscopic measurements and is of great value for materials design. Atomistic simulations are 2757 S. Yip (ed.), Handbook of Materials Modeling, 2757–2761. c 2005 Springer. Printed in the Netherlands.
2758
D.N. Theodorou
Figure 1. Some methods and interconnections involved in multiscale modeling of polymeric materials [1].
quite limited in terms of the length scales (typically on the order of 10 nm) and time scales (typically on the order of 10–100 ns for molecular dynamics (MD)) they can address. To deal with longer time- and length scale phenomena, one can resort to mesoscopic simulations. These employ coarse-grained representations of the material, cast in terms of fewer degrees of freedom. They can use as input information derived from atomistic simulations. For example, atomistically calculated chain dimensions, bulk densities, and cohesive energy densities can be used to estimate the radii of gyration, bending energies, and interaction parameters invoked by self-consistent field (SCF) theories or dynamic density functional theories of inhomogeneous polymers; rate constants for elementary jumps, calculated by atomistic transition-state theory (TST), can be fed to kinetic Monte–Carlo (KMC) simulations to track diffusion in glassy polymers over micro- or millisecond time scales; and potentials of mean force between coarse-grained moieties, computed atomistically, can be used within Brownian dynamics or dissipative particle dynamics (DPD) simulations. Nonequilibrium mesoscopic simulations are particularly helpful for addressing how the processing conditions imposed on a material affect its morphology and microstructure. Finally, macroscopic calculations, based on the continuum engineering sciences, can derive input from all previous levels of modeling to predict product performance under specific application conditions and address materials and product design issues. Multiscale modeling approaches may be either sequential, involving series of simulations at progressively coarser or finer levels, or parallel, involving simultaneous simulations of phenomena at different length and time scales
Multiscale modeling of polymers
2759
and passage of information between the different simulations. An example of a parallel approach is the simultaneous simulation of plastic deformation in a polymeric glass by continuum finite elements and atomistic molecular dynamics [2]; the atomistic simulation is carried out in a small volume embedded in the continuously represented material and provides the fundamental constitutive relation for the continuum deformation, which is tracked with finite element methods. Efficient sampling of configuration space according to the probability density of specific equilibrium ensembles is a prerequisite for the reliable calculation of thermodynamic properties of melts and solutions. Recent breakthroughs in the design of Monte–Carlo (MC) algorithms for polymers have dramatically enhanced our ability to sample configuration space, even for systems of long, entangled chains, where the longest relaxation time scales with the 3.4 power of the chain length for linear architectures. Examples of such algorithms are Configurational Bias MC, which offers itself for phase equilibrium calculations [3], Concerted Rotation MC, the connectivity-altering End Bridging and Double Bridging MC [4] and various combinations of the above with each other and with parallel tempering and expanded ensemble schemes. New density of states MC algorithms [5] open another promising avenue for bold sampling of configuration space; they have already been applied to single biological macromolecules with considerable success. For polymers of complex chemical constitution, a promising strategy for sampling configurations involves coarse-graining atomistic models into models cast in terms of fewer degrees of freedom and therefore governed by smoother potentials of mean force; equilibrating at the coarse-grained level; and reverse mapping back to the atomistic representation. Approaches for doing this using both lattice-based [6, 7] and continuous-space [8–10] models are under development. The ability to equilibrate long-chain polymer models at all length scales has opened up promising avenues for predicting polymer melt viscoelastic properties, which are of key importance in processing operations. Recent accomplishments include the computation of segmental friction factors, entanglement tube diameters and zero-shear viscosities by mapping atomistic MD trajectories onto the Rouse and reptation models [1] and the determination of entanglement structures and plateau moduli through direct topological analysis of well-equilibrated melt configurations [11]. Entanglement network-based KMC approaches have already proven useful for the prediction of rheological [12] and mechanical [13] properties from the architecture and length distribution of chains in the bulk and at interfaces. Thus, recent developments generate excellent prospects for connecting these properties rigorously all the way through to atomic-level chemical constitution. Modeling methods have already emerged into valuable tools for addressing molecular-level design issues. Examples include the computational
2760
D.N. Theodorou
investigation of permeability properties of glassy and rubbery polymeric materials for industrial gas separations and packaging applications, based on MD simulations, TST analysis and KMC; the prediction of stable and metastable equilibrium morphologies resulting from self-organization in multicomponent block-copolymer based systems, including self-adhesive materials, based on field-theoretic methods [14]; the development of new design principles for self-healing nanocomposites or nanocomposites with controlled optical properties based on a combination of SCF and density functional theory; calculations of phase and microphase separation phenomena under conditions of flow in polymer, block copolymer, and surfactant solutions through dynamic density functional theory [15] and DPD simulations [16]; and computation of intercalation and exfoliation phenomena in clay-filled nanocomposites through coarse-grained MD. There is still much to be done in establishing rigorous quantitative links between the various levels of simulation invoked in the applications mentioned above. Projection strategies, whereby one can pass from detailed atomistic to coarse-grained descriptions of thermodynamics and dynamics cast in terms of a small number of coarse-grained degrees of freedom or order parameters, and sampling strategies, whereby one can generate ensembles of detailed configurations consistent with a given coarse-grained description, are objects of active current research. There are several frontier problems involving polymeric materials, where methodological breakthroughs in multiscale modeling would be highly desirable. One such problem is polymer crystallization. Prediction of the correct crystal structure from chemical constitution requires the use of refined atomistic models; yet the long time scales of nucleation phenomena and the intricate, hierarchical, processing history-dependent semicrystalline morphologies (e.g., spherulites, axialites) obtained from polymer melts can only be addressed at a mesoscopic level. At the latter level, recent coarse-grained MD simulations [17] appear promising for understanding, at least qualitatively, nucleation and growth, melting and crystallization phenomena. Another frontier problem is the generation of polymer glasses with a formation history that is both realistic and well-defined. MD cooling from the melt is restricted to cooling rates in excess of 109 K s−1 , which are many orders of magnitude higher than those encountered in most applications. On the other hand, generating molecular packings at glassy densities through energy minimization and MD techniques may give satisfactory results for many purposes, but is difficult to map onto a well-defined, experimentally realizable vitrification history. Given the intense interest in and research talent devoted to multiscale modeling of polymers worldwide, there is every indication that challenges such are these will soon be overcome, and that computer-aided molecular design will steadily gain ground in future materials science and engineering.
Multiscale modeling of polymers
2761
References [1] D.N. Theodorou, “Understanding and predicting structure-property relations in polymeric materials through molecular simulations,” Mol. Phys., 102, 147–166, 2004. [2] J.S. Yang, W.H. Jo, S. Santos, and U.W. Suter, “Plastic deformation in bisphenol-Apolycarbonate: applying an atomistic-continuum model,” In: M. Kotelyanskii, D.N. Theodorou (eds.), Simulation Methods for Polymers, Marcel Dekker, New York, 2004. [3] T.S. Jain and J.J. de Pablo, “Configurational bias techniques for simulation of complex fluids,” In: M. Kotelyanskii and D.N. Theodorou (eds.), Simulation Methods for Polymers, Marcel Dekker, New York, 2004. [4] D.N. Theodorou, “Variable connectivity monte carlo algorithms for the atomistic simulation of long-chain polymer systems,” In: P. Nielaba, M. Mareschal, and G. Ciccotti (eds.), Bridging Time Scales : Molecular Simulations for the Next Decade, Springer-Verlag, Berlin, pp. 69–128, 2002. [5] F.G. Wang and D.P. Landau, “Efficient, multiple-range random walk algorithm to calculate the density of states,” Phys. Rev. Lett., 86, 2050–2053, 2001. [6] J. Baschnagel, K. Binder, and W. Paul, “On the construction of coarse-grained models for linear flexible polymer chains: distribution functions for groups of consecutive monomers,” J. Chem. Phys., 95, 6014–6025, 1991. [7] R.F. Rapold and W.L. Mattice, “Introduction of short and long range energies to simulate real chains on the 2nd lattice,” Macromolecules, 29, 2457–2466, 1996. [8] W. Tch¨op, K. Kremer, J. Batoulis, T. B¨urger, and O. Hahn, “Simulation of polymer melts I. Coarse-graining procedures for polycarbonates,” Acta Polymerica, 41, 61–74 1998a. [9] W. Tch¨op, K. Kremer, O. Hahn, J. Batoulis, and T. B¨urger, “Simulation of polymer melts II. From coarse-grained models back to atomistic description,” Acta Polymerica 41, 75–79, 1998b. [10] D. Reith, M. P¨utz, and F. M¨uller-Plathe, “Deriving effective mesoscale potentials from atomistic simulations,” J. Comput. Chem., 24, 1624–1636, 2003. [11] R. Everaers, S.K. Sukumaran, G.S. Grest et al. “Rheology and microscopic topology of entangled polymeric liquids,” Science, 202, 823–826, 2004. [12] M. Doi and J. Takimoto, “Molecular modeling of entanglement,” Philos. T. Roy. Soc. A, 361, 641–50, 2003. [13] A.F. Terzis, D.N. Theodorou, and A. Stroeks, “Entanglement network of the polypropylene/polyamide interface 3. Deformation to fracture,” Macromolecules, 35, 508–521, 2002. [14] G.H. Fredrickson, V. Ganesan, and F. Drolet, “Field-theoretic computer simulation methods for polymers and complex fluids,” Macromolecules, 35, 16–39, 2002. [15] A.V. Zvelindovsky, G.J.A. Sevink, and J.G.E.M. Fraaije, “Dynamic mean-field DFT approach to morphology development,” In: M. Kotelyanskii and D.N. Theodorou (eds.), Simulation Methods for Polymers, Marcel Dekker, New York, 2004. [16] W.K. den Otter and J.H.R. Clarke, “Simulation of polymers by dissipative particle dynamics,” In: M. Kotelyanskii and D.N. Theodorou (eds.), Simulation Methods for Polymers, Marcel Dekker, New York, 2004. [17] H. Meyer and F. M¨uller-Plathe, “Formation of chain-folded structures in supercooled polymer melts examined by MD simulations,” Macromolecules, 35, 1241–1252, 2002.
Perspective 16 HYBRID ATOMISTIC MODELLING OF MATERIALS PROCESSES Mike Payne,1 G´abor Cs´anyi,2 and Alessandro De Vita3 1 Cavendish Laboratory, University of Cambridge, UK 2 Cavendish Laboratory, University of Cambridge, UK 3
King’s College London, UK, Center for Nanostructured, Materials (CENMAT) and DEMOCRITOS National Simulation Center, Trieste, Italy
Hybrid atomistic modelling schemes aim to combine the lengthscale and timescale capabilities of simulations performed using empirical potentials with the accuracy of first principles calculations. However, there are considerable challenges to developing a hybrid atomistic modelling scheme that can describe materials processes. In this article, we outline some of these challenges and describe a scheme we have developed that overcomes some of these.
1.
Materials Simulations
Atomistic simulations using empirical potentials have gained an increasingly important role in the understanding of materials processes. However, empirical potentials are unable to capture some of the most basic features of the formation and breaking of atomic bonds, which can only be described accurately using first principles quantum mechanical techniques. First principles density functional calculations can presently be performed for system sizes up to a thousand atoms. Such calculations provide the capability of predicting many materials properties, one example being the prediction of superhard materials [1]. One obstacle to further progress in first principles simulations is that the computational time for conventional density functional theory calculations scales as the cube of the number of atoms in the system. We expect that over the next five years these approaches will be replaced with linear scaling techniques which will make it possible to perform quantum mechanical calculations on systems containing many tens of thousands of atoms for static problems, and thousands of atoms for dynamical simulations. 2763 S. Yip (ed.), Handbook of Materials Modeling, 2763–2771. c 2005 Springer. Printed in the Netherlands.
2764
M. Payne et al.
However, even these system sizes are far too small to study materials processes such as crack propagation which involve very strong coupling between atomistic effects, such as the breaking of a bond at the crack tip, and the long range elastic fields surrounding the crack. This coupling between short range quantum mechanical processes and long range elastic fields is common to many problems and appears to be beyond any conceivable future capability of first principles calculations alone, at least within a reasonable cost.
2.
Hybrid Modelling Schemes
Hybrid modelling schemes provide a natural approach for dealing with the demands of including both short range chemistry and long range elasticity in a single simulation. A general hybrid modelling approach is schematically illustrated in Fig. 1, in this case applied to a system containing two cracks. Over most of the system the distortion of the material due to the elastic fields is extremely small and can be accurately described by a continuum description of the material or using a simple atomistic model based on empirical potentials. In Fig. 1, we actually show both descriptions with the regions described using empirical potentials closer to the crack tips and a continuum region further away from the crack tips, where the distortion of the materials is smaller.
Empirical atomistic
Quantum
Continuum
Figure 1. Schematic illustration of a hybrid modelling scheme.
Hybrid atomistic modelling of materials processes
2765
Irrespective of the description of the material away from the crack tips, the region in the vicinity of each crack tip, where the bond breaking processes will occur, must be described quantum mechanically. For the purposes of this article we shall concentrate primarily on purely atomistic hybrid modelling schemes in which a region described quantum mechanically is combined with a region described using empirical interatomic potentials. Many approaches are being developed which embed a region described quantum mechanically within a region described empirically. One class consists of the so-called QM/MM techniques, which are being developed primarily for the study of biological systems [2]. In these approaches the active site of a protein would be described quantum mechanically while the remainder of the system is described using one of the standard potentials developed specifically for proteins.
3.
Challenges of Materials Simulations
There are many ways that the results of first principles calculations may be used in simulations of materials processes on lengthscales and time-scales that are far beyond those accessible to first principles techniques. One approach is to calculate some of the parameters that are relevant to the materials processes, such as surface energies or activation barriers, using first principles calculations and then using these parameters in the large scale simulations for instance in Bulatov et al.’s modelling of dislocation motion in silicon [3]. It has also become quite common to use the results of first principles calculations to construct the empirical potentials that will be used in the large scale atomistic simulations of materials processes, so that almost all modern potentials are fitted to reproduce data which at least in part is calculated. One can perform the first principles calculations on atomic configurations that are similar to those that will be encountered in the materials process so that the resulting empirical potential is more likely to give an accurate description of the process being studied. Such approaches to studying materials processes are hierarchical approaches, in which simulations for larger lengthscales and/or longer timescales are to be performed using parameters obtained from more accurate simulations performed over smaller lengthscales and timescales. While this approach can be successful in many cases, as mentioned previously, the hierarchical approach breaks down when there is a complex interplay between phenomena at different lengthscales, as in the case of crack propagation. Such problems need large numbers of atoms to correctly describe the long range elastic fields but also require quantum mechanical accuracy to correctly describe the bond breaking processes, which however take place as a direct consequence of the near-singular stress enhancement at the crack tip. For these problems it appears that the only way of modelling these processes is to use hybrid schemes.
2766
M. Payne et al.
The application of hybrid modeling techniques to study materials processes is much more challenging than their application to biological systems. The active site of a protein does not move its position within the protein and so the region to be treated quantum mechanically within a QM/MM simulation does not change with time. Unfortunately, as illustrated schematically in Fig. 2, this is not true in the case of most materials processes and so any hybrid modeling scheme for such systems must be able adjust the region treated quantum mechanically in response to the dynamical evolution of the system. A further challenge, also illustrated schematically in Fig. 2, is that it is often not clear which regions of the material should be treated quantum mechanically since one cannot know in advance where or when a new crack may be initiated in the material in response to the continuous changes that take place in the material during the simulation. What these simplistic illustrations do not make clear is that, realistically, both of these issues must be addressed within the hybrid modeling scheme. It would be totally impossible to run a huge simulation of a complex materials process if one had to study the system after each step in the simulation in order to decide which regions should be treated quantum mechanically at the next step. However, choosing which atoms must be treated quantum mechanically is totally beyond the capability of virtually every hybrid modeling scheme currently being used. It is, however, within the capability of a scheme originally proposed by De Vita and Car [4].
Empirical atomistic
Quantum
?
Continuum
Figure 2. Challenges for hybrid modelling schemes for materials simulations. These schemes must be able to follow dynamically evolving systems and to identify which regions must be treated quantum mechanically.
Hybrid atomistic modelling of materials processes
4.
2767
The Learn-on-the-fly Scheme
The scheme proposed by De Vita and Car takes a very different approach to the problem of hybrid modelling. In contrast to other schemes in which atoms are either in a quantum mechanical region or a region described by empirical potentials, in their scheme all the atoms in the system are described using empirical potentials. However, the parameters in the empirical potential are updated on an atom by atom basis where and when necessary during the simulation. A schematic illustration of this scheme is shown in Fig. 3. The updating of the interatomic potentials is carried out by performing accurate quantum mechanical calculations to determine the correct forces on the atoms in each critical region of the material and then updating the parameters of the potentials for all the atoms in this region to reproduce these quantum mechanical forces. As the parameters are updated “on-the-fly” we have named this approach “Learn-on-the Fly” or “LOTF”. In the LOTF approach every atom in the system can have a different interatomic potential as the parameters in the potential are assigned on an atom by atom basis. Furthermore, the potential of each atom can vary with time. For instance, in the case of crack propagation an atom would initially be described by a simple potential that worked well for the bulk material. While the atom was in the vicinity of the crack tip, the atomic potential would be updated regularly so that the potential is able to accurately describe the behaviour of this atom in its highly distorted and rapidly
Continuum Empirical atomistic
Atoms represented by empirical potentials with parameters determined by by quantum mechanical calculations
Figure 3. The Learn-on-the fly scheme.
2768
M. Payne et al.
varying environment. Finally, when the crack has passed the atom, the empirical potential evolves to accurately describe an atom on the surface of the material and at this point the potential no longer needs to be routinely updated until the next major event takes place in the vicinity of this atom. The real test of the LOTF scheme is whether it can reproduce the results of an accurate quantum mechanical calculation despite using empirical interatomic potentials. To test this point we have calculated the self-diffusion rates for a vacancy in silicon using systems containing 64 atoms. This size is small enough for us to perform one set of simulations using tight-binding calculations for the entire system. We also performed a simulation using the Stillinger–Weber potential [5]. Finally we performed a simulation using LOTF with a Stillinger–Weber type potential, where the potential parameters were fit to the forces calculated using the tight binding method for a number atoms in the immediate vicinity of the defect. The results of these calculations are shown in Fig. 4. In each case the LOTF scheme gives values for the selfdiffusion constants which are consistent with those generated when the tight
4 LOTF SW TB
4.2
log10 D [cm2/sec]
4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 6 0.65
0.7
0.75
0.8
0.85 0.9 1000/T [K]
0.95
1
1.05
Figure 4. Self diffusivity of a vacancy in a 64 atom bulk silicon cell as a function of inverse temperature. Diamonds and squares show the fully quantum mechanical (tight binding) and fully classical (Stillinger-Weber) results, respectively, and triangles (and dashed line) represent the results of the hybrid simulation, which agree with the tight binding reference within statistical error.
Hybrid atomistic modelling of materials processes
2769
binding scheme is applied to the entire system. It can also be seen that the diffusion constants calculated using the fixed original Stillinger–Weber potential are incorrect. This figure proves the ability of LOTF to inherit the accuracy of the quantum mechanical method used to generate the accurate forces on the atoms. This ability has been reproduced in every test of the method we have performed to date. It should be emphasized that LOTF will work with any form of empirical potential, even extremely unphysical forms. However, if very unphysical forms are used then the parameters will have to be updated more frequently in order to continue to reproduce the results of the underlying quantum mechanical calculations. To perform a LOTF simulation you set up the initial configuration of the system to be studied, choose the empirical potential to be used and the quantum mechanical scheme to be used to compute the accurate forces. There is, in principle, no reason why only one potential or quantum mechanical scheme has to be used. It could be more than one and it could even change during the simulation. Then one selects the criteria that will be used to select the atoms whose potentials will be updated. During the simulation, when these criteria are met, the forces on these atoms and their neighbours are calculated quantum mechanically and the parameters describing the empirical potentials of these atoms adjusted to reproduce the quantum mechanical forces on these atoms. We have found that it is important to use time averaged atomic positions when selecting these atoms. If one selects on the basis of instantaneous atomic configurations an enormous computational effort is expended following large thermal fluctuations on individual atoms that do not actually affect the materials process being studied. The selection criteria to be applied could include changes in interatomic distance above a chosen threshold, or changes in number of nearest neighbours within a specific distance. One of the challenges of successfully applying LOTF to any particular problem will be the choice of suitable criteria that will correctly identify the critical atomistic processes that give rise to the materials property one wishes to study. However, we emphasize that once these criteria are chosen, the simulation can proceed to run with no further user intervention. The major motivation for developing hybrid modelling schemes has generally been the wish to simulate ever larger systems. The same motivation lies behind attempts to develop linear scaling first principles calculations. However, increasing system size brings a further set of difficulties to simulation. These are primarily associated with the increased size and complexity of the phase space associated with the system. In general, the larger a physical system the larger the number of local energy minima in the potential energy surface and hence the slower the physical processes that occur within it. Without techniques for vastly accelerating the search through phase space, larger systems will require intractably longer simulations. Until recently, this problem was addressed by very few researchers but this situation now appears to
2770
M. Payne et al.
be changing. LOTF does not directly address this problem of the complexity of the phase space of large systems. However, it is worth pointing out that the scheme allows simulations with quantum mechanical accuracy to be performed for simulation times which are normally only accessible to empirical simulations, routinely many nanoseconds or even microseconds. In contrast, due to fundamental limitations of first principles molecular dynamics approaches, hybrid techniques which explicitly include a large quantum mechanical region may be restricted to the picosecond timescale.
5.
Outlook
LOTF appears to deal successfully with the hybrid modelling problem at the atomistic scale. But, as can be seen in Fig. 1, there is the further problem of matching the atomistic region to the continuum. In fact, this problem was addressed more than ten years ago by Kohlhoff, Gumbsch and Fischmeister in their simulations of crack propagation in metals [6]. Their method has yet to be tested on a system containing a number of interacting defects. At present, our LOTF scheme can only deal with short range interatomic potentials. We still need to extend the technique to deal with systems in which there are long range electrostatic interactions, but this could be done similarly to most current QM/MM schemes. All the LOTF simulations performed to date have used very simple empirical potentials. We are currently investigating the best way to augment much more complicated classical potentials. This will be crucial for applying LOTF to biological systems. It would seem that changing the parameters of such elaborate potentials is technically difficult, and a better route is to retain a very simple functional form, and tune its parameters to reproduce the forces of the empirical potential in most regions, and of the quantum mechanical model in the activated regions. This would make the scheme even more general than described above, capable of using a range of “black box” potentials (of varying degrees of accuracy) each used in their assigned region of the system.
References [1] A.Y. Liu and M.L. Cohen, “Prediction of new low compressibility solids,” Sci., 245, 841–2, 1989. [2] J. Gao, “Methods and applications of combined quantum mechanical and molecular mechanical potentials,” In: K.B. Lipkowitz and D.B. Boyd (eds.), Reviews in Comp. Chem., VCH Publishers, New York, vol. 7, pp. 119–185, 1995. [3] V.V. Bulatov, J.F. Justo and W. Cai et al., “Parameter–free modelling of dislocation motion: the case of silicon,” Philos. Mag. A, 81, 1257–81, 2001.
Hybrid atomistic modelling of materials processes
2771
[4] A. De Vita and R. Car, “A novel scheme for accurate MD simulations of large systems,” Mat. Res. Soc. Symp. Proc., 491, 473–480, 1998. [5] F.H. Stillinger and T.A. Weber, “Computer simulation of local order in condensed phases of silicon,” Phys. Rev. B, 31, 5262–5271, 1985. [6] S. Kohlhoff, P. Gumbsch and H.F. Fischmeister, “Crack propagation in BCC crystals studied with a combined finite-element and atomistic model,” Phil. Mag., A64, 851– 78, 1991.
Perspective 17 THE FLUCTUATION THEOREM AND ITS IMPLICATIONS FOR MATERIALS PROCESSING AND MODELING Denis J. Evans Research School of Chemistry, Australian National University, Canberra, ACT, Australia
Thermodynamics describes the framework within which all macroscopic processes operate. Until the discovery of the Fluctuation Theorem [1], there was no equivalent framework for small (nano) systems observed for short times. The Fluctuation Theorem provides a generalisation of the Second Law of thermodynamics, that applies to finite systems observed over finite times. The Second Law of thermodynamics states that for all macroscopic processes the total entropy of the Universe can only increase. The Fluctuation Theorem says that for finite systems, the probability ratio that for a finite time the entropy decreases rather than increases, vanishes exponentially with system size and observation time. Thus in the so-called “thermodynamic limit”, the entropy can only increase and we obtain the Second Law. The Fluctuation Theorem places limits on the operation of nanomachines and biological processes taking place in small organelles. The Theorem states that as “engines” are made ever smaller, the probability that they will operate thermodynamically in reverse, increases exponentially with the size of the system and the duration of operation. The Fluctuation Theorem also resolves another paradox. All the laws of mechanics (quantum or classical) are time reversible. If you look at the motion of the planets of the solar system, orbiting the sun, then if all the motions are reversed in time, the resulting motion is still a valid solution of the laws of mechanics. However when the system involves billions upon billions of interacting molecules say in a glass of water or a waterfall, thermodynamics says the motion can only occur in the direction which increases the total entropy! This is in spite of the fact that even for large systems (like a glass of water or a waterfall), the laws of mechanics are, as always, completely time reversible. The first satisfactory mathematical proof of a Fluctuation Theorem was given 2773 S. Yip (ed.), Handbook of Materials Modeling, 2773–2776. c 2005 Springer. Printed in the Netherlands.
2774
D.J. Evans
by [2]. This proof assumes the initial distribution of microstates is known and computes the relative probabilities of changes in entropy by explicitly using the time reversal symmetry of the equations of motion and the assumption of causality, namely that probabilities of observing final microstates can be computed from the probabilities of observing the initial states from which the final states are generated.
1.
The Fluctuation Theorem(s) in More Detail
Because the Fluctuation Theorem (FT) deals with fluctuations we expect that there will be different versions of the FT for systems at constant energy, constant temperature, constant volume constant pressure etc. A recent experiment performed to confirm the FT [3], serves as an instructive example. A single transparent colloid particle was trapped in the harmonic force field of a focused laser beam – a so-called optical trap. The colloid particle is in rather obvious contact with a heat bath: namely the surrounding water solvent. Wang et al, allowed the system to come to equilibrium, then at an arbitrary zero time, the optical trap was suddenly translated at fixed velocity relative to the solvent. The resulting transient motion of the colloid particle was monitored as the particle responded to the sudden motion of the trap. This same experiment was repeated several hundreds of times. The transient FT for this system considers the quantity, ¯ t = (tkB T )−1
t 0
ds vopt • Fopt(s)
(1)
where Fopt(t) is the optical force on the trapped particle,vopt is the (constant) velocity of the optical trap and kB T is Boltzmann’s constant times the abso¯ t is a lute temperature of the solvent. From its definition (1), we can see that quantity that is recognizable as the time average of rate of entropy absorption by the solvent. In thermodynamic terms it is the time average of the work divided by the temperature of the solvent. For thermostatted dissipative systems, ¯ t , is always something that is recognizable as a rate the argument of the FT , of entropy absorption [4]. However for more general systems the argument of the FT, the dissipation function, is not directly recognizable as an entropy production/absorption. The general expression for the dissipation function that is valid for arbitrary ergodically consistent combinations of initial ensemble and dynamics can be found in Evans and Searles, 2002 [4]. For the experiment of Wang et al., the transient FT makes a very simple prediction for the ratio of probabilities of observing complementary values for ¯ t, the time averaged entropy prduction, ¯ t = A) Pr( = exp( At). (2) ¯ t = −A) Pr(
Materials processing and modeling
2775
The original paper of Wang et al. [3] experimentally confirmed the integrated form of (2), namely, ¯ t < 0) Pr( ¯ t t) = exp(− . ¯ t >0 ¯ t > 0) Pr(
(3)
A direct check of the validity of the transient FT itself was recently performed for a simpler experiment. In this experiment, an equilibrium ensemble of trapped particles was subject to a step function change in the strength of the optical trap. In this case the dissipation function appearing in the transient FT is not directly related to entropy production. Instead the dissipation function ¯ t is ¯ t = (tkB T )−1 (k0 − k1 )
0
t
•
ds r(s) • r(s)
= (2tkB T )−1 (k0 − k1 )(r(t)2 − r(0)2 )
(4)
where k0 , k1 are the initial and final values of the spring constant for the optical trap and and r(t) is the vector position of the colloid particle relative to the centre of the optical trap. From its definition (4) we can see the dissipation function is like entropy production in that its ensemble average is always expected to be positive. For this system the transient FT states that, ¯ t = A) Pr( = exp( At). ¯ Pr(t = −A)
(5)
This prediction was confirmed in the experiments of Carberry, et al., 2004 [5]. A recent review of the theoretical status of the FT has been published by Evans and Searles 2002 [4]. Fluctuation Theorems are extraordinarily general. There are stochastic versions of the FT. The FT is completely consistent with Langevin dynamics [6]. There are quantum versions of the FT [7, 8]. This theorem is obviously very general. It can be used to derive Green–Kubo relations for linear transport coefficients and the Fluctuation Dissipation Theorem [9]. However, it is more general than either of these two relations since the FT applies to the nonlinear regime far from equilibrium where Green– Kubo and Fluctuation Dissipation relations fail. The transient FT can be applied to nonequilibrium paths that connect two equilibrium states. When this is done the FT can be used to derive new expressions for free energy differences between equilibrium states in terms of sums over all nonequilibrium path integrals which connect those two equilibrium states [10–14]. We expect that over the next decade further extensions of the Fluctuation Theorem will be explored. The practical utility of these nonequilibrium free energy expressions is yet to be properly assessed. However the mere fact that equilibrium free energy differences can be related to nonequilibrium path integrals is quite surprising.
2776
D.J. Evans
So one hundred years after Boltzmann, I think we can finally say to our students that we understand how macroscopic irreversibility arises from reversible microscopic dynamics.
References [1] D.J. Evans, E.G.D. Cohen, and G.P. Morriss, “Probability of second law violations in shearing steady states,” Phys. Rev. Lett., 71, 2401–2402, 1993. [2] D.J. Evans and D.J. Searles, “Equilibrium microstates which generate second law violating steady states,” Phys. Rev. E, 50, 1645–1648, 1994. [3] G.M. Wang, E.M. Sevick, E. Mittag et al., “Experimental demonstration of violations of the second law of thermodynamics for small systems and short time scales,” Phys. Rev. Lett., 89, 050601–4, 2002. [4] D.J. Evans and D.J. Searles, “The Fluctuation theorem,” Adv. In Phys., 51, 1529– 1585, 2002. [5] D.M. Carberry et al., “Fluctuations and irreversibility: An experimental demonstration of a second-law-like theorem using a colloidal particle held in an optical trap,” Phys. Rev. Lett., 92, 140601–4, 2004. [6] J.C. Reid et al., “Reversibility in non-equilibrium trajectories of an optically trapped particle,” Phys. Rev. E, 70, 016111–9, 2004. [7] J. Kurchan, “A quantum fluctuation theorem,” arXiv:cond-mat/0007360, 1–8, 2001. [8] T. Monnai and S. Tasaki, “Quantum correction of fluctuation theorem,” J.Phys.A Math.Gen., 37, L75–L79, 2004. [9] D.J. Evans, D.J. Searles, and L. Rondoni, “On the application of the gallavotti-cohen fluctuation relation to thermostatted steady states near equilibrium,” arXiv:condmat/0312353, 1–40, 2004. [10] C. Jarzynski, “Nonequilibrium equality for free energy differences,” Phys. Rev. Lett., 78, 2690–2693, 1997a. [11] C. Jarzynski, “Equilibrium free energy differences from nonequilibrium measurements: A master equation approach,” Phys. Rev. E, 56, 5018–5035, 1997b. [12] G.E. Crooks, “Entropy production fluctuation theorem and nonequilibrium work relation for free energy differences,” Phys. Rev. E, 60, 2721–2726, 1999. [13] G.E. Crooks, “Path-ensemble averages in systems driven far from equilibrium,” Phys. Rev. E, 61, 2361–2366, 2000. [14] D.J. Evans, “A non-equilibrium free energy theorem for deterministic systems,” Mol. Phys., 101, 1551–1554, 2003.
Perspective 18 THE LIMITS OF STRENGTH J.W. Morris, Jr. Department of Materials Science and Engineering, University of California, Berkeley
In the usual case the strength of a crystalline material is determined by the motion of defects such as dislocations or cracks that are present within it. Materials scientists control strength by modifying the microstructure of the material to eliminate defects or flaws and inhibit the motion of dislocations. There is, however, an ultimate limit to the strength that can be obtained in this way. The mechanical stresses that are not relieved by plastic deformation or fracture are supported by elastic deformation, which is, essentially, the stretching of the interatomic bonds. These bonds have finite strength. There is a value of the stress at which bonding itself becomes unstable and the material must fracture or deform, whatever its microstructure. This elastic instability sets an upper bound on mechanical strength that cannot be exceeded, however creative a scientist may be. There are several practical reasons to be interested in the limit of elastic stability [1]. First, elastic instability defines the ideal strength [2–4], and it is useful to know the highest strength a particular material could possibly have. This is particularly true in times when new concepts in materials, such as nanomaterials, have led to optimistic predictions that have not always been vetted against the limits nature has set. Second, the elastic limit is reached, or, at least, closely approached in a number of experimental situations. A familiar example is deformation via stress-induced phase transformations, as in certain austenitic steels. However, even normal, ductile metals seem to approach the limit of strength in nanoindentation experiments, and stronger alloys may also do so in the region of stress concentration ahead of a sharp crack. Third, elastic instability is one of the few problems in solid mechanics that can actually be solved ab initio. Existing pseudopotential codes are capable of following elastic deformation to the point of instability with reasonable accuracy, and a number of calculations have been done [1, 5–9]. Fourth, as we shall see, the 2777 S. Yip (ed.), Handbook of Materials Modeling, 2777–2785. c 2005 Springer. Printed in the Netherlands.
2778
J.W. Morris, Jr.
theoretical study of behavior at the limit of strength can provide new insight into mechanical behavior in more common situations. From the perspective of understanding deformation, it is interesting that a wide variety of mechanical phenomena that are ordinarily attributed to the specific behavior of dislocations would also be found in a defect-free world. In the limited space available here we discuss three examples. (1) In a defectfree world, the common bcc metals would cleave on {100} and exhibit “pencil glide” in 111. Most would be brittle at low temperature, as they are. (2) The common fcc metals would glide in {111} and would not cleave under simple tensile loads. They would be ductile at low temperature, as most of them are. (3) The maximum values of the nanohardness of simple metals would be very nearly what they are.
1.
Methodology
The calculation of ideal strength is based on modern methods that make it possible to compute the energy of a crystalline configuration of atoms with rather good accuracy. A number of computational techniques are available, almost all of them based on density functional theory. The Vienna Ab-Initio Simulation Package (VASP) [10] provides a set of readily available tools to accomplish these calculations, and other packages are also available. In most cases the local density approximation (LDA) to density functional theory yields good results. In certain circumstances (e.g., Fe) the generalized gradient approximation (GGA) gives significantly more accurate results [11]. The relative quality of the two approximations can be judged by comparing experimental data to theoretical predictions. In these methods Schrodinger’s equation is solved within a single particle approximation. It is usually sufficient to employ pseudopotentials for the atomic cores and a plane wave expansion of the wave functions (this is the approach used in VASP). This technique works reasonably well in situations for which the core states are not strongly affected by the imposed strains. However, under severe compressive stresses, for example, all electron methods, such as that employed in WIEN97, provide a better description of the solid’s properties. The elastic stress within a solid is related to the derivative of the energy with respect to the elastic strain. While there is some subtlety involved in doing this (there is no unique definition of the strain), it is usually sufficient to define the strain from the displacements of the crystal lattice points from a reference configuration [11, 12]. One can then compute the energy as a function of strain by incrementally displacing the atom positions within the unit cell to increment the strain along the desired path, and evaluate the associated stresses from the energy increment. The ultimate strength is associated with
The limits of strength
2779
the maximum in the stress, which occurs at (or near) the inflection point of a plot of energy against strain. The precise way in which the stress-strain relation is evaluated depends on the specific problem one is trying to solve. For example, one is often interested in the ideal strength under simple tension along a particular crystal axis, or in simple shear on a particular crystallographic plane. In these cases the only nonzero stress is the stress that is conjugate to the strain of interest: the uniaxial tensile stress in the case of simple tension, the conjugate shear stress in the case of simple shear. In these cases the atom positions must be adjusted after each increment to the strain so that all other stresses are relaxed to zero. Methods for doing this in a variety of load geometries and crystal types are described in the references cited at the end of this article. We shall now describe the results of some of these calculations with an emphasis on the physical insight they provide.
2. 2.1.
BCC Metals Ideal Strength in Tension
Computations of the ideal tensile strengths of unconstrained bcc metals show that they are weakest when pulled in a 100 direction. The majority of those we have investigated fail in tension (cleave) on {100} planes, both in theory and in experiment. Ab initio calculations [1, 11–15] give an ideal tensile strength of about 30 GPa for W (0.07E100 ), 29 GPa for Mo (0. 078E100 ) and 13 GPa for Fe (0.087E100 ) where E100 is the tensile modulus in the 100 direction, and the stresses are computed at 0K. There is a simple crystallographic argument that explains both the cleavage plane and the ideal tensile strength (Fig. 1). A relaxed tensile strain along
Figure 1. The Bain strain connecting the bcc and fcc structures. If bcc is pulled in tension on [001] while contracting along [100] and [010] it generates an fcc crystal as shown.
2780
J.W. Morris, Jr.
100 carries the bcc structure into an fcc structure with the same volume at a tensile strain of 0.26 (the “Bain strain”). By symmetry, both structures are unstressed, so the tensile stress must pass through at least one maximum along the transformation path, at a critical stain much less than 0.26. If we fit the stress-strain curve to a sinusoid that has the correct modulus at low strain, the tensile strength in 100 is given by
σm ∼
eB E100 = 0.08E100 π
(1)
where eB is the Bain strain and E100 is Young’s modulus for 100 tension. The same reasoning explains why the ideal strength increases when the normal tension is supplemented by a hydrostatic tension, as it is, for example, near the tip of a crack. Hydrostatic tension expands the unit cell, which increases the Bain strain and raises the stress at instability. Ab initio calculations for Fe [16] show that the ideal tensile strength increases by almost 50% when the tensile stress is supplemented by a hydrostatic tension that is equal in magnitude. The element Nb is anomalous among the bcc metals we have studied [13, 17]. While the ideal tensile strength of Nb is lowest for 100 loading, as in the other bcc’s, the failure mode is not in tension across the {100} planes, but in shear on the 111{112} system. After some significant tensile strain, Nb deviates from the tetragonal, Bain strain path onto an orthorhombic strain path that is characterized by unequal contractions in the 100 directions perpendicular to the axis of load. The eventual failure is in shear, rather than tension. This preference for shear failure is preserved, though with a smaller margin, when hydrostatic tension is superiomposed. The results suggest that Nb may not exhibit the ductile-brittle transition that is typical of bcc metals. The experimental evidence is unclear [13].
2.2.
Ideal Strength in Shear
The ideal shear strengths of bcc metals also reflects their symmetry. Calculations of the ideal strength of bcc W [12] in relaxed shear in the 111 direction on the {110}, {112} and {123} planes give almost identical values, τm ∼ 17.7 GPa ∼ 0.11G111 , where G111 is the shear modulus for shear in the 111 direction. In all three cases the shear strain at instability is about 0.17. Calculations for Fe [11] and Mo [13] give very similar results, with τm ∼ 7.2 GPa (0.11G111 ) for Fe, ∼15.8 GPa ( 0.12G111 for Mo). Nb is, again, unusual; the ideal shear strength in the 111{112} system is anomonously large, 6.4 GPa (0.15G111 ), and is still larger for the common alternative systems [13].
The limits of strength
2781
The symmetry rule that governs the shear strength of typical bcc’s is illustrated in Fig. 2 [1, 12]. Essentially, a shear in the 111 direction tilts planes that are perpendicular to the 111 axis. If we allow relaxation of the atom positions in these planes, they come into an atomic registry that changes the crystal symmetry at a shear strain of ∼ 0.34, irrespective of the plane of tilt. This common stress-free, saddle-point structure is body-centered tetragonal. There is a maximum in the shear strength at about half the saddle-point shear, at e ∼ 0.17. If we fit the stress-strain relation with a sinusoid that gives the correct modulus in the elastic limit, we obtain
τm ∼ 0.11G111
(2)
in good agreement with the ab initio calculations. A number of bcc metals have similar strengths for slip in the 111 direction on various planes, a phenomenon that is known as “pencil glide” and is attributed to the peculiarities of dislocation glide in bcc. These calculations show that defect-free bcc crystals would tend to behave in a very similar way. The balance between the shear and tensile strengths suggested by Eqs. (1) and (2) is such that, in a defect-free world, the common bcc metals would cleave if loaded along 100, but not if loaded in other directions. Taking W as an example, a uniaxial load along 100 would reach the ideal cleavage strength, ∼ 30 GPa, when the shear stress in the most favorable slip system was only around 14 GPa, below the ideal shear strength. However, a uniaxial load along 111 or 110 would cause the shear strength to be exceeded before the tensile stress in 100 reached the ideal value. A single crystal would be ductile or brittle, depending on the direction of the load.
Figure 2. Instability in shear in [111], [111] shear tilts planes of atoms (equilateral triangles perpendicular to [111]) until they come into registry, as at right, creating a new symmetry.
2782
3.
J.W. Morris, Jr.
FCC Metals
3.1.
Ideal Strength in Tension
The tensile strength of defect-free fcc crystals could, in theory, also be governed by the Bain strain [18]. As illustrated in Fig. 3, an fcc crystal can be converted into bcc by straining in tension in the [110] direction. The tensile strain required to reach bcc is, in fact, relatively small, so the estimated strength would be small as well. However, reaching bcc from fcc with a [110] pull requires very substantial relaxations in the perpendicular directions. The ¯ crystal must expand along [110] and contract to an even greater degree along [001]. These large relaxations are inconsistent with the Poisson contractions of typical fcc metals in the linear elastic limit. Fcc metals do not start out along this deformation path when pulled along [110] and, apparently, never find it. Nonetheless, the 110 directions are the weak directions for tension in all of the fcc metals that have been studied to date: Al, Cu, Ir and Pd [19]. If the fcc crystal is pulled quasistatically to failure under uniaxial tension, the failure mode at the elastic limit is not a tensile failure across the perpendicular {110} plane, but rather a shear failure (the “flip strain”). In this deformation mode the 110 tensile direction is stretched while the perpendicular 100 direction contracts, with the ultimate consequence that the two directions are interchanged. It can be shown that the failure mode is, in fact, a failure in shear in the 112{111} system, which is the normal mode of shear failure in an fcc crystal. This result suggests a simple explanation for the fact that fcc crystals do not exhibit a conventional ductile-brittle transition on cooling; the inherent failure mechanism is in shear. Those fcc metals that do become brittle at low temperature, such as Ir and nitrided austenitic steels, do so only after [110] [1⫺10]
[001]
[110]
[1⫺10]
[001]
Figure 3. Bain strain of an fcc lattice through tension along [110]. The crystal must expand ¯ equally along [110] and contract dramatically along [001]. The alternative is the “flip strain” shown at right.
The limits of strength
2783
significant plastic deformation; their “cleavage” is by decohesion on slip or twin planes. If we assume deformation at constant volume with a sinusoidal stress-strain curve, the “flip” instability occurs at an engineering strain of 0.08 and a stress of σm = 0.05E110 , where E110 is Young’s modulus for tension in the 110 direction. The most recent ab initio calculations of the ideal tensile strengths of fcc metals under quasistatic loading give the following values: for Al, σm = 5.2 GPa = 0.07E110 , for Cu, σm = 6.2 GPa = 0.05E110 , for Pd, σm = 5.2 GPa = 0.06 E110 , and for Ir, σm = 36 GPa = 0.06E110 [19]. Note that the exceptional strength of Ir is a consequence of its large elastic modulus, and is in no way anomalous. The anomaly is the high dimensionless strength of Al. However, recent research has shown that the actual strength of Al is determined by a phonon instability that intrudes slightly before the elastic instability [20], decreasing the ideal strength. No similar instability has been found in other fcc’s.
3.2.
The Ideal Strength in Shear
The mode of failure of fcc crystals in shear is conventional, though the behavior of at least some fcc’s, Al in particular, is not. The weak directions in shear are 112 directions in {111} planes, as one would expect from a rigidball model of the close-packed fcc structure. A sinusoidal model of that failure mode predicts a shear strength, τm ∼ 0.085G111 . The ideal shear behavior of Al and Cu [20, 21] make an interesting comparison. While the deformation of Cu remains nearly planar in the {111} shear plane as the instability is approached, Al expands significantly perpendicular to the shear plane. The consequence is that while Cu has an ideal strength near the estimate, τm = 2.7 GPa ∼ 0.11G 111 , the calculated shear strength of Al is much larger, both in absolute magnitude and in dimensionless terms: τm ∼ 3.4 GPa = 0.15G 111 . (This number is a bit of an overestimate, since Al experiences a phonon instability before reaching peak strength [20], but is qualitatively correct. It is significantly higher than the number reported in earlier work [8], which used a less accurate pseudopotential.) Comparing the ideal tensile and shear strengths of Al and Cu leads to a curious result: at the limit of strength, Cu is stronger than Al in tension, though weaker in shear. This is true despite the fact that the failure mode is precisely the same in the two cases: a shear instability in the system 112{111}. The reason is the perpendicular expansion of Al during shear. Tension in the 110 direction imposes a tension across the {111} shear plane, assisting the normal displacement of {111}planes and lowering the stress required for Al to fail in shear.
2784
4.
J.W. Morris, Jr.
Nanoindentation
A nanoindentation test is, essentially, a microhardness test done with a nanotipped indenter. Until the substrate yields, the deformation field of the indenter should be approximately Hertzian, which makes it possible to use the data to infer the stresses and strains at which yielding occurred. Moreover, since the maximum shear in a Hertzian strain field is well beneath the surface, nanoindentation tests can sample defect-free volumes, and may, therefore, test the ideal strength. Surprisingly, the shear strengths inferred from recent nanoindentation tests substantially exceed the computed ideal strengths. Thus, Bahr, et al. [22] report data showing shear stresses as high as 28 GPa in W prior to yielding, well beyond the value (18 GPa) that corresponds to the ideal strength on any of the common slip planes. Nix [23] reported preliminary Mo data giving a maximum strength of 23 GPa, compared to the theoretical shear strength of 15.6 GPa. The discrepancy between these values is almost entirely removed if one makes two corrections [24]. First, the Hertzian stress field is modified by nonlinearity as the ideal strength is approached. Finite-element calculations using a sinusoidal stress-strain relation show that the Hertzian stress field is correct except in the immediate vicinity of the maximum shear stress, even when the maximum shear stress approaches the ideal strength. However, the value of the maximum shear stress is significantly decreased, to τm ∼ 0.69 τH , where τH is the Hertzian value. Second, the triaxiality of the stress field near the point of maximum shear increases the ideal shear strength. When these (and a couple of other, minor corrections) are made the maximum shear strengths that can be inferred from nanoindentation experiments on W (22.8–24.0 GPa) and Mo (16.0–16.8 GPa) are very close to the theoretical values of the ideal strength (W = 22.1–23.3 GPa; Mo = 17.6–18.8 GPa), as they should be. This result suggests that nanoindentation may provide a viable means for measuring ideal strength.
References [1] J.W. Morris, Jr., C.R. Krenn, D. Roundy et al., In: P.E. Turchi and A. Gonis (eds.), Phase Transformations and Evolution in Materials, TMS, Warrendale, PA, pp. 187– 208, 2000. [2] R. Hill and F. Milstein, Phys. Rev. B, 15, 3087–3097, 1977. [3] J. Wang, J. Li, S. Yip et al., Phys. Rev. B, 52, 12, 627–635, 1995. [4] J.W. Morris, Jr. and C.R. Krenn, Phil. Mag. A, 80, 2827–2840, 2000. [5] A.T. Paxton, P. Gumbsch, and M. Methfessel, Phil. Mag. Lett., 63, 267–274, 1991. [6] W. Xu and J.A. Moriarty, Phys. Rev. B, 54, 6941–6951, 1996. [7] M. Sob, L.G. Wang, and V. Vitek, Mat. Sci. Eng., A234–236, 1075-1078, 1997.
The limits of strength [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]
2785
D. Roundy, C.R. Krenn, M.L. Cohen et al., Phys. Rev. Lett., 82, 2713–2716 1999. S. Ogata, J. Li, and S. Yip, Phys. Rev. B, in press, 2004. G. Kresse and J. Hafner, J. Phys. Condens. Matter, 6, 8245, 1994. D.M. Clatterbuck, D.C. Chrzan, and J.W. Morris, Jr., Acta Mater., 51, 2271–2283, 2003. D. Roundy, C.R. Krenn, M.L. Cohen et al., Phil. Mag. A, 81, 1725–1747, 2001. W. Luo, D. Roundy, M. L. Cohen et al., Phys. Rev. B, 66, 94110, 2002. D.M. Clatterbuck, D.C. Chrzan, and J.W. Morris, Jr., Phil. Mag. Lett., 82, 141–147, 2002. M. Friak, M. Sob, and V. Vitek, Proc. Int. Conf. Juniormat 2000, Brno Univ. Technology, Brno, 2001. D.M. Clatterbuck, D.C. Chrzan, and J.W. Morris, Jr., Scripta Mat., 49, 1007, 2003. C.R. Krenn, D. Roundy, J.W. Morris, Jr. et al., Mat. Sci. Eng. A, A319–321, 111–114, 2001. J.W. Morris, Jr., C.R. Krenn, D. Roundy, and M.L. Cohen, Mat. Sci. Eng. A, 309– 310, 121–124, 2001. J.W. Morris, Jr., D.M. Clatterbuck, D.C. Chrzan et al., Mat. Sci. Forum, 426–432, 4429–4434, 2003. D.M. Clatterbuck, C.R. Krenn, M.L. Cohen et al., Phys. Rev Lett., 91, 135501, 2003. S. Ogata, J. Li, and S. Yip, Science, 298, 807, 2002. D.F. Bahr, D.E. Kramer, and W.W. Gerberich, Acta Mater., 46, 3605–3617, 1998. W.D. Nix, Dept. Materials Science, Stanford Univ., Private Communication, 1999. C.R. Krenn, D. Roundy, M.L. Cohen et al., Phys. Rev. B, 65, 13411–13416, 2002.
Perspective 19 SIMULATIONS OF INTERFACES BETWEEN COEXISTING PHASES: WHAT DO THEY TELL US? Kurt Binder Institut fuer Physik, Johannes Gutenberg Universitaet Mainz, Staudinger Weg 7, 55099 Mainz
Interfaces between coexisting phases are ubiquitous in the physics and chemistry of condensed matter: Bloch walls in ferromagnets; antiphase domain boundaries in ordered binary (AB) or ternary alloys; the surface of a liquid droplet against its vapor; boundaries between A-rich regions and B-rich regions in fluid binary mixtures, etc. These interfaces control material properties in many ways (e.g., when a fluid polymer mixture is frozen-in to form an amorphous material, the mechanical strength of this macromolecular glass is controlled by the extent that A-polymers are entangled with B-polymers across the interface, etc.). For a detailed understanding of the properties of such interfaces, one must consider their structure from the scale of “chemical bonds” between atoms in the interfacial region up to much larger, mesoscopic, scales (e.g., when one tries to measure a concentration profile across an interface between coexisting phases in a partially unmixed polymer blend by suitable depth profiling methods, typical results for the interfacial width are of the order of 50 nm [1]). So from the point of view of simulations, one deals with a multiscale problem [2]. However, the situation is even worse: interfaces may exhibit longwavelength excitations, such as capillary waves [3], which lead to the effect that many properties associated with interfaces do depend on the geometry used for the experiment [1] or the simulation [4]. This fact causes a strong danger so that the properties observed in the simulation (or experiment, respectively) are mis-interpreted [2]. For simplicity, we shall consider in the following only interfaces in fluid systems (the gas liquid-interface or the interface in a fluid binary mixture, respectively); although many considerations can be carried over immediately 2787 S. Yip (ed.), Handbook of Materials Modeling, 2787–2791. c 2005 Springer. Printed in the Netherlands.
2788
K. Binder
to interfaces in solid phases (e.g., Bloch walls in antiferromagnets, antiphase domain boundaries in ordered alloys with negligible mismatch between the lattice parameters of the constituent atomic species, etc.), there are many cases in solids where additional complications arise due to long-range elastic interactions, and it is clear that the latter effects are very important (consider e.g., “coherent” vs. “incoherent” precipitation: in the latter case elastic strains destroy the coherence of the lattice structure between a precipitated grain and the surrounding matirx). These complications will stay beyond the considerations of the present note, however. Now the standard concept to describe an interfacial profile (i.e., a density or concentration variable c(z) observed as a function of the coordinate z across the interface) is the concept of an “intrinsic interfacial profile”. This concept dates back to van der Waals in the 19th century (and to Cahn and Hilliard in the late 50s of the 20th century, as far as ordinary binary mixtures are concerned, and to deGennes, Joanny and Leibler in the late 70s for polymer blends). The result for the interfacial profile near the critical point of the fluid (or fluid binary mixture, respectively) can be cast into the familiar tanh-form
c(z) =
z − z0 1 c1 + c2 + (c2 − c1 ) tanh 2 w0
.
(1)
Here c1 , c2 are the concentrations (or densities, respectively) of the two coexisting phases very far away from the interface that is located at z = z 0 . The variation from c1 to c2 extends essentially over the distance z = w0 around z 0 , the “intrinsic width” w0 . Near the critical point, w0 is just twice the correlation length of the order parameter fluctuations (remember that the order parameter is the concentration (density) difference c2 − c1 , for the considered systems). Near the critical point, the correlation length is large, particularly for polymer blends (where the scale for this length is set by the gyration radius of the coils, see e.g., [5]). Also away from the critical point, Eq. (1) often is obtained, at least as a good approximation, though the required theories are more complicated than the Ginzburg–Landau type mean field theory required to derive Eq. (1) near the critical point [5]. However, these theories (such as the density functional theories for fluids, or the self-consistent field theories for polymeric systems in a sense also have a mean field character, and all ignore lateral fluctuations in the (x, y) directions parallel to the interface. For very large length scales of an interface in a fluid system, these lateral fluctuations are nothing but the well-known capillary waves [3] (for simplicity, effects of gravity are ignored here: this is anyway a very good approximation for polymer mixtures, but also reasonable for many after systems). One can show from the equipartition theorem that the thermally averaged mean square amplitude of interfacial height fluctuations due to the capillary waves with wavelenghths 2 /q scales like the inverse of the square of the wavenumber q. This causes a long wavelength instability of the interface: on a lateral length
Simulations of interfaces between coexisting phases
2789
scale L parallel to the interface the mean square height scales proportional to the logarithm of L, ln (L/B), where the length B is a short-wavelength cutoff needed in the theory. In fact, on scales of the order of atomic distances (and near the critical point, already on scales of the order of the correlation length), the picture of a sharp interface locally displaced by capillary waves makes no sense anymore. However, a good understanding is lacking of what cutoff length B is [2–4, 6]. And when one considers the convolution of the intrinsic profile with this capillary wave broadening, there is no unique way to separate in the resulting dependence of the total width w the “intrinsic width” w0 from the term involving ln (B) [2, 4, 6]. These considerations have indeed dramatic consequences on simulations (and experiments [1]), as shown in [4]. Simulations intended to study interfacial profiles typically use one of the following two geometries: 1. A single interface can be stabilized in the geometry of a thin film of thickness D between two walls (of lateral linear dimensions L ). The nature of the walls is chosen such that they distinguish between the phases: in a binary mixture, one wall prefers A, the other wall prefers B; for a gas-liquid interface, one wall has a purely repulsive interaction between the wall and the fluid particles (prefering the gas phase), the other wall exerts an attractive interaction (prefering the liquid phase). In the lateral directions, periodic boundary conditions are used. The disadvantage of this geometry, of course, is that D must be rather large, to avoid a too strong perturbance of the interfacial profile due to the forces from the walls (there must be room on both sides of the interface for the bulk phases unperturbed by the walls). 2. One chooses periodic boundary conditions in all directions, and D considerably larger than L, and creates via suitable initial conditions a slab configuration, e.g., a liquid separated by two parallel interfaces from the gas (e.g., [7]). D must be large enough, so that any interactions between the two interfaces safely are avoided. In addition, L should not be too small either, because otherwise the average orientation of the interfaces around the z-directions also would fluctuate, and hence typically only part of the capillary wave spectrum is suppressed by the periodic boundary condition. In view of these problems, it is not clear to what extent observations of interfacial widths [7] should be compared with experiments, in particular when a careful study of the effects of varying both linear dimensions has not been made. A caveat needs also to be made with respect to the interfacial free energy f, which usually is computed from the profile of the anisotropy of the pressure tensor across the interface [3], since this profile is size-dependent as well. A variant of this geometry is used for symmetric systems (e.g., Isinglattice gas models, where there is a symmetry between particles and holes or
2790
K. Binder
“spin up” and “spin down” phases), there one can simulate a single interface but dispense of walls in favour of an “antiperiodic” boundary condition (crossing the boundaries in z-direction the sign of the spin is flipped) [8]. For a symmetric binary mixture, this means A turns into B (and vice versa) when a boundary in z-direction is crossed [6]. For this geometry (as well as for the above alluded “slab geometry” with two parallel interfaces) one has the problem already alluded to above, that the mean squared interfacial width varies with ln (L), and a unique identification of the intrinsic width (and intrinsic profile) is not possible. However, for the geometry with real walls (above choice No. 1) there is an additional very strong effect of the linear dimension D : for L , and shortrange forces due to the walls, the mean square width of the interface varies linearly with D (while it varies only logarithmically with D for long-range wall forces that decay with a power law with the distance from the wall and hence stabilize better the position of the interface in the middle of the thin film). This very strong size effect results from a kind of “soft mode”, the local position of the interface can be pushed away from the center of the thin film with very little energy cost, and in thermal equilibrium these “cheap” fluctuations cause a very strong broadening of the interfacial profile. These effects have been seen in simulations of Ising models and of polymer mixtures [2] as well as in experiments [1]. We conclude these comments by noting that finite size effects, on the other hand, are useful for the estimation of interfacial free energies f {e.g., [8,9]}. Choosing a geometry with periodic boundary conditions in all spatial directions and the (semi-) grandcanonical ensemble, the distribution of the order parameter is sampled. While its maxima correspond to the pure phases, its minimum corresponds to the slab configuration mentioned above. To be able to sample the minimum accurately, “multicanonical Monte Carlo” or “umbrella sampling techniques” are needed. From the ratio of the logarithms of the probabilities in the maximum and in the minimum, one can extract the free energy excess due to the interfaces, which is 2 times the area of a single interface times f . Checking that the result indeed scales linearly with the interface area proves that the asymptotic regime has indeed been reached. This “first principles” method avoids the need for taking any “measurements” at the interfaces or even to observe them explicitly in the simulations! There is now ample evidence for the reliability of this approach [6].
References [1] T. Kerle, J. Klein, and K. Binder, “Effects of finite thickness on interfacial widths in confined thin films of coexisting phases,” Eu. Phys. J. B7, 401–410, 1999.
Simulations of interfaces between coexisting phases
2791
[2] K. Binder, “Simulations of interfaces between coexisting phases: multiscale aspects,” In: Multiscale Computational Methods in Chemistry and Physics, pp. 207–220, IOS Press, Amsterdam, 2001. [3] J.S. Rowlinson and B. Widom, “Molecular theory of capillarity,” Clarendon, Oxford, 1982. [4] A. Werner, F. Schmid, M. Mueller, and K. Binder, “Anomalous size-dependence of interfacial profiles between coexisting phases of polymer mixtures in thin film geometry: a Monte Carlo simulation,” J. Chem. Phys., 107, 8175–8188, 1997. [5] K. Binder, “Phase transitions in polymer blends and block copolymer melts: some recent developments,” Adv. Polym. Sci., 112, 181–299, 1994. [6] A. Werner, F. Schmid, M. Mueller, and K. Binder, “Intrinsic profiles and capillary waves at homopolymer interfaces: a Monte Carlo study,” Phys. Rev. E, 59, 728–738, 1999. [7] J. Alejandre, D.J. Tildesley, and G.A. Chapela, “Molecular dynamics simulation of the orthobaric densities and surface tension of water,” J. Chem. Phys., 102, 4754– 4583, 1995. [8] K. Binder, “Monte Carlo simulations of surfaces and interfaces in materials,” In: A. Gonis, P.A. Turchi, and J. Kudrnovsky (eds.), Stab. Mater., pp. 3–37, Plenum, New York, 1996. [9] M. Mueller, K. Binder, and W. Oed, “Structural and thermodynamic properties of interfaces between coexisting phases in polymer blends: a Monte Carlo investigation,” J. Chem. Soc. Faraday Trans., 91, 2369–2379, 1995.
Perspective 20 HOW FAST CAN CRACKS MOVE? Farid F. Abraham IBM Almaden Research Center, San Jose, California
1.
Molecular Dynamics Experiments
With the present-day supercomputers, simulation is becoming a very powerful tool for providing important insights into the nature of materials failure. Atomistic simulations yield “ab initio” information about materials deformation at length and time scales unattainable by experimental measurement and unpredictable by continuum elasticity theory. Using our “computational microscope,” we can see what is happening at the atomic scale. Our simulation tool is computational molecular dynamics [2], and it is very easy to describe. Molecular dynamics predicts the motion of a large number of atoms governed by their mutual interatomic interaction, and it requires the numerical integration of the equations of motion, “force equals mass times acceleration or F = ma.” We learn in beginning physics that the dynamics of two atoms can be solved exactly. Beyond two atoms, this is impossible except for a few very special cases, and we must resort to numerical methods. A simulation study is defined by a model created to incorporate the important features of the physical system of interest. These features may be external forces, initial conditions, boundary conditions, and the choice of the interatomic force law. In the present simulations, we adopt simple interatomic force laws since we wish to investigate the generic features of a particular many-body problem common to a large class of real physical systems and not governed by the particular complexities of a unique molecular interaction. It is very important to emphasize that this is a conscious choice since it is not uncommon to hear others object that one is not studying “real” materials when using simple potentials. Feynman summarized this viewpoint well on page two, volume I of his famous three volume series Feynman’s Lectures In Physics [3]: “If in some cataclysm all scientific knowledge were to be destroyed and only one sentence passed on to the next generation of creatures, what statement would contain the most information in the fewest words? I believe it is 2793 S. Yip (ed.), Handbook of Materials Modeling, 2793–2804. c 2005 Springer. Printed in the Netherlands.
2794
F.F. Abraham
the atomic hypothesis that all things are made of atoms – little particles that move around in perpetual motion, attracting each other when they are a little distance apart, but repelling upon being squeezed into one another. In that one sentence, you will see there is an enormous amount of information about the world, if just a little imagination and thinking are applied”. A simple interatomic potential may be thought of as a “model potential,” and the model potentials for the present studies are harmonic and anharmonic springs and the Lennard–Jones 12:6 potential. Complaints of model approximations are not new. In his book entitled The New Science of Strong Materials, Gordon comments on Griffith’s desire to have a simpler experimental material that would have an uncomplicated brittle fracture [4]. He writes, “In those days, models were all very well in the wind tunnel for aerodynamic experiments but, damn it, who ever heard of a model material?” In the mid 1960s, a few hundred atoms could be treated. In 1984, we reached 100,000 atoms. Before that time, computational scientists were concerned that the speed of scientific computers could not go much beyond 4 Gigaflops, or 4 billion arithmetic operations per second and that this plateau would be reached by the year 2000! That became forgotten history with the introduction of concurrent computing. A modern parallel computer is made up of several (tens, hundreds or thousands) small computers working simultaneously on different portions of the same problem and sharing information by communicating with one another. The communication is done through message passing procedures. The present record is well over a few tens of Teraflops for optimized performance. Moore’s Law states that computer speed doubles every one and one-half years. For 35 years, that translates into a computer speed increase of ten million. This is exactly the increase in the number of atoms that we could simulate over the last 35 years.
2.
Supersonic Crack Propagation in Brittle Fracture
Our simulation study addresses the important question “how fast can cracks propagate?” In this study, we used system sizes of about twenty million atoms, very large by present-day standards but modest compared to our second study on ductile failure, which will follow [1]. Based on our current simulation model, we develop our earlier studies on transonic crack propagation in linear materials and supersonic crack propagation in nonlinear solids. Our new finding centers on a bilayer solid which behaves under large strain like an interface crack between a soft (linear) material and a stiff (nonlinear) material. In this mixed case, we observe that the initial mother crack propagating at the Rayleigh sound speed gives birth to a transonic daughter crack. Then, quite unexpectedly, we observe the birth of a supersonic granddaughter crack.
How fast can cracks move?
2795
We verify that the crack behavior is dominated by the local (nonlinear) wave speeds, which can be in excess of the conventional sound speeds of a solid. In this problem, there are three important wave speeds in the solid. In order of increasing magnitude, they are the Rayleigh wave speed, or the speed of sound on a solid surface, the shear (transverse) wave speed and the longitudinal wave speed. Predictions of continuum mechanics [5] suggest that a brittle crack cannot propagate faster than the Rayleigh wave speed. For a mode I (tensile) crack, the energy release rate vanishes for all crack velocities in excess of the Rayleigh wave speed, implying that the crack cannot propagate at a velocity greater than the Rayleigh wave speed. A mode II (shear) crack behaves similarly to a mode I crack in the subsonic velocity range; i.e., the energy release rate monotonically decreases to zero at the Rayleigh wave speed and remains zero between the Rayleigh and shear wave speeds. However, the predictions for the two loading modes differ for crack velocities greater than the shear wave speed. While the energy release rate remains zero for a mode I crack, it is positive for a mode II crack over the entire range of intersonic velocities. From these theoretical solutions, it has been concluded that a mode I crack’s limiting speed is clearly the Rayleigh speed. The same conclusion has also been suggested for a mode II crack’s limiting speed because the “forbidden velocity zone” between the Rayleigh and shear wave speeds acts as an impenetrable barrier for the shear crack to go beyond the Rayleigh wave speed. The first direct experimental observation of cracks moving faster than the shear wave speed has been reported by Rosakis et al., [6]. They investigated shear dominated crack growth along weak planes in a brittle polyester resin under dynamic loading. Around the same time, we performed 2D molecular dynamics simulations of crack propagation along a weak interface joining two strong crystals [7]. We assumed that the interatomic forces are harmonic except for those pairs of atoms with a separation cutting the centerline of the simulation slab. For these pairs, the interatomic potential is taken to be a simple model potential that allows the atomic bonds to break. Our simulations demonstrated intersonic crack propagation and the existence of a mother-daughter crack mechanism for a subsonic shear crack to jump over the forbidden velocity zone. We have since discovered that a crack cannot only travel supersonically [8] but that there exists a mother-daughter-granddaughter crack mechanism in bilayer slabs. The classical theories of fracture [5] are largely based on linear elastic solutions to the stress fields near cracks. An implicit assumption in such theories is that the dynamic behavior of cracks is determined by the linear elastic properties of a material. We have found [9] that the MD simulation results for harmonic atomic forces are indeed well interpreted by elasticity theories. However, the effects of anharmonic material properties on dynamic behaviours of cracks are not clearly understood. This is partly due to the
2796
F.F. Abraham
general difficulties in obtaining non-linear elastic solutions to dynamic crack problems. Molecular dynamics (MD) simulations can be easily adapted to the anharmonic case so that non-linear effects can be thoroughly investigated. We now discuss the anharmonic simulations.
3.
The Computer Experiments Setup
We consider a strongly non-linear elastic solid described by a tethered Lennard–Jones potential where the compressive part of this potential is identical to that of the usual Lennard–Jones 12:6 function and the tensile part is the reflection of the compressive part with respect to the potential minimum. An FCC crystal formed by this potential exhibits a strongly non-linear stress-strain behaviour resulting in elastic stiffening and an increase of the elastic modulus with strain, as shown in Fig. 1. Note that the elastic modulus increases by a factor of 10 at 13% of elastic strain, indicating that the material properties of such a solid is strongly non-linear in the hyperelastic regime. We performed 3D MD simulations of two face-center-cubic (fcc) crystals joined by a weak interface. In this study, we used system sizes of about twenty million atoms, though a billion atoms were used in a preliminary simulation [8]. For comparison, we consider the anharmonic tethered potential together with the harmonic potential where the spring constant is equal to the
3500 anhrarmonic potential modulus
2500
1500
500 0
0.05
0.1 strain
0.15
0.2
Figure 1. Variation of elastic modulus as a function of strain under uniaxial stretching in the [110] direction of a fcc crystal formed by the anharmonic (tethered repulsive LJ) potential.
How fast can cracks move?
2797
tethered potential at equilibrium. The interatomic force bonding the two crystals is given by the Lennard–Jones 12:6 potential. The simulation results are expressed in reduced units: lengths are scaled by the value of the interatomic separation where the LJ potential is zero and energies are scaled by the depth of the minimum of the LJ potential. Atoms bond only with their original nearest neighbors. Hence, rebonding of displaced atoms due to applied loading does not occur. For the adjoining two crystal slabs, we consider the following three simulation cases: (1) Harmonic case: both crystal slabs are characterized by the harmonic potential. This is used as a control for the anharmonic studies; (2) Anharmonic case: both crystal slabs are characterized by the anharmonic potential; (3) Mixed case: one crystal slab is characterized by the anharmonic potential, and the other is characterized by the harmonic potential. In each case, a shear crack lies along the (110) plane and oriented toward the [110] direction. The crack front is parallel to the [001] direction. The applied loading is dominated by shear. In order to interpret the results of MD simulation, we need the following wave speeds in the fcc crystals formed by harmonic and/or anharmonic potentials: the conventional longitudinal and shear wave speeds in the harmonic and anharmonic crystals in the direction of crack propagation; the longitudinal and shear wave speeds under applied loading in the anharmonic crystal in the direction of crack propagation; the local longitudinal and shear wave speeds near the crack tip in the anharmonic crystal in the direction of crack propagation. The calculations of these wave speeds will be presented in detail in a forthcoming paper. We give the results in Table 1. We add some comments with regard to the concept of local wave speeds near the crack tip, which play a critically important role in explaining our simulation results. Conceptually, it is clear that the fracture process in brittle solids involves breaking of atomic bonds and is intrinsically a highly non-linear process. The anharmonic material properties of solids near the cohesive strength of atomic bonds would in general be quite different from the harmonic properties. In an earlier attempt to explain why the highest crack velocities recorded for mode I crack propagation in a homogeneous body are significantly lower than the Rayleigh wave speed, Broberg [5, 10] has suggested that the reason could be due to some kind of a “local” Rayleigh wave speed in the highly strained region near the crack tip, rather than the Rayleigh wave speed in the
Table 1. Calculated wave speeds for the harmonic wave speeds, bulk anharmonic wave speeds due to applied loading and local wave speeds Longitudinal wave Shear wave
Bulk harmonic
Bulk anharmonic
Local
9.49 4.24
10.36 5.95
13.4 10.4
2798
F.F. Abraham
undisturbed material. Since the local strain is an extremely strong function of position from the crack tip, one might think that the local wave speed should depend on distance from the crack tip and cannot have one single value. It was not until 30 years after Broberg’s suggestion that this issue was finally quantitatively studied by Gao [11] who made use of the Barenblatt cohesive model of a mode I crack and, for the first time, defined the local wave speed unambiguously as the wave speed at the location where stress is exactly equal to the cohesive strength of the material, i.e., the true point of “fracture nucleation”. The local wave speed characterizes how fast elastic energy is transported near the region of bond breaking in front of a crack tip. For example, for a mode I crack√in a homogeneous isotropic elastic, the local wave speed is calculated to be σmax /ρ [11] where σmax is the cohesive strength and ρ is the density of the undisturbed materials. The cohesive strength is typically around 1/10 of the shear modulus, suggesting that the local wave speed for a mode I crack is approximately 1/3 of the shear wave speed. Interestingly, experiments [12] and MD simulations [13] show that mode I cracks exhibit a dynamic instability at 30% of the shear wave speed which suggests a possible dependence on the local wave speed [11]. We note that the local wave speeds differ from the conventional wave speeds both qualitatively and quantitatively. In the present study, the focus is shifted to study the effect of stiffening anharmonic behaviors of materials (as in many polymers) on a mode II crack propagating along a weak interface described by the Lennard–Jones potential. The solid itself is described by the tethered Lennard–Jones potential. We follow the same approach used in [11] and define the local wave speed as the wave speed of the solid at the location adjacent to the interface where the shear stress is exactly equal to the cohesive strength of the interface. Note that in this case the local wave speed depends on both the interface cohesive strength and the non-linear elastic properties of the solid.
4.
The Computer Experiments Results & Discussion
Figure 2 presents distance-time histories of a crack moving in the three different slab configurations: two weakly bonded harmonic crystals (designated “harmonic”); two weakly bonded anharmonic crystals (designated “anharmonic”); and a harmonic crystal weakly bonded to an anharmonic crystal (designated “mixed”). The cracks begin their motion when the Griffith criterion is satisfied. The respective dip-spike regions for each history represent the birth of a new crack. For the harmonic and anharmonic simulations (see Figs. 3 and 4), we observe one such region representing the birth of a daughter crack, the former traveling at the longitudinal sound speed for the harmonic solid and the latter achieving a supersonic sound speed of Mach 1.6. For the mixed simulation (see Fig. 5), we see two such dip-spike regions where a daughter crack
How fast can cracks move?
2799
speed
30
20 anharmonic 10 harmonic 100
mixed 200
time Figure 2. The space-time history of the crack tip for the three different simulations described in the text (reduced units are used).
Figure 3. Snapshot pictures of a crack traveling in the harmonic slab, where the progression in time is from the top to bottom. The boxed snapshot pictures represent a progression in time from the top to bottom.
2800
F.F. Abraham
Figure 4. Snapshot pictures of a crack traveling in the anharmonic slab, where the progression in time is from the top to bottom. The boxed snapshot pictures represent a progression in time from the top to bottom.
is born from the mother crack, which then gives birth to a granddaughter crack at a later time. In Figs. 3 and 4, the harmonic and anharmonic simulations are shown, respectively. The boxed snapshot pictures in each figure represent a progression in time from the top to bottom. In the top images, the mode II daughter cracks are born. The early-time occurrence of the Mach cone attached to the crack tip is evident in the anharmonic slab; In middle image of Fig. 3, the crack in the harmonic slab has a single Mach cone and a circular stress-wave halo, which indicates a crack speed equal to the longitudinal sound speed of the linear solid. This is in contrast to the middle image of Fig. 4 where the two Mach cones for the crack in the anharmonic slab signify that the crack is traveling supersonically. For the bottom images of the two figures, we conclude that the crack in the anharmonic slab wins. In Fig. 5, snapshot pictures of the crack traveling in the mixed slab are presented, where the progression in time is from the top to bottom. The material properties for the harmonic and anharmonic regions are labeled, and the different sound waves associated with the crack’s dynamics are denoted. The sequence shows the following progression of events: (1) the daughter
How fast can cracks move?
2801
Figure 5. Snapshot pictures of a crack traveling in the mixed slab, where the progression in time is from the top to bottom. The sound waves associated with the crack’s dynamics and the material properties of the harmonic and anharmonic regions of the mixed slab are labeled.
crack is born; (2) the daughter crack is traveling at the longitudinal sound speed of the harmonic slab; (3) the granddaughter crack is born; (4) the granddaughter crack speeds ahead to Mach 1.6, matching the crack speed in the anharmonic slab. We have shown that the behavior of the crack in the harmonic crystal is controlled by the conventional elastic wave speeds [7]. In contrast, the crack behavior in the anharmonic crystal is controlled by the local wave speeds, which play an important role in the dynamic behavior of crack propagation. A similar dependence has been identified in a related phenomenon where hyperelasticity plays an important role in whether a solid will undergo brittle fracture or ductile failure at the crack tip [14]. The local wave speed represents the non-linear, hyperelastic, material properties near cohesive failure of atomic bonds while the conventional elastic (shear and longitudinal) wave speeds represent the material properties under infinitesimal deformation and can differ significantly from the harmonic wave speeds. The crack propagation speeds observed in molecular dynamics simulations are tabulated in Table 2 for comparison.
2802
F.F. Abraham Table 2. Observed speeds at which a daughter crack or a granddaughter crack nucleates and the limiting speeds for the harmonic, anharmonic and mixed simulation cases Daughter crack Grand daughter Limiting speed
Harmonic
Anharmonic
Mixed
3.5 – 9.4
4.1 – 15
4.1 10.35 15
Comparing the MD simulation results with the calculated wave speeds for harmonic and anharmonic crystals, we reach the following conclusions. The harmonic case is consistent with the linear elastic theory of intersonic crack propagation. We have previously discussed this case in detail for the 2D harmonic MD simulation [7]. The same theory is found to apply for the present 3D case. Essentially, the initial crack starts to propagate when the Griffith criterion is satisfied. Near the Rayleigh wave speed, the crack encounters a velocity barrier and a vanishing of stress singularity at the crack tip; i.e., both stress intensity factor and energy release rate vanish at the Rayleigh wave speed. This velocity barrier is overcome by the nucleation of a daughter crack at a distance ahead of the mother crack. This distance corresponds to the shear wave front at which a peak of shear stress increases to a critical magnitude to cause cohesive failure of the interface. The daughter crack’s speed is only limited by the longitudinal wave speed. Comparing Tables 1 and 2, we see that the daughter crack indeed nucleates at the Rayleigh wave speed and the limiting speed agrees very well with the longitudinal wave speed for the harmonic crystal. The mother-daughter mechanism described above is consistent with the Burridge–Andrew model of intersonic crack propagation [15]. In the anharmonic case, the nucleation of the daughter crack is consistent with linear elastic theory. The mother crack initiates according to the Griffith criterion and achieves a limiting velocity equal to the Rayleigh wave speed. At this point, it is necessary to nucleate a daughter crack to break the velocity barrier. The limiting speed of the daughter crack is more than 50% higher than the longitudinal wave speed and cannot be explained by the linear theory of intersonic fracture. In comparison, the calculated local wave speed is approximately equal to (only 10% lower than) the observed limiting speed. In calculating the local wave speeds, we have ignored the large gradient of deformation field near the crack tip. In view of this simplification, we conclude that the local wave speed provides a reasonable explanation of the observed limited crack speed. In the mixed case, the nucleation of the daughter crack still occurs at the Rayleigh wave speed for reasons discussed above. The daughter crack breaks the velocity barrier at the Rayleigh speed and propagates near the longitudinal wave speed. This behavior is similar to the harmonic case. However, at a
How fast can cracks move?
2803
velocity of 10.35, we observe a granddaughter crack forming ahead of the daughter crack. The critical speed at which this transition occurs is very close to the local shear wave speed in the anharmonic crystal. The granddaughter crack rapidly accelerates toward the local longitudinal wave speed of the stretched non-linear solid. It is tempting to conclude that the nucleation of the granddaughter crack is controlled by the local Rayleigh wave speed magnified by the local stress concentration, although this issue is worth further investigation. In summary, we conclude that the local wave speeds play a dominant role in the behavior of cracks in the anharmonic crystals. The mixed case behaves somewhat like an interface crack between a soft material and a stiff material. Although the harmonic and anharmonic crystals have identical material properties under infinitesimal deformation, the local material properties near the crack tip are resembled by a bimaterial. Rosakis et al. [16] have previously studied crack propagation along an interface between PMMA (soft) and Al (hard) and found that the crack speed can significantly exceed the longitudinal wave speed of PMMA. Our present study shows that the crack behavior is dominated by the local (nonlinear) wave speeds. This is not only of theoretical interest, but also of practical importance. It is known that many polymeric materials, especially rubbers, increase their modulus significantly when stretched. The underlying physical mechanism is that initial elasticity in rubbers is due to entropic effects. When stretched to large deformation, the polymeric chains are straightened and covalent atomic bonds eventually dominate their hyperelastic response. In such solids, the elastic modulus increases with strain and the local wave speeds near a crack tip would be larger than the linear elastic wave speeds. The dynamic behavior of cracks in such solids should propagate at a speed exceeding the conventional wave speeds. Another point worth commenting here is the remarkable success of continuum theories in predicting the behaviors obtained by atomistic simulations. We have found previously [9] that the MD simulation results for shear crack propagation along a weak interface in harmonic solids are well interpreted by elasticity theories. In particular, calculations based on linear elasticity were able to predict the time and location of the daughter crack as well as the initiation time of the mother crack. Our atomistic simulation of the mother-daughter crack mechanism for intersonic shear crack is consistent with the continuum mechanics based discovery made earlier by Burridge and Andrew [15]. An important message of the present study is that the nonlinear continuum theory of local wave speeds is capable of predicting crack velocities in strongly nonlinear solids. Indeed, it is quite remarkable that the dynamic behavior of cracks may retain its basic nature over such a wide range of length scales, from atomistic calculations using interatomic potentials all the way up to macroscopic laboratory experiments and continuum elasticity treatments. This bridging of length scales in dynamic materials failure should be of great interest to the
2804
F.F. Abraham
general scientific audience since it points out the fantastic power of continuum mechanics.
References [1] F.F. Abraham, R. Walkup, H. Gao et al., Proc. Na. Acad. Sci., 99, 5777, 2002. [2] M.P. Allen and D.J. Tildesley, “Computer simulation of liquids,” Clarendon Press, Oxford, 1987. [3] R. Feynman, R. Leighton, and M. Sands, “The Feynman lectures on physics,” Addison–Wesley, Redwood City, 1963. [4] J.E. Gordon, “The new science of strong materials or why you don’t fall though the floor,” Princeton University Press, Princeton, 1988. [5] L.B. Freund, “Dynamical fracture mechanics,” Cambridge Univ. Press, New York, 1990; B. Broberg, Cracks and Fracture, Academic Press, San Diego, 1999. [6] A.J. Rosakis, O. Samudrala, and D. Coker, Science, 284, 1337, 1999. [7] F.F. Abraham and H. Gao, Phys. Rev. Lett., 84, 3113, 2000. [8] F.F. Abraham, J. Mech. Phys. Solids, 49, 2095, 2001. [9] H. Gao, Y. Huang, and F.F. Abraham, J. Mech. Phys. Solids, 49, 2113, 2001. [10] B. Broberg, J. Appl. Mech., 546, 1964. [11] H. Gao, J. Mech. Phys. Solids, 44, 1453, 1996. [12] J. Fineberg, S.P. Gross, M. Marder, and H.L. Swinney, Phys. Rev. Lett., 67, 457, 1991; Phys. Rev. B, 45, 5146, 1992, The dynamic instability is generally known as the “mirror-mist-hackle” instability. A quantitative study of the onset of this instability was not done previously. [13] F.F. Abraham, D. Brodbeck, R. Rafey et al., Phys. Rev. Lett., 73, 272, 1994; J. Mech. Phys. Solids, 45, 1595, 1997. [14] F.F. Abraham, Phys. Rev. Lett., 77, 869, 1996; F.F. Abraham, D. Schneider, B. Land et al., J. Mech. Phys. Solids, 45, 1461, 1997. [15] R. Burridge, Geophys. J. Roy. Astron. Soc., 35, 439, 1973; D.J. Andrews, J. Geophys. Res., 81, 5679, 1976. [16] A.J. Rosakis, O. Samudrala, R.P. Singh et al., J. Mech. Phys. Solids, 46, 1789, 1998.
Perspective 21 LATTICE GAS AUTOMATON METHODS Jean Pierre Boon Center for Nonlinear Phenomena and Complex Systems, Universit´e Libre de Bruxelles, 1050-Bruxelles, Belgium
When one is interested in studying the dynamical behavior of fluid systems starting at the microscopic level, a logical approach is to begin with a molecular dynamics description of the interactions between the constituting particles. This approach quite often turns into a formidable task when the fluid evolves into a non-linear regime where chaos, turbulence, or reactive processes take place. But one may question whether a ‘realistic’ description of the microscopic dynamics is indispensable to gain insight on the underlying mechanisms of large scale non-linear phenomena. Around 1985, a considerable simplification was introduced [1] when pioneering studies established theoretically and computationally the feasibility of simulating fluid dynamics via a microscopic approach based on a new paradigm: a virtual simplified micro-world is constructed as an automaton universe based not on a realistic description of interacting particles, but merely on the laws of symmetry and of invariance of macroscopic physics. Suppose we implement point-like particles on a regular lattice where they move from node to node at each time step and undergo collisions when their trajectories meet at the same node. As the system evolves, we observe its collective dynamics by looking at the lattice from a distance. And the remarkable fact is that, if the collisions occur according to some simple logical rules (satisfying fundamental conservations) and if the lattice has the proper symmetry, this Lattice Gas Automaton shows global behavior very similar to that of a real fluid. So we can infer that, despite its simplicity at the microscopic scale, the lattice gas automaton (LGA) should contain, at the elementary level, the essentials that are responsible for the emergence of complex behavior, and thereby can help us understand the basic mechanisms where from complexity builds up. The LGA consists of a set of particles moving on a regular d-dimensional lattice L at discrete time steps, t = nt, with n an integer. The lattice is 2805 S. Yip (ed.), Handbook of Materials Modeling, 2805–2809. c 2005 Springer. Printed in the Netherlands.
2806
J.P. Boon
composed of V nodes labeled by the d-dimensional position vectors r ∈ L. Associated to each node there are b channels (labeled by indices i, j, . . . , running from 1 to b). At a given time t, a channel can be either occupied by one particle or empty, so that the occupation variable n i (r, t) = 1 or 0. When channel i at node r is occupied, then the particle at the specified node r has velocity ci . The set of allowed velocities is such that the condition r + ci t ∈ L is fulfilled. The “exclusion principle” requirement that the maximum occupation be of one particle per channel allows for a representation of the automaton configuration in terms of a set of bits {n i (r, t)}; r ∈ L, i = {1, b}. The evolution rules are thus simply logical operations over sets of bits. The time-evolution of the automaton takes place in two stages: propagation and collision. In the propagation phase, particles are moved according to their velocity vector, and in the (local) collision phase, the particles occupying a given node are redistributed amongst the channels associated to that node. So the microscopic evolution equation of the LGA reads n i (r + ci t, t + t) = n i (r, t) + i ({n j (r, t)}),
(1)
where i ({n j }) represents the colision term which depends on all channel occupations at node r. By performing an ensemble average (denoted by angular brackets) over an arbitrary distribution of initial occupations, one obtains a hierarchy of coupled equations for the successive n-body distribution functions. This hierarchy can be truncated to yield the Lattice Boltzmann equation for the single particle distribution function f i (r, t) = n i (r, t): ({ f j (r, t)}), f i (r + ci t, t + t) − f i (r, t) = Boltz i
(2)
The l.h.s. can easily be recognized as the discrete version of the l.h.s. of the classical Boltzmann equation for continuous sytems, and the r.h.s. denotes the collision term where the precollisional uncorrelated state ansatz has been used to factorize the b-particle distribution function. The Lattice Boltzmann Eq. (2) is one of the most important results in LGA theory. It can be used as the starting point for the derivation (via multi-scale analysis) of the macroscopic equations describing the long wave-length behavior of the lattice gas. The LGA macroscopic equations are found to exhibit the same structure as the classical hydrodynamic equations, and under the incompressibility condition, one retrieves the Navier–Stokes equations for non-thermal fluids. Another important feature of the Lattice Boltzmann equation is that it can be used as an efficient and powerful simulation algorihtm. In practice one usually prefers to use a simplified equation where the collision term is approximated by a single relaxation time process inspired by the Bhatnagar–Gross– Krook model, known in its lattice version as the LBGK equation: f i (r + ci t, t + t) − f i (r, t) = −
1 leq f i (r, t) − f i (r, t) , τ
(3)
Lattice gas automaton methods
2807
where the r.h.s. is proportional to the deviation from the local equilibrium distribution function. There is a wealth of applications of the lattice gas methods which have established their validity and their usefulness. LGA simulations, based on Eq. (1), are most valuable for fundamental problems in statistical mechanics such as for instance the study of fluctuation correlations in equilibrium and non-equilibrium sytems [2, 3]. As an example, Fig. 1 shows the trajectories of tracer particles suspended in a Kolmogorov flow (above the critical Reynolds number) produced by a lattice gas automaton and where from turbulent diffusion was analyzed [4]. Simulations of more direct practical interest such as for instance profile optimization in car design or turbulent drag problems are most efficiently treated with the lattice Boltzmann method, in particular using the LBGK model. The examples given in Figs. 2 and 3 illustrate the method for the study of viscous fingering in Hele–Shaw geometry, showing the effect of reactivity between the two fluids as a determinant factor in the dynamics of the moving interface [5]. Applications of the LGA approach and of the lattice Boltzmann equation cover a wide variety of theoretical and practical problems ranging from the
Figure 1. Lattice gas simulation of the Kolmogorov flow: the tracer trajectories reflect the topology of the A BC flow in the regime beyond the critical Reynolds number (Re = 2.5× Rec ).
Figure 2. Lattice Boltzmann (LBGK) simulation of viscous fingering in miscible fluids.
2808
J.P. Boon
Figure 3. Lattice Boltzmann (LBGK) simulation of viscous fingering showing the interface sharpening effect of a reactive process between the two fluids (compare with Fig. 2).
dynamics of thermal fluctuations and quantum lattice gas automata to multiphase flow, complex fluids, reactive sytems and inhomogeneous turbulence. The list of selected references given below should guide the reader interested in specific topics related to the subject: Statistical Mechanics of Lattice Gas Automata [2], Lattice Gas Automaton applications [3], Reactive Lattice Gas Automata [6], Quantum Lattice Gas Automata [7], Complex fluids [8], Lattice Boltzmann method [9], and recent developments [10].
References [1] U. Frisch, B. Haslacher, and Y. Pomeau, “Lattice gas automata for the Navier–Stokes equation,” Phys. Rev. Lett., 56, 1505–1508, 1986. [2] J.P. Rivet and J.P. Boon, Lattice Gas Hydrodynamics, Cambridge, Cambridge University Press, 2001. [3] D. Rothman and S. Zaleski, Lattice Gas Cellular Automata, Cambridge, Cambridge University Press, 1997. [4] J.P. Boon, D. Hanon, and E. Vanden Eijnden, “Lattice gas automaton approach to turbulent diffusion,” Chaos, Solitons and Fractals, 11, 187–192, 2000. [5] P. Grosfils and J.P. Boon, “Viscous fingering in miscible, immiscible, and reactive fluids,” J. Modern Phys. B, (to appear), 2002. [6] J.P. Boon, D. Dab, R. Kapral, and A. Lawniczak, “Lattice gas automata for reactive systems,” Phys. Rep., 273(2), 55–148, 1996. [7] D. Meyer, “From quantum cellular automata to quantum lattice gases,” J. Statist. Phys., 85, 551–574, 1996.
Lattice gas automaton methods
2809
[8] B.M. Boghosian, P.V. Coveney, and A.N. Emerton, “A lattice gas model of microemulsions,” Proc. Soc., A452, 1221–1250, 1996. [9] S. Succi, The Lattice Boltzmann Equation, Oxford, Clarendon Press, 2001. [10] P.V. Coveney and S. Succi (eds.), “Discrete modeling and simulation of fluid dynamics,” Philos. Trans. R. Soci., 360, 291–573, 2002.
Perspective 22 MULTI-SCALE MODELING OF HYPERSONIC GAS FLOW Iain D. Boyd University of Michigan, Ann Arbor, MI, USA
On March 27, 2004, NASA successfully flew the X-43A hypersonic test flight vehicle at a velocity of 5000 mph to break the aeronautics speed record that had stood for over 35 years. The final flight of the X-43A on November 16, 2004 further increased the speed record to 6,600 mph which is almost ten times the speed of sound. The very high speed attainable by hypersonic airplanes could revolutionize air travel by dramatically reducing inter-continental flight times. For example, a hypersonic flight from New York to Sydney, Australia, a distance of 10,000 miles, would take less than 2 h. Reusable hypersonic vehicles are also being researched to significantly reduce the cost of access to space. Computer modeling of the gas flows around hypersonic vehicles will play a critical part in their development. This article discusses the conditions that can prevail in certain hypersonic gas flows that require a multi-scale modeling approach.
1.
Hypersonic Flight
Hypersonic flight is one in which the vehicle velocity is much greater than the speed of sound. For the remainder of this article, we will use the Galilean transformation that allows us to consider the hypersonic flow of air around a fixed vehicle as being the same as when a hypersonic vehicle flies through stationary air. In gas dynamics, the Mach number is the ratio of the flow velocity to the speed of sound. Although there is not a fixed definition, hypersonic flow is generally considered to involve a Mach number greater than five. The fastest commercial passenger airplane ever developed is the Concorde that cruised at a Mach number of about two. The fastest military aircraft ever developed is the SR-71 Blackbird that cruised at a Mach number of three. Important examples of truly hypersonic vehicles include the X-15 powered by rockets and flown 2811 S. Yip (ed.), Handbook of Materials Modeling, 2811–2818. c 2005 Springer. Printed in the Netherlands.
2812
I.D. Boyd
in the 1960s, the recently flown X-43A powered by a scramjet that reached a peak Mach number of about ten, and the space shuttle that re-enters the atmosphere at about Mach 25. Note, however, that the space shuttle during its hypersonic re-entry is a glider and does not fly under its own power. Flights of hypersonic passenger aircraft will not happen any time in the near future as there are many difficult technology issues faced in the development of such vehicles. Some of the most important of these issues concern the basic aerodynamics of the vehicle. Because hypersonic vehicles fly at very high speed, they generate a lot of aerodynamic drag. For cruise conditions, the drag must be overcome by the propulsion system and so directly affects the efficiency of the vehicle. Note that lack of economic viability for flight at Mach 2 for the Concorde was the primary reason for the end of commercial service of that aircraft. On hypersonic vehicles, the problem is even more acute calling for minimization of vehicle drag to the greatest extent possible. Also, at hypersonic speed, the high velocity is converted into high gas temperature at the vehicle surface requiring development of a thermal protection system (TPS) that may consist of special surface materials and active cooling strategies.
2.
Need for Computer Simulation of Hypersonic Flows
The development of hypersonic vehicles relies heavily on computational modeling. This is in part because laboratory experiments are both technically challenging and cost a lot of money. In order to generate in the laboratory an air flow at Mach 12 representative of a hypersonic flight condition, the air must be compressed to about 100 times atmospheric pressure and heated to a temperature of 7500 K. Expanding this hot, compressed gas through the hypersonic nozzle of a wind-tunnel will typically produce test times of less than one second making measurements difficult. Flight development programs for hypersonic vehicles such as the X-43A are very rare due to the orders of magnitude increase in cost compared to laboratory testing. By comparison, computer simulation techniques offer the potential to provide useful results to aid in the design of hypersonic vehicles in a fraction of the time and money required for laboratory and flight investigations. However, computer simulations are only useful if they are demonstrated to provide accurate results through direct comparison with measured data. Thus, laboratory testing remains an extremely important component in hypersonic gas dynamics research. Finally, similar to the development of any air vehicle, flight tests will always be required to address issues that cannot be foreseen in simulation and laboratory investigations.
Multi-scale modeling of hypersonic gas flow
3.
2813
Modeling Hypersonic Gas Dynamics
The hypersonic flow of air over a vehicle shape involves a number of complex gas dynamic phenomena. The air becomes compressed as it is decelerated by the presence of the vehicle surfaces. The hypersonic character of the free stream results in an extremely compact region of strong compression called a shock wave. For hypersonic vehicle flight conditions, the shock wave typically has a thickness much less than 1 mm over which gas properties such as pressure, temperature, and density are increased by substantial factors. Downstream of the shock wave, the high pressure and high temperature gas may undergo excitation of vibrational energy modes of the air molecules and the molecules may even begin to react chemically. As the gas finally approaches the surface of the vehicle, it encounters a relatively cold surface that causes the density to rise further in another thin layer of fluid called the boundary layer. It is the properties of the gas in this final layer of fluid immediately adjacent to the vehicle surface that determine the drag force and heat transfer to the vehicle. The most fundamental model of dilute gas dynamics is the Boltzmann equation that describes the evolution of the velocity distribution function of molecules. By taking moments of the Boltzmann equation, Maxwell derived the equation of change [1]: ∂ (n W ) ∂ (n ci W ) + = [W ] ∂t ∂ xi
(1)
in which W (ci ) is some molecular property that depends on the random velocity ci in the xi direction, indicates the average over the velocity distribution function otherwise called the moment, and [ ] indicates the change due to collisions. For example, if W is the mass of a molecule, then in the absence of chemical reactions the mass of the particle is conserved in a collision, and Eq. (1) becomes the well-known continuity equation of continuum gas dynamics. In addition, if equilibrium is assumed, and W is set to the momentum vector and the energy of a molecule, Eq. (1) provides a set of five continuum transport equations that is often called the Euler equations. This equation set corresponds to the case where the velocity distribution function everywhere in the flow is of the equilibrium Maxwellian form. To derive higher order sets of transport equations from Maxwell’s equation of change, it must be noted from Eq. (1) that the temporal derivative of any moment W depends on the divergence of the next higher velocity moment. This problem is addressed by one of two methods. In the Chapman–Enskog approach, a specific form of the velocity distribution function is assumed for flows perturbed slightly from the equilibrium state. In Grad’s method of moments, specific relations are assumed between the second and fourth order velocity moments. Setting W = mci c j and W = mci c j ck in Eq. (1), using either the Chapman–Enskog or
2814
I.D. Boyd
the Grad methods, leads to a set of 20 equations (the 20-moment equations) consisting of the five Euler equations, five further equations involving the shear stress tensor τ i j , and ten further equations involving the symmetric heat flux tensor Q i j k . For modeling hypersonic gas flow (and many other flows involving viscosity and thermal conductivity effects), a set of transport equations called the Navier–Stokes (NS) equations is widely employed. The NS equations are obtained from the 20-moment equations by replacing the heat flux tensor with a heat flux vector, qi . As discussed above, one of the methods for deriving transport equation sets from Maxwell’s equation of change is the Chapman–Enskog approach in which the following form for the velocity distribution function is assumed explicitly:
f (C) = f 0 (C) (C),
f 0 (C)dC =
1
c (2kT /m)1/2
(2)
e−C dC 2
π 3/2
C=
(3)
(C) = 1 + qi∗ Ci ( 25 C 2 − 1) − τi∗j Ci C j qi∗
κ =− P
2m kT
1/2
∂T , ∂ xi
τi∗j
µ = P
∂ci ∂c j 2 ∂ck + − δi j ∂x j ∂ xi 3 ∂ xk
(4)
(5)
where f o is the equilibrium Maxwellian form of the distribution. When the Chapman–Enskog parameter, , is slightly perturbed from unity, then the 20moment (or perhaps the Navier–Stokes) equations are valid. When is sufficiently far from unity then these equation sets can be expected to fail, and a more detailed approach is required. Examination of Eqs. (4) and (5) indicates that will become large when gradients in temperature and/or velocity become significant. This is exactly the situation in the critical regions of the shock waves and boundary layers created around hypersonic vehicles. In these regions, the characteristic length scales of the flow based on flow field gradients are so steep that the Chapman–Enskog distribution function cannot accurately describe the strong non-equilibrium behavior. Thus, higher order terms must be included in the perturbation to the Maxwellian distribution. In a continuum sense, these additional terms introduce higher order derivatives into the conservation equations. The next higher set of conservation equations produced in this manner is called the Burnett equations. Research on numerical solution of the Burnett equations has met with mixed success. The mathematical properties of the full set of Burnett equations presents several difficulties in terms of numerical solution. Attempts to simplify the equations by eliminating “difficult” terms have resulted in unfortunate side effects such as a failure to obey the second law of thermodynamics. In addition, development of the
Multi-scale modeling of hypersonic gas flow
2815
associated boundary conditions required for these higher order equation sets presents problems of its own. An alternative approach to the numerical simulation of the strongly nonequilibrium conditions experienced in hypersonic shock waves and boundary layers is the direct simulation Monte Carlo (DSMC) method [2]. In the DSMC technique, model particles are used to simulate the motions and collisions of real molecules. In a single iteration of the DSMC technique, the model particles are first moved over the distance given by the product of their velocity vector and a time-step t that is smaller than the mean free time, the average time that elapses between successive collisions of a molecule. After all particles are moved, any boundary interactions are processed, such as collision of particles with a solid wall. Next, the particles are binned into computational cells. The size of each cell should be less than the local mean free path, λ, that is the average distance a molecule travels between successive collisions. Within each of the cells, a number of particles are paired together, and collision probabilities determined for each pair to determine which particles actually collision. For the pairs that collide, collision mechanics is applied that conserve linear momentum and energy. Finally, the particle properties in each cell are sampled to determine time averaged, macroscopic properties such as density, flow velocity, temperature, and pressure. The DSMC technique has been successfully applied to a variety of nonequilibrium gas flow systems including hypersonic aerothermodynamics, spacecraft propulsion systems, materials processing, micro-scale gas flows, and space physics. Since the DSMC technique requires that the dimensions of its computational cells are of the order of a local mean free path (less than 1 mm under hypersonic vehicle flight conditions), the method becomes prohibitively expensive for continuum flows.
4.
The Multi-scale Hybrid Simulation Approach
The above discussion indicates that, in aiming to develop a computer model of the gas flow around hypersonic vehicles, we are faced with a dilemma. Solution of the sets of continuum conservation equations (such as the Navier– Stokes equations) using standard methods from computational fluid dynamics (CFD) [3], is relatively efficient numerically but may be physically inaccurate in the important regions of the shock waves and boundary layers where flow field gradients become very steep. The direct simulation Monte Carlo method is physically valid for the entire flow field, but is so numerically expensive as to make this approach impossible. One way of considering this problem is from a multi-scale perspective. Specifically, the continuum CFD approach is physically valid except in small thin regions where the scale length changes so
2816
I.D. Boyd
dramatically that sub-scale physical processes become important. Since these sub-scale, molecular level processes can be accurately simulated using the DSMC technique, a multi-algorithm approach to the problem presents itself as a natural way to proceed. This is a common solution technique in many areas of physics and engineering for multi-scale phenomena. Thus, a hybrid method that uses a CFD approach for most of the flow field, and the DSMC technique only in the high-gradient, sub-scale regions, appears to offer the potential to combine the physical accuracy and numerical efficiency of each of the component methods. There are two key issues in the development of a hybrid CFD-DSMC method: (1) how to determine the location of the domain boundaries between the CFD and DSMC regions; and (2) how to communicate information back and forward between continuum and discrete representations of the same gas flow. The first issue is resolved using continuum breakdown parameters that assess the physical accuracy of the continuum conservation equations. There are a number of such parameters in the literature [4, 5], and the most successful of these are defined in terms of flow field gradients similar to those that appear in the Chapman–Enskog distribution. The second issue is usually resolved by computing fluxes of conserved properties (mass, momentum, and energy) across the CFD/DSMC domain interfaces. This can either be achieved directly using a finite volume CFD formulation, or by evaluating the fluxes from primitive variables such as density, velocity, and temperature.
5.
Illustrative Results
The primary objective of a hybrid CFD-DSMC simulation is to improve the physical accuracy of a pure CFD solution without paying the enormous computational penalty of a pure DSMC solution. A common approach is to first obtain a pure CFD solution and to then use it to initialise a hybrid simulation. Figures 1 and 2 show some illustrative results for Mach 12 flow of molecular nitrogen over a blunt cone from a hybrid simulation performed in this manner. In Fig. 1, the decomposition of the flow domain between the CFD and DSMC methods is shown. A breakdown parameter based on flow field gradients is employed [4] and shows, as expected, that the DSMC technique is required in the bow shock wave and in the boundary layer next to the body surface. Figure 2 shows comparisons of three different numerical solutions along the axis of the flow (the stagnation streamline). Assuming that the pure DSMC solution is more physically accurate than the pure CFD solution, what we would like to see from the hybrid simulation is that it starts from the initial CFD solution, and moves it to the pure DSMC solution. This is exactly the behavior achieved in the results shown in Fig. 2.
Multi-scale modeling of hypersonic gas flow
2817
Figure 1. CFD (white) and DSMC (grey) domains for Mach 12 flow over a cone.
Figure 2. Numerical solutions of temperature along the flow axis.
6.
Outlook
The results shown in the figures represent a solid foundation for the further development of hybrid CFD-DSMC methods for hypersonic gas flows. The physical modeling capabilities of the hybrid methods need to be extended beyond the perfect gas description that is presently employed. Specifically,
2818
I.D. Boyd
models for vibrational excitation and chemical reactions must be developed. In addition, the numerical performance of the hybrid scheme must be improved dramatically. In the results shown here, the cost of the hybrid simulation was greater than that of the pure DSMC solution! Therefore, it is expected that refinement of hybrid CFD-DSMC hybrid methods will continue for several years. The development of such methods is intellectually challenging, and the need for accurate simulations of these complex hypersonic flows provides plenty of practical motivation.
References [1] T.I. Gombosi, Gaskinetic Theory, Cambridge University Press, Cambridge, 1994. [2] G.A. Bird, Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Oxford University Press, New York, 1994. [3] J.C. Tannehill, D.A. Anderson, and R.H. Pletcher, Computational Fluid Mechanics and Heat Transfer, Taylor and Francis, Philadelphia, 1997. [4] W.-L. Wang and I.D. Boyd, “Predicting continuum breakdown in hypersonic viscous flows,” Phys. Fluids, 15, 91–100, 2003. [5] A.L. Garcia, J.B. Bell, W.Y. Crutchfield, and B.J. Alder, “Adaptive mesh and algorithm refinement using direct simulation Monte Carlo,” J. Comp. Phys., 154, 134– 155, 1999.
Perspective 23 COMMENTARY ON LIQUID SIMULATIONS AND INDUSTRIAL APPLICATIONS Raymond D. Mountain Physical and Chemical Properties Division, Chemical Science and Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899-8380, USA
Molecular dynamics and Monte Carlo simulations have proven to be invaluable tools in the study of liquids. Statistical mechanics tells us what to calculate to obtain liquid state properties (equation of state, specific heat, and transport coefficients for example) from a model of the interaction between molecules. Simulations are the tools needed to make those calculations for physically realistic models. While the impact of simulations on the investigation of fluid properties and structure cannot be overstated, there has been very little success in the use of these methods to predict liquid properties outside the range where model potentials have been fit to some property. One result of this is that industrial modelers, who are called upon to estimate properties for various compositions, temperatures, and pressures of liquids, have not found simulation methods to be of much value for their work. Some of the reasons for the lack of use of molecular level simulations in the industrial sector will be examined here. These reasons point to opportunities and challenges for research that could lead to industrially useful molecular simulation methods and practices. A situation where simulations could be industrially useful involve conditions where data are scarce and measurements would be difficult, expensive, or hazardous. The solubility of oxygen in a combustible liquid at elevated temperature is an example. Another would involve compounds that are expensive to prepare. Specific properties that have industrial interest and where simulations could be useful are the vapor pressure, solubility of gases in liquids, and the thermal conductivity and shear viscosity of liquids. Simulation methods exist for determining these properties, although they are not easily used by non-experts.
2819 S. Yip (ed.), Handbook of Materials Modeling, 2819–2821. c 2005 Springer. Printed in the Netherlands.
2820
R.D. Mountain
The ingredients that go into a simulation are model interaction potentials and state conditions. For this discussion, boundary conditions, initial conditions, and simulation software will not be discussed. At present, almost all model intermolecular potential functions are empirical or semiempirical constructs that have parameters adjusted so that some set of calculated properties match the experimental values of those properties over some range of state conditions. The accuracy of property predictions outside the range where the fit was made is unpredictable. This is not a satisfactory state of affairs as uncertainty estimates are essential for industrial purposes. The way a molecule–molecule interaction is represented can be quite artibrary. For example, interaction sites may be placed on atomic sites or are they may be placed elsewhere. Often, the location of “atomic sites” differs from the location of the sites on the isolated molecule. A related issue is the placement of partial charges to mimic the charge distribution in a molecule. When a mean-field approach to induced polarization is employed, the criteria for the placement and magnitude of the charges is not governed by unambiguous physical considerations. Since the electrostatic interactions are generally much stronger than dispersion interactions, small changes in the magnitude and/or position of charges can have significant consequences. How to improve the predictive ability of molecular simulations for thermodynamic and transport properties is an open topic. At the root of the problem is the quantum mechanical nature of the potentials that are the result of effectively replacing electronic degrees of freedom by potentials. This suggests that at least some of the electronic degrees of freedom should be explicitly included in the simulation. How to do this in a way that is computationally tractable for system sizes of interest, for simulation intervals of sufficient length that statistically adequate samples are obtained, and that include the necessary physics so that the arbitrariness is reduced to acceptable levels is not yet understood, although there are continuing research efforts. One approach is to use high level quantum chemistry methods to calculate directly the energy of interaction between pairs of molecules and then to map the results onto functions that represent the energy of interaction for various separations and orientations of a pair of molecules. This can result in an accurate representation of the energy of a pair of molecules, but may be quite complicated and may require considerable computational effort if used in a molecular dynamics or Monte Carlo simulation. Also, significant many body effects associated with induced polarization require separate calculations. Extending this approach to explicitly consider three or more molecules simultaneously is beyond current capabilities for all but very simple molecules. A second scheme is to use quantum mechanics to calculate energies and forces as needed during a simulation. These methods are currently limited to using low level methods such as density functional based quantum methods. Even so, the computational load is heavy, the number of molecules is restricted
Commentary on liquid simulations and industrial applications
2821
to 100 or so, and and accuracy of predicted properties relative to experimental values is often no better than that obtained from empirical models. The challenge and opportunity is to find and develop ways to incorporate the relevant quantum mechanical interactions into forms that are computationally manageable for molecular simulations. A second reason why industrial modelers have not embraced molecular simulations is that the computational time required to obtain statistically significant results, particularly for the viscosity and the thermal conductivity, can be quite long. This is the case when either the equilibrium time correlation function/Einstein relation approach is used or a nonequilibrium externally applied gradient is imposed on the fluid. The challenge and opportunity is to develop alternative, efficient ways to extract transport coefficients with known uncertainties from simulations.
Perspective 24 COMPUTER SIMULATIONS OF SUPERCOOLED LIQUIDS AND GLASSES Walter Kob Laboratoire des Verres, Universit´e Montpellier 2, 34095 Montpellier, France
Glasses are materials that are ubiquitous in our daily life. We find them in such diverse items as window pans, optical fibers, computer chips, ceramics, all of which are oxide glasses, as well as in food, foams, polymers, gels, which are mainly of organic nature. Roughly speaking glasses are solid materials that have no translational or orientational order on the scale beyond O(10) diameters of the constituent particles (atoms, colloids, . . .) [1]. Note that these materials are not necessarily homogeneous since, e.g., alkali-glasses such as Na2 O-SiO2 show (disordered!) structural features on the length scale of 6–10 Å (compare to the interatomic distance of 1–2 Å) and gels can have structural inhomogeneities that extend up to macroscopic length scales. Apart from their relevance as a technological material, glasses, or more general glass-forming liquids, are also an important subject of fundamental research since many of their unique properties are not understood well (or at all) from a microscopic point of view and it is a great challenge to close this gap in our knowledge. For example, why does the addition of 100 ppm of water change the viscosity of silica, SiO2 , by 2–3 orders of magnitude, a question that is important, e.g., to understand the flow of magmatic material? What are the mechanisms that make a glass “age”, i.e., lead to a change in its properties with time, a phenomenon that is highly important to understand material properties on long time scales? What are the “best” compositions to obtain metallic glasses with a prescribed mechanical/electric/magnetic/. . . property? The most pertinent question is however an embarrassingly simple one: What is a glass? To understand this question we can consider the temperature dependence of the viscosity η of a glass-forming liquid, i.e., of a liquid that can be supercooled by a significant fraction of its melting temperature. 2823 S. Yip (ed.), Handbook of Materials Modeling, 2823–2828. c 2005 Springer. Printed in the Netherlands.
2824
W. Kob
Experimentally it is found that η shows a very strong T -dependence in that it increases from values on the order of 10−2 Pa · s– 1014 Pa · s (these values are typical for atomic and molecular liquids). This is demonstrated in Fig. 1 where we show an Arrhenius plot of η(T ) for various glass-forming liquids. Note that in order to take into account the different nature of the liquids, and hence the different relevant temperature scales, we have normalized the temperature axis by Tg , the temperature at which the viscosity of the liquid takes the value 1012 Pa · s, i.e., a value that corresponds to a typical relaxation time of 1–100 s. From this plot, we thus see that a change in temperature by a factor of two leads to an increase of the viscosity by about 10–15 decades! If the temperature is lowered a bit further, the viscosity, and hence the relaxation times, increases even more and the material does not flow anymore on human time scales, i.e., it has become a glass. The reason for the dramatic slowing down of the dynamics of glass-forming liquids is presently not really known. Although there are reliable theoretical approaches, such as the
Figure 1. Temperature dependence of the viscosity of various glass-forming liquids as a function of Tg /T , where Tg is the glass transition temperature. Adapted from Angell et al. [2] (reproduced with permission).
Computer simulations of supercooled liquids and glasses
2825
so-called “mode-coupling theory of the glass transition” [3], that are able to rationalize in a semi-quantitative and sometimes even quantitative way the observed slowing down, these approaches usually work only in a relatively narrow temperature range, i.e., in a T -window in which η is on the order of 10−1 −104 Pa · s and thus changes by “only” 3–5 decades. Finding a reliable theoretical description that is able to describe the slowing-down in the whole T -range is one of the most important issues in the field. Despite the many experiments that have been made to determine the properties of glass-forming systems [4] and hence to shed some light on the reason for the occurrence of the glass transition, one is currently still far away to have an answer. The problem is that in experiments, the microscopic information that is needed to identify the mechanism for the slowing down, namely the trajectories of the particles as well as their statistical properties, is not really available. Therefore, also in the last twenty years there have been large efforts to use computer simulations to study glass-forming systems [5]. Roughly speaking these numerical efforts can be divided into two large, not completely disjoint, groups. Simulations that aim to increase our understanding of a (real) specific material (or a narrow class of materials) such as silica, ortho-terphenyl, polyethylene, etc. Since the properties of the materials are often very specific, it is necessary to use force fields that are highly accurate. Although for the case of crystalline systems accurate effective force fields are often available, this is unfortunately not the case for supercooled liquids and therefore the results obtained with these types of interactions must always be looked at with some caution. The alternative is to use ab initio calculations in which the forces between the ions are determined directly from the electronic degrees of freedom [6]. Although the so obtained interactions are very accurate, the price one has to pay is an increase of about a factor 106 in the computational burden. Therefore the current state of the art ab initio simulations are done on systems with only a few hundred particles and cover a time scale of just a few tens of a pico-second. The second large group of simulations of glass-forming materials are numerical investigations aimed to understand the more universal aspects of these systems. Therefore one uses models that are relatively simple, such as Lennard–Jones particles, lattice gases, or spin models, i.e., systems in which the positions of the “particles” are frozen on a lattice and only their orientational degrees of freedom are relevant. Due to their simplicity these models are very useful to obtain results with a good statistics and to study the dynamical behavior of the system at relatively long times, which presently means O(109 ) time steps. Although making 109 time steps for a system size of O(1000) particles is presently still a huge numerical effort, it is still significantly less than the number of steps that one needs in order to reach macroscopic time scales. For example, in the case of silica a typical time step is around 1 fs and hence 109 steps correspond to only 1 µs! Thus the last 6–8 decades in η of the curves
2826
W. Kob
shown in Fig. 1 are presently not accessible in (equilibrium!) simulations and thus this technique is presently not able to make a significant contribution to our understanding of these highly supercooled liquids. Note that here we emphasize the notion of “equilibrium”. Of course it is possible to quench the system rapidly to a relatively low temperature, e.g., by coupling it to a heat bath, and then to start a microcanonical or canonical simulation. However, since the system is now out of equilibrium the measured properties might be, and usually are, very different from the ones found for the same system at this temperature but in equilibrium Bouchaud 1998 [10]. Note that in principle one is of course not forced to use these 109 time steps to do a “realistic dynamics”, i.e., one in which Newton’s equations of motions are solved by means of the standard algorithm, such as the one of Verlet. Instead it would be perfectly reasonable to use a cleverly designed Monte Carlo scheme which allows to equilibrate the system at relatively low temperatures and then to use the so generated equilibrium configurations as a starting point for a “conventional” simulation, i.e., a dynamics in which the particles follow Newton’s equations of motion. This approach is, e.g., used very successfully in the context of second order phase transitions, another physical situation where the dynamics becomes exceedingly slow (critical slowing down). However, whereas in this latter case there are indeed efficient algorithms, such as the one by Swendsen and Wang [7] that allows to equilibrate the system even very close to the critical point, this is not the case for glass-forming liquids, despite strong efforts to find such accelerating schemes. Nevertheless, some progress has already been made [8] and therefore it can be hoped that in the not too distant future it will indeed be possible to equilibrate glass-forming systems even at very low temperatures. (Note that since from a mathematical point of view equilibrating a glass-forming liquid has many similarities with more formal optimization problems, such as the traveling salesman problem (in both cases one searches for low lying minima of a complicated function, the potential energy/cost function), in recent years, there have been quite a few very fruitful exchanges between these two communities that have lead to an improvement of the algorithms [8].) Once such a method has been found, it will be possible to investigate on the microscopic level the vibrational and relaxation dynamics of such systems, and hence to make a much closer connection to experimental data and thus to allow to interpret it in a more reliable way and to make new predictions on the structure and dynamics of glass-forming systems. On the other hand such simulations will also give the possibility to obtain a better theoretical (analytical) description of the dynamics of supercooled liquids. As a possible example we return to the above mentioned mode-coupling theory. In this theory, one discusses the temperature dependence of the so-called intermediate scattering function F(q, t), a space-time correlation function that is directly measurable
Computer simulations of supercooled liquids and glasses
2827
in light or neutron scattering experiments and hence is of great theoretical and practical importance [9]. This function is defined given by F(q, t) =
N N 1 exp(iq · (r j (t) − rk (0)), N j =1 k=1
(1)
where r j (t) is the location of particle j at time t and q is a wave-vector. Using a procedure that is called “Zwanzig-Mori projection operator formalism” it is possible to derive exact equations of motion for F(q, t) and one finds that they have the form [9]: ¨ F(q, t) +
2q F(q, t)
+
2q
t
M(q, t − t )F(q, t )dt = 0.
(2)
0
Here the (squared) frequency 2q is given by q 2 k B T /(m S(q)), where m is the mass of the particles and S(q) is the static structure factor. The mode-coupling theory now makes the approximation that the so-called “memory function” M(q, t) is a bi-linear product of F(q , t) with coefficients that depend only on S(q ) [3]. As mentioned above, the theory works very well at temperatures corresponding to low and intermediate values of the viscosity. However, at lower temperatures the approximation seems not to be accurate any more but for the moment it is not clear at all how it can be improved. Thus to advance, it will be necessary to determine the memory function and to see why the approximation is no longer good. This can be done, e.g., by making large scale molecular dynamics simulations in order to determine F(q, t) with high precision. Using a simple Laplace transform it is then possible [9] to invert Eq. (2) in order to express M(q, t) as a function of F(q, t) and hence to see how the mode-coupling approximations can be improved. This procedure would thus allow to make finally one important step forward to our understanding of the relaxation dynamics of deeply supercooled liquids, a field that has not seen much theoretical progress since about 20 years, i.e., after the mode-coupling theory has been proposed. Thus it is evident that for the next few years the big challenges in simulations of glass-forming liquids and glasses are to obtain potentials that are sufficiently accurate to describe these disordered structures on a quantitative level and to develop new simulation algorithms that allow to equilibrate these systems also at low temperatures. Once these two goals are attained it will be possible to make on the one hand qualitative and quantitative predictions on the properties of glass-forming materials and on the other hand allow to make an important step forward in our understanding of the nature of the glassy state.
2828
W. Kob
References [1] J. Zarzycki (ed.), Materials Science and Technology, Vol. 9: Glasses and Amorphous Materials, VCH Publ., Weinheim, 1991. [2] C.A. Angell, P.H. Poole, and J. Shao, “Glass-forming liquids, anomalous liquids, and polyamorphism in liquids and biopolymers,” Nuovo Cim. D, 16, 993–1025, 1994. [3] W. G¨otze and L. Sj¨ogren, “Relaxation processes in supercooled liquids,” Rep. Prog. Phys., 55, 241–376, 1992. [4] K.L. Ngai, G. Floudas, A.K. Rizos, and E. Riande (eds.), “Relaxation in complex systems,” J. Non-Cryst. Solids, 307–310, 1–1080, 2002. [5] W. Kob, “Supercooled liquids, the glass transition, and computer simulations,” In: J.-L. Barrat, M. Feigelman, J. Kurchan, and J. Dalibard (eds.), Lecture Notes for Slow Relaxations and Nonequilibrium Dynamics in Condensed Matter, Les Houches July, 1–25, 2002; Les Houches Session LXXVII. Springer, Berlin, pp. 199–270, 2003. [6] M.E. Tuckerman, “Ab Initio, molecular dynamics: basic concepts, current trends and novel applications,” J. Phys. Condens. Matter., 14, R1297–R1355, 2002. [7] D.P. Landau and K. Binder, Monte Carlo Simulations in Statistical Physics, Cambridge University, Cambridge, 2000. [8] Y. Okamoto, “Generalized-ensemble algorithms: enhanced sampling techniques for Monte Carlo and molecular dynamics simulations,” J. Mol. Graphics Modelling, 22, 425–439, 2004. [9] U. Balucani and M. Zoppi, Dynamics of the Liquid State, Oxford University Press, Oxford, 1994. [10] J.-P. Bouchaud, L.F. Cugliandolo, J. Kurchan, and M. M´ezard, “Out of equilibrium dynamics in spin glasses and other glassy systems,” In: A.P. Young (ed.), Spin Glasses and Random Fields, World Scientific, Singapore, pp. 161–224, 1998.
Perspective 25 INTERPLAY BETWEEN MATERIALS THEORY AND HIGH-PRESSURE EXPERIMENTS Raymond Jeanloz University of California, Berkeley, CA, USA
High-pressure experiments have played an important role in materials physics, both as a route for synthesizing new materials and as a means of validating theory. These roles are complementary, and have proven to be remarkably synergistic. Perhaps the most famous example of materials synthesis under pressure is that of diamond, the high-pressure form of carbon that is produced in the Earth’s mantle and – since the early 1950s – in the laboratory; industrial diamonds have subsequently found a wide range of applications [1, 2]. Focusing on a property rather than a particular material, superconductivity illustrates an important characteristic that has been systematically pursued in the laboratory. Not only do the examples discovered under pressure almost double the number of superconductors documented among elements (Fig. 1)[3], but the highesttemperature superconducting transition found to date has been measured at high pressure [4]. Pressure is also significant for condensed-matter theory, which is founded on quantum mechanics but involves important approximations in actual implementation (e.g., of exchange and correlation effects; applicability of one-electron and density-functional approaches; separation of electronic and vibrational degrees of freedom; and considerations of electron-spin and of relativistic effects). The best way to evaluate the reliability of theory is to have it predict the stability and properties of new materials, or of known materials under new conditions. If one changes composition to form a new material, that also changes the approximations embedded in the potential term of the Schr¨odinger equation, however; if one changes temperature, new complications arise as thermal excitations have to be reliably accounted for in the theory. 2829 S. Yip (ed.), Handbook of Materials Modeling, 2829–2835. c 2005 Springer. Printed in the Netherlands.
2830
R. Jeanloz Superconductivity of Elements
H
He
Li
Be
17
0.03
Na
Mg
K Rb
T
P⫽0
TC in K for bulk sample
B
C
N
11 T
maximum reported TC in K
P>0
Ca
Sc
Ti
V
Cr
15
0.3
0.5
5.4
3
Sr
Y
Zr
Nb
Mo
Tc
Ru
4
3
0.6
9.3
0.9
8.2
0.5 10
Mn
Fe
Co
Ni
Cu
2 Rh
Pd
Ag
⫺6
Cs
Ba
Hf
Ta
W
Re
Os
Ir
1.7
5
0.4
4.5
0.01
1.7
0.7
0.1
Fr
Ra
Rf
Db
Sg
Bh
Hs
Mt
Ds
Rg
La
Ce
Pr
Nd
Pm
Sm
Eu
Gd
Tb
6
1.8
Ac
Th 1.4
Pt
Au
O
F
Ne
Cl
Ar Kr
0.6
Al
Si
P
S
1.2
8
20
17
Zn
Ga
Ge
As
Se
Br
0.9
1.1
5.5
2.7
11
1.5
Cd
In
Sn
Sb
Te
I
0.5
3.4
3.7
3.6
7.5
1.2
Po
At
Hg
Tl
Pb
Bi
4.2
2.4
7.2
7
Dy
Ho
Er
Tm
Yb
Xe
Rn
Lu 1.2
Pa 1.4
U 1.3
Np
Pu
Am 0.6
Cm
Bk
Cf
Es
Fm
Md
No
Lr
Figure 1. Periodic table identifying elements that have been found to be superconductors at high pressures (shaded), and those elements known as superconductors at zero pressure, with transition temperatures shown for bulk samples at zero applied magnetic field [3].
These and other means of perturbing the system can be experimentally applied and theoretically modeled, but none is as straightforward for theory as considering the effect of pressure. Quite simply, the inter-atomic spacing is changed, and the quantum mechanical problem is now solved (and optimized) again in order to predict the properties of matter under new conditions. Experiments then verify the predictions, or not.
1.
Effect of Pressure on Materials Chemistry
As shown by many theoretical studies, pressure can dramatically change the properties of materials. One of the earliest examples is the 1935 prediction by E. Wigner and H. B. Huntington that hydrogen – normally a transparent, electrically insulating gas at ambient conditions – would become a metal at high pressures [5]. Ironically, this example has become notorious because, though there is no doubt that hydrogen becomes metallic at high enough pressures and temperatures, the conditions of metallization and the detailed processes involved remain controversial. It is unclear whether the metallization of
Interplay between materials theory and high-pressure experiments
2831
hydrogen can be “exactly solved” even with present-day quantum mechanical methods. That the periodic table of the elements is fundamentally altered under pressure is suggested by the following reasoning. The PdV pressure–volume work induced by compression to 1 million times atmospheric pressure (1 Mbar or 100 GPa), conditions readily achieved using either static (diamond-anvil cell) or dynamic (shock-wave) methods, amounts to an internal-energy change of order eV (∼105 J/mol of atoms). This is comparable to the energies of valence-shell electrons, and thus to chemical-bonding energies [6, 7]. In other words, the thermodynamic perturbation caused by Megabar pressures is enough to alter chemical bonding. Numerous experiments document major changes in chemical properties under pressure. Oxygen becomes chalcogenide-like, for example, transforming from a transparent, electrically insulating gas to an opaque metal by about
300
250
Pressure (GPa)
CsI-Xe CONVERGENCE 200
150
Insulator-Metal Transition 100
50
Structural Transitions 0 0
10
20
30
40
50
60
Volume (A3/atom) Figure 2. Crystalline phases and equations of state of Xe (open symbols; face-centered cubic structure on extreme right) and CsI (closed symbols; CsCl structure on right) at low pressures converge at pressures above 50 GPa (hexagonal close-packed structure on left), and both materials are metallic at pressures above 120 GPa [7].
2832
R. Jeanloz
100 GPa (and a superconductor at somewhat higher pressures) [7]. Perhaps more remarkable, the “inert” gas xenon becomes a close-packed metal by 100 GPa, with a crystal structure and equation of state matching those of the iso-electronic compound CsI (Fig. 2). Indeed, the salt (with a gap between valence– and conduction–electron energy bands exceeding 6 eV at zero pressure) becomes metallic by 100 GPa, and then superconducting by 200 GPa. Could it be that xenon will also become a superconducting metal at high pressures? Experiments have, so far, failed to confirm this.∗
2.
Comparison of Theory with Experiment
One of the earliest high-pressure successes of modern electronic bandstructure calculations was the prediction that potassium would transform from an alkali to a transition metal at high pressure [8], and could therefore alloy with iron under compression.† Spectroscopic and other measurements confirmed that K does take on a transition-metal character by 25 GPa [9], and potassium has subsequently been alloyed with nickel and with iron at pressures above 25–30 GPa [10]. Quantitative predictions of structural transitions are notoriously difficult because the energy differences between bulk crystalline phases are often quite small: comparable, in many cases, to the energy resolution of calculations until the 1970s. By the 1980s, however, the feasibility of theoretically deriving transition pressures was demonstrated through studies on Si. Superconductivity was then predicted for some of the high-pressure phases of silicon, and experimentally confirmed shortly thereafter [11]. The role of theory relative to high-pressure experiment underwent a quiet but significant change in the 1990s, with predictions of phase stability being made well before they were experimentally studied, and not just for elements but also for compounds exhibiting more complex structures and variations in bonding. A subtle structural transition (from corundum to Rh2 O3 II structures) was predicted for Al2 O3 in 1987–1994 for example, and only experimentally confirmed in 1997 [12].
∗ The failure to find superconductivity in xenon is plausibly due to the difficulty of experiments in the “multi-Megabar” pressure range, but might also be due to subtle effects being unexpectedly important. For example, the slightly different masses of Cs and I result in the compound having twice as many vibrational mode-branches as the element, with optic as well as acoustic modes, and the possible effect of this in xenon could be explored through studies of a solid solution consisting of a crystallographically ordered 50:50 mixture of 126 and 132 isotopes of Xe. † The most abundant naturally occurring radioactive isotope with a half life in the billion-year range, 40 K, is an important source of heat for planets. If it alloys with iron at high-pressures, it could also be an important source of heat for the Earth’s iron-rich core and its dynamo-generated magnetic field.
Interplay between materials theory and high-pressure experiments
2833
At the same time, molecular dynamics, with interatomic forces derived from first-principles (quantum mechanical) calculations, was being developed to make predictions at high temperatures as well as high pressures. That crystalline CH4 methane would polymerize and ultimately form diamond at elevated pressures and temperatures was thus predicted, and subsequently observed experimentally [13]. By the end of the 20th century, it was becoming evident to high-pressure experimentalists that they had to pay attention to theory, not only to help identify interesting problems (determine which combination of elements to study, at what conditions and with what transformations or other phenomena to expect), but also to help interpret and go beyond the experimental measurements. At the highest pressures, many observations are indirect or require considerable interpretation, and theory can play a crucial role in validating and quantifying these interpretations. Also, to the degree that measured properties agree with theoretical predictions, one can have some confidence in the theoretical values for properties that cannot be measured (for example, elastic shear moduli are generally more difficult to measure under pressure than compressional moduli, so theory can be used to estimate the former if it has been shown to yield good estimates for the latter). Although it is useful to validate theory by demonstrating good agreement with measurement, comparisons between the two are especially powerful when disagreements are uncovered, because this is when the limitations in current modeling approaches or approximations are revealed. For example, the molecular-dynamics prediction that pressures of 100–300 GPa would polymerize methane and then transform it to diamond is experimentally found to be too high by approximately one order of magnitude. Such quantitative discrepancies provide important information about how theory can be improved.
3.
Future Directions for Materials Modeling
Theoretical modeling is pursued in order to develop a fundamental understanding of material properties, including what controls those properties and how those properties can be tuned in practice. An important goal of theory is thus to be able to take a desired property, such as superconductivity or high elastic modulus, and deduce the range of materials that would have these properties as well as predicting the conditions under which the relevant phases are thermodynamically stable. Extension beyond classical thermodynamic equilibrium (the material’s ground-state) is important in at least two regards. First, non-equilibrium properties such as strength and absorption or reflectance spectra (i.e., excited electronic or vibrational states) are of great practical interest. There is again much
2834
R. Jeanloz
opportunity for comparing theoretical predictions against experimental observations, in particular with pressure as a variable over which to make the comparisons (as a function of compression for a given phase, as well as across pressure-induced phase transformations). Second, once materials having particular desirable properties have been identified, it becomes all-important to consider synthesis routes going beyond the bulk equilibrium of classical thermodynamics. After all, the value of a material is directly proportional both to the nature of its properties and to the ease with which it is synthesized. Carbon is a case in point, with formation by chemical vapor deposition (CVD) having revolutionized the utility of synthetic diamond. High-pressure experiments thus helped establish the equilibrium properties and conditions for synthesis of diamond, but it is the low-pressure method of growth that has taken over as one of the most effective means of making high-quality samples [14]. Indeed, the toughness properties of CVD diamond can be tuned, and this in turn will prove useful for future high-pressure experiments as the diamonds are used as anvils. The ultimate hope is that by fine-tuning theoretical modeling of materials, it will become possible to predict optimal routes for synthesis. Understanding the microscopic details of how materials are nucleated or can grow, and not only under conditions of bulk thermodynamic equilibrium, can then provide the practical answer to those needing a particular set of material properties for a given application: from a prediction of materials that have the required properties to a recipe for the optimal synthesis route. At that point, high-pressure experiments may no longer have their current utility for the study of materials.
References [1] K. Nassau and J. Nassau, J. Crystal Growth, 46, 157–172; F.P. Bundy and R.C. DeVries, “Diamond: high-pressure synthesis,” In: Encyclopedia of Materials: Science and Technology, Elsevier, Amsterdam, 2001. [2] R.M. Hazen, The Diamond Makers, Cambridge University Press, Cambridge, 1999. [3] C. Buzea and K. Robbie, Supercond. Sci. Technol., 18, R1–R8, 2005. [4] L. Gao, Y.Y. Xue, F. Chen, Q. Xiong, R.L. Meng, D. Ramirez, C.W. Chu, J.H. Eggert, and H.K. Mao, Phys. Rev. B, 50, 4260–4263, 1994. [5] E. Wigner and H.B. Huntington, J. Chem. Phys., 3, 764–770, 1935. [6] R. Jeanloz, Ann. Rev. Phys. Chem., 40, 237–259, 1989. [7] R.J. Hemley and N.W. Aschroft, Phys. Today, 51, 26, 1998; R.J. Hemley, Ann. Rev. Phys. Chem., 51, 763–800, 2000. [8] M.S.T. Bukowinski, Geophys. Res. Lett., 3, 491–503, 1976. [9] K. Takemura and K. Syassen, Phys. Rev. B, 28, 1193–1196, 1983. [10] L.J. Parker, T. Atou, and J.V. Badding, Science, 273, 95–97, 1996; L.J. Parker, M. Hasegawa, T. Atou, and J.V. Badding, Eur. J. Solid-State Inorg. Chem., 34, 693–704, 1997; K.K.M. Lee and R. Jeanloz, Geophys. Res. Lett., 30,
Interplay between materials theory and high-pressure experiments
[11] [12]
[13]
[14]
2835
doi:10.1029/2003GL018515, 2003; K.K.M. Lee, G. Steinle-Neumann, and R. Jeanloz, Geophys. Res. Lett., 31, doi:10.1029/2004GL019839, 2004. D. Erskine, P.Y. Yu, K.J. Chang, and M.L. Cohen, Phys. Rev. Lett., 57, 2741–2744, 1986. R.E. Cohen, Geophys. Res. Lett., 14, 37, 1987; H. Cynn, D.G. Isaak, R.E. Cohen, M.F. Nicol, and O.L. Anderson, Am. Mineral., 75, 439, 1990; F.C. Marton and R.E. Cohen, Am. Mineral., 79, 789, 1994; K.T. Thompson, R.M. Wentzcovitch, and M.S.T. Bukowinski, Science, 274, 1880, 1996; N. Funamori and R. Jeanloz, Science, 278, 1109–1111, 1997. F. Ancilotto, G.L. Chiarotti, S. Scandolo, and E. Tosatti, Science, 275, 1288–1290, 1997; L.R. Benedetti, J.H. Nguyen, W.A. Caldwell, H. Liu, M. Kruger, and R. Jeanloz, Science, 286, 100–102, 1999. C.S. Yan, K.K. Mao, W. Li, J. Qian, Y. Zhao, and R.J. Hemley, Phys. Sta. Sol.(a), 201, R25–R27, 2004.
Perspective 26 PERSPECTIVES ON EXPERIMENTS, MODELING AND SIMULATIONS OF GRAIN GROWTH Carl V. Thompson Department of Materials Science and Engineering, M.I.T., Cambridge, MA 02139, USA
Grain growth is a process through which the average crystal, or grain size, in a fully crystalline material increases due to curvature driven motion of individual grain boundaries. The average grain size increases as small grains shrink and disappear (see Fig. 1). Grain growth occurs in a very wide range of engineering materials, and because grain sizes affect almost all properties, grain growth must be understood and controlled in order to obtain polycrystalline materials with properties, performance, and reliability required for a wide range of applications. Despite the long understood importance of grain growth, and many attempts to model it, modeling has proven difficult in any but the most simple and quasi-empirical ways [1]. Computer simulation, on the other hand, has proven to be a powerful tool for the study of grain growth and has in fact precipitated improved modeling and improved understandings of experimental results. It is generally agreed that ‘normal’ grain growth in bulk materials has the following characteristics: (1) the average grain size increases as the square root of time, r ∼ t 1/2 ,
(1)
(2) the distribution of grain sizes is time invariant (that is, the distribution of normalized grain radii is not a function of time), r is not a function of time, (2) f r (3) the grain size distribution is well fit by a lognormal function (see Fig. 2). The first characteristic leads to the postulate that 1 dr ∝ , (3) dt r 2837 S. Yip (ed.), Handbook of Materials Modeling, 2837–2841. c 2005 Springer. Printed in the Netherlands.
2838
C.V. Thompson
Figure 1. Schematic illustration of grain growth in a polycrystalline material. The average grain size increases as some grains shrink and disappear.
Lognormal Hillert fraction
Simulations
grain size/ Figure 2. Grain size distributions observed in experiments are often well fit by a lognormal distribution function. The time-invariant grain size distribution function found by Hillert [2] and found from computer simulations of “normal” grain growth [1] differ significantly from lognormal distributions.
which is rationalized by arguing that the driving force for grain growth is proportional to the average grain boundary curvature (proportional to1/r) and the average grain boundary energy γgb (included in the constant of proportionality in Eq. (3)). The average grain boundary mobility is also included in the constant of proportionality.
Perspectives on experiments, modeling
2839
Hillert and others [1, 2] have followed coarsening theory in defining a grain-specific grain growth law and determined a time-invariant grain size distribution that satisfies the condition for flux continuity in size space. Hillert specifically assumed that the rate of growth of a grain of radius r is proportional to (1/r − 1/r), and found the time invariant size distribution function. However, this function was not well fit by a lognormal function and in fact went to zero for a finite value of r (rmax = 2r in the 3D case) (see Fig. 2). Others have found time-invariant distributions that fit lognormal functions, but only by searching for a growth function that yields this required result (often with little or no physical motivation). Even this unsatisfactory approach to modeling breaks down when grain growth is abnormal in some way, as it often is in the real world. The troublesome characteristics of grain growth that lead to difficulties in model development include that the evolution of any individual grain directly depends on the characteristics of its neighbors, and that those neighbors often disappear relatively abruptly or are replaced throughout the grain growth process. While these phenomena are difficult to capture in statistical models of grain growth (at least in 3D), they are dealt with relatively simply in computer simulations. There are now a number of different approaches to simulation of grain growth. The most physically intuitive approach is based on front-tracking in which points on grain boundaries (points on lines in 2D and surfaces in 3D) are moved with local velocities that are proportional to the corresponding local boundary curvature. Front-tracking, and other simulations, must allow for grain disappearance (at some defined small size) and neighbor switching events that occur when grain vertexes meet [3]. A popular alternative to front-tracking models is modeling in which points in space are stochastically assigned to grains through algorithms that weight the assignment according to the assignments of neighboring points (this is often called the Potts model) [4, 5]. Other approaches include vertex tracking models [6] and phase filed models [7]. Simulations of grain growth have been carried out most extensively on 2D systems, in which they have been shown to satisfy the first and second conditions for normal grain growth given above. All of the simulations also lead to very similar time-invariant grain size distributions, though these distributions are not well fit by lognormal functions (see Fig. 2). While front-tracking simulations of grain growth have been developed for 3D systems, 3D grain growth has been most extensively modeled using the simpler Potts model. Again, when it is assumed that all boundaries have the same energies and mobilities, the first two characteristics of normal grain growth are obtained, although the steady-state (time-invariant) grain size distribution is again not well fit by a lognormal function [8]. Mean field models for 2D normal grain growth that account for evolution in number-of-sides space, as well as size space, have generated results that are
2840
C.V. Thompson
consistent with simulations [1]. Unfortunately, such models require numerical solution, and while the required knowledge of the topological characteristics of evolving cellular structures are known for 2D structures, but not 3D structures. However, these 2D mean field models provide time-invariant grain and number-of-sides distributions that are consistent with those found using various simulation techniques, supporting the validity of both approaches, at least under the assumed conditions that all grain boundaries have the same energies and mobilities, and that the grain boundary mobilities are time-invariant. The greatest advantage of simulations, compared to analytic or statistical models, is that the simplifying assumptions required for the latter can be relaxed in simulations. Armed with simulations that recover the expected characteristics of normal grain growth, researchers have gone on to investigate the effects of boundary-by-boundary variations in boundary energies and mobilities, the effects of solute, precipitate and thermal-groove induced drag, the effects of surface energy and surface energy anisotropy, as well as strain energy and strain energy anisotropy [1]. It has been demonstrated, for example, that thermal grooves can lead to stagnant structures with grain size distributions that are well fit by lognormal functions [9], and solute drag can lead to transient distributions that are also well fit by lognormal functions [10]. This leads to the important insight that the grain structures commonly observed in experiments are likely affected by grain boundary drag. Simulations have also been used to analyze the competing effects of surface and strain energy minimization during abnormal grain growth in polycrystalline thin films [11]. In this case, simulations were shown to be quantitatively consistent with a body of experiments showing variations in crystallographic texture evolution, depending on which energy minimization dominates. In addition, the combination of thermal-groove-induced grain boundary drag, with surface or strain energy biased growth of a subpopulation of grains, has been shown to lead to bimodal grain size distributions similar to those often seen in experiments on thin films (see Fig. 3). Simulations can therefore provide predictive tools for both texture and grain size evolution, at least in thin film systems. Simulations have seen their most extensive use in analysis of 2D and quasi2D systems such as thin films. However, as simulation techniques advance, and computation capabilities continue to rapidly improve, it can be expected that simulations of 3D systems, with the real complexities of variable boundary energies and mobilities [14] and various drag effects, should become more readily available and practical. It seems almost certain that the ability to simulate complex grain growth behavior will always outpace the ability to model it, so that it will be simulations that are used to aid experimentalists in interpretation of their results, and, increasingly, it will be simulations that aid engineers in developing processes that lead to materials with desired structures and properties.
Perspectives on experiments, modeling
2841
Figure 3. (a) A transmission electron microscope image showing a bimodal grain size distribution resulting from abnormal grain growth in a thin film of Ge [12]. (b) An image generated from a simulation of surface-energy-drive abnormal grain growth coupled with thermal-grooveinduced grain boundary drag in a thin film [13].
References [1] C.V. Thompson, Solid State Phys., 55, 2000. [This paper can serve as a general reference for much of the content of this perspective.] [2] M. Hillert, Acta Metall., 13, 227, 1965. [3] H.J. Frost, C.V. Thompson, C. L. Howe, and J. Whang, Scr. Metall., 22, 65, 1988. [4] M.P. Anderson, D.J. Srolovitz, G.S. Grest, and P.S. Sahni, Acta Metall., 32, 783, 1984. [5] D.J. Srolovitz, M.P. Anderson, P.S. Sahni, and G.S. Grest, Acta Metall., 32, 793, 1984. [6] K. Kawasaki and Y. Enomoto, Physica, 150A, 462, 1988. [7] C.E. Krill and L.Q. Chen, Acta Mater., 50, 3057, 2002. [8] M.P. Anderson, G.S. Grest, and D.J. Srolovitz, Philos. Mag. B, 59, 293, 1989. [9] H.J. Frost, C.V. Thompson, and D.T. Walton, Acta Metall. Mater., 38, p. 1455, 1990. [10] H.J. Frost, Y. Hayashi, C.V. Thompson, and D.T. Walton, MRS Symp. Proc., 317, 431, 1994. [11] R. Carel, C.V. Thompson, and H.J. Frost, Acta Metal. Mater., 44, 2479, 1996. [12] J.E. Palmer, C.V. Thompson, and H.I. Smith, J. Appl. Phys., 62, 2492, 1987. [13] H.J. Frost, C.V. Thompson, and D.T. Walton, Acta Metall., 40, 779, 1992. [14] M.C. Demirel, A.P. Kuprat, D.C. George, and A.D. Rollett, Phys. Rev. Lett., 90, 016106, 2003.
Perspective 27 ATOMISTIC SIMULATION OF FERROELECTRIC DOMAIN WALLS I-Wei Chen Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104-6282, USA
1.
Introduction
Atomistic simulation of moving discommensurations is most useful when atomic details have a strong influence on the outcome. A classical example is the problem of dislocation core and movement in which the nature of atomic bonding has a direct effect on the configuration of the dislocation core, which in turn affects the manner the dislocation moves [1]. The final macroscopic characteristics of plastic deformation are intimately related to these details, of which a complete understanding can only come from the atomistic simulation of both the static and the (activated) non-equilibrium configurations of dislocations. The extension to the interface problem is encountered in martensitic transformations, involving either the parent/product interface or the product/product interface, which we call variant interface [2]. Variant interfaces are mobile and may be regarded as a group of dislocations, and like dislocations they move under a stress. The problem is relevant to shape memory alloys in which the stress and rate dependence of shape change is ultimately controlled by the atomic configurations at variant interfaces during their movement, even though the final (presumably equilibrium) multi-variant configurations are dictated by crystallography and elastic energy minimization. As a further extension to the interface problem, domain walls in ferroelectric crystals are like variant interfaces in that, crystallographically, both belong to the class of twin boundaries, but domain walls are complicated by electrostatic considerations [3, 4]. Since the origin of ferroelectricity lies in the instability of covalent bonding, the atomic configurations of domain walls are necessarily sensitive to both covalent bonding and long-range elastic and electrostatic interactions. Such complexity makes the problem a good candidate for atomistic simulation studies from which one may hope to better understand domain switching 2843 S. Yip (ed.), Handbook of Materials Modeling, 2843–2847. c 2005 Springer. Printed in the Netherlands.
2844
I-W. Chen
characteristics such as coercivity and fatigue, which are important for ferroelectric applications. To date, however, there have been very few atomistic simulation studies on domain walls, none being relevant to domain wall movement. Nevertheless, the recent development of phenomenological but realistic potentials for Pb(Zr,Ti)O3 (PZT) [5] makes such studies entirely feasible within today’s computational resources. The purpose of this commentary is to outline the basic domain wall problem to stimulate the community to undertake atomistic simulations.
2.
Synopsis
Some general considerations based on our theoretical understanding of dislocation and ferroelectricity provide the following synopsis. A domain wall separates two domains of different polarization vectors. Like twinning, ferroelectric distortion causes internal strains that vary from domain to domain, albeit the net shear component of such strains can be greatly reduced by alternating the shear direction. Therefore, the simplest domain wall (called 180◦ domain wall) has opposite polarization on the two sides, and it has no elastic distortion at all. All the non-180◦ domain walls, however, are like twin boundaries and have some short-range strains associated with them. Therefore, they interact with stresses, including self-stress of the walls and internal stresses due to impurities and heterogeneities. A stress-free domain wall, in turn, is electrically neutral, which requires the normal component of the polarization vector to be continuous across the domain wall. This is illustrated for a tetragonal crystal (e.g., BaTiO3 ) for both 180◦ (BB’ in Fig. 1) and 90◦ (AA’ in Fig. 1) domain walls. This neutrality condition is not satisfied if the domain wall is aligned in other directions. (For example, BB” in Fig. 1 has a A'
B'
B"
A' D C
A
B
A Figure 1.
B
Atomistic simulation of ferroelectric domain walls
2845
negatively charged kink at CD.) Therefore, curved or kinked domain walls are generally charged. Such is the case at the edge of a lens-shaped domain where two domain walls bend to meet each other. Also, when a domain wall moves, it prefers to first form charged kinks (or ledges in three dimensions) as shown in Fig. 1, then propagate the kinks sidewise, rather than moving the entire domain wall forward all at once. (The electrostatic forces on the charged kinks always move them in a way to enlarge the side where the polarization vector is electrically favored.) These kinks and ledges are structurally similar to those found on moving dislocations, twin boundaries and variant interfaces, and their formation determines the activation energy for the wall movement. Since all moving walls are charged and all stationary non-180◦ walls are strained, they will attract or repel defects either electrically or elastically. Oxygen vacancies, which are positively charged and elastically soft, are known to be a potent defect that strongly interacts with domain walls and greatly influences the ferroelectric behavior of BaTiO3 and PZT. Second phases can similarly interact with domain walls and modify ferroelectric behavior.
3.
Stationary Domain Wall and Wulff Plot
The above synopsis makes evident the path to follow in the atomistic simulation of stationary and moving domain walls. In the following, we outline the case primarily for (strain-free) 180◦ domain walls to focus on the electrostatic aspect that is unique to the domain wall problem. To begin with, we recognize that there is a strong anisotropy due to both crystallography and charge. Take the case of BaTiO3 for example, in which the polarization is along 001. Then all the (hk0) domain walls are neutral, so their different domain wall energies are due to crystallographic anisotropy. On the other hand, the (hkl) domain walls are charged, so with increasing l there is a rapid rise in energy from the electrostatic contribution. The Wulff plot of the domain wall is therefore a highly elongated rod around the four-fold symmetry axis 001. Also of interest is the second derivative of the domain wall energy which determines the torque on a curved domain wall. These aspects, as well as the atomic positions and polarization distribution across the domain wall, can be readily studied by atomistic simulation. Our current understanding is that domain walls are relatively sharp, across which a near discontinuity of polarization vector exists. In this respect, ferroelectric domain walls are different from magnetic domain walls which can be quite diffuse. The sharpness of ferroelectric domain walls is due to the strong preference for polarization to align in certain crystallographic orientations and with a constant magnitude, given by the minimum of the “doublewell”. Ionic displacements at intermediate positions and with intermediate orientations may simply have too high an energy to be allowed even in the
2846
I-W. Chen
transition layer at the domain wall. This should be verified by atomistic simulation, however, and any deviation from the double-well minimum will modify the domain wall energy due to the so-called “gradient energy” term in the Ginsburg Landau equation. Moreover, the possibility of a more extended domain wall cannot be ruled out in some ferroelectric crystals. In BCC metals, extended dislocation cores with three-fold symmetry have been seen in both electron microscopy and atomistic simulations.
4.
Activation and Domain Wall Movement
As a first step to understand domain wall movement, we consider a translation that brings a domain wall initially to a higher energy state, then passing a maximum, before finally returning to the equilibrium state after one unit cell displacement. The periodic energy landscape along such translation can be studied by atomistic simulation, and its maximum slope gives the theoretical coercive field (at 0 K) of a flat domain wall. This energy landscape corresponds to the Peierls (lattice) energy for dislocations, and the theoretical coercive field corresponds to the Peierls stress. Lattice periodicity similarly manifests itself in energy profiles of a kinked/ledged domain wall when the entire wall is translated forward or when the kink/ledge is translated sideway. Polarization distribution across the translated wall/kink/ledge is again of interest. Both an electric and a stress field can motivate domain wall movement, but the energetics of domain wall movement can be studied without applying an external field. For a flat wall, the energy barrier for first forming a bulge, then enlarging the bulge sideway, is lower than the “Perierls” energy for translation. This bulge has a thickness of a unit cell, is electrically neutral as a whole, but is locally charged on some portions of its perimeter. The total energy of a bulged domain wall as a function of the size and shape of the bulge can be studied by atomistic simulation, which determines the activation barrier (and the coercive field at 0 K) for the flat wall. We expect the shape of the bulge to be highly anisotropic in order to minimize the domain wall energy of the charged perimeter and to keep the oppositely charged segments as far apart as possible. For the latter reason, the shape optimization of a bulge (of a constant area) involves non-local interaction and does not immediately follows from the Wulff plot. In addition, the incline of the perimeter may not be vertical but, instead, may be quite gradual especially when the wall energy of the segment is low. Therefore, atomistic simulation is again useful for elucidating these details. Of course, the simulation can also be repeated under an applied field. Since the activation nature implies that the wall movement is a dissipative, frictional process even at 0 K, a finite wall velocity obtains under an applied field once it exceeds the coercive field.
Atomistic simulation of ferroelectric domain walls
5.
2847
Extension to Finite Temperature and non-180◦ Domain Walls
Clearly, the simulation studies outline above can be extended to finite temperature. Information of domain wall entropy, roughening, and thermally activated movement can then be analyzed. Because of the non-local electrostatic interaction (and more generally, elastic interaction for non-180◦ walls) and the strong anisotropy of domain wall energy, the size of the simulated volume should be sufficiently large to avoid artifacts that either suppress or promote a particular mode of fluctuation. This size limitation is the main concern for atomistic domain wall simulations but it should be within reach of today’s computational resources. We have thus far emphasized the electrostatic aspect which is unique to the domain wall problem and is appropriate for the 180◦ domain walls, which only respond to the electrical field. The insensitivity to the stress field makes 180◦ domain walls less likely trapped and more mobile. In a real ferroelectric crystal, however, other domain walls are also present and they may well control the ferroelectric properties. For these domain walls, incorporation of both electrostatic and elastic interactions is obviously needed. Two sets of scaling parameters are therefore necessary. For electrostatic interaction, it is polarization; for elastic interaction, it is shear modulus and strain, which scales as (polarization)2 . Since the polarization vanishes at the Curie temperature whereas the shear modulus thermally softens only gradually, the temperature dependence in the domain wall problem is much stronger than in the dislocation problem, where only the shear modulus matters while strain remains constant. The additional influence of crystal chemistry and crystallography also need to be considered. This can be best incorporated into the simulation by parameterizing the semi-empirical potentials employed to allow simple interpretation in terms of charge, covalency, elasticity and polarization. No additional computational difficulty is foreseen with these studies.
References [1] J.P. Hirth and J. Lothe, Theory of Dislocations, 2nd edn., John Wiley & Sons, New York, 1982. [2] Y-H. Chiao and I-W. Chen, “Martensitic growth in ZrO2 –an in situ, small particle TEM study of a single-interface transformation,” Acta Metall., 38, 1163–1174, 1990. [3] F. Jona and G. Shirane, Ferroelectric Crystals, Pergamon Press, New York, 1962. [4] M.E. Lines and A.M. Glass, Principles and Applications of Ferroelectrics and Related Materials, Clarendon, Oxford, 1977. [5] I, Grinberg, V.R. Cooper, and A.M. Rappe, “Relationship between local structure and phase transitions of a disordered solid solution,” Nature, 419, 6910, 909–911, 2002.
Perspective 28 MEASUREMENTS OF INTERFACIAL CURVATURES AND CHARACTERIZATION OF BICONTINUOUS MORPHOLOGIES Sow-Hsin Chen Department of Nuclear Engineering, MIT, Cambridge, MA 02139, USA
1.
Introduction
Studies of bicontinuous and interpenetrating domain structures developed in the process of spinodal decomposition (SD) of binary mixtures of molecular fluids, binary alloys, and polymer blends have been an attractive research theme over the past several decades [1]. Complex structures can be observed in the phase separation process induced by a temperature quench from the stable one-phase region into the unstable two-phase spinodal region. The fact that the dynamical processes in complex fluids such as polymer blends, glasses, and gels are extremely slow due to their long characteristic relaxation times allows one to study their morphologies with better accuracy than those of simple liquid mixtures [2]. Similar kinds of bicontinuous structures have been observed in the water/oil/ surfactant three-component microemulsion system in the one-phase region close to the three-phase boundary (often called the “fish tail” in microemulsion literature because of the similarity in shape of the phase boundary of this region to a fish tail) and in the vicinity of hydrophile-lipophile balance temperature [3, 4]. It is interesting and physically relevant to examine the common and universal features of these bicontinous structures observed in phase-separated polymer blends [5] and in microemulsions [6]. During the course of the past three decades our comprehension of amphiphile solutions has been profoundly transformed [7]. A major conceptual force behind these developments has been the realization that the bulk phases of amphiphiles in solution could be visualized as systems of interfacial films and 2849 S. Yip (ed.), Handbook of Materials Modeling, 2849–2863. c 2005 Springer. Printed in the Netherlands.
2850
S.H. Chen
could therefore be successfully described in terms of their properties. Thus, the structural and thermodynamic features of micellar or vesicular solutions, of microemulsions, and of lyotropic liquid crystals, began to be progressively rationalized in terms of the properties of fluctuating surfaces [8]. Because interfaces formed by amphiphiles have often very small or vanishing interfacial tensions the dominant contribution to their free energy is its bending elastic energy. The latter, in turn, is given by the following expression,
FH =
1 dS κ (H − c0 )2 + κ K 2
(1)
where dS is the element of the surface area, H and K, are respectively, the mean and the Gaussian curvatures of the surface that represents the interface, and c0 , κ and κ are, respectively, the spontaneous curvature, the bending rigidity and the saddle-splay constants. FH is known as the Helfrich free energy [9] in the literature. Thus it is seen from the above formula, that the basic physical variables describing the elastic properties of an interface are the mean and Gaussian curvatures, much the same as those describing the mechanical properties of an interacting particle system are the kinetic and the potential energies. It follows that a statistical mechanical description of an interface must involve the average mean and Gaussian curvatures over the whole interface. It is then clear that to devise a method to measure and compute these average curvatures for physically relevant interfacial systems is utmost important. At any point on the oil–water interface (with the surfactant mono-layer in-between), the interface may be characterized by its two principal radii of curvature R1 and R2 , which by convention are positive when the interface is curved towards the oil phase and are negative when it is curved towards the water phase. One then defines a mean curvature by H =1/2 ((1/R1 ) + (1/R2 )), and a Gaussian curvature by K = 1/(R1 R2 ) at that point. The geometry of the surface is completely specified by giving a pair of values (H, K) at every point on the surface. The first practical method for measuring the average mean curvature H was proposed and implemented by Lee and Chen [10]. The system they studied was a three-component non-ionic microemulsion system composed of water/octane/ C10 E4 (tetraethylene glycol monodecyl ether) with equal volume fractions of water and oil. The phase diagram in the temperature- φs (volume fraction of the surfactant) plane shows that the hydrophile–lipophile balance (HLB) temperature is about 24 ◦ C and the one-phase microemulsion begins to form at the minimum volume fraction, φs = 0.10. Hence at φs = 0.20, if one varies the temperature from 15◦ to 30 ◦ C, the one-phase microemulsion should transform from a water-in-oil droplet microemulsion to an oil-in-water microemulsion through an intermediate state of lamellar microemulsion. In this process the average mean curvature H should switch sign from an initial
Measurements and characterization of bicontinuous morphologies
2851
positive value to a final negative value through an intermediate value of zero at the HLB temperature. They indeed found that the average mean curvature varies linearly in this temperature interval starting with a value H = 0.008 Å−1 at 15 ◦ C and ends up with a value H = −0.006 Å−1 at 30 ◦ C. The method was based on the measurements of three separate specific interfacial areas using a contrast variation technique in a small-angle neutron scattering (SANS) experiment. Because the surfactant layer in-between the water and oil phases of a microemulsion has a finite thickness d, the water–surfactant interfacial area AW and the oil–surfactant interfacial area AO are generally not equal if the film is curved. This can most easily seen by considering the case of a spherical shell; the inner surface has less area than the outer surface and this difference is related to the thickness of the shell and the radius of the sphere. The exact geometrical relationships between AW , AO and the surface area AS measured at the midpoints of the surfactant molecules to the average curvatures H and K are given by:
Aw = As
d2 K , 1 + d H + 4
Ao = As
d2 K 1 − d H + 4
(2) The average mean curvature can thus be calculated from thetwo measured areas AW and AO by a formulaH = ( Aw − AO ) / ( Aw + Ao ) /d. With this method, errors involved in the specific interfacial areas measurements were large enough that the average Gaussian curvature could not be deduced with enough accuracy. In order to measure the average Gaussian curvature Chen et al. proposed another method. This is called a Clipped Random Wave (CRW) model [6]. Cahn in 1965 proposed a scheme for generating a three-dimensional morphology of a phase separated A–B alloy by clipping a Gaussian random field generated by superposing many isotropically propagating sinusoidal waves with random phases [11]. The Gaussian random field can be normalized in such a way that it fluctuates continuously between −1 and +1. One can then realizes the bicontinuous two-phase morphology by clipping the continuous random process, namely, by assigning say 0 to all negative signals representing A, and +1 to all positive signals representing B. In this scheme, the essential features of the morphology depends only on the “spectral density function” (SDF), which is the three-dimensional Fourier transform of the twopoint correlation function g(r) of the Gaussian random field. The SDF gives the distribution of the magnitudes of the propagation wave vectors of the sinusoidal waves. Berk in 1987 further developed the idea of Cahn mathematically for the purpose of analyzing scattering data [12]. In particular, he derived an important relation connecting the two-point correlation function and the Debye correlation function which determines the scattering intensity. In his
2852
S.H. Chen
original paper, however, Berk discussed only a SDF which is a delta function. This resulted in generating a morphology which is only partially disordered and is not realistic. Chen, Chang and Strey [13] later point out that a broader but peaked SDF is necessary to generate a more disordered morphology which will reproduce small-angle neutron scattering (SANS) intensity distributions. In particular, it was recognized that the SDF should be chosen in such a way that it has the correct second moment to ensure the agreement with the measured finite interfacial area per unit volume (the specific interface area S/V).
2.
Theory of Scattering From Bicontinuous Porous Materials in Bulk Contrast
The intensity distribution of SANS from an isotropic, disordered twocomponent porous material can be calculated generally from a Debye correlation function (r) by the following formula [14]: I (Q)= < η >
∞
2
dr4πr 2 j0 (Qr)(r)
(3)
0
whereη2 = ϕ1 ϕ2 (ρ1 − ρ2 )2 , is the mean square fluctuation of the local scattering length density, ϕ1 and ϕ2 the volume fractions of components 1 and 2, ρ1 and ρ2 the corresponding scattering length densities. There are two physical boundary conditions that the Debye correlation function, a function of a scalar distance between the two points under consideration, must satisfy: it is normalized to unity at the origin, r = 0, and it should decay to zero at infinity. The most important property of the Debye correlation function for the case with a sharp boundary between two regions having different scattering length densities, ρ1 and ρ2 , is that it has a linear and a cubic terms in the small r expansion of the form: (r → 0) = 1 − ar + br 3 + . . . b 1 S r 1 − r2 + . . . = 1− 4ϕ1 ϕ2 V a
(4)
where a = (S/ V )/4ϕ1 ϕ2 is a factor proportional to the total interfacial area per unit volume of the sample and the ratio of the coefficient of the cubic term to the linear term has been given by Kirste and Porod [15] in terms of curvatures as: 1 b 1 2 K . = H − a 8 24
(5)
Measurements and characterization of bicontinuous morphologies
3.
2853
Clipped Random Wave (CRW) Model for Computation of Interfacial Curvatures
In the clipped random wave model of Berk [12], an order parameter field ψ( r ), which for example, can be a quantity proportional to the volume fraction of the majority phase in the system, is constructed by superposition of a large number N of cosine waves with random phases:
ψ( r) =
N 2 cos(ki · r + ϕi ) N i=1
(6)
where directions of the wave vector ki are assumed to be distributed isotropically over a unit sphere and the phase, ϕi distributed randomly over an interval (0, 2π ). In constructing the sum, the magnitude of the ki vector is sampled from a scalar distribution function f (k) called the “spectral density function”. This order parameter field is certainly a Gaussian random field due to the central limit theorem in statistics. The statistical properties of a Gaussian random field is completely characterized by giving its spectral density function f (k) or the corresponding two-point correlation function g(r). We define the two-point correlation funcr2 ) and the associated SDF f (k) by a Fourier ri )ψ( tion by g(|ri − r2 |) = ψ( transform relation: g(|ri − r2 |) =
∞
4π k 2 j0 (k | ri − r2 |) f (k) dk
(7)
0
This continuous random process ψ( r ), varying between +1 and −1, having a mean square value of unity (by its definition Eq. 6), is then clipped and transformed into a discrete, two-state discrete random process. By clipping we mean assigning a constant value +1 to the function whenever the Gaussian random field at that point is above a certain “clipping level” called α, and a constant 0 whenever its value is below α. This transformation can be defined as follows: 1, when ψ( r) ≥ α r )) = (8) ζ( r ) = α (ψ( 0, otherwise where α is a step function. Then the Debye correlation function can be written as a function of the discrete random variable ς (r) in the form: ζ(0)ζ( r ) − ζ 2 (9) ζ − ζ 2 where the quantities on the right-hand side are given by Teubner [16] as: ( r) =
1 1 ζ = − √ 2 2π
0 α
exp(−x 2 /2)dx
(10)
2854
S.H. Chen
1 ζ(0)ζ( r ) = ζ − 2π
cos−1 (g(r))
α2 exp − 1 + cos θ
0
dθ.
(11)
The average value of the clipped Gaussian random filed, ζ , and 1 − ζ can be interpreted as volume fractions of the majority and minority phases, ϕ1 and ϕ2 , respectively. Using (10) and ζ = ϕ1 , Eq. (9) can be rewritten as 1 ( r) = 1 − 2π ϕ1 (1 − ϕ1 )
cos−1 (g(r))
0
α2 exp − dθ 1 + cos θ
(12)
For a small α, meaning a slight deviation from an isometric case, Eq. (12) can be approximated as:
( r) ∼ = 1−
cos−1 (g(r)) 1 cos−1 (g(r)) − α 2 tan 2π ϕ1 (1 − ϕ1 ) 2
(13)
where the volume fraction ϕ1 can be approximated as α 1 ϕ1 ∼ = −√ . 2 2π
(14)
For an isometric (ϕ1 = ϕ2 = 1/2) microemulsion α = 0, and Eq. (12) reduces to a simple form 2 sin−1 (g(r)) (15) π which is a well-known result given by Berk [12]. From the relation in Eq. (7), one has the small r expansion of g(r) of the form ( r) =
∞
g(r) = 0
1 1 4 4 k r + . . . f (k)dk 4π k 2 1 − k 2r 2 + 6 120
=1−
1 4 4 1 2 2 k r + k r + ... 6 120
(16)
where we used the normalization condition g(0) = 1, and k 2 and k 4 denote the second and fourth moment of the spectral density function. Note that this expansion has a quadratic term followed by a quartic term. Using the result of Eq. (16) in Eq. (12), we obtain a small r expansion of the Debye correlation function the form: 1/2 1 2 √ k2 e−α /2r 2π 3ϕ1 ϕ2 1 2 2 1 k4 − k (α − 1) r 2 × 1− 40 k 2 72
B (r → 0) = 1 −
(17)
Measurements and characterization of bicontinuous morphologies
2855
Comparing this with Eq. (4), we arrive at a useful relation connecting the second moment of the spectral density function to the fundamental quantity of a porous material, the interfacial area per unit volume, 2 2 1/2 −α 2 /2 S = √ k e . V π 3
(18)
This relation also implies that one of the basic requirements for the physically acceptable spectral function is that the second moment be finite. We next consider a random surface generated by the level set ψ( r ) = ψ(x, y, z) = α
(19)
where the function ψ(x, y, z) is defined by Eq. (6). Teubner [16] has proved a remarkable theorem that for this random surface the average mean, Gaussian and mean square curvatures are given by:
K =
(20)
1 2 2 k α −1 6
(21)
H2 =
where
π 2 k 6
α H = 2
1 2 2 k α + v2 6
(22)
6 k4 v = 2 − 1 5 k2 2
(23)
So the second requirement of the physically acceptable spectral function is that it has a finite fourth moment also. Since for a bicontinuous microemulsion the level surface defined by Eq. (19) is approximately the mid-plane passing through the surfactant monolayer in a bulk contrast experiment, the average mean, Gaussian and square mean curvatures of the surfactant monolayer can be computed once a physically acceptable SDF can be found. Choice of the SDF can be based on a criterion that when it is substituted into equations Eqs. 3, 7 and 12, it would give an intensity distribution which agrees with SANS data in an absolute scale. A suitable form of the SDF has been proposed by Chen and Choi [17], which is an inverse eighth order polynomial in k and which contains three parameters a, b, and c. The first two parameters a and b have their approximate correspondences in the approximate theory of Teubner and Strey [18]. In the T-S theory, the Debye correlation function is given by: TS (r) = e−r/ξ
sin(2πr/d) (2πr/d)
(24)
2856
S.H. Chen
The correspondences are: a ≈ 2π/d and b ≈ 1/ξ , where d is the interdomain (water–water or oil–oil) repeat distance and ξ the decay length of the local order [19, 20]. The parameter c controls the transition from the main Q peak to the large Q behavior of the scattering intensity distribution, the existence of which is essential for the good agreement of the theory and experiment for the entire range of Q. This agreement is important if the correct surface to volume ratio of the system is to be incorporated into the theory. We thus choose [17] a SDF for which both the second and fourth moments exist. This function is an inverse eighth order polynomial containing three parameters, which are the minimal set for the physical situation under study: f (k) =
bc(a 2 + (b + c)2 )2 /(b + c)π 2 (k 2 + c2 )2 (k 4 + 2(b2 − a 2 )k 2 + (a 2 + b2 )2 )
(25)
√ This spectral density function has a peak approximately at kmax ≈ a 2 −b2 . From Eq. (25), the second and fourth moments of the spectral density function are given by:
c(a 2 + b2 + bc) (b + c)
c(a 4 + 2a 2 b2 + b4 + 4a 2 bc + 4b3 c + 4b2 c2 + bc3 ) k4 = (b + c) k2 =
(26)
The two-point correlation function g(r) can be given in an analytical form as well, which has only even powers of r in the small r expansion. From this two-point correlation function, the Debye correlation function can be calculated analytically using Eq. (12), once the clipping level (i.e., the volume fraction ϕ1 ) is specified. Explicit expressions for the curvatures are given in terms of the three parameters and the clipping level α (determined by the volume fraction of the majority phase as calculated by Eq. (10)) as:
π c(a 2 + b2 + bc) 6 (b + c) 2 2 1 c(a + b + bc) K = − (1 − α 2 ) 6 (b + c)
a 4 + 2a 2 b2 + b4 + 4a 2 bc + 4b3 c + 4b2 c2 + bc3 H 2 = K + 5(a 2 + b2 + bc) α H = 2
(27) (28) (29)
This model can be used to fit all the bulk contrast data nicely to obtain parameters a, b and c. One can then use Eqs. (27)–(29) to calculate the respective curvatures.
Measurements and characterization of bicontinuous morphologies
4.
2857
Examples of Small-angle Neutron Scattering (SANS) Data Analysis Using CRW Model
Figure 1a shows a series of scattering intensities I (Q) measured from an AOT microemulsion system at points in the one-phase region close to the lamellar phase boundary. The symbols are experimental data and solid lines are theoretical fits using Eq. (7), (15) and (25) in Eq. (3) plus an incoherent background. These curves show good agreement between the theory and experimental data over the entire range of Q. The fitted parameters a, b, and c are listed in Table 1. As the surfactant volume fraction increases a and c increase rapidly while b increases slowly. Considering the relations d ≈ 2π/a and ξ ≈ 1/b, the inter-domain distance d and the decay length ξ decreases with the surfactant volume fraction. This makes sense because we create more surfaces per unit volumes as the number of surfactant molecules increase and thus the inter-domain distance should decrease. Considering the ratio ξ /d, one sees that a bicontinuous microemulsion becomes more disordered at smaller surfactant volume fractions. The average Gaussian and square mean curvatures calculated by using Eqs. (28) and (29), respectively, are given in Fig. 2a. The solid line is a fit using a phenomenological parabolic equation K = co + c1 (φs − φo )2 (a)
(30)
(b) 104
103
102
102 101
Theory
100
0.08 0.11 0.14 0.17 0.20
I(Q) (cm⫺1)
I(Q) (cm⫺1)
103
φs
10⫺1 0.01
Theory octane decane dodecane tetradecane
100 0.1
Q (Å⫺1)
101
10⫺2
10⫺1
100
Q (Å⫺1)
Figure 1. (a) Analyses of the bulk contrast scattering intensities of isometric AOT-based microemulsions as a function ofsurfactant volume fraction. The scattering intensities were measured at points close to the lamellar phase boundary, where the average mean curvature is nearly zero; (b) analyses of the bulk contrast scattering intensities of isometric Ci E j -based microemulsion systems taken at the fish tail of its phase diagram.
2858
S.H. Chen
Table 1. Fitted parameters a, b, c and the calculated interfacial curvatures for an isometric (equal oil and water volume fractions,H = 0) AOT/decane/water (NaCl) microemulsion system in the one-phase region along the lamellar phase boundary. ϕs is volume fraction of the surfactant AOT φs
a (Å−1 )
b (Å−1 )
c (Å−1 )
0.05 0.08 0.11 0.14 0.17 0.20
0.00613 0.01317 0.02139 0.03005 0.03939 0.04968
0.01104 0.01198 0.01120 0.01236 0.01346 0.01459
0.01710 0.04682 0.09114 0.1096 0.1530 0.2096
2 η
backgrd
25.92 13.07 10.86 9.33 9.30 9.00
0.385 0.383 0.393 0.382 0.352 0.321
K
H 2
NaCl
–0.353 –1.165 –2.382 –3.610 –5809 –8.943
1.62 6.35 16.78 21.65 37.67 64.85
0.49 0.46 0.42 0.36 0.32 0.26
(1020 cm −4 ) (cm −1 ) (10−4 Å−2 ) (10−4 Å−2 ) (wt%)
and the dashed line is a fit using a similar equation
H 2 = c0 + c1 (φs − φo )2 .
(31)
Results of the fit are c0 = −0.434 × 10−4 Å−2 ; c1 = −310.2 × 10−4 Å−2 ; φ0 = 0.036 and c0 = 2.35 × 10−4 Å−2 ; c1 = 2530 × 10−4 Å−2 ; φ0 = 0.046. In both cases, φ0 turns out to be very close to the volume fraction at the fish tail of the phase diagram.We can say that c0 and c0 are the average Gaussian curvature and the average square mean curvature at the fish tail which is the lowest volume fraction of the surfactant where the one-phase microemulsion first forms. Magnitudes of average Gaussian curvatures obtained in this experiment are comparable to that obtained for a C10 E 4 /D2 O/Octane microemulsion near the fish tail before [10]. The quadratic dependence of the average Gaussian curvature on the volume fraction of surfactant is also understandable. According to Eqs. (18) and (21), the magnitude of the average Gaussian curvature is proportional to the square of the interfacial area per unit volume. The interfacial area is in turn proportional to the amount of surfactant added to the system. Since the average mean curvature is expected to be zero for microemulsions that we studied at these phase points, the average square mean curvature is the variance of the fluctuation of the mean curvature. One also observes that the variance of the fluctuation increases quadratically as the volume fraction increases. This is reasonable because as more surfactant is added to the system, more interfacial area is created. The surfactant film has to bend around to accommodate itself in the liquid. Figure 1b shows scattering intensities and their analyses for a series of isometric microemulsion systems composed of C8 E 3 /D2 O/n-alkane at their respective fish tail in the phase diagram. The theory again agrees with experiments uniformly well and the extracted curvatures are plotted as symbols in Fig. 2b. The fish tail is a very special point in the phase diagram where the system has the lowest specific internal surface area. Figure 2b shows results
Measurements and characterization of bicontinuous morphologies
⫺2
60
⫺4
50
40
⫺6
30
⫺8 ⫺10 0
0.05
0.1
φs
(a)
0.15
0.2
120 100 80
⫺6
⫺8
60
⫺10
20
⫺12
10
⫺14
0
⫺16
40
(10⫺4 Å⫺2)
⫺4
0
70
(10⫺4 Å⫺2)
(10⫺4 Å⫺2)
⫺2
80
(10⫺4 Å⫺2)
0
2859
20
0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
φs
(b)
Figure 2. (a) Average Gaussian and square mean curvatures as a function of surfactant volume fraction obtained from analyses of a series of isometric AOT-based microemulsions. The corresponding SANS intensities are those shown in Fig. 1a. Solid lines are parabolic fits based on Eqs. (30) and (31). Average Gaussian and square mean curvatures as a function of surfactant volume fraction at the fish tail obtained from analyses of a series of isometric Ci E j -based microemulsions. The corresponding SANS intensities are those shown in Fig. 1b. Solid lines are the parabolic fits.
of curvatures we extracted from 18 generic non-ionic microemulsion systems [21]. Again, the parabolic dependence of the curvatures on the surfactant volume fraction is evident. Figure 3 shows the level surface as defined by Eq. (19) for the system C8 E3 /D2 O/tetradecane at its fish tail. The three parameters in the spectral density function needed for the three-dimensional reconstruction comes from analysis of the fourth intensity curve from the top shown in Fig. 1b. It is clear from the figure the geometry of the level surface is locally hyperbolic giving rise to a negative Gaussian curvature. In Fig. 4 we show scattering intensity distributions and their analyses for three non-isometric microemulsions. This new microemulsion system is composed of F8 H16 /H2 O/perfluorooctane [22]. We use an abbreviation F8 H16 ≡ CF3 (CF2 )7 −COO(CH2 CH2 O)7.2 CH3 . The analyses are successful and we are able to extract both the average mean and Gaussian curvatures given in Table 2. It is interesting to ask the following question at this point: is CRW method of analysis applicable to bicontinuous structures generated by a late stage spinodal decomposition of an isometric system such as a binary alloy or a polymer blend? Answer is partially yes. In Fig. 5 we show a light scattering intensity distribution [5] of a 50–50 polymer blend of near critical mixture of perdeuterated polybutadiene (dPB) and polyisoprene (PI) at the late stage spinodal decomposition. The analysis result is shown as the solid line. The plot
2860
S.H. Chen
Figure 3. ‘The level surface as defined by Eq. (19) for the system C8 E3 /D2 O/tetradecane at its fish tail. It is clear from the picture that the surface is locally hyperbolic having a negative average Gaussian curvature. The size of cube is 240 × 240 × 240µm3 .
I(Q) (cm -1 )
102
101
The ory 100 α⫽0.0530 α⫽0.2534 ⫺1
10
10⫺2
α⫽0.4438
10⫺2
10⫺2 ⫺1
Q (Å )
Figure 4. Analyses of scattering intensity distributions for threenon-isometric microemulsions composed of F8 H16 /H2 O/perfluorooctane [22]. The curvatures obtained are given in Table 2.
Measurements and characterization of bicontinuous morphologies
2861
Table 2. Average mean and Gaussian curvatures extracted from non-isometric microemulsions α (clipping level) 0.0530 0.2544 0.4438
φ1 (volume fraction)
H in 10−3 Å−1
K in 10−4 Å−2
0.48 0.40 0.33
0.86 4.80 9.90
−3.35 −4.27 −5.05
is presented in a scaling form because the characteristic length scale of the microstructure is of the order of 10 µm. Agreement between the experiment and the theory is excellent for large Q portion of the curve but is only moderately satisfactory for smaller Q region. The following results are obtained:
α = 0, η2 = 6.836 × 1012 cm−4 a = 5.36 × 10−5 Å−1 ; b = 1.04 × 10−5 Å−1 ; K = −1.99 × 10−9 Å−2 ≈ 0.2 µm−2
c = 87.2 × 10−5 Å−1 (32)
101
Q3maxI(Q)/<η2>
100
10⫺1
10⫺2
Experiment Theory
10⫺3
100
101
Q /Qm ax Figure 5. Analysis of a light scattering intensity distribution [5] of a 50–50 (isometric) polymer blend of near critical mixture of perdeuterated polybutadiene (dPB) and polyisoprene (PI) at its late stage spinodal decomposition. The analysis results are given in Eq. (32).
2862
5.
S.H. Chen
Conclusion
We have outlined a method, called the Clipped Random Wave model, by which the interfacial curvatures of a porous material can be measured through a scattering experiment. This method also allows one to reconstruct a three dimensional interfacial structure of the porous medium from which the morphology of the two-phase bicontinuous microstructure can be visualized. We illustrate this method using SANS data taken from bicontinuous microemulsion samples having bulk contrast. Specifically, we can extract the average mean, Gaussian and square mean curvatures of the internal interfaces. The method relies on a choice of a physically reasonable SDF of the underlying Gaussian random field which upon clipping generates a bicontinuous structure that produces the given scattering intensity pattern. A physically reasonable SDF of the Gaussian random field should have finite lower order moments to ensure a finite internal surface area per unit volume and a finite average square mean curvature. The method has been put to test in three cases: a nearly isometric AOT/D2 O(NaCl)/decane microemulsion system as a function of surfactant volume fraction; Several isometric microemulsions made of Ci E j /D2 O/alkane at their respective fish tails in the phase diagrams; and several non-isometric microemulsions made of F8 H16 /H2 O/perfluorooctane. Theoretical intensities calculated by the CRW model can be made to agree with the scattering data in an absolute scale by adjusting the three length scale parameters, 1/a,1/b,1/c, in the SDF. The commonly used two-parameter Teubner–Strey model for the Debye correlation function can also be used to fit the scattering data in the small Q region. But the fit is distinctly worse compared to our three parameter theory for Q region after the peak.The parameter a can be approximately identified as a = 2π/d and b = 1/ξ of the corresponding parameters in the T-S theory. The third parameter c is necessary for ensuring a smooth transition of scattering intensity from small Q behavior near the peak to large Q behavior dominated by the Porod’s law. We can thus say that three length scales are necessary for a complete description of the microstructure of bicontinuous microemulsions. The CRW model with an appropriate choice of the SDF is also shown to be a useful tool for quantitative analyses of small angle scattering data from microphase-separated system such as a symmetric binary mixture of polymers at late stage of spinodal decomposition. It not only gives information on curvatures of the interface, which cannot be obtained by other means, but can also generate the morphology of 3-d microstructure as shown in Fig. 3.
Acknowledgment This research is supported by a grant from Materials Science Program of US Department of Energy. We are grateful to the Intense Pulse Neutron Source
Measurements and characterization of bicontinuous morphologies
2863
Division of Argonne National Laboratory for neutron beam time at the SAND Low-Angle Diffractometer.
References [1] J.D. Gunton, M. San Miguel and P. Sahni, In: C. Domb and J.L. Lebowitz (eds.), Phase Transitions and Critical Phenomena, Academic Press, New York, 1983, p. 269. [2] T. Hashimoto, Phase Transitions, 1988, 12, 47; Materials Science and Technology, Vol 12, Structure and Properties of Polymers, VCH, Weinheim, 1993. [3] W. Jahn and R. Strey, J. Phys. Chem., 92, 2294, 1988. [4] S.H. Chen, S.L. Chang, and R. Strey, J. Chem. Phys., 93, 1907, 1990. [5] H. Jinnai, T. Hashimoto, D.D. Lee, and S.H. Chen, Macromolecules, 30, 130, 1997. [6] S.H. Chen, D.D. Lee, K. Kimishima, H. Jinnai, and T. Hashimoto, Phys. Rev. E, 54, 6526, 1996. [7] W.M. Gelbart, A. Ben-Shaul and D. Roux (eds.), Micelles, Membranes, Microemulsions, and Monolayers, Springer, New York, 1994. [8] S.A. Safran, Statistical Thermodynamics of Surfaces, Interfaces, and Membranes, Addison-Wesley, 1994. [9] W. Helfrich and Z. Naturforsch, 28c, 693, 1973. [10] D.D. Lee and S.H. Chen, Phys. Rev. Lett., 73, 106, 1994. [11] J.W. Cahn, J. Chem. Phys., 42, 93, 1965. [12] N.F. Berk, Phys. Rev. Lett., 73, 106, 1987. [13] S.H. Chen, S. L. Chang, and R. Strey, J. Appl. Crystallogr., 24, 721, 1991. [14] P. Debye, H.R. Anderson, Jr. and H. Brumberger, J. Appl. Phys., 28, 679–683, 1957. [15] R. Kirste, Von & G. Porod, Kolloid-Z&Z.f.Polym., 184, 1–7, 1962. [16] M. Teubner, Europhys. Lett., 14(5), 403–408, 1991. [17] S.H. Chen and S.M. Choi, J. Appl. Cryst., 30, 755–760, 1997. [18] M. Teubner and R. Strey, J. Chem. Phys., 87, 3195–3200, 1987. [19] S.H. Chen, S.L. Chang, and R. Strey, Prog. Colloid Polymer. Sci., 81, 30–35, 1990. [20] S.H. Chen and S.L. Chang, and R. Strey, J. Appl. Cryst., 24, 721–731, 1991. [21] S.-M Choi, S.H. Chen, T. Sottmann, and R. Strey, “The existence of three length scales and their relation to the interfacial curvatures in bicontinuous microemulsions,” Physica A, 304, 85–92, 2002. [22] P. LoNostro, S.M. Choi, C.Y. Ku, and S.H. Chen “Fluorinated microemulsions–a study of the phase behavior and structure by SANS,” J. Phys. Chem., 103, 5347– 5352, 1999.
Perspective 29 PLASTICITY AT THE ATOMIC SCALE: PARAMETRIC, ATOMISTIC, AND ELECTRONIC STRUCTURE METHODS Christopher Woodward Northwestern University, Evanston, Illinois, USA
Over the last hundred years our evolving comprehension of deformation has been based on the discovery and understanding of the line defects (dislocations) that control plasticity. While our ability to directly model various defects has improved over this time, a great deal has been learned about deformation processes using parametric approaches. A natural extension of analytic models, parametric studies are sometimes overlooked in our search for the most accurate computational representation of the mechanism that controls a given materials property. However, parametric strategies has been broadly employed in the materials community. For example, we have used parametric approaches to study the influence of micro-structural properties on the strengthening mechanisms in model Ni-based super alloys and the influence of chemistry on high temperature strengthening in Ti–Al alloys [1, 2]. Also, current dislocation dynamics calculations can be viewed as a template for parametric studies of the role of various defect-defect interactions and how they control macroscopic behavior. The deformation behavior of the bcc transition metals provides an excellent historical example of this type of work and how it influenced our understanding of these materials. In the early 1920s as materials scientists were exploring the deformation behavior of various simple metals. Schmid proposed the seemingly reasonable premise that: Plastic flow will occur when the shear stress resolved on a particular slip system reaches a critical value, the critical resolved shear stress (CRSS), which will be independent of slip system and the sense of slip. Further, the CRSS should not be influenced by other components of the applied stress tensor. Since that time many examples of violations to Schmid’s law have been documented. In fact for materials slated for high-temperature structural 2865 S. Yip (ed.), Handbook of Materials Modeling, 2865–2869. c 2005 Springer. Printed in the Netherlands.
2866
C. Woodward
applications (refractory metals and intermetallics) violations to Schmid’s law seem to be the rule rather than the exception. In these materials edge dislocations tend to be very mobile and deformation is limited by the lattice friction stress (Peierls stress) of screw dislocations. These materials also exhibit large Peierls stresses compared to the simple fcc metals, particularly at low temperatures. This and the significant deviations from Schmids law led Mitchel and co-workers [3] to propose that in the bcc transition metals this behavior was produced by dislocations with non-planar dislocation centers (cores) that made planar glide difficult. Realizing that the differences in geometry for fcc and bcc metals may explain the gross differences in the plasticity of these materials, several groups in the early 1960s developed parametric atomistic potentials for the bcc metals. One set of parametric potentials were based on variations of a generalized stacking fault energy [4, 5]. This is defined as the energy as a function of shear for two blocks of material along the screw direction where the plane of shear is on the assumed glide plane. Structural parameters (lattice and elastic constants) were held fixed for the suite of potentials, but no effort was made to model a particular material. These simple studies showed that a range of non-planar dislocation cores could be obtained for screw dislocations in the bcc metals. Even though these studies were hampered by the simple forms of inter-atomic potentials for many years the qualitative parametric description of the screw dislocations was found to be satisfactory. The large lattice friction stresses, and some of the deviations from Schmids law could be understood in terms of the derived non-planar dislocation core structures. In the last 15 years, as confidence grew in new atomistic potential schemes such as the Embedded Atom Method, and the Model Generalized Pseudopotential Theory several groups produced carefully designed atomistic potentials for specific bcc transition metals [6–9]. These methods produced core structures consistent with previous, parametric studies, lending credence to the idea that the local geometry determines the shape and mobility of dislocations on the atomic scale. Specifically the studies showed that the predicted dislocation cores for the group V and VI bcc transition metals spread into three {110} planes and the shapes of the cores fell into two distinct classes (Fig. 1). The group V metals exhibited a core spread symmetrically about a central point (Fig. 1a) while the core for the group VI metals spreads asymmetrically about this central point (Fig. 1b). Differences in the macroscopic plasticity of the group V and VI transition metals, for example higher sensitivity to non-glide stresses propensity in the group VI metals, were linked to the differences in the equilibrium core structures. Also, atomistic studies showed that details of the generalized stacking fault determined the differences in equilibrium shape of the dislocations [10]. Only recently has it been possible to actually model these dislocation cores using electronic structure methods based on Density Functional Theory (DFT).
2867 <110>
Plasticity at the atomic scale
<112> <111>
(a)
(b)
Figure 1. Schematic of the strain field near the dislocation cores found using atomistic potentials (before 2001) for the group V and group VI bcc transition metals.
The geometry of an isolated dislocation is problematic for current large-scale electronic structure methods, so two approaches have been taken to accommodate these boundary conditions. The group at Wright Patterson Air Force Base employed a flexible boundary condition method that self-consistently couples the local strain field produced by the dislocation core to the long range elastic field [11]. This technique is an extension of a method originally proposed by Sinclair and allows the dislocation to be contained in a very small simulation cell [12, 13]. Other groups have used simulation cells with dislocation dipoles carefully arranged to minimize the total stress field produced by the dislocation array [14, 15]. Surprisingly all the electronic structure calculations show Ta (group V) and Mo (group VI) transition metals to have a dislocation core consistent with Fig. 1a. Frederiksen and Jacobsen also find the same result for the screw dislocation in bcc Fe (group VIII). They also find that the generalized stacking fault calculated using DFT for all the group V and VI bcc metals are consistent with a core spread symmetrically about a central point (Fig. 1a). Our calculations also show the atomic scale response of the screw dislocations to applied stress in Mo and Ta is similar to that found using atomistic potentials. Thus the differences in macroscopic plasticity of these two materials is probably not linked to the shape of the equilibrium cores, but is more likely tied to anisotropies in the local bonding of the dislocation core. While the atomistic methods are correctly incorporating a great deal of this information content, predicting differences between elemental metals may require improved interaction models or ab-initio methods.
2868
C. Woodward
Electronic structure calculations indicate that one extreme of the parametric model (Fig. 1b) is not realized in the bcc transition metals. This is not a failure of the previous methods; it just indicates that this level of detail resides at the electronic level and must be modeled appropriately if we are to understand the behavior of a particular material. The question remains; can we find examples of the core shown in Fig. 1b. in nature and how will this configuration affect plasticity. Thus the original parametric investigations, based on the geometry and reasonable values for the physical parameters, had great value in breaking into a new area of materials science. These configurations may be realized in other bcc metals, or perhaps in studies of interstitial-dislocation interactions. Simple parametric-models will continue to play and important role extending our understanding of materials behavior, don’t overlook this strategy when approaching a new area in computational materials science.
References [1] S.I. Rao, T.A. Parthasarathy, D.M. Dimiduk, and P.M. Hazzledine, “Discrete dislocation simulation of precipitate hardening in superalloys,” Phil. Mag., in press, 2004. [2] C. Woodward, and J.M. MacLaren, “Planar fault energies and sessile dislocation configurations in substitutionally disordered Ti-Al with Nb and Cr ternary additions,” Phil. Mag., A74, 337, 1996. [3] T.E. Mitchell, R.A. Foxall, and P.B. Hirsch, “Work hardening in Niobium Single Crystals,” Phil. Mag., 8, 1895, 1963. [4] V. Vitek, R.C. Perrin, and D.K. Bowen, “The core structure of a/2111 screw dislocations in BCC crystals,” Phil. Mag., 21, 1049, 1970. [5] M.S. Duesbery, V. Vitek, and D.K. Bowen, “The effect of shear stress on the screw dislocation core structure in BCC cubic lattices,” Proc. Roy. Soc. Lond., A332, 85– 111, 1973. [6] G.J. Ackland and V. Vitek, “Many body potentials and atomic scale relaxations in nobel metals alloys,” Phys. Rev., B 41, 10324, 1990. [7] D. Farkas and P.L. Rodriguez, “Embedded atom study of dislocation cores structure in Fe,” Scripta Metall. Mater., 30, 921, 1994. [8] W. Xu and J.A. Moriarty, “Atomistic simulation of ideal shear strength, point defects, and screw dislocations in BCC transition metals: Mo and a prototype,” Phys. Rev., B 54, 6941, 1996. [9] L.H. Yang, P. Soderlind, and J.A. Moriarty, “Accurate atomistic simulation of (a/2)111, screw dislocations and other defects in BCC tantalum” Phil. Mag., A81, 1355–85, 2001. [10] M.S. Duesbery and V. Vitek, “Plastic anisotrophy in BCC transition metals,” Acta Mater., 46, 1481–1492, 1998. [11] C. Woodward and S.I. Rao, “Flexible ab-initio boundary conditions: Simulating isolated dislocations in BCC Mo and Ta,” Phys. Rev. Lett., 88, 216402-1-4, 2002. [12] J.E. Sinclair, P.C. Gehlen, R.G. Hoagland et al., “Flexible boundary conditions and nonlinear geometric effects an atomic dislocation modeling,” J. Appl. Phys., 3890– 3897, 1978.
Plasticity at the atomic scale
2869
[13] S.I. Rao, C. Hernandez, J.P. Simmons et al., “Green function boundary conditions in 2-d and 3-d atomistic simulations of dislocations,” Philos. Mag. A77, 231, 1998. [14] S. Ismail-Beigi and T.A. Arias, “Ab-initio study of screw dislocations in Mo and Ta: A new picture of plasticity in BCC transition metals,”Phys. Rev. Lett., 84, 1499– 1503, 2000. [15] S.L. Frederiksen and K. Jacobsen, “Density functional theory studies of screw dislocation core structures in BCC metals,” Phil. Mag., 83, 365–375, 2003.
Perspective 30 A PERSPECTIVE ON DISLOCATION DYNAMICS Nasr M. Ghoniem Mechanical and Aerospace Engineering Department, University of California, Los Angeles, CA 90095-1597, USA
A fundamental description of plastic deformation has been recently pursued in many parts of the world as a result of dissatisfaction with the limitations of continuum plasticity theory. Although continuum models of plastic deformation are extensively used in engineering practice, their range of application is limited by the underlying database. The reliability of continuum plasticity descriptions is dependent on the accuracy and range of available experimental data. Under complex loading situations, however, the database is often hard to establish. Moreover, the lack of a characteristic length scale in continuum plasticity makes it difficult to predict the occurrence of critical localized deformation zones. Although homogenization methods have played a significant role in determining the elastic properties of new materials from their constituents (e.g., composite materials), the same methods have failed to describe plasticity. It is widely appreciated that plastic strain is fundamentally heterogenous, displaying high strains concentrated in small material volumes, with virtually undeformed regions in-between. Experimental observations consistently show that plastic deformation is heterogeneous at all length-scales. Depending on the deformation mode, heterogeneous dislocation structures appear with definitive wavelengths. A satisfactory description of realistic dislocation patterning and strain localization has been rather elusive. Attempts aimed at this question have been based on statistical mechanics, reaction-diffusion dynamics, or the theory of phase transitions. Much of the efforts have aimed at clarifying the fundamental origins of inhomogeneous plastic deformation. On the other hand, engineering descriptions of plasticity have relied on experimentally verified constitutive equations. At the macroscopic level, shear bands are known to localize plastic strain, leading to material failure. At smaller length scales, dislocation distributions are mostly heterogeneous in deformed materials, leading to the formation of 2871 S. Yip (ed.), Handbook of Materials Modeling, 2871–2877. c 2005 Springer. Printed in the Netherlands.
2872
N.M. Ghoniem
a number of strain patterns. Generally, dislocation patterns are thought to be associated with energy minimization of the deforming material, and manifest themselves as regions of high dislocation density separated by zones of virtually undeformed material. Dislocation-rich regions are zones of facilitated deformation, while dislocation poor regions are hard spots in the material, where plastic deformation does not occur. Dislocation structures, such as Persistent slip Bands (PSB’s), planar arrays, dislocation cells, and subgrains, are experimentally observed in metals under both cyclic and steady deformation conditions. Persistent slip bands are formed under cyclic deformation conditions, and have been mostly observed in copper and copper alloys. They appear as sets of parallel walls composed of dislocation dipoles, separated by dislocation-free regions. The length dimension of the wall is orthogonal to the direction of dislocation glide. Dislocation planar arrays are formed under monotonic stress deformation conditions, and are composed of parallel sets of dislocation dipoles. While PSB’s are found to be aligned in planes with normal parallel to the direction of the critical resolved shear stress, planar arrays are aligned in the perpendicular direction. Dislocation cell structures, on the other hand, are honeycomb configurations in which the walls have high dislocation density, while the cell interiors have low dislocation density. Cells can be formed under both monotonic and cyclic deformation conditions. However, dislocation cells under cyclic deformation tend to appear after many cycles. Direct experimental observations of these structures have been reported for many materials. Two of the most fascinating features of micro-scale plasticity are the spontaneous formation of dislocation patterns, and the highly intermittent and spatially localized nature of plastic flow. Dislocation patterns consist of alternating dislocation rich and dislocation poor regions usually in the µm range (e.g., dislocation cells, sub-grains, bundles, veins, walls, and channels). On the other hand, the local values of strain rates associated with intermittent dislocation avalanches are estimated to be on the order of 1–10 million times greater than externally imposed strain rates. Understanding the collective behavior of defects is important because it provides a fundamental understanding of failure phenomena (e.g., fatigue and fracture). It will also shed light on the physics of elf-organization and the behavior of critical-state systems (e.g., avalanches, percolation, etc.) Because the internal geometry of deforming crystals is very complex, a physically-based description of plastic deformation can be very challenging. The topological complexity is manifest in the existence of dislocation structures within otherwise perfect atomic arrangements. Dislocation loops delineate regions where large atomic displacements are encountered. As a result, longrange elastic fields are set up in response to such large, localized atomic displacements. As the external load is maintained, the material deforms
A perspective on dislocation dynamics
2873
plastically by generating more dislocations. Thus, macroscopically observed plastic deformation is a consequence of dislocation generation and motion. A closer examination of atomic positions associated with dislocations shows that large displacements are confined only to a small region around the dislocation line (i.e., the dislocation core). The majority of the displacement field can be conveniently described as elastic deformation. Even though one utilizes the concept of dislocation distributions to account for large displacements close to dislocation lines, a physically-based plasticity theory can paradoxically be based on the theory of elasticity! Studies of the mechanical behavior of materials at a length scale larger than what can be handled by direct atomistic simulations, and smaller than what allows macroscopic continuum averaging represent particular difficulties. When the mechanical behavior is dominated by microstructure heterogeneity, the mechanics problem can be greatly simplified if all atomic degrees of freedom were adiabatically eliminated, and only those associated with defects are retained. Because the motion of all atoms in the material is not relevant, and only atoms around defects determine the mechanical properties, one can just follow material regions around defects. Since the density of defects is many orders of magnitude smaller than the atomic density, two useful results emerge. First, defect interactions can be accurately described by long-range elastic forces transmitted through the atomic lattice. Second, the number of degrees of freedom required to describe their topological evolution is many orders of magnitude smaller than those associated with atoms. These observations have been instrumental in the emergence of meso-mechanics on the basis of defect interactions by Eshelby, Kr¨oner, Kossevich, Mura and others. Thanks to many computational advances during the past two decades, the field has steadily moved from conceptual theory to practical applications. While early research in defect mechanics focused on the nature of the elastic field arising from defects in materials, recent computational modelling has shifted the emphasis on defect ensemble evolution. Although the theoretical foundations of dislocation theory are wellestablished, efficient computational methods are still in a state of development. Other than a few cases of perfect symmetry and special conditions, the elastic field of 3-D dislocations of arbitrary geometry is not analytically available. The field of dislocation ensembles is likewise analytically unattainable. A relatively recent approach to investigating the fundamental aspects of plastic deformation is based on direct numerical simulation of the interaction and motion of dislocations. This approach, which is commonly known as dislocation dynamics (DD), was first introduced for 2-D straight, infinitely long dislocation distributions, and then later for complex 3-D microstructure. In DD simulations of plastic deformation, the computational effort per time-step is proportional to the square of the number of interacting segments, because
2874
N.M. Ghoniem
of the long-range stress field associated with dislocation lines. The computational requirements for 3-D simulations of plastic deformation of even single crystals are thus very challenging. The study of dislocation configurations at short-range can be quite complex, because of large deformations and reconfiguration of dislocation lines during their interaction. Thus, adaptive grid generation methods and more refined treatments of self-forces have been found to be necessary. In some special cases, however, simpler topological configurations are encountered. For example, long straight dislocation segments are experimentally observed in materials with high Peierls potential barriers (e.g., covalent materials), or when large mobility differences between screw and edge components exist (e.g., some bcc crystals at low temperature). Under conditions conducive to glide of small prismatic loops on glide cylinders, or the uniform expansion of nearly circular loops, changes in the loop shape is nearly minimal during its motion. Also, helical loops of nearly constant radius are sometimes observed in quenched or irradiated materials under the influence of point defect fluxes. It is clear that, depending on the particular application and physical situation, one would be interested in a flexible method which can capture the essential physics at a reasonable computational cost. A consequence of the longrange nature of the dislocation elastic field is that the computational effort per time step is proportional to the square of the number of interacting segments. It is therefore advantageous to reduce the number of interacting segments within a given computer simulation, or to develop more efficient approaches to computations of the long range field. While continuum approaches to constitutive models are limited to the underlying experimental data-base, DD methods offer new directions for modeling microstructure evolution from fundamental principles. The limitation to the method presented here is mainly computational, and much effort is needed to overcome several difficulties. First, the length and time scales represented by the present method are still short of many experimental observations, and methods of rigorous extensions are still needed. Second, the boundary conditions of real crystals are more complicated, especially when external and internal surfaces are to be accounted for. Thus, the present approach does not take into account large lattice rotations, and finite deformation of the underlying crystal, which may be important for explanation of certain scale effects on plastic deformation. And finally, a much expanded effort is needed to bridge the gap between atomistic calculations of dislocation properties on the one hand, and continuum mechanics formulations on the other. Nevertheless, with all of these limitations, the DD approach is worth pursuing, because it opens up new possibilities for linking the fundamental nature of the microstructure with realistic material deformation conditions. It can thus provide an additional tool to both theoretical and experimental investigations of plasticity and failure of materials.
A perspective on dislocation dynamics
2875
Two main approaches have been advanced to model the mechanical behavior in this meso length scale. The first is based on statistical mechanics methods. In these developments, evolution equations for statistical averages (and possibly for higher moments) are to be solved for a complete description of the deformation problem. The main challenge in this regard is that, unlike the situation encountered in the development of the kinetic theory of gases, the topology of interacting dislocations within the system must be included. The second approach, commonly known as Dislocation Dynamics (DD), was initially motivated by the need to understand the origins of heterogeneous plasticity and pattern formation. An early variant of this approach (the cellular automata) was first developed by Lepinoux and Kubin [1], and that was followed by the proposal of DD [2–4]. In these early efforts, dislocation ensembles were modelled as infinitely long and straight in an isotropic infinite elastic medium. The method was further expanded by a number of researchers, with applications demonstrating simplified features of deformation microstructure. Since it was first introduced in the mid-eighties independently by Lepinoux and Kubin, and by Ghoniem and Amodeo, Dislocation Dynamics (DD) has now become an important computer simulation tool for the description of plastic deformation at the micro- and meso-scales (i.e., the size range of a fraction of a micron to tens of microns). The method is based on a hierarchy of approximations that enable the solution of relevant problems with today’s computational resources. In its early versions, the collective behavior of dislocation ensembles was determined by direct numerical simulations of the interactions between infinitely long, straight dislocations [5]. Recently, several research groups extended the DD methodology to the more physical, yet considerably more complex 3-D simulations. The method can be traced back to the concepts of internal stress fields and configurational forces. The more recent development of 3-D lattice dislocation dynamics by Kubin and co-workers has resulted in greater confidence in the ability of DD to simulate more complex deformation microstructure [6–8]. More rigorous formulations of 3-D DD have contributed to its rapid development and applications in many systems [9–15]. We can classify the computational methods of DD into the following categories: 1. The Parametric Method: The dislocation loop can be geometrically represented as a continuous (to second derivative) composite space curve. This has two advantages: (1) there is no abrupt variation or singularities associated with the self-force at the joining nodes in between segments, (2) very drastic variations in dislocation curvature can be easily handled without excessive re-meshing. Other approximation methods have been developed by a number of groups. These approaches differ mainly in the representation of dislocation loop geometry, the manner by which the elastic field and self energies are calculated, and some additional details
2876
2.
3.
4.
5.
N.M. Ghoniem related to how boundary and interface conditions are handled. The suitability of each method is determined by the required level of accuracy and resolution in a given application. dislocation loops are divided into contiguous segments represented by parametric space curves. The Lattice Method: Straight dislocation segments (either pure screw or edge in the earliest versions , or of a mixed character in more recent versions) are allowed to jump on specific lattice sites and orientations. The method is computationally fast, but gives coarse resolution of dislocation interactions. The Force Method: Straight dislocation segments of mixed character in the are moved in a rigid body fashion along the normal to their midpoints, but they are not tied to an underlying spatial lattice or grid. The advantage of this method is that the explicit information on the elastic field is not necessary, since closed-form solutions for the interaction forces are directly used. The Differential Stress Method: This is based on calculations of the stress field of a differential straight line element on the dislocation. Using numerical integration, Peach–Koehler forces on all other segments are determined. The Brown procedure [16] is then utilized to remove the singularities associated with the self force calculation. The Phase Field Microelasticity Method: This method is based on the reciprocal space theory of the strain in an arbitrary elastically homogeneous system of misfitting coherent inclusions embedded into the parent phase . Thus, consideration of individual segments of all dislocation lines is not required. Instead, the temporal and spatial evolution of several density function profiles (fields) are obtained by solving continuum equations in Fourier space [17].
References [1] J. Lepinoux and L.P. Kubin, “The dynamic organization of dislocation structures: a simulations,” Scripta Met., 21(6), 833, 1987. [2] N.M. Ghoniem and R.J. Amodeo, “Computer simulation of dislocation pattern formation,” Solid State Phenomena, 3 & 4, 377, 1988. [3] R.J. Amodeo and N.M. Ghoniem, “Dislocation dynamics I: a proposed methodology for deformation micromechanics,” Phys. Rev., 41, 6958, 1990b. [4] R.J. Amodeo and N.M. Ghoniem, “Dislocation dynamics II: applications to the formation of persistent slip bands, planar arrays, and dislocation cells,” Phys. Rev., 41, 6968, 1990a. [5] H.Y. Wang and R. LeSar, “O(N) algorithm for dislocation dynamics,” Phil. Mag. A, 71, 1, 149, 1995. [6] L.P. Kubin, G. Canova, M. Condat, B. Devincre, V. Pontikis, and Y. Brechet, “Dislocation microstructures and plastic flow: a 3D simulation,” Diffussion and Defect Data – Solid State Data, Part B (Solid State Phenomena), 23–24, 455, 1992.
A perspective on dislocation dynamics
2877
[7] G. Canova, Y. Brechet, and L.P. Kubin, “3D dislocation simulation of plastic instabilities by work-softening in alloys,” In: S.I. Anderson et al., Modeling of Plastic Deformation and its Engineering Applications, RISØ National Laboratory, Roskilde, Denmark, 1992. [8] B. Devincre and L.P. Kubin, “Simulations of forest interactions and strain hardening in fcc crystals,” Mod. Sim. Mater. Sci. Eng., 2(3A), 559, 1994. [9] J.P. Hirth, M. Rhee, and H. Zbib, “Modeling of deformation by a 3D simulation of multipole, curved dislocations,” J. Comp.-Aided Mat. Design, 3, 164, 1996. [10] K.V. Schwarz and J. Tersoff, “Interaction of threading and misfit dislocations in a strained epitaxial layer,” App. Phys. Lett., 69(9), 1220, 1996. [11] R.M. Zbib, M. Rhee, and J.P. Hirth, “On plastic deformation and the dynamics of 3D dislocations,” In: J. Mech. Sci., 40(2–3), 113, 1998. [12] M. Rhee, H.M. Zbib, J.P. Hirth, H. Huang, and T. de la Rubia, “Models for long/ short-range interactions and cross slip in 3D dislocation simulation of bcc single crystals,” Mod. Sim. Mater. Sci. Eng., 6(4), 467, 1998. [13] N.M. Ghoniem and L.Z. Sun, “Fast sum method for the elastic field of 3-D dislocation ensembles,” Phy. Rev. B, 60(1), 128–140, 1999. [14] N.M. Ghoniem, S.-H. Tong, and L.Z. Sun, “Parametric dislocation dynamics: a thermodynamics-based approach to investiagations of mesoscopic plastic deformation,” Phys. Rev., 61(2), 913–927. [15] N.M. Ghoniem, J. Huang, and Z. Wang, “Affine covariant-contravariant vector forms for the elastic field of parametric dislocations in isotropic crystals,” Phil. Mag. Lett., 82(2), 55–63, 2001. [16] L.M. Brown, “A proof of Lothe’s theorem,” Phil. Mag., 15, 363–370, 1967. [17] Y. Wang, Y. Jin, A. Cuitino, and A.G. Khachaturyan, “Nanoscale phase field microelasticity theory of dislocations: model and 3D simulations,” Acta Mat., 49, 1847, 2001.
Perspective 31 DISLOCATION-PRESSURE INTERACTIONS J.P. Hirth Ohio State and Washington State Universities, 114 E. Ramsey Canyon Rd., Hereford, AZ 85615, USA
1.
Introduction
There are various ways in which a dislocation can interact with isostatic stress, with most effects being important at elevated stress levels. Some of these that have received extensive attention are the effects of isostatic stress on elastic constants [1–3]; the direct influence of pressure, or the indirect effect via the stress normal to the glide plane, on the Peierls stress, or more specifically the kink formation energy in many metals and alloys [4, 5]; the influence of the degree and symmetry of core splitting of screw dislocations in bcc metals [6–8] as well as some semiconducting crystals and intermetallic compounds [8, 9]; the implementation of core-splitting effects in terms of the general stress tensor [10, 11]; the indirect effect on operative slip systems that modifies geometric hardening [12]; and weak effects on factors such as lattice parameter, vibrational frequency and diffusivity. However, little or no consideration has been given to the coupling of isostatic stress with the nonlinear, long-range, elastic field of the dislocation. Here, we briefly treat this effect and indicate how it can be incorporated into constitutive models for dislocation motion.
2.
Origin of the Coupling
The nonlinear elastic theory of dislocations has been developed, in the perturbation sense of including third and fourth order elastic constants, for both the isotropic and anisotropic elastic cases [1, 13, 14]. These treatments predict, for example, that there is a biaxial dilatation associated with a dislocation, and X-ray diffraction verifies the presence of the dilatation [1]: of course, in low symmetry crystals such as triclinic, there is a dilatational response to 2879 S. Yip (ed.), Handbook of Materials Modeling, 2879–2882. c 2005 Springer. Printed in the Netherlands.
2880
J.P. Hirth
any applied stress so local dilatation should exist near a dislocation.. However, the perturbation models are not accurate in predicting the magnitude of the dilatation, although they are useful in giving the dominant terms for the near core region once the core fields are known. The dilatation is associated with highly nonlinear core effects and can be estimated from atomistic simulations of the near-core region [15] or from an atomistic simulation involving the variation of strain energy with stress in a local near-core region [16]. The core field entails sets of line-force dipoles without moment [15], but at long-range converges to a biaxial dilatation, so a consideration of the latter suffices at long range. A list of typical biaxial dilatations determined in such a manner is given by Puls [17]. Typically, the biaxial dilatational area, δa, including image relaxation at free surfaces, is of the order of 1 to 1.5 atomic areas per plane cut by the dislocation. This corresponds to a volume change per unit length δv/L = δa.
3.
Influence on Dislocation Motion
We consider local coordinates xi fixed on the dislocation with x3 // ξ , the sense vector of the dislocation. The dilatation will then interact with the biaxial stress σ = (σ11 +σ22 )/2. In most practical situations σ will correspond to the isoaxial stress σ I = − p = (σ11 + σ22 + σ33 )/3, where p is the isostatic pressure. We consider this correspondence to apply, although the more specific σ case can easily be introduced into the results. The local coupling will then produce a positive interaction energy per unit length given by W p = pδa.
4.
Systems Deforming by Kink Motion
As a first application, we treat kink pair formation on screw dislocations in bcc metals. For Fe a typical kink formation energy is 0.8 eV [7], δa = 0.62 b2 [18], and kinks should be inclined at an angle φ ∼45◦ to the Peierls valley. Here b is the length of the Burgers vector of a 12 [111] dislocation. These values lead to an energy contribution from the pressure interaction of W p λ=7.7 ( p/µ) eV, normalized to the shear modulus µ, where λ is the kink length. Thus at ( p/µ) = 0.01, the contribution of W p λ is about ten percent of Wk . The value ( p/µ) = 0.01 is readily attained in shock loading and it is of the order of pressures achieved in constrained high pressure tensile testing devices. This value is also typical of the isostatic stress σI = − p achieved in uniaxial tensile tests of heavily drawn steel wire [19]. In general, the kink formation energy can be written W f = Wk + Wh + W p λ +
Wσ 2
(1)
Dislocation-pressure interactions
2881
Entropy terms would also enter the kink free energy [20]. Here, Wk is the kink formation energy in the absence of stress, Wh is the contribution arising from core splitting and the influence of the applied stress tensor [10], and Wσ is the work done by the effective shear stress resolved on the glide plane as a kink pair is created. The numerical example indicates that W p λ is significant and should be included in the analysis of kink formation at high stress levels. Less obviously, there is an indirect effect of pressure on kink energy. When W p is appreciable, there is a tendency to increase φ from its zero-stress value in order to minimize the length λ that interacts with p: that is the minimum free energy kink will become sharper. This means that Wk will increase relative to the zero-stress value. The effect is second-order relative to the direct effect just discussed, but it can be important at very high stress levels. For fcc metals, the isostatic stress interaction term is roughly W p λ ≈ 5 ( p/µ) eV. This is similar to the value for bcc metals. Thus for fcc metals, the kink mechanism is only operative at temperatures typically well below room temperature, and there the interaction again can be important under shock loading. The effect is negligible for bulk metals under uniaxial tension because of the relatively low values of the flow stress for fcc metals. However, it could also be important for in-situ composites [21] and thin multiplayer structures [22], where stresses also reach values where ( p/µ) ∼0.01. When the interaction is important, the influence on φ should be marked because of its initially low value ∼ 5◦ [20]. Other systems with large kink formation energies should behave analogously to the bcc screw case. Intermetallic compounds [9] and ceramic crystals such as alumina [23] should exhibit such behavior.
5.
Systems Deforming by Dislocation Bowout and Breakaway
In fcc metals at room temperature and in most bcc metals above a critical temperature that exceeds room temperature, deformation involves locally bowed out segments that breakaway from obstacles by cutting or bypassing them. The barrier to bowout is the increase in line length of the bowing dislocation. The exact line energy is a complicated function of segment length, dislocation character and local surrounding segment configurations [20]. For our purposes, it suffices to use the simple line tension approximation. The increase in energy per unit length is then W = S L L = (µb2 /2)L where S L is the constant line tension and L is the increase in line length. With the pressure interaction present, W p directly adds to S L . Thus, the total line tension is S = S L + W p = µb2 /2 + pδa
(2)
2882
J.P. Hirth
And the increase in energy with an increase of line length is W = SL. The importance of the pressure interaction term can be determined by direct substitution.
6.
Summary
Dislocations have a long-range dilatational strain field arising from large nonlinearities in the core. The field interacts with isostatic stresses. The resultant interaction energy can be an important contribution to kink formation energy and to dislocation line tension at high pressures.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
A. Seeger and P. Haasen, Philos. Mag., 3, 470, 1958. D.J. Steinberg, S.G. Cochran, and J. Guinan, J. Appl. Phys., 51, 1498, 1980. L.H. Yang, P. S¨oderlind, and J.A. Moriarty, Philos. Mag., A81, 1355, 2001. J.A. Moriarty, Phys. Rev., B49, 12431, 1994. J.A. Moriarty, V. Vitek, V.V. Bulatov et al., J. Computer-Aid. Mater. Des., 9, 99, 2002. V. Vitek, Cryst. Lattice Defects, 5, 1, 1974. M.S. Duesbery, Acta Metall., 31, 1747, 1983. W. Cai, V.V. Bulatov, J. Chang et al., In: F.R.N. Nabarro and J.P. Hirth (eds.), Dislocations in Solids, vol. 12, N. Holland, Amsterdam, 2004. V. Paidar, D.P. Pope, and V. Vitek, Acta Metall., 32, 435, 1984. Q. Qin and J.L. Bassani, J. Mech. Phys. Solids, 40, 835, 1992. M.S. Duesbery and V. Vitek, Acta Metall. Mater., 46, 1481, 1998. R.J. Asaro and J.R. Rice, J. Mech. Phys. Solids, 25, 309, 1977. J.R. Willis, Int. J. Engin. Sci., 5, 171, 1967. C. Teodosiu, Elastic Models of Crystal Defects, Springer-Verlag, Berlin, p. 208, 1982. R.G. Hoagland, J.P. Hirth, and P.C. Gehlen, Philos. Mag., 34, 413, 1976. R.G. Hoagland, M.S. Daw, and J.P. Hirth, J. Mater. Sci., 6, 2565, 1991. M.P. Puls, Dislocation Modeling of Physical Systems, M.F. Ashby et al. (eds.), Pergamon, Oxford, p. 249, 1981. B.L. Adams, J.P. Hirth, P.C. Gehlen, et al., J. Phys. F, 7, 2021, 1977. J.D. Embury, A.S. Keh, and R.M. Fisher, Metall. Trans., 12, 478, 1967. J.P. Hirth and J. Lothe, Theory of dislocations, Kliewer, Melbourne, FL, 1992. K. Han, J.D. Embury, J.J. Petrovic et al., Acta Mater., 46, 4691, 1998. A. Misra and H. Kung, Adv. Engin. Mater., 3, 217, 2001. T.E. Mitchell, P. Peralta, and J.P. Hirth, Acta Mater., 47, 3687, 1999.
Perspective 32 DISLOCATION CORES AND UNCONVENTIONAL PROPERTIES OF PLASTIC BEHAVIOR V. Vitek Department of Materials Science and Engineering, University of Pennsylvania, Philadelphia, PA 19104
1.
Concept of Dislocations
Dislocations are line defects found in all crystalline materials and their motion produces plastic flow. The notion of dislocations has two starting points. First, the dislocation was introduced as an elastic singularity by considering the deformation of a body occupying a multiply connected region of space. Secondly, dislocations were introduced into crystal physics when analyzing the large discrepancy between the theoretical and experimental strength of crystals. These two approaches are intertwined since the crystal dislocations are sources of long-ranged elastic stresses and strains that can be examined in the continuum framework. In fact, the bulk of the dislocation theory employs the continuum elasticity when analyzing a broad variety of dislocation phenomena encountered in plastically deforming crystals [1–4]. From the continuum point of view dislocations are line singularities invoking long-ranged elastic stress and strain fields that decrease as d −1 , where d is the distance from the dislocation line. Formally, both the stress and strain diverge at the dislocation line and the corresponding strain energy would also diverge. However, physically this means that there is a region, centered at the dislocation line, in which the linear elasticity does not apply. This region, the dimensions of which are of the order of the lattice spacing, is called the dislocation core. The properties of the core region and its impact on dislocation motion and thus on plastic yielding, can only be fully understood when the atomic structure is adequately accounted for. In general, when a dislocation glides its core undergoes changes that are the source of an intrinsic lattice friction. This 2883 S. Yip (ed.), Handbook of Materials Modeling, 2883–2896. c 2005 Springer. Printed in the Netherlands.
2884
V. Vitek
friction is periodic with the period of the crystallographic direction in which the dislocation moves. The applied stress needed to overcome this friction at 0 K temperature is called the Peierls stress and the corresponding periodic energy barrier is called the Peierls barrier. In this Commentary we first focus on the dominant aspects of dislocation cores, in particular their symmetries and the form of spreading of atomic displacements in the core region. Specifically, the core displacements can be either confined to a given crystallographic plane or extended three-dimensionally. The latter cores, encountered in a broad variety of materials, are responsible for unconventional aspects of the plastic deformation such as the break-down of the Schmid law, anomalous dependence of the yield stress on temperature, unusually strong orientation and temperature dependence of the yield and flow stress, orientation dependence of the ductility and brittleness, strong strain rate sensitivity etc., [5–10]. Using body-centered-cubic (bcc) metals as an example, we discuss the core features that have to be captured in any theoretical description of the dislocation motion and related plastic properties. Finally, we demonstrate by a brief overview that analogous core phenomena are encountered in many other materials than bcc metals. However, prior to the discussion of atomic structures of dislocations, we reflect on the most important “core phenomenon”, dislocation dissociation into partial dislocations separated by metastable stacking-fault like planar defects.
2.
Dislocation Dissociation and Stacking Faults
A vital characteristic of dislocations in crystalline materials is their possible dissociation into partial dislocations with Burgers vectors smaller than the lattice vector. Such dislocation splitting can occur if the displacements corresponding to the Burgers vectors of the partials lead to the formation of metastable planar faults which then connect the partials. The reason for the splitting is, of course, the decrease of the dislocation energy when it is divided into dislocations with smaller Burgers vectors. A well-known example is splitting of 1/2110 dislocations in fcc materials into two Shockley partials with the Burgers vectors of the type 1/6112 on {111} planes. The planar fault formed by the 1/6112 displacement is an intrinsic stacking fault. In general, stacking-fault-like defects that include not only stacking faults but also other planar defects, such as antiphase domain boundaries and complex stacking faults encountered in ordered alloys and compounds, can be very conveniently analyzed using the notion of γ-surfaces, first employed in investigation of possible stacking faults in bcc metals [11]. To introduce the idea of a γ-surface, we first define a generalized stacking-fault: Imagine that the crystal is cut along a given crystallographic plane and the upper part displaced , parallel to the plane of the cut, as with respect to the lower part by a vector u
Dislocation cores and unconventional properties of plastic behavior
2885
shown in Fig. 1. The fault created in this way is called the generalized stackingfault and it is not in general metastable. The energy of such fault, γ ( u), can be evaluated using atomistic and/or density functional theory (DFT) methods; relaxation perpendicular to the fault has to be carried in such calculations. Re within the repeat cell of the given peating this procedure for various vectors u crystal plane, an energy-displacement surface can be constructed, commonly called the γ-surface. The local minima on this surface determine the displacement vectors of all possible metastable stacking-fault-like defects, and the values of γ at these minima are the energies of these faults. Many such calculations were performed employing a broad variety of descriptions of interatomic interactions. They became particularly popular with the advent of the DFT based methods since γ-surface calculations are much less computation intensive than studies of dislocations while providing information that is often sufficient for understanding the atomic level properties of dislocations. Examples are recent calculation of γ-surfaces in bcc transition metals [12], aluminum [13], silicon [13, 14], graphite [15], TiAl [16, 17] and MoSi2 [18, 19]. Symmetry arguments can be utilized to assess the general shape of γ-surfaces. If a mirror plane of the perfect lattice perpendicular to the plane of a generalized stacking-fault passes through the point corresponding to a , the first derivative of the γ-surface along the normal to this displacement u mirror plane vanishes owing to the mirror symmetry. This implies that the
Upper part of the crystal
u Generalized stacking fault
Lower part of the crystal Figure 1. Formation of the generalized stacking fault.
2886
V. Vitek
γ-surface will possess extrema (minima, maxima or inflexions) for those displacements, for which there are at least two non-parallel mirror planes of the perfect lattice perpendicular to the fault. Whether any of the extrema correspond to minima, and thus to metastable faults, can often be determined by considering the change in the nearest neighbor configuration produced by the corresponding displacement. Hence, the symmetry-dictated metastable stacking-fault-like defects can be ascertained on a crystal plane by analyzing its symmetry. Such faults are then common to all materials with a given crystal structure. The intrinsic stacking faults in fcc crystals are symmetry dictated since three mirror planes of the {101} type pass through the points that correspond to the displacements 1/6112. However, other minima than those associated with symmetry-dictated extrema may exist in any particular material. These can not be anticipated on crystallographic grounds but their existence depends on the details of atomic interactions and they can only be revealed by calculations of the γ-surface. The primary significance of the dislocation dissociation is that it determines uniquely the slip planes, identified with the planes of splitting and corresponding stacking-fault-like defects and, consequently, the operative slip systems. The fact that {111} planes are the slip planes in fcc materials is a typical example. However, the core of individual partial dislocations may still spread spatially and introduce effects similar to those found in undissociated dislocations with non-planar cores. In summary, when analyzing dislocation core structure, possible splitting into well-defined partials separated by metastable stacking fault-like defects has to be considered first. If such splitting cannot occur, either because no metastable stacking fault-like defects exist or their energy is so high that the splitting is not favored, the core of total dislocations has to be studied. In the opposite case, investigation of the cores of the corresponding partials needs to be performed.
3.
Dislocation Cores
In general, the dislocation core structure can be described in terms of the relative displacements of atoms in the core region. These displacements are usually not distributed isotropically but are confined to certain crystallographic planes. Two distinct types of dislocation cores have been found, depending on the mode of the distribution of significant atomic displacements. When the core displacements are confined to a single crystallographic plane the core is planar∗ . In metallic materials dislocations with such cores usually glide
∗ A more complex planar core, called zonal, may be spread into several adjacent parallel crystallographic planes of the same type [20].
Dislocation cores and unconventional properties of plastic behavior
2887
easily in the planes of the core spreading and their Peierls stress is commonly low. However, the Peierls stress can be high even for planar cores in covalently bonded solids owing to the need to break the covalent bonds during the dislocation glide. In contrast, if the core displacements spread into several nonparallel planes of the zone of the dislocation line, the core is non-planar. The glide planes of dislocations with such cores are often not defined uniquely, the corresponding Peierls stress is high and well below the melting temperature the dislocation glide is enabled by thermal activation over the Peierls barriers.
3.1.
Planar Cores
Only full-scale atomistic calculations can reveal all the details of the core structure for both planar and non-planar cores. However, for planar cores semiatomistic models of the Peierls-Nabarro type are capable to describe the cores with high precision [3]. In such models the core is regarded as a continuous distribution of dislocations in the plane of the core spreading. If we choose the coordinate system in the plane of the core spreading such that the axes x1 and x2 are parallel and perpendicular to the dislocation line, respectively, the corresponding density of the continuously distributed dislocations has two components ρα = ∂ u α /.∂ x2 (α = 1, 2), where u α is the α component of the dis increases gradu in the x1 , x2 plane. The displacement u placement vector u +∞ ρα dx2 = bα , ally in the direction x2 from zero to the Burgers vector so that −∞ where bα is the corresponding component of the Burgers vector. In the continuum approximation the elastic energy of such dislocation distribution can be expressed as the interaction energy of the dislocations within this distribution: E el =
+∞ +∞ 2
K αβ ρα ρβ ln x2 − x2 dx2 dx2
(1)
α,β=1−∞ −∞
where K αβ are constants depending on the elastic moduli and orientation of the dislocation line. On the atomic scale the displacement across the plane of the core spreading causes a disregistry that leads to an energy increase. The produces locally a generalized stacking-fault and in the local displacement u approximation the energy associated with the disregistry can be approximated as +∞
γ ( u) dx2
Eγ =
(2)
−∞
where γ ( u) is the energy of the corresponding γ -surface for the displacement . The continuous distribution of dislocations describing the core structure is u
2888
V. Vitek
then found by the functional minimization of the total energy E tot = E el + E γ [21]∗ . with respect to the displacement u The salient feature of this model is that the energy E tot does not depend on the position of the core and thus when the dislocation moves no energy change would be found. Nevertheless, the Peierls stress can be evaluated by re-introducing the discrete structure of the lattice by placing the atoms of this obtained from lattice into the positions determined by the displacement field u the model. The Peierls stress and the Peierls barrier can then be evaluated by gradually moving this displacement distribution along the slip plane by one period along the glide plane. This is the approach adopted originally by Nabarro [22]. The result is a rather low Peierls stress in the case of planar cores. This is seen from the analytical expression obtained for the original model with sinusoidal restoring force when the Peierls stress is τ P = 2µ/α exp(−4π ζ /b); µ is the shear modulus and α a factor of the order of one and ζ is the width of the core. This stress is very small even when the core is very narrow; for example, if ζ = b, τ P ≈ 10−5 µ. Comparisons with full-scale atomistic calculations have shown a good agreement with this approximate approach in a number of cases, in particular when the cores are wide. A variety of improvements have been suggested recently in which the atomic structure is taken into account explicitly in the model rather than a posteriori [13, 23, 24]. However, the model generally performs poorly for cores the width of which is comparable with the lattice spacing and when covalent bonds dominate crystal bonding. In the former case the continuum description of the core is poor since it is applied to distances smaller than the lattice spacing and in the latter case possible breaking and/or formation of covalent bonds is not included into the model.
3.2.
Non-planar Cores
The non-planar cores can be divided into two classes: cross slip and climb cores. In the former case the core displacements lie in the planes of the core spreading while in the latter case they possess components perpendicular to these planes. Climb cores are less common and are usually formed at high temperatures by a climb process. The best known example of the cross-slip core is the core of 1/2111 screw dislocations in bcc metals. Atomistic studies of dislocations are the principal source of our understanding of the structure of such cores since direct observations are mostly outside the limits of experimental techniques. Two alternate structures of the core of the 1/2[111] ∗ If the displacement vector is all the time parallel to the Burgers vector, so that only the component u in ∂γ this direction needs to be considered, and ∂u = / 0 for any finite value of u, the Euler equation corresponding to the condition δ E tot = 0 leads to the well-known Peierls equation [21].
Dislocation cores and unconventional properties of plastic behavior
(a)
2889
(b)
Figure 2. Two alternate core structures of the 1/2[111] screw dislocation depicted using differential displacement maps. (a) Calculated using Finnis–Sinclair type central force potential for Mo [25], (b) Calculated using a bond-order potential for Mo [27]. The atomic arrangement is shown in the projection perpendicular to the direction of the dislocation line ([111]) and circles represent atoms within one period. An arrow between them represents the [111] (screw) component of the relative displacement of the neighboring atoms produced by the dislocation. The length of the arrows is proportional to the magnitude of these components. The arrows, which indicate out-of-plane displacements, are always drawn along the line connecting neighboring atoms and their length is normalized such that it is equal to the separation of these atoms in the projection when the magnitude of their relative displacement is equal to |1/6 [111]|.
screw dislocation found by atomistic modeling are presented in Fig. 2. The core ¯ ¯ and (110) ¯ in Fig. 2a is spread asymmetrically into the (101), (011) planes that ¯ diad; belong to the [111] zone and is not invariant with respect to the [101] another energetically equivalent configuration related by this symmetry operation exists and this core is called degenerate or polarized. The core in Fig. 2b ¯ diad and it is called non-degenerate or is invariant with respect to the [101] non-polarized. The core structures shown in Figs. 2a and b were found by atomistic calculations employing central-force many-body potentials [25] and a tight binding and/or density functional theory based approaches [26, 27], respectively. This example demonstrates that the structure of the core is not determined solely by the crystal structure but may vary from material to material with the same crystal structure. The non-planar dislocation cores are the more common the more complex is the crystal structure and for this reason these cores are more prevalent than planar cores. In this respect, fcc materials (and also hcp materials with basal slip) in which the dislocations possess planar cores, are a special case rather than a prototype for more complex structures [5]∗ . ∗ Additional complexities of the dislocation core structures arise in covalent crystals where the breaking and/or readjustment of the bonds in the core region may be responsible for a high lattice friction stress, and in ionic solids where the cores can be charged which then strongly affects the dislocation mobility. Such dislocation cores affect not only the plastic behavior but also electronic and/or optical properties of covalently bonded semiconductors and ionically bonded ceramic materials.
2890
3.3.
V. Vitek
Glide of Dislocations with Non-planar Cores and Breakdown of the Schmid Law
The Peierls stress of dislocations with non-planar cores is typically at least an order of magnitude higher than that of dislocations with planar cores. Furthermore, the movement of such dislocations is frequently affected not just by the shear stress parallel to the Burgers vector in the slip plane but by other stress components. Thus the deformation behavior of materials with non-planar dislocation cores may be very complex, often displaying unusual orientation dependencies and breakdown of the Schmid law. It has been known ever since the first studies of the plastic deformation of bcc metals that the Schmid law does not apply [7, 28] and, indeed, already the early atomistic calculations showed that the non-planar core has to transform prior to the dislocation motion and this transformation may be influenced by shear stresses in the slip direction acting in planes other than the slip plane [29]. Furthermore, more recent calculations revealed that such transformation may also be affected by shear stresses in the direction perpendicular to the Burgers vector [25, 30, 31]. This is well demonstrated by calculating the Peierls stress as a function of the shear stresses perpendicular to the Burgers vector while keeping the plane of the maximum resolved shear stress parallel to the Burgers vector (MRSSP) fixed. As suggested by Ito and Vitek [25], this can be achieved by applying the stress tensor with the components σ11 = −τ2 , σ22 = τ2 , σ33 = σ12 = σ13 = σ23 = 0 in the right-handed coordinate system with the x1 axis in the MRSSP, x2 axis perpendicular to the MRSSP and x3 axis parallel to [111], together with the shear stress parallel to the Burgers vector. Figure 3 shows the calculated dependence of the Peierls ¯ stress, τM , for the core shown in Fig. 2b on τ2 when the MRSSP is the (101) plane. It is important to note that changing the sign of τ2 corresponds to the rotation of the coordinates by 180◦ around the [111] axis. However, the core structure is not invariant with respect to this transformation and thus the effect of τ2 upon the dislocation behavior will, in general, be different for positive and negative values, as observed. The values of τM corresponding to tensile and compressive loadings along ¯ [238] and [012] axes are also shown in Fig. 3; for each uniaxial loading the value of τ2 is uniquely related to the loading stress at which the disloca¯ tion started to move. For these axes the (101) plane is the MRSSP and if there were no effect of shear stresses perpendicular to the Burgers vector the value of τM would be the same for both axes as well as for tension and compression. Obviously, this is not the case and the magnitude of τM depends on the shear stresses perpendicular to the slip direction, described by τ2 , that are different for different uniaxial loading. Moreover, for large negative τ2 the ¯ ¯ plane although the shear slip plane changes from the (101) plane to (011)
Dislocation cores and unconventional properties of plastic behavior
2891
τM C44
τ2/C44 Figure 3. Dependence of the Peierls stress, τM , on τ2 that defines the shear stress perpendic¯ ular to the Burgers vector, when the MRSSP is the (101) plane.
stress in the direction of the Burgers vector is lower in this plane. Apparently, the shear stress perpendicular to the Burgers vector alters the core such that ¯ plane is preferred. transformation into the (011) It should be emphasized at this point that the phenomena discussed above have been observed for both polarized and non-polarized cores [25]; similarly the polarity has not been found to influence significantly the magnitude of the Peierls stress. This is in contrast with the recent suggestions that the polarity plays a considerable role in the glide of dislocations with non-planar cores [32]. The reason is that when a stress is applied the symmetry of the nonpolarized core is broken and becomes the same as that of the polarized one. This has recently been discussed in detail by Vitek [33]. The results of atomistic calculations suggest that the motion of the 1/2[111] screw dislocation is governed by four distinct shear stress components when ¯ it glides in the (101) plane. First two are the Schmid stress, i.e., the stress in ¯ the direction of the Burgers vector in the slip plane (101), and the shear stress parallel to the Burgers vector in another {110} plane of the [111] zone. The other two are shear stresses perpendicular to the Burgers vector acting in two different {110} planes of the [111] zone. As discussed in detail in [34, 35], the effects of non-glide stresses may enter the flow rules for a single crystal
2892
V. Vitek
loaded by a stress tensor σ by introducing for a slip system α an effective yield stress, τ ∗α = τ α +
3 η=1
aηα τηα
(3)
Here τ α = nα ·σ·mα is the Schmid stress; nα is the slip plane normal and mα the slip direction. τηα = nαη ·σ·mαη are the non-glide shear stresses; nαη is the normal to the {110} plane η, and mαη is the direction in this plane, either parallel or perpendicular to the Burgers vector, depending on which type of the shear stress is considered. At the onset of the plastic flow this effective yield stress has to attain a critical value τcr∗α . The coefficients aηα and the value of τcr∗α are determined so as to fit the results of atomistic calculations of the dislocation motion. This criterion reduces to the Schmid law if there is no influence of non-glide stresses and all aηα = 0. All the above results of atomistic studies relate to the glide of straight dislocations at 0 K but at finite temperatures the dislocation motion is aided by thermal activation and a generally accepted mechanism of the dislocation motion involves formation and extension of pairs of kinks. Assuming the usual rate theory, the dislocation velocity is then
U − W v ∝ exp − kB T
(4)
where kB is the Boltzman constant, T temperature, U = Uactivated − Uground the difference in the energy between the activated state and the state prior to the activation and W = τ α · b · A is the work done by the shear stress parallel to the Burgers vector (Schmid stress) during the activation process; b is the magnitude of the Burgers vector and A the area swept by the dislocation segment during activation. This mechanism of the thermally activated overcoming of the Peierls barriers has been employed in the framework of the dislocation theory by a number of authors, starting with the classical paper of Dorn and Rajnak [36] [37–39]. In these developments U had either been taken as constant or considered as a function of the Schmid stress τ α for a given slip system α. The same applies in the recent atomistic studies of kinks [40–43]. However, to include fully the effects of non-glide stresses at finite temperatures it has to be recognized that both Uactivated and Uground are functions of the applied stress tensor, so that U = f (σi j ), where σi j includes both glide and non-glide stress components affecting the dislocation motion. In principle, this could be achieved by molecular dynamics simulations of the formation of kink pairs at finite temperatures with applied stresses involving both glide and non-glide components. However, such calculations are feasible only for very
Dislocation cores and unconventional properties of plastic behavior
2893
high stresses and thus very large strain rates even when using the most powerful supercomputers [44]. The reason is that at usual strain rates the frequency of the formation of kink pairs is many orders of magnitude lower than the frequency of atomic vibration. Hence, the yet unsolved task is to develop a mesoscopic theory of the formation of kink pairs that will utilize the results of atomistic studies of the motion of straight dislocations to establish the dependence of U on all components of the stress tensor that govern the dislocation glide. A possible approach is to consider that U is a function of the effective yield stress τ ∗α such that U = 0 when τcr∗α = τcr∗α .
4.
Conclusions: Generality of the Core Effects
The non-planar dislocation cores, the example of which is the 1/2111 screw dislocation in bcc metals, are common in many materials. In hexagonal crystals such cores are not found when the slip is confined to basal planes but they are common in the case of prismatic and/or pyramidal slip [45] where they are responsible for strong temperature dependence of the yield stress. Similarly, non-planar dislocation cores have been identified in many intermetallic compounds. Examples are screw dislocations in NiAl [46], TiAl [47, 48] and MoSi2 [19, 49] where they are, presumably, responsible for an unusually large orientation dependence of the yield stress [50]. The transformation of 110 superdislocations in Ni3 Al from the planar glissile form to non-planar sessile form is responsible for the anomalous increase of the flow stress with temperature [10, 51]. However, non-planar cores leading to unusual temperature, strain rate and orientation dependencies have not been found only in metallic materials but also in materials such as olivine [52], sapphire [53] and anthracene [54]. Recently, an unusual ductile-to-brittle transition has been observed in the perovskite SrTiO3 that deforms plastically at room temperature but becomes brittle at about 1000 K [55]. This unusual behavior is most likely associated with formation of non-planar dislocation cores at high temperatures associated with local changes in stoichiometry [56]. In covalently bonded semiconductors it is both the atomic and electronic effects which affect significantly the Peierls barrier [57] that may lead to very complex mechanisms of formation of kinks [58]. In all these cases atomic level modeling is capable to identify the essential features of dislocation cores and identify the stress components governing the dislocation motion. The development of mesoscopic models of the thermally activated dislocation motion that incorporate the results of atomic level calculations is then the next essential step. However, at present such studies are either very rudimental or non-existent and this is, therefore, an open avenue for further research linking atomic level, nano-scale and macroscale deformation properties.
2894
V. Vitek
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10]
[11] [12] [13]
[14]
[15] [16] [17]
[18] [19]
[20] [21] [22] [23]
J. Friedel, Dislocations, Pergamon Press: Oxford, 1964. F.R.N. Nabarro, Theory of Crystal Dislocations, Clarendon Press: Oxford, 1967. J.P. Hirth and J. Lothe, Theory of Dislocations, Wiley-Interscience: New York, 1982. D. Hull, and D.J. Bacon, Introduction to Dislocations, Butterworth–Heinemannn, Oxford, 2001. A. Cottrell, “Closing Remarks,” Dislocations and Properties of Real Materials, M. Lorretto (ed.), London: The Institute of Metals, 378–381, 1985. V. Vitek, “Effect of dislocation core structure on the plastic properties of metallic materials,” M. Lorretto (ed.), Dislocations and Properties of Real Materials, London: The Institute of Metals, 30–50, 1985. M.S. Duesbery, “The dislocation core and plasticity,” In: F.R.N. Nabarro (ed.), Dislocations in Solids, Amsterdam: North Holland, vol. 8, p. 67, 1989. M.S. Duesbery and G.Y. Richardson, “The dislocation core in crystalline materials,” CRC Critical Reviews in Solid State and Materials Science, 17, 1, 1991. V. Vitek, “Structure of dislocation cores in metallic materials and its impact on their plastic behaviour,” Prog. Mater. Sci., 36, 1–27, 1992. V. Vitek, D.P. Pope, and J.S. Bassani, “Anomalous yield behaviour of compounds with L1(2) structure,” In: F.R.N. Nabarro (ed.), Dislocations in Solids, Amsterdam: North Holland, vol. 10, 135–185, 1995. V. Vitek, “Intrinsic stacking faults in bcc crystals,” Philos. Mag. A, 18, 773, 1968. S.L. Frederiksen and K.W. Jacobsen, “Density functional theory studies of screw dislocation core structures in bcc metals,” Philos. Mag., 83, 365–375, 2003. G. Lu, N. Kioussis, V.V. Bulatov, and E. Kaxiras, “Generalized-stacking-fault energy surface and dislocation properties of aluminum,” Phys. Rev. B, 62, 3099–3108, 2000a; “The Peierls-Nabarro model revisited,” Philos. Mag. Lett., 80, 675–682, 2000b. Y.M. Juan and E. Kaxiras, “Generalized stacking fault energy surfaces and dislocation properties of silicon: a first-principles theoretical study,” Philos. Mag. A, 74, 1367–1384, 1996. R.H. Telling and M.I. Heggie, “Stacking fault and dislocation glide on the basal plane of graphite,” Philos. Mag. Lett., 83, 411–421, 2003. J. Ehmann and M. F¨ahnle, “Generalized stacking-fault energies for TiAl: mechanical instability of the (111) antiphase boundary,” Philos. Mag. A, 77, 701–714, 1998. S. Znam, D. Nguyen-Manh, D.G. Pettifor et al., “Atomistic modelling of TiAl. I. Bond-order potentials with environmental dependence,” Philos. Mag., 83, 415–438, 2003. U.V. Waghmare, V. Bulatov, E. Kaxiras et al., “331 slip on {013} planes in molybdenum disilicide,” Philos. Mag. A, 79, 655–663, 1999. T.E. Mitchell, M.I. Baskes, R.G. Hoagland, and A. Misra, “Dislocation core structures and yield stress anomalies in molybdenum disilicide,” Intermetallics, 9, 849–856, 2001. M.L. Kronberg, “Zonal dislocations in hcp crystals,” Acta Metall., 9, 970, 1961. J.W. Christian and V. Vitek, “Dislocations and stacking faults,” Rep. Prog. Phys., 33, 307, 1970. F.R.N. Nabarro, “Dislocations in simple cubic lattice,” Proc. Phys. Soc. London B, 59, 256, 1947. K. Ohsawa, H. Koizumi, H.O.K. Kirchner, and T. Suzuki, “The critical stress in a discrete Peierls-Nabarro model,” Philos. Mag. A, 69, 171–181, 1994.
Dislocation cores and unconventional properties of plastic behavior
2895
[24] G. Schoeck, “The Peierls energy revisited,” Philos. Mag. A, 79, 2629–2636, 1999. [25] K. Ito and V. Vitek, “Atomistic study of non-Schmid effects in the plastic yielding of bcc metals,” Philos. Mag. A, 81, 1387–1407, 2001. [26] C. Woodward and S.I. Rao, “Ab initio simulation of isolated screw dislocations in bcc Mo and Ta,” Philos. Mag. A, 81, 1305–1316, 2001. [27] M. Mrovec, D. Nguyen-Manh, D.G. Pettifor, and V. Vitek, “Bond-order potential for molybdenum: application to dislocation behavior,” Phys. Rev. B, 69, 094115, 2004. [28] J.W. Christian, “Some surprising features of the plastic deformation of body-centered cubic metals and alloys,” Metall. Trans. A, 14, 1237, 1983. [29] M.S. Duesbery, V. Vitek, and D.K. Bowen, “Screw dislocations in bcc metals under stress,” Proc. Roy. Soc. London A, 332, 85, 1973. [30] M.S. Duesbery, “On non-glide stresses and their influence on the screw dislocation core in bcc metals I: the Peierls stress,” Proc. Roy. Soc. London A, 392, 145–173, 1984. [31] M.S. Duesbery and V. Vitek, “Plastic anisotropy in bcc transition metals,” Acta Mater., 46, 1481–1492, 1998. [32] G.F. Wang, A. Strachan, T. Cagin et al., “Role of core polarization curvature of screw dislocations in determining the Peierls stress in bcc Ta: a criterion for designing highperformance materials,” Phys. Rev. B, 6714, 101, 2003. [33] V. Vitek, “Core structure of dislocations in body-centred cubic metals: relation to symmetry and interatomic bonding,” Philos. Mag., 84, 415–428, 2004. [34] J.L. Bassani, K. Ito, and V. Vitek, “Complex macroscopic plastic flow arising from non-planar dislocation core structures,” Mat. Sci. Eng. A, 319, 97–101, 2001. [35] V. Vitek, M. Mrovec, and J.L. Bassani, “Influence of non-glide stresses on plastic flow: from atomistic to continuum modeling,” Mater. Sci. Eng. A, 365, 31–37, 2004. [36] J.E. Dorn and S. Rajnak, “Nucleation of kink pairs and the Peierls mechanism of plastic deformation,” Trans. TMS-AIME, 230, 1052, 1964. [37] M.S. Duesbery, “Dislocation motion in Nb,” Philos. Mag., 19, 501, 1969. [38] C.H. Woo and M.P. Puls, “The Peierls mechanism in MgO,” Philos. Mag., 35, 1641– 1652, 1977. [39] A. Seeger, “Peierls barriers, kinks, and flow stress: recent progress,” Z Metallk, 93, 760–777, 2002. [40] A.H.W. Ngan and M. Wen, “Dislocation kink-pair energetics and pencil glide in body-centered-cubic crystals,” Phys. Rev. Lett., 8707, 5505, 2001. [41] S.I. Rao and C. Woodward, “Atomistic simulations of (a/2)111 screw dislocations in bcc Mo using a modified generalized pseudopotential theory potential,” Philos. Mag. A, 81, 1317–1327, 2001. [42] L.H. Yang, and J.A. Moriarty, “Kink-pair mechanisms for a/2 111 screw dislocation motion in bcc tantalum,” Mater, Sci, Eng, A Struct Mater., 319, 124–129, 2001. [43] L.H. Yang, P. Soderlind, and J.A. Moriarty, “Accurate atomistic simulation of (a/2) 111 screw dislocations and other defects in bcc tantalum,” Philos. Mag. A, 81, 1355–1385, 2001. [44] J. Marian, W. Cai, and V.V. Bulatov, “Dynamic transitions from smooth to rough to twinning in dislocation motion,” Nature Materials, 3, 158–163, 2004. [45] D.J. Bacon, and V. Vitek, “Atomic-scale modeling of dislocations and related properties in the hexagonal-close-packed metals,” Metal. Mater. Trans. A, 33, 721–733, 2002. [46] R. Schroll, V. Vitek, and P. Gumbsch, “Core properties and motion of dislocations in NiAl,” Acta Mater., 46, 903–918, 1998.
2896
V. Vitek
[47] R. Porizek, S. Znam, D. Nguyen-Manh, V. Vitek, and D.G. Pettifor, “Atomistic studies of dislocation glide in y-TiAl,” Defect Properties and Related Phenomena in Intermetallic Alloys, E.P. George, H. Inui, M.J. Mills, and G. Eggeler (eds.), Pittsburgh: Materials Research Society, vol. 753, p. BB4.3.1–BB4.3.6, 2003. [48] C. Woodward and S.I. Rao, “Ab initio simulation of a/2110 screw dislocations in γ -TiAl,” Philos. Mag., 84, 401–414, 2003. [49] M.I. Baskes and R.G. Hoagland, “Dislocation core structures and mobilities in MoSi2 ,” Acta Mater., 49, 2357–2364, 2001. [50] K. Ito, H. Inui, Y. Shirai, and M. Yamaguchi, “Plastic deformation of MoSi2 single crystals,” Philos. Mag. A, 72, 1075–1097, 1995. [51] T. Kruml, E. Conforto, B. LoPiccolo, D. Caillard, and J.L. Martin, “From dislocation cores to strength and work-hardening: A study of binary Ni3 Al,” Acta Mater., 50, 5091–5101, 2002. [52] J.-P. Poirier, and B. Vergobbi, “Unusual deformation properties of olivire,” Physics of the Earth and Planetary Interiors, 16, 370–382, 1978. [53] J. Chang, C.T. Bodur, and A.S. Argon, “Pyramidal edge dislocation cores in sapphire,” Philos. Mag. Lett., 83, 659–666, 2003. [54] N. Ide, I. Okada, and K. Kojima, “Computer simulation of core structure and Peierls stress of dislocations in anthracene crystals,” J. Phys.: Condens. Matter, 5, 3151–3162, 1993. [55] P. Gumbsch, S. TaeriBaghbadrani, D. Brunner et al., “Plasticity and an inverse brittle-to-ductile transition in strontium titanate,” Phys. Rev. Lett., 8708, 5505+, 2001. [56] Z.L. Zhang, W. Sigle, W. Kurtz et al., “Electronic and atomic structure of a dissociated dislocation in SrTiO3 ,” Phys. Rev. B, 6621, 4112–4118, 2002a; “Atomic and electronic characterization of the a[100] dislocation core in SrTiO3 ,” Phys. Rev. B, 6609, 4108–4115, 2002b. [57] J.F. Justo, A. Antonelli, and A. Fazzio, “The energetics of dislocation cores in semiconductors and their role in dislocation mobility,” Physica B, 302, 398–402, 2001. [58] V.V. Bulatov, J.F. Justo, W. Cai et al., “Parameter-free modelling of dislocation motion: the case of silicon,” Philos. Mag. A, 81, 1257–1281, 2001.
Perspective 33 3-D MESOSCALE PLASTICITY AND ITS CONNECTIONS TO OTHER SCALES Ladislas P. Kubin LEM, CNRS-ONERA, 29 Av. de la Division Leclerc, BP 72, 92322 Chatillon Cedex, France
1.
Multiscale Analysis
There is a dislocation theory but there is no dislocation theory of plasticity. The reasons for this situation are multiple. By definition, a dislocation is a linear defect that ensures compatibility between slipped and unslipped parts of a crystal. The motion of this defect propagates microscopic shears and is responsible for the plastic (i.e., permanent) deformation of crystalline materials. A dislocation line can be viewed in two different manners. From an atomistic viewpoint, it consists of a highly distorted region, the core, which surrounds the geometrical line of the defect and has a diameter of a few lattice spacings. In the continuum, a dislocation consists of a singularity line to which are associated long range stress and strain fields. The line energy of a dislocation is mainly located outside the core region and can conveniently be calculated by elasticity theory. This allows treating all the elementary dislocation properties that derive from their self and interaction energies to any desired degree of accuracy. However, the properties of dislocation cores also govern important properties, like dislocation mobility or the selection of the dislocation slip planes. Although our current knowledge of core properties has significantly improved in the past years, it is still far from being just satisfactory. Thus, dislocation theory has not yet been able to bridge the gap between atomic scale properties and mesoscale properties (the mesoscale is understood here as the scale of the defect microstructure). At the mesoscale, the treatment of dislocation ensembles and the spontaneous emergence of dislocation patterns poses several types of problems. It is now understood that plasticity is a dissipative process far from thermal equilibrium. In such conditions, which are beyond the reach of thermodynamics, dislocation patterning is now modeled as a purely dynamic phenomenon. 2897 S. Yip (ed.), Handbook of Materials Modeling, 2897–2901. c 2005 Springer. Printed in the Netherlands.
2898
L.P. Kubin
Nevertheless, there is no common agreement on the mechanisms that trigger the formation of self-organized dislocation microstructures, or on the way the latter influence the mechanical response. Finally, in many important situations, of fundamental or applied relevance, a connection has to be established with continuum formulations. Indeed, only the latter can compute, through a proper treatment of boundary conditions, the complex field of a dislocated crystal in a specimen with finite dimensions. The objective of what is called multiscale analysis is to reach, through a combination of simulation and modeling, a physically-based picture of the plasticity of crystalline solids. Within this goal, 3-D dislocation dynamics (DD) simulations have been developed in the last decade as a complement to more ancient and well established simulation methods that exist in the domains of solid state physics or solid mechanics.
2.
Connection to the Continuum
At the mesoscale, dislocation core properties can only be incorporated by imposing “local rules”, which translate atomic scale properties into a continuum formulation. Indeed, in the absence of these rules, DD simulations make little sense, especially if they cannot treat thermally activated processes. Since atomistic simulations provide everything but analytical models, one has to make use of intermediate models or approaches to implement this connection. The most important core mechanisms are related to changes in core structure and energy under stress. They involve small energy changes, in the eV range, and are, therefore, assisted by thermal fluctuations. Thus, they can be described with the help of rate equations, for instance semi-empirical Arrhenius forms, forms derived from elastic models for the core structure or Monte– Carlo simulations. This information passing procedure between the atomic and mesoscopic scales presents a major advantage, that of decoupling the time and length scales between the two types of simulations. As thermally activated events are slow events, with a time scale of the order of a fraction of a second in conditions of conventional deformation testing, they cannot be fully treated by molecular dynamics (MD) simulations. The search for saddle point configurations is then performed with the help of various types of energy minimization techniques. In this domain, progress has been slow but continuous, taking advantage of the now possible comparison between ab initio calculations and quantum-based potentials. The most important results recorded in the past years are concerned with the lattice friction (also called Peierls stress), especially in bcc metals. In silicon, the absence of consistency between experiment, continuum modeling and atomistic modeling is still a bit confusing. Studies of the cross-slip mechanism in fcc crystals have confirmed the physical picture yielded by early
3-D mesoscale plasticity and its connections to other scales
2899
elastic models, but have stopped short of providing stress-dependent activation energies and scaling laws for fcc metals. The question of solute hardening, notably in the presence of lattice friction, is still an open domain. Studies on fast moving dislocations by MD simulations have a potential in relation to dynamic deformation tests (i.e., shock loading). A critical area of investigation is concerned with the study of athermal or weakly thermally assisted events, of which the most important one is dislocation generation in dislocation-free volumes, through either homogeneous or heterogeneous processes (occurring at surfaces, interfaces, voids, crack tips. . .). At present, the lack of local rules in this domain constitutes a serious handicap to mesoscale simulations. The investigation of fundamental dislocation mechanisms at atomic scale and the completion of dislocation theory is, indeed, a challenging task. Although this exercise may seem less seducing than the production of colorful but complex events, it is critical for the progress of multiscale analyses.
3.
DD Simulations: Limitations and Prespective
As exemplified by the variety of solutions adopted by the groups working in the field, implementing the elastic properties of dislocations in DD simulations is no longer a challenge. The microstructure may include defects other than dislocations, e.g., small clusters, precipitates, or grain boundaries. This entails the definition of additional local rules of topological nature, for instance telling dislocations in which conditions they can penetrate precipitates or grain boundaries. However, the elastic theory of dislocations is not providing tractable solutions for the treatment of complex stress fields originating from the conditions of compatible deformation at interfaces. As a result, DD simulations are essentially suited to studies on dislocation dynamics and collective behavior under uniform applied stresses, in crystals of infinite dimensions. Several technical factors (time step, number of interacting segments, dimension of the simulated volume, applied strain rate, available computing power) contribute to the maximum strain that can be achieved by DD simulations and, most of all, to the accuracy of the results. In spite of these limitations, the domain accessible to DD simulations is enormous. Globally, these simulations can be divided into “mass” simulations and “model” simulations, dedicated to the study of one or a few elementary mechanisms. Both are complementary and, as always, cannot be carried out without reference to experiment and current theoretical modeling. A nonexhaustive list of the main topics under investigation or that can be treated in the near future is as follows. – Single crystal studies, particularly in the absence of lattice friction. Such investigations can be very useful to clear up a number of controversies
2900
–
–
–
–
–
4.
L.P. Kubin accumulated over the years on dislocation pattern formation and strain hardening in monotonic and cyclic deformation. In materials with high lattice friction, especially covalent or iono-covalent compounds, similar investigations are feasible provided that enough information is available to construct local rules. Several models, deterministic or stochastic, some of them constructed along the lines of continuum dislocation theory, describe the collective behavior of dislocations in crystals. These models can benefit from a comparison with simulation data. Mechanical properties at medium temperatures (T > 0.3−0.4 Tm , where Tm is the melting temperature) have not been studied up to now by DD simulations. Any attempt to go beyond the current phenomenology of creep mechanisms at intermediate temperatures would be highly welcome. Another topic of practical interest, which has so far been treated only in 2-D, is the strength of alloys containing coherent or semi-coherent particles in the presence of cross-slip or climb. A major unknown in plasticity theory is concerned with the composition rules for obstacles of different strengths and of same or different nature. Only semi-empirical estimates exist in this domain, and mesoscale simulations could be of a great help to initiate modeling in a few simple cases. The interaction of dislocations with small clusters, in practice small prismatic loops, is involved in two important traditional domains: one is the instabilities and strain localizations observed during the plastic deformation of irradiated materials. The other is dislocation patterning in cyclic deformation. The investigation of size effects is another very seducing topic, although it has to be approached with care in the absence of a rigorous solution for the local fields. The influence of reduced dimensionality on line tension effects and dislocation mean-free paths is now being investigated. It implies defining new local rules, for instance for the interactions between dislocations and interfaces.
Connection to Continuum Mechanical Aspects
Simulations that simultaneously treat discrete dislocations and solve the boundary value problem have an enormous potential. In fundamental terms, they allow introducing length scales into the continuum framework. They are particularly suited for the study of size effects in nanostructured materials, of plasticity under strain gradients, of materials with complex microstructure, of complex modes of loading, static or dynamic, and fracture processes.
3-D mesoscale plasticity and its connections to other scales
2901
Two methods are being developed to establish this connection. One is based on the combination of DD simulations, which treat the plastic strains resulting from dislocation motion, with finite element (FE) codes, which solve the elastic fields. The existing solutions only differ in the distribution of tasks between the two codes. A variant consists in applying the boundary conditions via a Green function methods when the latter can be computed. The phase field method, which was initially developed to simultaneously treat stress and concentration fields, is now being applied to dislocation problems. At present, it is perhaps too early to assess the potential of this method. Its major advantage resides in the fact that it incorporates a “chemical” dimension that was up to now missing in mesoscale simulations. Another type of connection is possible, which is much less heavy. It simply consists in constructing physically based constitutive formulations for the evolution of dislocation densities and implementing them in a FE code that cares of the boundary value problem. Although this procedure is not really new, it takes advantage of the recent development of mesoscale simulations, for checking or developing existing models, and of continuum crystal plasticity codes, i.e., codes that include the slip geometry. This methodology is also very attractive for two reasons: it allows making use of a substantial existing body of knowledge on constitutive formulations, and it yields access to large strain behavior. How far it can be developed will depend on its ability to account for the spatial organization of the microstructures.
5.
Conclusion
At present, the development of DD simulations is governed in the one hand by the input information generated at atomic scale and, on the other hand, by the strain limit fixed by the available computing power. Within these bounds, DD simulations are still far from having exhausted their potential. The connection to continuum mechanical aspects constitutes the final step toward a physically-based modeling of plasticity. It will probably take a major importance in the coming years. Indeed, it is essential to keep in mind that this has been from the beginning the main objective of dislocation theory. Mesoscale simulations perhaps still have to confirm that they can go beyond providing textbook illustrations and are able to bring major useful information to materials scientists. This objective will, however, not be reached without a strong synergy with theoretical modeling. Current simulations are now able to reproduce known patterns of behavior, but reproducing does not necessarily means understanding.
Perspective 34 SIMULATING FLUID AND SOLID PARTICLES AND CONTINUA WITH SPH AND SPAM Wm.G. Hoover Department of Applied Science, University of California at Davis/Livermore and Lawrence Livermore National Laboratory, Livermore, California, 94551-7808
1.
SPH and SPAM
In my own research career I have been primarily interested in the statistical mechanics of nonequilibrium systems of particles, stressing new techniques for undertaking and understanding computer simulations [1, 2]. The blind alleys toward which molecular dynamics naturally leads provide a compensating appreciation of continuum mechanics, with its length scale more appropriate to everyday experiences. For me, it was a pleasure to learn that the two approaches, microsopic and macroscopic, can be usefully combined, using ideas due to Lucy and Monaghan. About 25 years ago these men independently introduced a new numerical method for simulating continuum problems. They used particles to represent extended macroscopic material elements. Though their method is not at all restricted to fluids, and certainly not to water, Monaghan has consistently adopted the name “smooth-particle hydrodynamics” for it. In the work I have carried out during the last decade I have used instead the alternative “Smooth particle applied mechanics,” (SPAM) for short, thinking it more apt, particularly for solid-phase applications. A Google search of the internet turns up hundreds of references to sph and SPAM. The interested reader could begin with my website at http:// williamhoover.info or could work backward from Carol’s and my review [3]. SPAM defines the local average value (at any location r) of any particle quantity Fi as a quotient of weighted sums:
F(r) ≡ i
Fi m i w(|r − ri| ) . i m i w(|r − ri| ) 2903
S. Yip (ed.), Handbook of Materials Modeling, 2903–2906. c 2005 Springer. Printed in the Netherlands.
2904
Wm.G. Hoover
All particles lying within a distance h of the location r (h is the “range” of w) are to be included in the sums. The “weight function” w(|r| < h) is short-ranged and must be continuously twice differentiable, resembling a foreshortened Gaussian function. Lucy’s choice for w is the simplest polynomial satisfying these conditions: w(x < 1) ∝ (1 − x)3 (1 + 3x) = 1 − 6x 2 + 8x 3 − 3x 4 ;
x ≡ |r|/ h.
The implied multiplicative proportionality constant depends upon the dimensionality of the problem (1, 2, or 3) and is to be chosen such that the spatial integral of w is unity. In applying SPAM tofluids and solids a fixed mass m i is associated with the ith particle so that i m i w(|r − ri |) is the mass density ρ(r) while the density at the location of the ith particle is a sum over nearby particle pairs (and including the term i = j ): ρi ≡
m j w(|ri − r j |).
j
The related identities, F(r)ρ(r) ≡
Fi m i w(|r − ri |) −→
i
∇r [F(r)ρ(r)] ≡
Fi m i ∇r w(|r − ri |),
i
show off the method’s key advantage. First-order or second-order spatial derivatives (of velocity, temperature, stress, . . . ) can all be evaluated from simple sums involving w and w . Only the polynomial w is affected by the gradient operation. A consistent application of these ideas leads to new particle forms for the continuum equations of motion,
r¨i = υi = −m
(P/ρ )i + (P/ρ ) j · ∇i wi j , 2
2
i
and for the particle representations of the continuum energy equation,
e˙i ≡ −
(m/2)[(P/ρ 2 )i + (P/ρ 2 ) j ] :(υ j − υ˙ i ) ∇i wi j
j
−
m[(Q/ρ )i + (Q/ρ ) j ]·∇i wi j . 2
2
j
Note that the special case P ∝ ρ 2 gives particle trajectories isomorphic to those found with molecular dynamics (with w playing the rˆole of a pair potential).
Simulating fluid and solid particles and continua with SPH and SPAM 2905 In a SPAM particle simulation the pressure tensor P and the heat-flux vector Q follow from the underlying continuum constitutive relations (such as Newtonian viscosity and Fourier conductivity). A SPAM simulation proceeds by solving the differential equations for all the particles’ {˙r , v, ˙ e} ˙ subject to assumed initial and boundary conditions. (It must be emphasized that developing properly-formed algorithms for boundary conditions is a high art form.) There is a tremendous literature on particular fluid and solid applications, including the effects of analogs of the artificial viscosity and artificial heat conductivity encountered in the more usual simulations using regular meshes. An interesting and simple problem, with philosophical connections to fundamental statistical mechanics, is Gibbs’ (time-reversible!) expansion problem, in which a fluid expands (irreversibly!) to fill a larger container [4, 5]. See Fig. 1. Smooth particles have an advantage, for multiscale physics, in that their length scale h is arbitrary (it can also be a function of time or direction in space), so that “big” zones can be linked to “smaller” ones, even to zones of molecular dimensions. The particles also promote rezoning in a natural way Because their basic attributes include mass, momentum, and energy, particles
Figure 1. Fourfold free expansion of a 2D gas using smooth-particle applied mechanics. The particles themselves are shown, as well as are contours of density and kinetic energy density, computed using the smooth-particle weight functions. The separation between the white and black regions in the density and kinetic energy plots is the contour characterizing the corresponding mean value. The simulations show rapid equilibration, on a timescale of a few sound traversal times. In the Figure τ is the sound-traversal time for the (spatially-periodic) system.
2906
Wm.G. Hoover
can easily be combined or subdivided in such a way as to conserve all three variables. Hybrid atomistic-continuum simulations based on these ideas are another useful application area.
References [1] W.G. Hoover, “Computational statistical mechanics,” Elsevier, New York, 1991. [2] Wm.G. Hoover, “Time reversibility, computer simulation, and chaos,” World Scientific, Singapore, 1999 and 2001. [3] Wm.G. Hoover and C.G. Hoover, “Links between microscopic and macroscopic fluid mechanics,” Mol. Phys., 101, 1559–1573, 2003. [4] Wm.G. Hoover and H.A. Posch, “Entropy increase in confined free expansions via molecular dynamics and smooth particle applied mechanics,” Phys. Rev. E, 59, 1770–1776, 1999. [5] Wm.G. Hoover, H.A. Posch, V.M. Castillo, and C.G. Hoover, “Computer simulation of irrversible expansions via molecular dynamics, smooth particle applied mechanics, eulerian, and Lagrangian continuum mechanics,” J. Stat. Phys., 100, 313–326, 2000.
Perspective 35 MODELING OF COMPLEX POLYMERS AND PROCESSES Tadeusz Pakula Max Planck Institute for Polymer Research, Mainz, Germany and Department of Molecular Physics, Technical University, Lodz, Poland
1.
Can Complex Macromolecular Architectures Lead to New Properties?
Creating new macromolecular architectures with increasing complexity can constitute a challenge for synthetic chemists but can additionally be justified if it would result in new material properties. For example, joining monomeric units into linear polymer chains, results in a dramatic change of properties of products with respect to properties of substrates. Whereas, a monomer in bulk can usually be only liquid-like or solid (e.g., glassy), the polymer can additionally exhibit a rubbery state with properties, which make these materials extraordinary for a large number of applications. This new state is due to the very slow relaxation of polymer chains in comparison with a fast motion of the monomers, especially, when the chains become so long that they can entangle in a bulk melt. The dynamic mechanical characteristics indicate a single relaxation in the monomer system (Fig. 1a) in contrast to the two characteristic relaxations in the polymer (Fig. 1b). The rubbery state of the polymer extends in the time scale between the segmental (monomer) and the chain relaxation times and is controlled by a number of parameters related to the polymer structure. The most important among these parameters is the chain length determining the ratio of the two relaxation rates. In the rubbery state, the material is much softer than in the solid state. If expressed by the real part of the modulus, the typical glassy state elasticity is of the order of 109 Pa and higher, whereas the rubber like elasticity in bulk polymers is of the order of 105 –106 Pa. It has been demonstrated, recently, that some highly branched macromolecular structures can lead to considerably different properties than these obtained 2907 S. Yip (ed.), Handbook of Materials Modeling, 2907–2915. c 2005 Springer. Printed in the Netherlands.
2908
T. Pakula
(a)
109
107
liquid
106
glassy state
105
T = 233K
(c)
104 103 102 108
109
1010 1011 1012 1013 1014
ω [rad/s.]
(b)
109
G', G" [Pa]
109
Isobutylene MW=186
G', G" [Pa]
G', G" [Pa]
108
polymeric rubber
107
106
side chain relaxation
103
105
103
101
G' G''
brush relaxation 10
linear PnBA
⫺4
⫺2
10
0
10
2
10
4
10
⫺7
10
⫺4
10
T = 254K ⫺1
102
105
ω [rad/s.]
T = 254K
10
segmental relaxation
gel
super-soft rubber
6
10
ω [rad/s]
Figure 1. A comparison of mechanical behavior of bulk systems with various molecular complexity: (a) low molecular liquid, (b) linear entangled polymer and (c) molecular brush with PnBA side chains.
by linking monomeric units into linear chains [1, 2]. Examples consider dynamic behavior of multiarm stars in the melt [3], the melts of brush-like macromolecules [2] and hairy micelles dispersed in linear polymer matrices. In all these systems, the more complex structures extend the spectrum of the relaxations by a third well distinguishable process with the longest relaxation time, which is interpreted as related to slow structural rearrangements in the structured system. When the new process is slow enough, it can create a new elastic plateau with the plateau modulus (102 –103 Pa) by orders of magnitude lower than that characteristic for the conventional polymeric rubbery state (Fig. 1c). Appropriate cross-linking of such structured systems based on macromolecules with complex architectures can lead to bulk super soft elastomers for which we expect a broad range of application possibilities [2].
2.
Where are the Main Obstacles? What is Predictable?
In contrast to the situation we have in the case of linear polymers, an understanding, a theoretical description and consequently a predictability of
Modeling of complex polymers and processes
2909
behavior of the bulk systems with macromolecules having complex architecture are in a stage in which the problems are more or less recognized but because of the increased complexity the methods and solution are missing. Tackling the problems of complexity in macromolecular systems in order to provide new ways advancing our possibilities in predicting controlling and analyzing the complex macromolecular systems is, therefore, required. This concerns both natural and synthetic systems and consideration of the later along the whole pathway from substrates to products i.e., from monomers to bulk materials. The macromolecular chemistry is receiving recently a special attention because of development of new synthetic methods and skills which make possible creation of macromolecules with a complexity and precision comparable with that met in biological systems. This includes macromolecules having complex architectures such as multiarm stars, hyperbranched polymers, dendrimers, comb-like or dendronized polymers, rings, catenanes as well as various variants of these topologies made additionally more complex by various distributions of interacting fragments such as functional groups, incompatible blocks or intramolecular composition distributions of chemically different units (Fig. 2). Synthesis of such complex macromolecules requires well controlled processes for which a detailed understanding and possibility of description or modeling is of crucial importance. Existing theoretical methods of description of synthetic processes in macromolecular systems are not sufficiently developed. The problems result from neglecting that for bond formation a presence homopolymers linear
cyclic
star
comb
microgel
network
block copolymers
Figure 2. Some examples of complex macromolecular architectures differing by topology and intramolecular composition distributions.
2910
T. Pakula
of reacting sites and catalysts within the reaction volume is necessary. This condition can only be reasonably considered in spatial models of reacting systems in which substrates and products coexist in space interacting mutually under reaction progress dependent conditions. Therefore, one of the important challenges for the future should be a development of effective tools for modeling of complex synthetic processes in 3D space under controlled conditions. A lot of systems have been developed in which a specific molecular architecture leads to self-organization of molecules to various supramolecular structures. In spite of a considerable progress in this field still many questions such as, what and how to synthesize in order to achieve desired functions or properties, remain open. Also in natural systems the mechanisms of adopting certain structural states and fulfilling certain functions are for most systems not understood. In all these systems, complexity and related physical effects play an important role and can not be considered within the existing theoretical tools. Experimental analysis of such processes and of products is difficult and in many cases leads to misleading results because of unknown effects of the complex macromolecules on the measured quantities (e.g., size exclusion chromatography). On the other hand, it can be expected that appropriate modeling can provide information concerning kinetics of such processes and properties of the products. They can be characterized, for example, by molecular weights, molecular weight distributions, compositions in the case of copolymers as well as composition heterogeneities and distributions, etc. When the complex molecules are obtained, the questions arise: which properties they have, which functions they can fulfill or what they are good for? The further aim of our efforts should, therefore, concern an analysis of structure and dynamics of the complex macromolecules in order to answer the above questions. The analysis should consider single complex macromolecules, spatially confined aggregates, as well as, bulk systems in which specific spatial arrangements, self-organization and formation of heterophases influence the behavior. The later structures constitute usually hierarchical supramolecular architectures in which the structural and dynamic complexity requires new methodological developments in order to be analyzed. Such structures can extend over many orders of magnitude in the size scale. The broad size range involves a broad variety of related relaxations which contribute to the dynamics extending over an extremely broad time range. The dynamic spectrum of such materials becomes very important for understanding the correlation between parameters of the molecular and supramolecular structures on one hand, and their specific functions or macroscopic properties on the other hand [2]. Analysis of these correlations appears to be extremely difficult because of complexity of the systems. Experimentally, it usually requires application of many techniques for characterization of both the structure
Modeling of complex polymers and processes
2911
and the dynamics. It often, however, leaves many questions open because of limitations in resolution, accuracy, sensitivity or selectivity of the experimental methods. The systems are usually to complex to be considered by existing theoretical tools. A hope is in computer modeling which due to rapid software and hardware developments becomes a valuable research tool providing a lot of details concerning structure and dynamics in such complex macromolecular systems [1].
3.
Some Perspectives for a Progress
The other important scientific objective of the future efforts should, therefore, be a methodological improvement in solving problems which appear in complex systems of a large number of molecular units interacting strongly or subjected to strong external constraints. Two aspects are considered as crucial for understanding such systems: hierarchy and cooperativity. We can take advantage of recent developments made in modeling of dynamics in complex molecular systems. A new liquid model – the Dynamic Lattice Liquid model [4] – could be used. The model seems to solve the long standing problem of cooperativity in dense complex molecular or macromolecular systems. It provides a microscopic picture of cooperative molecular rearrangements (cooperative loops) resulting from system continuity under conditions of excluded volume and dense packing of molecules (Fig. 3). The rearrangements are considered as taking place in systems with fluctuating density and with rates dependent on thermal activation barriers which depend on and fluctuate with the local density (intermolecular distances). Dependencies of this kind
single molecule trajectory
collective rearrangement
Figure 3. Illustration of a dynamic simplification of a rearrangement of beads in a dense system with all lattice sites occupied. The rearrangement consists in cooperative displacement of system elements along closed trajectories within which each element replaces one nearest neighbor.
2912
T. Pakula
may include chemical specificity of systems. It has been shown, that the model is able to reproduce the extreme cases of behavior of so called fragile and strong liquids, as well as, various cases filling the gap between these extremes [4]. It reproduces also effects of pressure on the molecular dynamics. The model considering a detailed microscopic behavior become a basis for parallel computer simulation algorithms and in this form can be applied for both liquids and polymers of various complexities [1, 4, 5]. It constitutes new chances for very efficient molecular modeling. Computer modeling is nowadays one of the most important sources of information about details concerning behavior of complex molecular and macromolecular systems at the molecular or atomic scale. Main obstacles in exploring possibilities in this field concern limitations in computational speed and in related limits of spatial resolution of considered models. Possibilities to overcome these problems are both in development of appropriate efficient software, as well as, in development of suitable super fast hardware. Recent modeling techniques allow simulating systems over time intervals not exceeding six orders of magnitude and with spatial resolution still far from reaching two orders of magnitude in the 1D size scale. In most problems concerning material development on nano- and micro-meter size scales, as well as, in biological systems on the size scale of cells, the requirements for the spatial resolution of models are higher than the resolutions accessible recently in modeling of molecular or macromolecular dynamics and organization. Tests of the models based on cooperative rearrangements on the sequentially working computing hardware indicted already a high potential of this type of models for predicting behavior of the complex molecular systems (CMA algorithms) [1]. Based on this experience, we suggest a development and construction of a multipurpose fast logical unit which can serve as a basic element of a parallel computing system realizing the 3D architecture and the DLL dynamics. A building block of such DLL system will be a single chip that integrates multiple processors, each representing single site in the model, and a connectivity and communication logic corresponding to the DLL architecture. This should allow to build a full system by replicating such a chip. A prototype of such a system should allow estimations concerning finite size limits of the DLL machine and its computational efficiency, as well as the technological problems related to energy supply, thermal stabilization and spatial requirements and so on. Schematic illustration of the pathway between the DLL model and the very innovative and technologically pretentious computing system is shown in Fig. 4. Realization of such a system is of a great challenge and when successful would constitute a unique example of a modeling system having the architecture directly corresponding to the structure of modeled system. This would allow using of highest level of parallelism of computation and logic in a similar way as in real systems. To our knowledge such computing systems do not
Modeling of complex polymers and processes
2913
DYNAMIC LATTICE LIQUID
Processor controlling single lattice site
3D network of logical units corresponding to structure of modeled system CDLL
Basic Cell of the DLL
1 000 000
100
100
DLL system 100 CDLL
Figure 4. Illustration of a pathway between the DLL model and a DLL massively parallel computing system.
exist yet. It is worth while to mention that the systems will constitute a 3D network with 12 coordinated knots (sites) which probably will make the system suitable for modeling of complex networks such as brain system as well. We postulate future research to be related both to computational modeling of processes in complex macromolecular systems but embedded in, supported by and confronted with experimental investigations of related systems. Many, more and more complex macromolecular systems are synthesized nowadays with arguments that they potentially can show interesting properties. This, however, can only be tested after often very difficult and time consuming synthetic and characterization efforts. Some modeling examples demonstrate, however, another access to this kind of tests providing molecular modeling tools having predictive potential both in the synthetic and in characterization areas [1, 2]. The main properties towards which the predictive recognition postulated here is directed are the mechanical properties dependent on the macromolecular dynamics specific for the complex macromolecular systems. It has been recently discovered (MPG and CMU) that such systems can exhibit super soft elastic states, the presence of which is controlled by the details of the complex macromolecular architecture and related dynamics [2]. An improvement of modeling methods should provide answers to questions which structures
2914
T. Pakula
under which conditions can lead to such states which can become a basis of useful applications. On the other hand, the development discussed should be considered as giving bases for much broader further investigations of other properties, other systems and other processes.
4.
Conclusions
The models based on the cooperative dynamics allow to expect that a real qualitative improvement in the field of macromolecular modeling can be achieved. Main advantages of these methods are: physically reasonable simplifications of the dynamics and structures, applicability to dense systems, flexibility in representation of various molecular topologies and high computational efficiency [1]. The cooperativity understood as concerted action of a number of system elements which results in realization of effects remaining strongly restricted or impossible to achieve in actions of individual elements is responsible for the high efficiency of the related computational methods. The simplicity of the cooperative phenomena in the suggested models lies in recognition of the essential conditions under which in a many body system the cooperative actions can take place [4]. The concepts of the DLL model can directly serve as basis for a parallel special purpose computer system realizing super fast modeling which can open new possibilities to improve spatial resolution of models by decoupling it from limitations imposed by computational speed. The building block of such a system will be a chip that integrates multiple processors each corresponding to the elemental site and the communication logic corresponding to the architecture of the DLL system. We consider it as a perspective which might be very helpful from the point of view of the material science and technology and which may allow in future simulations of systems comparable in sizes and complexity with biological cells, still represented with molecular resolution.
References [1] T. Pakula, “Simulations on the completely occupied lattice,” In: Simulation Methods for Polymers, M.J. Kotelyanskii and D.N. Theodorou, Marcel-Dekker, New York, Ch. 5. pp. 147–176, 2004. [2] T. Pakula, P. Minkin, and K. Matyjaszewski, “Polymers, particles, and surfaces with hairy coatings: Synthesis, structure, dynamics, and resulting properties,” In: K. Matyjaszewski (ed.), Advances in Controlled/Living Radical Polymerization, ACS Symp. Series 854, pp. 366–382, 2003. [3] T. Pakula, D. Vlassopoulos, G. Fytas, and J. Roovers, “Structure and dynamics of melts of multiarm polymer stars,” Macromolecules, 31, 8931–8940, 1998.
Modeling of complex polymers and processes
2915
[4] T. Pakula, “Collective dynamics in simple and supercooled polymer liquids,” Molecular Liquids, 86, 109–121, 2000. [5] P. Polanowski and T. Pakula, “Studies of mobility, interdiffusion, and self-diffusion in two-component mixtures using the dynamic lattice liquid model,” J. Chem. Phys., 118, 11139–11146, 2003.
Perspective 36 LIQUID AND GLASSY WATER: TWO MATERIALS OF INTERDISCIPLINARY INTEREST H. Eugene Stanley Center for Polymer Studies and Department of Physics Boston, University, Boston, MA 02215, USA
1.
Puzzling Behavior of Liquid Water
We can superheat water above its boiling temperature and supercool it below its freezing temperature, down to approximately −40 ◦ C, below which water inevitably crystallizes. In this deeply supercooled region, strange things happen: response functions and transport functions appear as if they might diverge to infinity at a temperature of about −45 ◦ C. These experiments were pioneered by Angell and co-workers over the past 30 years [1–4]. Down in the glassy region of water, additional strange things happen, e.g., there is not just one glassy phase [1]. Rather, just as there is more than one polymorph of crystalline water, so also there appears to be more than one polyamorph of glassy water. The first clear indication of this was a discovery of Mishima in 1985: at low pressure there is one form, called low-density amorphous (LDA) ice [5], while at high pressure Mishima discovered a new form, called highdensity amorphous (HDA) ice [6]. The volume discontinuity separating these two phases is comparable to the volume discontinuity separating low-density and high-density polymorphs of crystalline ice, 25–35 percent [7, 8]. In 1992, Poole and co-workers hypothesized that the first-order transition line separating two glassy states of water does not terminate when it reaches the no-man’s land (the region of the phase diagram where only crystalline ice is found experimentally), but extends into it [9]. If experiments could avoid the no-man’s land connecting the supercooled liquid with the glass, then the LDA–HDA first order transition line would continue into the liquid phase. This first-order liquid-liquid (LL) phase transition line separates two phases of liquid—high-density liquid (HDL) and low-density liquid (LDL)—which 2917 S. Yip (ed.), Handbook of Materials Modeling, 2917–2922. c 2005 Springer. Printed in the Netherlands.
2918
H.E. Stanley
are the precise analogs of the two amorphous solids LDA and HDA. Like essentially all first-order transition lines, the LL transition line between noncrystalline phases must terminate in a critical point. Above the critical point is an analytic extension of the LL phase transition line. Called the Widom line, this extension exhibits apparent singularities—i.e., if the system approaches the Widom line, then thermodynamic response functions appear to diverge to infinity until the system is extremely close, when the functions will round off and ultimately remain finite—as seen in adiabatic compressibility data [10].
2.
Plausibility Arguments
That a LL phase transition exists is at least plausible. Liquid water is a tetra-hedral liquid, and two water tetrahedra can approach each other together in many different ways. One way is coplanar, as in ordinary hexagonal ice Ih —creating a “static heterogeneity” with a local density not far from that of ordinary ice, about 0.9 g/cm3 . A second way is altogether different: one of the two tetrahedra is rotated 90◦ , resulting in a closer distance where the minimum of potential energy occurs, and hence a static heterogeneity with a local density substantially larger (by about 30%) than that of ordinary ice [11]. In fact, this rotated configuration occurs in solid crystalline water (“ice VI”), which occurs at very high pressure. In liquids close to the freezing temperature, there are heterogeneities with local order resembling that of the nearby crystalline phases. Not surprisingly, then, in water at low pressure, there are more heterogeneities that have icelike entropy (local order) and specific volume, while at high pressure there are more heterogeneities that have a Ice VI-like entropy and specific volume. The potential that represents the relative orientations of two water tetrahedra has two wells: a deeper, “high-volume, low-entropy” well corresponding to LDL and a shallower, “low-volume, high-entropy” well corresponding to HDL. Note that LDL has a higher specific volume and a lower entropy. Therefore when water cools, each molecule must decide how to partition itself between these two minima. The specific volume fluctuations increase because of these two possibilities. The entropy fluctuations also increase, and the cross-fluctuations of volume and entropy have a negative contribution—i.e., high volume corresponds to low entropy so that the coefficient of thermal expansion, proportional to these cross fluctuations, can become negative. The possibility that these static heterogeneities gradually shift their balance between low-density and high-density as pressure increases is plausible, but need not correspond to a genuine phase transition. There is no inherent reason why these heterogeneities need to “condense” into a phase, and the first guess might be that they do not condense – what is now called
Two materials of interdisciplinary interest
2919
the singularity free hypothesis [12, 13]. However if we reason by analogy with the gas-liquid transition, then there is one reason to believe that they will condense. This is related to the fact that a permanent gas is impossible so long as there is a weak attraction, no matter how weak. If such a weak attraction has an energy scale e, at low enough temperature T the ratio /T will become large enough to influence the Boltzmann factor sufficiently that the system will condense. For example, if a a lattice-gas fluid (ex: single-well potential) with = 1 condenses below Tc = 1, then if = 0.001 one anticipates condensation below Tc ≈ 0.001. For this reason, one anticipates that at low enough temperature the one-component liquid would condense into a low density liquid corresponding to a deeper potential well.
3.
Simulations
That the static heterogeneities should condense at sufficiently low temperature is found in simulations using a wide range of molecular potentials, ranging from “overstructured” potentials such as ST2 to “understructured” potentials like SPC/E. Recent work has focused on the newest of all water potentials, Tip5p [14]. Regardless of potential used, all results seem to be consistent with the LL phase transition hypothesis [2].
4.
Experiments
Experimental data are consistent with the LL phase transition hypothesis. The volume fluctuations are proportional to the compressibility, and this compressibility is a spectacularly anomalous function. Below 46 ◦ C, the compressibility start to increase as the temperature is lowered. This phenomenon is no longer counterintuitive if the double-well potential is correct. Similarly, below 35 ◦ C the entropy fluctuations, which correspond to the specific heat, start to increase. Finally, consider the coefficient of thermal expansion, which is proportional to the product of the entropy and volume fluctuations. This is positive in a typical liquid because large entropy and large volume go together, but for water this cross-correlation function has a negative contribution—and as we lower the temperature this contribution gets larger and larger until we reach 4 ◦ C, at which point the coefficient of thermal expansion passes through zero. The experimental work of Angell and collaborators shows apparent singularities when experimental data are extrapolated into the No-Man’s Land. Mishima measured the metastable phase transition lines of ice polymorphs and found that the slopes of these lines exhibited sharp kinks in the vicinity of the hypothesized liquid-liquid phase transition line as predicted by extrapolation
2920
H.E. Stanley
[15, 16]. The nature of these kinks can be explained if we take into account that an ice polymorph must melt into a metastable liquid before it can recrystallize into a different polymorph. By the Clausius–Clapeyron relation the slope of that first-order metastable melting line must be equal to the ratio of the entropy change divided by the volume change of the two phases that coexist. In one phase the coexistence is always the high-pressure polymorph of ice. The other phase is either HDL or LDL. The volumes and entropies of those two liquids are different, and therefore as the first-order solid-liquid phase transition line crosses the hypothesized LL phase transition line, the slope changes. The Gibbs potential of two phases that coexist along a first-order transition line must be equal. We already know the Gibbs potential of all the polymorphs of ice, so we know, experimentally, the Gibbs potential of the LDL and the HDL. From the Gibbs potential of any substance one can obtain, by differentiation, the volume. Thus, if we know the Gibbs potential as a function of temperature and pressure, we know the volume as a function of temperature and pressure—which is called the equation of state. In this way Mishima and Stanley were able to find an experimental equation of state for water deep inside the no-man’s land. This is, of course, not quite the same as actually measuring the densities of two liquids coexisting at the LL phase transition line since the Mishima experiments concerned metastable melting lines in which the Gibbs potentials of the two phases are not necessarily equal to each other. Very recently, Reichert and collaborators [17] discovered experimentally a HDL under conditions outside the no-man’s land. They studied the thin, quasi-liquid layer between ice Ih and a solid substrate (amorphous SiO2 ). Using a clever experimental technique at Grenoble, they were able to measure the density and found 1.17 g/cm3 , the density of HDA.
5.
Discussion
The LL phase transition hypothesis does not fully answer the question “what matters?”—i.e., it does not tell us which liquids should exhibit LL phase transitions and which should not. It has been conjectured that it is local tetrahedral geometry of water that matters, since a tetrahedral local geometry leads to static heterogeneities which lead to a LL phase transition [18]. But then what about other tetrahedral liquids? Phosphorus [19] and SiO2 [20] have a local tetrahedral geometry, and experimental evidence supports the LL phase transition hypothesis. There is recent evidence for a LL phase transition in silicon, which is also a tetrahedral liquid [21]. It has been argued that LL phase transitions are associated with liquids possessing a line in the T-P phase diagram at which the density achieves a maximum [22–24].
Two materials of interdisciplinary interest
2921
In summary, the presence of a local tetrahedral geometry leads to two distinct forms of static heterogeneities (or “local order”) differing in specific volume and entropy, with the specific volume and entropy anticorrelated. This fact gives rise to anomalous fluctuations in compressibility, specific heat, and the coefficient of thermal expansion. The hypothesis that at low enough temperatures these small regions of local order condense into two separate phases (LDA and HDA) is supported by simulations, but remains an open question experimentally.
References [1] C.A. Angell, “Amorphous water,” Ann. Rev. Chem., 55, 559–583, 2004. [2] P.G. Debenedetti, “Supercooled and glassy water,” J. Phys.: Condens. Matter, 15, R1669–R1726, 2003. [3] P.G. Debenedetti and H.E. Stanley, “The physics of supercooled and glassy water,” Physics Today, 56[6], 40–46, 2003. [4] O. Mishima and H.E. Stanley, “The relationship between liquid, supercooled, and glassy water,” Nature, 396, 329–335, 1998. [5] P. Briigeller and E. Mayer, “Complete vitrification in pure liquid water and dilute aqueous solutions,” Nature, 288, 569–571, 1980. [6] O. Mishima, L.D. Calvert, and E. Whalley, “An apparently first-order transition between two amorphous phases of ice induced by pressure,” Nature, 314, 76–78, 1985. [7] O. Mishima, “Reversible first-order transition between two H2 O amorphs at −0.2 GPa and 135 K,” J. Chem. Phys., 100, 5910–5912, 1994. [8] O. Mishima, “Relationship between melting and amorphization of ice,” Nature, 384, 546–549, 1996. [9] P.H. Poole, F. Sciortino, U. Essmann, and H.E. Stanley, “Phase behaviour of metastable water,” Nature, 360, 324–328, 1992. [10] E. Trinh and R.E. Apfel, “Sound velocity of supercooled water down to −33 ◦ C using acoustic levitation,” J. Chem. Phys., 72, 6731–6735, 1980. [11] M. Canpolat, F.W. Starr, M.R. Sadr-Lahijany et al., “Local structural heterogeneities in liquid water under pressure,” Chem. Phys. Lett., 294, 9–12, 1998. [12] H.E. Stanley and J. Teixeira, “Interpretation of the unusual behavior of H2 O and D2 O at low temperatures: tests of a percolation model,” J. Chem. Phys., 73, 3404– 3422, 1980. [13] S. Sastry, P. Debenedetti, F. Sciortino, and H.E. Stanley, “Singularity-free interpretation of the thermodynamics of supercooled water,” Phys. Rev. E, 53, 6144–6154, 1996. [14] M. Yamada, S. Mossa, H.E. Stanley, and F. Sciortino, “Interplay between timetemperature-transformation and the liquid-liquid phase transition in water,” Phys. Rev. Lett., 88, 195701, 2002. [15] O. Mishima and H.E. Stanley, “Decompression-induced melting of ice IV and the liquid-liquid transition in water,” Nature, 392, 164–168, 1998. [16] O. Mishima and Y. Suzuki, “Vitrification of emulsified liquid water under pressure,” J. Chem. Phys., 115, 4199–4202, 2001. [17] S. Engemann, H. Reichert, H. Dosch, J. Bilgram, V. Honkimaki, and A. Snigirev, “Interfacial melting of ice in contact with SiO2 , Phys. Rev. Lett., 92, 205 701, 2004.
2922
H.E. Stanley
[18] H.E. Stanley, S.V. Buldyrev, N. Giovambattista, E. La Nave, A. Scala, F. Sciortino, and F.W. Starr, “Statistical physics and liquid water: ‘what matters’,” Physica A, 306, 230–242, 2002. [19] Y. Katayama, T. Mizutani, W. Utsumi, O. Shimomura, M. Yamakata, and K.-i. Funakoshi, “A first-order liquid-liquid phase transition in phosphorus,” Nature, 403, 170–173, 2000. [20] P.H. Poole, M. Hemmati, and C.A. Angell, “Comparison of thermodynamic properties of simulated liquid silica and water,” Phys. Rev. Lett., 79, 2281–2284, 1997. [21] S. Sastry and C.A. Angell, “Liquid–liquid phase transition in supercooled silicon,” Nature Materials, 2, 739–743, 2003. [22] F. Sciortino, E. La Nave, and P. Tartaglia, “Physics of the liquid–liquid critical point,” Phys. Rev. Lett., 91, 155701, 2003. [23] G. Franzese and H.E. Stanley, “A theory for discriminating the mechanism responsible for the water density anomaly,” Physica A, 314, 508–513, 2002. [24] G. Franzese, G. Malescio, A. Skibinsky, S.V. Buldyrev, and H.E. Stanley, “Generic mechanism for generating a liquid-liquid phase transition,” Nature, 409, 692–695, 2001.
Perspective 37 MATERIAL SCIENCE OF CARBON Wesley P. Hoffman Air Force Research Laboratory, Edwards, CA, USA
Carbon is a ubiquitous material that is essential for the functioning of modern society. Because carbon can exist in a multitude of forms, it can be tailored to possess practically any property that might be required for a specific application. The list of applications is very extensive and includes: aircraft brakes, electrodes, high temperature molds, rocket nozzles and exit cones, tires, ink, nuclear reactors and fuel particles, filters, prosthetics, batteries and fuel cells, airplanes, and sporting equipment. The different forms of carbon arise from the fact that carbon exists in three very different crystalline forms (allotropes) with a variety of crystallite sizes, different degrees of purity and density, as well as various degrees of crystalline perfection. These allotropes are possible because carbon has four valence electrons and is able to form different kinds of bonds with other carbon atoms. For example, diamond with a covalently-bonded face-centered cubic structure can exist as a naturally–formed single crystal as large as 200 g. Diamond, the hardest material known to man, is also able to be made synthetically by a variety of processes. For example, high pressure anvils can be utilized to produce relatively small single crystals while a vapor-phase process such as chemical vapor deposition (CVD) is employed to deposit crystalline and amorphous coatings having grain sizes on the micron scale, and a variety of degrees of crystallite orientations. Synthetic diamonds in all forms are used as hard scratch-resistant coatings and tool coatings for grinding, cutting, drilling and wire drawing. Other applications include heat sinks and optical windows among others. The most abundant forms of carbon exist as various forms of the allotrope hexagonal graphite. The perfect crystalline structure of graphite is a hexagonal layered structure in which the atoms in each layer are covalently-bonded while the graphene layers are held together by weaker Van Der Waals forces. This difference in bonding is what is responsible for the great anisotropy in mechanical, thermal, electrical and electronic properties. 2923 S. Yip (ed.), Handbook of Materials Modeling, 2923–2928. c 2005 Springer. Printed in the Netherlands.
2924
W.P. Hoffman 0.1415 nm
0.2456 nm
0.3354 nm
Diamond Structure
Hexagonal Graphite Structure
With the relative recent discovery of the nano-forms of carbon in the last two decades, the range of properties that carbon can possess and the gamut of potential applications has greatly increased. The linear portions of these molecules are simply rolled up graphene layers, while the curved portions consist of graphite hexagons in contact with pentagons. Although commercial applications for both bucky balls and carbon nanotubes are not well defined at the time of this writing, the high aspect ratio of nanotubes along with the fact that they are stronger ( 63 GPa ) and stiffer (∼1000 GPa ) than any other known material, means that the potential for these materials is great.
BUCKYBALL
NANOTUBE
Adding to the complexity of understanding and modeling carbon is the fact that, in its various forms, it rarely exists in a perfect single crystalline state. For example, perfect graphitic structure only exists in the various forms of natural graphite flakes and graphitizable carbons, which are carbons formed from a gas or liquid phase process. An example of a graphitizable carbon would be highly oriented pyrolytic graphite (HOPG), which is formed by depositing one atom at a time on a surface utilizing the pyrolysis of a hydrocarbon, such as methane or propylene. This deposited material is then graphitized employing both thermal and mechanical stress. In the overwhelming number of applications single crystal graphite is not employed. Rather a carbon with some degree of graphite structure is utilized. Excluding crystallite imperfection, such as, vacancies, interstitials,
Material science of carbon
2925
substitutions, twin planes, etc., the form of carbon that most closely approaches single crystal graphite is turbostratic graphite. This form of carbon looks very similar to graphite except that, although there may be some degree of perfection within the planes, the adjacent planes are out of registry with one another. That is, in the hexagonal graphite structure, there is an atom in each adjacent plane that sits directly over the center of the hexagonal ring. In turbostratic graphite, the adjacent planes are shifted with respect to one another and are out of registry. This results in an increase in the interlayer spacing, which can increase from 0.3354 nm to more than 0.345 nm. If the value exceeds this, the structure exfoliates. Heating to temperatures in excess of 2800 K provides energy for mobility and can convert turbostratic graphite to single crystal graphite in a process called graphitization. As stated above, a carbon material that goes through either a gas phase or liquid phase process in its conversion to carbon (carbonization process) can be convert to a graphitic structure employing time and temperature in the range from 2000–2800 K. This means that materials formed in the gas phase like carbon black and pyrolytic carbon can be converted to a graphitic structure. Cokes and carbon fibers fabricated from petroleum, coal tar, or mesophase pitch are also graphitizable. On the other hand, chars formed directly from organic materials, such as wood and bone used for activated carbon, PAN fibers formed from polyacrylonitrile, and vitreous carbons formed from polymers, such as, phenolic or phenyl-formaldehyde, are amorphous and not graphitizable because they maintain the same rigid non-aligned structure that they possessed before carbonization. In addition to the degree of crystalline order, the properties of carbons are also determined by the crystallite size and orientation of polycrystalline carbons. The largest carbon parts that are manufactured are electrodes for the steel and aluminum industries. These electrodes can weigh more than 3 tons and are fabricated by extruding a mixture of fine petroleum coke and coal tar pitch. The extrusion process causes some preferential alignment of the crystallites and baking to 2800 K produces a polycrystalline graphite part that has high strength and conductivity. To make isotropic graphite, fine grain coke and petroleum pitch are isostatically pressed. By proper selection of precursor material and processing condition, a carbon with practically any property can be produced. For example carbons can be hard (chars) or soft (blacks), strong (PAN fibers) or weak ( aerogel), r as well as anisotropic (HOPG) or stiff (pitch fibers) or flexible (Graphoil), isotropic (polycrystalline graphite). In addition, porosity, lubricity, hydrophobicity, hydrophilicity, thermal conductivity and surface area can be varied over a wide range. For example surface area can vary from 0.5 m2 /g for a fiber to >2000 m2 /g for an activated carbon while thermal conductivity can range from 0.001 W/(m-K) for an amorphous carbon foam to 1100 W/(m-K) for a pitch fiber, and 3000 W/(m-K) for diamond.
2926
W.P. Hoffman
Carbon materials have been studied and produced for thousands of years. (The Chinese used lampblack 5000 years ago.) While much is understood about carbon, there are some very important areas in which there is still a lack of understanding. These areas fall generally into the production of graphitic material and the oxidation protection of carbon and graphite materials. A better understanding of the science of carbon formation will allow increased performance at reduced cost, while effective oxidation protection at ultra high temperature will enable a whole range of new technology. Currently, the highest performance and highest cost form of carbon are carbon-carbon composites, which are truly a unique class of materials. These composites, which are stronger and stiffer than steel as well as being lighter than aluminum, are currently used principally in high performancehigh value applications in the aerospace and astronautics industries. The highest volume of carbon-carbon is used as brake rotors and stators for military and commercial aircraft because it has high thermal conductivity, good frictional properties, and low wear. Astronautic applications include, rocket nozzles, exit cones, and nose tips for solid rocket boosters as well as leading edges and engine inlets for hypersonic vehicles. In these applications carbon-carbon’s high strength and stiffness as well as its thermal shock resistant are keys to its success. For re-useable hypersonic vehicles, the fact that carbon does not go through phase changes like some ceramics and in fact its mechanical properties actually increase with temperature make this a very valuable material. For satellite applications carbon-carbon’s high specific strength and stiffness as well as its near zero thermal expansion make it an ideal material for large structures that require dimensional stability as they circle the earth. Carbon–carbon composites are fabricated through a multi-step process. First, carbon fibers, which carry the mechanical load, are woven, braided, felted, or filament wound into a preform which has the shape of the desired part. The perform is then densified with a carbon matrix, which fills the space between the fibers and distributes the load among the fibers. It is this densification process that is not well understood. Unlike the manufacture of fiberglass, in the formation of carbon-carbon composites the matrix precursor is not just cured but must be converted into carbon. The conversion of polymers to produce a char, as well as the conversion of a hydrocarbon gas to produce a graphitizable matrix, is fairly well understood. What is not clear is the process of converting a pitch-based material to a high quality graphitizable matrix. That is, for example, starting with a petroleum or coal tar pitch, heat soaking converts this isotropic mixture of polyaromatics into anisotropic liquid crystalline spherical droplets called mesophase spherules. These spherules ultimately coalesce to form a continuous second phase which ultimately pyrolyzes to form a graphitic structure. Although this process has been observed microscopically, little is known about the process, or even exactly what mesophase is or what molecular precursors are needed to form
Material science of carbon
2927
mesophase. Little is known because it has proved impossible to accurate analyze mesophase precursors, mesophase itself, or the intermediates between mesophase and graphite. What is needed is a model that can predict the growth of single aromatic rings through the mesophase intermediate and into an ordered graphitic structure. Moreover, it is well known that during pyrolysis, mesophase converts into a matrix that is very anisotropic. The formation of onion-like “sheaths” takes place on the surface of individual carbon fibers in the carbon composite. As these sheaths grow outward from the fiber surfaces, they ultimately collide forming point defects called disclinations. This behavior has a pronounced effect on both the chemical and physical properties of the carbon-carbon composite. A model that describes this matrix contribution to the composite’s properties would be of great benefit both for understanding experimental data, such as thermal conductivity and mechanical properties, as well as for prediction of composite behavior. Although carbon–carbon composites possess high strength, stiffness, and thermal shock resistance making them an excellent high temperature structural material, their Achilles heel is oxidation. Above 670 K, carbon oxidizes. This means that if it is unprotected, it can not be used for long term high temperature applications. Today it is used for short term applications, such as, aircraft brakes as well as rocket nozzles, exit cone, and nose tips. To be used in long term applications such as reusable hypersonic vehicles, it must be protected. Currently, an adequate protection system at ultra high temperatures (>2700 K) does not exist. There is a tremendous need and payoff for a non-structural barrier that can keep oxygen from reaching a carbon surface at 2700 K. Efforts to fabricate these coatings have been unsuccessful. What is probably required for success is some sort of novel functionally graded coating which will require modeling material properties as well as thermal stresses. Finally, the rather recent discovery of Buckyballs and nanotubes, which are 3D analogs of hexagonal graphite, have also rekindled the need for models of the intercalation of graphite. This process occurs when various elements, such as lithium, sodium, or bromine, “sandwich” themselves between graphene planes. This greatly alters the thermal and electrical properties of the graphite. Similar behavior is beginning to be demonstrated in Buckyballs and nanotubes. The need for models based on computational chemistry would be of great help in this area. For additional information on carbon, there are many good reference works. Most focus on specific forms of carbon such as carbon blacks, active carbons, fiber, composites, intercalation compounds, nuclear graphites, etc. For general topics the handbook by Pierson [1] is a good text. A reference work covering scores of subjects in great detail is the 28 Volume Chemistry and Physics of Carbon [2] which has had three Editors over the last 40 years.
2928
W.P. Hoffman
References [1] H.O. Pierson, “Handbook of carbon, graphite, diamond and fullerenes – properties, processing and applications,” Noyes Punlications, Park Ridge, 1993. [2] P.L. Walker Jr., P. Thrower, and L.R. Radovic (eds.), “Chemistry and physics of carbon,” vol. 1–28, 1965–2004, Marcel Decker, New York, 2001.
Perspective 38 CONCURRENT LIFETIME-DESIGN OF EMERGING HIGH TEMPERATURE MATERIALS AND COMPONENTS Ronald J. Kerans Air Force Research Laboratory, Materials and Manufacturing Directorate, Wright-Patterson Air Force Base, Ohio, USA
We ask a great deal of structural components used at high temperatures. In addition to performing their primary load bearing function, most of them are subjected to sizable thermal stresses and aggressive atmospheres. Their microstructures and phase compositions, and hence their properties, evolve throughout their lifetime. Many are used in regimes where significant local creep deformation accumulates, so their shapes and residual stress states change also. Finally, many of them are used in situations where failure is extremely undesirable. Failure of a power generation turbine or an aircraft engine carries substantial fiscal costs, and the latter has potential for tragic human costs. The customers – all of us – have very low tolerance for other than extreme reliability. This is a challenging environment in which to introduce new materials. It is an equally challenging environment in which to substitute computation for time-proven testing and design techniques. The penalties for mistakes are extreme and the development costs are correspondingly large. On the other hand, the high stakes involved provide strong motivation to use computation to improve the ability to predict component end-of-life and thereby avoid both catastrophic failures and expensive premature retirement. Likewise, the long and costly development / certification cycles for introducing even evolutionary changes in materials provide a substantial benefit for substituting computation for experiment. Fortunately the fundamental understanding and many of the multi-scale modeling tools will be similar for both tasks.
2929 S. Yip (ed.), Handbook of Materials Modeling, 2929–2934. c 2005 Springer. Printed in the Netherlands.
2930
1.
R.J. Kerans
Current Efforts
In many ways, the need for computational infrastructure for both design and life management is most pronounced for new materials systems for which both design and life limiting behavior differ from materials now in service. However, the breadth and depth of understanding of behavior, substantial existing databases, and cost benefit associated with fielded systems substantially argues for application to existing and evolutionary systems. The computational tools required for either task are numerous, as are the materials science aspects that will require enhanced understanding; hence a comprehensive, coordinated attack on either is a significant undertaking. In fact, a number of efforts have been coordinated under the umbrellas of three substantial activities which have begun addressing both design [1–3] and life management [4, 5]. The long-range scope of the design-related modeling effort extends from atomic level to component design via finite element analysis (FEA). The life management effort covers a broad span in degradation and failure behavior, plus goals of comparable complexity in logging and predicting the effects of actual use history and in sensing the actual state of the system in situ.
2.
Emerging Materials
While current efforts boldly attempt to devise entirely new approaches to design and life management, the focus is quite logically on current or evolutionary variations of current materials. However, the real impact of the approach may be most profound when applied to the introduction of new materials, such as low-ductility intermetallics, ceramic composites and perhaps hybrids combining two or more materials systems. These systems almost “demand” a more robust computational approach to design. Consider that while developmental ceramic composites are made in sheet form with uniform properties, many actual components will have fiber and void volume fractions that vary strongly with location in the part. The magnitude and anisotropy of the properties of the material will be a local feature of the component. In this sense, it is almost impossible to separate the material from the component; optimal design will require that the material and component be dealt with concurrently. While these issues apply to all fibrous composites, they will apply especially to ceramic composites. Organic-matrix composites are often used for large sheet-like components such as wing skins, whereas ceramic composites will largely be used for the smaller, more complex shapes of high temperature hardware. In almost all aspects, ceramic composites represent a significant departure from conventional practice. Optimizing the distributed damage mechanisms
Design of emerging high temperature materials and components
2931
that provide macroscopic fracture toughness is an exercise in carefully designed fracture processes. In conventional materials, the design work is mostly oriented towards avoiding fracture entirely, or at least designing for minimum growth rates. The task of promoting local fracture and designing crack paths is relatively new territory [6]. A consequent significant difference between ceramic composites and other systems is the relationship between properties and damage. Local deformation in metals, for example, does not radically change properties whereas comparable deformation in a composite may significantly change the local properties with the largest changes possibly being in an unloaded direction. The local nature of properties both imposes obligation and provides opportunity to the designer. The idea is to design the local material/properties to best suit the requirements of the component. In many cases this will be best achieved not by fastening sub components together, but by way of designed architectures and blending materials systems within the component [7, 8]. Such hybrids might comprise oxide composites at the hot surface blending to nonoxide reinforcements in mid-temperature regions, blending to metal matrices in warm regions and finally to monolithic metals for cool attachments. The idea of hybrids is a natural extension of local design.
3.
Processing
Further complicating the local property situation is the certainty that the properties will also be a consequence of local processing. For most materials systems, processes will need to vary with the component. For example, the microstructures of both wrought and cast metal parts vary with location and depend on the geometry of the part. Ceramic composites are by their very nature particularly difficult to process. The basic goal is to consolidate a very refractory matrix material that is constrained by an equally refractory fiber structure, without damaging the fibers. The degree of densification tends to vary with local fiber fraction and most processes will always yield significant void space that varies with location. If the void space can’t be eliminated, then the objective should be to control it to be distributed in the most benign arrangement. Uniform fine porosity is typically more desirable than large pores, which are more desirable than large cracks. The challenge is three-fold; to design what can be processed reliably, to process what was designed, and to model the performance and life of what actually exists. This will require the ability to know and model the effects of such things as actual fiber locations in real preforms and components. Irrespective of the material system, modeling that represents real processes will be essential.
2932
4.
R.J. Kerans
Lifetime Design
Materials that operate at very high temperatures will evolve over time; no component in use is in a truly stable state. While this fact is not currently neglected entirely, there is opportunity to develop a much higher level of sophistication. The long-range goal should be to establish materials design and modeling capabilities that allow design of materials and components to achieve an optimum blend of properties over their lifetimes and reliable lifetimes in the specific application. This optimization will require: in-depth knowledge of, and the ability to simulate, the service environments; and the ability to predict the evolution of microstructures and damage and the consequent effects on properties and life limiting behavior. The ability to use this knowledge to predict the correct initial state of the materials for optimum lifetime performance implies a very high level of sophistication. The ability to achieve the goal of concurrent lifetime-designed, life-managed materials/components is completely dependent upon conquering what may be an equally challenging task: acquiring knowledge of the actual use environment to a far greater depth than is currently possible. It requires knowing well the actual conditions, not just nominal conditions, and the impact that has on each particular component, which may be as challenging as the preceding items.
5.
Models
The highest-level models may be similar to current design Finite Element Analysis. Much of the design portion of the work might best be done through inverse analysis. That is, the design analysis would determine what the component “wants” for properties and design the material to best suit that. In any event, the more basic analysis will need to be integrated such that it is invisible to the designer [3]. The shortest-scale models will depend somewhat on the materials systems and the nature of the component. In at least some cases, there will be a continuing need to deal with quantum-level modeling. An example of an atomic level issue in a metal alloy might be the introduction of a minor alloying element for the purpose of slowing diffusive microstructural changes. The modification could easily introduce unintended effects on boundary strength, or perhaps subtly affect stability of undesirable minor phases that could be problematic late in service life. Evaluation for all such effects requires very basic modeling, or, as in the current system, extensive prior knowledge. For ceramic composites, it has thus far seemed unnecessary to consider scales below the micromechanics level for design purposes; that is, the smallest dimension would be the approximate 200 nm thickness of the fiber coating.
Design of emerging high temperature materials and components
2933
It has been assumed that all constituents remain perfectly elastic (brittle) even at quite high temperatures, and that composite performance is not directly affected by subtleties at scales below that of the smallest constituent. However, recent basic work on deformation of monazite and scheelite fiber-coating materials indicates a large role of plasticity in the function of the coatings [9–11]. Further evidence confirms that the extent of plasticity will be temperature dependent, though considerably less so than for most metals. This will add a significant temperature dependent component to composite behavior, and a high level of modeling fidelity may require dealing with ductility via dislocations and deformation twins in the 200 nm coating. Moreover, it is clear that there are significant corrosion effects on fibers that remain uncharacterized, much less understood [12, 13]. These will have profound effects on lifetime properties, and surely other such effects lurk in the details. The degree to which we see simplicity in the modeling of ceramic composites is probably more a reflection of our depth of understanding of the materials than of their actual complexity in long-term service. In any event, very basic modeling will be necessary, but not nearly sufficient. Computational concurrent design will require integrated modeling on a wide variety of time and length scales, integration of materials disciplines, and integration of materials and design disciplines. An elegant and readily usable solution to multi-scale integration will be essential to an infrastructure that has real impact [3].
6.
Outlook
The ultimate goal could be: (1). concurrent material/component design for the best balance of properties over the lifetime of the component; (2). a reliable lifetime with graceful, detectable failure processes; (3). the ability to determine the current state of the component and predict its remaining life; (4). the power to do much of the preceding computationally with an economical set of physical experiments to confirm the results to high reliability. The achievement of this goal is dependent as much upon progress in the science of materials as upon progress in the science of modeling. It is probably safe to say that all current models are situation specific. They do not describe the actual physics; they describe an approximation of the actual physics. Each one describes a facet of the situation wonderfully well in certain circumstances and is completely wrong in others. Integrating them and their successors into an infrastructure that will apply them properly to this nearly infinitely faceted, multi-scale problem is a formidable challenge. There is the attendant danger that each implementation will be so material-system specific that applying the approach to a new system will require rebuilding the entire infrastructure. While such an outcome might still yield significant benefit, much of the
2934
R.J. Kerans
savings would be lost. The resources currently spent building temporarily useful databases would largely be spent building temporarily useful models. Nevertheless, it seems self-evident that it is the logical goal of materials and computational science. It is a goal with tremendous benefit in both monetary and human terms and that is wonderfully engaging in addition. Though when it will be achieved is uncertain, we can look forward to the day when intrinsic failures are an artifact of the past.
References [1] D.M. Dimiduk and P.L. Martin, et al., “Accelerated insertion of materials: the challenges of gamma alloys are really not unique,” Gamma Titanium Alumindes, H.C.Y-W. Kim and A.H. Rosenberger, TMS, Warrendale, PA, 15–28, 2003. [2] D.M. Dimiduk and T.A. Parthasarathy et al., “Structural alloy performance prediction for accelerated use: evolving computational materials science & multiscale modeling,” 2nd International Conference on Multiscale Materials Modeling, University of California Los Angeles, Los Angeles, CA USA, American Scientific Publishers, 2004. [3] D.M. Dimiduk and T.A. Parthasarathy et al., “Predicting the microstructuredependent mechanical performance of materials for early-stage design,” NUMIFORM 2004, Columbus, OH, USA, 2004. [4] J.M. Larsen and B. Rasmussen et al., “The engine rotor life extension (ERLE) initiative and its contributions to increased life and reduced maintenance cost,” 6th National Turbine Engine High Cycle Fatigue (HCF) Conference, Jacksonville, FL, USA, 2001. [5] L. Christodoulou and J.M. Larsen, “Using materials prognosis to maximize the utilization potential of complex mechanical systems,” JOM(March), 15–19, 2004. [6] R.J. Kerans and R.S. Hay et al., “Interface design for oxidation resistant ceramic composites,” J. Am. Ceram. Soc., 85(11), 2599–632, 2002. [7] G. Jefferson and T.A. Parthasarathy et al., “Materials design of hybrid ceramic composites for hot structures,” Proceedings of the 35th International SAMPE Technical Conference, Dayton, OH, SAMPE, 2003. [8] T.A. Parthasarathy and R.J. Kerans et al., “Reduction of thermal gradient-induced stresses in composites using mixed fibers,” J. Amer. Cer. Soc., 87(4), 617–625, 2004. [9] R.S. Hay, “(120) and (122) monazite deformation twins,” Acta Mater., 51(18), 5255– 5262, 2003. [10] R.S. Hay and D.B. Marshall, “Monazite deformation twins,” Acta Mater., 51(18), 5235–5254, 2003. [11] R.S. Hay, “Climb dissociation of dislocations in monazite at low temperature,” J. Am. Ceram. Soc., 87(6), 1149–1152, 2004. [12] E.E. Boakye, R.S. Hay et al., “Monazite coatings on fibers: II, coating without strength degradation,” J. Am. Ceram. Soc., 84(12), 2793–2801, 2001. [13] R.S. Hay and E.E. Boakye, “Monazite coatings on fibers: I, effect of temperature and alumina-doping on coated fiber tensile strength,” J. Am. Ceram. Soc., 84(12), 2783–2792, 2001.
Perspective 39 TOWARDS A COHERENT TREATMENT OF THE SELF-CONSISTENCY AND THE ENVIRONMENT-DEPENDENCY IN A SEMI-EMPIRICAL HAMILTONIAN FOR MATERIALS SIMULATION S.Y. Wu, C.S. Jayanthi, C. Leahy, and M. Yu Department of Physics, University of Louisville, Louisville, KY 40292
The construction of semi-empirical Hamiltonians for materials that have the predictive power is an urgent task in materials simulation. This task is necessitated by the bottleneck encountered in using density functional theory (DFT)-based molecular dynamics (MD) schemes for the determination of structural properties of materials. Although DFT/MD schemes are expected to have predictive power, they can only be applied to systems of about a few hundreds of atoms at the moment. MD schemes based on tight-binding (TB) Hamiltonians, on the other hand, are much faster and applicable to larger systems. However, the conventional TB Hamiltonians include only two-center interactions and they do not have the framework to allow the self-consistent determination of the charge redistribution. Therefore, in the strictest sense, they can only be used to provide explanation for system-specific experimental results. Specifically, their transferability is limited and they do not have predictive power. To overcome the size limitation of DFT/MD schemes on the one hand and the lack of transferability of the conventional two-center TB Hamiltonians on the other, there exists an urgent need for the development of semi-empirical Hamiltonians for materials that are transferable and hence, have predictive power. The key ingredient to the development of semi-empirical Hamiltonians for materials that have predictive power is a reliable and efficient scheme to mimic the effect of screening by electrons when atoms are brought together to form a stable aggregate. Such an ingredient requires the construction of
2935 S. Yip (ed.), Handbook of Materials Modeling, 2935–2942. c 2005 Springer. Printed in the Netherlands.
2936
S.Y. Wu et al.
the semi-empirical Hamiltonian based on a framework that allows a coherent treatment of the self-consistent (SC) determination of charge redistribution and environment-dependent (ED) multi-center interactions. Various schemes can be found in the literature in recent years that are designed to improve the transferability of TB Hamiltonians by including the self-consistency and/or the environment-dependency. Among these are methods that can also be conveniently implemented in MD schemes because the atomic forces can be readily calculated. They include methods whose emphasis is placed on a phenomenological description of the environment-dependency [1, 2] and two similar methods whose frameworks take into account the self-consistency as well as the environment-dependency [3, 4]. The latter two approaches can be construed as the expansion of the DFT-total energy in terms of the charge density fluctuations about some reference density. To the second order in the density fluctuations, the total energy is approximated as the sum of a band structure term, a short-range repulsive term akin to that in the conventional two-center TB Hamiltonian, and a term representing the Coulomb interaction between charge fluctuations. The charge fluctuations in this approach are self-consistently determined by solving an eigenvalue equation with the two-center Hamiltonian modified by a term that depends on the charge redistribution. In this framework, the Hamiltonian does contain the features of self-consistency in the charge redistribution and the environment-dependency for systems with charge fluctuations. The environment-dependent feature, however, disappears when systems under consideration do not involve charge fluctuations, e.g., periodic systems with one atomic species per primitive unit cell. But the environment-dependent multi-center interactions are key features in a realistic modeling of the screening effect of the electrons in an aggregate of atoms, including extended periodic systems. This deficiency in properly mimicking the screening of the electrons can be critical in the development of a truly transferable Hamiltonian. Thus the development of semi-empirical Hamiltonians for materials with predictive power requires the treatment of the self-consistency as well as the environment-dependency on equal footing. We have recently developed a scheme for the construction of semi-empirical Hamiltonians for materials within the framework of linear combination of atomic orbitals (LCAO) that allows a coherent treatment of the SC determination of the charge redistribution and the environment-dependent (ED) multi-center interactions in a transparent manner [5]. In this scheme, we set up the framework of the semi-empirical Hamiltonian in accordance with the Hamiltonian of the many-atom aggregate. The salient feature of the resulting semi-empirical Hamiltonian, referred as the SCED/LCAO Hamiltonian, is that it has the flexibility to allow the database to provide the necessary ingredients for fitting parameters to capture the effect of electron screening.
Semi-empirical Hamiltonian for materials simulation
2937
The Hamiltonian of a many-atom aggregate may be written as H =−
h¯ 2 l
2m
∇l2 +
ν( rl − Ri ) +
l,i
l,l
Z i Z j e2 e2 + 4π ε0rll 4π ε0 Ri j i, j
(1)
energy where ν( rl − Ri ) is the potential between an electron at rl and the ion at Ri , rll = rl − rl , Ri j = Ri − R j , and Z i corresponds to the number of valence electrons associated with the ion at site Ri . Within the one-particle approximation in the framework of LCAO, the on-site (diagonal) element of the Hamiltonian can be written as 0 inter + u intra Hiα,iα = εiα iα + u iα + v iα
(2)
0 denotes the sum of the kinetic energy and the energy of interaction where εiα with its own ionic core of an electron in the orbital iα. The terms u intra iα and inter u iα are the energies of interaction of the electron in orbital iα with other electrons associated with the same site i and with other electrons in orbital jβ ( j =/ i), respectively. The term v iα represents the interaction energy between the electron in orbital α at site i and the ions at the other sites. In our scheme, the terms in Eq. (2) are represented by 0 = εiα − Z i U εiα
(3)
u intra iα = Ni U
(4)
and u inter iα + v iα =
[Nk VN (Rik ) − Z k VZ (Rik )]
(5)
k= /i
where εiα may be construed as the energy of the orbital α for the isolated atom at i, Z i the number of positive charges carried by the ion at i (also the number of valence electrons associated with the isolated atom at i), Ni the number of valence electrons associated with the atom at i when the atom is in the aggregate, U , a Hubbard-like term, the effective energy of electronelectron interaction for electrons associated with the atom at site i, VN (Rik ) the effective energy of electron-electron interaction for electrons associated with different atoms (atoms i and k), and Z k VZ (Rik ) the effective energy of interaction between an electron associated with an atom at i and an ion at site k.
2938
S.Y. Wu et al.
Following the same reasoning, we can set up the off-diagonal matrix element Hiα, jβ ( j =/ i) as
Hiα, jβ
1 = K (Ri j )(ε iα + εjβ ) + [(Ni − Z i ) + (N j − Z j )]U 2 +
(Nk VN (Rik ) − Z k VZ (Rik ))
k= /i
+
(Nk VN (R j k ) − Z k VZ (R j k ))
Siα, jβ (Ri j )
(6)
k= /j
Thus, in addition to the conventional two-center hopping-like first term, Eq. (6) also includes both intra- and inter-electron-electron interaction terms as well as environment-dependent multi-center (three-center explicitly and four-center implicitly) interactions. In its broadest sense, the first term in Eq. (6) corresponds to the Wolfsberg– Helmholtz relation in the extended H¨uckel theory. In our approach, K is treated as a function of Ri j rather than a constant parameter to ensure a reliable description of the dependence of the two-center term on Ri j in the off-diagonal Hamiltonian matrix element. The overlap matrix elements Siα, jβ (Ri j ) are expressed in terms of Si j,τ , with τ denoting, for example, molecular orbitals ssσ, spσ, ppσ, and ppπ in a sp3 configuration. They are expected to be shortranged function of Ri j . Equations (2) through (6) completely define the recipe for constructing semi-empirical SCED-LCAO Hamiltonians for materials in terms of parameters and parameterized functions. An examination of Eqs. (2)–(6) clearly indicates that the presence of Ni , the charge distribution at site i, in the Hamiltonian provides the framework for a self-consistent determination of the charge distribution. From Eqs. (5) and (6), it can be seen that the environmentdependent multi-center interactions are critically dependent on VN (Rik ) and VZ (Rik ), in particular their difference VN (R ik ) = VN (Rik ) − VZ (Rik ). As both VN (Rik ) and VZ (Rik ) must approach E 0 Rik for Rik beyond a few nearest neighbor separations, VN (Rik ) is expected to be a short ranged function of Rik . The parameters, including those characterizing the parameterized functions, are to be optimized with respect to a judiciously chosen database for a particular material. In our approach, εiα may be chosen according to its estimated value based on the orbital iα, or treated as a parameter of optimization. The quantity U will be treated as a parameter of optimization while VZ (Rik ) and VN (Rik ) will be treated as parameterized functions to be optimized. The parameterized function VZ (Rik ) is modeled as the energy of the effective interaction per ionic charge between an ion at site k and an electron associated with the atom at site i. VN (Rik ) is then modeled in terms of VZ (Rik ) and the short-range function VN .
Semi-empirical Hamiltonian for materials simulation
2939
The recognition of the difference between VN (Rik ) and VZ (Rik ) in the SCED/LCAO Hamiltonian assures that the environment-dependent feature will not disappear even for systems with no charge redistribution. The presence of the environment-dependent terms in the SCED/LCAO Hamiltonian for systems with no on-site charge redistribution affects the distribution of the electrons among the orbitals even though the total charge associated with a given site is not changed. Therefore, the effect of the environment-dependency will be reflected in the band structure energy through the solution to the general eigenvalue equation corresponding to the SCED/LCAO Hamiltonian as well as the total energy. This feature, together with the self-consistency in the determination of the charge redistribution, provides the flexibility for the SCED/LCAO Hamiltonian to mimic the effect of electron screening. According to the strategy given above, the framework of the proposed semi-empirical SCED-LCAO Hamiltonian will allow the self-consistent determination of the electron distribution at site i. The inclusion of environmentdependent multi-center interactions will provide the proposed Hamiltonian with the flexibility of treating the screening effect associated with electrons which is important for the structural stability of narrow band solids such as d-band transition metals, while at the same time, handling the effect of charge redistribution for systems with reduced symmetry on equal footing. Furthermore, as described above, the Hamiltonian is set up in such a way that the physics underlying each term in the Hamiltonian is transparent. Therefore, it will be convenient to trace the underlying physics for properties of a system under consideration when such a Hamiltonian is used to investigate a manyatom aggregate and predict its properties. The salient feature of our strategy is that, with the incorporation of all the relevant terms discussed previously, there is no intrinsic bias towards ionic, covalent, or metallic bonding for the proposed Hamiltonian. The construction of the SCED/LCAO Hamiltonian depends critically on the database. If one can judiciously compile a systematic and reliable database, the scheme has the flexibility to allow the database to properly model the screening effect of the electrons in an atomic aggregate. Thus the strategy represents an approach that provides the appropriate conceptual framework to allow the chemical trend in a given atomic aggregate to determine the structural as well as electronic properties of condensed matter systems. The total energy of the system consistent with the Hamiltonian described by Eqs. (2)–(6) is given by E tot = E B S − E corr + E ion−ion
(7)
where E B S is the band-structure energy and is obtained by solving the general eigenvalue equation corresponding to the SCED/LCAO Hamiltonian, E corr is the correction to the double counting of the electron-electron interactions between the valence electrons in the band-structure energy calculation, and
2940
S.Y. Wu et al.
E ion−ion is the repulsive interaction between ions. Based on Eqs. (2)–(6), Eq. (7) can be rewritten as E tot = E B S + +
1 2 1 (Z i − Ni2 )U − Ni Nk VN (Rik ) 2 i 2 i,k(i =/ k)
1 Z i Z k VC 2 i,k(i =/ k)
(8)
with VC =
e2 E0 = 4π ε0 Rik Rik
(9)
For the MD simulation, the forces acting on the atoms in the atomic aggregate must be calculated at each MD step. The calculation of the band structure contribution to atomic forces can be carried out by the Hellmann–Feynman theory. With the presence of terms involving Ni and Nk in the SCED-LCAO Hamiltonian (see Eqs. (5) and (6)), terms such as ∇k Ni where ∇k refers to the gradient with respect to Rk will appear in the electronic contribution to the atomic forces. However, these terms are canceled exactly by terms arising from the gradients of the second and the third terms in the total energy expression (Eq. (8)). Thus terms involving ∇k Ni will not contribute to the calculation of atomic forces. This fact greatly simplifies the calculation of atomic forces needed in the MD simulations. In other words, if one disregards the extra time due to the self-consistency requirement, the calculation of atomic forces based on the SCED–LCAO Hamiltonian is not anymore difficult compared with conventional TB approaches. We have tested the SCED/LCAO Hamiltonian by investigating a variety of different structures of silicon (Si), including the bulk phase diagrams of Si, the equilibrium structure of an intermediate-size Si71 cluster, the reconstruction of the Si(100) surface, and the energy landscape for a Si monomer adsorbed on the reconstructed Si(111)-7×7 surface [5]. In all the cases studied, the results have demonstrated the robustness of the SCED/LCAO Hamiltonian. For example, results showing the binding energy vs relative atomic volume curves for the diamond, the simple cubic (sc), the body centered cubic (bcc), and the face centered cubic (fcc) phases of silicon, obtained by using the SCED–LCAO Hamiltonian constructed for Si with our scheme, are presented in Fig. 1. Also shown in Fig. 1 are the corresponding curves obtained using three existing traditional (two-center and non-self consistent) non-orthogonal tight binding (NOTB) Hamiltonians [6–8], and two more recently developed non-self consistent but environment-dependent Hamiltonians [1, 2]. All the curves (solid) are compared with the results obtained by DFT–LDA calculations [9].
Semi-empirical Hamiltonian for materials simulation
2941
1
fcc DFT LCAO
bcc
0.4
sc
0.2 cdia 0 0.4
0.6
0.8
1
1.2
0.6
0.8
1
1.2
1.4
0.4
0.6
0.8
1
1.2
1.4
0.4
0.6
0.8
1
1.2
1.4
0.4
0.6
0.8
1
1.2
1.4
0.4
0.6
0.8
1
1.2
1.4
1
0.6
NRL
0.8
Wang et al.
Frauenheim et al.
binding energy per atom
0.6
Bernstein & Kaxiras
Menon & Subbaswamy
SCED
0.8
0.4
0.2 0 0.4
1.4
relative atomic volume Figure 1. The binding energy versus relative atomic volume curves for the diamond (cdia), the simple cubic (sc), the body centerd cubic (bcc), and the face centered cubic (fcc) phases of silicon. Top-left panel: SCED-LCAO Hamiltonian. Top-central panel: [7]; Top-right panel: [8]; Bottom-left panel: [6]; Bottom-central panel: [1]; Bottom-right panel: [2]. All the curves (solid) are compared with the result obtained by a DFT–LDA calculation [9].
It can be seen that while the results obtained by all the existing Hamiltonians fail for the high pressure phases, those obtained using Hamiltonians with environment-dependent terms give much better agreement for those phases. This is an indication of the importance of the inclusion of the environmentdependent effects in the Hamiltonian, even for single-element extended crystalline phases where there is no charge redistribution. However, the most striking message conveyed by Fig. 1 is how well our result compares with the DFT–LDA results for all the extended crystalline phases, both at low as well as high pressures. It indicates that the SCED/LCAO Hamiltonian has the capacity and the flexibility of capturing the environment-dependent screening effect under various local configurations. The framework of the SCED/LCAO Hamiltonian outlined in Eqs. (2)–(6) is very flexible. It can be conveniently extended to include the spin-polarized effect and to construct SCED/LCAO Hamiltonians for heterogeneous systems in terms of parameters of SCED/LCAO Hamiltonians of their constituent elemental systems. Work along these lines is in progress.
2942
S.Y. Wu et al.
Acknowledgment This work was supported by NSF (DMR-0112824 and ECS-0224114) and DOE (DE-FG02-00ER4582).
References [1] C.Z. Wang, B.C. Pan, and K.M. Ho, “An environment-dependent tight-binding potential for Si,” J. Phys.: Condens. Matter., 11, 2043–2049, 1999. [2] D.A. Papaconstantopoulos, M.J. Mehl, S.C. Erwin et al., “Tight-binding Hamiltonians for carbon and silicon,” In: P.E.A. Turchi, A. Gonis and L. Colombo (eds.), TightBinding Approach to Computational Materials Science. MRS Symposia Proceedings No. 491, Materials Research Society, Pittsburg, pp. 221–230, 1998. [3] K. Esfarjani and Y. Kawazoe, “Self-consistent tight-binding formalism for charged systems,” J. Phys.: Condens. Matter., 10, 8257–8267, 1998. [4] Th. Frauenheim, M. Seifert, M. Elsterner et al., “A self-consistent charge densityfunctional based tight-binding method for predictive materials simulations in physics, chemistry, and biology,” Phys. Stat. Sol. (b), 217, 41–62, 2000. [5] C. Leahy, M. Yu, C.S. Jayanthi, and S.Y. Wu, “Self-consistent and environmentdependent Hamiltonians for materials simulations: case studies on Si structures,” http://xxx.lanl.gov/list/cond-mat/0402544, 2004. [6] Th. Frauenheim, F. Weich, Th. K¨ohler et al., “Density-functional-based construction of transferable nonorthogonal tight-binding potentials for Si and SiH,” Phys. Rev. B, 52, 11492–11501, 1995. [7] M. Menon and K.R. Subbaswamy, “Nonorthogonal tight-binding moleculardynamics scheme for Si with improved transferability,” Phys. Rev. B, 55, 9231–9234, 1997. [8] N. Bernstein and E. Kaxiras, “Nonorthogonal tight-binding Hamiltonians for defects and interfaces,” Phys. Rev. B, 56, 10488–10496, 1997. [9] M.T. Yin and M.L. Cohen, “Theory of static structural properties, crystal stability, and phase transformations: applications to Si and Ge,” Phys. Rev. B, 26, 5668–5687, 1982.
INDEX OF CONTRIBUTORS (Article/Perspective numbers are given in bold.) Abraham, F.F. P20, 2793 Alexander, F.J. 8.7, 2513 Aluru, N.R. 8.3, 2447 Angelis, F. de 4, 59 Artacho, E. 1.5, 77 Asta, M. 1.16, 349 Averback, R. 6.2, 1855 Bammann, D.J. 3.2, 1077 Barmak, K. 7.19, 2397 Baroni, S. 1.1, 195 Bartlett, R.J. 1.3, 27 Battaile, C. 7.17, 2363 Bazant, M.Z. 4.1, 1217; 4.1, 1417 Bernstein, N. 2.24, 855 Binder, K. P19, 2787 Blöchl, P.E. 1.6, 93 Boghosian, B.M. 8.1, 2411 Boon, J P. P21, 2805 Boyd, I.D. P22, 2811 Bulatov, V.V. P7, 2695 Caflisch, R. 7.15, 2337 Cai, W. 2.21, 813 Car, R. 4, 59 Carloni, P. 1.13, 259 Carter, E.A. 1.8, 137 Catlow, C.R.A. 6.1, 1851; 2.7, 547 Ceder, G. 1.17, 367; 1.18, 395 Chadwick, A.V. 6.5, 1901 Chan, H S. 5.16, 1823 Chelikowsky, J.R. 1.7, 121 Chen, I-W. P27, 2843 Chen, L-Q. 7.1, 2083 Chen, S-H. P28, 2849 Chipot, C. 2.26, 929 Ciccotti, G. 2.17, 745; 5.4, 1597 Cohen, M.L. 1.2, 13 Corish, J. 6.4, 1889 Coveney, P.V. 8.5, 2487 Crocombette, J-P. 2.28, 987 Crowdy, D. 4.1, 1417
Csányi, G. P16, 2763 Cuong, N.N. 4.15, 1529 de Koning, M. 2.15, 707 De Vita, A. P16, 2763 Dellago, C. 5.3, 1585 Doll, J.D. 5.2, 1573 Doyle, P.S. 9.7, 2619 Eggers, J. 4.9, 1403 Español, P. 8.6, 2503 Evans, D.J. P17, 2773 Evans, J.W. 5.12, 1753 Falk, M.L. 4.3, 1281 Farkas, D. 2.23, 839 Först, C.J. 1.6, 93 Fredrickson, G H. 9.9, 2645 Frenkel, D. 2.14, 683 Gale, J.D. 2.3, 479; 1.5, 77 Galli, G. P8, 2701 Ganesan, V. 9.9, 2645 García, A. 1.5, 77 Gear, C.W. 4.11, 1453 Germann, T.C. 2.11, 629 Geva, E. 5.9, 1691 Ghoniem, N M. 7.11, 2269; P11, 2719; P30, 2871 Giannozzi, P. 1.1, 195; 4, 59 Gillespie, D.T. 5.11, 1735 Gilmer, G. 2.1, 613 Goddard, W.A. III. P9, 2707 Gro, A. 5.1, 1713 Gumbsch, P. P10, 2713 Gygi, F. P8, 2701 Hadjiconstantinou, N.G. 8.1, 2411; 8.8, 2523 Hirth, J.P. P31, 2879 Ho, K.M. 1.15, 307 Hoffman, W.P. P37, 2923 Hoover, W.G. P34, 2903 Horstemeyer, M.F. 3.1, 1071; 3.5, 1133 Hou, T.Y. 4.14, 1507 Huang, H. 2.3, 1039
2943
2944 Hummer, G. 4.11, 1453 Islam, M.S. 6.6, 1915 Jang, S. 5.9, 1691 Jayanthi, C.S. P39, 2935 Jeanloz, R. P25, 2829 Jensen, P. 5.13, 1769 Jin, X. 8.3, 2447 Jin, Y.M. 7.12, 2287 Joannopoulos, J.D. P4, 2671 Junquera, J. 1.5, 77 Justo, J.F. 2.4, 499 Kaburaki, H. 2.18, 763 Kalia, R.K. 2.25, 875 Kapral, R. 2.17, 745; 5.4, 1597 Karma, A. 7.2, 2087 Kästner, J. 1.6, 93 Katsoulakis, M.A. 4.12, 1477 Kaxiras, E. 2.1, 451; 8.4, 2475 Kerans, R.J. P38, 2929 Kevrekidis, I.G. 4.11, 1453 Khachaturyan, A.G. 7.12, 2287 Khraishi, T.A. 3.3, 1097 Kim, S G. 7.3, 2105 Kim, W.T. 7.3, 2105 Klein, M.L. 2.26, 929 Kob, W. P24, 2823 Kofke, D.A. 2.14, 683 Korkin, A. 1.3, 27 Kremer, K. P5, 2675 Krill, C.E., III. 7.6, 2157 Kubin, L.P. P33, 2897 Landau, D.P. P2, 2663 Langer, J.S. 4.3, 1281; P14, 2749 Leahy, C. P39, 2935 LeSar, R. 7.14, 2325 Li, G. 8.3, 2447 Li, J. 2.8, 565; 2.19, 773; 2.31, 1051 Li, X. 4.13, 1491 Lignères, V.L. 1.8, 137 Lookman, T. 7.5, 2143 Louie, S.G. 1.11, 215 Lowengrub, J. 7.8, 2205 Lu, G. 2.2, 793 MacKerell, A.D., Jr. 2.5, 509 Magistrato, A. 1.13, 259 Mahadevan, L. 9.1, 2555 Margetis, D. 4.8, 1389 Marin, E.B. 3.5, 1133 Maroudas, D. 4.1, 1217 Martin, G. 7.9, 2223 Martin, R.M. 1.5, 77 Marzari, N. 1.1, 9; 1.4, 59 Mattice, W.L. 9.3, 2575 Mavrantzas, V.G. 9.4, 2583 McDowell, D.L. 3.6, 1151; 3.9, 1193
Index of contributors Mehl, M.J. 1.14, 275 Metiu, H. 5.1, 1567 Miller, R.E. 2.13, 663 Milstein, F. 4.2, 1223 Mishin, Y. 2.2, 459 Montalenti, F. 2.11, 629 Morgan, D. 1.18, 395 Moriarty, J.A. P13, 2737 Morris, J.W., Jr. P18, 2777 Mountain, R.D. P23, 2819 Müller, M. 9.5, 2599 Nakano, A. 2.25, 875 Needleman, A. 3.4, 1115 Nitzan, A. 5.7, 1635 Nordlund, K. 6.2, 1855 Odette, R.G. 2.29, 999 Ogata, S. 1.2, 439 Olson, G.B. P3, 2667 Ordejón, P. 1.5, 77 Pakula, T. P35, 2907 Pande, V. 5.17, 1837 Pankratov, I.R. 7.1, 2249 Papaconstantopoulos, D.A. 1.14, 275 Pask, J.E. 1.19, 423 Patera, A.T. 4.15, 1529 Payne, M. P16, 2763 Pechenik, L. 4.3, 1281 Peiró, J. 8.2, 2415 Phillpot, S.R. 2.6, 527; 6.11, 2009 Potirniche, G.P. 3.5, 1133 Powers, T.R. 9.8, 2631 Raabe, D. 7.7, 2173 Radhakrishnan, R. 5.5, 1613 Ratsch, C. 7.15, 2337 Ray, J.R. 2.16, 729 Reinhard, W.P. 2.15, 707 Reuter, K. 1.9, 149 Rickman, J.M. 7.14, 2325; 7.19, 2397 Rubio, A. 1.11, 215 Rudd, R.E. 2.12, 649 Rutledge, G.C. 9.1, 2555 Sakai, T. 8.5, 2487 Sánchez-Portal, D. 1.5, 77 Sauer, J. 1.12, 241 Saxena, A. 7.5, 2143 Scheffler, M. 1.9, 149 Schulten, K. 5.15, 1797 Schwartz, S.D. 5.8, 1673 Selinger, R.L.B. 2.23, 839 Sepliarsky, M. 2.6, 527 Sergi, A. 2.17, 745; 5.4, 1597 Sethian, J.A. 4.6, 1359 Shelley, M.J. 4.7, 1371 Shen, C. 7.4, 2117 Sherwin, S. 8.2, 2415
Index of contributors Sierka, M. 1.12, 241 Sierou, A. 9.6, 2607 Smith, G.D. 9.2, 2561 Soisson, F. 7.9, 2223 Soler, J.M. 1.5, 77 Sornette, D. 4.4, 1313 Srolovitz, D.J. 7.1, 2083; 7.13, 2307 Stachiotti, M.G. 2.6, 527 Stampfl, C. 1.9, 149 Stanley, E.H. P36, 2917 Sterne, P.A. 1.19, 423 Stone, H.A. 4.8, 1389 Stoneham, M. P12, 2731 Succi, S. 8.4, 2475 Tadmor, E.B. 2.13, 663 Tajkhorshid, E. 5.15, 1797 Tang, M. 2.22, 827 Tarek, M. 2.26, 929 Taylor, DeCarlos E. 1.3, 27 Theodorou, D.N. P15, 2757 Thompson, C.V. P26, 2837 Tornberg, A.K. 4.7, 1371 Torquato, S. 4.5, 1333; 7.18, 2379 Trout, B.L. 5.5, 1613 Tuckerman, M.E. 2.9, 589 Uberuaga, B.P. 5.6, 1627; 2.11, 629 Underhill, P.T. 9.7, 2619 Vaks, V.G. 7.1, 2249 Van de Walle, A. 1.16, 349 Van de Walle, C.G. 6.3, 1877
2945 Van der Giessen, E. 3.4, 1115 Van der Ven, A. 1.17, 367 Vashishta, P. 2.25, 875 Veroy, K. 4.15, 1529 Vitek, V. P32, 2883 Vlachos, D.G. 4.12, 1477 Voter, A.F. 2.11, 629; 5.6, 1627 Voth, G.A. 5.9, 1691 Voyiadjis, G.Z. 3.8, 1183 Vvedensky, D.D. 7.16, 2351 Wahnström, G. 5.14, 1787 Wallace, D.C. P1, 2659 Wang, C.Z. 1.15, 307 Wang, Y. 7.4, 2117 Wang, Y.U. 7.12, 2287 Weinan, E. 4.13, 1491; 8.4, 2475 Wijesinghe, H S. 8.8, 2523 Wirth, B.D. 2.29, 999 Wolf, D. 6.7, 1925; 6.9, 1953; 6.1, 1985; 6.11, 2009; 6.12, 2025; 6.13, 2055 Woo, C.H. 2.27, 959 Woodward, C. P29, 2865 Wu, S.Y. P39, 2935 Xiang, Y. 7.13, 2307 Yip, S. 2.1, 451, 613; 6.7, 1925; 6.8, 1931; 6.11, 2009 Yu, M. P39, 2935 Zbib, H.M. 3.3, 1097 Zhu, F. 5.15, 1797 Zikry, M. 3.7, 1171
INDEX OF KEYWORDS ab-initio 1877, 2671, 2687, 2865 ab initio calculations 13, 349, 423, 2823 ab initio molecular dynamics 9, 59, 93, 195, 259, 349, 2701 ab initio potentials 1901 ab initio pseudopotential 121 abnormal grain growth 2157 accelerated molecular dynamics 629 acceptors 1877 acoustic emissions 1313 activated processes 1613 activation barrier 1281, 2223 activation energy 773, 1985, 2055 activation free energy 259 activation volume 1281 active sites 241 adaptive simulation 2675 adatom diffusion 613, 2337 adiabatic energy surfaces 2731 adsorption resonances 1713 aggregation 1769 AgI 1901 AIM 2667 Al(Zr,Sc) 2223 alkali metals 1223 all-atom-models 2675 all-electron potential 121 Allen-Cahn equation 2157 alumina 479 aluminum in melting 2009 Alzheimer’s disease 259 AMBER 509, 2561 amorphization 987, 2009 amorphous 1953, 1985 amorphous carbon 307 amorphous cement model 1953, 1985 amorphous GaAs static structure factor 875 amorphous polymers 1281 amorphous solid water 2917 amorphous solids 1901, 2055, 2749 amorphous structure of high-energy grain boundaries 1953, 1985
amphiphilic fluids 2411, 2487 Andrade law 1313 angular-dependent forces 459 angular-force multi-ion potential 2737 anharmonic 349 anharmonic correction 1877 anisotropic diffusion 959 anisotropic grain growth 2157 anisotropic long-range interaction 2143 anisotropic media 729 anisotropy 2173 annealing stages 1855 annealing temperature 613 anticancer drugs 259 antifreeze proteins 1613 antiphase domain boundaries 2787 antisymmetrically coupled environmental modes 1673 AOT microemulsion 2849 a posteriori error estimation 1529 aquaporins (AQPs) 1797 argon 1635 Arrhenius 1635 Arrhenius dynamics 1477 Arrhenius factor 1735 Arrhenius plot 1985 a-SiO2 structure factor 875 asymmetric GB 1953 asymmetric tilt GB (ATGB) 1953 asymmetrical grain boundaries 1931 asymmetry parameter 2223 ATGB cusp 1953 atomic configuration 613 atomic displacement 349 atomic displacements at interfaces 1931, 2055 atomic force field 1837 atomic hypothesis 451 atomic jump 2223 atomic jump frequency 1787 atomic-level geometry of grain boundaries 1953
2947
2948 atomic mechanism [of diffusion] 1787 atomic misfit energy 2117 atomic pseudopotential wave functions 121 atomic simulations 929 atomic structure 1953, 1985 atomic transport 1851 atomistics 649 atomistic/continuum coupling 663 atomistic description 2523 atomistic (modeling) 2757, 2763 atomistic picture 959 atomistic simulation 451, 793, 1837, 1931, 2737 atomistic simulation of interfaces 1925 atomistic simulation of melting 2009 atomistic thermodynamics 149 automatic model adaptation 663 Bain strain 2777 Bain transformation 1223 ballistic jumps 2223 bands 13 bandgap problem 215 barrier crossing 1635 basin constrained molecular dynamics 629 basis function 2447 basis-sets 93 bcc lattice 827, 1953 bcc metals 2777, 2865, 2883 Beeman algorithm 565 Beevers–Ross site 1901 bending elasticity 2631 Bethe–Salpeter equation 215 Bhatnagar–Gross–Krook model 2805 bias potential 629 bias sampling 613, 1613 biased random walk 1635 bicontinuous morphologies 2849 bicrystal 1953, 1985 bicrystal melting 2009 bifurcation 1223 biharmonic functions 1417 bilayers 2631 bimolecular modeling 259 bimolecular reactions 1735 binary collision approximation 987 binary mixture 2787 binding energy 1877, 2659 biomolecules 1613 biosystems 2731 Bloch walls 2787 Bloch-periodic boundary conditions 423 Bloch’s theorem 423 block copolymers 2599 Blue Moon ensemble 1597
Index of keywords boiling point 395 Boltzmann equation 2411, 2513 Boltzmann factor 613 Boltzmann H theorem 2487 bonds 13 bond-angle distribution function 1985 bond breaking 2763 bond fluctuation model 2599 bond-order 499 bond-order potentials 2737 bond orientational order parameter 1613 border conditions 1931 Born effective charges 195, 479 Born stability criteria 2009 Born-Oppenheimer approximation 59, 195 Born–von K´arm´an model 349 Bortz algorithm 1753 boundary 1491 boundary conditions 1931, 2475 boundary integral methods 2205 boundary layer 1389 boundary-tracking 2157 Bravais lattice 1953 Bredig transition 1901 brittle 855 brittle fracture 1417 brittle-to-ductile transition 839, 773, 855 broken-bond model 1953 Brønsted acidic sites 241 Brownian dynamics 649, 2757, 2607, 2619 buckyball 2923 bulk free surface 2025 bulk interfaces 1925, 2025 bulk melting point 1985 bulk modulus 855, 2009 Burgers vector 2307 C60 307, 1627 calorimetric 1823 canonical d-bands 2737 canonical ensemble 613 capillary waves 2787 carbon 855, 2923 carbon nanotubes 215 carbon-carbon composite 2923 carboxylate bridged binuclear motif 259, 1877 Car–Parrinello molecular dynamics 59, 259, 1877 cascade 987 CASCADE 1889, 1901 cascade damage 959 CASTEP 1901 catalysis 241, 395, 1753 Cauchy relation 2025
Index of keywords Cauchy–Born rule 663 cellular automata 2351 cellular automata 2687 central limit theorem 1635 central symmetry parameter 1051 centroid molecular dynamics 1691 ceramic composites 527, 875, 2929 certification 1529 CGMD 649 chain entropy 2675 Chapman Kolmogorov equation 1635 Chapman–Enskog analysis 2475, 2487 charge localisation 2731 charge transfer 2731 charged state 1877 CHARMM 509, 2561 chemical-bonding changes 2829 chemical Fokker–Planck equation 1735 chemical Langevin equation 1735 chemical master equation 1735 chemical potential 349, 407, 707, 1389, 1877, 2645 chemical rates 59, 149, 1573, 1585 chemically reacting flows 2475 chemically synthesized quantum dot structures 875 cisplatin-DNA adduct 259 classical limit 349 classical molecular dynamics 59, 349, 451 classical potentials – DFT comparisons 27 clathrate hydrates 1613 cleavage anisotropy 855 climb 2307 clipped random wave model 2849 closure 1477 closure on demand 1453 cluster 241, 349, 1851, 1877 cluster dynamics 2223 cluster expansion 349, 367 cluster migration 241, 2223 coarse bifurcation (algorithms/computations) 1453 coarse grained models 2675 coarse projective integration 1453 coarse time stepper(s) 1453 coarse-grain models 929 coarse-grained molecular dynamics 649 coarse-grained Monte Carlo 1477 coarse-grained polycrystal 2055 coarse-grained statistical model 349 coarse-grained stochastic models 1477 coarse-graining 649, 1477, 1613, 2083, 2325, 2351 2503, 2599, 2645, 2687, 2757 coarsening 2117, 2205 coherency strain energy 2117
2949 coherent 2025 coherent interfaces 1925 coherent phase diagram 2117 coherent potential approximation 349 coherent precipitation 2117 coherent treatment 2935 coherent-twin boundary 1953, 2055 cohesive zone model 855 coincident-site lattice (CSL) 1953 collective damage 1313 collective diffusion model 1797 colloidal fluids 2503, 2607, 2645 combined quantum mechanics–interatomic potential functions method 241 commensurate GB 1953 commensurate interfaces 1925, 2025 committor(s) 1585 COMPASS 2561 compensation 1877 complex fluids 1371, 2411, 2487, 2503, 2675 complex Langevin sampling 2645 complex systems 1217, 1453, 1475 component simulation 2713 composite 2173, 2379 composition-modulated superlattice 2025 compressibility 745 compressibility in melting 2009 computational efficiency 2907 computational fluid dynamics 2811 computational materials design facility 2707 computational nanoscience 2701 computer-aided analysis 1453 concerted rotation 2583 concerted rotation Monte Carlo 2757 concurrent 2929 concurrent material/component design 2929 concurrent multiscale simulation 649 condensed matter 137, 2659 condensed phase systems 1597 conditional convergence 813 conditional probability 1635 conditional reaction probability 1735 conductivity 1333, 1877, 1901 configuration 349 configuration coordinate diagram 1877 configuration phase space 613 configurational and displacive states 349 configurational arrangement 349 configurational bias 2583 configurational bias Monte Carlo 2757 configurational disorder 367 confined liquid 1985 conformal map 1417 conformational partition function 2575
2950 conjugate gradients 2671 conjugated polymers 215 conservation laws 1491 consistency 2415 constant pressure molecular dynamics 589 constant-coverage algorithms 1753 constitutive equations 1071 constitutive formulations 2897 constitutive law 745 constitutive theories 1281 constrained equations 745 constrained equilibrium 149 constrained phase space flow 745 continuity equation 745 continuous Markov process 1735 continuous-orientation model 2157 continuous-time random-walk (CTRW) model 1797 continuum 2173 continuum approach [to modeling diffusion] 1787 continuum description 2523 continuum limit 2351 continuum mechanics 1529, 2903 convergence 2415 convex hull 349 cooperative rearrangement 2907 cooperativity 1823, 2907 coordinate scaling 707 coordination number 1051 copolymers 2645 copper 629, 1223 copper100 629 core energy 813 core properties 793 core-shell nanoparticles 875 correlated events 629 correlation factor 1855 correlation function 1333, 1635, 1673, correlation length 2787 correlation time 1635 correspondence relation 649 coupling method 2523 covalent bonding 499 covalent solids 451 cracks 839, 2287 crack dynamics 2475 crack propagation 2087, 2763 crack propagation in amorphous SiO2 875 crack tip 773 crack tip plasticity 839 creep 959, 1313, 2719 critical behavior 1613 critical point 2787
Index of keywords criticality 1313 cross-slip 793, 2307, 2897 cross-validation 395 crowdion 1855 crown thioether complexes of rhenium and technetium 259 crystal growth 613, 629 crystal interface 1925 crystal plasticity code 2897 crystal structure 395, 423, 2829 crystal symmetry 1223 crystal-growth simulation 2055 crystallography 2173 crystal-to-amorphous transition 2009 C-S bond cleavage 259 CSL misorientation 1953 curvature 1039, 2837 curvature-driven grain growth 2157 cusped orientation 1953 CVD diamond 2829 cyclic plasticity 1193 DAD effect 959 damage 1071, 1183 Damkohler number 2475 data management and mining 875 dislocation dynamics simulations 2897 deposition 1039 dissipative particle dynamics (DPD) 2503 de novo 2707 deacylation of peptide 259 Debye correlation function 2849 decoherence 2731 defects 241,1855, 1877, 1915, 2269, 2737, 2871 defect annealing 1855 defect calculations 547 defect concentration 1855 defect-dependent properties 1851 defect diffusion 613, 1855 defect entropies 1889 defect-induced melting 2009 defects in amorphous materials 1855 defects in intermetallics 1855 deformation 1281, 2173 deformation gradient tensor 439, 1133 deformed materials 1217 degrees of freedom (DOF) of a grain boundary 1953 deNOx process 241 density field 2645 density functional theory (DFT) 9, 59, 93, 121, 137, 149, 195, 349, 423, 439, 451, 855, 1613, 1877, 1889, 2737
Index of keywords density of CSL sites 1953 density of states 683, 1823 deposition 2363 deprotonation energy 241 design 2667 detailed balance 613, 1477, 1585, 1753 deterministic dynamics 613 diamond 2923 diamond-anvil cell 2829 diamond nanoparticle 307 diamond structure 855 diamond surface 307 diamond-to-graphite transition 307 dielectric breakdown 1417 dielectric tensor 195 differential displacement map 773 diffraction 1713 diffuse interface 2087, 2117 diffusion 629, 1627, 2117 diffusion coefficient 1635, 1691, 1901, 1931, 2055 diffusion controlled reactions 1635 diffusion equation 1635, 1787 diffusion mechanism 2223 diffusion permeability 1797 diffusional phase-transition 2205 diffusional width 1985 diffusion-limited aggregation 1417 diiron proteins 259 dimanganese proteins 259 dimensional analysis 565 dimer method 629, 1627 dimer-TAD 629, 1627 diomimetic compound of methane monooxygenase 259 direct correlation function 1613 direct force method 349 direct simulation Monte Carlo (DSMC) 2411, 2811, 2487, 2523 directed end-bridging 2583 directed internal bridging 2583 disclinations 2923 discrete simulation automata (DSA) 2411, 2487 discrete-orientation model 2157 discretization 2415 dislocation 793, 813, 839, 855, 1077, 1098, 1115, 2083, 2269, 2843, 2865, 2871 dislocation core energy 773 dislocation density tensor 2325 dislocation dipole 2325 dislocation dynamics 813, 827, 2083, 2269, 2307, 2325, 2871 dislocation dynamics simulation, 3D 2695 dislocation GB 1953, 1985
2951 dislocation generation 2897 dislocation microstructure 827 dislocation nucleation 773 dislocation patterns 2897 dislocation theory 2695 dislocation-pressure interaction 2879 disorder at interfaces 1931 disorder temperature 2749 disordered crystalline alloy 349 disordered materials 349, 1217 dissipation 1151, 1281, 1635, 2773 dissipative particle dynamics (DPD) 2411, 2757 dissociation 1713 distributed computing 1837 distributed damage 2929 distribution of grain sizes 2837 distribution of nucleation sites 2397 dividing surface 629 DL POLY code 1901 DNA 77, 2619 DNA binding 259 domain 2083, 2843 domain wall, 90◦ 2843 domain wall, 180◦ 2843 doping 1877 double bridging 2583 double bridging Monte Carlo 2757 DREIDING 2561 drift 1635 drift-diffusion process 1635 driven alloys 2223 DSC lattice 1931 d-state directional bonding 2737 ductile 855 due ferro 1 (DF1) 259 dumbbell 1855 dynamic density functional theories 2757 dynamic fracture in n-Si3 N4 875 dynamic heterogeneities 2917 dynamic lattice liquid 2907 dynamical matrix 195, 349, 459, 649 dynamical scaling hypoethesis 2351 dynamics 1491, 2083 EAM potential 1931, 2055 Earth’s interior 2829 edge 1098, 1115, 2307 edge diffusion 2337 EDTB potential for molybdenum 307 EDTB potential for Si 307 Edwards–Wilkinson equation 2351 effective cluster interaction 349 effective Hamiltonian 349, 2645 effective lifetime 959
2952 effective medium theory 2737 effective potential 499, 1823 effective temperature 1281, 2749 eigenvalue 349 eigenvector following [method] 1573 Einstein crystal 349 elastic anomalies at interfaces 2025 elastic compatibility 2143 elastic constants 275, 459, 479, 729, 773, 1223, 1333, 2025 elastic effect 349 elastic energy 2117 elastic instability 2009, 2777 elastic stability 1223 elastic stiffness 773 elastic wave 649 elastically-mediated interaction 349 elasticity 649, 1371, 1529 elastodiffusion 959 electro-migration 1417 electronegativity equalisation 479 electron-hole interaction 215 electronic 2829 electronic band energy 349 electronic chemical potential 349 electronic density of states 349 electronic entropy 307, 349 electronic excitation 349 electronic excited states 2731 electronic free energy 349 electronic structure 13, 93, 137, 149, 349, 423, 1889 electronic structure, coarse grained 2737 electrostatic effect 349, 423, 479 element modeling 649 elliptic equations 2415 embedded atom method (EAM) 459, 1223, 1953, 1985, 2025, 2737 embrittlement 2719 empirical potential 499, 509, 855 empirical valence bond (EVB) method 241 end bridging Monte Carlo 2757 end-bridging 2583 end-of-life 2929 end-to-end distance 2575 end-to-end vector 2575 energy barrier 1585 energy cusp 1953, 2055 energy density functionals 137 energy function 509 energy localisation 2731 energy minimization 855, 1931 energy release rate 855 energy transfer 1713, 2731 ensemble 729, 2645
Index of keywords ensemble average 349, 707, 763 entropic elasticity 2631 entropy 349, 707, 1931 entropy production 707 environment dependence 2935 environmental effects on fracture 875 environment-dependent tight-binding (EDTB) potential 307 enzyme 1613 epitaxial growth 2337 equation of continuity 763 equation of state 2599 equation-free modeling 1453 equilibrium ensembles 589 equilibrium Monte Carlo 613 equilibrium properties 613 equilibrium volume 349 ergodic hypothesis 613 ergodic systems 1931 error bounds 1529 Eshelby’s virtual procedures 2117 Euler–Lagrange equation 649 evaporation 1769 evaporation–condensation 1389 evolution 1823, 2083 Ewald method 479, 1889 exact enumeration 1823 EXAFS 1901 excess energy 1985 excess energy profile 1985 exchange process 1627 exchange-correlation potential 1877 exciton 2731 excitonic effects 215 explicit solvent model 1837 explicit tau-leaping simulation procedure 1735 external free surface 1953 facets 1389 fast ion conductors 1901 fast marching methods 1359 fast multipole method 875 fast sweeping method 2307 fatigue 1071, 1193 fcc lattice 1953 fcc metals 2777 Fe(Cu) 2223 Fe(Nb,C) 2223 f-electron metals 2737 femtosecond laser pulse 307 FENE 2619 Fermi distribution 349 Fermi level 349 fermi surface 275
Index of keywords ferroelectrics 349, 527, 2843 Feynman propagators 1673 fiber-bundle models 1313 film growth 629 finite difference 2415 finite difference method 423 finite difference time domain 2671 finite element method 121, 423, 649, 663, 1529, 2173, 2415, 2663 finite size effects 2787 finite temperature 349, 1491 finite volume 2415 finite-time singularity 1313 Finnis–Sinclair model 2737 first generation radiopharmaceutical compounds 259 first principles 149, 349, 367, 423, 1877, 2707, 2865 first wall 2719 first-principles density functional theory 275 fitting 479 ‘flip’ instability 2777 Flory–Huggins theory 2599 flow defect 2749 flow in porous media 1507, 2487 fluctuation theorem 2773 fluctuation-dissipation theorem 1635, 2411 fluctuations 1635, 1735 fluid binary mixture 2787 fluid dynamics 2411 fluid flow 1507, 1529 fluid permeability 1333 fluorite structure 1901 flux 1635, 1787 flux autocorrelation function 1673 Fokker–Planck equation 1635, 2503 folding temperature 1823 force constant 349, 1877 force field 2561, 2707 forced oscillator model 1713 formation energies 1877 formation enthalpy 1855 formation entropy 1855 formation volume 1855 Fourier transforms 121, 423 Fourier’s law of thermal conduction 763 fractal 1417 fracture 839, 855, 1071, 1171, 2793 Frank–Read source 827, 2307 free energy 613, 683, 707, 1585, 1613 free volume 1281, 1953, 1985 free-energy perturbation 683 freezing 1613 Frenkel defects 1901 friction coefficient 1635
2953 front-tracking 2837 fuel cells 1915 functional integral 2645 funnel sampling 683 fusion energy 2719 fusion reactors 999 GaAs phonon dispersion 875 gamma-surface 855, 2883 Gaussians 121 Gaussian (“normal”) distribution 1635 Gaussian chains 2599 Gaussian curvature 2849 Gaussian Markovian stochastic process 1635 Gaussian random field 2849 Gaussian random force 1635 Gaussian stochastic process(es) 1635 Gaussian variable 1635 Gaussian white noise 1735 general grain boundary 1953, 2055 generalized configurational bias 2583 generalized gradient approximation 439, 1877 generalized Langevin equation 1673 generalized pseudopotential theory 2737 generalized stacking fault (GSF) energy 793 GENERIC framework 2503 genetic algorithm 307, 547, 1613 geometric glide plane 2695 germanium 855 grain boundary sliding and migration 1931 ghost forces 663 Gibbs–Duhem integration 683 Gibbs ensemble 683 Gibbs–Feynman identity 707 Gibbs free energy 2009 Ginzburg-Landau framework 2143 glass transition 1281, 1985 glass transition temperature 1823 glasses 2823 glide 1098, 1115, 2307 global climatic changes 1613 global minimum 613 global structure optimization 307 go model / go-like 1823 GPT 2737 gradient thermodynamics 2117 grain area distributions 2397 grain boundary (GB) 547, 1039, 1925, 1931, 1953, 1985, 2025, 2055, 2083 grain boundary area 2157 grain boundary diffusion 1953, 1985, 2055 grain boundary diffusion creep 1985, 2055
2954 grain boundary energy 1953, 1985, 2025, 2055, 2157 grain boundary fracture 1953 grain boundary inclination 2157 grain boundary migration 1985 grain boundary misorientation 2157 grain boundary mobility 1985, 2157 grain boundary self-diffusion 1985 grain boundary sliding 2055 grain boundary superlattice (GBSL) 2025 grain boundary width 2157 grain growth 1985, 2837 grain growth exponent 2157 grain junction 1953, 1985, 2055 grain shape 2055 grain size 2055, 2157 Granato model 1855 grand canonical ensemble 1931 graphite 2923 gray scale operations and filters 2397 Green’s function 1877 Green-Kubo formula 745, 763 Green’s function 2117 grid 2415 grid computing 875 grid methods 121 Griffith criterion 855 GROMOS 509 Grotthus mechanism 1901 ground state 349 growth 1769, 2117 Gr¨uneisen parameter 349 GULP 1889, 1901 GW approximation 215 GW bandgap 215 GW-BSE approach 215 HADES 1889, 1901 Hamiltonian 2935 Hamiltonian matrix 121, 307 Hamilton’s equations of motion 763 handshake regions 875 hardening 1281 hardness 855 hard-sphere model 1953 harmonic approximation 349 harmonic functions 1417 harmonic transition state theory 629 Hartree potential 121 Hartree–Fock 1877 heat capacity 1823 heat flux 763 heat of formation 275 Hele–Shaw 1417 Helfrich free energy 2849
Index of keywords Hellmann–Feynman theorem 59, 195 Helmholtz 1529 Helmholtz inequality 707 Herring relation 2055 heterogeneous and homogeneous melting 2009 heterogeneous catalysis 149 heterogeneous materials 1217, 1333 heterogeneous microstructure development 959 heterogeneous multiscale method 1491 heteropolymer 1823 hexatic phase 1613 hierarchical modeling 649 hierarchical processes 2731 high friction limit 1635 high pressure 2737, 2829 high temperature limit 349 high-angle GB 1953, 1985, 2055 high-energy GB 1953, 1985, 2055 higher order finite difference expansions 121 high-temperature relaxed structure 1953, 1985 Hillert model 2157 histogram reweighting 683 histograms 1613 holonomic constraints 745 homogeneous 1613 homogeneous system 1635, 2025 homogenization 1333, 2325, 2379 HP model 1823 hybrids 2929 hybrid algorithms 1753 hybrid Car–Parrinello/MM molecular dynamics 259 hybrid energy 349 hybrid FE/MD scheme 875 hybrid FE/MD/QM scheme 875 hybrid formulation 2523 hybrid MD/QM scheme 875 hybrid multiscale modeling 649, 2763 hybrid quantum mechanics/molecular mechanics methods 241 hydrocarbons conversion 241 hydrodynamic interaction 2607, 2619 hydrodynamics 1403, 2513, 2523 hydrogen 1877 hydrogen bonding 1613 hydrogen tunneling 259 hydrophobic 1613 hydrostatic (compression/expansion) 1223, 2009 hyperbolic equations 2415 hyperdynamics 629
Index of keywords hyperfine parameters 1877 hypersonic flows 2811 ideal strength 439, 855, 2777 image analysis 2397 image force effect 2287 immiscible fluids 2487, 2503 implicit solvent model 1837 importance sampling 349, 613, 2583 impurities 1877 In melts 1985 InAs/GaAs square nanomesa 875 incoherent 2025 incubation 1193 incubation time for nucleation 2223 indented surface of Si3 N4 875 industrial applications 2819 industrial applications of materials modeling 2713 information theory 1477 infrared spectroscopy 195 infrequent event system 629, 1627 inherent structure(s) 1573 inhomogeneous 1613, 2025 instability 1371 integral materials simulation 2713 interatomic force constants 195 interatomic potential 241, 451, 459, 479, 499, 1901, 1931, 2731, 2763 interconversion of atoms 349 inter-diffusion 367 interface 1953 interface capturing 2205 interface fluctuations 2351 interface-plane method 1953 interface tracking 2205 interfacial 2025 interfacial curvatures 2849 interfacial dynamics 1217 interfacial energy 2157 interfacial fracture and dislocation emission 875 interfacial free energies 2787 interfacial profile 2787 interfacial width 2787 intergranular fracture 1931 interionic potentials 1889 intermediate scattering function 2823 intermetallic compounds and alloys 2737 intermetallics 2883 internal interface 1953 internal state variables 1077, 1183 internal variables 1151 interplanar spacing 1953 interstitial 2223
2955 interstitial diffusion 367 interstitial position 1877 interstitialcy 2223 intrinsic interfacial profile 2787 intrinsic profile 2787 intrinsic width 2787 invariance 1223 invariant volume element 589 inversion symmetry 1953 ion core 121, 613 ion implantation 649 ion polarizability 1901 ion transport mechanism 1901 ionic materials 479 ionization energy 1877 ion-pair potentials 241 IR spectra 259 irradiation 987 irradiation growth 959 irreversibility paradox 2773 irreversible aggregation 2337 irreversible process 707 irreversible work 707 Ising 2687 Ising models 2787 Ising-type models 1477 island dynamics 2337 island number density 2337 isostress molecular dynamics 1223 iterative diagonalization methods 121 IVR 1673 Jahn–Teller effect 2731 jammed state 1281 Jarzynski average 707 jump frequency 2223 jump Markov process 1735 JWKB approximation 1673 kinematics 1183 kinetic 137 kinetic equation 2249 kinetic Ising model 2223 kinetic Monte Carlo 149, 613, 629, 1627, 1753, 1769, 2083, 2223, 2351, 2363, 2757 kinetic pathway 2223 kinetic theory 2411, 2475, 2513 kinetics 959, 2083 kinetics (of reactions) 1585 kinetics of melting 2009 kink-pair mechanism 827 kink-pressure 2879 Kleinman–Bylander form 121 Knudsen number 2475, 2513, 2523
2956 Kohn–Sham eigenvalues 215 Kohn–Sham equation 121 Kolmogorov flow 2805 Kosevich energy functional 2325 k-point sampling 439 Kramer–Moyal–van Kampen expansion 2351 Kroger–Vink notation 1889 Kubo–Green 367 Lagrange multipliers 745 Lagrangian formulations 1281 Landau free energy 1613 Landau, Ginzburg, De Gennes 1613 Landau-type coarse-grained “chemical” free energy 2287 Langevin equation 649, 2351, 2411, 1635 Langevin leaping 1735 Langevin noise 2117 Langevin update formula 1735 LAPW 93 large strain mechanical response 1223 large-scale deformation 2749 laser ablation 307 latent heat and volume 2009 lateral interactions 149 lattice Boltzmann equation 2411, 2475, 2805 lattice dynamics 649, 1931, 2025, 2055 lattice friction 2897 lattice gas 1753, 2351 lattice gas automata 2805 lattice gas Hamiltonians 149 lattice Green’s function 649 lattice model 2599 lattice Monte Carlo 613 lattice protein 1823 lattice site 349 lattice statics 1223, 1931, 2025 lattice strain in melting 2009 lattice symmetry 349 lattice trapping 855 lattice vibration 349 layer-by-layer growth 2337 layered and tunnel structures 1901 leap condition 1735 leap-frog algorithm 565 Lee–Edwards boundary condition 745 legacy codes 1453 length and time scales 2663 length scales 1077 Lennard Jones potential 565, 1953, 1985, 2025 level set methods 1359, 2337, 2083, 2307 level surface 2849
Index of keywords life management 2929 life-managed 2929 limitations of atomistic simulations 451 Lindemann criterion 1985 linear inequality 349 linear programming 349 linear regression 395 linear response theory 195, 349, 707, 745 linear scaling 77, 649 line–tension–pressure interaction 2879 link atoms 241 Liouville equation 745, 763 Liouville operator 589 lipid bilayers 929 lipid membrane structure 929 liquid 349 liquid simulations 2819 liquid–liquid phase transition 2917 liquid–solid interface 2009 LMTO 93 load balancing 875 local density approximation 121, 439, 1877 local expansion at interfaces 1931 local processing 2929 local property 2929 local pseudopotentials 137 local rules 2897 localization 1713 localized basis sets 423 localized orbitals 77 localized vibrational modes 1877 lognormal 2397 long range elastic fields 2763 long-range interactions 241 long-range interatomic interaction 349 long-range order parameter 2117 long-range structural order 1953, 1985 look-up tables 1753 low viscosity simulation 1837 low-angle GB 1953, 1985 low-dimensional continuum models 2631 low energy 1953, 2055 low-energy GB 1985 lubrication forces 2607 macromolecular architectures 2907 macroscale 1071 macroscopic 1953 macroscopic (modeling) 2757 magnetism 275 manganese catalase 259 many-body expansion 499 many-body perturbation theory 215 many-electron Green’s function 215 mapping 1039
Index of keywords Markov chain 2583 Markov process 1477, 1735 Markovian stochastic processes 1635 martensitic phase transformation 349, 2117 mass matrix 649 mass-action equations 1735 master equation 613, 1281, 1635, 1753, 2351 materials by design 2667 materials failure 2793 materials for fission and 999 materials modeling 1217, 2707 materials processing 1529 materials simulation 2935 mathematical methods 1217 matrix-free (numerical methods) 1453 Maxwell’s equations 2671 mean curvature 2849 mean field theory 2787 mean free path 2513 mean occupation 2249 mean square curvatures 2849 mean-field methods 349 mean-squared displacement 1931, 1985, 2055 measure 745 mechanical deformation 439 mechanical embedding 241 mechanical properties 2173, 2737 mechanism 1613, 1985, 2687 melting 2009 melting at interfaces 1931 melting point 349 membrane dynamics 929 membrane transport 929 membranes 2631, 2675 memory kernel 649 memory time 1635 Mermin order parameter 1613 mesh 2415 meshless method 2447 mesophase 2923 mesoscale 1071 mesoscale phenomena 2411, 2487 mesoscopic (modeling) 2731, 2749, 2757 metallic glasses 1281, 2749 metallic hydrogen 2829 metallization 2829 metals 451, 2173 metastable phase 2009 metastable phase diagram 349 metastable states 1613 methane monooxygenase 259 Metropolis 2363
2957 Metropolis algorithm 349, 613, 1585, 1823, 1931, 2583 Metropolis-type dynamics 1477 MgO 1627 MGPT 2737 microemulsion 2645, 2849 micro–macro hybrid methods 2903 microphotonics 2671 microporous materials 547 microscopic 1953 microscopic electro-mechanical systems (MEMS) 2475 microscopic reversibility 613 microscopic state 349 microstructural evolution 2087 microstructurally small crack 1193 microstructure 999, 1193, 1333, 1371, 2083, 2117, 2269, 2379, 2687, 2871 microstructure evolution 2157, 2249 microstructure, ferroic 2143 migration barrier 1877 migration enthalpy 1855 migration entropy 1855 minimum energy path 773 mirror symmetry 1953 misfit 2307 misfit energy 793 misfit localization 1953 misfit strain 2287 mixed basis 2737 mixed quantum classical [time evolution methodologies] 1673 MMFF 509 mobility 1877, 2307 mobility constant 745 mode-coupling theory 2823 model order reduction 1529 modeling 2173, 2269, 2871 modeling of excited states 259 modeling of interfaces 1925 modified embedded atom method 459, 855 modulation wavelength 2025 molecular assemblies 2675 molecular complexity 2907 molecular dynamics (MD) 275, 451, 527, 565, 629, 649, 707, 729, 813, 839, 855, 1039, 1491, 1613, 1627, 1735, 1787, 1837, 1877, 1889, 1901, 1931, 2523, 2663, 2737, 2793, 2411, 2475, 2487, 2503, 2707 molecular dynamics and Monte Carlo comparison 613 molecular dynamics simulation 349, 509, 929, 1585, 1597, 1797
2958 molecular dynamics simulation of melting 2009 molecular mechanics 2763 molecular mechanics force fields 241 molecular modeling 509 molecular statics 839 molten sub-lattice model 1901 monazite 2929 mon-equilibrium molecular dynamics 745 Monte Carlo 451, 707, 729, 1039, 1931, 2663, 2687 Monte Carlo [path] 1613 Monte Carlo [procedure] 1585 Monte Carlo simulation 349, 1901, 2583, 2599 morphological and microstructural evolution 1217 morphological evolution 2351 morphology 2083 Morse model 1223 Mott–Littleton 1889 Mott–Littleton procedure 1901 multicanonical Monte Carlo 2787 multicomponent alloy 349 multifunctionality 2379 multilayer growth 2337 multimillion atom molecular dynamics simulations 875 multi-paradigm 2707 multiphysics 2475 multiplicity 349 multipole expansion 2325 multiscale 2523, 2707 multiscale analysis 1491, 1507 multiscale computation 1507 multiscale gas dynamics 2811 multiscale modeling 149, 793, 999, 1217, 2737, 2657 multiscale modeling of polymers 2757 multiscale visualization 875 Nabarro–Herring creep 2055 nanoclusters 547 nanocrystalline 839 nanocrystalline material 1985, 2055 nanocrystals 1925 nanodiamond 307 nano-forms of carbon 2923 nanoindentation 2777 nanoindentation of silicon nitride 875 nanomechanics 649 nanophase SiC 875 nanophotonics 2671 nanostructured a-SiO2 875 nanostructured ceramics 875
Index of keywords nanostructured materials 875 nanostructured Si3 N4 875 nanostructures 215 nanostructures in solution 2701 nanotube 307, 2923, 1797 native point defects 1877 native-centric 1823 natural convection 1529 Navier–Stokes 1529 Navier–Stokes equations 2411, 2475, 2805 NDDO 27 neighbor distribution 2397 neighbor list 565 nematic order parameter 1613 NEMS 649 neural network 395 neutral net 1823 Newtonian equations of motion 1931 Newtonian fluids 2503 n-fold way 2363 Ni(Cr,Al) 2223 nickel 1223 NMR 1901 NMR spectra 259 nonbonded interaction 2561 non-collinear magnetization 275 non-Debye relaxation 2731 non-equilibrium alloy 2249 non-equilibrium free-energy calculation 707 non-equilibrium processes 1753 non-equilibrium states 745 non-equilibrium work 683 non-Hamiltonian dynamical systems 589 non-Hamiltonian system 745 non-linear perturbations 745 nonlocal pseudopotentials 423 nonlocality 121 non-Markovian 1635 non-Markovian dynamics 1635 non-Newtonian fluids 2503 nonorthogonal tight-binding 307 nonperiodic systems 121 non-planar core 2883 non-reactive collisions 1735 nonstoichiometry 1851 norm conserving 121 normal modes 195, 349, 1635 normal random variable 1735 Nose–Hoover chains 589 nucleation 1098, 1115, 1613, 2117, 2223, 2337 nucleation and growth 2397 nucleation barrier 2223 nucleation of voids and dislocation loops 959
Index of keywords nucleation rate 2223 nucleation sites in melting 2009 nudged elastic band 773 numerical analysis 1453 O(N) multiresolution algorithms 875 observable variables 1151 occupation variable 349 offline–online procedures 1529 Omori law 1313 one-dimensional diffusion 959 ONIOM method 241 Onsager’s regression hypothesis 707, 745 on-the-fly algorithms 1753 on-the-fly kinetic Monte Carlo 1627 open systems 2731 operator resummation 1673 OPLS 509, 2561 optical properties 215 optical properties of silicon dots 2701 optimization 613, 2083, 2379 orbital-free density functional theory 137 order parameter 683, 1585, 1613 order-disorder transition 2645 ordered phase 349 ordering 2223, 2249 order-N 77, 565 orientation variants 2117 Orowan looping 2307 orthogonal tight-binding 307 osmotic permeability 1797 Ostwald ripening 2337 output bounds 1529 overdamped limit 1635 overlap matrix 307 oxidation of alkanes 259 oxidation of aluminium nanoparticles 875 oxygen diffusion 1915 oxygen molecule 121 pair correlation function 2325, 2397, 2583 pair potentials 499 parabolic equations 2415 parallel (multiscale modeling approaches) 2757 parallel calculations 423, 2487 parallel molecular dynamics 875 parallel replica dynamics 629, 1627 parallel tempering 1613 parallel-replica 629, 1627 parametrized problems 1529 parent lattice 349 partial charge 2561 partial differential equations 1529, 2415, 2447
2959 partial least squares 395 partial occupation 349 particle and radiation transport 613 particle-based 2513 particle bypass 2307 particle-laden flows 2607 particle tracking 451, 613 particle trajectories 613 partition function 349, 707, 2645 path integral molecular dynamics 259 path-integral propagator 1691 path integral quantum transition state theory 1713 path sampling 1837 pattern formation 2117 pattern recognition 1613 Peach–Koehler force 1098, 1115 Peclet number 2475, 2607 Peierls stress 793, 813, 2865, 2883 Peierls–Nabarro model 793, 2287, 2695 ‘pencil glide’ 2777 peptide bundle, 4-α-helix 259 perfect crystal 2025 periodic boundary condition 241, 423, 565, 773, 813, 1051, 2695 perovskite oxides 527, 1915 Pettifor map 395 phase behavior 527 phase coexistence 2009 phase diagram 149, 349 phase equilibria 683 phase field 2083, 2105, 2117, 2157, 2287, 2837 phase-field models 2087 phase-field simulation 2157 phase separation 2249 phase-separated polymer blends 2849 phase space 349, 683, 1635 phase space flow 745 phase space sampling 613 phase stability 349 phase transformation 1223, 2117 phase transitions 149, 1613, 2829 phonons 195, 1713 phonon density of states 349 phonon dispersion 459 phonon frequencies 275, 349 Photobacterium leiognathi 1597 photonic band gap 2671 photonic crystals 2671 physical properties 2379 π-back donation 259 planar core 2883 planar stacking 1953 planar structure factor 1953, 1985
2960 plane waves 59, 121, 195 plane-wave basis 93 plane-wave method 423 plastic deformation 663, 793, 1985, 2055, 2713 plastic spin 1133 plasticity 1071, 1151, 1281, 2173, 2269, 2749, 2793, 2865, 2871, 2929 plateau value 1573 point charge model 1901 point collocation 2447 point defects 1851, 1855 Poisson–Boltzmann 1837 Poisson effect 2025 Poisson process 1797 Poisson random numbers 1735 Poisson random variable 1735 polarization 479, 2561 polycarbonate 2675 polycrystals 1953, 1985 polycrystalline 839, 1039, 2837 polydispersity 2583 polyethylene 2575 polyhedral unit model 1953 polymer blends 2599, 2645, 2787 polymer glass 2599 polymer melts 2599 polymer mixtures 2787 polymer solutions 2599, 2645 polymeric fluids 2503 polymers 2173, 2675 potential cutoff 565 potential energy function 509, 2561 potential energy surface 149, 1713 potential of mean force 2645, 2757 Potts model 2687, 2837 powder metallurgical processing 2713 power laws 1313 precipitation 2117 predictive simulations 27 predictor–corrector algorithm 565 premelting 1985 premelting at surface 2009 pressure distribution in a Si/Si3 N4 nanopixel 875 principal component analysis 395 probabilistic clustering 1573 problem posing 2731 process simulation 2713 processing 2173 processing–structure–property relationships 395 production bias 959 projection strategies (in coarse-graining) 2757
Index of keywords projector augmented wave method (PAW) 93 propensity function 1735 protein data bank 1051 protein folding 1837 protein-based nanostructures 875 proton conductors 1901 proton exclusion 1797 proton mobility in zeolites 241 proton transport 1915 prototype sequence 1823 pseudopotential approximation 439 pseudopotential model 1223 pseudopotential perturbation theory 2737 pseudopotentials 13, 59, 93, 121, 195, 423, 1877 Pt chemical shift 259 Pulay forces 59, 121, 195 pyrolsis 2923 pyrosilicic acid 27 QM/MM 259, 241, 2675, 2763 QM-Pot method 241 quadrature domains 1417 quantitative structure-activity relationships 395 quantitative structure–property relationships 395 quantum-based interatomic potentials 2737 quantum dots 121, 649 quantum effects 1713 quantum entanglement 2731 quantum information processing 2731 quantum Kramers [approach] 1673 quantum mechanical rate constants 1691 quantum mechanics 27, 349, 2707, 2763 quantum molecular dynamics 2737 quantum Monte Carlo 2701 quantum reflection 1713 quantum rods and tetrapods 875 quantum simulations 451, 2701 quantum statistics 2731 quantum tunnelling 2731 quasi-atomic minimal-basis-sets orbitals (QUAMBOs) 307 quasicontinuum 649, 663, 1491 quasi-elastic neutron scattering 1787 quasiharmonic approximation 349, 1931 quasiparticle calculations 1877 quasiparticle excitations 215 quasiparticle lifetime 215
Index of keywords radial distribution function 1901, 1953, 1985 radial distribution function at interfaces 1925, 2025 radiation damage 613, 649, 999, 1627 radiation effects 999, 2719 radiopharmaceutical compounds 259 Raman spectra 195, 259 random deposition 2351 random-dissipative forces 649 random force 1635, 1673 random numbers 613, 1635 random sampling 613 random walk 1585, 1635, 1787 rare event 1585 rare reactive events 1597 rare transitions 1613 ratcheting 1193 rate catalog 1627 rate dependent 1133 rate equations 2337 rate independent 1133 rate table 1627 Rayleigh and Gamma probability densities 2397 Rayleigh scattering 649 Rayleigh–Taylor 1417 Rayleigh wave speed 855 reaction coordinates 1585, 1635, 1597 reaction force field 241 reaction mechanism 1585 reaction of H2 O molecules at a silicon crack tip 875 reaction rate constants 1585, 1691, 1735 reaction rate equations 1735 reaction time 1585 reactive collisions 1735 reactive flux correlation function 1597 reactive force fields (ReaxFF) 2707 Read–Shockley model 1953 real space 121 real space methods 121, 423 real-time computation 1529 realizations (stochastic trajectories) 1635 reciprocal space 121 reduced basis 1529 reduced coordinates 1051 reduced descriptions 1635 reduced dimensional systems 215 reduced dynamics 1635 reduced unit system 565 re-emission 1359 reference system 349 refinement 2447, 2523 reflection coefficient 649
2961 refractory metals 2865 relative fluctuations 1735 relaxation 1953 relaxation time 1635 relaxation volume 1855 renormalization 2687 renormalization group 1613, 2351 reptation 1901, 2583, 2599, 2675 reversible aggregation 2337 reversible glass transition in high-energy grain boundaries 1985 reversible process 707 reversible work 707, 1585 Reynolds number 2475 rezoning 2903 rheology 2607 Rice criterion 855 rigid-body translation 1953, 2055 ROCKS 259 rolling 2173 rotation 1953 rotational excitation 1713 rotational isomeric state 2575 roughening transition 1389 Rouse dynamics 2599 rupture prediction 1313 saddle points 1585, 1877, 2223 sampling strategies (in coarse-graining) 2757 scalable data-compression scheme 875 scalable simulation 875 scale-bridging 2687 scale-bridging simulation 2675 scaling 2687 scaling properties of fracture surfaces 875 scattering 649, 1713 scheelite 2929 Schmid law 2883 screened Coulomb interaction 215 screw dislocation 1098, 1115, 2307 screw dislocation in bcc molybdenum 307 second law 2773 sedimentation 2607 segregation 1931 self-avoiding walks 2599 self-consistency 2935 self-consistency cycle 121 self-consistent field (SCF) 2645, 2757 self-diffusion 367, 1985 self end-bridging 2583 self-energy operator 215 self-limiting growth 875 self-organization 2117 self-similarity 1403
2962 self-trapping 2731 semiclassical mechanics 1673 semiconductor etching and deposition 1359 semiconductor growth 149 semiconductor nanostructures 2701 semiconductors 855, 1877, 2737 semi-discrete variational Peierls–Nabarro model 793 semi-empirical 2935 semi-empirical potentials 839 semi-empirical quantum chemistry 27 sequential (multiscale modeling approaches) 2757 sequential multiscale modeling 649 shape function 649 shape selectivity 241 shape transition 2117 shear deformation 2009 shear failure 2777 shear instability 2009 shear localization 2749 shear modulus 855, 2009 shear-rate dependence 745 shear softening 1281 shear strength 2777 shear stress 1223 shear-transformation zone 2749 shear viscosity 745 shell model 527, 1889 shock waves 2829 shooting and shifting [algorithms] 1585 short-range interactions 2575 short-range order 349, 1953, 1985 Si/Si3 N4 interface 875 Si3 N4 phonon density of states 875 SIESTA 77 silica stress and strain curves 27 silicon 499, 613, 855, 2009 silicon (110) 855 silicon (111) 855 silicon cluster 307 silicon nitride 855 silver 629 simple ionic solids 349, 547, 613, 1889 simulated annealing 613 simulations 275, 729, 2173, 2363 simulations of nanostructured materials 875 single crystal plasticity 827 single exponential kinetics 1837 single molecule spectroscopy 1635 single particle distribution function 2513 single-file water transport 1797 singularities 1403 sink competition 959 sintering of n-SiC 875
Index of keywords sintering of silicon nitride nanoclusters 875 size effects 2897 Slater–Koster theory 307 slip 793 Smoluchowski equation 1635 smooth particles 2903 smoothed particle hydrodynamics 2503 soft matter 2675 soft matter properties 2675 soft mode 2787 solid 349 solid electrolytes 1901 solid oxide fuel cells 1901 solid polymer electrolytes 1901 solid with a phase transition 1901 solidification 2087, 2105 solid–liquid interface 2009 solid–liquid phase boundary 349 solid-on-solid model 2337 solid-state amorphization 2055 solubility 1877 solute 349 solvent 349, 1613 solvent model 1837 Sommerfeld model 349 SPAM 2903 spatial decomposition 875 special GB 1953 special grain-boundary plane 1953 specific reaction probability rate constant 1735 spectral density 1673 spectral density function 2849 spectroscopic properties 1851 sph 2903 spherical chicken 2083 spinodal decomposition 2487 splitting algorithm 2513 static structure factor 1931 stability 2415 stacking fault 1953 stacking period 1953 stacking sequence 1953 standard model 13 state-change vector 1735 static heterogeneities 2917 static lattice 547 static lattice calculations 1889, 1901 stationary 1635 statistical mechanics 149, 729 statistical sampling 613 statistical weight matrix 2575 statistics of point processes 2397 steepest descent sampling 2645 Steinhardt order parameter 1613
Index of keywords steps 1389 STGB cusp 1953 sticking 1713 stiffness 1735 stiffness matrix 649 Stillinger–Weber potential 499, 855 stochastic 1635, 1735, 2513 stochastic differential equations 2503, 2619 stochastic dynamics 613 stochastic effects 959 stochastic equations of motion 1635, 2083 stochastic mesoscopic models 1477 stochastic PDE 1477 stochastic simulation algorithm 1735 stochastic time evolution 1635 stochastic trajectory 1635 stochastic variable 1635 Stokesian dynamics 2607 strain 773 strain faceting 2337 strength 2777 stress 439, 773, 2205 stress-free transformation strain 2117 stress intensity factor 839 stress–strain curve 827 stress tensor 459, 2619 structural 2737 structural components 2929 structural disorder 1953, 1985 structural instability 2009 structural materials 2719 structural phase transitions 195, 1985, 2009 structural properties 2737 structural relaxation 1931 structural transformation in GaAs nanocrystals 875 structural width 1985 structure 13, 349 structure map 395 structure–property correlations 1925, 1931, 1953, 2025, 2055, 2675 submonolayer growth 2337 substitutional diffusion 367 subtraction scheme 241 subtraction technique 745 super soft elastomers 2907 supercell 565, 813, 1051, 1877, 1889 superconductivity 2829 supercooled liquids 2009, 2823 supercooled melt 1985 supercooled water 2917 superfunnel 1823 superhard materials 2829 superheating 2009 superheating limit 1985
2963 superionic conductors 1901 superlattice 2025 super-plasticity 1281 surface 855, 1953 surface chemistry 2337 surface diffusion 629, 1389, 1627 surface exchange mechanism 1627 surface growth 629, 2351 surface hopping 1673 surface roughness 2337, 2351 surface simulations 547 surface steps 1953 surface structure 149 surface tension 1403 surface-state excitons 215 suspensions 1371, 2607 Suzuki–Yoshida factorization scheme swelling 2719 switched Hamiltonian 349 switching parameter 707 symmetric GB 1953 symmetric tilt GB 1953 symmetry breaking 2009 symplectic integration 565, 589 system instability 959 systems theory 1453
589
tau leaping 1735 temperature accelerated dynamics 629, 1627 temperature-dependent elastic constants and moduli 2025 temperature-dependent potentials 2737 tensile strength 2777 tensor 1183 Tersoff model 499 tertiary creep 1313 tetrahedral order parameter 1613 texture 1039, 1133, 2173, 2837 theoretical strength 1223 thermal activation 1223 thermal conduction 763 thermal conductivity 2819 thermal defect 349 thermal expansion 349, 459, 2659 thermal properties 707, 2659 thermally activated processes 2897 thermodynamics 1151, 2083, 2773 thermodynamic and mechanical melting 2009 thermodynamic driving force 707 thermodynamic factor 367 thermodynamic integration 259, 349, 683, 707, 1855 thermodynamic limit 1735
2964 thermodynamic path 707 thermodynamic phase diagram 2009 thermodynamic stability 349 thermo-elastic behavior of interfaces 2025 thermo-mechanical behavior 2055 thermostatted molecular dynamics 589 theta state 2575 thin films 1039, 2363, 2837 thin films, structure and elastic behavior of 2025 thin film growth 1753, 2337 thin-film interfaces 1925 threshold displacement energy 987 threshold effects 1713 tight-binding 275, 855, 1877, 2737 tight-binding Hamiltonian 307, 451 tight-binding model for carbon 307 tight-binding molecular dynamics 307 tilt 1953 tilt axis 1953 tilt boundary structures in Si 307 tilt GB 1953, 2055 time average 763 time correlation function 763, 1635 time dependent DFT (TDDFT) 259 time evolution of dislocation motion 875 time integration 2415 time-dependent Ginzbur–Landau equation 2287 time-dependent rate coefficient 1597 time–temperature–transformation (TTT) diagrams 395 topology 2157 total energy 423 total-energy calculation 349 total energy functional 93 total energy surface 1877 tracer 1901 tracer (self-diffusion) coefficient 1787 trajectory in phase space 763 transfer Hamiltonian 27 transitions 2143 transition level 1877 transition metals 2737 transition metals containing zeolites 241 transition path ensemble 1585 transition path sampling 1585 transition rates 613, 1635 transition state 1585, 1613, 1635 transition state ensemble 1585 transition state rate 149,1635 transition-state theory 629, 1573, 1627, 1635, 1673, 2757 translational 1953 translational order parameter 1613
Index of keywords transport coefficient 745, 763 trapping constant 1333 trimolecular reactions 1735 triple junctions 2055, 2157 triple lines 2055 Trotter theorem 589 Tsallis statistics 1613 TVD Runge–Kutta 2307 twin boundary 2843 twinning plane 1953 twist 1953 twist GB 1953, 2055 two-center approximation 307 two-phase flow 1403 two-phase microstructure 2205 two-phase region 1985 two-state model 2025 two-state systems 1281 ultrasoft pseudopotential 59, 195 umbrella sampling 613, 683, 1613, 2787 uniaxial loading 1223 unimolecular reactions 1735 unique properties of molecular dynamics and Monte Carlo 451 unit-cell area 1953 unit-cell volume 1953 UNIVERSAL 2561 unmixing 2223 unperturbed state 2575 unstable mode 349 unstable stacking energy 855 unstructured mesh 649 upscaling 1507 vacancies 1901, 1985, 2223 vacancy formation energy 275 vacancy source, sink 2223 valence interaction 2561 variable charge MD simulation 875 variable connectivity Monte Carlo 2583 variance 1635 variance reduction 2619 variational transition state theory 1635 velocity gap 855 velocity Verlet algorithm 565 velocity Verlet time integration 649 Verlet algorithm 565 vertex tracking 2837 vesicles 2631 vibrational energy relaxation rate constants 1691 vibrational entropy 349 vibrational free energy 349 vicinal GB 1953
Index of keywords vicinal surface 1953 virtual environment 875 viscosity 2675, 2815 viscous fingering 1417, 2805 viscous sintering 1417 visibility 1359 visualization 451, 1051 visualization algorithms 875 void coalescence 1171 void growth 1171 void nucleation 1171 volumetric relaxation 2157 von Mises strain 1051 Voronoi construction 839 Voronoi diagram 2503 waiting-time distribution 2397 water 1613, 2917 water channels 1797 water nucleophilic attack 259 water/silica interactions 27
2965 water/water interactions 27 wave reflection 649 weight function 2415 weighting function 2447 WENO 2307 width 1953, 1985 Wolf–Villain model 2351 wormlike chain 2619 wrapper, computational 1453 yield stress 827, 1281, 2749 yield surface 1077, 2173 Zeldovich factor 2223 zeolite catalysts 241 zero-point vibrations 1931 zero-temperature relaxed structure 1985 Zwanzig Hamiltonian 1673
1953,