Reviews in Computational Chemistry Volume 10
n
Kevievvs in Computational Chemistry 10 0
0
Edited by
Kenny B. Lipkowitz and Donald B. Boyd
8WILEY-VCH
A NOTE TO THE READER This book has been electronically reproduced from digital information stored at John Wiley & Sons, Inc. We are pleased that the use of this new technology will enable us to keep works of enduring scholarly value in print as long as there Is a reasonable demand for them. The content of this book is identical to previous printings. Kenny B. Lipkowitz Department of Chemistry Indiana University-Purdue University at Indianapolis 402 North Blackford Street Indianapolis, IN 46202-3274, U.S.A.
[email protected]
This book is printed on acid-free paper.
Donald B. Boyd Department of Chemistry Indiana University-Purdue University at Indianapolis 402 North Blackford Street Indianapolis, IN 46202-3274, U.S.A.
[email protected]
@
0 1997 VCH Publishers, Inc.
0 2003 Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923,978-750-8400, fax 978-750-4470,or on the web at wuw.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 1 11 River Street, Hoboken, NJ 07030, (201) 748-601 1, fax (201) 748-6008, e-mail:
[email protected]. For general information on our other products and services please contact our Customer Care Department within the US. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002, ISBN 0-471-18648-1
ISSN 1069-3599 Printing History: 10 9 8 7 6 5 4 3 2 1
Preface With this book we celebrate the tenth volume of Reviews in Compututiolzul Chemistry.This is an opportune time to take a brief look back at the prefaces of previous volumes, In Volume 1 (1990),we defined computational chemistry as consisting of those aspects of chemical research which are expedited or rendered practical by computers. This definition has gained acceptance and still applies, even as the field grows and impacts additional disciplines in molecular science. Many people continue to use the terms computational chemistry and molecular modeling interchangeably, although some scientists make a distinction. Molecular modeling to some people implies the investigation of threedimensional molecular structures by only computer graphics and empirical force field methods, although it should be kept in mind that even quantum mechanical treatments and quantitative structure-property relationships model molecular behavior. In Volume 2 (1991), we refrained from pontificating, but in Volume 3 (1992), we shared some observations on the multitude, almost overabundance, of scientific meetings on computational chemistry. The frenetic pace of meetings occurring back in May/June 1991 has slowed a bit, but a great deal of activity still exists. In fact, since 1991, two major changes are noteworthy: ( 1 ) the Computers in Chemistry Division (COMP) of the American Chemical Society has become much more active, with up to three concurrent sessions running through the whole week of recent national meetings of the ACS, and (2) there is an increasing number of commercially organized scientific conferences. Obviously, for both scholar and entrepreneur, computational chemistry is a bigger business than it was a decade ago. In Volume 4 (1993), the preface dealt with the question of whether the number of extant journals in the field sufficed or were new ones needed? Since 1993, we have seen the birth of two electronic online journals, the Journal of Molecular Modeling and, as of 1996, the Electronic Journal of Theoretical Chemistry, Structure & Interactions. Also, many of the older computational chemistry journals have been trying to “reinvent” themselves. In fact, they seem to be converging such that their scopes overlap more and more with one another. We comment further on this later. In Volume 5 (1994), we addressed the widely recognized difficulty that computational chemists have had in convincing their scientific collaborators to lb
“D. B. Boyd and K. B. Lipkowitz, in Reviews in Computational Chemistry, K . B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. I, p. ix. V
vi Preface accept molecular design predictions conceived by computation. This problem, which has been around for years, is slowly being resolved. However, not a few computational chemists in industry have found the best strategy is to convince their synthetic chemistry collaborators that the design ideas belong to the latter, so that the idea will be acted on. This subterfuge indicates that computational chemists not only need good scientific and programming skills, but also good salesmanship and interpersonal skills. In Volumes 6 and 7 (both published in 1995),we highlighted changes that were and are still taking place in the job environment for computational chemists. These changes continue with further downsizing of pharmaceutical and chemical companies. The so-called “restructuring” underway in the United States and Europe is affecting tens of thousands of individuals. Four years ago, chemical employment in the United States turned downward and has declined 50,000 (ca. 5 % ) so far. Median salaries of Ph.D. chemists in the United States have remained flat in constant dollar terms. Noteworthy is that computational chemists are no more affected than other chemists by these changes. In fact, because the computational chemistry workforce is relatively young compared with organic and other bench chemists in industry, one could say that computational chemists have been less affected. In Volume 8 (1996),the pervasiveness of the use of computational chemistry in the papers of several prominent chemistry journals was quantitated. Clearly, the topics and techniques, which our series of books aims to cover, are now deeply ingrained in the fabric of research in many subdisciplines of chemistry. Finally, in Volume 9 (1996)-note that, not insignificantly, we are publishing more frequently-we showed how the science of computational chemistry has been globalized. Not surprisingly, it is in use around the world. But what was surprising was to see which countries have been investing heavily in computational chemistry. This brings us to the present volume, number 10. Here we point out an interesting evolution of pertinent journals in the field of computational chemistry that has been underway for two decades. Back in the “old days,” the two journals specializing in the field of what we now call computational chemistry were Theoretica Chemica Acta, founded in 1962, and the International Journal of Quantum Chemistry, founded in 1967. Both, as implied by their titles, focused on theoretical (pencil and paper) and computational quantum chemistry. Theoretica Chemica Acta tended to have papers reporting the results of approximate semiempirical molecular orbital calculations. Routine molecular orbital studies can now be found in many journals, with perhaps one of the highest concentrations in the old issues of THEOCHEM, a journal started in 1981. In the meantime, Theoretica Chemica Acta tried to widen and strengthen its editorial stance by adding the subtitle A Journal for Structure, Dynamics, and Radiation in 1985. The International Journal of Quantum Chemistry has sought to broaden its base of readers by including what used to be called “quantum biology” and
Preface vii “quantum pharmacology.” The amusing thing about these monikers is that experimental biologists and pharmacologists, when they occasionally encountered these terms, had absolutely no idea what they meant. However, it was an early attempt by computational chemists (nCe theoretical chemists) to try to relate what they were doing to other sciences. In any case, nowadays the Znternational Journal of Quantum Chemistry also seeks papers on molecular mechanics and dynamics. THEOCHEM has also started reaching to embrace the same wider audience as the other journals. The editorial policy was broadened to include what it called “new” fields like molecular modeling. Thus, the scope was changed in 1994 to go beyond its core of electronic structure papers and thus attract papers in modeling, computer graphics, dynamics, and even drug design. And in 1995, THEOCHEM’s old subtitle of Applications of Theoretical Chemistry to Organic, Inorganic, and Biological Problems was dropped in favor of Theory and Modelling in Chemistry. Not withstanding the fact that molecular modeling has been around for about 15 years, it appears that all the journals are converging toward what the Journal of Computational Chemistry has been and is. This widely read and cited journal has, from its beginnings in 1980, been a broad forum for papers on molecular mechanics and molecular simulations, as well as computational quantum chemistry and other facets of computational chemistry. Theoretical chemistry journals have not been the only journals to try to jump on a bigger bandwagon. Quantitative Structure-Activity Relationships, founded in 1982, used to carry the subtitle in Pharmacology, Chemistry, and Biology.Several years ago, it expanded the subtitle to the rather explicit Zncluding Molecular Modelling and Applications of Computer Graphics in Pharmacology, Chemistry,and Biology. Not to be outdone, the Molecular Graphics Society, which has published the Journal of Molecular Graphics since 1983, now calls itself the Molecular Graphics and Modelling Society. In early 1993, the venerable Journal of Chemical Znformation and Computer Sciences made an effort to embrace molecular modeling and computational chemistry papers. This journal now bears the subtitle: Includes Chemical Computation and Molecular Modeling, even though the bulk of its papers still focus on databases and molecular topology. The metamorphosis and coalescence of various computational interests of chemists into modern computational chemistry occurred in many steps. A landmark event in this change was perhaps the first Gordon Research Conference on Computational Chemistry, which was held 10 years ago in August 1986, in the village of New London, New Hampshire. In 1984, the editors of this book series had written a proposal to the Board of Trustees of the Gordon Research Conferences saying: “At present there is no conference that concentrates, as we intend this conference to, on applications of computational chemistry. A number of regular theoretical
viii Preface
conferences, such as the American, Canadian, and Sanibel theoretical meetings, deal mostly with quantum chemistry. The closest Gordon Conference, Quantitative Structure Activity Relationships in Biology, is mainly concerned with drugs and meets in odd numbered years. The Computational Chemistry Conference will focus on the other domains of computational chemistry and would meet in even numbered years.” Letters seconding our proposal were written by Novel Laureates Roald Hoffmann (Cornell University) and William N. Lipscomb (Harvard University), as well as Norman L. Allinger (University of Georgia), Richard W. Counts (QCPE), Kendall N. Houk (then at the University of Pittsburgh), Daniel A. Kleier (then at Shell), and Peter A. Kollman (University of California, San Francisco). The proposal was approved by the Gordon Conference Board of Trustees, and the first conference, which the present editors cochaired, was a huge success. More than 250 prominent as well as young computational chemists applied to be among the 150 original conferees. Our vision, one that was fulfilled, was to balance all facets of computational chemistry at the meeting. Thus, molecular mechanics, molecular modeling, molecular simulations, quantum chemistry, molecular graphics, molecular design, and so on, were all represented. The conference was designed in the spirit of inclusiveness and openness. The results helped set the stage for synergies and for the flourishing of the field. So much of science, as it has grown, has become more specialized and fragmented. Thus, it is interesting that at least in the broad area of computational chemistry, the many specialized journals have been converging toward the same vision. In effect, they are trying to keep up with where computational chemists are taking the field, This vision of computational chemistry encompasses all facets of computational molecular science. The computational chemist’s toolbox ranges from post-Hartree-Fock calculations to statistics for quantitative structure-property relationships to molecular graphics to simulations, and so on. The right tool or tools must be grabbed and applied effectively to the myriad research problems awaiting resolution. From the broad definition we attach to computational chemistry, it is clear that while many important topics have been covered in our books, we still have a long way to go. As with previous volumes, this tenth volume of Reviews in Computational Chemistry attempts to provide educational chapters on useful and interesting topics. As usual, we have asked our authors to prepare their chapters so they are part tutorial. That way, both a novice molecular modeler as well as a knowledgeable expert can benefit from them. An overarching theme of several chapters in this volume is chaos and randomness. The term chaos as it applies to chemistry does not have the negative connotation it does for political or cosmological events. Chaos in chemistry can be very interesting, beautiful, and applicable to important research questions. Not uncommonly in research, one is faced with a large number of variables. The researcher wants to find in this rather chaotic set of possibilities,
Preface ix the one that will optimize some property or phenomenon. In Chapter 1, Dr. Richard Judson presents a broad overview of Genetic Algorithms (GAS) and their numerous uses in various fields of computational chemistry. GAS are optimization algorithms based on several metaphors taken from biological evolution, including selection, fitness, mating, and mutation. He presents an extended primer for the basic GA approach. A number of variants on the simple GA are described. After a detailed look at a simple example, further applications of GAS in chemistry are presented. Information is also given on the general literature in the GA field and on public domain GA codes available free over the Internet. Combinational chemistry is a new experimental technology in which the chemist actually wants to achieve a great deal of controlled randomness. This nascent technology has caught the imagination of synthetic chemists, especially those in medicinal chemistry. Many biopharmaceutical companies now seek to capitalize on this approach to rapidly synthesize and biologically test small quantities of tens of thousands or even millions of variants of molecular structures. The old one-at-a-time medicinal chemistry approach is rapidly giving way to robotic machines that churn out whole libraries of compounds in a mere fraction of the time it took a chemist to make single compounds. In the old days, the scientist poorly understood the molecular targets for therapeutic intervention, so all sorts of compounds were tried at random in hopes that one would work. With increasing knowledge from molecular and structural biology, the scientist adopted increasingly rational approaches to drug design. However, with combinatorial chemistry, the old random approach has been resurrected; technology has made it fast and cheap. What does this revolution mean for the computational chemist? Is rational drug design dead? Is the romance of the pharmaceutical companies with computer-aided drug design and structure-based drug design over? In Chapter 2, Drs. Eric J. Martin, David C. Spellmeyer, Roger E. Critchlow, Jr., and Jeffrey M. Blaney provide timely information pertinent to these questions. They show how their computational research at Chiron Corporation is proving to be an effective ally of combinational chemistry. The approach, techniques, and software of computational chemistry, as applied in this era of combinatorial chemistry, are explained and illustrated. Chapters 3 and 4 introduce the concepts, techniques, and computational tools of nonlinear dynamics. Professor Robert Q. Topper focuses on applications to conservative systems, while Professors Raima Larter and Kenneth Showalter cover the uses of nonlinear dynamics in dissipative systems (they dissipate rather than conserve energy). The major application to chemistry of conservative nonlinear dynamics is in the realm of molecular dynamics, whereas chemical kinetics is the main focus of the chapter on dissipative systems. In both of these chapters, a geometrical approach to understanding dynamics is introduced. As Dr. Topper clearly explains in Chapter 3 , much work has gone into developing a better understanding of molecular dynamics by 1
x Preface
dipping into the “nonlinear dynamics toolbox.” It has been said that chemistry is not only about static substances, but also about how they react. De novo determination of reaction rate constants is a rigorous test of computational chemistry and the theories on which it is based. Chapter 4 describes the concepts of phase space, Poincark sections, and next-return maps to illustrate the geometrical features of a nonlinear dynamic system. Drs. Larter and Showalter also introduce the concept of universal dynamical behaviors and phenomena that appear in problems across fields, of which chemical kinetics is but one example. The unifying nature of dynamical systems theory highlights the interdisciplinary nature of this field, and several examples are given showing the similarity between kinetics, population biology, and other areas. Chapter 5 is a nice change of pace from the preceding chapters. It might be regarded at the “dessert” for the readers who have navigated the mathematics of the prior chapters. Mr. Stephen J. Smith and Professor Brian T. Sutcliffe give a historical account of the advancement of computational chemistry in the United Kingdom. This chapter complements the one in Volume 5 on the history of the field in the United States. Developments in the United Kingdom aided those in the United States, and vice versa. As stated in the Preface of that 1994 volume, it is our intention to have reviews of the historical developments in all the countries that have been major spawning grounds of the field. In many countries, computational chemistry was born of the womb of theoretical chemistry. Chapter 5 poignantly reminds us of the fact that the impetus to the construction of the early computers was to aid the survival of the allied nations during and after World War 11. It is perhaps some small consolation that from the severe destruction, suffering, and hardships of that great struggle came technology that has transformed and improved the lives of those now living. The technology to decipher enemy messages and build bigger bombs was malleable enough to yield computing machines that today are being used to design new medicines and new substances to benefit rather than destroy people. An extensive compendium of software for computational chemistry appeared in Volume 7 of Reviews in Computational Chemistry.* That compendium also had information about the Internet and the World Wide Web (WWW). We forego an appendix in this volume to allow more room for chapters, However, in the future we will again provide an updated compendium. In the meantime, the compendium of Volume 7 should serve as a handy guide. We would like to point out that information about Reviews in Computational Chemistry is available on the World Wide Web. Background information about the scope and style are provided for potential readers and authors. In addition, the tables of contents of all volumes and the various international addresses of the publisher are included. The Reviews in “D. B. Boyd, in Reviews in Computational Cbemistv, K . B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 7, pp. 303-380. Compendium of Software for Molecular Modeling.
Preface xi Computational Chemistry home page is being used as needed to present color graphics, supplementary material, and errata as adjuncts to the chapters. Your Web browser will find Reviews in Coinpututiorzal Chemistry at http://chem.iupui.edu/- boyd/rcc.html. If there are topics that we have not yet covered in this series, but you feel would be of benefit to you, your students, or your collaborators, we urge you to contact us. Our e-mail and postal addresses are given earlier in this book. We express our deep gratitude to the authors who contributed the outstanding chapters to this volume. We hope that you too will find them helpful and enlightening. Mrs. Joanne Hequembourg Boyd is acknowledged for help with the editorial processing of this book. We thank the readers of this series who have found the books useful in their work and have given us encouragement. Donald B. Boyd and Kenny B. Lipkowitz Indianapolis February 1996
Contents 1.
Genetic Algorithms and Their Use in Chemistry
Richard Judson
Introduction Natural Evolution as an Optimization Process The Genetic Algorithm as a Metaphor Overview Genetic Algorithms Tutorial The Simple Genetic Algorithm Analysis of the Simple Genetic Algorithm The Schema Theorem Convergence Known Problems Estimating Parameter Values Variations on the Simple Genetic Algorithm Is It Real or Is It a Genetic Algorithm? Examples of Chemical Applications (With Emphasis on the Genetic Algorithm Method) Conformational Searching: Molecular Clusters Conformational Searching: Small Molecules Conformational Searching: Proteins Con formational Searching: Docking Conformational Searching: DNA/RNA Protein N M R Data Analysis Protein X-ray Data Analysis Molecular Similarity QSAR Design of Molecules DNA and Protein Sequence Applications Data Clustering Spectral Curve Fitting General Model Fitting Potential Energy Functions Summary and Comparison with Other Global Optimization Methods Brief Overview of Other Global Search Methods
1
1 3 4 5 5 6 16 18 19
20 22 24 35
37 37 38 40 46 48 48 48 49 52 53 55 56 57 57 58
58 58
...
Xlll
xiv Contents
Summary of Comparison Between Genetic Algorithm and Other Methods Appendix 1. Literature Sources Appendix 2. Public Domain Genetic Algorithm Codes Acknowledgments References 2.
Does Combinatorial Chemistry Obviate Computer-Aided Drug Design? Eric J. Martin, David C. Spellmeyer, Roger E. Critchlow Jr., and Jeffrey M. Blavzey Introduction Fragments vs. Whole Molecules Similarity and “Property Space” Properties Experimental Design Selecting Substituent Sets Template Diversity SecondGeneration Libraries Structure-Based Library Design Calibration of Diversity Score Evaluating Efficiency of Experimental Design Comparison to Clustering Corporate Archives Diversity Space Comparing Diversity Among Libraries Synthesis and Testing of Mixtures Conclusions References
3.
Visualizing Molecular Phase Space: Nonstatistical Effects in Reaction Dynamics Robert Q. Topper Molecular Dynamics in Phase Space Introduction What We Hope to Gain: Semiclassical Insight Reaction Rates from Dynamics Simulations Initial Conditions Rate Constants Chemical Kinetics, Chaos, and Molecular Motions A Brief Review of Absolute Rate Theory Overview of Nonlinear Dynamics and Chaos Theory Visualizing Uncoupled Isomerization Dynamics in Phase Space
62 64 65 66 66 75
75 77 77 79 83 84 87 88 89 90 92 93 94 95 96 97 97
101 102 102 108 109 110 112 115 115 117 121
Contents xu
4.
Technical Overview of Nonlinear Dynamics Some Essential Theorems Visualizing Phase Space on Poincark Maps: Practical Aspects Interpreting PoincarC Maps Linear Stability Analysis of Periodic Orbits Numerical Reconstruction of the Separatrix Visualizing Coupled lsomerization Dynamics in Phase Space Isomerization in Two Coupled Degrees of Freedom Reactive Islands Kinetic Theory lsomerization in Many Coupled Degrees of Freedom The Poincark Integral Invariants A Note on Arnold Diffusion Summary and Conclusions Acknowledgments References
128 128
Computational Studies in Nonlinear Dynamics Raima Larter and Kenneth Showalter
177
Introduction: Nonlinear Dynamics and Universal Behavior Homogeneous Systems Multiple Steady States Autocatalysis as a Source of Bistability The Iodate-Arsenite Reaction Bistability as a Universal Phenomenon Normal Forms Bifurcations and Stability Analysis Generalization to Multiple Variable Systems Oscillations Numerical Methods for the Solution of Ordinary Differential Equations Continuation Method for Steady State Computations Nonhomogeneous Systems Turing Patterns: Nonhomogeneous, Steady State Patterns from Reaction-Diffusion Processes Chemical Waves: Propagating Reaction-Diffusion Fronts Quadratic Autocatalysis Fronts Cubic Autocatalysis Fronts Lateral Instabilities: Two- and Three-Dimensional Patterns Numerical Methods for Solution of Partial Differential Equations Cellular Automata and Other Coupled Lattice Methods
177 182 182 183 185 188 190 190 193 195
133 136 140 146 150 150 156 159 164 167 168 168 169
199 202 205 205 215 217 223 224 226 23 0
xvi Contents
Geometric Representations of Nonlinear Dynamics Phase Space, PoincarC Sections, and PoincarC Maps Chaos Attractors Sensitive Dependence on Initial Conditions: The Lyapunov Exponent Routes to Chaos Numerical Analysis of Experimental Data Reconstruction of Phase Portraits Calculation of the Correlation Dimension Lyapunov Exponents Conclusions Acknowledgments References 5.
23 1 23 1 235 236 236 23 a 25 9 259 260 262 265 266 266
The Development of Computational Chemistry in the United Kingdom Stephen J. Smith and Brian T. Sutcliffe
271
Introduction Beginnings Manchester Cambridge Emerging from the 1950s The 1960s The Atlas Computer Laboratory The Flowers Report Emerging from the 1960s The 1970s The Meeting House Developments The Chemical Database Developments The Growth of Networking Daresbury and Collaborative Research Projects CCPl and the Advent of Vector Processing Quantum Chemistry Outside CCPl Into the 1980s Computer Developments Computational Chemistry Developments Epilogue Acknowledgments References
271 273 275 277 280 28 1 286 288 289 292 293 298 299 300 303 307 308 309 311 312 313 313
Author Index
317
Subject Index
325
Contributors Jeffrey M. Blaney, Chiron Corporation, 4560 Horton Street, Emeryville, California 94608, U.S.A. (Electronic mail:
[email protected]) Roger E. Critchlow Jr., Chiron Corporation, 4560 Horton Street, Emeryville, California 94608, U.S.A. (Electronic mail:
[email protected]) Richard Judson, Center for Computational Engineering, Sandia National Laboratories, Livermore, California 9455 1-0969, U.S.A. (Electronic mail: rs juds@ca. sandia.gov) Raima Larter, Department of Chemistry, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, Indiana 46202-3274, U.S.A. (Electronic mail:
[email protected])
Eric J. Martin, Chiron Corporation, 4560 Horton Street, Emeryville, California 94608, U.S.A. (Electronic mail:
[email protected]) Kenneth Showalter, Department of Chemistry, West Virginia University, Morgantown, West Virginia 26506, U.S.A. (Electronic mail:
[email protected]) Stephen J. Smith, Department of Chemistry, University of York, York YO1 SDD, England, U.K. David C. Spellmeyer, Chiron Corporation, 4560 Horton Street, Emeryville, California 94608, U.S.A. (Electronic mail:
[email protected]) Brian T. Sutcliffe, Department of Chemistry, University of York, York YO1 SDD, England, U.K. (Electronic mail:
[email protected]) Robert Q. Topper, Department of Chemistry, The Copper Union for the Advancement of Science and Art, Albert Nerken School of Engineering, 51 Astor Place, New York, NY 10003, U.S.A. (Electronic mail:
[email protected])
xvii
Contributors to Previous Volumes' Volume 1 David Feller and Ernest R. Davidson, Basis Sets for Ab lnitio Molecular Orbital Calculations and Intermolecular Interactions. James J. P. Stewart,t Semiempirical Molecular Orbital Methods. Clifford E. Dykstra,+ Joseph D. Augspurger, Bernard Kirtman, and David J. Malik, Properties of Molecules by Direct Calculation. Ernest L. Plummer, The Application of Quantitative Design Strategies in Pesticide Design. Peter C. Jurs, Chemometrics and Multivariate Analysis in Analytical Chemistry. Yvonne C. Martin, Mark G. Bures, and Peter Willett, Searching Databases of Three-Dimensional Structures. Paul G. Mezey, Molecular Surfaces. Terry P. Lybrand,n Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. Donald B. Boyd, Aspects of Molecular Modeling.
-
'For chapters where no author can be reached at the address given in the original volume, the current affiliation of the senior author is given here. X u r r e n t address: 15210 Paddington Circle, Colorado Springs, C O 80921 (Electronic mail: jstewartQ fai.com). 'Current address: Indiana University-Purdue University at Indianapolis, Indianapolis, IN 46202 (Electronic mail: dykstra6ichem.iupui.edu). 'Current address: University of Washington, Seattle, WA 98 195 (Electronic mail: IybrandCproteus. bioeng.washington.edu).
x ix
xx Contributors to Previous Volumes
Donald B. Boyd, Successes of Computer-Assisted Molecular Design. Ernest R. Davidson, Perspectives on Ab Initio Calculations.
Volume 2 Andrew R. Leach,' A Survey of Methods for Searching the Conformational Space of Small and Medium-Sized Molecules. John M. Troyer and Fred E. Cohen, Simplified Models for Understanding and Predicting Protein Structure. J. Phillip Bowen and Norman L. Allinger, Molecular Mechanics: The Art and Science of Parameterization. Uri Dinur and Arnold T. Hagler, New Approaches to Empirical Force Fields.
Steve Scheiner, Calculating the Properties of Hydrogen Bonds by Ab Initio Methods. Donald E. Williams, Net Atomic Charge and Multipole Models for the Ab lnitio Molecular Electric Potential. Peter Politzer and Jane S. Murray, Molecular Electrostatic Potentials and Chemical Reactivity. Michael C. Zerner, Semiempirical Molecular Orbital Methods. Lowell H. Hall and Lemont B. Kier, The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Modeling.
I. B. Bersukert and A. S. Dimoglo, The Electron-Topological Approach to the QSAR Problem. Donald B. Boyd, The Computational Chemistry Literature.
-
'Current address: Glaxo Wellcome, Greenford, Middlesex, UB6 OHE, U.K. (Electronic mail:
[email protected]). +Current address: University of Texas, Austin, TX 78712 (Electronic mail: cmao771 @charon.cc.utexas.edu).
Contributors to Previous Volumes xxi
Volume 3 Tamar Schlick, Optimization Methods in Computational Chemistry. Harold A. Scheraga, Predicting Three-Dimensional Structures of Oligopeptides. Andrew E. Torda and Wilfred F. van Gunsteren, Molecular Modeling Using NMR Data. David F. V. Lewis, Computer-Assisted Methods in the Evaluation of Chemical Toxicity.
Volume 4 Jerzy Cioslowski, Ab Initio Calculations on Large Molecules: Methodology and Applications. Michael L. McKee and Michael Page, Computing Reaction Pathways on Molecular Potential Energy Surfaces. Robert M. Whitnell and Kent R. Wilson, Computational Molecular Dynamics of Chemical Reactions in Solution. Roger L. DeKock, Jeffry D. Madura, Frank Rioux, and Joseph Casanova, Computational Chemistry in the Undergraduate Curriculum.
Volume 5 John D. Bolcer and Robert B. Hermann, The Development of Computational Chemistry in the United States. Rodney J. Bartlett and John F. Stanton, Applications of Post-Hartree-Fock Methods: A Tutorial. Steven M. Bachrach, Population Analysis and Electron Densities from Quantum Mechanics. Jeffry D. Madura, Malcolm E. Davis, Michael K. Gilson, Rebecca C. Wade, Brock A. Luty, and J. Andrew McCammon, Biological Applications of Electrostatic Calculations and Brownian Dynamics Simulations.
xxii
Contributors to Previous Volumes
K. V. Damodaran and Kenneth M. M e n Jr., Computer Simulation of Lipid Systems.
Jeffrey M. Blaney and J. Scott Dixon, Distance Geometry in Molecular Modeling. Lisa M. Balbes, S. Wayne Mascarella, and Donald B. Boyd, A Perspective of Modern Methods in Computer-Aided Drug Design.
Volume 6 Christopher J. Cramer and Donald G. Truhlar, Continuum Solvation Models: Classical and Quantum Mechanical Implementations. Clark R. Landis, Daniel M. Root, and Thomas Cleveland, Molecular Mechanics Force Fields for Modeling Inorganic and Organometallic Compounds. Vassilios Galiatsatos, Computational Methods for Modeling Polymers: An Introduction. Rick A. Kendall, Robert J. Harrison, Rik J. Littlefield, and Martyn F. Guest, High Performance Computing in Computational Chemistry: Methods and Machines. Donald B. Boyd, Molecular Modeling Software in Use: Publication Trends. Eiji Osawa and Kenny B. Lipkowitz, Published Force Field Parameters.
Volume 7
Geoffrey M. Downs and Peter Willett, Similarity Searching in Databases of Chemical Structures. Andrew C. Good and Jonathan S. Mason, Three-Dimensional Structure Database Searches.
Jiali Gao, Methods and Applications of Combined Quantum Mechanical and Molecular Mechanical Potentials. Libero J. Bartolotti and Ken Flurchick, An Introduction to Density Functional Theory.
Contributors to Previous Volumes xxiii Alain St-Amant, Density Functional Methods in Biomolecular Modeling. Danya Yang and Arvi Rauk, The A Priori Calculation of Vibrational Circular Dichroism Intensities. Donald B. Boyd, Compendium of Software for Molecular Modeling.
Volume 8 Zdeniik Slanina, Shyi-Long Lee, and Chin-hui Yu, Computations in Treating Fullerenes and Carbon Aggregates. Gernot Frenking, Iris Antes, Marlis Bohme, Stefan Dapprich, Andreas W. Ehlers, Volker Jonas, Arndt Neuhaus, Michael Otto, Ralf Stegmann, Achim Veldkamp, and Sergei F. Vyboishchikov, Pseudopotential Calculations of Transition Metal Compounds: Scope and Limitations. Thomas R. Cundari, Michael T. Benson, M. Leigh Lutz, and Shaun 0. Sommerer, Effective Core Potential Approaches to the Chemistry of the Heavier Elements. Jan Almlof and Odd Gropen, Relativistic Effects in Chemistry. Donald B. Chesnut, The Ab lnitio Computation of Nuclear Magnetic Resonance Chemical Shielding.
Volume 9 James R. Damewood, Jr., Peptide Mimetic Design with the Aid of Computational Chemistry.
T. P. Straatsma, Free Energy by Molecular Simulation. Robert J. Woods, The Application of Molecular Modeling Techniques to the Determination of Oligosaccharide Solution Conformations.
Ingrid Pettersson and Tommy Liljefors, Molecular Mechanics Calculated Conformational Energies of Organic Molecules: A Comparison of Force Fields. Gustavo A. Arteca, Molecular Shape Descriptors.
CHAPTER 1
Genetic Algorithms and Their Use in Chemistry Richard Judson Center for Computational Engineering, Sandia National Laboratories, Livermore, California 94552 -0969
INTRODUCTION This chapter is mainly about optimization using genetic algorithms (GAS). Although GAS were not originally designed as numerical optimizers, this application accounts for much of their popularity in scientific and engineering fields having no connection to the biological roots of GAS. It is useful to begin, therefore, by defining the class of problems for which we will examine GA-based solutions. Numerical optimization is commonly divided into two areas at a high level, namely local and global optimization.’ Figure 1 shows a typical, one-dimensional function whose global minimum is sought. The function f(x) displays three local minima, labeled 1, 11, and Ill, of which 111 is the lowest, or global, minimum. A minimum is simply a point at which the first derivative is zero and the second derivative is positive. The goal of a local minimization method is to start at some point x“ and move “downhill” as efficiently as possible to a nearby local minimum. The local method will almost always use gradient information if it is available. The goal of a global optimization method is, in contrast, to find which of many local minima is the lowest. Usually I11 is what is sought. Local minimizers are not effective in Reviews in Computational Chemistry, Volume 10 Kenny B. Lipkowitz and Donald B. Boyd, Editors VCH Publishers, Inc. New York, 0 1997
1
2 Genetic Algorithms and Their Use in Chemistry
I
111 X
Figure 1 A sample one-dimensional fitness function illustrating local and global minima.
performing a global search, and, vice versa, no global optimization method is very good at doing the local fine tuning needed to get to the bottom of a well. Another distinction between local and global optimization methods is that the latter, the GAS included, almost never use gradient information. This makes them useful in the large class of problems characterized by discontinuous functions. Optimization implies that the method moves in a useful direction, at least eventually. In using a local gradient method, one can check this property by evaluating local function values and gradients, both of which should decrease monotonically, at least near the solution. However, during a search for a global minimum, the clues about the multidimensional landscape are more scattered, so it helps to run a number of searches in parallel. The GA is a prime example of a parallel optimization method that searches many regions of parameter space simultaneously. Another concept in global optimization that needs an early introduction is that of the hypersurface called the fitness landscape. The fitness is the GA term for the value of the function f(x). GAS come with their own vocabulary, starting with fitness, then populations, which are sets of individuals each of whom samples the search space. An individual’s chromosome is comprised of its set of parameters ( x in the example). Individuals are selected for mating and reproduction based on their fitness values-the more fit (lower f ) the higher the chance of producing offspring. An optimization run continues for a series of generations until some stopping criteria are met. A smooth fitness landscape is one with few local minima, is continuous, and which perhaps points downhill toward the global minimum after some smoothing. Many global methods will
lntroduction 3 perform well on such a landscape. A search starting at almost any point on that surface has a good chance of finding its way to the global optimum. The opposite of a smooth landscape is one that is rugged. A rugged landscape has many local optima, with deep and shallow wells randomly mixed together. A pathological example of this class is the needle-in-a-haystack function, which has a huge number of wells of random depth and width, having a single very deep but very narrow global minimum hiding the proverbial needle. There is no method, the GA included, that can solve this type of problem. Somewhere between these two extremes are many classes of important but difficult problems in computational chemistry for which the GA is a good solution tool.
Natural Evolution as an Optimization Process The GA is based loosely on the concepts of natural evolution and the (controversial) idea that organisms adapt in an optimal way to their environment. Charles Darwin’s Origin ofSpecics2 is one long argument for the idea that natural selection is the vehicle for altering species to maximize their chances of surviving in their local environment. Certain traits are advantageous for producing offspring and, because individuals tend to produce offspring much like themselves, those traits will tend to increase in the population. Over time, this selection of the fittest (defined as the individuals that possess these traits) alters the characteristics of the species as a whole to maximize the species’ probability of survival. Natural selection uses tools similar to those of breeders who aim to improve domesticated species by preferentially selecting individuals that display desirable traits. The handle for selection, either artificial or natural, is always the variation among individuals in the species. Darwin states in the final recapitulation of his book2 If then, animals and plants do vary, let it be ever so slightly or slowly, why should not variations or individual differences, which are in any way beneficial, be preserved and accumulated through natural selection, or survival of the fittest? If man can by patience select variations useful to him, why, under changing and complex conditions of life, should not variations useful to nature’s living products often arise, and be preserved or selected? What limit can be put to this power, acting during long ages and rigidly scrutinizing the whole constitution, structure and habits of each creature,-favoring the good and rejecting the bad? I see no limit to the power, in slowly and beautifully adapting each form to the most complex relations of life.
A few simple principles underlie the development of all the complexities of life as we know it. Darwin concludes by saying that “the theory of natural selection, even if we look no further than this, seems to be in the highest degree
4 Genetic Algorithms and Their Use in Chemistry
possible.” Thomas Huxley, another renowned evolutionist, is supposed to have remarked, on reading Darwin, “HOWutterly simple. How extremely stupid not to have thought of that.”3 The notion that natural selection produces optimal species can be taken too far, of course. Darwin did not advocate the idea that the latest-appearing species along a lineage is the “best” (humans, for instance). The theory stops at the point of stating that species evolve to maximize their chahces of being able to reproduce in their particular ecological niches. It is vital to stress that it is the species and not the individual that changes and improves its reproductive chances. Nature does this in large part by selecting fit parents to breed and produce, on the average, fit offspring. Another property possessed by species is variation, which allows the species to explore and, under the right conditions, to exploit and colonize new niches. One major difference between artificial and natural selection is that of intent. The breeder knows where she wants the species to go, fixes the goal over many generations, and steadily moves toward it. In nature, the environment, and hence the measure of fitness, tends to change over time, presenting a moving target for the species. In addition, many traits, some incompatible with others, are important for survival. This requires tradeoffs and hence makes it difficult to say that this trait or that is the one being optimized. Also, species rely on one another, and coevolve, reaching situations where one symbiont may decrease its chances of reproduction to increase the chances of the other symbiont. However, this can simply be looked at as optimizing survival probability of the symbiotic system.4 For instance, many viruses become less virulent over time because they would destroy their host otherwise.
The Genetic Algorithm as a Metaphor Evolutionary algorithms such as the GA act much like domestic breeders in that individuals with good function scores are selected to breed more often than those with poor scores. Further, these individuals tend to produce offspring that look like themselves and are slightly better often enough to continue a steady increase in the goodness of the function scores. Just as in Darwinian evolution, the collection of individuals, that is, the population, is what survives and evolves over time. In the GA the population improves against some fixed goal, whereas in Darwinian evolution the goal may change with time. In the GA, individuals see only other individuals, and their fitness is important only relative to that of the rest of the population. The GA employs a variety of Darwinian concepts in addition to the ones already mentioned, including breeding, selection (sexual and otherwise), and mutation. Although these serve only in a metaphoric sense in the applications that are discussed in this chapter, they mimic real evolutionary processes closely enough to be widely used in modeling biological systems. The evolutionary algorithm idea arose independently at least twice. John
Genetic .Algorithms Tutorial 5
Holland576 began the development of the genetic algorithm in the 1960s, and it is the outgrowth of his ideas with which we are mainly concerned. It is interesting to note that Holland’s earliest motivation was not optimization, but rather machine learning. Almost at the same time, Ingo Rechenberg in Germany was developing what he termed Evolutionary Strategies7 whose main aim was optimization of engineering problems. Both approaches rely on the process of selection to successively increase the fitness of a population, but carry out the evolution in somewhat different ways. In particular, Rechenberg’s early implementations used a single individual rather than a population and did not immediately include the idea of crossover even after populations of individuals began to be used. Over a period of time, however, the two approaches evolved to the point where they differed only in detail. Most of the discussion in this chapter concentrates on Holland’s GA, but a brief description of the standard evolutionary strategy is also included.
Overview The chapter is divided into three major parts. The first is a primer for the GA method, starting with a discussion and example of the simple genetic algorithm, or SGA. We next discuss why the GA works, ultimately leading up to Holland’s Schema Theorem. This discussion is followed by problems and failure modes of the GA, and some advice on choosing parameter values. Many variants on the SGA have been discussed in the literature, and we attempt to put these in context, explain their use and purpose, and evaluate situations in which they might be useful. The section ends by returning to the correspondence between the GA and biological evolution. The second major section is a review of the uses of GAS in chemical applications, stressing variants on the GA rather than the chemically relevant results. Because well over 100 papers have been published in the field in the last few years, only representative and especially illustrative work is reviewed in depth. The final section gives references and sources of more information and software, much of it on-line.
GENETIC ALGORITHMS TUTORIAL This section provides an overview of what a GA is, why it works, and some of the reasons why it can fail. Many research groups are using GAS to solve a variety of problems. In almost all cases, however, the underlying GA is no more than a variant on Holland’s original formulation which has come to be known’ as the simple GA (SGA). We start by describing this in detail and then discuss some of the variations.
6 Genetic Algorithms and Their Use in Chemistry
The Simple Genetic Algorithm The SGA takes several metaphors from Darwinian evolution and creates simple computational analogies. The first of these is the chromosome, which is a one-dimensional genotype that is translated into the complex phenotype. The second is that of a population of individuals, each carrying its individual and perhaps unique chromosome. The third analogy is that of an individual’s fitness, which determines its likelihood of mating and producing offspring. Finally reproduction mechanisms of chromosome exchange (crossover) and mutation are introduced. Each of these are considered in turn. To make the ideas concrete, we illustrate them with a simple parameter estimation problem. For the model problem, assume that inhibition constants have been measured for a series of 10 compounds, and a quantitative structure-activity relationship (QSAR) model has been developed of the form
where log& is the log of the inhibition constant, v is the molecular volume, and nHB is the number of hydrogen bonding groups on the molecule. Our task is to find values of the’parameters (xo,xl, x2, x 3 ) that best fit the data. The “true” values of the parameters will be taken to be to (0.35,0.78,4.07,1.45).As seen later in this section, the simple GA discretizes what are often continuous parameters, employing a user-specified grid spacing. The “true” values were chosen so that they lie on one of the grid points in the discrete four-dimensional space of parameters being searched. Values of the molecular properties (Y, nHB, logK,) for 10 test compounds are given in Table 1. Binary Chronzosornes In the SGA, each individual is represented by a binary chromosome that is simply a string of 1s and 0s. This string can then be translated into parameTable 1 Molecular Properties for the 10 Test Compounds 1 2 3
250
5
265 265 260 260 100 120
4
6 7 8 9 10
250
24.5 245
5 4
5 4 5
4 5 4 3 3
-6.02 5.59 -6.42 5.18 -4.81 6.80 -5.21 6.40 2.69 4.63
Genetic Algorithms Tutorial 7
ters for the particular problem being optimized. In our test problem, the chromosome might look like 101 010 000 101
where each of the x i is represented by three bits. The value of xl can range from -1 to 1 and can take on one of 23 discrete values spaced 2/(23 - 1) apart, yielding a resolution of 2/7 units. The chromosome would be translated into the relevant floating point values by converting each three-bit “word” or “gene” of the binary string into decimal and appropriately shifting and scaling, that is,
where xTin and xFax are the maximum and minimum of the range of the variable, di is the decimal value of the binary word, and n is the number of bits for the word. Typically a simple scaling from binary to floating representation is used, but there are cases in which the real value is constrained to lie only in disconnected regions. This requires that a complicated filter be used. For instance, when doing conformational searches on proteins, the backbone {cp, +} angles never taken on certain values, as seen in Ramachandran plots.8 A conformational search will proceed more efficiently if the low-probability regions in the Ramachandran plot are never sampled. Often, instead of using straight binary representation (e.g., 011 = 3), binary Gray coding will be used.9 Gray-coded binary numbers have the property that successive integers differ by a single bit flip. A corollary to this is that most (but not all) single bit flips will change the decimal number by 1. The three-bit Gray-coded numbers and their standard binary counterparts are given by Gray Code 000 001 011 010 110 111 101 100
Decimal 0 1 R
3
4
s
6 7
Standard Binary 000 001 010 011 100 101 110 111
The reason for using Gray coding will be described just below in the section on mutation.
8 Genetic Algorithms and Their Use in Chemistry
The Fitness Function Whereas the binary chromosome represents the individual’s genotype, the fitness function represents the phenotype. From a user’s standpoint, this is simply the function to be optimized. The GA program will have a subroutine that takes the binary chromosome, translates it into an array of integer or floating point numbers, and then passes this array to the user’s function. The function is then evaluated for that set of parameters, and the value of the function is passed back as the fitness. A function for our example from Eq. [l] is given below. It returns the root mean square (RMS) deviation between the experimental and calculated values of log I<;
I
10
If the “true” values of the parameters are found, the fitness function will return a value of zero. In the programming language C, we have for the data in Table 1 double eval (int nx, double *x) {
double vl: 101 int nhb[ 10J double logki[ 101
= (260,260,246,246,266,268,280,260,100,120); = {6,4,6,4,6,4,6,4,3,3}; = {-6.02,6.69,-6.42,5.18,-4.81,
8.80, -6.21,8.40,2.89,4.63}; int i, n= 10; double 1ogkLcalcfitness=O.O; for(i=O;i
1
1
fitness =sqrt(fitness/n); return fitness;
The large majority of GA applications use some variant on the SGA and can get away with using one of the public domain GA codes as a black box. (See Appendix 2 for details on accessing public domain GA codes.) The difficult part typically comes in formulating the fitness function in a way that makes sense for the GA. We set this issue aside for the moment and return to it later in this section and in the discussion of applications. Populations The GA simultaneously operates on an entire population of individuals. Populations can range in size from a few individuals to many thousands. The
Genetic Algorithms Tutorial 9
initial population is typically created by generating npoprandom binary strings of the appropriate length. During the course of a GA evolution, both the fitness of the best individuals and that of the population as a whole will tend to improve (Le., the error will decrease in our example), although there are exceptions. Four statistics that are often calculated for the population are (1) the fitness of the best individual, (2) the average fitness for the population, (3) the number of bits converged, and (4)the number of bits “lost.” A bit is converged if it takes on the same value ( 1 or 0) in a fraction of individuals greater than the convergence threshold, which is typically set equal to 0.9. A bit is lost if it takes on the same value in every member of the population. DeJong’O introduced the concepts of on-line and off -line performance to discuss the performance of the GA. On-line performance is defined as the average fitness value of all individuals examined so far:
The off-line convergence measure is given by
where f“(i) is the best (lowest) function value seen prior to function evaluation i. N is the total number of function calls so far. On-line performance can continue to oscillate over time, but off-line performance is a monotonically decreasing function that will converge once the search ceases to find lower function values. In the SGA, the size of the population remains constant, but the composition changes from one generation to the next. The GA alters the population by using a series of operators, which are defined next.
Selection The selection operator plays the role of natural or artificial selection. After the fitness of each individual is calculated, the population is subjected to a selection process that causes certain individuals to have a higher probability of producing offspring for the next generation. The SGA uses roulette wheel selection. The essential idea behind this is that each individual is given a slice of a circle of unit circumference, proportional to the individual’s fitness. The most fit individual gets the largest slice and the least fit individual gets the smallest, as illustrated in Figure 2. The mating pool is produced by calculating npop random numbers ranging from 0 to 1 (corresponding to locations along the circumference of the circle) and including individuals whose slices are chosen. Note that the best individual will likely enter the mating pool multiple times.
10 Genetic Algorithms and Their Use in Chemistry Least Fit Individual
Figure 2 An illustration of the roulette wheel used for selecting individuals for mating. The fraction of the circle allotted to an individual is proportional to the individual's fitness.
One of the dangers of roulette whet- selection is that if a single individual is much more fit than any other in the population, then it can completely take over the population. This causes the GA to converge prematurely. Once the mating pool is produced, the next generation is created with replacement and crossover operators. It is worth reiterating that the SGA uses a constant population size that is specified at the beginning of the calculation. Replacement and crossover terminate when this number of individuals have been created for the new population.
Replacement
In replacement, some individuals in the mating pool are simply copied directly into the next generation population, in the spirit of asexual reproduction or cloning.
'
Crossover The remainder of the new population is filled in via single-point crossover, which is a type of sexual reproduction. A pair of individuals (the parents) is chosen from the mating pool, and their chromosomes are lined up, split at a single point, and the left and right halves are swapped, producing two new individuals (the children) (Figure 3). 4 In some implementations, the crossover point is chosen completely at random, without regard to gene boundaries. Often it makes more sense in the context of the physical problem being optimized to restrict crossover points to boundaries between genes or function parameters.
Mutation
Once the new population is filled in, a certain fraction of the individuals undergo mutation. The mutation operator simply flips certain bits from 1 to 0 or vice versa. Two definitions of the mutation rate are commonly used. Accord-
Genetic Algorithms Tutorial 2 2
Child 1 Child 2
Figure 3 An illustration of the basic crossover operator.
ing to the first, the mutation rate is the fraction of bits in the entire population that undergo mutation. Each bit in the population is looked at individually, and a decision is made whether or not to mutate it. Therefore some individuals will have multiple mutations, and many or most individuals will have none. In the second definition, the mutation rate specifies the fraction of individuals that have a mutation. Some individuals are chosen to be mutated, and one bit in each of these flips. No bits in the other individuals are examined. The mutation operator provides the rationale for using Gray codes. When almost any bit is flipped in a Gray-coded binary number, the value of the number will change by the smallest possible integer amount. Therefore, mutation has the effect of local search in parameter space. If standard binary coding were used, a mutation to the most significant bit of the gene would cause a huge change in the parameter value. These large steps in parameter space are better carried out via the crossover operation which, as shown later, preserves important building blocks in the chromosome. Elitism A final operator often used in the SGA is elitism, which is not biologically based. This term simply means that the best individual in generation i is transferred into the population of generation i + 1 and is guaranteed not to undergo mutation. This best individual may also produce other offspring through replacement and crossover. Using elitism ensures that the best individual will not be lost until a superior one is found. Simple Genetic Algorithm Pseudo Code At this stage we can outline the main body of a typical SGA code.
12 Genetic Algorithms and Their Use in Chemistry Initialize population for (ngen generations) { for (i=1 to npop) { Evaluate Fitness of individual i
1
1
Thbulate population statistics Select indivlduals and create mating pool lf CHllitism) move best individual to new population Partially fill in new population with Reglaoement Finish filling in new population with Qroseaver Carry out Mutatlon on new population
There exist multiple variants on this outline for the SGA, but none has any great theoretical o r practical advantages. However, there are other operators that can be used and variants on the ones described so far, many of which will be discussed after we show how and why the basic GA works in practice.
Optimizing the 10-Compound QSAR Model We are now ready to use the SGA to calculate parameters that best fit the model of Eq. [l].The run parameters that we will use are: 10-Compound QSAR-Run Population size Number of generations Mutation rate Crossover rate Convergence threshold Selection method Bits per parameter Populations
1 Parameters 100
200 0.001 0.6 0.8 roulette wheel 6 1
The ranges of the four parameters are:
1
-1.0 5 x g I0.89 0.0 5 x1 I1.89 2.0 5 x 2 5 4.835 1.0 5 x 3 I1.63
The GA run parameters are typical of what is used on many applications. A rough rule of thumb is to choose a population size 10 times the number of parameters to be optimized, but no less than 100. The number of generations times the population size determines the total computer time needed for the optimization. With a problem such as the QSAR example, the longer it is run,
Genetic Algorithms Tutorial 13
the closer one comes to the “true” answer. In this case, 200 generations is a balance between the desire to get the best possible model and the need for relatively quick turnaround. The mutation and crossover ranges tabulated above are set at the low ranges of what is typically used. Later in this section, we will examine the effect of increasing these values. The convergence threshold does not affecthow the GA performs, but is used as a diagnostic value to show when the GA has switched over from a broad search to one that is more local. The number of bits per parameter determines how finely the GA samples the continuous parameters, that is, number of bits translate into resolution. Enough bits should be used so that the important features of the fitness landscape are captured. Using too many bits slows down the search. The ranges of the parameters xi are chosen so that the values on the grid points are rational numbers. Ordinarily, one does not worry about this, and it is done here only to ensure that the GA can find the “true” parameters exactly. In Figures 4 and 5 we show the progress of the GA for this run. The top curve in Figure 4 shows the population average RMS error, which drops from an initial high value of 979.8 (offthe graph) down to still very high values around 150. Recall that a good model needs to predict values within 1 log unit. The dashed line in Figure 4 shows the number of bits converged, which quickly reaches values of around 17 (out of 24).In Figure 5 , we see the RMS error of the best individual; it starts high (1.3) but quickly drops to a reasonable level of 0.42. The large errors in the population average reflect the fact that some
300.0
0.0
1
,,
0
I
..-. ”
- --”,,--
“
“
J
’
’
,- _ , -
- - , . _ - -*> -,.., r ‘
I
“
”
’
’
20 40 60 80 100 120 140 160 180 200 Generation
Figure 4 The top curve shows the population average RMS error for the 10-cornpound QSAR problem, as a function of the number of generations run. The dashed curve shows the number of bits converged, out of 24.
14 Genetic Alnorithms and Their Use in Chernistrv
0.2
.;0.0
1 0
’
’
”
’
‘
I
’
I
20 40 60 80 100 120 140 160 180 200
Generation Figure 5 The curve shows the evolution of the RMS error for the best individual during the evolution of the 10-compound QSAR problem.
number of individuals have very wrong values for the exponent variables x1 and x3. In Figure 5, we also see that the GA has converged well before the end of the run. The final best value shows up in generation 78 out of 200. The next question to ask is how well the “good” solution provided by the GA represents the “true” set of parameters. The following table compares the best estimate of the parameters given by the GA with the true values: Run 1 Results Parameter True Best RMS 0.00 XO X1
x2 x3
0.35
0.78 4.07 1.45
GA 0.42 0.14 0.90
2.59 1.63
The striking differences illustrate the danger of using GAS or any other global optimization method for problems such as this. The GA has produced a set of parameters that provide a good fit to the data set used in the evolution process, but that cause errors in prediction whose magnitude will grow as the compounds look less and less like the test set, For instance, for a compound
Genetic Algorithms Tutorial 15 that has v = 50 and nHB= 5, the true value of logKi is -24.58, whereas the GA derived model yields -20.89. The GA has prematurely converged to a local minimum that is different from the true solution. Within the simple GA, there are several ways to improve the performance, measured by the time needed to reach the global minimum. One way that will probably not help here is increasing the number of generations, at least not without altering something else. Increasing the mutation rate also typically does not help too much, except that it slows down convergence. Increasing the crossover rate can increase diversity and hence the chance of finding the global minimum. In the next run, we set the mutation rate to 0.05 and the crossover rate to 0.8 and keep all the other parameters as before. 10-Compound QSAR-Run Population size Number of generations Mutation rate Crossover rate Convergence threshold Selection method Bits per parameter Populations
2 Parameters 100 200 0.05 0.8 0.8 roulette wheel 6 1
The best RMS achieved for this run is 0.31, somewhat lower than for run 1, and the convergence is much lower. In fact, no bits were converged at the end of 200 generations. Each of Fhe four parameters is closer to its true value, but has not reached it. Run 2 Results Parameter True Best RMS 0.00 XO 0.35 X1 0.78 x2 4.07 x3 1.45
GA 0.3 1 0.59 0.69 4.57 1.39
The next option to try, given that run 2 had not converged, is to simply let the algorithm run longer. This did not produce a lower value RMS even out to 400 generations. Another way to search more broadly is to start several independent GA runs beginning from different initial populations. This is done in run 3, in which 10 independent runs were performed. Each of these used a larger population of 200.
16 Genetic Algorithms and Their Use in Chemistry 10-Compound QSAR-Run Population size Number of generations Mutation rate Crossover rate Convergence threshold Selection method Bits per parameter Populations
3 Parameters 200 200 0.05 0.8 0.8 roulette wheel 6 10
One of the populations found the true solution in generation 33. By the end of the run, 6 of the 10 populations found RMS values lower than the single population of run 2, but only one ever found the true solution. Parameter True 0 Best RMS 0.00 0.27 xo 0.35 0.23 XI 0.78 0.84 x2 4.07 3.17 x3 1.45 1.57
1 0.12 0.44 0.75 4.66 1.39
Run 3 Results 2 3 4 0.11 0.37 0.60 0.29 0.26 0.59 0.81 0.81 0.69 3.71 3.08 4.48 1.50 1.57 1.41
5 0.14 0.35 0.78 4.03 1.46
6 0.36 0.62 0.69 4.57 1.41
7 0.30 0.14 0.93 2.90 1.63
8 0.27 0.20 0.87 3.67 1.49
9 0.00 0.35 0.78 4.07 1.45
The separate runs have found a number of different solutions, although not necessarily local minima. For instance, population 5 has the correct values for the first two parameters and has values that differ from the correct values for the other two parameters by a single bit flip. A pair of correlated mutations would take this solution to the true minimum. In situations such as this, it is often useful to take the best solutions produced by the GA and apply a local minimization method to drive them to the bottom of the nearest local minimum. Population 2 is probably also in the basin of attraction of the solution and would drop in with the aid of a local minimizer. The point needs to be emphasized that the GA is typically very bad at refining solutions in local regions. It does not use information about local gradients in the way a Newton or conjugate gradient method would.' The strength of the GA is its ability to efficiently perform a coarse search of a largedimensional parameter space to find interesting potential solutions.
Analysis of the Simple Genetic Algorithm At this stage, we have presented what the SGA is and have given an example of the type of problem one might solve with it, but the reason it works is probably not obvious, The formal analysis of the GA is based on the idea of a schema, which is a generalized substring of the chromosome that is discussed
Genetic Algorithms Tutorial 17 below. Schemata (plural of schema) form building blocks from which the chromosome, and hence the solution to the problem, can be constructed, and these building blocks are what the GA really processes. To see what is meant by a building block, consider the most trivial example of a function that we wish to minimize:
Let us choose to have each of the parameters span the range 0 use three bits per word, the chromosome would look like
Ixi 5
7 . If we 1
010 111 000 101 This uses simple binary encoding, that is, 000 = 0, 001 = 1, 010 = 2, etc. Clearly this function can be minimized one parameter at a time, but it is also true that it can.be minimized one bit at a time. The global minimum occurs when each x i is equal to 0, and therefore when each individual bit is equal to 0. Any bit that changes from a 1 to a 0 decreases the function value. Now imagine a run where we had found the following pair of individuals in the population 000 000 000 001
111 111
111 110
The first string is very good, but the second is very bad. However, if they mate and cross over, with the crossover point between the next to last and last bits, one of the children will be all 1s and the other all Os, or the optimal solution. The last 0 bit in individual 2 is a building block that individual 1 needs. Next examine two other individuals
000 000 000 111 111 111
111 000
If these cross over between bits 9 and 10, they will again form the all-1s and
all-0s strings. Now the important building block consists of the last three bits. The same argument can be carried forward for longer strings and longer building blocks. Notice that it is more efficient to move from the second pair to the minimum than from the first pair, because for a single crossover, a larger change has been made in the function value. A problem with the GA occurs if the initial population is so unevenly distributed that in some positions no individual has a 0. No amount of crossing over would ever produce the minimum. Although mutation acts to solve this
.
18 Genetic Algorithms and Their Use in Chemistry
problem, it works only at a bit-by-bit level. An optimization run for this problem will go much faster if the initial population already contains some copies of 000 in each of the four positions. A few crossovers can then put the function at the minimum. Goldberg et al. have identified six aspects11 of the GA that are useful in defining a fitness function and analyzing the subsequent performance: 1. Know what the GA processes: building blocks. 2. Ensure that there is an adequate initial supply of building blocks. 3. Ensure the growth of necessary building blocks. 4. Ensure the mixing of necessary building blocks. 5 . Solve problems that are building-block tractable or recode them so they are. 6. Decide well among competing building blocks.
The first three points have already been touched on. Efficient crossover will ensure that point 4 is satisfied. Point 5 is easy to state, but sometimes difficult to implement. It addresses the fact that the GA will work optimally on almostseparable problems, that is, those whose solution can be built up block by block. As the degree of nonlinear coupling between the parameters increases, the problem will become increasingly difficult for the GA to solve. It also becomes increasing more difficult for all other models. The sixth point, deciding well among building blocks, goes to the heart of the workings of the GA. SGA shuffles building blocks through the action of the crossover operator, so the decisions about choosing building blocks is out of the user’s hands. Goldberg et al.” developed the messy GA (see below) partly as an attempt to do a better job of making these decisions.
The Schema Theorem The previous discussion of building blocks can be generalized to the idea of schemata which form the basis of Holland’s Schema Theorem.s@ A schema is a template containing the characters 0, 1, and *, the last being the “don’t care” or “either” character. An example of a schema is the string 1O“l. Two actual strings will match this schema, namely 1001 and 1011.A chromosome can be decomposed into a large number of overlapping schemata as in the following example: Chromosome Schema 1 Schema 2 Schema 3 Schema 4
00 110 1 101 10010 10
1*0*1
6(H) o(H)
5 3 0 0 * * * * * * * * * * * * 1 16 0 0011011011** 10
loo*
3 3 4
10
The top line gives the original chromosome, and the following lines show a variety of schemata found in the chromosome. Two important ways to categor-
Genetic Algorithms Tutorial 19 ize schemata are by their length-our first two examples are relatively short, whereas the second two are relatively long. One measure of the size of a schema H is its defining length, which is the number of positions between the outermost fixed (1 or 0) positions. Another measure is the schema order, which is the number of fixed positions. The standard notation for the defining length and order of schema H are 6(H)and o(H), respectively. The schema theorem provides a very powerful statement about the behavior of schemata in a chromosome. Mathematically, it states
m(H,t
(H)[ 1 - p , 6O - o ( H ) p , ] + 1) 2 m(H,t ) x ff 1-1
171
where m(H,t) is the number of examples of schema H in the population at time t; f(H) is the average fitness of strings (binary chromosomes) containing H;fis the average fitness of all strings currently in the population; p , is the probability that crossover will occur at a particular mating; pm is the probability that a particular bit will be mutated; and I is the length of the string. The factors outside the brackets indicate that a particular schema will increase its representation in the population at a rate proportional to its fitness relative to the average fitness. Good schemata will increase their representation exponentially, and bad schemata will lose representation exponentially. The rate of increase (or decrease) is tempered by the factor in brackets. Note that for a situation where selection operates alone, in the absence of crossover and mutation, the pure exponential behavior holds. As crossover is added, schemata with large defining lengths 6(H), relative to the length of the string, grow (or decrease) in probability less strongly than before. This arises because crossover disrupts schemata with a probability proportional to their length. Mutation is important only for fixed bits, so the decrease in the growth or decay factor is proportional to the order o(H). An important consequence of this behavior is that problems that are optimal for the GA are those whose solution can be incrementally built up from short schemata with relatively few defined positions. l 2 Here optimal implies that dominance of the population by highly fit schemata is good.
Convergence The schema theorem provides a law governing the evolution of probabilities of finding different schemata in a string. As more highly fit schemata are created during crossover and mutation, they will crowd out ones that are less fit. However, at some point no new schemata more fit than those already present will be found, and the current best will completely take over the population, except for drift due to the disruption caused by crossover and mutation. Even though a completely fixed state will not have been reached, we still speak of convergence. A typical measure of convergence is the number of bits con-
20 Genetic Algorithms and Their Use in Chemistry
verged at a set threshold. Bit i is converged to the value 1 (or 0) if 90% (say) of the strings have the value 1 (or 0) in position i. A bit is said to be lost if 100% of the strings in the population have the same value (1or 0) at that bit. Quite a bit of work has been done to develop proofs about the convergence of the GA, in analogy with the proof that, under certain circumstances, simulated annealing is guaranteed to converge to the global minimum of a function.13914
Known Problems GAS suffer from a series of well known pathologies, most of which are touched on in this section. The potential user of the method should be aware of these, but should not be deterred by their existence.
Premature Convergence Premature convergence is a straightforward consequence of the schema theorem. Once a few highly fit schemata enter the population, they can quickly dominate and drive the entire population toward convergence, before even better schemata can be found and exploited. A variety of methods have been developed to cope with this problem, by enforcing a minimum level of diversity in the population. These are discussed below in the section titled Variations on the Simple Genetic Algorithm. Sampling of Nonsense Parts of Parameter Space Although the GA gains some information about the shape of the function being optimized, all it really knows about is the abstract binary space in which irs bit manipulations are carried out. Because of this, it will search through all possible regions of parameter space, even though some areas may be physically unreasonable. An example of this is provided by certain spline-fit potential energy functions. Instead of properly going to positive infinity at zero separation of all atoms, they go to negative infinity, as shown in Figure 6 . Running molecular dynamics on such surfaces is safe because there is never enough energy to climb over a potential energy wall blocking access to a minimum. The GA, on the other hand, can simply step right over and quickly converge to a nonphysical solution. k
Dealing with Constraints A standard way to avoid the nonphysical regions of space is to apply constraints or penalty functions. Adding constraints to general optimization methods is an active area of research, but today there are no ideal ways to apply them to GAS. The standard trick is to add a penalty term to the fitness function that acts whenever the constraint is not satisfied. For the spline fit potential just discussed, we could add the penalty 'function shown in Figure 6 . Its value is zero for distances greater than the penalty function cutoff, but quickly climbs to positive infinity for shorter distances. For distances less than that where the
Genetic Algorithms Tutorial 21
h
$
5
"True Potential"
I -Spline
Fit Function
Figure 6 An illustration of the problems that can arise when parts of parameter space are nonphysical. Assume that one is interested in sampling the potential for energies less than the maximum of the spline fit function. Classical trajectories coming from the right cannot surmount the barrier and will behave almost the same on the spline fit function as on the true potential. However, the GA can sample anywhere and has the possibility of accessing the nonphysical region to the left of the maximum in the spline function. One solution is to add a penalty function whose value is zero except for regions of small distance. The sum of the spline fit function and the penalty function will be at least as large as the true potential in this classically forbidden. region and will thereby push solutions to larger distances.
spline fit potential turns over, the sum of the spline fit and the penalty should be at least as great as the true potential.
Greedy Use of Function Evaluations
Typical GA applications can require many function evaluations to find a good answer. This is true of all global optimization methods, and not just GA. A typical run for a problem with n variables will use a population of 10n, run for 100 generations, and be repeated 10 times, for a total of 1 0 , 0 0 0 ~function evaluations. If your function is expensive to evaluate, it may be best to look for some heuristic that allows you to locate the minima of interest more quickly than GA will.
Crossover Problems As mentioned previously, crossover preferentially disrupts long schemata.
As a consequence, variables whose values need to be correlated in an optimal solution to a problem should ideally be located nearby on the chromosome.
22 Genetic Algorithms and Their Use in Chemistry Unfortunately, one often runs a GA search precisely to find these correlations. Again, the only useful advice is to know as much about the problem before starting and order the variables on the chromosome so that suspected correlations are not disrupted.
Poor Performance in Local Search One optimization problem for which GAS are almost never the right tool to use is finding the bottom of a particular local well. To solve problems in which gradients can be calculated, even when they are expensive, it is usually better to turn to a gradient optimization method.'
Estimating Parameter Values Once a sample problem has been coded up to use in a GA, the user must still choose a small handful of GA specific parameters, and this section discusses some rules of thumb for doing so. However, the parameters are only estimates based on formal analysis and the examination of some representative test problems. For instance, the population sizing arguments given below require knowledge of the distribution of fitness values for each of the schemata, which are unknown prior to a run. Suggestions have been made to gather this information during the run and then dynamically size the population," but this is not commonly implemented. DeJong'O presented a very complete analysis of parameter ranges vs. performance on a test suite of problems.
Population Sizes us. Number of Generations The total number of function evaluations in a run will be n x g, where n is the population size and g is the number of generations. In most real applications, as opposed to the aforementioned test problems used to understand the workings of the GA, the fitness function will be expensive. The number of times one can afford to calculate the fitness function is less (sometimes much less) than the optimal number recommended by GA theory. A typical strategy is to perform a small number of test runs with a fixed number of function evaluations, with increasing n. A good value of n is found when the number of function evaluations is exhausted not too many generations after the improvement in the GA performance has stalled out, such as illustrated in Figure 5. Most authors report similar heuristics for sizing the population. There is typically a size above which no improvement in performance is seen, no matter how many generations are run. Goldberg et al.11 give an analysis to estimate population sizes, based on assumptions of the distribution of fitness values for schemata in the population. The goal is to ensure that the GA carries out enough evaluations to distinguish good from bad schemata. First, concentrate on a single pair of schemata with average fitness f, and fi, and fitness distributions ul and u2.Further assume
Genetic Algorithms Tutorial 23 that there are a total of K schemata competing for evaluations with these two. Then the most general result for n, the optimal population size, is n =
ai4 -
~ C K d2
where d = f l - fi, wM = (al + a2)/2,and c is a parameter that measures how sure the user wants to be that the schemata are distinguished. More function evaluations are needed if (1) the schema fitness averages are closer; ( 2 ) the average breadth of the schemata fitness distributions are larger; or (3) there are more competing schemata to cloud the picture. The optimal value of n will be chosen so that the population can distinguish the least distinguishable pair of schemata in the chromosome. An argument is made that n ultimately scales as O(1) where 1 is the number of bits in the chromosome.11 This analysis can be used as a guide for sizing populations for new problems in the same class as ones that already are found to work. For instance, in the conformational searching problem, a good rule of thumb was to choose a population size proportional to the number of dihedrals being rotated. In this case, the chromosome coded for dihedral angles, whose contributions to the energy are decoupled to first order.15
Mutation Rate Mutation rates, defined here as the probability that a given bit will be mutated, are typically set in the range 0.001 to 0.1. A value of 0.5 produces a random walk. If the rate is set too low, premature convergence is more likely. In addition, full exploration of the space can be limited by an initial population that poorly covers parameter space. It is important to avoid setting the rate too high because this will swamp out the directed nature of the GA search that is provided by selection and crossover. It would be useful to have the GA detect when the population is starting to converge and increase the mutation rate accordingly. This is sometimes done indirectly using niching or sharing operators, which are discussed below. Crossover Rate The crossover rate gives the fraction of members in the population that were created by crossover, as opposed to simply being copied (with mutation) from the previous population. Typical values range from 0.5 to 1.0. DeJonglo recommended a value of 0.6, although 0.8 is commonly used. The concepts of exploration and exploitation are sometimes used when discussing mutation and crossover. Crossover is exploratory, because it often produces individuals in parts of parameter space not previously explored. Mutation, on the other hand, moves locally and exploits local clues given by the fitness function. Pushing the crossover rate to 1.0 essentially prevents this local exploitation
24 Genetic Algorithms and Their Use in Chemistry from occurring, at least until the population is so homogeneous that crossover produces local moves only. Granularity of Real Number Representation Many of the applications that will be discussed use the binary GA to optimize real-valued functions. This entails laying a grid over the real valued space and allowing only discrete parameter values, that is, those on the grid, to be sampled. There is little direct cost associated with using a large number of bits to represent the real-valued variables. A small amount of extra storage and a little bit of extra overhead in decoding the binary strings are required. Both of these factors are typically trivial relative to the storage and central processing unit (CPU) time associated with the fitness function. The real cost is that the GA has a harder time distinguishing the many schemata with similar average fitnesses that now exist in the population. The result is that convergence rates are slower. The effects of mutations are also decreased, because a bit flip represents a smaller move in parameter space, at least if Gray coding is being used. Recall also that the GA is not very good at local search, so that the binary representation only needs to be fine enough to distinguish the relative fitness of local minima.
Variations on the Simple Genetic Algorithm Since the introduction of the GA, many variations have been invoked, typically to solve some of the drawbacks mentioned in the previous section. Those we will discuss in this section are representative of the variants that have been used, but do not exhaust the possibilities. One of the key issues one should consider before using these variants is whether one aims to have a robust “black box” GA or one tailored to a specific application. Some of these extensions will help any GA, whereas others are useful only for a subset of applications. Alternate Selection Strategies The roulette wheel selection strategy, based on fitness, has been formally shown to be near optimal for searching parameter space. However, it suffers several problems in practice that can be solved by using alternative selection methods. The first of these is designated rank-based selection.16 This method still uses a roulette wheel, but bases weighting on rank in the population instead of fitness. The best individual in a population is given a “rank fitness” of 1, the next best a “rank fitness” of 2, and so on. As described later in the section on fitness scaling, fitness functions having a wide dynamic range tend to cause the GA to converge prematurely, because the first individual that scores 100 or 1000 times better than any other will immediately capture the entire population. For such cases, the rank is used instead of the true fitness.
Genetic Algorithms Tutorial 25
A second scheme is step function selection. Here the top p% of the population is put into the breeding pool with equal weight, and the lower (100 - p ) % is discarded, Miihlenbein advocates this strategy in his breeding GA'79'8 based on the fact that there is little information on where the GA is going, so that heavily biasing its evolution toward the current best individuals may keep the GA from finding what is ultimately the best solution. A third scheme is called tournament selection. Here, pairs of individuals are selected from the population, and the better of the two enters the breeding pool, whereas the poorer does not. For the SGA, this offers little advantage over roulette wheel selection, but it does allow the population to be replaced incrementally in a clean manner, as discussed next.
Steady-State us. Generational Replacement In the SGA, we use what is termed generational replacement, meaning that the entire population is regenerated synchronously once a generation. A closer analogy with biological evolution is steadystate replacement in which individuals are constantly entering and leaving the population in an asynchronous manner.19JO At each call to the replacement operator, the number of individuals to replace is given. This number can go from 1 or 2 (typical steady state) to the size of the population (typical generational). An extension of this basic idea that is often used is steady state without duplicates. As pairs of children are created from pairs of parents, they are checked against every other individual in the current population. If a child duplicates a current member, it is discarded and the parent is put back into the breeding pool. In this way the population never has any two identical individuals and hence maintains a greater diversity than it would otherwise. This may lead to more work in the GA machinery because of constant checks against the population, but in practice most of the CPU time in a GA run is spent in the function evaluation routine, and extra overhead in the GA is negligible. Alternative Crossover Schemes Single-point crossover as used by the SGA has the problem that it is often difficult to quickly recombine some good building blocks into an optimal individual. Returning to the four-word problem from above, imagine the following pair of individuals: 111 000 000 111
000 111
111 000
There is no single point at which these two chromosomes can be cut to crossover and form the all-zero string. However, a two-point crossover scheme would work. Imagine making cuts between bit pairs [3,4] and [9, lo]. The first child gets the first and fourth segments of parent 1, and the second and third of
26 Genetic Algorithms and Their Use in Chemistry
parent 2, whereas the second child gets the opposite (and optimal) set. Twopoint crossover is often used where the two points are chosen at random. This is really only a halfway solution. Syswerdazo introduced a generalization of this which he termed uniform crossouer. Here a random template of Is and 0s is generated for each crossover operation. For each 1 bit in the template, child 1 gets the corresponding bit from parent 1 and child 2 gets the corresponding bit from parent 2. For each 0 bit in the template, the opposite happens. The following illustration should make this clear: Template Parent 1 Parent 2 Child 1 Child 2
000 000 111 111 111 000 000 111
000 111 111 000 000 111 000 111 111 000 111 000
The uniform crossover qethod will allow (in principle) any pair of schemata to be combined. In one-point crossover, schemata nearby on the chromosome tend to stay together, so that if two parameters tend to behave in a correlated fashion, crossover is most helpful (or least harmful) if those parameters are coded within nearby regions of the chromosome. However, one typically does not know about these correlations beforehand, and in fact these correlations may be precisely what one wishes to discover. Uniform crossover minimizes this need to carefully order the parameters., A number of optimization problems can be expressed in terms of ordered sets of numbers. The classic example is the traveling salesman problem (TSP). In the TSP, a salesman needs to visit a series of cities, with the constraint that each city must be visited once and only once. The object is to minimize the total distance traveled. An (integer) chromosome describing a trial route among eight cities could be 7 2 4 5 8 1 3 6
meaning that the salesman starts at city 7, proceeds to city 2, then to city 4, and so on. If a standard GA were used for this problem, both the crossover and mutation operators could create invalid strings, which do not have the property that each value shows up exactly once. A new mutation operator is easy to design. Typically one simply chooses a pair of entries and swaps them. Crossover is more complicated. Goldberg9 describes one solution called partially matched crossover (PMX). First, two crossover points are defined in a pair of parents: Parent1 Parent2
7 2 4 1 5 8 1 1 3 6 6 3 2 } 4 7 1 8 1 5
Genetic Algorithms Tutorial 27
Next, the values in the middle section are swapped between the two parents: Parent1 Parent2
7 2 4 1 4 7 1 1 3 6 6 3 2 1 5 8 1 8 1 5
Finally, the other pairs that match those just switched are also switched. For instance, the value 5 from parent 1swapped with the value 4 from parent 2. To maintain a single copy of each of the values, the other value 4 from parent 1 must swap with the other value 5 from parent 2. Likewise, the alternate 7 and 8 . 1 must swap. The final child chromosomes are: Child1 Child2
8 2 5 6 3 2
I I
4 7 5 8
I I
1 3 6 7 1 4
Inversion and Reordering Inversion and reordering are two additional mutational operators whose goal is to increase diversity in the population. Inversion simply takes the chromosome, or a segment of the chromosome, and flips it end-for-end, whereas reordering takes a single chromosome, chops two sections out, and replaces one with the other.
-
010 001 + 100 010 001 010 010 001
Full inversion Reordering by three-bit substrings
Greedy Oversampling Koza'2 describes a method of greedy oversampling that actually undoes some of the diversification of the population achieved by methods we have already discussed. Koza's method typically works on problems that do not have gradients and are not even locally smooth, but where one still wishes to do local sampling in the vicinity of the current best individuals. To achieve this, Koza preferentially selects highly fit individuals for breeding. However, this is done only where large populations (1000 or more) are being used, because otherwise it tends to cause premature convergence. Niching Another concept borrowed from Darwinian evolution is niching, which aims to keep populations diverse and hence to slow down convergence. In a natural environment, species tend to move into niches that are unoccupied and hence where resources may be easier to come by than in more crowded regions. In the same way, a niching strategy attempts to force individuals to migrate away from crowded regions or niches in parameter space. A common strategy is to calculate the distance between members in the population and bias the selection so as to reduce the likelihood of two nearby individuals being se-
28 Genetic Algorithms and Their Use in Chemistry
lected, even if they both have high fitness relative to the population as a whole. One typically uses the Hamming distance to measure the closeness of two individuals. The Hamming distance d H is simply the sum of the number of bits that differ between two chromosomes (ciand ci) divided by the number of bits. It will therefore range in value from 0 to 1. Goldberg and Richardson21 give a sharing scheme that weights the fitness function based on the amount of crowding around an individual. The weighted fitness is given by
c
I= 1
s x d,(c,
Cj)
[93
where ci is the chromosome of individual i
and u is a cutoff value. Goldberg discusses a variety of other niching strategies in his b00k.9 Another approach to niche formation is embodied in the parallel breeding GA of Miihlenbein.17J8 Here one runs several simultaneous but independent GA simulations starting from different initial populations. Periodically, for instance every 20 generations, the best individual in each population is given to all of the other populations. Each population will tend to converge to a single region, which will be different for different populations. The individuals passed into a given population will act as new, relatively fit seeds for further exploration by the GA. However, if sharing occurs too often, all of the populations will tend to converge on the identical set of parameters.
Haploid us. Diploid Chromosomes So far, we have lefr out an important feature of the chromosomes of most higher animals, namely diploidy. The following discussion follows the treatment of Chapter 5 of Goldberg’s book.9 A diploid species has two genes at each locus, only one of which is expressed in a given instance. In contrast, the SGA models a haploid organism, which has only a single gene at each locus that is always expressed. The chromosome of a diploid organism can be represented by a paired series of alleles, for instance, AbCdE and aboDE, where uppercase letters represent the dominant allele and lowercase the recessive allele. The dominant allele will be preferentially expressed, so that the haploid version of this chromosome is AbCDE. The biological advantage of diploidy is that alleles beneficial in one environment but gravely disadvantageous in another can be saved, hidden away but ready to be used when the need arises. Suppose a particular allele 6 is advan-
Genetic Algorithms Tutorial 29
tageous in environment E6, whereas the alternate allele A is advantageous in environment EA. As the environment alters between E6 and EA, offspring with the appropriate allele can appear, expressed starting from the first generation. This is because some individuals have likely kept it hidden away unexpressed. A haploid organism, on the other hand, must evolve from one allele to the other through a process that may take many generations. In the context of the GA, diploidy offers another method to maintain genetic diversity. The crossover operator is replaced by the fertilization operator which takes two diploid chromosomes and trades gametes, or half chromosomes (see Figure 7 ) . A simple binary representation of the diploid chromosome uses a three-letter alphabet where a - 1 implies a dominant 1, a 1 implies a recessive 1, and a 0 implies a 0. The dominance matrix is 1-1 -1 01 1
0
1
1 0 1 0 0 1 1 1 1
whose use is illustrated in the example: Allele 1 Allele 2 Expressed
1 1 1 0 0 0 - 1 - 1 - 1 1 0 - 1 1 0 - 1 1 0 - 1 1 1 1 1 0 0 1 0 1
In practice, the use of diploidy is not very useful for stationary fitness functions, that is, those that do not change in time. Its real value comes into play when the fitness function is nonstationary. A practical example of this would be a GA
Parent 1
Parent 2
I
ABCDE abCde
+
E3-+ 1-
abCde
--b
ABCDE
ABcDe
AbCDe
Figure 7 An illustration of the diploid style crossover or fertilization operator.
30 Genetic Algorithms and Their Use in Chemistry
being used to control an experimental apparatus whose operating parameters drift from day to day. Regulator Genes A concept related to diploidy is that of a regulator gene which is a gene that turns other genes off or on in a haploid chromosome. The reason for including this feature in the GA is to allow certain, perhaps deleterious, genes to remain in the chromosome but remain repressed until the rest of the chromosome has changed sufficiently sp that the repressed gene becomes beneficial. An application of this is in a problem in which the chromosome specifies pairs of atoms that will be brought together in sequence to fold a linear chain polymer.22 The regulator gene specifies how many of the individual folding steps will actually be carried out during a particular fold program. Messy Genetic Algorithms Messy GAS (MGAs)23 provide another solution to the problem of building blocks that get separated by crossover. Other solutions already mentioned are multiple point and uniform crossover operators. Holland6 early on recognized that certain genes, or substrings in the GA, represent variables whose effect on the functions being optimized are correlated. The GA should attempt to enforce linkage between these genes, meaning that once they come together, they should survive further processing as a unit. For instance, the function value is low only when gene 1 takes on value x: and gene 20 takes on value x&,. Clearly, once this combination is found in the chromosome, it should be kept, but these two genes, being far from one another on the chromosome, are likely to be separated often by crossover.23 Goldberg and co-workers developed an MGA23 that has variable ordering and length of the chromosome, in an attempt to effect tight linkage. A messy chromosome is a series of genes identified by both their identity and value, but not necessarily placed in order. For instance, the five-gene chromosome ( 5 , x5), ( 2 , xz), (1, XI), (3, Xd, (5,xa assigns value x5 to gene 5 , value x 2 to gene 2, etc. Notice two oddities here. First, there are two copies of gene 5 , and there is no copy of gene 4.Before we explain those, consider the standard GA chromosome in this notation:
(Lx,),(2, xz), (3, x3), (4,x4), (5, x5) There is one and only one copy of each gene and they are always ordered, so that the gene number is redundant. The reasoning behind the MGA representation is that one would want multiple copies of a particular gene if that gene
Genetic Algorithms Tutorial 31 strongly affects the function value. On the other hand, genes that do not greatly affect the function value need to have little processing performed on them. Therefore they can be left out of the explicit representation. These duplicated and missing genes still have to be resolved, however, because the fitness function needs a single unique value for each of its parameters to be evaluated. Goldberg’s solution to this requirement is for the fitness function to process from left to right, taking the first instance of each gene that occurs. Missing genes are given a value taken from a template. The template values are either chosen randomly or are assigned the value of that gene found in the best individual from the previous generation. The evolution of an MGA population proceeds in three stages: initialization phase, primordial phase, and juxtapositional phase. In the initialization phase, a limited amount of uniform search is performed to find good building blocks (of one or more genes). During the primordial phase, evolution proceeds via tournament selection alone. In the final juxtapositional phase, the cut-andsplice operator, a variant on crossover, is used together with tournament selection. No mutation is used. The cut-and-splice operator can have a variety of outcomes. These include producing a single offspring chromosome by simply splicing two parent chromosomes together (pure slice); making two child chromosomes from a single parent (pure cut); and making the equivalent of standard single point crossover. Some impressive claims have been made for the potential of MGAs to solve large optimization problems quickly and efficiently. So far, however, the MGA has not been widely used, so it is unclear what advantage it will offer on a range of real problems.
Real-Valued Chromosomes Most of the formal analysis of the GA was based on the use of a discrete description of the chromosome. The binary alphabet (0 or 1) described so far is the one most commonly used, but other finite alphabets (e.g., the nucleic acid four-letter code, ACGT) have been analyzed and are known to exhibit the same behavior as the binary case. From a practical standpoint, it makes sense to dispense with the finite representation altogether and simply carry around realvalued strings as the chromosomes. Selection and crossover are identical, but mutation needs to be handled differently. Typically, the real-valued mutation operator selects a parameter to be mutated as before, but then adds a random increment to the value. The increment is drawn from a Gaussian distribution centered on the old value and having a user-specified width, thus adding one extra parameter to the method. Some early formal analyses24 suggested that real-valued GA simply should not work, despite the fact that in practice it seems to work well. However, subsequent analysis demonstrated that all of the important properties of a finite alphabet GA will carry over to the continuous case.
32 Genetic Algorithms and Their Use in Chemistry Scaling of Fitness Functions The dynamic range of the fitness function is an important consideration when using roulette wheel selection. For instance, if the fitness function contains a series of narrow valleys surrounded by walls that climb exponentially, the population will converge rapidly on the first individual that finds its way into any one of the wells. The simple solution to this problem is to scale the fitness function appropriately. A common approach is to use the log of the original fitness function after it has been shifted to ensure that all values are greater than zero. Various linear scaling methods have been proposed, again to counter the problem of roulette wheel selection over-selecting the best few individuals. These problems can also be solved using one of the alternate selection methods described above. Lamarckian Genetic Algorithm Lamarckian GA25 is modeled on Lamarck’s theory of inheritance of acquired characteristics, in much the same way that the basic GA is modeled on Darwinian evolution. The term usually means that the evaluation function takes the initial parameters decoded from the chromosome and carries out a local function minimization. The parameter values at the local minimum are recoded into the chromosome and passed back to the main GA routine along with the value of the fitness function at that point. The environment has caused a direct and (locally) beneficial change in the individual’s genes that can be passed on to its offspring. Lamarckian GA is useful when the fitness function has high, steep ridges surrounding deep, narrow valleys. This is often the situation in conformational searching or docking problems. A slight geometrical alteration in even the lowest energy conformation can send the energy up a steep van der Waals wall. Even a few steps of gradient minimization can put the conformation almost at the minimum. The tradeoff is that each local minimization can cost many function evaluations that might be more profitably used in further coarse exploration of the parameter space using the basic GA. The balance seems to favor standard (non-Lamarckian) GA except when function evaluations are very cheap, and the search space is large.
Noisy Functions Most of the problems we tend to solve are time independent, in the sense that f(x) will give the same answer tomorrow as it does today, given the same value of x . Standard GAS take advantage of this by checking whether a given chromosome has been seen before and simply returning the previously calculated and stored function value, rather than performing a potentially expensive new function evaluation. In contrast, there are several classes of problems that are noisy, meaning that f(x) can be different each time the same value of x is given to it, The first is the case where the function value is provided by some experimental apparatus that is subject to real noise. An example of this is the
Genetic Algorithms Tutorial 33
use of a GA to design laser pulses which are then fed to a controller. The experimentally measured result of the pulse is then fed back to the GA.26 A second situation is when the GA is optimizing hypotheses concerning sets of experimental data in a database that is constantly growing. If the fitness function is the average error in predicted vs. calculated activity, for instance, each new data point can shift the fitness value. Using a GA in these situations simply requires having the GA explicitly evaluate the fitness function each time. The advantage of the GA for noisy functions is that the population typically contains solutions clustered around the best found so far. As the “true” solution wanders owing to noise, the GA can quickly lock into the new “true” solution because it often already has a copy available.
Parallel Genetic Algorithm Parallel computing environments are becoming increasingly common,27 and the GA offers one of the ideal applications on such architectures. Here a parallel computer can mean a network of workstations, a shared memory multiprocessor, a distributed memory computer, or an interconnected combination of these. The ease of parallelization relies on the fact that most of the CPU time used by the GA is spent in the fitness evaluation. Because this occurs in a simple loop over the individuals in a population, a good speedup can be achieved by simply parallelizing this loop. The following pseudo-code shows a message passing implementation of the SGA. Message passing can be thought of as a “master-slave” arrangement in which a master sends requests for calculations to a series of slaves. A slave receives the request, carries out the calculation, sends back the answer, and waits to receive another request. Meanwhile the master receives the answer and proceeds. The master and slaves are typically implemented as different UNIX processes each having a main(). A slave process, once initialized, goes into an infinite loop waiting for sends. This example assumes that the population size is greater than the number of processors. I* Master Procress * I
Initialize population Initialhe nproc processors for(ngen generations) { / * send an hltlal Individual to each processor * / nsend=O for( iproc= 1 to nproc) { send(nsend,indlvidual[nsend]) to proc[ lproc] nsend=nsend+ 1
1
/*
Do all of the rest of the sends and receives AB soon as a processor is flnished, give It more work t o balance t h e load * /
34 Genetic Algorithms and Their Use in Chemistry nreceive= 0 whlle(nreceive
1
1
"he remaining code is identical to that shown above * I Tbbulate population statistics /*
1
...
I* Slave Process *I Initialhe for() { receive(i,fitness[i]) from master Evaluate Fitness of individual[i] send(ifltness[i]) t o master
}
Code also needs to be added to cleanly halt the slave processes once the calculation is finished. There are several simple, public domain message passing libraries available for parallelizing your own GA code. Some of the public domain GA codes listed in Appendix 2 are also parallelized.
The { pyx}Evolutionary Strategy An alternative approach for computational evolution, termed the Evolutionary Strategy (ES)28 was developed independently by Rechenberg and co-workers.' The ES employs many of the same ideas as the GA, including mutation and selection. However, the ES always uses real-valued encoding. A particular evolution strategy is usually denoted as a {p + A}-ES. The parameter p refers to the constant population size, whereas the parameter A refers to the size of the pool out of which a new population is selected. Obviously, A must be at least as large as p. The basic algorithm operates as follows. We have a set of N parameters x: (where 1 5 i 4 N), a fitness function f(xf), as well as a set of standard deviations af. The superscript t refers to time, in generations. The simplest ES is the (1 + l}-ES, where a single individual evolves as follows. The individual mutates
where No is a Gaussian random number of mean zero and standard deviation af. If the function value decreases, then the new individual is selected; other-
Genetic Algorithms Tutorial 35 wise the old one is. In other words, both xi+ and xf are placed in the selection pool, and the best one wins out, This implementation is a pure hill climbing algorithm, with Gaussian weighted mutations. The sizes of the standard deviations are typically adjusted dynamically to yield an acceptance ratio of about 1/5. With larger populations, each individual has its own set of a: which can change with time. In a {p + A}-ES, the p individuals produce X offspring and the selection pool includes the p current members of the population as well as the A offspring. In a {p, A}-ES, A offspring are produced, and only those are selected from. In either case, the selection operator chooses the p best individuals to form the new population. Therefore the number of individuals whose fitness is evaluated remains constant. Current practice produces offspring through mutation and uniform crossover. The latter acts on both parameters and the standard deviations.
Is It Real or Is It a Genetic Algorithm? Before going on to look at a series of examples, it is at least intellectually interesting to see how closely GA approximates the evolution of real organisms. First, selection in the real world is based on phenotype, just as in the GA, and acts much like the GA, at least in captive breeding situations. A breeder has some notion of what is good or bad in his stock and selects breeding individuals accordingly. The real differences appear when we look at the genotype level. Real chromosomes, of course, come in the four-letter alphabet ACGT. A feature that has not found a use in GAS is the accumulation of junk DNA occurring in eukaryotic chromosomes. About 95% of our DNA resides in these noncoding regions, called introns. Presumably the reason that introns have never disappeared is that there is little selection pressure to get rid of them. However, except for some controversial ideas that introns contain signals for transcription, no positive use for them is known. Mutation in real organisms is most often a substitution of one base for another, which can either lead to no change (due to the redundancy of the mapping from DNA to amino acids), to a single amino acid substitution, or to a potentially large and fatal change such as would occur if rhe start codon for an important protein is altered. A significant difference between GA and current living organisms is the rate of mutation. In species with efficient DNA repair machinery, the mutation rate is on the order of 10-10 mutations per site per generation for an individual,29 whereas in the GA, rates are typically 10-2. However, it is assumed that mutation rates were much higher earlier in evolutionary history, before the advent of the DNA repair machinery, and that the rate of evolutionary exploration was much greater than it is today. This would at least partially account for the larger diversity of phyla seen early on than is currently observed.
36 Genetic Algorithms and Their Use in Chemistry
Crossover in the GA differs in many details from biological reproduction.30 Diploid individuals go through the process of meiosis to produce gametes, which then recombine in sexual reproduction. Figure 8 is a drawing of the major steps in meiosis. Each parent starts with a pair of chromosomes that replicate and pair like with like. (During normal cell division, or mitosis, they swap partners so that each daughter cell gets one copy of each.) Then crossover occurs, within the parent, forming hybrid versions of one of each pair. The four chromosomes then split up, one to each of four gametes. The second parent likewise produces four gametes, yielding 16 possible unique children from this single gametogenesis event. The crossover point changes with each crossover and recombination event, further increasing the diversity of potential offspring. Finally, there are multiple chromosome pairs, instead of the single pair shown here, each undergoing replication and recombination. This potential for great diversity from each mating is balanced by the fact that there is a small amount of diversity among the chromosomes within a species, so that parents and
a
Replication
Crossover and Recombination
99630 Four Gametes
Figure 8 A simple representation of actual crossover during the formation of the gametes from a single parent prior to sexual reproduction. One gamete from each parent is then chosen to form the genome for a new diploid individual.
Examples of Chemical Applications 37 children tend to stay clustered tightly in gene space, much more so than occurs in GA simulations. For instance, the coding regions of chimpanzee and human chromosomes are approximately 95 o/' alike. Depending on the species, crossover and recombination occurs in a variety of ways and contexts, making real reproduction (fortunately) more complex and interesting than in the GA. Reproduction in the GA most resembles outcrossing, which is common in prokaryotes. In outcrossing, only nuclear DNA is exchanged. In sexual reproduction, mitochondria1 DNA (which is carried by only one sex) is also exchanged.29
EXAMPLES OF CHEMICAL APPLICATIONS (WITH EMPHASIS ON THE GENETIC ALGORITHM METHOD) About 100 articles have been published in the chemical literature that describe the use of some variant of the GA, most of which have appeared in the last few years. A GA has been used in just about ever imaginable subfield of computational chemistry where optimization is called for, so a complete review of the use of GAS in the chemistry literature would entail something of a survey of all of computational chemistry. That is a bigger task than is undertaken here, so only some representative uses of the GA have been chosen, with an emphasis on how the GA is useful or on novel twists on the GA that are required in different contexts. For more information on the chemistry behind the applications, the original papers cited should provide good starting points. Most of the applications described in this chapter involve molecular modeling. Some other areas in which GAS have been used that bear mentioning are in the generation of fuzzy logic controllers for chemical reactors31 and process design32 Hibbert33 and Lucasius and Katemad4 provide earlier reviews of GAS in chemistry. Both articles contain useful tutorials and also discuss applications, primarily to chemometrics.
Conformational Searching: Molecular Clusters Several authors have used GAS to find minimum energy conformations of clusters. Xiao and Williams35 developed the program GAME36 to energy minimize clusters of small molecules such as benzene and naphthalene, as well as to dock proteins together. Their approach uses SGA. The first molecule is fixed, and the chromosome then specifies the six position and orientation variables of the remaining molecules in the cluster. They report results for clusters as large as tetramers of benzene.
38 Genetic Algorithms and Their Use in Chemistry Hartke3' uses a GA to optimize the geometry of atomic clusters, again using SGA. One variation that he reports as being important is the use of an appropriate internal coordinate representation. He correctly points out that for clusters larger than a dimer, a Cartesian representation, such as that used by Xiao and Williams, will cause unnecessary coupling between elements of the chromosome, making it difficult to produce high-order schemata. The same issue would arise if one tried to do conformational searching on a molecule using Cartesian coordinates instead of internal coordinates. Hartke also shows that GA is more effective than simulated annealing (SA) for this problem. Smith38 demonstrates how the GA can be used to find arrangements of atoms in a binary alloy crystal. The problem to be solved involves placing equal numbers of atoms of types A and B in the cells of a crystal and evaluating the energy using a pairwise energy approximation. The chromosome is simply a string such as AABABBAABBBBAB . . . plus a fixed rule for mapping the sequence onto the two- or three-dimensional lattice. The GA method easily outperforms Metropolis Monte Carlo. This is not surprising given the correlated nature of the final solution, which requires just the sort of large building blocks the GA is good at constructing. Mestres and Scuseria39 have combined semiempirical tight-binding potentials with a GA to find the global minima of small molecular clusters. Their chromosome is an adjacency matrix. The adjacency matrix element as is equal to 1 if atoms i and j are neighbors, and 0 otherwise, A method is described to translate from the adjacency matrix representation to internal coordinates. Obviously, it is also important to restrict the number of 1s in the chromosome to avoid making impossible structures. They successfully locate global minima for small carbon clusters.
Conformational Searching: Small Molecules Judson et a1.26 compare GA, simulated annealing, Nelder-Mead simplex,4" and pure random search for finding low-energy conformations of a scalable two-dimensional polymer whose global energy minimum conformation is known. A possible conformer is shown in Figure 9. For this molecule, the degrees of freedom coded in the chromosomes are the angles at each atom, so for an N atom chain, there are N - 2 degrees of freedom. This model system can exist in a set of high-energy knotted conformations and a lower energy set without knots. Especially when N is large, it was found necessary to use gradient minimization on each function evaluation because the SGA could not effectively search in the presence of a large number of high-energy van der Waals contacts. All of the GA runs used the hybrid approach (GA plus gradient minimization). One variant on this was the use of Lamarckian GA in which the energy function returned both the minimized energy and its coordinates. For the smallest case (19 atoms; see Figure 9), only the simplex method found the
Examples of Chemical Applications 39
Figure 9 Sample conformation for the twodimensional polymer with 19 atoms.
global minimum, but all of the other methods found conformations one energy unit higher, For runs with 37 and 61 atoms (35 and 59 degrees of freedom respectively), the Lamarckian GA outperformed all of the other methods, but never found the global minimum. Meza and Martinez41 compared these results with that for the parallel direct search (PDS) method and found that GA also outperformed PDS, when using the same local gradient method (conjugate gradient). However, they also found that the use of more sophisticated local methods in a hybrid could significantly improve the performance of PDS, to the point that it outperformed GA for the largest problem. In this case, local minima were so closely spaced, that a local method, by taking relatively large initial steps, could actually search a region having several local minima and choose the best of those. Subsequently, Meza et al.41 compared PDS and GA for a series of molecules and found the two methods to perform in a similar fashion for molecules up to 39 dihedrals. Judson et al.15 applied SGA, using Genesis42 plus MacroModel43 to the conformational search of a standard test of 76 molecules containing up to 12 dihedrals, and compared this with the CSEARCH44 systematic conformational search method in SYBYL.45 The GA found conformations lower than or within 1 kcal/mol of the energy of the minimized crystal conformation for all of the molecules. The CPU time grew much more slowly than for CSEARCH, with the crossover coming at about eight dihedrals, as shown in Figure 10. Brodmeier and Pretcsh46 report a similar GA approach for small molecule conformational searching. McGarrah and Juds0n4~performed a conformational search on a cyclic peptide using a very nonlinear mapping from the internal coordinates onto Cartesians, in order to guarantee ring closure. Ultimately this proved to be unnecessary. Instead it has been found that simply searching over dihedrals and letting the GA learn to close the ring under the influence of the energy function works much better. Nonetheless, this study showed the value of using a set of intercommunicating subpopulations to promote broader searching. The con-
40 Genetic Algorithms and Their Use in Chemistry
-2! 0
,
2
.
I
4
.
,
6
I
,
8
.
,
10
.
, 12
.
14
Number of Dihedral Angles Figure 10 Log of the CPU ratio between CSEARCH and GA on a set of small organic molecules. 15
clusion reached was that allowing the subpopulations to communicate more often than every 10-20 generations caused them to resemble one another too much. This promotes rather than discourages convergence. Another point made is that even infrequent local minimization during the evolution, which allows the GA to find out how deep are the local minima, is beneficial to the search as a whole. Dolata and co-workers48 have tested a GA in their suite of programs for learning rules for conformational analy~is.~9-51 They use SGA, but what are manipulated are precalculated conformations of fragments. For instance, a cyclohexyl group is represented by three discrete conformations, which correspond to local minima, rather than by a set of continuous dihedral variables. By combining these template fragments with an SGA, they produce a method for very rapidly finding low-energy conformations of small molecules, including those with complex ring systems.
Conformational Searching: Proteins Tuffery and co-workers52~53have used GAS to treat the problem of determining side chain conformations for a protein when the backbone conformation is fixed. Their approach is to draw side chains from a rotamer library and fit them together in an optimal way. Each amino acid side chain is represented
Examples of Chemical Applications 41 by a discrete set of conformations in the library, and each has a probability, based on the distribution of conformations observed in the crystal database. The chromosome is simply the set of N side chain conformations for an N residue protein, and the evaluation function is the conformational energy. An interesting measure of the degree of evolution used here is the prediction spectrum, which is calculated by comparing the distribution of rotamers in the population at each residue with the underlying distribution from which they are drawn. If for side chain i there are n rotamers, the conformational entropy is given by n
where pk is the probability that rotamer k shows up in the GA population. The value of R varies from 0 (all individuals have the identical rotamer) to 1 (the distribution is completely random). The population convergence is given by N
The most novel aspect of this work is the focusing operation used during the evolution. For some number of generations, a standard GA is performed. Periodically, the population is divided into a set of subpopulations, using a simple clustering algorithm. Members of a given subpopulation tend to resemble one another. Next, the value of D, for each subpopulation is calculated. If a subpopulation has a sufficiently low value of D,, meaning that all of the members are essentially the same, the lowest energy member of the group is saved for later analysis and the entire group is discarded from the GA population. The effect of this operation is to clear out a region of conformation space (a niche) that has been well searched and optimized and to force the search to concentrate elsewhere. After the focusing operation has been performed, the remaining subpopulations are merged, and the population is filled back to its nominal size by building new conformations based on the underlying probability distribution. The GA method is compared against a heuristic sparse matrix driven (SMD) method, in which subsets of the side chain space are separately searched. The boundaries of the subsets are chosen so that there is significant interaction within the subset and little subset-subset interaction. Then an exhaustive search is performed on each subset, with an extra subset potential being included in a mean field fashion. The result of this comparison is that the SMD method tends to find lower energy conformations than the GA and gets there faster, but tends to miss interesting conformations just slightly higher in
42 Genetic Algorithms and Their Use in Chemistry
energy. The GA does just the opposite and tends to find many conformations almost as low as the best found by the SMD algorithm. In a second article,53 Tuffery et al. continue the comparison by including SGA, their GA with focusing, SMD, and simulated annealing (SA). The outcome, based on a study of 14 proteins of varying sizes, is that SMD works best for small proteins, SA fails for large proteins, and GA with focusing works best for large proteins. SGA gets to the same minima as GA plus focusing, but takes about twice as long. The version of SA used here alters only one side chain at a time and fails because it cannot deal with correlated motions needed to escape local minima. Once a multiple-flip SA was implemented, it did about as well as SGA. Judson22954 has used the GA to evolve rules to model the time-dependent folding process of a polymer. In both articles, the system being modeled was a two-dimensional unbranched polymer interacting via Lennard-Jones forces. The parameters were adjusted so that the global energy minimum was the hexagonal close-packed structure shown in Figure 9. In the first article, the chromosome consisted of an integer regulator gene, denoted Nfold,followed by a series of integer pairs corresponding to pairs of atoms in the polymer. The fitness function would then read Nfo,d pairs. For each pair read in, the two corresponding atoms would be connected with a spring of force constant k and equilibrium distance ro, and a gradient minimization would be performed, bringing those two atoms close to one another. Sometimes the move would be foiled because other atoms were in the way. Once the minimizer was converged, the spring was removed, and the next pair of atoms was processed. At the end of Nfoldsteps, a final gradient minimization would be performed using just the basic force field. The fitness was the final energy. This is an example of an optimal control problem, where a succession of actions are being carried out to enable a machine to move to a specified final conformation. Over a period of several generations, the GA learned to fold the polymer into one of the degenerate global energy minimum conformations. This approach was obviously nonphysical in its one-at-a-time interaction scheme. A second variant54 on this uses a state transition matrix approach similar in flavor to methods described by Koza.12 This approach uses the distance matrix Dji whose elements are the distance between atoms i and j. At each step, a new distance matrix Qi is formed for which D&= D, + S , where S is the state transition matrix, S, specifies the desired amount by which the distance between atoms i and j should be changed each step. The magnitude of S , is some fraction of an angstrom. Springs are added between each pair of atoms for which D ,is less than rcuowith equilibrium length Dij. If S, < 0, the effect is to bring i and j together. If S, > 0, the effect is to push i and j apart. A gradient minimization is performed to try to satisfy the constraints, and then the process is repeated. Note that S, is constant throughout the folding process. An SGA was used to evolve S matrices, where the target was the Di, matrix of
Examples of Chemical Applications 43 one of the global energy minimum conformations. Once again, the GA was able to find several S matrices that could fold a single initial conformation into the target. Finally, a series of initial conformations was given to each of these S matrices, and the one that correctly folded the largest fraction was judged to be the best. Dandekar and Argosss examined a pair of applications of the GA for protein engineering work. In the first case, they evolved protein sequences (i.e., the chromosome was a sequence) to find sequences that might fold in a specified way. The fitness function steers the evolution to produce sequences that have similar composition to the family that is the target (zinc finger proteins in this case). It also rewards sequences that have specific amino acids at sites where the original family is highly conserved. In the second application, the GA is used to evolve conformations for which the chromosome specifies internal coordinates for a lattice model. An SGA was used in each of the applications, with one interesting twist. They ran a number of separate evolutions (niches in our vocabulary, epochs in theirs) and saved the best individual in each. Then a single competition run was performed whose population was made up of the best individuals from each of the epochs and filled in with random individuals. In a later article,S6 they extended their method and applied it to simple, nonlattice models of proteins. Each amino acid could take on one of a small set of conformers which would be specified by the chromosome. A variant on this was to fix segments of the protein using a secondary structure determination method. For instance, certain regions were identified as being helical, so each of these segments was replaced by a rigid helix taken from a known structure. The GA was then left to determine the conformations of the loop regions. Unger and Moult57 introduced a hybrid Metropolis Monte Carlo/GA method that partially solves the major problem each method has on its own. The Metropolis method consists of taking random steps away from a starting conformation, evaluating the energy after the move, and then deciding whether or not to accept it. The move is always accepted if the energy goes down and accepted with a Boltzmann probability if it goes up. This method has the ability to climb out of local minima, but cannot do a broad search of parameter space, because the moves are all local. The GA, on the other hand, is not good at local searches, but instead tends to encompass the bigger picture. Unger and Moult’s method is in essence a form of cooperative simulated annealing. A population of individuals perform a Metropolis Monte Carlo random walk for n steps, where n is typically 20. Then crossover is performed, where pairs of parents are crossed to form children. The children are evaluated to see if they meet some minimum fitness criteria and rejected if they do not. Once the population has been refilled, another round of Monte Carlo is performed. The application to which they apply their method is the folding of a two-dimensional lattice representation of a proteins8 where the conformation is defined by a move at each lattice point. A conformation is required to be self avoiding, and this is the
44 Genetic Algorithms and Their Use in Chemistry
only criterion used to filter possible children during the crossover step. For a sufficiently small protein, exhaustive searching can be done to find the global optimum, which is the target of the search. Unger and Moult57 ran both pure Monte Carlo (MC) and their hybrid method on a 20-amino-acid protein and found that the M C algorithm took on the order of 106-107 function evaluations to find the global optimum. The hybrid method, on the other hand, found the optimum in roughly 105 evaluations, saving a factor of between 10 and 100. They also introduced a variant mutation operator in which the effect of the mutation is controlled.s9 At each MC step, a variable number of mutations was allowed, that is, more than one internal angle could be altered simultaneously in a given conformation. A series of runs were then performed examining the lowest energy from the run vs. the mutation rate and the acceptance control parameter (the “temperature”). If all moves are accepted (infinite temperature), the best performance was seen with intermediate mutation rates. In other words, one can afford to have more than, but not much more than, one mutation per conformation. If a tight control is used (low temperature), a higher mutation rate is needed to make any progress. Sun60 developed a segmented GA for use on the protein coriformational search problem. He took the basic idea of a rotamer library and extended it to two-, three-, four-, and five-amino-acid sequences. The inclusion of the highorder rotamers (i.e., two-, three-, four-, or five-amino-acid pieces with specified side chain conformations) is most important in the mutation operator. When a conformation is to be mutated, the full amino acid sequence of the protein being examined is randomly partitioned into two-, three-, four-, and fiveamino-acid pieces. Then one of these pieces is replaced with a member of the corresponding rotamer library that matches the original amino acid sequence. Before adding the rotamer to the protein, the angles may be tweaked by +lo” to increase the region of conformation space sampled. The effect of the segmentation is to speed up the search, because good local structure is automatically built in. In the language of GA, these segments are highly fit, short-order schemata. The protein model on which the method is tested uses a three-atom per residue backbone and a one-atom side chain.60 Optimization was carried out on melittin (26 residues), avian pancreatic polypeptide inhibitor (APPI) (36 residues), and apamin (18 residues). The fitness function used a molecular mechanics penalty as well as a penalty on the radius of gyration, which is a relatively straightforward value to obtain experimentally. In all three proteins, the GA found conformations of lower energy than that of the native structure, which was a problem with the force field, rather than with the optimization method. A standard SA method was also applied to this problem. It still found low-energy conformations, but took 100-200 times more function evaluations. Le Grand and Merz6’ used a steady-state, real-valued GA to fold a series of small proteins. They also used a rotamer library (in which each amino acid was represented by a small set of high probability conformations), but went
Examples of Chemical Applications 45 through an intermediate step to use the rotamers. Their chromosome gives the values of the dihedrals. For each amino acid, the rotamer that most closely resembles that specified by the chromosome is used. Again, to further search the space, the rotamer dihedrals are randomly tweaked by up to 220". The fitness is the AMBER62 energy. Three different crossover operators are used, with probabilities determined by how successful they have been at generating low-energy offspring. The three crossover operators are two-point, two-point with wraparound, and uniform. (See the prior discussion of alternate crossover strategies for a description of two-point and uniform crossover. Two-point with wraparound is the same as two-point, except that it allows one of the cut points to be at the end of the chromosome.) The largest protein on which they tested their method was crambin (46 residues). Here, they found conformations about 150 kcal/mol lower in energy than the crystal structure, implying but not proving that the GA did a good job of searching the conformation space. Jones63 used a GA to design protein sequences that take on a particular fold. The chromosome is the amino acid sequence and the fitness function is a statistical potential developed from known structures. The GA generates a sequence, and maps it onto the target structure, after which the fitness is calculated. Certain residues are required to be fixed (cysteines, for instance), which is enforced by a penalty term. An SGA is used. One of the interesting points made in this study is that it is possible to over-design a sequence, that is, to optimize it too highly with the GA. The problem with this is that the fitness function is only approximate, and too much optimization may well emphasize one part of the protein to the detriment of the overall design. Gunn et al.64 applied ideas similar to some already discussed, combining SA with GAS to optimize a model protein. They fixed secondary structure (their test case is the four-helix bundle protein myoglobin) and allowed the optimization method to vary only the loop regions. Use was made of a hierarchical potential where a move (in the SA sense) is first tested against a simple potential that only evaluates bad contacts. If the move passes with that score, the full potential is evaluated. As in Unger and Moult's work,59 they carry out a series of steps of SA on a population of structures and then mix them using a crossover operator. In their version of the operator, one individual is chosen, and a cut point in one of the loops is chosen randomly. The short end of that conformation is replaced with the short end of all of the others, and each of the new conformations is tested with the simple potential, and the lowest energy one replaces the original parent. This operation is then carried out for each individual in the population to generate a new population. Three combinations were tested: pure GA (no SA intermediate); pure SA (no crossover), and the mix just described. Not surprisingly, the mixed method was found to be the best. Ring and Cohen6s use a simple GA in their program BLoop to generate loop libraries for proteins by building up overlapping sets of tetrapeptides found in the structure database. They represent each tetrapeptide by one of four letters {U, K, Z , J} which correspond symbolically to the path taken by the four
46 Genetic Algorithms and Their Use in Chemistry (2,s. These are in turn represented by the binary alphabet {(00), (Ol), (ll), (lo)},respectively. A four-residue loop is a single letter, a five-residue is represented by a pair of letters, and so on, so that a loop of N residues is represented by N - 3 letters. The GA generates a string of letters and evaluates the string by comparing the likelihood that the prescribed tetrapeptide orientations would occur for the particular tetrapeptide sequence in the loop being built. This buildup technique allows loops to be built that closely resemble ones in the database, but also interesting variants not found there. The representation and the fitness function are so simple that the optimization goes very quickly. Merkle and co-workers66 have applied a messy GA (MGA) to the protein conformational search problem and have implemented it on a parallel machine. Preliminary results show that the MGA can be used efficiently on a parallel machine (as can SGA), but the issue of search and optimization efficiency was left unresolved. Herrmann and Suhai report the combination of an SGA with MOPAC energies to perform conformational searches of peptides.67 These authors have also reviewed earlier work using GAS for protein folding.
Conformational Searching: Docking The first report of using GAS for the molecular docking problem was by Dixon,68 who published an abstract describing the use of SGA to drive the torsional and orientational degrees of freedom inside the program DOCK.@ The six degrees of freedom used in most of the docking applications described here are the (x, y, z) offset of the ligand, plus three Euler angles to describe its rigid body orientation relative to the protein. In addition, the internal dihedral angles of the ligand are usually specified. Oshiro et al.70 have extended this work to examine a series of ligand-protein systems. Judson et al.71 have extended their small molecule methods to the docking problem. The approach uses a variant on the SGA. One of the most important features used there is the concept of growing. This is an attempt to solve the problem of fitting a large floppy molecule into a tight pocket. Most if not all of the initial ligand conformations will fail to fit, and the GA will learn nothing about the binding modes. The approach used is to initially dock a small but presumably important part of the molecule which produces a number of plausible binding modes. Then the rest of the molecule is grown in over a period of several generations. Most of the orientational space is discarded during the early growing stages, and the later stages serve to refine the internal conformation of the ligand to conform to the binding site. Another insight from this work is that using a hybrid method (GA plus local gradient minimization) greatly helps guide the search. However, the implementation was too expensive to use gradient minimization on a regular basis. The method was applied to a series of thermolysin inhibitors72 and was able to produce a good
Examples of Chemical Applications 47 correlation between calculated binding energy and experimental binding constants. Clark and Ajay73 independently developed a method similar to that of Judson and co-~orkers,71J2although their numerical implementation is much faster. This is principally due to the use of grid-based potentials for the ligandprotein interaction. Several other modifications are also described to improve the search. These include running several subpopulations, each sampling only a restricted region of the center-of-mass translational space. They report results for both rigid body and flexible ligand docking. Gehlhaar and co-workers74 report the use of an evolutionary programming method to predict the conformation of protease inhibitors. They use a very simple, piecewise linear potential that starts off being very soft, so that cavities are larger than in reality. As the evolution proceeds, the potentials become harder, and the binding site shrinks to its appropriate size. Their evolutionary method uses real-valued coding of the usual translational, rotational, and dihedral degrees of freedom. The selection process takes a single individual in a population and compares it with a set of other individuals (tournament selection). Its probability of surviving is proportional to its number of “wins,” that is, the number of opponents it scored better than. If an individual is selected, it is placed into the new population after being subjected to a Gaussian-weighted random mutation. They find that to maintain diversity in the population, it is important to use a small number of opponents in the selection competition, as well as to carefully control the mutation sizes. Jones et al.75 have developed a GA based docking method that allows both ligand and protein flexibility. Their chromosome is different in important respects from that used in the other docking applications described here. It is divided into four separate pieces as
The first two sections code for the values of the internal dihedral angles for the ligand (9) and the protein side chains (x). The third section has one variable for each lone pair in the ligand. The values that those variables can take are the atom numbers of the polar hydrogens in the protein binding site. The final block has one variable for each polar hydrogen in the ligand. Those variables can take as values the atom numbers of the protein lone pair donors. The ligand is placed in the binding site by first setting its own and the protein’s dihedrals, then making a rigid body fit that attempts to satisfy as many as possible of the polar hydrogen-lone pair contacts specified by the third and fourth blocks of the chromosome. Crossover and mutation act separately on each of the four blocks. This representation is potentially more powerful than that used by the other groups doing GA based docking, because the hydrogen
48 Genetic Algorithms and Their Use in Chemistry
bonding interactions that drive binding are directly specified by the GA. The method is illustrated by docking five ligand protein systems. These authors also used their representation to perform flexible pharmacophore matching.75
Conformational Searching: DNA/RNA Lucasius et al.76 published one of the earliest reports on the use of GASto solve a molecular conformation problem. They used what is essentially an SGA, but added a hierarchical component to it. Their application aimed to find conformations of nucleic acids satisfying a set of NOE-derived distance constraints provided by NMR. The fitness is the number of NOE distance constraints satisfied by a given conformation. What the authors recognized is that much of the information is at least partially separable, that is, that certain NOES correspond to certain dihedrals. They used this insight to build a hierarchical GA that optimizes many subparts of the molecule in parallel and then folds the subparts into successively larger scale segments until the entire molecule is reconstructed. Ogata et a1.77 attacked the same nucleic acid conformation problem, but replaced the buildup scheme of Lucasius with a local filter that is equivalent to the use of a rotamer library. In both cases, these methods must deal with the fact that this is an underconstrained problem because several of the dihedrals have no NOES associated with them. Schustei-78earlier treated a simple model of RNA to predict three-dimensional (3D) conformations, using a variant on a spin-glass Hamiltonian as his fitness function. The simple model used allowed for the analysis of the complexity of the fitness landscape, couched in terms of the genotype-to-phenotype mapping. '
Protein NMR Data Analysis Blommers et a1.79 performed a conformational search on small peptides in an attempt to satisfy a set of NOE distance constraints. They used an SGA, the only variant on which was the use of a sharing operator to slow down convergence. At the end of the initial search phase, several interesting conformations were gradient minimized using an MM energy function.
Protein X-ray Data Analysis Chang and Lewis demonstrate a method for using a GA to determine heavy atom positions in the isomorphous replacement method of X-ray crystallography.80 Heavy atoms are placed in a crystal to help determine the phasing information when solving for protein structures using X-ray crystallographic methods. Crystals with a water replaced by a mercury compound, for instance, will display characteristic differences in the Patterson map,81 manifested as groups of peaks off the origin. The heights of the peaks are proportional to the
Examples of Chemical Applications 49 scattering power of the replacement atoms, whereas the noise level is proportional to the number of replacements. To find the locations of the replacement atoms, positions are guessed, a map is calculated and compared with the experimental data, and the process is repeated until a match of sufficiently high quality is obtained. Chang and Lewis code the positions (x, y, z ) of the heavy atoms within the unit cell in the chromosome. The solution proceeds in one of two ways. First is the bootstrap in which the position of one of the atoms is found and then the vicinity of that atom is excluded from further search. The problem with this. method is that errors in trial solutions of early positions can ruin later prediction. Their other approach, which seems preferable, is to code for all of the positions simultaneously. This seems to provide higher quality solutions. An interesting variant on the chromosome coding is to use (x,y,z) for the first atom and then reference the others from that with a rotation matrix and offset vector. This is useful when the relative positions of the heavy atoms are known at least approximately. Otherwise, a standard SGA is used.
Molecular Similarity Several groups have used GAS to attack problems involved with superposing molecules in both two and three dimensions. Payne and Glen82 look at a variety of uses of 3D conformational searches, similar to those already described, but for the purpose of finding molecular similarities. The main theme here is using the GA to find conformations that satisfy a set of constraints. The first type considered are distance constraints between similar groups on different molecules, as derived from NOES. For instance, they show how to use the GA to superpose multiple molecules to produce a good pharmacophore model. They also consider more complex situations where the function to be optimized involves volume or charge overlap. The representation is similar to that used in the docking calculations described previously. The chromosome includes molecular center-of-mass displacement, Euler angles, dihedral rotations, and ring flips. The last specifies whether atoms in a nonplanar ring should be inverted about the major ring plane. Otherwise, the GA implementation is close to that of the SGA. Clark et al.83 carry this work further and compare the GA, SA, distance geometry,s4s85 CSEARCH,44 directed tweak,86 and random search methods on the problem of finding good pharmacophoric matches in databases of thousands of molecules. Their final conclusion is that GA and directed tweak outperform all of the methods, and that the directed tweak may in fact be preferable to GA. May and Johnson87 have taken the molecular superposition problem to the next level of complexity and used it to superpose proteins in order to measure structural similarity. Once molecules are as large as proteins, and pairs of proteins that may or may not be similar are tested, it is sometimes problematic to decide which pairs of atoms in the two structures should be used for the
SO Genetic Algorithms and Their Use in Chemistry
superposition. Dynamic programming88@ approaches are typically used to make a sequence alignment, giving a first guess as to which residues should line up. May and Johnson, following Fredman,88 add in the known structural information. Their GA fixes one protein and varies the six degrees of freedom describing rigid rotation and translation of the second one; in addition their GA uses the gap penalty needed for Fredman’s dynamic programming evaluation. At the end of the GA, one is left with an optimal set of pairs of residues to superpose using standard least squares fitting. Again, the standard SGA is used.87 Walters and Hinds90 published a method that uses a GA to build a model of a receptor whose structure is unknown. They start by taking a set of ligands whose bioactivity has been measured for a receptor, then choose a putative bioactive conformation for each ligand and finally align the ligands manually. Next the superimposed ligands are embedded in a cubic grid, and pseudoreceptor atoms are placed on the open grid points. These atoms are given a fixed set of force field parameters and are allowed to relax around the ligands to maximize the van der Waals interactions. Finally they are pushed slightly away (0.1-1.0 A) from the ligands. A set of 40-60 atoms closest to the ligands is chosen to define receptor atom positions. At this point, the GA is used to choose atom types to assign to each of the receptor atoms. The choices are standard CHARMM91 atom types found in proteins, including their associated van der Waals parameters and atom charges. The atoms in the model are numbered consecutively, and the chromosome simply contains the atom types that are assigned to the atoms. The fitness function is the correlation coefficient for the linear fit between the force field computed binding energy and log(bioactivity) for each of the compounds in the training set. Good models achieved an r2 fit > 0.95 for the training set, but the residual error for compounds not in the training set was as high as 0.77 log units for log(bioactivity). A standard GA approach is used. However, an important innovation is introduced, namely the use of many of the best models to gain statistics and estimate error bars on the activities for new compounds not in the training set. Another important reason for examining many models is to see which receptor atom types are conserved across all good models and which are not. The former are presumably telling us something “real” about the receptor, at least if the alignment of the training set is correct. Another useful feature of this model, which is independent of its GA component, is the ability to compare the fit of structurally diverse ligands into a receptor. Fontain92.93 demonstrates the use of the GA for the problem of calculating chemical distance between molecules for use in automated chemical synthesis planning methods. The essential idea is that a starting material (SM) is transformed through a series of steps into product (P). At each step k, the intermediate Ik can in principle be transformed into a series of new intermediates IL’i,, li2il,. . . . One of the criteria used to rule out certain of these
Examples of Chemical Applications SZ transformations is that the minimum chemical distance, CD,, from the intermediate to P must be less than the distance from SM to P. CD,94 is defined as
I(l!+,
CD, = min
{ C E,, - p * B, ,,;
I
where E and B are be-matrices95 for the two molecules, describing the redistribution of electrons as bonds are altered to transform molecule Ik into each of the molecules I(&l. p* is a permutation matrix element altered to minimize the sum. The GA is used to do this minimization. The interesting point from the GA perspective is the need to always produce proper permutation matrices during crossover and mutation. To solve this problem, the PMX crossover of Goldberg is used.9 Brown et a1.96,97 describe the use of GAS for fast substructure searching using chemical graphs. The task is to take a new structure S and compare it with a set of hyperstructures already in the database and discover which substructures of S match corresponding substructures in a hyperstructure H . Hyperstructures9* provide a method for efficiently storing large numbers of molecules in a database. Here the chromosome is a list of atom numbers in H onto which the atoms of S will be mapped. For instance, if S has five atoms, the chromosome could look like (8, 3,4, 20, lS}, meaning that atom 1 in S maps onto atom 8 in H,atom 2 in S maps onto atom 3 in H , etc. A proper chromosome has the property that the matched atoms in S and H must be the same type (atom type and bonding type) and that no two S atoms can map onto the same H atom. Finally, only maps onto connected substructures of H are of interest, so only those H atoms that are specified in the chromosome and that form a connected graph are used in the fitness evaluation. This leaves some fraction of the chromosome unaffected by selection. Because of the requirement that each element of the chromosome be unique, novel PMX-like crossover and mutation operators were developed. Hibbert99describes a GA-based method for generating isomers satisfying a given molecular formula and for determining proper 2D representation of the isomers for graphical representation. The chromosome is a set of pairs (i, j ) indicating a bond between atoms i and j . The fitness function measures how well all of the atoms’ valences are satisfied and how representative the bonding is. For instance, single- and double-bonded carbons will be favored over triplebonded carbon. The operators used here do not guarantee that valid molecules are produced from each chromosome, for example, molecules can have incorrect valences or even be disconnected. An attempt is made to have the operators mostly produce valid structures, but invalid ones are kept at a low percentage of the population by not allowing them to produce offspring.
52 Genetic Algorithms and Their Use in Chemistry
QSAR Rogers and Hopfinger100 present a powerful method for developing quantitative structure-activity relationship (QSAR) models using a method they call genetic function approximation (GFA). QSAR attempts to find models that fit features of molecules to the activity or other properties of the molecules. A model takes as input a set of features, which are properties of the molecules that can either be measured or calculated, and a set of so-called basis functions of the features and combines them in a linear regression model. Examples of features are logP, melting points, and dipole moments. Examples of basis functions are x , log(x), and (1 - x ) 2 where x is a particular feature. The regression model is of the form
where A is the activity; X represents the set of individual features x; the ai are regression coefficients to be determined; and the ‘pi are the basis functions. One of the principal problems with developing a QSAR model is that one typically has many features and many possible basis functions, so that it is easy to overfit the data. This yields good regression fits for the training set of data, but often poor predictive power for new compounds. It is difficult though, to decide a priori which features to suppress and which subset of basis functions to choose when constructing a model. GFA solves these problems by allowing the computer to try many models, using a GA to search for progressively better models. The algorithm proceeds as follows. The user chooses a set of features and basis functions of the features from which the GA can choose. For features x, y, and z, the set could be
The GA chromosome for an individual is composed of a subset of these elements: I
Individual 1: Individual 2: Individual 3:
{x, z}
{log(x), ( x - 10)2, y} {x, (1 - y)3, z }
A regression model is built for each individual in the population, and the fitness is evaluated. Rogers and Hopfinger use the Lack of Fit (LOF) measure of Friedman 10’
Examples of Chemical Applications 53
LOF
=
LSE
Here LSE is the usual least squares error, c is the number of basis functions, p is the number of total basis functions (which can exceed c), M is the number of compounds in the training set, and d is a smoothing function which is typically chosen equal to 1. This scaling of the LSE penalizes models that overfit the data due to using many features and/or basis functions. This is an example of a parsimonious fitting method. The population then undergoes selection, crossover, and mutation. The crossover operator simply exchanges subparts of the chromosomes of the two parents. The mutation operator adds or subtracts a basis function. There are additional operations when splines are used as basis functions. An important outcome of a GFA run is that it produces many different models, several of which may have low LOF values. The user can then feed new compounds through the set of models and work with a distribution of predicted activities. Presumably a user will have higher confidence in the potential success of a compound that has high activity predicted by all of the models than that of a compound that yields mixed predictions.
Design of Molecules Venkatasubramanian and CarutherslOZ have developed a novel approach
to designing polymers by letting a GA search over monomer space to find polymers that possess a specified set of physical or chemical properties. Their work is a good example of the design of problem specific representations and operators. GAS are expected to do well in a search of molecule space relative to
other optimization methods because the space is large and discontinuous. The representation used here is a LISP-like string that represents backbones and side chain groups. For instance, the string
represents the polymer
54 Genetic Algorithms and Their Use in Chemistrv
This simple representation is ideal for unbranched polymers, but would need modification for more complex cases. However, other string representations such as SMILES"J3 are available and serve the same -purpose. The unique operators used with polymers are: Asymmetric crossover: Because the monomer can be of variable length, there is no point in choosing the same crossover point for the two parents. The only requirement is that the crossover operator must leave realizable offspring, so a crossover point in the backbone is always chosen. Blending: This operator simply concatenates two parents to produce one offspring whose length is the sum of those of the two parents. Mutation: The mutation operator can act on either the backbone or the side chains. Insertion and deletion: Backbone and side chain units can be deleted or inserted. Hopping: The hop operator switches positions of two backbone units. The GA chooses from a pool of backbone and side chain units such as shown here:
H I -CI H
O H
Backbone units
Side chain units The fitness function used is also novel in that it requires the properties of the designed polymer to lie within a specified tolerance of the desired value:
Pi represent the n properties, pi is the target value of the parameter, (Pi,max, Pi,min)are the maximum and minimum allowed value of the properties, and a is a scaling parameter that determines how large the penalty is for stepping outside the tolerances. Venkatasubramanian and Caruthers also give a onesided fitness function for situations where only a maximum or minimum parameter value is specified. The case studies presented in the article102 show that the GA almost always finds a polymer satisfying a given set of criteria, provided that the appropriate chemical building blocks are in the mix. How well this will
Examples of Chemical Applications 55 work as a laboratory design tool will, as with many of the other cases discussed here, depend on the accuracy of the model that calculates physical properties for a given polymer.
DNA and Protein Sequence Applications Cinoshy and co-workers104 have developed a set of software tools for producing genome maps from overlapping fragments. Their program, called SIGMA (System for Integrated Genome Map Assembly) attempts to place fragments in the proper relative positions to construct a complete map. The function they are optimizing is a set of statements about probabilities of fragments having specific relative placement. An example statement is “These two fragments overlap by about 10,000 base pairs.” A map may use millions of such statements, all of which need to be tested and optimally satisfied. A GA is used as part of this optimization process. Schneider and Wrede105>106 demonstrate a nice coupled Neural Network/Evolutionary Algorithm method for designing specific protein sequences. (See Dandekar and ArgosSS for more information on using GAS for protein engineering.) The problem that Schneider and Wrede set out to solve is the design of sequences that could act as cleavage sites in peptidases and perhaps have different specificities from those found in nature. Their first task is to train a Neural Net (NN) to recognize sequences that would act to cleave proteins. This is done by giving the N N a training set of sequences that are known to act or not act as cleavage sites. Their first use of the EA is to optimally adjust the weights in the N N to distinguish cleaving from noncleaving sequences. Once a trained N N is available, it is used as an EA fitness function to screen for new sequences that have successively higher cleavage scores. The input to the N N are a variety of physicochemical properties of the amino acids, rather than just amino acid names. The authors use a standard {I, A} EA.7 The bottom line for this approach (which is true for many of the applications described here) is that the chief stumbling block is not the optimization method, but rather the fitness function. Real chemistry is far more complex than even our best models. On the other hand, Schneider and Wrede have enough confidence in their predictions that they are synthesizing and testing the predictions in the lab. Fiillen and Youvan107 describe the use of a GA to directly drive the synthesis of new protein sequences. The experimental approach first produces DNA, which then codes for protein, and then measures the activity of the resulting proteins. Because there is a many-to-one relationship in the DNA triplet to amino acid coding, the GA chromosome simply codes for the doping ratios of the four DNA bases at each step of synthesis. Vemuri and CedeiiolO* investigated a variety of methods for slowing down convergence by promoting niche formation and applied them to the problem of DNA sequence reconstruction from fragments. (This is different from map assembly described above, in that shorter sequences are of interest,
56 Genetic Algorithms and Their Use in Chemistry
but the correct base-by-base sequence is the output, rather than simply a correct ordering of many large fragments.) The authors review some of the earlier work on niche formation or crowding such as DeJong’slO and Mahfoud’s109 and then propose their own variant that they term multi-niche crowding (MNC).108 MNC uses uniform selection in a steady-state GA. For each mating event, an individual A and a set of individuals Cf (for crowding factor group) are selected. A then mates with the individual M in Cfthat is most similar to A, using the fitness value (phenotype) rather than the chromosome (genotype) to measure closeness. Vemuri and Cedeiio investigate a number of other recondite niche inducing schemes that are applied to simple test problems as well as to the DNA sequencing application. Ishikawa et a1.110 have combined a GA with a dynamic programming (DP) algorithm to perform multiple sequence alignments. DP is a local hill climbing algorithm that attempts to align pairs of sequences by shifting sequences relative to one another and possibly inserting or deleting residues in one or another of the sequences. One drawback to using DP to align multiple sequences is that the CPU time grows as Nz where N is the number of sequences. The GA is used in this context to partition the original set of N sequences into several subsets that are aligned separately. The GA is in essence used to reduce the dimensionality of the problem by finding optimal partitions that can be easily optimized locally.
Data Clustering Lucasius et al.111 have developed B fast clustering method based on a GA. Using a data set of K elements, they attempt to find the optimal partitioning into k clusters. The fitness function is the sum of the distances from the elements of the clusters to the centers. The approach is to have the GA chromosome be a list of k unique elements and to assign the remaining K - k elements to the closest of the k centers. A given optimization run uses a fixed value of k. The necessity that each of the elements of the integer chromosome be unique requires the use of a novel crossover/mutation operator, which is illustrated below. Assume that a group of nine elements is to be partitioned into three clusters. Let chromosomes for two individuals be given by [ 13 53 and [ 8 7 11. The off spring are produced through the following steps: Step 1: Step 2: Step 3: Step 4: Step 5 :
Concatenate the two chromosomes: Scramble: Mutate: Scramble: Produce child 1 by choosing the first three unique elements, reading from the left:
1 6 6 1 [1
3 8 8 1 8
6 3 3 8
61
8 1 1 8
7 7 8 6
1 1 1 3
Examples of Chemical Applications 57 Step 6 : Produce child 2 by choosing the first three unique elements, reading from the right:
[3 5 81
The authors point out that this is an application that benefits from increasing the mutation rate in the late search phase in order to increase the degree of local search. The GA clustering method is compared with CLARA, a standard clustering method,”2 and is shown to perform better, but at a greater computational cost. A compensating advantage is that the GA seems to continue to work well for data sets so large that CLARA starts to fail.
Spectral Curve Fitting Lucasius and co-workers113~1*4have developed GA-based methods for fitting and manipulating spectral data. Their approach is to first assign spectral peaks. The fit to the spectrum is described by a set of peaks, each of which has a center, a height, a width, and fractional mixture of Gaussian and Lorentzian character. In the simplest version, a GA is used to fit the data, with all parameters being determined by the GA. A refinement to this is to couple in a NN to determine the positions of the peaks first. A more difficult extension of this is to take a spectrum from a multicomponent system and determine the concentrations of the various species. Lucasius et al.115 compared their approach to SA and to stepwise eliminations, both of which are standard methods used in the field. The GA was shown to perform better than either of these, for both selectivity and accuracy. Another application by this group116 involves using the GA to design materials having a particular set of properties.
General Model Fitting Hibbert”7 examined several variants of the SGA for the problem of determining a set of kinetic rate constants for coupled reactions where the concentrations of all species are not known. This lack of full knowledge gives rise to multiple local minima. In particular, he compared SGA with SGA plus local minimization, and real-valued vs. binary valued chromosomes. He also examined the use of “incest prevention’’ methods, which allow only dissimilar parents to mate and that forbid duplicate children to be formed. His conclusions for this application are: (1)SGA does not work well, (2) adding incest prevention makes the algorithm much more robust, and (3) adding local gradient minimization is very beneficial and not too expensive in this case. Leardi and co-workers11*,119 use a GA to determine which of a set of features best explains a set of observations and to determine outliers in the data. Their SGA scheme was shown to outperform partial least squares. The tastiest aspect of this method was its application to the correlation of chemical composition with the age of provola cheese.
58 Genetic Algorithms and Their Use in Chemistry
Potential Energy Functions Rossi and Truhlar120 give a novel example of the use of GAS for parameterizing semiempirical potential energy surfaces. The object is to obtain a set of semiempirical parameters that would reproduce a surface described by a small number of high level ab initio data points. An SGA is used, yielding good fits.
SUMMARY AND COMPARISON WITH OTHER GLOBAL OPTIMIZATION METHODS This section summarizes the comparisons that have been made between GA and other global optimization methods. First, several points concerning the use of optimization methods are reiterated. Next, a brief description is given of the methods with which GA has been compared. Finally a summary of the comparison results is presented. When approaching a new optimization problem one should first try to understand as much as possible about the nature of the fitness landscape. Is it smooth? Does it have only one or a few local minima? Or alternatively is it very noisy, with many local minima? If it is smooth, with only one or a small number of minima, then some method other than the GA (probably a local rather than a global method) is called for. If there is some rough way to characterize different regions of the fitness landscape, then it may be best to devise a fitness function using that information. This will increase the efficiency of the GA as well as that of any other method that might be used. Next, what is the size of the problem? If it has no more than about five degrees of freedom, many of the methods described below should do a good job and may do better than the GA. For higher dimensional problems, several of these methods often become too expensive to run at all. The GA is robust in that it will continue to work on high-dimensional problems, although the absolute performance, as measured by success at finding the global minimum, will probably decrease as the dimensionality increases. Finally, what is the cost of the fitness function?If it is very expensive, then the GA is probably not a good choice as an optimization method, and one is better off to find some physical basis for deciding where in parameter space to use the small number of function evaluations that can be afforded.
Brief Overview of Other Global Search Methods We give short descriptions of several global optimization methods that have been compared against the GA. For a more complete description of these, one should consult the references. Leach’s review121 of conformational search-
Summary and Comparison with Other Global Optimization Methods 59 ing methods is also a good starting point. In all cases, we will illustrate the methods using a generic function f ( x l , x2, , . ,x,) that we wish to minimize. Further assume that each of the parameters can lie only in a restricted range, for example, xyin I x i 5 xTaX.The collection of parameters will be denoted byZ.
.
Random Search Here, many parameter setsZ(l),X(2),?2(3), . . . are chosen, and the corresponding function values are calculated. The lowest value found is then used as the best estimate of the global minimum. Typically the parameter sets are. chosen to cover parameter space more or less uniformly. Metropolis Monte Carlo122 In Metropolis Monte Carlo (MC), one follows a trajectory through parameter space: ?(') +X(2) +X ( 3 )+ . . . ,generating a series of function values: f(1) -+ f ( 2 ) + f ( 3 ) +. . . . A typical step proceeds as follows. One of the parameters is changed by an increment: (xl, x2,
. , . ,X;, . . . , X , ) ( k )
--.$
(XI, x2,
. . . , xj + 6 , . . . , X , ) ( k )
[If31
The new function value fck." is calculated and compared with the previously calculated value ffk'. If fck" Ifckl, the move is accepted. Otherwise, a random number 0 5 ,31 5 1 is generated. The move is still accepted if the following condition is met:
Otherwise, the move is rejected, and another random increment is added
to Xfk).The parameter T in Eq. [19] plays the role of a temperature. If T is
large, then relatively large uphill moves will be accepted. If T is small, then only slight increases in the function value are allowed. Over a course of many steps, the function value will decrease.
Simulated Annealing123 Simulated annealing {SA) takes the basic MC method and adds a cooling schedule that slowly decreases the temperature. Typical cooling schedules keep the temperature constant for some number of steps, then decrease it by, for example, a constant fraction: T + aT where a < 1. The initial temperature is often adjusted to produce an acceptance ratio (fraction of M C steps that are accepted) near 0.5. The effect of cooling is to allow broad searching initially, including jumping over barriers from one local minimum to another. Once the temperature gets sufficiently low, only downhill moves are accepted, so that the steps converge to the bottom of a local minimum. It has been proven that SA
60 Genetic Algorithms and Their Use in Chemistry will find the global optimum of a function provided that it is run for an infinite amount of time, using a logarithmic cooling schedule. This is obviously impractical. Simplex124
The Nelder-Mead simplex method is not really a global search technique, but rather a local method that takes large steps and does not use gradients. Because of these properties, it can be used for global searches in certain cases. The basic method is illustrated in Figure 11.40 A simplex is a set of n + 1 points in an n-dimensional space. This set is a triangle in two dimensions, a tetrahedron in three dimensions, etc. A search starts by placing a simplex in the
High 3
A 1
Original Simplex
Low
.a’
Reflection Through
Best Plane
5 Expansion
v4
Contraction
1
Figure 11 Examples of moves in the simplex optimization method for a searching in a two-dimensional space.
Summary and Comparison with Other Global Optimization Methods 61 space by choosing n + 1 points, It is the user's responsibility to make the initial choice of points. The function is evaluated at each of the points, and the vertices are labeled in order of function value. The vertex with the lowest value is 1, second lowest is 2, and so on. The vertex with the highest value is n + 1. The basic move is to reflect the simplex through the best plane (n-dimensional surface), which moves the search away from the worst point. If this move produces a new lowest value, then a so-called expansion step is taken. This expands the simplex in the same direction as was taken by the previous reflection step. If expansion yields a new lowest value, it is repeated. If not, then a contraction step can be taken. This step moves back toward the center of the original simplex from the first reflection point. Depending on the implementation, the search will proceed via a series of reflection, contraction, and expansion steps. It converges once the simplex finds itself in the bottom of a local well and contracts below a minimum size. A variant of the simplex method is parallel direct search (PDS) which runs many simplex searches simultaneously.41 It can also allow uphill steps as in simulated annealing.
Systematic Search121 This class of methods attempts to search the entire parameter space in a quasi-uniform manner. If this is done, then within the resolution of the search, the global optimum is guaranteed to be found. These are also termed grid search methods because they superimpose a grid on the search space and sample only points on the grid. This approach suffers from the so-called combinatorial explosion problem because the number of points that have to be sampled grows as an, where a is a constant that increases with the resolution of the search, and n is the number of parameters. To increase the efficiency of the search, various tree-pruning algorithms are used. These are all based on the recognition that some values of certain parameters always yield bad function values, so entire rows of the high dimensional search grid can be discarded. CSEARCH44 is a grid search method implemented in SYBYL.45 A newly developed technique resembling the grid search is the a-Branch and Bound (aBB) method of Maranas and Floudas.125 This method has not been compared with GA so far, but it bears mentioning because it is guaranteed to find the global minimum of a function, given some mild restrictions. This ability is independent of the resolution of the original grid. Furthermore, the typical potentials used in molecular modeling meet these restrictions. aBB has been applied to the conformational search problem with promising results. The scaling behavior with problem size is unclear, however. Cartesian Random Search Saunders'26 introduced a method that performs random molecular conformational searches in Cartesian rather than internal space. In this method, one starts with an initial conformation. Every atom in the molecule is given a
62 Genetic Algorithms and Their Use in Chemistry “kick” in a random direction, and then the conformational energy is minimized. This new structure is then kicked and minimized, and so on, until a large number of conformations have been generated. The lowest energy conformer found is then the best guess for the global minimum. The aim here is to sample more of conformational space, rather than to simply locate the global minimum.
Summary of Comparison Between Genetic Algorithm and Other Methods
‘
Several groups compared GA with Metropolis MC or with SA on conformational searching problems. The uniform conclusion is that SGA always outperforms straight SA.25137,53,57,59*64,82~113,114However, in several applications, a combination of the two proved to outperform either one individually.53>57959These hybrids either used the SA to perform local searching between GA generations or used crossover in a multiple-individual SA. Wienke et compared the GA with several standard optimization techniques including simulated annealing, grid search, simplex, pattern search, along with local optimization methods, for several test problems. Their conclusion was that the GA consistently outperformed the other methods as measured by the fraction of runs that found the global optimum. Simplex and the related PDS algorithms were compared against the GA for conformational searching applications.26,41 Both the simplex method (using many starting simplexes) and PDS performed about as well as GA for small problems, but GA appears to find lower energies more quickly for large problems. However, the difference in performance is small. Clark et a1.83 and Judson et al.15 compared GA with the CSEARCH method for conformational searching. CSEARCH worked better than GA for very small problems (fewer than eight dihedrals). For larger problems, both groups report that GA outperforms CSEARCH and that the difference in performance grows with problem size. Clark et al.83 also compared the GA with distance geometry,85 directed tweak,86 and random search methods on the problem of finding good pharmacaphoric matches in databases of thousands of molecules. Their final conclusion is that GA and directed tweak outperform all of the methods, and that the directed tweak may in fact be preferable to GA, as mentioned earlier in this chapter. A considerable amount of work is being done in the ligand-protein docking area, and it is interesting to compare the functionality of the methods reported in the literature against the GA method reported here. Table 2 lists each of the methods along with whether they include ligand flexibility, protein flexibility, and full orientational motion of the ligand. From the table we see that there are essentially four groups: (1) those that dock rigid ligands into rigid proteins; (2) those that dock flexible ligands into rigid proteins; (3) those
Yes Yes Yes Yes Yes
Yes (?)
YeS
No Yes
Yes
?
No
YeS No Yes Yes
Torsions
Yes Yes No No Yes (2) Yes
Trans/Rot
?
No
No Yes
Yes (2) Yes Yes (?)
No
Yes
No
No/ Yes No No No
Torsions
Yes
Yes
? No
Yes No No No
All Terms
?
No
No Yes
No
Yes
No
No No No No
All Terms
Protein Flexibility
144
140- 142 143
139
137, 138
68, 71, 72, 74, 75, 83 128, 129 130 131 132 69,133-136
References
#This list illustrates some different strategies used to dock small molecules into proteins. A question mark indicates the method could have the ability indicated, but that the reference either did not use the ability or did not say it was used. T h e ligand was placed by hand and then subjected to gradient minimization.
GA search Brownian dynamics Systematic search Monte Carlo43 Annealed dynamics Steric fitting (Dock) Molecular dynamics Misc. hybrid methods De novo design Hand fit plus minimization6 Distance geometry
Method
Ligand Flexibility
Table 2 A Comparison of Docking Methods Reported in the Literature”
64 Genetic Algorithms and Their Use in Chemistry
that dock flexible ligands into flexible proteins; and (4) those that perform conformational searching on ligands in a protein pocket but do not vary overall translation and rotation of the ligand. The GA methods fall into categories (2) and (3). Enough researchers have reported promising results using GAS on a wide variety of problems that it is safe to predict that the GA will become a standard search and optimization tool for computational chemists. A further validation of this view is the presence of GA-based methods in several commercial modeling packages. Good, simple to use, general purpose GA codes are available on the Internet (see Appendix 2), which make it straightforward for code developers to incorporate GA methods. Available parallel implementations offer chemists a convenient path to make use of high-performance parallel machines. However, as the examples discussed in this chapter show, no one variety of GA is sufficient for all applications.
APPENDIX 1. LITERATURE SOURCES The references to chemical uses of GA in this chapter are close to complete as of September 1995. However, the general GA-related literature is large and growing rapidly and has only been touched upon here. For further information on the GA field, there are several good books, listed below, as well as information available on the Internet. Adaptation in Natural and Artificial Systems6 by John Holland. This book, originally published in 1975, marks the beginning of the GA field. However, it is heavy on formal presentation and is not very useful for a beginner. Genetic Algorithms in Search, Optimization and Machine Learning9 by David Goldberg presents a good mix of theory, explanation, and applications. Proceedings from the 3rd and 4th International Conferences on Genetic Algorithrns14,'45 include a wide range of papers dealing with both GA theory and applications. Handbook of Genetic Algorithms,19 edited by Lawrence Davis, has an excellent tutorial on the basic GA method combined with chapters on a variety of applications. Davis's book Genetic Algorithms and Simulated Annealit~g'4~ also has a number of good examples of the use of GAS in a variety of contexts, but little information on simulated annealing, despite the title.
Appendix 2 . Public Domain Genetic Algorithm Codes 65 Genetic Programming12 by John Koza gives the theory and many applications of genetic programming. It also contains an insightful introduction to GAS. Several research groups have set up home pages that can be accessed through the Internet. Most of these cross reference one another, so we list only a few as sites for starting a search. An excellent place to start is Sandia’s GASin Chemistry home page which has links to several other sites including those listed below. In addition it contains links to the source code directory for the programs used in this chapter. The URL is http://midwa;y.ca.sandia.guv/-judson/. The Genetic Algorithm Group (George Mason University)
http://www.cs.gmu.edu:80/research/gag/ The ILLiGAL Home page (University of Illinois)
ftp://gal4.ge.uiuc.edu/and http://gal4.ge.uiuc.edu/ NCARAI Genetic Algorithms Archive (Navy Center for Applied Research in Artificial Intelligence)
http :/ /www.aic.nrl.navy,mil:80/ galist/ The usenet group comp.ai.geneticis devoted to discussion of GA research. The FAQ for this group, The Hitch-Hiker’s Guide to Evolutionary Algorithms,l47 can be accessed at
http://www.cis.ohio-seate.edu/hypertext/faq/usenet/ai-faq/genetic/top.html. There are two e-mail groups devoted to GAS. The first is galist,which you can join by sending a message to
[email protected]. The second is gamolecule devoted to molecular applications of GAS. To join this, send a message to
[email protected]. Both of these are lowvolume groups.
APPENDIX 2. PUBLIC DOMAIN GENETIC ALGORITHM CODES Fortunately, there are several good public domain GA codes available on the Internet that allow you to quickly start doing your own calculations. This appendix lists a subset of the available codes and gives pointers to accessing them.
66 Genetic Algorithms and Their Use in Chemistry
The largest repository is maintained by the Naval Research Laboratory (NRL) whose ftp server is ftp.aic.nrl.navy.mil. Source code is found in the directory /pub/gdist/src/ga. Currently, there are about 15 different programs ranging from the standard “simple” GA (program SGA), through the more robust package genesis (and its various offspring GENEsYs, GAucsd, dgenesis, and paragenesis) to LISP-based GA and genetic programming code. The INDEX file in this directory gives a short description of each of the included programs. The messy GA code (mGA) of Deb and Goldberg is available from the University of Illinois GA site (ILLiGAL) (gal4.ge.uiuc.edu:/pub/src/ messyGA/C/). They also have a version of the simple GA. The programs used in the example for this chapter are made available at
midww.ca.sandia.gov: /pub/ga/src/.
ACKNOWLEDGMENTS The author acknowledges support by the Department of Energy under contract DE-
AC04-94AL85000.
REFERENCES 1 . T. Schlick, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1992, Vol. 3, pp. 1-71. Optimization Methods in Computa-
tional Chemistry. 2. C. Darwin, The Origin of Species, 6th edit., P. F. Collier and Sons, New York, 1909. 3. E. Mayr, fntroduction to Darwin’s “Origin of Species, ” Harvard University Press, Cambridge, MA, 1964. 4. N. Eldredge, Reinventing Darwin-The Great Debate at the High Table of Evolutionary Theory, Wiley, New York, 1995. 5. J. H. Holland, Sci. Am., 267, July 1992, p. 66. Genetic Algorithms. 6. J. H. Holland, Adaptation in Natural and Artificial Systems, MIT Press, Cambridge, MA, 1992. 7 . I. Rechenberg, Evolutionsstrategie-Optimierung Technischer Systeme nach Prinzipien der Biologischen Euolution, Frommann-Holzboog, Stuttgdrt, 1973. 8. C . Brandon and J. Tooze, Introduction to Protein Structure, Garland, New York, 1991. 9. D. Goldberg, Genetic Algorithms in Search, Optimization, and Learning, Addison-Wesley, Reading, MA, 1989. 10. K. A. DeJong, Doctoral Thesis, University of Michigan, 1976. An Analysis of the Behavior of a Class of Genetic Adaptive Systems. 11. D. E. Goldberg, K. Deb, and J. H . Clark, Complex Systems, 6 , 333 (1992).Genetic Algorithms, Noise, and the Sizing of Populations. 12. J. Koza, Genetic Programming, MIT Press, Cambridge, MA, 1992.
References 67 13. G. J. E. Rawlins, Ed., Foundations of Genetic Algorithms, Morgan Kaufmann, San Mateo, CA, 1991. 14. R. K. Belew and L. B. Booker, Eds., Proceedings ofthe Fourth International Conference on Genetic Algorithms, Morgan Kaufman, San Mateo, CA, 1991. 15. R. S. Judson, E. P. Jaeger, A. M. Treasurywala, and M. L. Peterson, 1. Comput. Chem., 14, 1407 (1993). Conformational Searching Methods for Small Molecules. 11. Genetic Algorithm Approach. 16. J. E. Baker, in Proceedings of an International Conference on Genetic Algorithms and Their Applications, J. Greffenstette, Ed., Lawrence Erlbaum Associates, Cambridge, MA, 1985, pp. 101-11 1. Adaptive Selection Methods for Genetic Algorithms. 17. H. Miihlenbein, in Foundations of Genetic Algorithms, G. J. E. Rawlins, Ed., Morgan Kaufman, San Mateo, CA, 1991, 316 pp. Evolution in Time and Space-The Parallel Genetic Algorithm. 18. H. Miihlenbein, M. Schomisch, and J. Born, Parall. Computing, 17, 619 (1991).A Parallel Genetic Algorithm as Function Optimizer. 19. L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, 1991. 20. G. Syswerda, in Proceedings of the Third International Conference on Genetic Algorithms, J. D. Schaffer, Ed., Morgan Kaufmann, San Mateo, CA, 1989, pp. 2-9. Uniform Crossover in Genetic Algorithms. 21. D. E. Goldberg and J. Richardson, in Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms, J. Greffenstette, Ed., Lawrence Erlbaum Associates, Cambridge, MA, 1987, pp. 41-49. Genetic Algorithms and Sharing for Multimodal Function Optimization. 22. R. S. Judson, /. Phys. Chem., 96, 10102 (1992).Teaching Polymers to Fold. 23. D. E. Goldberg, K. Deb, and B. Korb, in Proceedings ofthe Fourth International Conference on Genetic Algorithms, R. K. Belew and L. B. Booker, Eds., Morgan Kaufmann, San Mateo, CA, 1993, pp. 24-30. Don’t Worry, Be Messy. D. E. Goldberg, K. Deb, H. Kargupta, and G. Harik, University of Illinois at Urbana-Champaign, Department of General Engineering, Illinois Genetic Algorithms Laboratory, Report No. 93004, 1993. Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms. 24. D. E. Goldberg, Complex Systems, 5 , 139 (1991).Real-Coded Genetic Algorithms, Virtual Alphabets, and Blocking. 25. R. S. Judson, M. E. Colvin, J. C. Meza, A. Huffer, and D. Gutierrez, Int. /. Quantum Chem., 44, 277 (1992). Do Intelligent Configuration Search Techniques Outperform Random Search for Large Molecules? 26. R. S. Judson and H. Rabitz, Phys. Rev. Lett., 68, 1500 (1992).Teaching Lasers to Control Molecules. 27. R. A. Kendall, R. J. Harrison, R. J. Littlefield, and M. F. Guest, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 199.5, Vol. 6, pp. 209-3 16. High Performance Computing in Computational Chemistry: Methods and Machines. 28. T. Back, F. Hoffmeister, and H.-P. Schwefel, in Proceedings of the Fourth International Conference on Genetic Algorithms, R. K. Belew and L. B. Booker, Eds., Morgan Kaufmann, San Mateo, CA, 1991;pp. 2-9. A Survey of Evolutionary Strategies. T. Back and H.-P. Schwefel, Evolutionary Computation, 1, 1 (1993). An Overview of Evolutionary Algorithms for Parameter Optimization. 29. K. Sigmund, Games of Life, Explorations in Ecology, Evolution, and Culture, Oxford University Press, Oxford, U.K., 1993. 30. M. Singer and P. Berg, Genes and Genomes, A Changing Perspective, University Science Books, Mill Valley, CA, 1991. 31. C. L. Karr, S. K. Sharma, W. Hatcher, and T. R. Harper, in Modeling and Simulation ofthe Control Hydrometallic Processes, Proceedings of an International Symposium, V. G . Papan-
68 Genetic Algorithms and Their Use in Chemistry
32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
42. 43.
44. 45. 46. 47. 48. 49. 50.
51. 52. 53.
gelakis and G. P. Demopoulos, Eds., Canadian Institute of Mining, Metallurgy, and Petrology, Montreal, 1993, pp. 227-236. Fuzzy Logic and Genetic Algorithms for the Control of Exothermic Chemical Reactions. I. P. Androulakis and V. Venkatasubramanian, Comput. Chem. Eng., IS, 217 (1991). A Genetic Algorithm Framework For Process Design and Optimization. D. B. Hibbert, Chemometrics Intell. Lab. Syst., 19, 277 (1993). Genetic Algorithms in Chemistry. C. B. Lucasius and G. Kateman, Chemometrics Intell. Lab. Syst., 19,1 (1993). Understanding and Using Genetic Algorithms. Part 1. Concepts, Properties and Context. Y. Xiao and D. E. Williams, Chem. Phys. Lett., 215, 17 (1993). Genetic Algorithm: A New Approach to the Prediction of the Structure of Molecular Clusters. Y. Xiao and D. E. Williams, Comput. Chem., 18,199 (1994).GAME: Genetic Algorithm for Minimization of Energy: An Interactive Program for Three Dimensional Intermolecular Interactions. B. Hartke, 1. Phys. Chem., 97, 9973 (1993). Global Geometry Optimization of Clusters Using Genetic Algorithms. R. W. Smith, Comput. Phys. Commun., 71, 134 (1992). Energy Minimization in Binary Alloys via Genetic Algorithms. J. Mestres and G. E. Scuseria,]. Comput. Chem., 16, 729 (1995). Genetic Algorithms: A Robust Scheme for Geometry Optimizations and Global Minimum Structure I%oblems. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, 2nd edit., Cambridge Univeisity Press, Cambridge, U.K., 1992, pp. 408-412. J. C. Meza and hl. L. Martinez,]. Comput. Chem., 15, 627 (1994). Direct Search Methods for the Molecular Conformation Problem. J. C. Meza, R. S. Judson, T. R. Faulkner, and A. M. Treasurywala, J. Comput. Chem., 17, 1142 (1996).A Comparison of a Direct Search Method and a Genetic Algorithm for Conformational Searching. J. J. Grefenstette and N. N. Schraudolph, Naval Research Laboratory, Washington, D.C., 1990. GENESIS 1.2ucsd. F. Mohamadi, N. J. G. Richards, W. G. Guida, R. Liskarnp, M. Lipton, C. Caufield, G. Chang, T. Hendrickson, and W. C. Still, 1. Comput. Chem., 11, 440 (1990). MacroModel-An Integrated Software System for Modeling Organic and Bioinorganic Molecules Using Molecular Mechanics. R. A. Dammkoehler, S. F. Karasek, E. F. B. Shands, and G. R. Marshall, J. Cornput.-Aided Mol. Design, 3, 3 (1989). Constrained Search of Conformational Hypersurfaces. SYBYL, Tripos, Inc., St. Louis, MO, 1991. T. Brodmeier and E. Pretsch, 1. Comput. Chem., 15, 588 (1994). Application of Genetic Algorithms in Molecular Modeling. D. B. McGarrah and R. S. Judson,]. Comput. Chem., 14, 1385 (1993). An Analysis of the Genetic Algorithm Method of Molecular Conformation Determination. W. P. Walters, M. T. Stahl, and D. P. Dolata, University of Arizona, Dept. of Chemistry, Tucson, AZ, 1995. Wizard Ill, A New 3D Model Building and Conformational Search Program. W. P. Walters and D. 1.’ Dolata, 1. Mol. Graphics, 12, 130 (1994). MOUSE-111: Learning Rules of Conformational Analysis from X-Ray Data. D. P. Dolata and W. P. Walters, J. Mol. Graphics, 11, 112 (1993). Short-Term Learning in Conformational Analysis. D. P. Dolata and W. P. Walters, 1. Mol. Graphics, 11, 106 (1993). MOUSE: A Teachable Program for Learning in Conformational Analysis. P. Tuffery, C. Etchebest, S. Hazout, and R. Lavery, J. Biomol. Struct. Dynam., 8 , 1267 (1991).A New Approach to the Rapid Determination of Protein Side Chain Conformations. P. Tuffery, C. Etchebest, S. Hazout, and R. Lavery, J. Comput. Chem., 14, 790 (1993). A
References 69
54.
55. 56. 57. 58. 59.
Critical Comparison of Search Algorithms Applied to the Optimization of Protein Side Chain Conformations. R. Judson, Sandia National Laboratories, Livermore, CA, 1996, Report. No. SAND9585785. Computational Evolution of a Model Polymer That Folds to a Specified Target Conformation. T.Dandekar and P. Argos, Protein Eng., 5 , 637 (1992). Potential of Genetic Algorithms in Protein Folding and Protein Engineering Simulations. T. Dandekar and P. Argos, J. Mol. Biol., 236, 844 (1994).Folding the Main Chain of Small Proteins with the Genetic Algorithm. R. Unger and J. Moult,]. Mol. Biol., 231,75 (1993).Genetic Algorithms for Protein Folding Simulations. K. F. Lau and K. A. Dill, Proc. Natl. Acad. Sci. USA, 87, 638 (1990). Theory for Pr'otein Mutability and Biogenesis. R. Unger and J. Moult, in Computer Aided Innovation of New Materials, M. Doyama, J. Kihara, M. Tanaka, and R. Yamamoto, Eds., Elsevier Science Publishers B.V., New York, 1993, pp. 1283-1286. Effects of Mutations on the Performance of Genetic Algorithms Suitable for Protein Folding Simulations.
60. S. Sun, Protein Sci., 2, 762 (1993). Reduced Representation Model of Protein Structure Prediction: Statistical Potential and Genetic Algorithms. 61. S. LeGrand and K. Merz, J. Global Opt., 3, 49 (1993). The Application of the Genetic Algorithm to the Minimization of Potential Energy Functions. 62. S. Weiner, P. Kollman, D. Nguyen, and D. Case, ]. Comput. Chem., 7, 230 (1986). An All Atom Force Field for Simulations of Proteins and Nucleic Acids. 63. D. T. Jones, Protein Sci., 3, 567 (1994). De Novo Protein Design Using Painvise Potentials and a Genetic Algorithm. 64. j. R. Gunn, A. Monge, R. A. Friesner, and C . H. Marshall,]. Phys. Chem., 98,702 (1994). Hierarchical Algorithm for Computer Modeling of Protein Tertiary Structure: Folding of Myoglobin to 6.2 A Resolution. 65. C. 5. Ring and F. E. Cohen, lsr. ]. Chem., 34,245 (1994).Conformational Sampling of Loop Structures Using Genetic Algorithms. 66. L. D. Merkle, G. H. Gates, G. B. Lamont, and R. R. Pachter, in Proceedings ofthe Intel Supercomputer Users' Group 1994 Annual North American Conference, J. Wold, Ed., Intel Supercomputer Systems Division, Beaverton, OR, 1994, pp. 189-1 95. Application of the Parallel Fast Messy Genetic Algorithm to the Protein Folding Problem. 67. F. Herrmann and S. Suhai, J. Comput. Chem., 16, 1434 (1995). Energy Minimization of Peptide Analogues Using Genetic Algorithms. F. Herrmann and S. Suhai, in Computational Methods in Genome Research, S . Suhai, Ed., Plenum Press, New York, 1994, pp. 173-190. Genetic Algorithms in Protein Structure Prediction. 68. J. S. Dixon, in Trends in QSAR for Molecular Modeling, '92, Proceedings of the European Symposium on Structure-Activity Relationships, C . G. Wermuth, Ed., ESCOM, Leiden, The Netherlands, 1992, pp. 312-313. Flexible Docking of Ligands to Receptor Sites Using Genetic Algorithms. 69. I. D. Kuntz, J. M. Blaney, S. J. Oatley, R. Langridge, and T. E. Ferrin, ]. Mol. Biol., 161,269 (1982). A Geometric Approach to Macromolecule-Ligand Interactions. 70. C. M. Oshiro, 1. D. Kuntz, and J. S. Dixon, ]. Cornput.-Aided Mol. Design, 9, 113 (1995). Flexible Ligand Docking Using a Genetic Algorithm. 71. R. S. Judson, E. P. Jaeger, and A. M. Treasurywala, I. Mol. Struct. (THEOCHEM),308,191 (1994). A Genetic Algorithm-Based Method for Docking Flexible Molecules. 72. R. S. Judson, Y. T. Tan, E. Mori, C. F. Melius, E. P. Jaeger, A. M. Treasurywala, and A. Mathiowetz, I. Comput. Chem., 16, 1405 (1995).Docking Flexible Molecules: A Case Study of 3 Proteins.
70 Genetic Algorithms and Their Use in Chemistry 73. K. P. Clark and Ajay,]. Comput. Chem., 16,1210 (1995).Flexible Ligand Docking Without Parameter Adjustment Across Four Ligand-Receptor Complexes. 74. D. K. Gehlhaar, G. M. Verkhivker, P. A. Rejto, C. J. Sherman, D. B. Fogel, L. J. Fogel, and S. T. Freer, Chem. Eiol., 2, 317 (1995). Molecular Recognition of the Inhibitor AG-1343 by HIV-1 Protease: Conformationally Flexible Docking by Evolutionary Programming. 75. G. Jones, P. Willett, and R. C. Glen, /. Mol. Biol. 245,43 (1995).Molecular Recognition of Receptor Sites Using a Genetic Algorithm with a Description of Desolvation. 76. C. B. Lucasius, M. J. J. Blommers, L. M. C. Buydens, and G. Kateman, in Handbook of Genetic Algorithms, L. Davis, Ed., Van Nostrand Reinhold, New York, 1991, pp. 251-281. A Genetic Algorithm for Conformational Analysis of DNA. 77. H. Ogata, Y. Akiyama, and M. Kanehisa, Genome Information Service, 4, 270 (1993). A Computer Modeling Method for the Three-Dimensional Structure of RNA. 78. P. Schuster, Springer Ser. Synerget., 44, 101 (1989). Optimization and Complexity in Molecular Biology and Physics. 79. M. J. J. Blommers, C. B. Lucasius, G. Kateman, and R. Kaptein, Eiopolymers, 32,45 (1992). Conformational Analysis of a Dinucleotide Photodimer with the Aid of the Genetic Algorithm. 80. J. Chang and M. Lewis, Acta Crystallogr., Sect. D,50, 667 (1994). Using Genetic Algorithms for Solving Heavy-Atom Sites. 8 1. M. J. Buerger, Contemporary Crystallography, McGraw-Hill, New York, 1970. 82. A. W. R. Payne and R. C. Glen, /. Mol. Graphics, 11, 74 (1993). Molecular Recognition Using a Binary Genetic Search Algorithm. 83. D. E. Clark, G. Jones, P. Willett, P. W. Kenny, and R. C. Glen,]. Chem. Inf. Comput. Sci., 34, 197 (1994). Pharmacaphoric Pattern Matching in Files of Three-Dimensional Chemical Structures: Comparison of Conformational-Searching Algorithms for Flexible Molecules. 84. J. M. Blaney and J. S. Dixon, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5, pp. 299-335. Distance Geometry in Molecular Modeling. 85. J. M. Blaney, G. M. Crippen, A. Dearing, and J. S. Dixon, Quantum Chemistry Program Exchange, Indiana University, Bloomington, IN, 1988. DGEOM: Distance Geometry. 86. P. S. Shenkin, D. L. Yarmush, R. M. Fine, H. Wang, and C. Levinthal, Eiopolymers, 26, 2053 (1987). Predicting Antibody Hypervariable Loop Conformation: 1. Ensembles of Random Conformations for Ringlike Structures. 87. A. C. W. May and M. S. Johnson, Protein Eng., 7,475 (1994). Protein Structure Comparisons Using a Combination of a Genetic Algorithm, Dynamic Programming and LeastSquares Minimization. 88. M. L. Fredman, Bull. Math. Biol., 46,553 (1984). Algorithms for Computing Evolutionary Similarity Measures with Length Independent Gap Penalties. 89. S. B. Needleman and C. Wunsch, I. Mol. Biol., 48,444 (1970).A General Merhod Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. 90. D. E. Walters and R. M. Hinds, /. Med. Chem., 37, 2527 (1994). Genetically Evolved Receptor Models: A Computational Approach to Construction of Receptor Models. 91. B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, 1. Comput. Chem., 4, 187 (1983). CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. 92. E. Fontain, Anal. Chim. Acta, 265, 227 (1992). The Problem of Atom-to-Atom Mapping. An Application of Genetic Algorithms. 93. E. Fontain, 1. Chem. Inf. Comput. Sci., 32,748 (1992).Application of Genetic Algorithms in the Field of Constitutional Similarity. 94. C. Jochum, J. Gasteiger, and 1. Ugi, Angew. Chem. Int. Ed. Engl., 19, 495 (1980). The Principle of Minimum Chemical Distance (PMCD).
References 71 95. J. Dugundji and 1. Ugi, Top. Curr. Chem., 39, 19 (1973).An Algebraic Model of Constitutional Chemistry as a Basis for Chemical Computer Programs. 96. R. D. Brown, G. Jones, P. Willett, and R. C. Glen,J. Chem. Inf. Comput. Sci., 34,63 (1994). Matching Two-Dimensional Chemical Graphs Using Genetic Algorithms. 97. R. D. Brown, G. M. Downs, G. Jones, and P. Willett, /. Chem. Inf. Comput. Sci., 34, 47 (1994).Hyperstructure Model for Chemical Structure Handling: Techniques for Substructure Searching. 98. G. Vladutz and S. R. Gould, in The International Language of Chemistry, W. A. Warr, Ed., Springer Verlag, Berlin, 1988, pp. 371-384. Joint Compound/Reaction Storage and Retrieval and Possibilities of a Hyperstructure-Based Solution. 99. D. B. Hibbert, Chemometrics Intell. Lab. Syst., 20, 35 (1993).Generation and Display of Chemical Structures by Genetic Algorithms. 100. D. Rogers and A. J. Hopfinger, J. Chem. Inf. Comput. Sci., 34, 854 (1994).Application of Genetic Function Approximation to Quantitative Structure-Activity Relationships and Quantitative Structure-Property Relationships. 101. J. Friedman, Laboratory of Computational Statistics, Department of Statistics, Stanford University, Report No. 102, 1990. Multivariate Adaptive Regression Splines. 102. V. Venkatasubramanian and J. M. Caruthers, Comput. Chem. Eng., 18, 833 (1994). Computer-Aided Molecular Design Using Genetic Algorithms. 103. D. Weininger, A. Weininger, and W. L. Weininger,]. Chem. Inf. Comput. Sci., 29,97 (1989). SMILES 2: Algorithm for Generation of Unique Smiles Notation. 104. M. J. Cinkosky, J. W. Fickett, W. M. Barber, M. A. Bridgers, and C. D. Troup, Los Alamos Science, 20, 267 (1992).SIGMA: System for Integrated Genome Map Assembly. 105. G. Schneider and P. Wrede, J. Mol. Evol., 36, 586 (1993).Development of Artificial Neural Filters for Pattern Recognition in Protein Sequences.
106. G. Schneider and P. Wrede, Biophys. I., 66,335 (1994).The Rational Design of Amino Acid Sequences by Artificial Neural Networks and Simulated Molecular Evolution: De Novo Design of an Idealized Leader Peptidase Cleavage Site. 107. G. Fiillen and D. C. Youvan, Complexity International (http:llwww.csu.edu.au/ci/voll/ fullenlREMhtml), 1, unpaged (1995).Genetic Algorithms and Recursive Ensemble Mutagenesis in Protein Engineering. 108. V. R. Vemuri and W. Cedefio, in Genetic Algorithms and Applications, L. Chambers, Ed., CRC Press, Boca Raton, Florida, 1995, Vol. 2, pp. 5-29. Multi-Niche Crowding for Multimodal Search. 109. S. W. Mahfoud, in Proceedings of Parallel Problem Solving from Nature 2, R. Manner and B. Manderick, Eds., Elsevier Science Publishers B.V., Amsterdam, 1992, pp. 27-36. Crowding and Preselection Revisited. 110. A. Ishikawa, T. Toya, Y. Totoki, and A. Konagaya, Institute for New Generation Computer Technology, Tokyo, Report NO. ICOT TR-0849, 1993. Parallel Iterative Aligner with Genetic Algorithm. 1 1 1 . C. B. Lucasius, A. D. Dane, and G. Kateman, Anal. Chim. Acta, 282, 647 (1993). On k-Medoid Clustering of Large Data Sets with the Aid of a Genetic Algorithm: Background, Feasibility and Comparison. 112. L. Kaufman and W. Rousseeu, Finding Groups in Data. An Zntroduction to Cluster Analysis, Wiley, Chichester, U.K., 1990. 113. C. B. Lucasius and G. Kateman, Trends Anal. Chem., 10, 254 (1991). Genetic Algorithms for Large Scale Optimization in Chemometrics: An Application. 114. C. B. Lucasius, A. P. d. Weijer, L. M. C. Buydens, and G . Kateman, Chemometrics Intell. Lab. Syst., 19, 337 (1993). CFIT: A Genetic Algorithm for Survival of the Fitting. 115. C. B. Lucasius, M. L. M. Beckers, and G . Kateman, Anal. Chim. Acta, 286, 135 (1994). Genetic Algorithms in Wavelength Selection: a Comparative Study.
72 Genetic Algorithms and Their Use in Chemistry 116. A. P. d. Weijer, C. B. Lucasius, L. Buydens, and G. Kateman, Chemometrics Intell. Lab. Syst., 20, 45 (1993).Using Genetic Algorithms for an Artificial Neural Network Model Inversion. 117. D. B. Hibbert, Chemometrics Intell. Lab. Syst., 19,319 (1993).A Hybrid Genetic Algorithm for the Estimation of Kinetic Parameters. 118. R. Leardi, R. Boggia, and M. Terrile, I. Chemomet., 6,267 (1992). Genetic Algorithms as a Strategy for Feature Selection. 119. R. Leardi, ]. Chemometrics, 8 , 65 (1994). Application of a Genetic Algorithm to Feature Selection Under Full Validation Conditions and to Outlier Detection. 120. 1. Rossi and D. G. Truhlar, Chem. Phys. Lett., 233,231 (1995).Parameterization of NDDO Wavefunctions Using Genetic Algorithms. An Evolutionary Approach to Parameterizing Potential Energy Surfaces and Direct Dynamics for Organic Reactions. 121. A. R. Leach, in Reviews in Computational Chemistry, K. 8.Lipkowin and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 1-55. A Survey of Methods for Searching che Conformational Space of Small and Medium-Sized Molecules. 122. N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, ]. Chem. Phys., 21, 1087 (1953). Equation of State Calculations by Fast Computing Machines. 123. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Science, 220, 671 (1983). Optimization by Simulated Annealing. 124. J. A. Nelder and R. Mead, Computer]., 7,308 (1965).A Simplex Method for Optimization. 125. C. D. Maranas and C. A. Floudas,]. Chem. Phys., 100,1247 (1994).A Deterministic Global Optimization Approach for Molecular Structure Determination. 126. M. Saunders, ]. Am. Chem. SOC.,109, 3150 (1987). Stochastic Exploration of Molecular Mechanics Energy Surfaces, Hunting for the Global Minimum. 127. D. Wienke, C. B. Lucasius, and G. Kateman, Anal. Chim. Acta, 265, 211 (1992). Multicriteria Target Vector Optimization of Analytical Procedures Using a Genetic Algorithm. 128. S. A. Allison, S. H. Northrup, and J. A. McCammon, ]. Chem. Phys., 83, 2894 (1985). Extended Brownian Dynamics of Diffusion Controlled Reactions. 129. S. H. Northrup and H. P.Erickson, Proc. Nutl. Acad. Sci. USA, 89,3338 (1992).Kinetics of Protein-Protein Association Explained by Brownian Dynamics Computer Simulation. 130. D. Meyer, C. B. Naylor, 1. Motoc, and G. R. Marshall, I. Cornput.-Aided Mol. Design, 1,3 (1987).Docking by Systematic Search. 131. W. C. Guida, R. 5. Bohacek, and M. D. Erion,]. Comput. Chem., 13, 214 (1992).Probing the Conformational Space Available to Inhibitors in the Thermolysin Active Site Using Monte CarlolEnergy Minimization Techniques. 132. D. S. Goodsell and A. J, Olson, Proteins: Struct. Funct. Genet., 8 , 195 (1990). Automated Docking of Substrates to Proteins by Simulated Annealing. 133. 1. D. Kuntz, Science, 257, 1078 (1992). Structure-Based Strategies for Drug Design and Discovery. 134. E. C. Meng, B. K. Schoichet, and I. D. Kuntz,]. Comput. Chem., 13,505 (1992).Automated , Docking with Grid-Based Energy Evaluation. ,135. B. K. Shoichet and 1. D. Kuntz,]. Mol. Biol., 221,327 (1991).Protein Docking and Complementarity. 136. B. K. Shoichet, R. M. Stroud, D. V. Santi, 1. D. Kuntz, and K. M. Perry, Science, 259,1445 (1993).Structure-Based Discovery of Inhibitors of Thymidylate Synthase. 137. L. Banci, S. Schroder, and P. A. Kollman, Proteins: Struct. Funct. Genet., 13, 288 (1992). Molecular Dynamics Characterization of the Active Cavity of Carboxypeptidase-A and Some of its Inhibitor Adducts. 138. B. L. Stoddard and D. E. Koshland, Proc. Natl. Acad. Sci. USA, 90,1146 (1993).Molecular Recognition Analyzed by Docking Simulations-The Aspartate Receptor and lsocitrate Dehydrogenase from Escherichiu coli.
References 73 139. C. A. Freeman, C. R. A. Catlow, J. M. Thomas, and S. Brode, Chem. Phys. Lett., 186, 137 (1991). Computing the Location and Energetics of Organic Molecules in Microporous Adsorbates and Catalysts: A Hybrid Approach Applied to Isomeric Butenes in a Model Zeolite. 140. A. R. Leach and 1. D. Kuntz,]. Comput. Chem., 13,730 (1992).Conformational Analysis of Flexible Ligands in Macromolecular Receptor Sites. 141. J. B. Moon and W. J. Howe, Tetrahedron Comput. Method, 3, 697 (1990). 3D Database Searching and De Novo Construction Methods in Molecular Design. 142. S . H. Rotstein and M. A. Murcko, 1. Cornput.-Aided Mol. Design, 7 , 23 (1993). GENSTAR-A Method for De Novo Drug Design. 143. S. Gallion and D. Ringe, Protein Eng., 5 , 291 (1992).Molecular Modeling Studies of the Complex Between Cyclophilin and Cyclosporin-A. 144. J. M. Blaney and J. S. Dixon, Ann. Rep. Med. Chem., 26,281 (1991).Receptor Modeling by Distance Geometry. 145. J. D. Schaffer, Ed., Proceedings of the Third International Conference on Genetic Algorithms, Morgan Kaufman, San Mateo, CA, 1989. 146. L. Davis, Ed., Genetic AIgorithms and Simulated Annealing, Pitman Publishing, London, 1987. 147. J. Heitkotter and D. Beasley, USENET:comp.ai.genetic, 1994. Available by anonymous FTP from rtfm.mit.edu:/pub/usenet.news.answers/ai-faq/genetic/. The Hitch-Hiker’s Guide to Evolutionary Computation: A List of Frequently Asked Questions (FAQ).
CHAPTER 2
Does Combinatorial Chemistry 0bviate Computer-Aided Drug Design? Eric J. Martin, David C. Spellmeyer, Roger E. Critchlow Jr., and Jeffrey M. Blaney Chiron Corporation, 4560 Horton Street, Emeryville, California 94608
INTRODUCTION Recent advances in high-throughput screening and solid-phase organic synthesis have led to the exciting new field of combinatorial drug discovery.’ Automated modular reaction schemes yield “libraries” of compounds, incorporating all combinations from a basis set of substituents in each of several “diversity sites” around a central “scaffold.” Some researchers decry this technology as regression from the modern age of rational drug design back to irrational mass screening. Yet, mass screening should not imply a random or irrational approach. Many successful techniques for rational design of individual molecules can be extended to rational design of combinatorial libraries, and the extensive information obtained from screening libraries of carefully designed compounds can benefit rational drug design. Combinatorial libraries achieve chemical diversity by employing a wide variety of substituents from readily available starting materials such as amines, Reviews in Computational Chemistry, Volume 10 Kenny 8 . Lipkowitz and Donald B. Boyd, Editors VCH Publishers, Inc. New York, 0 1997
75
76 Combinatorial Chemistry and Computer-Aided Drug Design halides, aldehydes, or esters.2 Parallel synthesis techniques enable the synthesis and screening of thousands of compounds per week. Yet, a typical smallmolecule combinatorial synthetic scheme might permit any of 1000 “candidate” substituents to be introduced at each of say four sites, for a total of 1012 potential compounds. At a proficient rate of 20,000 compounds per week, 1 million years would be required to syntehsize this entire library. One might want, therefore, to make a smaller library, perhaps of just 160,000 compounds from 20 substituents at each of the four sites. Experimental design approaches can help select such a small subset from the enormous potential pool that will still uncover as many interesting leads as possible. Library design strategies depend on how well the target’s pharmacophoric requirements are understood. Libraries for a well-characterized receptor aim to eliminate unwanted diversity, so only substituents meeting the known requirements for that particular target are included. Broad-screening libraries, instead, maximize diversity to minimize redundancy. Combined strategies mix target-specific and diverse side chains, to present the pharmacophoric fragments in many geometries and chemical environments. A preview of the overall library design strategy is illustrated in Figure 1. Suitable reagents for the combinatorial reaction scheme are identified from a database of available chemicals. Computed properties for each structure generate coordinates to locate each candidate substituent in “property space.” Statistical experimental design methods are used, either to identify reagents dispersed in property space for the creation of nonredundant broad-screening libraries or to find biased sets that heavily sample special pharmacophoric combinations of properties, and less thoroughly sample the remaining space, for target-focused libraries. Of course, the application of experimental design to pharmaceutical problems is not new. It has been used for the design of structure-activity relationship (SAR) compound sets,3-S for optimizing synthetic processes,6J in analytical chemistry,g and for selecting screening subsets from corporate chemical archives.9 This chapter focuses only on the experimental design of combi-
Figure 1 An overview of the experimental design of combinatorial libraries.
Similarity and “Property Space” 77 natorial small-molecule libraries, such as those yielding suitable leads for orally bioavailable drugs.
FRAGMENTS VS. WHOLE MOLECULES An immediate practical decision in any combinatorial library design effort is whether to calculate properties of whole molecules, or to divide the molecules into a scaffold and substituents, performing calculations on only the fragments. The obvious disadvantage of working with fragments is that in reality they are not isolated and they might interact. A design based on fragment properties does not explicitly account for these effects. However, as a combinatorial library includes every combination of substituents, it does form a full factorial design to characterize interactions. The disadvantage of the whole-molecule alternative is limited computer resources. In the example given, 10’2 compounds could be made from 1000 candidate substituents at each of the four sites. Allowing 10 days to compute properties for 1012 whole molecules in the library allows only 1 ps per molecule, insufficient time for any but the crudest computation. In contrast, when working with 4000 substituents and allowing 1 day for property calculations still allocates about 20 seconds per fragment, which is enough time for fairly sophisticated calculations. Furthermore, the combinatorial library in the example is not the most diverse set of 160,000 compounds possible from the 1012 potential molecules, but rather is the most diverse set possible from four sets of 20 substituents. It is easy to compute the most diverse subsets of 20 fragments from sets of 1000, but to compute the most diverse combinatorial subset from 1012 whole molecules is harder. Thus, both fragment-based and whole-molecule-based approaches require approximations. The fragment approach assumes that a diverse set of substituents will yield diverse libraries. The whole-molecule approach is very slow and also assumes that similarity can be adequately characterized by very crude computations. This chapter focuses on the fragment-based approach, calculating properties on the substituents only, the site of attachment being replaced by a dummy atom.
SIMILARITY A N D “PROPERTY SPACE” The experimental design approach requires use of biologically relevant descriptors to measure similarity between molecules or molecular fragments. The ideal similarity measure would be easy to compute for any molecular
78 Combinatorial Chemistry and Computer-AidedDrug Design structure and, for any biological target, should yield high similarity scores between compounds with similar biological activity and low scores between compounds with dissimilar activity. Because biological activity is difficult to calculate directly, molecular properties related to biological response are substituted. A biologically relevant similarity measure might at least include information on molecular shape, bulk physical properties such as membrane permeability or pK,, chemical functional groups, metabolic stability, and the geometrical arrangement of potential sites of receptor interaction such as charge and H-bonding sites, as well as the location of aromatic or hydrophobic regions. Compounds that are similar in all of these molecular properties might be anticipated to have similar biological activity. Unfortunately, many of these obviously important properties, such as the “shape” of a flexible molecule, are difficult to define, let alone compute. The previous discussion subtly shifted between molecular similarity and molecular properties. It is important to elucidate the relationship between the two. If each of the molecular properties can be treated as a separate dimension in a Euclidean “property space,” and dissimilarity can be equated with distance between property vectors, similarity/diversity problems can be solved using analytical geometry. A set of vectors (chemical structures) in property space can be converted to a matrix of pairwise dissimilarities simply by applying the Pythagorean theorem. This operation is like measuring the distances between all pairs of cities from their coordinates on a map. The reverse calculation, going from a similarity matrix to property space, is known as multidimensional scaling (MDS). It is more difficult and corresponds to trying to draw a map (i.e., assign coordinates to each city) given only the distances between each pair of cities. MDS assigns coordinate vectors to each molecule in the minimum number of “dimensions” needed to reproduce the distances (ie., dissimilarities) within a specified error (see below). These dimensions constitute a “latent” property space, implied by the similarity relationships between all pairs of molecules. Exact “classical” MDS is possible only if the similarity measure is a “metric,” that is, if it satisfies positivity, symmetry, and triangle inequality (so that the matrix of cosines is positive semidefinite) requirements.10 In practice, however, approximate MDS is often possible even with similarity measures, such as Tanimoto’s coefficient,ll that fail the triangle inequality concept. In these cases, one starts with the classical solution (in some number of dimensions), then numerically optimizes the coordinates of the molecules to better match the pairwise distances in the original dissimilarity matrix. This gives the Euclidean property space that best reproduces the given non-Euclidean similarities, The numerical refinement is computationally much more costly than the initial classical embedding. Nevertheless, Tanimoto similarity is frequently preferred for chemical problems because it is normalized for the complexity of the molecules. For example, this way, the dissimilarity between aminomethane and methanol is greater than, say, between an amino-steroid and the corresponding hydroxy-steroid.’
Properties 79
Readers with experience in chemometrics will have noticed that, like principal components analysis (PCA), MDS is a dimensionality reduction method. For each molecule, a large number of attributes (similarity to each other molecule) is reduced to a much smaller number of coordinates in an abstract property space, which reproduce the original data within an established error. The pertinent difference is that PCA uses the matrix of correlations between a set of (redundant) properties, which are usually obtained from a table of those properties for an initial set of molecules. In contrast, MDS uses a matrix of similarities between each pair of molecules (or substituents). Given a table of redundant properties, one could calculate dissimilarities as Euclidean distances and use MDS instead of PCA. Whereas the results would be similar, this is usually wasteful, because the number of molecules is typically much greater than the number of properties.10 However, if the similarities are best calculated from a nonlinear function of the properties, such as Tanimoto coefficients computed from two bit strings, the results would not be similar and nonlinear MDS should be used. One then gets back a set of latent properties (dimensions) for which Euclidean distance approximates the desired similarities. Thus, a simple rule of thumb is: For redundant properties as input use PCA, for metric similarities as input use classical MDS, for nonmetric similarities use numerically refined MDS.
PROPERTIES Martin et al. typically use about 18 substituent properties: lipophilicity, about five shape descriptors, about seven chemical functionality descriptors, and about five receptor ‘interaction descriptors.12 These properties were selected because they are readily calculated for virtually any chemical fragment, while still capturing many of the important effects mentioned above. Lipophilicity is characterized by estimated octanoVwater partition coefficient (log K0,,J, calculated using the CLOGP,13914 PROLOGP,’S LOGKOW,16 and HINT” programs, or it is estimated by comparison to experimental values for analogous compounds in the Pomona9.5 database.18 Topological indices,*g calculated with the program MOLCONN-X,ZO characterize the side chain’s shape, flexibility, branching, arrangement of cycles, etc. These indices have frequently been used to measure two-dimensional shape similarity.21 Following a well-established approach of Basak et a1.,22 Martin et a1.12 calculated a large number of descriptors: 70 connectivity indices; seven shape indices; the phi flexibility descriptor; molecular weight; and the number of elements, non-hydrogen atoms, and bonds. These 82 indices are then reduced with PCA (see below). Five or six principal components (PC) are typically retained, explaining a total of about 85-90% of the variance. “Chemical functionality descriptors’’ were developed to create a set of
80 Combinatovial Chemistry and Computer-Aided Dvun Desinn properties that together form a low-dimensional space wherein the distance between vectors represents the similarity in chemical functionality between corresponding substituents. The descriptors are computed from chemical database search keys, which enumerate the various structural fragments in a molecule. The Daylight “fingerprint” routines23 are used to search each substituent for all substructures up to seven bonds long and to set a bit in a 2048 bit string corresponding to each fragment found. The Tanimoto coefficient for comparing bit strings, which has often been applied to chemical fingerprints for database similarity searching and clustering,ll is used to calculate a dissimilarity matrix. MDS then finds the set of low-dimensional Cartesian coordinates for every substituent that best reproduces, simultaneously, the entire set of intersubstituent similarities (see above). PCA is applied after MDS to align the axes by variance. For a set of 1133 carboxylic acid derived substituents, MDS reduced the 2048-bit fingerprints to just seven continuous variables that reproduce all 642,000 pairwise dissimilarities with a relative standard deviation of just 10%. The calculations required 7 hours on an IBM RS/6000 580 computer using SAS PROC MDS.24 Spellmeyer has written a specialized MDS code that reproduced this result in 11 minutes on the same machine.25 He was able to reduce a set of 3984 amines to 10 dimensions in 4.4 hours. The example in Figure 2 illustrates these steps for creating a five-dimensional property space from a set of 721 amine-derived “peptoid” side chains.
“Chemical Functionality” Descriptors co co s
*. .. .. ., .. ..
,..
A B C D E i
D1 D2 0.90 -0.81 0.88 1.12 1.03 -0.11 1.00 -0.64 0.79 1.12 j
:
D3 0.65 0.91 0.99 0.63 1.09
.
:
D4 0.57 0.58 0.44
D5
-0.12 -0.09 -0.14 0.55 -0.12 0.39 0.06
.
i
1
0
N H
Aromatic Amines
Aliphatic
I
An;ines
A
Figure 2 Converting substituent structures to vectors in “property space.”
Properties 81 The first three descriptors have been plotted for 27 amines that have similar values in the remaining two dimensions. Carboxylic acids form a tight cluster near, but separate from, the esters. Alcohols are farther away from the acids, and amines are even farther. Aliphatics are widely separated from aromatics. The formation of these clusters is typical and is what justifies calling these dimensions “chemical functionality descriptors. ” “Receptor interaction descriptors” were developed to describe the distribution of chemical features of the substituents that contribute to specific ligand-receptor binding interactions. Unlike chemical functionality descriptors, these descriptors explicitly account for the position of a substituent, that is, atoms near the backbone systematically contribute to binding differently than those that are more remote. They also incorporate isosterism; e.g., one acidic functionality can often substitute for another. A Daylight toolkit-based program was written to characterize each non-hydrogen atom by six properties: radius, whether it is acidic, basic, an H-bond donor (HBD), an H-bond acceptor (HBA), and/or aromatic (Ar). Within a side chain, all atoms at a given bond count distance from the backbone comprise an “atom layer.” As Figure 3 illustrates, each of the six atom properties is summed for all atoms within each layer (for up to 15 layers) to make a table of six columns (properties) and as many rows as there are layers. Other atomic properties, such as positive and negative partial charge, electrophilic and nucleophilic superdelocalizability, or the electrotopological state index20 could also be added as columns to the tables. As with the fingerprints, a low-dimensional space is required that preserves the similarity between the substituents as distances. To calculate similarity between pairs of substituents, the maximum and minimum values are determined for each corresponding pair of cells in their respective atom layer tables. The sum of the minima divided by the sum of the maxima yields the similarity between the pair of substituents. This can be understood as an extension of Tanimoto’s measure of similarity. In “fuzzy logic,” finding the minimum of continuous values is analogous to using the logical “AND,” and finding the maximum is analogous to using a logical “OR.” The Tanimoto coefficient is the number of bits in the AND of two bit strings divided by the number of bits in the OR of the two strings. Thus, this similarity measure, which reduces to Tanimoto similarity for binary data, can be regarded as its fuzzy logic generalization. As in the case of the chemical functionality descriptors, a dissimilarity matrix is computed, and MDS is applied. The atom layer similarities generally reduce to fewer dimensions than the chemical database fingerprints. A set of 2133 amines requiring 11 dimensions to capture chemical functionality similarity required just five receptor interaction dimensions. Figure 3 compares the atom layer tables for two substituents. Although compound I1 differs from I only by the removal of one atom, there are six changes in the tables (in bold); the neutral amide HBA in I becomes a basic amine in 11, and the phenolic oxygen, which had been acidic owing to the electron withdrawing carbonyl in I, is a neutral HBD without it in 11. This
82 Combinatorial Chemistry and Computer-Aided Drug Design Receptor Recognition Similarity based on Atom Layer Tables
Atom Layer Table for I Radius Acid 1 1 . 9 0 2 1 . 9 0 3 1 . 8 0 4 1 . 9 0 5 3 . 5 0 6 3 . 8 0 7 3 . 8 0 8 1 . 9 0 9 1 . 7 1 1 0 0 0
Base HBD HBA Ar 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 2 0 0 0 2 0 0 0 1 0 0 1 0 0 0 0 0
Atom Layer Table for II
I
1 2 3 4 5 6 7 8 9 10
Radius Acid 1 . 9 0 1.9 0 1.8 0 1.9 0 1.9 0 3.8 0 3.8 0 1.9 0 1.7 0 0 0
Base HBD HBA Ar 0 0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 1 2 2 1 0 0
Cmin / Xmax = 28.6/35.2= 0.81 Figure 3 An “atom layer” table for a substituent is made by summing each of several properties for all non-hydrogen atoms at a given bond count distance (leftmost columns) from the scaffold backbone (BB), including: radii, acids, bases, H-bond donors (HBDs), H-bond acceptors (HBAs), and aromatics. Two tables are compared element by element, and the sum of the minima divided by the sum of the maxima gives the similarity between the substituents.
,
illustrates the unique chemical information captured by these receptor interaction similarities that are missed by similarities based on maximum common substructure or chemical database fingerprints. (Rows 11-15 were not shown as they contain all zeros. Note that elements that are zero in both tables do not affect the calculated similarity.) Table 1 summarizes the number and kind of properties described above, which address much of the “wish list” for a biologically relevant property space (see’above). Log KO,,+,is a simple property, but the remainder form three subspaces for topological shape, chemical functionality, and receptor interaction features. The properties must be scaled and combined into a single property space for experimental design. All properties are mean centered, so the origin of property space is the mean of each property independently. Within each subspace, the properties are weighted in proportion to their eigenvalues. If one set of properties, say shape, is believed more important than the others, that set can be given extra weight. Otherwise, log KO,, and the largest dimension of each subspace are scaled to have equal weight. In practice, very little correlation
Experimental Design 83 Table 1 A Typical Set of Descriptors for
Combinatorial Library Design Descriptor Type Topological shape descriptors Chemical functionality descriptors Receptor interaction descriptors Calculated log
Number 5-6 5-11 5-7 1
has been found between these properties.12 Nevertheless, the properties are combined with PCA to include as much information as possible for small designs in truncated spaces with fewer members than the total number of descriptors. The ability to combine several kinds of similarities, without assuming orthogonality, is an advantage of converting similarities to coordinates with MDS. Care must be taken to perform the PCA on the variance/covariance matrix rather than the correlation matrix, to avoid losing the scaling. Calculating properties for a class of reagents need not be extremely fast because it must be performed only once for each class of starting materials. Thereafter, it can be used for the design of many libraries.
EXPERIMENTAL DESIGN Returning to Figure 1, a property space has now been created, and individual substituents must now be selected. Highly focused sets of substituents are chosen simply by rank ordering every member of the candidate set by the Euclidean distance from selected fragments in lead compounds. Finding dissimilar sets is more difficult. In particular, we often want to design a “biaseddiversity” set, by including some particular fragments based on pharmacophore hypotheses or other criteria, then complete the rest of the set with a number of additional substituents from the full candidate set that are mutually diverse. This is accomplished with “D-optimal” design.26 D-optimal design chooses subsets from a large candidate set maximizing the determinant of the “information matrix,” IX‘XI, for a.design matrix X (with X’ being its transpose). This minimizes the determinant of the inverse, which is the variance of the parameter estimates for an assumed model. The rows of X are the substituents and the columns are model terms, that is, the property space dimensions, or higher order terms such as their squares, cubes, or cross terms. Roughly speaking, to determine accurate parameter estimates to model a response surface, the D-optimal design algorithm chooses a small subset of points from the candidate set that are well spread out and nearly orthogonal in property space; that is, they are diverse. Based on an existing SAR or other information, some substituents can be preselected for inclusion
84 Combinatorial Chemistry and Computer-Aided Drug Design
into the set. The D-optimal algorithm then “augments” the design by selecting additional side chains that best complement those, completing a design of a specified size with maximum overall diversity. Log IX’XI can be used to compare the diversity of designs of the same size, property space, and model, but not between designs of different sizes or for different models. If the model contains only linear terms, D-optimal design picks extreme points in property space to best determine a linear model. If it includes higher order terms, intermediate points will also be used to characterize curvature and interactions in the response surface. For conventional statistical problems, such as QSAR, a model is chosen containing fewer terms than the requisite number of points. In that case, the D-optimal algorithm will choose to duplicate some points, allowing an estimate of uncertainty in a fitted response surface. Obviously, this does not minimize redundancy, so extra terms should be systematically added to “saturate” the model, by removing any ext ‘egrees of freedom. A useful approach is to first include an intercept and linear terms, in order, starting with the largest principal component. Squared terms are then added, followed by cross terms starting with PCl “PC2,PCl“PC3,PC2“PC3, and so forth to saturate the model. For N-dimensional property space, this protocol covers up to (N2 + 3N)/2points; for example, up to 189 substituents for 18 dimensions. Beyond that, cubic or higher terms can be added. Besides the D-optimal criterion (max IX‘XI), other criteria exist for choosing optimal designs. Some, such as A-, G-, and I-optimalityY27are “model based” methods differing only slightly from D-optimality. Others, such as Soptimal and U-optimal, are “distance based” methods designed to fill space or sample clusters.27 D-optimality favors designs that minimize collinearity and sample the full dimensionality of the property space. Because of a convenient update formula, IX’XI can be optimized more rapidly than the other criteria. Speed proves essential for the marriage of medicinal chemical intuition and statistical design described below. S-optimality corresponds to intuitive ideas of minimizing redundancy. U-optimality biases the design toward existing clusters, increasing the chances that any hits can be rapidly improved by additional generations of follow-up libraries. Distance-based designs should always be run with a simple linear model and can meaningfully be compared regardless of the design’s size.
SELECTING SUBSTITUENT SETS The simplest design just lets the D-optimal algorithm select a given number of substituents from the candidate set. This would be a purely statistical diversity design; it is unbiased, but has little else to recommend it. The next level of complexity is to select a fixed number of substituents for inclusion by hand, based on pharmacological or synthetic ideas, and then let the D-optimal
Selecting Substituent Sets 85 algorithm select the remaining substituents to fill out the design for maximum diversity. It is also useful to sort the candidate set by distance from the origin, then remove some of the more remote substituents to “pull in” the design from the very extremes, as well as picking one substituent near the origin for the inclusion set. Experimental library design becomes increasingly effective as more medicinal chemistry is coaxed into the design by successive applications of D-optimization. The candidate substituents can be divided into “bins” or categories, such as those substituents containing pharmacophores for a particular target; substituents found effective in another project where this library might also be screened; those for which the chemistry has already been validated; those requiring special protection before synthesis; those that are already available protected; and those that are expensive, or pure hydrocarbons, or suspected toxiphores, or low molecular weight, etc. Muskal has developed lists of “stable” fragments found in known drugs.28 These fragments could be used to form a bin of substituents with historically successful “druglike” features. Forming bins is facilitated by storing the available substituents in a chemical database capable of substructure searching. The example in Table 2 shows how the design can be controlled by selecting ranges of substituents from each candidate bin, shown in column 1,to make up the total number. Column 2 gives the number of substituents in each bin from which each step of D-optimal design can draw. The number of substituents to be selected is given in column 3 . Finding the D-optimal set of substituents is a multiple maximum problem, so column 4 gives the number of tries to find the optimal solution at each stage. One design will be generated for each possible combination from column 3 that is consistent with the total of 35 substituents. D-optimal design is performed successively, building up the design in stages. The first design of 35 substituents begins with the most diverse set of five from the pharmacophore candidate bin. The design is then augmented to 10 drawing from the protected bin. It is next augmented to 14 from the drug fragment bin, augmented further to 27 from the validated bin, augmented again to 30 from the set of low-molecular-weight substituents, and finally Table 2 Example Setup for a Biased D-Optimal Design Substituent Bin Pharmacophores Protected groups Drug fragments Validated Low molecular weight All
No. in Bin
Total No. in Design
No. of Tries
18 23 11 93 392 72 1
5 to 8 10 to 12 14 to 16 27 to 29 30 to 33 35
2 2 2 6
4 4
86 Combinatorial Chemistry and Computer-AidedDrug Design increased to 35 from all possible substituents. Thus, just the first design requires six D-optimal steps, and this was just one design consistent with the ranges for the bins. The next design starts with six members from the pharmacophore bin, and so on. The example shown has 432 possible designs consistent with the ranges selected, requiring a total of nearly 2600 D-optimization steps. This is one reason for using the readily calculated D-optimality criterion. These calculations typically take several hours, depending on the numbers and sizes of the bins and the number of attempts made to find the global optimum at each D-optimal step. The designs are then sorted by the diversity scores (log JX’XJ), as in Table 3,to critique the tradeoff between bias, synthetic ease, and diversity. The object is to find synthetically feasible biased designs with high diversity and pharmacophore focus, and low molecular weight. Unbiased design #1 was made by taking all 35 substituents from the “All” candidate bin. Its score of 150.9 is used
Table 3 Highest Scoring 25 Designs from the Series of 432 Biased Diversity Substituent Sets from Table 2* Design No.
Score
Pharrn.
Prot.
Drug
Valid.
Low MW
All
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
150.9 111.4 111.4 111.2 110.9 110.3 110.1 110.1
0 5 5 5 5
0 5 2 2 4
0 12 13 13 11 11 13 12 13 12 12 13 12 14
17 18 19 20 21 22 23 24 25
109.2 109.0 108.9 108.9 108.7 108.7 108.6
0 5 7 7 7 7 5 5 5 6 7 5 4 4 5 7 7 5 7 7 6 6 4 6 4
14 15 13 13 14 15 14 12
0 3 4 3 3 4 3 4 4 2 2 2 3 2 2 2 1 5 2 1 1 2 3
13
2
35 5 4 5 5 4 5 4 4 5 5 5 5 5 5 5 5 3 5 5 5 5 5 4 5
16
110.0
109.9 109.8 109.8 109.4 109.2 109.2
109.2
108.4 108.3
5 5 5 5 5 5 5
6 6 6
5
5 5 5 5 5 5
7 5 6
aSee Table 2 for explanation of column headings.
4 4
5 4 5 4 5 5 4 4 2 2 4 3 3 3 3 4 4 5
13
13
3
Template Diversity 87 as a benchmark for maximum possible diversity. The most diverse of the biased designs, #2, is highly drug focused with five members from the pharmacophore bin and five from the drug feature bin. Equally diverse design #3 trades drug bias for synthetic ease, using 7 preprotected groups and 13 other validated substituents. After selecting a design, a medicinal chemist examines the structures by eye. Invariably, no matter how carefully the candidate set of substituents was prescreened, problems will be identified when chemists consider synthesizing the library. Substructure searching tools are then used to eliminate all substituents sharing these problems from the candidate bins. The design calculations are repeated. Now that the tradeoff between diversity and bias is understood, the ranges on the bins can be narrowed such that the calculation of new D-optimal sets takes just a few minutes. The best of the new designs is again displayed and criticized. A few changes are made to the bins, and yet another design is generated. This cycle is typically repeated from 5 to 25 times until everyone is satisfied with the design. Because this task requires rapid feedback between medicinal and computational chemists, it is best performed interactively at the computer screen. It is well worth the effort to create fast and convenient software for displaying the designs, adjusting the bins, and calculating the new D-optimal substituent sets.
TEMPLATE DIVERSITY Originally, combinatorial synthetic schemes were based on biopolymer chemistry or a few other simple backbones. Because choices were so limited, calculating the diversity of the scaffold was of secondary importance. Combinatorial synthetic technology has improved; chemists are now in a position to pick and choose among many library scaffolds, so maximizing the diversity between library templates is increasingly important. Diverse sets of scaffolds can be chosen based on log molecular connectivity indexes, chemical functionality descriptors from substructure fingerprints, and atom layer properties as previously described for substituents. However, the geometry of attachment positions among templates is particularly important. Lauri et al. developed a computer algorithm, CLICHE, to analyze the geometry of substituent attachment sites for a scaffold or a database of scaffolds.29 Each pair of attachment vectors constitutes a point in “CAVEAT” space, that is, the space of a library of relatively rigid organic templates.29 Sets of vector pairs for a conformation of a scaffold form a figure in CAVEAT space. Similarity between scaffolds or sets of scaffolds is determined by the number of figures they have in common. Maximizing or minimizing this similarity allows focused or diverse sets of combinatorial libraries to be designed.
88 Combinatorial Chemistry and Computer-AidedDrug Design
SECOND-GENERATIONLIBRARIES The designed library is now ready to be synthesized and screened. Assuming a few interesting hits of moderate potency are found, a second-generation library might be planned. As in all QSAR problems, there are two approaches: regression or optimization. If the library was synthesized as individual compounds that are fairly rigid and quantitative data are available on many of the compounds, a regression approach might be chosen. Because the libraries were derived from an experimental design, Hansch-type analysis using the property space dimensions as descriptors is a good approach. In this case, the activity data are fit to the substituent properties using cross-validated multiple regression, PLS, or other standard statistical methods. Other QSAR methods not based on the design property space, such as comparative molecular field analysis ( C O M F A ) , could ~ ~ equally be used. If a significant model is found, the candidate substituents are searched for those predicted to be active at each position, and they are assigned to a special “pharmacophore” candidate bin for further designs. However, regression implicitly assumes a common mode of binding, or at least a single biological response surface, which might not be true for a diversity library. Moreover, if the compounds were synthesized as complex mixtures, one cannot assume that every compound in an inactive pool was in fact synthesized and proved inactive. In this case, optimization is a useful alternative approach. Here one simply rank orders the substituents by their distance in property space from each substituent in the active compounds and assigns the neighbors to the special pharmacophore bin for the next set of designs. Either way, second-generation library design begins by sampling one-half to three-fourths of the design from the new pharmacophore bin and then fills out the rest to optimize diversity subject to constraints of molecular weight, synthetic difficulty, and so forth as above. This approach amounts to optimization by successively more focused experimental designs, sampling property space heavily around the lead compounds, but also pairing the lead substituents with diverse partners to present them in many geometries and chemical environments. The more potent the initial lead, the more tightly the substituents should be clustered around the leads. If one is testing mixtures, optimization pools will contain many analogs, so the pools should be smaller to avoid finding too many hits per pool. Sheridan and Kearsley used a slightly different approach to designing a lead optimization combinatorial library from noncombinatorial leads.31 They employed a genetic algorithm (GA) to generate N-substituted glycine “tripeptoids” maximizing the value of a trend vector QSAR model based on active small molecules. Although this is a whole-molecule-based approach, the procedure did identify several side chains that occurred frequently in high-scoring
Structure-Based Library Design 89 molecules. These fragments could similarly be used in the pharmacophore bin of a strongly biased diversity library. Weber et al. also applied a genetic algorithm to a modular reaction scheme, but optimized directly against an experimental enzyme assay as the fitness function, rather than a theoretical QSAR mode1.32 In 20 generations of parallel synthesis of 20 Ugi reactions, thrombin inhibition was increased from 1000 pM to 0.22 p M . It is important to keep in mind the cost of synthesis, screening, and identifying the hits from a large combinatorial library. Thus, the number of optimization steps must be fewer than in traditional medicinal chemistry. Nevertheless, the number of compounds tested can be far greater. Careful experimental design at each optimization step helps one to derive the most benefit from all of the available information.
STRUCTURE-BASED LIBRARY DESIGN Combinatorial library design is ideally suited for structure-based drug design, The approach is similar to that of making second-generation libraries. The most difficult part of de novo design programs, which build molecules into a protein receptor, is finding a general way to build arbitrary organic compounds that are both stable and synthetically feasible. In de novo library design, this problem does not occur. Given a combinatorial synthetic scheme, and a list of candidate scaffolds and substituents, one can generate and dock only the molecules in a potential library. These are all synthetically feasible, even as part of a large parallel synthesis scheme. Another obstacle in de novo design is a lack of adequate scoring functions. Although state-of-the-art docking methodologies are able to predict reasonable docking geometries, calculated binding energies often do not correlate well with potency between different ligands, especially for flexible molecules. This is likely due to inadequacies in computing hydrophobic and entropic contributions. Whereas a genetic algorithm33 or some similar method could be used to generate 100 structures with high docking scores, all of which could be made by modular chemistry, the chance that any one of them would be a very potent ligand is low. However, these compounds would have good charge and shape complementary with the binding site. If they were used as the pharmacophore bin in a biased diversity library of, say, 50,000 compounds, the chances of finding a potent ligand would be enhanced. In an example of the marriage of combinatorial chemistry and structurebased design, we used a genetic algorithm to generate populations of mixed peptoid/peptide oligomers.34 The ligands were docked into a homology model
90 Combinatorial Chemistry and Computer-AidedDrug Design of a serine protease using distance geometry. The fitness function for the GA was the intermolecular energy from a force field calculation. The best scoring members evolved into obvious families based on their mode of binding. The common side chains from each family formed the pharmacophoric bin for a biased diversity library design. In another example of the combination of the two methods, Van Vliet et al. docked jusr the substituents from a combinatorial synthetic scheme into a receptor structure,35 then searched a library of potential templates to connect the best scoring fragments in the appropriate geometry. The results were scored based on a combination of the docking quality and connection quality. The final assembled molecules were again docked to verify the procedure.
CALIBRATION OF DIVERSITY SCORE Because the properties described above are not all coded to range between + 1 and -1, the diversity score (log JX’Xl)does not have an absolute scale.27 Therefore, it is important to establish diversity “bench marks,” the most important of which is the completely unbiased D-optimal library, representing maximum diversity. Additional unbiased benchmarks can be made by randomly eliminating fractions of the candidate set and determining both the mean and range of D-optimal scores. The low-diversity extreme using this approach is the mean score for random sets of substituents of the desired size. The score for the best design using only pure hydrocarbon substituents is a useful biased benchmark. Designs made entirely from members of the 50 nearest analogs of a particular reagent, such as tyramine or glycine, give lowdiversity estimates for highly focused designs. Table 4 shows the diversity scores for a set of calibration designs derived from amine reagents. Random subsets were chosen 10 times, and the mean and standard deviation reported. Table 4 D-Optimal Scores (log IX’Xl) for Several “Benchmark” Designs of 20 Substituents Benchmark
Score
All 721 amines Drop 50% at random Drop 80% at random 29 1 hydrocarbons Drop 90% at random Random sets of 20 50 Glycine analogs 50 Tyramine analogs
97.9 86.8 70.8 58.1 53.3 0.0 -13.9 -23.7
Std. Dev.
5.2 5.0
-
5.5 9.3
Calibration of Diversity Score 91 “Random sets of 20” refers to selecting 20 compounds at random and just calculating the score. The analog sets are the 50 amines closest to the listed reagent. Note that, although the property space and candidate set are the same as in Table 3, the maximum score is different because the design size is 20 members rather than 35. This scale of benchmarks puts various designs into perspective. For example, the substituent derived from tyrarnine occurs in potent drug leads found from published combinatorial “peptoid” libraries.36 The scores for biased designs of 20 amines forced to include from 0 to 20 tyramine analogs are shown in Table 5. Comparing these results to those in Table 4 illustrates that forcing the inclusion of 4 of the 50 nearest tyramine analogs in a design of i 0 is comparable to randomly eliminating half the candidate set. Including 10 tyramine analogs yields diversity comparable to that of the best pure hydrocarbon design. Picking the most diverse set that includes 18 tyramine analogs is comparable to picking 20 substituents at random. This demonstrates how poor random choice is compared to an optimized choice even when the optimized choice has severe constraints. As a convenience, one might imagine designing a diverse set of 30 substi-
Table 5 Change in Diversity Score for D-Optimal Libraries of 20 Substituents Forced to Include Increasing Numbers from a Bin of Tyramine Analogs
No. analogs
Score
0 1
97.9 97.0 94.0 91.8 87.7 83.7 77.4 73.9 71.2 66.3 60.7 53.2 51.9 45.7 36.6 29.1 20.5 6.1 1 .o -11.6 -23.7
2
3 4 5 6 7 8 9 10 11. 12 13 14 15 16 17 18 19 20
92 Combinatorial Chemistry and Computer-AidedDrug Design
tuents and then drawing smaller subsets from this list whenever a new design of fewer than 30 is needed. When random subsets of 20 were chosen from the D-optimal set of 30, the mean score was only 74.8. Table 4 shows that, on average, this approach is comparable to randomly throwing away three-fourths of the candidate set before doing a design of 20. Alternatively, Table 5 shows this is no better than a design biased with six tyramine analogs. Thus, it seems worth the small additional effort to make a new design of the exact size needed.
EVALUATING EFFICIENCY OF EXPERIMENTAL DESIGN An important question is, “How do I know that a statistically designed library is better than one designed by intuition”? At first blush, this seems like a very difficult question to answer. The most direct approach might be making one library based on these methods and another based purely on intuition, screen them both, and compare the hit rates. The required increase in library hit rate depends on a codbenefit analysis of how much extra time is required to add statistical analysis to the choice of substituents. To develop a modular synthetic scheme, validate the chemistry, synthesize a large combinatorial library, test the library across a battery of screens, and deconvolute the hits requires perhaps a man-year worth of work. The statistical design effort is less than 5% of this, so, if the number of leads discovered is only 5% higher, the effort is worthwhile. This could be a very expensive experiment, requiring many man-years to make enough libraries in designed and undesigned pairs to prove statistically whether designed libraries are at least 5% more efficient. A more indirect but perhaps more useful approach is simply to test if activity occurs in clusters in property space. The utility of experimental design itself is not in question; it has been validated many times in many fields over 70 years. If proximal compounds tend to have similar activity, then design is worthwhile. In fact, Brown et al. have shown that biological activity is clustered by database search keys.37 The clustering of activity is also demonstrated by second-generation optimization libraries, which display higher hit rates and more potent ligands than the original broad-screening libraries. This is no surprise. Qualitative structural similarity is well captured by these properties, and the existence of structure-activity relationships has always been the foundation of medicinal chemistry. Yet even this analysis misses the important point that little is sacrificed by designing libraries. Using bins is an excellent way to organize structure-activity ideas. It allows the synthetic chemist to incorporate intuition into the design as well. Having done that, it is actually easier to let the computer pore over a 1000 or so reagents to ensure the library’s diversity than it is to do this by hand. Experienced chemists are very good at looking through large collections of structures to pick out ones most similar to a lead. Likewise, given fairly simple
Comparison to Clustering Corporate Archives 93 structures that are dominated by a single property, such as charge, polarity, or lipophilicity, they can readily pick out small diverse sets by choosing one representative from each type. However, when the structures become larger and more complicated, with many containing a variety of these simple features in different arrangements, it is very hard just to look at several sets of 20 structures and determine by intuition which is most diverse, let alone to pick the most diverse set of 20 out of 1000 candidates. With suitable software, allowing the computer to pore over vast lists of structures makes this part of the job easier, not harder.
COMPARISON TO CLUSTERING CORPORATE ARCHIVES Much precedent for selecting nonredundant subsets of potential libraries comes from the related problem of clustering corporate archives.9 Brown et al. clustered a database using substructural fragments and found significant clustering of the biological activity for two enzyme assays, demonstrating that stratified sampling would increase screening efficiency.37 In another case of using a corporate database, Taylor tested sampling strategies on computer-simulated biological data showing that sampling from clusters facilitates optimizing an initial lead with a second focused screen.38 He also showed that sampling for “maximum dissimilarity” reduced the ability to optimize a weak initial lead. Note, however, that Taylor’s dissimilarity is very different from D-optimal diversity. It is not based on covering property space statistically. Taylor selected “dissimilar molecules” by the stepwise elimination of compounds from pairs of nearest neighbors, so the final set is expected to be rich in unique compounds having no close analogs. As he pointed out, the original database should, therefore, contain fewer compounds similar to a lead from an initial dissimilar set, so an optimization screen is less likely to be fruitful. These and most other corporate archive sampling approaches are based on Tanimoto similarities from substructural fragments. However, because corporate databases typically contain hundreds of thousands of compounds, proceeding with MDS to create a Euclidean space with coordinates for each molecule is prohibitive. Although the number of potential compounds in a combinatorial library is many orders of magnitude larger than any corporate database, the number of building blocks is much smaller, so the MDS calculation is tractable. Access to Euclidean coordinates has many advantages. It allows one to combine diverse properties from many sources. If the properties are correlated, PCA can be performed at a final step to reduce the dimensionality. PCA also reveals the total dimensionality of the property space, which in turn helps indicate how many monomers are required to supply a specified degree of
94 Combinatorial Chemistry and Computer-Aided Drug Design
coverage. A grid can be imposed to reveal the number and size of unrepresented regions or to indicate whether a given monomer is near the “edge” of space. Experimental design, coupled with model building and optimization strategies, can help guide the next optimization library. Compounds selected from clusters (centroids or otherwise) generally do not produce a balanced, orthogonal design. In many clustering methods, the centroid of one cluster can actually lie within another cluster, so the “most representative member” of the cluster would never be chosen. In short, extracting a latent property space allows one to apply all of the tools of analytical geometry to the design and evaluation of the substituent sets.
DIVERSITY SPACE The property space described earlier, while including many seemingly important interactions, is clearly incomplete. For example, it treats distance in terms of bond counts only and does not account for actual three-dimensional (3D) geometry. It has only the most indirect estimate of conformational entropy, solubility, or hydrophobic forces. Yet, PCA shows that 20 unique dimensions are often required for just a single class of substituent in this very incomplete description of property space. This has several consequences. The “size” of property space is the first problem. A common ambition in combinatorial drug discovery is to synthesize a “Universal Library” encompassing every possible pharmacophore. However, the feasibility of this goal depends on the size of property space and how fine a “net” is needed to sweep out all possible pharmacophores. If, as it appears, pharmacological property space does require well over 20 dimensions, hopes to accomplish this seem vain. A hypercube of just 20 dimensions has 106 corners. Assuming a coarse grid of just five values per dimension, there are 1014 points to occupy. Many of these may be inaccessible because they contain incompatible combinations of properties or have properties incompatible with an effective drug. Yet, if even 1% of the space is relevant, a spanning library would require 1012 structures, perfectly placed in property space. Because flexibility might allow one molecule to adopt many structures, this library might be barely feasible. If, however, .property space requires even 30 dimensions, there are 109 corners, and a fivelevel grid contains 1021 grid points. No amount of flexibility and careful design would make this library approachable with present technology. It would be extremely valuable to get a reasonable estimate on the size of biologically relevant property space. A second related problem is how to compare the quality of libraries that were designed in different property spaces. Imagine that a worker develops a property space of three dimensions and manages to obtain a set of molecules, one at each corner of the property cube, thereby achieving a full factorial,
Comparing Diversity among Libraries 95 maximally spread, design. Now imagine a colleague who develops another diversity measure, but manages to encompass only one face of the cube. Because the eight points are now projected onto one face, the latter analysis determines that the previous design has four pairs of exactly redundant points, apparently a far from diverse set. The problem, however, is not with the original design, but rather with the second property space. Whereas it is tempting to develop a new similarity measure and then analyze and criticize another’s designs, any design inevitably will be suboptimal in any alternative property space. Unless the second property space is shown to include all of the features of the original space, the analysis is meaningless. Only if multiple regression shows that each property in the first space can be predicted from the second, so the first space is included in the second, is the analysis valid.
COMPARING DIVERSITY AMONG LIBRARIES Martin and co-workers12 counted the total number of substructural fragments up to seven bonds long found in entire libraries as an estimate of “chemical functional group diversity.” They showed that although synthetic oligomeric combinatorial peptoid libraries were far more diverse than combinatorial biopolymer libraries, they still contained only about as many substructures as just the 45 top selling small molecule drugs. Boyd et a1.39 developed the “HookSpace Index” as a measure of the orientational diversity between pairs of common functional groups in a database of compounds. Analysis of the Cambridge Structural Database of crystallographic structures showed that the HookSpace Index for pairs of functional groups was highly correlated with the number of occurrences of those pairs of groups. Chemical Design Ltd. measures diversity by enumerating the 3D threepoint pharmacophores (3PP) that a collection of compounds can present.40 Four types of features are considered: the centers of HBA, HBD, positive charge, and aromaticity. This allows 20 types of 3PP from the 20 unique ways to select three members from four types.41 Distances between centers are divided into 31 bins, giving a total of almost 500,000 possible 3D 3PP. Each of these is assigned to a bit in a 70 Kbyte fingerprint bit string. A rule-based search enumerates each compound’s low-energy conformations, and all of the 3D 3PP are identified and accumulated for an entire library. Libraries are compared by comparing bit strings. A search procedure automatically identifies small subsets of compounds that can present most of the 3PP in the entire library. A 3D graphical display helps visualize the overlap between libraries. The Chemical Design Ltd. approach40 is very attractive because it uses flexible 3D information. However, it implicitly assumes that the presence of a
96 Cornbinatorial Chemistry and Computer-Aided Drug Design particular 3D pharmacophore is sufficient to determine biological activity. Also, like other whole molecule approaches, although it is computationally feasible to apply to sets of 100,000 compounds, it cannot be used to analyze a potential combinatorial library of 1012 compounds as discussed previously. Furthermore, the search procedure does not choose combinatorial subsets. Thus, its best application in combinatorial chemistry might be to identify missing 3PP in libraries that were designed by another technique.
SYNTHESIS AND TESTING OF MIXTURES Some combinatorial synthetic approaches rely on the synthesis and testing of mixtures to achieve high throughput. Typically, one wants to maximize the diversity of each pool, so the activity of a pool is indicative of the activity of its most potent compound. However, when sensitivity of the test or solubility of the compounds has been a problem, some groups have, instead, pooled similar compounds, amplifying a weak signal by “the epitope effect,” that is, the cumulative activity of many related compounds. Regardless of whether compounds are pooled, the methods described previously allow one to maximize either the similarity or diversity of the pools. One can make mixtures either by allowing several reagents to compete during the reaction or by “split/mix” solid phase synthesis, in which a single reaction is run in each of several reaction vessels and mixtures are achieved by recombining and reapportioning the resin beads between reactions. The latter approach has the potential to make nearly equimolar mixtures, but only if enough resin beads are present to make the mixing statistics uniform. Burgess et al.42 used both computer simulations and an algebraic expression based on the Poisson distribution to analyze the number of beads required to have confidence that every intended compound is present in the library. For typical library sizes, if the number of beads is an order of magnitude greater than the total number of compounds in the library, every compound should be present on at least one bead. Zhao et a1.43 used the Pearson statistic to determine the number of beads needed to be confident that either the smallest individual error or overall relative error in concentration is less than a given threshold. There are several ways to c‘deconvolute” or identify the individual active compounds in an active mixture, These include iterative resynthesis and the synthesis of overlapping mixtures, such as the “positional scanning” method of Houghten et a1.44 In iterative resynthesis, the most active pools are resynthesized as several smaller subpools with a variable position resolved. This process is repeated until every position has been resolved and individual active compounds have been made. In Houghten’s method, the complete library is
References 97 synthesized several times, but with different positions mixed and resolved, so the activity profile across several libraries suggests which compound(s) are responsible for the activity. Neither of these methods is guaranteed to identify the most potent individual compound. Freier et al. used a RNA hybridization model to simulate the efficacy of these approaches.45 The simulation included the presence of suboptimal binders, experimental assay errors, and errors in equimolarity of concentration. Iterative resynthesis usually found a ligand that bound within 1 kcal/mol of the very best binder except when the mixtures were far from equimolar as might be expected from a mixed reagent synthetic approach. This suggests that split/mix synthesis with deconvolution by iterative resynthesis is an effective strategy for screening mixtures. Positional scanning was often confounded by experimental errors, even assuming split/mix synthesis.
CONCLUSIONS Screening combinatorial libraries should not imply a “random” or “irrational” approach to drug discovery. Far from obviating the use of computers, the increased complexity of combinatorial drug discovery makes their use ever more essential. Although the number of compounds that could be made is enormous, a well-planned experimental design selects a reasonable number of compounds that sample the range of accessible properties. This chapter described several easily computed properties that address many important interactions, although other properties could be used as well. Methods were outlined that can maximize the diversity of a library for broad screening or bias a library to emphasize specific properties or chemical functionality suitable for a particular biological target. The “art-based” insights of experienced medicinal chemists are smoothly married with statistical diversity by assigning the substituents to candidate bins and applying successive steps of D-optimal design to build a diverse library constrained to a profile of features based on these bins. The diversity score (log IX’Xl) monitors how much diversity is sacrificed by adding intuitive bias. With sufficient ingenuity, many of the computational methods that have been fruitfully employed in conventional drug design ought to be adaptable to the design of combinatorial libraries as well.
REFERENCES 1. W. H . Moos, G. D. Green, and M. R. Pavia, Ann. R e p . Med. Chem., 28,315 (1993). Recent Advances in the Generation of Molecular Diversity. See also R. Baum, Chem. Eng. News, February 12, 1996, p. 28. Combinatorial Chemistry. S. Borman, Chem. Eng. News, February 12, 1996, p. 29. Combinatorial Chemists Focus on Small Molecules, Molecular Recog-
98 Combinatorial Chemistry and Computer-Aided Drug Design
2.
10. 11.
12. 13. 14.
15. 16. 17.
18. 19.
20. 21. 22.
nition, and Automation. A. M. Thayer, Chem. Eng. News, February 12, 1996, p. 57. Combinatorial Chemistry Becoming Core Technology at Drug Discovery Companies. J. H. Kreiger, Chem. Eng. News, February 12, 1996, p. 67. Combinatorial Chemistry Spawns New Software Systems to Manage Flood of Information. N. K. Terrett, M. Gardner, D. W. Gordon, R. J. Kobylecki, and J. Steele, Tetrahedron, 51, 8135 (199s). Combinatorial Synthesis-The Design of Compound Libraries and Their Application to Drug Discovery. V. Austel, Methods Princ. Med. Chem., 2, 49 (1995). Experimental Design in Synthesis Planning and Structure-Property Correlations. M. Sjoestroem and L. Eriksson, Methods Princ. Med. Chem., 2, 63 (1995). Applications of Statistical Experimental Design and PLS Modeling in QSAR. L. H. Brannigan, M. V. Grieshaber, and D. M. Schnur, CHEMTECH, 25,225 (1995).Use of Experimental Design in Organic Synthesis. P. J. Scott, A. Penlidis, and G. L. Rempel, ]. Polym. Sci., 31, 403 (1993). Ethylene-Vinyl Acetate Semi-Batch Emulsion Copolymerization: Experimental Design and Preliminary Screening Experiments. M. W. Weiser and K. B. Fong, Am. Ceram. SOL.Bull., 72,87 (1993). Experimental Design for Improved Ceramic Processing, Emphasizing the Taguchi Method. N. Kettaneh-Wold,]. Pharm. Biomed. Anal., 9,605 (1991).Use of Experimental Design in the Pharmaceutical Industry. N. E. Shemetulskis, J. B. Dunbar, B. W. Dunbar, D. W. Moreland, and C. Humblet, ]. Cornput.-Aided Mol. Design, 9, 407 (1995). Enhancing the Diversity of a Corporate Database Using Chemical Database Clustering and Analysis. W. R. Dillon and M. Goldstein, Multivariate Analysis: Methods and Applications, Wiley, New York, 1984. P. Willett, Similarity and Clustering in Chemical Information Systems, Wiley, New York, 1987, p. 54. See also G. M. Downs and P. Willett, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 1-66. Similarity Searching in Databases of Chemical Structures. E. J. Martin, J. M. Blaney, M. A. Siani, D. C. Spellmeyer, A. K. Wong, and W. H. Moos, J. Med. Chem., 38,1431 (1995).Measuring Diversity: Experimental Design of Combinatorial Libraries for Drug Discovery. A. Leo, Chem. Rev., 93, 1281 (1993).Calculating Log Po,, from Structures. D. Weininger, CLOGP, Daylight Chemical Information Systems, Inc., Santa Fe, NM, 1994. For details about this and other software mentioned below, see D. B. Boyd, in Reviews in Computational Chemistry, K. B. Lipkowin and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 303-380. Compendium of Software for Molecular Modeling. PROLOGP 5.1, CompuDrug Chemistry Ltd., Budapest, Hungary, 1995. P. Howard and W. Meylan, LOGKOW, Syracuse Research Corp., Syracuse, NY, 1993. C. E. Kellogg, G. S. Joshi, and D. J. Abraham, Med. Chem. Res., 1, 444 (1992). A. Leo, Pomona Database, Daylight Chemical Information Systems, Inc., formerly in Irvine, CA and now in Mission Viejo, CA. L. H. Hall and L. €3. Kier, in Reviews in Computational Cbemistv, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 367-422. The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Modeling. L. H. Hall, MOLCONN-X 1.0, Hall Associates, Quincy, MA, 1991. kl. Randit, in G. M. Maggiora and M. A. Johnson, Eds., Concepts and Applications of Moiecular Similarity, Wiley, New York, 1990, p. 77-145. Design of Molecules with Desired Properties: A Molecular Similarity Approach to Property Optimization. S . C . Basak, V. R. Magnuson, G. J. Niemi, and R. R. Regal, Discrete Applied Mathematics, 19, 17 ( 1988). Determining Structural Similarity of Chemicals Using Graph-Theoretic Indices.
References 99 23. C. A. James, D. Weininger, and J. Scofield, Daylight Toolkit Programmer’s Guide, Daylight Chemical Information Systems, Inc., Irvine, CA, 1994. 24. SAS Technical Report P-229 6.10, SAS Institute Inc., Cary, NC, 1992. 25. D. Spellmeyer, unpublished results, 1995. 26. V. V. Federov, Theory of Optimal Experiment, Academic Press, New York, 1972. 27. SASlQC Software: Usage and Reference, Vol. 1, Version 6, SAS Institute Inc., Cary, NC, 1995. 28. S. M. Muskal, Abstracts of the 209th National Meeting of the American Chemical Society, Anaheim, CA, 1995, COMP 029. Enriching Combinatorial Libraries with Features of Known Drugs. 29. G. Lauri, Abstracts of the 209th National Meeting of the American Chemical Society, Anaheim, CA, 1995, COMP 031. Tools for Library Design. G. Lauri and P. A. Bartlett, ]. Comput.-Aided Mol. Design, 8,51 (1994). CAVEAT: A Program to Facilitate the Design of Organic Molecules. 30. R. D. Cramer 111, D. E. Patterson, and J. D. Bunce, J. Am. Chem. SOL., 110, 5959 (1988). Comparative Molecular Field Analysis (CoMFA). 1. Effect of Shape on Binding of Steroids to Carrier Proteins. 31. R. P. Sheridan and S. K. Kearsley,]. Chem. Inf. Comptct. Sci., 35,310 (1995).Using a Genetic Algorithm to Suggest Combinatorial Libraries. 32. L. Weber, S. Wallbaum, C. Broger, and K. Gubernator, Angew. Chem. Int. Ed. Engl., 34,2280 (1995). Optimization of the Biological Activity of Combinatorial Compound Libraries by a Genetic Algorithm. 33. R. Judson, this volume. Genetic Algorithms and Their Use in Chemistry. 34. E. Martin, J. Blaney, M. Siani, and D. Spellmeyer, Ahtracts of the 209th National Meeting of the American Chemical Society, Anaheim, CA, 1995, COMP 032. Measuring Diversity: Experimental Design of Combinatorial Libraries for Drug Discovery. 35. D. Van Vliet, M. H. Lambert, and F. K. Brown, Abstracts of the 209th National Meeting of the American Chemical Society,Anaheim, CA, 1995, COMP 034. Design of Libraries Based on the Binding Site: The Merging of De Novo Design and Combinatorial Chemistry. 36. R. N. Zuckermann, E. J. Martin, D. C. Spellmeyer, G. B. Stauber, K. R. Shoemaker, J. M. Kerr, G. M. Figliozzi, D. A. Goff, M. A. Siani, R. J. Simson, S. C. Banville, E. G . Brown, L. Wang, L. S. Richter, and W. H. Moos, ]. Med. Chem., 37, 2678 (1994).Discovery of Nanomolar Ligands for 7-Transmembrane G-Protein-Coupled Receptors from a Diverse N-(Substituted)Glycine Peptoid Library. 37. R. D. Brown, M. G. Bures, and Y. C. Martin, in Proceedings of the First Electronic Computational Chemistry Conference, S. M. Bachrach, W. Hase, D. B. Boyd, H. S. Rzepa, and S. K. Gray, Eds., CD-ROM Version, ARInternet Corp., Landover, MD, 1995. A Comparison of Some Commercially Available StrUCNral Descriptors and Clustering Algorithms. 38. R. Taylor,]. Chem. Inf. Comput. Sci., 35, 59 (1995). Simulation Analysis of Experimental Design Strategies for Screening Random Compounds as Potential New Drugs and Agrochemicals. 39. S. M. Boyd, M. Beverley, L. Norskov, and R. E. Hubbard, 1. Comput-Aided Mol. Design, 9, 417 (1995). Characterising the Geometric Diversity of Functional Groups in Chemical Databases. 40. K. Davies and C. Briant, Network Science (NetSci on the World Wide Web at http://edisto,awod.com/netsci/), July 1995. Combinatorial Chemistry Library Design Using I’harmacophore Diversity. 41. The order of the three pharmacophores does not matter, so there are 4 of the class XXX, 4 x 3 = 12 of the class XXY, and 4 of the class XYZ. 42. K. Burgess, A. I. Liaw, and N. Wang, 1. Med. Chem., 37, 2985 (1994). Combinatorial Technologies Involving Reiterative Division/Coupling/Recombination: Statistical Considerations.
100 Combinatorial Chemistry and Computer-AidedDrug Design 43. P.-L. Zhao, R. Zambias, J. A. Bolognese, D.Boulton, and K. Chapman, Proc. Nutl. Acud. Sci. USA, 92, 10212 (1995). Sample Size Determination in Combinatorial Chemistry. 44. C. Pinilla, J. Appel, P. Blanc, and R. A. Houghten, Biotechniques, 13, 901 (1992). Rapid Identification of High Affinity Peptide Ligands Using Positional Scanning Synthetic Peptide Combinatorial Libraries. 45. S. M. Freier, D. A. M. Konings, and J. R. Wyatt, 1. Med. Chem., 38, 344 (1995). Deconvolution of Combinatorial Libraries for Drug Discovery: A Model System.
CHAPTER 3
Visualizing Molecular Phase Space: Nonstatistical Effects in Reaction Dynamics Robert Q. Topper Department of Chemistry, The Cooper Union for the Advancement of Science and Art, Albert Nerken School of Engineering, 51 Astor Place, New York, New York 10003
This chapter reviews some of the formal and computational methods that have evolved from the theory of nonlinear dynamical systems (sometimes loosely referred to as “chaos theory”) for visualizing global trends in reaction dynamics. These methods have provided new and interesting predictions about the details of pre- and postreactive molecular motions. Much of the focus of the work has been to emphasize the roles played by certain “phase-space structures” which can act as bottlenecks to intramolecular energy transfer or as mediators of reaction dynamics. These structures have been given a variety of exotic names, including “reactive islands,” “turnstiles,” “cantori,” “vague tori,” and “cylindrical manifolds.” Their properties have helped to provide a framework for visualizing nonstatistical effects in reaction dynamics and molecular dynamics simulations. We will make a reasonable (albeit brief) attempt to review the nonlinear dynamics literature, with an emphasis on some aspects that are relevant to microscopic reaction dynamics. However, our main goals are: (1) to convey to the interested nonspecialist some of the interesting ideas that have arisen Reviews in Computational Chemistry, Volume 10 Kenny B. Lipkowitz and Donald B. Boyd, Editors VCH Publishers, Inc. New York, 0 1997
101
102 Visualizing Molecular Phase Space from the application of nonlinear dynamics to the motions of molecules and clusters, (2) to provide a short tutorial for those who may be contemplating working in this area, and (3) to describe some of the numerical methods that have been found to be useful in visualizing phase space. This is done in the hope that it may convey some useful insights into chemical reaction dynamics in general. In this spirit, we will also briefly describe the basis for some of the microscopic kinetic theories of unimolecular reaction rates that have arisen from nonlinear dynamics. Unlike the classical versions of Rice-RamspergerKassel-Marcus (RRKM) theory and transition state theory, these theories explicitly take into account nonstatistical dynamical effects such as barrier recrossing, quasiperiodic trapping (both of which generally slow down the reaction rate), and other interesting effects. The implications for quantum dynamics are currently an active area of investigation.
MOLECULAR DYNAMICS IN PHASE SPACE Introduction Readers who can remember what their first course in chemistry was like may recall how hard it was to learn how to visualize molecular structures and from there to visualize electronic wavefunctions. It is fair to say that an even more difficult challenge awaits those who attempt to visualize the phase space of reactive classical molecular dynamics within the Born-Oppenheimer approximation. The term “phase space” seems to have been originally coined by Gibbs.1.2 But what is phase space? And if it is so hard to visualize, why should we want to visualize it? Obviously, such an endeavor is worthwhile to the extent that it yields new insights about molecular motion and helps us to interpret experiments. The concept of phase space will gradually emerge from the discourse in the remainder of this section. Consider the dynamics (and kinetics) of a molecule isomerizing from one form (A) to an other (B) in the gas phase at a fixed internal energy E (ideally, in a collisionless environment):
Because the internal energy is sufficiently large for A and B to interconvert, a molecule that converts from A to B must eventually back-react, reforming the reactant configuration. If the time scale of back-reaction is similar to that of the forward reaction, the dynamics of this process can be quite complex. One approach to characterizing such a reaction is to simulate the nuclear motions in
Molecular Dynamics in Phase Space 103 the classical limit. The use of classical mechanics to represent molecular motions has been referred to as mofecufardynamics (MD).374 To carry out constant-energy MD calculations, in principle all we need do is specify a set of initial conditions (positions and velocities) that correspond to the energy we are interested in and use a convenient and efficient algorithm3-5 for the solution of the differential equations that arise from Newton’s equations of motion. For a system with m atoms, we will need to specify something on the order of 3m positions. Thus, the application of Newton’s laws will result in a set of approximately 3m second-order partial differential equations to solve simultaneously. A solution of the equations for a particular initial condition is called a trajectory, and a group of trajectories with the same energy but varying initial conditions constitutes a fixed-energy ensemble of trajectories. In the absence of external forces, each of the trajectories in the ensemble will conserve energy E, although vibrations and rotations can result in energy transfer from one part of a molecule to another during the course of the trajectory.6 Other quantities may be conserved as well in addition to E, such as the total angular momentum. These conservation principles can be used as a check on the accuracy with which the trajectory is being integrated. However straightforward it may be in principle to carry out an MD simulation of a small-molecule chemical reaction in a collisionless environment, it is one thing to carry out an MD simulation of a chemical reaction and quite another to properly interpret the results of the simulation. Assuming for the moment that we can numerically predict a reaction rate in this fashion, the question then arises: Why does the rate constant have the specific value that it does? An answer to this question may shed light on the details of the reaction dynamics, which ultimately give rise to the reaction rate. It may even help us in the interpretation of quantum-mechanical calculations of reaction rates. Assume that E is larger than the energy of any energetic barrier(s) between the reactant and product. Because the molecule is isolated, E will remain constant regardless of whether it happens to be a reactant or a product at any moment in time. Also note that in this collision-free limit, the set of all possible internal motions of a single isomerizing molecule at fixed internal energy can be considered to be representative of every member of a microcanonical ensemble (fixed E, volume, and number of molecules)2 as long as we sample all possible initial conditions properly. Such samples are, incidentally, taken within the molecular phase space. Yet we have still not even defined what phase space is. Instead of specifying only the nuclear coordinates Z(t) as a function of time during a reaction, we will also monitor the nuclear momenta?@) to gain a complete picture of the molecular dynamics in what we call phase space. One advantage of working in phase space is that, unlike configuration space, it is unique. Although trajectories may cross in coordinate space, they never cross in S;) phase space; if two trajectories have the same instantaneous values of then they necessarily must have arisen from the same initial condition.’ The
(5,
104 Visualizing Molecular Phase Space formal basis for this point of view is that there are several ways to write down classical equations of motion. The most familiar way to do this is to write down Newton’s equation (force = mass X acceleration). For one-dimensional motion of a particle with mass b along a coordinate x and subject to a timeindependent force F(x), we have d2x F ( x ) = p,dt2
and
with V(x) the potenti 1 energy. This yields an ordin ry differential equation to solve that is second order with respect to the time t. This equation does not always admit a closed-form solution, but in principle it can always be solved numerically; all one-dimensional systems are classically integrable.7 Once x(t) is known (which corresponds to generating a trajectory), the momentum p(t) is easily obtained from its derivative: dx P=cLx
[41
If the system is described by two or more coordinates, the relationships in Eqs. [2-41 also apply in vector form, but only if Cartesian coordinates are used.’ For systems described in more general coordinate systems (for example, internal coordinates), it is often more convenient to write down the equations of motion in a different form. One of these forms involves describing the dynamics in terms of two first-order differential equations, called Hamilton’s equations of motion:
and‘
with q(t) a generalized (not necessarily Cartesian) coordinate along which the one-dimensional motion occurs, and H(p, q) the classical Hamiltonian function. H(p, q) is typically given by an equation of the form
Molecular Dynamics in Phase Space 10.5
where T(p) is the kinetic energy expressed as a function of the momentum instead of velocity. H(p, q) (which is not the Hamiltonian operator from quantum mechanicsat9) is equal to the (constant) total energy of the systein.7 It can be shown that Hamilton's equations of motion are a valid representation of classical mechanics no matter what coordinate system is used, and that they always generate the same trajectories that Newton's equations generate for a given system.4J In the Hamiltonian way of looking at classical dynamics, the momentum plays a more prominent role than it does in the Newtonian formulas. In Eqs. [5-71 the position and momentum variables are more or less on an equal footing; we must know how both the position and the momentum are varying with time to solve the equations. This inspires us to focus on visualizing the dynamics in phase space. What is more, it turns out that in the classical version of statistical mechanics, all thermodynamic functions (such as the free energy) are obtained by computing phase-,space averages,2 further inspiring us to make the conceptual leap into phase space. Note that p and q are formally connected to one another in a special way. It turns out that the dynamics generated by Eqs. [S-7] is completely described only if we redefine the momentum t, in relation to the Lagrangian, L(q, q),7310 which is a function of velocity q and position q for a set of particles,
P and
=($I
4
where T(q) is the kinetic energy expressed as a function of the velocity. If Cartesian coordinates are used, Eqs. [8,9] can easily be shown to yield Eqs. [3, 41. The relationship between p and q implied here is emphasized by stating that p and q are canonically conjugate variables.4J~IO It is clear that there are some advantages to working in phase space, but it is not clear that we should want to visualize the dynamics in phase space. For a system described by an N-dimensional coordinate vector G(t),there are also N components to$(t); thus the phase space is 2N-dimensional! We have therefore doubled the dimensionality of the problem in some sense . . . or actually, not quite. Because E is fixed, not all values of the internal coordinates and momenta are accessible (assuming that all motion is energetically bounded, that is, there is no molecular dissociation), and the constraint of energy conservation necessarily reduces the dimensionality of the accessible phase space. Thus,
106 Visualizing Molecular Phase Space motion does not completely fill the 2N-dimensional phase space [$(t),3 ( t ) ] ; rather, it is restricted to a (2N - 1)-dimensional surface of (generally) limited extent that is embedded within the full 2N-dimensional phase space. Let us consider an example of the benefit that can be derived from visualizing phase space. As an exercise, we examine the one-dimensional motion of a classical harmonic oscillator at fixed energy. The Hamiltonian is
where q here represents the displacement of the oscillator from its equilibrium position, o is the harmonic frequency of oscillation, and p is the momentum associated with the vibrational motion. This model is often adopted to describe the vibrational motion of a diatomic m o l e ~ u l e Note . ~ that if we fix E in this equation, we have an equation of only two variables that are not independent. As it turns out, we can describe the phase space completely for this onedimensional system (or for that matter, any one-dimensional system) without calculating a single trajectory.10 Solving this expression for p, we obtain
with both roots representing physically admissible solutions. A plot of Eq. [ 111results in a simple ellipse (see Figure 1and refer back to Eq. [ 101, recalling that the general equation for an ellipse is of the form xVa2 + yVb2 = 1).Thus, the phase space of the one-dimensional harmonic oscillator is a one-dimensional curve, and all trajectories will follow this curve as they evolve in time. Because the curve is closed, the motion must be periodic, and the period of oscillation is simply the time required by the oscillator to make one complete circuit around the ellipse. In classical mechanics, we know that the momentum may never be imaginary. Thus, for fixed E the only possible values of p are the ones for which [E - po2q2/2] 2 0 (the equality sign applies when p = 0, at which point the oscillator is at a turning point and has momentarily stopped moving). The maximum value of (pI will occur when q = 0, where the potential energy is at its minimum. If we next increase the energy of the oscillator, the size of the ellipse increases, but there are no qualitative changes. We therefore can visualize the global properties of the harmonic oscillator’s phase space quite easily as a series of ellipses snugly nested within one another (as shown in Figure 1).Incidentally, when phase space consists of low-dimensional structures nested in a higherdimensional space (here we have one-dimensional elliptical curves nested in a two-dimensional planar space), the phase space is said to be foliated. We have now begun to learn how to visualize phase space. However, if more than one coordinate is needed to describe a system, we may run into
Molecular Dynamics in Phase Space 107 5.0 I
I
4.(
U -3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
3.0
q
2.0 1 .0
p
0.0 -1 .o
-2.0
-3.0
Figure 1 (Top) Plot of the potential energy U of a one-dimensional harmonic oscillator as a function of its separation from its equilibrium geometry, given as q. (Bottom) Phase space of a one-dimensional harmonic oscillator, showing the conjugate momentum p as a function of q. The horizontal lines in the upper plot and the phase space curves in the lower plot correspond to low (L), medium (M), and high (H) energies.
trouble. For a system described by N coordinates, the dynamics will “live” in a (2N - 1)-dimensional subset of phase space. We are able to visualize onedimensional motion, which exists in a one-dimensional subset of a two-dimensional phase space, and we may even be able to visualize two-dimensional motion, which exists in a three-dimensional subset of a four-dimensional phase space. But how can we visualize motion in three or more coordinates? In
108 Visualizing Molecular Phase Space particular, how can we visualize the phase space of molecules that can undergo reactions ?
What We Hope to Gain: Semiclassical Insight Given the difficulties alluded to above, a skeptical reader may (rightly) be concerned about the fundamental question of whether the use of classical mechanics to understand trends in microscopic reaction rates is even well justified. If classical mechanics yields an inaccurate description of molecular motions, why spend a lot of time thinking about the classical phase space of reacting molecules? From our perspective, there are at least several reasons for this endeavor. First, it is certainly the case that the use of classical (Newtonian) molecular dynamics simulations to interpret and even predict the behavior of large molecules, interfaces, clusters, and materials constitutes a large and growing part of the primary literature of computational chemistry.3.4 Assuming that we have a perfect knowledge of the Born-Oppenheimer potential energy surface for a molecule (or group of molecules), we can in principle integrate Newton’s equations of motion and obtain dynamical and spectroscopic information about the system of interest at any energy we choose. In addition, by using the dynamics to sample the potential energy surface, we can discover which structures are lowest in energy or even those that are lowest in free energy. This approach basically presupposes that such interesting quantum-mechanical phenomena as tunneling and zero-point motion provide at most only quantitative corrections to the dynamics of the system of interest, or are irrelevant to the information sought from the study. A great deal of useful structural, dynamical, and thermodynamic information has been obtained using the MD approach. It remains a point of scientific concern and interest to identify those situations and applications where the M D approach is valuable and where it may break down. Then, there is the wealth of information that classical trajectory studies have given us regarding the details of molecular reactive and nonreactive scattering. In some studies, for example, initial conditions for an ensemble of scattering M D trajectories are chosen in such a way as to take into account zero-point motion.9 Such a simulation is said to be quasiclassical.11 In quasiclassical work, initial conditions for the ensemble of trajectories are chosen so that their distribution of positions and momenta somewhat resembles that of a quantum-mechanical system. Quasiclassical simulations have been carried out for a wide variety of unimolecular and bimolecular reactions, and the comparison of these with results from rigorous quantum mechanics has proved enlightening.12 Interpreting the results of quasiclassical simulations can be aided by visualizing the phase space,13 and it seems possible that this might also help in the development of new quasiclassical methods in the future. In addition, a
Reaction Rates from Dynamics Simulations 109 number of interesting studies have observed signatures of classical resonance phenomena in quantum-mechanical simulations of reactions.14-22 This has stimulated further interest in the classical dynamics of reactive motion. Finally, although we will not consider this line of research in the present tutorial, some really interesting work has been done in investigating the possibilities for signatures of classical chaos in spectroscopy and other sensitive probes of quantum dynamics.23 Molecules that are intermediate in size between “purely classical” systems (such as a bowling ball) and “purely quantal” systems (such as a proton) are in a certain sense wonderful laboratories for probing the detailed nature of the Correspondence Principle.7-9 Understanding classical molecular dynamics may help us to understand some of the underlying principles of quantum molecular dynamics. What is more, we have certain relations (such as the Ehrenfest theorem24) that assure us that the center of a quantum-mechanical wave packet moves like a classical particle. Thus, on some short time scale (how short?), we expect that quantum dynamics may closely agree with classical dynamics. Of course, from this perspective the rate at which the wave packet spreads, tunnels, and bifurcates must be much longer than the time scale of the motion of the center of the wave packet for the classical approximation to be useful. The interested reader should consult other sources for more information on this fascinating area of research.25
REACTION RATES FROM DYNAMICS SIMULATIONS In general, it seems reasonable to believe that we should be able to quantitatively account for any large deviations that may occur between the kinetics of M D simulations (Len,from “numerical experiments”) and the kinetics predicted by simple theoretical models of reaction rates (such as transition state theory).26 We usually should be able to obtain numerically the asymptotic reaction rate from MD simulations at a particular E by integrating Hamilton’s equations of motion for an ensemble and counting the number of these trajectories that correspond to reactants at any particular time. In a practical sense, there may be some ambiguity such as identifying when the system is on the reactant side or on the product side of the reaction. This is usually achieved by establishing a dividing surface between reactants and products. The dividing surface is often chosen by identification of a critical configuration, perhaps (but not necessarily) at the summit of an energetic barrier between reactants and products. It is sometimes possible to identify a reaction coordinate that is single-valued at the dividing surface.27 In such cases, when a trajectory’s value of the reaction coordinate is to the left of the dividing surface, we say that the system is a “reactant;” otherwise, the system is a “product.y’ An excellent, entertaining, and much more careful discussion of
2 10 Visualizing Molecular Phase Space
the identification of critical configurations and dividing surfaces has been presented elsewhere by Pechukas.28 Let us consider the MD simulation of an elementary gas-phase unimolecular isomerization reaction of the form of Eq. [l].We first define a suitable dividing surface between reactants and products. If we next initiate an ensemble of trajectories all with the same energy (we might do this uniformly on the constant-energy surface of possible initial positions and momenta, forming a ccmolecularmicrocanonical ensemble”) and all initially on the reactant side of the dividing surface, we can monitor the decay of this initially nonequilibrium distribution of trajectories as a function of time?
Initial Conditions A word or two about the generation of initial conditions that are uniform on the constant-energy surface might be helpful. Let us say we wish to generate initial conditions uniformly on the elliptical phase space surface of a harmonic oscillator with 1.1. = o = 1 and at E = l/2 (refer to Figure 1). This can be accomplished by a simple series of steps. First, the turning points of the oscillator in position and momentum space should be determined. Next, points may be randomly generated within a rectangle formed by the turning points (in practice the rectangle might be slightly larger). Next, any points that lie outside the curve defined by E = ‘12 are rejected (actually this is not strictly necessary but we do this for simplicity). Let the coordinates of these points within the Such points are uniform on the rectangular grid formed by the ellipse be two phase-space coordinates, but are not uniform on the one-dimensional surface of constant energy. -+ As our next step, we now start at r,,, and move along a vectorz (to be determined) that “pokes” at the energy surface along the direction of a unit vector that is locally normal to the energy surface, h (Figure 2). This unit vector is given by the general formula
c,,,,.
g=- VH lVHl where V H is the gradient of the Hamiltonian. In this one-dimensional example, we would have
VH=
(2)h,+ (g)hp P
4
The coordinates of the point so generated will be given by a vector FE:
Reaction Rates from Dynamics Simulations 11 1
Figure 2 Illustration of construction of an initial condition that is part of a set uniformly distributed on a constant-energy shell of a one-dimensional harmonic oscillator.
z,
We have determined the direction of but not its magnitude. To find the point along the direction of h that lies exactly on the energy shell we are interested in, we let 2 be parameterized by its magnitude s: g = sh
-+
~ 5 1
But what is s? We know that
Therefore, we need only vary the functional HteSt(s) with respect to s until we find a (positive) root in the equation
Points that are generated by this algorithm will be uniformly distributed on the constant-energy shell. These equations, and this algorithm, generalize readily to multidimensional anharmonic systems. Obviously, once we have established a dividing line between the various possible isomers, we can further restrict our
112 Visualizing Molecular Phase Space
samples to a single molecular isomer and obtain a distribution that is initially uniform on the constant-energy surface within that isomer.29 The procedure just described may be referred to as a "crude" microcanonical sampling procedure and is likely to be practical only for low-dimensional systems. Much more sophisticated and efficient procedures have been developed and are in use. Rather than attempting to completely survey these methods, we refer the interested reader to the literature for more information on this topic.3~4J1930
Rate Constants Let n A ( t ) be the number of reactant molecules in the ensemble at time t, with n,(t) similarly defined for the products. In general, one observes a variety of transient behavior in a plot of n A ( t ) , the nature of which will depend to a certain extent on the initial conditions. O n a short time scale, nA(t) and ne(t) may, for example, undergo oscillations or decay at a different rate than that at longer time scales, depending on the nature of the system of interest and the energy as well as on the initial conditions (see Figure 3).29 However, let us look at the long-time behavior of the ensemble, recalling our assumption that there
1
n
0,
c \
4
3
c
0.9 0.8 0.7 0.6 0.5
0.4
t
'
0
5
10
15
ti&
25
30
35
40
Figure 3 Fractional population decay curves for a model of three-state molecular isomerization with all trajectories initially distributed uniformly on the constantenergy shell within a single isomer A.29
Reaction Rates from Dynamics Simulations 113 are no oscillations in nA(t) and nB(t) in this time regime. The regression hypothesis of Onsager31 implies that the decay rate from A to B at large t will be independent of the initial conditions chosen, reflecting the rate at which spontaneous fluctuations of the system from equilibrium reapproach the equilibrium state. The forward ( k A B ) and reverse ( k B A ) rate constants at long times are related to the microcanonical equilibrium constant K by the well-known relation2,32,33
K=
~ A B
kBA
Ultimately we would like to understand the kinetics on all time scales, but if we restrict our attention to long time scales, we know that this relationship will hold. Assuming also that nA(t) and nB(t) will eventually settle down to welldefined equilibrium values, we next follow Chandler33 and write down firstorder phenomenological rate equations for the asymptotic decay:
and
where kAB(E) is a first-order rate constant for the forward reaction at fixed energy E, and so on. Using the detailed balance condition,32333 we integrate Eqs. [19, 201 to find
with nA(0) being the number of reactant molecules at t = 0, niq the number of reactant molecules at equilibrium, and k ( E ) the phenomenological rate constant, which is related to the forward and reverse reaction rate constants33:
We now see that k(E) can in principle be obtained from making a plot of In[nA(t) - nT] as a function of time and fitting it to a straight line in the longtime-scale regime, after any transient (and perhaps complex) kinetic phenomena have occurred.33-35 Note that k(E) is in principle completely independent
114 Visualizing Molecular Phase Space of the initial conditions used to compute it. We should point out that other methods, including the computation of autocorrelation functions, may be used to extract the reaction rate.33 Also note that the above result can be generalized to multiple isomers. Sometimes life is not so simple, however. The preceding discussion assumes that the ensemble will in fact always decay to an equilibrium distribution of reactants and products. However, certain models of chemical reactions display periodic oscillations of nA(t)and n,(t), and in such cases the above procedure cannot be strictly employed because the idea of a rate constant no longer strictly applies (if there is no decay, there can be no constant decay rate).35 As we shall see, this situation generally will exist whenever the dynamics along the reaction coordinate is only weakly coupled to other coordinates. What is more, such fits are often difficult to carry out in practice, requiring many trajectories to adequately represent an ensemble. Assuming that the concept of a rate constant is valid, we might consider using a microscopic theory of unimolecular chemical reactions to predict what the reaction rate should be and then check to see whether the theory is in agreement with that obtained from the computer simulation. The theory most widely used for this purpose is the RRKM theory developed by Rice and Ramsperger,36 K a ~ s e l and , ~ ~Marcus and co-workers.38 As has been discussed in detail elsewhere,32 RRKM theory contains the same essential dynamical assumptions contained in transition-state theory.26 We discuss these assumptions briefly in the next section. A number of M D studies on various unimolecular reactions over the years have shown that there can sometimes be large discrepancies (an order of magnitude or more) between reaction rates obtained from molecular dynamics simulations and those predicted by classical RRKM theory. RRKM theory contains certain assumptions about the nature of prereactive and postreactive molecular dynamics; it assumes that all prereactive motion is statistical, that all trajectories will eventually react, and that no trajectory will ever recross the transition state to reform reactants.26 These assumptions are apparently not always valid; otherwise, why would there be discrepancies between trajectory studies and RRKM theory? Understanding the reasons for the discrepancies may therefore help us learn something new and interesting about reaction dynamics. i To this end, significant effort has been devoted over the last 15 years to the development of theories of chemical reaction kinetics and dynamics based on ideas gleaned from nonlinear dynamics. It has been found that techniques similar to those used to analyze instabilities in weather patterns and the formation of galaxies can be employed to visualize pre- and postreactive phase space. This makes it possible to determine what kinds of motions a molecule must execute to react. In turn, this knowledge can be used to make a prediction of the reaction rate which is significantly more accurate than RRKM theory.
Chemical Kinetics, Chaos, and Molecular Motions 1 15 Again, the organizing principle is the need to visualize reaction dynamics in phase space. By concentrating on low-dimensional, nonrotating systems, considerable progress can be made to achieve this. We review some of the results of this effort, as well as their practical implementation, and discuss the application of the resulting insights to reaction rate theory.
CHEMICAL KINETICS, CHAOS, AND MOLECULAR MOTIONS
A Brief Review of Absolute Rate Theory Absolute rate theory, the prediction of rate coefficients from first principles, has been reviewed in several excellent articles and texts which we wholeheartedly recommend to the interested reader.26J*J9-44 In the following, we (very) briefly summarize the field. The references cited are meant to be representative of the field, but are certainly not exhaustive (in fact, this same caveat applies to the remainder of the chapter). Perhaps the best known expression in the kinetic theory of elementary bimolecular reactions is the empirical expression for the rate constant proposed in 18 89 by Svante Arrhenius32: k ( T ) = Aexp(-E,/RT)
~ 3 1
with k(T)the rate constant as a function of the temperature ?J R the universal gas constant, A the frequency factor, and E , the activation energy. In 1935 Eyring,4s as well as Evans and Polanyi,46 were able to use certain assumptions to approximately link this empirical relationship to microscopic parameters. Their work showed that the factor exp(-E,lRT) represents the fraction of molecules that have at least energy E , (where E , is given by the height of the energetic barrier assumed to separate reactants from products), and that the prefactor A can be thought of as the frequency of collisions that occur with the correct orientation to react successfully.32~47~48 Thus, reaction rates are roughly determined by (1)how much energy is available for reaction, (2) how often activated molecular collisions occur, and (3) what fraction of activated collisions have the proper relative orientation of the reactants. It is straightforward to show that an expression for k(T) that is in close correspondence to the Arrhenius equation may be obtained from proper statistical-mechanical averaging of the microcanonical RRKM rate constant.32 The Arrhenius equation, interpreted using the theory of Eyring, Evans, and M. Polanyi, seems at first glance to contain virtually all the information one would need to know about reaction kinetics. However, as hinted at previously, a number of reactions have been observed to exhibit behavior that does
116 Visualizing Molecular Phase Space
not obey the Arrhenius law (for a variety of postulated reasons),49 and others appear to behave according to the Arrhenius law but the measured rates are inconsistent with other kinds of experiments.5”>51Moreover, the Arrhenius law tells us nothing about the rate at which a system initially in a nonequilibrium state evolves toward equilibrium. It also tells us nothing about the nature of the population decay. For these reasons, the experimental and theoretical study of microscopic reaction dynamics has been one of the more active research areas in physical chemistry.51~52A large body of work has been dedicated toward the development and testing of microscopic rate theories, which attests to the complexity of the problem. In essence, most theories of microscopic kinetics treat the process of chemical reaction as a statistical phenomenon that occurs by passage through a single bottleneck known as the transition state. Many authors have different definitions for the transition state. It is usually defined as either an energetic or dynamical barrier, which may be used to establish a dividing surface between reactants and products. In the tramition state theory of reaction rates, all molecules that cross through a transition state are assumed to fall into the products region and never return.28~32Somewhat more correctly, transition state theory assumes molecules that cross through the transition state do not return on a time scale comparable with the rate at which they cross the transition state into products.33 This means that the motion of barrier crossing is assumed to be very fast compared to the time it takes to recross the barrier, and thus recrossing can be neglected completely. It has been shown that the “assumption of no return” is completely equivalent to Eyring, Evans, and M. Polanyi’s assumption that the transition state and the reactants are always in thermal equilibrium with one another.28>32,44As has been discussed elsewhere, RRKM theory uses the same basic set of assumptions about the dynamics of barrier crossing as does transition state theory; it is, literally, a highly useful and practical version of transition state theory designed specifically for unimolecular reactions at fixed energy.32 Transition state theory in its classical, quantum, semiclassical, and variational forms has proven to be a powerful and useful tool in the prediction of chemical reaction rates.52 However, we can easily imagine a situation in which the no-recrossing assumption must break down, at least on a certain time scale. Considering bound isomerization as an example, when a molecule twists from one form into another and isomerizes, classical molecular dynamics simulations typically reveal that the molecule can twist back into its original form after only a few vibrational periods in a nonstatistical pattern of back-reactions, crossing and recrossing the transition state.35.53 This situation, which can also occur to a somewhat lesser extent in bimolecular reactions, was referred to by Wigner as the recrossing problern.54 But what does it mean for motion to be nonstatistical? And how do we include nonstatistical recrossing into a microscopic kinetic theory? These questions are the focus of our attention here.
Chemical Kinetics, Chaos, and Molecular Motions 11 7
Overview of Nonlinear Dynamics and Chaos Theory When we study dynamical models of molecules, using computers to predict and analyze their motions, we observe that the motions can (loosely) be classified as being either regular or chaotic. Regular motion tends to be wellbehaved and even somewhat predictable, such as the motion of a clock pendulum or children on a see-saw. Chaotic motion, on the other hand, is so erratic that it is extremely difficult to predict (even using a computer) exactly where the moving object will go next. Most molecular models, as we shall see, admit a mixture of coexisting regular and chaotic motions. Motion near an energetic barrier of a potential energy surface is typically (but not always) chaotic at energies above the barrier height. If “chaotic” were synonymous with “statistical,” then reaction dynamics could in such cases be completely understood using standard statistical methods and theories. However, modern work in the theory of dynamical systems has shown that chaotic motion has an underlying structure; there is order in chaos. The underlying structure within chaotic motion is the source of nonstatistical behavior. This structure needs to be understood in order to predict how the system will behave . Let us consider all possible internal geometries and momenta (velocities) that are possible for an isomerization reaction at a particular total energy E > E,. One assumption made by transition state theory is that no matter how fast a molecule is moving or what configuration it is in, every molecule with properly oriented momentum along the reaction coordinate has the same chance of crossing the barrier as does any other molecule. This is the same as saying that the phase space is structureless or nondecomposable, which basically means that the probability of a molecule reacting at any moment in time is completely random. It is fair to say that there has been a renaissance in the study of the properties of nonlinear dynamical systems in recent years.55 This work has brought forth the notion that the equations of classical mechanics describing most systems are fundamentally unsolvable, or nonintegrable, and that as a result all nonintegrable systems have universal scaling laws which underlie their complex and apparently chaotic motions.56 We now delve into the topic of what is meant by chaotic motion. We should first point out that there is no generally agreed upon technical definition of what the word “chaotic” means for dynamical systems, but it is possible for us to get a sense of what the issues are, nevertheless. It is not a foreign notion to consider that some kinds of systems exhibit regular motion (a pendulum, a onedimensional oscillator, a thrown baseball) and others behave erratically (a balloon with air escaping from its nozzle). It is more unsettling to consider the three ideas that follow.
1 1 8 Visualizing Molecular Phase Space (1) If we know all forces affecting the balloon as a function of time and try
to predict the balloon’s trajectory numerically (given exactly correct initial
conditions) using a computer with finite (even infinitesimal) precision, the predicted and actual trajectories will diverge rapidly and observably from one another as a function of time. (2) Similarly, divergence will also occur if we have an infinitely precise computer to solve the chaotic problem, but the balloon experiences an unaccounted-for, infinitesimal fluctuation between finite time steps of the trajectory calculation. The accumulation of errors resulting from the extreme sensitivity of a chaotic trajectory to its instantaneous environment was called the “butterfly effect” by Lorenz.57 The butterfly effect arises from the fact that the balloon’s trajectory is dynamically unstable, which means that it is so sensitive to changes in its instantaneous environment that the perturbations caused by the fluttering of a butterfly’s wing thousands of miles away are sufficient to cause the trajectory of the balloon to change from what it would otherwise have been. (3) There is yet a third consideration, which Percival pointed out in a recent (fascinating) review.58 If we have an infinitely precise calculation of trajectories and know all instantaneous forces perfectly, divergence will still occur between our computation and experiment if we have only a finite approximation to the balloon’s initial conditions. In Percival’s own words58: According to one point of view, expressed by Laplace, dynamical systems like the Solar System are completely deterministic, so probability theory can have no relevance. But this point of view requires a God-like omniscience in being able to determine initial conditions exactly. This requires an infinite number of digits and is beyond the capacity of anybody or anything of finite size, including the observable Universe (Ford 1983) [Ref. 591. In reality measurement is only able to determine the state of a classical system to a finite number of digits, and even this determination is subject to errors, without quantum mechanics, and whether this determination is made by human or machine. Such measurements limit the known or recorded motion to a range of possible orbits.
Taking these three considerations together, it is apparent that we must generally resign ourselves to an imperfect ability to assess and predict the dynamical states of most deterministic systems. Coarse-grained descriptions of system dynamics are therefore the subject of much interest.58-62 The aim of a coarse-grained description of the dynamics and also transition state theory in the context of reaction dynamics, is to predict the behavior of families of trajectories instead of individual ones. Henri PoincarC seems to have been the first to recognize the existence of dynamical chaos, its intrinsic connections with the field of topology, and its importance to physics. However, the importance of his three-volume work on the subject and its implications for planetary motion, Les Mbthodes Nouvelles
Chemical Kinetics, Chaos, and Molecular Motions 119 de la Mkcchaniqtte Ce'leste (1899), was apparently not widely appreciated until recent times.63 In Gleick's interesting book Chaos: Making a New Science,64 the contribution of PoincarC is summarized as follows: Both subjects, topology and dynamical systems, went back to Henri Poincart, who saw them as two sides of one coin. Poincart, at the turn of the century, had been the last great mathematician to bring a geometric imagination to bear on the laws of motion in the physical world. He was the first to understand the possibility of chaos; his writings hinted at a sort of unpredictability . . . But after Poincart's death, while topology flourished, dynamical systems atrophied.
A great deal of attention has been focused in recent years by workers in classical dynamics on the geometric properties of phase space structures and their manifestation on Poincare' maps (also referred to as surfaces of section). The result has been the blossoming of a huge Iiterature on the subject of nonlinear dynamics (quasiperiodicity and dynamical chaos), which is discussed in a number of recent textbooks and articles.58,65-75 We focus on the application of chaos theory to constant-energy systems described by at least two coordinates (degrees of freedom), whose properties are generally relevant to models of microscopic chemical reaction dynamics. Fermi and co-workers carried out a pioneering numerical study of the equilibration rate of one-dimensional chains linked by nonlinear potentials and observed that these nonintegrable systems exhibited quasiperiodic behavior.76 Kolmogorov, Arnold, and Moser formulated the famous KAM theorem (discussed in the following section) for the stability of integrable motion in the presence of a nonintegrable perturbation.58.77-81 Chirikov and his co-workers considered the overlap of resonances as a criterion for the global onset of dynamical chaos.82 Poincark,63 Birkoff,83 and Smale84 investigated the stability properties and ergodic behavior of motion near unstable orbits. Greene et a1.85 pointed out that although conservative systems have universal scaling properties in the chaotic limit, the scaling properties of conservative systems are different than those for nonconservative (time-dependent) systems, and they belong to different universality classes.56 MacKay, Meiss, and Percival86 and Bensimon and Kadanoff87 developed a theory of transport for Hamiltonian systems and showed that the difference in accumulated action between wellchosen orbits is related to the flux of trajectories between different parts of chaotic phase space. A number of studies have successfully applied the theory of nonlinear dynamics to studies of atomic and molecular motions; some representative references follow. In perhaps the first such study, in 1955 DeVogelare and Boudart considered aspects of nonlinear dynamics in models of bimolecular exchange reactions.88 Pollak, Pechukas, and Child explored periodic orbits and their associated asymptotic limit sets in exchange reactions.89-92 De Leon, Berne, and Rosenberg considered quasiperiodicity and chaos in unimolecular
120 Visualizing Molecular Phase Space
i~omerization.3~>93 Davis and Wyatt examined classical resonance effects in the interaction of a diatomic molecule with an electric field.94 Hutchinson, Reinhardt and Hynes applied the Chirikov analysis to vibrational energy transfer in linear hydrocarbon chains.95 Jaffd, Reinhardt, and Shirts have considered “vague tori” in the chaotic phase space of models in intramolecular energy transfer.96-98 In an important article, Davis showed that the transport properties described by MacKay et a1.86 could be used to understand the rate of intramolecular vibrational energy transfer in a collinear model of the OCS molecule, observing and developing rigorous connections between nonlinear dynamics and reaction rate theory.60 Martens, Davis, and Ezra considered multidimensional resonance phenomena in 0CS.99 A number of workers have discussed the connection between classical resonance phenomena and Fermi resonances.6J00 Davis et al. explored the nonlinear dynamics of bimolecular exchange and unimolecular decomposition reactions.101-103 Gray and Rice considered unimolecular isomerization in these term~.~O~J05 Gillilan and Reinhardt have analyzed diffusion of atoms on surfaces also in terms of nonlinear dynamics.62 Inspired by the experimental work of Harthcock and Laane,l06 Marston and De Leon studied the conformational isomerization of 3-phospholene and presented numerical evidence for a previously unreported transport sequence which they called “reactive islands.”53 Ozorio de Almeida et al. presented formal arguments explaining the new transport sequence in terms of the existence of “cylindrical manifolds” in phase space and proposed a new classification scheme for separatrix manifolds in Hamiltonian systems.107 De Leon, Mehta, and Topper extended these arguments and developed a general reaction rate theory based on the properties of cylindrical manifolds which they called “reactive islands theory.”299108 De Leon later developed a phase-space “temporal” version of reactive islands theory109 and applied an approximate “linearized” version of it to the internal conversion of stilbene.110 Meanwhile, Zhao, Rice, and Jang developed a reaction rate theory based on the earlier work of Gray and Rice and applied it to a number of systems, comparing it to reactive islands theory.111 Most recently, De Leon and Ling have applied linearized reactive islands theory to the isomerization of HCN.112 Concurrently, Collins and Schranz have studied resonance phenomena in models of torsional isomerization.ll3 A number of useful reviews on progress in the use of nonlinear dynamics and chaos theory to describe molecular phenomena are availab]e.13,25,92,114 Many of the studies mentioned above made use of a method conceived of by Poincard in the late 1800s to project the motion of thousands of isomerizing molecules onto a special map of the dynamics.63 We will describe how this method is implemented in a practical sense. However, before beginning that discussion, we will try to guide the reader through a “thought experiment” in which we try to visualize the three-dimensional phase space of a two-dimensional prototypical model for isomerization.
Visualizing Uncoupled lsomerization Dynamics in Phase Space 121
VISUALIZING UNCOUPLED ISOMERIZATION DYNAMICS IN PHASE SPACE When teaching general chemistry, one often begins by teaching the students how to write correct Lewis dot structures, then showing them how to use the Valence Shell Electron Pair Repulsion (VSEPR) model to predict molecular geometries. With the geometries known, the wavefunctions can then be predicted by invoking hybridization and maximizing the overlap between hybridized orbitals. Finally, molecular orbital theory is introduced to explain resonance and other phenomena not predicted by the principle of maximum overlap. Just as this “bootstrap” kind of approach can be used to help a student gain the ability to visualize molecular properties, a bootstrap approach can be valuable in helping us to visualize reactive phase space. We first learn to visualize the phase space of a multimode system that does not admit mode-mode energy transfer, and then we “turn on” the mode-mode energy transfer. The advantages of this approach are that, if there is no energy transfer, each coordinate’s motion can be solved and there is no chaos (the system is completely integrable). This makes the phase-space surfaces upon which the trajectories evolve easier to visualize. Consider a molecular model in which we freeze all internal coordinates except two: q l , which will represent a twofold internal conversion (for example, we might expect (1,1,2)-trichloroethane,or vinyl trichloride, to have a roughly twofold internal rotation about the C-C bond if conversion to the allgauche configuration is not energetically accessible), and q2, which will represent a single internal vibrational coordinate (in the case of the above substituted ethane q2 might represent a symmetric stretch or perhaps the deviation of the C-C distance from the equilibrium distance). We will refer to qr as the reaction coordinate. The variation of the molecule’s internal energy with respect to independent variation of each of these coordinates is shown in Figure 4. This molecular model has two degrees of freedom. Rather than thinking of this as an accurate molecular model for a particular system (it is not), we consider this to be a simple prototype model for molecular isomerization. We next make some further simplifying assumptions that will let us visualize the phase space. If we assume that the two coordinates are completely independent of one another, the total energy of the molecule E can be written as a sum of two contributions:
Under this assumption, energy is never transferred from the torsional motion to the vibrational motion; they are truly independent, unless we introduce a third
122 Visualizing Molecular Phase Space T-
1
Figure 4 Plots of the potential energy V for a system that simultaneously experiences uncoupled twofold torsional oscillation along q1 and anharmonic vibration along q2. The total energy exceeds the barrier height, but if insufficient energy is allocated to q , (shown as case “T”), the molecule will be trapped within one isomer for all time. If sufficient energy is allocated to q1 (shown as case “R”),the molecule will react and back-react repeatedly. If the energy allocated to q1 equals the energy height of the barrier (shown as case “S”) the molecule will approach infinitely close to the transition state at qi in the limit of infinite time.
term in the energy that involves both coordinates or momenta (a coupling term). Thus, E,,,, and Evibare both constants.7J0 Note that E,, and Evibcontain contributions from the kinetic energy of motion along each coordinate as well as from the potential energy. For example, a qualitatively reasonable equation for E,,,, might be obtained by assuming that it is the sum of independent kinetic and potential energy contributions. Furthermore, we could represent the energy of motion along the torsional coordinate q1 via the expression3s
Visualizing Uncoupled lsomerization Dynamics in Phase Space 123
where p1 is the momentum canonically conjugate to q,, I , is the effective moment of inertia of motion along ql, and E and a are parameters that control the height of the barrier and the location of the potential minima. Note that the kinetic and potential contributions to E,,,, will vary with time, although the total vibrational energy will remain constant. Similarly, we might assume an (anharmonic) Morse potential*Is for the stretching coordinate, obtaining
P 2 + D { l - exp[-olq2]}2 Evib = 2 2cL2 with p z the canonically conjugate momentum to q2, k2 an effective mass along the vibrational coordinate, D the dissociation energy for motion along q2, and CI a parameter controlling the stiffness of the vibrational motion. We next presume that E is fixed at a value greater than that of the barrier height Eb, that is, we let E > E,. For a fixed value of E, we then can partition the energy between the torsional and vibrational coordinates as we choose. Now, consider the following three cases of interest, illustrated in Figure 4: (1)If E,, < Eb (which will leave us with a relatively large amount of energy in the vibrational coordinate) and we assume that the molecule is initially on the lefthand side of the barrier, then in the approximation of classical mechanics the molecule will never cross over to the other side of the barrier (i-e., it is trapped in isomer A). ( 2 ) If E,,,, > Eb, then the molecule can now cross over to the right-hand side of the barrier from A to B (i.e., it is reactive), but because there is relatively little vibrational energy, the fluctuations along q2 will be relatively small. Moreover, because E,, is constant, the molecule never loses energy along the reaction coordinate and will continue to cross and recross the barrier, reacting and back-reacting, for all time (again, we assume that there are no intermolecular collisions in this discussion). (3) If E,,, = E b , then the molecule can approach infinitesimally close to the summit of the barrier, but can never reach the top (or, if you prefer, it will take an infinite amount of time for it to get there). We can now try to visualize this situation in phase space. First, we make plots of p , as a function of q1 for each of these three cases, just as we did for the harmonic oscillator, by solving Eq. [25] for p , :
These plots are shown in Figure 5 , with trapped motion indicated by a “T” and reactive motion indicated by an “R.”Note that the curve labeled T corresponds to an ellipse that does not span both isomers (there is another “trapped” curve that corresponds to being trapped in isomer B with the same torsional energy),
124 Visualizing Molecular Phase Space
1
q:
q1
Figure S Same as Figure 1, but for the torsional reaction coordinate of the system depicted in Figure 4. See Figure 4 for an explanation of the symbols. Note that rnotion on the curve labeled “S” will taken an infinite amount of time to reach the energetic barrier. “S” signifies the separatrix.
and the curve labeled R spans both isomers. The curve that corresponds to E,, = Eb is neither trapped nor reactive; neither fish nor fowl. Moreover, motion just inside it is trapped for all time, and motion just outside it is reactive for all time. The curve that separates one kind of motion from another kind is referred to as the separutrix (thus we label it with an ‘‘S”).This phase space is foliated, as was that of the harmonic oscillator, but now there are two qualitatively different kinds of motions, with a separatrix marking a boundary between them. Proceeding similarly with the vibrational coordinate, we obtain in Fig-
Visualizing Uncoupled lsomerization Dynamics in Phase Space 12.5 ure 6 a series of (anharmonic) closed curves, qualitatively similar to that of the harmonic oscillator and without a separatrix-at least, assuming that dissociation along q2 is not energetically possible. Because the motion along the reaction coordinate ( q l ) is totally separable from the motion along q2, we can look at each coordinate’s individual phase-
Figure 6 Same as Figure 5, but for the vibrational coordinate of the system depicted in Figure 4. See Figure 4 for an explanation of the symbols. Note that at the classical energies considered here, “fission” along the symmetric stretch coordinate is not possible.
126 Visualizing Molecular Phase Space space portrait and learn all there is to know about the system. However, this will not generally be the case. In the presence of mode-mode coupling, which provides a mechanism for energy transfer between torsional and vibrational motions, the problem of visualizing the phase space becomes more difficult. However, we can immediately visualize the full phase space of the above, uncoupled system. Since the total energy is fixed, all motion lies on a three-dimensional surface embedded within the four-dimensional phase space. However, because each coordinate’s energy is conserved, we have another constraint. The effect of this constraint is to reduce the dimensionality of the motion by 1; the dynamics lies on a series of nested two-dimensional surfaces. First, consider the case of trapped motion within a single isomer. The phase space of q2 is (always) an ellipse, which has the same topology as a onedimensional sphere (which a mathematician would name Sl). However, the phase space of q1 is also elliptical and has the same topology (Sl). The topology of the two-dimensional phase-space surface on which the dynamics lies is the Cartesian product of these two, which is a two-dimensional torus, or a “phasespace doughnut” (TL = S1 x S1).1*6~1*7The toroidal geometry is shown in Figure 7. As a consequence of each individual oscillator’s phase space being
Figure 7 (Top left) Torus of constant-energy vibrational motion in two uncoupled degrees of freedom, projected upon the three-dimensional space (q,, q2, p2). (Below right) Three nested tori, sliced to reveal their foliated structure.
Visualizing Uncoupled lsomerization Dynamics in Phase Space 127 foliated, the global phase space for trapped motion consists of foliated tori nested within each other. There are two identical sets of these trapped tori, one on each side of the potential barrier, although only the trapped tori within the phase space of A are shown in Figure 8 [these are labeled RA(E)]. Also note that in Figure 8, as in Figure 7, the ends of the tori have been “snipped off” to reveal the underlying foliated structure. Next considering the case of reactive motion at the same energy E, we realize that the situation at hand is not very different. The phase space of q2 is still elliptical. The phase space of q, is not elliptical, but it is a simple closed curve, and it still therefore has the same topology as a one-dimensional sphere (every point on a closed curve can be uniquely mapped onto a sphere). Thus, the phase space of reactive motion consists of foliated tori that span both sides of the potential barrier. These reactive tori will be skinny when sliced along the (q2,p 2 ) plane compared to the trapped tori, because they have less energy in the vibrational coordinate and more in the reaction coordinate. In Figure 8 these are labeled RAB(E). Next, we consider motion on the separatrix. The phase space of the
I
Conformer B
4
Conformer A
q1
Figure 8 Representative phase space surfaces for uncoupled two-state isomerization in two degrees of freedom. A reactive torus labeled G!AB(E) spans both isomers, while foliated trapped tori f l A ( E ) are confined to isomer A (the ends of the trapped tori are snipped off to reveal the underlying foliated structure). The coordinate system is the same as that in Figure 6 , that is, (q,,q2,p z ) . The orbit labeled T(E) is the periodic orbit at the transition state, which represents a phase-space dividing surface between reactants and products. The surface labeled W,(E) represents motion that asymptotically approaches T(E) from A but never enters B. Note that all reactive tori lie within W,(E), and all trapped tori lie outside WA(E).Reprinted with permission from Ref. 108.
128 Visualizing Molecular Phase Space separatrix is not a torus, because motion on the separatrix can never move from A to B; nor can motion on the separatrix turn around once it heads toward the barrier from either side. The separatrix must consist of two halftori, which we label WA(E) and W,(E), which are “sewn together” at the barrier by a one-dimensional phase-space loop T(E). This loop is a periodic orbit and has been referred to by Poilak and Pechukas as a repulsive periodic orbit dividing surface (PODS)92 because motion near the PODS tends to rapidly diverge away from it along the reaction coordinate. The separatrix lies at q1 = qt, periodically oscillating in the (q2,p 2 ) plane but never crossing from A to B (see Figure 8). Note that all trapped tori lie outside the separatrix, whereas all reactive tori lie inside the separatrix; the separatrix separates reactive mo: tion from trapped motion within the full phase space. Finally, consider the case where E,,,, = 0. All of the available energy is trapped in the vibrational coordinate, so this case corresponds to maximumamplitude vibrational motion within each isomer, with no motion along the reaction coordinate. Thus, there are two one-dimensional periodic orbits that sit at q1 = -+&,the potential minima along the reaction coordinate. All motion at this fixed energy must pass within the interior of one or both of these surfaces. Using the Pollak/Pechukas terminology, these orbits are attractive PODS92 (the PODS nomenclature is explained more extensively later). We are almost ready to proceed to the visualization of reaction dynamics of nonlinearly coupled systems, but, before proceeding, we need a few theorems to assure rigor.
TECHNICAL OVERVIEWOF NONLINEAR DYNAMICS Some Essential Theorems We present here a brief discussion of some relevant essential theorems in nonlinear dynamics, compiled from the writings of several authors, to whom we refer the reader for a more complete tutoria1.5S~S9~65,66Y6~-7”~116 The classical dynamics of molecular models is generated by Hamilton’s (or Newton’s) equations of motion. In the absence of external, time-dependent forces, and within the Born-Oppenheimer approximation, the dynamics of molecular vibrations, rotations, and reactions conserves the total energy E. We therefore restrict our attention in the nonlinear dynamics literature to energyconserving systems, which are technically referred to as Humiltonian systems. For the purposes of the present discussion, we restrict our attention to Hamiltonian systems with two degrees of freedom: H ( q , P) = Hh,, 42, P 1 , 4 -
P2)
=
E
P81
Technical Overview of Nonlinear Dynamics 229 As alluded to previously, the reason we discuss dynamics for two-dimensional systems arises from the fact that all one-dimensional conservative Hamiltonians are integrable and therefore do not admit chaotic motion. However, two-dimensional systems are in general anharmonic and nonintegrable, except in certain special cases (a particle in a central field in two dimensions, a twodimensional normal-mode oscillator, etc.). Thus, Hamiltonians of the type given previously represent the simplest conservative systems that can exhibit dynamical chaos. Note that any function of four variables that is equal to a constant must correspond to a three-dimensional surface embedded in the fourdimensional space. We can further assume for the purposes of our discussion that through various mathematical tricks and transformations7 our Hamiltonian can be transformed into the form
with f ( q l , q2)a nonlinear coupling of two modes corresponding to the coordinates (ql, q2).The presence of this coupling term gives rise to nonlinear equations of motion (meaning that the Newtonian forces are not linear forces). The first two terms of this Hamiltonian might be referred to as the “zeroth-order Hamiltonian” in some contexts, such as classical perturbation theory.’ Such a system generally does not have analytically integrable equations of motion. However, we may apply Hamilton’s equations of motion, solve them numerically, and thus generate a unique trajectory for each set of initial conditions we choose. The resulting dynamics generally exhibits a variety of interesting phenomena. First, the frequency of motion in each mode is no longer a constant [as would be the case if we had f ( q l , q2) = 01 but depends on the instantaneous values of the canonical coordinates ( p, 9): - 7 -
Of course, if we had f ( q l , q2) = 0, the frequencies would be given by 0 1 =
6,
0 2 = 62
In either case, if the condition n l w l = n,02;
( n l , n l )integers
130 VisualizingMolecular Phase Space
is satisfied, the dynamics is resonant; alternatively, the trajectory is said to be in resonance with itself. For the case of two uncoupled harmonic oscillators, it is simple to see what the dimensionality of the resulting orbit is; we will use the KAM theorem to understand the dimensionality for the nonintegrable case. As it happens, the uncoupled two-dimensional harmonic oscillator has two constants of motion besides E (but not in addition to E) called the actions
(JI,J2I7:
These integrals are taken over one period of oscillation of each respective coordinate and are proportional to the area enclosed by the phase-space curves of each uncoupled oscillator. This area is constant for any trajectory because there is no mode-mode energy transfer, and therefore the actions must be constants of the motion. The subscript T,, the period of oscillation of q,, represents the fact that when integrating p 1 to obtain I,, we ignore the motion that is simultaneously occurring along q2. Actually, the actions (J,, J 2 ) are most generally line integrals and are constants of the motion for any system with a zeroth-order Hamiltonian that is decomposable into the sum of two one-dimensional Hamiltonians in some coordinate system, and this coordinate system need not be explicitly known. A more general way to write down the actions that reflects these facts is
In this expression the domains (yl,y2) over which the integrals are evaluated are two topologically independent circuits of a two-dimensional surface, which is formed by the intersection of the two three-dimensional surfaces formed by Eq. [28]. In English: we know that when the modes are uncoupled there is no mode-mode energy transfer. If we fix the total energy and we also fix the energy in one mode, then the energy in the other mode is determined. We cannot choose both of the J j independently for fixed E because by carrying out an appropriate change of variables it can be shown that E = E( J1,J2);thus the dynamics must lie on a two-dimensional surface.
Technical Overview of Nonlinear Dynamics 232 But what does this two-dimensional surface look like? As stated in the previous section, the two-dimensional surface formed must be the product of the one-dimensional surfaces occupied by each independent oscillator. Because each oscillator independently moves on a one-dimensional elliptic surface [a mathematician might say that the surface has the S1 topology, indicating that it can be uniquely mapped onto a one-dimensional sphere], the composite motion of both oscillators lies on a surface with the product topology [Sl x Sl = T 2 ] , which is a two-dimensional torus,116,*17Because each oscillator’s phase space is foliated, the global phase space must consist of foliated tori, as shown in Figure 7 . Notably, each torus is a vector field of momentum vectors. Because the Ji are line integrals on a vector field, the values of the Ji (the actions) actually depend only on the topological cut and not on the path of integration chosen. A trajectory with an irrational frequency ratio wI/wz never exactly closes on itself and will eventually visit every part of the surface. The trajectory in such a case is thus said to be ergodic O M the torus. (In contrast, the motion is ergodic if, over a very long time, the phase point passes through all configurations on the constant-Hamiltonian surface.) Now, if we impose the resonance condition given in Eq. [32], which is a constraint of the dimensionality of the dynamics, the trajectories of the system must lie on a surface of one dimension lower than the two-dimensional torus, that is, a one-dimensional line. In this case, the motion winds about on the surface of the torus but does not explore its entire two-dimensional surface; the system instead continually winds along on a closed path and is thus a periodic orbit for all initial conditions. Periodic orbits can be quite complex or quite simple, depending on the frequency ratio (see Figure 9 for plots of a few low-order periodic orbits from the HknonHeiles Hamiltonian).*l*J*9 In the presence of a coupling term, how does the situation change? The introduction of a coupling term seems to destroy the picture above; indeed, before the pioneering computer experiments of Fermi et al.,76 this was widely believed to be the case. It was once believed that the presence of a coupling term would cause the entire phase space to become chaotic. However, the KAM theorem gives us some essential information on this point. Without attempting to present the KAM theorem in its full glory, this theorem, in essence, tells us that if the zeroth-order Hamiltonian is far from resonance (i.e., if Eq. [32] is not satisfied), then in the presence of an extremely small coupling term most, but not all, of the tori are not destroyed but are instead perturbed and caused to deform. Such deformed tori (which are now call KAM tori) will have constant actions. But what is the nature of motions that do not lie on KAM tori? This question is answered by the Poincark-Birkhoff theorem.55 This theorem tells us that motions that do not lie on KAM tori “pinch off ” into periodic orbits which generally alternate between two general types. Some periodic orbits are stable (motion near the orbit tends to stay close to it), whereas others are unstable (motion near the orbit tends to diverge away from it). Unstable peri-
132 Visualizing Molecular Phase Space
Figure 9 Configuration-space plots of 12 low-order periodic orbits (POs) obtained from the Henon-Heiles Hamiltonian at E = 0.1, surrounded by the equipotential curve V(q,, q2) = E. Note that certain POs touch the equipotential curve (e.g., 1, 3 , and 9), whereas others do not (e.g., 2 and 10). Reprinted with permission from Ref. 119.
odic orbits have associated with them certain special sets of trajectories which can be generically referred to as “limit sets.” These limit sets are families of trajectories which asymptotically approach, or diverge from, the unstable periodic orbit. By “asymptotically” we mean that motion on the convergent limit set (which is often called the stable manifold)approaches infinitely close to the periodic orbit in the limit t + x, and motion on the divergent limit set (called the unstable manifold) approaches infinitely close to the periodic orbit in the infinite past, that is, in the limit t -x. Any motions near these limit sets that do not lie on KAM tori are free to move between the interior and exterior of these two limit sets, as well as any other limit sets that might exist elsewhere in physically accessible regions of phase space. These motions are said to be chaotic, and the “dance” of chaotic orbits between chaotic limit sets amounts to mode-mode energy transfer. Together, these two theorems (KAM and Poincari-Birkhoff) tell us that the general situation in coupled Hamiltonian systems will be one of coexistence; at a particular total energy, some motions will be regular (lie on a torus) and others will be irregular, that is, chaotic.120 But which tori will be the first to be destroyed as we “turn on” a coupling term? As it happens, there is as of yet no theorem that guarantees which regular structures are the first to go (at least, not so far as we are aware), but there is a theorem that indicates which ones will probably last the longest. We might naively reason that motion that satisfies the resonance condition might be the first to become chaotic because the KAM --f
Technical Overview of Nonlinear Dynamics 133 theorem does not hold for such systems. However, the Center Manifold theorem69,70 guarantees that if a periodic orbit is subjected to an infinitesimal perturbation (assuming noncatastrophic changes in the potential surface and that the orbit does not bifurcate), a perturbed version of the orbit will continue to exist in the perturbed system. It may be slightly perturbed from its original geometry, but it will still be present, it will still satisfy its original resonance condition, and the phase space surrounding it will have nearly the same properties. The situation becomes considerably more complicated as the importance. of the mode-mode coupling increases. More and more periodic orbits, sarisfying different resonance conditions, can arise in various regions of the energetically allowed phase space.55 These orbits can be surrounded by a mixture of regular and irregular motion. The “newly arrived” periodic orbits can undergo bifurcations (i.e., split into two or more new periodic orbits), and the topology of nearby phase-space surfaces can undergo dramatic changes.70~8*~1071121 We turn to a technique suggested by PoincarC for visualizing these effects.
Visualizing Phase Space on PoincarC Maps: Practical Aspects To visualize some of the effects described in the previous section, Poincari showed that the behavior of two degree-of-freedom nonlinear systems can be profitably studied by mapping the dynamics onto a well-chosen plane. This is because the conservation of energy requires all trajectories to wander on a three-dimensional hypersurface. In his honor, these maps are often referred to as Poincare‘ maps. The plane chosen to map the dynamics onto is referred to as a surface of section. Often the plane is chosen to be a simple function, such as a constant value of one of the phase-space coordinates. Knowledge of any combination of three phase-space coordinates determines the fourth (within a sign) because energy is conserved. For example, if the system is of the form of Eq. [29] and ( p 2 , ql, q2) are given, p1 is simply
p1 =
2
[
d 2 E
-
1 pt - $G!d
+
Gd)- f(i%<) ]
PSI
Therefore, we could define a useful surface of action by choosing a constant value of q1 = q‘;] such that each trajectory passes it every time it undergoes an oscillation in the q1 direction (a good practical choice for q[;’ would probably be the value of q1 for which the potential energy is a minimum). Each time a trajectory passes q‘;] with p , > 0 (to remove the sign ambiguity), we record ( p 2 , q,).100*122This means that all points on our PoincarC map will lie on the surface q1 - 4“’ = 0
1361
134 Visualizing Molecular Phase Space If we carry out such a calculation for a large number of trajectories at the same fixed energy E, we obtain a unique map of the global dynamics at that energy. The map is unique because each point on it uniquely specifies a single trajectory in phase space, and trajectories in phase space do not intersect at any given instant in time. If they did intersect, they would have to arise from identical initial conditions, and then classical mechanics would no longer be a causal theory!7 Operationally, to numerically construct a Poincart map, a number of initial conditions are chosen (they are often chosen uniformly on the Poincark map surface within the energetically allowed range of positions and momenta), and the trajectories are allowed to intersect the map many times. In practice, the code must monitor whether a trajectory has crossed the surface, and then back-track to the surface with reasonable accuracy. The back-tracking can be carried out by a linear interpolation between the integration points immediately before and after the crossing, but this procedure lacks sufficient numerical accuracy for some applications. To obtain an accurate Poincart map, a clever and practical algorithm presented by Henon (which he describes as a “trick”) may be used to accurately back-integrate to the map.122 This algorithm allows one to generate a Poincart map with the same level of accuracy with which the trajectories are being integrated. Essentially, one carries out a simple transformation of variables such that q1 “trades places” with the time variable t; t becomes the dependent variable, and q, becomes the independent variable. The transformation yields a new set of differential equations which are essentially ratios of the original equations. It then becomes possible to,back-integrate to the surface of section in a single “time step,” with q1 now playing the role of time. The Henon algorithm is worth describing in some detail. Let the vector? these represent dependent represent the phase-space coordinates ( pl,ql,p2,q2); variables, whereas t is the independent variable. Application of Hamilton’s equations of motion will yield four first-order differential equations to generate the trajectories:
dpl
_.
=
2; dt
fl(F)
-= f
2 m
d p 2 = f3(?) dt
dq, = f4(T) dt
We divide the first equation and the last two equations by the second, to obtain
Technical Overview of Nonlinear Dynamics 135
The second of these equations is then inverted to obtain
We now have a set of four new differential equations, but the dependent variables are (pl, t, p2, q2), and q1 plays the role of the independent variable. Thus, the Poincari map of a trajectory is constructed by integrating Hamilton’s equations of motion (Eq. [37]) until the surface-of-section function S(ql) = q1 - qpl changes sign (which means that the trajectory has just “pierced through” the surface of section. We then integrate the new set of differential equations (Eq. [ 3 9 ] )for one integration step using the same integration algorithm. The integration step is given by Aq, = -S, where S is the value of the surface-of-section function at the time step immediately after the trajectory has pierced the surface of section.122 O n the downside, the Henon map algorithm is restricted to Runge-Kutta and extrapolation algorithms (predictor-corrector algorithms are excluded) because it requires that each time step At be numerically independent of its preceding and subsequent time steps.3-5,122 Its strength lies in the fact that no additional numerical machinery need be developed; one simply writes a subroutine that is a rearranged version of the subroutine that must be written to generate the trajectory in the first place! Another advantage of the Henon algorithm is that its use assures us that the resulting Poincare map is obtained at a level of numerical accuracy comparable to that used to generate the trajec-
136 VisualizingMolecular Phase Space tory. Finally, Henon describes how the method can be applied to a surface of section given by an arbitrary function; thus, we are not restricted to choosing our surface-of-section slices for constant values of q, or q2.122
Interpreting Poincari Maps Figure 10 is an example of a PoincarC map for a system of two strongly interacting nonlinear oscillators at relatively low energy,11S422-’24 Due to energy conservation, the map has a fixed maximum boundary at this particular energy (labeled 2). Note the presence in Figure 10 of closed elliptical curves, some of which are surrounded by other curves that surround the ellipses and “pinch off” between them. The large central ellipse has at its focus a single point, corresponding to a resonant trajectory, which we refer to as an elliptic fixed point. The set of curves surrounding a fixed point or family of fixed points (i.e., a set of fixed points that all satisfy the same resonance condition [0,/w2 = nl/n2]) is referred to as the resonance zone. Trajectories within a resonance zone tend to have frequency ratios (01/w2) close to that of their enclosed fixed points, and their phase-space surfaces share common topologi-
B I .
D
C
P1
Q1 Figure 10 PoincarC map for two nonlinearly coupled oscillators at fixed energy. The energetic periphery is labeled 2 . (A) The elliptic fixed point of an orbit with w1/w2 = I/z. (B, C) Elliptic fixed points of two distinct orbits with w,/w, = %. (D) Separatrix surrounding the w,/w, = 2/1 resonance zone. The separatrix “pinches off” at four hyperbolic fixed points, three of which are more or less clearly visible. (E) One of four elliptic fixed points of an orbit with w,/w, = 4/7 (the other three are not visible on this scale). Most of the motion is clearly quasiperiodic.
Technical Overview of Nonlinear Dynamics 137 cal characteristics as well. The central fixed point in this particular case happens to correspond to a periodic orbit with frequency ratio w1/w2 = l/2, and is the lowest order fixed point at this energy. This can be assessed by constructing a second Poincari map on a transverse surface of section (not shown) and observing that the same orbit intersects the transverse map at two distinct points. We can reason that although the ‘/z orbit produces the fixed point of lowest order at this energy, at lower energies a fixed point may exist that satisfies 01/w2 = I/o, which is the central fixed point of the map in the harmonic limit (assuming that the harmonic system is not resonant). Next, note the set of four distorted ellipses surrounding the central ellipse. The foci of these curves correspond to two distinct periodic orbits, each of which has the frequency ratio o,/w2 = 2/3. Two fixed points “belong” to one orbit, the other two to the other orbit, exclusively. Also note the presence of a small “bulge” near the map periphery of Figure 10. This corresponds to a very narrow resonance zone centered about a 4/7 periodic orbit. If we were to launch trajectories about the edges of this zone or about the edges of the bifurcated 2/3 resonance zones, a very small scattering of points would be barely visible (because most tori in this system have not been destroyed). This scattering is associated mostly with the points at which the resonance zones pinch off from one another. These “pinch points” correspond to unstable (but not necessarily repulsive) periodic orbits, and we call them hyperbolic fixed points. Typically, one finds numerically that tori close to hyperbolic fixed points are among the first to be destroyed by a perturbation. This phenomenon was interpreted by Chirikov in terms of the overlap between the edges of adjacent resonance zones.82 When resonance zones overlap (not possible in an integrable system), the trajectories that lie in the overlap region can no longer lie on a single torus. Thus, the scattering of points on a Poincari map reflects the destruction of tori. The trajectories corresponding to these points are said to be chaotic, because they have no constants of the motion other than the total energy. Trajectories initiated close to an elliptic fixed point behave in a qualitatively different manner from those near a hyperbolic fixed point. For one thing, elliptic fixed points are invariably surrounded by invariant tori, with frequency ratios not far from that of the fixed point; all motion on each torus stays on the same torus. Hyperbolic fixed points may or may not be surrounded by tori. However, they are always associated with a single unique set of manifolds composed of motion asymptotic to them in positive and negative time. These manifolds are called stability manifolds or separatrix manifolds, and their continued existence in the presence of a coupling term is guaranteed by the Stable Manifold theorem.68Jo The nature of the asymptotic manifolds will be seen to be of special interest and importance, and we discuss them at length in the following section. For now, we note that the elliptic fixed points are separated from one another by hyperbolic fixed points, which meet along a line generally known as
138 Visualizing Molecular Phase Space
the separatrix. As in the case we discussed in a previous section, the separatrix is a phase-space surface that separates two qualitatively different types of motion from one another. In this particular case (cf. Figure lo), motion inside the separatrix lies within the 2/3 resonance zone; motion outside it lies outside the 2/3 resonance zone. In either case, if the motion lies on a torus, it will never cross the separatrix from one zone to another. If the motion does not lie on a torus, it is free to pass from one resonance zone to another within any region of phase space not occupied by tori (because tori are invariant; motion on the torus stays on the torus, and so chaotic trajectories cannot pass through a torus in these systems), but it must cross the separatrix in order to do so. In Figure 10 the chaotic region is extremely small. However, in Figures 11 and 12 we show a second system’s phase space map as a function of energy.3~Jl9This system exhibits a mode-mode resonance at low energies, with a hyperbolic fixed point located near the center of the Poincark map. Note in Figure 11that as the energy increases, the measure of quasiperiodic phase space decreases and approaches a limit in which most of the tori are destroyed, with
Figure 11 Surface of section plots for the De Leon-Berne Hamiltonian as a function of energy for energies below the barrier, Note the destruction of quasiperiodic motions (KAM tori) as a function of energy. Also note the persistence of KAM tori near certain elliptic fixed points. Reprinted with permission from Ref. 119.
Technical Overview of Nonlinear Dynamics 139
9 1
0.7
0.4
0.3 0.1
I “‘I
3
Figure 12 (Top) two trapped quasiperiodic orbits at fixed energy superimposed on potential energy contours for the De Leon-Berne Hamiltonian at E = 0.65 (see Figure 23 for the potential energy surface). (Below) The Poincari map for this system at this energy, with the surface of section defined at fixed q2 with p r > 0. The trajectories above are connected with their corresponding map locations below. Note that whereas these relatively regular motions lie on tori, most of the Poincari map is “broken up,” indicating that most motions at this energy are chaotic. Reprinted with permission from Ref. 119.
the phase space surrounding the hyperbolic fixed point breaking up at relatively low energy, and the phase space surrounding the elliptic fixed points persisting in the limit of fairly high energies. In the limit that all tori are destroyed (the existence of which is not generally guaranteed), all trajectories will explore the entire phase space, except for a set of orbits that are of measure zero (for example, periodic orbits, which are one-dimensional objects in a three-dimensional space). We can at that point consider the system dynamics to be ergodic on the energy hypersurface or globally chaotic.71J25 However, by
240 Visualizing Molecular Phase Space
far the most common situation is for a system’s dynamics to admit a mixture of regular and chaotic motions. This is emphasized in Figure 12, where we show two quasiperiodic orbits plotted on the potential energy contours and indicate the regions of the PoincarC map to which they correspond.119 On the Poincart map, these quasiperiodic motions appear as isolated regular structures in a “sea of chaos.” Despite the fact that the motion is now ergodic, there is a considerable amount of structure in the chaotic regions of phase space, which may be referred to as “order in chaos.” In particular, separatrix structures still exist and can be reconstructed numerically. These structures determine the rate at which chaotic trajectories travel from one part of phase space to another. Moreover, the separatrix structures can be present if the dynamics has any measure of chaos whatsoever, regardless of whether or not the system is ergodic. The periodic orbits of the system admitted by the system are intimately connected to these structures, as we will see.
Linear Stabilit Analysis of Periodic Or its
6
We have discussed the typical manifestation of periodic orbits on a PoincarC map as fixed points that are either elliptic or hyperbolic. Let us now consider the properties of motion nearby these fixed points in terms of their stability properties. This is accomplished by a straightforward linear stability analysis about the fixed point. We can carry out such an analysis on any fixed point, whether or not the surrounding phase space is chaotic (as long as we can find the fixed point). In considering dynamics on the Poincart map, it is extremely useful to think of the dynamics as being generated by a mapping U:
U represents a unique transformation from the initial to the final map location enerated by Hamilton’s equations of motion: ( p 2 , q2)-% (piyqi), ( p i , 8 4;) --+ (p;, q;), and so forth (see Figure 13A). In particular, U is required to be the mapping that generates the same transformation as that generated by a classical trajectory, which makes it a special kind of canonical transformation.7 Because fixed points correspond to periodic orbits, we realize that within one or more mappings (depending on the orbit), the mapping of a fixed point (pq, 4): must eventually take us exactly back to the initial condition (p:, q$’)(see Figure 13B). If the mapping returns to the fixed point after k mappings, we then say that the fixed point is a fixed point of order k:
Technical Overview of Nonlinear Dynamics 141
A
P2
t
6
c P2
Figure 13 Point dynamics on the Poincari map. (A) A typical mapping sequence of a point initially at (pz, q 2 ) generated by the mapping U . (B) Dynamics of a fixed point of order 3. For such a point, on the third mapping the initial and final points exactly coincide.
The linear stability analysis of such a fixed point is simplified somewhat by the fact that U is an area-preserving mapping. Among other things, this means that if trajectories with initial conditions on a closed curve y, on the map with area A , are propagated by Hamilton’s equations until all trajectories undergo a second mapping, the area encompassed by y2 is equal to the initial area (A, = A,; Figure 14).The curve y2 generally may be distorted in geometry from the original distribution (just because its area is preserved does not mean
142 Visualizing Molecular Phase Space
P2
Figure 14 Area preservation on the Poincari map. The region bounded by y1 is mapped onto the region bounded by y, by the area-preserving mapping U such that A , = A,. C denotes the energetic periphery.
its shape is preserved), but it will always encompass the same area as y, and it will still be a closed curve. The reason that the mapping is area preserving can be thought of in several ways. Perhaps the simplest is to consider the consequences of the loss of area preservation. Consider once more a closed curve y, on the mapping surface, which encloses an area A l . If y1 is mapped onto y2 such that A t # A2, we have no way to uniquely map y2 back onto yl. In other words, there is no well-defined inverse to the mapping, and thus the dynamics generated does not have time-reversal symmetry. This would imply a “loss or gain of information,” a situation that cannot exist in Hamiltonian dynamics, which is after all a causal theory. Time-reversal symmetry is a property of all Hamiltonian systems (cf. Arnold66), and thus the map dynamics generated by a mapping that is not area preserving does not represent Hamiltonian dynamics. Another, slightly more technical (!) way to understand area preservation is to recall that the coordinates of all the points on the PoincarC map are specified by coordinates that are canonically conjugate. Because both the initial and final coordinates of a family of trajectories propagated for one mapping are so specified, there must exist a generating function that transforms the coordinates of the initial points into those of the final points.’ Such a generating function is necessarily a canonical transformation.7 All canonical transformations preserve the norm of the vectors they transform; it can be shown that this property is equivalent to area preservation on the Poincare map.G5
Technical Overview of Nonlinear Dynamics 143
To analyze the stability properties of motion nearby a fixed point on the map, we will consider a linear approximation to the mapping dynamics near the fixed point, For simplicity we examine a fixed point of order 1. As before, let (pq, 4;) be our fixed point; let ( p 2 , q2) be an initial point on the map displaced from (pq, 4:) by a tiny differential vector (tip,, 6q2),and let ( p i , 4;) be the first iterate of ( p 2 , q2)generated by initiating a trajectory at that point, as in Figure 15.
P2
Figure 15 Numerical linear stability analysis of a fixed point of order 1. A point (p2, q 2 )is chosen a t a very small distance from the fixed point (p;, 4:). A trajectory is integrated from (p2, q2) until it undergoes k mappings (here k = 1) and generates ( p i , 4;). This information is used to numerically estimate the manodromy matrix (see text).
144 Visualizing Molecular Phase Space
We have P2
=
Pz” + S P ,
42 = 420 + 64, If we assume that the final coordinates are a linear function of the displacement of (p2,4,) from the fixed point (pp, qq), we have
P;
=
PP +
4 ; = qq
(
2lo ( g 64, +
+ ( a )64, + (””’) a42
0
ap2
)o
0
se2
sp2
In matrix form this can easily be shown to be the same as65
(’i) 42
=M(::)
with M a matrix of partial derivatives known as the manodromy matrix:
Once the manodromy matrix is known, we can characterize the fixed point by analyzing its eigenvalue spectrum. The manodromy matrix is real and generally asymmetric; thus, eigenvalues can be found that are either real or complex. The possible cases are thoroughly described in Lichtenberg and Lieberman,65 but we will outline their discussion. The eigenvalues of M are found by solving the secular equation det(M - Al) = 0. They are given in closed form by
where M,i is the (ith,jth) element of M. We have used det(M) = 1 because M is an area-preserving mapping. It is easy to verify that A 1 A 2 = 1 and A, A, E Re (the sum of the eigenvalues is always real). Note that although this analysis yields two eigenvectors, there are often two more eigenvectors with the same eigenvalues; these would be discovered by carrying out the same stability analysis, but starting with an initial point that is reflected through the fixed point. In the case of an elliptic fixed point (which would correspond to a stable
+
Technical Overview of Nonlinear Dynamics 245 periodic orbit), the eigenvalues will be complex conjugates of one another and the eigenvector matrix will correspond to rotations about the fixed point (i.e., nearby trajectories will lie on tori). Conversely, if the eigenvalues are real and not equal to 1, (py, qy) will be a hyperbolic fixed point (which would correspond to an unstable periodic orbit). The eigenvectors yielded by this analysis will correspond to the stable and unstable branches discussed in the previous section; there will usually be two “stable” branches (which asymptotically converge onto the fixed point) and two unstable branches (which asymptotically diverge from the fixed point). Finally, if A, = A, = ? 1, the fixed point is exactly at a transition between hyperbolic and elhptic behavior; it is neither fish nor fowl. This transition signals the onset of chaos in the surrounding phase space.65 Hyperbolic fixed points can be further classified. If the A, are real and positive (and not unity), the situation is qualitatively different from the case where they are real and negative. It can be shown that in the former case (A, real and positive) points that are asymptotic to the fixed point approach along a single eigenvector. The fixed point is then “repulsive” and the periodic orbit is referred to as a repulsive periodic orbit or a repulsive manifold. A repulsive fixed point can be seen in the lower part of Figure 5. In the latter case (A, real and negative), points that are asymptotic to the fixed point reflect about the fixed point along both stable eigenvectors with each successive approach; the fixed point in such a case is said to be a fixed point of reflection or an “attractive” fixed point. The orbit is variously called an attractive periodic orbit or an attractive manifold. By focusing on the phase-space properties of the periodic orbits that give rise to fixed points on the Poincari map, it was argued by Ozorio de Almeida et al. that this classification of hyperbolic fixed points is intimately connected to the phase-space geometry of the stable and unstable branches.107 The stable and unstable branches, which are lines on the Poincark map, are mappings of an invariant two-dimensional surface embedded within the phase space. Ozorio de Almeida and co-workers observed that, in general, the surfaces must twist an integer number of times (0, 1, 2 . . . ) as they branch out from the stability vector into the full phase space before returning again to the Poincark map. When the fixed point is a fixed point of reflection (an attractive orbit), the number of twists is odd, and the surface has the topology of a Mobius strip (it is said to be homeomorphic to a Mobius strip). Otherwise, the number of twists is even and the surface is homeomorphic to a cylinder (it has an “inside” and an “outside”). If the number of twists is exactly zero, the surface has the topology and the geometry of a simple cylinder in phase space. Although this last case was apparently not previously emphasized in the nonlinear dynamics literature, it is one of the most chemically relevant situations, because it arises whenever there is a simple barrier in the system’s potential energy surface. We consider the implications of this fact in more detail below,
146 Visualizing Molecular Phase Space
Numerical Reconstruction of the Separatrix The hyperbolic fixed points are of special interest, because as stated previously, their presence is associated with (1)the onset of chaos and (2)with separatrix motion. In particular, we have seen in our uncoupled system that a hyperbolic fixed point was associated with a separatrix in the phase space. This is actually a quite general result, and so it is desirable to reconstruct separatrix curves for nonlinearly coupled systems. This can be accomplished numerically. The separatrix is obtained by generating M for the fixed point of interest, finding its eigenvectors, seeding the eigenvectors with initial conditions, and integrating them for a number of mappings. For example, numerical stability analysis may be carried out on a fixed point of order 1 by choosing two nearby points horizontal and vertical to the fixed point, integrating the equations of motion for these points to obtain their first iterates, and approximating the matrix M numerically (see Figure 15).The eigenvectors for a hyperbolic fixed point can then be generated from M using standard methods. Once extracted, the eigenvectors of M may be used to reconstruct the separatrix by initiating two ensembles of trajectories along the eigenvectors on the surface of section and propagating them in positive or negative time, as appropriate (see below). The line segment formed by these sets of trajectories on the surface of section will map along two well-defined curves, called the stable (W-) and unstable (W’) branches. The stable branch is convergent toward the fixed point as time increases, and the unstable branch diverges away from the fixed point (note: because classical mechanics is time reversible, we can think of the unstable branch as being convergent upon the fixed point as time decreases). As Hamilton’s equations of motion generate successive mappings, the trajectories remain on the separatrix (once on a separatrix, you cannot get off; it is an invariant surface in the phase space). However, the linear approximation no longer is necessarily valid, and so the lines so formed on the map will generally acquire some curvature on the Poincare map as it extends away from the fixed point (see Figure 16). Because the energy is fixed, the range of available phase-space coordinares is finite, and so eventually the two branches must intersect. The reconstrufted separatrix consists of a closed curve of points formed by the first .intersection of W -with W+ (see Figure 16A). The two lines formed by mapping trajectories along the stable and unstable branches will first meet at a single point, h , . This point “belongsyyto both branches, both of which arise from the same fixed point; thus h, is called a homoclinic fixed point. It is also possible for the stable and unstable branches from an orbit to intersect the unstable and stable branches emanating from a different periodic orbit before meeting each other (see Figure 16B). The points of intersection between stability branches of different periodic orbits are called heteroclinic fixed points. The resulting closed curve is called a separatrix in either case.
Technical Overview of Nonlinear Dynamics 147
A
hi
S
-
Figure 16 Sketches of numerically generated separatrices on the Poincark map. After extracting the manodromy matrix of a hyperbolic fixed point (p:, q?), the asymptotic eigenvectors W+ and W- can be obtained. (A) If no other fixed points are nearby in the chaotic sea, the separatrix branch formed by repeated mappings in positive time of points initially on W+ will eventually meet the branch formed by repeated mappings in negative time of points initially on W- at a single point h, (called a homoclinic point). The closed curve so generated is the separatrix “S. (B) If a second fixed point (p!, q g 2 associated with a different periodic orbit is nearby, a separatrix S may be formed by the intersection of branches arising from the two orbits at two points h, and h: (called heteroclinic points). ”
If all the points on the reconstructed separatrix are mapped forward one additional time, curves of the type shown in Figure 17 will be generated. The new lobes inside and outside the separatrix have been referred to as turnstiles.86787 The area of the turnstile within the separatrix is equal to that of the turnstile outside. As h, maps toward a fixed point, the area encompassed by the turnstiles is conserved with each mapping, but the length of the line segment connecting the base of the turnstile decreases with each mapping of the ensemble as it approaches the fixed point.65.120 Thus the turnstiles become more filamentous and reach out into the chaotic region exterior to the sep-
148 Visualizing Molecular Phase Space
S
a
a* Figure 17 Generation of turnstiles. (A) All points on the homoclinic separatrix S are mapped for a single mapping U.However, not all of the points from W+ are destined to map onto W-; only h,, which is a member of both the W+ and W- sets, must map along W-. The points on the line segment between 6, and its map image h, remain on a line segment, but this line goes outside of the separatrix, forming a turnstile loop (labeled a). a contains trajectories that were previously within S . Similarly, a' contains trajectories previously outside S. Note that area(a') = area(a) because this is an area-preserving mapping. (B) Same as A, except that the separatrix is a heteroclinic separatrix and two turnstiles are generated at the two heteroclinic points.
aratrix and into the region interior to the separatrix (making it possible for more heteroclinic o r homoclinic intersections to occur). The result is known as the standard homocliniclheteroclinic tangle. Any motion that passes between the interior and the exterior of the separatrix must pass through the turnstile, hence its name.
Technical Overview of Nonlinear Dynamics 149 The area of the turnstiles relative to the area of the chaotic mapping region within the separatrix is related to the flux of chaotic trajectories across the separatrix, if it is assumed that motion within the chaotic regions of phase space is statistica1.60~86~87 This concept has been applied successfully to flux calculation of transport across dynamical bottlenecks in models of molecular systems.60~103~108-112,126Certain separatrices can act as dynamical bottlenecks. Although this can be true under a variety of conditions, so far most numerical evidence suggests that this tends to happen when the phase space is still a mixture of ordered and chaotic motions, and not all tori have been destroyed. Often, the separatrix that acts as a dynamical bottleneck is the last torus to be destroyed within a particular range of frequencies. This torus is called the golden torus or cantovus.60986.87 After the cantorus has been destroyed, highorder periodic orbits with separatrices that closely match its shape can act as bottlenecks to energy transfer. Because it can be difficult to precisely locate the cantorus numerically, usually one settles for finding these nearby periodic orbits. Procedures for doing this have been described elsewhere, the first of which were introduced to the chemistry literature by Davis in a study of the OCS molecule.29~60J19 Roughly speaking, once the separatrix of interest and its turnstile have been reconstructed, one can use their areas to obtain the probability of escape from the separatrix’ interior on any given mapping P, using the equation
where A, is the area of the separatrix and A, is the area of one loop of the “turnstile.” These areas can be obtained by direct integration once the separatrix and turnstiles have been generated. Alternatively, MacKay et a1.86 and Bensimon and Kadanoff87 showed that the accumulated action of the orbits associated with the homoclinic/heteroclinic points directly gives the area of a turnstile.86 Homoclinic/heteroclinic orbits are a denumerable (countable) set of orbits in systems with two degrees of freedom. These orbits can be located numerically by, for example, a search on a Poincare map for orbits that minimize the accumulated action via a simplex algorithm293119 or by NewtonRaphson or other minimization techniques.60 Their accumulated actions can be integrated numerically and the turnstile area thus obtained. However, a note of caution; in systems with more than two degrees of freedom, the homoclinic/heteroclinic orbits belong to dense (uncountable) sets. To carry out a flux calculation in this manner, an infinite number of homoclinic/heteroclinic orbits would all have to be located (a virtually impossible task).62 Moreover, the generation of Poincark maps and separatrix curves for systems with more than two degrees of freedom is fraught with practical and conceptual difficulties.99~100~114 In this section we have considered some of the basic properties of nonlinear dynamical systems, concentrating on the manifestations of nonlinear
150 Visualizing Molecular Phase Space phenomena on PoincarC maps and the properties of the resulting map dynamics. In the next section, we see that the common thread connecting all Hamiltonian systems is the geometricalhopological aspect of phase space dynamics. Of particular interest to us is the manifestation of separatrix motion on Poincare maps when the separatrix in question separates reactive motion from nonreactive motion.
VISUALIZING COUPLED ISOMERIZATION DYNAMICS IN PHASE SPACE Isomerization in Two Coupled Degrees of Freedom Armed as we are now with the KAM theorem, the Center Manifold theorem, and the Stable Manifold theorem, we can begin to visualize the phase space of reaction dynamics. Returning to our original system (see “Uncoupled Reaction Dynamics in Two Degrees of Freedom”), we now realize that the periodic orbit that sews together the half-tori to make up the separatrix is a hyperbolic periodic orbit, and it is not a fixed point of reflection. From our previous visualization of uncoupled phase-space dynamics, we know that the separatrix is completely “nontwisted.” In the terminology of Pollak and Pechukas, the hyperbolic periodic orbit is a repulsive PODS.92 In the limit of small mode-mode coupling, we expect that the situation will continue to be true; the Center Manifold theorem asserts that the orbit will continue to exist (perhaps slightly perturbed), and the Stable Manifold theorem tells us that the separatrix will continue to exist and continue to separate reactive motion from trapped motion. However, we are also aware that chaos typically will first develop near a hyperbolic orbit; thus we expect the separatrix to be “broken.” If we construct a Poincari map of the separatrix such that every time the molecule passes 4 2 = q5 with p 2 > 0 we record (ql, p , ) , we expect to obtain the standard homoclinic tangle. However, we will instead construct a new Poincari map, using two surfaces of section (see Figure 18). The first surface of section, which we call XL, is defined such that every time the molecule passes q1 = q$ (the location of the potential minimum of isomer A) with p1 > 0 we record (q2,p2); similarly, every time the molecule passes q1 = q! with p , > 0 we record (q2,p 2 ) on a second map C,. This arrangement of multiple surfaces of section has been termed the n-map.107 Instead of the standard homoclinic tangle, after one mapping of the separatrix on each surface of the n-map, we observe two simple closed curves (see Figure 19) with the same area as the action of the repulsive PODS.53 The appearance of these special curves, which have been called reactive islands, can be explained by visualizing the separatrix branches in phase space.
Visualizing Coupled Isomerization Dynamics in Phase Space 151
Figure 18 Schematic rendering of the two surfaces of section Xi and Zi used to define the n-map for a double-well system. We associate each map with a potential minimum, likely to be associated with an attractive PODS. As the trajectory winds through phase space, we record its location as it passes each plane with the proper orientation (mapped points are indicated by an asterisk (“)).
In the phase space, the result of introducing a mode-mode coupling term is that all motion on the stable branch of the separatrix is no longer required to return to the transition state along the unstable branch. The two branches, which we now think of as cylinders that happen to meet each other perfectly everywhere in phase space in the absence of coupling, can “miss” each other slightly, and instead of a half-torus, we have two cylinders that can partially overlap within the phase space of the isomer (see Figure 20). The appearance of the reactive islands structure on this map does not mean that the separatrix does not admit the standard homoclinic tangle. If a surface of section is constructed such that it always intersects the separatrix along its “long axis” (longitudinally) instead of along its cross-section (transversely), the standard homoclinic tangle is observed (see Figure 21). However, in the strong-coupling limit this may be a nearly impossible task.53 There is now a division in phase space between trajectories that are instantaneously within the overlap between the two cylinders and those that are not. First of all, since the coupling is weak it is likely that most of the phase
152 Visualizing Molecdar Phase Space
Figure 19 Sketch of reactive islands on one of the n-map surfaces 22.The reactive island formed by the unstable branch is labeled II,, and the first mapping of the stable branch is labeled ll:, Points within 11, must have just passed directly from B to A; similarly, points within 112 must be about to pass directly from A to B. Points in the overlap of these two sets (labeled 11,) must therefore be “direct” back-reactors. h, and h, are heteroclinic points to the repulsive PODS.The outer curve represents the energetic periphery of the Poincari map. Reprinted with permission from Ref. 108.
space within the overlap region consists of foliated reactive tori. However, a small measure of trajectories near the separatrix will no longer be restricted to lie on tori, and these may pass into and out of the separatrix. Roughly speaking, we expect that the magnitude of the overlap between the two separatrix branches relative to the cross-section of the cylinders should tell us something about the fraction of trajectories that back-react after exactly one oscillation of the reaction coordinate (these were referred to as primary-2 back-reactors). We also expect that the extent to which the stable branch’s cylindrical maw “overlaps” with the unstable branch to tell us something about the probability of motion passing from a temporarily trapped region of phase space into the stable branch of the separatrix (and on toward products). Moreover, the existence of an overlap “clogs” the stable branch and acts as an impediment to the activation of trajectories initiated outside the stability manifolds. This information was used by De Leon and co-workers to formulate the first of a series of unimolecular reaction rate theories collectively referred to as reactive islands theoyy.29~107-110,*12,126 The extension of these ideas to bimolecular reactions would yield a model quite similar (but not identical) to that developed by
Visualizing Coupled lsomerization Dynamics in Phase Space 153 Homoclinic Orbit
Conformer B
Conformer A
Figure 20 Schematic drawing of two cylindrical manifolds within isomer A in the weak-coupling limit (refer to Figure 8 for an explanation of the symbols). The twodimensional cylinders will intersect each other along one-dimensional lines. These lines are two homoclinic orbits. The small, thin tube spanning both isomers corresponds to a “reactive” KAM torus. Note that although we have “stopped” drawing the cylinders beyond a certain point for clarity, in reality the cylinders continue to wind about and explore the entire accessible region of chaotic phase space. Reprinted with permission from Ref. 108.
Homoclinic Tangle
A
B
Figure 21 Sketches of Poincare maps for the system described in Figure 20. In (A) a longitudinal cut of the phase space separatrices is used to defined the surface of secttion, whereas in (B) a transverse cut (such as that used in an n-map) is shown. Note the homoclinic tangle in (A). In (B), the reactive islands partially overlap. Note that the invariant tori W B ( E ) are located in the overlap region between the two reactive islands in (B), but lie “exterior” to the separatrix in (A). Reprinted with permission from Ref. 108.
154 Visualizing Molecular Phase Space
Pollak and J?echukas,92 and it can in a sense be thought of as a generalization of their work. These theories basically assume that motion within the chaotic region is statistical (although this assumption can sometimes be modified; see below).29 As the magnitude and importance of the coupling term increases, or as resonance terms are introduced into the zeroth-order Hamiltonian, the cylinders continue to exist.*07J08 However, they can exhibit a variety of interesting behaviors. For example, the two branches can actually undergo multiple oscillations within an isomer before meeting one another, which would result in back-reaction happening on a very short time scale, When this happens, it is literally impossible for any molecule to back-react after only one postreactive oscillation along the reaction coordinate (i.e., there are no, primary-1 backreactors)! In Figure 22 we show a numerical reconstruction of such a situation. The plot shows an unstable branch within the phase space of B and a stable branch within the phase space of A obtained by a linear stability analysis of the repulsive PODS associated with the barrier separating the two isomers of the De Leon-Berne Hamiltonian35:
H( p, q ) --*.-t
=
$ [p: + pf] + Eq$[q: - a2]e-z%
+ D(1 - exp[-crq2]}2
[48]
Figure 22 Numerical reconstruction of cylindrical separatrix manifolds, contained within a “basket” of maximum extent of the variables (q,, q2, p z ) (the extent is limited due to conservation of energy). These cylinders are generated from the DeLeonBerne Hamiltonian at low energy (see Figure 23) and describe a dynamical pathway through which all reactive trajectories must pass. Arrows drawn on the cylinders show the pathways followed by prereactive and postreactive motions.
Visualizing Coupled Isomerization Dynamics in Phase Space 155 The Hamiltonian above is a prototype of coupled intramolecular forces; note especially that as the vibrational coordinate q2 increases, the effective rotational barrier decreases exponentially. A plot of the potential energy surface is shown in Figure 23. The cylindrical separatrix arising from the transition state repulsive PODS, shown in Figure 22, was generated using the procedure outlined previously (see “Numerical Reconstruction of the Separatrix” above), with the stability analysis carried out on the repulsive PODS using a Poincark map. The Poincark map used for the stability analysis was defined as follows: first, the location of the maximum of the potential energy surface (41, q t ) was found (as it happens, qf = 0); then, a Poincark surface was defined at q2 = qt with p2 > 0. For this particular Hamiltonian, the fixed point of the repulsive PODS happens to be located on the Poincari surface at ( p , , q l ) = (0, 0). Once the PODS is located, the linear stability analysis is straightforward. In Figure 22, the ( q l , p 2 , q2) values of the separatrix branches so generated are sorted sequentially to define the “interior” and “exterior,” and then “coated” with a surface, which is plotted and surrounded by a “basket” representing the maximum values of these canonical coordinates which are accessible at this particular energy.119 Note that the cylinders diverge from the potential barriers with cylindrical geometry and maintain their cylindrical geometry as they wind
Figure 23 Plot of the De Leon-Berne potential energy surface as a function of the coordinates (ql, q2).Above is a surface plot of the potential function, and bel6w is a contour plot of the same surface. Reprinted with permission from Ref. 119.
156 Visualizing Molecular Phase Space
about within the phase space, because neither branch has yet undergone homoclinic intersection with another branch within either isomer. Although the De Leon-Berne Hamiltonian is apparently the only system for which these fairly complex surface plots have been made, the reactive islands Poincare map structure (which is a unique signature of the cylindrical geometry) has been observed in models of 3-phospholenes3 as well as in a symmetric triple-well prototype.29~124It has also been observed in the bimolecular reactions H + H2 + H2 + H92J24 and in the unimolecular decomposition of the (He * * * 1,) cluster124 and of HNSi.124 These studies have shown that even in a strong-coupling limit, where the repulsive PODS wanders away from the barrier to some extent124 or even bifurcates into multiple PODS,88*927124cylindrical separatrix manifolds mediate the pre- and postreaction dynamics. The reactive islands structure of the De Leon-Berne Hamiltonian is shown at several energies in Figure 24. Observe that at high energy (system 3) the reactive islands take up a large fraction of the map and that they intersect each other on the first mapping; the dynamics is approaching an adiabatic limit. As the energy is lowered (system 2) the cylinders “miss” each other at first; thus, no trajectories can possibly back-react without oscillating at least twice in the reaction coordinate within each isomer. At the lowest energy (system 1) we have a map of the cylinders shown in Figure 22.29 Although we have restricted our discussion to symmetric two-state isomerization mediated by a single transition state, in some cases the dynamics admits multiple repulsive PODS, It then becomes possible for them to heteroclinically tangle together, providing a mechanism for rapid, nonstatistical transport between the parts of phase space separated by the PODS.92J08J24
Reactive Islands Kinetic Theory To summarize: if a cylinder approaching the barrier intersects another one that has just come away from the barrier, all trajectories that recross the transition state after vibrating once along the reaction coordinate must have passed through the overlap of the two cylinders. By calculating how “fat” the cylinders are in phase space relative to the “fatness” of the phase space of the reactant and by knowing how strongly they overlap with one another and the knanner in which they wind about, one can directly measure the extent and types of recrossing that can occur as a function of energy. Algorithms for accomplishing this in a practical sense have been described in detail elsew h e r e . 2 9 ~ 6 2 ~ 9 2 ~ * 0 9 ~All l l O of ~ ~this ~ 2 information yields a set of linear equations that describes the flux of trajectories between the various subregions of phase space implied by the details of the cylinders’ overlaps. These equations may be solved and used to obtain reaction rate and population decay information. Numerical examples considered to date indicate that the reaction rates predicted by this procedure, known as reactive islands theory, are generally far
Visualizing Coupled lsomerization Dynamics in Phase Space 157
L P2
LO
P2
Figure 24 Numerical reconstruction of reactive islands of the DeLeon-Berne Hamiltonian on one surface of the n-map at (1) low, (2) medium, and (3) high energies. Reprinted with permission from Ref. 29.
more accurate than statistical rate theories in describing the kinetics of molecular dynamics simulations.108-1*2 The basic principles of reactive islands theory have been extended to a number of isomerization reactions, including those of stilbenello and hydrogen cyanide.112 Although the “full-blown” reactive islands theoretical prediction of the
158 Visualizing Molecular Phase Space
rate cannot generally be written down in closed form, we briefly describe a highly practical and intuitive approximate version of reactive islands theory developed by De Leon for isomerization kinetics. 1 1 0 ~ 1 2In this "linearized" version of reactive islands theory, corrections to a statistical rate expression appear as first-order, linear terms. The microcanonical reaction decay rate k(E) (cf. Eq. [12]) is given by
where a can be thought of as a phase-space correction to the classical statistical rate t-1 for the presence of nonreactive trapped KAM tori, and y can be thought of as a phase-space correction for recrossing. The statistical rate, which depends only on the phase-space volume of the transition state relative to that of the reactants and products, is given by28,33~88-93~'08-"2 t-' =
(&)I.
where pA is the classical density of states32 of A (which is a conformer of the molecule of interest that can, for example, be calculated using Monte Carlo techniquesj.29~"1~124~'27), p is the total classical density of states (p = pA + p,), andJE is the action of the transition-state PODS at energy E. From a semiclassical perspective, the quantity y was referred to by Hirshfelder and Wigner as the transmission coeficients49128; in the context of classical mechanics, this coefficient has been considered by a number,of studies including work by Miller129 and by Straube and Berne.130 However, linearized reactive islands theory provides a new (purely classical) recipe for the calculation of ~ 1 1 0 :
where ZAis the fraction of postreactive trajectories that cross back into B after one oscillation in the reaction coordinate within A. This fraction is equal to the fraction of the stable cylinder's cross-section that is contained within its overlap with the unstable cylinder, The aspect of this equation for y that appears to be unique is that it implicitly takes into account the clogging effect that the overlap of cylinders has on the activation of trajectories initially outside the cylinders. The correction a,which does not seem to have been as well discussed in the literature, is calculated according to the formula
with pT the density of states of trapped motion, and p?T the density of states of trapped motion within A. (Y is a correction that takes into account the measure
Visualizing Coupled Isomerization Dynamics in Phase Space 159
of phase space that is occupied by KAM trapped motions and therefore not available for reaction.86 Implicit in the application of reactive islands theory, or in the application of other models of reaction dynamics that make use of the properties of a reaction separatrix,l3.1*4 is the idea that the chaotic phase space may be considered to be completely structureless (within the restrictions imposed by the homoclinidheteroclinic tangling of the separatrix manifolds). However, the presence of internal bottlenecks within one or more isomers may cause this approximation to break down.29161,103-105,112 Accurate procedures for developing corrections to account for internal bottlenecks in two-dimensional sysUnfortunately, much progress retems have been de~eloped.29~60~*6~~~.’19~’26 mains to be made toward developing quantitative procedures for dealing with internal bottlenecks in many-dimensional systems.
Isomerization in Many Coupled Degrees of Freedom The prospect of extending what is now known about low-dimensional reaction systems to high-dimensional systems is tantalizing and, in fact, may well be within reach. However, what we have learned may have to be modified (at least formally) to properly take into account some of the differences between two-dimensional systems and those of higher complexity. One issue is yet to be discussed: do separatrices even exist in isomerization models with more than two degrees of freedom?70,74,91Certainly we know that the potential barrier is generically recrossed by some trajectories when the system has more than two degrees of freedom and that motion trapped for a number of vibrational periods can become reactive.28?62,131 Is barrier recrossing in high-dimensional systems mediated by a hyperdimensional analog of cylindrical separatrix branches? And if cylinders do exist, can such theories as reactive islands theory be used to describe these systems? The answer to both of these questions appears to be “yes.”70,74,112,139,124,332 In the following, we consider the properties of a molecular model for isomerization with N degrees of freedom. Throughout, q, will be the reaction coordinate, along which we presume the existence of a single energetic barrier, and q2,q3, . . . q N will be vibrational “bath” coordinates, which are presumed to be bounded at all energies considered here. The barrier top will be at q1 = 41, and the reactant will be labeled A and the product B, as before. We will consider the manifolds that emerge from unstable motion at the barrier in the absence and presence of coupling between the bath and reaction coordinates. First, consider a Hamiltonian of the form
where H‘ represents and contains only those terms that couple the reaction mode with the bath modes. When H’ is set equal to zero, the reaction coordi-
160 Visualizing Molecular Phase Space nate q1 is uncoupled from the bath coordinates, which may (or may not) be coupled among themselves. In such a case, the total energy E may be partitioned as E = E , + EZ.,.N. Consider the set of points in phase space at energy E located at the barrier top q1 = q1 with p 1 = 0. All such points will, in addition to being located at the barrier top, have HI = E b and H2...N = A€ = E - Eb, where E b is the barrier height. Because there is no coupling between the reaction and bath coordinates, this set will be invariant. In other words, although points within this set are unstable with respect to a “nudge” along q l , they will not fall off the barrier if not nudged. The above set of phase-space points forms a surface of dimension 2 N - 3 . This surface can be looked upon as a constant-energy surface for the bath modes and has the topology of a hypersphere ( S 2 ” - 3 ) . This hypersphere is embedded in the 2N-dimensional phase space of the full system (and localized about the potential saddle). We will label this hypersurface T ~ - , ,because it is itself a coupled dynamical system with N - 1 degrees of freedom. For example, for N = 2, T ] is (topologically) a circle (Sl) and corresponds to an unstable periodic orbit at the barrier; the orbit is a one-dimensional dynamical system. For N = 3 , T~ is (topologically) a three-dimensional sphere (S3). Such a surface admits a dense set of trajectories located at the barrier top; it is a miniature dynamical system all its own. Motion on/within T~ can be regular, chaotic, resonant, and so on. In Figure 25 (top),we see a quasiperiodic orbit that lies on T~ in a typical three-degree-of-freedom system of the type just described.119 The surface -rAV-, is a particular example of what Wiggins has referred to as a hyperbolic manifold70 and what De Leon and Ling have termed a normally invariant hyperbolic manifold.112 Hyperbolic manifolds are unstable and constitute the formal multidimensional generalization of unstable periodic orbits. Hyperbolic manifolds, like PODS, can be either repulsive or attractive.69>92If motion near a hyperbolic manifold falls away without recrossing it in configuration space, the hyperbolic manifold is said to be repulsive. On the other hand, it is often the case that motion near a hyperbolic manifold will cross it several times in configuration space as it falls away, and in this case it is said to be attractive .29~107-1 12 Assume that T ~ , - is repulsive. Since T.,.- is an invariant surface, motion , never fall away. However, speaking somewhat loosely for precisely on T ~ - can the moment, a slight “push” along q1 will cause motion initially on T ~ to- roll~ away from the barrier top. The set of motions that will roll away most slowly ~ , the surface are motions that are asymptotic to the repulsive manifold T ~ - and formed by these motions constitutes the multidimensional version of a separatrix. As motion asymptotic to T ~ “falls ~ away,” it will generate a surface embedded in the full phase space whose geometry is the direct product of the sphere and the real line, S21y-3 X R l , that is, a hypercylinder. The dimension of this hypercylinder is thus 2N - 2.703’01 As we will see, this (2N - 2)-dimensional hypercylinder survives in the vicinity of the barrier even in the presence of a finite coupling between the reaction and bath modes.
Figure 25 (Top) A quasiperiodic trajectory lying on the transition state manifold for a two-isomer system modeled in three degrees of freedom ( q l , ql, q 3 ) ,with q , a unique reaction coordinate. This system has no linear terms in q ! , and the projection of the manifold on configuration space is therefore a two-dimensional surface (see the side-on view of the trajectory displayed in the inset). (Berow)Same, except that the system now does have linear terms in 4 , . The projection of the manifold on configuration space is now space-filling, albeit thin (see the thickness of the sample trajectory as seen in the inset). Reprinted with permission from Ref. 119.
162 VisualizingMolecular Phase Space
We next focus on the situation where the reaction and bath modes are coupled, that is, H’ f 0. We will assume H‘ to be a polynomial, and furthermore that it will have only quadratic and higher order dependence on q1 - q ] (this latter condition is for the purposes of simplifying the discussion and will be relaxed later). Locally about the barrier top, the structure of phase space will be the same as the unperturbed system because H’ will be zero in that region. We are guaranteed this by a theorem of Moser80*81that proves that a quadratic normal form exists sufficiently close to a hyperbolic equilibrium in manydimensional Hamiltonian systems, even in the presence of nonlinear coupling terms. Consequently, the hyperbolic manifold T ~ located ~ at q, = 41, as well as the separatrix hypercylinders, will continue to exist for the coupled system. As in the uncoupled situation, the surface of the cylinders in the coupled system may be looked upon as the set of all trajectories asymptotic in the infinite future and past to Four such hypercylinders will emerge from T ~ - , ,with two associated with motion within each reactantiproduct. For example, within isomer A there will be stable (Wi) and unstable (W,) hypercylinders. Except for a relatively straightforward generalization to multidimensions, the situation is exactly the same as that for the N = 2 system. The hypercylinders Wi and W, will in general only partially overlap. The overlap of these ( 2 N - 2)dimensional hypercylinders will occur along a ( 2 N - 3)-dimensional surface. Because this overlap surface is common to both the stable and unstable hyper~ . ( 2 N - 3)cylinders, motion on this surface must be homoclinic to T ~ - This dimensional surface has been referred to as a homoclinic manifold.62 For example, for N = 2 the homoclinic manifold is one-dimensional, which corresponds to a countable set of homoclinic orbits. For N = 3, the homoclinic manifold is three-dimensional and consists of a dense (uncountable) set of homoclinic orbits (which are still one-dimensional) and so on. Note that the interior of the hypercylinders has a volume of the same dimensionality as the constant energy surface. Thus, the hypercylinders demarcate the energy hypersurface, that is, they represent a separatrix. Reactant motion within the interior of W, will follow W, to the interior of T N - ~ ,cross the barrier, and immediately enter the interior of the product hypercylinder leading into B, Wg.If the product hypercylinders W$partially overlap one may undergo back-reaction (i.e., another, then some of the motion within W,+ go back to reactants). As in the case with two degrees of freedom, the cylinders may in some cases encounter each other directly, so that back-reaction may occur within one or more oscillations of a trajectory along the reaction coordinate after passing the transition state. Or, the cylinders may initially “miss” one another, so that back-reaction only occurs after two or more oscillations. Which scenario actually occurs depends on the nature and energy of the system of interest. Also, just as in the case of two degrees of freedom, the fraction of trajectories that recross the transition state after the kth post-reactive oscillation is still proportional to the phase-space volume contained by the surface formed by the overlap of the cylinders at their kth intersection.
Visualizing Coupled Isomerization Dynamics in Phase Space 163 We assumed in the previous discussion that H' had no linear dependence ~ on q1 - 41. In this circumstance the repulsive hyperbolic manifold T ~ will project onto N-dimensional configuration space as an ( N - 1)-dimensional surface, that is, a hyperplane dividing reactants from products, and T ~ will ~ rigorously be the classic transition state dividing surface. One can extend the previous conclusions to the more general situation where H' has a linear and higher-order dependence on q1 - 41. In particular, the Center Manifold theorem guarantees that a hyperbolic manifold T ~ continues ~ to exist, independent of the precise nature of the coupling (in the absence of bifurcation).@g.;" Sufficiently close to the hyperbolic equilibrium (i.e., hE = 0), H can be expanded up to quadratic terms in all variables. The resulting quadratic normal form guarantees, by the Stable Manifold theorem, the existence of a repulsive ~ sufficiently close to the barrier. By contihyperbolic manifold T ~ at- energies nuity of phase space, a family of such manifolds spanning a range of energies - be ~ repulsive will exist. However, this does not necessarily suggest that T ~ will at all energies of interest. Indeed, T N - may undergo a transition from repulsive to attractive character at some energy, a transition often accompanied by bifurcation of T N - ~into multiple invariant manifolds.**y92 For simplicity, we conremains repulsive, cern ourselves here only with the situation where T+ although the arguments can be generalized. The general existence of a repulsive hyperbolic manifold guarantees the existence of a dividing surface between reactants and products (i.e., a transition state). This follows from the Stable Manifold theorem.69JO3132 Hypercylinders will emerge from T N - ~ ,and they will mediate the dynamics between reactants and products. However, in this more general situation, T N - ~will not project as an (N - 1)-dimensional hyperplane in the N-dimensional configuration space. Instead, the projection of T N - ~ onto configuration space will be space filling and thus will not satisfy the classic concept of a transition state dividing surface [see Figure 25 (bottom)].Note, however, that although the projection of T ~ , -is~ space filling, it is usually observed to be quite thin in configuration space.119 This may be part of the reason for the success of configuration-space, variational, transition state theory calculations for multidimensional reaction dynamics.52 However, the surface T N - ~is the unique variationally correct dividing surface in the full phase space and rigorously minimizes the reactive flux in the manner discussed by Keck.131 The role played by internal bottlenecks, which usually are generated by homoclinic tangling of attractive hyperbolic manifolds, can be generalized using a set of arguments similar to those used for repulsive hyperbolic manifolds. The bottleneck separatrix is a (2N - 2)-dimensional surface, and its branches will have the topological properties of the multidimensional Mobius strip.107 In principle, then, all seems as it was before (assuming that some of the considerable technical problems involved in finding and analyzing such a surface can eventually be solved). However, N = 2 systems have some special properties, and as a result high-dimensional systems exhibit a new dynami-
164 Visualizim Molecular Phase SDace cal phenomenon not observed for N = 2, known as Arnold diffusion (see later).jj
The PoincarC Integral Invariants When considering the generalization of phase-space concepts to manydimensional systems, it is also important to consider what the generalized version might be of area preservation on a Poincare map. We present a brief consideration of an extremely important set of quantities known as the Poincare integral invariant~.66?~2>133 Flows in phase space have special geometric properties. They are said to have a symplectic geometry, which means that the flux integrals over a surface are decomposable into oriented projections of the surface onto planes (or sets of planes) formed by the conjugate-coordinate axis system.72 For example, a surface of section is an oriented projection of a Hamiltonian system’s dynamics; thus, this amounts to area preservation on the Poincare map. The conservation of the PoincarC integral invariants is so fundamental that classical mechanics can be formulated using this as a starting point. In fact, numerical integration schemes for Hamiltonian systems have been developed that are based on their preservation134 (as opposed to, say using a Runge-Kutta algorithmj>135).In principle, such a scheme could be used for molecular dynamics simulations. All Hamiltmian systems must preserve the Poincart integral invariants under a canonical transformation. We will not attempt to present proofs of their preservation, but will simply state the invariants and refer the reader to the texts of Arnold,66 Goldstein,’33 and Tabor,72 which may be read with profit. Although these quantities can be stated and formally manipulated most elegantly using the language of differential forms,66>72we resist the temptation to do so, for the sake of clarity. We would see that, for systems with two degrees of freedom, the integral invariants correspond to area preservation on Poincari maps and to Liouville’stheorem (the conservation of phase space volume).2 For systems with more than two degrees of freedom, one of the invariants leads to hypervolume preservation on a hyperdimensional Poincark map. First we consider a system with two degrees of freedom (N = 2). Suppose we have two closed curves y, and y2 in phase space, both of which encircle a tube of trajectories generated by Hamilton’s equations of motion. These curves can be at two sequential times (t,, t2), or they can be at two sequential mappings on a Poincare map: yz = Uy,. These curves are associated with domains labeled (D,, D2), which are the projections of the closed curves upon the coordinate planes (pi, qJ. Because both the mapping and the time propagation .92)are preserved are canonical transformations, the integral invariants (constant) in either case. There are two of them, of the form:
Visualizing Coupled Isomerization Dynamics in Phase Space 165
The terms constituting 9, and $, are shown in Figure 26.119 We next presume that the system is bound within a single potential well. If we now take the curves (yI,y2)to be PoincarC mapping images, such that y2 = Uy, with the surface of section defined by {ql = 4’1, p , > 0}, and since dq, = 0, we have
A
I
/
/
/
/
/
/
\
/
\
\
\
\
\
\
\
/
91
q3
Figure 26 A schematic representation of the PoincarC integral invariants for a system with three degrees of freedom. (A) The invariant 9 , is the sum of oriented areas projected onto the three possible (4,, p , ) planes. (B) The invariant 9, is the sum of oriented volumes projected onto the three possible (4,, p,, 4,. p,) hyperplanes. Not shown: 9 , (Liouville’s theorem). Reprinted with permission from Ref. 119.
166 Visualizing Molecular Phase Space
By Green’s theorem,66 the above equation becomes 9, =
f
P242 YI
=
f
P2d92
~ 7 1
Y2
which is simply a statement of area preservation on the PoincarC map.66J2 Similarly, it can be shown that Eq. [55] is a statement of the preservation of phase space volume under propagation by Hamilton’s equations of motion, that is, Liouville’s theorem.2 It is important to note that the PoincarC integral invariants are also preserved under a canonical transformation of any kind and not just the propagation of Hamilton’s equations. In N degrees of freedom, a hierarchy of N integral invariants exists. For an arbitrary phase space surface S with symplectic projections consisting of 24-dimensional volumes S21, the Cth member of this hierarchy is of the form
Let us define a surface of section for a bounded three-dimensional system such that {q, = qi, p, > 0). For N degrees of freedom, such a surface is of dimension 2 N - 2 (all points on it have q1 = 4 and H = E); here the surface of section is four-dimensional. The three integral invariants are
I
Simply by analogy with the N = 2 case, it should now hopefully be clear that 9, is the (constant) four-dimensional hypervolume of an arbitrary surface under a canonical transformation. 9 , is interpreted as the conservation of the sum of symplectic areas [i.e., the areas of the projections onto (qi,pi) planes] of S.9, is the total phase space volume, which is constant in agreement with Liouville’s theorem. Because the mapping y2 = Uy, is canonical, the mapping of the
Visualizing Coupled Isomerization Dynamics in Phase Space 167 dynamics onto a Poincari surface is a bypervolume-preserving map. We see that in general a similar Poincari map for an N-dimensional system will preserve its (2N - 2)-dimensional map hypervolume, which can be obtained from the invariant $N-l. We also note that g l N is the multidimensional statement of Liouville’s theorem. In the current context, conservation of the Poincart integral invariants assures that the properties of a PoincarC mapping of separatrix manifolds are entirely analogous to the properties of two-dimensional systems. More importantly, we are assured that the cross-sectional hypervolume of a hypercylinder is always the same as the hypervolume of the repulsive manifold 71\1-1.66’747112,1*9,132 Thus we are assured with a fairly rigorous argument that cylindrical manifolds in many dimensions have the same properties as in two degrees of freedom.
A Note on Arnold Diffusion For N = 2 we have already learned that trapped and reactive regular motion corresponds to motion on KAM surfaces (invariant tori), and that chaotic motion can never cross a KAM surface. However, for coupled systems with N > 2 degrees of freedom, Arnold diffusion can occur between stochastic regions separated by K A M surfaces.55.65~66~82 Arnold proved the existence of this phenomenon in a specific Hamiltonian system and presented solid arguments for its generic presence in multidimensional Hamiltonian systems.65 Basically, KAM surfaces in multidimensional systems are of lower dimension than that available to chaotic trajectories. Therefore chaotic trajectories can move “around” the KAM surfaces, which cannot happen when N = 2. The pattern formed by this “diffusion” is known as the stochastic web. This phenomenon has since been observed in many dynamical systems, although interestingly no general proof for its existence has apparently yet been given.65 The existence of Arnold diffusion is irrelevant to the properties of separatrix manifolds, which still mediate the transport of chaotic trajectories within the regions of phase space they control. However, if Arnold diffusion is present in a given multidimensional system, the possibility exists for chaotic motion initially trapped between two nonreactive (trapped) KAM layers to eventually become reactive. This would presumably manifest itself as an apparent bottleneck to the rate of population decay, as chaotic trajectories slowly leak out from the region occupied by regular KAM surfaces into the portion of phase space more directly accessible to the hypercylinders. However, transport via the Arnold diffusion mechanism typically manifests itself on time scales much larger than those that we observe in numerical simulations (Arnold diffusion usually occurs on the order of thousands of mappings, or vibrational periods), and so it seems improbable that this effect would be observed in a typical reaction dynamics simulation. It would be interesting to characterize the effect of Arnold diffusion in realistic molecular models.
168 Visualizing Molecular Phase Space
SUMMARY AND CONCLUSIONS A great deal of effort, especially since the mid-l970s, has gone into developing a better understanding of molecular dynamics by dipping into the “nonlinear dynamics toolbox.” We now know that rate constants obtained from statistical models of microscopic chemical kinetics can deviate from the kinetics of trajectory simulations by an order of magnitude or more. These deviations may be caused by a combination of dynamical effects, including quasiperiodicity (trapped KAM tori), intramolecular bottlenecks (cantori), nonstatistical recrossing effects (caused by the overlap of cylindrical manifolds). Separating out these various effects has been quite a challenging endeavor. Part of the reason for this effort has been fueled by the belief that the ability to visualize molecular dynamics in phase space may ultimately prove to be generally useful in interpreting trends in microscopic chemical reaction dynamics. Certainly, the various kinetic theories that have arisen from these ideas have thus far been found to describe classical reaction kinetics quite accurately for low-dimensional systems. There remain a number of interesting challenges in addition to the application of nonlinear dynamics to larger systems. One of these is the development of a better understanding of the significance of these developments to quantum dynamics and s p e c t r o s c o p y , ~ 7 - 2 ~ ~ ~ ~ . 2 5 . 9 2 ~ 9 4 ~ 9 6 - 9a8 fascinating ~~~~~~17~ field ~19 that has the colorful and intriguing name of “quantum chaos.”~36-138Another challenge is to improve our understanding of the properties of dynamical bottlenecks in high-dimensional systems. A third challenge we see is to design ways of modifying the nonlinear dynamics approach to include quantummechanical corrections to k(E) [and ultimately k(T)]in a practical way, a highly desirable goal. Practical routes to correcting variational transition-state theory for tunneling and zero-point motion that are currently being developed may point the way to achieving this.52
ACKNOWLEDGMENTS I would like to acknowledge the invaluable zssistance of Francis Torres (Cooper Union), who contributed several figures for this article, and Hiu-Lui Chan (Cooper Union) for her assistance with the literature search. 1 am grateful for the assistance of Nelson De Leon (Indiana University Northwest) and Manish A. Mehta (University of Washington), both of whom contributed substantially to an early draft of the last section of this chapter. Manish also kindly granted permission for 2 number of figures from his thesis to be published here, for which I thank him. I have also benefited from many conversations about nonlinear dynamics over the years with C. Clay Marston (Notre Dame), Steven Neshyba (University of Puget Sound), Michael New (University of California San Francisco), and Song Ling. Comments o n an early draft by Harold Schranz (Australian National University) were very helpful. Finally, l thank the Cooper Union for the Advancement of Science and Art for its support of this work.
References 169
REFERENCES 1. J. W. Gibbs, Elementary Principles in Statistical Mechanics, Yale University Press, New Haven, CT, 1902. 2. R. C. Tolman, The Principles ofStatistica1 Mechanics, Dover, New York, 1938. H. Eyring, D. Henderson, B. J. Stover, and E. M. Eyring, Statistical Mechanics and Dynamics, Wiley, New York, 1982. 3. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, 2nd edit., Oxford University Press, Oxford, 1989. 4. J. M. Haile, Molecular Dynamics Simulation: Elementary Methods, Wiley, New York, 1992. 5. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd edit., Cambridge University Press, Cambridge, 1992. 6. See, for example, these articles and references therein. S. M. Lederman, V. Lopez, V. Fairen, G. A. Voth, and R. A. Marcus, Chem. Phys., 139, 171 (1989). Vibrational Energy Redistribution Across a Heavy Atom. M. J. Davis, G. A. Bethardy, and K. K. Lehmann, J. Chem. Phys., 101, 2642 (1994). Hierarchical Structure in the 3v, Band of Propyne. P. M. Felker and A. H. Zewail, in Jet Spectroscopy and Molecular Dynamics, J. M. Hollas and D. Phillips, Eds., Blackie Academic, London, 1995, pp. 222-308. Ultrafast Dynamics of IVR in Molecules and Reactions. 7. H. Goldstein, Classical Mechanics, 2nd edit., Addison-Wesley, Reading, MA, 1980. 8. B. L. Van der Waerden, in Sources of Quantum Mechanics, B. L. Van der Waerden, Ed., Dover, New York, 1967, pp. 1-57. Introduction. 9. See, for example, D. A. MacQuarrie, Quantum Chemistry, University Science Books, Mill Valley, CA, 1983. 10. 1. Percival and D. Richards, Introduction to Dynamics, Cambridge University Press, Cambridge, 1982. 11. Some useful reviews of quasiclassical and semiclassical dynamics include: D. G. Truhlar and J. T. Muckerman, in Atom-Molecule Collision Theory. R. B. Bernstein, Ed., Plenum Press, New York, 1979, pp. 505-566. Reactive Scattering Cross Sections. 111. Quasiclassical and Semiclassical Methods. L. M. Raff and D. L. Thompson, in Theory of Chemical Reaction Dynamics, M. Baer, Ed., CRC Press, Boca Raton, FL, 1986, Vol. 111, pp. 1-121. The Classical Trajectory Approach to Reactive Scattering. M. S. Child, in Theory of Chemical Reaction Dynamics, M. Baer, Ed., CRC Press, Boca Raton, FL, 1986, Vol. 111, pp. 247-279. Semiclassical Reactive Scattering. 12. See, for example, D. W. Schwenke, D. G. Truhlar, and M. E. Coltrin,]. Chem. Phys., 87,983 (1987). Comparison of Close Coupling and Quasiclassical Trajectory Calculations for Rotational Energy Transfer in the Collision of Two HF Molecules on a Realistic Potential Surface. M. Zhao, D. G. Truhlar, N. C. Blais, D. W. Schwenke, and D. J. Kouri, J. Phys. Chem., 94, 6696 (1990). Are Classical Molecular Dynamics Calculations Accurate for State-to-State Transition Probabilities in the H + D, Reaction? 13. S. A. Rice and P. Gaspard, Isr. J. Chem., 30,23 (1990). Unimolecular Reactions Revisited. 14. W. H. Miller, ACC.Chem. Res., 4, 161 (1971). The Semiclassical Nature of Atomic and Molecular Collisions. 15. J. R. Stine and R. A. Marcus, Chem. Phys. Lett., 29, 575 (1974).Semiclassical S Matrix Theory for a Compound State Resonance in the Reactive Collinear H + H, Collision. 16. G. C. Schatz, J. M. Bowman, and A. Kuppermann, J. Chem. Phys., 63,674 (1975).Exact Quantum, Quasiclassical, and Semiclassical Reaction Probabilities for the Collinear F + H, --* FH + H Reaction. 17. E. Pollak and M. S. Child, Chem. Phys., 60, 23 (1981). A Simple Prediction of Quanta1 Resonances in Collinear Reactive Scattering.
170 Visualizing Molecular Phase Space 18. C. C. Marston, R. C. Brown, and R. E. Wyatt, Chem. Phys. Lett., 122,205 (1985). Semiclassical Wavepacket Construction of Quantum Resonance States from Classical Resonant Orbits for the F + H, Reaction. C. C. Marston and R. E. Wyatt, J. Chem. Phys., 83,3390 (1985). Semiclassical Theory of Resonances in 3D Reactions. 11. Resonant Quasiperiodic Orbits for F + H,. 19. E. Pollak, J. Phys. Chem., 90, 3619 (1986). Spectroscopy of Resonances in Three-Dimensional Atom-Diatom Reactive Scattering. 20. L. Zachilas and S. C. Farantos, Chem. Phys., 154,55 (1991).Periodic Orbits and Quantum Localization in the van der Waals System CO-Ar. 21. D. C. Chatfield, R. S. Friedman, D. G. Truhlar, B. C. Garrett, and D. W. Schwenke, J. Am. Chem. SOL., 113, 486 (1991).Global Control of Suprathreshold Reactivity by Quantized Transition States. 22. Y. Sun, J. M. Bowman, G. C. Schatz, A. R. Sharp, and J. N. L. Connor, J. Chem. Phys., 92, 1677 (1990).Reduced-Dimensionality Quantum Calculations of the Thermal Rate Coefficient for the CI + HCI -+ HCI + CI Reaction: Comparison with Centrifugal-Sudden Distorted Wave, Coupled Channel Hyperspherical, and Experimental Results. 23. H. S. Taylor and J. Zakrzewski, Phys. Rev., A38, 3732 (1988). Dynamic Interpretation of Atomic and Molecular Spectra in the Chaotic Regime. See also references therein. 24. J. J. Sakurai, Modern Quantum Mechanics, S. F. Tuan, Ed., Benjamin Cummings, Menlo Park, CA, 1985. 25. E. J. Heller, in Advances in Classical Trajectory Methods, W. L. Hase, Ed., JAI Press, New York, 1992, Vol. l . , pp. 165-213. Periodic Orbits and Spectra. M. A. Sepulveda and E. J. Heller, J. Chem. Phys., 101, 8004 (1994). Semiclassical Calculation and Analysis of Dynamic Systems with Mixed Phase Space. 26. D. G. Truhlar, W. L. Hase, and J. T. Hynes, J. Phys. Chem., 87,2664 (1983).Current Status of Transition-State Theory. D. G. Truhlar and B. C. Garrett, Annu. Rev. Phys. Chem., 35, 159 (1984).Variational Transition State Theory. R. A. Marcus, J. Phys. Chem., 90, 3460 (1986). Theory, Experiment, and Reaction Rates. A Personal View. J. D. Doll and A. F. Voter, Annu. Rev. Phys. Chern., 38, 413 (1987). Recent Developments in the Theory of Surface Diffusion. B. J. Berne, in Multiple Time Scales, J, U. Brackbill and B. I. Cohen, Eds., Academic Press, Orlando, FL, 1987, pp. 419-436. Molecular Dynamics and Monte Carlo Simulation of Race Events. P. Hanggi, P. Talkner, and M. Borkovec, Rev. Mod. Phys., 62, 251 (1990).Reaction-Rate Theory: Fifty Years After Kramers. 27. G. A. Natanson, B. C. Garrett, T. N. Truong, T. Joseph, and D. G. Truhlar, J. Chem. Phys., 94, 7875 (1991). The Definition of Reaction Coordinates for Reaction-Path Dynamics. D.-h. Lu, M. Zhao, and D. G. Truhlar, J. Comput. Cbem., 12, 376 (1991). Projection Operator Method for Geometry Optimization with Constraints. 28. P, Pechukas, in Dynamics ofMolecular Collisions, Part B, W. H. Miller, Ed., Plenum Press, New York, 1976, pp. 269-321. Statistical Approximations in Collision Theory. 29. N. De Leon, M. A. Mehta, and R. Q. Topper, J. Chem. Phys., 94,8329 (1991).Cylindrical Manifolds in Phase Space as Essential Mediators of Chemical Reaction Dynamics and Kinetics. 11. Numerical Considerations and Applications to Models with Two Degrees of Freedom. 30. G. Nyman, S . Nordholm, and H. W. Schranz, j . Chem. Phys., 93, 6767 (1990). Efficient Microcanonical Sampling for a Selected Total Angular Momentum. Applications to HO and HzO. H. W. Schranz, S. Nordholm, and G. Nyman,]. Chem. Phys., 94,1487 (1991). An Efficient Microcanonical Sampling Procedure for Molecular Systems. H. W. Schranz, J. Phys. Chem., 95, 4581 (1991).On the Microcanonical Weight Function. 31. L. Onsager, Phys. Rev., 37, 405 (1931). Reciprocal Relations in Irreversible Processes. I. L. Onsager, Phys. Rev., 38, 2265 (1931).Reciprocal Relations in Irreversible Processes. 11. See also Ref. 32. 32. J. I. Steinfeld, J. S . Francisco, and W. L. Hase, Chemical Kinetics and Dynamics, PrenticeHall, Englewood Cliffs, NJ, 1989.
References 171 33. D. Chandler, Introduction to Modern Statistical Mechanics, Oxford University Press, New York, 1987. 34. D. Chandler, ]. Chem. Phys., 68, 2959 (1978). Statistical Mechanics of Isomerization Dynamics in Liquids and the Transition State Approximation. J. A. Montgomery, Jr., D. Chandler, and B. J. Berne, ]. Chem. Phys., 70,4056 (1979).Trajectory Analysis of a Kinetic Theory for lsomerization Dynamics in Condensed Phases. 35. N. De Leon and B. J. Berne,]. Chem. Phys., 75,3495 (1981).Intramolecular Rate Processes: Isomerization Dynamics and the Transition to Chaos. 36. 0. K. Rice and H. C. Rampsperger, ]. Am. Chem. SOC.,50, 617 (1928). Theories of Unimolecular Gas Reactions at Low Pressures. 11. 37. L. S. Kassel, ]. Phys. Chem., 32, 1065 (1928). Studies in Homogeneous Gas Reactions 11. Introduction of Quantum Theory. 38. R. A. Marcus and 0. K. Rice, ]. Phys. Colloid Chem., 5 5 , 894 (1951).The Kinetics of the Recombination of Methyl Radicals and Iodine Atoms. R. A. Marcus, ]. Chem. Phys., 20, 352 (1952). Lifetimes of Active Molecules. 1. H. M. Rosenstock, M. B. Wallenstein, A. L. Wahrhaftig, and H. Eyring, Proc. Natl. Acad. Sci. USA, 38, 667 (1952). Absolute Rate Theory for Isolated Systems and the Mass Spectra of Polyatomic Molecules. 39. H. S. Johnston, Gas Phase Reaction Rate Theory, Roland Press, New York, 1966. 40. W. Forst, Theory of Unimolecular Reactions, Academic Press, New York, 1973. 41. W. C. Gardiner, Rates and Mechanisms of Chemical Reactions, Benjamin Press, New York, 1969. 42. S. Classtone, K. Laidler, and H. Eyring, The Theory of Rate Processes, McGraw-Hill, New York, 1969. 43. E. E. Nikitin, Theory of Elementary Atomic and Molecular Pvocesses, Clarendon Press, Oxford, 1974. 44. J. H. Beynon and J. R. Gilbert, Application of Transition State Theory to Unimoleculav Reactions: An Introduction. Wiley, Chichester, 1984. 45. H. Eyring, ]. Chem. Phys., 3, 107 (1935).The Activated Complex in Chemical Reactions. 46. M. G. Evans and M. Polanyi, Trans. Faraday SOC.,31,875 (1935).Some Applications of the Transition State Method to the Calculation of Reaction Velocities, Especially in Solution. 47. D. Eisenberg and D. Crothers, Physical Chemistry with Applications to the Life Sciences, Benjamin Cummings, Menlo Park, CA, 1979, pp. 237-245. 48. R. D. Levine, Physical Chemistry, 2nd edit., McGraw-Hill, New York, 1983. 49. See, for example, B. Bagchi and G. R. Fleming, ]. Phys. Cbem., 94, 10 (1990).Dynamics of Activationless Reactions in Solution. G. S. Tyndall and A. R. Ravishankara, Znt. ]. Chem. Kinet., 23, 483 (1991). Atmospheric Oxidation of Reduced Sulfur Species. 50. S. H. Courtney, M. W. Balk, L. A. Philips, S. P. Webb, D. Yang, D. H. Levy, and G. R. Fleming, ]. Chem. Phys., 89,6697 (1988).Unimolecular Reactions in Isolated and Collisional Systems: Deuterium Isotope Effect in the Photoisomerization of Stilbene. 51. See, for example, D. L. Bunker,]. Chem. Phys., 40,1946 (1963).Monte Carlo Calculations. IV. Further Studies of Unimolecular Dissociation. D. L. Bunker and M. Pattengill, ]. Chem. Pbys., 48, 772 (1968). Monte Carlo Calculations. VI. A Re-evaluation of the RRKM Theory of Unimolecular Reaction Rates. W. J. Hase and R. J. Wolf,]. Chem. Phys., 75,3809 (1981).Trajectory Studies of Model HCCH + H + HCC Dissociation. 11. Angular Momenta and Energy Partitioning and the Relation to Non-RRKM Dynamics. D. w. Chandler, W. E. Farneth, and R. N. Zare, J. Chem. Phys., 77, 4447 (1982). A Search for ModeSelective Chemistry: The Unimolecular Dissociation of t-Butyl Hydroperoxide Induced by Vibrational Overtone Excitation. J. A. Syage, P. M. Felker, and A. H. Zewail, ]. Chem. Phys., 81,2233 (1984).Picosecond Dynamics and Photoisomerization of Stilbene in Supersonic Beams. 11. Reaction Rates and Potential Energy Surface. D. B. Borchardr and s. H. Bauer, I. Chem. Phys., 85, 4980 (1986). Intramolecular Conversions Over Low Barriers, VII. The Aziridine Inversion-Intrinsically Non-RRKM. A. H. Zewail and R. B. Bernstein,
172 Visualizing Molecular Phase Space Chem. Eng. News, Nov. 7, 1988,p. 24. Real-Time Laser Femtochemistry: Viewing the Transition from Reagents to Products. 52. See, for example, D. C. Truhlar and M. S. Gordon, Science, 249, 491 (1990).From Force Fields to Dynamics: Classical and Quanta1 Paths. D.-h. Lu, D. Maurice, and D. G. Truhlar, ]. Am. Chem. SOC., 112,6206 (1990).What Is the Effect of Variational Optimization of the Transition State on a-Deuterium Secondary Kinetic Isotope Effects? A Prototype: CD,H t H CD, + H,. M. R. Soto and M. Page, J. Chem. Phys., 97, 7287 (1992).Ab lnitio Variational Transition-State-Theory Reaction-Rate Calculations for the Gas-Phase Reaction H + HNO -+ H, + NO. Y.-P. Liu, G . C. Lynch, T. N. Truong, D.-h. Lu, D. G. Truhlar, and B. C. Garrett, J. Am. Cbem. SOC., 115, 2408 (1993).Molecular Modeling of the Kinetic Isotope Effect for the [ 1,5] Sigmatropic Rearrangement of cis-1,3-Pentadiene. G . R. Haynes, G. A. Voth, and E. Pollak, ]. Chem. Phys., 101,7811 (1994).A Theory for the Activated Barrier Crossing Rate Constant in Systems Influenced by Space and Time Dependent Friction. 53. C. C. Marston and N. De Leon, ]. Chem. Phys., 91,3392 (1989).Reactive Islands as Essential Mediators of Unimolecular Conformational Isomerization: A Dynamical Study of 3-Phospholene. 54. E. Wigner, J. Chem. Phys., 5,720 (1937).Calculation of the Rate of Elementary Association Reactions. 55. R. C. Hilborn, Chaos and Nonlinear Dynamics: An Introduction for Scientists and Engi’ neers, Oxford University Press, New York, 1994. See also, R. Larter and K. Showalter, this volume. Computational Studies in Nonlinear Dynamics. 56. M. J. Feigenbaum, ]. Stat. Phys., 19, 25 (1978).Quantitative Universality for a Class of Nonlinear Transformations. M. J. Feigenbaum, ]. Stat. Phys., 21,669 (1979).The Universal Metric Properties of Nonlinear Transformations. 57. E. N. Lorenz, ]. Atmos. Sci., 20,130 (1963).Deterministic Nonperiodic Flow. E. N. Lorenz, Nature, 352, 241 (1991).Dimension of Weather and Climate Attractors. 58. I. C. Percival, Proc. R. SOC. Lond., A413, 131 (1987).Chaos in Hamiltonian Systems. 59. J. Ford, Physics Today, 36, 40 (1983).How Random Is a Coin Toss? 60. M. J. Davis,]. Chem. Phys., 83, 1016 (1985).Bottlenecks to Intramolecular Energy Transfer and the Calculation of Relaxation Rates. 61. R. S. MacKay and J. D. Meiss, J. Phys., 19A,L225 (1986).Flux and Differences in Action for Continuous Time Hamiltonian Systems. 62. R. E. Gillilan and W. P. Reinhardt, Chem. Phys. Lett., 156,478 (1989).Barrier Recrossing in Surface Diffusion: A Phase-Space Perspective. 63. H. Poincart, Les Mdthodes Nouuelles de la Mkhanique Celeste, Tome 1-111, GauthierVillars, Paris, 1899.H. Poincari, New Methods of Celestial Mechanics, Vols. 1-111, NASA Technical Translation, NASA TT F-452,Dover Publications, New York, 1957.H. Poincari, New Methods of Celestial Mechanics, Vols. 1-111, English translation, D. L. Goroff, Ed., AIP, Woodbury, NY, 1993. (Poincari is pronounced pwa-ca-ray’). 64. J. Gleick, Chaos: Making a New Science, Penguin Books, New York, 1987. 65. A. J. Lichtenberg and M. A. Lieberman, Regular and Stochastic Motion, Springer-Verlag, New York, 1983. 66. V. I. Arnold, Mathematical Methods of Classical Mechanics, K. Vogtmann and A. Weinstein, Transl., Springer-Verlag, New York, 1978. 67. J. Ford, in Topics in Nonlinear Dynamics, AIP Conf. Proc. 46,S. Jorna, Ed., AIP, Woodbury, NY, 1978, pp. 121-146. A Picture Book of Stochasticity. 68. P. Holmes, Physics Rep., 193, 138 (1990).Poincari, Celestial Mechanics, DynamicalSystems Theory and “Chaos.” 69. J. Cuckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, 2nd edit., Springer-Verlag, New York, 1985. 70. S. Wiggins, Global Bifurcations and Chaos, Springer-Verlag, New York, 1988.
+
References 173 71, L. Krlin, Fortschr. phys., 37, 735 (1989). The Intrinsic Stochasticity of Near-Integrable Hamiltonian Systems. 72. M. Tabor, Chaos and Integrability in Nonlinear Dynamics, Wiley, New York, 1989. 73. S. Wiggins, Introduction to Applied Nonlinear Dynamical Systems and Chaos, SpringerVerlag, New York, 1990. s. Wiggins, Chaotic Transport in Dynamical Systems, SpringerVerlag, New York, 1992. 74. S. Wiggins, Physica, D44, 471 (1990). On the Geometry of Transport in Phase Space 1. Transport in k-Degree-of-Freedom Hamiltonian Systems, 2 5 k < m. 75. N. B. Tufillaro, T. Abbott, and J. Reilly, An Experimental Approach to Nonlinear Dynamics and Chaos, Addison-Wesley, Redwood City, CA, 1992. 76. E. Fermi, J. Pasta, and S. Ulam, Los Alamos National Laboratory Document LA-1940, May 1955. A reprint may be found in E. Fermi, Collected Papers: Note e Memorie, University of Chicago Press, Vol. 11, pp. 977-988. Studies of Non Linear Problems. 77. A. N. Kolmogorov, Dokl. Akad. Nauk. SSSR, 119,861 (1958).A New Invariant for Transitive Dynamical Systems. 78. R. Abraham. Foundations of Mechanics, Benjamin Press, New York, 1967, Appendix D. 79. V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics, Benjamin Press, New York, 1968. 80. J. Moser, Nachr. Akad. Wiss.Gottingen, 1, 1 (1962). On Invariant Curves of Area-Preserving Mappings on an Annulus. 81. J. Moser, Stable and Random Motion in Dynamical Systems, Princeton University Press, Princeton, NJ, 1973. 82. B. V. Chirikov, Phys. Rep., 52, 263 (1979). A Universal Instability of Many-Dimensional Oscillator Systems. 83. G. D. Birkhoff, Mem. Pont. Acad. Sci. Noui. Lyncaei, 1, 85 (1935) (In French). Nouvelles Recherches sur les Systemes Dynamiques. 84. Smale’s technically challenging contributions are nicely summarized by S. S. Cairns, Diferentiuf and Combinatorid Topology, Princeton University Press, Princeton, NJ, 1965. 85. J. M. Greene, R. S. MacKay, F. Vivaldi, and M. J. Feigenbaum, l’hysica, 3D, 468 (1981). Universal Behavior in Families of Area-Preserving Maps. 86. R. S. MacKay, J. D. Meiss, and I. C. Percival, Physica, 13D, 55 (1984).Transport in Hamiltonian Systems. R. S. MacKay, J. D. Meiss, and 1. C. Percival, Physica, 27D, 1 (1987). Resonances in Area-Preserving Maps. 87. D. Bensimon and L. P. Kadanoff, Physica, 13D, 82 (1984). Extended Chaos and Disappearance of KAM Trajectories. 88. R. De Vogelaere and M. Boudart, J. Chem. Phys., 23, 1236 (1955). Contribution to the Theory of Fast Reaction Rates. 89. E. Pollak and P. Pechukas, J. Chem. Phys., 69, 1218 (1978). Transition States, Trapped Trajectories, and Classical Bound States in the Continuum. 90. P. Pechukas and E. Pollak, J. Chem. Phys., 71, 2062 (1979). Classical Transition State Theory Is Exact if the Transition State Is Unique. 91. E. Pollak and M. S. Child, 1. Chem. Phys., 73, 4373 (1980). Classical Mechanics of a Collinear Exchange Reaction: A Direct Evaluation of the Reaction Probability and Product Distribution. 92. E. Pollak, in Theory of Chemical Reaction Dynamics, M. Baer, Ed., CRC Press, Boca Raton, FL, 1985, Vol. Ill, pp. 123-246. Periodic Orbits and the Theory of Reactive Scattering. 93. B. J. Berne, N. De Leon, and R. 0. Rosenberg, J. Phys. Chem., 86,2166 (1982). Isomerization Dynamics and the Transition to Chaos. 94. M. J. Davis and R. E. Wyatt, Chem. Phys. Lett., 86,235 (1982).Surface-of-SectionAnalysis in the Classical Theory of Multiphoton Absorption.
174 Visualizing Molecular Phase Space 95. J. S. Hutchinson, W. P. Reinhardt, and J. T. Hynes, J. Chem. Phys., 79, 4247 (1983). Nonlinear Resonances and Vibrational Energy Flow in Model Hydrocarbon Chains. 96. R. B. Shirts and W. P. Reinhardt, J. Chem. Phys., 77,5204 (1982). Approximate Constants of Motion for Classically Chaotic Vibrational Dynamics: Vague Tori, Semiclassical Quantization, and Classical Intramolecular Energy Flow. 97. C. Jaffi and W. P. Reinhardt, J. Chem. Phys., 77, 5191 (1982). Uniform Semiclassical Quantization of Regular and Chaotic Classical Dynamics on the Henon-Heiles Surface. 98. K. Sohlberg and R. B. Shirts, J. Chem. Phys., 101,7763 (1994). Semiclassical Quantization of a Nonintegrable System: Pushing the Fourier Method into the Chaotic Regime. 99. C. C. Martens, M. J. Davis, and G. S. Ezra, Cbem. Phys. Lett., 142, 519 (1987). Local Frequency Analysis of Chaotic Motion in Multidimensional Systems: Energy Transport and Bottlenecks in Planar OCS. 100. See, for example, the following and references contained therein: E. L. Sibert Ill, W. P. Reinhardt, and J. T. Hynes, J. Chem. Phys., 81, 1115 (1984). intramolecular Vibrational Relaxation and Spectra of CH and CD Overtones in Benzene and Perdeuterobenzene. S. P. Neshyba and N. De Leon, J. Cbem. Phys., 86, 6295 (1987). Classical Resonances, Fermi Resonances, and Canonical Transformations for Three Nonlinearly Coupled Oscillators. S. P. Neshyba and N. De Leon, J. Chem. Phys., 91, 7772 (1989). Projection Operator Formalism for the Characterization of Molecular Eigenstates: Application to a 3:4 Resonant System. G. S. Ezra, 1.Chem. Phys., 104, 26 (1996). Periodic Orbit Analysis of Molecular Vibrational Spectra: Spectral Patterns and Dynamical Bifurcations in Fermi Resonant Systems. Also see Ref. 6. 101. M. J. Davis, J. Chem. Phys., 86, 3978 (1987).Phase Space Dynamics of Bimolecular Reactions and the Breakdown of Transition State Theory. 102. R. T. Skodje and M. J. Davis,]. Chem. Phys., 88,2429 (1988).A Phase-Space Analysis of the Collinear I + HI Reaction. R. T. Skodje and M. J. Davis, Chem. Phys. Lett., 175,92 (1990). Statistical Rate Theow for Transient Chemical Species: Classical Lifetimes from Periodic Orbits. 103. S. K. Gray, S. A. Rice, and M. J. Davis, I. Phys. Chem., 90, 3470 (1986). Bottlenecks to Unimolecular Reactions and an Alternative Form for Classical RRKM Theory. M. J. Davis and S . K. Gray,]. Chem. Phys., 84,5389 (1986). Unimolecular Reactions and Phase Space Bottlenecks. 104. S. K. Gray and S . A. Rice, Furuduy Discuss. Chem. SOC.,82,307 (1986).The Photofragmentation of Simple van der Waals Complexes. 105. S. K. Gray and S . A. Rice, J. Chem. Phys., 86, 2020 (1987). Phase Space Bottlenecks and Statistical Theories of lsomerization Reactions. 106. M. A. Harthcock and J. Laane,]. Chem. Phys., 79,2103 (1983).Two-Dimensional Analysis of the Ring-Puckering and PH Inversion Vibrations of 3-Phospholene. 107. A. M. Ozorio de Almeida, N. De Leon, M. A. Mehta, and C. C. Marston, Physicu, 46D, 265 (1990). Geometry and Dynamics of Stable and Unstable Cylinders in Hamiltonian Systems. 108. N. De Leon, M. A. Mehta, and R. Q. Topper, J. Chem. Phys., 94,8310 (1990). Cylindrical Manifolds as Essential Mediators of Chemical Reaction Dynamics and Kinetics. I. Theory. 109. N. De Leon, J. Chem. Phys., 96, 285 (1991). Cylindrical Manifolds and Reactive Island Kinetic Theory in the Time Domain. 110. N. De Leon, Chem. Phys. Lett., 189, 371 (1992).Dynamical Corrections for Non-RRKM Behavior. 111. M. Zhao and S. A. Rice,J. Chem. Phys., 96,3542 (1992).Unimolecular Fragmentation Rate Theory Revisited: An Improved Classical Theory. M. Zhao and S. A. Rice, 1. Chem. Phys., 96, 6654 (1992).An Approximate Classical Unimolecular Reaction Rate Theory. M. Zhao and S. A. Rice, 1.Chem. Phys., 98,2837 (1993). Comment on the Rate of Isomerization of 3-Phospholene. M. Zhao and S. A. Rice,J. Chem. Phys., 96,3542 (1992).Comment on the Classical Theory of Isomerization. S. Jang, M. Zhao, and S. A. Rice, J. Chem. Phys., 97,
References 175 8188 (1992).Comment on the Rate of Isomerization in Molecules with a Symmetric Triple Well. 112. N. De Leon and S. Ling,]. Chem. Phys., 101,4790 (1994).Simplification of the Transition State Concept in Reactive Island Theory: Application to the HCN HNC Isomerization. 113. H. W. Schranz and M. A. Collins, ]. Chem. Phys., 98, 1132 (1993).Nonlinear Resonance and Torsional Dynamics: Model Simulations of HOOH and CH,OOCH,. H. W. Schranz and M. A. Collins, 1. Chem. Phys., 101,307 (1994). A Model Classical Study of Nonlinear Resonance and Torsional Isomerization. 114. W. L. Hase, in Dynamics of Molecular Collisions, Part B, W. H. Miller, Ed., Plenum Press, New York, 1976, pp. 121-169. Dynamics of Unimolecular Reactions. G. S. Ezra, in Advances in Classical Trajectory Methods, W. L. Hase, Ed., JAl Press, London, 1992, Vol. 1, pp. 1-40. Classical Trajectory Studies of Intramolecular Dynamics: Local Mode Dynaplics, Rotation-Vibration Interaction and the Structure of Multidimensional Phase Space. J. S. Hutchinson, in Advances in Classical Trajectory Methods, W. L. Hase, Eds., JAI Press, London, 1992, Vol. 1, p p 41-75. The Role of Mode-Mode Energy Transfer in Unimolecular Reactions. M. J. Davis and R. T. Skodje, in Advances in Classical Trajectory Methods, W. L. Hase, Ed., JAI Press, London, 1992, Vol. 1, pp. 77-164. Chemical Reactions as Problems in Nonlinear Dynamics: A Review of Statistical and Adiabatic Approximations from a Phase Space Perspective. C. Duneczky and W. Reinhardt, in Advances in Classical Trajectory Methods, W. L. Hase, Ed., JAl Press, London, 1992, Vol. 1, pp. 315-349. The Role of Potential Couplings and Isotopic Substitution on the Ultrafast Classical Relaxation of High Alkyl CH and CD Overtones. S. Chapman and T. Uzer, in Advances in Classical Trajectory Methods, W. L. Hase, Ed., JAl Press, London, 1992, Vol. 1, pp. 351-384. Dynamics of Overtone Induced Energy Flow and Fragmentation in Alkyl Hydroperoxides. 115. P. M. Morse, Phys. Rev., 34, 57 (1929). Diatomic Molecules According to the Wave Mechanics. 11. Vibrational Levels. 116. R. H. G. Helleman, in Topics in Nonlinear Dynamics. S . Jorna, Ed., AIP, New York, 1978, pp. 400-403. Dynamics Revisited, A Glossary. M. V. Berry, in Topics in Nonlinear Dynamics. S . Jorna, Ed., AIP, New York, 1978, pp. 16-120. Regular and Irregular Motion. 117. M. V. Berry, in Les Houches-Chaotic Behavior of Deterministic Systems. G. Iooss, R. H. G. Helleman, and R. Stora, Eds., pp. 171-271, North Holland, Amsterdam, 1983. Semiclassical Mechanics of Regular and Irregular Motion. 118. M. Htnon and C. Heiles, Astrophys. ]., 69, 73 (1964). The Applicability of the Third Integral of Motion: Some Numerical Experiments. 119. M. A. Mehta, Ph.D. Dissertation, Yale University, New Haven, CT, 1990. Classical and Quantum Dynamics of Phase Space Cylindrical Manifolds. 120. M. Htnon, in Les Houches-Chaotic Behavior ofDeterministic Systems. G. Iooss, R. H. G. Helleman, and R. Stora, Eds., pp. 53-170, North Holland, Amsterdam, 1983. Numerical Exploration of Hamiltonian Systems. 121. D. L. Gonzalez, M. 0. Magnasco, G. B. Mindlin, H. A. Larrondo, and L. Romanelli, Physica, 39D, 111 (1989). A Universal Departure from the Classical Period Doubling Spectrum. 122. M. Htnon, Physica, SD, 412 (1982). On the Numerical Computation of Poincart Maps. 123. T. I.. Hill, An Introduction to Statistical Thermodynamics, Dover, New York, 1960. See Appendix IV. 124. R. Q. Topper, Ph.D. Dissertation, Yale University, New Haven, CT, 1990. The Dynamics and Kinetics of Reactive Motion Between Multiple Geometric Conformers. 125. L. E. Reichl, in Long-Time Predictions in Dynamics. V. Szebehly, B. D. Tapley, and D. Reidel, Eds., Dordrecht-Holland, 1976, pp. 71-98. Statistical Behavior in Conservative Classical Systerns. 126. N. De Leon and C. C. Marston, J. Chem. Phys., 91, 3405 (1989). Order in Chaos and the Dynamics and Kinetics of Unimolecular Conformational Isomerization.
176 Visualizing Molecular Phase Space 127. D. L. Freeman and J. D. Doll, J. Chem. Phys., 101, 848 (1984). Fourier Path Integral Methods for the Calculation of the Microcanonical Density of States. 128. J. 0. Hirshfelder and E. Wigner, J. Chem. Phys., 7,616 (1939).Some Quantum-Mechanical Considerations in the Theory of Reactions Involving an Activation Energy. 129. W. H. Miller, J. Chem. Phys., 61, 1823 (1974). Quantum Mechanical Transition-State Theory and a New Semiclassical Model for Reaction Rate Constants. 130. J. E. Straub and B. J. Berne, J. Chem. Phys., 83, 1138 (1985). A Statistical Theory for rhe Effect of Nonadiabatic Transitions on Activated Processes. 131. J. C. Keck, J. Chem. Phys., 32,1035 (1960).Variational Theory of Chemical Reaction Rates Applied to Three-Body Recombination. 132. R. S. MacKay, Physics Lett., A145, 425 (1990). Flux Over a Saddle. R. S. MacKay, Nonlinearity, 4, 155 (1991). A Variational Principle for Invariant Odd-Dimensional Submanifolds of an Energy Surface for Hamiltonian Systems. 133. H. Goldstein, Classical Mechanics, 1st edit., Addison-Wesley, Reading, MA, 1950. 134. P. J. Channel1 and C. Scovel, Nonlinearity, 3, 231 (1990).Symplectic Integration of Hamiltonian Systems. 135. L. F. Shampine and M. K. Gordon, Computer Solution of Ordinary Differential Equations: The Initial \blue Problem, W. H. Freeman, San Francisco, 1975. 136. A. M. Ozorio de Almeida, Hamiltonian Systems: Chaos and Quantization, Cambridge University Press, Cambridge, UK, 1988. 137. M. C. Gutzwiller, Chaos in Classical and Quantum Mechanics, Springer-Verlag,New York, 1990. 138. E. B. Stechel and E. J. Heller, Annu. Reu. Phys. Chem., 35,563 (1984).Quantum Ergodicity and Spectral Chaos. T. A. Heppenheimer, MOSAIC, 20, 2 (1989). Classical Mechanics, Quantum Mechanics, and the Arrow of Time. M. C. Gutzwiller, Sci. Am., 266,78 (1992). Quantum Chaos.
CHAPTER 4
Computational Studies in Nonlinear Dynamics *Raima Larter and tKenneth Showalter “Department of Chemistry, Indiana University-Purdue University at Indianapolis ( I UPUI), Indianapolis, Indiana 46202, and fDepartment of Chemistry, West Virginia University, Morgantown, West Virginia 26506
INTRODUCTION: NONLINEAR DYNAMICS AND UNIVERSAL BEHAVIOR For many centuries science has been largely reductionist in character, and theoretical chemistry has been greatly influenced by this prevailing philosophy. A body of experimental knowledge and insights coalesced in the early twentieth century to form a theory of the structure of matter that emphasized its particulate nature-matter could be reduced to its fundamental components, that is, atoms. These were, in turn, decomposable into subatomic particles. At every level of description, the “particles” interacted with one another to give rise to different phenomena, for example, chemical bonds that appear when electrons form pairs. This view of matter as being composed of fundamental particles, interacting through laws embodied in the formalism of quantum mechanics, inspired a flurry of experimental activities that drove the development of the modern science of chemistry. A reductionist view would, perhaps, define Chemistry in the following way: Reviews in Computational Chemistry, Volume 10 Kenny B. Lipkowitz and Donald B. Boyd, Editors VCH Publishers, Inc. New York, G 1997
177
178 Computational Studies in Nonlinear Dynamics The properties of atoms and the subatomic particles which comprise them, particularly the electrons, can explain the transformation of matter through chemical reaction.
The addition of details (such as the electron shell structure, various bonding theories, etc.) has rendered this view enormously successful in explaining many of the phenomena of interest to chemists. The twentieth century has seen an acceleration in scientific activity that is unprecedented. This has resulted in a view that emphasizes the enormous complexity and diversity of nature, a view that is truly overwhelming. The time has long since passed when any one person could grasp the whole of science. Theoretical chemists and physicists, struggling to keep up with these developments, have found that even with the fastest computers it is often impossible to take a strictly reductionist view in modeling anything more complex than small inorganic or organic molecules, not to mention the physical and chemical processes in which these molecules are involved. The problem is twofold: first, the size of a large molecular system of interest often renders the computational problems intractable. Second, and this is the more important part of the problem, the results of a highly reductionist (e.g., ab initio) calculation may be so complex and detailed that the human investigator is unable to fully absorb their meaning. Imagine carrying this reductionist approach to an absurd level, calculating the total wavefunction and all the available energies of a large molecule (e.g., a large protein, one of an estimated 30,000 in the human body). Even if it could be done (and some investigators are trying), what would be the meaning of such a result? The above description of the current state of affairs in theoretical and computational chemistry has been deliberately exaggerated to emphasize the features of the reductionist philosophy that make it different from other philosophies of science. Many investigators currently working in computational chemistry do not take a strictly reductionist approach to their problem of interest. Ab initio methods are not always used, for example, when it is clear that a semiempirical, semiclassical, classical, or even continuum approach better describes the problem. Even when an ab initio approach is used, however, there is an implicit assumption that the level of description does not need to go “below” that of the electrons and nuclei; that is, it is not necessary to include the ‘properties of quarks, the elementary particles in the current “standard model” of the structure of matter that prevails in the study of high energy physics. The recent announcement of the discovery of the top quark, for example, will have no effect whatsoever on most studies in the computational chemistry community. Indeed, even if the discovery had been reversed, and the announcement had been that “there is no top quark,” computational chemistry would be largely unaffected. Computational chemists have assumed, and rightly so, that the proper level of description for their problems is usually the molecular level, which is decoupled from the sub-subatomic particle level.
,
Introduction: Nonlinear Dynamics and Universal Behavior 179 Whereas it seems obvious that the structure of matter below the level of the electrons and nuclei can be neglected when considering questions in theoretical chemistry, it is not so obvious what one should neglect, if anything, when trying to make sense of the overwhelmingly complex world of molecular biology. Indeed, individual molecules, and even small portions of molecules such as the active site of an enzyme, seem to play enormously important roles in the function of living organisms. Knowing this, is it valid to neglect the existence of molecules when trying to describe biological phenomena such as the cell cycle, embryogenesis, metabolism, and so forth? Many investigators have decided that molecular-level information is crucial and much (indeed most) of the recent work in biology occurs at this level of description. Some unifying principles have emerged out of this vast mountain of information, such as the idea of the gene as the information storage unit that turns on or directs cellular processes such as biosynthesis of proteins. Whereas these unifying ideas have been quite helpful in making sense of the bewildering array of ever-increasing knowledge, there are many questions that remain and that may not be answerable with this approach. Another approach to taming the overwhelming complexity of nature can be found in the field of nonlinear dynamics. Topics such as chaos, nonequilibrium thermodynamics, dissipative structures, and self-organizing systems are addressed in this general area of study. The underlying philosophy in this field is somewhat different from the prevailing reductionist philosophy described above. Rather than searching for the fundamental units of matter that underlie the phenomena of interest at that particular level, such as quarks, atoms or genes, the search focuses on universal laws of dynamical systems that transcend levels of organization. These universal laws, found to apply at either the molecular, cellular, organismic, or population levels, are to be contrasted with the fundamental laws of the reductionist approach, laws that generally apply only at the simplest (i.e., most fundamental) level of description. Universal laws describe dynamical systems in which the interacting “parts” might just as well be foxes and rabbits as oxidizing and reducing molecules or activators and inhibitors in a developing embryo. The similarity of the laws and associated phenomena that emerge in physically dissimilar and unrelated systems constitutes the evidence for the universality of these laws. If the interaction of foxes and rabbits is shown to lead to the same type of dynamical behavior as that observed in an enzyme-catalyzed reaction, for instance, the nonlinear dynamicist would count this as a success in the search for universal laws. In fact, many such universal phenomena have been discovered and described both quantitatively and qualitatively in the last several decades. The approach advocated in nonlinear dynamics has been developed as a response to the failure of the strictly reductionist approach in elucidating many phenomena of interest above the molecular level. Philip Anderson, a physicist, wrote more than 20 years ago’ on the necessity of the paradigm shift that has led to this area of science:
280 Computational Studies in Nonlinear Dynamics The reductionist hypothesis does not by any means imply a ‘constructionist’ one: the ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe.
A “constructionist” approach might be something like statistical mechanics,
that is, a formalism that provides rigorous rules for starting at a “lower” level of description (like the particle-in-a-box model) and deriving a law known to hold at a “higher” level of description (such as the ideal gas law, which can be derived this way). A general formulation of this type would complement the reductionist approach but has not yet been developed and may not even be generally possible. Chemistry is often called “the central science” because the atomic/molecular level of description is midway between that of the sub-subatomic world of high energy physics and the supramolecular worlds of biology, condensed matter physics, or other disciplines in which cccomposites’’are studied (see Figure 1).The field of nonlinear dynamics is one of the more truly interdisciplinary fields of study, a feature that is understandable in light of the emphasis on laws that transcend levels of organization and disciplinary boundaries. Nevertheless, chemistry has played a central role here, too, as some of the most illuminating experimental systems revealing the existence of those transcendent laws come from chemistry. In fact, many of the chemical examples have been found to exhibit phenomena that, prior to their discovery, were thought to be restricted to living systems. For instance, the observation of spontaneous spatiotemporal ordering in thermodynamically open chemical reactors, a phenomenon previously thought to be relegated to biological organisms, caused a rethinking of this particular definition of life. The boundary between the living and the nonliving has been distinctly blurred as a result of these and other studies. Chemistry has been central in another sense as well. Physicists, for example, uncomfortable with the nonreductionist approach necessary for the study of condensed matter, have looked to the study of chemistry for validation of this unfamiliar method of investigation. A recent review by S . Schweber2 in Physics Today commented on the changes occurring in the discipline of physics: Traditionally, physics has been highly reductionist, analyzing nature in terms of smaller and smaller building blocks and revealing underlying, unifying fundamental laws. In the past this grand vision has bound the subdisciplines together. Now, however, the reductionist approach that has been the hallmark of theoretical physics in the 20th century is being superseded by the investigation of emergent phenomena, the study of the properties of complexes whose “elementary” constituents and their interactions are known. Physics, it could be said, is becoming like chemistry.
Introduction: Nonlinear Dynamics and Universal Behavior 181
I
Ecological Systems
]
I
Societies
I
I
n Organisms
R
Cells
I I
0
I
0
I
Molecules Atoms
n
Quarks
I
Figure 1 Levels of organization of matter in living systems.
And chemistry is becoming like biology which is becoming like meteorology which is becoming like ecology. The study and comparison of emergent phenomena from all these fields has revealed the universal laws that are reviewed in the following sections. In this chapter we focus on the computational and theoretical techniques that have been brought to bear on these problems. Theory has often driven experiment in the field of nonlinear dynamics, another feature of this field of study that makes it somewhat unique. The phenomenon of Turing patterns, for example, proposed in 1952 by the mathematician Alan Turing from his investigations of reaction-dihsion models,3 were studied extensively in the intervening years by many theoreticians before the first widely accepted experimental observation was reported in 1990.4 Whereas most theoretical or computational predictions did not precede experimental confirmation by nearly 40 years, they have, in fact, often come before the experimental “discoveries.” We now turn to a review of the computational techniques and methods that have been used in the search for universal laws of nonlinear dynamics, highlighting as we go along some of the associated phenomena that have emerged.
182 Computational Studies in Nonlinear Dynamics
HOMOGENEOUS SYSTEMS Well-stirred chemically reacting systems, hereinafter referred to as homogeneous systems, exhibit a variety of nonlinear phenomena illustrative of the universal behavior found at all levels of organization. Because these systems are homogeneous, the dynamics at one point in the reaction vessel are considered to be the same as the dynamics at any other point. In other words, time is the only independent variable that enters the description; spatial variations do not occur, so dependence on spatial coordinates or phenomena that depend on the existence of gradients (such as diffusion) do not need to be included here. Because the assumption of homogeneity can be made more rigorously for well-stirred chemical reactors than it can for, say, interacting predator and prey populations in an ecological system, chemical systems have provided important experimental tests of the theoretically predicted features of homogeneous nonlinear systems. One of the universal phenomena to emerge from these studies is that of bistability, a particular case of the phenomenon of multiple steady states.
Multiple Steady States A system in which the dependent variables are constant in time is said to be in a steady or stationary state. In a chemical system, the dependent variables are typically densities or concentrations of the component species. Two fundamentally different types of stationary states occur, depending on whether the system is open or closed. There is only one stationary state in a closed system, the state of thermodynamic equilibrium.5 Open systems often exhibit only one stationary state as well; however, multistability may occur in systems with appropriate elements of feedback if they are sufficiently far from equilibrium.6 This phenomenon of multistability, that is, the existence of multiple steady states in which more than one such state may be simultaneously stable, is our first example of the universal phenomena that arise in dissipative nonlinear systems. Consider how a chemical system responds to an influx of reactants supplied by pumping fresh reactant solutions into a stirred reactor and simultaneously removing an equal volume of reaction mixture. This type of open system is typical of the classical continuous-flow stirred tank reactor or CSTR (see Figure 2) familiar to chemists and chemical engineers involved in commercial chemical processing.7 As the reactant flow rate is increased, the average time any species spends in the reactor (the residence time) decreases, and, because there is less time for the chemical reaction to occur, the composition of the reaction mixture becomes increasingly rich in reactants. The system relaxes to a particular steady state composition at each new value of the residence time. Thus, we anticipate a smooth transition from the product-rich steady state at
Homogeneous Systems 183
CSTR Figure 2 Schematic drawing of a continuous-flow stirred tank reactor (CSTR) in which a cubic autocatalytic reaction is occurring. The flow rate, kf, is the reciprocal of the reactor residence time.
low flow rates to the reactant-rich steady state at high flow rates. There are important exceptions to this typical behavior, however, that occur in chemical systems with elements of feedback such as autocatalytic reaction kinetics. Multiple stationary states, the topic of this section, is just one of several possibilities. Other types of dynamical behavior such as periodic oscillations and chaos are discussed, in turn, throughout this chapter.
Autocatalysis as a Source of Bistability Many chemical systems are known to exhibit bistability when carried out in an open reactor, where the reaction may exist in either of two different steady states for the same experimental operating conditions (such as reactor residence time).7 This is a striking phenomenon, as the two steady states typically differ significantly in the extent of reaction and, hence, composition. The features of bistability are nicely illustrated by considering a simple reaction comprised of two autocatalytic stepsu?
A
A
k,
+ B+2B, k,
+ 2B-+3B,
rate = k,ab
rate = k,ab*
PI
where the lowercase letters indicate reaction mixture concentrations of A and By respectively, and k and k , are the rate constants for the quadratic and cubic rate laws. The overafi rate law for this reaction in a CSTR is given by
184 Computational Studies in Nonlinear Dynamics
db = k,ab dt
+ k,ab2
+k,(b, - b )
[31
where a, and b, are the reactant stream concentrations. The rate constant k, for the inflow of the reactant stream and the outflow of reaction mixture is equal to the reciprocal of the reactor residence time. The rate law can be expressed in terms of a single variable (a or b), given that, by stoichiometry, the sum of the concentrations of A and B must equal the sum of the reactant stream concentrations according to a + 6 = a, + 6 0 :
The steady state for a particular set of parameter values can be found by setting dbldt = 0, which results in a cubic equation for b. The steady state value of b as a function of k, for a particular set of input concentrations a, and bo is shown in Figure 3. A unique steady state is exhibited at low values of ko, and the locus of steady state points eventually connects to the equilibrium state
5.8 1.6
I
I
I
1.8
2.0
2.2
I
I
I
2.4 2.6 2.8 5.0 + Log (k,+k/)
I
I
I
3.0
3.2
3.4
6
Figure 3 Computed steady state iodide concentration, b in Eq. [4] or [I-] in Eq. [6], as a function of reciprocal residence time (given by k, + k,,’), which is proportional to flow rate. Steady states shown by solid line are stable nodes (sn) and steady states shown by dashed line are saddle points (sp). (Reprinted from Ref. 14 with permission of the American Institute of Physics.)
Homogeneous Systems 185 at ko = 0.There is a range of k, values, however, over which three steady states exist. The cubic equation admits three positive, real roots over this range, with two representing stable steady states and the third an unstable steady state. As described below, only the stable steady states are observed in experimental studies, although methods are now known for stabilizing and, hence, observing the unstable state.10-12 At high values of ko the system again becomes monostable, with a unique steady state.
The Iodate-Arsenite Reaction The simple autocatalytic model system described in the previous sedtion can be directly correlated with an experimental system known to exhibit bistability, the iodate-arsenite reaction. This system can be described in terms of only one dynamic variable; it is thus a uniquely simple experimental system.13J4 It is, in fact, accurately described by reactions [ 11 and [2] when carried out in buffered solutions and with iodate the stoichiometrically limiting reagent. This can be seen by considering the net reaction 10;
k,
+ 3H3As03 FI
I-
+ 3H3As0.,
k2
PI
and the rate law for the production of the autocatalytic species, iodide,
d [-l - l - ( k , + &,[I-])([IO;], dt - k,([I-I, - [I-1)
+ [I-], - [I-])[I-][H+]2
161
where the stoichiometric relationship [IO?] = [IO,-], + [I-], - [I-] has been used, k, and k2 are the rate constants for reaction [ 5 ] , ko is the reciprocal of the reactor residence time (or the flow rate), and [I-], and [IO?], are the feed stream concentrations. The equivalence of Eq. [6] and Eq. [4] can be seen by making the following assignments for the variable species and the parameters: a = [I-], a, = [I-],, b = [IO,], b, = [IOj-],, k, = kl[H+]2, k, = k2[H+]2
A typical experimental study of bistability requires monitoring the steady
state concentration of a particular species as a function of a bifurcation param-
eter such as reactant flow rate. (Bifurcation parameters are described in more detail in a following section.) A convenient species to monitor in the iodatearsenite reaction is iodide, the autocatalyst. Figure 4 shows the steady state iodide concentration as a function of the reciprocal residence time, k,. As the flow rate is increased, displacing the system from equilibrium (where the extent of reaction, and iodide concentration, is high), the iodide concentration gradu-
186 Computational Studies in Nonlinear Dynamics I
I
5.0
+ Log
(ko+ kd)
Figure 4 Experimental steady state iodide concentrations as a function of reciprocal residence time. Arrows indicate transitions from one steady state to another (Reprinted from Ref. 14 with permission of the American Institute of Physics.) ally drops off. At a particular value of ko,a discontinuous decrease in the steady state iodide concentration occurs; the new, lower steady state concentration further decreases as the flow rate continues to increase. If the flow rate is now decreased, the system retraces its path to the point where the discontinuous transition occurred. Rather than “jumping” back to the high-iodide state, however, the system remains in the low-iodide state over a range of lower flow rates. The system eventually undergoes a discontinuous transition to the highiodide state, where it then retraces its earlier path as the flow rate is further lowered. Between the two discontinuous transitions, the system may exist in either a high-iodide steady state or a low-iodide steady state. Of course the concentrations of all the other species in the system reflect the differences in iodide concentration; in particular, the iodate concentration is given by the conservation relation above. Which state the system exhibits depends on its past history-whether it has been approached from high or low flow rates. Thus, for any particular flow rate in the bistability range, the extent of reaction may be very high or very low, even though the reactant concentrations, residence time, and so forth are identical. Many studies of the dynamics of bistable systems have been carried out, with particular attention paid to the relaxation kinetics near the transition
Homogeneous Systems 187 points.1331s One further aspect of bistability that may be of practical importance is the existence of what are referred to as mushrooms and isolas. If a second, independent flow of reaction mixture is added to the system described by reactions [l]and [2], two regions of bistability may be exhibited. Figure 5 shows how there are regions of bistability at high and low flow rates in such a system. This pattern of steady states has a simple explanation: the system is moved away from thermodynamic equilibrium and complete reaction not only as the flow rate is increased but also as it is decreased, giving rise to two regions of bistability. This occurs when the flow rate is decreased because the reaction mixture is “washed out” by the constant flow rate of the second input stream. As the flow rate of the second input’ stream is increased, an even more intriguing type of bistability is exhibited, the isolu. Figure 6 shows that with this change, the base of the mushroom is “squeezed off,” leaving only the “cap.” Now, the high-iodide (or high extent of reaction) branch of steady states is inaccessible by variation of the flow rate alone. As it is increased or decreased, the system remains in the low-iodide state without ever accessing the branch of high-iodide states. In fact, the only direct way the system can reach this branch is by external perturbations, which must be of sufficient magnitude to cause the transition from one branch to the other. Isolas have important
p.
6.8
.*‘SP
6.4 -
sn
4.4
’-
4*0 212 4*0’
212
214 214
2:6 2:6
2h 2h
3:O 3:O
312 312
314 314
5.0 + Log (ko+ k,,’)
3:6 3:6
318 318
410 410
Figure 5 Computed steady state iodide concentration, b in Eq. [4] or [I-] in Eq. [6], as a function of reciprocal residence time showing mushroom behavior. (Reprinted from Ref. 14 with permission of the American Institute of Physics.)
188 Computational Studies in Nonlinear Dynamics
4.8 4.4 .'
Figure 6 Computed steady state iodide concentration, 6 in Eq. [4] or [I-] in Eq. [ 6 ] , as a function of reciprocal residence time showing isola behavior. (Reprinted from Ref. 14 with permission of the American Institute of Physics.) practical implications for chemical manufacturing: because the extent of reaction is typically higher on the isolated branch of steady states, it may be desirable to operate a chemical reaction on this branch. To reach this branch, it is necessary to perturb the system, and one must know the appropriate perturbation for the transition to occur. Hence with a good understanding of the actual kinetics, computational studies of bistability could permit a substantial increase in yield. We have restricted our discussion in this section to bistability in wellstirred, homogeneous systems. Multiple steady states may also occur in unstirred systems, where domains of the system in one steady state coexist with domains in the other steady state. In addition to the obvious application to nondhemical systems, chemical systems (in fact the iodate-arsenite system considered here) sometimes exhibit domains that are connected by propagating reaction-diffusion fronts. We will return to this system in our discussion of chemical waves, which will include a description of these fronts.
Bistability as a Universal Phenomenon As mentioned previously, bistability is an example of a universal phenomenon that arises in dissipative nonlinear systems. Its existence is largely independent of the identity of the interacting parts but strongly dependent on the type of
Homogeneous Systems 189 interactions between these parts, that is, on the dynamics. In the iodate-arsenite example, the iodide species plays a key role because its concentration is affected by two competing influences: one effect leads to the production of iodide, and a second leads to its decrease. These two influences correspond to the positive and negative terms, respectively, on the right-hand side of Eq. [6]. It is the combination of autocatalytic production and a consumption process that leads to bistability. An interesting illustration of the universality of the bistability phenomenon can be seen by comparing the iodate-arsenite chemical system to a totally unrelated system from population biology-unrelated, that is, except in the dynamical sense. The population dynamics of the spruce budworm has been described16 in terms of a model that includes the growth of the population of budworms, N , as a result of reproduction (proportionality factor Y), the limits on that growth that occur because of a finite food supply (inversely related to the carrying capacity K), and a further limit on the population of the budworms resulting from predation by birds. The population of spruce budworms, N,thus satisfies the following growth equation
dN dt - ~ N ( l - - g ) - p ( N )
_.-
where the quantity p ( N ) is a nonlinear function of tory effect
(71
N that describes the preda-
The shape of this predation function is typical of simple predator-prey systems such as the spruce budworm-bird system. The parameters A and B can be found by fitting observational data to the function, a method analogous to the typical techniques used to find rate constants in chemical systems. The steady states of this system are found by setting dNldt = 0, just as for the iodatearsenite example, and solving the resulting polynomial in N,which, once the trivial solution is eliminated, is found to be a cubic equation. As for the chemical example described previously, we find that the steady state equation has one or three roots depending on values of the various parameters and that, under certain conditions, two of these are simultaneously stable. Thus, the spruce budworm dynamics exhibits bistability and is said to be dynamically equivalent to that of the iodate-arsenite reaction. The growth of the spruce budworm population resulting from reproduction is offset by a potential decrease caused by the two effects of a finite carrying capacity and predation by birds; these opposing influences have the same effect on the system dynamics as do the autocatalytic production of iodide in combination with limits on its growth owing to stoichiometry and flow of the solution which removes iodide from the reactor. Because of equivalencies of this type, cross-fertilization of ideas and
190 Computational Studies in Nonlinear Dynamics
techniques between the fields of population dynamics and chemical kinetics has occurred frequently in the area of nonlinear dynamics.
Normal Forms The fact that bistability is found in such disparate systems as autocatalytic chemical reaction kinetics and predator-prey dynamics such as that associated with the spruce budworm has led to the concept of normal forms,17 dynamical models that illustrate the phenomenon in question and are the simplest possible expression of this phenomenon. Physically meaningful equations, such as the reaction rate law for the iodate-arsenite system described above, can, in principle, always be reduced to the associated normal form. Adopting the usual notation of an overhead dot for time differentiation, the normal form for bistability is the following
x
= rx -
[91
x3
Notice that the normal form is simpler than either of the examples we have considered in that no quadratic term exists. This is because quadratic nonlinearities are not important for the existence of bistability; it is the cubic term that is crucial for this particular phenomenon. Thus, the normal form retains only the features necessary for the universal dynamical phenomenon in question. The normal form for bistability has the following steady state solutions: x=+Vr
X = O
x = - 6
If r is negative, the first and last of these will not be physically meaningful (because x is a variable with physical significance such as a chemical species concentration or a population and, hence, cannot be imaginary). However, if r is positive, all three roots will be valid. As will be seen below, only the first and last of these turn out to be stable.
Bifurcations and Stability Analysis The stability of the steady state solutions is of great importance in the analysis of dynamical systems, and it is often found that the stability changes suddenly at certain critical values of the parameters. These sudden changes have come to be known as bifurcations because in their earliest forms they were evidenced by steady state curves that branched (or bifurcated) at these critical parameter values. A bifurcation generally occurs when the stability of one or more of the solutions changes. The stability properties of interest may be either local or global in nature, that is, the solution may be found to be stable or unstable to small perturbations (this is the local case), but this property may be different for large perturbations (the global case). Bifurcation points are gener-
Homogeneous Systems 191 ally found by considering only the local stability properties of a particular solution. Consider, first, only steady state solutions of a dynamical system. As in the examples in the previous section, the dynamical system may be a single ordinary differential equation (ODE) such as a rate equation or population growth equation, in the case where there is a single dynamical variable. If there is more than one dynamic variable, the dynamical system consists of the same number of usually coupled ODES. In the former case of a single variable, the dynamical system is thus given by a single equation of the form
x = f(x)
[I11
where f(x) is, generally, a nonlinear function. For chemical systems, f usually has a power law form, although it can be more complex, such as when f is a ratio of polynomials for enzyme kinetics derived using the quasi-steady-state assumption, very similar to the predation "rate law" encountered in the spruce-budworm example above. We let x" be a steady state solution of the dynamical equation, that is, x" satisfies: f(x") = 0
[I21
As we have seen, there may be more than one solution to Eq. [l2]. We will consider each steady state solution in turn, that is, we will carry out a local stability analysis for each x " that satisfies Eq. [12]. Because the local analysis will involve a linearization of the full equations, this type of analysis is also called a linear stability analysis. The stability analysis begins by applying a small perturbation, q, to the steady state solution of interest: x = x"
+q
~ 3 1
The variable x in Eq. [13] is a function of time because the perturbation, q, is a function of time. The time dependence of the variable x is given, of course, by Eq. [ll],because the general dynamical equation of motion applies to any x . Substituting Eq. [13] into the general Eq. [ll],we can see the time dependence of x explicitly:
x = i "+ q = f ( x " + q)
[I41
This equation can be simplified immediately because we know, by definition, that X"
-0
~ 5 1
192 Computational Studies in Nonlinear Dynamics
Therefore, we have
Equation [16] is exact but cannot, in general, be solved analytically. At this point in the derivation we will use the fact that q is small so we can approximate Eq. [16] with a simpler, but analytically soluble, equation. We first expand function f in a Taylor series about the point x = x " : f(x"
+ q) = f ( x " ) + f ' ( x * ) q + f'
(x")d
+ ...
u71
Substituting Eq. [17] into [16], which is, again, the exact equation of motion for the perturbation q, we find the approximate equation of motion
To simplify Eq. [18] further, we first recall Eq. [12] which allows us to set the first term on the right-hand side to zero. Then, given that q << 1,we can safely drop all terms above the linear one, leading us to the linearized equation:
Because f ' ( x * ) is a constant (i.e., time independent), Eq. [19] can be solved analytically to yield
where c = f ' ( x * ) is a constant whose sign determines the stability properties of the reference solution x " . A positive value of the constant c will lead to an exponentially growing perturbation. On the other hand, if c < 0, q ( t )decays to zero as t becomes large. Hence, the sign of c determines whether the perturbation added to the steady state x" will persist or will die out. The general conclusion of the local stability analysis for a one-variable system is summarized thus: If f ' ( x * ) > 0 $ x" is unstable If f ' ( x * ) < 0 3 x' is stable
Example As an example of the linear stability analysis, we consider the normal form for bistability. The function f(x) is here f(x) = rx - x3. Therefore, f'(x) = r - 3x2. Substituting each of the three steady states in turn we find that:
Homogeneous Systems 193
-G 3 f ' ( x ; ) = -2r 3 stable x: = 0 3 f ' ( x : ) = r .$ unstable x i = fi .$ f ' ( x 3 ) = -2r .$ stable
x: =
Since r must be positive for the first and third roots to be physically meaningful, we see that the stability analysis yields steady states of alternating stability. If Y is negative, only one root (the trivial solution) exists, and it can easily be shown to be stable in this situation. The stability information can be summarized in a biftlrcation diagram (see Figure 7), which shows the three steady states branching (i.e., bifurcating) out of the trivial solution at the bifurcation point Y = 0. A dotted curve is typically used to indicate an unstable solution whereas a solid curve indicates a stable solution. The particular type of bifurcation illustrated in Figure 7 is, for obvious visual reasons, called a pitchfork bifurcation.
Generalization to Multiple Variable Systems The generalization of this approach to a system of n ODES can be illustrated by considering the two-variable dynamical system:
where f and g are, in general, nonlinear functions of the variables x and y. To find the steady states of this system, both equations must be set equal to zero and solved simultaneously. Again, as for the one-variable case, more than one steady state solution is possible because the functions f and g are, in general, nonlinear. Consider one possible steady state solution ( x " , y") to the system [20]. Let u and v be small perturbations added to this steady state solution, that
Figure 7 Pitchfork bifurcation in the normal form for bistability, Eq. [9].
194 Computational Studies in Nonlinear Dynamics
is, the perturbed quantities will be x(t) = x" + u(t), y ( t ) = y" dynamical equations of the perturbed system are X"
y*
+ if
+L
= f(x*
= g(x*
+ v(t). The
+ u, y" + v )
Ell
+ u, y" + v )
As with the one-variable system, we will carry out a Taylor series expansion of the functions f a n d g , utilizing the fact that u and v are both very small. For a general function of two variables, h(x, y), the Taylor series expansion about the point ( x " , y " ) will then be
h(x, y ) = h(X",y " )
+-
(x - x * )
+2 aY
1
x",y*
( y - y") +
...
1221
In this notation, the subscripts indicate that the partial derivatives are to be evaluated at x = x" and y = y * . Using the general Taylor series expression, Eq. [22], for the functions f(x, y ) and g(x, y ) in Eq. [21], we firsr note that f ( x " , y') = g(x", y " ) = 0. After linearizing, we have
The resulting system of equations, [23], can be solved using linear algebra techniques. We rewrite this system in matrix form by first defining the column vector w and the matrix J (the Jacobian evaluated at steady state) as
Then, Eqs. [23] can be written in the much simpler form W=Jw
[251
The solution of a linear system such as this is a linear combination of exponential functions of the form
where v1 and v2 are eigenvectors of J, and A, and A2 are its eigenvalues.
Homogeneous Systems 195 It is generally not necessary to determine the eigenvectors v1 and v2 to determine the stability properties of interest. Only the eigenvalues-in particular, only the sign of the eigenvalues-is needed to determine whether the perturbation w will grow or not, as was the case for one-variable problems. For two variables, we have an additional twist, however, as we now have two eigenvalues to consider; both signs need to be taken into account when drawing conclusions about stability. In addition, it is possible for the eigenvalues (and eigenvectors) to be complex in systems with more than one variable. The complex eigenvalues will always appear as complex conjugate pairs. Table 1 summarizes the stability categories that can exist in two-variable systems, In general, if the real part of either eigenvalue is positive, the solution in question is unstable. For the case of a saddle, the system is still considered unstable even though perturbations precisely in the direction of the eigenvector associated with the negative eigenvalue (an experimentally difficult situation) will decay away in time. A particularly interesting scenario arises when the eigenvalues are complex and the real part changes from negative (stable) to positive (unstable). This stability transition is known as a Hopfbifurcatiolz and is one way in which the possibility of oscillatory phenomena arises. This is illustrated in the following section.
Oscillations We consider first an abstract chemical model invented by a group of investigators in Brussels that has played an important role in the development of nonlinear dynamics. Because of its city of origin, it has become known as the Brusselutor,6 or “oscillator from Brussels.” The chemical mechanism associated with this abstract model is:
A-X
kl
k2 B+X-Y+D 2x X-E
+ Y-3x
k3
k4
The chemical ccspecies”in this abstract model that are allowed to vary in time are X and Y; the others are all held constant over time. The laws of mass-action kinetics can be used to write down the dynamical equations for this system:
dX - = k l A - k2BX dt dY - = k2BX dt
+ k 3 X 2 Y - k4X
- k3X2Y
196 Computational Studies in Nonlinear Dynamics Table 1 Stability Categories for Two-Variable Systems Re(&)
++ +-
Re(X,)
Im(A,)
+-
0 0 0
-
+-
#O
#O
Im(h,)
0 0 0
#O
f O
Stability category Unstable node Stable node Saddle Unstable focus Stable focus
These equations can be made dimensionless by first choosing an appropriate scaling of the time variable, say, t = 7 / k 4 .Whereas dimensionless equations are not necessary for carrying out a stability analysis, they often simplify the associated algebra, and sometimes useful relationships between parameters that would not otherwise be readily apparent are revealed. It is also important to note that the particular choice of dimensionless variables does not affect any conclusions regarding number of steady states, stability, or bifurcations; in other words, the dimensionless equations have the same dynamical properties as the original equations. Introducing the definition t = T/k4 into the above equations we find:
The constant term on the right-hand side of the first equation is ( k l A ) / k 4 This . quantity has units of concentration and, so, can be used to scale the concentration variables. This is done by defining dimensionless concentrations u and v as follows:
Rearranging these definitions, we have the following expressions for X and Y :
which can be substituted into the rate equations, yielding:
du - = 1 - (6 + 1)u + a d v d.r dv-- bu - au2v dT
Homogeneous Systems 197 where we have used the definitions:
We can now find the steady states of the dimensionless system, Eqs. [32], and determine their stabilities. The steady state equations are:
1 - ( b + 1)u" + au"2u"
=
0
[341
bu* - au"2u" = 0
where (u", Y ' ~ )is the steady state solution. Solving the second of these for Y * in terms of u ", substituting the resulting expression into the first equation, and solving for u" results in the single steady state solution for this system:
To find the stability of this steady state, we first determine the Jacobian matrix =
( b + 1) + 2auv
(- b
-
-uu2
2auv
Now, evaluating the Jacobian J at the steady state u" , v" yields
Finally, finding the eigenvalues of this matrix yields the following characteristic equation A2
+ (U - b + l ) A + u = 0
1381
which has two roots A, =
b - u - 1 k V(U- 6
+ 1)2-
2
4~
[391
From this result, we see that it is possible for the eigenvalues to be complex if (U
-6
+ 1 ) 2 < 4~
or - 2 G < (U - 6
+ 1) < 2
G
198 Computational Studies in Nonlinear Dynamics Consider, for instance, a situation in which a is specified and b is allowed to vary; A + will have an imaginary part, then, if b is in the range (a t 1
-2
6 ) < b < (a + 1 t 2 G )
[411
For example, for a = 1, the pair of eigenvalues will have an imaginary part if O
0, if (b - a - 1)/2 > 0, or, in other words, if b > (a + 1).Again, for a = 1, we see that the steady state is unstable if b > 2. If b is also in the range of values for which A, has an imaginary part (i.e., 0 < b < 4), the solution to the linearized stability equation will thus be composed of linear combinations of functions of the form
So, if Re(&) > 0, the small perturbation 6 will evolve as an oscillatory function superimposed on an exponentially increasing envelope. If Re@,) < 0, it will evolve with damped oscillations [see Figure 8(a,b)]. In the (u, v) plane, then, the perturbation applied to the steady state (u*, v ” ) will move away from the steady state point in a spiral manner if the steady state is unstable, and will spiral inward toward the steady state point if it is stable [see Figure 8(c,d)]. The type of steady state illustrated by the Brusselator example is called a focus because it is the pivot point for the spiraling trajectories that move toward or away from it. As we will see, the existence of a focus is often the prerequisite for the existence of oscillatory solutions to the full equations of motion. In particular, we look for an unstable focus (one for which the real part of the stability eigenvalues is positive) because the trajectories that spiral away from the focus may eventually reach a stable cyclic path surrounding that focus called a limit cycle. The change of stability that occurs when a focus goes from stable to unstable as indicated in Figure 8 is, as was mentioned previously, a specific type of bifurcation known as a Hopf bifurcation. A Hopf bifurcation occurs when Im(A+) # 0 (i.e., we have a focus) and when Re(A,) = 0 which, of course, ,occu& when the stability of the focus changes as Re@,) changes sign. As mentioned previously, a Hopf bifurcation is important because the eventual fate of the outwardly spiraling trajectories might be a stable limit cycle. To find the limit cycle, it is necessary to solve the full system of equations, not just the linearized ones, a task that must be done using numerical or computational methods. Analytical solution of the full equations of motion for these types of systems is not generally possible because of their nonlinearity. For the Brusselator example, it turns out that a solution of the full equations of motion show that the outwardly spiraling trajectories do, indeed, end
Homogeneous Systems 199
I
I
7
U
Figure 8 Evolution of small perturbation, 6, added to an (a) unstable focus; (b) stable focus. In (a), the perturbation grows, whereas in (b)it dies out via damped oscillations. Phase plane portraits for the two-variable system, with small perturbations 6 and p, are shown for the corresponding cases of (c) a stable focus and (d) an unstable focus.
up on a stable limit cycle (see Figure 9). The limit cycle is an attractor just as a stable steady state is an attractor; the difference is that a limit cycle attractor is a closed path in the phase space, whereas a steady state attractor is a point. The limit cycle is an attractor because all trajectories in its vicinity will eventually find their way to the closed path associated with the cycle. We return to the concept of attractor in a later section but, for now, consider some of the details regarding numerical methods used in solving nonlinear dynamics problems.
Numerical Methods for the Solution of Ordinary Differential Equations Numerical methods used to solve a system of ODES are widely available in computational libraries and through texts such as Numerical Recipes. 19 Certain considerations arise in the use of these standard techniques for nonlinear systems, particularly in models of chemical systems, which often consist of systems of stiff equations that require special care. Stiff equations are characterized by the presence of widely differing time scales, which leads to eigenvalues of the Jacobian matrix differing by many orders of magnitude.
200 Commtational Studies in Nonlinear Dvnamics
V
U
Figure 9 Limit cycle oscillations calculated by numerically solving the dimensionless Brusselator Eq. 1321 with the parameter values a = 1 and b = 2.2 and a variety of initial conditions. The unstable focus is located at u = 1, t! = 2.2. Solved using the Runge-Kutta routine, DiffEq-3D, with stepsize 0.05, available through Ref. 18.
Equations that arise in modeling the dynamics of homogeneous systems are initial value problems, generally approached with techniques of the Euler type. Initial value problems involve derivatives with respect to time; these must be discretized, which can be done using the forward Euler method
or the backward Euler method
i(tk)
'k
- 'k-1 h
Homogeneous Systems 201
In these formulas, the subscript refers to the discretized time step, and h is the size of the fixed time step, i.e., h = t k + l - tk. The difference in the two methods is in the choice of points that are used to estimate the time derivative at the kth point. The discretization used in the forward Euler method will lead to the following approximation of the ODE [where i = f(x)]
The forward Euler method is thus referred to as an explicit method, because x k + l is taken to depend only on points that have been previously calculated, that is, x k + l depends only on x k . The discretization used in the backward Euler method leads to the following approximation of the ODE
Because x k + l appears on both sides of this equation, additional steps are required to solve for xk+1 before the approximation can be used to calculate it. (This can be done via iteration, such as through the Newton-Raphson method.) Hence, the backward Euler method is also referred to as an implicit method. The trapezoidal algorithm averages the information from the forward and backward Euler algorithms such that the iteration equation to be used is
Sometimes the ODES that arise in studies in nonlinear dynamics can be solved using explicit methods (such as the forward Euler) which require less computations per step and are thus cheaper and faster to implement. The Runge-Kutta family of algorithms are a popular implementation of the explicit methods. Runge-Kutta methods begin with a Taylor series expansion; the order of the particular Runge-Kutta method used is simply the highest order term retained in the Taylor series. Using explicit methods to solve the ODES that arise in chemical kinetics often leads to numerically unstable situations in which solutions do not converge. Implicit methods tend to be more stable than explicit ones because the presence of x & + ~on the right-hand side of the equation acts as a source of feedback that imparts extra stability to the algorithm used to solve the discretized equations. The Gear algorithm20 is a very popular algorithm devised, in fact, to deal especially with stiff equations such as those that arise in nonlinear chemical kinetics models. In chemistry, such equations can sometimes be traced back to the presence of steps in the mechanism with widely different rate constants. Whatever the source, though, a stiff system of equations requires a special numerical algorithm with a self-adjusting step size, and the Gear algo-
202 Computational Studies in Nonlinear Dynamics Table 2 Gear Method Iteration Formulas
Order
k-th Order Formula
rithm is generally the method of choice. The type of stability imparted by the Gear method leads to its designation as a stifly stable algorithm. The Gear algorithm is based on the idea of using multiple points along the trajectory to approximate the next point. In the simple forward and backward Euler iteration equations given previously only a single prior point, x k , is used to predict the next point, x k + l . An nth order Gear algorithm will use ~f prior points to carry out the prediction. A table given by Parker and Chua in Practical Numerical Algorithms for Chaotic Systems21 is reproduced here (see Table 2) and shows the iterative equations used for the first- through sixthorder Gear algorithms. Note that here “order” refers to the number of prior points used in the algorithm, whereas in the Runge-Kutta routines “order” refers to the number of terms retained in the Taylor series expansion. In the Gear method, as in all multistep methods, another way to interpret the concept of “order” is that an nth order routine is entirely error free for exact polynomials up to order n. Indeed, the expansion coefficients in Table 2 have been derived by assuming a polynomial form of the appropriate order for the solution.
Continuation Method for Steady State Computations The computation of steady state solutions of the equations of motion may be as simple as using standard polynomial solvers from a library if the solution at only a single point in parameter space is desired. However, it is often desirable to compute the steady state solutions as a function of some parameter to
Homogeneous Systems 203 generate, for example, a bifurcation diagram. In addition, more than one steady state solution may exist and, as we have seen, some of these solutions may be stable and others not. One technique that is used to calculate multiple steady state solutions and their stabilities over a range of parameter values is known as the continuation method. An excellent explanation of this method, as well as a FORTRAN listing of a program that utilizes it, is given in Marek and Schreiber.22 The following presentation is a summary of their text and the reader is referred to it for further details. Often, the system of ODES depends on a number of parameters but, for simplicity, we will consider the case where all but one of these is fixed. Hence, the steady state solutions will satisfy the system of algebraic equations: fi(x1, x2,.
. . ,x,,
a) = 0
i = 1,2, . . , , n
~481
where dx,ldt = f. and a is the single parameter of interest. For a particular value of a,a solution to the above system, in the form of x = {xl, x2, . . . ,xn}, can be generated, given a suitable initial guess xo, from a sequence of approximations X I , x2, . . . , xk, . . . , using, for instance, Newton’s method: xk+l = xk
+ AAxk
k = 0, 1,. . .
[@I
In Newton’s method, the quantity Axk satisfies the following system of linear algebraic equations
where J is the Jacobian matrix, and A (in Eq. [49]) is an iteration parameter. The value of A is chosen to be unity provided that the magnitude of the righthand side of the above equation for the (k + 1)th iterate is less than that for the kth iterate, that is, as long as
If the above condition is not satisfied, A is halved until the condition holds. The method by which the initial guess xo is chosen forms the basis for the continuation method. The desired steady state solution is visualized as a curve in an (n + 1)-dimensional phase space spanned by the variables xl, x2, . . . ,x, and the parameter a.A new variable, z, is defined to be the arc length of this curve in the augmented phase space (see Figure 10).Given any point on the curve, that is, any steady state solution at a particular value of a,say ao,the solution may be continued along the curve by determining the slope and direction of the curve as it proceeds out from this known point. Since the information about slope and direction need be known only very close to the given point a0,a derivative of the steady state equation with respect to the arc length
204 Comtwtational Studies in Nonlinear Dynamics
aO
U
Figure 10 Augmented phase space illustrated for a one-dimensional system. The arc length of the curve in the augmented space {x, a} is z . The solution is determined at a reference point, here denoted cxo, and continued along the curve by determining the slope (here shown via the tangent line to the curve).
variable, z, may be taken. This procedure results in the following coupled system of equations:
The unknowns in this system of equations are the quantities dx,/dz, dx,/dz, . . . . The equations can be rearranged to solve for these in terms of a particular one, dx,/dz:
4 -- Pi _ dz
dX; ~5
i = 1,2,, ,. , i
- l , i + 1,. . . , n +
1
[531
where the pi are determined using linear algebra techniques, and x , , + ~is defined as a. The quantity dx,/dz on the right-hand side of this equation can be defined by taking the arc length to be unity, that is, by taking
which, when rearranged, yields
Nonhomogeneous Systems 20.5
i#j
The two equations, [ 5 3 ] and [55],form a system of coupled ODES with the variable z playing the role of the independent variable. Given initial conditions at a point zo, these equations can be solved by standard numerical routines such as those discussed in the previous section. Because much computational effort is required to evaluate each pi at each increment of the independent variable z, a method that does not require too many evaluations of the right hand side of the iterative equation is desirable. Usually, a simple forward Euler routine is quite adequate for these purposes. If a multistep algorithm is used, the Adams-Bashforth method has been recommended by Kubicek and Marek23; the first-order Adams-Bashforth algorithm is, in fact, equivalent to the simple forward Euler algorithm. Computer codes for implementation of the continuation method are available in several places. A self-contained program known as AUTO24 has been developed by Eusebius Doedel and is available from the Applied Mathematics Department, California Institute of Technology, Pasadena. Alternatively, FORTRAN code for a similar routine is given in an appendix of the recent book by Marek and Schreiber.22
NONHOMOGENEOUS SYSTEMS
w
Turing Patterns: Nonhomo eneous, Steady State Patterns from eactionDiffusion Processes We are used to thinking of diffusion as a process by which concentration gradients are reduced in time, and concentration inhomogeneities are generally “smoothed out.” Thus the notion of diffusion playing a vital role in the spontaneous appearance of order and structure in chemical systems seems quite counterintuitive. However, exactly this idea was proposed by the British mathematician Alan Turing in his seminal 1952 paper: “The Chemical Basis of Morphogenesis.” How can diffusion give rise to structure?By itself, it cannot. But when coupled with a chemical reaction having appropriate elements of feedback, the interaction between the dispersive forces of diffusion and the growth and inhibition of reaction can-and does-give rise to spontaneous patterns. Such spontaneously appearing structures are now known as Turing patterns.
206 Computational Studies in Nonlinear Dynamics Theoretical studies of Turing patterns-from Turing’s original study onward-have typically used two-variable models based on the autocatalytic production of an “activator” species coupled with the generation of an “inhibitor” species which inhibits autocatalysis. These dynamical elements lie at the heart of the oscillatory systems discussed in the previous section. How does such a chemical system give rise to spontaneous pattern formation when coupled with diffusion? The key to the behavior lies in the activator and inhibitor species having different diffusivities. Specifically, the ratio of the diffusion coefficients for the inhibitor and activator must be sufficiently greater than unity, D 1 / D A> 1. When this condition is met, the spread of the activator species by outward diffusion is contained by the inhibitor species, which is subsequently generated in the same region but diffuses beyond the advancing front of autocatalytic reaction. There is a fine balance between the autocatalytic growth of the activator and the retardation of this growth by the inhibitor, which leads to the appearance of highly ordered patterns-from hexagonal arrays of spots to repeating parallel stripes. Although Turing’s paper outlining just how such patterns might form was published more than 40 years ago, experimental evidence of Turing patterns has appeared only in the past few years. The most striking patterns have been found in the oscillatory CIMA (chlorite-iodide-malonic acid) reaction, which exhibits regular spots in quasi-one-dimensional4J~~26 and arrays of spots or stripes in quasi-two-dimensional gel reactors.27-29 As with all sustained nonequilibrium structures, these patterns appear in open systems, where a gel medium is continuously fed with fresh reactant solutions. Interestingly, Turing patterns can also be observed transiently in a closed system,30 just as transient oscillations are often observed in batch oscillatory reactions. It should be pointed out that, whereas the studies cited above have provided unequivocal experimental evidence for Turing patterns, patterns in precipitation reactions have also been considered to be examples of Turing patterns.31 We also note that spatiotemporal patterns have long been known to occur in combustion systems, and stationary patterns in thermodiffusive flames are perhaps the earliest experimental examples of Turing-like structures.32 There are a number of excellent treatments of the Turing instability. A classic and comprehensive account can be found in the book by Murray,16 which includes fascinating treatments of pattern formation in animal-shaped domains (to address the question of animal coat patterns!). A less advanced but highly recommended presentation can be found in the book by Edelstein-Keshet733 which contains a wealth of information on mathematical treatments of biological systems. The following description is adapted from Ref. 34, which draws on both of these sources. For more advanced discussions, the reader may wish to consult the thermodynamics oriented treatment of Nicolis and Prigogine.6 We consider the stability of a general two-variable system, first in the absence of diffusion and then with diffusion terms. Because the Turing bifurcation is a diffusion-induced instability, we will first show that the system is stable
Nonhomogeneous Systems 207 in the absence of diffusion and then demonstrate the destabilizing effects of diffusion. Our system is given by
where f a n d g are functions describing the chemical kinetics of species X and Y. We first review, from a slightly different perspective, the procedure outlined previously in which the linear stability of the steady state is analyzed. Consider the steady state of the well-stirred system, [XI,, [Y]:, and small perturbations that move the system away from the steady state defined by x = [XI - [XI, and y = [Y] - [Y],. These are substituted into Eqs. [56], and the resulting expressions are linearized by dropping nonlinear terms. As described earlier, this is formally carried out by writing Taylor series expansions for f([X], [Y]) and g([X], [Y]) around the steady state concentrations [XI,, [Y], and retaining only the linear terms. This procedure yields equations for the evolution of the perturbation in the linear regime of the steady state
where the elements of the Jacobian matrix A are the derivatives f, = df/d[X], etc. evaluated at the steady state. A particular solution to Eq. [57] is
where C1, C2 give the initial amplitude of the perturbation, and hl,2 are the eigenvalues of the Jacobian. For this two-variable system there. are two eigenvalues, both of which must be real and negative or complex with a negative real part for the state to be stable. The eigenvalues of A are given by Al,2
=
Tr(A) k dTr2(A) - 4 det(A) 2
and, for a stable steady state, the trace and determinant of A must satisfy
208 Computational Studies in Nonlinear Dynamics We now consider the effects of diffusion on an unstirred, spatially distributed system, where the steady state of the homogeneous system satisfies Eq. [60]. This corresponds to a spatial system that is stable in the presence of small, uniform perturbations. The spatially homogeneous steady state remains [XI,, [Y],;however, we now allow for the possibility of this state being destabilized by the effects of diffusion. We consider a one-dimensional system, which, after linearizing around the steady state, is described by
where r is the single spatial coordinate. With the addition of the diffusion terms, we now have a solution that depends on space and time according to
where CI,Rand C2,k again represent the amplitude of the initial perturbation, which is now spatially inhomogeneous. The stability of the spatiotemporal system depends on the wavenumber k as well as the eigenvalue A, which are evaluated from the matrix B
B = A - k2(
D
ox
O DY
)=(
fx
-k 2 4 gx
f,.
g y - k2D,
)
[63]
Again, the stability of the steady state is determined by the sign of A, and for a diffusion-induced instability of this state, we require A > 0. From the above analysis of the homogeneous system, we know this requires Tr(B) > 0 or det(B) < 0; however, we also know from Eq. [60] that Tr(B) < 0 for stability of the homogeneous system. We therefore conclude that a necessary requirement for the diffusion-induced instability is det(B) < 0. ' We now further explore the Turing instability by examining the parameter dependence of the sign of det(B). First, we define the function H ( k 2 ) :
H(k2) = -det(B) = -DxDy(k2)2 + (fxDy+ gyDx)k2- det(A) [64] where H ( k 2 ) > 0 defines the condition for instability. Because H ( k 2 ) is a parabola in k 2 opening downward, the criterion for instability then is a positive maximum. Taking the first derivative of Eq. [64] and substituting the result back into this expression gives
Nonhomogeneous Systems 209
A rearrangement of this inequality yields
which, together with the inequalities of Eq. [60], constitute necessary and sufficient conditions for the Turing instability. The Turing instability arises from the combination of short-range activation and long-range inhibition. This can be seen by subtracting the first inequality in Eq. [60] from the first inequality in Eq. [66] to give
where 6 = D y / D xis the ratio of the inhibitor and activator diffusion coefficients. This inequality says that the inhibitor must diffuse “faster” and therefore have a larger spatial range than the activator. Thus if a random perturbation causes a local increase in activator concentration, the autocatalytic kinetics will cause the activator concentration to grow. The inhibitor is subsequently generated in this region. If the diffusivities of the activator and inhibitor were the same, that is, if 6 = 1, then the subsequent appearance of the inhibitor would inhibit the autocatalysis and the system would return to the steady state. For some 6 > 1, however, the outward diffusion of the inhibitor is sufficient for the autocatalytic growth to continue locally. On the other hand, the autocatalysis is spatially contained because it is surrounded by elevated concentrations of inhibitor. There is thus a critical value of the diffusion coefficient ratio, 6, = 6 > 1, above which H ( k 2 ) > 0 over a range of wavenumbers kmin < k < k,,,. Perturbations with wavenumbers within this range will grow because the associated temporal eigenvalues are positive. Perturbations with wavenumbers outside this range will decay exponentially to the homogeneous steady state. Figure 11shows the variation of H(k2) with k2 for the cases of 6 > 6,, and 6 < aCr. For a spatially unbounded system, the onset of instability (i.e., the Turing bifurcation) occurs at 6 = a, when H ( k 2 ) = 0, because the wavenumbers are dense over the H(k2) curve. For a spatially finite system with no-flux boundary conditions, however, only particular wavenumbers satisfy the boundary conditions because the functional form of the solution must be cos(kr). Thus, in a bounded one-dimensional system, perturbations will grow with wavenumbers given by
210 Computational Studies in Nonlinear Dynamics 0004
.I
n
k? I
\ k'
Figure 11 Plot of H ( P ) as a function of the square of the wavenumber k 2 for 6 > 6, and 6 < 6=,.
where L is the finite domain length, and the integer n is the number of halfwavelengths of the cosine solution. Similar arguments can be made for finitelength systems with periodic boundary conditions. The implication of Eq. [68] is that the onset of instability for finite-size systems typically occurs after the H ( k 2 ) curve has moved above the zero line by some finite extent, because the first wavenumber corresponding to the instability will not necessarily occur at H,,,. If there are several wavenumbers spanning the range kmin < k < k,,, that satisfy Eq. [ 6 8 ] ,the one associated with the largest temporal eigenvalue, Re@), will dominate in the initial growth. This is because all of the unstable modes present in the initial perturbation grow according to their associated temporal eigenvalues. The possible wavenumbers take on a different form for a finite twodimensional system. Consider a rectangular domain with dimensions L, and L,, where each side of the rectangle is defined by no-flux boundaries. Again, for these boundary conditions, the solution will be of the form of a cosine function, but we must now consider both spatial directions. Only certain wavenumbers are allowed, and these are given by
kmin < k
=
m2
T
(t12 L: + -) Ly2
112
k,,,
Nonhomogeneous Systems 21 1 where, as in the one-dimensional case, the integers n and m give the number of half-wavelengths of the cosine solution-one for the x direction and one for the y direction, Again, all of the unstable modes present in the initial perturbation grow in time according to their temporal eigenvalues, and the fastest growing mode (with the largest eigenvalue) typically gives rise to the dominant pattern. This prediction is not absolute, however, because the linear analysis cannot anticipate how nonlinear terms might affect the evolution of the system to its asymptotic state. The above analysis also reveals certain features about the types of twovariable systems that can exhibit the Turing instability. We first consider the Jacobian elements f, = d f / d [ X ] and g, = dg/d[Y].By comparing the first inequalities in Eqs. [60] and [66],we can see that these elements must have opposite signs. If our variables have been defined such that f, > 0, then it follows that g, < 0. A positive f, means that X promotes its own formation, that is, X is the activator, whereas a negative g, means that Y inhibits its own formation. In addition, Y may also inhibit the production of X, which is why it is called the inhibitor. This can be seen by considering the second inequality in Eq. [60]. We know from the above argument that (f,)(g,,) < 0; it then follows that (f,)(gJ < 0 since det(A) > 0. This new requirement, that fy and g, be of opposite sign, gives rise to two possibilities for two-variable chemical mechanisms in which the Turing instability can occur. The Jacobian elements can take on either of the following sign patterns
('
:)
The first corresponds to the classical activator-inhibitor system, where the elements f,, < 0 and g, > 0 represent, respectively, Y (the inhibitor) inhibiting the formation of X (the activator), and X promoting the formation of Y. The second, with the opposite sign pattern for these off-diagonal elements, corresponds to a positive-feedback system such as the Gray-Scott model,35 where X is the autocatalyst and Y is typically a consumable reactant. In this case, both the autocatalyst and the reactant promote the formation of the autocatalyst, and, in turn, both species participate in the consumption of the reactant. In either case, a Turing instability can exist. An Example System The general treatment given above will now be illustrated by considering a simple two-variable chemical model. We shall examine the pattern formation that occurs in the Schnackenberg mode1,36 which is closely related to the GrayScott model35 and a member of the family of cubic autocatalysis models for chemical systems (a family that includes the Brusselator6). A detailed study of pattern formation in the Schnackenberg scheme has been carried out by Dufiet and Boissonade.37
212 Computational Studies in Nonlinear Dynamics
The Schnackenberg model is given by the following four reactions: kl
A-tX k2
B +Y 2x
k3
+ Y-3X k4
X + P where the variable species are X and Y, and the reactant species A and B have fixed concentrations. The distributed, unstirred system is described by a set of two partial differential equations:
-a[x1 at
- D,V2[X]
+ kl[A], + k3[XJ2[Y]- k4[X]
I'[- - - D,V2[Y] + k,[B], - k3[XI2[Y]
[721
at
where [A], and [B], are the constant reactant concentrations. The conditions for the Turing instability for this system can be determined from Eqs. [60] and [66]. These give a region in parameter space where the Turing bifurcation takes place, shown in Figure 12, for the parameter values listed in Table 3. The left boundary corresponds to the Turing bifurcation locus and the right boundary to the Hopf bifurcation locus. Thus, the Turing region is bounded on the left by a region where the spatially uniform steady state is stable and on the right by a region in which spatially uniform but temporally periodic solutions (limit cycle oscillations) exist. Me now examine two different sets of parameter values within the region of diffusion-induced instability for a two-dimensional rectangular domain. The first, point A (see Figure 12), is near the Turing bifurcation locus and should give rise to patterns similar to those predicted by the linear analysis. By substituting the corresponding parameter values (k,[A], and k4) into an equation analogous to Eq. [64] for H(k2), we can find the range of wavenumbers between kmin and k,,, over which the diffusion-induced instability occurs
This expression is satisfied according to Eq. 1691 by two possible pairs of n and m values: (n, m) = ( 5 , O ) and (4,3).Which pattern is exhibited depends on the initial perturbation. When a particular perturbation is imposed on the system,
Nonhomogeneous Systems 213 0.002 k,[A], I Ms"
0.000 -
0.000 0.5
I
I
1.o
1.5
k, I s-'
2.0
Figure 12 Constraint diagram showing parameter values of Turing bifurcation locus (left line) and Hopf bifurcation locus (right line). The Turing instability occurs for parameter values in the middle region. (Reprinted with permission from Ref. 34.)
the pattern corresponding to the (n, m) = ( 5 , O ) pattern (shown in Figure 13) is displayed as the final asymptotic state. When a different perturbation is imposed, the (n, m) = (4, 3) pattern is exhibited as the asymptotic state (Figure 14). The growth of an initial perturbation to the final asymptotic pattern is complicated by the fact that the form of the perturbation and the growth rate are tightly interconnected. Thus it is difficult to predict the final asymptotic pattern, even where the linear analysis is expected to hold. Point B is much farther from the Turing bifurcation, and we anticipate the pattern formation to be more complex. Following the same procedure as described above, we find the following range of wavenumbers over which the diff usion-induced instability occurs
7.699 < (n2 +
m2)1/2<
21.93
Table 3 Parameter Values for the Schnackenberg Model k,[A], = 0.000-0.002 Ms-1
k2[B], = 0.001 Ms-1 = 2.57 x 104 M - ~ ~ - I k, k, = 0.5-2.0 s-' 6 = D J D , = 20.0
V41
214 Computational Studies in Nonlinear Dynamics
Figure 13 Pattern for parameters at point A in Figure 12 with (n,m) = (5, 0). Circles show grid points where x > (0.6)(xmax).(Reprinted with permission from Ref. 34.)
There is now a much wider range of possibilities for the values of n and m, corresponding to the appearance of many different patterns. Because many wavenumber pairs will be excited by any particular perturbation, we expect the final pattern to be a mixture of several different unstable modes. An example pattern appearing from a particular perturbation is shown in Figure 15. We now see a complex pattern without the anticipated symmetry seen in the previous examples. Over the past several years there have been many experimental and theoretical studies aimed at developing a better understanding of pattern formation in reaction-diffusion systems. The focus of recent studies has been on more complex behavior away from the onset of instability. For some parameter values, spatiotemporal chaos may occur near the boundary between the Turing region and the region of homogeneous oscillations (Figure 12).
Nonhomogeneous Systems 215
Figure 14 Pattern for parameters at point A in Figure 12 with (n, m) = (4, 3). Circles show grid points where x > (0.6)(xmax).(Reprinted with permission from Ref. 34.)
Chemical Waves: Propagating Reaction-Diflusion Fronts Propagating fronts are ubiquitous in nature, for example, expanding bacterial colonies, advancing regions of corrosion on metals, or infectious diseases spreading through populations. Many different types of fronts can be formulated in terms of reaction-diffusion processes, for which propagating chemical waves serve as ideal model systems. Reaction-diffusion fronts are found in a number of simple autocatalytic reactions, where a narrow reaction zone propagates with a constant velocity and a constant wave form. Fronts convert reactants that are ahead of the wave into products which are left behind. The bulk of the chemical reaction occurs within the narrow reaction zone, similar to a propagating flame-but with no heat. In a flame, heat generated by the reaction ignites the reactants ahead, and the flame advances. In a reaction-diffusion front, diffusion of the autocatalytic species initiates autocatalysis ahead,
21 6 Computational Studies in Nonlinear Dynamics
Figure 15 Pattern for parameters at point B in Figure 12. Circles show grid points where x > (0.8)(xmax). (Reprinted with permission from Ref. 34.)
and the front advances. Propagating fronts are typically studied in one-dimensional configurations, such as narrow tubes filled with reaction mixture, or in two-dimensional configurations, such as thin films of solution in a petri dish. Even though the heat evolution and density changes accompanying autocatalytic reactions are typically small, these may give rise to convective disturbances, and hence it is best to study reaction-diffusion waves in a gelled medium. The first report of propagating reaction-diffusion fronts appeared almost 90 years ago, when Robert Luther, a student of Wilhelm Ostwald, presented his paper at a meeting of the Deutsche Bunsengesellschaft f i r Angewandte Physikalische Chemie held in Dresden, Germany.38.39 In addition to a lecture demonstration showing propagating fronts in the permanganate oxidation o€oxalate, he proposed an equation for their velocity of propagation
V
=
avK x D x C
where K is the rate constant for autocatalysis, D is the diffusion coefficient, C is the reactant concentration, and u is a constant with a value between 2 and 10.
Nonhomogeneous Systems 21 7 The great physical chemist Walther Nernst happened to be in the audience and voiced his skepticism about Luther’s velocity equation, asking, “Who has derived this formula?” When Luther said he did, Nernst asked, “But this is not published yet?” Luther replied, “No, but it is a simple consequence of the corresponding differential equation.” Nernst said that he looked forward to seeing the “complete publication.” It is easy to imagine that this exchange was not exactly congenial-and Luther’s glib reply may not have been completely up front because the result is not a simple consequence of the differential equation. Luther’s result was apparently lost in the literature, but the same veldcity equation was derived some 30 years later in the seminal papers of the British statistician R. A. Fisher40 and the Russian mathematician A. N. Kolmogorov and his co-workers.41 Their analyses showed that the minimal velocity of a propagating front is given by Eq. [75],with the constant a = 2. Interestingly, Fisher’s paper, entitled “The Advance of an Advantageous Gene,” was not about chemical reaction-diffusion waves. The history of this particular equation illustrates the influence of one field of study on another in the field of nonlinear dynamics. The eventual realization was that the same equation does, in fact, apply to both chemical waves and gene transport, as well as to many other examples of propagating fronts. In this section, we describe the basic features of propagating fronts in terms of the Fisher-Kolmogorov equation. There have been many treatments of this fundamentally important equation over the years, most following the analysis in Fisher’s original paper (which is highly recommended for its lucid presentation). Together with many other reaction-diffusion problems, the FisherKolmogorov equation is treated in the book by Murray,*6 and advanced mathematical analyses of such problems can be found in the monograph by,Fife.42 We also consider propagating fronts based on cubic autocatalysis as well as the lateral instabilities displayed by these fronts. We follow the treatments in Refs. 34, 43, and 44 in our discussion. For more detailed treatments of simple propagating reaction-diffusion fronts the reader may wish to consult Refs. 43 and 45.
Quadratic Autocatalysis Fronts We begin by considering the simplest form of autocatalysis, which is characterized by a rate equation with a quadratic nonlinearity: A
+ B + 2B,
rate = k,[A][B]
~761
In a distributed, unstirred solution, where reaction couples with diffusion, the space and time evolution of the system is described by the partial differential equations
228 Computational Studies in Nonlinear Dynamics
where D , and DBare the diffusion coefficients of the species. Equation 1771 describes behavior in one spatial dimension. In the case where the system initially contains only A, chemical reaction cannot occur unless some of the autocatalyst B is added. In a typical experiment, a small amount of B is introduced locally, perhaps electrochemically at an electrode, which serves to initiate the reaction. It is convenient to express the reaction-diff usion equation in dimensionless terms by scaling the concentration variables according to the initial reactant concentration ahead of the wave, a = [A]/a,, p = [B]/ao, and the space and time according to 6 = x(kqao/D,)lQand 7 = k,a,t:
Here, 6 = D,/D, is the ratio of the diffusion coefficients, V 2 = a 2 / a p is the one-dimensional Laplacian, and a. is the initial concentration of the reactant. These equations describe a front propagating in the positive 5 direction, where the boundary conditions are given by
with a = 1, p = 0 for
6 4+w
and a = 0, p = 1 for [ 3
--a,
[79]
Thus, there is only reactant A far ahead of the front and only product B far behind it. The corresponding initial conditions would be a = 1everywhere at 7 = 0 except for a small amount of the autocatalyst, Po, over some local region. Figure 16 shows the evolution of a front described by Eqs. [78] following such an initiation. The net reaction of the autocatalytic process [76] is A 4B; therefore, in the absence of diffusion or in the case of equal diffusivities, the reactant and product concentrations are linked by the conservation relation [A], = [A] + [B]. In terms of the dimensionless variables and the above initial conditions, we have the relation a p = 1for the case of equal diffusivities, 8 = 1. Hence, the
+
Nonhomogeneous Systems 219 1.o.
B
0
Figure 16 Evolution of reactiondiffusion fronts. Reaction is initiated locally by addition of a small amount of the autocatalyst (shown as the small rectangle at the origin). The time intervals between the four outermost curves are equal, and the constant propagation distances show that a constant velocity is exhibited. (Reprinted from Ref. 43 with permission of the American Chemical Society.)
1
DISTANCE
concentration of reactant consumed is always reflected in the concentration of autocatalyst produced-at any point sufficiently far from the initial input of B. This allows us to express the evolution of the system in terms of a single variable, and we choose a:
act = v 2 a - a(1 - a) a7
This is the Fisher-Kolmogorov equation, and whereas no analytical solution is known for it, a simple phase-plane analysis allows a determination of the minimal front propagation velocity. We can further simplify Eq. [80]by assuming a constant propagation velocity, a reasonable assumption because fronts observed in experimental systems typically exhibit constant wave speeds following a transient initiation period. Thus we introduce a coordinate system moving at the velocity of the propagating front. The new spatial coordinate, z = 5 - CT, where c is the dimensionless wave velocity ds / dT, allows the front to be described in terms of an ordinary differential equation
d2a + c d ” + a(l - a) = 0 dz2
dz
220 Computational Studies in Nonlinear Dynamics where the boundary conditions are now
In this frame of reference, the reactant appears to flow into the front-which is stationary-and the product flows away from it. Equation [81] can be described in terms of two first-order ordinary differential equations, one for the concentration 01 and the other for the concentration gradient u: -da =u
dz
dU = -cu + a(1 - a) dz
The first of these is simply the definition of the gradient, whereas the second is the original ODE expressed in terms of the gradient. Figure 17 shows the concentration-gradient phase plane, in which the trajectory corresponds to the concentration profile of the front. The trajectory connects two stationary states, the state (a,u ) = (0, 0) at z -+ --03 and the state (a,u ) = (1,0) at z -+
U
o(
Figure 17 Phase plane for quadratic autocatalysis front described by Eqs. [83]. The front profile corresponds to a trajectory emanating from the origin (saddle point) along the outset ec+,O) and approaching the singularity at (1, 0) along the (degenerate) eigenvector e$O), (Reprinted from Ref. 43 with permission of the American Chemicai Society.)
Nonhomogeneous Systems 221 f c o . Note that the trajectory as a function of the moving coordinate z begins far behind the stationary front and ends far ahead of it. Restrictions on how the system can evolve in the phase plane will lead us to an expression for the minimal propagation velocity. First we note that 0 I a 5 1 everywhere in the range --co < z < +m, and that a = 0 and 1 only at the ends of this range. It follows from the boundary conditions that a must remain in this range as the system evolves. It also follows that the gradient u is positive in this range. The trajectory is, therefore, restricted to the region 0 5 a 5 1, u 2 0 of the phase plane, indicated by the unshaded region in Figure 17. The second step is to analyze the linear stability of the stationary states according to the methods presented earlier. Evaluating the Jacobian matrix A for the two-variable, first-order system [83] gives
which yields an expression for the eigenvalues in terms of velocity c:
A, =
CY
and the wave
2
Evaluating Eq. [85] at the stationary state (a,u ) = (0, 0) yields
x,
=
-c jrvc2 t 4
2
which corresponds to a saddle point with real eigenvalues of opposite sign.
Thus the system leaves this state along the eigenvector associated with the
positive eigenvalue, as shown in Figure 17. The subsequent evolution of the system follows the trajectory, corresponding to the concentration profile of the front, into the allowed region of phase space. This trajectory can be determined by forward numerical integration of Eqs. [83] with initial values chosen to be on the outset of the stationary state. Evaluating Eq. [85] at the stationary state (a,u ) = (1, 0) yields
A* =
-c
a
m 2
We see that the discriminant in this expression is negative for front speeds c < c" = 2, which gives a pair of complex conjugate eigenvalues with a negative real part. As summarized in Table 1, this corresponds to a stable focus, where
222 Computational Studies in Nonlinear Dynamics the system approaches the stationary state with damped oscillations; that is, the trajectory spirals into point ( 1 , O ) as shown in Figure 18. We know from the above arguments concerning the allowed region of phase space that such a trajectory is not permitted, since it would require negative values of the concentration p (when 01 > 1) as well as negative values of the gradient M. For front speeds c = c" = 2, the two eigenvalues become real (and equal) which corresponds to a degenerate stable node (with a single eigenvector). Thus, the system approaches the point (1,0) along the attracting eigenvector, as shown in Figure 17, and the connection between the two stationary states is now allowed. This then defines the minimal front speed c" = 2 for the Fisher-Kolmogorov equation, corresponding to reaction-diffusion fronts with quadratic autocatalysis. For front speeds c > c" = 2, Eq. [87] yields two real (negative) eigenvalues that are now different from each other, corresponding to a regular stable node. These wave speeds are also permitted, since the system approaches the (1, 0) state along the eigenvector associated with the eigenvalue of smaller absolute value, and the trajectory remains in the allowed region of phase space. Hence for quadratic autocatalysis, constant velocity, constant wave-form propagating fronts are allowed with any velocity c greater than some minimum velocity:
U
Figure 18 Trajectory connecting the two stationary states for c < c* with an oscillatory approach to the (1,0) state. (Reprinted from Ref. 43 with permission of the American Chemical Society.)
Nonhomogeneous Systems 223 The minimum velocity corresponds to the value c“ at which the stationary state (1,O) ceases to be a focus and becomes a node. Generally, fronts initiated with a small, localized input of B evolve to the minimum velocity c = cminfollowing the decay of transient behavior. In dimensional form, the minimum velocity is v =
$
=
2
w
0
with a square root dependence on the diffusion coefficient D (which is the same for A and B), the rate constant for autocatalysis k,, and the initial concentration of the reactant ao. The higher velocity solutions, also representing “allowed connections” between the stationary states (0,O)and (1,0), correspond to special initial conditions such as specific concentration gradients in B. These initial conditions give rise to fronts with “phase wave” character, where the extent of reaction varies locally with the initial concentration of B.
Cubic Autocatalysis Fronts The simplest possible autocatalytic reaction is the quadratic autocatalysis of Eq. [76]. We now consider the “next simplest” case, in which a cubic nonlinearity appears in the rate law:
A
+ 2B
.--)
3B,
rate =
k,[A][B]2
[903
For the distributed system, we can write reaction-diffusion equations identical to Eq. [77], except that the cubic form of the rate law appears in place of the quadratic form. The reaction-diffusion equations are then rewritten as dimensionless equations to yield
where now P = k,a$ and 5 = x(k,aZ,/DB)1’2.The same procedure for transforming to a moving coordinate system as described previously for the quadratic case can be carried out here for the cubic case. A different approach is taken for the analysis in the cubic case, however, because, for this case, an analytical solution for the velocity and concentration profile can be found.46-48 The reader is referred to Refs. 43 and 45 for the detailed development. The front velocity for cubic autocatalysis may take on any value above some minimum velocity cminjust as for the quadratic case. In dimensionless terms, we have c2
Cmin
= ct =
liv2
~921
224 Computational Studies in Nonlinear Dynamics
that is, we find a minimum velocity that is different from that found for the quadratic case. In dimensional form, this minimum velocity is given by
with a square root dependence of v on the diffusion coefficient, D (again the same for A and B) and the rate constant for autocatalysis, k,. The velocity is now directly proportional to the initial reactant concentration ao, and we find a numerical coefficient of 2-1’2 rather than the Fisher-Kolmorogov coefficient of 2. As in the quadratic case, higher velocity solutions are also allowed, and, similarly, these correspond to phase waves arising from special initial conditions.
Lateral Instabilities: Two- and Three-Dimensional Patterns There are a number of important differences between quadratic and cubic fronts; however, the most striking is found in their behavior in two- and threedimensional configurations. We shall focus on the two-dimensional case in which the diffusivities of A and B may take on significantly different values. (Similar behavior is found in the three-dimensional case.) We now use a twodimensional Laplacian defined by V 2 = a v a p + a 2 / a + with 6 = xl(DBtch)l/2 and q = y/(DBtch)l’2 in Eqs. [78] and [91], where tCh= kaaOand k,a3, respectively. tchis the characteristic time. The reaction takes place in a strip configuration defined by --oo < 6 < +m and -qo .CE q 5 +q,, where no-flux boundary conditions are imposed in the transverse direction, that is, a d a q = a p / a q = 0 at q = &qo. The behavior found for the quadratic system in the one-dimensional case is directly applicable to the two-dimensional configuration: planar fronts are exhibited with velocities given by Eq. [88] for the case of equal diffusivities of A and B. When the diffusivities significantly differ, planar fronts are still observed; however, now the velocity scales with the diffusion coefficient D = DB according to Eq. [89].43749 The one-dimensional solution for the cubic system is also valid for the two-dimensional configuration; however, the cubic front may exhibit lateral instabilities that are not observed in the quadratic system.50 We will now consider the stability of cubic autocatalysis fronts. Figure 19 shows a sketch of a front that has been perturbed away from its planar configuration. The overall direction of propagation is from left to right, and there is a high concentration of reactant A ahead of the wave and a high concentration of autocatalyst B behind. In regions of the front that are convex in the direction of propagation, there is an enhanced dispersion of the autocatalyst B due to the curvature. This, in effect, dilutes the amount of B available for initiating autocatalytic reaction ahead. Thus the wave velocity in these convex segments will be decreased relative to the planar wave velocity.
NonhomogeneousSystems 225
Figure 19 Sketch of a perturbed front propagating from left to right. Small arrows indicate diffusion of reactant A and autocatalyst B. (Reprinted from Ref. 47 with permission of the American Institute of Physics.) The opposite is true along the concave segments, where the wave front is retarded relative to the planar front. Here, there is a diffusive focusing of B into the region ahead of the front, leading to a local increase in wave velocity. These local increases and decreases in propagation velocity tend to eliminate the local curvature. Thus we may say that the diffusion of the autocatalyst has a stabilizing effect on the planar wave. Applying an analogous argument for the diffusion of the reactant A, we see that segments that are convex in the direction of propagation have a diffusive focusing of reactant into the front, which tends to increase the local velocity of these already advanced sections. In the retarded, concave sections, a diffusive dispersion of reactant occurs, tending to decrease the local velocity of these sections. Thus, we conclude that the diffusion of the reactant has a destabilizing effect on the planar wave. Based on these qualitative arguments, we may predict that for 6 < 1, in which the diffusion coefficient of the autocatalyst is greater than that of the reactant, the overall tendency will be a stabilization of the planar front. For 6 > 1, however, we anticipate that the planar front will lose stability as the destabilizing influence of the reactant diffusion becomes dominant. To test this prediction, Eqs. [91] were numerically integrated with S = 1 and S = 5. The initial conditions correspond to a discontinuity in the concentrations of A and B: a = 0, p = 1 for x 5 xo and a = 1, p = 0 for x > xo at some convenient xo for all y. To perturb the planar front, the middle third in the y direction was displaced slightly forward. For 6 = 1, the perturbation eventually completely decays away to yield a planar front. For 6 = 5, the perturbation evolves to produce a distinctly nonplanar front in which four spatial oscillations in the y direction are displayed. The concentration profiles of 01 and p for the front are
226 Computational Studies in Nonlinear Dynamics
1.o
B Figure 20 Concentration profiles of A and B in cubic autocatalysis front with 6 = 5 . (Reprinted from Ref. 43 with permission of the American Chemical Society.)
0.0
shown in Figure 20. Interestingly, concentrations of the autocatalyst where p > 1 occur in the patterned front. The front behavior shown in Figure 20 is just a sampling of the spatiotemporal dynamics displayed by the cubic autocatalysis reaction-diffusion system. Even in this example, more complex behavior, in which the front also loses temporal stability, is exhibited for many initial conditions. The spatiotemporal behavior depends on the value of 6 and the width of the reaction area. For a sufficiently large 6 , an increasing number of wavelengths appears in the pattern as the reaction width is increased. Each pattern loses its temporal stability as the width is further increased, displaying complex oscillatory behaviorincluding period doubling and chaos-before the next pattern with more wavelengths is established. For a particular width, there is a critical value of 6 at which the planar front loses its stability.
Numerical Methods for Solution of Partial Differential Equations Simulation methods for spatially extended systems are much more numerically challenging than are those for homogeneous systems. The governing equations, that is, the reaction-diff usion equations, are systems of partial dif-
Nonhomogeneous Systems 227 ferential equations (PDEs) rather than ODES. Computer solution of PDEs can be time consuming, especially for two- and three-dimensional geometries. In addition, the algorithms are plagued by additional sources of numerical instability, and the computer memory requirements may be prohibitive if anything more complex than a simple Euler approach is used for the time derivatives. We consider a general two-variable model involving reaction and diffusion. The PDE system is given by the following pair of equations:
* at at
=
D,V2U
+ f(u, v )
=
D,V2V
+ g(u, v )
where f(u, v ) and g(u, v ) are the reaction velocities for u and v, and the diffusion terms involve the Laplacian operator and the diffusion coefficients D, and D,. We will consider a two-dimensional rectangular system described by Cartesian coordinates; thus, the Laplacian operator is defined as V 2 = a 2 / a x 2 a2/dy2. A grid of points is defined as shown in Figure 21, and the Laplacian is discretized. If we take h to be the grid spacing, the simplest approach to the spatial discretization is to use a five-point formula for the Laplacian
+
A similar expression holds for the values of v at each grid point, that is, for q j . We need to discretize the derivatives in time, as well, and this can be done in either an implicit or an explicit fashion. Considering first, for simplicity, an explicit Euler algorithm for the discretized time derivatives, the following system of equations will result:
Recall that the Laplacian terms involve evaluations of the variables u and v at the grid points surrounding the point (i, j ) (see Figure 21). The values of utT1, vi,j # + l at this point are calculated from Eq. [96] using the values from the nth iteration, which includes the neighboring points for the Laplacian term according to Eq. [95]. The II 1 values are stored, and the next point in the grid is calculated, and so on. Once all the grid points have been determined, the values are updated, and the next iteration begins. An alternative approach i s to treat the kinetic terms implicitly, while treating the diffusion terms explicitlysl-54:
+
228 Computational Studies in Nonlinear Dynamics
Figure 21 Two-dimensional grid for the simulation of reacrion-diffusion problems. The five spatial points used in the discretization of the Laplacian in Eq. [95] are shown.
+ AtD,V2(uzi) + Atf(uil,+l,v t T 1 ) Vi,j n+l = vzj + AtDuV2(vzi) + Atg(ucF*, v F r l ) ui,j n+l
= uzi
P71
This is sometimes done to enhance numerical stability because implicit schemes provide a source of feedback, as was described previously in the section on numerical methods for ODES. If this approach is taken, the resulting equations must be solved for the ( n + 1)th values of the variables. An example will illustrate how this is done (the following presentation is taken from Ref. 5 1). Consider the Tyson and Fifes5 reduction of the Oregonator mode156 of the Belousov-Zhabotinskii (BZ) reaction:
au = v 2 u + at
"
u(l
E
= v2v + u - v at av
- u ) - fv- " - 4 1 u+q
1981
Nonhomogeneous Systems 229 where u and v are the dimensionless concentrations of bromous acid and oxidized metal ion catalyst, respectively, and f, q, and E are parameters. Note that the diffusion coefficients (which are taken to be equal in this model) do not appear explicitly in the dimensionless equations. Because E is typically quite small, the variable u changes very rapidly compared to the variable v. Hence, it is possible to use an explicit formulation for the kinetics term for the variable v, but an implicit formulation is needed to describe the kinetics of u. Taking this hybrid approach, the following formulas will hold at each grid point (note that we drop the i, j subscripts everywhere):
Multiplying the second equation through by ( u , + ~+ q ) and rearranging yields a cubic equation for u , + ~ :
where the coefficients are defined as E
a=-+q-1
At
t =
-qfv,+l
k v2 l
- q E -+
u,
The cubic equation can be solved by an iterative routine such as the NewtonRaphson method.19 For PDEs, stability is affected by the relationship between the spatial grid size and the time step. For a five-point approximation to the Laplacian, it can be shown that the time step must be chosen such that
h2 40
At= for the algorithm to be stable. Some improvement in accuracy can result if more grid points are included in the approximation to the Laplacian; for example, a standard nine-point formula is57: i+l
i+l
230 Computational Studies in Nonlinear Dynamics where
If this nine-point formula is used, the stability criterion becomes
3h2 Ats8 0 which means that a slightly larger step size can be used and the numerical algorithm will remain stable. Because the numerical solution of PDEs is computationally intensive, a simple Euler algorithm is typically utilized. For one spatial dimension, it is possible to use a predictor-corrector method (such as Gear) because the Jacobian matrix is always banded and only the nonzero terms need to be stored. For two or three spatial dimensions, however, the number of variables that must be stored becomes difficult to handle. Even for a small grid, say 100 x 100 elements in size, a two-variable system of PDEs will result in 20,000 coupled ODES, which must be solved simultaneously using predictor-corrector methods. Furthermore (and this is the real problem), the Jacobian will contain (2 x lO4)2 elements. Since at least 4 bytes per element are needed for storage, this would require over a gigabyte of storage for the Jacobian matrix alone. Because many of these elements are zero, a special “sparse-Jacobian” version of the predictor-corrector algorithm can be used to reduce the storage requirements. This is an option, but most approaches to the simulation of reaction-diff-usion patterns have taken the simpler route of using an Euler routine. In this approach, calculations are carried out with smaller and smaller space and time steps, maintaining a stability criterion such as Eq. [lo21 or [104], until no change is seen in the dynamical behavior. This procedure assures that the discretized equations approximate the PDEs with sufficient accuracy.
, Cellular Automata and Other Coupled .
Lattice Methods
In addition to the direct solution of PDEs corresponding to reactiondiffusion equations, in recent years attention has begun to be focused on the use of coupled lattice methods. In this approach, diffusion is not treated explicitly, but, rather, a lattice of elements in which the kinetic processes occur are coupled together in a variety of ways, The simulation of excitable media by cellular automata techniques has grown in popularity because they offer much greater computational eficiency for the two-and three-dimensional configurations required to study complex wave activity such as spirals and scroll waves.
Geometric Representations of Nonlinear Dynamics 231 Coupled lattices of various types can be created. In the cellular automata approach, a variable that can take on discrete values constitutes the elements which are coupled together. The coupling occurs via rules that simulate physical processes such as biological interactions, diffusion, and so forth. Coupled map lattices take this one step further and assign to each lattice element a difference equation that, when iterated, produces a discrete dynamical system. Coupled ODE lattices represent the next step in complexity, and accuracy, for a coupled lattice; here, an ODE or system of ODES are coupled together, again by a choice of simple rules chosen to simulate the desired physical interactions. The advantages of coupled lattice techniques, in general, over numerical solution of the corresponding PDEs are, clearly, speed and efficiency. The disadvantage is that the models are not so easily related to the underlying physics and chemistry that they are created to simulate. The behavior of coupled lattices may be completely unrelated to, and not predictable from, the behavior of the corresponding PDE. In general, it is best to think of coupled lattice models as entirely new dynamical systems and not as substitutes for the corresponding PDE system.58 With this caveat, coupled lattices can be used-with care-to study the dynamics of problems such as reaction-diffusion systems, and offer a fast, reliable means to do so.
GEOMETRIC REPRESENTATIONS OF NONLINEAR DYNAMICS Phase Space, Poincari Sections, and Poincar6 Maps The phase space representation of trajectories computed numerically, as described above, has been introduced in another chapter of this volume.59 The systems considered there are Humiltoniun systems which arise in chemistry in the context of molecular dynamics problems, for example. The difference between Hamiltonian systems and the dissipative ones we are considering in this chapter is that, in the former, a constant of the motion (namely the energy) characterizes the system. A dissipative system, in contrast, is characterized by processes that dissipate rather than conserve energy, pulling the trajectory “in” toward an attractor (where ccinyy refers to the direction in phase space toward the center of the attractor). We have already seen two examples of attractors, the steady state attractor and the limit cycle attractor. These attractors, as well as the strange attractors that arise in the study of chaotic systems, are most easily defined in the context of the phase space in which they exist. Phase space is defined in essentially the same way for dissipative systems as it is for Hamiltonian systems. In Hamiltonian systems, the axes of the phase space are the dynamical variables (which are the coordinates) and their conjugate momenta. In dissipative systems, the axes are defined in a similar way: the
232 Computational Studies in Nonlinear Dynamics dynamical variables (such as the species concentrations in a chemical kinetics system) and their velocities (or any quantity related to the time derivative of the dynamical variable) are used. It is also common to use only a subset of the phase space variables (usually just the concentrations and, then, not always all of them) in which case the phase space portraits are actually projections of the full portrait onto a lower dimensional subspace. The main difference between the Hamiltonian and dissipative systems arises from the conservation condition that applies to the former. In Hamiltonian systems, the total energy is fixed. A trajectory with a given initial condition and energy will continue with that same energy for the remainder of the trajectory, In the phase space representation, this will result in a stable trajectory that does not “pull in” toward an attractor. A periodic trajectory in a Hamiltonian system will have an amplitude and position in the phase space that is determined by the initial conditions. In fact, the phase space representation of a Hamiltonian system often includes many choices of initial conditions in the same phase space portrait. The Poincari section, to be described below, likewise contains many choices of initial conditions in one diagram. In dissipative systems, many trajectories with different choices of initial conditions will be attracted to the same region in phase space and end up, asymptotically, on the same attractor. The phase space portrait, then, will usually consist only of the asymptotic state, that is, a trajectory that represents the final state for many different initial conditions. This asymptotic trajectory traces over the attractor and reveals its shape. Figure 22(a) shows a steady state attractor, whereas Figure 22(b) shows a limit cycle attractor. A Poincark surface of section is defined for dissipative systems in the same way as for Hamiltonian ones but, again, will look somewhat different because the phase space trajectory consists only of the asymptotic state, a single attractor. To construct the PoincarC section, the phase space portrait is “cut” with a surface to create a cross-sectional view of the attractor. Hence, for a simple limit cycle attractor which is a single loop in phase space, the Poincard section consists of a single point. For a more complex attractor, the Poincare section will be more elaborate, as we will see. A trajectory that is approaching a limit cycle attractor will pierce through the PoincarC section at different points each time the trajectory passes through the section, creating a sequence (1, 2, 3, . . . } corresponding to the ordered pairs of points in the phase space, {(xl,y l ) , ( x 2 , y2), ( x 3 ,y 3 ) . }, as shown in Figure 23. The (n + 1)th point is determined by the nth point. The flow of the vector field directs the trajectory in the phase space, taking it through the Poincar6 surface of section, cycling around, and then through the PoincarC surface again. If the flow eventually carries the system to the limit cycle, subsequent points in the sequence {(q, y l ) , (xz, yz), ( ~ 3 y,3 ) . . . 1 will converge
..
Geometric Representations of Nonlinear Dynamics 233
Y
X
I
Y Figure 22 Examples of attractors: (a) shows a steady state attractor (here, a node), whereas (b)shows a limit cycle attractor.
where (xlo yIc) is the single point in the PoincarC section corresponding to a simple (one-loop) limit cycle. In general, the flow provides a mapping of one point into the next. Thus, a Poincare' map (or, equivalently, a next-return map) can be created from the ordered pairs of points by plotting x,+~versus x , (or, equivalently, yn+l versus yn, or some function of the xs and ys). If the transient part of the trajectory is eliminated and only the asymptotic limit cycle state is used to create the Poincari map, a graph containing a single point at xlc will result. We will return to the PoincarC map concept in our discussion of chaos in a subsequent section. The Poincari section is a convenient way to view the results of a numerical simulation, but, more importantly, it shows that the essential dynamics of
234 Computational Studies in Nonlinear Dynamics
xlc-
I
Poincari. Section
ylC
Y
Figure 23 Operational definition of the PoincarC section. A trajectory approaching a limit cycle (I.c.) pierces through ehe Poincari section (here located at x = xlc) a number of times before it reaches the limit cycle. Once the transient trajectory has reached the limit cycle, the Poincari section will consist of a single point at {xlC,ylC}. the system are representable in a space of lower dimensionality. The conclusion that can be drawn from this is that the dynamics of the system are, in fact, lowdimensional, that is, the number of variables required to describe the dynamics is small. This is an essential feature of nonlinear dissipative systems. One way to see the effect of the reduction in dimensionality that occurs as one goes from time series to phase space portrait to Poincari section is to consider the stability of a system as it is reflected in the stability of points in the Poincark section. The single point, which corresponds to the Poincark section of a simple limit cycle, can be treated in the same way an equilibrium point (or steady state point) is treated, even though, here, we are considering the stability of the periodic state, that is, the limit cycle. A small perturbation added to this point in the cross section will be found to decay back toward the point itself, if the limit cycle is stable, or to evolve away from the point, if the cycle is unstable. The stability properties of the point in the Poincark section are the same as the stability properties of the limit cycle to which it corresponds. Hence, a stable point in the section means that the limit cycle is stable, and an unstable one means that the limit cycle is unstable. And, furthermore, any bifurcations which occur for the point in the cross section also correspond to bifurcations which the limit cycle undergoes as a parameter is varied. With the surface of section technique, it can be observed that a limit cycle can undergo, for example, a Hopf bifurcation. When the stability analysis of the Poincari section is carried out, the real part of a complex conjugate pair of eigenvalues is seen to pass from negative to positive; a small perturbation added to the limit cycle will evolve away from the cycle in an oscillatory fashion. This type of bifurcation results, then, in the appearance of a second
Geometric Representations of Nonlinear Dynamics 235 characteristic frequency of oscillation and the birth of a new attractor for the system. This new attractor, characterized by two characteristic frequencies, takes the shape of a torus, and is a very important attractor in one of the scenarios that leads to chaos. Before getting into more details about this scenario, we first turn to a general consideration of the phenomenon of chaos in nonlinear dissipative systems.
Chaos The phenomenon of chaos has captured the imagination of many people, scientists and nonscientists alike, and it is probably fair to say that its discovery in nonlinear dissipative systems had an invigorating influence on the development of the field. The fact that irregular, unpredictable behavior can arise in a system that is completely determined (i.e., deterministic) is now a well accepted fact of nature. This realization, however, has changed in a profound way our understanding of what nature is like. No longer do we live in a universe in which the future is predetermined and its prediction (or control) depends only on the discovery of its governing laws. Now we know that the future of the universe, indeed of any system of sufficient complexity, cannot be determined no matter how well we understand its inner workings. This has resulted in a major paradigm shift and a rethinking of the meaning of “understanding” a system. Chaos can arise in very simple systems, even those characterized by as few as three variables. Indeed, the discovery of chaotic behavior can be traced to a study by Edward Loren260 of a system of three nonlinear ODESextracted from the hydrodynamic equations used in meteorological modeling. The story is now familiar to students of the field, and its essential features have been retold many times61: how Prof. Lorenz noticed that identical runs of a numerical integration routine with slightly differing initial conditions led to widely varying results after a few iterations. How he first attributed this to a “bug” in his program but eventually realized that the widely varying results were real, a result of the equations themselves and not due to any mistake or programming error. Other important events in the development of this field include the publication of a article by Li and Yorke62 in which the word “chaos” was used for the first time to refer to phenomena of the type discovered by Lorenz. Later, subsequent investigators found that chaotic behavior had been observed many years before Lorenz but had not been recognized as such. It is now accepted that Poincare observed irregular, chaotic behavior in his studies63 but at the time (a century ago, long before the invention of the computer) it could not be investigated thoroughly enough to really understand it. Indeed, it was the development of the digital computer that made the study of nonlinear dynamical systems possible, a study which has revealed the wide variety of rich dynamical behavior that always existed in these systems but whose elucidation was beyond the scope of earlier computational ability.
236 Computational Studies in Nonlinear Dynamics
Attractors Chaotic behavior in nonlinear dissipative systems is characterized by the existence of a new type of attractor, the strange attractor. The name comes from the unusual dimensionality assigned to it. A steady state attractor is a point in phase space, whereas a limit cycle attractor is a closed curve. The steady state attractor, thus, has a dimension of zero in phase space, whereas the limit cycle has a dimension of one. A torus is an example of a two-dimensional attractor because trajectories attracted to it wind around over its two-dimensional surface. A strange attractor is not easily characterized in terms of an integer dimension but is, perhaps surprisingly, best described in terms of a fractional dimension. The strange attractor is, in fact, a fractal object in phase space. The science of fractal objects is, as we will see, intimately connected to that of nonlinear dynamics and chaos. Although a strange attractor has an unusual dimensionality and corresponds to a chaotic state, it represents a stable configuration of the system. Small perturbations moving the system away from the attractor are found to decay away, with the system relaxing back to the (stable) chaotic state. It might seem paradoxical that chaos could be stable, but this is one of the more remarkable features of nonlinear dissipative systems. Chaotic behavior is robust and stable, and, furthermore, its important features (such as the nature of the attractor that governs it) are independent of initial conditions. Within a certain basin of attraction in phase space, all trajectories will move toward the strange attractor; it thus represents the stable, asymptotic state of the system under these particular conditions.
Sensitive De endence on Initial Conditions:$he Lyapunov Exponent Whereas a strange attractor represents an asymptotic state that is independent of initial conditions within a basin of attraction in phase space, it is still the case that a chaotic state exhibits the property of sensitive dependence on initial conditions. This latter property has a somewhat different meaning than that discussed previously in relation to the basin of attraction. The property of sensitive dependence on initial conditions refers to the following effect: two trajectories differing only infinitesimally in initial conditions will start off near one another on the strange attractor but will, after a finite period of time, diverge exponentially. The distance between any two initially nearby trajectories at time t is x = e", where A is the characteristic exponent that describes the divergence. A is known as the Lyapulzov exponent and is positive for a chaotic state. In fact, the property of sensitive dependence on initial conditions and/or a positive Lyapunov exponent is taken by some as the definition of chaotic behavior, although the actual situation is much more complicated than this. In fact, nonchaotic systems also display sensitive dependence on initial conditions
Geometric Representations of Nonlinear Dynamics 237 and can be shown to have a positive Lyapunov exponent. This criterion alone is, then, not enough to establish the existence of chaos in a system. The identification of chaotic behavior is definitive only when several criteria, described below (including a positive sign of the Lyapunov exponent), are found to hold. .HOWcan it be that a chaotic system is associated with a stable attractor toward which trajectories arising from many different initial conditions are attracted and, yet, exhibit a sensitive dependence on initial conditions? The answer lies in the vector field properties of the phase space containing the attractor.17 The vector field associated with the attractor is characterized by different manifolds. On the stable manifolds, trajectories move toward ,one another, that is, converge, whereas on the unstable manifolds, trajectories diverge. It is the confluence of these two opposing effects that makes the trajectory strongly dependent on the precise choice of initial condition; two initially nearby trajectories will move away from one another after a finite period of time because of the effect of the unstable manifold. This results in the property of sensitive dependence on initial conditions. To understand the stability of the attractor itself, we consider the region of phase space surrounding the attractor. This region is filled with a vector field that points toward the attractor so that all trajectories in the region will eventually find their way to it. Hence, trajectories that are nearby, that is, in a basin surrounding the attractor, will be attracted to it regardless of their starting points. A Lyapunov exponent is a generalized measure of the growth or decay of small perturbations away from a particular dynamical state. For perturbations around a fixed point or steady state, the Lyapunov exponents are identical to the stability eigenvalues of the Jacobian matrix discussed in an earlier section. For a limit cycle, the Lyapunov exponents are called Floquet exponents and are determined by carrying out a stability analysis in which perturbations are applied to the asymptotic, periodic state that characterizes the limit cycle. For chaotic states, at least one of the Lyapunov exponents will turn out to be positive. Algorithms for the calculation of Lyapunov exponents are discussed in a later section in conjunction with the analysis of experimental data. These algorithms can be used for simulations that yield possibly chaotic results as well as for the analysis of experimental data. As mentioned previously, the identification of chaotic behavior in a system can be a tricky business, and there are many pitfalls in doing so. The safest bet is to look for the confluence of several of the accepted criteria for chaotic behavior, which are: Sensitive dependence on initial conditions Existence of a strange attractor Observation of an accepted route to chaos Again, there is no hard and fast “definition” of chaotic behavior, and even in using this list, problems abound. For example, the identification of chaotic
238 Computational Studies in Nonlinear Dynamics behavior that arises by a new “route” is not unheard of, although such a claim must be justified more strongly than the claim for chaos that arises by a wellknown route. The way in which chaos arises in a system as a bifurcation parameter is varied is known as the “route to chaos,” and, as implied above, its identification is crucial in establishing the existence of chaotic behavior. Some of the more well-established routes to chaos and examples of their existence in chemical systems will now be described.
Routes to Chaos Chaotic behavior is associated with periodic behavior. When we speak of the “route to chaos,” we are talking about a scenario in which chaos arises from a system that displays periodic dynamics for certain parameter values. Changing this parameter will cause the system to undergo bifurcations, that is, sudden and discontinuous changes of behavior, which lead it from the regular periodic behavior into chaotic, aperiodic behavior. The precise sequence of events that occur along the way from periodic behavior to chaotic behavior provides insights into the nature of the chaos. The following descriptions of well-understood routes to chaos apply to homogeneous (well-stirred) systems as well as to nonhomogeneous systems, although the characterization of chaos in the latter is still an area of active research.
Period Doubling Route The route to chaos that was the first to be thoroughly understood, and is probably also the simplest, is known as the period-doubling route to chaos.64 In this scenario, a simple periodic state is found to undergo a series of bifurcations that, one after the other, approximately double its period of oscillation until an accumulation point is reached that corresponds to an infinite number of doublings. Here the period becomes infinite. The limit cycle attractor associated with the original periodic state goes through distinct and recognizable changes as these bifurcations occur; furthermore, the resulting infinite period state is found to be associated with a strange attractor and to exhibit a positive Lyapunov exponent, that is, sensitive dependence on initial conditions. The confluence of all these observations leads to a conclusion that the final state observed is a chaotic state of the system. The period doubling route to chaos is often illustrated in textbooks using a simple iterative system known as the logistic map or quadratic map6s; indeed, much of the original work64?66on the period-doubling route to chaos occurred using the logistic map as the specific system of interest. In our discussion of period doubling, however, we will consider a different model, one involving differential equations, which makes it easier to relate to the types of models that arise in computational chemistry. Otto Rossler developed the three variable model we will consider67; it gives rise to chaotic behavior governed by a strange attractor now known as the Rossler attractor. The three variable model is:
Geometric Representations of Nonlinear Dynamics 239 jc=-y-z y=x+ay 2 =
b
+ z ( x - c)
which contains three parameters (a, b, and c) and three dependent variables ( x , y, and z). Because the model is a nonlinear coupled system of ODES, it is necessary to solve it using numerical techniques. Runge-Kutta works quite well for this system, although a stiff ODE solver may also be used. The period-doubling route to chaos is easily seen if the parameters a and b are considered to be fixed and c is allowed to vary. Figure 24 shows a sequence of phase portraits for a = b = 0.2 with c given by c = 2.5, c = 3.5, and c = 4.0, respectively. If the Rossler system is solved for the first value of c, any choice of initial conditions will result in a trajectory that approaches the limit cycle shown in Figure 24(a). This limit cycle is called the period-one state because it corresponds to a single loop in the phase space. Between c = 2.5 and c = 3.5, the Rossler system will undergo a bifurcation, and the period-one state will become unstable. For values of c above the bifurcation point, the Rossler system develops a new stable state shown in Figure 24(b). This state is called the period-two state because the trajectory goes around the attractor twice for every time it went around once in the period-one state. Increasing c, again, to c = 4.0 results in another bifurcation. The period-two limit cycle loses its stability and is replaced by a third limit cycle, the period-four state shown in Figure 24(c). Finally, increasing c to c = 5.0 results in the phase portrait shown in Figure 24(d) which is, as we will see, a chaotic state. This phase portrait thus corresponds to a strange attractor, the Rossler attractor. Note that the period of oscillation is approximately doubling as each bifurcation occurs in this sequence; at the same time, each loop in the associated attractor is splitting into two loops. The period-one attractor thus has one loop, the period-two attractor has two loops, and the period-four attractor has four. Furthermore, the points along the c axis at which the subsequent bifurcations occur get closer together as c increases. If we could subdivide the interval between c = 4.0 (corresponding to the period-four state) and c = 5.0 (the chaotic state), we would find a period-eight state, a period-sixteen state, and so on, all squeezed into a small interval along the c axis in the range E = 4.0 to c = 5 .O. Because the period of oscillation approximately doubles at each bifurcation point, we conclude that the period goes to a limit of infinity at an accumulation point somewhere between c = 4.0 and c = 5.0. The accumulation point corresponds to the value of c at which the intervals between subsequent bifurcations collapse to zero size. This is the point at which all the periodic states lose their stability and the only remaining stable state of the system is the chaotic one. It is clear from the phase portrait of the chaotic attractor that all the periodic orbits that have lost their stability throughout the bifurcation sequence are visited during the course of time a chaotic trajectory winds its way
Figure 24 Period-doubling bifurcations in the Rossler model, Eqs. [106].For these simulations (computed using the Runge-Kutta routine DiffEq-3D, with stepsize = 0.1, available through Ref. 18), the following parameters were used: a = 0.1 and b = 0.2 with c variable. In (a) c = 2.5 and the initial values were xo = -2.98, yo =
2.03, zo = 0; in (b) c = 3.5, and initial values were xo = -2.91, yo = 2.222, z0 = 0; in (c) c = 4.1 and initial values were xo = -3.75, yo = 2.50, zo = 0; and in (d) c = 5.0 with initial values xo = -2.83, yo = 1.833, zo = 0. In all parts (a)-(d), the axis limits are: -12 S: x i 12; -12 Iy I12; 0 5 z 5 25.
242 Computational Studies in Nonlinear Dynamics 4.0 3.8 3.6
-x(n+l)
3.4 3.2 3.0 3.0
Figure 25 Poincari sections for the limit cycle states shown in Figure 24. For parts (a)-(c), the Poincar6 section was placed along the
3.2
3.4
3.6
3.8
-dd
4.0
@ 1
.
I
.
I
.
Geometric Representations of Nonlinear Dynamics 243
h
+
d
zv
X
Y
i!
t
t Figure 25 (continued)
over the strange attractor. In fact, we can think of the stable strange attractor as consisting of an infinite number of unstable periodic orbits. The Riissler system provides a nice illustration of the use of Poincari sections in understanding nonlinear dynamic behavior. Figure 25 shows the Poincari sections of the phase portraits in Figure 24. The period-one state yields a single point in cross section, the period-two yields two points, and so on. We may illustrate these points in the cross section using the Poincare' map shown in Figure 25. The Poincari map is created by plotting the (n t 1)thvalue of the trajectory as it passes through the section against the nth value. As discussed in a previous section, the period-one limit cycle will thus produce a Poincark map that is a single point [see Figure 25(a)]. The period-two limit cycle will yield a Poincari map with two points, and the period-four state will have a map with four points. As seen in the accompanying figure, the chaotic state that arises out of this sequence of periodic states does, indeed, correspond to a Poincari map with an infinite number of points, but, remarkably, these points all fall along a simple curve [see Figure 25(d)].68 This result is remarkable because it shows that the complex, aperiodic behavior displayed by the Rossler system can be reduced to a simple function of the form
where f(x) is, judging from the shape of the graph that has resulted from this analysis, apparently a polynomial of low order, very close to a simple quadratic, in fact. All the essential features of the dynamics of the Rossler system are contained in and recoverable from the Poincari map derived using the analysis
244 Computational Studies in Nonlinear Dynamics described previously. The specific trajectories are not recoverable, nor are all the detailed features of the associated strange attractor, but none of these is really of fundamental importance in describing the dynamics of the Rossler (or any other) nonlinear system. What is important is the sequence of bifurcations that occur as a parameter is varied and the topological characteristics of the associated attractors; these are, in fact, explained by a simple function of the form shown in Figure 25(d). The logistic map, referred to previously, is a now famous, but simple, quadratic function that displays the same period-doubling cascade into chaos exhibited by the Rossler system (and by quite a few experimental examples, as well). A “mup” is an equation that gives an iterative rule for generating a sequence of points {xl, x2, x 3 , , . . } given an initial value xo. The logistic map is given by
and a plot of it is quite similar to the PoincarC map extracted from the trajectories for the Rossler system. For values of A > 3.57, the logistic map displays chaotic behavior that arises through the same period-doubling route observed in the Rossler system. The dynamics of the logistic map have been described in many other places and several recent texts with detailed information on its dynamics are available’736s; these may be consulted by the reader who desires further details.
Evolution of the Torus Another route to chaos that is important in chemical systems involves a torus attractor which arises via bifurcation from a limit cycle attractor. Again chaos is found to be associated with periodic behavior and to arise from it through a sequence of transformations and associated bifurcations of a periodic state of the system. The specific sequence is different in this case, however, and somewhat more complex. As mentioned in a previous section, a limit cycle can, at times, undergo a Hopf bifurcation. This would be revealed in a stability analysis of the crosssectional point in the PoincarC section of the limit cycle. A Hopf bifurcation would correspond to an associated pair of eigenvalues whose real part passes from negative to positive while all other eigenvalues remain negative. In physical terms, a Hopf bifurcation means a second frequency becomes available to the system, and this is reflected in the disappearance of the limit cycle attractor and the appearance of a torus attractor. (The limit cycle still exists actually, but the bifurcation renders it unstable so that all trajectories are repelled from it.) The torus attractor has associated with it two distinct kinds of behavior: quasiperiodic and periodic. These two kinds of behavior are, in turn, associated with two distinct relationships between the two natural periods (or frequencies) in the system. The first frequency corresponds to the now unstable limit cycle
Geometric Representations of Nonlinear Dynamics 245 and is associated with the period of time required to traverse the torus in the long direction (see Figure 26). The second frequency is that which arises via the Hopf bifurcation of the limit cycle and is associated with the period of time required to wind around the small part of the torus. Quasiperiodic behavior occurs when these two frequencies are incommensurate, that is, when one is not a simple multiple (integer or otherwise) of the other [see Figure 27(a)]. Periodic behavior, also called phase-locked behavior, corresponds to a commensurate relationship between the two. In the latter case, one frequency is a multiple of the other (although the multiplicative factor is not necessarily an integer; it is, however, always rational). The quasiperiodic states are immediately recognizable in the PoincarC section [see Figure 27(b)] as a closed curve [see Figure 27(c)]. The phase-locked states correspond to limit cycles that wind around the surface of the torus and thus result in a PoincarC section that consists of a number of distinct points where the trajectory repeatedly pierces the section as it passes through it on each pass. Chaos does not occur as long as the torus attractor is stable. As a parameter of the system is varied, however, this attractor may go through a sequence of transformations that eventually render it unstable and lead to the possibility of chaotic behavior. An early suggestion for how this happens arose in the context of turbulent fluid flow and involved a cascade of Hopf bifurcations, each of which generate additional independent frequencies.@ Each additional frequency corresponds to an additional dimension in phase space; the associated attractors are correspondingly higher dimensional tori so that, for example, two independent frequencies correspond to a two-dimensional torus (TZ), whereas three independent frequencies would correspond to a three-dimensional torus (T3).The Landau theory69 suggested that a cascade of Hopf bifurcations eventually accumulates at a particular value of the bifurcation parameter, at which point an infinity of modes becomes available to the system; this would then correspond to chaos (i-e., turbulence).
Stable Fixed Point
Unstable Fixed Point -> Stable Limit Cycle
Unstable Limit Cycle -> Stable Torus
I
Figure 26 Generation of a torus attractor via two Hopf bifurcations. The first Hopf bifurcation converts a stable fixed point (a focus) into an unstable focus. A stable limit cycle generally originates at this bifurcation point. A second Hopf bifurcation occurs, rendering the limit cycle unstable, and giving rise to a stable torus. Each Hopf bifurcation results in one additional frequency of oscillation in the system.
246 Computational Studies in Nonlinear Dynamics
Figure 27 The torus attractor generated by quasiperiodic trajectories. In (a) a quasiperiodic trajectory is shown making slightly more than one pass around the torus. As can be seen on the front left side of the torus attractor, the quasiperiodic trajectory does not exactly match up with its first pass; the eventual result is that the surface of the torus will be covered completely by the quasiperiodic trajectory. The Poincari surface of section shown in (b) results in a circular or ellipsoidal cross section (c).
I 0
Although the Landau theory is interesting, there does not seem to be any experimental evidence for the existence of higher dimensional torus attractors in the transition to turbulent flow nor, interestingly, in any other experimental system. Brandstater and Swinney’o carried out careful experiments on the transition to turbulence in Taylor-Couette flow which involves changes in the motion of fluid between two counter-rotating cylinders as the rotation speed is increased. They found that the T2 attractor is followed immediately by turbulence as the rotation speed increases, without any evidence of the intermediate T3, T4, , . . stages that would appear in the cascade of Hopf bifurcations predicted by the Landau theory. Ruelle, Takens, and Newhouse suggested71 a
Geometric Representations of Nonlinear Dynamics 247 scenario (the RTN scenario) by which a three-dimensional torus is, in fact, formed after a third Hopf bifurcation but is unstable, so it would never be observed. A fractal attractor that is stable forms instead, although the way in which this happens is not specified by the RTN scenario. The current view of the transition to chaotic behavior from the torus attractor observed in chemical systems is tied heavily to phase-locked states, that is, those that correspond to commensurately related frequencies on the torus. Both the Landau and RTN scenarios are more closely related to quasiperiodic behavior on a torus (i.e., the case of incommensurate frequencies) than to phase-locked behavior, As described previously, phase-locked states are complex limit cycles confined to the surface of the torus; their Poincari section consists of a finite number of points corresponding to the places the limit cycle trajectory repetitively crosses the Poincari surface of section. The evolution of the torus from its regular form to a strange attractor involves a change in the vector field around these points as the bifurcation parameter is moved toward values that support chaotic behavior. In systems that follow the torus route to chaos, a wrinkling of the surface 'of the torus is noted as the bifurcation into chaos is approached. The wrinkling occurs near the points corresponding to the phase-locked states and, as chaos is approached, becomes more and more pronounced until, eventually, the torus breaks up and becomes a fractal object, that is, a strange attractor. Closely associated with the wrinkling and fractalization of the torus is the existence of so-called mixed-mode states, generally found to be arranged in an interesting pattern known as a Farey sequence.72 The mixed-mode states are actually phase-locked states on the broken or fractal torus and are, therefore, periodic. They consist of repeating sequences of large- and small-amplitude oscillations. The pattern of large to small peaks follows a Farey sequence, to some extent, as the bifurcation parameter is varied, although the details of the sequence are dependent on the particular system in question. It is possible that the Farey sequence arises from the underlying dynamics of an even-lower dimensional system, one corresponding to a circle map that can be extracted from the Poincari section. Again, the reduction in dimensionality reveals a simpler system (simpler in the sense of having a smaller number of variables) that can explain very complex behavior.
Circle Maps A circle map can be easily defined in terms of the PoincarC section of the torus (see Figure 28). The elliptical curve in the PoincarC section is actually a large number of individual points, each corresponding to the place the quasiperiodic trajectory pierces through the surface of section. An arbitrary zero point can be chosen (see Figure 28), and an angle, On, can be defined to label each position, that is, the nth point, in the section. The Poincari cross section is constructed by allowing an actual trajectory to create a sequence of points that can be numbered in the order in which they appear. Hence, a sequence of
248 Comtlutational Studies in Nonlinear Dvnamics
Figure 28 Definition of the angle Oi from the Poincark section of a torus attractor derived from experimental data. The index ‘‘i” labels the order with which points appear in this section as the trajectory winds its way over the surface of the torus. This definition can be generalized to a wrinkled or fractal torus.
angles, {el, 8,) e3, . . . }, can be defined. From this sequence, we can create a map, much as the Poincark map was created before, by plotting e,, versus 8,. This map is known as the circle map of the system. An example taken from a particular chemical system73 is shown in Figure 29. Mathematical functions that approximate the form of the experimentally derived circle map shown in the figure have been extensively studied.74JS Perhaps the most well known of these is referred to as the sine circle map, which is given by the following iterative relation:
f(e,)
= 8,
K + i2 - 2sin(21~0,) 7F
Here, R and K are two parameters that determine the dynamics of this map. Under iteration, the variable 8, may converge to a periodic series for which
en+Q = en + P In the Poincart section, we will see Q points; the trajectory will follow a path such that the points that appear in the section go around a circle P times before the trajectory closes on itself, that is, the points begin to repeat. We thus define the winding number,74 W, which is rational for the periodic case:
P W = - for th.e periodic case
Q
Geometric Representations of Nonlinear Dynamics 249
2x
t
/
I
x e(n)
I
2x
Figure 29 A circle map derived from simulations with the DOP model of the peroxidase-oxidase reaction. (From Ref. 73; used with permission.)
The corresponding circle map for this case will consist of P points below the diagonal and Q total points. The accompanying graph [see Figure 30(a)]of the sine circle map for Cn = 0.2, K = 1.0 shows a periodic orbit with P = 1, Q = 8 for a winding number W = l/g, The winding number may also be irrational if the trajectory is quasiperiodic:
W = q for the quasiperiodic case
u121
where q is irrational. The graph [see Figure 30(b)]for Cn = 0.2, K = 0.9 shows a quasiperiodic trajectory. A general expression for the winding number which encompasses both the periodic and quasiperiodic cases is
w = I&1,[ / y e ) - 01
131
This definition allows us to consider chaotic behavior, which the sine circle map also displays. In the case of chaos, W will not converge for large 12. For the sine circle map, there is a tendency for the winding number (W) to lock in to a rational value PIQ over a range of values when K is in the range 0 < K I 1. The density of these mode-locked states in a graph of W versus Cn is low for low values of K and increases as K approaches 1. At the value K = 1 (the critical point) the mode-locked states fill up the entire C? axis. The graph of
250 Computational Studies in Nonlinear Dynamics
0
Figure 30 Trajectories generated from the sine circle map, Eq. [log] for (a) 0 = 0.2, k = 1.0, and (b) R = 0.2, K = 0.9. (From Ref. 74; used with permission.) i
W for K = 1 is shown in Figure 31.75 This figure is known as a devil’s staircase because of its self-similar features. Notice that the blow-up of one small region of the staircase looks exactly like the entire staircase. An even higher resolution blow-up of a region between two small steps in the insert would, as well, resemble the entire staircase. This is the essence of self-similarity and is associated with the fact that the staircase has a fractal dimension. (The name “devil’s staircase” is an indication of the frustration one would feel in trying to climb
Geometric Representations of Nonlinear Dynamics 251 1.0’
1
I
I
I
00 -
EQ6 aL
I t-:r
a-
-
.-
-
024
I
,
-
an-
I
A.
I
I
-
-
025
1
’
..-
1
-
- ,
- -
-
.
af6“n 1
.
1
such a staircase by placing a foot on each step.) The graph shown is, again, for the critical case (K = 1) in which the mode-locked states fill up the entire i’l axis and is, therefore, called a complete devil’s staircase. For K < 1, the circle map still generates staircases, but they are incomplete. The spaces between the steps correspond to quasiperiodic states which, in turn, correspond to irrational winding numbers. Because the sine circle map is so simple, a large number of calculations over ranges of K and i’l values can be quickly carried out. A “phase diagram” indicating the regions over which mode-locked states and quasiperiodic states occur can be constructed. A useful phase diagram is a plot of the values of K and over which the various behaviors occur; a sketch is shown in Figure 32.75 The mode-locked states are found to fall within what are called Arnol’d tongues which come to a point on the K = 0 axis and widen out as K increases. The irrational winding numbers corresponding to quasiperiodic states fall along one-dimensional curves, not within wedges as the mode-locked states do, These curves extend from the K = 0 axis to a point terminating at K = 1. Above the K = 1 line, the Arnol’d tongues corresponding to the mode-locked states overlap and chaos is observed. Note that no quasiperiodic states are observed above the K = 1 line. For K < 1, quasiperiodic states are found over the entire extent of the i’l axis. By observing typical trajectories in the overlap region (above K = l),it is found that the chaotic orbit jumps erratically from one mode-locked state to another, never settling for long into any one state because none is any more stable than another. This has been referred to as a “frustrated” response75 of the system and occurs because more than one attracting (stable) periodic orbit exists. Hence, the chaotic behavior that arises by the overlapping of Amol’d tongues is fundamentally different from that which arises by the period-doubling route.
252 Computational Studies in Nonlinear Dynamics
Q
Figure 32 Phase plane portrait showing Arnol’d tongues, which come to a point on the K = 0 axis and widen out as K increases. Within the wedge-shaped regions, mode-locked states exist. Above K = 1, these wedges overlap, leading to the possibility of chaos in this system. (Adapted from a similar figure in Ref. 75.)
Mixed-Mode Route to Chaos: Peroxidme- Oxidme Reaction Example In realistic models, of which those arising in chemistry are good examples, the simple dynamics displayed by the sine circle map becomes more complex. It is possible that the sine circle map describes the dynamics of the system very close to and slightly beyond the transition to chaos, but that once one has gone well “into” the chaotic region in parameter space, this description no longer applies. A study73J6 of a fairly simple model of an enzyme reaction that exhibits chaotic behavior, the peroxidase-oxidase reaction, provides a good illustration of the role of circle map dynamics and mixed-mode oscillations in the transition to chaos. In the peroxidase-oxidase reaction,77 the peroxidase enzyme from horseradish (which, as its name implies, normally utilizes hydrogen peroxide as the electron acceptor) catalyzes an aerobic oxidation
2YH2
+ 0 2 + 2H+ + 2YH+ + 2H20
~
4
where YH2 is a general electron donor, usually nicotinamide adenine dinucleotide (NADH). Decreasing the enzyme concentration leads to a sequence of simple oscillatory dynamics followed by chaotic behavior and, finally, mixed-mode oscillations.7* The latter, again, refer to oscillations in which peaks differing vastly in amplitude are arranged in a repetitive pattern. Later
1
Geometric Representations of Nonlinear Dynamics 253 experiments with this reaction revealed a period-doubling route to chaos79 when a different component (a phenol additive) was varied. A model was proposed in 1979 by Degn, Olsen, and Perramso to explain both the simple oscillations and the chaotic behavior that had been discovered a few years earlier. It is a four-variable model (now known as the DOP model) and is given by the following system of rate equations:
A
= -k,ABX
B
=
X
=
- k3ABY + k7 - k-7A
-k,ABX - k3ABY +
k8
A,ABX - 2k2X2 + 2k3ABY - k,X
Y = -k,ABY
+ 2k2X2 - k,Y
+ k,
w.51
where A is the concentration of dissolved O,, B is the concentration of NADH, and X and Yare the concentrations of two intermediates. From many comparisons of simulations and experiment,77 it has been determined that X mimics the likely dynamics of a free radical species, NAD., whereas Y corresponds to an enzyme-substrate complex known as compound 111, which consists of a molecule of oxygen bound to a reduced form of the enzyme known as Per2+. (The native enzyme is Per3+.) The DOP model exhibits chaotic behavior in a certain range of parameter values. Typically, all parameters except k l are held fixed, and k, is treated as a bifurcation parameter; chaos is found in a certain range of parameter values. Variations in kl reproduce the experimental behavior observed when the enzyme concentration is changed, so that this rate constant can be thought of as being related to the enzyme catalyst concentration. The chaotic dynamics in the DOP model are governed by a torus attractor that evolves through four distinct stages as the k l parameter is varied.73 These four stages are: (1)the undistorted torus; (2) the wrinkled torus; (3) the fractal torus; and (4) the broken torus. Poincari sections of each stage of the attractor are shown in Figure 33. Whereas the fractal torus is difficult to distinguish from the wrinkled torus, the broken torus (stage 4)is immediately recognizable from its surface of section. The transition from wrinkled to fractal torus can, however, be clearly seen in the associated circle map. The circle map develops an inflection point (see Figure 34) at the transition from wrinkled to fractal torus. The existence of an inflection point means that the circle map is no longer invertible, that is, the circle map cannot be derived from a true two-dimensional torus. It also means that chaotic dynamics are now possible.81 The transition from stage 2 to stage 3 heralds the death of the two-dimensional torus and the transition to the possibility of chaotic dynamics. For the latter two stages of the torus [Figures 33(c) and 33(d)]the trajectory used to create the section shown is a chaotic one, a statement made on the basis of a positive value of the associated Lyapunov exponent. Nonchaotic, that
254 Computational Studies in Nonlinear Dynamics 0.0145 1
0.0105
/*il
-
4
19.25
20.5
22.0
24.5
B
0.0142
Figure 33 Evolution of a torus attractor found in simulations with the DOP model of the peroxidase-oxidase reaction, equations [115]. Parameter values used are: k, = 1250, k, = 0.046875, k, = 1. 104, k, = 0.001, k, = 0.89, k-, = 0.1175, k, = 0.5 and (a) k, = 0.205, (b) k , = 0.17, (c) k, = 0.1634, and (d) k, = 0.1178. See text re Eqs. [I151 for definition of axes. (From Ref. 73; used with permission.)
(C
22.25
0.0145 0.01215 0.0102
25.5
'
'.
/'
27.5
c.
32.0
is, periodic, trajectories also exist over the range of k , values corresponding to stages 3 and 4. An example is shown in Figure 35. This is an example of a mixed-mode oscillation which, as can be clearly seen in the Poincari section, corresponds to phase-locking on the broken torus. The mixed-mode oscillations are, again, periodic modes in which peaks of widely differing amplitudes
Geometric Representations of Nonlinear Dynamics 255
1.32 x
I
n
Q(n)
1.22 II
Figure 34 Circle maps derived from data of (a) Figure 33(c), and (b) Figure 33(d). The development of an inflection point in the circle map signals the onset of chaotic behavior. (From Ref. 73; used with permission.)
alternate in a given repetitive pattern. Because the amplitudes are well separated in size, it is possible to designate some as “large” amplitude (L) and others as “small” amplitude (S) and to use notation of the form LS to indicate a mixed-mode state consisting of L large-amplitude oscillations alternating with S small-amplitude ones. A numerical value known as the rotation number, analogous to the winding number, can be assigned to each mixed-mode state.82 A general formula for
256 Computational Studies in Nonlinear Dynamics
0.00
-
I
0.00
I
750 time units
I
I
0.01000
20.0
c
B
I
0.01000
750 time units
22.0
22.75
25.5
29.75
34.0
B
0.01500
X
u.u
1
I
750 time units
1
0.01275
0.01050
B
Figure 35 (a), (b) Phase-locked and (c) mixed-mode oscillations in the DOP model, Eqs. [115]. The oxygen time series for (a) k, = 0.193762205, (b) k, = 0.1621, and (c) k, = 0.1033 are shown on the left. (All other parameters are given in Figure 33.) The corresponding PoincarC sections in (d), (e), and (f) show the fractal nature of the torus corresponding to (c), the mixed-mode states. See text re Eqs. [115] for definition of axes. (From Ref. 83; used with permission.) the rotation number, p, in terms of the angle Oiof the ith point in the PoincarC section is
Plots of these rotation numbers as a function of the bifurcation parameter k, take on a staircase form, although the staircases are much more complex than those seen in the sine circle map.73~83Figure 36 shows a portion of the staircase
Numerical Analysis of Experimental Data 257 for a sequence of mixed-mode states of the form (LS)m(L(S+l))", that is, a concatenation of the primary, that is, LS and L ( S f l ) , states. The general formula for rotation number applies to these states as well as all others, and it is possible to show that it is given by p = (rn + n)/[m(L + S ) + n(L + S + l)] for concatenated states of this form. Notice that the staircase involves a large number of concatenated states (not all are labeled in the figure) between the 512511 and 511 steps and, similarly, between the 511.510 and 5 1 0 steps but not between the 5 1 1 and 511510 steps. In fact, a close look at the latter interval (which always occurs at the right hand end of the primary steps) shows that the 511 and 511510 steps overlap, resulting in an interval of hysteresis, that is, bistability, here. O n the left-hand sides of the primary steps, such as the interval between the 511510 and 510 steps, a small staircase consisting of highly complex mixed0.053
I
WRINKLED I TORUS I
rota tion number
I I-
1/20,
.
'1/19 2/39
0.050
I I
0.047 0.030
0.068
rota tion numbero.063
0.058
0.042
0.054
51°
51°5'l
0.100
ki
0.110
Figure 36 Portions of the devil's staircase in the simulations described in Figures 33-35. (From Ref. 83; used with permission.)
258 Computational Studies in Nonlinear Dynamics mode states of the form 511(510)* can be found alternating with chaotic states. Figure 3 7 8 3 shows a bifurcation diagram formed by plotting the value of the variable A in this PoincarC section as a function of the bifurcation parameter k,. The 511510 state can be seen to go through a series of period-doubling bifurcations before the chaotic state appears. At the other end of the chaotic region, the 511.510510 state abruptly appears, with no evidence of a reverse 1.6810
Aint
0.9940
I
I
I-
I
.. ,
*
-
ki
0.1039585
0.104O715
0.1041
I
1.17 1
.:..: I
. ...
:.
I
0.103987
I
ki
I
0.1041
Figure 37 Bifurcation diagram for the simulations described in Figures 33-35 with blow-up of the chaotic region showing period-doubling cascade of the mixed-mode states. Ain, values show maxima of oscillations in A(t). (From Ref. 83; used with permission.)
Numerical Analysis of Experimental Data 259 period-doubling cascade. This may be an example of a crisis transition in which the strange attractor collides with its basin boundary, and the system moves to a coexisting limit cycle attractor. This type of behavior is also seen in the familiar logistic map. This example shows that mixed-mode oscillations, while arising from a torus attractor that bifurcates to a fractal torus, give rise to chaos via the familiar period-doubling cascade in which the period becomes infinite and the chaotic orbit consists of an infinite number of unstable periodic orbits. Mixedmode oscillations have been found experimentally in the Belousov-Zhabotinskii (BZ) reaction82984 and other chemical oscillators85 and in electrochemical systems,86 as well. Studies of a three-variable autocatalator model have also provided insights into the relationship between period-doubling and mixedmode sequences.87 Whereas experiments on the peroxidase-oxidase reaction have not been carried out to determine whether the route to chaos exemplified by the DOP model occurs experimentally, the DOP simulations exhibit a route to chaos that is probably widespread in the realm of nonlinear systems and is, therefore, quite possible in the peroxidase reaction, as well.
NUMERICAL ANALYSIS OF EXPERIMENTAL DATA Reconstruction of Phase Portraits One practical issue that arises immediately when experimental data are analyzed is the necessity of constructing a phase space to generate a portrait of the attractor. Sometimes it is not feasible or desirable to measure, simultaneously, more than one of the dynamical variables. Recall that simultaneous measurements are necessary for the construction of attractors, and the full phase space for any given system will consist of all the dynamical variables and their velocities or derivatives. At least two such variables are needed to look at projections of the attractor in a lower dimensional phase space, so the measurement of only one dynamical variable does not seem, at first glance, to be enough for this task. However, a very useful technique for reconstructing the attractor in the absence of a multiple number of measured variables exists.88.89 An adequate representation of the attractor can be created by measuring just a single dynamical variable and creating an n-dimensional phase space using the method of time delays. This method rests on the topological equivalence of portraits constructed from time-delayed readings to those constructed with derivatives. An illustration of this method is found in an investigation of the Belousov-Zhabotinskii (BZ) reaction carried out by Hudson and Mankin.90 The BZ reaction involves the bromination of an organic acid and proceeds through
260 Computational Studies in Nonlinear Dynamics
a cycle of oxidations and reductions. The oscillations are revealed, usually, by a transition metal redox indicator couple. The bromide ion is an important intermediate in this system, and its concentration can be monitored directly with a bromide-selective electrode. Many ionic species are involved in the reaction, but early observations indicated that the voltage reading from a platinum electrode inserted into the reacting medium responded in synchrony with the observed color changes and, thus, provided a probe of the state of the reaction mixture. Hudson and Mankin showed90 that phase portraits constructed by plotting either of the two following collections of three variables were topologically equivalent: Br- electrode reading, Pt electrode reading, time derivative of Pt electrode or Br- electrode reading, Pt electrode reading, Pt reading at a fixed time delay
So, the time-delay reconstruction technique yields a phase space portrait that is equivalent to that defined in the usual way (variables and their time derivatives). It is now considered standard to treat the time-delay reconstructed phase portrait as the actual phase space portrait.
Calculation of the Correlation Dimension Once the reconstructed portrait is found, the same type of analyses as were described for the simulated data-that is, construction of Poincark sections and Poincare maps, calculation of the fractal dimension, etc.-can be carried out. Of these, the calculation of the fractal dimension is often of interest, although, as cautioned earlier, the knowledge of this number cannot, alone, distinguish chaotic data from nonchaotic data. An understanding of the route by which the suspected chaotic state arises is also necessary before a definitive statement can be made. Nevertheless, the determination of the fractal dimension from a data set thought to be chaotic is often of interest. A number of different dimensions exist in the literature, including the Hausdorff dimension, the information dimension, the correlation dimension, and the Lyapunov dimension. Which of these is the true fractal dimension? Of the ones in this list, the information dimension, D,,has the most basic and fundamental definition, so we often think of it as the “true” fractal dimension. Because the information dimension is impractical to calculate directly, however, most investigators have taken to finding the correlation dimension, Dc, as an estimate of the fractal dimension. Grassberger and Procaccia published91 a straightforward and widely used algorithm for the calculation of the correlation dimension. On the other hand, the Lyapunov
Numerical Analysis of Experimental Data 262
Figure 38 Definition of the distance ui-ui used in determining the correla-
tion sum, Eq. [ 1181, for the correlation
dimension.
dimension, D , (related to the Lyapunov exponent) is also straightforward to calculate using the widely available algorithm of Wolf et al.92 It turns out that these two dimensions provide an upper and lower bound93 for the desired information dimension
Reference 93 can be consulted for a full explanation of the differences between the dimensions used to estimate the fractal dimension of an attractor. Because the Grassberger-Procaccia algorithm is the most widely used for the calculation of the correlation dimension, we briefly review it here. As discussed above, a phase space portrait must first be reconstructed from the measured data. The resulting attractor consists of a set of ordered points, {ul, u2, . . , ,ui, . . . }, where ui 5 (x(ti), y(ti), z(ti)) is the ith point in the trajectory from which the attractor has been reconstructed. The correlation dimension is calculated by comparing the distance between any two points on the attractor (see Figure 38) with a given small distance, E, which will be varied. The number of pairs of points, (u,ui), that both fall inside a ball of diameter E are counted. The correlation sum C(E)is the average number of point pairs inside balls of diameter E spread over the attractor 1
C(E) = -
N2
N
N
cc
j=l+l
H(E -
lUi
-
UJ)
262 Computational Studies in Nonlinear Dynamics Here, the function H is the Heaviside step function which is zero if E < Iui - ui 1 and one if E > luj - uiJ.The correlation sum is found to follow a simple power law form which can be used to determine the correlation dimension, D,: C(E)
=
EDC
~191
Taking the natural logarithm of both sides of this equation yields In C(E) = D , l n ( ~ )
v201
Therefore, a plot of In C(E)versus In ( E ) for various values of E will have a linear form as long as the above power-law relation holds. In practice, a plot of In C ( E )versus In ( E ) is usually not linear. However, there is often a range of values at intermediate E for which the power-law form is valid. The slope of this middle portion is identified as the correlation dimension. It is sometimes found that the slope of the graph varies with parameters such as the embedding dimension used in the phase space reconstruction. In fact, the correlation dimension calculation can be used to determine the appropriate embedding dimension for reconstructing the attractor of the experimental system. A plot of the slopes D, for different values of the embedding dimension, d, should show a convergence of D, values at a particular d. The value of d at which convergence is achieved is considered to be the actual dimension of the attractor. The correlation dimension calculation thus gives some insight into the inherent dimensionality of the experimental system.
Lyapwnov Exponents As described in a previous section, the Lyapunov exponents are a generalized measure of the growth or decay of perturbations that might be applied to a given dynamical state; they are identical to the stability eigenvalues for a steady state and the Floquet exponents for a limit cycle. For aperiodic motion at least one of the Lyapunov exponents will be positive, so it is generally sufficient to calculate just the largest Lyapunov exponent. ' An algorithm due to Wolf et al.92 is the most widely used for calculating Lyapunov exponents. It can be applied to a reconstructed phase portrait or one found by measuring more than one dynamical variable. Two nearby points, A and B, that are not part of the same orbit around the attractor are located; the latter criterion can be ensured by not considering the first rn points in the time series after the selection of one of the two points. The evolution of the two points can be followed since the time series is available, and it is known which points on the attractor follow in time. In a chaotic system, the initial distance
Numerical Analysis of Experimental Data 263 between the points A and B, Lo, will diverge and may reach a separation Lb after a length of time At has elapsed; that is, if (B(t0)- A(t,)( = Lo, then (B(to+ At) - A(to + At)]= Lb > Lo. These two distances will, in general, be related by I .
where hl is the maximum Lyapunov exponent. We can solve this equation for hl by taking the base-2 logarithm of both sides and rearranging
In practice, it is necessary to make many such measurements and average the results. In the Wolf algorithm, one trajectory is followed for the entire calculation, whereas the second point is picked anew after multiples of At have elapsed (see Figure 39). If M distances are measured, {Lo, L,, L,, . . . , LM},the average maximum Lyapunov exponent can be calculated from
For chaotic behavior, at least one of the Lyapunov exponents would be expected to be positive. In practice, it is difficult to find the entire spectrum of Lyapunov exponents from experimental data because it is not always clear how many dependent variables are involved, and the number of Lyapunov exponents is equal to the number of dependent variables. For models, however, it is possible to compute the complete spectrum of Lyapunov exponents. If these are
Figure 39 Definition of distances L, and Li for i = 0, 1, 2, tion of the largest Lyapunov exponent from Eq. [123].
. . . used in the calcula-
264 Computational Studies in Nonlinear Dynamics arranged in decreasing order such that A, is the largest (most positive), X2 is the next largest, and so forth, then the Lyapunov dimension, D,, is defined as
Here, j is the maximum integer for which X1 + X2 + . . . + X j L 0, that is, j is the number of positive Lyapunov exponents in the spectrum. An example of a calculation of the Lyapunov exponents and dimension, D,, for a simple four-variable model of the peroxidase-oxidase reaction will help to clarify these general definitions. The following material is adapted from the presentation in Ref. 94. As described earlier, the Lyapunov dimension and the correlation dimension, D,, serve as upper and lower bounds, respectively, to the fractal dimension of the strange attractor. The simple four-variable model is similar to the Degn-Olsen-Perram (DOP) model discussed in a previous section but was suggested by L. F. Olsen95 a few years after the DOP model was introduced. It remains the simplest model of the peroxidase-oxidase reaction which is consistent with the most experimental observations about this reaction. The rate equations for this model are:
-d B = k8 - k,BX - k,ABY dt dX = k l B X - 2k2X2 + 3k3ABY - k4X t k6Xo dt
dY = 2k2X2 - k3ABY dt
~
5
- kSY
where the concentration variables A, B, X , and Y have the same meanings as they do in the DOP model [ A = 02,B = NADH, X = NAD., Y = compound 1111. Simulations with this model show a period-doubling sequence into chaos as the rate constant k, is varied.95 Table 4 shows values of the three largest Lyapunov exponents for various values of k,. Above k, = 0.033, chaotic behavior is observed, and this is reflected in a sudden jump in the Lyapunov dimension from 1.00 (indicative of a limit cycle) to a fractional value larger than 2 (indicative of a strange attractor.) The correlation dimension, D,, was also computed for this system and was found to be 2.02 for k3 = 0.036, as an example. Comparing the values of D, and DL for this value of k,, we can estimate the fractal dimension, DF, of the strange attractor as 2.02 IDF 5 2.10.
1
Conclusions 265 Table 4 Lyapunov Exponents for the Olsen Model. k3
XI
A2
x3
DL
0.025 0.033 0.034 0.035 0.036
0.000 0.000
-0.69 -0.05
-0.72 -3.40 -1.93 -2.99 -3.62
1.00 1.00
0.447 0.444 0.361
0.00 0.00 0.00
ORate constant values: k , =0.35, k2 = 2.5 x 102, k, = 20.0, k, = 3.35, k,Xo = 10-5, 0.1, keEo = 0.825, A" = 8.0
2.23 2.15 2.10
k,
=
CONCLUSIONS In this chapter we have reviewed the basic concepts and tools used in the study of complex phenomena, emphasizing that certain universal behaviors have been observed in entirely unrelated systems. and that the understanding of such behavior in one type of system (such as population biology) can be fruitfully applied to the understanding of similar behavior in another type of system (such as an inorganic chemical reaction). Bistability, oscillations, chaos, Turing patterns, and propagating fronts are just a few examples of the universal phenomena found in chemical systems and in many other settings. Computational techniques are centrally important at every stage of investigation of nonlinear dynamical systems. We have reviewed the main theoretical and computational tools used in studying these problems; among these are bifurcation and stability analysis, numerical techniques for the solution of ordinary differential equations and partial differential equations, continuation methods, coupled lattice and cellular automata methods for the simulation of spatiotemporal phenomena, geometric representations of phase space attractors, and the numerical analysis of experimental data through the reconstruction of phase portraits, including the calculation of correlation dimensions and Lyapunov exponents from the data. Investigation of complex phenomena has been advanced by the development of techniques that emphasize a geometric description of the dynamics and that take a systems approach to the investigation of phenomena. The use of analytical and computational tools has played a central role in the evolution of this field of study and theoretical developments have often preceded-and even driven-the experimental investigations. Chemistry has provided important examples and model systems for the development of these computational techniques, and many of the insights achieved through the study of nonlinear chemical systems have application to problems beyond chemistry. Conversely, the insights gained by studying nonchemical systems, such as population biology, genetic transport, fluid dynamics, etc., have been successfully applied to
266 Computational Studies in Nonlinear Dynamics the understanding of nonlinear chemical systems as well. In all of these instances of fruitful interdisciplinary communication, it has been found that the use of mathematical descriptions and numerical simulations has greatly facilitated the exchange of insights across disciplinary boundaries.
ACKNOWLEDGMENTS The authors would like to thank Drs. D. Horvkh, A. Tbth, and C. Steinmetz for providing material from their PhD theses. Financial support came from the National Science Foundation.
REFERENCES 1. P. W. Anderson, Science, 177, 393 (1972).More is Different. 2. S. S. Schweber, Physics Today, November 1993, pp. 34-40. Physics, Community and the Crisis in Physical Theory. 3. A. M. Turing, Philos. Trans. R . SOC. Lond. B237, 37 (1952). The Chemical Basis of Morphogenesis. 4. V. Castets, E. Dulos, J. Boissonade, and P. DeKepper, Phys. Rev. Lett., 64, 2953 (1990). Experimental Evidence of a Turing Stationary Structure. 5. Recent studies by Ross and co-workers have shown that, in special cases, multiple equilibrium states are possible. See, X.-L. Chu and J. Ross, ]. Chem. Phys., 93, 1613 (1990). Complex Kinetics of Systems with Multiple Stationary States at Equilibrium. 6. G. Nicolis and I. Prigogine, Self-Organization in Nonequilibrium Systems, Wiley, New York, 1977. 7. R. J. Field and M. Burger, Eds., Oscillations and Traveling Waves in Chemical Systems, Wiley, New York, 1985. 8. P. Gray and S. K. Scott, Chemical Oscillations and Instabilities: Non-linear Chemical Kinetics, Clarendon Press, Oxford, 1990. 9. S. K. Scott, Chemical Chaos, Clarendon Press, Oxford, 1991. 10. E. C. Zimmermann, M. Schell, and J. Ross,J. Chem. Phys., 81,1327 (1984).Stabilization of Unstable States and Oscillatory Phenomena in an Illuminated Thermochemical System: Theory and Experiment. J. Kramer and J. Ross, J. Chem. Phys., 83,6234 (1985).Stabilization of Unstable States, Relaxation, and Critical Slowing Down in a Bistable System. R. H. Harding and J. Ross, ]. Chem. Phys., 92, 1936 (1990). Experimental Measurement of the Relative Stability of Two Stationary States in Optically Bistable ZnSe Interference Filters. 11. J.-P. Laplante,]. Phys. Chem., 93,3882 (1989).Stabilization of Unstable States in the Bistable Iodate-Arsenous Acid Reaction in a Continuous Flow Stirred Tank Reactor. 12. V. Petrov, V. Ghsphr, J. Masere, and K. Showalter, Nature, 381,240 (1993).Controlling Chaos in the Belousov-Zhabotinskii Reaction. 13. N. Ganapathisubramanian and K. Showalter, J. Chem. Phys., 84, 5427 (1986). Relaxation Behavior in a Bistable Chemical System Near the Critical Point and Hysteresis Limits. 14. N. Ganapathisubramanian and K. Showalter, ]. Chern. Phys., 80, 4177 (1984). Bistability, Mushrooms and Isolas.
References 267 15. M, Hehrichs and F. W. Schneider, 1. Phys. Chem., 85, 2112 (1981). Relaxation Kinetics of Steady States in the Continuous Flow Stirred Tank Reactor. Response to Small and Large Perturbations: Critical Slowing Down. 16. J. D. Murray, Mathematical Biology, Springer-Verlag, New York, 1990. 17. J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, Springer-Verlag,New York, 1983. See also, S. H. Strogatz, Nonlinear Dynamics and Chaos, Addison-Wesley, Reading, MA, 1994. 18. J. H. Hubbard and B. H. West, MacMath 9.2: A Dynamical Systems Software Package for the Macintosh, Springer-Verlag, New York, 1993. 19. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 2nd edit., Cambridge University Press, Cambridge, 1988. 20. C. W. Gear, Numerical Initial Value Problems in Ordinary Differential EqMations, PrenticeHall, Englewood Cliffs, NJ, 1971. 21. T. S. Parker and L. 0. Chua, Practical Numerical Algorithms for Chaotic Systems, SpringerVerlag, New York, 1989. 22. M. Marek and I. Schreiber, Chaotic Behaviour of Deterministic Dissipative Systems, Cambridge University Press, Cambridge, 1991. 23. M. Kubicek and M. Marek, Computational Methods in Bifurcation Theory and Dissipative Structures, Springer-Verlag, New York, 1983. 24. E. J. Doedel, Congress. Num., 30,265 (1981).AUTO: Software for Continuation and Bifurcation Problems in Ordinary Differential Equations. E. J. Doedel and J.-P. Kernevez, Applied Mathematics, California Institute of Technology, Pasadena, 1986. 25. V. Castets, E. Dulos, J. Boissonade, and P. DeKepper, Phys. Rev. Lett., 64, 2953 (1990). Experimental Evidence of a Sustained Standing Turing-Type Nonequilibrium Chemical Pattern. 26. K. Agladze, E. Dulos, and P. De Kepper, 1.Phys. Chem., 96,2400 (1992).Turing Patterns in Confined Gel and Gel-Free Media. 27. Q. Ouyang and H. L. Swinney, Nature, 352,610 (1991).Transition from a Uniform State to Hexagonal and Striped Turing Patterns. 28. Q. Ouyang and H. L. Swinney, Chaos, 1, 411 (1991). Transition to Chemical Turbulence. 29. Q. Ouyang, Z. Noszticzius, and H. L. Swinney, J. Phys. Chem., 96, 6773 (1992). Spatial Bistability of Two-Dimensional Turing Patterns in a Reaction-Diffusion System. 30. 1. Lengyel, S. Kadar, and 1. R. Epstein, Science, 259,493 (1993).Transient Turing Structures in a Gradient-Free Closed System. 31. J. Ross, A. P. Arkin, and S. C. Miiller, 1. Phys. Chem., 99, 10417 (1995). Experimental Evidence for Turing Structures. 32. G . H. Markstein, Ed., Nonsteady Flame Propagation, Pergamon Press, Elmsford, New York, 1967. 33. L. Edelstein-Keshet, Mathematical Models in Biology, Random House, New York, 1988. 34. D. Horvdth, PhD Thesis, West Virginia University, 1994. Instabilities in Reaction-Diffusion Fronts. 35. P. Gray and S. K. Scott, Chem. Eng. Sci., 39, 1087 (1984). Autocatalysis Reactions in the Isothermal, Continuous Stirred-Tank Reactor: Oscillations and Instabilities in the System A + 2B + 3B; B -+ C. 36. J. Schnackenberg, 1. Theor. Biol. 81, 389 (1979). Simple Chemical Reaction Systems with Limit Cycle Behavior. 37. V. Dufiet and J. Boissonade, 1. Chem. Phys., 96,664 (1991).Conventional and Unconventional Turing Patterns. 38. R. Luther, Elektrochem., 12, 596 (1906).Propagation of Chemical Reactions in Space.
268 Computational Studies in Nonlinear Dynamics 39. R. Arnold, K. Showalter, and J. J. Tyson, 1. Chem. Educ., 64, 740 (1987). Pro agation of Chemical Reactions in Space. [An English Translation of the 1906 Article (Re! 38) by R. Luther]. K. Showalter and J. J. Tyson, 1. Chem. Educ., 64, 742 (1987). Luther’s 1906 Discovery and Analysis of Chemical Waves. 40. R. A. Fisher, Ann. Eugenics, 7, 355 (1937). The Advance of an Advantageous Gene. 41. A. Kolmogorov, I. Petrovsky, and N. Piscounoff, Bull. Unio. Moscow, Ser. Int., Sect. A, 1, 1 (1937).Etude de L’Cquation de la Diffusion avec Croissanie de la QuantitC de Mati&reet son Application i un Probltme Biologique. 42. P. Fife, Lecture Notes in Biomathematics, Vol. 28, Springer-Verlag,New York, 1979. 43. S. K. Scott and K. Showalter, 1. Phys. Chem., 96, 8702 (1992). Simple and Complex Reaction-Diffusion Fronts. 44. K. Showalter, Nonlinear Science Today, 4 , l (1995).Quadratic and Cubic Reaction-Diffusion Fronts. 45. P. Gray, S. K. Scott, and K. Showalter, Philos. Trans. R. SOL.Lond., A337,249 (1991). The Influence of the Form of Autocatalysis on the Speed of Chemical Waves. 46. A. Hanna, A. Saul, and K. Showalter,]. Am. Chem. SOL., 104,3838 (1982).Detailed Studies of Propagating Fronts in the Iodate-Oxidation of Arsenous Acid. 47. A. Saul and K. Showalter, in Oscillations and Traveling Waves in Chemical Systems, R. J. Field and M. Burger, Eds. Wiley, New York, 1985, pp. 419-440. Propagating Reaction-Diffusion Fronts. 48. P. Gray, K. Showalter, and S. K. Scott,]. Chim. Phys., 84,1329 (1987).Propagating ReactionDiffusion Fronts with Cubic Autocatalysis: The Effects of Reversibility. 49. J. Billingham and D. J. Needham, Philos. Trans. R. SOL.Lond., A334,l (1991).The Development of Travelling Waves in Quadratic and Cubic Autocatalysis with Unequal Diffusion Rates. 1. Permanent Form Travelling Waves. 50. D. Horvath, V. Petrov, S. K. Scott, and K. Showalter, ]. Chem. Phys., 98, 6332 (1993). Instabilities in Propagating Reaction-Diffusion Fronts. 51. A. Tbth, PhD Thesis, West Virginia University, 1994. Chemical Waves: The Effects of Geometrical Constraints. 52. D. Barkley, Physica D, 49, 61 (1991). A Model for Fast Computer Simulation of Waves in Excitable Media. 53. D. A. Vasquez, ]. Comput. Chem., 13, 570 (1992). Locally Implicit Solution of a ReactionDiffusion System with Stiff Kinetics. 54. V. Gaspar, J. Maselko, and K. Showalter, Chaos, 1, 435 (1991). Transverse Coupling of Chemical Waves. 55. J. J. Tyson and P. C. Fife,]. Chem. Phys., 73,2224 (1980).Target Patterns in a Realistic Model of the Belousov-Zhabotinskii Reaction. 56. R. M. Noyes and R. J. Field, Acc. Chem. Res., 10, 273 (1977).Mechanisms of Chemical Oscillators: Experimental Examples. 57. W. H. Byer, Ed., CRC Handbook of Mathematical Sciences, CRC Press, Boca Raton, FL, 1987. 58. R. Kapral, personal communication, 1994. 59. R. Q. Topper, this volume. Visualizing Molecular Phase Space: Nonstatistical Effects in Reaction Dynamics. 60. E. N. Lorenz, J. Atmos. Sci., 20, 130 (1963).Deterministic Nonperiodic Flow. 61. E. N. Lorenz, The Essence of Chaos, University of Washington Press, Seattle, WA, 1993. 62. T.-Y. Li and J. A. Yorke, Am. Math. Monthly, 82, 985 (1975). Period 3 Implies Chaos. 63. 1. Stewart, Does God Play Dice? The Mathematics of Chaos, Blackwell, Cambridge, MA, 1989.
References 269 64. M. J. Feigenbaum, Los Alamos Science, 1,4(1980). Universal Behavior in Nonlinear Systems. 65. R. L. Devaney, A First Course in Chaotic Dynamical Systems: Theory and Experiment, Addison-Wesley, Reading, MA, 1992. 66. R. M. May, Nature, 261, 459 (1976).Simple Mathematical Models with Verf Complicated Dynamics. 67. 0. E. Rossler, Phys. Lett., AS7, 397 (1976).An Equation for Continuous Chaos. 68. L. F. Olsen and H.Degn, Q. Rev. Biol., 18, 165 (1985).Chaos in Biological Systems. 69. L. D. Landau and E. M. Lifshitz, Fluid Mechanics, Oxford University Press, New York, 1959. 70. A. Brandstater and H. L. Swinney, Phys. Rev. Lett., 65, 1523 (1990).Strange Attractors in Weakly Turbulent Couette-Taylor Flow. 71. S. Newhouse, D. Ruelle, and F. Takens, Commun.Math. Phys., 64,35 (1987).Occurremce of Strange Axiom A Attractors Near Quasiperiodic Flows on Tm,m 2 3. 72. G. H.Hardy and E. M. Wright, Eds., An Introduction to the Theory ofNumbers, 4th edit., Clarendon Press, Oxford, 1954. 73. C. G. Steinmetz and R. Larter, J. Chem. Phys., 94, 1388 (1991). The Quasiperiodic Route to Chaos in a Model of the Peroxidase-Oxidase Reaction. 74. M. Hogh Jensen, P. Bak, and T. Bohr, Phys. Rev. Lett., SO, 1637 (1983).Complete Devil’s Staircase, Fractal Dimension, and Universality of Mode-Locking Structure in the Circle Map. 75. M. Hogh Jensen, P. Bak, and T. Bohr, Phys. Rev., A30, 1960 (1984).Transition to Chaos by Interaction of Resonances in Dissipative Systems. I. Circle Maps. 76. C. G. Steinmetz, PhD Thesis, Indiana University Purdue University Indianapolis, 1991.Chaos in the Peroxidase-Oxidase Reaction. 77. R. Larter, L. F. Olsen, C. G . Steinmetz, and T. Geest, in Chaos in Chemical and Biochemical Systems, R. J. Field and L. Gyorgyi, Eds., World Scientific Press, Singapore, 1993,pp. 175224.Chaos in Biochemical Systems. The Peroxidase Reaction as a Case Study. 78. L. F. Olsen and H.Degn, Nature, 267, 177 (1977).Chaos in an Enzyme Reaction. 79. T. Geest, C. G . Steinmetz, R. Larter, and L. F. Olsen, J. Phys. Chem., 96,5678 (1992). Period Doubling Bifurcations and Chaos in an Enzyme Reaction. 80. H.Degn, I,. F. Olsen, and J. W. Perram, Ann. NY Acad. Sci., 316, 623 (1979).Bistability, Oscillations, and Chaos in an Enzyme Reaction. 81. D. G. Aronson, M. A. Chory, G. R. Hall, and R. P. McGehee, Commun. Math. Phys., 83,303 (1982).Bifurcations from an Invariant Circle for Two-Parameter Families of Maps of the Plane: A Computer-Assisted Study. 82. J. Maselko and H.L. Swinney, J. Chem. Phys., 85,6430 (1986).Complex Periodic Oscillations and Farey Arithmetic in the Belousov-Zhabotinskii Reaction. 83. R. Larter and C. G . Steinmetz, Philos. Trans. R. SOC., A337,291 (1991).Chaos via MixedMode Oscillations. 84. F. Argoul, A. Arneodo, P. Richetti, and J. C. Roux, J. Chem. Phys., 86,3325 (1987).From Quasiperiodicity to Chaos in the Belousov-Zhabotinskii Reaction. 85. M. Orbin and I. Epstein, J. Phys. Chem., 856,3907 (1982). Complex Periodic and Aperiodic Oscillation in the Chlorite-Thiosulfate Reaction. 86. F. N. Albahadily, J. Ringland, and M. Schell, J. Chem. Phys., 90, 813 (1989).Mixed-Mode Oscillations in an Electrochemical System. I. A Farey Sequence Which Does Not Occur on a Torus. 87. V. Petrov, S. K. Scott, and K. Showalter, J. Chem. Phys., 97, 6191 (1992).Mixed-Mode Oscillations in Chemical Systems. 88. F. Takens, Lect. Notes Math., 898, 366 (1981).Detecting Strange Attractors in Turbulence.
270 Computational Studies in Nonlinear Dynamics 89. N. Packard, J. P. Crutchfield, J. D. Farmer, and R. S. Shaw, Phys. Rev. Lett., 45,712 (1980).
Geometry from a Time Series.
1. Chem. Phys., 74, 6171 (1981). Chaos in the BelousovZhabotinskii Reaction. P. Grassberger and I. Procaccia, Physica D, 9, 189 (1983). Measuring the Strangeness of Strange Attractors. A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, Physica D, 16,285 (1985). Determining Lyapunov Exponents from a Time Series. J. D. Farmer, E. Ott, and J. A. Yorke, Physicu D, 7, 153 (1983). The Dimension of Chaotic Attractors. T. Geest, L. F. Olsen, C. G. Steinmetz, R. Larter, and W. M. Schaffer,J. Phys. Chem., 97,8431 (1993). Nonlinear Analyses of Periodic and Chaotic Time Series from the PeroxidaseOxidase Reaction. L. F. Olsen, Phys. Lett., 94A, 454 (1983). An Enzyme Reaction with a Strange Attractor.
90. J. L. Hudson and J. C. Mankin, 91. 92. 93. 94. 95.
CHAPTER 5
The Development of Computational Chemistry in the United Kingdom Stephen J. Smith and Brian T. Sutcliffe Department of Chemistry, University of York, York YO1 5DD, England, United Kingdom
INTRODUCTION The development of computational chemistry in the United Kingdom (UK) is not easily separable from the developments in government funding for
computers in universities and the Research Councils. It is prudent therefore, in looking at the developments in computational chemistry in the UK since the end of the second world war, to be quite explicit about the governmental and quasi-governmental bodies that were involved. To put this in context, it is necessary to go back somewhat further. During the nineteenth century the number of universities in the UK began to increase. This was often a consequence of the civic pride of the growing and prosperous manufacturing towns. Many such towns petitioned the Crown for a Royal Charter to found a University and many were successful. In practice the granting of a charter was a matter for the government of the day, though it was not usually a party political matter. But the finance of these institutions was a matter for the University Court and Council, established under the charter. The government did not get involved until 1889, when the temperance lobby in the UK succeeded in diverting some of the “whiskey money” to educational causes. This resulted in a sum of fl5,OOO being granted each year to the University Reviews in Computational Cbemistty Volume 10 Kenny B. Lipkowitz and Donald B. Boyd, Editors VCH Publishers, Inc. New York, Q 1997
2 71
272 The Development of Computational Chemistry in the United Kingdom Colleges in England and Scotland. (It is perhaps appropriate to note here that Ireland was then a single country and was separate in educational matters from the rest of the UK. For present purposes, Northern Ireland is the relevant part of the UK, and it remains different in some respects from the rest of the UK, as does Scotland from England and, in more minor respects, Wales from England. But, to a good approximation, what is said about the UK in this chapter applies to the whole of it.) The money was distributed by the Committee on Grants to University Colleges in Great Britain and by 1900 there were about 20,000 fulltime students attending University, about 0.8% of the age group. University grants continued to increase and had reached some f180,000 a year by 1911. The first world war, however, placed such a strain on the UK higher education system that by 1919 the system was essentially bankrupt. Part of the reason for this was the increasing role science was playing in universities and, naturally, the increasing expense that this involved. The government of the day recognised that it was essential to keep the universities going, if only to provide scientists. This view arose, in part, from the scare that happened at the beginning of the war in 1914, when it was discovered that the UK had few facilities for the production of explosives and many other vital chemicals, because the effective chemical industry of the UK was in Germany. The government in 1915 set up the Department of Scientific and Industrial Research to coordinate and, if possible, guide scientific developments in the UK. In 1919 the government similarly set up the University Grants Committee (UGC) to coordinate and to suggest courses of action to the universities. Care was taken to try to preserve the autonomy of the universities in a manner consistent with the proper use of public funds. By 1924 there were about 42,000 full-time students in Universities (about 1.5% of the age group). The UGC and the Department of Scientific and Industrial Research continued as important forces for the next 40 years. In 1949 the National Research and Development Corporation (NRDC) came into being, initially as an offshoot of the Department of Scientific and Industrial Research but later as an independent body. Its job was to stimulate commercial development from scientific discovery. In 1960 the Department of Scientific and Industrial Research was terminated and replaced by the Research Councils. In this chapter, these will be regarded as continuations in spirit of the Department of Scientific and Industrial Research. In 1966 the Computer Board was established. Its job was to behave like the UGC, but just for the provision of computing equipment for research and (later) teaching in Universities. It also advised the Research Councils on computing needs. The Research Councils have recently (1994) been replaced by new bodies which, it is hoped, will be more sensitive to wealth creation and the like. The UGC and the Computer Board were terminated in the education changes that began in the UK in the 1980s. Following a period in which a transitional body, the University Funding Council, continued some of their functions, there are now regional funding agencies, carrying out government
Beginnings 273 educational policy, that provide what remains of university finance from central government. This includes computer provision that now tends to be lumped with much else and called Information Systems provision. These bodies deal with about twice the number of institutions that they did in 1980 because the government permitted a doubling of those institutions entitled to the name “university.” During the 1970s the NRDC had become the British Technology Group, which was privatised during the 1980s. Neither the new research bodies nor the new funding agencies nor any successors to NRDC are important for this review. They are mentioned here, along with the institutional name changes, simply to warn against extrapolating in too linear a manner from this history to the future. For a view of the role of the UK government in computer development generally up to the early 1970s, a good, if dispiriting, account can be found in the book by Hendry.1 An equivalent study of the situation in the United States was published by Flamm.2
BEGINNINGS The development of the digital computer in the UK owes much to the exigencies of World War I1 and so is quite like the United States experience. It is generally agreed3 that it was the need to be able to break the codes of the German forces that provided the most important impetus in the UK. This need led to the development of COLOSSUS by engineers of the Post Office Research Station at Dollis Hill under the leadership of T. H. (Tommy) Flowers. The functional requirements for the machine were drawn up by M. H. A. (Max) Newman, later professor of pure mathematics in the University of Manchester. The work was undertaken at the Government Code and Cipher School at Bletchley Park, Buckinghamshire. The Post Office Engineers were also involved in the construction of a machine at the Telecommunications Research Establishment at Malvern in Worcestershire and among those working there were F. C. (Freddie) Williams and T. (Tom) Kilburn and both of these were later to work in the department of Electrical Engineering in the University of Manchester. Among those working in Bletchley Park was Alan Turing, who has since become something of a romantic figure by virtue of a full biography,4 and as a character in a play (Breaking the Code) which had some success both in London and on Broadway. He was later to go to Manchester, where he was both influential in developments in coding that originated there and extremely irritating to those actually involved in the building of the machine. COLOSSUS was not strictly a stored program machine for the program was set by wiring plug-boards. The machine conrained over 1500 thermionic valves (vacuum tubes), and it surprised everyone by its reliability and speed. It did not hold its data internally either, but on a loop of paper tape that was reread as necessary. Although the secrecy, required by the state for security
274 The Development of Computational Chemistry in the United Kingdom purposes, prevented the knowledge gained from COLOSSUS being exploited openly, some of those involved went to work after the war ended to try to develop stored program machines, reassured by their wartime experiences that such machines were technically feasible. The obvious need was to develop some economic storage technology on which a program and the data could be retained and modified and from which they could be read, more quickly and effectively than was possible using plug-boards and paper-tape loops. It should perhaps be stressed that the theory of computing was perfectly well understood in terms of the earlier work of Turing, Church, Post, and von Neumann (much of this work is described in an entertaining manner in Hodges’ book4) and that it was the practical problems that proved so daunting. The three kinds of storage devices that were important in the UK were the delay line, the Williams tube, and the magnetic drum. The delay line and the Williams tube were what today would be called fast-access (main) store and the drum was what would now be called backing store. In a delay line the information is held as a train of impulses continuously circulating around a closed path. The time taken for an impulse to circulate was arranged to be longer than the time taken by the electrical impulses through the computer circuitry. In the early British computers DEUCE (whose development was begun at the National Physical Laboratory in the late 1940s) and the University of Cambridge EDSAC (which first ran in 1949), mercury delay lines were used, and the information was realised as a sequence of acoustic pulses travelling round a U-tube filled with mercury. The Elliott 400 series and the Ferranti PEGASUS used nickel delay lines on which the acoustic pulses were achieved by means of magnetostriction. The Williams tube memory was a cathode-ray tube on which the information was stored as a pattern of dots on the phosphor on the screen. While working at Malvern, Williams discovered the anticipation pulse effect which made it possible to refresh the screen both periodically and after it had been scanned. Hence it was possible to hold the information as a bit-pattern essentially indefinitely. This form of storage was used in the University of Manchester Mark I machine and also in some machines in the United States, as described in Ref. 5. The drum was used to store the program and data after they were read in. The program and data were read back from it as needed. The beginnings of the idea of virtual storage (paging) begin about this time in a number of attempts to present the programmer with only one level of storage, Both paper tape (five-hole teleprinter) and punched cards were used as input media, depending on the machine. The output was the same, though sometimes there was a teleprinter for small quantities of printed output. All the machines referred to above influenced the development of computational chemistry in the UK. However, the really powerful stimulus to such work came as a consequence of the work at Manchester and subsequent work at Cambridge. It i s therefore the developments here that will be concentrated on, but the reader should be aware that such an account is extremely partial. It could not be justified either in terms of a full history of computing in the UK or in terms of justice to other workers whose ideas did not come to full fruition.
Beginnings 275
Manchester In the early 1930s D. R. (Douglas) Hartree became Professor of Mathematics at the University of Manchester. He was interested in numerical analysis and in the development of numerical methods to solve what are now known as the Hartree equations and later the Hartree-Fock equations for atoms. He realised the need for automatic methods of computation and had begun to build a differential analyser (a machine that is fundamentally a wheel-and-disc integrator) to aid work on the numerical solution of differential equations generally, not least those arising in Hartree-Fock calculations on atoms. This work was probably influenced by the work of Vannevar Bush in the United States, who had constructed the first working differential analyser in 1930. By 1939 there were working differential analysers not only at Manchester, but also at Cambridge, Queen’s University in Belfast, and the Royal Air Force Research Establishment at Farnborough. During the war, Hartree was much involved in consulting on projects of military significance that needed numerical work. He was thus something of a father figure to many involved in such work during the war. He had left in Manchester a mathematics department in which there was a genuine enthusiasm for numerical work and for automatic computation. This enthusiasm was kindled too in Cambridge, where he went as Professor and Head of the Mathematics Laboratory in October 1946. In December 1947 in the Electrical Engineering Department in Manchester, a very small computer was built around a 32 x 32-bit word Williams tube. The builders were T. Kilburn and G. C. Toothill, both of whom came from Malvern following F. C. Williams. The machine had a keyboard for manual input, and output was to a monitoring display screen. On 21 June 1948 this prototype machine ran a program (written by Kilburn) to find the highest proper factor of a number. A copy of an amended version of the program (18 July 1948) is reproduced as Figure 15.1 in Ref. 3. It is interesting not only for being the first program for a stored-program computer, but also because it used relative and absolute indirect jumps consequent on the outcome of repeated subtraction. Encouraged by this, the group at Manchester continuously expanded the machine and in April 1949 had undertaken, at the suggestion of Prof. Max Newman, an investigation in to Mersenne primes, and by June 1949 the machine sometimes stayed up and running for as long as 9 hours. This machine became known as the Manchester Mark I machine, and a description of it can be found in Appendix 2 of Ref. 3 and also as Figure 2 in Ref. 6. It was shut down in 1950, but not before it had been visited by the then Chief Scientist to the Cabinet, Sir Ben Lockspeiser. He had come at the suggestion of Prof. P. M. S. Blackett, then Professor of Physics at Manchester, who was at that time influential in government circles. Lockspeiser was so impressed that within a few days of his visit in October 1948 he had arranged via NRDC for a government contract to be placed with the local electrical engineering firm, Ferranti, to make a production version of the machine “to Professor Williams’ specifica-
276 The Development of Computational Chemistry in the United Kingdom
tion.” The contract was for 5 years at E35,000 a year. The first production model was installed at Manchester in February 1951. This development was important, not only in its own right but also because it established a link between Ferranti (together with its successor companies) and the University of Manchester that lasted for 30 years at least. One production model went to Fort Halstead, a government establishment at which atomic weapons calculations were carried out. There Dr Alec Glennie, working in his spare time, developed in 1952 an Autocode, what would now be called a compiler, but because his work was so secret, no one could know of it. (Glennie was required to lock himself in the computer room when he was computing and even to destroy the paper inking ribbon before he left the room at the end of a session. It is not recorded if he was required to eat any spare output.) But in March 1954, R. A. (Tony) Brooker introduced a Mark I Autocode openly at Manchester. The Mark I had a sort of assembler language that was designed by Turing (who came to Manchester in 1948, shortly after the first program had been run) which was realised in terms of the alphabetical symbols of 5-bit teleprinter code. But the order code was realised in “reverse Polish” so it was a real pain to write in. Naturally the old hands were very sniffy about the use of Autocode, but it made a huge difference to lesser mortals. The Mark I was followed by the Mark 11 (sometimes called MEG). The first molecular quantum chemistry calculations by computer in the UK were carried out on this machine (which was fully operational in May 1954). These were done by H. 0. Pritchard and F. Sumner. (Sumner, then a graduate student, was later to become Professor of Computer Science at the University of ManChester and very powerful in the counsels of the UGC and the Computer Board.) The first calculation7 was something of an exercise in evaluating a secular determinant, rather hung on the peg of a calculation on hyperconjugation. This paper was received on 7 January 1954 and contains an acknowledgment of help from Alan Turing. A more substantial contribution was Ref. 8, a self-consistent Hiickel calculation. This paper is recorded as being received in May 1954 and appeared later that year. Another paper on the same theme followed shortly thereafter,9 finally appearing in 1956. The Mark I1 machine, which had floating point arithmetic, index registers, and much more, was the prototype for the machine eventually called Mercury, the first production model of which was delivered by Ferranti to the Norwegian Defence Research Establishment in August 1957. This machine had a decent assembler (called, curiously enough, PIG) and an Autocode, Mercury Autocode. This was later extended to a version, Extended Mercury Autocode, always called EM(M)A, which was much used on the later Atlas machine. By 1959 there was an effective machine, which could be programmed fairly easily and which could be used to run, for example, routines to diagonalise symmetric matrices and programs for X-ray analysis. It had 1,024 40-bit words of ferrite core store, hardware floating point operations, index registers, and four magnetic storage drums, each holding 4,096 words. It was comparable to,
Beginnings 277 though much smaller than, the contemporaneous IBM 704. It usually stayed up and running for more than 12 hours at a time. After passing a driving test, users could obtain time slots on the machine and had to operate it themselves. If a user had a time slot while the machine was working, then a run could be attempted. If not, then all was lost until a new time slot could be negotiated. Any attempt to question the engineers about whether the machine would come to life again during a given time slot was to invite scathing derision. Users were generally told, in no uncertain terms, to get lost. They were reminded that those involved were engineers and “not bloody crystal-ball gazers. ” By the late 1950s, and into the early 1960s, there was a lot of computational chemistry going on at Manchester. There was not only crystallography, under 0. S . Mills, but also quantum chemistry, initially by the students of H. 0. (Huw) Pritchard (who left Manchester in 1965) and also by the students of W. (Bill) Byers Brown (who left in 1960). The machine was also used by workers from the College of Science and Technology in Manchester (later to become UMIST) and foremost among them in usage were the crystallographers from Henry Lipson’s group. Most of the quantum chemists wrote in Mercury Autocode in a conscious effort to keep their code portable, but most wrote enough assembler to be able to write digit packing routines and the like. To write in Autocode was to invite contempt from the old-stagers in the lab, some of whom were former pupils of Turing, who certainly believed in the philosophy, “If it’s hard to write, it should be hard to read.” (Turing had died in mysterious and rather bizarre circumstances on 7 June 1954. The coroner’s verdict was suicide, but some continue to believe that he was murdered by the British secret service, see Ref. 4.) Among the work in quantum chemistry being carried out at this time at Manchester, and not atypical of it, is that in Refs. 10-14. It is interesting to note that although H. C. (Christopher) Longuet-Higgins was Reader in Theoretical Chemistry at Manchester for much of the period considered here, he seems to have played no part in the developments of computational quantum chemistry made at Manchester, although in the acknowledgements* he is thanked for his interest. While this work was going on, Ferranti was building, to a Manchester design, a machine later to be called Atlas. Those working in the lab knew this, and it was one of the reasons that they coded in Autocode, because they believed that this gave them the best chance of minimising their pain at transition. Atlas proved extremely important in the development of computational chemistry in the UK from 1965 to 1975, and its story will be taken up later, but now it is necessary to return to the immediately postwar era and to Cambridge.
Cambridge In 1933, J. E. (John) Lennard-Jones became the John Humphrey Plummer Professor of Theoretical Chemistry at the University of Cambridge. During his tenure of the Chair, the Department showed great strength in calculational
278 The Development of Computational Chemistry in the United Kingdom work in chemistry. It is arguable that this was very much in the Cambridge tradition, which was one in which numerical work was encouraged. In the prewar period C. A. (Charles) Coulson worked in the lab, performing one of the earliest molecular orbital calculations ever undertaken. R. A. Buckingham also worked there. He was later to write a book on numerical analysis and to be the first director of the University of London Computing Service. M. V. (Maurice) Wilkes began his career there. S. F. (Frank) Boys actually presented his Ph. D. from the lab although he had begun his work in physical chemistry under T. M. Lowry, who died before Boys had finished. In October 1946, Professor D. R. Hartree moved from his Chair of Mathematics at the University of Manchester to become Professor of Mathematical Physics at Cambridge and to head up the Mathematical Laboratory there. He knew Lennard-Jones, for he had been a member of the Kapitza club in 192425, when Lennard-Jones was also a member. (He was then just plain Jones, the Lennard was added in 1925 upon his marriage to Kathleen Mary Lennard.) Among the other members of the club that year were P. A. M. Dirac, L. H. Thomas, and P. M. S. Blackett. Also about 1946, a team working under the direction of Maurice Wilkes (who had spent part of the war at Malvern) was beginning to develop EDSAC.15 This machine had a mercury delay-line fast store of 512 words, each word being 36 bits. Input and output were done on five-hole paper tape and the machine ran its first program on 6 May 1949 and was shut down only in July 1958, by which time, EDSACZ was in operation. There is no doubt that the design of EDSAC was influenced by the United States machine EDVAC, perhaps because both Wilkes and Hartree had visited the United States in the postwar period to look at developments in computation there. It was the declared aim of the laboratory to provide an effective service for users and to make life as easy as possible for the inexperienced user. To this end, EDSAC was programmed from the start in a symbolic assembly language, so that a program could be written out in terms of meaningful alphabetic characters, to be punched onto paper tape, read in, and converted automatically to binary machine instructions. They also made a film to show how to use EDSAC and stills from this film have been published.16 At a technical level the idea of microprogram control in relation to computer design seems to have first been expounded by Wilkes in 1951. By the middle 1950s there was a lively use of EDSAC for computational chemistry of all kinds, but from the point of view of quantum chemistry it is the figure of Boys who probably dominates the scene. Frank Boys came back ro Cambridge and Lennard-Jones’ lab in 1948 after wartime service in the Ballistics Branch of the Armaments Research Department at the Woolwich Arsenal and a short postwar period at Imperial College in London. An interesting, if perhaps idiosyncratic, view of the developments in computational quantum chemistry in Cambridge in the period that began about this time can be found in a review by Handy.17 But as illuminating as anything in placing Boys in the
Beginnings 279 Cambridge computing context, are the remarks found on page 152 of Wilkes’ memoirs15: One of the more substantial users (of EDSAC) from an early date was S. F. Boys, whom I had known around 1937 when he was a research student working under Lennard-Jones on methods for computing the wave functions of molecules. I ran into him again one day after the war when I had lost myself in the corridors of Imperial College, London, and had knocked on a door to seek assistance. In the course of conversation, he enquired what 1 was doing and I told him about EDSAC, then still incomplete. He listened carefully and then proceeded to give me the reasons why the machine would not be of much help in compuring moletular wave functions. After he had returned to Cambridge and become one of our most prominent users, with a number of students working with him, he was fond of recalling the conversation, and would explain that what he had failed to realise was that digital computers could do Boolean as well as arithmetic operations. At one point Boys’ work did not appear to be progressing very rapidly and unfortunately he lacked the art of explaining an intricate subject with clarity. [This point is noted and commented on by Coulson in his obituary notice of Boys for the Royal Society.1E]One day, after he had mystified us at Priorities Committee, even Hartree, who had a special interest in the subject, expressed some doubt about supporting the work further. Boys’ real trouble was that he was trying to operate on a scale that was beyond the means available at the time. Later, when machines more powerful than EDSAC became available, the full extent of his vision became apparent.
1
Wilkes also remarked that Boys tried to explain to him a program that he had written to do symbolic algebra, and Wilkes believed that it was perhaps the first time that a computer had been used for algebraic manipulation. But nothing was ever published on it so there is no way of knowing precisely what was involved. The first papers by Boys using a digital computer are one with Price19 and one with Sahni.2” Both were received in June 1953 and appeared in 1954. The paper with Sahni is on the calculation of what they called vector coupling coefficients and just might be the algebraic manipulation work spoken of above. The paper with Price reports atomic calculations and is the first report of such work by digital computer in the UK. Later, in 1956 with Cook, Reeves, and Shavitt, he published a manifesto for computational quantum chemistry, together with some calculations using a gaussian basis.21 (Although there is no doubt that it was Boys’ approach to the use of gaussian functions that was the really influential one, in terms of historical priorities it was Roy McWeeny who was the first to use gaussians, and indeed contracted gaussians, and even-tempered gaussians, in a series of papers culminating in Ref. 22. See also Shavitt.23) A survey of the computational chemistry work of Boys can be the paper by Handy, Pople, and Shavitt24 in the special issue of the Journal of Physical Chemistry devoted to the work of Frank Boys and of Isaiah Shavitt.
280 The Development of Computational Chemistry in the United Kingdom The computer also played an enormously important part in protein crystallography at Cambridge. The elucidation of the structure of horse myoglobin by Kendrew and his co-workers, announced in 1958, would not have been possible without EDSAC on which to process the 400 reflections measured. Subsequently EDSAC2 was used to produce a more refined structure based on 10,000 reflections and also to produce a determination of the structure of haemoglobin by Perutz and his co-workers. Both of these achievements were reported in Nature in 1960.25J6
Emerging from the 1950s So, at Cambridge, as at Manchester, there was a great deal of computational quantum chemistry underway by the late 1950s and indeed there was a great deal of computational everything by 1960. However, the UK university system in general and all computer-based study was on the verge of a great transformation. The change in the UK universities was (at least in part) in response to a report by a committee chaired by Lord Lionel Robbins, the Principal of the London School of Ec0nomics.~7The committee reported late in 1963 and recommended that the number of UK universities be increased from the existing 25 or so (depending on how they are counted) dealing with some 118,000 students (about 4% of the age group) to about 50 to deal with about twice as many students. There were to be some consequent changes in other institutions of higher education, and because all of these are considered in future tabulations, the comparable figure for total number of students in 1962/63 is 216,000 (about 8.8% of the age group). By 1971 there were about 457,000 students in higher education (about 11%of the age group). As far as computers were concerned, in the early and mid-l950s, the politically powerful scientific figures in the UK, such as Hartree, Blackett, and Darwin (Superintendent of the National Physical Laboratory), firmly believed that three or four digital computers would be quite enough for the nation's needs. Thus there had been no real thought in the UK about the possibility of the mass production of machines. But by 1960 it was clear that the demand for computing resources just in Universities and research establishments was soon going to outstrip the supply. It was clear, too, that it was rapidly going to become impossible for every user to run his own jobs on the machine and that thought had to be given to setting up a proper service to run the machine effectively, without, it was hoped, alienating the user. It was also realised that the development of a system (or monitor), together with assembler and higher level languages, could ease both the task of the programmer and those responsible for running the machine. 'fiere were also contemporary technological developments and perhaps the most important of these was of the transistor. This United States discovery meant that it was no longer necessary to use valves (tubes) in anything like the
The 1960s 281 numbers that had been used heretofore and their replacement by transistors meant a consequent increase in reliability and a diminution in power requirements. Thus, although the design of machines was still discrete-component and so each board had to be individually assembled, machines became smaller and easier to keep cool. Also, more automatic methods of producing ferrite core store were developing so that such store, though still expensive, became the fast store of choice. Just as wealth in primitive societies was measured in the size of the herds, so in computer owning society, wealth came to be measured in the size of the core store. The story in the United States is quite similar to that in the UK up to this point, but from this time on it can be said, admittedly in a rough-and-ready way, that the United States won the battle for the expansion of higher education and for computer mass production. So the research groups involved in the UK in the 1960s were to be much smaller than their United States counterparts, and the computers that they had at their disposal, on the whole, were smaller and less reliable than those in comparable United States undertakings.
THE 1960s It is quite difficult to pick a path through this period, say, 1962-1972, in the UK. Up to 1962, computational chemistry is easy enough to characterise. It consisted simply of X-ray crystallography and of molecular quantum chemistry, which in turn consisted chiefly of atomic and molecular electronic structure calculations. But after 1962 it became much more various. Simulation work of various kinds began to be computerised. This work ranged from the simulation of reaction kinetics through the beginnings of reaction dynamics of various kinds and on to statistical mechanics. There were also developments in the computational simulation of spectra of all kinds. Programs for ESR and NMR and molecular vibration-rotation spectra and the like all began to be developed. The development of direct methods in X-ray crystallography also began to lead to computational simulation approaches to the decoding of X-ray data. In such work, the computer model replaced the ball-and-stick models that had begun to be used in the 1950s to elucidate protein crystal structures. This approach reached its full fruition, however, only in the late 1970s and 1980s, with the development of powerful graphics work stations. There began also the first attempts at computer-aided instrumentation, so that the output from a given experimental set up could be processed directly on a computer, without the need for recording and separate transcription of the results. This was a factor that contributed to the development of networks toward the end of this period. Not only were these developments being made, but also chemical information was being computerised and many database type developments were begun. There was also a move to develop computer-aided synthesis, partic-
282 The Development of Computational Chemistry in the United Kingdom ularly organic synthesis, though this was chiefly pursued in the United States and only to a limited extent in the UK. The first early steps in what would now be recognized as chemometrics were also being taken. It is not possible to do justice to all these developments in a brief review like this and what follows will undoubtedly be slanted toward the development of computational quantum chemistry. But-the experience of that field is not incongruous with that of some of the other fields mentioned above, though it is rather different from the experience of the information retrieval and processing side of things. Given this, it is probably prudent to start from what would be widely agreed to be a really seminal occasion, for all computational quantum chemistry, not just that in the UK, namely the Boulder Conference of 1959. The proceedings of this meeting are recorded in Reviews of Modern Physics, 32 which appeared in 1960. It was at that meeting that Charles Coulson28 offered the view that quantum chemists came in two types, “group I (electronic computors) and group I1 (nonelectronic computors), though a friend has suggested as alternatives the ab initio-ists and the a posteriori-ists.” The present concern is not so much with whether he turned out to be right or not,S but that he (with his usual perceptiveness) recognised that a sea change was occurring in the subject and that the computer had become an absolutely essential tool in the development of one kind of quantum chemistry. In fact, many of those in group I1 used semiempirical methods to perform simple calculations by hand, and since that time the computer has actually come to dominate in calculations of this kind too. What was clear to any reader of that issue of the journal was the enormous extent of computer use in the United States and of the variety of machines available to the chemical community often, though not always, by means of collaboration with military agencies and the defence industry. Thus much of the Chicago work was done on a Univac 1103 at Wright-Patterson Air Force Base near Dayton, Ohio. Especially noteworthy of such work was the systematic LCAO-MO SCF study undertaken by R a n d on all the first-row diatomics. Some work by Nesbet was done on the IBM 704 at the Martin Aircraft Company in Baltimore. The work of Goodfriend, Birss, and Duncan on formaldehyde was performed on an IBM 650 (a great workhorse in many United States efforts about this time) at the US Army Ordnance Ballistic Research Laboratories. The work from MIT used the Whirlwind, which was an MIT engiheered and built machine. The UK computational contribution was made .chiefly by Boys both alone and with his student, Julian Foster. The calculations reported were all performed on EDSAC at Cambridge. Young people in the UK who were interested in developing computational quantum chemistry were therefore, not unreasonably, strongly attracted to a period of work in the United States where there were many more computing facilities than there were in the UK and, so it seemed, much more interest in such work. It should perhaps be recorded that quantum chemistry in the UK was at this time dominated by the groups at Oxford and at Cambridge, the first
The 1960s 283 under Charles Coulson and the second under Christopher Longuet-Higgins (who had succeeded John Lennard-Jones in 1954). Both these powerful figures remained cool about computational work in quantum chemistry and though neither was Luddite or anti-computer, both regarded it as appropriate to make anyone who wished to use a computer, explain precisely why, and why it was advantageous. They were both anxious (and not foolishly anxious, either) that calculation not be used as a substitute for thought. They were not alone in the UK in their anxiety, and in the United States a similar anxiety was often expressed, perhaps most forcefully and effectively by Peter Debye. However, it was clear that in the United States Robert S. Mulliken at Chicago and John C. Slater at MIT (among others) presided over groups in which computational electronic structure work of all kinds was encouraged. Thus for a period in the early 1960s, quite a lot of UK quantum chemistry was actually done in the United States as young people came from the UK on postdoctoral research fellowships to work in United States labs. The kind of thing that happened can be exemplified by considering developments at the Solid State and Molecular Theory Group at MIT between 1962 and 1964. (This is, of course, to recount only a small portion of the history of a distinguished group of much longer duration and wider interests than might appear from the present account.) In the SSMTG Quarterly Progress report for January 1963, J. C. Slater notes in his introduction that since October the group has been joined by two postdoctoral workers from England and one from Canada “who have extensive experience in molecular calculations.” All three had been attracted to MIT not only for its computational facilities (it had an IBM 709 as a group machine in the Collaborative Computational Laboratory run by a very competent, experienced, and helpful staff) but also by the presence there of M. P. (Michael) Barnett, an expatriate from the UK, who was at that time the guru of manycentre integrals over Slater orbitals. It was generally agreed that, in implementing molecular calculations of any kind, it was the evaluation of the three- and four-centre electron repulsion integrals over Slater orbitals that caused the real bottleneck. The only plausible method of tackling them was by means of the Barnett-Coulson expansion, an expansion that was known, however, to have notoriously erratic convergence properties. Although Barnett and his coworkers (among whom were Russ Pitzer, Don Ellis, and Joe Wright) had made some progress in doing three-centre integrals generally and four-centre ones in special cases, it rapidly became clear to the new arrivals that any progress was going to be very slow indeed. In these circumstances M. C. (Malcolm) Harrison, one of the Englishmen, was able to convince the molecular-structure people that the use of gaussian, rather than Slater, orbitals was the way forward. He also convinced them of the utility of the LCAO-MO-SCFapproach and of the subsequent Configuration Interaction (CI) calculation to allow for electron correlation. He claimed no originality for these ideas, attributing them to his teacher at the University of
284 The Development of Computational Chemistry in the United Kingdom
Leeds, C. M. (Colin) Reeves who, he said, had begun to develop them as a student of Boys. The group therefore divided up the work of programming and brought in graduate students to help. The whole thing was organised and held together by Malcolm Harrison. He wrote software for file-handling on the tape units, which forced the collaborators to use a common interface, and he was ruthless in castigating bad and sloppy habits of programming (although he had a soft spot for computed “G0TO”s). Thus developed, in just under a year, the initial phase of POLYATOM, the first automatic program system to perform LCAO-MO-SCF calculations quite generally. Although molecules were not uppermost in J. C . Slater’s mind at this time (he was developing X, theory and beginning its computational implementation), he was enormously supportive of this work. This was particularly so when things went wrong in the computing, as they sometimes did, and so much very costly computer time was expended to no avail. He took such setbacks calmly and paid the computing bills without demur and without visiting retribution on the heads of the perpetrators, remarking that this was a price that had to be paid in developing new things. It must be admitted that his calmness and reasonableness were quite unexpected by those involved, for he occasionally had a fearsome temper and quite a short fuse. The first phase of POLYATOM was reported on by Barnett at the Sanibel meeting held in January 1963,29 and though the report was more about what was planned to happen rather than what by that time had happened, it is nevertheless an accurate account of what eventually was to happen. By the end of 1963 the program had become widely distributed, without guarantee, and though it was not quite bug-free initially, it rapidly became so on use, and the first papers using it were published two years after the completion and checking of the system30JI The system was described in an MIT Collaborative Computational Laboratory report,32 and it went on to further development in the United States under the direction of Jules Moskowitz (who had been one of the original MIT SSMTG molecular calculations group) at New York University (NYU) and at Bell Telephones, It is in this more developed form (“Version 2”)that the program was mostly widely distributed in the late 1960s and up to the mid-1970s. But by that time the UK participants in the development had moved on to other things, and Slater’s Canadian postdoctorial worker mentioned earlier had become a large-scale user of the system and no longer a developer. In 1963 the Chicago group and Enrico Clementi were taking the first steps toward what was to become IBMOL, and for a time there was a friendly rivalry between them and the developers of POLYATOM, but as IBMOL began its IBM-based development it drew steadily ahead of POLYATOM in the level of support and the facilities provided. But even today it is possible to find programs employing POLYATOM ideas and, occasionally, the vestiges of POLYATOM code can be recognised. But POLYATOM and IBMOL began a line of
The 1960s 285 program systems for molecular electronic structure calculations whose current representatives are GAUSSIAN, GAMESS, and so on. The Cl program continued its development in Jules Moskowitz’s group at NYU from late 1963 until late 1964. Although a users’ manual was issued then, it did not attract much use or attention for some years to come. This was essentially because of the difficulty of transforming the two-electron integrals from a basis of AOs to a basis of MOs on machines that had only tapes as backing-store and rather small fast-access store. Only with the development of large disc storage and the development of chaining algorithms did the approach really become a starter, It was first effective some 10 years later,33 as part of the ATMOL package and also as part of the MUNICH package. These developments will be considered in. context. It is appropriate to note here too, that at about the same time as the developments recorded above were taking place, some senior UK people who made great Contributions to computational chemistry also went to work in the United States. In the field of quantum chemistry perhaps the most notable figures were M. J. S . (Michael) Dewar and J. A. (John) Pople. Some too went to Canada and among these were P. R. (Phil) Bunker, who was to contribute to computational spectroscopy at the National Research Council of Canada. It would be possible to expand this list of persons who, to use a term fashionable at the time in the UK, went down the “brain drain” from the UK. But since they all made their contributions from their adoptive countries it would not be appropriate to claim them for computational chemistry in the UK. However, it would be wrong to give the impression that all UK computational chemistry done between 1962 and 1965 was done abroad. There was a continuing effort at Cambridge as described by Handy.17 There was also continued work at Manchester, especially in X-ray crystallography, but the quantum chemistry side of things was diminished by the departure of two of the senior figures, Huw Pritchard and Bill Byers Brown, both eventually going to North America. But development was difficult, for there was a chronic shortage of computer power in universities and civilian research institutes in the UK, and this shortage was not mitigated by the kind of collaborations with the military and defence sectors that were so usual in the United States. Such collaborations were generally forbidden on security grounds, and although this is understandable in the political context of the times, all involved knew that, in practice, it was a quite daft prohibition. Things would thus have been pretty grim for computational chemistry but for two developments. The first was that of the Atlas Computer Laboratory, which opened in summer 1964. The second was the publication of “A Report of a Joint Working Group on Computers for Research” in January 1966.34This last is always called the Flowers Report, after its chairman B. H. (Brian) Flowers, who was then Langworthy Professor of Physics (in succession to Blackett) at Manchester. He was later to become a government science adviser, chairman of both the Science
286 The Development of Computational Chemistry in the United Kingdom
Research Council (SRC), one of the successors to Department of Scientific and Industrial Research, chairman of the Computer Board, Rector of Imperial College, London, and eventually Lord Flowers.
The Atlas Computer Laboratory Atlas Lab had its origins in the Atomic Energy Research Establishment at Hanvell and indeed the lab was, and is still, sited next door to Hanvell. This section owes much to a talk that J. (Jack) Howlett, the first director of the lab gave at the Warwick meeting of the IEE in July 1993. The Theoretical Physics Division there was heavily involved in computing for reactor design. The head of this division in the immediately postwar period was Dr. Klaus Fuchs, who was later found to have been treacherous for his dealings with the USSR during the war. This was no doubt one of the reasons for the political reluctance to permit academic computational collaboration with the military and defence agencies. By the middle-l950s, the division was using the computers then available including by 1958 a Mercury. But, in practice, for serious reactordesign calculations, they needed to use the IBM 704 which was situated at the very high-security Atomic Weapons Research Establishment at Aldermaston, and for advanced reactor design it was clear that they were going to have to use a machine of power comparable to IBM’s proposed Stretch machine. In these circumstances it was natural for the computational group in the Division to press for a large machine. It was not unreasonable of them to suggest that it should be designed and built in the UK. The ambitions of the group were supported by the Director of Harwell, Sir John Cockcroft, and negotiations led to a proposal that Ferranti should build, to a Manchester University design, a super-computer, eventually to be called Atlas. The proposed machine would satisfy all the computing requirements of the Atomic Energy Authority, both at Harwell and elsewhere, and still have spare capacity. It was proposed that the spare capacity be made available to UK universities generally for work needing large-scale computational support. This provision was to be without charge and as of right. The proposal was so expensive that it required the specific approval of the Minister of Science, and it was submitted to him. The response was favourable and in one respect, surprising: it was decreed that the machine be not run by Harwell but run under the auspices of the then newly created National Institute for Research in Nuclear Science (NIRNS). (NIRNS was created in an attempt to encourage civilian research in nuclear and high energy physics, areas that were proving just too expensive for research on an individual University basis.) A special laboratory for the machine should be built outside the wire at Harwell, adjacent to the site of the proposed Rutherford High Energy Laboratory (RHEL),which was also part of the NIRNS remit. It should provide services to the Atomic Energy Authority and to the Universities as proposed, but the Atomic Energy Authority should pay for the services it received. It turned out that Atomic Energy Authority
The 1960s 287 made no use at all of the services of the Atlas lab, and so the lab quickly became a completely civilian facility, eventually coming under the wing of the SRC after the demise of NIRNS (1965). The decision to go ahead was made late in 1961, and the building of the lab was completed by spring 1964, and the Atlas machine was delivered in April 1964. By then, Ferranti had sold its computer business to International Computers and Tabulators, a firm that was later (1968) to become International Computers Limited (ICL) on acquiring the computer interests of all other UK manufacturers. It may be of interest to record that the part of the lab built to hold the machine was on two floors. All the equipment that only the engineers needed to touch was on the ground floor and the operations section with the tape drives, the printers, and so on, was on the first floor. The space needed reflected the enormous size of the machine, for it took 14 truckloads to deliver it all, and 3 months were needed to install and test it. The total power consumption of the machine and its attendant cooling facilities was about 150 KVA. None of these figures is out of line with contemporary machines of comparable power. (The machine delivered roughly 350 KFLOPS.) The machines of the time were power-hungry monsters that took a lot of maintenance and needed highly skilled operators. The machine came on-line for users in October 1964 running a single shift, but the uptake by users was so quick and great that by early 1965, the machine was being run around the clock. The machine was run in batch mode. Users submitted jobs, usually on punched cards, to the operators, of whom there were six to eight on a shift, the jobs were run, and finally the output, together with the input cards, were returned. Atlas has a pretty good claim to be the first machine with an operating system in the Atlas Supervisor. This program controlled the 110, scheduled the jobs, did automatic transfers between main and backing store (paging), and kept job statistics. Atlas also ran a number of compilers, among them FORTRAN, and the symbolic assembler language was very similar to FAP. Because of the exigencies of a batch mode of operation, it was necessary for a user to be present at the lab at least during the early stages of program development and debugging. Because the lab was in a fairly remote rural setting (Oxford was the nearest large town, about 15 miles away), workers from different computational disciplines were at close quarters with each other and rather isolated from mundane concerns. There was thus much discussion of computing techniques in a cross-disciplinary environment while waiting for a run-slot. Once programs were ready for production running, card-decks could be sent by post to the lab, where they were run and returned with the output. The lab had a small staff of experienced and capable support programmers who would look at any job that failed and assess if the error involved was trivial and could be corrected on site and the job rerun. One person, Mike Claringbold, seemed to many users of the time to have almost supernatural
288 The Development of Computational Chemistry in the United Kingdom powers of error perception and correction, and his skills certainly added immeasurably to the effectiveness of remote operation. The lab also began to develop support groups of staff for particular computational enterprises. The ones relevant to computational chemistry began to develop in the early 1970s and will be considered later, but to understand the relationship between the lab and the computational developments more generally in the UK, it is necessary to consider the Flowers Report and that to which it led.
The Flowers Report Although the UGC had a subcommittee on computing, it was not fully seized with the urgency of the need for computing capabilities in the universities. However, the Committee for Scientific Policy, which was a body comprising the great and the good which advised government on national scientific policy, told the government of the day (1964) that there was a real need for computers in universities and a need that should be met as quickly as possible. The government was sympathetic to the Committee for Scientific Policy’s message for it sought to stimulate the UK computer industry on the advice of the NRDC. Thus when Brian Flowers proposed that a joint committee of the UGC and the Committee for Scientific Policy be set up “To assess the probable computer needs during the next 5 years of users in Universities and civil research establishments receiving support from Government funds,” the proposal was acted,upon by the government by setting up such a committee and appointing Brian Flowers chairman. The committee (which among other distinguished members contained Dr R. F. (Bob) Churchouse, Head of the Programming Group at Atlas lab) set about its work with great dispatch, travelling around the country taking evidence from interested parties. A report was presented in June 1965, and the government gave general approval to the proposals in December 1965. This approval included the setting up of the Computer Board, of which Brian Flowers became the first chairman in 1967. The committee recommended that the UK be treated as a collection of regions with a hierarchical system of computer provision within a region. There were to be regional computer centres, and each university in the region was to have a similar smaller computer, the size being determined essentially by the size of the physical science and engineering departments in the university. Any university user had access as of right to the machine at the regional centre. The government accepted the recommendations and very quickly implemented them so that, under the direction of the Computer Board, by 1967 almost all universities had a computer and a computer centre, and the regional centres had been established, usually at the largest university in the region. Computational chemistry in the UK was thus in with a chance to achieve state-of-the-art research.
The 1960s 289
Emerging from the 1960s Initially the computers provided by the Computer Board were for research work only and not for teaching. However, that distinction did not persist into the late 1960s, and the computers became a general educational resource. It was not permitted that the computers so provided be used for university administration and, since they were required to provide an effective service, there were very strict limits on the extent to which the then burgeoning tribe of computer scientists were allowed to monkey about with the machines. As a general rule the initial provision was of UK-manufactured machines. The smaller ones were comparable to the IBM 709 and the larger ones were about of IBM 7090 or 7094 power. It would probably be agreed now that most of the machines provided were not wonderfully satisfactory, either in terms of hardware reliability or in software provision. But by the late 1960s most of them had been got to work in a satisfactory manner. But, alas, this was just as their manufacturers were going out of business. After 1968, whatever UK computer a university had, they had to deal with ICL who had taken over all other UK computer manufacturers. Thus to add to the anguish of the users over the machines themselves was the difficulty of dealing with an essentially monopoly supplier. The period from 1967 up to 1972 or so was a period in which computational chemists felt themselves to be struggling against machine limitations, but at least they were able to do some computational work in their home institutions. They could also use the regional centres where, in some cases United States-made machines were available, and, if they had SRC grants, they could use Atlas. So though it was a period of frustration, it was also a time of progress. That progress may be typified by developments in computational quantum chemistry and related enterprises, and no attempt will be made to cover crystallographic computing or the burgeoning interest in databases and bibliographic developments. What follows is a perhaps rather impressionistic attempt to convey the nature of the computational chemistry enterprise in the UK at this time. At Manchester computational quantum chemistry began to develop strongly again with the building of a group under Dr (now Prof.) Ian Hillier. Among much of interest that originated in that group at about this time, there was probably the first computational realisation of a “direct method” in solving the LCAO-MO-SCF equations, which appeared in the paper by Hillier and Saunders.35 The work was performed on Manchester’s own Atlas, which was the prototype for the machine at Atlas lab, and somewhat smaller than it. This work is also interesting because the (gaussian) integral evaluations were carried out by using a program written in FORTRAN IV by Vic Saunders, which was based on the integral programs in IBMOL released as QCPE 92 (see below for QCPE). This work can be regarded as the start of the ATMOL system. It was developed further at Atlas lab, which Dr Saunders was to join in 1970.
290 The Development of Computational Chemistry in the United Kingdom
The paper of the Manchester lab appeared almost simultaneously with one by Roger Fletcher36 on the use of direct minimisation techniques in MO calculations. Fletcher had been a student of Reeves at Leeds at the same time as was Harrison. With Reeves he developed a widely used conjugate gradient method of minimisation, which is often called the Fletcher-Reeves method. Fletcher’s interest in computational chemistry proved to be only a passing one, and he went on to have a distinguished career in optimisation theory itself. However, the Fletcher-Reeves method was used in LCAO-MO work originating from both the quantum chemistry group in York37 and the group in Leicester.38 It is perhaps not fair to count these as the very first use of gradient methods, for nowadays that term is more associated with geometry optimisation in electronic structure calculation. The first in the latter category must surely go to the paper of Gerratt and Mills,39 work originating from the group at the University of Reading. The chief computational chemistry interest of this paper was molecular spectroscopy. It was while working at Reading that J. Watson developed the modern form of the Eckart Harniltonian,40 which was to provide the basis for most subsequent computational work on the interpretation of molecular rotation-vibration spectra. In Cambridge, EDSAC2 was still in use but in January 1965 a TITAN computer was installed. This was essentially a kit-built Atlas, and the story of its accession to Cambridge is told in an entertaining manner by Wilkes.15 Boys and Handy had begun work on the transcorrelated method, a method, albeit nonvariational, for incorporating the. interelectronic distances into the wave function. A series of papers resulted culminating in 1969 with the use of the technique on the LiH molecule.41 In 1967 Christopher Longuet-Higgins resigned as Professor of Theoretical Chemistry and took up a Royal Society professorship in Edinburgh to study artificial intelligence. In 1969 A. D. (David) Buckingham became Professor, coming from Bristol where he had also been Professor of Theoretical Chemistry. With his advent the study of intermolecular forces began to develop strongly, its computational aspects owing much to the ideas and energy of Anthony Stone. Both at Manchester and at Cambridge there were the beginnings of developments in computational molecular dynamics and scattering theory and these disciplines were developing in a lively way in London University too. Molecular dynamics was developed by Konrad Singer and Ian McDonald at Royal Holloway College, particuIarly.42~43 Computational work had also begun to flourish in the group of Charles Coulson in the Mathematical Institute at Oxford. Not only was molecular electronic structure work done there but also heavy-particle scattering, and quite a lot of that aspect of the work can be discovered by reading the book by Levine,44 who was in the Mathematical Institute at the time. Computational work had also begun in the Physical Chemistry Laboratory chiefly with Peter Atkins and his students with their interests in NMR and ESR simulation, and
The 1960s 291 Mark Child and his students, with their interests in semiclassical scattering theory. Oxford was not alone in having for its centre of computational quantum chemistry, a department of mathematics: it was the case too at the University of Nottingham where the Applied Mathematics department under Prof. George Hall made important contributions to many aspects of computational chemistry. A typical computational paper from that group at about that time is the one by David Brailsford and Brian Ford on the ionization potentials of the linear alkanes.45 The paper is interesting not only for its content, but also for the future careers of the authors. David Brailsford went on to be professor and head of the department of Computer Science at Nottingham, and Brian Ford became director of the Numerical Algorithms Group (NAG), a group that made such enormous contributions to software developments in the 1970s and 1980s and indeed, still do make. There was also scattering theory at Nottingham but not strictly of chemical relevance. During the late 1960s Roy McWeeny left the University of Keele for the University of Sheffield and that rapidly became a centre of computational quantum chemistry as well as a centre for spectral simulation of all kinds. McWeeny’s group was remarkable in having its own computing facilities, an uncommon occurrence at the time. A paper from that time and place is Ref. 46. It was much more usual at the time for researchers to use their local computing facilities, supplemented by the regional centre facility and sometimes Atlas. An example of such use is in a paper originating from the University of Birmingham by Deryk Davies,47 which is additionally of interest because all available levels of computing resource were involved in its execution and because it is one of the earliest UK uses of programs provided by the Quantum Chemistry Program Exchange (QCPE), a United States organisation, described in more detail in Ref. 5. That which has been written above perforce ignores the work of many in the field in the UK, for at least 10 other institutions could have been mentioned at which work in computational chemistry of this kind was being carried out by individuals or very small groups. Failure to mention them explicitly is not intended in any way as a slight on their work; it is simply hoped that what has been written fairly typifies the sort of work that was going on in the UK, without traducing anyone. The point of the story has been to show that the numbers of persons involved in the endeavour in the UK was rather small (perhaps a score or so of quantum chemists, with perhaps a comparable number in dynamics and simulation, although there were probably about five or six times as many crystallographers), and that their efforts were fragmented. It was natural therefore that attempts would be made to encourage collaborations so that the UK might more effectively contribute to computational chemistry at the international level. The development of such collaborations is the theme for the 1970s. It is
292 The Development of Computational Chemistry in the United Kingdom also the case that from about 1973 onwards the story of computational chemistry in the UK becomes, to a large extent, the story of computational chemistry on United States designed and built machines, and this change and its difficulties is another theme for the 1970s.
The 1970s In 1970 Vic Saunders was appointed to the staff at Atlas lab with the remit of supporting computational chemistry generally and in particular computational quantum chemistry. He was joined in this enterprise in 1972 by M. F. (Martyn) Guest. In a similar manner, first M. (Mike) Elder and then P. A. (Pella) Machin were appointed for the support of computational crystallography, from which developed the support for chemical databases and database manipulation. There were, of course, many other appointments in support of other computational enterprises, but to follow the development of these two through the 1970s typifies the developments in computational chemistry more generally. In saying this, however, a caveat must be entered for quantum chemistry because two outstanding and extremely influential figures in the subject died early in the decade. Frank Boys died at the age of 61 in October 1972, and Charles Coulson died, after a long period of debilitating illness, at the age of 63 in January 1974. There can be no doubt that their intellectual presence in the subject was much missed as was the consummate political skill of Charles Coulson. There are also some institutional changes at Atlas lab and at RHEL that affect the story too. Although the computing facilities for the high energy physics community at RHEL had been provided on a UK machine (Orion), by 1966 that machine was proving inadequate for its needs, and RHEL won the argument that any future machine should be from IBM on the grounds that this was necessary to collaborate effectively with CERN in Geneva. In 1967 Orion was replaced by an IBM 360/75, and this was replaced in 1971 by a 3601195. The Atlas machine was closed down in 1973 and though every effort was made to replace it by a UK machine of suitable power, this proved impossible. So in 1975, on the retirement of its first (and only) director, Jack Howlett, Atlas lab was merged for organisational purposes with RHEL, to form the Atlas computing division of the Rutherford Appleton Laboratory (RAL), and computing work was transferred to the 360/195. There was a large upheaval in 1977 when five of the Atlas lab staff who had been particularly concerned with computational chemistry support, were transferred to what had been a nuclear physics laboratory, running an electron synchrotron called NINA at Daresbury near Warrington in northwest England. The developments at Daresbury are important for the story after 1977, but need not be told now. There are similar changes in computing equipment both at Cambridge and at Manchester but consideration of these too can be delayed.
The 1970s 293
The Meeting House Developments During 1971 and 1972 there was much discussion in the Science Board of the SRC on the computational needs of theoretical physics and chemistry. It was the view of some that the way forward was to set up institutes researching these subjects and to concentrate the personnel and computing power at these institutes. Such an idea for theoretical chemistry institutes was floated by Brian Flowers (who was then Chairman of SRC) but it rather soon sank. An idea that seemed more buoyant was that of “Meeting Houses’’ whose origin lay probably with Prof. R. (Ron) Mason. He was until 1971 Professor at the Universitv of Sheffield, when he went to the University of Sussex. From 1969 he was Chairman of the Chemistry Committee of the Science Board, and from 1972 to 1975 he was €hairman of the Science Board itself. He went on to be Chief Scientist at the Ministry of Defence, Sir Ronald Mason KCB, and to hold some important positions in public life. The fruition of his idea as far as computational chemistry is concerned begins with a memorandum by Jack Howlett to the Science Board in October 1972, which .is worth quoting from rather extensively, Howlett wrote: It seems to us that the way one would go about actually setting up and conducting a “Meeting House” on any particular topic is likely to be largely independent of the topic, We suggest the following: an underlying assumption is that every project which is undertaken is expected to have some visible product often no doubt a computer program but in other cases (or in the same case) possibly a new method of attack on some problem or an understanding of some phenomenon. i. We invite a small number - say 4 - of very distinguished scientists whose views would command respect to meet in the Laboratory for a day, to discuss the subject amongst themselves and to suggest areas for study, and names of people knowledgeable in those areas. ii. We get as many as possible of the people‘suggested in (i) to come to the Laboratory for more detailed discussions amongst themselves and also with members of our own staff and possible a Science Board nominee. These may go on for a few days, and should lead to the specification, in broad terms, of one or more problems that are to be tackled and to the setting up of a small group to be responsible for each. These groups may or may not include Laboratory staff and are not permanent, but disband when the project is completed. There would be an interval between (i) and (ii) whilst either a member of staff worked full-time on getting familiar with the subject or someone who was already knowledgeable was found who was willing to join the project for its duration. . . . iii. The groups then settle down with members of the Laboratory to plan out the work in as much detail as seems sensible, and to decide on the resources required-eg., manpower, machine time, special storage need, special software support, special hardware. iv. . . . v. Having made , , . as good arrangements as we can, we and our collaborators start work. We should hope to be free of too much administrative control and reporting, but progress would be reported to the appropriate bodies-. . , -and to the 4 Wise Men (if I may refer to them thus) if, as one
294 The Development of Computational Chemistry in the United Kingdom would hope, they were willing to keep a general eye on the work. The working groups would of course be free - be encouraged in fact - to arrange seminars at the Laboratory and to contribute to such meetings arranged in universities. In all this we have placed the “Meeting House” concept in the domain of the Science Board, but the idea has aroused interest among the other SRC Boards, notably the Engineering Board. In fact, the idea was taken up by the Engineering Board too, in the formation of what were eventually called Special Interest Groups, but it is the positive response of the Science Board that is of interest here and that led through the first meeting house project to the development of,the Collaborative Computational Projects (CCPs). Jack Howlett proposed, and Science Board accepted, the proposal for the first meeting house to be one which would study “molecular correlation errors in theories which surpass the Hartree-Fock theory in accuracy” and nominated as the four wise men, Professors Bransden (Durham), Burke (Queens’ University, Belfast), Coulson (Oxford), and McWeeny (Sheffield).Coulson was too ill to serve, and in his place J. N. (John) Murrell of Sussex was chosen. The wise men met first at the lab on 6 February 1974 with Roy McWeeny in the Chair. They first agreed to recommend that the title for the project overall should be “The Atlas Computer Lab Meeting House in Theoretical and Computational Physics and Chemistry” and further recommended that a specific project should be started, to be known as Project I: “Correlation in Molecular Wave Functions.” It was recognised that if this project took off then there would be a good case for other projects to start quickly thereafter. The other topics suggested were: continuum wave functions, atomic/molecular collisions, and chemical reactivity. For the moment, it was agreed just to start Project I by setting up a small working group to guide its progress and to report back to the four wise men (who, doubtless uncomfortable with their title, named themselves the “Steering Panel”). The first Project I working group consisted of two of the wise men (McWeeny and Murrell) and J. (Joe) Gerratt (Bristol), N. C. (Nicholas) Handy (Cambridge), M. A. (Mike) Robb (Queen Elizabeth, London), and B. T. (Brian) Sutcliffe (York). The working group was to meet in March and to prepare a report for the Steering Panel giving a scientific program and also indicating the scale of expenditure (manpower, computer resources, and travel grant funding) that would be required of SRC. But before considering in more detail the scientific consequences, we give an aside on the cultural history that may perhaps be illuminating. It is not only the historical content that is interesting in what is recorded above. The manner of the development typifies a mode of thought, reflected in a proposal for the structure of an organisation, that was perhaps peculiarly English and of its time. At least it was certainly characteristic of UK intellectual
The 1970s 295 and organisational life (including scientific endeavours) before the 1980s. The organisation is top-down and the thinking essentially elitist. The nature of the structures proposed makes it overwhelmingly likely that those lower down (admittedly only those chosen by the top) will rapidly become involved and actually set the intellectual and organisational programme of the enterprise, but the programme will be strongly guided by the elite. Such organisations, like benevolent despotisms, are not without their good features, but like benevolent despotisms, they must be tempered by the “occasional assassination or welljudged suicide” to remain responsive. They are also quite intolerable to those not chosen by the elite and who are thus excluded from participating in the programme setting, Such modes of thought and forms of organisation vanished from the UK in the 1980s. They were replaced by structures justified by populist rhetoric and operating a peculiar version of Lenin’s “democratic” centralism, in which the idea of top-down organisation meant that the central committee set the targets for the workers to achieve. It follows then that extreme caution should be exercised in attempting to learn any contemporary lessons from the history recorded here. The first meeting of the Working Group took place at Atlas lab on 28 March 1974. At this stage the Atlas had been decommissioned, and it was pretty clear to the participants that any future computing would be on the IBM 360/195 at RHEL. This is implicit in all that is said from now on. The agenda for that meeting began as follows: I THE SCIENTIFIC SCOPE OF THE PROJECT Note: The steering panel felt that its general intentions for Project 1 could best be conveyed to the working group in the form of a list of topics worthy of investigation: i. Valence Bond Theory and Its Variants ii. Multiconfigurational Self-consistent Field Theory iii. Geminal and Group Function Methods iv. Large-scale Configuration Interaction v. Transcorrelated Wave Functions vi. Many Body Perturbation Theory and Green’s Function Methods vii. Time-Dependent Hartree-Fock Theory, Response Functions, and Related Methods
The meeting was a lively one as can be gauged from the minutes (“Dr Handy expressed shock . . .”, “Prof. Murrell was generally sceptical . . .”,“Dr Robb pointed o u t . .”), and it appears that, at some stage, Prof. Murrell emerged as the Chairman. But what is really interesting is that there was a serious and extended discussion of whether a package of programs for quantum chemical purposes was the way forward and, if it was, how such a package should be implemented so as to be most useful to the community. In the end it was agreed to build on what was already available at Atlas lab and to link the existing SCF
.
296 The Development of Computational Chemistry in the United Kingdom and MC-SCF packages via a stand-alone four-index transformer to a CI package to be developed from the POLYATOM code. It was agreed too, that a PostDoctoral Research Associate (PDRA) would be of great benefit to the project. The discussion begun at this first meeting was continued at a second one on 8 May. A little before this second meeting took place, from 8-11 April 1974, Atlas lab sponsored a symposium, held at St. Catherine’s College, Oxford, entitled “Quantum Chemistry: The State of the Art.”48 In fact, it covered rather more than just quantum chemistry; it included some scattering theory too. The proceedings make interesting reading as an account of many of the contemporary concerns of computational chemistry in the UK. The symposium was also seminal, in that it exemplified a pattern for useful and effective small meetings that was later much used to good effect by the CCPs. From the minutes of the second meeting, can be seen the sort of tensions developing that are perhaps inseparable from any attempt at a collaborative venture aimed at increasing the public good. The meeting was devoted to discussing what work the PDRA, to be appointed, should undertake. It was recognised that any worthwhile PDRA would wish to be independent but anxiety was expressed about the possibility of appointing someone who turned out to be a maverick. In the end it was agreed to present a selection of topics, each the particular interest of a member or members of the working group and to appoint an individual with the possibility of any of these in mind. W. R. (William) Rodwell was appointed as the PDRA during the summer of 1974 and was in fact appointed to develop the variational CI approach, both in the configuration-driven form code and also in the (then) recently invented integral driven approach originating with Bjorn Roos and Per Siegbahn in Sweden. With this appointment, naturally enough, some members of the working group became more involved in the actual mechanics of implementing the code, while others became less so. At one level it could be looked on as a matter of winners and losers in the working group, but the stresses did not break up the group, for all seemed to hope for the production of code that would be of widespread use to the community. The group did not meet again until October 1975 and by that time Atlas lab had become a division of RAL with all Project 1 work being done on the IBM 360/195, although a UK machine (an ICL 1906) was in use for other Atlas division projects. At that meeting a progress report of work done by Project 1 was presented. The configuration-driven CI had been developed from the MUNICH CI package developed by G. H. F. Diercksen. Brian Sutcliffe was involved in both MUNICH and the developments at Atlas, as was M. F. (Michael) Chiu. The integral-driven CI had been developed from the MOLECULE program of Roos and Siegbahn. A stand-alone four-index transformer had been developed by Vic Saunders, and the integral and MO-SCF packages which now constituted ATMOL had been interfaced with the CI and transformation packages. In many ways these were extremely satisfactory developments because anyone who had a suitable IBM machine (and Cambridge had acquired a
The 1970s 297 360/165 in 1974) could simply port the code and run it. Furthermore, any UK worker could apply to SRC for a grant of time on the RAL machine and, if successful, could run the program suite on the chosen problem, with the support of the Atlas lab group. However, the codes were pretty machine-dependent and certainly could not be used as “black boxes.” In these respects the codes were no better (and no worse) than any others available in the United States and elsewhere. Computational chemistry codes became portable in a routine way only in the 1980s, when also their operation became transparent to users, with the widespread use of free-format input, made interactive on an interface with suitable graphics. In fact, free-format input was actually a feature of the ATMOL suite, and so it was rather ahead of its time. Given the era, however, it is not surprising that quite a lot of the discussion at the 1975 meeting was devoted to porting the codes to other machines. Of course, the need for porting was not a new need. Martyn Guest had actually ported the ATMOL code to the CDC 6600 at the Manchester Regional Centre (UMRCC) even before he joined Atlas lab. But porting was thought particularly urgent as the Regional Computer Centres, envisaged by Flowers, were now up and running with a new generation of machines, and it was agreed that the porting should be done by the permanent staff at Atlas who were involved in the project. There was also much discussion on what should be*tacklednext. The Chairman put forward the view that the Project should be closed down early in the New Year. But other members of the working group had their own agendas for continuation. No agreement was reached on these agendas and a whole-day meeting, a month or so hence, was proposed, at which advocates of particular developments could concentrate on persuading the working group. Two other aspects of the meeting bear noting. The first was the realisation that a powerful interest had not been given a voice on the working group and that it was therefore important to incorporate it before too long, so Ian Hillier of Manchester was invited to join. The next was a discussion of relations with, and possible support for, the Centre Europien Calcul Atomique et Moliculaire (CECAM), an institute in Paris run by Carl Moser. In fact the UK had joined what was then called the Common Market (its successor is the European Union (EU))on New Year’s day 1973, but there had also been a referendum in 1974 on whether the UK should stay in on the existing terms. Since the referendum had gone in favour of staying in under the existing terms, the idea of Europewide science was being seriously considered even though historically SRC had not favoured such an idea. This was chiefly because joint ventures meant budget commitments related to a basket of foreign currencies, whose exchange rates were beyond SRC’s control. (SRC had been very badly burned over the UK contributions to CERN in the early 1970s because of currency fluctuations.) Subsequently Dr Hillier agreed to join the working group (and indeed was to succeed John Murrell as Chairman in 1980), but SRC declined to support CECAM, a position not changed until 1979.
298 The Development of Computational Chemistry in the United Kingdom The whole-day meeting took place on 4 December 1975, at which Joe Gerratt spoke for Valence Bond Techniques, Dr B. T. (Barry) Pickup of the University of Sheffield spoke for Green's Function Methods, Mike Robb and Vic Saunders spoke for Coupled Pair Many Electron Theories (CPMET), and Nicholas Handy spoke for Direct (that is, integral-driven) CI methods. The upshot was that Joe Gerratt was invited to inaugurate a pilot study on the viability of the VB method and that, in collaboration with Dr Robb, a largescale CPMET project should be undertaken. These proposals were considered by the Steering Panel, who recommended them to Science Board, where they were accepted at the spring of 1976 meeting. Thus Project 1was continued for another 3 years. But large changes were in the offing. Proposals had been made to reorganise the SRC permanent laboratories, and these were about to be acted on. Their consequence for this story is the shifting of the computational chemistry focus from Atlas to Daresbury Laboratory, as mentioned above. But before going on to consider the developments after the change of location, it is useful to back-track a little and consider the chemical database work done at Atlas lab.
The Chemical Database Developments The efficient and effective inversion and manipulation of chemical databases formed one of the most difficult and intellectually challenging problems that faced computational chemists in the 1970s, not least because of the essentially pictorial nature of chemical structure.information and the need to specify connectivity of significant fragments. The first work in this area in the UK was undertaken at Atlas lab by Mike Elder and Pella Machin, in collaboration with 0. S. Mills of University of Manchester, using the Cambridge Structural Database (CSD) developed at the Cambridge Crystallographic Data Centre (CCDC) under the direction of Olga Kennard. An account of the state of crystallographic databases generally in the mid-1980s can be found in the IUC monograph.49 The work that begins here develops in the 1980s into a Chemical Database Service provided through the SRC and its successor, the SERC, for the UK chemical community and making available databases of NMR, IR, and mass spectra as well as reaction-oriented databases. In 1974 when this work was begun, the CSD consisted of about 25,000 entries, each entry having a bibliographic element, a connectivity element, and a data element. The data element, giving atom positions and so on, was (and is) by far the largest element in an entry. The UK workers used the CSD files as inverted by a program developed at the National Institutes of Health (NIH) in Maryland and thus began using, as the information retrieval system, the Crystal Structure Search a n d Retrieval (CSSR) program originating there too. Both these programs owed their origins to the work of Richard Feldman at NIH. Essentially, the NIH file inversion program was incorporated into the UK suite and then updated independently of any NIH developments, as and when the
The 1970s 299 CCDC conventions for file specification were changed and updated. The CSSR
program was also taken over, and it developed into a program to handle not only CSD data but also that from other chemical databases. Thus Feldman’s ideas and philosophy much influenced the development of the UK codes. By 1977 when Drs. Elder and Machin had also moved to Daresbury, a set of extremely effective programs had been developed, whose utility to the community was increased by the early developments in networking that were taking place, albeit limited by modern standards. It was also clear that such work was of great commercial interest and that there was going to be commercial competition. Thus database developments were perhaps 6 or 7 years in advance of the rest of computational chemistry in having to face up to the dilemma of operating an enterprise, set up initially only with the public good in mind but readily developed as a source of revenue and perhaps in competition with a commercially provided alternative. This theme is one to be returned to later.
The Growth of Networking In the UK the Post Office had a monopoly of telecommunications provision dating from the nineteenth century until the telecommunications part of it was split off and privatised in the 1980s. Thus for communication between sites at the time of interest here, the Post Office had to be dealt with as a monopoly supplier of electronic communications facilities. Within a site, however, the owners were free to do as they pleased unless they wished to connect to the Post Office lines, which they had to do in a specified and regulated way. By 1973 not only did most computers have an operating system, but they also had a timesharing operating system which allowed terminals to be connected to the machine over asynchronous lines and via which editing of files could be undertaken and jobs could be submitted to the batch queues. Initially the terminals were usually noisy teleprinters, later dot-matrix printers (like the DEC-writer) were used. Video display units (VDU) were available by 1972 though in the UK they were very expensive indeed. (They cost about f l O O O in 1972, about the same as the cost of a small motor car then.) Thus within buildings and across campuses connections were made (usually via 2 mA current loop circuits) from the terminals to the machines. The sort of machine that a medium sized (about 4000 students) University in the UK might have would support between 30 and 40 lines. These lines were also capable of taking digitised output directly from experimental apparatus, provided that the data rates were not too high and so the networking of apparatus (which had been developing piecemeal since the 1950s) was to some extent systematised. Connections were made between institutions via telephone lines and these were of varying speeds and reliability, essentially according to the tariff paid for them. The cheap and cheerful connection was basically an ordinary 300-baud telephone line that enabled the user to operate a terminal as if present in the terminal room of the machine but, of course, with a much slower response time than if actually present. Faster lines were available, up to 1900
300 The Development of Computational Chemistry in the United Kingdom
baud in 1973, and very fast lines could be hired (at great expense) for direct machine-to-machine communications. These developments mark a watershed in the way that computing was undertaken, and a rather more detailed account of them can be found in the article by Paul Bryant30 in the final issue of the SERC publication Engineering Computing Newsletter. (This publication also contains some interesting material on the growth of the interactive use of computers within the general science community in the UK.) They spelled the end of the punched card and the punched tape and the end to sociable days and nights at the computer lab, if not running and debugging one’s own programs then hanging around and chatting to find out what problems others were doing and what puzzles they were encountering. Of course the change did not occur overnight; still, most people preferred to spend time at the lab when setting up a new program. But the developments in the UK were ad hoc and rather disorganised, and this was recognised by the Computer Board which, in 1973, set up a networking committee under the chairmanship of Prof. M. (Mike) Wells of Leeds University and among whose members were Mervyn Williams (formerly of the Post Office), Roland Rosner (RAL), and Chris Morris (Bristol University). They reported to the Computer Board in 1974, recommending that a national network be set up and that its development be supervised by a six-person Joint Network Team (JNT). Although the report was accepted, these developments did not come about until 1979, and the proposed network, eventually called JANET (Joint Academic Network), did not come into full service until 1 April 1984. There were also changes on the national telecommunications (telecomms) scene. The Post Off ice had developed an experimental packet-switched system for data transmission, known therefore as EPSS, which became available in 1975. With the development of Packet Assembler-Disassemblers (PADS),the existing ad hoc system of networking became somewhat more organised, and started to use standard protocols. Intramural communications also tended to become packet based. Thus although the more traditional computational chemists stayed with punched cards or tape and preferred to run their programs in as “hands-on” a manner as possible until the end of the decade, the carnival was over. Where the user was geographically in relation to where the computer was had less and less relevance, as the decade progressed, to what could actually be accomplished. By the end of the decade too electronic mail (e-mail) had begun to make an impact on how people operated and interacted.
Daresbury and Collaborative Research Projects Further organisational changes were consequent on the move to Daresbury lab because Prof. Phil Burke, one of the original four wise men, had been appointed, on a part-time basis while still retaining his Belfast appointment, to coordinate theory and computational science developments at the lab. At that
The 1970s 301 time the lab was between experimental projects, for its Nuclear Structure Facility (NSF)was still being built, as was its Synchrotron Radiation Source (SRS). However the experimental work in both these areas was being done at other places, both in the UK and abroad, but much computing and processing of results were still done at the lab. In these circumstances, theory and computation were dominant in the active life of the lab. Phil Burke took the opportunity provided both by this and by his seniority in SRC counsels, to reorganise the Meeting House, beginning at a meeting of the Steering Panel held at Daresbury in October 1977. The upshot was that the Collaborative Computational Projects (CCPs) were invented. The first one, CCP1, was just the old Project 1 renamed, but other ones developed shortly thereafter, much in the way that had at first been envisaged at Science Board. Thus in 1978 CCP2 on Continuum States in Atoms and Molecules began, a project which in practice was closer to physics than to chemistry, Also in 1978 the Surface Science CCP3 began, involving both physical chemists and physicists. Anticipating the completion of the SRS, which would have a number of X-ray ports on it, CCP4 on Protein Crystallography was started in 1979 and thereaker developed rapidly. And by the end of the decade, CCPS on Monte Carlo and Molecular Dynamics of Bulk Systems and CCPG on Heavy Particle Dynamics were up and running. In these last three projects as in the first, chemists were dominant. The transfer to Daresbury of the workers from Atlas involved some difficulties. The Daresbury machine, an IBM 370465, had only a primitive spooling system, and because since its purchase, the first generation of virtual machines, like the 3701168, had come along, it was obsolescent and operated with an obsolescent operating system (MVT). It was also subject to inexplicable breakdowns. (Part of this unreliability was eventually traced to an electrician who had a grudge against the lab and was skilled at injecting spikes into the power supply.) Within a year, however, the computational chemistry work being done centrally had recovered to the level that it was before the move. William Rodwell had left Project 1 to take up a post at the Australian National University, and Stephen Wilson was appointed as the Project PDRA with the special task of developing the CPMET work. The transfer to Daresbury was celebrated in December 1977 with the first Daresbury Study Weekend, inspired by the Atlas symposium mentioned above, which was devoted to the study of correlated wave functions.51 It is interesting to read Phil Burke’s “Introduction and Welcome” address to the assembled company in which he took the opportunity to define the CCPs. The aim of the Collaborative Projects can be summarised as follows: 1
i. to provide for the rapid interchange of information (theory algorithms and programs) in selected areas of study; ii. to collect, maintain, and develop relevant items of software; iii. to encourage basic research in the given area, by providing facilities for rapid computer implementation of new methods and techniques;
302 The Development of Computational Chemistry in the United Kingdom iv. to assess and advise on associated computational needs; v. finally, . . ., to disseminate information among University and other research groups by organising “symposia” and “workshops.”
To assist in this work it was envisaged that the projects would be supported by i. Daresbury Laboratory staff, ii. short term appointments of Senior Visiting Fellows, iii. longer term Research Assistantships-the original idea was perhaps two per project for up to 3 years. However financial difficulties within SRC as a whole has limited this to one, iv. provision of travel expenses and subsistence for visitors from Universities needing to spend days or weeks at a time working on a project. Although these statements are clear enough, they still leave open the problem of precisely who was supposed to benefit from the CCPs and how that benefit was supposed to be measured. The precise method of funding is also left unspecified. The benefits difficulty was resolved, or not, entirely within the group working on a CCP, and the tensions, arising in trying to resolve this, led to the collapse, at least for a time, of one project. The funding problem was to bedevil the projects for the next decade. If they were funded (as was the case at this time) directly by the Science Board, other Science Board projects and committees would have correspondingly diminished grants, for the Science Board had a rigidly fixed budget. But why should, say, the Physics Committee have its budget reduced to pay for CCP1, from which, it was argued, the Physics community did not benefit? There was also the problem of where in the SRC budget lines, the staff salaries of the lab staff should fall. This last issue has remained problematic to the present, but in the early 1980s it was decided that a CCP should be treated just like a research grant, to be applied for to run for 3 years, to be refereed (by non-UK referees, eventually), and granted or not. The Treasury guidelines issued in 1977, set out in some detail the financial considerations that should be used in testing the merit of a proposal for spending made to a public body. Essentially, a “cost/benefit” analysis was required. But in practice this was not worthwhile for the relatively small sums (some thousands of pounds, without staff salaries and on-costs (indirect or overhead costs) at Daresbury) that were involved in a CCP renewal proposal. At first it was proposed to try and estimate the staff and on-costs of the lab and to attribute it to that lab. This was tried once and then abandoned, as the notional attributions had only a metaphysical significance as long as the lab was in being and actually continued to employ a staff member. This funding treatment of a CCP did not, of course, relieve all anxieties. CCPl was funded through the Chemistry Committee, and it was the (paranoid ?) nightmare of the workinggoup that, if a renewal were applied for, they might face a coalition of organic chemists on the committee, hostile to
The 1970s 303 theory and computation. But, however it happened, CCPl survived a number of renewal applications, and still exists at the time of writing. But the CCPs were now firmly placed in the scheme of things and developed steadily over the next decade, not only in terms of computing done but also in terms of Study Weekends organised, senior visitors invited to advise on projects and to work with projects for short times, and workers in a project travelling to visit groups doing relevant work outside the UK. The CCPs also each published newsletters, usually on a quarterly basis, and these were supplemented, where appropriate, by papers in the Daresbury Laboratory preprint series. Of course it is not the case that all the CCPs worked and developed in the same way, but a perfectly typical pattern is provided by the development of CCP1, and so this will provide the basis of the account of developments up to the early 1980s.
CCPl and the Advent of Vector Processing The developments reported at the next meeting of the working group in July 1978 showed quite encouraging developments, particularly on the CPMET front. ‘This was perhaps just as well, for the Chairman had just returned from an American Theory Conference meeting in Boulder, and he informed the meeting that the two most impressive theory contributions at the meeting were by Rodney Bartlett on CPMET and by John Pople on size-consistent CI using Msller-Plesset (MP) theory, and that the UK had better look sharp if it was to remain in competition. The Chairman felt that “particular attention should be given to the MP work, bearing mind the large influence of Pople in computational chemistry and his policy, to date, of making programs widely accessible to the ‘ordinary’ chemist.” Prof. Murrell said that a high priority ought to be integrating the MP treatment into the ATMOL suite. Among the other ideas canvassed at the CCPl meeting was a concerted attack on pseudopotential methods and another one on density functional methods. These sections of the discussions benefited from the presence of some who were not on the working group, but who had experience of the methods discussed, and it was agreed that applications for extra funding, by interested workers should be made to SRC with the blessing of CCP1. There was also a lively discussion as to whether or not it would be sensible for joint work to be undertaken on a particular chemical system, and Joe Gerratt argued strongly, supported by Brian Sutcliffe, for a joint attack on NOz. It was proposed that the working group “sponsor” an application to SRC for extra funds for this work. The working group declined to do this, but the discussion motivated the chairman to suggest that an application to SRC for a block of computer time for quantum chemistry, to be distributed by the CCPl working group, might be a good idea. The visitors to the meeting indicated in no uncertain terms that they thought it would be a very bad idea, and so it was dropped.
304 The Development of Computational Chemistry in the United Kingdom
The working group met again in July 1979, where it was reported that SRC had agreed to support both the density functional and pseudopotential work and that it had finally come round to a modest level of support for CECAM at an initial level of f25,000 for 3 years. There was also some discussion of what Steve Wilson should undertake in his last year as PDRA, whether to put bells and whistles on the CPMET code or whether he should begin to develop a Unitary Group CI code. These discussions were given particular urgency because in June, Daresbury had taken delivery of a CRAY lS, and so decisions were going to have to be made and made quickly, about transfer of codes and which codes it would be effective to port or to develop on a vector machine. It is appropriate here to disrupt the linear discussion of the goings-on in CCPl to provide a little background to the acquisition of the CRAY supercomputer. As the technology of large-scale integration developed in the 1970s, so the idea of the mini-computer developed, and by 1975 some computational chemistry groups in the United States had equipped themselves with group minis (generally Digital VAXs). They were relatively easy to install and required little special in the way of climate control and power supply. They were also immensely reliable, needed little maintenance, and were easy to program and use. Groups that had minis were able to free themselves from the whims and inefficiencies of often all-too-powerful computer centres, and they could run the machines all day and all night and all week and all year. So, although they were often factors of 20 or 30 slower than the best large machines, the through-put of work on them might well be much greater than on a big machine. Computational chemists in the UK were well aware of the arguments in favour of such machines and aware too, that the funding bodies favoured this development, because unit purchase costs of such machines were small and thus upgrade costs occurred in smaller increments. However they were very much united that this was not the way forward (in contrast to the engineers and the astronomers, who did favour this route). They lobbied hard for the central provision of massively large machines, essentially because they saw themselves as dealing with massively large problems. It is interesting to contrast the community attitude at this time with that of 10 years later when the same community proved extremely responsive to the appeal of group minis, when the Science Board Computing Committee (SBCC) attempted to encourage their use ,by means of special grants. The CRAY was, officially, to be the UK’s first vector processor, although it was a pretty open secret that one had come in to Atomic Weapons Research Establishment at Aldermaston earlier in the year. Time on it was to be allocated by the usual peer-review process so that its use should become part of the standard SRC grant allocation process. Although by this time EC rules prevented the operation of a crude buyBritish policy, they did not prevent opposition to an import from the United States. The relevant papers have not yet been released, but the folklore is that
The 1970s 305 there was a huge battle between the Chief Scientist at the Department of Trade and Industry, the flamboyant and colourful Duncan Davies, and the Chairman of SRC, Sir Geoffrey Allen. Duncan wished to make sure that the ICL distributed array processor (DAP) was properly considered. This in itself was not problematic, for a working group had been set up under the Chairmanship of the then Chief Scientist at the Ministry of Defence, Sir Hermann Bondi, to consider it. But when it became clear to those considering it (among whom were Phil Burke) that one would have to buy an ICL machine, and what is more, a machine not yet built; thus this option was felt to be a nonstarter as far as active work was concerned. Sir Geoffrey was seized of this view but still needed to persuade, in order to get an import licence. It is believed that he failed in the battle to persuade. It seems not unlikely that political considerations were involved. The ICL main factory was in the West Gorton constituency of the Minister for Technology, and the DAP was actually made in the Stevenage constituency of the Secretary of State for Education and Science. However, Sir Geoffrey’s personal assistant, Eric Sampson, who was a wily fellow, enlisted the aid of Brian Oakley, who had been a senior civil servant at the Department of Trade and Industry, but who had just come to be Secretary of the SRC. They fixed it with the aid of a spot of creative accounting on Cray’s part. They agreed that the machine to be delivered was to be Seymour Cray’s prototype, which had already seen use at seven other sites and actually did not have parity error correction on the memory. It was thus, for the books, a second-hand item of obsolescent design. The machine was to remain owned by Cray Research Inc., but it was to be sited in Daresbury. It was to be used as a demonstration machine by Cray who, in return, would reimburse SRC for use of accommodation, services, and staffing. The SRC was to pay for 40 hours a week access, and this arrangement was to,run until 30 June 1981. The machine was front-ended by the IBM 360/175 from August 1979, and a service to users was begun in January 1980. There need have been almost no gap between the front-ending and the start of service if security and accident had not intervented. The nuclear lab at Aldermaston had an effective frontending code that Daresbury sought to use. There was some delay in granting this on security grounds, but when the code arrived, it did not work because it was written for a front-end running MVS and not MVT! It might be useful here to remark that the CRAY was an integrated solidstate machine and perhaps the first really big one. The technology was bipolar semiconductor. It had 72 bit words of which 64 bits were data and a total of 4M words, maximum. The cycle time was 50 nanoseconds and the clock period 12.5 nanoseconds. Compact though it was, it was difficult to keep cool and required freon coolant. It had a very substantial disc backing store and could be run with very little operator intervention. Although it was only dimly grasped at the time of its installation, this kind of machine presaged the beginning of the end at computing centres, of large operating staff s, whose chief tasks had been the mounting and demounting of tapes and discs to ensure that the
306 The Development of Computational Chemistry in the United Kingdom batch queue ran smoothly. Of course large operations staffs were still required on the conventional machines (from now on called scalar processors, SPs, to distinguish them from vector processors, VPs), and these were still used, if only to front end the VPs. The philosophy of front-ending was, of course, communications driven, because it was deemed inefficient to run a VP directly. However by the middle 1980s front-ending had passed, and anyway scalar processors had become as compact and with such large backing store as had VPs at the beginning of the decade. To return now to the CCPl proceedings, it was clear that for the staff in Daresbury the CRAY was a fascinating new toy and that the porting and development of codes for the CRAY was a task that would engage their wholehearted enthusiasm. In fact Vic Saunders said that all the ATMOL codes could be ported to the CRAY in about a month’s work and that they could be subsequently vectorised to increase their efficiency. And Saunders and Guest accomplished precisely this, to the satisfaction of the whole community as will be seen later. It was agreed to have another study weekend devoted to Electron Correlation and to invite speakers from abroad to talk about Unitary Group CI, Symmetric Group CI, Pair Theories, and Valence Bond Theories. This study weekend took place 17-18 November 1979, and its proceedings52 still make interesting reading, perhaps especially the paper by Vic Saunders entitled The Use of Vector Processors in Quantum Chemistry, in which he explained how he had vectorised the ATMOL gaussian integral code and how his code now ran 16.2 times faster than was possible using the best code on the then-fastest scalar processor in the UK, the CDC 7600 at Manchester. What was possible by vectorisation is most clearly seen in the paper presented by Martyn Guest and Steve Wilson at the American Chemical Society meeting at Las Vegas in August 1980.53 What was shown there is that “straight” porting of FORTRAN code with no changes except those forced by I/O conventions and job control language (JCL) differences, gave factors of at least 4 and sometimes 10 in improved performance over the IBM 370/195. Thus, given the machine, one got an order of magnitude improvement in code performance for next to no effort. And, with effort, two orders of magnitude were often achieved. It is hard to convey the sense of exhilarated anticipation that the UK workers felt at that study weekend. Although not all the computational chemists had actually got on to the CRAY yet, all understood what the benefits were likely to be and what could be done, Thus all felt that they were in with a chance again at the international level. Unsurprisingly, at a brief meeting of the Working Group during the study weekend, all agreed to apply for another renewal of the CCP to it carry over beyond September 1980 when the project was due to end. Chairman Murrell and Vic Saunders agreed to prepare a case for Science Board, which was subsequently submitted and was successful.
The 1970s 307
Quantum Chemistry Outside CCPl It would be quite wrong to give the impression that all UK computing in this decade was an aspect of CCP work. Workers were supported on an individual basis if they were successful in SRC grant applications or if they could obtain other funding for students, PDRAs, and computer time. Indeed it was perfectly possible for work, which had not convinced the working group of a CCP of the need to support it collectively, to be funded at a modest level on an individual basis. Thus Joe Gerratt failed to convince the working group of CCPl of the value to the UK community of developing a suite of programs to evaluate integrals over Slater orbitals. Nevertheless, this work was funded with a grant of IBM 360/195 time for a number of years. Certainly CCPl was sensitive to its rather elitist nature and, to mitigate this, made some more cooptations to the working group and did all it could to make sure that all quantum chemists in the community knew what was going on as a consequence of its activities. It was, of course, the case that some workers simply preferred to work alone and felt that CCPl was irrelevant to their concerns. But their funding on the basis of case made did not depend on the good-will or sayso of CCP1. Neither did the CCPs prevent the growth of large independent groups of workers in Universities. Indeed the decade is noteworthy for the regrowth at both Manchester and Cambridge of powerful centres of electronic structure calculation, each with many students and PDRAs, whose funding and operation were not at all reliant on the CCPs. It should be noted, too, that almost all the UK developments in computational quantum pharmacology, drug design, and study of the electronic structure of carcinogenic compounds took place outside a CCP context. Why this happened is difficult to discern, but perhaps it was because commercial sponsorship for work in these areas was more readily available than in some other areas, and so the need for collaborative endeavour was less. Indeed possibly the nature of commercial sponsorship actually discouraged collaboration. However, that may be, there were lively developments in these areas in the UK, most notably perhaps in the groups of W.G. (Graham) Richards in Oxford and in that of Colin Thomson at St. Andrews. Thus while the CCPs cannot be seen as stifling initiative or preventing innovation, they did, of course, consume funds that had other uses. So, naturally enough, there was always argument about whether the money spent on CCPs was the best use that could be made of that money. And it was precisely the idea of the market being the best way to determine what was value for money that, in the late 1 9 8 0 ~provided ~ such a radical change in the way in which UK science in general and computational chemistry were to be dealt with. In May 1979, a government committed to a free-market philosophy had been elected. Due to other priorities of this government, it was not until 1987
308 The Development of Computational Chemistry in the United Kingdom that real attention could be turned to restructuring education and provisioning science based on market principles. It is thus prudent to confine what remains of this review to the period up to 1987. This cutoff will mean, too, that no discussion will be given of computational chemistry in the burgeoning world of powerful work stations, massively parallel machines, and networked PCs. Nor will the impact of Internet be considered,
INTO THE 1980s For UK computational chemistry, the 1980s began with very great promise and certainly the first 2 or 3 years of the decade were years of outstanding achievement. Among the most outstanding on the quantum chemistry front, was that of Saunders and van Lenthe on direct CI. In a remarkable paper,54 they reported on a direct CI program in which 106 configurations could be handled in 2 minutes. Providing that the reference space was sufficiently small, the timings indicated only a linear increase in cost with configuration number provided that the fast store was large enough to hold a trial vector. The Daresbury CRAY, on which this was done, actually had only */2 Mword of fast store and not the maximum 4 Mword. There was also the development of a vectorised version of GAMESS, implemented on the CRAY by Martyn Guest, that provided an outstandingly flexible and efficient package for LCAO-MO-SCF work in which the integral evaluation code went about 20 times as fast as it did on the IBM 360/195. For a time then, the UK was ahead of the field, for these kind of speeds could simply not be matched on minis (not even on the very popular VAX machines) that had been opted for by so many workers in the United States. Of course this did not last, for there was a rapid development of supercomputer centres in the United States funded by both Federal and State resources. On the whole, the United States machines had much larger fast-store than did the UK machines and so the United States workers again drew ahead in full C1 calculations. The work of Charlie Bauschlicher at NASA Ames on the CRAY-2 exemplifies this trend.55 But it was not only in the large-machine area that the United States again drew ahead. It was also during this period that the Gaussian program system began to develop as a genuine “black-box” system so that it could be used effectively by any group running a VAX or other such smallish machine. Similar developments occurred with the MINDO, MNDO, MOPAC and other semiempirical M O systems. For an overall view of the state of play in such developments at about this time, the book by Clark (a UK-educated chemist who was actually working in Germany) is a fair guide.56
Into the 1980s 309
Computer Developments United Kingdom workers were, unfortunately, to face a series of disruptions as a consequence of policy decisions resulting in changes in the organisation of laboratories in what was the SRC, which, in 1981, became the SERC, the word Engineering added explicitly to its title. During the first half of the 1980s there were three successive enquiries into the provision of scientific computing in the UK. The first was a Computing Review Working Party set up by SRC in December 1979 under the Chairmanship of Prof. Roger Elliott (Oxford), and it reported in December 1980. In April 1983 the Central Computing Committee (CCC, a body set up as a consequence of the Elliott Report) of the SERC, set up a working party under the Chairmanship of Prof. Alistair MacFarlane (Cambridge), who had been a member of the Elliott Committee, to review again SERC’s scientific computing provision. The committee reported in the spring of 1984. Finally the Advisory Board for the Research Councils (ABRC), which covered all the Research Councils including SERC, the Computer Board, and the UGC, set up a joint working party to consider the likely need for advanced research computing for the UK academic community. This Committee was chaired by Prof. John Forty (Warwick), who had been a member of the MacFarlane Committee, and his report was made in June 1985. The report of the Elliott Committee was, on the whole, encouraging to computational chemists, because it recommended, that the CRAY should be kept on at Daresbury until June 1983. It also recommended that the obsolete IBM 360/175 should be replaced by a modern and more reliable machine that would be able not only to front end the CRAY, but also to take a large portion of the growing scalar computing load at the lab as the experimental facilities there came on line. The report expressed agreement with the traditional SRC role of supporting state-of-the-art computing and noted that by 1983 it could well be that the Computer Board regional centres would be in a position to take over responsibility for vector processing. Council agreed to the arrangements proposed for the CRAY and to a new scalar machine at Daresbury and, early in 1981, a NAS 7000 was delivered which proved to be a very reliable machine. The CRAY was also upgraded from the protype machine to one with proper memory. However, sometime during 1982 the decision was taken that Council no longer support the CRAY at Daresbury, because the machine did not, by then, represent the state-of-the-art in computing. The provision of “standard” computers for the research community lay within the remit of the Computer Board, and it was now up to them to fund a CRAY. Although this was an understandable decision, acting on it actually had pretty dire consequences for the computational chemistry community. At Daresbury by this time there was an infrastructure of knowledgeable support staff, involved in the VP-based projects. This expertise was not yet matched at any of the regional centres. The communications from Universities
310 The Development of Computational Chemistry in the United Kingdom to the regional centres, though no worse than those to Daresbury, were still not fast or entirely reliable and to get help from those still learning themselves was often frustrating. To compound the misery, it was decided that the machine to be used at the chosen regional centre (it was to be the London centre, ULCC) was actually to be the machine at Daresbury. Perhaps there is no need to say more. It was extremely difficult to work to any effect for the year following the transfer of the CRAY to ULCC. The consequent anguish on the part of computational chemists finds expression in evidence given to the MacFarlane Committee. Also in evidence to this committee, the CCPs strongly supported the idea that SERC should provide researchers with the next generation of large-scale state-of-theart computer. The Committee did not feel that a view on provision of that nature lay within its remit, but it did recommend that a working party be set up to give urgent consideration to the future provision of advanced research computing. This proposed committee was realised in the Forty Committee. The effects of the report of the Forty Committee did not begin to be felt until after 1987, the chosen closing date for this chapter, but the decisions are of interest as indicating the way in which thinking was going in 1985. Appendix I of the Committee Report,s7 which consists of the evidence submitted, is a treasuretrove of the views and attitudes of those involved in the UK in what was shortly to become widely designated as “Computational Science,” spanning weather forecasting to molecular biology (bioinformatics). The Committee did, in fact, recommend that the UK academic community should have access to super-computing facilities at the forefront of technological capability. As a first step in the phased provision of such facilities the purchase of a CRAY X-MP/48 was recommended to be installed at RAL. This recommendation of the Forty Committee was, in due course, acted on and so super-computing in the UK returned in 1987, from whence it had started in 1964. But by the time that this had happened, mass storage had grown so efficient and extensive, and operating systems become so sophisticated that, compared with the old Atlas, the new machine effectively ran itself. Moreover, communications had become so capable that even though the skilled and experienced support staff remained in Daresbury, it really made little or no difference. The Committee also recommended that funds should be allocated to provide a distributed system of other forms of advanced research computing, including special-purpose machines and powerful graphics workstations to enhance local resources in selected university and research council sites. This proposal led to the Computational Science Initiative by the SERC, which was run by SBCC and which resulted in the 1980s equivalent of group minis being widely distributed throughout the UK computational chemistry community. But this time, the community welcomed them because they were provided in the context of a secure large scale super-computing environment and not in competition with it.
lnto the 1980s 31 1 The Committee recognised the likely future importance of array processors like the Floating Point System (FPS) 264, which was then just being marketed, and also of parallel machines, which it saw as being particularly useful in pattern recognition and in database management, a view that was strongly supported by evidence from the protein crystallography community.
Computational Chemistry Developments Although the CRAY was, by 1983, no longer actually at Daresbury, the lab did remain the centre for CCP developments as it did for what was ta become the Chemical Database Service (CDS). Toward the end of 1980 Ian Hillier replaced John Murrell as Chairman of CCP1, and it was under his Chairmanship that the project made an interesting move by appointing the project PDRA to be resident at a University rather than in the Lab. The appointment in question was that of Roger Amos (to succeed Steve Wilson) to work on the development of derivative methods in electronic structure calculations. He was part of Nicholas Handy’s group in Cambridge, and the work begun at that time culminated in the program system CADPAC and also in a cut-down PC version called MICROMOL. This move was emulated by other CCPs, and though it was clearly not without its disadvantages, it did produce in the case of CCP1, extremely useful code for the theoretical chemistry community. The organisation of the CCPs was also changed somewhat and the steering panel became formally constituted as consisting of the Chairs of the various CCPs with Phil Burke presiding. From then on, the history of the CCPs can be followed not only in their Quarterly News Letters, but also by consulting the Appendix entitled Theory, Computational Science and Computing of the Daresbury annual reports. It is similarly possible to follow the development of computational chemistry work performed on the machines that were to come to RAL by consulting the RAL annual report and the special reports that are, from time to time, issued by the lab (for example, Ref. 58). The mention made above of PCs should perhaps be amplified a little. These began to make an impact on computational chemistry in the UK from about 1980 onward. Initially they were commonly used as intelligent terminals and also as teaching aids. But by the mid-1980s experiments with the use of networked PCs as parallel processors were stimulating interest in them as useful aids to serious computation. Their impact, however, did not occur until the late 1980s and so are beyond the period considered here. At about the same time that the CRAY was moved to ULCC, a CYBER 205 was installed at the Manchester Regional Centre (UMRCC), and so, formally at least, by 1983 very adequate super-computing resources were available to the quantum chemical community on a peer-review basis, analogous to the previous provision via SERC central facilities. There was some delay in the CYBER 205 becoming suitably operational, and, in practice, it did not find the same favour with the computational chemistry community in the UK, as
312 The Development of Computational Chemistry in the United Kingdom did the CRAY. This was probably due to the fact that the 205 was available with a maximum of only 256 Kwords of fast store, and it had soon been found that the bigger the fast store, the better that computational chemistry programs ran. Furthermore the 205 could not process vector orders whose increment was more than unity and so the existing CRAY codes did not run efficiently on it. Of course the machine did have some advantages in its long pipeline, but it took time to recognise these and to code for them effectively. But by 1987 the CRAY at ULCC had been upgraded, and an X-MP/48 had been delivered to RAL so that the advantages UK computational chemists had had in the early 1980s were, by then, restored. During the period up to 1987, UK computational chemistry in general and the CCPs in particular developed in an orderly fashion making full use of what were really good facilities. It would perhaps be not unfair to characterise this as a period of consolidation rather than that of innovative development. The advantages of vectorisation had been pretty fully incorporated into codes for doing what groups had traditionally done. The necessary rethinking that came with new methodology (such as the Car-Parrinello method) and with new machinery (such as the development of massively parallel engines) lay somewhat in the future. The period was thus one of great productivity gained by applying well-understood techniques with well-understood and quantifiable limitations, to a large number of interesting chemical problems. There were, however, some perturbations. As has been remarked earlier, there were obvious commercial aspects to the chemical database developments. Such aspects were now beginning to surface in the work of CCP1; manufacturers of vector processors started to become quite interested in the highly efficient vector codes for doing electronic structure calculations that had been developed in the project, as aids to marketing their machines. Of course during the 1960s and 1970s IBM had used IBMOL as a marketing feature for its machines, but IBMOL was a project grown entirely by IBM, and so they could do what they wished with it. The CCPl software had been grown at public expense and by various hands, and thus its position as property, to be disposed of or traded, was deeply ambiguous. The problem seemed not to have arisen in the other CCPs for reasons that are not too clear. It is, however, a problem that is likely to continue to arise. The obvious solution to it is to expropriate the codes and market them, using any profits to further the work. But this would undoubtedly cause tensions in the groups involved and might well call into question the basic idea of a publicly funded collaborative computational enterprise.
EPILOGUE In their survey of the development of computational chemistry in the United States, Bolcer and Hermanns remark that there, “computational chemistry has remained and prospered as a result of being a cottage industry” rather
References 313 than because of the creation of centralised research facilities. And it is certainly the case in the United Kingdom that very much computational chemistry going on at the present time is in the cottage industry mould. It comprises the use of standard programs on powerful work stations to solve, as far as is possible with the codes and on the machinery available, the problem with which a user is faced. The code used is very often a commercial one, available at a special price for users of a particular machine or to those engaged in not-for-profit work. However, it is difficult to believe that this approach will suffice for the largest and most complicated computational chemistry problems that need to be tackled, There will, it seems probable, always be a need to employ the biggest and fastest engines that are available at any time, and this means centralised institutions. This is not only because such engines are likely to be hideously expensive, but also because, if the future is like the past, a collection of talents not usually found in a single group will be needed to get the best out of them. It is unclear whether the UK will remain in contention here, for it is difficult to suppose that there will be a commercial return in the short to medium term on computational chemistry work done on such engines. So, as market-oriented criteria come to predominate in judgments of what is worthwhile science, it is hard to see how such use could be justified. In this area perhaps a European incarnation is the most realistic hope for the future of such computational chemistry in the United Kingdom.
ACKNOWLEDGMENTS We should like to acknowledge the very kind help that we have had in the way of conversations, reminiscences, and suggestions from many people, though naturally we take full responsibility for our mistakes and misunderstandings. Brian Davies and Bob Hopgood at RAL provided a wonderful opportunity to hear Jack Howlett reminisce about the early days, by inviting him back for a visit, and it is with grateful pleasure that one of us was able to hear Jack reminisce. Brian Davies also allowed free access to the staff at RAL, who shared their experiences of the early days too. Steve Wilson, now also of RAL, was helpful about the early CRAY days. Roland Rosner was endlessly courteous and informative about the development of networking, as were Nicholas Handy and David Buckingham about computing in Cambridge. Huw Pritchard shared his ManChester memories, as did Geoff Hunter, to our great advantage. We are also grateful to Peter Roberts of the York Computing Service, who told us of his experiences in running a University Computing Service from its inception following the Flowers report to the mid-1980s. At Daresburjc Paul Durham was extremely helpful as were Martyn Guest, Howard Sherman, and Bob McMeeking, in steering us through the developments at Daresbury. Phil Burke too shared memories of his period at Daresbury. Roy McWeeny, John Murrell, and Sir Ron Mason kindly responded to our queries, as did Lord Flowers who, with exquisite consideration, invited one of us to take tea and to talk with him in the House of Lords.
REFERENCES 1. J. Hendry, lnnovating for Failure: Government Policy and the Early British Computer lndustry, MIT Press, London, 1990.
314 The Development of Computational Chemistry in the United Kingdom 2. K. Flamm, Creating the Computer: Government, lndustry and High Technolog, Brookings Institution, Washington, DC, 1988. 3. S. Lavington, Early British Computers, Manchester University Press, Manchester, 1981. 4. A. Hodges, Alan Turing: The Enigma, Burnett, London, 1983. 5. J. D. Bolcer and R. B. Hermann, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5, pp. 1-63. The Development of Computational Chemistry in the United States. 6. S. Lavington, History of Manchester Computers, NCC Publications, Manchester, 1975. 7. H. 0. Pritchard and F. Sumner, Phil. Mag., 45,466 (1954).A Property of ‘Repeating’ Secular Determinants. 8. H. 0. Pritchard and F. Sumner, Proc. R. SOC.Lond., A226, 128 (1954). The Application of Electronic Digital Computers to Molecular Orbital Problems. I. The Calculation of BondLengths in Hydrocarbons. 9. H. Pritchard and F. Sumner, Proc. R. SOC. Lond., A235, 136 (1956). The Application of Electronic Digital Computers to Molecular Orbital Problems. 11. A New Approximation for Hetero-atom Systems. 10. B. Gray, H. Pritchard, and F. Sumner, 1. Chem. SOC., 2631 (1956). Hybridization in the Ground State of the Hydrogen Molecule-Ion. 11. H. Pritchard and F. Sumner, J. Phys. Chem., 65, 641 (1961). Complete Set Expansions for Molecular Wave Functions. 12. R. McWeeny and B. Sutcliffe, Proc. R. SOC. Lond., A273, 103 (1963).The Density Matrix in Many-Electron Quantum Mechanics. Ill. Generalised Product Functions for Beryllium and Four-electron Ions. 13. R. McWeeny and B. Sutcliffe, Mol. Phys., 6,493 (1963).Spin Polarisation Effects in Paramagnetic Molecules: Calculations on the NH2 Radical. 14. B. Sutcliffe, 1.Chem. Phys., 39,3322 (1963).Hyperfine Electron Spin Resonance Spectrum of the NH, Free Radical. 15. M. Wilkes, Memories o f a Computer Pioneer, MIT Press, London, 1985. 16. M. Wilkes, The Radio and Electronic Engineer, 45, 332 (1975). Early Computer Developments at Cambridge: The EDSAC. 17. N. Handy, Int. Rev. Phys. Chem. 7, 351 (1988). Quantum Chemistry in the University of Cambridge. 18. C. Coulson, Biographical Memoirs of Fellows of the Royal Society, 19, 95 (1973). Samuel Francis Boys, 1911-1972. 19. S. Boys and V. Price, Philos. Trans. R. SOC. Lond., A246, 451 (1954). Electronic Wave Functions. XI. A Calculation of Eight Variational Wave Functions for CI, Cl-, S and S-. 20. S. Boys and R. Sahni, Philos. Trans. R. SOC. Lond., A246, 463 (1954). Electronic Wave Functions. XII. The Evaluation of General Vector-Coupling Coefficients by Automatic Computation. 21. S. Boys, G. Cook, C. Reeves, and I. Shavitt, Nature, 178,1207 (1956).Automatic FundamenI tal Calculations of Molecular Structure. .22. R. McWeeny, Actu. Cryst., 7, 180 (1954). X-ray Scattering by Aggregates of Bonded Atoms. 1V. Applications to the Carbon Atom. 23. I. Shavitt, Israel J. Chem., 33,357 (1993).The History and Evolution of Gaussian Basis Sets. 24. N. C. Handy, J. A. Pople, and I. Shavitt, J Phys. Chem., 100, 6007 (1996). Samuel Francis Boys. 25. J. Kendrew, R. Dikerson, R. Strandberg, R. Hart, D. Davies, D. Phillips, and V. Shore, Nature 185, 422, (1960). Structure of Myoglobin: A Three-Dimensional Fourier Synthesis at 2 A Resolution. 26. M. Perun, M. Rossmann, A. Cullis, H. Muirhead, G. Will, and A. North, Nature, 185,416 (1960). Structure of Haemoglobin: A Three-Dimensional Fourier Synthesis at 5.5 b Resolution Obtained by X-ray Analysis.
References 315 27. Report of the Committee on Higher Education, Cmnd. 2154,HMSO, London, 1963. 28. C . Coulson, Ra: Mod. Phys., 32, 170 (1960).The Present State of Molecular Structure Calculations. 29. M. Barnett, Rev. Mod. Phys., 35, 571 (1963).Mechanized Molecular Calculations-The POLYATOM System. 30. I. Csizmadia, M. Harrison, J. Moskowitz, and B. Sutcliffe, Theor. Chim. Acta, 6, 191 (1966). Non-empirical LCAO-MO-SCF Calculations on Organic Molecules. 31. I. Csizmadia and B. Sutdiffe, Theor. Chim. Acta, 6, 217 (1966).Preliminary Non-empirical Calculations on Formyl Fluoride. 32. MIT Technical Notes Collaborative Computational Laboratory 36, (1963).The Polyatom System, Part 1: Description of Basic Subroutines. 33. G. Diercksen and B. Sutcliffe, Theor. Chim. Acta, 34, 105 (1974).Configuration Interaction by the Method of Bonded Functions, Some Preliminary Calculations. 34. A Report o f a Joint Working Group on Computers for Research, Cmnd. 2883,HMSO (Her Majesty’s Stationery Office), London, 1963. 35. 1. Hillier and V. Saunders, Proc. R. SOC. Lond., A320, 161 (1970).Ab-initio Molecular Orbital Calculations of the Ground and Excited States of the Permanganate and Chromate Ions. 36. R. Fletcher, Mol. Phys., 19, 55 (1970).Optimization of SCF LCAO Wave Functions. 37. R. Kari and B. Sutcliffe, Chem. Phys. Lett. 7,149 (1970).Direct Minimisation of the Energy Functional in Some Open Shell LCAO Calculation on Atoms 38. T. Claxton and N. Smith, Theor. Chim. Acta, 22,399 (1971).Comparison of Minimization Procedures for UHF Wave Functions. 39. J. Gerratt and I. Mills,]. Chem. Phys., 49,1719(1968).Force Constants and Dipole-Moment Derivatives of Molecules from Perturbed Hartree-Fock Calculations. I. 40. J. Watson, Mol. Phys., 15, 479 (1968).Simplification of the Molecular Vibration-Rotation Hamiltonian. 41. S. Boys and N. Handy, Proc. R. SOL. Lond., A311,309 (1969).A First Solution, for LiH, of a Molecular Trans-Correlated Wave Equation by Means of Restricted Numerical Integratio?. 42. I. McDonald, Chem. Phys. Lett. 3,241 (1969). Monte Carlo Calculations for One- and TwoComponent Fluids in the Isothermal-Isobaric Ensemble. 43. J. Singer and K. Singer, Mol. Phys., 19, 279 (1970).The Thermodynamic Properties of Mixtures of Lennard-Jones (12-6)Liquids. 44. R. D. Levine, Quantum Mechanics of Molecular Rate Processes, Clarendon Press, Oxford, 1969. 45. D. Brailsford and B. Ford, Mol. Phys., 18,621 (1970). Calculated Ionization Potentials of the Linear Alkanes. 46. D. Cook, P. Hollis, and R. McWeeny, Mol. Phys., 13, 553 (1967).Approximate Ab Initio Calculations on Polyatomic Molecules. 47. D. Davies, Mol. Phys., 13,465 (1967). All Valency Electron Molecular Orbital Calculations, I. Dipole Moments, Spin Densities and 19F Shielding Constants in Some Fluorobenzenes and Fluoronitrobenzenes. 48. V. R. Saunders and J. Brown, Eds., Quantum Chemistry: The State of the Art, Atlas Computer Laboratory, Science Research Council, Chilton, Oxfordshire, 1975. 49. Crystallographic Databases: Information Content, Software Systems, Scientific Applications, Publication of The Data Commission of the International Union of Crystallography, Bonn/Cambridge/Chester, 1987. 50. P. E. Bryant, in Engineering Computing Newsletter, M. R. Jane, Ed., Rutherford Appleton Laboratory, Chilton OX11 OQX, 1996,Vol. 6,p. 6. The Rise and Fall of SRCnet. 51. V. R. Saunders Ed., Correlated Wavefunctions, DLISCIIR10, Daresbury Laboratory, Science Research Council, Warrington WA4 4AD, 1978.
316 The Development of Computational Chemistry in the United Kingdom 52. M. F. Guest and S. Wilson, Eds., Electron Correlation, DLISCIIR14, Daresbury Laboratory, Science Research Council, Warrington, 1979. 53. M. F. Guest and S . Wilson, The Use of Vector Processors in Quantum Chemistry: Experience in the UK, DLISCIIP 290T, Daresbury Laboratory, Science Research Council, Warrington, 1981. (Also published in Supercomputers in Chemistry, ACS Symposium Series, 173, P. Lykos and I. Shavitt, Eds., 1981) 54. V. R. Saunders and J. H. van Lenthe, Mol. Phys., 48,923 (1983). The Direct CI Method. A Detailed Analysis. 55. See, for example, C. W. Bauschlicher, Jr., S. R. Langhoff, P. R. Taylor, N. C. Handy, and P. J. Knowles, J. Chem. Phys., 85, 1469 (1986). Benchmark Full Configuration Interaction Calculations on HF and NH,. 56. T. Clark, A Handbook of Computational Chemistry, Wiley, New York, 1985. 57. Future Facilities for Advanced Research Computing, published on behalf of the ABRC, UGC, and Computer Board by SERC, Swindon, 1985. 58. C. R. A. Catlow, Ed., High Performance Computing at the Atlas Centre, Rutherford Appleton Laboratory, Chilton, 1994.
Reviews in Computational Chemistry, Volume10 Edited by Kenny B. Lipkowitz, Donald B. Boyd Copyright 0 1997 by VCH Publishers, Inc.
Author Index Abbott, T., 173 Abraham, D. J., 98 Abraham, R., 173 Agladze, K., 267 Ajay, 70 Akiyama, Y., 70 Albahadily, F. N., 269 Allen, M. P., 169 Allison, S. H., 72 Anderson, P. W., 266 Androulais, 1. P., 68 Appel, J., 100 Argos, P., 69 Argoul, F., 269 Arkin, A. P., 267 Arneodo, A., 269 Arnold, R., 268 Arnold, V. I., 172, 173 Aronson, D. G., 269 Austel, V., 98 Avez, A., 173 Bachrach, S. M., 99 Back, T., 67 Baer, M., 169, 173 Bagchi, B., 171 Bak, P., 269 Baker, J. E., 67 Balk, M. W., 171 Banci, L., 72 Banville, S. C., 99 Barber, W. M., 71 Barkley, D., 268 Barnett, M., 315 Bartlett, P. A., 99 Basak, S. C., 98 Bauer, S. H., 171 Baurn, R., 97 Bauschlicher, C. W., Jr., 3 16 Beasley, D., 73 Beckers, M. L. M., 71 Belew, R. K., 67 Bensimon, D., 173 Berg, P., 67 Berne, B. J., 170, 171, 173, 176
Bernstein, R. B., 169, 171 Berry, M. V., 175 Bethardy, G. A., 169 Beverley, M., 99 Beynon, J. H., 171 Billingham, J., 268 Birkhoff, G. D., 173 Blais, N. C., 169 Blanc, P., 100 Blaney, J. M., 69, 70, 73, 79, 99 Blornmers, M. J. J., 70 Boggia, R., 72 Bohacek, R. S., 72 Bohr, T., 269 Boissonade, J., 266, 267 Bolcer, J, D., 314 Bolognese, J. A., 100 Booker, L. B., 67 Borchardt, D. B., 171 Borkovec, M., 170 Borrnan, S., 97 Born, J., 67 Boudart, M., 173 Boulton, D., 100 Bowman, J. M., 169, 170 Boyd, D. B., Y , x , 66, 67, 70, 72, 98, 99, 314 Boyd, S. M., 99 Boys, S., 314, 315 Brackbill, J. U., 170 Brailsford, D., 315 Brandon, C., 66 Brandstater, A., 269 Branningan, L. H., 98 Briant, C., 99 Bridgers, M. A., 71 Brode, S., 73 Brodmeier, T., 68 Broger, C., 99 Brooks, B. R., 70 Brown, E. G., 99 Brown, F. K., 99 Brown, J., 315 Brown, R. C., 170 Brown, R. D., 71, 99 Bruccoleri, R. E., 70
317
318 Author Index Bryant, P. E., 315 Buerger, M. J., 70 Bunce, J. D., 99 Bunker, D. L., 171 Bures, M. G., 99 Burger, M., 266 Burgess, K., 99 Buydens, L. M. C., 70, 71, 72 Byer, W. H., 268 Cairns, S. S., 173 Caruthers, J. M., 71 Case, D., 69 Castets, V., 266, 267 Catlow, C. R. A., 73, 316 Caufield, C., 68 Cedefio, W., 71 Chambers, L., 71 Chandler, D., 171 Chandler, D. W., 171 Chang, G., 68 Chang, J., 70 Channell, P. J., 176 Chapman, K., 100 Chapman, S., 175 Chatfield, D. C., 170 Child, M. S., 169, 173 Chirikov, B. V., 173 Chory, M. A., 269 Chu, X.-L., 266 Chua, L. O., 267 Cinkosky, M. J., 71 Clark, D. E., 70 Clark, J. H., 66 Clark, K . P., 70 Clark, T., 316 Claxton, T., 315 Cohen, B. I., 170 Cohen, F. E., 69 Collins, M. A., 175 Coltrin, M. E., 169 Colvin, M. E., 67 Conner, J. N. L., 170 Cook, D., 315 Cook, G., 314 Coulson, C., 314, 315 Courtney, S. H., 171 Cramer, R. D., Ill, 99 Crippen, G. M., 70 Crothers, D., 171 Crutchfield, J. P., 269 Csizmadia, I., 315 Cullis, A., 314
Dammkoehler, R. A., 68 Dandekar, T., 69 Dane, A. D., 71 Darwin, C., 66 Davies, D., 314, 315 Davies, K., 99 Davis, L., 67, 70, 73 Davis, M. J., 169, 172, 173, 174, 175 De Leon, N., 170, 172, 173,174, 175 De Vogelaere, R., 173 Dearing, A., 70 Deb, K., 66, 67 Degn, H., 269 DeJong, K. A., 66 DeKepper, P., 266, 267 Demopoulos, G. P., 68 Devaney, R. L., 269 Diercksen, G., 315 Dikerson, R., 314 Dill, K. A,, 69 Dillon, W.R., 98 Dixon, J. S., 69, 70, 73 Doedel, E. J., 267 Dolata, D. P., 68 Doll, J. D., 170, 176 Downs, G. M., 71, 98 Doyama, M., 69 Dufiet, V., 267 Dugundji, J., 71 Dulos, E., 266, 267 Dunbar, B. W., 98 Dunbar, J. B., 98 Duneczky, C., 175 Edelstein-Keshet, L., 267 Eisenberg, D., 171 Eldredge, N., 66 Epstein, 1. R., 267, 269 Erickson, H. P., 72 Eriksson, L., 98 Erion, M. D., 72 Etchebest, C., 68 Evans, M. G., 171 Eyring, E. M., 169 Eyring, H., 169, 171 Ezra, G. S., 174, 175 Fairen, V., 169 Farantos, S. C., 170 Farmer, J. D., 269, 270 Farneth, W. E., 171 Faulkner, T. R., 68 Federov, V. V., 99
Author Index 319 Feigenbaum, M. J., 172, 173, 269 Felker, P. M., 169, 171 Fermi, E., 173 Ferrin, T. E., 69 Fickett, J. W., 71 Field, R. J., 266, 268, 269 Field, P. C., 268 Figliozzi, G. M., 99 Fine, R. M., 70 Fisher, R. A,, 268 Flamm, K., 314 Flannery, B. P., 68, 169, 267 Fleming, G. R., 171 Fletcher, R., 315 Floudas, C. A., 72 Fogel, D. B., 70 Fogel, L. J., 70 Fong, K. B., 98 Fontain, E., 70 Ford, B., 315 Ford, J., 172 Forst, W., 171 Francisco, J. S., 170 Fredman, M. L., 70 Freeman, C. A., 73 Freeman, D. L., 176 Freer, S. T., 70 Freier, S. M., 100 Friedman, J., 71 Friedman, R. S., 170 Friesner, R. A., 69 Fullen, G., 71 Gallion, S., 73 Ganapathisubramanian, N., 266 Gardiner, W. C., 171 Gardner, M., 98 Garrett, 8. C., 170, 172 Gispir, V., 266, 268 Gaspard, P., 169 Gasteiger, J., 70 Gates, G. H., 69 Gear, C. W., 267 Geest, T., 269, 270 Gehlhaar, D. K., 70 Gelatt, C. D., 72 Gerratt, J., 315 Gibbs, J. W., 169 Gilbert, J. R., 171 Gillilan, R. E., 172 Glasstone, S., 171 Gleick, J., 172 Glen, R. C., 70, 71
Goff, D. A., 99 Goldberg, D. E., 66, 67 Goldstein, H., 169, 176 Goldstein, M., 98 Gonzalez, D. L., 175 Goodsell, D. S., 72 Gordon, D. W., 98 Gordon, M. K., 176 Gordon, M. S., 172 Goroff, D. L., 172 Gould, S. R., 71 Grassberger, P., 269 Gray, B., 314 Gray, P., 266, 267, 268 Gray, S. K., 99, 174 Green, G. D., 97 Greene, J. M., 173 Greffenstette, J., 67, 68 Grieshaber, M. V., 98 Gubernator, K., 99 Guckenheimer, J., 172, 267 Guest, M. F., 67, 316 Guida, W. C., 68, 72 Gunn, J. R., 69 Gutierrez, D., 67 Gutzwiller, M. C., 176 Gyorgyi, L., 269 Haile, J. M., 169 Hall, G. R., 269 Hall, L. H., 98 Handy, N. C., 314,315,316 Hanggi, P., 170 Hanna, A,, 268 Harding, R. H., 266 Hardy, G. H., 269 Harik, G., 67 Harper, T.R., 67 Harrison, M., 315 Harrison, R. J., 67 Hart, R., 314 Harthcock, M. A., 174 Hartke, B., 68 Hase, W. L., 99, 170, 171, 175 Hatcher, W., 67 Haynes, G. R., 172 Hazout, S., 68 Heiles, C., 175 Heinrichs, M., 267 Heitkotter, J., 73 Helleman, R. H. G., 175 Heller, E. J., 170, 176 Henderson, D., 169
320 Author Index Hendrikson, T., 68 Hendry, J., 313 Henon, M., 175 Heppenheimer, T. A., 176 Hermann, R. B., 314 Herrmann, F., 69 Hibbert, D. B., 68, 71, 72 Hilborn, R. C., 172 Hill, T. L., 175 Hillier, I., 315 Hinds, R. M., 70 Hirshfelder, J. O., 176 Hodges, A., 314 Hoffmeister, F., 67 Holland, J. H., 66 Hollas, J. M., 169 Hollis, P., 315 Holmes, P., 172, 267 Hopfinger, A. J., 71 Hbrvath, D., 267, 268 Houghten, R. A., 100 Howard, P., 98 Howe, W. J., 73 Hubbard, J. H., 267 Hubbard, R. E., 99 Hudson, J. L., 269 Huffer, A., 67 Humblet, C., 98 Hutchinson, J. S., 174, 175 Hynes, J. T., 170, 174 Iooss, G., 175 Ishikawa, A., 71
Jaeger, E. P., 67, 69 Jaffk, C., 174 James, C. A., 99 Jane, M. R., 315 Jang, S., 174 Jensen, M. H., 269 Jochum, C., 70 Johnson, M. A., 98 Johnson, M. S., 70 Johnston, H. S., 171 Jones, D. T., 69 Jones, G., 70, 71 Jorna, S., 172, 175 Joseph, T., 170 Joshi, G. S., 98 Judson, R. S., 67, 68, 69, 99 Kadanoff, L. l?, 173 Kadar, S., 267
Kanehisa, M., 70 Kapral, R., 268 Kaptein, R., 70 Karasek, S. F., 68 Kargupta, H., 67 Kari, R., 315 Karplus, M., 70 Karr, C. L., 67 Kassel, L. S., 171 Kateman, G., 68, 70, 71, 72 Kaufman, L., 71 Kearsley, S. K., 99 Keck, J. C., 176 Kellogg, G. E., 98 Kendall, R. A., 67 Kendrew, J., 314 Kenny, P. W., 70 Kemevez, J.-P., 267 Kerr, J. M., 99 Kettaneh-Wold,N., 98 Kier, L. B., 98 Kihara, J., 69 Kirkpatrick, S., 72 Knowles, P. J., 316 Kobylecki, R. J., 98 Kollman, P. A., 69, 72 Kolmogorov, A., 268 Kolmogorov, A. N., 173 Konagaya, A., 71 Konings, D. A. M., 100 Korb, B., 67 Koshland, D. E., 72 Kouri, D. J., 169 Koza, J., 66 Kramer, J., 266 Kreiger, J. H., 98 Krlin, L., 173 Kubicek, M., 267 Kuntz, I. D., 69, 72, 73 Kupperman, A,, 169 Laane, J., 174 Laidler, K., 171 Lambert, M. H., 99 Lamont, G. B., 69 Landau, L. D., 269 Langhoff, S . R., 316 Langridge, R., 69 Laplante, J.-P., 266 Larrondo, H. A,, 175 Larter, R., 172, 269, 270 Lau, K. F., 69 Lauri, G., 99
Author lndex 321 Lavery, R., 68 Lavington, S., 314 Leach, A. R., 72, 73 Leardi, R., 72 Lederman, S. M., 169 LeGrand, S., 69 Lehmann, K. K., 169 Lengyel, I., 267 Leo, A., 98 Levine, R. D., 171, 315 Levinthal, C., 70 Levy, D. H., 171 Lewis, M., 70 Li, T.-Y., 268 Liaw, A. I., 99 Lichtenberg, A. J., 172 Lieberman, M. A., 172 Lifshitz, E. M., 269 Ling, S., 175 Lipkowitz, K. B., v, x , 66, 67, 70, 72, 98, 314 Lipton, M., 68 Liskamp, R., 68 Littlefield, R. J., 67 Liu, Y.-P., 172 Lopez, V., 169 Lorenz, E. N., 172, 268 Lu, D.-h., 170, 172 Lucasius, C. B., 68, 70, 71, 72 Luther, R., 267 Lykos, P., 316 Lynch, C. C., 172 MacKay, R. S., 172, 173, 176 MacQuanie, D. A., 169 Maggiora, G. M., 98 Magnasco, M. O., 175 Magnuson, V. R., 98 Mahfoud, S. W., 71 Manderick, B., 71 Mankin, J. C., 269 Manner, R., 71 Maranas, C. D., 72 Marcus, R. A., 169, 170, 171 Marek, M., 267 Markstein, G. H., 267 Marshall, C. H., 69 Marshall, G. R., 68, 72 Marston, C. C., 170, 172, 174, 175 Martens, C. C., 174 Martin, E. J., 98, 99 Martin, Y. C., 99 Martinez, M. L., 68
Maselko, J., 268, 269 Masere, J., 266 Mathiowetz, A,, 69 Maurice, D., 172 May, A. C. W., 70 May, R. M., 269 Mayr, E., 66 McCammon, J. A., 72 McDonald, I., 3 15 McGarrah, D. B., 68 McGehee, R. P., 269 McWeeny, R., 314, 315 Mead, R., 72 Mehta, M. A., 170, 174, 175 Meiss, J. D., 172, 173 Melius, C. F., 69 Meng, E. C., 72 Merkle, L. D., 69 Merz, K., 69 Mestres, J., 68 Metropolis, N., 72 Meyer, D., 72 Meylan, W., 98 Meza, J. C., 67, 68 Miller, W. H., 169, 175, 176 Mills, I., 315 Mindlin, G. B., 175 Mohamadi, F., 68 Monge, A,, 69 Montgomery, J. A., Jr., 171 Moon, J. B., 73 Moos, W. H., 97, 98, 99 Moreland, D. W., 98 Mori, E., 69 Morse, P. M., 175 Moser, J., 173 Moskowitz, J., 315 Motoc, I., 72 Moult, J., 69 Muckerman, J. T., 169 Miihlenbein, H., 67 Muirhead, H., 3 14 Miiller, S. C., 267 Murcko, M. A,, 73 Murray, J. D., 267 Muskal, S. M., 99 Natanson, G. A., 170 Naylor, C. B., 72 Needham, D..J., 268 Needleman, S. B., 70 Nelder, J. A., 72 Neshyba, S. P., 174
322 Author Index Newhouse, S., 269 Nguyen, D., 69 Nicolis, G., 266 Niemi, G. J., 98 Nikitin, E. E., 171 Nordholm, S., 170 Norskov, L., 99 North, A., 314 Northrup, S. H., 72 Noszticzius, Z., 267 Noyes, R. M., 268 Nyman, G., 170 Oatley, S. J., 69 Ogata, H., 70 Olafson, B. D., 70 Ohen, L. F., 269, 270 Olson, A. J., 72 Onsager, L., 170 Orbin, M., 269 Oshiro, C. M., 69 Ott, E., 270 Ouyang, Q., 267 Ozorio de Almeida, A. M., 174, 176 Pachter, R. R., 69 Packard, N., 269 Page, M., 172 Papangelakis, V. G., 67 Parker, T. S., 267 Pasta, J., 173 Pattengill, M., 171 Patterson, D. E., 99 Pavia, M. R., 97 Payne, A. W. R., 70 Pechukas, P., 170, 173 Penlidis, A,, 98 Percival, 1. C., 169, 172, 173 Perram, J. W., 269 Perry, K. M., 72 Perutz, M., 314 Peterson, M. L., 67 Petrov, V., 266, 268, 269 Petrovsky, I., 268 Philips, L. A,, 171 Phillips, D., 169, 314 Pinilla, C., 100 Piscounoff, N., 268 Poincark, H., 172 Polanyi, M., 171 Pollak, E., 169, 170, 172, 173 Pople, J. A., 314 Press, W. H., 68, 169, 267
Pretsch, E., 68 Price, V., 314 Prigogine, I., 266 Pritchard, H. O., 314 Procaccia, I., 269 Rabitt, H., 67 Raff, L. M., 169 Rampsperger, H. C., 171 Randit, M., 98 Ravishankara, A. R., 171 Rawlins, G . J. E., 67 Rechenberg, I., 66 Reeves, C., 314 Regal, R. R., 98 Reichl, L. E., 175 Reidel, D., 175 Reilly, J., 173 Reinhardt, W. P., 172, 174, 175 Rejto, P. A., 70 Rempel, G. L., 98 Rice, 0. K., 171 Rice, S. A., 169, 174 Richards, D., 169 Richards, N. J. G., 68 Richardson, J., 67 Richetti, P., 269 Richter, L. S., 99 Ring, C. S., 69 Ringe, D., 73 Ringland, J., 269 Rogers, D., 71 Romanelli, L., 175 Rosenberg, R. O., 173 Rosenbluth, A., 72 Rosenbluth, M., 72 Rosenstock, I. H. M., 171 Ross,J., 266, 267 Rossi, I., 72 Rossler, 0. E., 269 Rossmann, M., 314 Rotstein, S. H., 73 Rousseeu, W., 71 Roux, J. C., 269 Ruelle, D., 269 Rzepa, H. S., 99 Sahni, R., 314 Sakurai, J. J., 170 Santi, D. V., 72 Saul, A., 268 Saunders, M., 72 Saunders, V. R., 315, 316
Author Index 323 Schaffer, J. D., 67 Schaffer, W. M., 270 Schatz, G . C., 169, 170 Schell, M., 266, 269 Schlick, T., 66 Schnackenberg, J., 267 Schneider, F. W., 267 Schneider, G., 71 Schnur, D. M., 98 Schoichet, B. K., 72 Schomisch, M., 67 Schranz, H. W., 170, 175 Schraudolph, N. N., 68 Schreiber, I., 267 Schroder, S., 72 Schuster, P., 70 Schweber, S. S., 266 Schwefel, H.-P., 67 Schwenke, D. W., 169, 170 Scofield, J., 99 Scott, P. J., 98 Scott, S. K., 266, 267, 268, 269 Scovel, C., 176 Scuseria, G. E., 68 Sepulveda, M. A,, 170 Shaffer, J. D., 73 Shampine, L. F., 176 Shands, E. F. B., 68 Sharma, S. K., 67 Sharp, A. R., 170 Shavitt, I., 314, 316 Shaw, R. S., 269 Shemetulskis, N. E., 98 Shenkin, P. S., 70 Sheridan, R. P., 99 Sherman, C. J., 70 Shirts, R. B., 174 Shoemaker, K. R., 99 Shore, V., 314 Showalter, K., 172, 266, 268, 269 Siani, M. A,, 98, 99 Sibert, E. L., 111, 174 Sigmund, K., 67 Simson, R. J., 99 Singer, J., 315 Singer, K., 315 Singer, M., 67 Sjoestroem, M., 98 Skodje, R. T., 174, 175 Smith, N., 315 Smith, R. W., 68 Sohlberg, K., 174 Soto, M. R., 172
Spellmeyer, D. C., 98, 99 Stahl, M. T., 68 States, B. J., 70 Stauber, G. B., 99 Stechel, E. B., 176 Steele, J., 98 Steinfeld, J. I., 170 Steinmetz, C. G., 269, 270 Stewart, I., 268 Still, W. C., 68 Stine, J. R., 169 Stoddard, B. L., 72 Stora, R., 175 Stover, B. J., 169 Strandberg, R., 314 Straub, J. E., 176 Strogatz, S. H., 267 Stroud, R. M., 72 Suhai, S., 69 Sumner, F., 314 Sun, S., 69 Sun, Y., 170 Sutcliffe, B., 314, 315 Swaminathan, S., 70 Swift, J. B., 269 Swinney, H. L., 267, 269 Syage, J. A., 171 Syswerda, G., 67 Tabor, M., 173 Takens, F., 269 Talkner, P., 170 Tan, Y. T., 69 Tanaka, M., 69 Tapley, B. D., 175 Taylor, H. S., 170 Taylor, P. R., 3 16 Taylor, R., 99 Teller, A., 72 Teller, E., 72 Terrett, N. K., 98 Terrile, M., 72 Teukolsky, S. A., 68, 169, 267 Thayer, A. M., 98 Thomas, J. M., 73 Thompson, D. L., 169 Tildesley, D. J., 169 Tolman, R. C., 169 Tooze, J., 66 Topper, R. Q., 170, 174, 175, 268 Tbth, A., 268 Totoki, Y., 71 Toya, T., 71
324 Author Index Treasurywala, A. M., 67, 68, 69 Troup, C. D., 71 Truhlar, D. G., 72, 169, 170, 172 Truong, T. N., 170, 172 Tuan, S. F., 170 Tuffery, P., 68 Tufillaro, N. B., 173 Turing, A. M., 266 Tyndall, G. S., 171 Tyson, J. J., 268 Ugi, I,, 70, 71 Ulam, S., 173 Unger, R., 69 Uzer, T., 175 Van der Waerden, B. L., 169 van Lenthe, J. H., 316 Van Vliet, D. V., 99 Vasquez, D. A., 268 Vastano, J. A,, 269 Vecchi, M. P., 72 Vemuri, V. R., 71 Venkatasubramanian, V., 68, 71 Verkhivker, G. M., 70 Vetterling, W. T., 68, 169, 267 Vivaldi, F., 173 Vladutz, G., 71 Vogtmann, K., 172 Voter, A. F., 170 Voth, G. A., 169, 172 Wahrhaftig, A. L., 171 Wallbaum, S., 99 Wallenstein, M. B., 171 Walters, D. E., 70 Walters, W. P., 68 Wang, H., 70 Wang, L., 99 Wang, N., 99 Warr, W. A., 71 Watson, J., 315 Webd, S. P., 171 Weber, L., 99
Weijer, A. P. d., 71, 72 Weiner, S., 69 Weininger, A., 71 Weininger, D., 71, 98, 99 Weininger, W. L., 71 Weinstein, A,, 172 Weiser, M. W., 98 Wermuth, C. G., 69 West, B. H., 267 Wienke, D., 72 Wiggins, S., 172, 173 Wigner, E., 172, 176 Wilkes, M., 314 Will, G., 314 -Willett, P., 70, 71, 98 Williams, D. E., 68 Wilson, S., 316 Wold, J., 69 Wolf, A., 269 Wolf, R. J., 171 Wong, A. K., 98 Wrede, P., 71 Wright, E. M., 269 Wunsch, C., 70 Wyatt, J. R., 100 Wyatt, R. E., 170, 173 Xiao, Y., 68 Yamamoto, R., 69 Yang, D., 171 Yarmush, D. L., 70 Yorke, J. A., 268, 270 Youvan, D. C., 71 Zachilas, L., 170 Zakrzewski, J., 170 Zambias, R., 100 Zare, R. N., 171 Zewail, A. H., 169, 171 Zhao, M., 169, 170, 174 Zhao, P.-L., 100 Zimmermann, E. C., 266 Zuckermann, R. N., 99
Reviews in Computational Chemistry, Volume10 Edited by Kenny B. Lipkowitz, Donald B. Boyd Copyright 0 1997 by VCH Publishers, Inc.
Subject Index Computer programs are denoted in boldface; databases and journals are in italics. Absolute rate theory, 115 Accumulation point, 239 Actions, 130 Activation energy, 115 Activator species, 206 Adams-Bashforth algorithm, 205 Advisory Board for the Research Councils (ABRC), 309 Alkanes, 291 Alleles, 28 Allinger, N. L., viii Alpha-Branch and Bound method, 61 Alternate selection strategies, 24 Alternative crossover schemes, 25 AMBER, 45 American Theory Conference, 303 Amino acid sequence, 45 Annealed dynamics, 63 Apamin, 44 Aperiodic motion, 238, 262 Area-preserving mapping, 141 Arnold diffusion, 167 Arnol’d tongues, 251, 252 Array processors, 310 Arrhenius equation, 115 Artificial selection, 9 Asymmetric crossover, 5 4 Asymptotic state, 232, 236 Atlas, 277, 287, 289, 290,295, 310 Atlas Computer Laboratory, 285, 286, 292, 295 ATMOL, 285,289,296,297,303, 306 Atom layer table, 82 Atomic Energy Research Establishment, 286 Atomic Weapons Research Establishment, 304 Attractive fixed point, 145 Attractive manifold, 145 Attractive periodic orbit, 145 Attractors, 199, 236 Augmented phase space, 204 AUTO, 205 Autocatalysis, 183, 206, 209, 216, 222, 223, 225
Autocode, 276 Avian pancreatic polypeptide inhibitor (APPI), 44 Backbone conformation, 40 Backward Euler method, 200 Ball-and-stick models, 281 Barnett-Coulson expansion, 283 Barrier recrossing, 102 Bartlett, R., 303 Basin of attraction, 236 Basis functions, 52 Belousov-Zhabotinskii (BZ)reaction, 228,259 Benzene, 37 Biased D-optimal design, 85 Biased designs, 91 Biased diversity, 83 Bifurcation diagram, 193 Bifurcation point, 193, 239 Bifurcations, 133, 185, 190 Bimolecular exchange, 119, 120 Bimolecular reactions, 152, 156 Binary chromosomes, 6, 8, 57 Bioactive conformation, 50 Bioinformatics, 3 10 Biology, 181 Biosynthesis of proteins, 179 Bistability, 182, 183, 186, 188 Bit strings, 80 Blending, 54 BLoop, 45 Born-Oppenheimer approximation, 102, 108, 128 Bottleneck separatrix, 163 Boulder Conference, 282 Boys, S. F., 278, 292 Breeding, 4 Broken torus, 253 Brownian dynamics, 63 Brusselator, 195, 198, 211 Buckingham, A. D., 290 Building blocks, 17, 25, 30, 38 Butterfly effect, 118
325
326 Subject Index CADPAC, 3 11 Cambridge, University of, 277 Cambridge Structural Database, 95, 298 Canonically conjugate variables, 105 Cantorus, 149, 168 Carbon clusters, 38 Carcinogenic compounds, 307 Cartesian random search, 61 CAVEAT, 87 CDD computers, 297, 306 Cehlar automata, 230 Center Manifold theorem, 133, 150, 163 Central Computing Committee (CCC), 309 Centre Europten Calcul Atomique et MoICculaire (CECAM), 297, 304 CERN, 292,297 Chaos, 101, 109, 115, 117, 179, 183, 214, 226,235 Chaotic behavior, 237, 263 Chaotic orbits, 132 Chaotic phase space, 159 Chaotic state, 258 Chaotic systems, 23 1 Chaotic trajectories, 117, 137, 138, 149, 167, 239 CHARMM, 50 Chemical Database Sewice, 31 1 Chemical databases, 292, 298 Chemical distance, 50 Chemical diversity, 75, 95 Chemical functionality descriptors, 79, 81, 82, 87, 95 Chemical graphs, 51 Chemical kinetics, 115, 190 Chemical oscillators, 259 Chemical waves, 188, 215 Chemistry, 181 Chemometrics, 37, 79, 282 Chlorite-iodide-rnalonic acid (CIMA) reaction, 206 Chromosomes, 2, 6, 10, 17, 18, 25, 36, 41, 47 Circle maps, 247, 248, 250, 253, 255 CLARA clustering method, 57 Classical dynamics, 128 Classical Hamiltonian function, 104 Classical mechanics, 103, 117 Classical resonance effects, '120 Clementi, E., 284 CLICHE, 87 CU)GP, 79 Closed system, 182, 206 Clustering, 56, 80, 93
Clusters, 37, 38 Collaborative Computational Projects (CCP), 294,296,301,303 Collision-free limit, 103 COLOSSUS, 273 Combinatorial chemistry, 75 Combinatorial explosion problem, 61 Combinatorial library design, 89 Commensurate relationship, 245 Committee for Scientific Policy, 288 Comparative molecular field analysis, (CoMFA), 88 Compiler, 276 Computational chemistry, v, 37, 271, 178, 291 Computational crystallography, 289, 292 Computational quantum chemistry, 289,292 Computational quantum pharmacology, 307 Computational Science Initiative, 3 10 Computer-aided drug design, 75 Computer-aided instrumentation, 281 Computer-aided synthesis, 281 Computer Board, 272, 288, 300 Computer graphics, v Computing Review Working Party, 309 Concentration gradients, 205 Configuration interaction (Cl), 283, 296, 304, 308 Configuration space, 103 Conformational entropy, 41 Conformational isomerization, 120 Conformational searches, 7, 23, 32, 37, 38, 40, 46, 48 Conjugate gradient, 39 Connectivity indices, 79 Conservation condition, 232 Conservative systems, 129 Constraints, 20 Constructionist approach, 180 Continuation method, 203 Continuous-flow stirred tank reactor (CSTR), 182 Continuum states in atoms and molecules, 301 Convergence, 19 Convergence of journals, ui, uiii Corporate archives, 93 Correlation dimension, 260, 264 Correlation matrix, 83 Correspondence principle, 109 Cottage industry, 312 Coulson, C., 282, 290, 292, 294 Counts, R. W., viii
Subject Index 327 Coupled Hamiltonian systems, 132 Coupled lattice methods, 230 Coupled map lattices, 231 Coupled ordinary differential equations, 191 Coupled ordinary differential equations lattices, 231 Coupled pair many electron theories (CPMET), 298, 303 Crambin, 45 CRAY computers, 304, 308, 309, 310, 312 CRAY Research Inc., 305 Cross-validated multiple regression, 88 Crossover, 5, 6, 10, 17, 19, 36, 43, 4 7 Crossover operator, 18,26, 29, 45, 53 Crossover problems, 21 Crossover rate, 15, 23 Crystal Structure Search and Retrieval (CSSR),298 CSEARCH,39,49,61,62 Cubic autocatalysis, 211, 223, 226 Cut-and-splice operator, 3 1 CYBER 205 computer, 311 Cylindrical manifolds, 120, 153 D-optimal design, 83 Daresbury Laboratory, 292, 301, 302, 309 Darwinian evolution, 4, 27 Database management, 311 Database similarity searching, 80 De Leon-Berne Hamiltonian, 138, 139, 154, 157 De novo design, 63, 89 Defining length, 19 Delay line, 274 Deletion, 5 4 Density functional methods, 303 Design matrix, 83 Detailed balance condition, 113 DEUCE computer, 274 Devil’s staircase, 250, 251, 257 Dewar, M. J. S., 285 Diatomic molecule, 106 Differential analyzer, 275 Diffusion, 205 Diffusion-induced instability, 206, 208, 212 Digital computers, 273, 279, 304 Dimensionless concentration, 196, 229 Dimensionless variables, 196, 218 Diploid chromosomes, 28 Dirac, P. A. M., 278 Direct CI methods, 298 Directed tweak, 49, 62 Discontinuous transition, 186
Display screen, 275 Dissimilarity, 78 Dissipative systems, 179, 182, 188, 231 Distance constraints, 48, 49 Distance geometry, 49, 62, 63 Distance matrix, 42 Distributed array processor, 305 Diversity score, 90, 97 Diversity sites, 75 Diversity space, 94 DNA, 35, 37, 48,55 DOCK,46 Docking, 46, 63 DOP model, 253, 256,258,264 Drug design, uii, 75, 307 Dynamic programming, 56 Dynamical chaos, 118, 119, 129 Dynamical systems, 179 Eckhart Hamiltonian, 290 Ecology, 181 EDSAC computer, 274, 278,280, 282, 290 EDVAC computer, 278 Ehrenfest theorem, 109 Electrochemical systems, 259 Electron correlation, 294, 306 Electronic Journal of Theoretical Chemistry Electronic mail (e-mail), 300 Electronic mail groups, 65 Electronic wavefunctions, 102 Electrotopological state index, 8 1 Elementary particles, 178 Elitism, 11 Elliott Committee, 309 Elliott 400 series computers, 274 Elliptic fixed point, 136, 137, 139, 144 Energy transfer, 103, 126 Engineering Computer Newsletter, 300 Enzyme-catalyzed reaction, 179 Enzyme kinetics, 191 Epitope effect, 96 Ergodic motion, 131, 139 Evolutionary programming, 47 Evolutionary strategies, 4, 5, 34 Excitable media, 230 Experimental design, 83, 92 Experimental packet-switched system, 300 Explicit method, 201 Extended Mercury Autocode, 276 Factorial design, 77 FAP language, 287 Far from equilibrium, 182
Y
328 Subject lndex Farey sequence, 247 Features, 52 Feedback, 182, 183, 201, 205, 228 Ferranti computers, 274, 287 Fertilization operator, 29 Fingerprint routines, 80 First-order rate constant, 113 Fisher-Kolmogorov equation, 219 Fitness, 2, 22, 24 Fitness function, 8, 18, 20, 24, 32, 43, 50, 54, 55, 58, 89, 90 Fitness landscape, 2, 13, 48, 58 Fixed point of order, 140 Fixed point of reflection, 145 Fletcher-Reeves method, 290 Floating point operations, 276 Floating Point System 264 (FPS) computer, 310 Floquet exponents, 237, 262 Flowers Report, 285, 288 Flux of trajectories, 156 Focus, 198,221, 245 Focusing operation, 41 Foliated phase space, 106, 124 Foliated tori, 127, 131, 152 FORTRAN, 287,289,306 Forward Euler method, 200 Four-letter alphabet, 35 “Four wise men,” 293, 294, 300 Fractal dimension, 260, 264 Fractal object, 236 Fractal torus, 253, 257 Fractional dimension, 236 Frequency factor, 115 Frustrated response, 25 1 Function evaluations, 21 Fundamental laws, 179 Fundamental particles, 177 Fuzzy logic, 81
GAMESS,285,308 Gametes, 29, 36 Gametogenesis, 36 Gaussian, 285, 308 Gaussian basis, 279 Gaussian integral evaluations, 289 Gaussian weighted mutations, 35 Gear algorithm, 201, 202 Generational replacement, 25 Generations, 22 Genes, 179 Genesis, 39 Genetic algorithm (GA), 1, 4, 49, 63, 88, 89
Genetic algorithm codes, 8, 65 Genetic function approximation (GFA), 52 Genome maps, 55 Genotype, 6, 8, 35 Gerratt, J., 294, 303, 307 Global minimum, 1, 2, 15, 20, 42, 44 Global optimization, 1, 21, 58 Global stability, 190 Globally chaotic dynamics, 139 Golden torus, 149 Gordon Research Conference on Computational Chemistry, vii Government funding, 271 Granularity, 24 Grassberger-Procaccia algorithm, 261 Gray coding, 7, 11, 24 Greedy oversampling, 2 7 Green’s function methods, 298 Green’s theorem, 166 Grid search methods, 61, 62 Growing, 46 Guest, M. F., 292, 297, 308 Hamiltonian operator, 105, 106 Hamiltonian systems, 128, 142, 231 Hamilton’s equations of motion, 104, 129, 134, 140, 166 Hamming distance, 28 Handy, N. C., 298 Hansch-type analysis, 88 Haploid chromosomes, 28 Harmonic oscillator, 106, 107, 130 Hartree, D. R., 275, 278 Hartree-Fock equations, 275 Hausdorff dimension, 260 HCN, 120, 157 Heaviside step function, 261 Hel, cluster, 156 Hemoglobin, 280 Henon algorithm, 134, 135 Henon-Heiles Hamiltonian, 132 Heteroclinic fixed point, 146, 147, 148 Heteroclinic orbits, 149 Hierarchical genetic algorithm, 48 Hierarchical potential, 45 High-throughput screening, 75 HINT,79 HNSi, 156 Hoffmann, R., viii Home pages, xi, 65 Homoclinic fixed point, 146, 147 Homoclinic manifold, 162 Homoclinic orbits, 149
Subiect lndex 329 Homogeneous systems, 182, 200, 214 HookSpace index, 95 Hop operator, 54 Hopf bifurcation, 195, 198, 212, 234, 244, 245 Hopping, 54 Horse myoglobin, 280 Houk, K. N., viii Huckel calculation, 276 Hyperbolic fixed points, 137, 139, 145 Hyperbolic manifold, 160 Hypercylinders, 162 Hyperdimensional Poincari map, 164 Hyperstructures, 5 1 Hypersurface, 2 Hypervolume preservation, 164, 167
Joint Network Team (JNT), 300 Journal of Chemical Information and Computer Sciences, vii Journal of Computational Chemistry, vii Journal of Molecular Graphics, vii lournal of Molecular Modeling, v
IBM, 292,312 IBM computers, 277, 282, 283, 286, 289, 292,295,296,301,305,307,308,309 IBMOL,284,289,312 ICL computers, 296, 305 Ideal gas law, 180 Implicit method, 201 Information dimension, 260 Information matrix, 83 Inhibitor species, 206 Initial conditions, 110, 236 Initial value problems, 200 Insertion, 54 Internal bottlenecks, 163 Internal coordinates, 104 International Computers and Tabulators, 287 International Computers Limited, 287 International Journal of Quantum Chemistry,
Lack of Fit (LOF), 52 Lagrangian, 105 Lamarckian genetic algorithm, 32, 38 Landau theory, 245 Laplacian operator, 227 Latent property space, 78, 94 Lateral instabilities, 224 LCAO MO SCF, 282,283,289,308 Lennard-Jones, j. E., 277 Libraries of compounds, 75, 76, 95 Life, 180 Ligand flexibility, 47 Ligand-protein docking, 62 Ligand-protein systems, 46 LiH, 290 Limit cycle, 198, 200, 237, 262 Limit cycle attractor, 199, 231, 233, 236, 244 Limit sets, 132 Linear stability analysis, 140, 143, 154, 155, 191, 192,207 Linearized reactive island theory, 120, 158 Liouville’s theorem, 164, 166 Lipophilicity, 79 Lipscomb, W. N., viii Local filter, 48 Local gradient, 39 Local minima, 1, 14, 39, 43 Local minimization, 16 Local search, 11, 22, 24 Local stability, 190, 191 Local structure, 44 Log KO/,, 82,87 Logistic map, 238, 244 LOGKOW, 79 Longuet-Higgins, C., 283, 290 Loop libraries, 45
Vi
Intramolecular bottlenecks, 168 Intramolecular energy transfers, 120 Introns, 35 Invariant tori, 137, 167 Inversion, 27 Invertible map, 253 Iodate-arsenite reaction, 185, 189 Irrational approach, 75, 97 Isolas, 187 Isornerization, 159 Isomorphous replacement method, 48 Isosterism, 8 1 Iterative resynthesis, 96 Jacobian matrix, 194, 197, 199, 203, 207, 221,230 Joint Academic Network, 300
KAM theorem, 119, 130, 131, 150 KAM tori, 131, 132, 138, 153, 158, 168 Kennard, O., 298 Keyboard, 275 Kinetic energy, 105, 122 Kinetic rate constants, 57 Kleier, D. A., viii Knotted conformations, 38 Kollman, P. A., viii
330 Subject Index Loop regions, 43, 45 Lyapunov dimension, 260, 264 Lyapunov exponent, 236,253,261,262 MacroModel, 39 Magnetic drum, 274, 276 Manodromy matrix, 143, 144 Mark computers, 274, 275, 276 Many-center integrals, 283 Map, 244 Mass screening, 75 Massachusetts Institute of Technology, 282 Master-slave arrangement, 33 Mating pool, 9 Maximum dissimilarity, 93 McWeeny, R., 294 Meeting Houses, 293, 301 Meiosis, 36 Melittin, 44 Mercury, 276, 286 Message passing, 33 Messy chromosome, 30 Messy genetic algorithm, 18, 30, 46, 65 Meteorology, 181, 235 Method of time delays, 259 Metropolis Monte Carlo, 38, 43, 59 Microcanonical ensemble, 103, 110 MICROMOL, 3 11 Microscopic reaction dynamics, 116 MINDO, 308 Minimum chemical distance, 5 1 Mitosis, 36 Mixed-mode oscillations, 254, 259 Mixed-mode route to chaos, 252 Mixed-mode states, 247, 257 MNDO, 308 Mobius strip, 145, 163 Mode-mode coupling, 126, 150 Mode-mode energy transfer, 130, 132 Mode-mode resonance, 138 MOLCONN-X, 79 Molecular connectivity indices, 87 Molecular design, viii, 53 Molecular docking, 46 Molecular dynamics (MD), 63, 102, 103, 108, 109, 231, 290, 301 Molecular graphics, viii Molecular mechanics, viii Molecular modeling, v, 37 Molecular motions, 115 Molecular phase space, 101 Molecular properties, 78 Molecular quantum chemistry, 276
Molecular similarity, 49, 78 Molecular simulations, viii MOLECULE, 296 Maller-Plesset (MP) theory, 303 Momentum, 104 Monte Carlo, 43, 44, 63, 301 MOPAC, 46,308 Morse potential, 123 Moskowitz, J., 284 Mulliken, R. S., 283 Multi-niche crowding, 56 Multidimensional scaling (MDS), 78 Multiple sequence aligiments, 56 Multiple steady states, 182 Multiple variable systems, 193 Multistability, 182 MUNICH, 285,296 Murrell, J. N., 294 Mushrooms, 187 Mutation, 4, 6, 10, 19, 34, 47, 54 Mutation operator, 10, 31, 44, 53, 54 Mutation rate, 10, 15, 23, 35 Myoglobin, 45 N-map, 150 Naphthalene, 37 NAS 7000 computer, 309 National Institute for Research in Nuclear Science (NIRNS), 286 National Institutes of Health (NIH), 298 National Research and Development Corporation (NRDC), 272 National Research Council of Canada (NRCC), 285 Native structure, 44 Natural evolution, 3 Natural selection, 3, 9 Nelder-Mead simplex method, 38, 60 Nernst, W., 217 Networking, 299 Neural Net (NN), 55 Neural Network/Evolutionary Algorithm, 55 Newton-Raphson method, 201, 229 Newton’s equations of motion, 103, 104, 108 Newton’s method, 203 Next-return map, 233 Niche, 41, 55 Niching, 27 Nicotinamide adenine dinucleotide (NADH), 252 NMR, 291 NO,, 303 Noisy functions, 32
Subiect lndex 331 Nonchaotic systems, 236 Nonequilibrium thermodynamics, 179 Nonhomogeneous systems, 205 Nonlinear chemical kinetics models, 201 Nonlinear coupled system, 239 Nonlinear dissipative systems, 234 Nonlinear dynamical systems, 101, 117, 235 Nonlinear dynamics, 117, 119, 128, 177 Nonphysical solutions, 20 Nonstatistical effects, 101, 102 Normal forms, 190 Normally invariant hyperbolic manifold, 160 Norwegian Defence Research Establishment, 276 Nuclear coordinates, 103 Nuclear momenta, 103 Nuclear Overhauser effect (NOE), 49 Nuclear Structure Facility (NSF), 301 Numerical Algorithm Group (NAG), 291 Numerical Recipes, 199 OSC, 120, 149 Octanol/water partition coefficient (log K,/,o), 79 Off-line performance, 9 Offspring, 4 Olsen model, 265 One-point crossover, 26 Open chemical reactor, 180, 183 Open systems, 182, 206 Optimal control problem, 42 Optimality, 84 Optimization, 1, 37, 290 Order in chaos, 140 Order in schema, 19 Ordinary differential equation (ODE), 104, 191, 199 Oregonator, 228 Organization of matter, 81 Orion computer, 292 Oscillations, 195 Oscillatory systems, 206 Outcrossing, 37 Packet Assembler-Disassemblers (PADS),300 Paging, 274 Paper tape, 274 Parallel computing, 33, 3 11 Parallel direct search, 39, 61 Parallel optimization, 2 Parameter values, 22 Partial differential equations (PDE), 226, 227 Partially matched crossover, 26
Particle-in-a-box model, 180 Pattern recognition, 3 11 Pattern search, 62 Penalty functions, 20, 21 Peptidases, 55 Peptoid libraries, 91, 95 Period doubling, 226, 238, 258 Period-four state, 239, 243 Period-one state, 239, 243 Period-two state, 239, 243 Periodic behavior, 238 Periodic orbit, 131, 132, 140 Periodic orbit dividing surface (PODS), 128, 150 Periodic oscillations, 183 Peroxidase-oxidase reaction, 249, 252 Pharmacophore, 49, 76, 83, 85, 94 Phase-locked behavior, 245, 247 Phase space, 102, 199, 231, 236 Phase space averages, 105 Phase space donut, 126 Phase space portraits, 232, 259 Phase space structures, 101 Phenotype, 6, 8, 35 Phospholene, 120, 156 PIG, 276 Pitchfork bifurcation, 193 Planar wave, 225 Plug boards, 274 PoincarC-Birkhoff theorem, 131 PoincarC integral invariants, 164, 165, 166 Poincark maps, 119, 133, 134, 136, 137, 138, 141, 142, 147, 149, 153, 155, 164, 231, 233,243 Poincark sections, 231, 234,242, 254 Poincark surface of section, 232 POLYATOM, 284,296 Polymer folding, 42 Polymers, 53 PomonaSS, 79 Pople, J. A., 285 Population convergence, 41 Population dynamics, 189, 190 Population size, 12, 22 Populations, 2, 6, 8 Positional scanning, 96 Potential energy, 104, 122 Potential energy functions, 58 Potential energy surface, 155 Predator-prey systems, 189 Prediction spectrum, 41 Predictor-corrector method, 230 Premature convergence, 20
332 Subject Index Primary-1 back-reactors, 152, 154 Principal components analysis (PCA), 79 Programs, 65, 205 PROLOG P, 79 Propagating reaction-diffusion fronts, 188, 215, 216 Properties, 79 Property space, 76, 77, 78, 88 Protease inhibitors, 47 Protein conformation, 44 Protein crystallography, 48, 280, 281, 301, 311 Protein engineering, 43, 55 Protein flexibility, 47, 62 Protein NMR data analysis, 48 Protein sequences, 45, 55 Proteins, 7, 40 Pseudo-receptor, 50 Pseudopotential methods, 303 Punched cards, 274, 300 Punched tape, 300 QCPE, 289 Quadratic autocatalysis fronts, 217, 222 Quadratic map, 238 Quantitative Structure-Activity Relationships, uii
Quantitative structure-activity relationships (QSAR), u, 6, 12, 52, 84, 88, 89 Quantum chaos, 168 Quantum chemistry, 277, 285, 307 Quantum Chemistry Program Exchange (QCPE), 291 Quantum dynamics, 109, 168 Quantum mechanics, 105, 177 Quarks, 178 Quarterly News Letters, 3 11 Quasi-steady-state assumption, 191 Quasiclassical simulations, 108 Quasiperiodic behavior, 119, 245 Quasiperiodic orbit, 160 Quasiperiodic states, 25 1 Quasiperiodic trapping, 102 Quasiperiodicity, 119, 168 Radius of gyration, 44 Ramachandran plots, 7 Random search, 38, 49, 59, 62 Rank-based selection, 24 Rank fitness, 24 Rate constant, 103, 112, 115, 185 Rational drug design, ix, 75 Reaction coordinate, 109, 156
Reaction-diffusion fronts, 215, 219 Reaction-diffusion models, 181, 214 Reaction-diffusion equations, 218, 223, 226 Reaction dynamics, 101, 281 Reaction rate, 103, 109, 156 Reactive islands, 120, 150, 152, 156, 157 Reactive motion, 123 Real-valued chromosomes, 31 Receptor interaction descriptors, 79, 81, 82 Reciprocal residence time, 186, 187, 188 Recrossing problem, 116 Reductionist philosophy, 177, 178, 179, 180 Redundant properties, 79 Regional Computer Centres, 297 Regression hypothesis, 113 Regular motion, 117 Regulator gene, 30 Reordering, 27 Replacement, 10, 25 Repulsive manifold, 145 Repulsive periodic orbit, 145 Resonance zone, 136, 137 Resonant dynamics, 130 Rice-Ramsperger-Kassel-Marcus (RRKM) theory, 102, 114 RNA, 48 Robb, M. A., 294 Rossler attractor, 238, 240, 244 Rotamer library, 40, 44, 48 Rotarion number, 255 Roulette wheel selection, 9, 25, 32 Routes to chaos, 238 RTN scenario, 247 Runge-Kutta methods, 200, 201, 202, 240 Rutherford Appleton Laboratory (RAL), 292, 296 Rutherford High Energy Laboratory (RI-IEL), 286,292,295 Saddle point 184 SAS, 80 Scaffold, 75, 77, 87 Scalar processors, 306 Scaling, 32 Scattering theory, 290 Schema, 16 Schema theorem, 18, 19 Schnackenberg model, 21 1 Science Board, 293, 298, 306 Science Board Computing Committee (SBCC), 304, 310 Science Engineering Research Council (SERC), 298,309,310
Subject Index 333 Science Research Council (SRC), 285, 298, 309 Scoring functions, 89 Scroll waves, 230 Search keys, 80 Second-generation libraries, 88, 89 Secondary structure, 43 Segmented genetic algorithm, 44 Selection, 4, 9, 34 Self-organizing systems, 179 Self-similar features, 250 Separatrix, 124, 128, 136, 137, 138, 140, 146, 147, 149, 155, 162 Sequence alignment, 50 Shape descriptors, 79 Sharing operator, 48 Side chain conformations, 40, 44 SIGMA (System for Integrated Genome Map Assembly), 55 Similarity, 77 Similarity matrix, 78 Simple genetic algorithm (SGA), 6, 11, 49 Simplex, 60, 62 Simulated annealing, 20, 38, 43, 49, 59, 62 Single-point crossover, 10, 25 Slater, J. C., 283 Slater orbitals, 283 SMILES, 54 Sparse matrix driven method, 41 Spatiotemporal patterns, 180, 206, 214 Spectral curve fitting, 5 7 Spectral simulations, 281, 291 Spectroscopy, 168 Spin-glass Hamiltonian, 48 Spiral waves, 230 Spiraling trajectories, 198 Split/mix solid phase synthesis, 96 Spontaneous patterns, 205 Spruce budworm, 189 Stability analysis, 190, 234, 237, 244 Stability manifolds, 137 Stable Manifold theorem, 137, 150, 163 Stable manifolds, 132, 237 Stable nodes, 184 Standard homoclinidhetetoclinic tangle, 148, 150 State transition matrix, 42 Stationary fitness functions, 29 Stationary state, 182 Statistical mechanics, 105, 180, 281 Steady state, 184, 186, 187, 188, 207, 209 Steady state attractor, 199, 231, 233, 236 Steady state genetic algorithm, 5 6
Steady state replacement, 25 Step function selection, 25 Steric fitting, 63 Stiff equations, 199, 201, 239 Stiffly stable algorithm, 202 Stilbene, 120, 157 Stochastic web, 167 Stored program machine, 273 Strange attractors, 231, 236 Structure-activity relationship (SAR), 76, 83, 92 Structure-based drug design, ix, 89 Structure-based library design, 89 Substituent properties, 79, 88 Substituents, 77, 84 Substructural fragments, 93, 95 Substructure searching, 51 Superdelocalizability, 8 1 Surface of section, 119, 133, 135, 137, 150, 151, 164, 165,232 Sutcliffe, B. T., 294, 296, 303 SYBYL,39,61 Symplectic geometry, 164 Synchrotron Radiation Source, 301 Synthesis, 96 Synthetic chemistry, vi Systematic search, 61, 63 Tanimoto coefficients, 78, 79, 80, 81, 93 Target-focused libraries, 76 Taylor-Couette flow, 246 Taylor series expansion, 192, 194, 201, 207 Telecommunications, 299 Template diversity, 87 THEOCHEM, ui Theoretica Chemica Acta, vi Theoretical chemistry, 177 Theoretical physics, 180 Thermionic valves, 273 Thermodynamic equilibrium, 182, 187 Thermodynamic functions, 105 Thermolysin inhibitors, 46 Three-dimensional torus, 245 Three-point pharmacophores, 95 Time-reversal symmetry, 142 TITAN computer, 290 Topological equivalence of portraits, 259 Topological indices, 79 Topological shape, 82 Topology, 126, 131 Torsional isomerization, 120 Torus, 126 Torus attractor, 244, 245, 246,253
334 Subiect Index Tournament selection, 25, 31, 47 Trajectories, 103, 104, 236 Transition points, 186 Transition state, 102, 109, 114, 116, 122, 151, 155, 156,161, 163 Transitor, 280 Transmission coefficient, 158 Trapezoidal algorithm, 201 Traveling salesman problem, 26 Tree-pruning algorithms, 61 Trichloroethane, 121 Turbulence, 245 Turbulent fluid flow, 245 Turing, A., 181, 273, 277 Turing bifurcation, 206, 212, 213 Turing patterns, 181, 205 Turnstiles, 147 Two-dimensional torus, 245 Two-point crossover, 25 Two-state isomerization, 127 Unbiased D-optimal library, 90 Uncoupled isomerization dynamics, 121 Uniform crossover, 26 Unimolecular decomposition, 156 Unimolecular exchange, 120 Unimolecular isomerization, 110, 119, 120
United Kingdom, 271 United States, 274, 312 Univac 1103 computer, 282 Universal laws, 179, 181 University Grants Committee (UGC), 272 Unstable focus, 198, 245 Unstable manifolds, 132, 237 Unstable periodic orbits, 243 Usenet group, 65 Vague tori, 120 Valence bond theories, 298, 306 Variancekovariance matrix, 83 Vector processors, 303, 304, 306 Vibrational energy transfer, 120 Video display units, 299 Virtual storage, 274 Whirlwind, 282 Williams tube, 274 Winding number 248, 255 Wolf algorithm, 263 Working group, 294 World Wide Web (WWW), x Wrinkled torus, 253, 257 X-ray crystallography, 48, 277, 281