MIT Press Math6X9/2001/02/08:14:23
Page 9
Contributors
Karthik Balakrishnan Decision Analytics Group Obongo Inc. Redwood City, USA balak
[email protected]
Dario Floreano Department of Computer Science Swiss Federal Institute of Technology Lausanne, Switzerland
[email protected]
Egbert Boers Artificial Intelligence Group National Aerospace Laboratory Amsterdam, The Netherlands
[email protected]
Terence Fogarty Department of Computer Studies Napier University Edinburgh, UK
[email protected]
Brian Carse Faculty of Engineering University of the West of England Bristol, UK
[email protected]
Frederic Gruau Laboaratoire d’Informatique Robotique et Microelectronique de Montpellier Montpellier, France
[email protected]
Frederick Crabbe IV Computer Science Department University of California Los Angeles, USA
[email protected]
Vasant Honavar Department of Computer Science Iowa State University Ames, USA
[email protected]
Melania Degeratu Program in Applied Mathematical and Computational Sciences University of Iowa Iowa City, USA
[email protected]
Yong Liu Computer Science Division Electrotechnical Laboratory Ibaraki, Japan
[email protected]
Michael Dyer Computer Science Department University of California Los Angeles, USA
[email protected]
Bruce MacLennan Computer Science Department University of Tennessee Knoxville, USA
[email protected]
x
Thilo Mahnig RWCP Theoretical Foundation GMD Laboratory Sankt Augustin, Germany
[email protected] Filippo Menczer Department of Management Sciences University of Iowa Iowa City, USA
[email protected]
Contributors
F. Moran Departamento de Bioquimica y Biología MolecularI Universidad Complutense de Madrid Madrid, Spain
[email protected] David Moriarty Information Sciences Institute University of Southern California Marina del Rey, USA
[email protected]
J. J. Merelo Departamento de Electrónica y Tecnología de los Computadores Universidad de Granada Granada, Spain
[email protected]
Heinz Muehlenbein RWCP Theoretical Foundation GMD Laboratory Sankt Augustin, Germany
[email protected]
Olivier Michel LAMI Swiss Federal Institute of Technology Lausanne, Switzerland
[email protected]
Stefano Nolfi Institute of Psychology C.N.R. Rome, Italy
[email protected]
Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, USA
[email protected]
Domenico Parisi Institute of Psychology C.N.R. Rome, Italy
[email protected]
Francesco Mondada K-Team SA Switzerland
[email protected]
Mukesh Patel Applitech Solution Limited Ahmedabad, India
[email protected]
Contributors
xi
A. Prieto Departamento de Electrónica y Tecnología de los Computadores Universidad de Granada Granada, Spain
John Sullivan Faculty of Engineering University of the West of England Bristol, UK
[email protected]
Ida Sprinkhuizen-Kuyper Computer Science Department Universiteit Maastricht Maastricht, The Netherlands
[email protected]
Xin Yao School of Computer Science The University of Birmingham Birmingham, UK
[email protected]
William Street Department of Management Sciences University of Iowa Iowa City, USA
[email protected]
MIT Press Math6X9/2001/02/08:14:23
Page 13
Preface
The emergence of Artificial Intelligence and Cognitive Sciences concerned with understanding and modeling intelligent systems, both natural and synthetic, is perhaps one of the most significant intellectual developments of the twentieth century. From the earliest writings of India and Greece, understanding minds and brains has been a fundamental problem in philosophy. The invention of the digital computer in the 1940s made this a central concern of computer scientists. Among the first uses of the digital computer, in the 1950s, was the development of computer programs to model perception, reasoning, learning, and evolution. Ensuing developments in the theory of computation, design and analysis of algorithms and the synthesis of programmable information processing systems, provided a new set of tools with which to study appropriately embodied minds and brains – both natural as well as artificial – through the analysis, design, and evaluation of computers and programs that exhibit aspects of intelligent behavior. This systems approach to understanding intelligent behavior has been driven largely by the working hypothesis of Artificial Intelligence, which simply states that thought processes can be modeled by computation or information processing. The philosophical roots of this hypothesis can be traced at least as far back as the attempts of Helmholtz, Leibnitz, and Boole to explain thought processes in terms of mechanical (or in modern terms, algorithmic or computational) processes. The advent of molecular biology, heralded by the discovery of genetic code, has ushered in a similar information processing approach to the study of living systems. This has resulted in the emergence of computational molecular biology, synthetic biology (or artificial life), and related disciplines, which are driven by the working hypothesis that life processes can be modeled and/or understood by computation or information processing. Systems whose information processing structures are fully programmed to their last detail, despite being viewed as the ideal in many areas of computer science and engineering, are extremely difficult to design for all but the simplest and the most routine of applications. The designers of such systems have to be able to anticipate in advance all possible contingencies such a system could ever encounter, and program appropriate procedures to deal with each of them. Dynamic, open, real-world world environments call for adaptive and learning systems that are equipped with processes that allow them to modify their behavior, as needed, by changing their information processing structures. Long-term collective survival in changing environments calls for systems that can evolve. Biological intelligence has emerged over the millennia and depends
MIT Press Math6X9/2001/02/08:14:23
xiv
Page 14
Preface
crucially on evolution and learning – or rather, it is hard to conceive of a God so intelligent and yet so stupid, so mundane at detail yet so brilliant at the fantastic, as to – like a watchmaker – tediously craft each living creature to its last detail. Cognitive and information structures and processes, embodied in living systems, offer us impressive existence proofs of many highly effective designs for adaptive, flexible, and creative biological intelligent agents. They are also a great source of ideas for designing artificially intelligent agents. Against this background, this book explores some of the central problems in artificial intelligence, cognitive science, and synthetic biology, namely the modeling of information structures and processes that create and adapt intelligent agents through evolution and learning. Work on this book began in 1995 when two of us (Vasant Honavar and Mukesh Patel) independently started efforts to put together edited volumes on evolutionary approaches to the synthesis of intelligent systems. When we became aware of each other’s work, it seemed logical to pool the efforts and produce a single edited volume. With encouragement from Harry Stanton of MIT press, we invited a team of authors to contribute chapters covering various aspects of evolutionary synthesis of intelligent systems. The original target date for publication of the book was late 1997. However, the sad demise of Harry Stanton interfered with this plan. Fortunately, Doug Sery, who inherited the project in 1998, was very supportive of the project and determined to see it through. Karthik Balakrishnan joined the team in 1998. Honavar and Patel are grateful to Balakrishnan for taking over a large portion of the effort of putting together this volume between 1998 and 2000. During this period, all of the contributed chapters went through several rounds of reviews and revisions. The final product is the result of the collective work of the authors of the various chapters and the three editors. The chapters included in this volume represent recent advances in evolutionary approaches to the synthesis of intelligent software agents and intelligent artifacts for a variety of applications. The chapters may be broadly divided into four categories. The first set of chapters demonstrates the raw power of evolution in determining effective solutions to complex tasks such as walking behaviors for 8-legged robots, evolution of meaningful communication in a population of agents, addressing of multi-objective tradeoffs inherent in the design of accurate and compact codebook vectors for vector quantization, and the design of effective box-pushing robot behaviors. The second set of chapters ex-
MIT Press Math6X9/2001/02/08:14:23
Page 15
Preface
xv
plores mechanisms to make evolutionary design scalable, largely through the use of biologically inspired development mechanisms. These include a recipelike genetic coding based on L-systems (inspired by DNA coding of amino acids) and models of artificial neurogenesis (cell differentiation, division, migration, axonal growth, guidance, target recognition, etc.). The third set of chapters largely deals with equivalents of ontogenetic learning, which encompass the use of powerful local learning algorithms to quickly improve and explore the promising solutions discovered by the underlying evolutionary search. These include gradient-descent based learning of radial-basis function network architectures designed by evolution, different flavors of Hebbian adaptation of evolved neural architectures, and the evolution of learning parameters and rules for second-order neural network architectures. The final set of chapters extends the principle of evolutionary search along novel and useful directions. These include the use of local selection as a powerful yet computationally inexpensive alternative to the more popular global selection schemes (e.g., proportional, tournament, etc.), symbiotic evolution for evolving and exploiting cooperating blocks (or units) of the neural network solution, treating the evolved population as an ensemble of solutions and exploiting the population diversity to produce enhanced and robust solutions, and developing a theoretically sound alternate interpretation of evolutionary search that operates over search distributions rather than recombination/crossover. A summary of the key contributions of each chapter follows. Evolutionary Synthesis of Complex Agent Programs In Chapter 1, Balakrishnan and Honavar introduce the powerful agent synthesis paradigms of evolution and learning. Drawing inspiration from biological processes at work in the design of varied, and in many instances unique, adaptations of life forms, the authors present evolutionary synthesis of artificial neural networks as a computationally efficient approach for the exploration of useful, efficient, and perhaps even intelligent behaviors in artificial automata. Most important, the authors present a formal characterization of properties of genetic representations of neural architectures. In Chapter 2, Gruau and Quatramaran address a challenging problem of evolving walking gaits for an 8-legged robot. This task is particularly difficult because the leg-controller must not only move the individual legs, but also coordinate them appropriately to ensure that the robot is balanced and stable. Using a genetic representation based on cellular encoding, they evolve programs for transforming a graph and decoding it into a neural network. Perhaps
MIT Press Math6X9/2001/02/08:14:23
xvi
Page 16
Preface
the greatest benefit of this graph-grammar encoding scheme is its ability to evolve reusable modules or subnetworks, an aspect of critical importance in the control of the 8-legged robot. This encoding is used to evolve the architecture of the neurocontroller, along with the weights, the number of input and output units, as well as the type of units (temporal for duration of movement sequence and spatial for intensity of movement). The authors propose and use an interactive approach, calling for human intervention in identifying and promoting the emergence and adaptation of specific regularities. They achieve this by visually monitoring the evolutionary process and rewarding potentially promising features by increasing their fitness evaluations. The authors claim that this semi-supervised approach works well, requiring far fewer evolutionary generations to evolve acceptable behaviors than pure evolutionary search. They present a number of interesting walking behaviors that evolved, including wavewalk – moving one leg at a time, and the true quadripod gait – moving four legs at a time. MacLennan presents an extremely interesting application of evolution in Chapter 3. He considers the evolution of meaningful communication in agent populations. In the first set of experiments, he arms the agents with finite state machines (FSMs) (which are computationally equivalent to a restricted class of neural networks), capable of reading single symbols and communicating/writing single symbols in return. The agents are rewarded for recognizing and responding with correct symbols. Using a direct encoding scheme to represent the transition tables of the FSMs, he demonstrates interesting results in the evolution of communication in this agent population. He shows that the evolved agents appear to communicate meaningfully using single symbols, and to some extent, pairs of symbols exhibiting rudimentary syntax. Owing to the limitations of FSMs, primarily their inability to generalize, MacLennan and his students also explore the use of simple feed-forward neural network structures for communication. In these experiments, they evolve the network architecture and use back-propagation to train the network. Again they demonstrate the evolution of meaningful communication, although the results are characteristically different from those using FSMs. In Chapter 4, Merelo et al. adopt an evolutionary approach to address an important consideration in unsupervised classification problems. Unsupervised classifiers or clustering algorithms must not only discover compact, welldefined groups of patterns in the data, but also deliver the correct number of groups/clusters. This is a difficult multi-objective optimization problem with few practical algorithmic solutions. The common approach is to fix the num-
MIT Press Math6X9/2001/02/08:14:23
Page 17
Preface
xvii
ber of desired clusters a priori as in the popular k-means algorithm. Vector Quantization (VQ) is a classification technique in which a set of labeled vectors, called the codebook, are given, and classification of an input is performed by identifying the codebook vector that matches it best. However, in order to use VQ, one needs the codebook to be specified. Kohonen’s Learning Vector Quantization (LVQ) is an approach that can learn codebooks from a set of initial vectors. However, even with this approach, the size of the codebook needs to be set in advance. But how does one know what the size should be? Merelo et al. use an evolutionary approach to design the codebook, optimizing accuracy, distortion, and size. This multi-objective optimization leads to codebook designs that are smaller and more accurate than those obtained with LVQ, and contain fewer weights than back-propagation trained neural networks of similar accuracy. The authors demonstrate the power and utility of this approach in separating a set of Gaussian non-overlapping clusters and breast cancer diagnosis. In Chapter 5, Balakrishnan and Honavar address the evolution of effective box-pushing behaviors. They use a multi-level, variable-length genetic representation that possesses many of the properties described in Chapter 1. Using this mechanism, they evolve both feed-forward and recurrent neuro-controllers for a heavily constrained box-pushing task. They analyze the evolved networks to clearly demonstrate the mechanism by which successful box-pushing behaviors emerge, leading to critical insights into the nature of the task in general, and the ability of evolution to deal effectively with the imposed constraints in particular. Through a variety of experiments the authors highlight the role of memory in effective box-pushing behaviors, and show how evolution tailors the designs to match (and surmount) constraints. The authors also discuss the evolution of robot sensory systems, and present empirical results in the coevolution and co-adaptation of robot sensory and neural systems. They show that evolution often discovers sensory system designs that employ minimal numbers of sensors. Further, they also show that evolution can be trusted to discover robust controllers when the task environment is noisy. Evolution and Development of Agent Programs Although evolution is a powerful and adaptive search technique, its application is often limited owing to the immense computational effort involved in searching complex solution spaces. Local or environment guided search can often help alleviate this problem by quickly exploring promising areas identified by the evolutionary mechanism. Many authors have explored such combinations
MIT Press Math6X9/2001/02/08:14:23
xviii
Page 18
Preface
of evolutionary and local search, with much success. One such local mechanism is inspired by developmental and molecular biology. This deals with the notion of development, which can be crudely defined as adaptation that is a result of an interaction between genetically inherited information and environmental influences. Development is usually associated with determining the phenotype from the genotype, possibly influenced by environmental factors, as is evidenced in the following three chapters. In Chapter 6, Boers and Sprinkhuizen-Kuyper present a novel genetic representation for neural network architectures that is inspired by DNA coding of amino acids. This representation or coding mechanism can be thought of as a recipe for creating a neural network rather than its direct blueprint. This powerful and novel coding is a graph grammar based directly on the parallel string rewriting mechanism of L-systems, where each genotype contains a set of production rules, which when decoded produces the corresponding phenotype (or neural network). The primary benefit of this coding mechanism is its scalability in coding potentially large neural network structures. In addition, such grammar based representations also easily handle and foster the emergence of modular designs. The authors present results on the evolution of modular architectures for the famous what-where task. They also present an enhancement of the development process that constructively adds modules and units to the decoded network, and address its implications for exploiting the Baldwin effect in the Darwinian evolution framework. Drawing inspiration from developmental and molecular biology, Michel presents a model of artificial neurogenesis in Chapter 7. He uses this procedure to evolve and develop dynamical recurrent neural networks to control mobile robot behaviors such as maze traversal and obstacle avoidance. His model of neurogenesis, based on protein synthesis regulation, includes artificial specifications of processes such as cell differentiation, division, migration, axonal growth, guidance, and target recognition. In addition, the linear threshold function units that are evolved are also adapted using simple Hebbian learning techniques. The preliminary results obtained using this approach offer much promise for the design of large, modular neural networks. The author presents examples of interesting behaviors evolved using this approach for the robot Khepera. Michel also discusses issues involved in the transfer of robot behaviors evolved in simulation, to real robots. In Chapter 8, Parisi and Nolfi present a comprehensive view of the role of development in the evolutionary synthesis of neural networks. In their view, development is any change in the individual that is genetically trig-
MIT Press Math6X9/2001/02/08:14:23
Page 19
Preface
xix
gered/influenced. Using this broad definition, they summarize their considerable research in various aspects of development, including self-teaching networks, a model of neurogenesis (axonal growth process), self-directed learning, a model of temporal development with genetic information expressed at different ages of the individual, etc. Self-directed learning is demonstrated using an architecture wherein the neural network learns to predict the sensory consequences of the movements of the agent. The authors show that this ability endows the agents with more effective behaviors. Self-teaching networks contain two subnetworks; one produces the behavioral responses of the agent, while the other produces a teaching or supervisory input used to adapt the first subnetwork. The authors also experiment with individuals that are cognizant of their age, allowing adaptation of behaviors with age. The authors demonstrate and discuss results of these different developmental techniques in the context of agent behaviors such as foraging (food finding), wall/obstacle avoidance, goal approaching, etc. In most cases they use a priori fixed neural network architectures and a simple direct encoding to evolve the weights of the network. Enhancing Evolutionary Search Through Local Learning In addition to the use of recipe-like genetic codings that develop into neural architectures under the influence of environmental factors, other, more powerful local search algorithms can also be used in conjunction with evolutionary search. The following chapters present examples of the use of such learning techniques to further shape the solutions discovered by evolution. In Chapter 9, Carse et al. address the design of radial basis function (RBF) neural networks using a hybrid approach that involves the use of evolution to discover optimal network architectures (number of hidden units, radial basis function centers and widths, etc.) and a gradient-descent based learning algorithm for tuning the connection weights. The authors introduce two interesting notions in the chapter – a novel crossover operator that identifies similar units no matter where they appear on the genotype, and a scaling of training epochs that assigns fewer training cycles to networks in the early stages of evolution, and progressively more as evolution advances. The rationale behind the former idea is to make evolutionary search more efficient by handling the competing conventions/permutations problem (where the exact same phenotype has multiple genetic representations), while the latter is inspired by the fact that networks in early stages of evolution are perhaps not architecturally close to ideal, and hence little effort need be wasted in training them. The idea thus is to discover the optimal structure first and then fine tune it. The authors
MIT Press Math6X9/2001/02/08:14:23
xx
Page 20
Preface
use variable length genotypes, with a direct encoding of the network architecture. They show that this approach produces relatively small yet accurate RBF networks on a number of problems, including the Mackey-Glass chaotic time series prediction problem. In Chapter 10, Floreano et al. investigate chase-escape behaviors coevolving in a framework of two miniature mobile robots, a predator with a vision system and a faster prey with only proximity sensors. Both robots are controlled using recurrent neural networks evolved in a competitive coevolutionary setting. Inspired by the thesis that unpredictable and dynamic environments require the evolution of adaptivity, the authors explore mechanisms that produce solutions capable of coping with changing environments. This becomes particularly critical in co-evolving populations owing to the phenomenon known as the Red Queen Effect. In such situations, adaptive traits evolved by one species are often annulled by the evolution of counteractive traits in a competing species, thereby producing dynamic fitness landscapes. In order to evolve agent designs that can cope with such dynamic and unpredictable changes, the authors explore three broad approaches – (1) maintaining diversity in evolving populations, (2) preferentially selecting genotypes whose mutant neighbors are different but equally viable, and (3) incorporating mechanisms of ontogenetic adaptation. The authors use a direct encoding representation to evolve weights within a priori fixed network architectures, and in later experiments, noise levels and parameters for Hebbian learning. They present results of interesting evolved chase-escape behaviors. Crabbe and Dyer, in Chapter 11, present a flexible neural network architecture that can learn to approach and/or avoid multiple goals in dynamic environments. Artificial agents endowed with this architecture, called MAXSON (max-based second order network), learn spatial navigation tasks faster than other reinforcement-based learning approaches such as Q-learning. The learning architecture employs two subnetworks – a second-order policy network that generates agent actions in response to sensory inputs, and a firstorder value network that produces reinforcement signals to adapt the policy networks. Of particular importance are two thresholds, 1 and 2 , that control the response of the agent to sudden sensory changes or fluctuations. The authors demonstrate the emergence of effective behaviors for agents seeking food and drink and avoiding poisons. The final set of experiments demonstrates the ability of this approach in producing agents driven by internal goals, i.e., agents seek food or water only when they are hungry or thirsty, respectively. The authors use an evolutionary approach to determine optimal settings for the
MIT Press Math6X9/2001/02/08:14:23
Page 21
Preface
xxi
thresholds, which strongly influence the learning rules of the agents. They also present results in evolving portions of the learning rules themselves. Novel Extensions and Interpretations of Evolutionary Search Although evolutionary algorithms have proven themselves to be efficient and effective heuristic search techniques, their applicability is often limited owing to a few drawbacks such as the phenomenon of premature convergence, lack of concrete theoretical underpinnings, etc. This has prompted researchers to extend or interpret evolutionary search mechanism in novel and useful ways. The final set of chapters in this compilation extend basic evolutionary search along interesting and useful dimensions. In Chapter 12, Menczer et al. present an interesting variation of evolutionary algorithms, namely, local selection. Their algorithm ELSA (Evolutionary Local Selection Algorithm) creates and maintains diversity in the population by placing geographic constraints on evolutionary search. In their approach, individuals/agents in the population compete for finite environmental resources, with similar agents obtaining less energy than dissimilar ones. Agents accumulating more than a certain fixed threshold of energy reproduce via mutation, while agents with no energy die. This environment-mediated crowding leads to diverse populations containing heterogeneous individuals. Such heterogeneity is a must for certain kinds of cover or Pareto optimization problems. For example, on problems with multi-criterion or expensive fitness functions, ELSA can quickly find a set of solutions using a simplified fitness function. Some other technique can then be used to compare these solutions to pick the right one. Using a direct encoding of feed-forward neural networks, the authors demonstrate the power of ELSA on a number of application domains, including evolution of sensors and adaptive behaviors, design of realistic ecological models, web-information browsing agents, and feature selection for classification tasks. Moriarty and Miikkulainen present the SANE (Symbiotic, Adaptive Neuro-Evolution) approach in Chapter 13. The SANE approach differs from conventional evolution of neural architectures in that each individual in the population is a neuron rather than a neural network. In the fitness evaluation phase, pools of neurons are randomly drawn from the population to construct neural networks, which are then evaluated on the target task. The fitness of the resulting neural network is then distributed among the component neurons. Evolution thus favors neurons that can cooperate with each other. This approach supports heterogeneity by maintaining a diverse set of neurons, since
MIT Press Math6X9/2001/02/08:14:23
xxii
Page 22
Preface
neurons specialize to search different areas of the solution space. This leads to immediate benefits in the prevention of premature convergence. The authors use a direct encoding scheme that consists of label-weight pairs. They use this encoding to evolve the architecture and weights of feedforward networks. They demonstrate the utility of this approach on two interesting tasks – as a value ordering method in scheduling an auto assembly line, and as a filter for minimax search in an Othello playing program. The authors find that SANE easily optimizes sophisticated evaluation functions. In Chapter 14, Yao and Liu use an Evolutionary Programming (EP) approach to evolving both single neural networks as well as ensembles. They introduce two key notions in this chapter – a set of mutation operators that can be sequentially applied to produce parsimonious networks, and an ensemble interpretation of evolved populations. Using a direct encoding representation, the authors evolve the architecture and weight of feed-forward networks for applications such as medical diagnosis, credit card assessment, and the Mackey-Glass chaotic time-series analysis. The evolved networks are trained using the back-propagation algorithm. The authors use a Lamarckian learning framework wherein the learned weights and architectures are inherited by offspring. Of particular interest is the authors’ treatment of final evolved populations. They regard the final population of neural networks as an ensemble, and combine the outputs from different individuals using mechanisms such as voting and averaging. Since diversity often exists in evolved populations, this ensemble treatment produces results that are often better than the best single network in the population. Muehlenbein and Mahnig develop a sound theoretical analysis of genetic algorithms in Chapter 15. Using results from classical population genetics and statistics, they show that genetic search can be closely approximated by a new search algorithm called the Univariate Marginal Distribution Algorithm (UMDA), which operates over probability distributions instead of recombinations of strings. To make UMDA tractable, the authors develop another variation of the algorithm that works with factorized probability distributions, an algorithm called Factorized Distribution Algorithm (FDA). The appropriate factorizations are determined from the data by mapping probability distributions to Bayesian networks. The authors demonstrate applications of this approach in the synthesis of programs via probabilistic trees and the synthesis of neural trees. Although synthesis of complex neural networks is currently outside the scope of this work, options exist for resolving this issue. The authors feel this can be done either by extending Bayesian networks to application domains of
MIT Press Math6X9/2001/02/08:14:23
Preface
Page 23
xxiii
neural networks, or by restricting the structure of neural networks to be handled by Bayesian networks. This work is particularly interesting owing to the sound theoretical underpinnings. Acknowledgments We would like to thank all the authors whose contributions made this book a reality. We are grateful to Dr. Harry Stanton for his vision and enthusiastic encouragement during the early stages. He is sorely missed by those who had the good fortune to have known him. The project owes a great deal to Doug Sery who stepped in and saw project through to its completion after Harry’s demise. Without his patience, unrelenting support, and understanding, it is unlikely that this volume would have been completed. We would like to acknowledge the able assistance provided by editorial and production staff of MIT press at all stages of production of this book. We are grateful to United States National Science Foundation, the Department of Defense, the National Institute of Health, the European Engineering and Physical Sciences Research Council, and other funding agencies for their support of research and education in Cognitive, Biological, and Computer and Information Sciences in general, and the research presented in this book in particular. Honavar is indebted to the Computer Science department at Iowa State University for its support of this project. He is also grateful to the School of Computer Science, the Robotics Institute, the Center for Automated Learning and Discovery, and the Center for Neural Basis for Cognition at Carnegie Mellon University where some of the work on this book was completed during his sabbatical in 1998. Balakrishnan would like to thank the Department of Computer Science at Iowa State University, Allstate Research and Planning Center, and Obongo Inc., for providing the resources and support for this project. He is also grateful to IBM for providing a research Fellowship that made some of this work possible. We are indebted to our students, teachers, colleagues, family, and friends from whom we have learned a great deal over the years. Honavar is indebted to Leonard Uhr for introducing him to research in artificial intelligence and being a wonderful mentor and friend. He has benefited and learned from interactions with members of the artificial intelligence, neural computing, evolutionary computing, machine learning, intelligent agents, cognitive science, complex adaptive systems, computational biol-
MIT Press Math6X9/2001/02/08:14:23
Page 24
xxiv
Preface
ogy, and neuroscience communities (at one time or another) at the University of Wisconsin-Madison (especially Gregg Oden, Jude Shavlik, Charles Dyer, Larry Travis, Lola Lopes, David Finton, Matt Zeidenberg, Jose Cavazos, Peter Sandon, Roland Chin and Josh Chover), Iowa State University (especially Rajesh Parekh, Chun-Hsien Chen, Karthik Balakrishnan, Jihoon Yang, Armin Mikler, Rushi Bhatt, Adrian Silvescu, Doina Caragea, Drena Dobbs, Jack Lutz, Gavin Naylor, Codrin Nichitiu, Olivier Bousquet, John Mayfield, Philip Haydon, Dan Voytas, Pat Schnable, Giora Slutzki, Leigh Tesfatsion, Di Cook, Don Sakaguchi, Srdija Jeftinija, Les Miller, David Fernandez-Baca, James McCalley, Johnny Wong, Mok-dong Chung, and Vadim Kirillov), and elsewhere (especially Tom Mitchell, Pat Langley, David Touretzky, Lee Giles, Ron Sun, Massimo Negrotti, David Goldberg, David Fogel, Wolfgang Banzaf, and John Koza). Patel would like to extend particular thanks to Ben du Boulay (at Sussex University), Marco Colombetti (at Politechnico di Milano) and Brendon McGonigle (at Edinburgh University) – each in their own way played a significant role in helping him conceive and realize the idea of such a collection of papers. Balakrishnan is deeply indebted to his family for their love and support all through his life. In particular, he would like to extend a special note of gratitude to his parents Shantha and R Balakrishnan, his wife Usha Nandini, and his sisters Usha and Sandhya, for always being there for him. He would also like to thank Vasant Honavar for being his research mentor. Balakrishnan is indebted to numerous teachers, friends and colleagues, from whom he has learned much. He would especially like to thank Dan Ashlock, Albert Baker, Rushi Bhatt, Olivier Bousquet, Chun-Hsien Chen, Veronica Dark, Dario Floreano, David Juedes, Filippo Menczer, Armin Mikler, Rajesh Parekh, William Robinson, Leigh Tesfatsion, Noe Tuason, Johnny Wong, and Jihoon Yang. Last but not the least, we are indebted to the artificial intelligence, neural computing, evolutionary computing, synthetic biology, computational biology, and cognitive science research communities at large whose work has directly or indirectly helped shape the theme and contents of this book. Karthik Balakrishnan Vasant Honavar Mukesh Patel
Redwood City, USA Ames, USA Ahmedabad, India
MIT Press Math6X9/2001/02/08:14:23
1
Page 1
Evolutionary and Neural Synthesis of Intelligent Agents
Karthik Balakrishnan and Vasant Honavar
1.1 Introduction From its very inception in the late 1950s and early 1960s, the field of Artificial Intelligence, has concerned itself with the analysis and synthesis of intelligent agents. Although there is no universally accepted definition of intelligence, for our purposes here, we treat any software or hardware entity as an intelligent agent, provided it demonstrates behaviors that would be characterized as intelligent if performed by normal human beings under similar circumstances. This generic definition of agency applies to robots, software systems for pattern classification and prediction, and recent advances in mobile software systems (softbots) for information gathering, data mining, and electronic commerce. The design and development of such agents is a topic of considerable ongoing research and draws on advances in a variety of disciplines including artificial intelligence, robotics, cognitive systems, design of algorithms, control systems, electrical and mechanical engineering, programming languages, computer networks, distributed computing, databases, programming languages, statistical analysis and modeling, etc. In very simplistic terms, an agent may be defined as an entity that perceives its environment through sensors and acts upon it through its effectors [76]. However, for the agents to be useful, they must also be capable of interpreting perceptions, reasoning, and choosing actions autonomously and in ways suited to achieving their intended goals. Since the agents are often expected to operate reliably in unknown, partially known, and dynamic environments, they must also possess mechanisms to learn and adapt their behaviors through their interactions with their environments. In addition, in some cases, we may also require the agents to be mobile and move or access different places or parts of their operating environments. Finally, we may expect the agents to be persistent, rational, etc., and to work in groups, which requires them to collaborate and communicate [76, 87, 60]. It can be seen that this definition of agents applies to robots that occupy and operate in physical environments, software systems for evaluation and optimization, softbots that inhabit the electronic worlds defined by computers and computer networks, and even animals (or biological agents) that cohabit this planet. Robots have been used in a wide variety of application scenarios like geo, space, and underwater exploration, material handling and delivery, secu-
MIT Press Math6X9/2001/02/08:14:23
2
Page 2
Karthik Balakrishnan and Vasant Honavar
rity patrolling, control applications in hazardous environments like chemical plants and nuclear reactors, etc., [14, 18, 90, 62]. Intelligent software systems have been extensively used as automated tools for diagnosis, evaluation, analysis and optimization. Softbots are being increasingly used for information gathering and retrieval, data mining, e-commerce, news, mail and information filtering, etc., [7]. In general, an agent can be characterized by two elements: its program and its architecture. The agent program is a mapping that determines the actions of the agent in response to its sensory inputs, while architecture refers to the computing and physical medium on which this agent program is executed [76]. For example, a mail filtering agent might be programmed in a language such as C++ and executed on a computer, or a robot might be programmed to move about and collect empty soda cans using its gripper arm. Thus, the architecture of the agent, to a large extent, determines the kinds of things the agent is capable of doing, while the program determines what the agent does at any given instance of time. In addition to these two elements, the agent environment governs the kinds of tasks or behaviors that are within the scope of any agent operating in that environment. For instance, if there are no soda cans in the robot’s environment, the can collecting behavior is of no practical use. Given a particular agent environment the question then is, how can we design agents with the necessary behaviors and abilities to solve a given task (or a set of tasks)? The field of Artificial Intelligence (AI) has long concerned itself with the design, development, and deployment of intelligent agents for a variety of practical, real-world applications [73, 86, 81, 76]. A number of tools and techniques have been developed for synthesizing such agent programs (e.g., expert systems, logical and probabilistic inference techniques, case-based reasoning systems, etc.) [73, 24, 40, 17, 16, 76]. However, designing programs using most of these approaches requires considerable knowledge of the application domain being addressed by the agent. This process of knowledge extraction (also called knowledge engineering) is often difficult either owing to the lack of precise knowledge of the domain (e.g., medical diagnosis) or the inability to procure knowledge owing to other limitations (e.g., detailed environmental characteristics of, say, planet Saturn). We thus require agent design mechanisms that work either with little domain-specific knowledge or allow the agent to acquire the desired information through its own experiences. This has led to considerable development of the field of machine learning (ML), which provides a variety of modes and
MIT Press Math6X9/2001/02/08:14:23
Page 3
Evolutionary and Neural Synthesis of Intelligent Agents
3
methods for automatic synthesis of agent programs [35, 36, 44, 74, 58].
1.2 Design of Biological Agents Among the processes of natural adaptation responsible for the design of biological agents (animals), probably the most crucial is that of evolution, which excels in producing agents designed to overcome the constraints and limitations imposed by their environments. For example, flashlight fish (photoblepharon palpebratus), inhabits deep-waters where there is little surface light. However, this is not really a problem since evolution has equipped these creatures with active vision systems. These fishes produce and emit their own light, and use it to detect obstacles, prey, etc. This vision system is unlike any seen on surface-dwelling organisms [27]. Apart from evolution, learning is another biological process that allows animals to successfully adapt to their environments. For instance, animals learn to respond to specific stimuli (conditioning) and ignore others (habituation), recognize objects and places, communicate, engage in species-specific behaviors, etc., [51, 23, 27, 52]. Intelligent behavior in animals emerges as a result of interaction between the information processing structures possessed by the animal and the environments that they inhabit. For instance, it is well known that marine turtles (e.g., Chelonia mydas) lay their eggs on tropical beaches (e.g., Ascension Island in the Atlantic ocean). When the eggs hatch, the young automatically crawl to the water without any kind of parental supervision. They appear to be genetically programmed to interpret cues emanating from large bodies of water or patches of sky over them [27]. Animals are also capable of learning features of specific environments that they inhabit during their lifetime. For instance, rodents have the ability to learn and successfully navigate through complex mazes [82, 61]. These processes of natural adaptation, namely, evolution and learning, play a significant role in shaping the behaviors of biological agents. They differ from each other in some significant aspects including the spatio-temporal scales at which they operate. While learning operates on individuals, evolution works over entire populations (or species). Further, learning operates during the lifetime of the individual and is presumably aided by long lifespans, while evolution works over generations, well beyond an individual’s effective lifespan [1]. Despite these apparent differences, evolution and learning work synergis-
MIT Press Math6X9/2001/02/08:14:23
4
Page 4
Karthik Balakrishnan and Vasant Honavar
tically to produce animals capable of surviving and functioning in diverse environments. While the architectures of biological agents (e.g., digestive, respiratory, nervous, immune, and reproductive systems) are shaped by evolution, the agent programs (e.g., behaviors like foraging, feeding, grooming, sleeping, escape, etc.) are affected and altered by both evolutionary and learning phenomena. In such cases, evolution produces the innate (or genetically programmed) behaviors which are then modified and contoured to the animal’s experiences in the specific environment to which it is exposed. Thus, by equipping the agents with good designs and instincts, evolution allows them to survive sufficiently long to learn the behaviors appropriate for the environment in question. Although the processes of evolution and learning are reasonably well detailed, there are still many gaps to be filled before a complete understanding of such processes can be claimed. Ongoing research in the neurosciences, cognitive psychology, animal behavior, genetics, etc., is providing new insights into the exact nature of the mechanisms employed by these processes: the structures they require and the functions they compute. This has led to many new hypotheses and theories of such mechanisms. Computational modeling efforts complement such research endeavors by providing valuable tools for testing theories and hypotheses in controlled, simulated environments. Such modeling efforts can potentially identify and suggest avenues of further research that can help fill in the gaps in current human understanding of the modeled processes [10]. As outlined in the previous sections, the enterprise of designing agents has to take into account three key aspects: the nature of the environment, agent architecture, and agent program. The primary focus of this book is on evolutionary synthesis of intelligent agents with a secondary focus on the use of learning mechanisms to enhance agent designs. The chapters that follow consider the design of agent programs using processes that are loosely modeled after biological evolution. They explore several alternative architectures and paradigms for intelligent agent architectures including artificial neural networks. They provide several examples of successful use of evolutionary and neural approaches to the synthesis of agent architectures and agent programs.
MIT Press Math6X9/2001/02/08:14:23
Page 5
Evolutionary and Neural Synthesis of Intelligent Agents
5
1.3 Agent Programs As we mentioned earlier, the agent program is primarily responsible for the behavior of the agent or robot in a given environment. The agent program determines the sensory inputs available to the agent at any given moment in time, processes the inputs in ways suited to the goals (or functionalities) of the agent, and determines appropriate agent actions to be performed. In order to be used in practice, these programs have to be encoded within the agents using an appropriate language and the agents must possess mechanisms to interpret and execute them. Some of the earliest programs were realized using circuits and control systems that directly sensed input events, processed them via appropriate transfer functions, and directly controlled the output behaviors of the robot [46]. Examples of such representations include systems used in the control of industrial robots, robotic arms, etc. [2]. However, these approaches require the transfer function to be known and appropriately implemented, which is often difficult in practice. In addition, these control mechanisms are rather inflexible and reprogramming the robot for a different behavior or task may entail extensive changes. The advent of computers and their ability to be effectively reprogrammed, opened up entirely new possibilities in the design of agent programs for modern-day robots and software agents. In these cases, the agent programs are written in some computer language (e.g., C++, Java, LISP), and executed by the computer. The program receives the necessary inputs from the agent sensors, processes them appropriately to determine the actions to be performed, and controls the agent actuators to realize the intended behaviors. The sensors and actuators may be physical (as in the case of robots) or virtual (as in the case of most software agents). The behavior of the agent can be changed by changing the agent program associated with it. The question then is, how can we develop agent programs that will enable the agent to exhibit the desired behaviors? Designing Control Programs for Agent Behaviors Many contemporary agents make use of programs that are manually developed. This is a daunting task, given the fact that the agent-environment interactions exhibit a host of a priori unknown or unpredictable effects. In addition, complex agent behaviors often involve tradeoffs between multiple competing alternatives. For example, suppose a robot has the task of clearing a room by pushing boxes to the walls. Let us also assume that the robot has limited sens-
MIT Press Math6X9/2001/02/08:14:23
6
Page 6
Karthik Balakrishnan and Vasant Honavar
ing ranges that prevent it from observing the contents of the entire room and it does not have any means to remember the positions of boxes it has observed in the past. Suppose this robot currently observes two boxes. Which one should it approach and push? This decision is critical as it directly affects the subsequent behaviors of the robot. We may program the robot to approach the closer of the two boxes, but can we be sure that such a decision made at the local level will indeed lead to any kind of globally optimal behavior? Manually designing control programs to effectively address such competing alternatives is an equally challenging proposition. We thus need approaches to automate the synthesis of scalable, robust, flexible, and adaptive agent programs for a variety of practical applications. In recent years two kinds of automatic design approaches have met with much success: discovery and learning. Approaches belonging to the former category typically include some mechanism to search the space of agent programs in the hope of finding or discovering a good one. Each program found during this search is evaluated in environments that are representative of those that are expected to be encountered by the agent and the best ones are retained. Some discovery approaches (e.g., evolutionary search) use these evaluations to guide or focus the search procedure, making the process more efficient. The latter category includes approaches that allow the robot behaviors to be modified based on the experiences of the agent, i.e., the robot learns the correct behaviors based on its experience in the environment. In the remainder of this chapter we will introduce two paradigms that aid in the automatic synthesis of agent programs. Artificial neural networks, finite state automata, rule-based production systems, Lambda Calculus, etc., offer alternative computational models for the design of agent programs. As shown by the pioneering work of Turing, Church, Post, Markov, and others, they are all essentially equivalent in terms of the class of information processing operations that they can realize [83, 56, 47]. However, in practice, each paradigm has its own advantages and disadvantages in the context of specific applications. It is therefore not at all uncommon to use hybrid models that integrate and exploit the advantages of multiple paradigms in synergistic and complementary ways [38, 80, 26]. In what follows, we focus primarily on evolutionary synthesis of neural architectures for intelligent agents. However, many of the same issues arise in the automated synthesis of agents with other information processing architectures as well (e.g., rule-based systems).
MIT Press Math6X9/2001/02/08:14:23
Page 7
Evolutionary and Neural Synthesis of Intelligent Agents
7
1.4 Artificial Neural Networks as Agent Programs Artificial neural networks offer an attractive paradigm of computation for the synthesis of agent programs for a variety of reasons including their potential for massively parallel computation, robustness in the presence of noise, resilience to the failure of components, and amenability to adaptation and learning via the modification of computational structures, among others. What are Artificial Neural Networks? Artificial neural networks are models of computation that are inspired by, and loosely based on, the nervous systems in biological organisms. They are conventionally modeled as massively parallel networks of simple computing elements, called units, that are connected together by adaptive links called weights, as shown in Figure 1.1. Each unit in the network computes some simple function of its inputs (called the activation function) and propagates its outputs to other units to which it happens to be connected. A number of activation functions are used in practice, the most common ones being the threshold, linear, sigmoid, and radial-basis functions. The weights associated with a unit represent the strength of the synapses between the corresponding units, as will be explained shortly. Each unit is also assumed to be associated with a special weight, called the threshold or bias, that is assumed to be connected to a constant source of +1 (or a -1). This threshold or bias serves to modulate the firing properties of the corresponding unit and is a critical component in the design of these networks.
Input units
Hidden units
Output units Threshold function 1 Output Net input -1
Weights
Recurrent links
Figure 1.1 Artificial neural network (left) and the bipolar threshold activation function (right).
MIT Press Math6X9/2001/02/08:14:23
8
Page 8
Karthik Balakrishnan and Vasant Honavar
The input to an n-input (including the bias) unit is typically represented by a pattern vector X 2 Rn or in the case of binary patterns, by a binary vector X 2 [0; 1]n . The weights associated with an n-input unit i are typically represented by an n-dimensional weight vector i 2 Rn . By popular convention, the first element of the weight vector usually represents the threshold (or bias). The input activation of a unit i, represented by Ai , in response to a pattern X on its input links is usually given by the vector dot product: Ai = i : . The output of the unit is a function of Ai and is dictated by the activation function chosen. For example, the bipolar threshold activation function, shown in Figure 1.1, produces: Oi = 1 if Ai = i : 0 and Oi = 1 otherwise. Units in the network that receive input directly from the environment are referred to as input units, while the units that provide the environment with the results of network computations are called output units. In conventional neural network terminology, the set of input and output units are said to reside in input and output layers. In addition to these kinds of units, the networks can also have other units that aid in the network computations but do not have a direct interface to or from the external environment. Such units are referred to as hidden units. Often, the hidden units critically determine the kinds of mappings or computations the networks are capable of performing. In a typical neural network the activations are propagated as follows. At any given instance of time, the input pattern is applied to the input units of the network. These input activations are then propagated to the units that are connected to these input units, and the activations of these second layer units are computed. Now the activations of these units are propagated via their output links to other units, and this process continues until the activations reach the units in the output layer. Once the computations of the output layer units are complete (or in the case of recurrent networks the network activity stabilizes), the resulting firing pattern across the output layer units is said to be the output of the network in response to the corresponding input pattern. Thus, in a typical neural network, activations enter the input units, propagate through links and hidden units, and produce an activation in the units of the output layer. A wide variety of artificial neural networks have been studied in the literature. Apart from differences stemming from the activation functions used, neural networks can also be distinguished based on their topological organization. For instance, networks can be single-layered or multi-layered; sparsely connected or completely connected; strictly layered or arbitrarily connected; composed of homogeneous or heterogeneous computing elements, etc. Perhaps the most important architectural (and hence functional) distinction is be-
W
WX
WX
MIT Press Math6X9/2001/02/08:14:23
Page 9
Evolutionary and Neural Synthesis of Intelligent Agents
9
tween networks that are simply feed-forward (where their connectivity graph does not contain any directed cycles) and recurrent (where the networks contain feedback loops). Feed-forward networks can be trained via a host of simple learning algorithms and have found widespread use in pattern recognition, function interpolation, and system modeling applications. In contrast to feedforward networks, recurrent networks have the ability to remember and use past network activations through the use of recurrent (or feedback) links. These networks have thus found natural applications in domains involving temporal dependencies, for instance, in sequence learning, speech recognition, motion control in robots, etc. For further details regarding artificial neural networks and their rather chequered history, the reader is referred to any of a number of excellent texts [15, 32, 45, 22, 43, 31, 74]. Design of Artificial Neural Networks As may be inferred, the input-output mapping realized by an artificial neural network is a function of the numbers of units, the functions they compute, the topology of their connectivity, the strength of their connections (weights), the control algorithm for propagating activations through the network, etc., [35]. Thus, to create a neural network with a desired input-output mapping, one has to appropriately design these different components of the network. Not surprisingly, network synthesis of this sort is an extremely difficult task because the different components of the network and their interactions are often very complex and hard to characterize accurately. Much of the research on neural network synthesis has focused on algorithms that modify the weights within an otherwise fixed network architecture [22]. This essentially entails a search for a setting of the weights that endows the network with the desired input-output behavior. For example, in a network used in classification applications we desire weights that will allow the network to correctly classify all (or most of) the samples in the training set. Since this is fundamentally an optimization problem, a variety of optimization methods (gradient-descent, simulated annealing, etc.) can be used to determine the weights. Most of the popular learning algorithms use some form of errorguided search (e.g., changing each modifiable parameter in the direction of the negative gradient of a suitably defined error measure with respect to the parameter of interest). A number of such learning algorithms have been developed, both for supervised learning (where the desired outputs of the network are specified by an external teacher) and unsupervised learning (where the network learns to classify, categorize, or self-organize without external supervi-
MIT Press Math6X9/2001/02/08:14:23
Page 10
10
Karthik Balakrishnan and Vasant Honavar
sion). For details regarding these learning paradigms, the reader is referred to [22, 43, 30, 74]. Although a number of techiniques have been developed to adapt the weights within a given neural network, the design of the neural architecture still poses a few problems. Conventional approaches often rely on human experience, intuition, and rules-of-thumb to determine the network architectures. In recent years, a number of constructive and destructive algorithms have been developed, that aid in the design of neural network architectures. While constructive algorithms incrementally build network architectures one unit (or one module) at a time [34, 37, 63, 89], destructive algorithms allow arbitrary networks to be pruned one unit (or one module) at a time. Thus, not only do these approaches synthesize network architectures, but also entertain the possibility of discovering compact (or minimal) networks. A number of such constructive and destructive learning algorithms have been developed, each offering its own characteristic bias. In addition to these approaches, evolutionary algorithms (to be descibed shortly) have also been used to search the space of neural architectures for near-optimal designs (see [5] for a bibliography). This evolutionary approach to the design of neural network architectures forms the core of this compilation.
1.5 Evolutionary Algorithms Evolutionary algorithms, loosely inspired by biological evolutionary processes, have gained considerable popularity as tools for searching vast, complex, deceptive, and multimodal search spaces [33, 25, 57]. Following the metaphor of biological evolution, these algorithms work with populations of individuals, where each individual represents a point in the space being searched. Viewed as a search for a solution to a problem, each individual then represents (or encodes) a solution to the problem on hand. As with biological evolutionary systems, each individual is characterized by a genetic representation or genetic encoding, which typically consists of an arrangement of genes (usually in a string form). These genes take on values called alleles, from a suitably defined domain of values. This genetic representation is referred to as the genotype in biology. The actual individual, in our case a solution, is referred to as the phenotype. As in biological evolutionary processes, phenotypes in artificial evolution are produced from genotypes through a process of decoding and development, as shown in Figure 1.2. Thus, while a human being
MIT Press Math6X9/2001/02/08:14:23
Page 11
Evolutionary and Neural Synthesis of Intelligent Agents
11
corresponds to a phenotype, his/her chromosomes correspond to the genotype. The processes of nurture, growth, learning, etc., then correspond to the decoding/developmental processes the transform the genotype into a phenotype. Decoding
Evaluate phenotype on application domain
Offspring
Genotypes
Selection Recombination Mutation
Phenotypes
Fitness evaluation label
Figure 1.2 The functioning of an evolutionary algorithm.
In artificial evolution (also referred to as simulated evolution), solutions represented by the phenotypes are evaluated based on the target problem for which solutions are being sought. This evaluation of the phenotypes assigns differential fitness labels to the corresponding genotypes. Processes akin to natural selection then preferentially choose genotypes of higher fitness to participate in probabilistically more numbers of matings. These matings between chosen individuals leads to offsprings that derive their genetic material from their parents via artificial genetic operators that roughly correspond to the biological operators of recombination and mutation. These artificial genetic operators are popularly referred to as crossover and mutation. The offspring genotypes are then decoded into phenotypes and the process repeats itself. Over many generations the processes of selection, crossover, and mutation, gradually lead to populations containing genotypes that correspond to high fitness phenotypes. This general procedure, perhaps with minor variations, is at the heart of most evolutionary systems. The literature broadly distinguishes between four different classes of evolutionary approaches: genetic algorithms, genetic programming, evolutionary programming, and evolutionary strategies. While genetic algorithms typically use binary (or bit) strings to represent genotypes [33, 25], genetic programming evolves programs in some given language [42]. Both these paradigms
MIT Press Math6X9/2001/02/08:14:23
12
Page 12
Karthik Balakrishnan and Vasant Honavar
perform evolutionary search via genetic operators of crossover and mutation. Evolutionary programming, on the other hand, allows complex structures in the genotypes but only uses a mutation operator [20]. Evolution strategies are typically used for parameter optimization [78, 3]. They employ recombination and mutation, and also permit self-learning (or evolutionary adaptation) of strategy parameters (e.g., variance of the Gaussian mutations). In recent years, the distinctions between these different paradigms have become rather fuzzy with researchers borrowing from the strengths of different paradigms. For instance, we use complex data structures for representing genotypes and employ both recombination as well as mutation operators to perform the evolutionary search [4]. In this regard our approach may be described as a combination of evolutionary programming and genetic algorithms. For these reasons we prefer to use the generic term, evolutionary algorithms, to describe our approach to the use of artificial evolution. As each population member represents a potential solution, evolutionary algorithms effectively perform a population-based search in solution space. Since this is equivalent to exploring multiple regions of the space in parallel, evolutionary algorithms are efficient search tools for vast spaces. In addition, the population-based nature of evolutionary search often helps it overcome problems associated with local maxima, making it very suitable for searching multi-modal spaces. Further, the genetic encoding and genetic operators can be chosen to be fairly generic, requiring the user to only specify the decoding function and the fitness or evaluation function. In most cases these functions can be specified using little domain-specific knowledge. Thus, one does not necessarily have to understand the intricacies of the problem in order to use an evolutionary approach to solve it. Evolutionary Synthesis of Agent Programs As demonstrated by several of the chapters that follow, evolutionary algorithms offer a promising approach to synthesis of agent programs (in the form of artificial neural networks, LISP programs, etc.) for a wide variety of tasks [68, 19, 71, 29, 70, 64, 72, 69, 88, 13, 53, 85, 41, 12, 48, 54, 59, 8, 50]. In such cases, evolutionary search operates in the space of agent programs, with each member of the population representing an agent behavior. By evaluating these behaviors on the target agent task and performing fitnessproportionate reproduction, evolution discovers agent programs that exhibit the desired behaviors.
MIT Press Math6X9/2001/02/08:14:23
Page 13
Evolutionary and Neural Synthesis of Intelligent Agents
13
1.6 Evolutionary Synthesis of Neural Systems In an earlier section we alluded to the difficulty of synthesizing artificial neural networks that possess specific input-output mappings. Owing to the many properties of evolutionary algorithms, primarily their ability to search vast, complex, and multimodal search spaces using little domain-specific knowledge, they have found natural applications in the automatic synthesis of artificial neural networks. Several researchers have recently begun to investigate evolutionary techniques for designing such neural architectures (see [5] for a bibliography). Probably the distinguishing feature of an evolutionary approach to network synthesis is that unlike neural network learning algorithms that typically determine weights within a priori fixed architectures and constructive/destructive algorithms that simply design network architectures without directly adapting the weights, evolutionary algorithms permit co-evolution or co-design of the network architecture as well as the weights. Evolutionary algorithm may be easily extended to automatically adapt other parameters such as rates of mutation [79] and learning rate [77], and even the learning algorithm [9]. In addition, by appropriately modifying the fitness function, the same evolutionary system can be used to synthesize vastly different neural networks, each satisfying different task-specific performance measures (e.g., accuracy, speed, robustness, etc.) or user-specified design constraints (e.g., compactness, numbers of units, links and layers, fan-in/fan-out constraints, power consumption, heat dissipation, area/volume when implemented in hardware, etc.). Evolutionary algorithms also allow these networks to be optimized along multiple dimensions either implicitly [4] or explicitly via the use of different multi-objective optimization approaches [21, 67, 66, 39]. An Example of Evolutionary Synthesis of Neural Networks A number of researchers have designed evolutionary systems to synthesize neural networks for a variety of applications. Here we will present the approach adopted by Miller et al. (1989). In their system, Miller et al., encode the topology of an N unit neural network by a connectivity constraint matrix C , of dimension N (N + 1), as shown in Figure 1.3. Here, the first N columns specify the constraints on the connections between the N units, and the final column codes for the connection that corresponds to the threshold or bias of each unit. Each entry Cij , of the connectivity constraint matrix indicates the nature of the constraint
MIT Press Math6X9/2001/02/08:14:23
Page 14
14
Karthik Balakrishnan and Vasant Honavar
on the connection from unit j to unit i (or the constraint on the threshold bias of unit i if j = N + 1). While Cij = 0 indicates the absence of a trainable connection between units j and i, a value of 1 signals the presence of such a trainable link. The rows of the matrix are concatenated to yield a bit-string of length N (N + 1). This is the genotype in their evolutionary system. From Unit: To Unit:
1 2 3 4 5
1
2
3
4
5 bias
0 0 1 1 0
0 0 1 1 0
0 0 0 0 1
0 0 0 0 1
0 0 0 0 0
0 0 1 1 1
000000 000000 110001 110001 001101
Connectivity Constraint Matrix
0000000000000110001110001001101 Bit-string Genotype
5 3
4
1
2
Network Architecture
Figure 1.3 An example of an evolutionary approach to the synthesis of neural networks.
The fitness of the genotype is evaluated as follows. First, the genotype is decoded into the corresponding neural network (or the phenotype). This decoded network has connections (or weights) between units that have a 1 in the corresponding position in the connectivity constraint matrix (or the genotype), as explained earlier. Even though feedback connections can be specified in the genotype, they are ignored by the decoding mechanism. The system thus evolves purely feed-forward networks. Next, all the connections in the network are set to small random values and trained for a fixed number
MIT Press Math6X9/2001/02/08:14:23
Page 15
Evolutionary and Neural Synthesis of Intelligent Agents
15
of epochs on a given set of training examples, using the back-propagation algorithm [75]. The total sum squared error (E ) of the network, at the end of the training phase, is used as the fitness measure, with low values of E corresponding to better performance and hence a higher fitness label for the corresponding genotype. The system maintains a population of such genotypes (bit-strings), and uses a fitness-proportionate selection scheme for choosing parents for reproduction. The genetic operator crossover swaps rows between parents while mutation randomly flips bits in the genotype with some low, pre-specified probability. The researchers used this evolutionary system to design neural networks for the XOR, four-quadrant and pattern-copying problems [55].
1.7 Genetic Representations of Neural Architectures Central to any evolutionary design system is the choice of the genetic representation, i.e., the encoding of genotypes and the mechanism for decoding them into phenotypes. The choice of the representation is critical since it not only dictates the kinds of phenotypes (and hence solutions) that can be generated by the system, but also determines the amount of resources (e.g., time, space, etc.) expended in this effort. Further, the genetic operators for the system are also defined based largely on the representation chosen. These factors contribute directly to the efficiency (e.g., time, space, etc.) and the efficacy (e.g., quality of solution found, etc.) of the evolutionary search procedure [66, 65]. Thus, a careful characterization of the properties of genetic representations as they relate to the performance of evolutionary systems, is a necessary and useful venture. Some authors have attempted to characterize properties of genetic representations of neural architectures [11, 28]. However, most such characterizations have been restricted to a specification of the properties of the encoding scheme without considering in detail the associated decoding process. This is an important oversight because the decoding process not only determines the phenotypes that emerge from a given genotype, but also critically influences the resources required in the bargain. For instance, a cryptic encoding scheme might be compact from a storage perspective but might entail a decoding process that is rather involved (both in terms of time and storage space). If we were blind to the effects of the decoding process, we might be enamoured by this encoding scheme and use it as our genetic representation. This would have
MIT Press Math6X9/2001/02/08:14:23
16
Page 16
Karthik Balakrishnan and Vasant Honavar
severe consequences on the performance of the evolutionary system. For these reasons it is imperative that we consider the encoding and the decoding process as a closely related pair while characterizing different properties of genetic representations. Given this motivation, we have identified and precisely defined a number of key properties of genetic representations of neural architectures, taking both the encoding and the decoding processes into account [6]. However, it must be pointed out that this is only a preliminary characterization and we expect these definitions to get more refined as we examine a large variety of evolutionary systems more closely. The following section describes these properties. Properties of Genetic Representations of Neural Architectures In order to develop formal descriptions of properties of genetic representations, we need the following definitions.
GR is the space of genotypes representable in the chosen genetic representation scheme R. GR may be explicitly enumerated or implicitly specified using a grammar whose language L( ) = GR . p = D(g; E D ), where D is the decoding function that produces the phenotype p corresponding to the genotype g possibly under the influence of the environment E D (e.g., the environment may set the parameters of the decoding function). A value of for E D denotes the lack of direct interaction between the decoding process and the environment. It should be borne in mind that D may be stochastic, with an underlying probability distribution over the space of phenotypes.
p2
= L(p1 ; E L ), where the learning procedure L generates phenotype p2 from phenotype p1 under the influence of the environment E L . The environment may provide the training examples, set the free parameters (e.g., the learning rate used by the algorithm) etc. We will use L = to denote the absence of any form of learning in the system. In the following discussion we will use the term decoding function to refer to both D and L. This slight abuse of notation allows the following properties to apply to genetic representations in general, even though they are presented in the context of evolutionary design of neural architectures.
PR is the space of all phenotypes that can be constructed (in principle) given a particular genetic representation scheme R. Mathematically, PR = fp=9g 2 GR [(p1 = D(g; E D )) ^ (p = L(p1 ; E L ))]
MIT Press Math6X9/2001/02/08:14:23
Page 17
Evolutionary and Neural Synthesis of Intelligent Agents
17
S is the set of solution networks, i.e., neural architectures or phenotypes that satisfy the desired performance criterion (as measured by the fitness function in a given environment E . If an evolutionary system with a particular representation R is to successfully find solutions (even in principle), S PR , or, at the very least, S \ PR 6= ;. In other words, there must be at least one solution network that can be constructed given the chosen representation R.
)
A is the set of acceptable or valid neural architectures. For instance, a
network may be deemed invalid or unacceptable if it does not have any paths from the inputs to the outputs. In general, A may be different from PR . However, it must be the case that A\S 6= ; if a particular evolutionary system is to be useful in practice. We now identify some properties of genetic representations of neural architectures. Unless otherwise specified, we will assume the following definitions are with respect to an a priori fixed choice of E D ; L, and E L . 1. Completeness: A representation R is complete if every neural architecture in the solution set can be constructed (in principle) by the system. Formally, the following two statements are equivalent definitions of completeness.
(8s 2 S )(9g 2 GR )[(p1 = D(g; E D )) ^ (s = L(p1 ; E L ))] S PR
Thus, completeness demands that the representation be capable of producing all possible solutions to the problem. Often, this may be hard to satisfy and one may have to choose between partially complete representations. In such cases, \PR j , might another figure of merit called solution density, denoted by jSjP Rj be useful. One would then choose representations that correspond to higher solution densities, since this implies a higher likelihood of finding solutions. It should be noted that if the solution density is very high, even a random search procedure will yield good solutions and one may not have much use for an evolutionary approach. 2. Closure: A representation R is completely closed if every genotype decodes to an acceptable phenotype. The following two assertions are both equivalent definitions of closure.
(8g 2 GR )[(p1 = D(g; E D )) ^ (L(p1 ; E L ) 2 A)] PR A
A representation that is not closed can be transformed into a closed system by constraining the decoding function, thereby preventing it from generating invalid phenotypes. Additionally, if the genetic operators are designed to have
MIT Press Math6X9/2001/02/08:14:23
18
Page 18
Karthik Balakrishnan and Vasant Honavar
the property of closure, then one can envision constrained closure wherein all genotypes do not correspond to acceptable phenotypes, however, closure is guaranteed since the system never generates the invalid genotypes. Closure has bearings on the efficiency of the evolutionary procedure as it determines the amount of effort (space, time, etc.) wasted in generating unacceptable phenotypes. 3. Compactness: Suppose two genotypes g1 and g2 both decode to the same phenotype p, then g1 is said to be more compact than g2 if g1 occupies less space than g2 :
(p1 = D(g1 ; E D )) ^ (p = L(p1 ; E L )) ^ (p2 = D(g2 ; E D )) ^ (p L(p2 ; E L ))^ j g1 j<j g2 j where j g j denotes the size of storage for genotype g .
=
This definition corresponds to topological-compactness defined by Gruau (1994). His definition of functional-compactness – which compares the genotype sizes of two phenotypes that exhibit the same behavior, can be expressed in our framework (for solution networks) as
(p1 = D(g1 ; E D )) ^ (L(p1 ; E L ) 2 S ) ^ (p2 = D(g2 ; E D )) ^ (L(p2 ; E L ) 2 S )^ j g1 j<j g2 j
Compactness is a useful property as it allows us to choose genetic representations that use space more efficiently. However, compact and cryptic representations often require considerable decoding effort to produce the corresponding phenotype. This is the classic space-time tradeoff inherent in algorithm design. Hence, the benefits offered by a compact representation must be evaluated in light of the increased decoding effort before one representation can be declared preferable over another. 4. Scalability: Several notions of scalability are of interest. For the time being we will restrict our attention to the change in the size of the phenotype, measured in terms of the numbers of units, connections, or modules. This change in the size of the phenotype manifests itself as a change in the size of the encoding (space needed to store the genotype), and a corresponding change in decoding time. We can characterize the relationship in terms of the asymptotic order of growth notation commonly used in analyzing computer algorithms — O(). For instance, let nN;C 2 A be a network (phenotype) with N units and C connections (the actual connectivity pattern does not really matter in this example). We say that the representation is O(K )–size-scalable with respect to units if the addition of one unit to the phenotype nN;C requires an increase
MIT Press Math6X9/2001/02/08:14:23
Page 19
Evolutionary and Neural Synthesis of Intelligent Agents
19
in the size of the corresponding genotype by O(K ), where K is some function of N and C . For instance, if a given representation is O(N 2 ) size-scalable with respect to units, then the addition of one unit to the phenotype increases the size of the genotype by O(N 2 ). Size-scalability of encodings with respect to connections, modules, etc., can be similarly defined. The representation is said to be O(K )– time-scalable with respect to units if the time taken for decoding the genotype for nN +1;C exceeds that used for nN;C by no more than O(K ). Similarly, time-scalability with respect to the number of connections, modules, etc., can also be defined. Scalability is central to understanding the space-time consequences of using a particular genetic representation scheme in different contexts. In conjunction with completeness and compactness, scalability can be effectively used to characterize genetic representations. 5. Multiplicity: A representation R is said to exhibit genotypic multiplicity if multiple genotypes decode to an identical phenotype. In other words, the decoding function is a many to one mapping from the space of genotypes to the corresponding phenotypic space.
(9n 2 PR ) (j fg 2 GR =(p = D(g; E D )) ^ (n = L(p; E L))g j> 1)
Genotypic multiplicity may result from a variety of sources including the encoding and decoding mechanisms. If a genetic representation has the property of genotypic multiplicity, it is possible that multiple genotypes decode to the same solution phenotype. In such cases, if the density of solutions is also high, then a large fraction of the genotypic space corresponds to potential solutions. This will make the evolutionary search procedure very effective. A representation R is said to exhibit phenotypic multiplicity if different instances of the same genotype can decode to different phenotypes. In other words, the decoding function is a one to many mapping of genotypes into phenotypes.
(9g1 ; g2 2 GR )[(p1 = D(g1 ; E D )) ^ (n1 = L(p1 ; E L ))) ^ (p2 D(g2 ; E D )) ^ (n2 = L(p2 ; E L))) ^ (g1 = g2 ) ^ (n1 6= n2 )]
=
Phenotypic multiplicity may result from several factors including the effects of the environment, learning, or stochastic aspects of the decoding process. If the density of solutions is low, then the property of phenotypic multiplicity increases the possibility of decoding to a solution phenotype. 6. Ontogenetic Plasticity: A representation R exhibits ontogenetic plasticity if the determination of the phenotype corresponding to a given geno-
MIT Press Math6X9/2001/02/08:14:23
20
Page 20
Karthik Balakrishnan and Vasant Honavar
type is influenced by the environment. This may happen as a result of either environment-sensitive developmental processes (in which case E D 6= ), or learning processes (in which case L 6= ). Ontogenetic plasticity is a useful property for constraining or modifying the decoding process based on the dictates of the application domain. For instance, if one is evolving networks for a pattern classification problem, the search for a solution network can be dramatically enhanced by utilizing a supervised learning algorithm for training individual phenotypes in the population. However, if such training examples are not available to permit supervised learning, one will have to be content with a purely evolutionary search. 7. Modularity: Gruau’s notion of modularity [28] is as follows: Suppose a network n1 includes several instances of a subnetwork n2 then the encoding (genotype) of n1 is modular if it codes for n2 only once, with instructions to copy it that would be understood by the decoding process. Modularity is closely tied to the existence of organized structure or regularity in the phenotype that can be concisely expressed in the genotype in a form that can be used by the decoding process. Other notions of modularity dealing with functional modules, recursively-defined modules etc., are also worth exploring. It can be observed that the property of modularity automatically results in more compact genetic encodings and a potential lack of redundancy (described below). In modular representations any change in the genotypic encoding of a module, either due to genetic influences or errors, affects all instances of the module in the phenotype. Non-modular representations, on the other hand, are resistive to such complete alterations. It is hard to decide a priori which scenario is better, since modular representations benefit from benign changes while non-modular representations are more robust to deleterious ones. 8. Redundancy: Redundancy can manifest itself at various levels and in different forms in an evolutionary system. Redundancy often contributes to the robustness of the system in the face of failure of components or processes. For instance, if the reproduction and/or decoding processes are error-prone, an evolutionary system can benefit from genotypic redundancy (e.g., the genotype contains redundant genes) or decoding redundancy (e.g., the decoding process reads the genotype more than once). If the phenotype is prone to failure of components (e.g., units, connections, sub-networks, etc.), the system can benefit from phenotypic redundancy. Phenotypic redundancy can be either topological (e.g., multiple identical units, connections, etc.) or functional (e.g., dissimilar units, connections, etc., that somehow impart the same function).
MIT Press Math6X9/2001/02/08:14:23
Page 21
Evolutionary and Neural Synthesis of Intelligent Agents
21
It is worth noting that genotypic redundancy does not necessarily imply phenotypic redundancy and vice versa (depending on the nature of the decoding process). This simply reiterates the importance of examining the entire representation (encoding as well as decoding) when defining properties of evolutionary systems. Also note that there are many ways to realize both genotypic as well as phenotypic redundancy: by replication of identical components (structural redundancy) or by replication of functionally identical units, or by building in modules or processes that can dynamically restructure themselves when faced with failure of components etc. [84]. 9. Complexity: Complexity is perhaps one of the most important properties of any evolutionary system. However, it is rather difficult to characterize satisfactorily using any single definition. It is probably best to think of complexity using several different notions including: structural complexity of genotypes, decoding complexity, computational (space/time) complexity of each of the components of the system (including decoding of genotypes, fitness evaluation, reproduction, etc.), and perhaps even other measures inspired by information theory (e.g., entropy, Kolmogorov complexity, etc.) [49]. Although it is clear that one would like to use a genetic representation that leads to lower system complexities, the many interacting elements of the evolutionary system, genetic representations and their properties, and the existence of many different kinds of complexities, make it hard to arrive at one scalar measure that would satisfy all. Needless to say, characterization of complexity remains a subjective measure of the user’s preferences and the dictates of the application problem. This list of properties, although by no means complete, is nevertheless relevant in an operationally useful characterization of evolutionary systems in general, and the design of neural architectures in particular. Table 1.1 illustrates a characterization of the evolutionary system proposed by Miller et al., that was described in Section 1.6. 1.8 Summary In this chapter we have introduced the evolutionary approach to the synthesis of agent programs in general, and artificial neural networks in particular. That evolution is a powerful, and more importantly, an aptly suited design approach for this undertaking, will be amply demonstrated in the chapters to follow. Since the efficiency and efficacy of any evolutionary design system is crit-
MIT Press Math6X9/2001/02/08:14:23
Page 22
22
Karthik Balakrishnan and Vasant Honavar
Table 1.1 Properties of the genetic representation used by Miller et al. Property Completeness Closure Topological Compactness Functional Compactness Space Scalability Time Scalability Genotypic Multiplicity Phenotypic Multiplicity Ontogenetic Plasticity Modularity Genotypic/Decoding Redundancy Phenotypic Redundancy Space Complexity Time Complexity
Satisfied?
p p p p p p p p p p
Comments With respect to feed-forward networks. Invalid networks can result. Determined by back-propagation. Also possible. O(N ) with respect to units. O(N ) with respect to units. No genotypic multiplicity. Dictated by back-propagation. Back-propagation used for training. Genotype only specifies connections. One gene for each connection. Units and modules, but not connections. Dictated by genotype size. Dictated by GA and back-propagation.
ically governed by the encoding mechanism chosen for specifying the genotypes and the decoding mechanism for transforming them into phenotypes, extreme care must be taken to ensure that these two mechanisms are designed with the application problem in mind. To aid this process, in this chapter, we identified and formalized a number of properties of such genetic representations. To the extent possible, we have tried to characterize each property in precise terms. This characterization of properties of genetic representations will hopefully help in the rational choice of genetic representations for different applications. For instance, suppose we need to design neurocontrollers for robots that have to operate in hazardous and a priori unknown environments. Examples of such applications include exploration of unknown terrains, nuclear waste cleanup, space exploration, etc. Since robots in such environments are required to plan and execute sequences of actions (where each action in a sequence may be dependent on previous actions performed as well as the sensory inputs), a recurrent neural network is probably needed. Further, if the system is to be used to design robots capable of functioning in different, a priori unknown environments, the robot controllers must have ontogenetic plasticity, i.e., the robots must be capable of learning from their experiences in the environment. The hazardous nature (e.g., in nuclear waste cleanup) or remoteness of the environment (e.g., in the case of robots used to explore distant planets) makes it desirable that the controllers operate robustly in the face of component failures etc., which calls for phenotypic redundancy of some form (e.g., duplication of
MIT Press Math6X9/2001/02/08:14:23
Page 23
Evolutionary and Neural Synthesis of Intelligent Agents
23
units, links, or modules of the neurocontroller). In addition, implementation technology and cost considerations might impose additional constraints on the design of the controller. For instance, hardware realization using current VLSI technology would benefit from locally connected, modular networks built from simple processors. Also extended periods of autonomous operation might require designs that are efficient in terms of power consumption, etc. In order to design a robot controller satisfying these multiple performance constraints, one might resort to an evolutionary design approach. In such cases, a number of these constraints translate into properties that we have identified in Section 1.7. Using these, one can choose an appropriate genetic representation that can be used to evolve appropriate robot behaviors. Elsewhere we have demonstrated an evolutionary approach to the synthesis of robot behaviors for a box-pushing task, where we choose a genetic representation based on the properties we have identified in this chapter [4]. The remaining chapters in this volume address fundamental concerns and demonstrate successful applications of evolutionary search in the synthesis of intelligent agent designs and behaviors. The chapters are authored by prominent researchers, each an authority in his/her area of expertise. In this sense, this compilation presents a unique snapshot of cutting-edge research from leading researchers around the world. References [1]D. H. Ackley and M. L. Littman. Interactions between learning and evolution. In Proceedings of the Second International Conference on Artificial Life, pages 487–509, 1991. [2]D. Anand and R. Zmood. Introduction to Control Systems. Butterworth-Heinemann, Oxford, 1995. [3]T. B¨ack, G. Rudolph, and H.-P. Schwefel. Evolutionary programming and evolution strategies: Similarities and differences. In Proceedings of the Second Annual Conference on Evolutionary Programming, pages 11–22. 1993. [4]K. Balakrishnan. Biologically Inspired Computational Structures and Processes for Autonomous Agents and Robots. PhD thesis, Department Of Computer Science, Iowa State University, Ames, IA, 1998. [5]K. Balakrishnan and V. Honavar. Evolutionary design of neural architectures — a preliminary taxonomy and guide to literature. Technical Report CS TR 95-01, Department of Computer Science, Iowa State University, Ames, IA, 1995. [6]K. Balakrishnan and V. Honavar. Properties of genetic representations of neural architectures. In Proceedings of the World Congress on Neural Networks, pages 807–813, 1995. [7]J. Bradshaw, editor. Software Agents. MIT Press, Cambridge, MA, 1997. [8]F. Cecconi, F. Menczer, and R. Belew. Maturation and evolution of imitative learning in artificial organisms. Adaptive Behavior, 4(1):179–198, 1995. [9]D. J. Chalmers. The evolution of learning: An experiment in genetic connectionism. In
MIT Press Math6X9/2001/02/08:14:23
24
Page 24
Karthik Balakrishnan and Vasant Honavar
Proceedings of the 1990 Connectionist Models Summer School, pages 81–90, 1990. [10]P. Churchland and T. Sejnowski. The Computational Brain. MIT Press, Cambridge, MA, 1992. [11]R. Collins and D. Jefferson. An artificial neural network representation for artificial organisms. In Proceedings of the Conference on Parallel Problem Solving from Nature, pages 259–263, 1990. [12]R. Collins and D. Jefferson. Antfarm: Towards simulated evolution. In Proceedings of the Second International Conference on Artificial Life, pages 579–601, 1991. [13]M. Colombetti and M. Dorigo. Learning to control an autonomous robot by distributed genetic algorithms. In From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior, 1992. [14]I. Cox and G. Wilfong, editors. Autonomous Robot Vehicles. Springer-Verlag, New York, NY, 1990. [15]J. Dayhoff. Neural Network Architectures: An Introduction. Van Nostrand Reinhold, New York, 1990. [16]T. Dean, J. Allen, and Y. Aloimonos. Artificial Intelligence - Theory and Practice. Benjamin Cummings, Redwood City, CA, 1995. [17]J. Durkin. Expert Systems – Design and Development. Macmillan, New York, NY, 1994. [18]H. Everett. Sensors for Mobile Robots: Theory and Application. A. K. Peters Ltd, Wellesley, MA, 1995. [19]D. Floreano and F. Mondada. Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot. In from Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, pages 421–430, 1994. [20]D. Fogel. Asymptotic convergence properties of genetic algorithms and evolutionary programming: Analysis and experiments. Cybernetics and Systems: An International Journal, 25:389–407, 1994. [21]C. Fonseca and P. Fleming. An overview of evolutionary algorithms in multi-objective optimization. Evolutionary Computation, 3(1):1–16, 1995. [22]S. Gallant. Neural Network Learning and Expert Systems. MIT Press, Cambridge, MA, 1993. [23]C. Gallistel. The Organization of Learning. MIT Press, Cambridge, MA, 1990. [24]M. Ginsberg. Essentials of Artificial Intelligence. Morgan Kaufmann, San Mateo, CA, 1993. [25]D. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Reading, MA, 1989. [26]S. Goonatilake and S. Khebbal, editors. Intelligent Hybrid Systems. John Wiley, West Sussex, UK, 1995. [27]J. Grier and T. Burk. Biology of Animal Behavior. Mosley-Year Book, New York, NY, 2 edition, 1992. [28]F. Gruau. Genetic micro programming of neural networks. In K. Kinnear, editor, Advances in Genetic Programming. MIT Press, Cambridge, MA, 1994. [29]I. Harvey, P. Husbands, and D. Cliff. Seeing the light: Artificial evolution, real vision. In From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, 1994. [30]M. H. Hassoun. Fundamentals of Artificial Neural Networks. MIT Press, Cambridge, MA, 1995. [31]S. Haykin. Neural Networks. Macmillan, New York, NY, 1994. [32]J. Hertz, A. Krogh, and R. Palmer. Introduction to the Theory of Neural Computation. Addison Wesley, Redwood City, CA, 1991. [33]J. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan Press,
MIT Press Math6X9/2001/02/08:14:23
Page 25
Evolutionary and Neural Synthesis of Intelligent Agents
25
Ann Arbor, 1975. [34]V. Honavar. Generative Learning Structures and Processes for Generalized Connectionist Networks. PhD thesis, Department of Computer Science, University of Wisconsin, Madison, WI, 1990. [35]V. Honavar. Toward learning systems that integrate different strategies and representations. In V. Honavar and L. Uhr, editors, Artificial Intelligence and Neural Networks: Steps Toward Principled Integration, pages 561–580. Academic Press, San Diego, CA, 1994. [36]V. Honavar. Intelligent agents. In J. Williams and K. Sochats, editors, Encyclopedia of Information Technology. Marcel Dekker, New York, NY, 1998. [37]V. Honavar and L. Uhr. Generative learning structures and processes for generalized connectionist networks. Information Sciences, 70:75–108, 1993. [38]V. Honavar and L. Uhr, editors. Artificial Intelligence and Neural Networks – Steps toward Principled Integration. Academic Press, San Diego, CA, 1994. [39]J. Horn and N. Nafpliotis. Multiobjetive optimization using the niched pareto genetic algorithm. Illigal technical report 93005, University of Illinois, Urbana-Champaign, IL, 1993. [40]J. Kolodner. Case-Based Reasoning. Morgan Kaufmann, San Mateo, CA, 1993. [41]J. Koza. Genetic evolution and co-evolution of computer programs. In Proceedings of the Second International Conference on Artificial Life, pages 603–629, 1991. [42]J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, 1992. [43]S. Y. Kung. Digital Neural Networks. Prentice Hall, New York, NY, 1993. [44]P. Langley. Elements of Machine Learning. Morgan Kauffman, San Mateo, CA, 1995. [45]D. Levine. Introduction to Neural and Cognitive Modeling. Lawrence Earlbaum Associates, Hillsdale, NJ, 1991. [46]F. Lewis, C. Abdallah, and D. Dawson, editors. Control of Robot Manipulators. Macmillan, New York, NY, 1993. [47]H. Lewis and C. Papadimitriou. Elements of the Theory of Computation. Prentice Hall, Englewood Cliffs, NJ, 1981. [48]M. Lewis, A. Fagg, and A. Sodium. Genetic programming approach to the construction of a neural network for control of a walking robot. In Proceedings of the IEEE International Conference on Robotics and Automation, 1992. [49]M. Li and P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, New York, NY, 1997. [50]H. Lund, J. Hallam, and W.-P. Lee. Evolving robot morphology. In Proceedings of IEEE Fourth International Conference on Evolutionary Computation, 1997. [51]N. Mackintosh. Conditioning and Associative Learning. Clarendon, New York, NY, 1983. [52]D. McFarland. Animal Behavior. Longman Scientific and Technical, Essex, England, 1993. [53]F. Menczer and R. Belew. Evolving sensors in environments of controlled complexity. In Proceedings of the Fourth International Conference on Artificial Life, 1994. [54]O. Miglino, K. Nafasi, and C. Taylor. Selection for wandering behavior in a small robot. Artificial Life, 2(1):101–116, 1994. [55]G. Miller, P. Todd, and S. Hegde. Designing neural networks using genetic algorithms. In Proceedings of the Third International Conference on Genetic Algorithms, pages 379–384, 1989. [56]M. Minsky. Computation: Finite and Infinite Machines. Prentice Hall, Englewood Cliffs, NJ, 1967. [57]M. Mitchell. An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA, 1996. [58]T. Mitchell. Machine Learning. McGraw Hill, New York, NY, 1997. [59]S. Nolfi, J. Elman, and D. Parisi. Learning and evolution in neural networks. Adaptive
MIT Press Math6X9/2001/02/08:14:23
26
Page 26
Karthik Balakrishnan and Vasant Honavar
Behavior, 3(1):5–28, 1994. [60]H. Nwana. Software agents: An overview. Knowledge Engineering Review, 11(3), 1996. [61]J. O’Keefe and L. Nadel. The Hippocampus as a Cognitive Map. Clarendon, Oxford, UK, 1978. [62]O. Omidvar and P. van der Smagt, editors. Neural Systems for Robotics. Academic Press, San Diego, CA, 1997. [63]R. Parekh. Constructive learning: Inducing grammars and neural networks. PhD thesis, Department of Computer Science, Iowa State University, Ames, IA, 1998. [64]M. Patel. Concept formation: A complex adaptive approach. Theoria, (20):89–108, 1994. [65]M. Patel. Situation assessment and adaptive learning: Theoretical and experimental issues. In Proceedings of the Second International Round-Table on Abstract Intelligent Agents, 1994. [66]M. Patel. Constraints on task and search complexity in ga+nn models of learning and adaptive behaviour. In T. Fogarty, editor, Evolutionary Computing 2, pages 200–224. Springer-Verlag, Berlin, 1995. [67]M. Patel. Heuristic constraints on search complexity for multi-modal non-optimal models. In IEEE International Conference on Evolutionary Computation, 1995. [68]M. Patel, M. Colombetti, and M. Dorigo. Evolutionary learning for intelligent automation: A case study. Journal of Intelligent Automation and Soft Computing, 1(1):29–42, 1995. [69]M. Patel and M. Dorigo. Adaptive learning of a robot arm. In Selected Papers from AISB Workshop on Evolutionary Computation, pages 180–194, 1994. [70]M. Patel and V. Maniezzo. Nn’s and ga’s: Evolving co-operative behaviour in adaptive learning agents. In IEEE World Congress on Computational Intelligence, 1994. [71]C. Reynolds. Evolution of corridor following behavior in a noisy world. In From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, 1994. [72]C. Reynolds. Evolution of obstacle avoidance behavior: Using noise to promote robust solutions. In K. Kinnear, editor, Advances in Genetic Programming. MIT Press, Cambridge, MA, 1994. [73]E. Rich and K. Knight. Aritificial Intelligence. McGraw Hill, New York, NY, 1991. [74]B. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, New York, NY, 1996. [75]D. E. Rumelhart and J. L. McClelland, editors. Parallel Distributed Processing, Vol I-II. MIT Press, Cambridge, MA, 1986. [76]S. Russell and P. Norvig. Artificial Intelligence - A Modern Approach. Prentice Hall, Englewood Cliffs, NJ, 1995. [77]R. Salomon. Improved convergence rate of back-propagation with dynamic adaptation of the learning rate. In Proceedings of the First International Conference on Parallel Problem Solving from Nature, pages 269–273, 1991. [78]H.-P. Schwefel, editor. Numerical Optimization of Computer Models. John Wiley, Chichester, UK, 1981. [79]H.-P. Schwefel. Collective phenomena in evolutionary systems. In Proceedings of 31st Annual Meeting of the International Society for General System Research, pages 1025–1033, 1987. [80]R. Sun and L. Bookman, editors. Computational Architectures Integrating Symbolic and Neural Processes. Kluwer Academic, New York, NY, 1994. [81]S. L. Tanimoto. Elements of Artificial Intelligence Using Common Lisp. Computer Science Press, New York, NY, 1995. [82]E. Tolman. Cognitive maps in rats and men. Psychological Review, 55:189–208, 1948. [83]L. Uhr and V. Honavar. Introduction. In V. Honavar and L. Uhr, editors, Artificial
MIT Press Math6X9/2001/02/08:14:23
Page 27
Evolutionary and Neural Synthesis of Intelligent Agents
27
Intelligence and Neural Networks: Steps Toward Principled Integration. Academic Press, San Diego, CA, 1994. [84]J. von Neumann. Probabilistic logics and the synthesis of reliable organisms from unreliable components. In C. Shannon and J. McCarthy, editors, Automata Studies, pages 43–98. Princeton University Press, Princeton, NJ, 1956. [85]J. Walker. Evolution of simple virtual robots using genetic algorithms. Master’s thesis, Department of Mechanical Engineering, Iowa State University, Ames, IA, 1995. [86]P. Winston. Artificial Intelligence. Addison Wesley, New York, NY, 1992. [87]M. Wooldridge and N. Jennings. Intelligent agents: Theory and practice. Knowledge Engineering Review, 10(2):115–152, 1995. [88]B. Yamauchi and R. Beer. Integrating reactive, sequential, and learning behavior using dynamical neural networks. In From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, pages 382–391, 1994. [89]J. Yang, R. Parekh, and V. Honavar. DistAl: An inter-pattern distance-based constructive learning algorithm. In Proceedings of the International Joint Conference on Neural Networks, Anchorage, Alaska, 1998. To appear. [90]A. Zalzala and A. Morris, editors. Neural Networks for Robotic Control: Theory and Applications. Ellis Horwood, New York, NY, 1996.
MIT Press Math6X9/2001/02/08:14:23
2
Page 29
Cellular Encoding for Interactive Evolutionary Robotics
Fr´ed´eric Gruau and Kameel Quatramaran
This work reports experiments in interactive evolutionary robotics. The goal is to evolve an Artificial Neural Network (ANN) to control the locomotion of an 8-legged robot. The ANNs are encoded using a cellular developmental process called cellular encoding. In a previous work similar experiments have been carried on successfully on a simulated robot. They took however around 1,000,000 different ANN evaluations. In this work the fitness is determined on a real robot, and no more than a few hundreds evaluations can be performed. Various ideas were implemented so as to decrease the required number of evaluations from 1,000,000 to 200. First we used cell cloning and link typing. Second we did as many things as possible interactively: interactive problem decomposition, interactive syntactic constraints, interactive fitness. More precisely: 1- A modular design was chosen where a controller for an individual leg, with a precise neuronal interface was developed. 2- Syntactic constraints were used to promote useful building blocks and impose an 8-fold symmetry. 3- We determined the fitness interactively by hand. We can reward features that would otherwise be very difficult to locate automatically. Interactive evolutionary robotics turns out to be quite successful, in the first bug-free run a global locomotion controller that is faster than a programmed controller could be evolved.
2.1 Introduction The Motivation for Interactive Evolutionary Algorithm In [93] Dave Cliff, Inman Harvey and Phil Husbands from the University of Sussex lay down a chart for the development of cognitive architectures, or control systems, for situated autonomous agent. They claim that the design by hand of control systems capable of complex sensorimotor processing is likely to become prohibitively difficult as the complexity increases, and they advocate the use of Evolutionary Algorithm (EA) to evolve recurrent dynamic neural networks as a potentially efficient engineering method. Our goal is to try to present a concrete proof of this claim by showing an example of big (> 16 units) control system generated using EA. The difference between our work and what we call the “Sussex” approach is that we consider EAs as only one element of the ANN design process. An engineering method is
MIT Press Math6X9/2001/02/08:14:23
30
Page 30
Fr´ed´eric Gruau and Kameel Quatramaran
something which is used to help problem solving, that may be combined with any additional symbolic knowledge one can have about a given problem. We would never expect EAs to do everything from scratch. Our view is that EA should be used interactively in the process of ANN design, but not as a magic wand that will solve all the problems. In contrast with this point of view, Cliff Harvey and Husband seem to rely more on EAs. In [93] they use a direct coding of the ANN. They find ANN without particular regularities, although they acknowledge the fact that a coding which could generate repeated structure would be more appropriate. The advantage of the Sussex approach is that it is pure machine learning, without human intervention. In contrast, we use EA interactively in the ANN design. This is similar to supervised machine learning. How Do We Supervise the Evolutionary Algorithm? The key element that enables us to help the EAs with symbolic knowledge is the way we encode ANNs. What is coded is a developmental process: how a cell divides and divides again and generates a graph of interconnected cells that finally become an ANN. The development is coded on a tree. We help the EA by providing syntactic constraints, a “grammar” which restrict the number of possible trees to those having the right syntax. This is similar to program in C needing to satisfy the C-syntax. Syntactic constraints impose a prior probability on the distribution of ANN. One advantage of our approach is that we are able to study the structure of our ANN, identify some regularities, and help the emergence of them by choosing the appropriate syntactic constraints. We don’t want to use neurons in a sort of densely connected neural soup, a sort of raw computational power which has to be shaped by evolutionary computation. Instead we want a sparsely connected structure, with hierarchy and symmetries, where it is possible to analyze what’s going on just by looking at the architecture. We think one needs a lot of faith to believe that EAs can quickly generate complex highly structured ANNs, from scratch. Perhaps nature has proven it is possible, but it took a lot of time and a huge number of individuals. Our approach is to use symbolic knowledge whenever it is easy and simple. By symbolic we mean things which can be expressed by syntactic constraints which are formally BNF grammars. By easy we mean symmetries that anybody can perceive. We see symbols as a general format that can define the symmetries of the problem or decompose a problem into subproblems, or else provide building blocks. The discovery of such things is timeexpensive to automate with evolutionary computation, but easily perceived by
MIT Press Math6X9/2001/02/08:14:23
Page 31
Cellular Encoding for Interactive Evolutionary Robotics
31
the human eye. Any non-scientific person can point out the symmetries of the 8-legged robot, and thus build the “symmetry format”. We view Evolutionary Computation of ANN as a “design amplifier” that can “ground” this symmetry format on the real world. This is may be another way to address the well known symbol grounding problem. The Challenge of This Work It is possible to automate problem decomposition and symmetry detection. First of all we should say that we still do automatic problem decomposition in this work, the EA automatically decomposes the problem of generating 8 coupled oscillators, into the problem of generating a singleton, and putting together copies of it. However we give some information about the decomposition, since we provide the number 8. In [95] we show that the EA could alone decompose the 6-legged locomotion problem into the sub-problem of generating a sub-ANN for controlling one leg and put together six copies of the sub-ANN. There we did not give the number 6. We also needed a powerful IPSC860 32 processor parallel machine, and over 1,000,000 evaluations. We are now working with a real robot, and each fitness evaluation takes a few minutes, and is done by hand. The challenge of this paper was to solve the same problem with only a few hundred evaluations. At the outset, this did not seem promising. Four ideas made it possible:
We enhanced the developmental process by adding cellular cloning and link typing,
Using syntactic constraints, we forced three cloning divisions at the root of the cellular code, so as to be sure than an 8-fold symmetric network would develop.
Each leg has two degrees of freedom, and two output units are needed to control them. We first evolved a leg controller to reduce those two outputs to a single output commanding the forward stroke and the return stroke. This way the problem is simplified to the task of generating 8 coupled oscillators.
We determined the fitness by hand. We visually monitored the EA so as to reward any potentially interesting feature that would otherwise had been very difficult to detect automatically. We steer the EA starting from generating easy oscillatory behavior, and then evolving the leg coupling. The paper presents what is cellular encoding, cell cloning and link typing, how we use syntactic constraints, experiments done with only one leg, and then
MIT Press Math6X9/2001/02/08:14:23
Page 32
32
Fr´ed´eric Gruau and Kameel Quatramaran
with all the legs. Automatically generated drawing represent different ANN that were found at different stages of the evolution. We discuss the behavior of the robot, and try to explain it based on an analysis on the architecture, whenever possible. The description tries to render how it feels to breed robots. 2.2 Review of Cellular Encoding Cellular encoding is a language for local graph transformations that controls the division of cells which grow into an Artificial Neural Network (ANN) [94]. Other kind of developmental process have been proposed in the literature, a good review can be found in [98]. Many schemes have been proposed with partly the goal of modeling biological reality. Cellular encoding has been created with the sole purpose of computer problem solving, and its efficiency has been shown on a range of different problem, a review can be found in [96]. We explain the basic version of Cellular Encoding in this section, as shown in figure 2.1. A cell has an input site and an output site and can be linked to other cells with directed and ordered links. A cell or a link also possesses a list of internal registers that represent local memory. The registers are initialized with a default value, and are duplicated when a cell division occurs. The registers contain neuron attributes such as weights and the threshold value. The graph transformations can be classified into cell divisions and modifications of cell and link registers.
mother
Mother cell
1
1
1
2
SEQ
2
PAR
1
1 2
FULL
2
2
CPO
CPI
Figure 2.1 Illustration of main type of division: SEQ, PAR, FULL, CPO, CPI.
A cell division replaces one cell called the parent cell by two cells called child cells. A cell division must specify how the two child cells will be linked. For practical purposes, we give a name to each graph transformation; these names in turn are manipulated by the genetic algorithm. In the sequential division denoted SEQ the first child cell inherits the input links, the second child
MIT Press Math6X9/2001/02/08:14:23
Page 33
Cellular Encoding for Interactive Evolutionary Robotics
33
cell inherits the output links and the first child cell is connected to the second child cell. In the parallel division denoted PAR both child cells inherit both the input and output links from the parent cell. Hence, each link is duplicated. The child cells are not connected. In general, a particular cell division is specified by indicating for each child cell which link is inherited from the mother cell. The FULL division is the sequential and the parallel division combined. All the links are duplicated, and the two child cells are interconnected with two links, one for each direction. This division can generate completely connected sub-ANNs. The CPO division (CoPy Output) is a sequential division, plus the output links are duplicated in both child cells. Similarly, the CPI division (CoPy Input) is a sequential division, plus the input links are duplicated. Before describing the instructions used to modify cell registers it is useful to describe how an ANN unit performs a computation. The default value of the weights is 1, and the bias is 0. The default transfer function is the identity. Each neuron computes the weighted sum of its inputs, applies the transfer function to obtain s, and updates the activity a using the equation a = a+(s a)= where is the time constant of the neuron. See the figures 2.5, 2.10, and 2.12 for examples of neural networks. The ANNs computation is performed with integers; the activity is coded using 12 bits so that 4096 corresponds to activity 1. The instruction SBIAS x sets the bias to x=4096. The instruction DELTAT sets the time constant of the neuron. SACT sets the initial activity of the neuron. The instruction STEP (resp LINEAR) sets the transfer function to the clipped linear function between 1 and +1 (resp to the identity function). The instruction PI sets the sigmoid to multiply all its input together. The WEIGHT instruction is used to modify link registers. It has k integer parameters, each one specifying a real number in floating point notation: the real is equal to the integer between -255 and 256 divided by 256. The parameters are used to set the k weights of the first input links. If a neuron happens to have more than k input links, the weights of the supernumerary input links will be set by default to the value 256 (i.e., 256 256 = 1). The cellular code is a grammar-tree with nodes labeled by names of graph transformations. Each cell carries a duplicate copy of the grammar tree and has an internal register called a reading head that points to a particular position of the grammar tree. At each step of development, each cell executes the graph transformation pointed to by its reading head and then advances the reading head to the left or to the right subtree. After cells terminate development they lose their reading-heads and become neurons. The order in which cells execute graph transformations is determined as
MIT Press Math6X9/2001/02/08:14:23
34
Page 34
Fr´ed´eric Gruau and Kameel Quatramaran
follows: once a cell has executed its graph transformation, it enters a First In First Out (FIFO) queue. The next cell to execute is the head of the FIFO queue. If the cell divides, the child which reads the left subtree enters the FIFO queue first. This order of execution tries to model what would happen if cells were active in parallel. It ensures that a cell cannot be active twice while another cell has not been active at all. The WAIT instruction makes a cell wait for a specified number of steps, and makes it possible to also encode a particular order of execution. We also used the control program symbol PROGN. The program symbol PROGN has an arbitrary number of subtrees, and all the subtrees are executed one after the other, starting from the subtree number one. Consider a control problem where the number of control variables is n and the number of sensors is p. We want to solve this control problem using an ANN with p input units and n output units. There are two possibilities to generate those I/O units. The first method is to impose the I/O units using appropriate syntactic constraints. At the beginning of the development the initial graph of cells consists of p input units connected to a reading cell which is connected to n output units. The input and output units do not read any code, they are fixed during all the development. In effective these cells are pointers or place-holders for the inputs and outputs. The initial reading cell reads at the root of the grammar tree. It will divide according to what it reads and generate all the cells that will eventually generate the final decoded ANN. The second method that we often prefer to use, is to have the EA find itself the right number of I/O units. The development starts with a single cell connected to the input pointer cell and the output pointer cell. At the end of the development, the input (resp. output) units are those which are connected to the input (resp. output) pointer cell. We let the evolutionary algorithm find the right number of input and output unit, by putting a term in the fitness to reward networks that have a correct number of I/O units. The problem with the first method is that we can easily generate an ANN where all the output units output the same signals, and all the inputs are just systematically summed in a weighted sum. The second method works usually better, because the EA is forced to generate a specific cellular code for each I/O unit, that will specify how it is to be connected to the rest of the ANN, and with which weights. To implement the second method we will use the instruction BLOC which blocs the development of a cell until all its input neurons are neurons, and the instruction TESTIO which compares the number of inputs to a specified integer value, and sets a flag accordingly. The flag is later used to compute the fitness.
MIT Press Math6X9/2001/02/08:14:23
Page 35
Cellular Encoding for Interactive Evolutionary Robotics
35
Last, the instruction CYC is used to add a recurrent link to a unit, from the output site to the input site. That unit can then perform other divisions, duplicate the recurrent link, and generates recurrent connections everywhere. 2.3 Enhancement of Cellular Encoding We had to enhance cellular encoding with cloning division, and the use of types. We also implemented another way to obtain recurrent links. All these new elements are reported in this section. The cloning operation is really easy to implement and is done by encapsulating a division instruction into a PROGN instruction. After the division, the two child cells only modify some registers and cut some links, then they simply go to execute the next instruction of the PROGN, and since they both execute the same instruction, it generates a highly symmetric ANN. Figure 2.2 presents a simple example of cloning.
PAR
PAR PAR
END END END END
PAR
PAR PAR
END END END END
SEQ
Initial
Step 1
Step 2
Step 3
Figure 2.2 The cloning operation, the above ANN is developed from the code PROGN(SEQ)(PAR)(PAR)(END) which contains three clone division. The development takes three steps, one for each clone.
The instruction that we are now presenting has a version for the input unit which ends with the letter ’I’ and one for the output units which ends with the letter ’O’. We are now going to use another link register called the type register, which will be initialized when a link is created between two child cells, and that is later used to select links for cutting, reversing, or setting the weight. We also introduce two sets of generic instructions – one to select links, and another one to set the link register. The instructions beginning with ‘C’ and continuing by the name of a register r are used to select links. This instruction selects the links whose
MIT Press Math6X9/2001/02/08:14:23
Page 36
36
Fr´ed´eric Gruau and Kameel Quatramaran
register is equal to the value of the argument. For example CTYPEI(1) selects all the input links for which the type register is equal to 1. There is another register called NEW which is a flag that is set each time a new link is created between two child cells, or a recurrent link is added. CNEWO(1) selects all the newly created links, going out of the output site. The instructions beginning with ‘S’ and continuing by the name of a register are used to set the value of a register, in some previously selected links. For example the sequence PROGN(CNEWI(1))(STYPEI(2)) sets the type register of newly created links from the input site, to the value 2. In this work, we only use this instruction for the weights, and for the type. We also use the instruction RESTRICTI and RESTRICTO which has two arguments x and dx . It is used to reduce a list of preselected links. Let say there are 10 input links whose type is 2. We can select the 5th and the 6th using the sequence PROGN(CTYPEI (2))(RESTRICTI(5)(2)). The type register together with the select and the set instructions can be used to encode the connections from the neurons to the input and output pointer cell. Those connections are crucial, since they determine which are the input and the output units. Using type registers, we can let each neuron individually encode whether it is or it is not an input or an output unit. We assign two different types, say 0 and 1, to the link that links the ancestor cell to respectively the input pointer cell and the output pointer cell. We ensure that each time a cell divide, the links get duplicated, so that at the near end of the development, all the cells are connected to both the input and the output pointer cell. But then we ensure that each cell can potentially cut the links whose type are 0 or 1. In other words, if a cell wants to be an input unit, it just does nothing, but if it does not want, then it has to cut its input link of type 0. Last, we use another instruction to add recurrent links. The instructions REVERSEI and REVERSEO duplicates a previously selected link from cell a to cell b, and changes the direction of the duplicate, making it go from cell b to cell a.
2.4 Syntactic Constraints We used a BNF grammar as a general technique to specify both a subset of syntactically correct grammar-trees and the underlying data structure. The default data structure is a tree. When the data structure is not a tree, it can be a list, set or integer. By using syntactic constraints on the trees produced by
MIT Press Math6X9/2001/02/08:14:23
Page 37
Cellular Encoding for Interactive Evolutionary Robotics
37
the BNF grammar, a recursive nonterminal of the type tree can be associated with a range that specifies a lower and upper bound on the number of recursive rewritings. In our experiments, this is used to set a lower bound m and an upper bound M on the number of neurons in the final neural network architecture. For the list and set data structure we set a range for the number of elements in these structures. For the integer data structure we set a lower bound and an upper bound of a random integer value. The list and set data structures are described by a set of subtrees called the “elements.” The list data structure is used to store a vector of subtrees. Each of the subtrees is derived using one of the elements. Two subtrees may be derived using the same element. The set data structure is like the list data structure, except that each of the subtrees must be derived using a different element. So for example, the rule
< A >::= (list[2::2]of (0)(1)) generates the trees ((0)(0)), ((0)(1)), ((1)(0)), ((1)(1). The rule
< A >::= (set[2::2]of (0)(1)) generates only the trees ((0)(1)) and ((1)(0)). Figure 2.3 shows a simple example of syntactic constraints used to restrict the space of possible solutions.
[0..8]; ::=
::= ( PAR()() ) | ( CPO()() ) | ( SEQ ()() ) | ( )
::= (PROGN : set[0..4] of (WEIGHT: list[8..8] of (integer[-255..+255])) (DELTAT(integer[1..+40])) (SBIAS(integer[-4096..+4096])) (STEP) )
Figure 2.3 Tutorial example of syntactic constraints
The nonterminal is recursive. It can be rewritten recursively between 0 and 8 times. Each time is it rewritten recursively, it generates a division and adds a new ANN unit. Thus the final number of ANN units will
MIT Press Math6X9/2001/02/08:14:23
38
Page 38
Fr´ed´eric Gruau and Kameel Quatramaran
be between 1 and 9. Note that in this particular case, the size of the ANN is proportional to the size of the genome, therefore constraints of the grammar in figure 2.3 which controls the size of genome, results directly in constraints on ANN growth which controls the ANN size. The nonterminal is used to implement a subset of four possible specializations of the ANN units. The first 8 weights can be set to values between 1 and +1. The time constant can be set to a value that ranges from 1 to 40, and the bias is set to a value between 1 and +1. The transfer function can be set to the STEP function instead of the default transfer function. Since the lower bound on the set range is 0, there can be 0 specializations generated, in which case the ANN unit will compute the sum of its inputs and apply the identity function. Because the upper bound on the set is 4, all the 4 specializations can be generated. In this case, the neuron will make a weighted sum, subtract the bias, and apply the clipped linear function. If the lower and the upper bound had both been 1, then exactly one and only one of the features would be operational. This can be used to select an instruction with a given probability. For example, the sequence PROGN: set [1..1] of (WAIT) (WAIT) (WAIT) (CYC) generates a recurrent link with a probability of 0:25. Crossover. Crossover must be implemented such that two cellular codes that are syntactically correct produce an offspring that is also syntactically correct (i.e. that can be parsed by the BNF grammar). Each terminal of a grammar tree has a primary type. The primary label of a terminal is the name of the nonterminal that generated it. Crossover with another tree may occur only if the two root symbols of the subtrees being exchanged have the same primary label. This simple mechanism ensures the closure of the crossover operator with respect to the syntactic constraints. Crossover between two trees is the classic crossover used in Genetic Programming as advocated by Koza [99], where two subtrees are exchanged. Crossover between two integers is disabled. Crossover between two lists, or two sets is implemented like crossover between bit strings, since the underlying arrangement of all these data structures is a string. Mutation. To mutate one node of a tree labeled by a terminal t, we replace the subtree beginning at this node by a single node labeled with the nonterminal parent of t. Then we rewrite the tree using the BNF grammar. To mutate a list, set or array data structure, we randomly add or suppress an element.
MIT Press Math6X9/2001/02/08:14:23
Page 39
Cellular Encoding for Interactive Evolutionary Robotics
39
To mutate an integer, we add a random value uniformly distributed between max(2; (M n)=8). M and m are the upper and lower bounds of the specified integer range. Each time an offspring is created, all the nodes are mutated with a small probability. For tree, list and set nodes the mutation rate is 0.05, while for the integer node it is 0.5. Those probability may be reset at run time of the EA. 2.5 The Leg Controller The Challenge The leg does a power stroke when it pushes on the ground to pull the body forward, and the return stroke when it lifts the leg and takes it forward. The challenge in this first experiment was to build a good leg controller, one that does not drag the leg on the return stroke, and that starts to push on the ground right at the beginning of the power stroke. The ANN had one single input. The input of 4096 on the input unit must trigger the power stroke, and the input of 0 must trigger the return stroke. This implies the right scheduling of four different actions: when exactly the neuron responsible for lifting and swinging, actually lifts up and down, or swings forwards and backwards. General Setting The command of return stroke or power stroke was hand generated during the genetic run, so as to be able to reproduce the movement whenever and as many times as desired. The EA used 20 individuals, the fitness was given according to a set of features: the highest and the lowest leg position had to be correct, the movement of the leg must start exactly when the signal is received, there must not be dragging of the leg on the return stroke, so the leg must first be lifted and then brought forward. Second the leg must rest on the floor at once on the power stroke, therefore the leg must be first put on the floor and then moved backward. Each of these features determined a range of fitness; the ranges were chosen in such a way that for all the features that were checked, the intersection of the ranges was not empty. The fitness was then adjusted according to a subjective judgement. The ranges evolved during the run, so as to fit the evolution. The EA was run around 30 generations. Fitness evaluation is much quicker than with the complete locomotion controller, because we have to watch the leg moving for only a very short period of time to be able to assess the performance, That is how we did up to six hundred evaluations.
MIT Press Math6X9/2001/02/08:14:23
Page 40
40
Fr´ed´eric Gruau and Kameel Quatramaran
Syntactic Constraints Used for the Leg Controller We now comment on the syntactic constraints used in this run, which are described in figure 2.4. [6..20]; begin ::= (LABEL (SEQ (WAIT) (PAR () (PROGN (PAR) (PAR(PAR(PAR(WAIT)(WAIT))(WAIT))(PAR(WAIT)(PAR(WAIT)(PAR(WAIT) (WAIT)) )) ) ) ) ) ) ::=
(SEQ()()) | (PAR()()) | (SHARI1()()) | (CPI()()) | (CPO()()) | (FULL()()) | () | (<sunit>)
::= (PROGN (STEP) (PROGN : set[1..3] of ( DELTAT (integer[1..40])) ( WEIGHT: list[8..8] of ( integer[-256..+256]) ) ( SBIAS (integer[-4096..+4096])) ) )
<sunit> ::= (PROGN (LINEAR) (PROGN : set[2..2] of ( WEIGHT: list[8..8] of ( integer[-1024..+1024]) ) ( SBIAS (integer[-4096..+4096])) ) )
Figure 2.4 Syntactic constraints used for the leg controller
We did not use link typing or clone instructions for the leg controller, the ideas to use them came to us when we began to tackle the problem of evolving the whole locomotion controller. The non terminal generates one neuron, each time it is rewritten, since it can be rewritten between 6 and 20 times, the number of neurons will be between 7 and 21. The division instructions are classic, except for the SHARI1 where the input links are shared between the two child cells, the first child gets the first input, and the second child gets the other inputs. The ANN begins by generating 14 fake output units
MIT Press Math6X9/2001/02/08:14:23
Page 41
Cellular Encoding for Interactive Evolutionary Robotics
41
using parallel division (one clone and 7 normal). Those units reproduce the input signal. In this way, we can compare the movement on the leg whose controller is evolved, with the raw signal that moves the 7 other legs. The nonterminal is rewritten between 6 and 20 times, and finally the neuron specializes either as a temporal unit (non-terminal ) or as a spatial unit (non-terminal <s-unit>.) The temporal units have a threshold sigmoid and a time constant that is genetically determined. They are used to introduce a delay in a signal, and the spatial units have a linear sigmoid, and a bias that is genetically determined. They are used to translate and multiply a signal by genetically specified constants. Those two types of units are the building blocks needed to generate a fixed length sequence of signals of different intensities and duration. The duration is controlled by the temporal units, and the intensity by the spatial units. Explanation of the Solutions Figure 2.5 presents the leg controller found by the Evolutionary Algorithm, together with four different settings of the activities inside the ANN, and the corresponding leg positions.
tau 1
0
a -693 bias b 3497
-693
tau 3
3500 b -914
4096
c
-1024 tau 39 d 360 736 bias e 1864 lift
4096
a
-202
a
4096 a
0
a
b
-7000 b
-305
b
c
1300 c
3000 d
3500 d
4096 c
c
0
40 4224 d
4224 d f
bias -2321
swing 4 middle up
e
f
-9000
9000
e
f
-12000
4800
e
f
-5000
-9000
Step 1
Step 2
Step 3
forward down
middle down
backward down
e
f
4000
2000 Step 4
middle up
1 2 3 forward middle backward down down down
Figure 2.5 The leg controller, and four different settings of the activities inside the ANN, together with the corresponding leg positions. The default values are not represented. The default weight is 256, the default bias is 0, and the default time constant is 3. The diagonal line means the linear sigmoid, the stair case means the step function sigmoid.
MIT Press Math6X9/2001/02/08:14:23
Page 42
42
Fr´ed´eric Gruau and Kameel Quatramaran
Figure 2.6 shows its genetic code after some obvious hand-made simplifications, such has merging a tree of PROGNs. SEQ (PROGN(STEP) (DELTAT(1))) (CPI (SEQ (PROGN(LINEAR)(WEIGHT(-693) (-1024)(360)(-252)(-300)(-984)(-849)(610))(SBIAS(3497))) (CPO (PROGN(STEP)(SACT(3458))) (PROGN(STEP)(DELTAT(39))))) (PAR (PROGN(LINEAR)(WEIGHT(-693)(-1024)(360)(-252)(-300)(-984)(-706) (796))(SBIAS(1864))) (PROGN(LINEAR)(WEIGHT(-914)(40)(736)(-622) (-1024)(-984)(-706)(610))(SBIAS(-2321)))) )
Figure 2.6 The genetic code of the Champion leg controller
Neuron e controls the lift, neuron f controls the swing and neuron a is the input neuron. While the ANN is completely feed-forward, yet it can generate a short sequence of different leg commands. Because we use neurons with time constants, different events can happen at different time steps. In Step 1, the input to the ANN is 0, and the leg is forward down, the input is then changed to 4096 to initiate the power stroke. In Step 2, we are in the middle of the power stroke, neuron e receives an even more negative input, this has no effect, since it was already over negative, the leg just stays on the ground. On the other hand, neuron f is brought to negative value, so the leg goes backward relative to the body, and since the leg is on the ground, the body moves forward. In Step 3, the power stroke is finished, we now initiate the return stroke the input is set to 0. In Step 4, we are in the middle of the return stroke, neuron a is 0, but neuron c had not yet time to increase its activities, because of the time constants. Therefore, neuron e is positive, and the leg is up in the middle. When the return stroke terminates, we are back to Step 1. Neuron c being positive, forces the neuron e to become negative again, and the leg is back on the ground. Initially we wanted to terminate the return stroke with the leg up, and to bring it down on the power stroke, the controller evolved in another way, and we thought it would be acceptable. We realized later, that it is actually much better this way, because the robot always has its leg on the ground, except when it is in the middle of a return stroke. So it does not lose balance often. There is another unexpected interesting feature of this controller. We put a security on the robot driver, so that at each time step the leg cannot move more than a small amount, to avoid warming of the servo motors. The evolutionary algorithm used this feature. It generates extreme binary leg position. Nevertheless, the leg moves continuously, because it cannot move more that the predetermined
MIT Press Math6X9/2001/02/08:14:23
Page 43
Cellular Encoding for Interactive Evolutionary Robotics
43
upper bound. This was totally unexpected. 2.6 The Locomotion Controller The Challenge We have previously evolved ANN for a simulated 6-legged robot, see [95]. We had a powerful parallel machine, an IPSC860 with up to 32 processors. We needed 1,000,000 ANNs to find the an ANN solution to the problem. In this study, one ANN takes a few minutes to assess the fitness, because the fitness is manually given, and it takes some time to see how interesting the controller is. One time we even spent an hour trying to figure out whether the robot was turning or going straight, because of the noise that was not clear. The challenge of this study was to help the EA so as to be able to more efficiently search the genetic space and solve the problem with only a few hundreds of evaluations instead of one million. General Setting There are some settings which were constant over the successful run 2, and run 4, and we report them here. The way we give the fitness was highly subjective, and changed during the run depending on how we felt the population had converged or not. We realized that the role of the fitness is not only to reward good individuals, but also to control genetic diversity. Since there is a bit of noise when two individuals are compared, the selective pressure can be controlled by the fitness. The rewarding must be done very cautiously; otherwise newly fit individuals will quickly dominate the population. We are quite happy to see a new good performing phenotype, and are inclined to give it a good rank, to be sure that it is kept for a while. However, if we see that same guy reappear again and again, we may kill it to avoid dominance. We followed some general guidelines. The fitness was a sum of three terms varying between 0.01 and 0.03: The first term rewards oscillations, and discourages too slow or too quick oscillations. The second term rewards the number of legs which oscillate, the third term rewards the correct phase between the different leg oscillators, and the correct coupling. We were very afraid to give big fitnesses, and wanted to put fitnesses difference in the range of the noise that exists when a Boltzmann tournament takes place. So our fitness seldom went above 0.1 . We started with a population of 32 individuals so as to be able to sample building blocks, and reduce it to 16 after 5 generations. At the same time as we
MIT Press Math6X9/2001/02/08:14:23
44
Page 44
Fr´ed´eric Gruau and Kameel Quatramaran
reduced the population, we lowered all the mutation rates (set list and tree) to 0.01 except the integer mutation rate which was increased to 0.4, the idea was to assume that the EA had discovered the right architecture, and to concentrate the genetic search on the weights. The weights were mutated almost one time out of two. The selective pressure was also augmented: the number of individual participating in Boltzmann tournament was increased from 3 to 5. Those tournaments are used to delete individuals or to select mates. We no longer wanted genetic diversity, but rather genetic convergence of the architecture, similar to tuning an already working solution. We got the idea to use genetic convergence from the SAGA paradigm of Inman Harvey [97]. It is possible to give a fitness 1, the effect is that the individual is immediately thrown to the garbage, and not counted. We threw away the motionless individuals, and those that did not have the right number of input/output units. As a result, to generate the first 32 individuals in the initial population, we went over one hundred evaluations. Syntactic Constraints Used in All the Runs The first part of the syntax remains the same in the different runs. It is represented in figure 2.7. ::=(LABEL(SEQ (SEQ (PAR()()) (PROGN (WAIT(4)) (CTYPEI(-1)) (RESTRICTI(0)(1)) (STYPEI(0)) (CTYPEI(-1)) (STYPEI(1)) (CTYPEO(-1)) (STYPEO(0)) (<evolved>) ) ) (PROGN (BLOC) (TESTIO8) (SHARI (JMP12) (SHARI (JMP12) (SHARI (JMP12) (SHARI (JMP12) (PROGN (SWITCH) (SHARI (JMP12) (PROGN (SWITCH) (SHARI (JMP12) (PROGN (SWITCH) (SHARI (JMP12) (JMP12) (1))) (1))) (1))) (1)) (1)) (1)) (1)) )))
Figure 2.7 Syntactic constraint specifying the general structure
It just specifies some fixed cellular code that is always to be part of any individual. This codes a general structure which is developed in figure 2.8. We now detail the execution of the code. First it sets the type of the links to the output pointer cell to 0 and develops an ANN with two input units
MIT Press Math6X9/2001/02/08:14:23
Page 45
Cellular Encoding for Interactive Evolutionary Robotics
45
<evolved>
BLOC
SHARI
SHARI
(a)
(b)
JMP JMP JMP JMP 12 12 12 12
JMP 12
JMP JMP JMP 12 12 12
(e)
(d)
(c)
JMP 12
1 2 3 4
(f) (g)
5 6 7 8
Figure 2.8 What the Syntactic constraint specifying the general structure do: The initial situation is (a), with the two pointer cells represented by a square, and the ancestor cell by a circle. (b) The general structure is specified after the three first divisions, two input units will execute a genetic tree generated by the non-terminal , which is not used anyway. The central cell will develop the core of the controller, the bottom cell waits for the core to develop, then in un-blocs, check that it has exactly 8 neighbor, (it is the case here, but it could not, and then makes 7 SHARI division in (d) and (e), The last four divisions are interleaved with SWITCH operator, so as to set the leg numbers as is indicated in (g). Finally the 8 child cells make a JMP 12, where 12 is just the number of the tree where the cellular code that encodes a leg controller has been stored.
typed 0 and 1. It then generates a cell which will execute the beginning of the evolved code, (<evolved>), and another cell blocked by the BLOC instruction. The input cells are not used. The blocked cell waits that all its neighbors are neurons, then it unblocks and executes the TESTIO instruction which has the effect to check the number of inputs, here it test whether it is equal to 8, and set a flag accordingly. This flag will be used in the fitness evaluation, to throw away those ANNs which does not have exactly 8 outputs units. The unblocked cell then goes on to execute the cellular code that develop the previously evolved leg controller at the right place. For this it used an ad-hoc division called ’SHARI’ and also the ’SWITCH’ operator which is used to assign
MIT Press Math6X9/2001/02/08:14:23
Page 46
46
Fr´ed´eric Gruau and Kameel Quatramaran
number to the output units that match a logical numbering of the legs. This order is specified in figure 2.8 (g).
2.7 Log of the Experimental Runs We only performed five runs, and we are proud of it. That is a proof that the method works well, it doesn’t need weeks of parameter tuning. That’s useful because one run is about two days of work. So we report the five runs, even if only one of them was really successful, and the others were merely used to debug our syntactic constraints. Analysis of Run 0 and Run 1 The first two runs were done with only seven legs, because a plastic gear on the eighth leg had burned, the robot had stubbornly tried to run into an obstacle for 30 seconds. As a result, ANNs were evolved that could made the robot walk with only seven legs. Run 0 brought an approximate solution to the problem. But after hours of breeding we input accidentally a fitness of 23, which had the effect to stop the EA, the success predicate being that the fitness is greater than 1. Run 1 also gave a not so bad quadripod, as far as we can remember, but we realized there was a bug in the way we used the link types which were not used at all. Instead of selecting a link and then setting the type we were first setting and then selecting, which amounts to a null operation. We had simply exchanged the two alleles. When we found out the bug, we were really surprised that we got a solution without types, but we think that’s an interesting feature of EAs – even if your program is buggy, the EA will often find workarounds. Syntactic Constraints Used in the Second Run The syntactic constraints used for the second run are reported in figure 2.9. The core of the controller is developed by the cell executing the nonterminal <evolved>. This cell makes clones, and then generates a sub-ANN. The non-terminal is rewritten exactly 3 times, and generates exactly 3 clones, therefore an ANN with an 8-fold symmetries will be generated. The non-terminal is rewritten between 1 and 6 times, and generates a subANN having between one and seven units. Because of the preceding 3 clones, this sub-ANN will be duplicated 8 times, however, one of those 8 sub-ANNs can potentially more than one leg.
MIT Press Math6X9/2001/02/08:14:23
Page 47
Cellular Encoding for Interactive Evolutionary Robotics
47
The clone division and the normal cell division are specified in a similar way. The general type of the division is to be chosen between FULL, PAR, SEQ, CPI and CPO. Right after dividing, the two cells execute a different sequence of cutting operators generated with the non-terminal. It is possible to adjust the probability of cutting link by first specifying how often we want to cut, and then how much links of a given type we want to cut, using the non-terminal . We tune this probability so as to obtain ANNs not too densely connected. The second child sets the type of the newly created link between the two child cells, if there are some, using the non-terminal <settype>. When a cell stops to divide, it sets all its neuron attributes, the cellular code is generated by the non-terminal . First we reverse some links or/and add a recurrent link. We choose a particular amount of recurrence by setting the probability with which recurrent links are created. then generates a time constant, some weights, a threshold, an initial activity, and finally the sigmoid type. The PI unit makes the product of its input. We use the PI unit with a small probability, because it is highly non-linear, and we do not want to introduce too much non-linearity. Analysis of the Second Run The wavewalk gait is when the robot moves one leg at a time, the quadripod gait is when the robot moves the legs four by four. For the second genetic run, we implemented a mechanism to store nice individuals, and be able to print them and film them afterwards. We now describe a selection of the different champions that were found during the second genetic run. They are shown in figure 2.10. At generation 0, the ANNs are a bit messy, densely connected. About 75 percent of the ANNs do not have the correct number of outputs, that is 8 outputs, and they are directly eliminated. For this reason, the generation 0 takes quite more time than the other generations. Later individuals have the right number of outputs most of the time. Among those that have the correct number of outputs, most of the ANN do not even make the robot move, also a lot of them make random movements. This happens when a threshold unit is connected directly to the servos and its input is near 0. In this case its output oscillates between 0 and 4096 due to the 1 percent of random noise that is added to the net input. Some ANNs produce a short sequence on one leg before the motionless state. They get a fitness like 0.001. One of them produced an oscillation on four legs. During the next two generations, we concentrated on evolving oscillations on as many legs as possible, giving fitnesses between
MIT Press Math6X9/2001/02/08:14:23
48
Page 48
Fr´ed´eric Gruau and Kameel Quatramaran
0.01 and 0.02, depending on how many legs were oscillating, and how good the period and duration of the return stroke and power stroke were. At generation 3, we started to have individuals with a little bit of coordination between the legs. We watched an embryo of wavewalk which very soon vanished because the leg went out of synchrony. The coupling was too weak. The ANN is represented in figure 2.10 in the second picture, you can see that it is very sparsely connected. The next ANN at generation 4 generated an embryo of quadripod. Four legs moved forward, then four other legs, then the 8 legs moved backward. At generation 6 we got an ANN made of four completely separated sub-ANNs each one was controlling two legs, due to a different initial setting of the activities, the difference of phase was correct at the beginning, and we obtained perfect quadripod, but after 2 minutes, the movement decreased in amplitude, four legs come to a complete stop, and then the four other legs. Generation 7 gave an almost good quadripod, the phase between the two pairs of four legs was not yet correct. The ANN is probably a mutation of the one at generation 3, because the architecture is very similar. Generation 8 gave a slow but safe quadripod walk. The coupling between the four sub-ANNs is strong enough to keep the delay in the phase of the oscillators. Only the frequency needs to be improved, because that’s really too slow. Generation 10 produced a correct and fast quadripod, but not the fastest possible. Generation 11 produced one which walked a little bit faster, due to an adjustment of the phase between the two pairs of four legs. The frequency did not improve. Furthermore there was no coupling between the legs. Generation 12 did not improve and we show it because it has a funny feature. The input neuron was not used in all our experiments, it was to control the speed of the robot, but we could not yet go that far. However its activity was set to 2000, and what happened at generation 12 was that an ANN was produced that used this neuron to differentiate the phase between the two pairs of sub-ANNs that were controlling the two pairs of four legs. Except at generation 0 where we always have a great variety of architectures, you can see that the general organization of the ANN remains similar throughout the whole run. It has 8 sub-ANNs, four of them control 8 legs, one for two contiguous legs, and can potentially ensure that two contiguous legs on one side are out of phase. The four other ANN are in-between, sometimes they are not used at all, as is the case in the champion of generation 11 and 12. The 8 sub-ANNs taken separately have some recurrent links. However, if each sub-ANN is modeled as one neuron, the architecture becomes feedforward. The sub-ANNs that control the rear legs send information to all the
MIT Press Math6X9/2001/02/08:14:23
Page 49
Cellular Encoding for Interactive Evolutionary Robotics
49
other sub-ANNs. They are the two master oscillators which impose the rhythm. The other sub-ANNs, although they could supposedly run independently, have their phase locked by the master oscillators. We realized that in this run, it was not possible to mutate the general architecture without destroying the whole genetic code, because that would imply replacing a division by another. We think it explains the fact that the general architecture did not change. So we decided to change the syntactic constraints in the next run so as to make the mutation of the general architecture possible, and more thoroughly explore the space of general architectures. Analysis of the Third Run The third run lasted only 5 hours, because of a typing mistake in the first generation. We typed a high fitness on the keyboard, for a bad individual who kept reproducing all the time during the next three generation. Being unable to locate it and kill it, we had to kill the whole population, a bit like the story of the mad cow disease which recently happened in England. Syntactic Constraints Used in the Fourth Run Our motivation to do another run was to enable mutation of the general structure. We also realized that there were still bugs in the syntactic constraints and we fixed them. The new syntactic constraints are shown in figure 2.11. To enable mutation of the general structure, instead of implementing as a recursive non-terminal, we put three occurrences of it, and use it in a non-recursive way. In this way, it is possible to mutate any of them, without having to regenerate the others. Second, we used only the FULL division for the general structure, and forced the EA to entirely determine type by type, which links are inherited and which are not. This can be mutated independently, and may result in making soft mutation possible, that can modify a small part of the division. In contrast, if we mutate a division from CPI to PAR, for example, the entire division has to be re-generated from scratch. (We remind the reader that now the division is also genetically determined using link types and cutting links selectively after the division.) The goal of using only FULL was also to augment the probability of coupling between two subANNs. Using only FULL, augments the density of connections, so we augmented the probability of cutting links to keep the density at the same level. Also we made a distinction between cutting links that were created during the development, and cutting the link to the output unit which is now done at the
MIT Press Math6X9/2001/02/08:14:23
Page 50
50
Fr´ed´eric Gruau and Kameel Quatramaran
end of the development, by the non-terminal