0
Automaton Theory and Modeling of Biological Systems 0
M. L. TSETLIN Translated by Scitran (Scientific Translation Service) Santa Barbara, California
ACADEMIC PRESS New York and London A Subsidiary of Harcourt Brace Jovanovich, Publishers
1973
COPYRIGHT 0 1973, BY ACADEMIC PRESS,WC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC.
111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road. London N W l
LIBRARY OF CONGRESS CATALOG CARD NUMBER: 72-11341
PRINTED IN THE UNITED STATES OF AMERICA
Automaton Theory and Modeling of Biological Systems. Translated from the original Russian edition entitled Issledovaniya PO Teorii Avtomatov i Modelirovaniyu Biologicheskikh Sistem, published by “Nauka” Press, Moscow, 1969.
Mikhail L’vovich Tsetlin (1924-1966)
Contents
Foreword . .
xi
Pufau lq the Russian EditWn
xiii
AUTOMATON THEORY Finite Automata and Modeling the Simplest Fonns of Beluorior Introduction . . . . . . . . . . . . . . . . . . . . . .
3
I. BEHAVIOR OF AUTOMATA IN RANDOM MEDIA
1 Behavior of Automata in Stationary Random Media . . . . . . . . . . . 2 Asymptotically Optimal Sequences of Symmetric Automata. The Book Stack Problem
. . . . . . . .
12 16
3 Behavior of Automata in Composite Media . . . . . . . . . . . . . . 4 Behavior of Automata with an Evolving Structure in Random Media . . . Appendix to Part I. Eigenvalues of Markov Chains Describing lhe Behavior of Asymptotically Optimal Automata in Stationary Random Media . . . . . .
2S
3i 34
II. AUTOMATON GAMES. ZERO-SUM GAMES FOR TWO AUTOMATA 1 Automaton
Game~
41
. .
2 Zero-Sum Games for Two Automata . . . . .
46
Ill. HOMOGENEOUS AUTOMATON GAMES
1 Homogeneous Games . . . . . . . . . . . . . . . . . 2 An Example of Simulating a Symmetric Automaton Game 3 Circle Automaton Games . . ." . . . . . . . . . . . . .
SS 62 71
An EXlllllple or Modeling the Beluovior or a Group or Automata with a
Two-Level Organization (The Numerical Method Distributioo Problem) vii
84
viii
Behavior of Automata in Periodic Random Media and the Problem of Synchronization in the Presence of Noise.
93
Organization of the Queuing Discipline in Queuing Systems Using Models of the Collective Behavior of Automata . 102 Mathematical Modeling of the Simplest Forms of Behavior
108
Appendix I. Addr'e$Siess (N onjndividuali:zed) Control , , . , . , , Appendix 2. Languages That Automata Use to Communicate with One Another
124 J2S
ARTICLES ON BIOLOGICAL SYSTEMS AND MATHEMATICAL MODELS IN BIOLOGY Introduction . . . . . . . . . . . . . . . .
129
Mathematical Simulation of the Principles of tbe Functioning of tbe Central Nervous System I On Search Tac
132 140 147 149
2 Simulation of Expc(iicnt Behavior of a Group of Automata . 3 A Mathematical De;sc.ription of Excitable Tis!>ues , 4 The Principle of Least Interaction . . . . . . . . . . . . .
Coatinuous Models of Control Systems . . . . • .
.
IS4
Certain Problems in the Investigation of Movements I Synergies and Other Mechanisms Simplifying Motor Control 2 Functioning of Motor Control on the Spinal Level . . . 3 Functioning of a Motoneuron Group and Motor Units . . .
162 16S 169
Computer Simulation of the Functioning of a Motoneuron Group I
Posc~Regimc
of Motor Unit Operation
. . . . . . . . . . . . . . .
172
. . . . . . . . .
174
2 Description or the Model
3 Desynchroni1,ation of Motoncurons; the Influence of the Renshaw CeiJ System on Impulse Transmiss.:ion of Motoneurons
4 Control or Muscle U:ngth . . . S Simulation of Pathological States . . . .
178
182 184
ix Restructuring Prior to a Movement .
187
Bioelectric Control and Diagnostics of States I Usc or Skeletal Muscle Biopotcntial$ for Control . 2 Usc of Cardiac Biocurrents to Control Diagnostic. Devices
198 200
3 Certain Problems Related to Automatic Diagnosis of Acute Pathological States 210
SUPPLEMENTARY ARTICLES Certain Properties of Finite Graphs Related to tbe Transportation Problem . . . . . . . . . . . . . . . . . . . 221 Application of Matrix Calculus to the Synthesis of Relay-Contact 226 Circuits . . . . . . . . . . . . . Bibliography of Papers by M. L. Tsetlin .
232
APPENDICES On tbe Goore Game . . . . . . . . . • . . . . • . . . . . . . . . . 239 A Simplified Description of Games Played by Asymptotically Optimal Automata 247
I Two-Person Games . . . . . . . . . . . . . . . . . . . . 2 A Remark about Games with an Arbitrary Number of Players
251
The Problem of Controlling a Communications Network . . . . . . . 253 The Operation of tbe Apartment Commission . The "Hey" Problem
.
255
. . . . . . . . . . . . . . . . . . . . . . . . 260
Papers on Continuous Exci.table Media 1 One-Dimensional Excitable MeOia . 2 Two-Dimen$ional Excitable Media .
262 264
X
CoiiiOtt.s
The Restructuring
or the
Operation
or tbe Spinal
~vtl
.
• .
267
References .
269
Author lndu
281
Subj~ct
285
Index
Foreword
For an American practitioner in mathematics, reading the work of Soviet scientists frequently arouses disquieting undercurrents of feeling. Despite the fact that mathematics is the most nearly universal of languages, there remains a difference in style, in motivation, and in the attitude of the mathematician to’ward his work. This collection of the papers of the Russian mathematician M. L. Tsetlin has evoked these feelings in me and I believe that other readers may well respond in the same way. M. L. Tsetlin was a mathematician in the tradition of Gel’fand, Fomin, and Pyatetsky-Shapiro, that is to say, of mathematicians who combine a very high order of mathematical skill and rigor with a concern for the extramathematical implications of their theorems. “Automaton Theory” exhibits the qualities we have come to expect of good Russian writing in mathematics. It is expository, the reasoning is explicitly and clearly presented, the proofs are careful and complete, and the theoretical results are applied to real world problems, in this case to aspects of neurology and biological control. The mathematics itself would hardly be termed automaton theory by American and West European automaton theorists. Tsetlin’s work makes relatively little contact with this line of research as we have come to regard it. Rather it is closer in spirit to the issues raised in game theory and to learning machines or perceptrons. Tsetlin considers collections of automata as a model for collective behavior of a group with no a priori information other than the rules of the game, and attempts to derive the structure of systems that exhibit self-organizing behavior. It is true that the papers refer to only the simplest modes of behavior. One wonders whether it is a reflection of cultural or social differences that Tsetlin chose to study cooperative phenomena in choosing “expedient” behavior, while American game theory focuses on competition among the players. In any case, Tsetlin has made a number of digital models, using a rather eclectic mixture of mathematical tools, to begin to understand interacting collections of automata. xi
xii
Fore word
Over half of the text is devoted to Tsetlin’s work, with Gel’fand and coworkers, on biological systems and mathematical models in biology. The readers might well begin reading this collection with Gel’fand’s introduction to this second section. Here and elsewhere in bits and pieces the reader will sense the feeling of personal loss that Tsetlin’s co-workers felt at his early death. Gel’fand writes inter alia “Unfortunately, this stage of our joint work has been interrupted without ever having seriously begun.” I t was undoubtedly this sense of loss that impelled his colleagues to do the job of careful, extensive, and dedicated editing of Tsetlin’s papers. Tsetlin’s work is unfinished; there are insights, starts, and work in progress in a number of directions that will be interesting to neurophysiologists, psychologists, and sociologists, as well as mathematicians. There is as yet little serious impact by mathematics upon the biological and social sciences other than within the special domain of statistics. “Automaton Theory” tells us about one major Russian attempt to formalize the study of living systems. Tsetlin’s work may only be a beginning but I believe we can learn from it when we try our hand at translating biology’s ineffable problems into mathematics. MURRAYEDEN Cambridge, Massachusetts
Preface to the Russian Edition
On May 30, 1966, Mikhail L’vovich Tsetlin died suddenly at the age of 42. He was one of the most outstanding specialists in the area of cybernetics. During the last ten years of his life, Mikhail L’vovich was mainly occupied with finding the general principles underlying the operation of biological systems, and with development of biocontrolled devices. The friends and colleagues of the deceased have collected his principal works, and this is how this book came into being. It contains the published articles and certain materials from Mikhail L’vovich’s archives, prepared for publication by the editors. The basic text of the articles was fully preserved, with the exception of a small number of changes and omissions which were made to avoid repetition. The order of the materials was determined by the desire to produce a self-contained book. For the same reason, the literature quoted in M. L. Tsetlin’s papers and in the appendix has been put together in a single list placed at the end of the book (references to it are in the form of numbers in brackets). M. L. Tsetlin’s works are grouped in three parts. The first part, Automaton Theory, is devoted to a study of the modeling of expedient behavior with the aid of automata. The second part contains the basic papers on the general principles underlying the functioning of biological systems. The basis for the article on bioelectric control was provided by the materials which Mikhail L’vovich prepared together with others for the purpose of writing a separate book. The materials were put in their final form by V. S. Gurfinkel’ after the death of Mikhail L‘vovich. The predominant position in the book is occupied by the lecture, “On the mathematical modeling of the simplest forms of behavior,” which was placed at the end of the first part. This lecture, delivered by Mikhail L’vovich to physiologists, is a simple and clear exposition of the basic ideas discussed in the first part. It is perfectly understandable to persons not having a specialized mathematical training. At the same time, in our opinion, this xiii
xiv
Preface to the Russian Edition
lecture is also of great value to mathematicians who are interested in the problems discussed in the first part. We recommend that the readers become acquainted with the lecture before reading the first part. In addition, the lecture is essentially an introduction to the second part of the book. The third part of the book contains two additional articles, which are not directly related to the subject matter of the first two parts, and a bibliography of papers by M. L. Tsetlin. An appendix, placed at the end of the book, contains the results obtained by Mikhail L’vovich’s students, as well as questions and problems which worried M. L. Tsetlin in the latter years of his life, but were not expressed in his publications. When speaking of the first part of the book, which is devoted to the mathematical problems of a theory of expedient behavior of one individual or a group, it must be noted that in his works M. L. Tsetlin did not strive to model any kind of specific behavior, but instead posed himself the task of explaining the general laws of expedient behavior, in particular, the expedient behavior of groups. Expedient behavior is really an adaptation to the external world. An individual is capable of performing one of a finite number of actions. Whether he wins or loses depends on which action he performed. Should the individual know beforehand for which action he is going to receive the maximum payoff, he would then naturally always perform this most advantageous action, and this type of behavior would then be, of course, most expedient. In reality, however, an individual usually does not have a priori information about which action will be most advantageous to him. In Tsetlin’s works an individual finds himself in “an external world” which is constructed in the following way (for a more precise treatment see pp. 13-14 of the present edition): in response to each action of the individual, he wins with a certain probability and loses with a certain probability. This model of “an external world” is called a stationary random medium. Mikhail L’vovich proposed the construction of an automaton which minimizes the number of unfavorable reactions of the external world. Such an automaton-and this is most remarkable-does not possess any a priori information about the parameters of the stationary random medium with which it is interacting. In the words of Mikhail L’vovich, this problem is the simplest model of “a small living creature in the big world around him.” Mikhail L’vovich’s works, dealing with the expedient behavior of a collective, permit us to approach the expedient behavior of extremely
Preface to the Russian Edition
xv
diverse groups from a single point of view, and to view them as composed of individual cells or nerve centers, or even individual persons. These works gave rise to the theory of automaton games. Contrary to the classical theory of games developed by J. von Neumann, in which players from the very outset know the consequences of various actions (their own and the opponents’), M. L. Tsetlin proposed to investigate a situation in which the players d o not have any a priori information about the numerical values of the parameters of the game. It was found that automata with a linear tactic which successfully dealt with the problem of behavior in a stationary random medium also dealt successfully with a game played under these conditions. This concept made it possible for Mikhail L’vovich to consider not only two-player games, but also games with a larger number of players. Then Mikhail L’vovich applied the principles, developed by him in the theory of automaton games, to various problems. The second part of the book is devoted to the general principles underlying the functioning of biological systems. In contrast with the preceding part which discusses the mathematical theory with sharply delimited boundaries, the articles contained in the second part represent rough sketches of some future theory. They were written by M. L. Tsetlin together with I. M . Gel’fand, V. S. Gurfinkel’, Yu. B. Kotov, M. L. Shik, and others. Basically, these articles deal with the problem of control of movements, and with the principles uilderlying the operation of the nervous system. The approach toward the modeling of biological systems, developed in them, is connected with the theory of automaton games which were discussed in the first part of the book. In addition, this part expounds the principles of bioelectric control. On the basis of the information about electric processes in organisms, it became possible to create a new type of controlled systems in which the role of control signals is played by biopotentials. Isolated attempts of this type were also made earlier, but it was only after the fundamental papers by Soviet authors that the problem of biocontrol became widely investigated. The creation of the first operating models of prostheses and manipulators, based on biocontrol, provided a powerful stimulus, and now this problem is being investigated by dozens of laboratories in various countries. M. L. Tsetlin was one of the creators of the principles of biocontrol, and a direct participant in the construction of the first models. He devoted a great deal of attention to the application of biocontrol principles for the development of new diagnostic instruments. He developed devices providing a synchronization between the switching of diagnostic instruments and the phases of the cardiac cycle, and devices
xvi
Preface to the Russian Edition
for automatic control of the monitoring and registering apparatus used in disturbances of the cardiac rhythm. He also developed principles for a system of automatic observation and diagnosis of the state of patients. The preparation of this book for publication after the death of Mikhail L’vovich was undertaken by M. M. Bongard, I. M. Gel’fand, V. S. Gurfinkel’, I. I. Pyatetskiy-Shapiro, N. N. Chentsov, M. L. Shik. The appendix was prepared by M. B. Berkinblit, A. V. Butrimenko, D. I. Kalinin, V. I. Krinskiy, B. G. Pittel’, V. A. Ponomarev, L. I. Rozonoer, V. L. Stefanyuk, A. L. Toom, I. M. Epshteyn. The editing of the book was done by M. B. Berkinblit and A. L. Toom. The work they have performed went far beyond the usual editorial duties. M. BONGARD I. PYATETSKIY-SHAPIRO N. CHENTSOV
AUTOMATION THEORY
Finite Automata and Modeling the Simplest Forms of Behavior'
Introduction The simplest behavior problem discussed in the present chapter can be formulated as follows: We consider an automaton, i.e., an object capable of receiving a finite number of signals s E (sl, s 2 , . . . ,s , ~ ) ,at every instant of time f = 1, 2 , . . . , and changing its internal state in accordance with these signals. The automaton can carry out a finite number of actions f E (fl,h,.. . ,fJ.The choice of the action is determined by the internal states cp E (cpl, q 2 , .. . ,pm); the number m is called the memory capacity of the automaton. It is assumed that the automaton is situated in a certain environment and that the actions of the automaton cause responses s of the medium C. These responses are, in turn, the input signals of the automaton. It uses them, as it were, to decide its further actions. In this chapter, we confine ourselves to the simplest case in which the automaton treats all the possible reactions of the environment s E (sl,. . . , sN) as belonging to one of two classes: that of favorable reactions (corresponding to a payoff, s = 0), and that of unfavorable reactions (corresponding to a loss, s = 1). Within each of these classes, the responses of the environment are indistinguishable for the automata. The expedient behavior of the automaton in a given environment consists of increasing the number of favorable responses and diminishing the unfavorable ones. The doctoral dissertation of M. L. Tsetlin defended in 1964 at a division of the V. A. Steklov Mathematical Institute (postscript of the compiler, post. comp.). 3
4
Automaton Theory
In what follows, we shall consider deterministic and stochastic automata.2 = F(g;(t)) giving the relationship between an actionf(t) of the automaton at time t and its state p(t), and the stochastic matrix 1) a j j ( s )11, i , j = I , 2,. . . , m. Here aij(s)is equal to the probability of a transition from state v(t)= vi into state v(t+I ) = y j , due to input s ( t I ) . I n the case of deterministic automata, the matrices 1 1 a i j ( s ) 11 consist of zeros and ones. Since we consider automata that are capable of receiving only two signals (s = 0) and (s = I ) , it is sufficient to specify two such matrices I( a i j ( 0 )( 1 and I( a i j ( l )11. Part I deals with behavior of automata in random environments; Parts I1 and 111 deal with various problems related to the collective behavior (games) of the automata. The first section of Part I deals with the simplest problem arising here: the behavior of automata in stationary random environments [154, 15-51. In such environments, the probabilities of payoff or losing are defined for each possible action of the automata. I t is shown that the behavior of an automaton in a stationary random environment can be described by means of a finite Markov chain. Usually, this chain is ergodic. Then there exist final probabilities of states, and this makes it possible to define the mathematical expectation of the automaton payoff, which does not depend on its initial state. This quantity can serve as a measure of the expediency of behavior in a given stationary random environment. The symmetric finite automata, i.e., those automata in whose construction any apriori information about random environments is deliberately omitted, are of special interest. A question arising naturally is that of the existence of asymptotically optimum sequences of symmetrical automata \a,. . . , %,!,. . . , i.e., sequences such that the expected payoff for automata belonging to such a sequence tends, when n-,co, to the maximum possible value in a given random environment. Here n can be interpreted as the number of states of the automaton (the capacity of its memory). When the capacity of its memory is sufficiently large, the automaton carries out, with a probability close to 1, the action that maximizes the probability of winning. Section 2 of Part I presents several constructions of asymptotically optimcm sequences of finite symmetric automata advanced by Krinskiy [102], Krylov [112], Ponomarev [128], and the author [155]. The same section contains a formulation of the necessary condition for the asymptotic optimality of a sequence of finite automata. An automaton is specified by an equation f(t)
+
One can familiarize oneself in more detail with the automaton theory from books by Kobrinskiy and Trakhtenbrot [97] and by Glushkov [70].
Finite Automata and Modeling the Simplest Forms of Behavior
5
The constructions of stochastic automata maximizing the expected payoff with a fixed capacity of the memory were discussed by Milyutin [123], who had indicated such constructions and some of their principal properties. In particular, Milyutin has shown the optimality of automata with linear tactics in some stationary random environments. In this reference Milyutin used a method, suggested by him and Dubovitskiy, of solving extremum problems in the presence of constraints [88, 891. The same section contains a discussion of “the problem of a pile of books,” an example of a system whose behavior largely coincides with the behavior of a finite automaton in a stationary random environment, and which is expedient in a certain sense. The constructions of automata described in Section 2 assure expedient behavior in the simplest case, that of stationary random environments. It seemed natural to us to study their behavior in environments which are not stationary. Section 3 describes the behavior of automata in environments termed composite, i.e., in random environments whose probabilistic properties depend on time as determined by a Markov chain. In composite environments an automaton is forced to continually “evolve,” and the time during which its structure “evolves” greatly influences the expediency of its behavior. For this reason the dependence of the expected payoff to the automaton on the capacity of its memory loses its monotonic character in composite environments. In particular, this section contains a computation of the expected payoff for the case when an automaton with linear tactics, described in Section 2, interacts with a composite environment. It ttirns out that the expected payoff attains a maximum for a fixed capacity of memory, and decreases as this capacity further increases. Owing to the existence of an optimum memory capacity, it is possible to select those automata having a specific design which have the greatest expectation of winning in a given composite environment. Algorithms of optimum behavior in composite environments were studied by Dobrovidov and Stratonovich [87]. Varshavskiy and Vorontsova investigated the behavior of stochastic automata with evolving structures [43, 441. At first, the behavior of these automata is not expedient, but as the responses of the environment are fed into them, these automata change their matrices 11 a i j ( s )11 in such a way that their behavior becomes increasingly expedient during the experiment. The behavior of such automata in stationary and composite random environments was investigated by means of modeling on a digital computer [43]. Methods of structural evolution were found which ensured optimum behavior in stationary random environments. In composite environments, the mean payoffs for automata with evolving structures coincide with those for
6
Automaton Theory
automata with linear tactics and optimum memory capacity. The results obtained by Varshavskiy and Vorontsova are briefly presented in Section 4, Part I. Part I of this chapter deals with the behavior of automata in those random environments whose properties are assumed either to be constant or to change independently of the actions of the automata. Such forms of behavior are sometimes described as “games against nature.” The following two parts of this chapter deal with questions connected with the collective behavior of automata that is produced by their interaction. A convenient way of defining simple forms of such interaction is provided by the terminology of game theory [24, 41, 121, 1221. The use of such terminology narrows down the class of the forms of behavior studied, but at the same time, it leads to the construction of a number of meaningful models. It should be noted that the automaton games are discussed here from a viewpoint that differs from the one accepted in game theory. Indeed, it is normally assumed in the latter that the game is defined by a system of payoff functions previously known to the players. Using this apriori information as well as arbitrarily chosen means of computation, the player selects his strategy. Moreover, each player assumes that his opponents play in the best possible way. The (usually mixed) strategies that are selected in this manner remain constant during the game. The game resembles a game of chess which begins and ends with an analysis of the situation. We thought it interesting to consider games played by finite automata having n o a priori information about the game, and being forced to shape their strategies for each successive replay in the course of the game itself [113, 156-1581.
Automata games are defined in the first section of Part 11. It is assumed that such games are repeatedly played numerous times. A play f of a game G is called a set f = (fj,, f:* ,. . . ,flJ of strategies (actions) selected by the . . , ,u’ playing the game. An outcome s of a playf is the automata %l, ?I2,. set s = (sl,s2,. . . , sy), sj = 0, 1, i = I , 2,. . . , Y. Here, sj = 0 if an automaton %j won the play f, and sj = 1 if this automaton lost the play f. The information about the constructions of the automata playing the game and the probabilities p ( f , s) of the outcomes of the plays defines the game. Thus, the information about the payoff or loss in a play determines the input variable for each of the playing automata, and so determines the choices of actions (strategies) in the plays that follow. The information received by an automaton during the course of the game is made complete by the information about the outcomes of its individual plays. An automaton
Finite Automata and Modeling the Simplest Forms of Behavior
7
game defined in this manner is equivalent to a game defined in the usual sense of game theory. Conversely, an automaton game can be constructed on the basis of a given system of payoff functions. For automaton games described as games with independent outcomes, such a construction is unique. The same section proves that an automaton game can be described by a (finite) Markow chain. As a rule, the chain is ergodic. This makes it possible to single out the important class of ergodic games in which there exist welldefined final probabilities (independent of the initial conditions) of winning for each of the automata participating in the game. The section ends with an example of a nonergodic game. The second section of Part I1 is devoted to zero-sum games for two automata. The interest of this case lies in the fact that the equivalent zerosum game for two persons is the subject of von Neumann’s well-known theorem. A game played by an automaton against an opponent who chooses a mixed strategy (in the sense of the game theory) is considered first. It turns out that an automaton belonging to an asymptotically optimum sequence maximizes (if the capacity of its memory is sufficient) its payoff. It obtains a payoff equal to the value of the game in von Neumann’s sense when its opponent chooses an optimum strategy. In other words, such an automaton “plays no worse” than its opponent, choosing an optimum mixed strategy. Behavior that is expedient in a stationary random environment turns out to be expedient in this case also. When choosing the design of playing automata, it is natural to require that in any case their behavior should be expedient in the simplest game, i.e., in a game against nature. The simple automata whose behavior is studied in the present book have no information available about the actions of their opponents, about the strategies that are available to them, or even about the number of opponents. To an automaton, the role of the opponent reduces to forming a more or less complicated environment in which it must behave in an expedient manner. Next, a zero-sum game for two identical automata with linear tactics is discussed in detail in Part 11, Section 2, and the final payoff probabilities for n 00 are computed. It turns out that in the case of an obvious advantage for one of the players (the presence of rows of nonnegative elements in the matrix I mmBI I of the game), it chooses the strategy yielding the maximum guaranteed payoff, so that its behavior is very similar to that which is prescribed by game theory. However, if there is no such obvious advantage, the expected payoff for each player is zero. Automata with linear tactics play, as it were, in a rough way, being unable to take advantage of the subtler
-
I
8
Auromaron Theory
properties of the matrix of the game. These investigations were carried out by the author jointly with Krylov [113, 1581. Part 111 deals with homogeneous games of automata, i.e., games in which all participants have equal rights [32, 591. In the first section of Part 111, a definition of the group G , of the automorphisms g of a game r* is given. Those games are homogeneous for which the automorphism group is transitive over the set of players. In homogeneous games, the sets of strategies of the players are pairwise isomorphic. The invariant sets of game plays are determined with the aid of the automorphism group of a homogeneous game. Here, the arithmetic mean of the payoffs for all players in a given play (value of the play) coincides with the arithmetic mean of the payoffs for any one player over an invariant set of plays (with the value of the invariant set). A Nash play of a game is defined as a play in which it is not convenient for any player to change his strategy, assuming that no other player changes his. In this sense, Nash plays are stable. Games consisting of Nash plays are called Nash games. The invariant sets generated by Nash plays in homogeneous games are called Nash sets. In the study of homogeneous games of automata, it seemed interesting to us to compare, as in the zerosum games for two automata, the behavior of automata with that of peopIe who know the conditions of the game beforehand. In homogeneous games, such behavior is not difficult to predict: people would agree to play sequentially the plays belonging to an invariant set of maximum value. lf the set is a Nash set, then it can be assumed that with a favorable structure of payoff functions the automata will play “no worse.” Games in which the invariant set of maximum value is a Nash set are called Moore games. Section I of Part 111 gives examples of Nash and Moore games. The final part of the section gives a procedure for constructing from a given homogeneous game another game To*which has the same sets of players, their strategies, and the same values of the plays as in the game r*,but is a Moore game. This procedure (we call it the procedure of introducing a “common fund”) is equivalent to an agreement among the players of a homogeneous game about dividing equally their payoffs in the game. The homogeneous games were studied by the author together with Gel’fand and Pyatetskiy-Shapiro. Section 2 of Part I I I discusses homogeneous games for identical automata and an example of modeling a symmetric game of automata (called a game with a distribution [ 6 4 ] )on a digital computer is described in detail. The situation modeled in this game is typical, for example, of the problem in which a predator is selecting his hunting area. Here the number of animals
r*
Finite Automata and Modeling the Simplest Forms of Behavior
9
per predator is determined by the supply of game and the number of predators simultaneously present in the area. The use of a certain strategy in the game corresponds to the selection of a hunting area and the various values of the payoff function correspond to the number of prey. The distribution game is defined by x nonnegative numbers, a,, a,, . . . , a,, a, 2 a, 2 . . 2 a, 2 0, called the efficiencies of the strategies. The game is played by the automata % I , . . . , 'u", Y 5 x , each having x strategies fi ,f,,. . . ,f,. An automaton, having chosen a strategyfj in some play of the game, has in this play the expectation of a payoff equal to a j / n j , where nj are the number of automata that have chosen strategyfj in this play. It is not hard to verify that the distribution game is a Nash game. Studying the behavior in this automaton game with a linear tactic, we have found in a number of examples that the payoff to any automaton approaches the value of the Nash set, assuming that the memory capacity n of each increases without a bound; with a probability approaching 1, the automata play those plays that belong to the Nash set. The procedure of introducing a common fund transforms the game of distributions into a Moore game. It turns out that in this case the average payoff of a participant automaton tends to the value of the Moore set (for n + co), i.e., to the maximum average payoff possible in the game. As the memory n of each participant automaton goes to infinity, their collective behavior in the game of distributions does not differ from the behavior of a group of people who know previously the conditions of the game and who have entered into an agreement. In fact, it is obvious that the players who have prior information about the distribution game and agree on joint actions in the game would receive the highest payoff, playing sequentially all the plays in which the strategiesf, , A , .. . ,fy would be used with maximum efficiency. Each one of the strategies is used precisely by one player in each play (such plays form a Moore set). A discussion of the behavior of automata in the distribution game makes it possible to express certain notions about accuracy. The problem is that when a portion of the participating automata leaves the game, those remaining continue to play the most convenient plays (Moore plays for the reduced number of automata). Therefore, independently of which automaton leaves the game, the strategies with highest efficiencies still continue to be used, so that the average payoff to the automata continuing the game increases; this compensates to an extent for the decrease in the total payoff to the group of automata. For finite fixed values n of the memory capacity of the participant automata, the increase of the average payoff to the automata continuing the game is related to the fact that the value of the Moore
-
10
Automaton Theory
set is reached more accurately as the number of the players decreases. The final part of Section 2 in Part 111 contains examples illustrating these ideas. In the course of analyzing games for many automata, it seemed interesting to us to separate out those games whose description does not depend on the number of players. This property is possessed, for example, by games with a limited number of neighbors, where the payoff functions of each player depend only on the strategy chosen by them and on the strategies of a limited number of other players-his “neighbors in the game.” The simplest homogeneous game of this type is a game along a circle with two neighbors in which the payoff function of a player % j depends on his strategy and the strategies of the players %j+l and %j-l, and each of the players has only two strategies, 0 and 1. A description of this game and the results of modeling it on a digital computer form the contents of Section 3. The same section contains a definition of the simplest game on a circle, an analysis of its Nash plays, and a proof of the assertion that a game with an even number of players is always a Nash game. Furthermore, an example is given of modeling on a digital computer a game on a circle that is a Moore game in which the payoff of each of the participating automata reaches the value of the Moore play: for a fixed number of players and an infinite increase of the memory capacity of each of them, the Moore play is played with a probability approaching 1. The value of the Moore play is, however, not reached for all Moore games on a circle. The final part of the section states certain assumptions about the necessary conditions for this to take place. These conditions are imposed on the payoff functions of the game. The assumptions are then verified by means of modeling on a digital computer, and the results of this verification are also given in the same section. The problem of the expedient behavior of a pair of automata was also analyzed in the diploma dissertation of Stefanov [138]. The paper by Borovikov and Bryzgalov [ 2 9 ] is a study of the simplest symmetric automaton game, each having only two actions 0, I , and where the payoff was determined as the fraction of players who chose action 1. Also, the conditions of this game do not depend on the number of players. The behavior of automata in random media and the automaton games to whose study this work is devoted may be viewed as the functioning of self-organizing systems. The theory of the latter (in connection with abstract automaton theory) is investigated in the papers by Glushkov [66,68]. These papers introduce criteria that allow one to judge the capacity of a system for self-organization, as well as quantitative (entropic) character-
Finite Automata and Modeling the Simplest Forms of Behavior
I1
istics of self-organization. The actions of automata capable of expedient behavior assure not only a payoff, but also the obtaining of information necessary for the selection of actions. In this sense automata are systems with a dual control according to Fel’dbaum [145, 1461. A paper by Letichevskiy and Dorodnitsyna [I161 describes an interesting model of natural selection which is substantially close to the behavior of automata in random media. A detailed analysis of learning systems different from the ones described here was made in the book by Bush and Mosteller, whose Russian translation is provided with an interesting appendix by Shreyder [ d o ] . It should be noted that the examples of behavior models forming the contents of the chapter refer only to the simplest modes of behavior: It is assumed that the automata are capable of distinguishing between only two possible input signals. The behavior of automata capable of receiving richer and more complex information is considerably more complicated. Numerous studies on the modeling of pattern recognition were devoted to an investigation of certain aspects of such complex behavior. Here we shall mention only the well-known papers of Rozenblatt on perceptrons [130];the papers by Glushkov [67, 691 are devoted to the important aspects of their theory. Recognition algorithms are described by Bongard [25, 261, Ayzerman [ I ] , Braverman “1, Glushkov et al. [71],Kovalevskiy [98],Kharkevich [148], and others. It seems that it would be interesting to study the game behavior of more complex automata, for example, of automata receiving information not only about the outcome of each play, but also about the strategies used in this play by other players. The mathematical models of the simplest forms of behavior described here were used (with Gel’fand) in our joint research in physiology. We have attempted to use them to explain certain peculiarities of the interaction of nerve centers [56]. In their paper Pyatetskiy-Shapiro and Shik [I291 made an attempt to apply these models to the investigation of the spinal mechanisms of motor control. Other attempts at modeling behavior are concerned with the modeling and control of motion in man and higher animals, and are described by Gel’fand and Tsetlin [62],where the basis for the construction of models is provided by the techniques of the automatic search for the extremum of a function of many variables, in particular by the “ravine method” due to Gel’far~d.~ See p. 137 of the present volume (Editor’s note).
12
Automaton Theory
I BEHAVIOR OF AUTOMATA IN RANDOM MEDIA 1 Behavior of Automata in Stationary Random Media
Assume that a deterministic automaton % is described by its canonical equations
+ 1 ) = @(Ql(t),
s(t
f ( t >= F(P(0).
+ 1 I),
(1)
(2)
I n these equations the variable t represents time, and is assumed to take on integer values f = 1, 2,. . . . We shall assume that the input variable s(t) may take on only two values: 0 and I . The value s = 0 will be called the “nonpenalty,” and the value s = I the “penalty” of an automaton (a. We shall further assume that the output variablef(t) of the automaton may take on x different valuesf,, fi,.. . ,f,.The values of the variablef(2) will be called the action of the automaton and we shall say that at the instant t the automaton $11 has performed the uth action if f ( t ) =f,, (1 = . I , 2) . . . , x. I t is also assumed that the variable cp(r) may take on n? different values cp, , q r , .. . , p,,,. These values are called the state of the automaton and the number 111 is said to be its riteniory capacity. Of course ni 2 x. We shall say that the automaton ‘u is, at an instant t, in the jth state, j = I , 2,. . . , in, if y ( t ) = cpj. The action f, is said to correspond to a state qj if F ( y j ) =fa. Equation ( 2 ) describes the relationship between the actions of the automaton and its states, and Eq. ( I ) describes changes in its state due to the action of the input variable s ( t ) . The input variable assumes only two values, so that ( I ) specifies two mappings into itself of the set of states of the automaton; one of the mappings is given for s = 0, and the other for s = I . The mappings will be conveniently written in the form of a special state matrix1 (1 a i j ( s )11, i, j = I , 2 , . . . , in. The state matrix for a deterministic automaton is simple, i.e., each row in this matrix for any fixed value of s contains exactly one element equal to unity, and the remaining elements are zero. The matrix 11 a i j ( s )11 determines the transitions of states for a deterministic automaton in the following manner: if at the instant t the automaton is in state ( p i , then at t 1 it will make a transition to a state vj such that aij(s(t I ) ) = 1.
+
+
For more details, see Tsetlin [150-152], where these matrices are used to construct deterministic automata out of real physical elements. [See also p. 226 (Editor’s note).]
Finite Automata and Modeling the Simplest Forms of Behavior
13
The transitions of states for each value of the input variable s can be represented graphically by means of graphs of states. Thus, each state pi of the automaton is juxtaposed with the vertex i of the graph of states, and each nonzero matrix element ajj(s) is juxtaposed with an arrow directed from the vertex i to the vertex j . The simplicity of the matrix aij(s)1 1 implies here that exactly one arrow emanates from each vertex of the graph of states. For automata considered in this study, the transitions of states are described by a pair of such graphs (for s = 0 and s = I). We shall still need the concept of a s/ochastic automaton. A stochastic automaton also has a finite number of states q,, p2,.. . ,pm and a finite number of actionsf, ,,fi2,. . . ,f,. Just as in the case of deterministic automata, described above, we shall assume that the input variable assumes only two values, .F = 0 and s = 1. The actions of a stochastic automaton are uniquely determined by the statef(t) = F ( q ( t ) ) , and the state matrices 1 1 a j j ( s )11, s = 0, are stochastic. Here a j j ( s )signifies the probability of a transition from the ith state to the ,jth state for a given value of the input variable s. Evidently, deterministic automata are a special case of stochastic automata. In what follows we shall simply speak of automata, having in mind both deterministic and stochastic automata. Now we shall begin o u r discussion of the behavior of automata in stationary random media. We shall say that an automaton 21 is in a stationary random medium C = C(a, , a 2 , .. . , a,) if the actions of the automaton and the values of its input variable are related as follows: the action,f,, a = I , 2 , . . . , x performed by the automaton at the instant / generates the value s = I (a penalty) at the instant t I with the probability p, = ( 1 - a,)/2 and the value s = 0 (a nonpenalty) with the probability q, = ( I a,)/2. We assume here that I a, I 5 I . Suppose that at the instant t , the automaton was in statepi, i= I , 2 , . . ., m, to which there corresponds the action f Z l = F(qj). Then the probability p i j of the transition of the automaton from state p i to state pj is given by the formula
+
+
It is not difficult to see that the matrix P = 1 1 p i j 1 1 is stochastic. Thus, the functioning of the automaton in a stationary random medium is described by a Markov chain. In the cases of interest to us, this chain turns out to be ergodic so that final probabilities of automaton states exist in a given medium which are independent of its initial state.
14
Automaton
Theory
Let r i denote the final probability of state y i of the automaton in a stationary random medium C . Let a,, u = I , 2,. . . , x designate the sum of the final probabilities of those states v i , to which there corresponds the action fa. The quantities 0, signify the probabilities of the actionf, of automaton 1’ 1. in the medium C . The mathematical expectation W(%, C ) of a nonpenalty for automaton 3 in the medium C is given by
Obviously,
The expediency of the behavior of an automaton consists in increasing W .
We shall say that an automaton 1 ‘ 1. behaves expediently in the medium C if5
For an automaton performing its actions equiprobably and independently of the reactions of the medium, W = ( I / x ) ( a , a2 . a,). Let us consider the simplest example of an automaton behaving expediently. Consider an automaton L2,2,8possessing two states y1 and y 2 , and two actionsf, = F(pl) andf, = F(y2).The automaton remains in the same state in case of a nonpenalty, and changes its state in case of a penalty. The state matrices have the form
+ + -+
Graphs of states are given in Fig. 1. Suppose that automaton L,,2 is in a medium C ( a i ,a2).Upon constructing, according to (3), the matrix of the transition probabilities
One of the problems in which M. L. Tsetlin was interested in connection with these considerations was: What portion of all stochastic automata with a given number of states, or what portion of some reasonably separated class of such automata, consists of automata capable of expedient behavior? (Editor’s note). Automaton L2,* has the same logic of behavior as a static trigger with a countable input [35].
15
Finite Automata and Modeling the Simplest Forms of Behavior
Figure I
we arrive at the following equations for the final probabilities r, and r2:
Using, in addition, the normalization condition rl rl
= PZ/(P,
+ PZ),
r2
= Pl/(Pl
+ r2 = I ,
we have
+ PZ).
We obtain the following expression for the mathematical expectation of a nonpenalty : w 5 2 , 2 C ) = (pza, pla,)/(Pl P 2 ) . 9
+
+
+
It is easy to see that W(L,,,, C ) > (a, a,)/2 for a, # u 2 , i.e., that automaton L2,2behaves expediently in a stationary random medium. Those automata are of particular interest whose structure does not involve any type of information about the type of random medium in which they function. These automata, one might say, do not have the property of the “a priori expediency” of behavior. An automaton % will be called symmetric if the expected value of a nonpenalty, W(%,C ) , in any stationary random medium C(a,, u 2 , .. . , a,) is a symmetric function of a,, a 2 , .. . , a,. Let !lJi be the set of all states of an arbitrary automaton %, and let ma be the set of those states p in which the automaton performs the action fa, i.e., F ( y ) = fa. An automorphism of an automaton % is defined as a one-to-one mapping g of the set !lJi of its states into itself such that the following conditions are satisfied : The partition of the set tu1 into a system of sets !lJmis preserved; in other words, if gp E tD1, for some p E tDlz, then for any p’ E we have gp’ E m,. 2. I I aids) I I = I I agigj(3) I I . 1.
ma
Automorphisms g of automaton % evidently form a group G . Let Godenote a subgroup of G, consisting of automorphisms mapping each of the sets
16
Automaton Theory
!l.Jlm into itself. It is not difficult to see that Go is a normal divisor of the group G . The quotient group G' = G / G ois naturally realized as a group of automorphisms of the set of actions fi,h,.. . ,fx. If G' is transitive on this set, then it is natural to call the automaton homogeneous. For homogeneous automata, the sets Im, are isomorphic to one another. The mathematical expectation of a nonpenalty W(%,C ) of a homogeneouc automaton CU in a stationary random medium C(al , a 2 , .. . , a,) is invariant with respect to transformations from group G'. If group G' is symmetric, then the corresponding automaton is obviously symmetric. Section 2 gives examples of symmetric homogeneous automata for which group G' is cyclic. 2 Asymptotically Optimal Sequences of Symmetric Automata. The Book Stack Problem
In the previous section we gave a definition of the expedient behavior of automata in stationary random media. Obviously, the expectation W ( % ,C ) of a nonpenalty for an automaton in a medium C ( a l ,u 2 , .. . , a,) does not exceed amax= max(al , a 2 , .. . , ax). Naturally, there arises a question of the existence of automata for which W would approach a,,,,,. . . , ,. . . will be called asymptotically A sequence of automata optimal if lim W(%,, C ) = amax. (7)
an
n+m
An automaton belonging to an asymptotically optimal sequence, if n is sufficiently large, performs almost exclusively the action for which the probability of a nonpenalty is maximum. We shall give several examples of asymptotically optimal sequences of symmetric automata.'
I . An automaton L2n,2(an automaton with linear tactic), which is a . . ,~ l 1, , natural generalization of automaton L 2 , 2 ,has 2n states v,l, vZ1,.
v12,q Z 2 ,... ,pn2 and
two different actions A , fi, where
F ( v i U =f,, ) i
i = 1, 2 , . . . , n ,
u = 1, 2.
For a nonpenalty, i.e., when s = 0, states viUmake a transition to q ~ f + ~ , I , 2 , . . . , n - I , and states p,,"make a transition into themselves. For
=
'I The description of automaton L,,,, has been included in the text by us; it is taken from Tsetlin [I551 (Editor's note).
17
Finite Automata and Modeling the Simplest Forms of Behavior
a penalty, i.e., when s = 1, states qiapass into & , i = 2, 3,. . . , n ; state q,' becomes ql2,and qI2becomes qI1. Graphs of states of automaton L2n,2are given in Fig. 2. Let us calculate the expectation W(L,,,,, C ) , assuming that a, and a2 f 0, f l.
Figure 2
Computing the transition probabilities from (3), we obtain a system of equations for the final probabilities ria of states qia:
r,,'
=
rLl
=
ril
=
+ qlrLl, plrnl + q 1 r L 2 , + q,r,L,
rz1 = p1r3l
+q2rL p2rn2+ q2ri-2r
rn2 = 42'n2
qlrnl
ri-,
=
9
+
ri2 = w?+,q2&,
+ qlrll,
rz2 = pzr:
+
+ q2r12,
+ + +
+
r,,' = 1. and the normalization condition is rll . . r,,l rI2 * The solution of each of the columns of equations is assumed to have the form ril = u,A;-,, ri2 = We obtain the following characteristic equations for the eigenvalues Al and &: p.12 - Ax qz = 0, CI = l , 2. Calculating from these equations
+
Ail)
=
A 2( 1 )
= 1,
p
=
ql/pl = A,,
a p = q2/p2= A,,
we write the solutions in the forms
Using the equations for r,,l and rn2, we see that B, equations for rll and rI2, we have
=
B2 = 0. From the
18
Automaton Theory
Now let us evaluate the sums
u,
=
C ril = A C n
n
i=l
1=1
=
A,(A,. - l)/(Al - I ) ,
+
The normalization condition u, o2 = 1 and Eq. (8) yield the values for the coefficients A , , A , , and thus the mathematical expectation W ( L 2 , 1 , 2C) , of a nonpenalty can be found, using Eq. (4), to be W(Lzn,2, 3 ' -
a1 Pln - 41n
-(7? PI - 41
02
Pzn
Pzn - 4 2 y ( 21 PIn - 4In I 1 Pzn - q2n) P2 - 42 PI - 41 P2n P2 - q 2
It is important to note that W ( L 2 , 1 ,C 2 ,) is an increasing function of the memory capacity, and that for (a,, a,) 1 0, W = lim W(L,,,, , C ) = max(a, , a 2 ) . n+m
This relation means that automaton L 2 , 1 , 2for , a sufficient memory capacity n, performs almost exclusively the action for which the probability of a penalty is at a minimum. 2. Automata with linear tactic L x , , , x . 8These automata have xn states via,CL = 1, 2,. . . , x, i = 1, 2 , . . . , n. To a state viathere corresponds an action fr. The transitions of states, depending on the value of the input variable s, are effected as follows: for s = 0 (a nonpenalty), state viapasses into state for i = I , 2 , . . . , n - I ; state vnapasses into itself. For s = 1 (a penalty), state cp: passes into state for i = 2, 3,. . . , n. State qla passes into state p;fl, u = I , 2 , . . . , x - I . State qIxpasses into state qll. Graphs of states for an automaton with a linear tactic are shown in Fig. 3. Assuming that an automaton L,,,, is in a stationary random medium C(a1,a 2 , . . . ,a,) and using (3) and (4), we can find W(L,,,,, C ) . We get
* Electronic models of automata with a linear tactic were constructed and experimentally studied in the diploma dissertation of Buylov [35].
Finite Automata and Modeling rhe Simplest Forms of Behavior
19
n
Equation (9) implies that a sequence of automata L,,,,, is asymptotically optimal for stationary random media C(a,, a 2 , .. . , a,) under the condition that a iiiax = max(a,, a 2 , .. . , a,) L 0 . In fact, in those media
In media in which the condition of nonnegativity of a,,,,,does not hold,
w = [(I /x)(a;' + a,' +
* * *
+ a;')]-',
i.e., it coincides with the harmonic mean of a , , a 2 , .. . , a,. 3. Automata D x n , x . 9These automata also have xn states via, rr = 1, 2 , . . . , x , i = 1, 2 , . . . , n. To a state viathere corresponds an action f,. For s = 0 (a nonpenalty), viapasses into a state v,,",i = 1, 2 , . . . , n. For s = 1 (a penalty), transitions of states are effected in the same way as in automata with a linear tactic. Graphs of states for automata D,,,,, are given in Fig. 4. Calculating, as in the previous examples, the expectation of a nonpenalty, assuming that the automata are in a stationary random medium C(a,, a 2 , .. . , a,), we obtain
The construction of the automata D,,,, was proposed by Krinskiy [102].
20
Aufomafon Theory
Figure 4
It is not difficult to check that a sequence of automata D,,,,is asymptotically optimal in an arbitrary stationary random medium. 4. A construction of the automata K,,,, proposed by Krylov [112]. Stochastic automata K,,,, have xn states qim, i = 1 , 2 , . . . , n, a = 1,2,. . . , x ; F(qia) = S,. For s = 0, transitions of states occur in the same way as in the case of automata with a linear tactic, L,,,,,. For s = 1, state p; passes with the same probabilities equal to 4 into states pi“,, and p;-l, i = 2, 3 , . . . , n - 1. State qnupasses with probabilities 4 into states qnaand States 9,“pass, also with the same probabiIities equal to 4, into states qzaand p + l, a = 1, 2,. . . , x - 1 ; state qlxpasses into states qzxand q l l . For automaton K,,,,
It is easy to see that automata K,,,, also form an asymptotically optimal sequence in all stationary random media. 5 . An interesting construction of asymptotically optimal sequences of finite automata was proposed by Ponomarev [128]. These automata were named automata with a comparing tactic V,,7,,. A graph of states for automaton Vz,,z having only two actions is shown in Fig. 5. In this figure the solid circles indicate those circles where the action fi is performed, and the open circles correspond to the action fi. This autr :naton has 2n = 2m 21 states q; (a = 1, 2, i = 1, 2,. . . , n); F ( g L 2 ) = 1, The states qll,. . . ,qll,q I 2 ,.. . ,qt, belonging to the horizontal “comparing” part of the automaton, are related to one another in such a way that a change of actions of the automaton occurs independently of whether the action was followed by a nonpenalty s = 0 or a penalty s = 1 .
+
Finite Automata and Modeling the Simplest Forms of Behavior
21
However, in case of a nonpenalty meted out after the action fl (or, correspondingly, in the case of a penalty after the actionjJ the state of the automaton changes to the neighboring left state. I n case of a nonpenalty after the action fl (or, correspondingly, in case of a nonpenalty after the action&) the state of the automaton if replaced by the neighboring right state. In these states automaton V,,,, makes, as it were, the choice of the necessary action. In this the automaton differs from automata with a linear tactic L 2 n , 2 ,where the change of actions occurs only in two states vll and v12.
5
=o
Figure 5
and q$+l ,. . . ,I$,+~ states are related to each other The vi+l,. . . , in the same way as in the automaton with a linear tactic. However, state vI2 (or, respectively, vll) passes, for s = 1 (a penalty), into the extreme state vh+l(or, respectively, I&+~) of the line. This assures a multiple repetition of the corresponding action even in those cases in which the probability of a penalty is considerably greater than t . The above construction of automata can naturally be generalized to the case of an arbitrary number x of actions. Without giving the somewhat cumbersome formula for the mathematical expectation of a nonpenalty, we shall note only that a sequence of symmetric automata V,,,, is asymptotically optimal in all stationary random media. The foregoing examples of asymptotically optimal sequences of automata possess one general property; for each of these sequences, the automaton %, in each of the sets W, of its states has a state vnasuch that it cannot be derived from the W, with a sequence of input signals having length less
22
Automaton Theory
than n. (We remember that here, as before, !Illa denotes the set of all automaton states in which the action L is performed.) Let pl be some state of a deterministic automaton %, and let F ( p ) =A, i.e., pl E ma. The depth d(pl) of a state pl will be defined as the minimum length of a sequence of input signals bringing out of the set !Illa. The depth d(%) of an automaton will be defined as the greatest of the depths of its states. We recall that for homogeneous automata the sets are pairwise isomorphic. In order for a sequence of deterministic automata "Ill ,. . . , a,,. . . to be asymptotically optimal, it is necessary that
Let us prove this assertion. Consider for simplicity the case x = 2. Let a, be the total probability of the states belonging to the set ma. Furthermore, let u k denote the total final probability of a set of states p l Im, ~ having a depth k. Suppose that the automaton is in a stationary random medium C ( a l ,a2). Then it is obvious that we have the following inequalities:
in which ,ul= min(pl, q,), p1 = I - q1 = (1 - a,)/2. These inequalities imply that k = 1, 2,. . . , d,. 0 2 2 plkUk, (14)
Similarly we can show that
The fact that the ratio al/a2 is bounded implies that the expectation of a nonpenalty W ( X n ,C ) = a,al a2u2,a1 a2 = 1, reaches the value a' < a,,,, = max(a, , a2). Thus, a,,, is not attained by automata with a bounded depth. The possibility is not excluded that this property (perhaps, with some additional conditions) is also sufficient for the corresponding automata to form an asymptotically optimal sequence.
+
+
Finite Automata and Modeling the Simplest Forms of Behavior
23
We note that all the automata just described admit a natural numbering of the states belonging to the same set !D& (and thus, they are in this sense “linear”). Apparently, this property is characteristic of constructions that are most economical as to the number of states. Milyutin [123] studied the question of optimal constructions of automata, i.e., those constructions of automata that assure the maximum sum of the expectations of a nonpenalty, W , in two stationary random media C,(a,, a2) and C,(a,, a,) for a fixed memory capacity. Using a method (proposed by him and Dubovitskiy [88, 891) of solving extremum problems in the presence of constraints, Milyutin gave an upper estimate for the possibility of a nonpenalty in both media for automata with a fixed memory n. In particular, for automata functioning equally well in the media C,(a, , a,) and C,(a,, a,), we have the following estimate for the medium C , :
I W P
(02/a1>
2 )
((q2/ql)(Pl/Pd)n-l7
(17)
where p 1 > p2,
p i = 1 - qi = (1 - aJ2,
i = 1, 2.
Milyutin exhibited examples of automata for which this estimate is achieved asymptotically. In particular, he proved that automaton L2n,2is close to optimal in the media in which at least one of the numbers a, and a, is close to 1, and described a construction of automata, close to optimal in all media which are largely identical with automata having a comparing tactic, proposed by Ponomarev. In the final part of this section we shall give an example of a system whose functioning is largely similar to the behavior of automata in stationary random media and possesses a naturally defined expediency. Suppose there are n books K , , K2,. . . , K,, lying in a stack on a desk. The books in the stack may be used in various ways-for example, one can choose the desired book and put it back in the same location after use. It is, however, often preferred not to search for the original location of the book chosen, but to simply put it on top of the stack. We shall now show that this method possesses a certain expediency: each book is located, on the average, higher, the more frequently it is used. We shall calculate the probabilities of an arbitrary ordering of books in the stack in terms of the probabilities p k , 1 5 k 5 n, of their being used. We assume that at each instance of time r = 1, 2,. . . , a kth book is taken from the stack with a probability pk (independent of the location of books at preceding instants p,, = 1. of time), and is put on top of the stack. Of course, p1 p2 The location of books in the stack is given by one of the permutations
+ + .- - +
24
Automaton Theory
(il , iz,.. . , i n )of the indices 1, 2,. . . , n. The choice of the kth book changes permutation ( i l , . . . , i2-l,k, i2+l,.. . , in) into permutation (k, i l , . . . , i l - l , it+,,. . . , in). The pile of books may be considered as a finite automaton having n! states sl,s, . . . ,s, , each corresponding to a certain ordering i,, of books. This automaton has n values of the input variable q ~ ~ ~ , ~ ~ , , , , and sP, k = 1, 2,. . . , n, corresponds to the choice of the kth book. The process of redistributing books in the stack is described, obviously, by a finite Markov chain. We shall assume that 0 < Pk < I , k = 1, 2, . . . , n ; then the chain is ergodic. Let ri in denote the final probability of permutation (ii,. . . , in). To determine rLl,.,.,inwe have the following system of n! equations:
and the normalization condition is
Using mathematical induction with respect to n, we can show that the following formula is true:
if
Pi,
4Pi,.
Using Eqs. (18), one can calculate the probabilities rL(Z) that in the stack the book K iis located in the fth position, and thus the mean depth di of the location of this book in the stack is n
di = C hi(/). 2=1
The inequalities (19) imply easily that d j 5 dP for p i 2 p k , i.e., for each pair of books in the stack, that book which is used more often is located higher (on the average). The foregoing considerations explain perhaps the dissatisfaction of a person who finds that the books on his desk were put in order by someone else.
25
Finite Automata and Modeling the Simplest Forms of Behavior
3 Behavior of Automata in Composite Media In the preceding sections we discussed the behavior of automata in stationary random media whose probabilistic properties were not assumed to be known beforehand. The a priori knowledge of the constants a , , a 2 , .. . , a,, defining a stationary random medium C(a, , a,, . . . , ax), makes it meaningless to construct an automaton that would be capable of performing, depending on its state, a number of different actions: in this case, one could construct an automaton having only one state and one action (corresponding to a ,) and nevertheless possessing the maximum possible expediency. For stationary random media, the length of the transient process plays no important role, since the time of functioning in a stationary medium is assumed to be infinite. In this section we consider the behavior of an automaton in a medium whose properties change in a random manner. Then, even if the possible probabilistic characteristics of the medium were known, it would be impossible to construct an automaton with a single action which would be capable of expedient behavior. The transient period becomes important here: in such media the automaton must, so to speak, continuously "relearn," and an increase in the "relearning" time lowers the expediency of the automaton's behavior. If in stationary random media the foregoing asymptotically optimal sequences of automata-e.g., of automata with a linear tactic-are typically characterized by a monotonic increase of the expected value of a nonpenalty with an increasing memory capacity, then in nonstationary media this relationship is no longer monotonic. The dependence on time of the probabilistic properties of the medium will be specified in a special way: We shall consider that the medium in which the automaton is located consists of stationary random media whose switching is determined by a Markov chain. A ) , having V states Thus, consider a Markov chain K ( C ( l ) , . . , C(v), P , .. . , 0") and a matrix of transition probabilities d = 11 6,, 11 , a, = 1, 2 , . . . , V. The state C(")corresponds to the stationary random medium 0") = C(alb,a,",. . . , a,"). We shall say that an automaton CU is in a composite medium K if at each instant of time it is located in one of the media C("),a = 1, 2,. . . , V, i.e., if its actions and the values of the input variable are related in the manner described above for stationary random media; if, moreover, at the instant t the automaton is in a medium C("),then at the moment t 1 it will be in a medium C(0)with the probability W .
+
26
Automaton Theory
Let Y/,’@’, = 1,. . . , V, i = 1, 2,. . . , m, denote the state of the system automatonxomposite medium such that the automaton is in state rpi, and the composite medium in state Ccb).Then the probability d?)(,) of a transition of the system from state !Pip’ into state Yj,) can be expressed by the equation n‘iiP’(v’ = [pL:’aij(l)
+ qL:’aij(o)]
where 1 1 a i j ( s )I I is the matrix of states of automaton
+
(20)
-6pv,
a, and p::)
=
( 1 -aai(8’ ) /2
and q::’ = ( I aLf’)/2are the probabilities of a penalty and a nonpenalty, respectively, in the medium C C pfor ) the action f,, = F(pi). The matrix IP’(r’ Z7 = ( 1 zij 11, p, y = 1 , . . . , V , i , j = I , . . . , rn, generates a finite Markov chain. If this chain is ergodic, then the final probabilities r:p) of states Yjp’ of the system do not depend on its initial states, and the expectation of a nonpenalty, W(9l, K ) , for the automaton in the medium K can be found from the equation
In this expression a’; signifies the total probability of those states of the system in which the automaton performs the action f z and the composite medium is in state C(p). We shall limit ourselves to the simplest case in which V = 2, and
where the parameter -6 represents the mean frequency of the switching of states of the composite medium.I0 Let us further assume for the sake of simplicity that the number x of actions of the automaton is 2, and
Under these assumptions we shall indicate the procedure for computing K ) for an automaton with a linear tactic (see Secthe expectation&!.‘(‘+I tion 2). This is easy to see by calculating the mean number of clock intervals during which the state of the medium remains constant:
T
c m8(1 93
=
m-1
- ap-1 =
1/8.
Finite Automata and Modeling the Simplest Forms of Behavior
27
For the case x = 2, we shall enumerate the states of a n automaton with a linear tactic in the following way: qil = xi,
q?
=
x ~ + ~ i, = I,
2 , . . . ,n.
Let rja) denote the final probability of state Yja)of the system automatoncomposite medium. We introduce the vector Ri = (rjl), ri2)),i = 1 , . . . , 2n. As before, let p = (1 - a)/2, q = ( 1 a)/2, 1 = p / q . We shall make use of Eq. (20) for the matrix of the transition probabilities of the system, and the definition of an automaton with a linear tactic 1 5 ~ , , ,Then ~. we obtain the following system of equations for the final probabilities
+
R1
=
SR2
+ QRn+19
In this system S and Q are second order matrices:
The solution of this system will be assumed to have the form Rk = R , P ~ - ~ ' , k = 1, 2 , . . . , n, and Rk = R,pk-', k = n + l , . . . , 2n, where R, = (ri", $)') is a constant vector. To determine the eigenvalues p and the eigenvectors, we obtain the characteristic equation
in which E is the unit second order matrix. From the equation det(p2Q - p E
+ S ) = 0,
28
Automaton Theory
we have P4
- P3-
1-6 1-226
4
[
1
+ a2 +--
-+ p 2 21-d I-u2
1-8
-piIjj-
4
Solving this equation we get p l quadratic equation
41
1 1-2261-aZ
+1=0. = ,u2=
1 ; p3 and p4 are the roots of the
Upon finding the eigenvectors from ( 2 5 ) , we write the solution of the system (23) in the form Rk
1
ARL"
+ BRi?' -t CRfj" + DRL",
where R;."= [ I ; I ] ,
where
A
=p/q,
k = I , 2 , . . . , n.
(28)
Finite Automata and Modeling the Simplest Forms of Behavior
29
The coefficients A , B, C, D, in ( 2 8 ) are found from the equations for R, and R,,, in the system (23). Then we find the expressions for Rk = (ri*),r i 2 ) ) for k = I , 2,. . . , n:
I n these expressions d is the normalization constant, and x is either root of the quadratic equation (27). The expressions for R l l f k= rj&, rAyk) are found from the equalities
Upon using Eq. (21), we arrive at the following expression for the expectation value of a nonpenalty for an automaton with a linear tactic, L,,,,2,in the composite medium K , specified by formulas ( 2 2 ) and ( 2 2 ' ) : W(L,n,,* K )
=(A
- 1)2
{ 2 n s / ( l - 26)}(A
where cash y
=
+ I),
cash - 1 cosh ny (A
+
( 1 - A ) Z ~1 -s 2A 1 - 26
-
coth y / 2 sinh ny' (32)
I.
Figure 6 presents the plots of M = ( 1 - W ) / 2 versus the memory capacity n of automaton L,,,,2for various values of S and fixed a = 4 ( p = +, q = I). Figure 7 gives the same plots for fixed 6 = 0.01 and various values of p . It is easy to see that the expectation of a nonpenalty is nonnegative, and vanishes at n = 0 and t? + DO; consequently, it attains a maximum rn for some finite value of the memory capacity no. The lowering of W for small values of n is related to the fact that in this case the information about the state of the medium in which the automaton is located is not used to its full extent. When the memory capacity is enormously increased there occurs, one might say, an averaging of the statistical properties of both states of the composite medium (the automaton "fails to relearn"). A decrease in
30
Automaton Theory
--
the switching frequency is equivalent to an increase in the response speed of the automaton. It is natural, then, that with decreasing 6 the values of n o , as well as the maximum value of the nonpenalty W , should both increase. When the mean switching frequency 6 tends to zero, no + 00 and m max(a, -a). With an increase in 6 a reverse process takes place. Thus, for -j
( 1 - 26)/6(1 - 6)
5 (A + l)”A,
the maximum expectation of a nonpenalty is attained at no = 1. Equation MI
a3 az
af -
31
Finite Automata and Modeling the Simplest Forms of Behavior
(32) makes it possible to select those automata with a linear tactic that possess the most expedient behavior in a given composite medium. To select the memory capacity for such automata, one can use Table I, where the values of no and m are listed for various a and 6. Each column in the table contains a pair of numbers; the first is no and the second is m. We note that the value of W as a function of n and 6 can serve as a measure of the distinguishability of the random media C") and C2'for automata with a linear tactic. TABLE I 0.001
0.8 0.6 0.5 0.33 0.2 0.1
3; 0.792 5; 0.588 6; 0.488 8;0.306 11; 0,178 15; 0.072
0.010
2; 3; 4; 5; 6; 7;
0.032
0.100
0.744 2; 0.672 1; 0.512 1; 0.532 2; 0.446 2; 0.314 1; 0.424 3; 0.344 2; 0.232 1; 0.250 3; 0.182 2; 0.110 1; 0.112 4; 0.074 2; 0.n40 1; 0.034 4; 0.020 2; 0.010 1;
0.45
0.32
0.230 0,130 0,090 0.040 0.014 0.004
1; 1; 1; 1; 1; 1;
0.064 0.036 0.024 0.012 0.004 0.002
Interesting experiments, in which the behavior of a human being was compared with the behavior of automata in stationary and composite random media, are described in the paper by Alekseyev et al. [ 2 ] . 4
Behavior of Automata with an Evolving Structure in Random Media
In our study of the behavior of finite automata in random media, we have thus far assumed that their structure remains constant. We recall that the structure of an automaton is given by the matrices I I a i j ( s )11, s = 0, 1, specifying the transitions of states p(t) E (pl ,p z ,. . . , pm)of the automaton for various values of the input variables, and by the equationsf(t) = F(p(t)), specifying the actions of the automaton, f(t) E ( fi ,f2,. . . ,fJ,depending on its state. Varshavskiy and Vorontsova [43, 441 studied the behavior of stochastic automata with an evolving structure by modeling them on digital computers. The state matrices for automata of this type also change depending on the values of the input variable.
32
Automaton Theory
The modification of the state matrices occurs in the following fashion. Suppose that at the instant t , under the influence of the input variable s ( t ) , state pi passes into state pj, and then (at the instant t I ) the input variable assumes the value s ( t 1 ) = 0 (a nonpenalty) or s(t 1) = 1 (a penalty). Then the value of the transition probability a i j ( t ,s ( t ) ) increases in the case in which s ( t 1 ) = 0 and decreases when s(t 1 ) = I , and the remaining elements aik(t, s ( t ) ) , k f j , of the row change in such a way as to preserve the stochasticity of the matrix, i.e., to preserve the condition
+ +
+
+
+
2 a& + I , s ( t ) )
=
1=1
I.
The remaining rows of the matrix remain unchanged. We note that at every instant of time only one of the matrices undergoes a change 1 ) a ( t , s ( t ) ) l l , s = 0, I , namely, that one which corresponds to the value of s equal to s ( t ) . We consider the following method of forming the structure of automaton A,,,,, having x actions fi,fz,. . . ,f, and m = n x states p1, pz,. . . p m : 9
u
F(p,,,+,) = f u + l ,
+ I, s(t))= ajk(t + 1, s ( t ) ) = aij(t
Uij(f, s(t)) aik(t,
=
+ (-
0,. . . , x
-
I;
v
=
I,. . . , n ,
I)s(~+l’gaij(f, s(t))(l - O i j ( t , s ( t ) ) ) ,
s ( t ) ) - (-l)sct+l’gaik(t,s(t))aij(t, ~ ( t ) ) , j f k.
(33) In these formulas 0 5 g< 1 ; it is not hard to check that the matrix remains stochastic. The behavior of automata with an evolving structure in random media can be described by a nonhomogeneous Markov chain. For automata whose structures evolve as described by Eqs. (33), there exist stationary values of the transition probabilities, and one can speak of the final probabilities of the states of the system. I n particular, in the discussion of the behavior of automaton in stationary random media it was shown that
i.e., the simplest automaton with an evolving structure having only two states is equivalent to an automaton with a linear tactic having an infinite number of states.
Finite Automata and Modeling the Simplest Forms of Behavior
33
An analytic study of the behavior of automata with an evolving structure in composite random media is very involved. The work by Varshavskiy and Vorontsova [43] contains such a study (done by modeling on a digital computer) of the behavior of the system: an automaton with an evolving structure-composite medium. In one of such experiments, a model was made of the behavior of automaton A 8 , z . This automaton has eight states vl,.. . ,vsand two actions fl ,f z . In states yl,. . . , p4the actionf, is performed, and in states v 5 , .. . ,v8, the action fi. The structure of the automata evolves according to (33). At the initial moment t = 0, the state matrices for s = 0 and s = 1 coincide, all of their elements being equal to 8 : aij(O,0) = aij(O, 1 ) = 4, i , j = 1, . . . , 8. Thus, the behavior of the initial automaton is obviously not expedient. The behavior of this automaton was studied in the medium K ( P , C2), A ) , where C") = C(a, - u ) , C2)= C(-a, a), and A has the form (22). A measure of the expediency of behavior is provided by the mean payoff during time T :
FV(A8,2,K, T ) = ( 1 / T )
C (-l)B(t). T
t=1
These experiments have shown that, apparently, lim F(A,,,,K, T ) = m,
T+CC
i.e., for sufficiently large time intervals T, the mean payoff for an automaton with an evolving structure approaches, from below, the value of the expectation of a nonpenalty for automata with a linear tactic and optimal memory capacity. (Here those media were considered.for which no 5 4.)The value of the constant g has little effect on the evolution of the structure. Thus, Fig. 8 shows the plot of the function M ( T ) = &(I - W ( A , , , , K , T ) ) versus T for a = 0.8, 6 = 0.01. For such a medium, the optimal automaton with a linear tactic has memory capacity no = 2 and the value M = i ( 1 - W ) = 0.128. During the evolution of the automaton's structure during its functioning it was noted that the state matrices in the composite media change, approaching the matrices of the finite automata and resembling constructions described in Section 2. It is interesting that the evolution of the matrices I I a i j ( t ,0) 11, corresponding to the value s = 0 (a nonpenalty) proceeds much faster than the evolution of the matrices I I aij(r,1) 11 (s = 1, a penalty). This is due to the fact that automata with an evolving structure acquire
34
Automaton Theory
expedient behavior: they “succeed” more often in winning than in losing, and conversion to the matrix 1 1 u i j ( t ,0) 11 is more frequent than conversion to the matrix I I uij(r, 1) 1 1 . We shall note now that the evolution of the structure of an automaton in a random medium can serve as an example of the automatic synthesis of an automaton according to a given criterion for its operation.
Appendix to Part I. Eigenvalues of Markov Chains Describing the Behavior of Asymptotically Optimal Automata in Stationary Random Media”
In this section we discuss the spectrum (the set of eigenvalues) of Markov chains describing the behavior of some asymptotically optimal automata in stationary random media. To be more precise, we investigate the question of when these Markov chains have a sequence of eigenvalues (those which are not equal to 1) approaching 1 in modulus for n 4 00, and we estimate the rate of convergence.*2In all cases we take x = 2 for simplicity. 1. The easiest to investigate is the spectrum of automaton D 2 n , 2 sug, gested by V. I. Krinskiy. This appendix was prepared on the basis of an unpublished rough draft by M. L. Tsetlin. The preparation for printing and the filling in of any gaps were done by S. G. Gindikin and A. L. Toom (Editor’s note). la As we know, the eigenvalues of a Markov chain (generally speaking, complex) never exceed modulus 1.
Finite Automata and Modeling the Simplest Forms of Behavior
35
Let us write out the system of equations for the eigenvalues:
. . . . . . . . . .
. . . . . . . . . .
Here r i , si are the probabilities of being in states vil and q:, respectively (see Section 2). Using all the equations except for the nth and the (n 1)th (relating r i and sj to each other), we express all r i in terms of rl , and all siin terms of s l . Substituting these expressions in the nth and (n 1)th equations, we obtain the system
+
+
For the solution to exist, it is necessary that the determinant of the system be zero, whence
(W - 1) - q , P l n ) ( ~ n Q - 1) - 42P2,) = ptp2(An-1(1 - A)
- q1py1)(A,-1(1 - A)
+
-
q2pp-1).
This polynomial is of degree 2n 2. All the desired eigenvalues are its roots. It has two extraneous roots A = p l , p 2 . This is related to the fact that in the system (35) in the first equation, the fraction can be reduced by A - p l , and in the second equation by A - p 2 ; however, this reduction is not made. We shall rewrite the equation as
Automaton Theory
36 Let us subtract 1 from both sides: 1,-1(1 p12-1(
- 1)(1- p , ) - A ‘ y l - A)(pz - A) . A’&(1 - A) - q2p2n I - A) - qlpl”
Now we see that there is a single root A = 1 (corresponding to the eigenvector of the final probabilities) and the (n - I)-fold root 0. Furthermore,
or
The case in which p1 or p a or both are equal to 1 is excluded from our considerations. It is easy to show that in this case, in general, there are no eigenvalues tending to 1 for n 00. Then we let max(p, , p 2 ) = nz < 1. We intend to find positive E,, + 0 and S,, such that every solution of Eq. (36) satisfies one of the two inequalities: I 1I I m S,,< I or I A - 1 I I E,,. (This will mean that any sequence 1,)is either bounded in modulus by a number less than 1, approaches 1 sufficiently rapidly, or breaks up into two subsequences of this type.) We shall prove this assertion by contradiction. Let us assume that I A - 1 I > E,, and I 1I > m S,, . Then
-
+
+
Then
E,,
and 6, satisfy the relation
+
If we set E,, = 2(1 (S,,/m))’l-l, then the assertion will be proved. We can take S,, = ( 1 - m)/2. Then we obtain either I A I I (I m)/2 or I 1 - 1 I 5 2(1 (1 - m)/2m)”-’.This implies that if we choose A,, in such a way 0. that I 1, I + I , then of necessity I I - A,,I 0, and even I 1 - A,,1”
+
-
+
-
Finite Automata and Modeling the Simplest Forms of Behavior
In this case, setting W),
=
A,, = 1
37
- w,,, we easily obtain
+ pzn + P1/91 + P2/42 + O(Pln + P z n Y . Pln
1
Thus we have explained the behavior of I 1, I 1.13 2. Now we shall study the automaton with a linear tactic, L2n,z.The required eigenvalues are solutions of the system ---f
We assume beforehand that p l , q l , p 2 , q2 f 0 , 4, 1. Let us focus our attention on the equations in the first column, except for the first and the last. Their solution rl , r2 ,. . . , rn can be considered as a solution of a linear equation involving finite differences with constant coefficients. The characteristic equation is
(The same can be done for the other column.) We have assumed that p1 # 4 and p z # 4. Then 4p1q1< I and 4pzqz < 1. Since we are interested in A’s whose moduli are close to 1, the case 1 = 4plq1 or 1 = 4p,q2, yielding multiple roots, can be neglected. Then the solution of the equations in the first column, except for the first and last entries, has the form
rk
=
Aplk
+ BpZk,
l 3 Neither here nor in the considerations that follow is the existence of such I I., I --t 1 proved, although in fact this takes place. The problem of the uniqueness of such I., is not considered either (Editor’s note).
38
Automaton Theory
Similarly, we can satisfy all equations of the second column, except for the first and last, if we assume where
Substituting these expressions for rk , sk in the remaining four equations, we obtain a system of four linear equations with four unknowns A, B, C, D . Equating the determinant of the system to zero, we obtain the equation
This equation has one unknown 1 since pl, p 2 , v l , v2 are given functions of 1. Subtracting 1 from both sides of the equation, we can then factor out 1 - 1 in both numerators. By division, we obtain PIOlln - P2") - 41@1-' - pUrt-9 Pln - Pzn
+
P2(V?+'
- YE+') vln
- 42(vln - v2 - v2n
)
= 0.
We see that
Using this result, we shall switch to new variables x, y by setting
Substituting these expressions in the equation, we get {pl sinh nx - (p1p2)1/2 sinh(n - l)x}/sinh nx = {q2 sinh
ny - (p2q2)'I2sinh(n
+ I)y}/sinh ny.
This equation has one unknown 1 since x, y can be expressed in terms of I by formulas 2(p1q1)1/2 cash x = 2(p2q2)1"cash y = I. Making use of the formulas for sinh(a f B), we reduce the equation to the form (p1q1)1'2 sinh x coth nx
+ (p2q2)1'asinh y coth ny
=
1 - (pl
+ p a ).
(39)
39
Finite Automata and Modeling the Simplest F o r m of Behavior
Let us multiply (39) by 2 and introduce the angles a
= arc sin(pl)1/2,
8 = arc sin(p,)”,,
8 < n / 2 , and
a,
8 f. 744.
+ sin 28 sinh y coth ny = cos 2a + cos 28.
(40)
0 < a,
We obtain the system: sin 212 cosh x
= sin
28 cosh y
sin 22 sinh x coth nx
=I
,
The transformations are finished. Now we shall investigate the nature of the I I , I + 1, given by the solutions of the system. We note that if Re x, 2 c > 0, then limn+mcoth nx, = 1, and if Re x, 5 c < 0, then coth nx, = - 1. Therefore we must consider a system which can be obtained from (40) by replacing coth nx and coth ny with 1 or - 1. But first we note that if ( 9 , y ” ) is the solution of this system, yielding some A0, then (-9,yO), (9,-y”), (-9, -y”) are also solutions, yielding the same 1O. Therefore we may seek only those solutions for which Rex20,
Reyq0.
(41)
Then coth nx and coth ny should be replaced only by 1. We obtain a system : sin 2a cosh x sin 2a sinh x
= sin
28 cosh y
= 1,
+ sin 28 sinh y = cos 2a + cos 28.
(42)
The system, considering the condition (41), has only the following solutions: If 0 < a, 8 < n / 4 , then eZ = cot a, eu = cot 8. In this case jZ = 1. ( 2 ) If n / 4 < a, 8 < n / 2 , then eZ = -tan a, eu = -tan 8. In this case I = -1. (1)
For a < n / 4 < 8 or 8 < n / 4 < a, there are no solutions satisfying the condition (41). (The cases in which a or 8 are equal to 0, n/4, n / 2 are not considered.) These solutions will be useful to us later, and now we shall analyze the behavior of Re x and Re y. We set x = x1 ix, , y = y, iy,, where x, , y, 2 0. We intend to find constants c > 0 and 0 < M < 1, such that if x1 5 c or x, 5 c, then I1ILM.
+
+
40
Automaton Theory
It will be noted that
=
sin 2a
ez1
+
e-21
=
2
sin 2a cosh xl.
Similarly, I 1I 5 sin 28 cosh y , . Taking this into consideration, we select c > 0 in such a way that cosh c
< min{ 1 /sin 2%; 1 /sin 2 8 ) .
We can do this since u, # n/4. Then we shall have
I 1 I 5 M = max{sin Since we are only interested in
2.r cosh c, sin 28 cosh c} < 1 .
I 1, I
+
1, the cases in which x1 5 c or
x2 I c, do not, as we have shown, have t o be considered.
Thus, we now consider that
X I , x2
2 c > 0.
(43)
Then
Similarly,
I coth ny - 1 I 5 2/(ezne
1 ).
-
This, as it is easy t o show, implies that any x, y , 1 [solutions of the system (40)], satisfying the condition (43), differ from the solutions of the system (42) by no more than constant x erznc. For the case (43) it is easy to estimate more exactly the difference between 1and 1 and -1. For this purpose, it is sufficient to denote 1by 1 - w o r -1 w , and substitute in the system (40), leaving out the terms involving w in powers greater than one. From the proof of the estimate (01 I constant ~ e - it ~is easy ~ ~to ,generalize this type of calculation. As a result, we obtain:
+
(a) for p l , p z < w =
(41
4, 1 = 1
-
w,
where
+ ( 9 2 - Pz)(P2/42)nI
- P1>(42 - P2)[(41 - P1)(Pl/41)n 41 92 - 1
+ o[(P1/4dn+
+
(P2/42)nl;
41
Finite Automata and Modeling the Simplest Forms of Behavior
(b) for p l , p2 > 4,
)3 = -
o = (PI - 41)(P2 - 42"l
1
+ o,where - 41)(41/PJn 42 - 1
41
+
+ (Pa - 42)(42/Pz)nl
+ o[(41/Pl)n + (42/P2)nl.
3. The spectrum of automata Kzn,2 can be studied in a similar fashion. We can show that for p l , q l , p 2 ,q2 f 0, I ilI -+ 1, we have il= 1 - o, where o=-
2q1q2
41
+
.I')*(.')&([.+
[4'(&y+q2(&y]
42
I1 AUTOMATON GAMES. ZERO-SUM GAMES FOR TWO AUTOMATA
1 Automaton Games In this section we shall describe mathematical models for the simplest forms of collective behavior of automata. We shall use some of the notions and results of the von Neumann-Morgenstern theory of matrix games [24,42, Z22, 1221, as well as the definitions and constructions given in Part I of this chapter. We shall consider the collective behavior (game) of automata (E", . . . , 91'. It is assumed that each of these automata is specified by means of its state matrices and Eqs. (1) and (2). Furthermore, let sj(t),f j ( t ) ,@'(t), j = 1,. . . ,v, be the values of the input variable, output variable, and the state, respectively, of automaton ' u j at the instant t. We shall assume, as before, that the input variable sj(t) assumes only two values, d ( t ) = 0 and s j ( t ) = 1, corresponding to the (unit) win and loss of automaton 91j at the instant t. The output variablefj(t) is supposed to take on values belonging to the set f],. . . ,fij.These values will be called strategies of automaton % j , and indicate that at the instant t, automaton ' u j uses its ath strategy iffi(t) =fa'. The values q$,. . . ,vkj of the variable @ ( t ) will be called the states of automaton ' u j , and the number mj the capacity of its memory. Obviously mj2 d. We shall consider the state matrices
1 1 & ( d ( t ) ) 11,
j = 1,. . . , v ; i, k = 1,.
to be given for automata
W,.. . , 'W
. . ,mj;
d ( t ) = 0, 1,
42
Automaton Theory
Now we proceed to describe games played by automata. A play f ( t ) , taking place at the instant t , will be defined as the set f ( t ) = ( f l ) ( t ) , .. . , f ’ ( t ) ) of the strategies used at the instant t by automata W,. . . , %“. The outcome s(r 1) of the playf(t) will be defined as the set
+
s(t
+ 1) = ( s l ( t + I),.
. . ,su(t
+ 1))
of the values of the input variables (unit wins and losses) of these automata 1. at the instant t We shall say that a game f played by automata W,. . . , ‘u“ is given if for each playf(t) we are given the probabilityp(f, s) of its outcome s ( t 1); the equality
+
+
1P ( f ,
s) = 1
(44)
Y
is valid for anyf. Thus a game F for automata ‘E“,. . . , \u‘consists of a sequence of plays f ( t ) , t = 1, 2 , . . . , whose outcomes s ( t 1 ) are determined by the probabilities p ( f ( t ) , s(t 1)). A system of the values of p ( f , s), specifying an automaton game I’, defines a v-person game r understood in the usual sense for game theory. In fact, the payoff functions v ” ( f ) ,j = I , . . . , v, defining a game have the meaning of the expected value of a win for the jth player and his set of strategies f, and are obtained uniquely from the probabilities of the outcomes, using the formula
+
+
r*
r.
r*will be called equivalent to the automaton game Note that specifying the game f* does not determine uniquely the automaton game r. In fact, the game I‘* is specified by Y functions V J ( f ) , and the game f is defined by 2v - 1 probabilities p ( f , s) for each playf. A game played by Y automata f will be referred to as a game with independent outcomes if A v-person game
P(.L s)
P ( f , sl,. . . 9
where Pi(f,
O),
P
G 1)
2 0,
n V
S”) =
j=l
Pj(J s J )
(46)
9
Af, 0 ) + @(f, 1 )
=
1.
43
Finite Automata and Modeling the Simplesi Forms of Behavior
An arbitrary game I'* makes it possible to construct uniquely an automaton game with independent outcomes; here
We shall say that a system of automata % I , . . . , a' participating i n a game I' is in state u ( t ) = (al,. . . , a"),if at the instant t , automaton %j is in state v x j ,aj = 1,. . . , mj,j = 1,. . . , v. We shall show that this system can be described by a finite Markov chain. For this purpose it is sufficient to determine the probabilities p;:;:::;;; of the transition of the system from state u ( t ) = (ul ,. . . , uv), into state
80 + 1)
= (81,. ..3
BY).
Suppose that the following strategies correspond to states q i l , . . . ,yiV of automata W,. . . , W:
Then the probabilities p;:::::;;;of the transition of the system from state u ( t ) into state p ( t 1 ) are given by the formulas
+
It is not hard to verify that
i.e., that the matrix 11 pi:;::::;;I I is stochastic. As a rule, the Markov chain defined in this fashion is ergodic. In this case, the final probabilities of the states of the system exist, and with them also the expectations of the wins of automata which do not depend on the initial states. Automaton games to which ergodic Markov chains correspond will be called ergodic. Let RX1,...,IXv denote the final probability of state (L = ( u l r . . , (1") of the system of automata % I , . . . , %" participating in the game r. States ~ ,.i . ,. , ygV of the playing automata and the strategies ,. . . correspond to this state of the system. Then the expectation WJ of a win for automaton can be expressed as
,fc,
44
Automton Theory
In this formula P(f)is the payoff function for the j t h player in the equivalent game T*for v persons, specified by the expression (45). The quantity Wi will be called the value of the game T for automaton '@. In describing the modeling of automaton games on digital computers in the following, it will be useful to use the final probabilities a ( f ) of the plays f of the game I'. Let f = ,. . . , j z ) and let U ( f ) be the set of all such states of the system of playing automata in which automata 'W,. . . , W use strategies fi:, . . . ,A:, respectively. Then
(fif
It is easy to see that for Y = 1, automaton games reduce to behavior in stationary random media. The models of the collective behavior of automata thus defined utilize the language of game theory. However, the definitions of automaton games and ways of behavior in the games which arise here differ considerably from the point of view accepted in game theory. In fact, in game theory it is assumed that the system of payoff functions, determining the game, is known beforehand by the players. The player is supposed to use this information in order to determine his strategy (usually mixed), which during the game itself would remain unchanged; in choosing the strategy one can decide to use any computing means. Automaton games, however, are defined by specifying not only the systems of payoff functions, but also the structures of the participating automata. The automata taking part in games do not possess any apriori information about the game. The actions of each automaton are determined only by its wins and losses in the course of the game. The role of the payoff functions, defining a game, and that of the opponents of the automaton thus reduce to formation of a more or less complex random medium in which the automaton should be capable of expedient behavior.14 l4 Krinskiy and Ponomarev [I061 discuss the problem of how players should behave who do not have apriori information about the matrix of a game. An algorithm is described which assures that a player will have a guaranteed payoff arbitrarily close to the value of the game (when the game is repeated sufficiently many times). For simplicity, we shall describe this algorithm for the case in which the matrix of the game has a saddle point. Suppose that the first player has n actions; the number of actions of his opponent is not important. For every action, a player remembers the payoff ob-
45
Finite Auiomaia and Modeling ihe Simplesi Forms of Behavior
In selecting the construction of the playing automata, it is therefore natural to require that their behavior be in every case expedient in the simplest game -the game against nature. The absence of any apriori information about the payoff functions leads naturally to the use of symmetric structures for the automata. In the examples of automata games described in the following sections, we shall use structures of automata belonging to the asymptotically optimal sequences described in Section 2 of this part. In concluding this section, we shall give a simple example of an automaton ,game. Even though this game is not ergodic, it is possible to follow certain characteristic features of the behavior of automata in a game. In this game, four automata participate with a linear tactic:
Thus, these automata have, respectively, 2, 3, 4, and 10 strategies each, where
In the case of a win, each of the automata continues to use the previous strategy, and in the case of a loss, each of the automata replaces that strategy by the next (the last strategy is replaced by the first). The conditions of a game consist of the following: for each play f ( t ) = ( f l ( t ) , f 2 ( t ) f, 3 ( t ) ,f 4 ( ( t ) ) we , form the sum o ( t ) =f
Vt)
+f Y t ) +f 3 ( t ) +f 4 ( t > .
Here automaton wins a playf(t) if o ( t ) I 12, and loses in the opposite case. Automata V, \.!win laa play if a ( t ) exceeds the numbers 5, 8, 13, respectively. A typical sequence of playsf(t), values a ( f ) ,and the outcomes s ( t 1) are shown in Table I I . We see that, beginning with the eighth play, all automata except for begin to win and stop changing their strategies; automaton 3' begins to lose and changes its strategies cyclically.
a3,
+
tained the last time this action was executed. At every instant of time, the player performs the action to which the maximum payoff is attached. I n this case, throughout the game a payoff less than the value of the game is obtained n o more than n - 1 times. For matrix games without saddle points, one can construct an algorithm involving mixed strategies which guarantees an average payoff arbitrarily close t o the value of the game if only the game is continued sufficiently long (Editor's note).
46
Automaton Theory
TABLE II t
1 2 3 4 5 6 7 8 9 10
f1
fa
fS
f4
u
1
1
1 2 3 3 3 3 3 3 3 3
1 2 3 4 5 6 7 8 8 8
4 7 9 10 11 12 13 15 14 15
1 1 1 1 1 1 2 1 2
2 2 2 2 2 2 2 2 2
sa
ss
54
0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0
1 1 1 1 1
51
I
1 0 0 0
In the following section we shall describe zero-sum games for two automata that are equivalent to zero-sum games for two persons, the wellknown von Neumann theorem holding in this case. In Part 111 we shall discuss homogeneous automaton games, i.e., those automaton games in which all participants are equivalent. 2
Zero-Sum Games for Two Automata
In the foregoing section we have given a general definition of an automaton game. Now we shall consider in more detail two examples of such games. Both examples involve zero-sum games for two automata. The first one involves a game for an automaton playing against an opponent who had chosen a mixed strategy in the sense of game theory, and the second involves a game for two identical automata capable of expedient behavior in random media. For the first example, it will be shown that, for any asymptotically optimal sequence of automata, the win approaches the value of the game if the opponent uses the optimal strategy. In the second example, the value of the automaton game is in a certain sense analogous to the value of a game in the sense of von Neumann's game theory, although it does not coincide with it. Let us first define a zero-sum game for two automata. Consider a game r, in which automata (11' and 912 have, respectively, M and N strategies ( X I = M , x2 = N ) . We assume that in each play f = ( f 1 , f 2 )of this game one of the two automata wins and the other loses. Then the probabilities p ( f , s) = p ( f ' , f 2, sl, s2) of the outcomes (sl, s2) are zero for s1 = s2.
Finite Automata and Modeling the Simplest Forms of Behavior
47
The automaton game thus defined will be called a zero-sum game for two automata. The quantity 4flf2 = P ( f ' , f 2, 0, 1) means the probability of a win for the first automaton in a playf= (f',f2), and Pf'f2 = f 2 , 1, 0 ) P(f'9
represents the probability of its winning in this play. According to Eq. (44),
The expectation m f l f 2of a win by automaton X' in the play f = ( f ' , f 2 ) is equal, by virtue of (45), to
Of course, the expectation of the sum of wins of automata X' and a2is zero. The quantities mapform a rectangular matrix 1 1 m,@11, a = 1,. . . , M , = 1,. . . , N , coinciding with the matrix of the equivalent zero-sum twoperson game. If the game is ergodic, then there exist final probabilities of the state of the syytem of the participant automata, and the value of the expectation W(%', W, I') of a win for the first automaton does not depend on their initial states. This value will be called as a convention the value of a game r for automata cU1 and cU2. Let R"1"2denote, as before, the final probability of the state of the system for the two automata participating in a game r. In this state automata Iu" and 912 are in states &I and &, and use strategies hil = P(&) and = F-(&), respectively. According to (49), the value of the expectation W(%', a2,r)of a win for automaton 3' can be calculated using the formula
Now we proceed to consider the examples. Consider first a game for automaton X against opponent U , using some fixed mixed strategy. We shall show that for any asymptotically optimal sequence of the automata, their win approaches the maximum possible value, and if the opponent uses the optimal strategy in the von Neumann sense, then this maximum coincides with the value of the game.
48
Automaton
Theory
In fact, suppose that the opponent U of the automaton 'u in the game realizes some mixed strategy x = (x,,.. . , xLV), i.e., in each play he uses his pth pure strategy, p = 1 , . . . , N , with the probability xp, x1 x N = 1 . By definition of mixed strategy, xp are functions of the matrix I I mup1 I of the game I',and d o not depend on the behavior of the opponent. Then for any pure strategy f,,a = 1,. . . , A4 of automaton 'u, we have the expectation value mu of winning:
+ -
+
Thus, a game with an opponent who has chosen any mixed strategy determines a stationary random medium C ( m , , .. . , m.lf),for any x: W(%, u,
r)= W(%, C ) .
If ?Il,. . . , %,,. . . , is a n asymptotically optimal sequence of automata, then, by definition,
w = lim n+m
w(%,U, I')= max(m,, . . . , mM) = max 2 m,pXB. N
a
8=1
(55)
an,
Thus, automaton for sufficiently large n, maximizes its win. If the mixed strategy x = (x,,.. . , xg) is optimal, then the expectation of a win for automaton 'un (for n 4 co) approaches W = max min a
2 morpXp, N
z B=1
i.e., it coincides with the value of the game according to von Neumann. One can say that such an automaton plays n o worse than its partner who has chosen an optimal strategy, even though it does not have any apriori information about the structure of the matrix 1) map1) for the game I', and receives all the necessary information during the course of the game itself. Behavior which is expedient in a stationary random medium turns out t o be expedient also in the example of the game. The foregoing considerations can easily be generalized to the case of a game for many automata, one of which belongs t o an asymptotically optimal sequence, the rest using mixed strategies. Let us now examine the example of a zero-sum game for two identical, symmetrical automata. For such automata, we shall make use of stochastic automata BMUn,~tt and B N n . N which , will be defined in the following way. Automaton B,,,,
Finite Automata and Modeling the Simplest Forms of Behavior
49
has xn states via,a = 1,. . . , x , i = 1,. . . ,n, and in state viait uses strategy L. For s = 0 (a win), the transitions of states occur in the same way as
for an automaton with a linear tactic, namely, state via passes into state i = I , . . . , n - I , and state ynapasses into itself. For s = 1 (a loss), , 2,. . . , n. However, state via, in contrast state viapasses into state q ~ ; - ~i = to automata with a linear tactic, passes with equal probabilities (= I/x) into states vl@,,b = 1,. . . , x . It is not hard to verify that a sequence of automata B,,,, is asymptotically optimal in the same media as the automata with a linear tactic. Assuming that a game is ergodic, we shall show a method of calculating the value of the game W(BMn,M,BNn,N, Then one can show that the limit
r).
W = lim W B M , , M ,B N ~ .,Nr) n+m
exists and possesses the following properties : 1. If the matrix I( map11 of a game T contains even one row consisting of nonnegative elements, then W is the harmonic mean of the elements of the row from that group whose smallest element is largest. In a similar way, if 1 I mapI I contains at least one column all of whose elements are nonpositive, then W is the harmonic mean of the elements of the column from that group whose largest element is smallest. 2. If the conditions of Case I are not satisfied, W = 0.
Case 1 signifies well-known advantages for the first player: The presence of a row containing nonnegative elements means that he has a “loss-free” strategy. In this case the behavior of the first automaton resembles a cautious tactic, prescribed by game theory. One chooses a strategy that yields the maximum guaranteed payoff. The fact that the second player does not minimize the payoff to the first one, and is satisfied only with the harmonic mean, is related to the fact that a sequence of automata B,,,, is not asymptotically optimal in a random media all of whose parameters are negative. Case 2 shows that, in the absence of such clear preference for one of the players, the automata end the game in a draw: W = 0. In this case they play, so to speak, “more roughly,” since they are not able to make use of the subtler properties of the game matrix. However, even in this case, the value of W lies between the upper and the lower values of the game : max min mag5 W 5 min max ma@. (57) I
@
@
a
50
Automaton Theory
Thus, we see that in a game the automata capable of expedient behavior in stationary random media achieve results that are almost the same as those obtained using the methods of game theory and utilizing full a priori information about the game matrix. We shall outline the proofs of assertions 1 and 2. Let R:$ , a = 1,. . . , M ; B = 1,. . . , N , i, k = 1,. . . , n, denote the final probability of the state of the system of the participating automata in which the first one is in state qia, and the second in state vk'. Making use of the definition of automata B M n , M ,BNnSNand game r, we obtain the following system of equations for R:$:
and the normalization condition
2
RY!=
1.
a,B,i.k
We shall show first that R:$=O
for
i+ k # n +
1.
+ +
For this purpose we use induction with respect to s = i k . It will be noted that, according to (65), Ri!n = 0, and we assume that R:,$ = 0 for all i k = s, s > n 2. Then for R:,i, i k = s - 1, we obtain a homogeneous system of linear equations whose determinant is nonzero.
+
+
Finite Automata and Modeling the Simplest Form of Behavior
51
+
Similarly, by virtue of (65), RYfl = 0, and from Ri$ = 0 for i k = s, s < n, it follows that Ri$ = 0 for i k = s - 1. Therefore, the solution of Eqs. ( 5 8 ) can be written as
+
+ c,,A~j',
R:$+l-i = b,,
where
A,,
= qao/pa,.
(67)
From Eqs. (63) and (64), we obtain for b,, and capthe following system of 2MN equations: qaaba,
+ Pa,cap
Pa,ba, iqa&ilcafi
=
(68)
= ra 9
(69)
where
Solving Eqs. (2.25) and (2.26) for b,,, cap,we obtain
Summing Eq. (68) over a, and Eq. (69) over we get M N equalities:
+
B,
and using (70) and (71),
Substituting the value of b,, from (72) into (74) and (75), we obtain a system of M N equations for a,, t,:
c
+
N
ta
6-1
-
l)/(A:ll
c (Ay, - l)/(A;;l M
a,
(4x6
Y-1
- 1)
N
=
-w 3 i;
C 6-1
-
c
l)/(AZ1 -
a = 1, 2 , . . . ,M ,
(76)
M
=
Y-1
-
l)/(A;;l - l b Y , p = 1, 2,. . . ,N .
(77)
Krylov and Tsetlin [IIJ] give the solution of this system for M = N = 2. Here we shall limit ourselves to considering only the limiting case n + 00.
52
Automaton Theory
Consider first case 1. Suppose that the matrix ( 1 map11, a = I , 2,. . . , M , /? = 1,. . . , N contains rows consisting of nonnegative elements. Without
loss of generality, let us assume that the first L rows possess this property. Then A,, > I for a = 1,. . . , L, /? = I , . . . , N . We set
and note that for n -+
and for n
+ 00
and A,,
00
and A,, > 1,
< I,
pxp+ 1 - A,,,
VuS
-
-AZp
+
0.
(80)
Using this notation, we eliminate as from Eqs. (76) and (77). Interchanging the order of summation, we obtain
We note now that the sums vaa over column elements appearing in the denominators of the right-hand side of (81) are finite for n + 00, since by assertion 1, the matrix 1 1 mUp11 does not contain columns all of whose elements would be negative. Among the products v,apu,a for y > L, a 5 L there is at least one that remains finite for n + 00 since the rows with indices y > L contain negative elements and the rows with indices a 5 L consist of positive elements. If, however, y 5 L, then all these products vanish when n + 00. The coefficients o f t , on the left-hand side of Eqs. (81) approach zero for n 00 if a 5 L. Thus, in Eqs. (81) for n + 00, the coefficients of t,,tend to zero for y 5 L, and remain finite for y > L. Therefore, t y ,y > L, have a higher order of smallness than r y , y 5 L. Neglecting these, we shall rewrite (81) as --f
We set paao(a)= max6 pad.Then, neglecting the terms of greater degree of smallness, we obtain
Finite Automata and Modeling the Simplest Forms of Bekvior
53
Let ,uaod0 = min, pad0(a).It is easy to see that the coefficients of ta0 have a higher order of smallness than the coefficients of t,, a # ao. Thus, we have shown that t, + 0, a # ao. Therefore, in Case 1 , we choose the strategy a. ,for which min, maxdpea is attained, or identically, max, mind mad. Now let us calculate the limiting value W for the game T. We return to Eq. (77). Using the notation of (78) and neglecting terms of higher order of smallness, we have M
The coefficients of as on the left-hand sides of Eqs. (84) are finite since any column of the matrix 11 mas11 contains positive elements. Thus, all as have a higher order of smallness than t a o . Let us rewrite Eq. (53) for the value of the game taking into account (67): W(BMn,.u BNn,x 3
r)= c,8 {nbuo + [(A$
- 1)/(%8 - l>Ica~}mas*(85)
According to (74) and (75), we have
c masbas
a.8
= 0.
Next, using the expression (73) for cap, we have
We note that in this expression the coefficients of as, t, remain finite for n + 00 both when Aas > 1 and when A,@ 5 1. Neglecting as, t,, which tend to zero (for a # ao), we obtain
r)= Ntu0.
W = lim W(BMn,M, BNn,N, n+ca
(87)
Taking into account the normalization condition (66), we obtain N
w = N ( C m;oi)-l. 8-1
We have studied the case in which the game matrix contains rows consisting of positive elements. Of course if the matrix contains columns consisting of negative elements, the corresponding analysis is exactly similar.
54
Auromaton Theory
Now let us consider Case 2. In this case any row of the matrix contains negative elements, and any column contains positive ones. Therefore, Eqs. (76) and (77) involve all t a ,ap with finite coefficients. The value of the game W ( B M n , M , B N n , NI ,') will be W(BMn,M B N ~ , N 0 3
9
The first sum in the numerator of this fraction is zero by virtue of (74) and (75); the remaining sums remain finite as n + 00. The presence of a term increasing with n in the denominator means that
w = n+cu lim W ( B i v n , M , B . V n , N ,
= 0.
The foregoing discussion of automaton games permitted us to make conclusions only about the final stationary states. Attempts to analyze the transient processes were undertaken by means of modeling the games described here on digital computers.15
o
mu
10ou
k
Figure 9
0.4 0.8 0.7 0.6 1.0 0.1 0.3 0 . 9 -0.1 -0.3 -0.9 0.1
0.3 0.2 0.9 0.7
'
l5 These examples of simulation were included in the text by us; they were taken from Krylov and Tsetlin [113, p. 9861 (Editor's note).
Finite Automata and Modeling the Simplest Forms of Behavior
55
The first two rows of this matrix contain no negative elements, so that the game comes under Case 1. The plots differ in the memory capacities of the automata: for curve 1, n = 19, and for curve 2, n = 5. The solid horizontal line V = 0.47 coincides with the limiting value of the game W, which in this case is the harmonic mean of the elements of the first row. For n = 19 and k > 1O00, the value of V(k) differs very little from W. For n = 5, the value of V ( k )converges to a limit which differs considerably from W.
Figure 10
Figure 10 gives similar plots for the matrix 0.1 -1.0 0.4 0.2 0 . 3 -0.5 0 . 6 -0.3 -0.3 0.2 -0.5 0.3
0.3 1.0
-0.3 -0.4
,
which has no rows consisting of nonnegative elements and no columns consisting of nonpositive elements, i.e., it comes under Case 2. It is clear in the figure that V(k), even at small values of k, differs very little from zero both for n = 19 and for n = 5 (curves 1 and 2, respectively).
III HOMOGENEOUS AUTOMATON GAMES 1 Homogeneous Games
In this part we shall describe homogeneous games, i.e., games in which all participants have equal rights. For this purpose it will be useful to introduce a group of automorphisms of a game. In our study of homogeneous games we shall use such concepts as the value of a play and the invariant set of plays. Among homogeneous games of particular interest
56
Automaton Theory
are those games for which at least one invariant set of plays is stable, i.e., Nash games. If the invariant set of maximum value turns out to be stable, then the corresponding game is called a Moore game. We shall propose a procedure making it possible to construct from a given homogeneous game T* another game To*, which has the same set of players, their strategies, and the same values of the plays as in the game r*,but which is a Moore game. Such a procedure is equivalent to an agreement among the players in a homogeneous game that their winnings in this game will be divided equally. Furthermore, we shall study homogeneous automaton games. In view of the fact that an analytical investigation is rather cumbersome, we made an attempt to analyze such games by means of computer simulation. Examples of such simulation constitute the contents of the last two sections. Before we state the definition of a homogeneous game, it will be useful to introduce definitions of the mapping of a game and of the automorphism of a game. Thus, consider a game r*,played by A’, . . . , A”. Suppose that the player A j has xi strategieshj, f $ , . . . ,f i j , j = 1,. . . , Y available. A play f of the game will be defined as before in the context of an automaton game (see Part 11, Section 1 ) as the set f = (h:,.. . ,fi:), whereh: is the strategy of the j t h player. There are Y functions V(f ), . . . , V”(f ) given on the set of plays defining the game T*. We shall state that a mapping g of a game into itself is defined if we are given:
r*
r*
(I)
a one-to-one mapping g of the set of players into itself:
(2) a one-to-one mapping of the set of strategies of a player A i into the set of strategies of a player M i :
The mapping g defined a mapping into itself of the set {f)of plays of the game In fact, a play f = (fit,. . . , g )is mapped by g into a play
r*.
kl = gj(ij)
if
gj = I,
I = 1,. . . , Y.
Finite Automata and Modeling the Simplest Form of Behavior
57
A mapping g will be called an automorphism of a game F*, if it preserves
the payoff functions, i.e., if for any play f of the game F*, we have the equality Vj(f ) = V q g f ) . (91) Evidently, the set of all automorphisms of the game F* forms a group G p . A game F* is called homogeneous if the group Gr. of the automorphisms of the game is transitive on the set of players, i.e., if for any pair of players A i and A j ther ' 3 an automorphism g such that gAi = A j . Obviously, in a homogeneous game the sets of strategies of all its players are pairwise isomorphic. Suppose now that f is an arbitrary play in a homogeneous game The value U(f ) of a play f will be defined as the arithmetic mean of the expectations of the winnings of all players: I
r*.
where vg is the number of plays of the form gf: In other words, for a homogeneous game F*, the value of a play coincides with the average payoff to any players over the set of plays {gf }, g E G p . The set { g f } will be referred to as the invariant set of plays, generated by a play S, and U( f ) will be called the value of the invariant set {gf }. It will be noted that in a homogeneous game the payoff function for any player is uniquely determined by the payoff function of one of the players and the group Gro of the automorphisms of the game. This circumstance considerably simplifies the description of a homogeneous game. We shall give two examples of homogeneous games. Example 1 Consider a game with two players A' and A2, each of which has N strategies f l a , . . . ,f N a , a = 1, 2. Let V1(f:, f k 2 ) be the winning of player A', using his ith strategy against the kth strategy of his opponent. Suppose also, that
V'(hl,f k 2 )
=
V2(fk1,
This game has a mapping g which acts as follows:
(94)
58
Automaton Theory
By virtue of (94) the mapping g preserves the payoff functions and thus is an automorphism, so that the game under consideration is homogeneous. The plays ( f i 1 , f k 2 and ) ( f k l , f : ) form the invariant set. Example 2 A homogeneous game for v players will be called Z*-symmetric if the group Gr. of its automorphisms coincides with the symmetric group of the permutations of the indices 1, 2,. . . , Y . Lethj,. . . ,fld be the set of strategies of player A j , and let V(f) = P(fi:, . . . ,hi) be the payoffs of this player. Also, let
w-)
=M(h$
a1,.
..
7
a.v-A.
(95)
We shall limit ourselves to symmetric games for which (95) is valid.I6 In this formula, a, is the number of players (other than A j ) who have selected in a playftheir s t h strategy,” s = 1, 2,. . . , N - 1. Let ij = k. Then the invariant set { g f } , generated by the play5 consists of all such plays of the game Z* in which B1 players select the first strategy, B2 players select the second,. . . , BN players select the Nth and
In the study of homogeneous automaton games, it seemed natural to us to separate out those games, in particular, the Nash games and the Moore games, in which the behavior of automata can be compared with the behavior of intelligent beings who know the conditions of the game beforehand. In the foregoing section, such a possibility was assured by virtue of the fact that, for two-person zero-sum games, we have the von Neumann theorem. Now we shall proceed to isolate certain special classes of homogeneous games for which such a comparison is possible. Suppose that in an arbitrary game there is a play f N, in which it is not convenient for any of the players to change their strategies unless the remaining players change theirs. The play f will be called a Nash play.
r*
l6 Equation (95) indicates that it is possible to number the set of player strategies in such a way that the kth strategy of any player will always change into the kth strategy of another player in the case of autornorphisrn. l’ Just as previously, we may state that player A’ selected his sth strategy in the set f = (fil, . . . ,hi) if il = s.
59
Finite Automata and Modeling the Simplest Forms of Behavior
The definition of a Nash play f inequalities :
=
(hi,.. . ,A;)
reduces to a system of
mh: . . A; . . ,h:>P mh: . . Aj 9 .
9.
9 .
9. * *
JQ
(96)
for all j = 1, 2,. . . , v and all strategies &<. The games that consist of Nash plays will be called Nush games. If f N is a Nash play in a homogeneous g a m e r * , and g E C,, is any automorphism of this game, then it is obvious that all plays of the form { g f N } are also Nash plays. The set of plays {gfN}, g e C,, will be called the Nush set. As an example, consider a symmetric game for two players A' and A2, each of which is capable of only two actions fli = 0, f2i = 1, j = 1, 2. Let m ( ~S ,) , E , 6 = 0, 1, be the expectation of winning for the player who has chosen the strategy E if his opponent engages in the action 6, and let
m(0,O) = 0.43, m(0, 1)
= 0.2,
m(1,O)
= 0.2,
m(1, 1) = 0.5. (97)
In this game the plays f ( 0 , O ) and f(1, 1) are Nash plays. We shall show that all symmetric games, satisfying the condition (95), whose participants have only two strategies each, are Nash games. Thus, let Z* be a symmetric game with v players A', A 2 , . . . , A', each of whom has only two strategies 0 and 1. Then by virtue of (95) the payoff functions of the game have the form V j ( f ) = m(Ej, a), where a is the number of players (other than A j ) that have chosen strategy 1. Suppose that in playf, strategy 1 was selected by B, and strategy 0 by v - 8, participants of the game Z*. This play is a Nash play if
The first of the inequalities (98) means that for the player who has chosen strategy 0 it is convenient to change it; the second inequality means that a change of strategy is not convenient for the player who has chosen strategy 1. Suppose that m(D, 0) 2 m(1, 0). Then play f = (O,O,. . . , 0 ) is a Nash play. If, however, this inequality is not satisfied, i.e., if m(0, 0) 5 m(1, 0), and if in addition m(0, 1) 2 m(1, l), then the play in which strategy 1 is used by one player is a Nash play. The play in which two players use strategy 1 will be a Nash play if the last inequality is not satisfied and if m(0, 2) 2 m(1, 2), etc. Let us write out the sequence of inequalities thus
60
Auromton Theory
generated, which according to (98) are the conditions that a given play be a Nash play: (1)
m(O, 0 ) 1 m ( l , 01,
f = (0, 0,. . . , 0 )
Nash play Nash play
-
(B) m(0, ,L? - 2 ) < m ( l , B - 2 ) , f = ( I , m(O, B - 1) '1m(1, B - I), (v)
m(0, Y
-
2) m : (l, Y - 2),
m(0, v
-
I ) 2 m(I, Y - I),
.f
2,. . . , 1, 0,. . . , 0 )
Nash play Nash play
P-1
= (1, I , . . . , 1, 0 )
Nash play
If at least one pair of inequalities 1,. . . , Y is satisfied, then a Nash point exists. If, however, one of these inequalities is satisfied, then m(0, Y - 1 ) < m( 1, Y - I ) , and the play in which all the participants choose strategy I is a Nash play. Symmetric games whose participants can use more than two strategies are, generally speaking, not Nash games. Consider, for example a symmetric game with two players each of whom can use three strategies 0, I , 2, defined by a system of payoff functions m(0,O) = m(1, I )
m(2, 2 )
=
0.5,
m(1,O) = m(0, 2 ) =
m(2, I )
=
0.5,
m(0, 1) = m(1, 2 )
m(2,O) = 0.7
=
=
(99)
Here m(E, 6) is the expectation of winning for the player engaging in action assuming that his opponent engages in action 6. It is easy to verify that this game is not a Nash game and that it belongs to the class (95). A Moore play will be defined as a Nash play whose value is no less than the value of any other play. Games consisting of Moore plays are called E,
Moore games.
Thus, for example, the game defined by the payoff functions (97) is a Moore game, and the p l ay f = (1, 1 ) is a Moore play. The invariant set of plays generated by a Moore play will be called a Moore set.
Finite Automata and Modeling the SimpIest Forms of Behavior
61
We shall attempt to imagine how homogeneous games would be played by people who would be informed beforehand about the conditions of a game. It is clear that the greatest winning would be assured for each player if they agreed to play sequentially the plays belonging to the invariant set generated by the play having maximum value. The situation is more complex in those cases in which the conditions of a game are not given to its players beforehand and in which there is no such agreement. Then if, for example, a play of maximum value which is not a Nash play is played, then among the players there will be those who win, changing their strategy. The plays that result when some of the players change their strategies will no longer belong to the invariant maximum value set. The Nash plays in this sense are stable. If the invariant maximum value set is a Nash play (Moore game), then the set is in this sense stable. In the following, in describing experiments with automaton games where the automata do not have a priori information about the conditions of the game, these concepts will be found useful. For any given homogeneous game r*,we indicate a procedure making it possible to construct a game ro* with the same players, strategies, and values of the plays, and which is a Moore game. Let V j ( f ) ,j = 1,. . . , v be a system of payoff functions for the game The game To*is then defined by a system of payoff functions
r*.
In each play of game To*,the winnings of all players are identical, and coincide with the value of the play. Therefore, a play of maximum value is a Moore play, and the invariant set generated by it is a Moore set. The procedure of constructing the game ro*-we shall call it a game with a common fund-is in a sense equivalent to an agreement among the players of the homogeneous game to divide equally their winnings in the game. For a homogeneous game in which all participants are equal and can count on the same payoff, such an agreement seems natural. In the following section we shall return to this question. It will be noted in addition that zero-sum homogeneous games with two players have skew-symmetric game matrices. Therefore, the procedure of introducing a common fund in this case results in a situation where the winning of each player in every play is zero, and coincides with the value of the game in von Neumann’s sense.
62
Automaton Theory
2 An Example of Simulating a Symmetric Automaton Game In this section we describe homogeneous games with identical automata and give an example of the computer simulation of a symmetric game played by many identical automata; this makes it easy to interpret. By means of this example we shall illustrate the role of Nash and Moore plays, as well as the role played by the procedure of using a common fund. In the final part of the section we consider certain notions related to the reliability of the functioning of a group of automata united by their participation in a common game. In Part I1 we discussed the fact that an arbitrary game r*admits a unique construction of the corresponding automata game with independent outcomes. The probabilities p ( f , s) of the outcomes s = (9,. . . , s") of plays f = (hf,.. . ,A;) in an automata game T are determined here by a system of payoff functions for the game r*from Eqs. (46) and (47); we shall limit ourselves here to many-automata games with independent outcomes. In choosing the constructions of the automata playing the game, it is natural to use (by virtue of the absence of a priori information about the game) symmetric automata capable of expedient behavior in stationary random media, examples of which are described in Section 2 of Part I. In particular, in the examples of digital computer simulation to be considered later we use automata with a linear tactic L,n,x. The results obtained are apparently valid for any automata forming asymptotically optimal sequences. As a characteristic of a game for automata, a',. . . , V,. . . , W used in simulation we shall use the average payoff Vnjof automaton !lR,j, belonging to the asymptotically optimal sequence a1j,.. . , 2 1 n j , . . . ,
r
For automaton L,,,,, the number n coincides with the capacity of its memory. Here, as before, the input variable s j ( t ) of automaton 'E[j assumes the value s j ( t ) = 0 in the case of a win for the automaton in playf(t - l), and d ( t ) = 1 in the case where it loses. Sometimes we shall also use the average frequency ii,(fo) of the play fo, i.e., the value T
an(f0) =
where
Iim(l/T) C d ( f 0
T+CC
t-I
9
f(t)> 9
(101')
Finite Automata and Modeling the Simplest Forms of Behavior
63
For ergodic games, the average quantities Wnj and an(&)correspond to the average payoff Wj of automaton ‘uj and to the final probability o ( f o ) of playf,, introduced earlier in (49) and (50). As the number of plays executed increases, they tend to these quantities with a probability equal to unity. In all examples analyzed, it was possible to discern a definite tendency in the variation of 5 and W with increasing n. This leads to an assumption about the existence of a limit for these quantities when n + 00. This limit is achieved within a certain degree of accuracy for relatively small values of n ; the convergence, apparently, has an exponential character. In computer simulation of automaton games, we strive to study the final probability distributions for Markov chains having a great many states. For the distribution game described below, the number of states of the corresponding Markov chain is on the order of lo8; for the circle game, Meanwhile, even when the total number of plays is on the order of lo5, a computer simulation makes it possible to separate out those few states whose total probability tends to unity. These results can be considered trustworthy by virtue of the fact that the characteristics obtained do not change substantially as the number of plays executed continued to increase, and also because of their essential predictability. Homogeneous games with identical automata (from this point on, we shall call them homogeneous automaton games) possess the obvious property that the values of the game are identical for all automata participating in the game. Consider an automaton ?lo, whose actions consist of various plays f of the homogeneous game r*,and whose expectation of a win in the case of the actionfis the value U ( f )of this play in the game This automaton, as it is easy to see, functions in a stationary random medium C ( { U ( f ) } ) . Therefore, if automaton ‘u,O belongs to an asymptotically optimal (in the sense of Section 2 of Part I) sequence of automata, for n + 00 its payoff tends to max U(f),i.e., to the maximum value of the play. We shall be interested in those cases where a group of automata ‘ul,. . . , %” participating in a homogeneous game r attains a maximum payoff. We shall note again that the automata participating in the game do not posses a priori information about it. Their information about the game is limited to only a knowledge of their winnings and losses in each of the plays executed. Reaching a maximum payoff, when it occurs, shows that the behavior, which is expedient in stationary random media, replaces in the case of automata the agreement about combining their actions. This type of agreement might be reached among people informed beforehand
r*.
64
Automaton Theory
about the conditions of the game. An automaton game in these cases does not differ from the game played by automaton ‘?lo,defined above. From an example of a game that will be described in this section we shall see that for homogeneous automaton games one does not, generally speaking, achieve maximum payoffs. For Nash games, the average payoff of an automaton apparently coincides with the maximum value of a Nash play. Upon constructing a Moore game with the aid of the common fund procedure, we shall see that-when automaton games are simulated on digital computers-the automata achieve the maximum payoff (of course, for n + m). This apparently occurs for an arbitrary homogeneous game with a common fund and for an arbitrary asymptotically optimal sequence of automata. We shall proceed now to describe the results of simulating, on a digital computer, a symmetric Nash game, whose content can be interpreted. Consider a game B*, with v players A l , . . . , A’. Each of the players has x > v strategies A , . . . ,f,. The game is defined by a set of x quantities a , , . . . , a,, a, 2 a2 2 . . a, 2 0. The quantity a, is called the power of the strategyf,, a = 1, 2,. . . , x. Suppose that in the game B* a certain p l ay f = (fEl ,. . . ,Lv) is executed, and suppose that in this play strategyf,, is chosen simultaneously by ma, players. Then the payoff P ( f )to a player A’ in this play is defined by the formula
V(f)= aaj/maj,
j
=
I , 2,. . . , v .
(102)
Thus, the payoff to each of the players in playfis equal to the ratio of the power of the strategy used by him to the total number of players that have selected the same strategy in this play. Obviously the game B* is symmetric. We shall call this game a distribution game. The distribution game can be interpreted in various ways. This form of game situation is typical, for example, of the use made by animals of various pastures. In this case, the strategy of an animal reduces to the choice of some pasture, and the strength of the strategy can be interpreted as the corresponding supply of food. The invariant set of plays in the game B* is determined by the set m = (m,,. . . , m,)of nonnegative integers m a , ct = I , 2 , . . . , x . In the plays belonging to the invariant set, ma is the number of players using strategyf,; .. m, = v. m,
-+
+-
Finite Automata and Modeling the SimpIest Forms of Behavior
65
The value U ( f ) of the invariant set {gf}, according to (92), can be expressed by the formula
U(f)= (1/y)
c am,
UEA,
(103)
where A j is the set of indices (Y for which ma # 0. The value of the play is equal to the sum of powers of the strategies used in a given play divided by the number of playing automata. The distribution game is a Nash game. In fact, for arbitrary powers of strategies a,, . . . , a,, a, 2 a2 2 2 a,, one can find nonnegative integers m:,. . . , m,O, m: m," = v , such that foranypairsa, B E (1,. . . , x ) the following inequalities will be satisfied :
+. + -
-.
a
a,lm," P aslCm:
+ 1).
(104)
Consider the invariant set {gfN} of plays defined by the set m0 = (m:, (104) this set is a Nash set, and the plays it contains are Nash plays. Since the numbers a,,. . . , a, are in a nonincreasing order, among the sets m0 there will be one such that
. . . , m,"). By virtue of inequalities
It will be noted that the invariant maximum value set is not, generally speaking, a Nash set. In fact, this set is generated by the set m = (1, 1,. . . , 1, 0,. . . ,0), which in the first v positions contains 1's and in the remaining positions 0's; in the plays belonging to the invariant set of maximum value, only the first Y strategies fi ,. . . ,fy , are used, and each one of them is used by one player only. (A unique "Pauli principle" holds here.) At the same time the set m = ( I , . . . , I , 0,. . . , 0) generates a Nash set only when a, > a,/2. We shall describe the results of simulating the game of distributions on digital computers.'* In all these examples we used automata with a linear tactic L,,,.,. Example 1 The game was played by five identical automata L7n,7.Each of the automata could choose one of seven possible strategies with the following powers: a, = 0.9, a2 = a3 = . .. = a7 = 0.33. How would people behave in a similar situation? It is clear that for each one it is more convenient to select the first strategy if it is free. Even if somebody had already chosen the first strategy it would still be more Excerpts from Ginzburg ef af. [64] are given later on in this section (Editor's note).
66
Automaton Theory
convenient for another to also choose the first strategy, since the expected value of winning in this case is equal to 0.45, and this is higher than the expected value of winning for any other strategy. If, however, the first strategy were chosen by three players, then each one of them would win 0.3, i.e., less than for any of the remaining strategies. Consequently, the most natural distribution is one in which any two players choose the first strategy, and the remaining players choose one of the remaining strategies each; besides, this distribution is not much different from one in which the first strategy is chosen by three players. Now, what is the behavior of automata in this case? This version was checked for automata with various memory capacities. For n = 10, it turned out that in 78% of the plays the first strategy was used by two automata, in 1274, by three, and in 9%, by one automaton. The strategies in the second-through-seventh group are, practically speaking, chosen simultaneously by one automaton only. Since in a majority of plays the first strategy is selected by two automata, the remaining six strategies are subject to selection by three automata, so that we can naturally expect that each of these strategies will be available in approximately onehalf of the plays, and in half of the plays it will be taken by one automaton. For a memory capacity n = 10, the simulation showed that the second strategy was used by one automaton in 48% of the plays, and in 51 % of the plays it was not used at all; the strategies in the third-through-seventh group are selected in a fashion similar to that of the second. For automata with the memory n = 5, the first strategy is used in the following way: in 34% of the plays it is used by one automaton, in 55% of the plays by two, and in 10% by three automata. The second strategy is available in 49 % of plays, in 47 % of plays it is used by one, and in 4 % by two automata. It is clear that an intelligent tendency is maintained in the behavior of the automata, but the picture is more blurred. Thus, the number (34%) of plays in which the first strategy is selected by only one automaton is still large; on the other hand, the fraction of plays in which the second strategy is simultaneously chosen by two automata is quite noticeable (about 4%). If we consider that the first strategy is, in an overwhelming majority of cases, used simultaneously by two automata, and the remaining ones use one strategy from the second-through-seventh group each, then the expected payoff to every automaton should be close to 3(2 x 0.45 3 x 0.33) = 0.378. The average payoffs obtained as a result of the calculation are given in Table 111 (upper row).
+
67
Fimte Automata and Modeling the Simplest Forms of Behavior
TABLE HI Automaton memory Average payoff without a common fund 0.353 0.368 0.370 0.373
-
Average payoff with a common fund
0.315 0.327 0.335 0.350
Example 2 The game was also played by five identical automata L7*,, , each of which could choose one of seven strategies with the following powers: a, = 0.9, a, = a3 = = a, = 0.15. The Nash set is defined by the collection (5, 0,. . . , 0). By analogy with the behavior of the automata in the previous example, it seemed that all the automata should choose the first strategy, since then the payoff to each one of them would be approximately equal to 0.18, which would be more than the payoff involved in the choice of any other strategy. The calculations have shown, however, that the first strategy is more often chosen by four, three, and even two automata. This is not hard to understand. In fact, if the first strategy is selected by five or four automata simultaneously, then the expectation of a win for each one of them is relatively small, and the probability of a strategy change is quite high. However, the average payoff falls off with an increase in the memory capacity, and apparently for n + 00 it tends to the value of the Nash set. In this example the relationship between the average payoff and the memory capacity is given in Table IV (upper row).
- -
TABLE IV
Average payoff without a common fund
0.25
0.23
0.22
0.21
Average payoff with a common fund
0.23
0.27
0.29
-
68
Aufomafon Theory
Example 3 In this example, as in the two preceding ones, v = 5, x = 7, and all powers of strategies are selected to be identical: a, = a2 = ... = a7 = 0.6. Already for n = 5, in 98% of the plays, each strategy was chosen by no more than one automaton so that p5= 0.6. The game in Example 3 is a Moore game, and the Nash set in this game is the invariant set of maximum value, i.e., a Moore set. With the aid of the “common fund” procedure described in the preceding section, it is not difficult to construct the corresponding Moore game Bo* for the distribution game. We have already said that this procedure is equivalent to an agreement among the players to divide their payoffs equally. For the distribution game with a common fund, the payoff to each player in a play coincides with the value of this play. For this reason all plays of maximum value are Moore plays, in which a change of strategy is equally inconvenient for all players. It is natural, therefore, in the corresponding automaton games to expect that each of the automata achieves (for sufficient memory capacity) the maximum value of the play. We have already mentioned that an automaton game is no different in its results from a game between people who know the conditions of the game beforehand and reach an agreement about joint actions in the game. This statement was confirmed by simulation. Example 4 A simulation was made of a game having a common fund the conditions of which were the same as in Example 1 of this section. The relationship between the average payoff r,:to an automaton in this game and its memory capacity is shown in Table 111 (lower row). For this game the value U ( f M )of a Moore play is &(a,
+ a2 + a3 + + a5) = i(0.9 + 0.33 x 4) = 0.444. U,
The experiment shows that the value of a Moore play is achieved in this case. Figure 11 contains plots for this example of the average payoff to an automaton for a game without a common fund (graph 1) and with a common fund (graph 2) versus the memory capacity n. We see that for n > 6 the procedure of introducing a common fund increases the average payoff to an automaton, and for n 5 6 it decreases it; the introduction of a common fund turns out to be advantageous only for automata with sufficiently high memory capacity. The same figure makes it clear that the average frequency ~ 7 , ~ ( f of ”) Nash plays for the game B* without a common fund tends, for C,(fM).
Finite Automata and Modeling the SimpIest Forms of Behavior
69
to unity faster than the average frequency n + 03 of a Moore play for the game Bo* with a common fund. This can be explained by the fact that in a game with a common fund, a change of strategy by one automaton has relatively little influence on its payoff, and only for fairly high memory capacities is the value of a Moore play achieved. For small memory capacity, however, the selection of the best strategies is not made with sufficient accuracy. Therefore, the common fund procedure turns out to be disadvantageous with small memory capacities of the automata (one might say, “wage leveling damage”). Example 5 In this example a digital computer was used to simulate the distribution game with a common fund, the conditions of which were chosen to be the same as in Example 2. Also in this case for n + 03 the value of a Moore set is achieved which for this example has the value i(0.9 4 ~ 0 . 1 5 )= 0.3. The results of simulation are given in Table IV (lower row). It is easy to see that the remarks made in Example 4 are also valid for this example. The example of the automaton game described in this section can be related to certain considerations involving the reliability of the functioning of an automaton group. In fact, consider a set of distribution games having identical powers of strategies and differing only as to the number of participants. These games have Nash plays in the absence of a common fund, and Moore plays when a common fund is introduced. In both cases, as shown by a computer simulation experiment, the frequencies of Nash plays (or Moore plays)
+
70
Automaton Theory
approach unity as n 400. This predetermines the choice of strategies by the automata participating in the game, independently of their number. In particular, if one of the participant automata fails (according to Glushkov’s ~lassification~~), the remaining automata again for some time choose strategies with the highest powers of the sources. These strategies are selected by the automata independently of which automaton suffered a failure. In particular, in our experiments we periodically failed the automaton that had chosen strategy f l , whose power of source was maximum, and followed the changes in the average payoff to a participating automaton. As the number of the playing automata decreased, the average payoff increased; the number of the strategies used was reduced by eliminating the least advantageous ones.
Example 6 The results of such a simulation for the game of distributions with powers of strategies a, = 0.9, u2 = a3 = = a, = 0.33 (as in Examples 1 and 4) and memory capacity n = 10 are given in Table V.
--
TABLE V
w,, w:,,
It is easy to verify that the values of and are close to the values of Nash and Moore plays, respectively. Moreover, it should be noted that the increase in the average payoff to an automaton with a decrease in their number leads to a situation where the total average payoff to the playing automata falls off relatively slowly with an increase in the number of automata that failed. The increase in the average payoff to the automata with a decrease in their number is related to still another circumstance. For a fixed memory capacity n of the playing automata and with a decrease in their number, the maximum values of plays are more and more reliably achieved. This is connected with the fact that the game is, as it were, simplified, and the I@
See Glushkov [70,p. 3511.
71
Finite Automata and Modeling the Simplest Forms of Behavior
previous memory capacity makes it possible to bring the frequency of Moore plays closer to unity. The following example may serve as an illustration of this remark. Example 7 The powers of the sources were selected as follows: a, = 0.9, a, = 0.45, a3 = 0.225, a, = 0.1 12, as = 0.056, a, = 0.028, a, = 0.014. This game with a common fund was first played by five, then by four, and then by three automata. Each time, the automaton that chose the strategy fi,possessing the highest power, was made to fail. In Table VI, U ( f ‘ ) denotes the value of a Moore play. TABLE VI I
I
I
5
I
1
I
I
1I 8:); 1 I 1 1I 0.35 10.28 10.29 10.3 0.32 0.37 0.4 0.43 0.47 0.52
3 Circle Automaton Games In the study of multiautomata games, it is natural to separate out those games for which the description does not depend on the number of players. This property is possessed, in particular, by those games for which the payoff functions of each player depend on the choice of strategies by this player and a limited number of other players-his “neighbors” in the game. In this case the limited number of the arguments of the payoff function considerably simplifies the description of a game. It is convenient to juxtapose games with a limited number of neighbors and special graphs of games. For this purpose a player A k is put in correspondence with a vertex k of the graph. If the payoff function of player A k depends on the choice of a strategy by player Ai, then an arrow is drawn from the vertex i to the vertex k . In games with a limited number of neighbors, each vertex of a graph brings together a limited number of arrows. In this section we shall describe the simplest homogeneous automaton game with a limited number of neighbors as well as the results of its simulation on a digital computer. Consider the following game K * . The game is played by v persons A’, A 2 , . . . , A’, each of whom has x strategies fi ,. . . ,f,. Let f = (fi,,. . . ,&)
72
Automaton Theory
be an arbitrary play of the game. Then the payoff function is given by the formula
W(f)of a player
Players Aj-’ and Aj+’ will be called the left and right neighbors of player A j , respectively. Thus, the payoff to each player in any play of the game is determined by his strategy and the strategies of his two neighbors-on his left and on his right-and the game is defined by a function of three variables, independently of the number of players. A game thus defined will be called a circle game.20 Any automorphism g , , k = 0, I , . . . , v - I of a circle game can be described as follows:
+
+
(a) g,Aj = A j + k if j k 5 v, and g,Aj = A j t k - ” if j k > v; (b) a strategy f a of a player A’ is mapped into the strategy f , on a player g k A j . I t is obvious that the set of automorphisms g , forms a cyclic group of order v, where g,g, = g,+, for k s 5 v and g,g, = gk+s-vfor k s > v. The identity element of this group is the automorphism go for which goAj = A j . The group G K , of automorphisms of the circle game is transitive on the set of players so that the game is homogeneous. In what follows we shall limit ourselves to the simplest case of the circle game in which each of the players has only two strategies: 0 and 1. Then the choice by each of the players A j of a strategy e j , E~ = 0, 1 determines a play f = ( e l , . . . , E ” ) , in which the payoff V j ( f ) to the player A j is determined by the formula
+
+
Evidently, the invariant set generated by a play Plays (El
3 .
. . &A( E 2 9
9
E3,.
..
9
E,
9
E l ) , (83
9
E‘l
9
. . . ,E , ,
El,
( E ~ e2, , .
. . , E ” ) consists of
4,.. . , ( E ”
9
E l , E2
9 .
. . &”-A. 7
2o We have described a game on a circle with two neighbors. It is clear that we may determine a circle game with an arbitrary number of neighbors in a similar way. In this way we may determine a game on a torus, in which the payoff to each participant is determined by his strategy and the strategies of four of his neighbors, etc.
Finite Automata and Modeling the Simplest Forms of Behavior
73
A homogeneous game K* may be a Nash game. In this case the sequence cl, E ~ . ., . , E , , determined by the Nash play f = (el, E ~ , .. . , e,), should possess the following property: If the sequence contains a triple E ~ , E ~ + then ~ , it does not have a triple E ~ E, ~ + ~ , such that E~ = E ~ E, ~ # + ~E ~ + ~ , ~ i = + E~ ~ + ~(The . summation over the index is done with respect to the modulus of Y.) In fact, if the sequence sl,. . . , E, contains a triple E ~ E, ~ + E~ ,~ + ~ and determines a Nash play, then V ( E ~E ,~ + E~ ~, + 2 ~ ) V ( e j ,E k , E ~ + ~for ) i?k # E ~ + ~This . sequence may not contain the triple E ~ E, ~ + eifZ, ~ , since in this case it would be advantageous for player Ai+’ to change his strategy. It may be verified that if, for some sequence E ~ . ., . , E , , this condition is observed, then one can construct a system of payoff functions V(E’,E “ , E ” ’ ) such that in the game defined by these payoff functions, f = ( E ~ , .. . , e Y ) will be a Nash play. We shall show that the sequence E ~ e, Z , . . . , E , , which may define a Nash play, belongs to one of the following two categories: (1) All E ~ j, = 1,. . . , Y are equal. (2) In the sequence E ~ , .. . , E , , 0’s occur only once, and 1’s no more than twice in a row, or conversely, 1’s occur only once at a time, and 0’s no more than two at a time. We shall show first that if the sequence E ~ . ,. . , E, contains three identical E’S in a row, then all of the E’S are identical. Suppose for definiteness that the sequence has the form . . . , 0, 0, 0, 1, E , . . .. Then if E = 0, the sequence will contain triples 0, 0, 0 and 0, 1, 0, and consequently cannot define a Nash play. This sequence also cannot define a Nash play when E = 1 since in this case it will contain the triples 0, 0, 1 and 0, 1, 1. We shall show now that if the sequence contains two 0’s in a row, then it cannot contain two 1’s in a row. In fact, if our sequence has the form 0, 0, 1,. . . ,0, 1, 1,. . . , then it contains the triples 0, 0, 1 and 0, 1, 1, which is impossible for a sequence defining a Nash play. We shall proceed now to describe Nash points in mixed strategies.21 Let us recall [122] that a Nash point is defined by a set t = (tl, t 2 , . . . , t,), where t k is the probability with which the kth player executes the action 0, k = 1, 2,. . . , Y. Let E k ( t )denote the expectation of a win by the kth player. The fact that t is a Nash point implies that d E k ( t ) / d t k= 0, whence atk-ltk+l
+ btkel + ctk+l + d = 0,
Excerpts From Bryzgalov et al. [32] will be given later on in the text (Editor’s note).
74
Alrtomaton Theory
where
Using (107) it is not difficult to express tk+l in terms of t k - l . This relationship will be simplified when we use the parameter tk = (1 - t k ) / t k : tk+l
= [rU(l, o ) t k - l
f p(0, 0)1/[-p(19
l b k - 1 - p(0,
(108)
We shall now find a general form for mixed strategies. For this purpose we note that tkfl= A t k p l , where A is the transformation (108), implies that tl = A Y t l . If AY is not an identity mapping, the fixed points of the transformation A” coincide with the fixed points of the transformation A . This is easy to show by direct verification. Hence, if v is odd, then all tk are equal, and if Y is even, all tk with even subscripts and all t k with odd subscripts are equal. Thus, we have shown that if v is odd, then any Nash point has the form ( t , t , . . . , t ) , where t is one of the roots of the equation at2+
(b+c)t+d=O,
(109)
and a, b, c, and d a r e defined in (107). If, however, v is even, then any Nash point has the form ( t , t ‘ , c, t ‘ , . . . , t , t ’ ) , where t and t‘ are the roots of Eq. (109). Equation (109), after t is replaced by t = (1 - t ) / t , becomes
where p ( ~E,’ ) are defined in (107). It is obvious that if p ( 0 , O ) and p(1, 1) have the same sign, then a play composed of 0’s or a play composed of 1’s is a Nash play; here Eq. (1 10) has either two positive roots or none. In this case, in addition to plays composed of 1’s and O’s, there may still be two Nash plays in mixed strategies or none. If, however, p ( 0 , O ) and p ( l , I ) have opposite signs, then Eq. ( I 10) has exactly one positive root. Then either a play of 0’s and a play of 1’s are both Nash plays, or neither of them is a Nash play. In this case there is an additional Nash play in mixed strategies. Moreover, we should note that for circle games with an even number of players, Nash plays always exist. In fact, if V(I,O, 1) < V(1, 1, I), then
Fimte Automata and Modeling the Sinq~lestForms of Behavior
75
the play (1, 1,. . . , 1) is a Nash play. If V(0, 1, 0) < V(0,0, 0), then the play (0, 0,. . . , 0) is a Nash play. If, however, neither of these inequalities is satisfied, then the play (l,O, l , O , . . . , 0) is a Nash play. We have already stated (see p. 42) that from a given play K*, one can construct a game K, i.e., a game with independent outcomes of v automata (u“,. . . , W.Here according to (47), the probability d ( f ) of winning for automaton ‘E[iin a play f = ( E ~ , .. . , EJ is defined by the formula
@(f) = 311
+
V ( ~ j - 1 ,E j , ~ j + l ) l ,
(111)
and the probability p ‘ ( f )of its losing in this play is defined by the formula P i ( f ) = 3[l -
V(Ej-1, E j , E j + l ) l .
(1 12)
So far as the construction of the automata participating in the game is concerned, we use, as before, symmetric automata forming an asymptotically optimal sequence. In particular, in the examples of simulation described in the following, we use automata with a linear tactic, L2n,n.One can assume that the results of simulation would not be considerably different if one used other sufficiently wide classes of symmetric automata belonging to asymptotically optimal sequences. In the simulation, we were, of course, first of all interested in Moore games, i.e., the games for which the invariant set of maximum value plays is a Nash set. Here, there is an opportunity to compare the behavior of a group of the simplest automata with the behavior of persons who have prior knowledge of the conditions of the game, and who therefore can agree as to whether or not to execute only those plays belonging to the Moore set. Let a,(f) denote the probability of an occurrence of a playf. (It is equal to 5 , ( f ) , which is the limit of the ratio of the number of times playfwas executed to the total number of plays made.) In all the examples considered there was a definite tendency for the value of a , ( f ) to change as the number n (automaton memory) increased. This leads us to assume that the limit of a,(f), as n + 00, always exists. This limit is reached with a desired degree of accuracy for relatively small values of n, and the convergence has, a p parently, an exponential character (see Examples 1 and 2). The latter circumstance is important also in connection with the following. Consider a circle game with v players that has a Moore playfM. Then, if the memory of the automata is sufficient, the behavior of the group has only a weak dependence on v, in other words, a n v ( f Mis) almost constant for sufficiently large n and a fairly wide range of v (see Examples 2 and 4). In the intro-
76
Automaton Theory
duction we have already noted that the “reliability” of collective behavior is related to that fact. Simulation data show that, with a favorable structure for the payoff function of the circle game, automata L2,,,,,execute Moore plays with a probability approaching one for n 00, i.e., in these cases --+
lim a n u ( f M=) I
n +m
The automata, as it were, “agree” to perform the actions for which the payoff to each one of them is maximum. I n these cases the expedient behavior of each one of the participant automata assures the expedient behavior of their group. In other cases, the automata are not capable of “agreeing” to take those actions that assure maximum payoff. In describing the examples of simulating the circle automaton games, we shall clarify the importance of constructing payoff functions of the game so that ( I 13) is satisfied. If this relation is satisfied, we shall say that the value of the Moore set is achieved. We shall note finally that in games which d o not contain Nash plays the collective behavior of automata is much more complex. A fairly typical example is the occurrence of a cycle, i.e., a sequence of playsf’, f z , . . . , f 8 with the following property: The automata participating in play f ’ , behave in such a way that play f ’ passes preferentially into play f2,and from it into f 3 , etc., into f 8 , from which again it passes into f ’ (see Example 6). Now we proceed to consider examples of game simulation. Example 1 We simulated a symmetric game for two players with the payoff function V ( E ~E ,~ ) ,el = 0.1, defined as follows: V(0,0) = V(1, 1)
=
0.25,
V(0, I )
= 0.9,
V(1,O)
= -0.1
Here V ( E , ,E * ) denotes the expectation value of a win by a player who performed action c1 at the same time as his opponent performed action E ~ It is easy to verify that playf” = (0,O)is a single Nash play in this game. The values of a , , ( f ) are given in Table VII. The quantity denoted in this table (and subsequent ones) by o l l ( . f )is in reality equal to the ratio of the number of times play f was executed to the total number T of plays executed. In all the examples considered, T was fairly large so that the value of a,,(f ) found by us was sufficiently close to the probability with which the play f was realized. Table VII makes i t clear that on(O,0) approaches unity as n increases.
.
Finite Automata and Modeling the Simplest Forms of Behavior
77
TABLE W
Example 2 Consider the circle game in which the payoff function (which is the expectation value of winning) V ( E ~ -E ~~ ,E, ~ + ~is) defined as follows: V(1, 1 , 1) = 0.6, V ( O , O , O ) = 0.43, V(0, 0, l)=V(O, 1 , O)=V(l, 0, O)=V(O, 1, l ) = V ( l , 0, I)=V(l, 1, O)=O. This is a game with two Nash plays: f = (0, 0,. . . , 0), which is a play of O’s, and f = ( I , I , . . . , l ) , which is a play of l’s, the latter obviously being also a Moore play. The players who know the payoff function of the game beforehand find it natural to agree about always performing action 1 : this will guarantee them the maximum possible payoff. It turns out that with sufficient memory capacity, almost all automata throughout the game perform action 1 . The results of simulation are shown in Table VIII, where -depending on the number of automata v and the memory capacity n of each automaton-we list the fraction D of the plays composed of 1’s (as a ratio of the number of plays composed of 1’s to the total number of plays) and the frequency F with which each automaton performs the action 1. Table VIII shows that with sufficient memory the fraction of plays of 1’s is close to unity. It will be noted that, for automata with fixed memory the fraction of plays composed of 1’s falls off, although slowly, as v increases. Apparently, this decrease is on the order of v-= for a suitable a. Approximate curves showing this relationship are given in Fig. 12. The relationship between the fraction of plays composed of 1’s and the memory capacity for a constant number of players is plotted in Fig. 13. We note that D apparently depends exponentially on the memory of the automata. The simulation made it possible to follow the dynamics of the game. In this process it was discovered that the state of the group of automata,
78
Automaton Theory
TABLE Vm 6
f
D
9
F
- -
0.52
-
0.36 0.83 0.80 0.99 0.97 0.99 0.98
F
0.
- -
-
D
0.69 0.84 0.99 0.99
18
0.22 0.78 0.92 0.98
32
-
-
0.67 0.03 0.52 0.87 0.60 0.88 0.95 0.80 0.93 0.99 0.91 0.97
-
0.65 0.88 0.89 0.9
0.00. 0.451 0.72 0.81 1
1
when all of them perform action 1, is stable, i.e., a change by one automaton of its actions did not result in a separation from the others. It is interesting to compare the circle game for automata L,,,, with an automaton game where the automata are not capable of expedient behavior and choose their strategies randomly regardless of the outcomes of the plays executed. For such automata the value of o;(f') is obviously equal to 2-v. In the example described (just as in the ones that follow), the number of states of the Markov chain describing the game is on the order of (2n). lo3,. However, in the case of simulation on a digital computer, when the number of plays executed is on the order of lo5, one can clearly exhibit the states whose final probabilities approach unity when summed. We note that the value of a n v ( f Mfalls ) off as v increases, although relatively slowly. However, a decrease in the memory capacity n has a much
-
369
f3
Figure 12
32
Y
1
3
1
4
56717
Figure 13
Finite Antomata and Modeling the Simplest Forms of Behavior
79
stronger effect on the mean frequency. One can say that in the example considered the individual expediency of the playing automaton plays a more important role than does the number of automata participating in the game. In the simulation of circle automaton games, we noted that the value of the Moore play was not always achieved. To satisfy this relation, it is apparently important that the expectation of winning by the automaton that changes its strategy in the Moore play be not larger than the expectation of winning by its neighbors. In the case with sufficient memory capacity n, the probability of an automaton changing its strategy to return to the previous choice of strategy will be larger than the probability that its neighbors will change their strategies. Consider, for example, the Moore play f M = (1, 1 , . . . , 1) of the game in Example 2. If an automaton rlrj changes its strategy from e j = 1 to ej = 0, then its payoff changes from V(0, 1, 1) = 0 to V(1, 1, 0) = 0, respectively, and thus the expectation of winning by automaton ‘2li when it changes its strategy will not be larger than the expectation of winning by its neighbors.22 This argument apparently explains why relation (1 13) is satisfied for the game in Example 2. One may, however, select the payoff functions for the circle game in such a way that in the Moore play a change of strategy by any of the participant automata will result in a situation in which its expectation of winning turns out to be larger than the expectation of winning of its neighbors. For this case one can assume that the Moore play will, so to speak, be “washed out,” and the value of the Moore play will not be reached. To verify this assumption we shall study Examples 3-5.
Example 3 Consider a symmetric game between two players with the followingpayofffunctions: V(0,O) = 0.43, V(0, 1 ) = 0.43, V(1,O)= -0.1, V(1, 1 ) = 0.5. In this game there are two Nash plays: f = (0, 0) and f = (1, l), where the latter is the Moore play. The results of simulation are given in Table IX, which shows that the automata are in no position to agree about playing the Moore play. This is apparently due to the following: Suppose that both automata perform action 1. If one of them changes its action, then its partner, which continues to perform action 1, turns out
** In the case of a game with a common fund, when there is a deviation from the strategy of the Moore set, the decrease in the mathematical expectation of a payoff is identical for all automata participating in the game, just as in the case discussed for a game on a circle. This apparently is related to the fact that for games with a common fund, Eq. (113) is satisfied in all the modeling cases considered.
80
Automaton Theory
TABLE IX
?:A}
0.05
0.03 0.00
to be in a worse position, and therefore it also changes its action relatively rapidly, and both of them begin to execute action 0. On the other hand, if in play (0,O) one of the automata changes its action, then it would find itself in a worse situation, and play f = (0, 0) would begin again. Thus, the preferential choice of play f = (0,O)is determined by the values of V(0, I ) and V(1,0). This is also confirmed by the results of a simulation of the game with the payoff function V(0,O)= 0.43,
V(0, 1)
=
V(1, 0 ) = 0,
V(1, 1) = 0.5.
Here the results of simulation were the following: for n = 5 and n = 6, the fraction of plays f = (1, 1) is equal to 0.52 and 0.93, respectively, i.e., the automata agree to perform action 1 and obtain the maximum payoff. Example 4 As in Example 2, we consider the circle game with v automata having two strategies each ( E ~= 0.1). The payoff function of the game has the form V(0,0, 0 ) = 0.43,
V ( I , 0, 0) = 0.2,
V(0, 1, 0 ) = 0, V(1, 1,O)
=
-0.2,
V(0,0, 1)
V(0, 1, 1)
=
= 0.2,
-0.2,
V(1, 0, 1) V(1, 1, 1)
= 0,
= 0.6.
As in Example 2, this game has two Nash plays f = (1, 1,. . . , 1) and = (0, 0,. . . , 0), the first of which is also a Moore play. If in the p l a y f M = (1, 1,. . . , I), player %j, j = 1, 2,. . . , v , changes his strategy from 1 to 0, then the expectation of his winning becomes equal to V ( 1 , 0, 1) = 0; the expectation of winning of his two neighbors 9l-l and %j+lwill be equal to V(1, 1, 0) = -0.2 and V(0, 1, 1) = -0.2, respectively. In this case, when one of the players randomly deviates from strategy 1 prescribed by the Moore play, his neighbors in the game will also change their strategies. In this case, (67) will not be satisfied and the expectation of winning of the automata will of course be less than the value of the Moore Play*
f
81
Finite Automata and Modeling the Simplest Forms of Behavior
Consider now the Nash playfN = (0, 0, . . . , 0). If any one of the playing automata, for example ‘ u j , changes its strategy from cj = 0 to cj’ = 1, then the expectation of its winning will become equal to -0.2, and that of its neighbors equal to +0.2. Therefore, it can be expected that if in play f = (0, 0,. . . ,0) one o f the players changes his strategy, then this player will return to the previous strategy faster than his neighbors will change their strategies. It is natural, therefore, to assume that for the game in Example 4, we have the relation lim o;(fN)
n+m
where
=
1,
(1 14)
(0,O) . . . , 0 ) .
f N =
The results of a simulation of the game in Example 4 on a digital computer are shown in Table X, in which one can follow the dependence of the mean frequency anv(fN)on the number v of the playing automata and on the memory capacity n of each one of them.
171 1 1 TABLE X
3
6
1 1 8
32
- 0.79 0.42 0.89 0.88 0.69 0.95 0.92 0.82 0.69
The table shows that Eq. (1 14) is satisfied for this game. It is not difficult to note that for Table X the mean frequency of the Nash play depends more on the memory capacity of the playing automata than on their number; we have already noted this circumstance when describing the results of a simulation of the game in Example 2.
Example 5 In this example, the payoff function V ( E ~ c- j~, cj+l), , of the game with Y automata on the circle is given as follows:
cj = 0.1,
V(0,0, 0)
= 0.43,
V(0, 1,0)= 0, 1,.
V(1, 0, 0)
=
-0.2,
V(1, 1, 0) = 0.2,
V(0,0, 1) V(0, 1, 1)
=
-0.2,
= 0.2,
V(1, 0, 1)
V(1, 1, 1)
= 0,
= 0.6.
Just as in the previous examples, plays f = (0, 0, . . . , 0) and f = (1, . . , 1) are Nash and Moore plays, respectively. However, when any one
82
Automaton Theory
of the playing automata deviates from strategy 1 which is prescribed by the Moore play, the expectation of winning of this automaton, V( 1,0, l)=O, will turn out to be less than the payoff to its neighbors, V(1, I , O )
=
V(0, I , I )
= 0.2.
In accordance with the assumption stated above, one can consider that Eq. ( 1 13) will be satisfied for this game. Moreover, it is natural to assume that the mean frequency o , ; ( f " ) for the values of v and n, the same as in Example 2, will assume values approaching 1. When simulating this game, it turned out that 02z(fM) = 0.09 and o"(fM) = 0.8. In Example 2 these values were 0.00 and 0.45, r e s p e c t i ~ e l y . ~ ~ Example 6 Finally, we shall consider at some length an example of a game in which there are no Nash plays. Let us specify in a three-person circle game the following payoff function: V(0,0, 0 ) = -0.9,
V(0, I , I )
=
0.9, V(0,0, 1 )
V(0, 1, 0 ) = 0.7, V(1, 1, 0 ) = 0.7,
=
-0.9,
V(1, 0, 0 ) = 0.7,
V(1, 0, 1 )
V(1, I , 1)
=
0.9,
= -0.9.
It is easy to verify that this game has no Nash plays. In fact, in the plays (0, 0, 0), (0, 0, I), ( I , I , I), it is advantageous to change the action of the middle player, and in the play (0, 1, 1), the action of the one on the right. The remaining plays of the game can be obtained from the preceding ones by a cyclic rotation. The mixed strategies can be described in the following way: The equation for the Nash point in mixed strategies [see ( I l O ) ] has the form 1.6t2- 0 . 4 ~- 1.8 = 0 .
.
Solving this equation, it is easy to find the final distribution of the frequencies of the plays. These frequencies and the results of a simulation of an automaton game with memory n = 8 are listed in Table XI. A typical situation is that a Nash play in mixed strategies is not realized by the automata. This can be explained by the fact that, in contrast with the point of view taken in the von Neumann-Nash theory of games, automata do not select their actions independently at all. Naturally, then, their behavior will not be described by specifying independent frequencies of the actions chosen by the automata. Let us consider in more detail the dynamics of the game. In play (0, I , 0), the first player loses most of the time, whereas the second and third win; as
In Table VIII, these values of 6,"(fM) were designated by an asterisk.
83
Finite Automata and Modeling the Simplest Forms of Behavior TABLE XI
f Frequency of play f at the Nash point
o,o,o
0, 0, 1 0,1,0 1, 0, 0
0, 1, 1
0.1
0.116
0.113
l,O,l
I,],]
1, 1, 0
0.153
therefore, this play is most likely to change to play (1, 1,O). In the latter the second player loses, and the remaining ones win; therefore, play ( I , 1, 0) will change into (1, 0, 0), etc. In plays (0, 0,O) and (1, 1, I), all the players lose, and therefore, the probability of their continuation is negligible. In the final part of our discussion of circle games with automata, it should be noted once again that in our analysis we have studied sets of games that differed from one another only in the number of players. In such games expedient behavior on the part of each automaton participating in a game assures the expedient behavior of the group of automata and its simultaneous increase in a relatively wide range. We have already noted that a change in expedient behavior of an individual automaton (which is related to the memory capacity n ) has a much stronger effect on the behavior of the group than a change in the number of participant automata. These remarks make it possible to speak of the reliability of a group of automata joined by their common participation in a circle game. Suppose that a set { K } of circle games is characterized by the payoff function V ( E ,E ' , E " ) and by the construction of the playing automata, and let, for example, f = ( I , I , . . . , I ) in all these games be a Moore play. The games in this set differ only in the number of players. With a proper choice of payoff function, the value of the Moore play is reached for all games in this set. Therefore, a failure of a number of automata does not result in a breakdown of ( 1 13), and, consequently, it will not change the choice of strategies on the part of the remaining automata. Moreover, as the number of playing automata decreases, the mean frequency of the Moore play will increase, so that the total payoff to the playing automata will decrease relatively slowly. This type of increase in a,;(f) is easy to follow in the table of examples described in this section.
0
An Example of Modeling the Behavior of a Group of Automata with a Two-Level Organization (The Numerical Method Distribution Problem)’ 0
In this chapter we shall describe a game played by several automata that admits various interpretations of its content. Consider the following “the numerical method distribution problem.”2 Suppose that we are given a set of stationary random media C = {Ci}, i = 1, 2 , . . . , N (see p. 13). Each medium C iis characterized by a set of . . numbers a I 2uzZ,. , . . , aQi,representing the expectations of winning for an automaton located in a medium C iand performing actions 1, 2 , . . . , e, respectively. At every instant of time t = I , 2 , . . . , there appears a medium C t with a probability p t . Of course, p 1 p 2 . . . p s = 1 . Suppose also that there are M automata R , , k = 1, 2 , . . . , M , each of which may perform actions I , 2,. . . , 0. We assume that M 5 N . Each automaton R, is at every instant of time located in some medium Ci, where each medium contains at most one automaton. If automaton R, is located in a medium Ci, we shall say that it is tuned to the index i. Suppose that at time t there appears a medium Ci.If some automaton Rk is connected to it at that instant, it will perform an action, will win or lose, and pass into another state. If, however, at that instant no automaton is connected to the medium, then the actions I , 2 , . . . , 0 are selected with equal probability. The win or loss as determined by the medium in both cases (in response
+ +
+
Reprinted with minor changes from the article written jointly with Ginzburg [65] (Editor’s note). * Cf. p. 120 concerning the same problem (Editor’s note). 84
85
The Numerical Method Distribution Problem
to an action of an automaton or a random action) is considered to be a win or a loss for the entire group of automata. The problem is this: Without knowing beforehand the numbers ujiand p i , we are supposed to select the automata REand organize their tuning to the indices in such a way that the expectation of a payoff for the entire group is as large as possible for all combinations of the numbers uji and p i . For the case in which M = N , it would be natural to tune each automaton to its constant index; then the automata would function in a stationary medium. If the automata belonged to asymptotically optimal sequences, one could guarantee (in the limit for infinite memory) the payoff W
=
Cglpimaxjuji.
If, however, the number M of automata is less than the total number of various media, then there is a problem of how to tune the automata to their indices so that the maximum possible payoff will be assured. Sometimes we shall assume that one automaton is tuned to the index 0. It will be turned on when a medium appears to whose index no other automaton is tuned. Consider the simplest example. Suppose there exist two media, C’ = (0.6; 0.9), C2 = (0.45; -0.45), appearing with probability 0.5 each, and suppose we deal with a single automaton. If the automaton is tuned to index 1 and it chooses action 2, its payoff in the case of medium C1 is equal to 0.9. In the case of medium C2, the random decision maker is the system, and the average payoff of the system will be equal to zero. Thus, considering that media C’ and C2appear with probability 0.5, the average payoff of the system will be 0.9 x 0.5 0 x 0.5 = 0.45. Similarly, when the automaton is tuned to index 2, the expectation of the payoff for the system is (0.6 0 . 9 ) ~ 0 . 5 0.45 x 0.5 = 0.6. Suppose now that the automaton is tuned to index 0. In this case it will be turned on when any medium appears. Since the expectation of winning in the case where the first strategy is chosen is 0.6 x 0.5f0.45 xO.5 = 0.525, and in the case where the second strategy is chosen it is 0.9 x 0.5-0.45 x 0.5 = 0.225, our automaton, being set to index 0, will assure a payoff not exceeding 0.525. Thus, in this example it is most advantageous to set the automaton to index 2. “The numerical method distribution problem” may be considered to be the problem of organizing the collective behavior of solving device^.^ We
+
+
+
That is, automata capable of expedient behavior are understood to be the devices which inaccurately solve the problem of choosing the largest number out of the numbers a,’, characterizing a medium C‘ (Editor’s note).
86
Automaton Theory
shall attempt here to organize their interaction in such a way that the expedient behavior of the individual solving devices results in the optimal behavior of the entire system when solving a problem. For this purpose, the type of tuning which will assure the maximum total payoff must be the one that is most advantageous. This is achieved by introducing the common fund. On the other hand, the payoff for each of the solving devices depends not only on which problem it has chosen, but also on how accurately it solves that problem. This implies that, for any fixed tuning of the solving devices to certain indices, a certain amount of time must pass in order that the solving devices be able to make a sufficiently accurate choice of strategies. Therefore, only the average total payoff for the solving devices over the time t,sufficient for them to reach a steady-state operation, can characterize the quality of a given type of tuning. The value of t can characterize the quality of a given type of tuning. The value of t should be selected in such a way that during this time it will be possible to estimate roughly the expediency of a given type of tuning. Of course, there is no need to choose a value of t such that the transient regime will have no effect whatsoever. This characteristic feature of the numerical method distribution problem makes it necessary to use two levels for its solution. Thus, during each cycle r = I , 2,. . . , a problem is chosen from the set C', C2,.. . , C", the corresponding automaton is turned on, a choice of the strategy is made and an evaluation of this choice takes place. At times T, 2 t , . . . , an evaluation of the quality of the choice of indices is made and the tuning to the indices is changed. At every instant f = I , 2,. . . , the actions of each automaton are characterized by two numbers (p, i), namely, the strategy chosen B, B = 1, 2, . . . , e, and the index number i, i = 1, 2,. . . , N . In choosing the number @, the automata act independently of one another. The functioning of each one of them is simply an operation in a stationary random medium. As the solving devices, i.e., automata R , , we shall use pairs of automata with a linear tactic, which will be denoted by ( A i l ) , ,4iz)), k = 1, 2,. . . , M. The actions of automaton Ail) will be the various strategies p = 1, 2,. . . , e; the actions of automaton Aiz) consist of choosing the index i = 0, 1,. . . , N . Here the states of automata ,4iz)and their actions do not change in the interval r = ( r - 1)t I , . . . ,r t . Automaton Ail), k = 1, 2,. . . , M is turned on only when a medium appears whose index i is the same as that to which the automaton Ai') is tuned, and it performs an action @ that corresponds to its state at that time.
+
87
The Numerical Method Distribution Problem
+
Here the automaton wins with probability qoi = $(l usi) and loses with probability I - q;. The win and the loss of automaton Ail) change its state (in accordance with its structure). In this case a medium appears to whose index i none of the automata A i z ) , k = 1,2,. . . , M , is tuned; either some automaton Ail’ is turned on (if Aiz) has index 0) or the random choice maker is turned on. Consequently, in each cycle either one of the automata Ail), k = 1,2,. . . , M , or the random choice maker is turned on; the remaining automata do not change their states. The wins and losses of the automata and the random choice maker are summed over the period t,forming the total payoff of the entire system during that time. This payoff, when divided by the number of cycles t, determines the payoff W for any of automata k = 1,2,. . . , M at the time rt. Automata Ai2), which have N 1 actions 0, 1, 2,. . . , N each, at moments rt, r = 1, 2,. . . , win with probability q = &(1 W ) and lose with probability p = &(I - W ) , which results in a change in their states, and thus, in their actions. The choice of index is made in such a way that each index is selected by at most one automaton. We recall that automata Ai2’ are also automata with linear tactics. We shall describe the operation of the program simulating the system S on a digital computer. A medium C i= Ci(ali,u Z i , . . , uQi) (where usi is the expectation of winning with strategy p for the ith medium, i = 1, 2,. . . , N , /? = 1, 2,. . . , e ) appears with probability p i at the instants t = 1, 2,. . . . The game is played by pairs of automats (Ail), Aiz)),k = 1, 2,. . . , M , and the random choice maker. The states of automaton A i l ) , k = 1, 2,. . . , M , are denoted by q ~ j f i , where /l= 1 , 2 , . . . , e is the number of the strategy chosen by the automaton, and j = I , 2,. . . ,n is the depth of the state. The states of automaton Aiz), k = 1, 2,. . . , M , are denoted by yYi where i = 1, 2,. . . , N is the number4 of the medium (index) to which the automaton is tuned, and y = 1, 2,. . . , m, is also the depth of the state. Suppose that a medium C iappears at time t . Then an attempt is made to find automaton Ai2),which is tuned to the index i. If such an automaton exists, automaton Ail) is turned on; if the state of automaton Ail) at t was q ~ f it, will choose strategy p. Here automaton Ail) wins with probability qoi = $(I ap9. In the case of winning [uk(r)= 11, the automaton Ail) transfers from state qjfl into state q ~ if~ j +# ~n, and remains in state q Q if j = n. In the case of losing [uk(t)= - 11, automaton Ail’ transfers from state pjfi into
+
+
Or i
= 0, 1,
2,. . ., N (Editor’s note).
+
88
Automaton Theory
state pjPl if j f 1, and transfers with probability l / e into one of states pb’, B’ = 1, 2,. . . , e, if j = 1. The quantities u k ( t ) are summed over the time (r - 1)t I , . . . , r t in a counter called the common fund. If at time t none of automata Aiz) has the index i, but there is an automaton Aiz) with zero index, the computation is made as if this automaton had index i. However, it may happen that there is no automaton with index i or 0. In this case the random choice maker is turned on: one of the strategies /? = 1, 2,. . . , p, is chosen with probability l/p, and then the expectation of winning is equal to a:. The payoff W for each automaton ALz) of the system S (the total accumulated payoff to the system during t cycles divided by the number of cycles z) iscomputed during z cycles, and the contents of the common fund is cleared. This is followed by automata Aiz), k = 1, 2,. . . , M , playing for payoffs. Any automaton Aiz) wins with probability q = $(I W ) and loses with probability p = 1 - q.5 In the case of a win, automaton AL? transfers from state yvi into state yv+lif y f m,and remains in state ymi if y = m. In the case of a loss, automaton Ai’) transfers into state y:-l if y # 1. If, however, y = I , then the automaton transfers into state yl’ with equal probability, where I assumes any values except for those occupied already. This type of procedure excludes the case where the same index would be chosen by two or more automata. In addition to accumulating the total payoff to the system S during z cycles, the average payoff to each pair of automata and the random decision maker over the entire computation time T is also calculated. These “personal” payoffs have no effect on subsequent computations, but their sum (payoff to the system S ) is used to judge the expediency of the behavior of automata. If the automata ( A i l ) , ALz)) belong to an asymptotically optimal sequence, one can assume that in the limit (with their memories nil), ni2)-+ co) they choose the indices of strategies in such a way as to obtain the maximum payoff W,;T,,. Let us consider several special cases.
+
+
1. The system S consists of a single pair of automata (A‘”, A ( 2 ) ) and , the zero index is absent, i.e., i = 1 , 2 , . . . , N . Then
w&
=
max
i-1 , Z , .
{pi[max(aIi, a,i,. . . ,
..,N
a:)]
N r#i
The payoffs for automata Ap’ are played for independently (Editor’s note).
89
The Numerical Method Distribution Problem
2. The system S consists of a single pair of automata (A"', A ( 2 ) ) ,but A ( 2 ) may have indices i = 0, 1,. . . , N. In this case W,S,, = max Wi for i = 0, 1, 2,. . . , N , where
w,= wi = pi[max(ali, aZi,.. . ,a,i)I + r#iC N
+ a< + - - - + a;)/el~r,
i = 1, 2,..., N .
3. The system S consists of M pairs of automata (Ail), Ai2)), k = 1, 2,. . . , M. Automata A f ) may only have indices different from zero. In this case, we have the formula
where M
W(il, iz,.. . , iM) = C p i [max(a&. . . , a%)] 6=1
4. The system S consists of M pairs of the automata (Ail), Aiz)),k = 1, 2,. . . , M. The automata Ai2) may have indices i = 0, 1, 2,. . . , N. Then the payoff W of the system for this or other tuning to the indices is determined by the formulas of case 3 if there is no automaton AL? with zero index, and if one of the automata Ai2) is tuned to zero index, then
W(ia= 0, i,, iz,.. . , iM) =
M
C 8-l;6+d
pi6[max(a& aid,. . . , u4)]
In the final analysis: W&,=
i&.
max
...,i M
[W(ia= 0, il, i2,.. . , i M ) , W(il, i z , . . . , iM)].
In the following we state the results of a computer simulation for several examples of the numerical method distribution problem.
90
Automaton Theory
Example 1 C' = (0.9; 0), C2 = (-0.9; 0.9), the media appear with probability 0.5. The game is played by one pair of automata (A"), A',)), with memory n") = d 2 )= 5. The tuning is done every 20 cycles. The payoff W&, , as calculated from the formulas of case 2, is as follows: W, = 0.45, W, = 0.45, W , = 0.675. The results of simulation for the cases i = 1, 2 and for i = 0, 1, 2 are listed in Tables XI1 and XIII. TABLE XII
No. of cycles
'
2000
3000
0.67
0.69
0.66
200
540
540
540
0
460
1460
200
1000
0.44
Total no. of cycles for index 1 Total no. of cycles for index 2
I
Payoff W s
Payoff W s
I
200
I
3000
4000
7
3460
5000
-
0.45
0.66
0.67
0.67
0.69
Total no. of cycles for index 0
0
490
1200
1200
1200
1200
Total no. of cycles for index 1
200
510
560
5 60
560
560
0
0
240
1240
2240
3240
-
0.67 540
0.44
Total no. of cycles for index 2
4000
- --
No. of cycles
2460
1
4240
These show that automaton A ( , ) in both versions selects index 2 in a sufficient number of cycles. The average payoff Ws to the system in this case is 0.67. Example 2 The media C(l)= (0.6; 0.9), C2 = (0.45; -0.45) appear with probability 0.5 at the input. The game is played by a pair of automata (A"), A',)). The memory of automaton A") is equal to 5 ; the memory of A',) is 7. The index tuning is checked every 20 cycles (t = 20). The index is any integer from 0 to 2.
91
The Numerical Method Distribution Problem
The formulas of case 2 yield the following theoretical payoffs: W, = 0.525, W , = 0.45, W, = 0.6. The results of simulation are as follows: Automaton A',) almost always selects index 2, and the payoff for the system Ws is on the average equal to 0.6. Example 3
One of the five media: C'
=
(0.2; -0.6), C2 = (-0.9; 0.9),
C3= (-0.5; 0.5), C4 = (-0.3; 0.3), C5 = (-0.3; 0.3) appears with equal
probability 0.2. It is not hard to see that for the first medium the strategy p = 1, and for t!x remaining ones the strategy j3 = 2 are advantageous. System S consists of two pairs of automata (Ail), A;z)),k = 1, 2, and of the random choice maker. In this case it is easy to compute the average payoff to the system for any given tuning (see Table XIV). In this table the pair of numbers i, j denotes a tuning of one decision maker to i, and of the other to j . TABLE XIV Tuning
0.1 1 0 . 2 1 0 . 3 1 0 . 4
Payoff
1 1 1 I I I Tuning
1.5
Payoff
0.1
I II II I
2.3
2.4
2.5
0.241 0 . 2 1 0 . 2
3.4
3.5
4.5
0 . 1 2 1 0.121 0.08
Table XIV shows that setting automaton A f ! )to 0 and the other automaton to 1 is most advantageous; the payoff of the system is thus 0.44. The payoff for the system with the tuning to all pairs of indices is on the average 0.18. Therefore, if the payoff for the system is 0.42, it means that the combination (0, 1) occurs approximately 12 times more often than all the remaining ones. The memory of either of automata Ail), k = 1, 2, is 5. Versions were run with various memory capacities niz' of the automata AL"!),k = 1,2, and with various intervals t.Table XV gives the average payoffs and the distribution of the automata Aiz) over the media up to the time of 10,000 intervals t,in terms of the percentages of the total number of intervals t. Table XV shows that for niz)= 5, we obtain a blurred picture of the distribution of the automata over the media. For niz) = 10 the combination
92
Automaton Theory
TABLE XV
1
Distribution of automata over indices Aka)
0
34
1
2
3
4
5
1 0
18 13 12 17 6
0 8 5
0
4
1
2
17 31 24
6 5
93
0
3
4 5 1
10 10 8
0
0
0 7
(0, 1) stands out clearly. How fast this combination begins to assert itself can be seen from Table XVI (niz)= 10, t = 40). Thus, practically speaking, after about 2000 cycles the automata select only the index combination (0, 1). TABLE XVI Number of intervals t
1000 2000 4000 8000 12 000
A:”
A:%)
WS
0
0.14 0.30 0.42 0.42 0.42
0 0 0 0 0
1
2
16 40 24 20 62 10 81 5 87.4 3.3
3
4
0 44 0 32 0 16 0 8 0 5.2
5
0 24 12 6 4.1
0 1 2 3 4 28 0 24 0 8 2 0 9 1 0 9 4 0
0 0 0 0 0
0 0 0 0 0
5
0 72 0 36 0 1 8 0 9 0 6
0
Behavior of Automata in Periodic Random Media and the Problem of Synchronization in the Presence of Noise’ 0
In Section 3 of Chapter 1, Part 1, we studied the behavior of automata in random media whose probabilistic characteristics were governed by a simple Markov chain. It was discovered that there exists an optimal memory capacity for an automaton with linear tactics, and in Section 4 of Part 1 it was learned that a stochastic automaton with a variable structure forms a memory which is close to being optimal. In this chapter we attempt to find out how a priori information concerning the law governing variation of the probability characteristics of a medium can be used to design automata that are optimal for a given medium. To this end we will treat complex automata as structures made up of simple automata, and attempt to study the possibility of proper organization of this collective. To construct such complex automata we will use simple automata with optimal behavior in the simplest situations-such as stationary random media. It is natural to begin with some simple law governing variation of the characteristics of the medium. For simplicity we will consider automata with two outputs, 0 and 1 ; this constraint does not limit the generality of the results. At each instant the automaton output and state of the medium determine whether a “penalty” or “payoff” signal will be applied to the automaton input at the next instant.2 It is assumed that if the output signal of an automaton at time t is 0, the probability of a penalty at the next point in time ispo(t), while if the automaton output at time t is I , the penalty probability at the next point in time is p l ( t ) . Since the probabilities lie in An article written jointly with Varshavskiy and Meleshina [45] (Editor’s note). Input signal s = 1 in case of a penalty and s = 0 in case of a payoff (p. 12) (Editor’s note).
93
94
Automaton Theory
the interval (0, l), the situation reduces, when p o ( t ) and p l ( t ) are monotonic functions of time, to the behavior of an automaton t + 00 in a stationary random medium. Thus, we will be interested only in the case in which p i ( t ) , i = 0, I , are not monotonic functions. We will assume that the p i ( t ) are periodic or asymptotically periodic functions. The simplest example of a periodic function p i ( r ) is a periodic piecewise constant function. We will attempt to relate a model of such a medium to some technical problem so as to give it a practical interpretation. It so happens that the problem of recognizing periodic signals in a noisy channel can be reduced to the problem of behavior in periodic random media. An example of such a situation is the problem of synchronizing the switches of instruments in a system for transmitting telemetry with time division of channels. In connection with the foregoing discussion, we have divided the present paper into two sections. In the first section, we consider the behavior of a particular automaton in periodically changing random media, and, for a medium with a known switching period, we show that this automaton is asymptotically optimal. For a medium with an unknown switching period, we present the results of computer simulation, which show that in this case the automaton solves the problem under investigation. In the second section, we describe the problem of synchronizing the commutating and decommutating circuits of a telemetry system with time division of channels in terms of the behavior in periodically switching random media. We shall say that an automaton operates in a periodically switching ( P , , ~p, , , , , p z , o ,pZ,,,.. . ,p 7 7 , 0 , with period T if the random medium following conditions are satisfied: If the automaton output at time t is 0, the penalty probability at each time t I is p t , o , while if the automaton output at time t is I , the penalty probability at time r 1 isp,,, . We assume that p l + T , O= p t , o ,p l f T , , = p l , l , t = I, 2 , . . . . For simplicity we shall henceforth use the phrase “the behavior of an automaton in a periodic random medium.” It is not hard to show that the possible mathematical expectation of a penalty for any automaton operating in a periodic random medium lies between
+
+
We assume that the period T for the medium is known. Then, if we consider an automaton that operates only at times t , t T,t 2T,. . . , it will function in a stationary random medium C(p,,,,P ~ , ~several ); asymptotically optimal sequential automata are known for this problem. This last consideration determines the structure of an automaton with asymp-
+
+
Behavior of Automata in Periodic Random Media
95
totically optimal behavior in a periodic random medium. A diagram of such an automaton is shown in Fig. 14. The automaton consists of T automata (such as automata with linear tactics) that are asymptotically optimal in a stationary random medium, and two cyclic commutators with switching period T. The first commutator cyclically connects the individual automaton outputs to the output of the device; the second cyclically connects the individual automaton inputs to the input of the device. The automata are connected to the first and second commutators in the same order. The commutators are synchronized with the discrete-time rhythm of the medium, and the second commutator lags behind the first by one cycle, so that at time t 1, the input of the automaton whose output is connected to the output of the device at time f is connected to the input of the device. Thus, if the period of the commutator coincides with, or is a multiple of, the period of the medium, each automaton functions in the same stationary random medium. (Here it is assumed that, when the input of an automaton is disconnected from the input of the device, its states do not change.) Then, if the automata forming the device have a sufficient memory capacity, the expectation of a penalty for each automaton is arbitrarily close to min(p,,,, P,,~), i.e., the expectation of a penalty for the device as a whole is arbitrarily close to Mminand, consequently, in a periodically random medium with a known period, the device as a whole possesses asymptotically optimal behavior. (When automata with linear tactics are used, we must also assume that min(p,,,, p t , J 5 4 for all t . ) We will now attempt to solve the problem for the case in which the period is unknown. We require, however, that the upper limit of possible periods
+
Figure 14
96
Automaton Theory
T,,,, be given. Note that for the majority of practical problems, the last constraint is not excessive. In this case, too, the foregoing discussion provides a natural approach to the construction of an automaton that is asymptotically optimal. i t is sufficient to take the design shown in Fig. 14 and make the switching period equal to the least common multiple of the numbers from I to T,,,,,. Then any possible period of the medium will be a divisor of the switching period. This solution, however, is obviously uneconomical. Of course, there are automaton designs in which the switching period is T,,,and provision is made for optimal inspection of all possible switching periods between 1 and T,,,,,. This inspection can be performed, for example, by an automaton with linear tactics and T,,,,, outputs, the penalty probability for each output being determined by the average number of penalties drawn by a device with switching period defined by the output of this automaton. This solution, however, is also unsatisfactory since the system must have a very large settling time, even though the automaton is asymptotically optimal. The unacceptably long settling time of this automaton necessitates other means of determining the switching period. Note that the problem of determining a period in the interval 1 5 T 5 can be reduced to the simplest symmetric automaton game called a Goore game (the Goore game is discussed on p. 1 13). Indeed, consider a collective of T,,,, automata, each having outputs 0 and 1 ; a Goore game is said to be defined if the payoff probability for each of the automata playing the game depends only on the number of the automata choosing the output 1. It is shown on p. 235 of this book that the number of automata choosing output 1, with the probability tending to unity as the memory of the automata increases, corresponds to the minimum point of the function determining the probability of a penalty as a function of the number of automata choosing output 1. We shall use this result to construct a group of automata, possessing expedient behavior in a periodic random medium with an unknown period. In this case, we must also allow for the fact that the penalty is determined not only by proper selection of the period, but also by proper choice of outputs at each point in time by the automata determining the output state of the device. This, of course, is the essential difference between the system of automata described here and a Goore game. The above considerations were used as a basis for constructing the automaton shown in Fig. 15; it differs from the design of Fig. 14 with respect to the presence of additional automata B, one for each position of the commutator. The number of commutator positions is Tmm. Each automaton B is an automaton with a linear tactic and two outputs, 0 and 1. The commutators can stop only at those positions corresponding to au-
Behavior of Automata in Periodic Random Media
97
tomata B with output 1. After each discrete cycle, the commutator moves from a given position to the next position corresponding to an automaton B with output 1. Thus, the operating period of the commutator is equal to the number of automata B with output 1. After each complete rotation of the commutator, automata B are penalized with a probability equal to the average number of penalties drawn by automata A operating during this period. Analytic investigation of the behavior of such a device is extremely difficult; below we present the results of an experimental investigation by means of computer simulation. The experimental results are shown in Table XVII, where n is the memory capacity of automata with a linear tactic (the same for automata A and B ) , Tmaxisthe number of automaton pairs simulated, T is the actual period of the medium, t is the duration of the transient mode in the number of periods of the medium, and M is the average number of penalties per cycle of automaton operation between termination of the transient mode and the end of the experiment. In order to determine the stability of the device, we ran all the experiments for more than 50,000 cycles. The stability was estimated to three significant figures on the basis of the average number of penalties. The values given for M in the table show that, according to this criterion, the device is stable. It is also clear that this criterion does not detect brief random instabilities. In the experiments, we investigated the behavior in a periodic sequence of
98
Automaton Theory
two stationary random media of the form C , ( p , , 1 - p l ) and C2(1 - p 2 ,p 2 ) . The values of p 1 and p 2 , together with the code determining the sequence of the media in a given period, are shown in the table. It is clear from the table that the duration of the transient mode is very short. It is possible that this is connected with the method of choosing the initial internal state of the automata with linear tactics-they were chosen at the output interface. On the basis of our experiments, we can assert that, with a sufficient degree of accuracy, the design shown in Fig. 15 behaves optimally in periodic random media with period no greater than T,,,. TABLE XWI n
r
may
- 10 10 10
10 10
15 15 15 15 15 15
15
10 10 10 10 10 10
10 10 20 24 24 24 24 10
10 20 24 24 20 20 30 30 30 30
7
Code
M
5 7 6 12
11 8
12 5 7 11 12 6 7 7
10 10 10 10
0. I25 0.125 0.125 0.125 0. I25 0.125 0.125 0. I25 0.125 0.125 0. I25 0. I25 0. I25 0.250 0.333 0. I75 0.250 0.283
I70 147 210 203 208 263 175 200 144 228 I92 250 171 262 250 320 375 382
I 1 100 1 1 1 1 000 llO000 110010001 1 1 1 11001 000111 I 1 001 000 110010001 1 1 1 I I 100
I111OOo I I 001 OOO 1 1 1
110010001 1 1 1 110000 1 llO000 1 110000 1 1 1 1 1lOOOO
1111llOOOO I 1 1 1 llO000 1 1 1 1 llO000
We will now attempt to give the problem of the behavior of automata in periodic random media a practical interpretation by discussing the problem of synchronizing switching circuits in a system for transmitting telemetry with time division of channels over a noisy communications channel. The switching device in the transmitter sequentially interrogates the telemetric pickups in accordance with some program and sequentially transmits data on their states over the communications channel. The switching circuit at the receiving end must extract and identify the signals from the various pickups. Thus, the switching circuit in the receiver must, with the same phase, synchronously execute the same sequence of operations as the switching circuit in the transmitter.
Behavior of Automata in Periodic Random Media
99
The problem of synchronizing switching circuits is of particular value when the received data must be used for on-line control of the object carrying the telemetric pickups. Synchronization of switching circuits includes the problem of synchronizing logic circuits in a receiver with the corresponding logic circuits in a transmitter for proper decoding of received information. We will assume that prior to operation (and possibly during transmission of information), the transmitter, in order to enter the synchronous code, transmits an alignment signal (sync signal) composed of an n-digit binary code K . The time during which one data pickup is connected with the communications channel corresponds to the length of one binary digit in the synchronizing code, and the time at which the pickups are switched corresponds to the time at which the binary digit is changed. We shall assume that the transmitter and receiver operate in some discrete time scale and have identical standards for determining the lengths of the binary digits. Obtaining such a standard is naturally associated with carrier-frequency synchronization. Two more problems can now be stated: (1) synchronization of the times at which binary digits change, i.e., synchronization of the times at which channels change, and (2) synchronous and in-phase restoration of sync signals at the receiver. In order to solve the first problem, we take, for the unit time, one or more periods of the carrier frequency, depending on the accuracy required in determining the point at which the binary digits change. In the second problem, it is natural to take as the unit time the length of a binary transmission, which is either known, for example, in terms of the carrier-frequency period, or is determined by finding the times at which the digits in the binary sync signal change. Since the comrnunications channel is noisy, the initial signal is received with some error. We will assume that at each point in time the errors are independent and the probability of incorrect reception is p , where p < 0.5. In order to eliminate a number of additional difficulties, we will consider synchronizing codes 2k digits long. We start by considering the problem of synchronizing the times at which the digits in the code change. Assume that the probability of incorrect reception of a sync signal in a time interval equal to the period used to determine the switching point is p l . We require that the sum of the ones in the even positions in the code K be less than 0.5k,and we require that the sum of the ones in the odd positions in the code K be larger than 0.5k. Assume, for example, that we are given the synchronizing code 1110101010 ( k = 5). The sum of the ones in the even positions of the code is A,,,, = 1, while the sum of the ones in the odd
100
Automaton Theory
positions is Aodd = 5. Then, at the receiver, the sync signal can be treated as a periodic random sequence with period equal to the length of two digits, so that during the transmission of even digits, the probability of the appearailce of a one in this sequence is Aeven/k(1 - 2p1) p1 < 0.53[in our example, this probability is $(I 3p1)],while during the transmission of odd digits it is Aodd/k(l - 2p1) p1 > 0.5 (in our example, 1 - pl). It is clear that in an investigation of the behavior of the automata just described, the correlation that exists here does not distort results obtained, on the assumption that the sequence is of the Bernoulli type. There are two possibilities: (a) the lengths of the digits are known, and (b) the lengths of the digits are not known. Depending on the circumstances, we choose a device of the type shown in Fig. 14 or of the type shown in Fig. 15. We make the switching period equal to the known (or maximum) length of two digits, compare the output of the device with the input signal, and determine the penalty or payoff, depending on whether or not the signals coincide. It is not difficult to see that in this case the behavior of the devices is equivalent to their behavior in the periodic random medium and, as the foregoing discussion shows, the output signal of the device will be one while odd digits are being transmitted, and zero while even digits are being transmitted, correct to the time interval of the communicators. The solution of the problem of sync code repetition is now obvious. Let p z be the probability of incorrect determination of a signal while a digit is being transmitted. Then, in the digits of code K with value 1, ones will appear with probability p z . As in the preceding case, depending o n the information we have about 2k, we choose either a device of the type shown in Fig. 14, with switching period 2k, or a device of the type shown in Fig. 15, with switching period (2k)n,ax.In order to determine the penalty and payoff, we compare the output of the device with the received signal, using the time standard of the device to determine the point at which the code digits change. It is clear from this that here the device must enter the signal (sync code) repetition mode. Technological realization of this device presents no difficulty, especially when economical (from the viewpoint of technical realization) functional analogs of automata with linear tactics are used. Of course, this does raise its own difficulties. We did not set out to design a sync system. However, we have demon-
+ +
+
s The Russian text uses the subscript “odd” here, but it should correctly be “even” (Translator’s note).
Behavior of Automata in Periodic Random Media
101
strated the possibility of solving synchronization problems and have shown that a particular interpretation of problems concerning the behavior of automata in random media can provide new principles for construction of a number of technical devices; nor have we attempted to compare our method of synchronization with known methods such as special servo systems. It should, however, be noted that our method is extremely reliable. Random failures in the automata have practically no effect on the results of the collective device, and when a device with a variable period is used, the system is also stable with respect to failure of individual automata. This last assertion is an obvious consequence of the operating principle of the device.
0
Organization of the Queuing Discipline in Queuing Systems Using Models of the Collective Behavior of Automata' 0
An analysis of queuing systems with infinite queues shows that the quality of the system operation, determined, for example, by the average length of a queue, can be improved by the introduction of priorities [131]. In this case the highest priority number is given to a subscriber with the shortest average service time. However, in order to organize a system of priorities one needs, first, to know the probabilistic characteristics of the input flows, and, second, a special system of queue control. Often in a number of practically important cases the probabilistic characteristics of inputs are not known beforehand or vary with time. In this report we shall describe a queuing system in which the priorities are worked out directly in the channels without any a priori knowledge of input characteristics. As a result, each channel, in the course of its operation, selects a subscriber with the smallest average service time, and in case the service times are close to one another-a subscriber with the densest flow. The selection of the corresponding subscriber by the channel is done in the following way. At the instant a subscriber is connected to a channel, a control sum N is stored in a special register. The value of N may be chosen to be, for example, equal to the average length of a call, averaged over all subscribers.2 During the call the subscriber is charged per unit time, and the charge is subtracted from the control sum. If the real length of the call, z, is shorter than N , then in the channel register there remains an unspent This chapter was written together with Varshavskiy and Meleshina [46] (Editor's note). It is assumed that the approximate value of N is known; otherwise N is selected when the system is put in operation. 102
103
Organization of the Queuing Discipline in Queuing Systems
sum which entitles this subscriber to use the channel next time without waiting in line. If z > N , the subscriber does not have that privilege. The role of registers is played by automata with linear tactics (see p. 16). The index of the automaton action i (i = 1 , . . . , M ) is determined by the number of the subscriber who at a given instant is selected by the channel. The graphs of transitions within the set of automaton states, corresponding to a single action, are shown in Fig. 16, where the input S, corresponds to the arrival of a claim on the channel with the automaton turned on. The input S, corresponds to the arrival of a claim when the automaton is turned off, and the input S, corresponds to the state in which, at the instant the claim is made, the channel is busy and the automaton is turned on. A change of action may take place only in state number 1.
s =s, s = s, s=s,
8 8
... ... Figure I6
8 8
Thus, when the claim arrives, states with the numbers k ( k = 1,. . . , n - N ) transfer t o the state bearing the number k N , and the states with 1,. . . , n transfer to the state with the number n. numbers n - N Furthermore, when the channel is busy, the number of the state of the automaton is lowered by one during each time cycle. State 1 is here transfered into itself. Thus, the higher the state number of the automaton after a call, the more favored the subscriber is at a given time. In order for the channels to secure the most favored subscribers, a provision is made for competition among the subscribers. For this reason two automata are used to serve each channel-the basic automaton ( A ) and the reserve automaton (B). Let i ( A j ) and i’(Bj,)be the numbers of actions of the basic automaton of channel K j and of the reserve automaton of channel Kjt, respectively. The operation of the system is organized in the following way. The call initiated by the subscriber Ti, is processed without waiting if at the instant of call initiation there is a free channel K j , j = 1,. . . , k , and ( A j ) = i,. In this case automaton A j is turned on. When the call is over the number i ( A j ) remains equal to i, , and the state number is recalculated in accordance with the graph in Fig. 16. If, at the instant of call initiation, channel K j ,
+
+
104
Automaton Theory
i ( A j )= i,, is busy, then the claim will be processed as soon as this channel is freed. If i ( A j ) # i, for all j = 1,. . . , k , then the claim will be processed in the order of the queue. In this case, if there is a reserve automaton Bj., satisfying the condition i(Bjr)= i,, then this automaton is turned on. If the state number of the reserve automaton turns out to be greater than the state number of the basic automaton, then when the service is finished, the automata interchange their places. Finally, if for the subscriber i, for all j = 1,. . . , k , i ( A j ) # i, and i ( B j )# i,, then the claim is processed by any of the free channels, and the corresponding reserve automaton is turned on. If there are several free channels, then that channel is selected for which the state number of the reserve automaton turns out to be the smallest. In those cases when the reserve automaton Bj, is turned on, a state with the number 1 is established, and i(Bjo)is taken as equal to i,. We note that when the channels secure a certain portion of the subscribers, still another characteristic of telephone exchanges is improved- the number of switchings in the system. We shall consider that a switching occurs in a channel if through this channel two claims arrive consecutively with different numbers. Switchings of this type introduce substantial noise and strongly lower the audibility. If the probabilistic characteristics of various inputs are constant, then their operating conditions approach a situation arising in the distribution game (see p. 64). Therefore, if the memory of these automata is sufficiently large, one may assume that the priorities for the subscribers with calls of short length should be established as if the average lengths of calls were known beforehand. We shall give an elementary example. Consider a queuing system consisting of a single channel and two streams of calls with frequencies of occurrence A, = A, = &, and average lengths of calls t i = 12 sec and t2 = 2 sec. For the “first-come-first-served’’ discipline, the average waiting time for the above parameters will be 50 sec. If a priority is given to the second stream of calls, the average waiting time is lowered to 31 sec. This very example was simulated on a digital computer for the case in which the parameters of streams were not known beforehand. The average waiting time also turned out to be equal to 31 sec. Using computer simulation, we investigated the average waiting time for various ratios of flow densities of “short” and “long” callers, and also the amount of switching. The results of simulation are shown in Table XVIII, where W,, is the average waiting time for the system without priorities, W, - W,L1 is the average waiting time for the system with priorities obtained directly by the channels, Wn2is the average waiting time calculated using
105
Organization of the Queuing Discipline in Queuing Systems
03 0
0
00
2
0
0
2
W
m
8
0
0
c!
?
r-
2
?
9
m
I?
?
m m
z ?
m
19m
m N
W-
N
N
r-
d
d
d
m
d
m
--
W
W
-
m
9
.
W
-
2
106
Automaton Theory
formulas and the a priori knowledge of the characteristics of input^,^ n is the memory of the automata A and B, and Z7,, and ZIn are the average numbers of switchings in the systems which are not controlled or controlled by the automata, respectively. Table XVIlI shows that the construction described makes it possible to assure the quality of service which in practice approaches the quality of service of the systems in whose construction the a priori information about the subscribers was used. In practical problems, we are interested in those cases in which the parameters of the individual subscribers vary with time. We have studied experimentally (Tsetlin [155]) the cases in which this time dependence was random and was given by a Markov chain. In this case it is impossible to assign a constant priority in constructing a system. TABLE XIXU
wo
(50.0 150.0
Wi
(31.2
no n,
(50.0
134.4 137.6
1 I I [
0.98
0.82
[
0.98 0.84
1
0.98
0.86
n = 4; N = 4; number of channels = 4. Input characteristics: A, = ?,. = 1/8, t,= T? = 2 (in state 1); A, = I , = l/8, t,= t, = 12 (in state 2).
In the first group of experiments (their results are listed in Tables XIX and XX) it was assumed that the system had two states, where at each instant of time the state number is preserved with probability 1 - 6 and changed with probability 6. In the second group of experiments, the average service time was varied independently for all subscribers: each of the subscribers had two states, at each instant of time the state number was preserved with probability 1 - 6 and was changed with probability 6. The results of this group of experiments are shown in Tables XXI and XXII. WnZwas roughly calculated, using the formula for the average length of a queue for a single channel system having p priorities (see Saati [131, p. 2861).
TABLE XXa 10
12
14
50.0
50.0
50.0
50.0
38.4
39.5
41.3
42.3
n
1
2
3
4
8
Wo
50.0
50.0
50.0
50.0
Wk
48.4
43.1
39.5
36.2
a N = 4 ; S = 1/64; number of channels = 4. Input characteristics: 1, = 1, = 1/8, t I= t, = 2 (in state 1); I, = I., = 1/8, T, = t, = 12 (in state 2). TABLE XXIa
a n = 4; N = 4; number ofchannels = 4. Input characteristics: 1,=&=13=1,=l/8, tl= t2= t,= T, = 2 (in the first state); t , = t 2 = t , = t , = 1 2 (in the second state). At the initial moment t, = t z = 2 , t 3 = t 4 12. =
TABLE XXIIa
a S=1/64; N = 4; number of channels = 4. Input characteristics: A, = 1, = 1, = = 1/8, t l = t2 = T, = t, = 2 (in the first state); tl= t, = t,= t, = 12 (in the second state). At the initial moment tl = t, = 2, t a= T, = 12.
107
0
Mathematical Modeling of the Simplest Forms of Behavior' 0
This chapter will be concerned with certain problems related to the mathematical models of simple forms of behavior. The problems in which I first became interested deal primarily with certain peculiarities of motor control in man and higher animals. Why has the study of movements lead an entire group of mathematicians to pose a number of problems that, one might think, have a purely abstract character and are not particularly related to motor control? Above all, the reason is that by studying movements that can be objectively measured, one can reveal certain characteristic features that govern the mechanism of control of complex systems. Some of these characteristic features, as we shall see, are capable of being modeled. Others still leave a lot to be desired from the point of view of modeling. However, we understand very well that without gaining an understanding of the more complex features of behavior, we shall never be able to obtain any kind of exact and accurate knowledge of control processes. I shall allow myself first to describe a few simple banalities, even anecdotes, that lead us to such problems. This is the first concept I shall discuss later in more detail. Let us imagine the work of a commission that allocates living quarters. The commission will consist of the representatives of various organs of the establishment, but generally speaking, not all of them. Each member of the commission has a list of those in need of living quarters. In addition they know how much living space is available. Note that each member represents first of all the interests of his own branch, and as it is well known to anybody who, as for This lecture was delivered by M. L. Tsetlin at a meeting of a section of the Physiological Society on 23 February, 1965. A tape recording of the lecture remained for our use. The text was prepared for print by V. V. Ivanov, D. I. Kalinin, I. I. PyatetskiyShapiro, and I. M. Epshteyn (Editor's note). 108
Mathematical Modeling of the Simplest Forms of Behavior
109
instance myself, has had to deal with this problem, one can rarely persuade people to see another point of view. I cannot seem to convince any member of the commission that someone from my section is more in need than someone from his section, simply because each member thinks that I am prejudiced as far as the needs of my people are concerned. Nevertheless, the members of the commission come to a unanimous decision, and the list can be compiled. This is of course a paradox. It is easy to imagine that one might never come to any decision by voting. This also applies to the work of a parliament and similar institutions.2 Another paradox in the area of control is even simpler, but very extensive conclusions can nevertheless be drawn from it. Imagine that you come home to your apartment building and find in the hall a notice saying: “75% of residents are required to come tomorrow to a meeting”. It is not hard to imagine that if such a notice were actually hung, nobody would show up. Nevertheless the problem of calling exactly 75% of citizens to a meeting is not in itself uncommon. Exactly the same kind of problem arises in the area of movement generation. Suppose that we have to lift a weight, and it is necessary that 75% of the motor units participate in this undertaking. How a janitor will solve the problem is quite clear. He has a list of the tenants, so he will select the 75%, and will send a note to those that have been selected. In an organism one could, generally speaking, follow this “listtype” solution: one can imagine that somewhere there is a list of motoneurons from which one selects the motoneurons needed, i.e., those motor units which are told to “work,” where the remaining ones will be told not to “work.” However, the procedure involved in this type of selection would be long, complex, and wasteful of nerve cells. Should not one organize the process of control in such a way that it will not be necessary to make up such a list, and that nevertheless one will be able to bring 75% of people to a meeting or to tell 75% of the motor units to work? A similar problem arises in the following situation. A store sells pork and beef, and housewives, it would seem to me, prefer to buy beef. If it is necessary to increase the consumption of pork, then it is possible, of course, to send out agents to people who would explain the advantages of eating pork over beef, and who would tell people why it is in the interest of society to give preference to pork. One could approach each housewife individually and explain the matter to her or one could gather all housewives together and talk to all of them at the same time. One can also, however, Concerning the reason for the “problem of the housing commission” see p. 253 (Editor’s note).
110
Automaton Theory
do this: change the prices. This method will no longer involve listing. It is problems of this kind that I wish to discuss now. I must apologize for a certain artificiality of the title: “Simplest forms of behavior.” I will not, of course, talk about more complex forms of behavior, say, food-oriented or sexual behavior of animals or man. We shall be primarily interested in the simplest forms of collective behavior. Initially, we wanted to reduce collective behavior to the well-investigated behavior of the individual. Since later we shall consider more or less complex groups, we shall define the behavior of a single man, one member of a group (one machine, one animal ; for brevity, these objects will be called automata without going into further details about their structure) in a very simple way. We shall, however, see later that such a definition makes it possible to construct more complicated models. Thus, in order to be able to speak of behavior we must ensure the possibility of an object’s choosing one out of a set of actions. Suppose we have a set of x different actions; they will be denoted by fi, fi, . . . , f,. We can speak of behavior in the sense that our automaton is able at each instant of time to make a definite choice out of the set of possible actions. We shall assume for simplicity that we are interested in the behavior of the automaton only at discrete instants of time 1, 2, . . . . This does not impose any substantial limitations. Furthermore, we shall assume that our automaton is able to observe or receive some external signals. For simplicity’s sake, we shall suppose that all external signals to the automaton can be subdivided into only two groups: favorable signals (we call them payoffs) and unfavorable signals (we call them penalties). In this sense the forms of behavior we shall talk about are simple. We shall also say that an automaton is capable of expedient behavior if it attempts to win more often than it loses. The next assumption is that our automaton (it will be denoted by some letter, say A ) has a memory of definite capacity (I d o not want to state here its precise definition). This can be imagined to mean that the automaton can remember a certain number of elementary facts. For example, suppose that the automaton can remember one of four facts. This means that the automaton may be in one of four states. Our automaton, I must stress, does not receive anything from the external medium except for signals signifying payoffs or penalties, and therefore its states pass into one another only in the presence of win and lose signal^.^ In order to speak of the expedient behavior of such an automaton, we must define some problem in which this expediency will be revealed. As
*
For precise definitions, see p. 12 (Editor’s note).
Mathematical Modeling of the Simplest Forms of Behavior
111
such a simple problem we may take the problem of behavior in a stationary random medium C ( p , , p z , . . . , p J . This means that if the automaton performs actionf,, then it wins with probabilityp, and loses with probability 1 - p l , If the automaton performs action fi , it wins with probability p z ,etc. If, instead of the automaton, a man were told the value of these constants, he would act in an extremely simple fashion. He would find the largest among these probabilities, say p , , and would perform only the action f,. This type of behavior would guarantee him the highest possible payoff. We, however, do not deal with a person who knows all this beforehand, but with an automaton that first does not have that knowledge and second, understands only whether he won or lost at that particular time. The automaton is not able to receive more complex information, for which it is too simple. We can describe its behavior using constants. The automaton itself does not know this. We shall give a simple example of an automaton. Let us imagine that the automaton can perform only two actions-first and second-and since our automaton is simple, let it have only two states, i.e., it can only remember two different facts. The others all merge together for it. We shall prescribe for it this type of behavior: If the automaton performs an action and wins, it will go on performing the same action; it has no reason to change its action. In the case where it loses, the automaton will not behave this way. If it performed some action and lost, it would no longer perform this action and would change to another one. In passing, we note that it is very simple to make such an automaton: this is how an ordinary electronic trigger operate^.^ No cunning of any kind is involved here: one signal throws the trigger over, another leaves it alone. It turns out that such an automaton already possesses a noticeable expediency of behavior. If we compare the payoff that will be received on the average by such an automaton, and the payoff that will be received by an automaton that selects its actions at random, our automaton will have a serious advantage. Now, of course, it is natural to ask: is it possible to build an automaton that would behave no worse than a person who knows the conditions of the problem beforehand, i.e., one that would behave in an optimal way? It turns out that it is possible to construct such an automaton (Fig. 17). I shall give here the simplest design, although, generally speaking, many such designs are known.6 Suppose that an automaton performs actionf, and always wins; then it See Fig. 1 and all the necessary calculations on p. 15 (Editor’s note). See pp. 16-21 (Editor’s note),
I12
Auiomuton Theory
will pass from state 1 into 2, from 2 into 3, and it will remain in state 3 ; i.e., the automaton moves upward in states, continuing to do the action f i . In case of a loss, our automaton will behave, it seems, in a natural manner also: namely, if it performs actionf,, then it will go on transferring to lower and lower states until it finally changes its action. The rules of transition in those states, where the automaton performs actionf,, do not differ at all from the transition rules for the states, where actionf, was performed.
Win
LOSS
Figure 17
If the number of states is sufficiently large, then the automaton, with a sufficient degree of accuracy, will behave in exactly the same way as a person who knows the conditions of the problem beforehand. Naturally, all kinds of mathematical questions arise here : What construction of the automaton is the best? Which automaton is most economical of all those that possess this optimal behavior, etc.? I shall not dwell on these problems here. Thus far we have been talking about individual behavior. In the case of individual behavior it turns out-and this is important for a number of things-that its behavior is better, the more states the automaton has. Generally speaking, such an improvement of memory is not possible in all specific acts of behavior. Let us imagine the following problem. As an automaton we can take, e.g., a car driver. He travels in a country in which in some cities the traffic is on the left side of the streets, and in others on the right side. Then, if the automaton with a large memory capacity changes cities often, it will probably have to pay penalities for major traffic violations since it will not manage to relearn every time those cities having a different system. On the other hand, if it changes cities rarely and drives for a long time in any one city, then with a large memory capacity it will first, pay fines rarely, and second, have the time to learn more exactly and choose
Mathematical Modeling of the Simplest Forms of Behavior
113
the correct direction of traffic with a higher probability. For such a problem, the presence of an optimal memory capacity is characteristic. Without giving offense, one could give the following analogy taken from real life. The mentality of persons living in the city and the country are, as we know from observation, quite different. People who live in the country tend to think more deeply and slowly, and people that live in the city tend to think more superficially and faster. They live at a different rate, and because their environment is rapidly changing, city people can afford to have a smaller memory capacity, and achieve their desired goals by a sufficiently rapid switching of ideas. A large memory capacity that makes it possible in a stationary setting to achieve optimum results may turn out to be directly harmful. The same situation also occurs in games.s In the following we shall discuss how a group of such automata would behave. Above all, we would like to verify to what extent complex forms of collective behavior can be realized this way. First, I would like to give a simple example. Every time we study the behavior of a group, especially that of a group of people, we are confronted with a very important fact which is that people who interact with one another may agree on certain common actions, or joint tactics. Because of our habit of thinking this way about human groups, a concept has arisen that has not been verified to any known extent. The concept is that the most important feature of collective behavior is the possibility of an agreement. Therefore, it is very interesting to ask whether people who cannot see one another, and cannot talk to one another, can nevertheless reach a point where their collective behavior will be expedient. I am not saying at all that it does not make sense to make agreements. Agreements may strongly accelerate and improve collective behavior, but at the same time it is important to analyze whether or not an agreement is a necessary and required feature of expedient collective behavior. The reason is that such an agreement presupposes a considerable complexity in our automata, much greater than the complexity of those automata which understand only whether they won or lost in a given play. Let us imagine the following game (it is called the Goore game'): We have a referee and many players, but the players do not see one another. The referee, though, can see them. The rules of the game are the following: a buzzer sounds from time to time, and each of the players is supposed to raise either one or two fingers. Notice that the players cannot see each other. They can only signal the judge with one or two fingers. The judge counts See p. 30 (Editor's note). the Goore game see p. 239 (Editor's note).
' For
114
Automaton Theory
what percentage of the players raised one finger. If this percentage is zero, then with the probability shown in Fig. 18, a payoff is paid. This probability is not large, and means that a penalty is more likely that will paid by everybody. If it is 20%, then a large payoff is paid. If everybody raised one finger (loo%), then again a small payoff is paid. How will each player behave in this situation? He hears a buzzer, raisis one or two fingers, and either receives a ruble or gives it away. He does not participate in any other interaction with either the judge or other players. Note that the rules of the game are formulated in such a way that our automata can participate in it in a way that they understand: first, either they pay a ruble or they are paid a ruble, and second, they can choose one of two actions: raise one finger or raise two fingers. Nothing else is required of the players in such a game. Let us say that the senses of smell, sight, and touch, and the abilities to understand speech and to speak are not required. For this reason such a game can be played by automata. We can then prove the following theorem: No matter what the number of players in a Goore game may be, with sufficient memory and for the function shown in Fig. 18, exactly 20% of the players will, with probability I-i.e., with certainty-raise one finger, and the remaining 80% will raise two. Of course, if we dealt with people it would not cost them anything to make the following agreement. They could decide amongst themselves that first, one person shows one finger, then two persons one finger, then three, then four, etc. If the players remember when the highest payoff was received, then the judge will be ruined financially, because now exactly 20% will raise one finger every time. For this to happen, it must be necessary for the players to make an agreement, and this is no simple matter. It must be noted that our automata do not have any kind of built-in altruism, and to agree on something for such automata, just as for people, incidentally speaking, is a very dangerous matter. It is dangerous for the following reason: there is no basis for thinking that the person you make an agreement with is not your enemy. Moreover, in game theory one
Figure 18
Mathematical Modeling of the Simplest Forms of Behavior
115
emphasizes and finds most interesting those situations in which it is impossible for the players to communicate. For example, suppose I play the game of 21 with a friend. On what can I agree with him? Every ruble won by me is won at his expense. The only agreement we can enter here is “let us stop playing.” This type of conflict situation is very common. Here, however, it turns out that without any agreement, i.e., without risking anything and without making any commitments, such a group of simple automata achieves the percentage that results in the maximum payoff. The recruitment of the motor units, say a working muscle, can probably be reduced to this type of problem. In this case the individual automaton would be replaced by a motoneuron or a motoneuron with a group of interneurons. The problem is to obtain the required number of working motor units. Here it is not important what the payoff consists of. Imagine that on the axis of abscissas in Fig. 18 we measure off, not the fractions from 0 to 1 but, for example, the pull of 4 kg, or 500 gm. Then the problem becomes exactly the same as the previous one and the motoneurons “agree” to pull with a given force. The only thing that is changed here is the scale. This problem is also directly related to that of gathering the required number of people for a meeting or to the problem of lifting a weight. Consider still another problem which has exactly the same game character, and which was discussed simultaneously with the Goore game- the distribution problem.* Let us imagine the following game. We take certain numbers, e.g., 0.9,0.33,0.33,0.33,0.33,0.33,0.33.Let us imagine mice or some other animals of the kind can receive food from various troughs. The foregoing numbers will represent the amount of food in each trough. Thus, there is one well-supplied trough with 0.9 amount of food and six poorly supplied troughs with 0.33 amount of food. Suppose we have five animals that cannot communicate [they probably can, but we cannot assume that; for our purposes, they manage without]. The game will proceed as follows. A metronome is struck, and using this signal, each of the animals chooses one digit: 1, 2, 3, 4, 5, 6, 7, i.e., creeps to a certain trough. If the animal comes to a given trough by itself, it will eat all the food in the trough. If two animals come to the same trough, they divide the food in half. In the case of three animals, they divide the food in three parts, etc. Let us imagine what we would do in such a Situation. We would imagine that we should distribute ourselves in the following way: two of us would come to the best supplied trough, and each of the remaining ones would come separately to the remaining troughs. Then either of the two at the well-supplied trough See p. 64 (Editor’s note).
116
Automaton Theory
receives 0.45, and the others get 0.33 each. It would seem that this is unjust, but what else is there to do? If either one of those at the rich trough goes to the already occupied poor trough, he will receive 0.16 instead of 0.45. If one of those at the poor troughs goes to the rich trough, he will obtain 0.3 instead of 0.33, thus less again. Consequently, no one has an opportunity to improve his lot. In order to assume our solution, we would have had to inspect these numbers, i.e., we would have had to know them beforehand. We shall not assume that the automata know anything beforehand; we cannot say anything about them since they are not able to perceive numbers. Every time the metronome is struck, the automata utter one of the digits 1,2, . . ., 7 or imitate a movement to some trough, and learn whether they won a ruble or lost it, i.e., the automata are in exactly the same conditions under which they operated when playing the Goore game. Moreover, the automata do not know what game they are playing. It could be any kind of game; perhaps they are playing the Goore game. It turns out that automata with sufficient memory capacity, not necessarily very extensive, say 4-5, distribute themselves in a way that is just as reasonable as for people who would know the contents of each trough. What is also interesting is that they do it more rationally than people would in the same situation. A person would probably behave in the following way and say: “if I came here, then even though 1 get 0.45 and the other players only 0.33, I will not allow myself to be dragged out of here, no matter what.” In automata, by virtue of their probabilistic structure, it turns out that on the average they all get the same amount, i.e., each automaton finds itself at the poor trough for part of the time, and for a greater part of the time at the rich trough. In turn they pass through all the states and are under the same conditions. Of course, in this case it is assumed that the automata have an exactly identical structure. Now let us calculate their payoffs in this case. The total payoff is 0.9
+ 3 x 0.33 = 1.89.
The payoff to each one of them is 1.8915 = 0.378. Thus it turns out that, if the game is played by automata, they will each receive exactly the same payoff. Here, the following fact is of interest: We have assumed that the participants did not promise one another anything or make any commitments. At the same time they behave in the optimal way. They distribute themselves in such a way that it will not be advanta-
Mathematical Modeling of the Simplest Forms of Behavior
117
geous for anybody to change his actions. A stable configuration is formed which will be referred to as the Nash point. But this is not the optimal pattern of behavior. In our case, each gets a payoff of 0.378. Now assume that the participants are people who are unflinchingly convinced that their opponents are honest. Then their configuration will be different. They will say: we shall distribute ourselves in such a way that there will be one person to each trough. This may turn out to be very unjust. This is true: one will get 0.9 and the rest 0.33 each. One of them might say: “What good does that do? I had better come to the rich trough, where I will get 0.45 and not just 0.33.” However, they agreed earlier to play with honesty; thus both those who receive a lot and those who receive little put their money into a common fund and then divide it e q ~ a l l yLet . ~ us calculate what the payoff per person will be in this case: 0.33 x 4 0.9 = 2.22; 2.22 x 5 = 0.444. As we have seen, this type of agreement led to a situation where they received higher payoffs. We point out once again that this kind of agreement is not without its risks. They must count on the one another’s honesty. Otherwise, if a person at the rich trough stops giving money into the common fund, they will not even receive 0.378 each (the payoff guaranteed in the absence of a common fund), but only 0.33 each. What are the answers given by simulating the behavior of finite automata on a digital computer or by analytic calculations? If the memory were large, the automata would receive only 0.44, but in this case they would be required to make the following agreement: that the conditions of the problem be changed so that all their payoffs are deposited in a common fund, and then divided equally. However, this kind of joint action, accomplished by means of a common fund in physiology, for example, is something that is fully justified. For this reason it is natural to presuppose kind of homogeneity of the payoff for physiological applications. What is now the situation with the automata? If their memory is large, they receive 0.44; if their memory is small, they receive less. Depending on the memory, the payoffs can be plotted on a graph [see Fig. 11 on p. 69 (Editor’s note)]. For sufficiently large memory without a common fund, the automata receive 0.378. With small memory capacities, the use of a common fund lowers the total payoff, and for large memory capacities, it increases the payoff. For some intermediate value of the memory, the common fund yields neither a gain nor a loss. A banal sociological analogy occurs at this point: How is it that the common fund can be disadvantageous? The reason is this: Imagine that two automata found themselves at a poor trough. What
+
See the definition of a game with a common fund on p. 61 (Editor’s note).
118
Automaton Theory
they did was obviously silly, and if it were not for the fact that they received payoffs from the common fund, their personal payoff would drop from 0.33 to 0.16 which would be quite noticeable and would make them change their actions. If, however, they receive payoffs through the common fund, the consequences of such unreasonable acts become for them much less acute. They receive a compensation at the expense of their more successful partners. This is exactly what is commonly called the damage of equalization. If the memory is insufficient, if relatively small payoff decreases with such a memory are not noticed, then the common fund is simply damaging. It obliterates the worth of individual acts. If, however, the memory capacity is sufficiently large, making it possible to notice even very small changes in the total payoff, then the common fund brings a considerable improvement. This, incidentally, shows why one should not organize overly large collectives, functioning with a common fund, and receiving identical payoffs-let us say, the same amount of food. It is a well-known fact that production quotas are set up within a single brigade because within one brigade there is enough memory capacity, or awareness if you will, to understand that low productivity lowers not only the wages of others, but also those of the person involved. Within a factory, production quotas are never set up, since at that level one needs a higher degree of awareness to feel any possible losses. Thus as a factory worker, if I smoke a cigarette for half an hour, I would not lose a single kopeck from my wages. On the factory level this is not noticeable, but if I were fulfilling individual production quotas, this act could cost me a fortune. In connection with this problem, I would like to mention still another characteristic of control systems, in particular, systems related to motor control. I have in mind the idea of compensation. There are various forms of reliability, which can have different definitions in technology and in physiology. Specifically, the following undesirable property of all our technical products is their nonuniform reliability. I refer to unreasonable actions such as the fact that a shirt is thrown away when its collar is worn out, even though its remaining portions are in a good condition. Machines, even large and heavy, are replaced even when the wear and tear is insignificant -literally perhaps only a couple of grams of metal are missing. If, however, the foundation is beyond repair, this cannot be replaced. Wise people who were making shirts 30 or 40 years ago sold extra collars with the shirts. The collars were attached to the shirts with clips, and thus they could be replaced. Incidentally, as far as technical parts are concerned, the moving parts are made replaceable as much as possible. I t would, of course, be much more convenient if shirts wore out uniformly. This is, however, not the case.
Mathematical Modeling of the Simplest Forms of Behavior
119
If shirts had extra material and enough of it, then a single shirt could probably be worn ten times longer than it is worn today. This, of course, applies to shoes to an equal extent: shoes are thrown away when they are generally still quite new. It would be more convenient if such a compensation were accepted as a matter of course. Using the example of the distribution game, it will be easy to see how this is done.1° I would like to use these numbers to explain why such a system of playing automata behaves in exactly the same way as a shirt which is being shortened to make a new collar out of the extra material rather than being thrown away. In every case, the automata transfer to the procedure involving a common fund (Fig. 19). Now let us imagine that these automata are mortal; each automaton works for some time, then dies and is excluded from the game. Irrespective of the number of automata, even with just one remaining, the following configuration will develop each time due to their optimaljty: the first n places will be occupied, where n is the number of automata, and the remaining places will remain free.
Number o f phys executed
Figure 19
The same type of reliability picture occurs also in the Goore game which was described before this example. In fact, there the payoff will depend only on the fraction, and the failures of some of participants obviously lower the payoff, but do not disrupt the automata from their optimal mode. This reliability of behavior seems to be typical of automaton groups. I am attempting to formulate here certain general assertions. The only thing I have in mind is the fact that, since the automata have equal rights and their properties are universal to the extent that they can play this game the Goore game, and dozens of other games that I could quote here, these are then identical automata and their properties are indistinguishable. Therefore, the death of any of these automata, a failure to operate, is immediately comlo
See p. 69 (Editor’s note).
120
Automaton Theory
pensated for, and since they are capable of optimal behavior, the optimal distribution of the automata over the actions available is maintained to the last automaton; as the saying goes, they “fight to the last soldier.” Generally speaking, game theory was proposed by the brilliant American mathematician J. von Neumann to describe the behavior of people, and not of automata. It was found that automata, even the simplest, behave no worse than humans in a number of cases. It will be noted that the homogeneous games described above were not analyzed in von Neumann’s game theory. Homogeneous games seemed interesting to me, especially in application to automata. Suppose we have a universal computer. Then we are required to set up a timetable listing tasks to be done by various shifts. We must see to it that electric power is available, that the cleaning woman will wash the floor, that the spare parts will be supplied on time, and that maintenance will be done systematically, etc. As the computer manager, our job is to take care of the computer, and not to solve problems. Furthermore, we sell the computer time for money, of course. A mathematician comes and solves his problem. Another mathematician comes and solves his problem; these problems are completely different. A third mathematician comes and solves his problem, and so on. A universal computer is exceptional in the sense that it can be used to solve a variety of problems. When speaking about physiology, naturally I will often be naive, but I hardly feel the need to apologize for that because so many physiologists still perpetuate these naive notions themselves. What is the weight of a human brain? The answer is about 2 kg, so we shall assume approximately that there is some 2 kg of brain matter. [Incidentally, to me it is not important whether it is 2 or 1 kg.] The question is: There is a certain amount of brain substance; how can it be distributed? The most absurd thing one could do would be to proceed something like this: Here is the right hand, 500 gm of brain substance will be allotted to it (the right hand being very important); here is the left hand: give it 300 gm. For each leg, 150 grams will be enough. The tongue will take something too. How much shall we give to the eyes, the ears, food preparation, emotions, social feelings, remembering that 15 gm must still be left for playing chess? We finally come to a point where there is no more brain substance to give away. I t would seem that one might act like this, but it is hardly feasible that this is what actually has been done. The reason could be that when playing chess, I am not at that moment thinking about my professional work or jumping a
Mathematical Modeling of the Simplest Forms of Behavior
121
rope [which I can only d o with some difficulty]. This means, it would seem, that when I am playing chess it would be more advantageous to devote to it not 15 gm but, let us say, 1 & kg out of the 2 kg of brain substance available. Tomorrow it might happen that I will become involved in a fight, and during that fight when I need to move as fast as possible and make fast motor decisions, it would be natural to take the gray matter away from the eyes, ears, chess-playing, cooking, and emotions and give it completely to the fight. Instinctively we feel that this is indeed the case: when completely involved in a fight, you forget about everything else. One can ask, couldn’t the control centers be distributed according to the tasks? For all the naiveti of such a viewpoint, and at the risk of leaving myselfopen to ridicule, I do think that this is actually the way the brain works. There are certain computing means which I conventionally measure in grams [I think that it is more natural to use grams], and those means can be redistributed. In order for brain substance to be able to be redistributed, it must have certain universal characteristics so that when it is brought to bear upon one problem after another it will still be able to deal with this new problem. Our automata, by virtue of the universality of their properties as postulated and developed, but not based on physiology, do possess that characteristic. I can make the same automaton play the Goore game, the distribution game, a zero-sum game according to von Neumann, and virtually anything else I please. The universality of the automaton is sufficient for that purpose. Having this kind of automata available, it would obviously be interesting to see if it would be possible to construct a game in which there would be some choice of problems, so that the automata would choose the most important problems and solve those. And if the priority of problems were to change, than they would drop the problem that became unimportant, and would begin to solve the problem that would be of highest priority at any given moment. Together with S . L. Ginzburg, in 1964 we made certain numerical experiments which were run on a computer.ll Such games are easy to invent, easy to interpret, and what is most important, consume very little machine time. The problem is very simple for automata. However, I am far from being convinced about so absolute a universality for brain matter. I think that a very large fraction of these 2 kg is irreversibly specialized to perform various functions, for example, vegetative functions. But in any case such a universal functioning of the basic brain substance must probably take place, since otherwise it would be impossible to understand l1
See p. 84 (Editor’s note).
122
Automaton Theory
the occurrence of compensation. One can refer to the well-known experiments by Laszlo, which involved removal of the frontal brain lobes, or to other similar experiments, but it seems to me that this is unnecessary. Numerous examples of functional compensation indicate that brain matter does indeed possess such a universal capacity. Finally, as mentioned at the beginning of the chapter, I would like briefly to clarify the difficulty concerning the distribution of living space and what relation it has to automata. Suppose we have a number of apartments, the number being rather small, and certainly much smaller than the number of those in need. [If the number of the needy were not very great, then there would be no problem, and there would be no need to set up a commission, since the commission would not have anything to do.] All apartments will be assumed identical and to consist of two rooms each. [Should the apartments be different, there would simply be several different problems: how to distribute two-room apartments, how to distribute single-room apartments, etc.] There are N persons needing apartments, and there are m members of the commission, where m is not very large. Now let us imagine how the commission actually operates. Each member takes a list of the applicants, and tries t o ascertain which one of them is most in need, who is second most in need, etc., i.e., each member puts the applicants in a certain order. Thus, for example, the first member will write something like this (persons will be denoted by letters for convenience) : all a 2 9 a3,
I
014,
-
9
OLN
3
and at some place he will make a mark, say after a s , meaning that there are no more apartments. Note that while making up the list, he will very carefully choose those that fall on the left of the mark. The second commission member will do the same thing, and so will the third, etc. These opinions are then announced to everybody else. For example, one can suppose that they will write them out on the blackboard. One cannot decide anything by voting, the reason being that the number of apartments is smaller than the number of applicants, and the lists will usually not be the same. But if I made up a list and one of the people I was representing was not on the final list, then I would not approve such a decision. Therefore, I would only approve my own list. So far, if I were not convinced that all of my people would appear on the final list, I would not approve it. And if I were convinced, then there was no need to select me for the commission in the first place. Let us look at what happens. The blackboard has the opinions of all commission members. If in the majority of cases these opinions are the same, then one can decide the matter by voting. Now, is that likely to happen?
Mathematical Modeling of the Simplest Forms of Behavior
123
No, this is completely unlikely, since there are N ! various opinions, where N is the number of applicants, and the probability of the coincidence of opinions is very small. Therefore, the first thing that the commission members are going to see is that it is impossible to come to a common opinion. Incidentally, in any reasonably constituted commission, decisions are not arrived at by voting: they begin to vote only when there is a conviction that a decision is unanimous. It is clear why this is done thay way. If I were to stick to my own opinion, and the other members could not convince me that the decision was right, then the commission’s work would have been in vain. I would then go to the local committee, and the proceedings would start again. As a rule, any decision of the commission should be approved by the local committee, and if one of the commission members demurs the decision, then the local committee would not be able to approve anything. Instead it would send the commission members back until a unanimous decision was reached. Thus, voting cannot be used to make the decision. Perhaps this is an appropriate time to say when voting can be used to make decisions. If we were to choose a chairman from among three possible candidates, then this is a problem that could be resolved by a vote, since the number of possible views is here much smaller than the number of those who vote. It is only in such cases that decisions can be arrived at by a vote. In our case, the decision cannot be made this way. Thus, the members. of the commission must come to some reasonable compromise by making agreements among themselves without voting. How can they resolve their disagreements? First, no one is denied the right to change his mind. Second, and we always think about this seriously, we can try to persuade each other [although I think this is not very likely]. In fact, any reasonable person can be persuaded somewhat, as far as most questions are concerned. But in considering the housing problem, this is a completely hopeless task. The reason is very simple: each member of the commission represents some department, supports its interests, and will defend its interests no matter what, and this is indeed the case. How can we convince one member except by saying: “Look, you are defending this person while everybody else is against him; you will not be able to succeed, and upon can’t help him.” Here the point is not that the member is shouted down, but that he begins to change his mind. It turns out that this problem may be formulated in terms of automata games, and from many examples we see that automata come to the decision that would also be arrived at by humans. This example was given because I wanted to show that even fairly complex forms of behavior (the behavior of the commission members is a complex form of making collective decisions) can be simulated by means of automata.
124
Automaton Theory
Appendix 1 Addressless (Nonindividualized) ControP
If one assumes that all control proceeds from the top down to a specific address, then the system becomes very complex. For example, motor control involving addresses means that one has to write a schedule of the actions of all muscles, to inform each receptor where it is supposed to direct its pulse. On the other hand, this can be done differently. We shall show this using an example of an automaton game. If the conditions of a game are given, then the automata find the required actions by themselves. In this case they do not need individual commands. We think that the higher levels of the nervous system-e.g., the supraspinal level relative to the spinal level-need only a general type of information about the states of the lower levels. In other words, only very general instructions are needed. Using only these most general instructions, the automata can then select the optimal level by themselves. On the basis of numerical analysis, the military have often made this assertion: in fact, the army is urgently in need of computerization since now the infantry does not walk. They ride on personnel carriers, and can cover 100 or 150 km in the same time as 30 km were traveled before. Now, in order to write a marching order for an infantry division, two days are needed. However, this does not mean at all that computers are needed, because the bureaucracy that expressed this point of view had in mind that the orders should specify exactly what must be done by all subunits, all special units, all staffs, all camps, all magazines, all cooking personnel, medical-sanitary battallions, each person in fact. But the orders are not written that way. A division commander makes only one decision: From point A, move to point B. Then he says to the staff officer: “Now write the orders.” The staff officer writes the orders for the commanders of infantry regiments: “You have orders to reach that point,” and says to the rear staff officer: “Look, we are leaving for town B, so you have to move your troops.” Thus, there is no one person who could give this kind of allencompassing order. In my opinion the nervous system is in the same situation. Here it is important that all subordinate nerve centers be able to understand the language that we use. From the point of view of control, it is precisely here that address-free control has an advantage, i.e., it is not necessary to tell each one what to do. All that one has to do is to hang out a notice saying la The text in the appendix is made up of the answers to the questions posed after the lecture (Editor’s note).
Mathetktical Modeling of the Simplest Fo,-ms of Behavior
125
“to whom it may concern.” [I found with astonishment that, for example, the work of prisoners is more expensive than that of free men, even though the former are much worse fed and clad, and they work no less. The point is not only that the efficiency of prisoners is lower, but that a prisoner must be fed, clad, and watched by someone else. With a free person the matter is different: e.g., “I get paid twice a month, I give the money to my wife, and then my manager knows that I am not hungry, that I have my shoes in order, that I get my meals on time, that I will not leave, etc. He doesn’t have to think when to change my shoes or linen or what to do with my children, and so forth.”] Automata are, strictly speaking, good to use for simulation, because their properties are universal. If a payoff system is given (in terms of wins and losses), then the automata can already act on their own, attempting to be punished as little as possible. Appendix 2 Languages That Automata Use to Communicate with One Another
We have discussed very simple forms of behavior, and for this reason we limited ourselves to the simplest types of automata. The exchange of information among these automata takes place in the language of penalties and rewards. Although this language seems universal enough, it would, however, be interesting to also look at more complicated automata that possess some specialized language to communicate with other automata. Such automata are needed to describe more complex forms of behavior. These more complex behavioral forms necessitate the use of much more diverse information. Thus, for example, a player participating in a sports game, in addition to a general estimate of the situation (whether he acts correctly or not, he probably is not always aware), sees a lot with his eyes and hears with his ears, i.e., he receives many different kinds of information. When we deal with, say, a card game [and card games can be very complex, so complex that it is simply inconceivable to me how people can learn to play them], then we know beforehand exactly what information a card player has at his disposal. He can see his pack, he saw the cards that were put away before a given moment, and one can always say what he knows and what he does not know. In the case of an athlete, however, I think it is impossible to describe exactly what information he is using in his actions.
126
Automaton Theory
For this reason, I think that sports games are some of the most complex forms of behavior that can be described at all. I think that the game played by a soccer player is much more complex than the game of the participants in, as another example, an international conference. Automata are necessary to describe more complex forms of behavior, and this is the reason why: Imagine that I am playing chess with somebody, let us say with V. B. Malkin [it so happens that we did play chess together]. Anybody who has played chess at least once can understand what kind of moves can be expected from a given opponent. The player has some internal representation of how his opponent thinks. In my case, my opponent also thinks of me. This means that when I think about my opponent I should simultaneously include in my model of the opponent the model of myself that the opponent has. But then I have a model of the opponent, and thus one obtains m at r e~ h k a s 'contained ~ inside one another which are impossible to untangle in any reasonable way. This is a paradox which is natural in mathematical logic. Automata are to a large extent free of that paradox. They cannot have any model of the opponent. In the case of more complex forms of behavior, models with richer languages that can be used for communication are needed. It is, however, worthwhile to introduce a special language for communication, since it will be necessary for us to consider in this language the languages used by the opponent, and thus matreshkas come into play again. Nevertheless, they d o not appear because the model of a chess opponent is generally replaced by chess theory, which is applied not specifically to a given opponent, but to any person proficient in chess.
lS Wooden dolls of successively smaller sizes contained in one another (Translator's note).
ARTICLES ON BIOLOGICAL SYSTEMS AND MATHEMATICAL MODELS IN BIOLOGY
0
Introduction’ 0
The research on the problems related to physiology and other “complex” systems that M. L. Tsetlin and I have been pursuing may be divided into three periods : 1. The first period was devoted to an analysis of the properties of continuous media. Our interest in those questions arose in connection with the problems of the heart, and continuous media appeared as a proper method of description. This period saw a development of the axiomatics of continuous media and a paper about Benkebach cycles. At that time we formulated the principle of least interaction, using the example of spontaneously active elements. The principle was later of great heuristic value to us. 2. The second period was characterized by an enthusiasm about the ideology that resulted from the ravine method. We were interested in ways of overcoming the complexity of the problem, using a hypothesis about the organization of the world. The questions and ideas that originated as a result turned out to be very useful later. 3. Finally, after the appearance of M. L. Tsetlin’s papers on automata, an attempt was made to apply the principle of nonindividualized control to various biological problems. In the application to the problems of spinal motor control, the work resulted in specific predictions. The situation was similar so far as the problem of pretuning was concerned.
The present stage is only in its initial state. This is the stage in which postulates (“precepts”) are formulated that are typical of a living system. Unfortunately, this problem was not covered in the papers written. It seemed to us that this gradual change of our viewpoint from stage to This introduction was written by I. Gel’fand. 129
130
Biological Systems and Mathematical Models in Biology
stage should have resulted over a period of years in the construction of some “language” (a set of “postulates”) in which one could speak of living systems. Unfortunately, this stage of our joint work has been interrupted without ever having seriously begun2 It can be stated that the papers that were completed and are published in this collection are in a sense only a prelude to the basic theme-which is to understand the principles underlying living systems, and making them so much different from inanimate systems. Perhaps it is precisely the difficulty of this problem that was the cause of so many different attempts. The only common trait in these attempts was the anxiety not to let this “feeling for life” slide through our fingers. We have nevertheless hoped that each model, which has been inevitably more or less f o r m a l i ~ e dwould ,~ leave out at least a tiny part which would make it possible to come a little closer to the understanding of “life.” We always remembered the words of the great physicist N. Bohr who said that, in his opinion, the leading branch of knowledge will not be physics, but biology. Now a final remark. What should be the degree of formalization in biology in the study of living systems? Considering quantum mechanics, one can distinguish two stages in its formation. The first stage took place when Bohr created the philosophy of quantum mechanics. At that time, the formulas did not yet exist, and even if they did, they were not quite as they should be or were completely wrong. The second stage was a period of a rapid growth, and quantum mechanics became an exact branch of physics with a large number of precise formulas. But this stage was possible only after the first stage had taken place. By comparison, in biology the first stage has not yet occurred.
Rather, it was the undercurrent all that time, and all the preceding papers have dealt with it. Maybe the problem “Ah” is proof that the undercurrent was always there. Unfortunately, often less formalized than we would wish.
0
Mathematical Simulation of the Principles of the Functioning of the Central Nervous System’ 0
This chapter is not a survey o f the methods of mathematical modeling of the mechanisms underlying the central nervous system; it is devoted to some mathematical models related to the physiology of the central nervous system. We shall limit ourselves to an exposition of the basic ideas. Readers who are interested in the details of the mathematical apparatus and the details of physiological applications will be able to find additional information in the bibliography. We present three models. The first is concerned with the modeling of expedient behavior based on the methods of searching the extremum of a nonstationary function of many variables. The second model is devoted to simulating the behavior of collectives consisting of automata capable of expedient behavior. Both of these models were generated by studying behavior which we understand as a single process of studying the environmentmaking and carrying out a decision. The question of the specific physiological mechanisms performing the behavior will not be discussed here. The third model is an attempt to describe mathematically the functioning of the simplest excitable tissues. These mocels are quite diverse, but we would like to see in them examples realizing a certain general principle. The last part of the chapter is an attempt to formulate such a principle (the principle of least interaction). The problems that result in the necessity of studying complex control systems are extremely diverse and are generated by various branches of contemporary science and technology. The peculiarity of these systems forces us to rethink the very word “learnJointly with Gel’fand [62a]. The text is slightly changed: it includes the material from a previous work [62] (Editor’s note). 131
132
Biological Systems and Mathematical Models in Biology
ing.” The problem is that a completely isomorphic description, making it possible to take into consideration all the characteristic features of a phenomenon, is inadequate for complex systems, precisely by virtue of their complexity. Numerous examples are known of the inadequacy of descriptions of that type. Thus, using a system of differential equations to describe the motion of gas particles and their initial coordinates and velocities does not add anything substantial to our knowledge of the macroscopic properties of a gas. For complex systems, it is typical to find that the method of description is dependent on the problem that is to be solved by means of that description. It will be noted that for complex systems used in solving a problem, it makes sense to introduce the notion of the quality of the solution, i.e., the degree to which it is adequate. We shall consider this type of situation, using as an example a problem involving the calculation of a minimum of a multivariable function. This problem will naturally involve such notions as: the complexity of a problem, organization, search, tactics, hypothesis, and others. The most striking examples of complex control systems occur when one studies the behavior of a “little animal in the big world.” Here especially one can see the inadvisability and even the practical impossibility of an isomorphic description. This perhaps explains why physiological statements often bear a distinctly model-like character. Among many aspects of behavior, we shall touch upon only one, and this is the problem of the generation of motion. This problem, important in itself in physiology, attracts us by its ‘‘physicality’’-many motion parameters can be measured and described quantitatively. This also implies the necessity of organization and the use of tactics that, in view of the continuity of the process, refer both to the analysis of afferentation and to a generation of the motions themselves. 1 On Search Tactics We shall consider the functioning of a complex control system which is designed to reach a certain definite objective. It is assumed that the system is capable of determining how close it is to reaching its objective. The informatic .i necessary for the successful functioning of the system is received by the system in the activity directed toward the objective.2 The complexity In this sense the systems studied here are systems with dual control according to Fel’dbaum [145, 1461. Generally speaking, the case is more natural when a somewhat
PrincipIes of the Functioning of the Central Nervous System
133
of the system is defined by the number of parameters necessary to specify its state. Problems of this type are encountered when studying control processes governing numerous physiological mechanisms and complex technological systems. Examples of such systems are the movements of animals and of humans. Numerous problems in the area of complex systems would seem to have been quite completely analyzed by means of classical mathematics and functional analysis. However, it often happens that the algorithms proposed there, which permit a formal solution of a problem, turn out to be practically inapplicable. Thus, for example, to obtain an extremum of a multivariable function, classical mathematics proposes the following technique : differentiate the function in turn with respect to each of its arguments and set the derivatives obtained to zero. Thus, the problem reduces to solving a system of equations whose number is equal to the number of arguments. In practical numerical problems it turns out, however, that a solution of such a system is by no means simpler than to obtain an extremum directly. For a larger number of arguments both problems are in a general case extremely involved, and their solution is beyond the capabilities of today’s numerical analysis. Often the direct use of other algorithmic methods is also practically i m p ~ s s i b l e . ~ One can state other problems for which, even if it is possible to construct an algorithm suitable for all cases, its realization turns out to be impossible in view of the limitations inherent in contemporary computer technology and the restrictions on the time during which a problem should be solved. The restrictions on the time spent on solving a problem are particularly important. The difficulty is that practical problems (for example, those ocmore precise knowledge of the objective is also developed when the problem is solved. The problem of developing the objective (and the corresponding estimator or a system of such estimators) is probably even more important. However, this problem is more difficult, and we shall limit ourselves here to only those problems in which the objective and the estimator are known. An example of such a problem is the game of chess; to solve it, it is very important to develop an estimator. a Ulam [194], noting that finding a minimum of a function of more than four or five unknowns on a computer is often practically impossible, suggests that to solve such problems one should construct a cooperative link between the man and the machine. In this case, the machine displays on the screen reliefs of given two-dimensional sections of a function, and the man using these reliefs makes a decision about further action (i.e., what sections should be taken, what area should be examined in magnified scale, etc.). In this very wise proposition, one can clearly see the understanding of the imperfection of exact isomorphic algorithms. However, the proposed division of labor between the man and the machine is based upon existing machines and methods of their use.
I34
Biological Systems and Mathematical Models in Biology
curring in physiology) typically involve situations that vary with time, so that a delayed solution may turn out to be outright erroneous. In this sense even a relatively rough approximate solution that is obtained rapidly may be preferable to a more exact, but delayed s ~ l u t i o n In . ~ such situations, the acceptable solution can be achieved only by means of the organization which is to a larger or lesser degree possessed by problems occurring in practical human activity or perhaps in physiology. It is a very complex task to try to give a more complete definition of the notion of organization. In essence, the term “organization” is understood to mean those characteristic features of a problem or a situation which may facilitate obtaining a solution. These characteristic features are not known beforehand exactly, but are only more or less probable. Therefore, we make use of organization by advancing certain hypotheses and constructing tactics based on these hypotheses. Usually i t is impossible to verify these hypotheses directly. They are, one might say, tested in practice: the criterion used to judge the correctness of a hypothesis is provided by the “goodness” of a solution. Let us consider a simple example. The solution, using difference methods, of a fairly complex system of partial differential equations (for instance, of those arising in hydrodynamics) is based on an unverified hypothesis; after all, no one attempts to prove that a solution of a difference system for a step 6 chosen by the problem-solver differs from the exact solution by less than a number E specified by the problem-solver. The use of hypotheses implies a deliberate refusal to consider all possible situations such as those random cases that are the most probable in the formally mathematical sense. In this section we attempt to construct a tactic for finding a minimum of a multivariable function. The tactic will take advantage of the proposed organization of the problem. The type of organization of functions which we assume here (this will be described somewhat later) is very typical for many cases, and is probably one of the most frequently used types. Extremum problems are also interesting in themselves, comprising an important chapter of modern automatic control theory (automatic optimization [142-144]). Let F(x,, . . . , x,, y , , . . . , y,) be a function whose minimum values are sought. It is assumed that a system is able to measure the values of F. It is also assumed that the system can measure the values of the variables “It is possible to predict weather exactly for tomorrow, but one needs a month to do this”-according to Richardson.
Principles of the Functioning of the Central Nervous System
135
x,, . . . , x I 1 ,which will be called its working parameters. The arguments y , , . . . , y , are the hidden parameters of the system that are dependent on time, and perhaps also on the variables x1, . . . , x , ~ the : system is not able to measure or change the values of the hidden parameters. The function F ( s , , . . . , x,,, y , , . . . , y,,) will be written as (sl, . . . , s,, r ) ; the function @ will be referred to as the estimator of the system. It should be noted that the function @ is not assumed to be given analytically or in any other way, so that the selection of the required values of the working parameters should be done experimentally. The time-dependence of the estimator (this dependence is by no means assumed to be known) implies the necessity of a continuous search for the required values of the arg~ments.~ The important property of this type of search is its speed. Only those search tactics can be acceptable for which, roughly speaking, satisfactory values of the estimator are obtained within a time interval during which the estimator will not change appreciably. Thus, the search speed turns out to be related to the rate of change of the estimator. We should note that the time-dependence of @ loses its meaning if one is concerned with finding the absolute minimum. Therefore, we shall limit ourselves only to the problem of finding the region in which the estimator has relatively small values. (The region is equivalent to a level which must be maintained.) The automatic search for values of the working arguments which assure sufficiently small values of @ can be conducted in many different ways. Those various methods may be divided conveniently into three groups. The first group includes the so-called blind search methods. For them, it is characteristic that all points of the space of working parameters are either scanned in a certain order or selected randomly (the principle of homeostasis [132, 1631). When sufficiently small values of @ are reached, the search is stopped until those. values go beyond the admissible limits. Blind search methods make very little use of the characteristic features of the estimator (its organization). The results of a separate experiment are not used in the search that follows, so that the information about the estimator which was gained when measuring the values of @ is lost. Therefore, the values of @ are not improved from experiment to experiment, and on Moreover, if @ did not depend upon time (and the time devoted to search were infinite), one could be satisfied with complete scanning of its values. Using only one memory cell containing the minimum of the previous values, one could obtain the value of the absolute minimum.
136
Biological Systems and Mathematical Models in Biology
the average turn out to be relatively high. Blind search methods use only one value of @-namely, the one that is given at any given time. In this sense systems that work using blind search d o not have a memory. The second group comprises local search methods. They are quite numerous : they include such methods as the gradient, relaxation, steepest descent techniques, and others. Their common feature is localness: the working point continuously moves through the space of working parameters. To prepare the next experiment, one uses the values of @ in the small neighborhood of the preceding experiment. Tactics of this type make it possible to achieve a systematic lowering of the values of @ during the search process, and thus give a considerable advantage to local methods as compared with blind search techniques. Local search systems (as applied to the problems of automatic optimization) are described in detail in the papers by Fel’dbaum [142-144], which also indicate the possible circuit designs of the electronic systems in question. The use of any specific local method necessitates an experimental determination of certain constants which define the search (e.g., the magnitude of the step in the gradient method of steepest descent). The values of the constants that assure the most rapid search are important characteristics of the minimizing function. However, inasmuch as the local search methods use only the local features, the constants are different in various regions and d o not characterize @ completely. The use of only the local properties of @ limits the effectiveness of the local search methods, and creates a constant danger that the search will “cycle” in one place in any “small second-degree dip.” When the values of the gradient are small, the search becomes a blind wandering, and its effectiveness then is insignificant. The third group of automatic search methods comprises methods which we call nonlocal. These methods are characterized by the fact that the trajectories along which the working point moves in the space of parameters are not continuous. For this reason, the volume scanned per unit time becomes considerably greater, and the search itself is significantly faster, thus comparing favorably with local methods. The simplest of the nonlocal search methods combines the principle of homeostasis with any local method. Methods of this type (which, incidentally, are often used in computing practice) can be described as follows. Upon choosing an arbitrary point, one conducts a search (descent) using a certain local method. When the search “cycles” in one place, i.e., when changes in
Principles of the Functioning of the Central Nervous System
137
the value of the estimator become small during subsequent motion, a new arbitrary point is chosen, and the process is repeated. This is the way, for example, the simplest nonlocal gradient method works. Methods of this type also use only the local properties of the minimizing function; the information obtained during the local descent is not used further and is lost. Therefore, at the beginning of each descent a start is inescapably made in a region of large values of @, which means that the local descent is long. We shall now describe a method of nonlocal search which in our work [61] was referred to as “the ravine method.” This method permits us to use properties of function organization that are more extensive than the local behavior of the function. The ravine method is effective in those cases in which the working parameters xl, . . . , x, may be subdivided into two groups. The first group, which includes almost all of these parameters, consists of those arguments whose variation results in a significant change of the value of @. Thus, the selection of the values of these parameters (we shall call them nonessential) can be achieved in a relatively simple and rapid fashion.. The second group of parameters includes a small number (for instance, one, two, or three) of variables. These variables may themselves be the working parameters from among sl, . . . , s,, , but more often are functions of them. A variation of the variables in the second group (essential variables) results in a relatively small change of the estimator. The number of the essential variables will be called the dimensionality of a ravine. Of course, such a subdivision of parameters is impossible for every function that could be defined by a mathematician. However, for functions occurring in man’s practical activities (here we include reasonable problems in physics, engineering, and physiology) this type of breakdown is apparently possible in a considerable number of cases (probably even in a majority of cases). Keeping in mind the difficulties associated with an exact definition of these concepts, we shall nevertheless allow ourselves to call functions well-organized which admit of such a parameter decomposition. The hypothesis that an estimator @ is well-organized lies at the basis of the ravine method. The decomposition of parameters into essential and nonessential should, of course, be done automatically in the course of the search. Here it is important to note that the decomposition of the parameters into groups depends, generally speaking, on the time and on the point X = (.yl, . . . , x , ~ ) in the space of working parameters.
138
Biological Systems and Mathematical Models in Biology
Now we shall describe the search itself. At first an arbitrary point X , is selected. From this point a descent is made along the gradient (or using any other local method). The descent is continued until the relative decrease A@/@ exceeds some present value A , called the gradient test. When the ravine method is used, the local descent should be made roughly, selecting the value of A to be relatively large (e.g., d = 20%). The problem is that since only the local search ceases to decrease the values of @ to any significant degree, we fall into a region where the variables in the first and second groups become equivalent, i.e., the function ceases to be well organized. Therefore, if we continue local search, not moving significantly along the essential variables, we shall wander randomly moving only along the nonessential variables. Properly speaking, this is indeed the reason for the low effectiveness of local search methods. Thus, suppose that a descent along a gradient has brought us to a point A , . Then a point X , is selected in a neighborhood of the point X , at a distance considerably larger than a step of the gradient descent (for example, in some direction perpendicular to the gradient). From the point XI a local descent is made to a point A , . After the points A , and A , are found, the point X , is found by means of the so-called “step in the ravine.” The points A , and A , are connected by a straight line on which the point X , is found at a distance L from A , , called the length of the ravine step. For well-organized functions, this length is chosen much greater than the length of the gradient step. The ravine step is chosen experimentally. Its value will, to a large extent, determine the efficiency of the automatic search: for a fixed value of L,we “roll over small ridges” and “climb tall mountains.” These scales of magnitude are determined by the value of the ravine step. When the point X , is chosen, a gradient descent is made to a point A,; a point X , is selected using the points A , and A , just as the point X , was found using the points A , and A , , and then the process is repeated. The points Xiare thus found in places, where there is an expectation of small values of the estimator or close to them,6 so that the entire search is essentially conducted in regions where the estimator has small values. Another important thing should be noted. When the length of the ravine step is correctly chosen, as one moves along the ravine an adaptation to its direction occurs, so that the lengths of the gradient descents become much smaller than the lengths of the ravine step. This adaptation is related to the The value of @ at points Xi(on the “slopes of the ravine”) does not have to be small, because of the effect of the nonessential variables.
139
Principles of the Functioning of the Central Nervous System
fact that, in the course of moving along the ravine, the direction of the movement is made more accurate. Thus, because of a more accurate separation of the essential variables, the fraction of time spent on descents along gradients is lowered. This results both in an acceleration of the search process itself and in a considerable lowering of the values of @ thus computed. This adaptation of tactics possesses features which may be related to such terms as learning or expedient behavior. The efficiency of the search to a large extent depends on the choice of the ravine step length and the value of the gradient test A . The values of these parameters that assure the most effective search are important characteristics of the function @. Thus, the use of the ravine method makes it possible to obtain considerably more information about the structure of this function than is possible using local methods. We have described the ravine method in its simplest form, i.e., when the ravine is one dimensional. In the case of a multidimensional ravine this method, particularly at the beginning of the search, is inadequate. In those cases it is advantageous to take several points at the outset (e.g., two for each of the working parameters). Then, having obtained the points resulting from descent, one can find the correct direction of the ravine from them. It is often expedient to start the search from a “pencil of ravines” originating in the same region. We have described here the tactics of ravines of first rank in the sense that the variables are divided into two groups. It is not difficult to devise tactics of higher ranks. Thus, for example, in a tactic of second rank the variables are divided into nonessential, essential of first rank, and essential of second rank, and the lengths L1 and L, of the ravine steps along these variables are chosen correspondingly. The gradient and steepest descent methods require the computation of gradients, i.e., the computation of the values of @ at the (n 1)st point, which for large n involves a considerable number of operations. Here it may be useful to use finite automata capable of expedient behavior [154, 15.51. One can use the automata Lzkn,n (k is the number of working parameters, n is the complexity of an automaton). “Penalties” and “rewards” are determined from a given value of the gradient test, and to each of the rays in the diagram of states of an automaton there corresponds a motion in a definite direction along one of the working parameters. It is important to note that (for functions which vary rapidly with time) in the course of motion along a ravine, the correlation between the values of the function at points lying at a distance of a ravine step becomes low. Therefore, the effectiveness of the ravine method begins to approach the
+
140
Biological Systems and Mathematical Models in Biology
effectiveness of the simplest nonlocal gradient method. A further enhancement of the time dependence leads also to a decrease of the local correlation (the correlation between the values of the function at points lying at a distance of one gradient step from each other) so that the tactics begin to approach a blind search. A similar situation also arises when the degree of organization of the function is lowered. In essence, the ravine method includes both the nonlocal gradient method and blind search, thus yielding a significant payoff in those cases in which the estimator is well organized, and the speed of search exceeds the rate of variation with time. In the remaining cases, the ravine method is not inferior to other methods. In using various methods of automatic extremum search, the question of their relative worth arises. Here it is natural to use functionals of the type (I,T)
ST’
Y p q X I , . . , x,, t ) ) d t .
0
(1)
The function Y may be chosen differently depending on the choice of the criterion for judging the efficiency of the search. Thus, for example, for ‘P(@)= @ the value of this functional corresponds to the so-called “cost of the search” defined for simple systems of automatic minimization [159]. I n problem where the objective is to reach the values of @ which do not exceed a certain level C (“minimization with respect to the level C”), it is convenient to use the functional which can be obtained from (1) upon setting
When the ravine method is used, the value of these functions is lower than in the case of other methods since, due to the process of adaptation, the entire search is conducted in the region of small values of @. The application of the ravine method to the problems of the phase scattering of particles and to the analysis of the structure of crystals was described in the literature [42, 51, 52, 581. The corresponding physical phenomena are also described [115, 1241.
2 Simulation of Expedient Behavior of a Group of Automata Before proceeding to the next group of mathematical models that arise in our preoccupation with the physiology of the central nervous system, it seems to us that it will be of some use to emphasize certain characteristic
PrincipJes of the Functioning of the CentYaJ Nervous System
141
features of the structure of complex control systems. Knowledge of these systems is essential for simulation of: (1) complexity of the systems: presence of a large number of relatively autonomous subsystems, experimental difficulties involved in studying and describing the interactions among such subsystems; (2) reliability of functioning assuring the expedient behavior of the entire system also in the case when some subsystems suffer a failure; and (3) diversity of problems handled by complex control systems, and the impossibility of separating specialized systems for solving each problem.
When simulating the behavior of complex control systems, the necessity naturally arises of separating the simplest forms of such behavior, of searching for structures that possess expedient behavior in the simplest forms, and of constructing a language useful for describing the mutual interaction among the slmplest structures whose collective behavior would allow us to convey the essential features of the behavior of complex control systems. We would like to mention one attempt at describing complex control systems by separating out elementary problems and structure^.^ In choosing the structures of the simple systems capable of expedient behavior, we have made use of finite automata. Here an automaton is defined as an object capable at each instant of time t = 1, 2, . . . of receiving a finite number of signals S , , . . . , S, and of changing its internal state accordingly. An automaton is capable of a finite number of actions f,, . . . , fk. The choice of an action is determined by the internal state of the automaton; it is assumed that an automaton has m internal statesv,, . . . , pm; the number m will be called the capacity of the automaton memory. If an automaton is in some medium, then its actions I cause the responsive reactions of the medium S, which are in turn the input signals for the automaton. The automaton, one might say, uses them to make a decision about subsequent actions. In the simplest case we shall assume that all possible reactions of a medium are received by the automaton as reactions belonging to one of the two classes-the class of favorable reactions or the class of unfavorable reactions; these classes will be called rewards and penalties. Inside each of these classes, the reactions of the medium are indistinguishable. The expedient behavior consists in increasing the number of favorable reactions and decreasing the number of unfavorable ones. A more detailed exposition is given on pp. 12-83. In what follows, we give an exposition for a nonmathematical reader (Editor’s note).
142
Biological Systems and Mathematical Models in Biology
The role of the medium is to establish the relation between the actions of the automaton and the signals received at its input. This relationship may, generally speaking, be very complex, especially when a given automaton interacts with other auiomata. On the other hand, the information obtained by the automaton consists merely of whether a reward or a penalty followed the last action, so that the character of the medium is not known to the automaton beforehand. It is therefore natural to select the construction of the automaton in such a way that its behavior will possess the maximum expediency in the simplest cases, and then study the behavior of the automaton and groups of automata in more complex media. The simplest of the problem arising here involves the behavior of an automaton in a stationary random medium. In such media, for each of the actions fi(i = 1, . . ., k) of the automaton, we are given the expectations ai of its reward, so that the set a i , . . . , uk specifies a stationary random medium. Here the probabilities of a reward and of a penalty are given by
The functioning of a finite automaton A in a stationary random medium is described by a finite Markov chain, and for ergodic chains one can speak of final probabilities of states and of the final (independent of the initial state) value W ( A ) of the average payoff received by the automaton in a stationary random medium. It is natural to compare the average payoff obtained by such an automaton with the average payoff which could be obtained by a person who (as distinguished from the automaton) would know beforehand the parameters a i , . . . , uk of the medium. This person would obviously perform only the action that yields the maximum payoff, and the average payoff for this person would be equal to the greatest of the numbers a i , . . . , u k . It turns out that the payoff to any finite automaton is less than ( a i , . . . , ah), but one could construct sequences of finite automata A i ,. . . , A , , . . . , such that lim W(A,)
n+m
=
max(a, ,. . . , ak).
Such sequences are called asymptotically optimal. A description of a number of such constructions can be found on pp. 18-21 and in the papers [102, 112, 128, 154, 1571, which are devoted to a simulation of the simple forms of behavior. In this case, the number n for an automaton in an asymptotically optimal sequence may be interpreted as the capacity of its memory. We shall not
Principles of the Functioning of the Central Nervous System
143
dwell here upon questions related to the behavior of automata in media whose properties are not stationary (see p. 25). We shall first of all be interested in the problems related to the collective behavior of automata. The collective behavior of automata is generated by their interaction. We have agreed to consider only the simple automata in whose construction there is no a priori information about the medium or other automata; the information obtained by a simple automaton is limited to just information about a reward or a penalty incurred by a given action. For this reason, we shall only consider those forms of interaction which can be realized in the collective behavior of such simple automata. The language of game theory provides a convenient tool that can be used to describe such forms of interaction. However, the models of collective automaton behavior differ considerably from the models accepted in game theory. Thus, in game theory we assume that the system of payoff functions which defines a game is revealed to the players before the game. Using this a priori information and making use of any computing means, the player makes a choice of a strategy. The strategies thus chosen are not changed in the course of the game, so that the game is similar to a play of chess that begins and ends with a home analysis. Models of collective automaton behavior (an automaton game) [I571do not assume that any a priori information is present so that the strategies are selected in the course of the game itself. It is assumed that an automaton game consists of a sequence of plays. Here a playf(t) of a game r, played at time t, is defined as a set f(t) = ( f l ( t ) , . . . ,f x ( t ) ) of actions (strategies) selected at that moment by the 1) automata A', . . . , A x participating in the game. The outcome S(t of a playf(t) is defined as a set S(t 1) = (S1(t l), . . . , SaV(t l)), 1) = 0 if the automaton Aj wins in this play, and Sj(t 1)= 1 where Sj(t if the automaton loses in the playf(t). By giving the structures of the participant automata and the probabilities P ( f , s) of the play outcomes, we define a game played by automata. Using these probabilities, one can determine the expected value W(f) of the payoff to an automaton Aj in a playf. The system of payoff functions thus constructed defines a game in the sense of game theory, a game which is equivalent to an automaton game. Thus, the information concerning wins and losses as a result of a given play specifies the values of the input variables for the participant automata, determining the choice of strategies in the subsequent plays of the game. In this case, the automata do not receive any information about either the actions of their partners or the strategies which are at the disposal of their
+
+
+
+ + +
144
Biological Systems and Mathematical Models in Biology
partners, of even about the number of opponents. For a given automaton the role of the remaining players is reduced to the formation of a more or less complex medium in which the automaton should be capable of expedient behavior. Therefore, when selecting the structures of the participant automata it is natural to require that these constructions assure us of expedient behavior in the simplest game-a game with one player (“games against nature”), i.e., in a stationary random medium. It turns out that for quite a number of automaton games such simple structures make for expedient behavior. Consider as an example a zero-sum two-person game. Suppose that the game matrix is given to one of the opponents before the game, and he chooses the optimal strategy in the gametheoretical sense, and his opponent is an automaton belonging to an asymptotically optimal sequence. Then the automaton (with sufficient memory capacity) achieves a payoff equal to the value of the game according to von Neurnanr8 If both players in the game are automata, then their payoff is a!so close in some sense to the value of the game. Of course, the most interesting are games played by many automata. We are interested here in the simplest games, namely, those in which all the players are equivalent. The simplest example of such a game is the so-called Goore game. This game is played by N persons each of whom is capable of only two actions. Here the probability of winning for any of the players is determined only by the fraction of the players using the first strategy. It is obvious how people would behave in this game if they knew the conditions of the game beforehand-they would make an agreement that the first strategy should be used by that number of players which assures the maximum payoff. The simple automata playing the games do not know beforehand the payoff function of the game or the actions of or the payoffs to the opponents in its separate plays. Each of the automata receives only the information about its own win or loss in a given play. Nevertheless, with sufficient memory capacity they also maximize their payoff. The expedient behavior of every automaton in the simplest problem results in the expediency of their collective behavior, replacing such difficult to formalize phenomena as “agreement on common actions.” It is interesting to note that for a fixed memory capacity, if the number of It will be noted that the opponent of an automaton with a linear tactic, who does not use von Neumann’s strategy, may win from a linear automaton more than the value of the game (Editor’s note).
Principles of the Functioning of the Central Nervous System
145
automata playing the Goore game increases, the expediency of their collective behavior decreases, in the limit not differing from random behavior. Conversely, for any fixed number of players an increase in the memory of each automaton results in a heightened expediency, and the average payoff in this case tends to the maximum possible value. Let us give another example of an automaton game which we called the distribution game. The situation simulated in this game is typical of a problem facing predatory animals when they must choose their hunting grounds. Here the number of prey per predator is determined by the supply of prey in the area chosen by him and the number of predators hunting there at the same time. A choice of a certain strategy corresponds to a choice of the hunting grounds in the game, and a certain value of the payoff function corresponds to the number of prey. The distribution game is specified by k nonnegative numbers a, 2 2 ak 2 0 called powers of the strategies. The game is played by automata A ’ , . . . , Ax,N < k, each of whom has k strategies fi,. . . ,fk. The expectation of winning for an automaton that chooses strategy Fj in some play of the game is equal to ajlmj, where mi is the number of automata using strategyfi in this play. The distribution game as played by simple automata was studied by us using a computer simulation. We found that the behavior of automata with sufficiently large memories did not differ from the behavior of people knowing the conditions of the game beforehand and making agreements about their actions-the automata (with probability close to unity) were choosing their actions in the optimal manner. Thus, for example, in the distribution = a, = 0.33 game played by five automata with a, = 0.9; a, = a, = it turned out that in 99% of the plays the first strategy was chosen by two automata, and the remaining ones by one. In this case the average payoff to each automaton amounted to 0.38. It is not hard to verify that in this situation it is not advantageous for any automaton to change its strategy. This type of behavior of the automata coincides with the behavior of people knowing the powers of the strategies beforehand. However, by making an agreement about sharing the payoffs, people could achieve higher payoffs. In fact, if the total payoff is divided equally among all the players, it would be useful to use the first five strategies taking one of each: then the payoff to each player would amount to 0.44. If the payoff to the automata in the distribution game is totaled in each play and shared equally (“the game with a common fund”), then the behavior of the automata will also change: in each play the first N strategies are chosen, where each strategy is chosen by one automaton. In this case
-
146
Biological Systems and Mathematical Models in Biology
the average payoff changes, and with an increasing memory capacity of the participant automata it approaches the maximum possible value (in our example, 0.44). It is interesting to note that the growth of the average payoff when the common fund is introduced is achieved only with a sufficiently high memory capacity; if the memory is small, the introduction of a common fund lowers the average payoff. In other words, if the individual expediency of behavior of each player is low, then the equalizing distribution of payoffs is not advantageous. The behavior of automata in the distribution game possesses the characteristic features of reliability. In fact, let us assume that the automata playing the game with a common fund may suffer a failure. However, the remaining automata will, as before, play in the most advantageous way, that is, independently of whether any automaton fails; the strategies with maximum powers will be used as previously. The increase in the average payoff to each of the automata continuing the game will partially compensate for the lowering of the total payoff. When new automata are included in the game, they will also be distributed in the most optimal manner. The behavior of such a group of automata is similar to a perfectly reliable machine in which the wear and tear to the most important parts is automatically compensated for at the expense of the less important parts. For the sake of the following, it is important to note that both the Goore game and the distribution game are examples of automaton groups that are “easy to control” in the sense that to control them it is sufficient merely to specify the pay-off functions. T o achieve the optimal operating mode, it is not necessary to control the behavior of each individual automaton. In these examples the optimal operating mode is selected by automata that do not have any information about the actions of the other automata and are not capable of directly changing these actions, so that the interaction of the automata is limited to participation in a common game. We note, in addition, that a slightly more complicated distribution game may serve as a model for the naturally arising problem concerning the most convenient distribution of the computing means when it is necessary to simultaneously solve a number of problems. Those automaton games are interesting in which the payoff functions of each automaton depend only on the strategy chosen by them and the strategies of a limited number of other players, their “neighbors” in the game. The simplest example of such a game is provided by the so-called “circle game” in which the payoff to a player Aj depends only on his strategy and the strategies of the players Ai-’ and Ai+‘ that are his neighbors.
Principles of the Functioning of the Central Nervous System
147
For games with a limited number of neighbors, it is characteristic that the plays assuring the maximum payoff are selected relatively rapidly. Here the behavior of the automata possesses the features of reliability that were referred to when describing the distributioii game.
3 A Mathematical Description of Excitable Tissues The first two sections of the present work were devoted to the simulation of certain features of the behavior of complex systems. In constructing these models, we have not attempted to consider the specific physiological structures underlying behavior. In this section we shall (of necessity) briefly describe certain attempts at simulating the simplest physiological structures whose properties are close to the simplest properties of excitable tissues, which will perhaps make it possible to explain certain features of their functioning. Here, in contrast to the traditional models of the type of nerve networks given by W. S. McCulloch and W. H. Pitts, we shall not consider a system composed of a large number of separate elements with a complex system of connections among them. Instead, we shall consider continuous media, assuming that those points are neighbors of a given point which lie in its immediate geometric neighborhood. We note that in a physiological experiment, the separation of an individual element is sometimes difficult and is not always meaningful. Let us consider the simplest example of such a continuous medium [60].O An active tissue will be defined as a medium that possesses the following properties : Each point of the medium is capable of instantaneous excitation. During a time R after the instant of excitation, the point cannot be excited. The value of the interval R is called the time of refractivity. The phase t ( x , t) of a point x at time t is defined as the time that has passed from the time of the last excitation. If t ( x , t ) is less than R, we shall say that the point is refractive. 2. The excitation may propagate in the medium. The velocity c(x, t) of the excitation propagation at the point x at time t depends on the phase of this point: c(x, t) = q ( t ( x , t)). The excitation cannot propagate in refractive media. 1.
*
This paper is included in the book (see p. 154) (Editor’s note).
148
Biological Systems and Mathematical Models in Biology
3. A point is capable of spontaneous activity. This means that during a time T (the period of spontaneous activity) after the last excitation, the point may again become spontaneously excited. (It should be noted that property (3) is not obligatory, and we shall also consider media without spontaneous activity.)
The propagation of an excitation in active tissues possesses a number of interesting properties. Thus, for example, the process of propagating excitation impulses in a homogeneous ring from the active tissue is autosynchronizing: no matter what the initial phases and the initial distribution of the impulses in a ring may be, a regime is established such that the impulses are set up at equal distances in the ring and they propagate with a constant velocity. In the case of periodic excitation of the end of a segment, the impulses are also propagated, in the limit, from the active tissue with a constant velocity that does not depend on the initial distribution of the phases. (See also the literature [5, 6, 63, 1.391.) Now we shall give an example of the functioning of a plane excitable tissue. We assume, for simplicity, that the velocity c of the excitation propagation is constant, and that the initial phase is identical for all points and equal to zero. It is clear that all points of such a tissue will be excited simultaneously and with the same period T. Let us now imagine that at time f, 2 R the point x, will be excited from the outside. Then, obviously, the excitation will spread from this point with velocity c. At a time t < T, the set of the excited points forms a ring of radius c ( t - t o ) with the center at x o , and at a time T a l l points will be excited that lie outside of the ring of radius c(T - f,). At a time to T, the point x, is again excited spontaneously, and the process will be repeated periodically. In this case, each point of the medium will be excited with period T ; one can say that there is no interaction in the system here. It is not difficult to see that for an arbitrary initial distribution of phases and external excitations, a regime is established in the system in which each point will be excited spontaneously earlier than the excitation is received from its neighbors, i.e., a regime without interaction. For an active tissue whose points have different periods of spontaneous activity, in steady-state conditions the period of excitation of any point of the medium will be equal to the minimum period of spontaneous activity -the medium will be synchronized by its most active point. This very synchronization mechanism is realized in certain physiological objects. In the paper by Gel’fand el al. [57] it was, for example, shown that the sinus
+
Principles of the Functioning of the Central Nervous System
149
portion of the heart is automatically synchronized by the active cell working with the highest frequency. The concept of an active tissue was used in constructing a number of models of physiological mechanisms. Thus, for example, in the work by Lukashevich [ZZ7], the author describes a computer simulation of the process of excitation propagation in the heart. This model made it possible to study the peculiarities of a number of pathologies in the functioning of the heart. Keder-Stepanova and Rikko [94J used nonhomogeneous active tissues to construct models of the salvo-type activity of the breathing center. Arshavskiy and co-workers [9, 15, 1351 have studied the “cable nets,” i.e., the homogeneous lattices whose ribs possess the properties of active tissues, as well as the characteristics of excitation propagation in such structures using the example of the myocardium and the dendrites of nerve cells. 4 The Principle of Least Interaction
We have described here various mathematical models connected in one way or another with our preoccupation with physiology. We understand very well the extreme diversity of these models, which are united only by the generality of their approach. Nevertheless, it seems to us that the models also possess a certain internal common feature which is perhaps also present in those physiological mechanisms which led to these models. We have in mind the principle of least interaction. We are still very far from the definitive formulation of this principle, and we actually only associate with these words our hope that it will be possible to construct a mathematical theory of complex control systems in which this principle would play a role analogous to that played by variational principles in analytical mechanics.’O We would like to think that in this future theory a uniform description will also be made of those preliminary mathematical models which were referred to in the present work. However, if we attempt here to talk about certain notions related to the principle of least interaction, this is only because these notions were useful in our activities. lo At the present time, we are still not able to determine for sufficiently diverse systems a function which, on one hand, could be readily interpreted as the interaction of parts of the system among themselves or parts of the system with the surrounding world, and on the other hand, such that when the system minimizes this function, the expedient behavior of the entire system will follow (Editor’s note).
150
Biological Systems and Mathematical Models in Biology
A system will be said to function expediently in some external medium if the system strives to minimize its interaction with this medium. Here, as a rule, a natural definition of the interaction function results from the properties and the purpose of the system itself. Thus, for example, a measure of the interaction of an organism with a medium is provided by the deviations of the parameters of the organism’s internal medium from their optimal values. For models about which we talked before, these functions are diverse. Thus, for example, for a system used to search for the minimum of a multivariable function (Section l), a measure of interaction may be provided by the average value (over a certain time interval) of the function minimized. For automata capable of expedient behavior (Section 2), the interaction is measured by the average value of the penalty. For models of active tissues (Section 3), a measure of the interaction for an eiement of the tissue may be provided by a monotonic function of the deviation of the mean interval between two excitations from the period of spontaneous activity. Thus, all our models are examples of expedient systems. For such systems, it is typical that the most stable are states with minimum interaction. In this sense, expedient systems are, so to speak, inertial-they tend to enter a state of small interaction so that they will not have to change states again. It should be noted that similar notions lie at the basis of the principle of a homeostat as proposed by Ashby. For complex control systems, a structure is typical which allows a separation of the individual, relatively autonomous subsystems. For each of such subsystems, all remaining subsystems belong to the external medium, and the expediency of the subsystems is revealed in the minimization of the interaction among them, so that in stable states these subsystems function, as it were, independently or autonomously. In the last analysis, the functioning of each subsystem is determined by the external medium and the functioning of the entire complex system, the subsystem being part of the latter. However, at each moment the subsystem solves its own “particular,” “personal” problem--namely, it minimizes its interaction with the medium; therefore, the complexity of the subsystem does not depend on the complexity of the entire system. The expediency of the entire system is revealed in the minimization of the total interaction of the system with the medium. When the external medium changes, the previous mode of operation of the control system no longer assures the minimum interaction. The interaction both of the entire system with the medium, as well as of the individual subsystems among themselves, increases. The expediency of the system leads to another stable mode assuring the minimum interaction in the new medium.
Principles of the Functioning of the Central Nervous System
151
Let us clarify the above with some examples. First consider the simplest homogeneous systems all of whose subsystems are equivalent. A model of an active tissue is such a system. As its elementary subsystems one can take the points of the medium forming the tissue. We have already seen that in such a tissue, no matter what the initial phase distributions and points of the initial external excitations are, a regime is established in which all points are excited spontaneously, i.e., they d o not interact among themselves. The propagation of the impulses in a ring from the active tissue proceeds at a constant velocity, so that the interaction of all points is identical and with a good choice of the interaction function it is also minimal. Similar terms can also be used to describe such important physiological phenomena as the synchronization of the operation of the individual elements, and to explain the necessity of the appearance of special desynchronizing mechanisms (for example, the type of Renshaw cells in the spinal cord). Another example of homogeneous expedient systems is the homogeneous automaton game, for example, the Goore game or the distribution game described in Section 2. Here we also see that the expedient behavior of the individual automata assures us of the expedient behavior of the entire system of the participant automata. For the case in which the external medium (in this model, the values of the functions defining the game) changes, the behavior of the automata changes, lowering the interaction. When discussing the distribution game, we have already mentioned the reliability of a group of playing automata. This property is characteristic in general of complex systems in which the expedient behavior consists in the interaction of subsystems which themselves possess expediency. For these systems, another property is also apparently typical. This is the fact that the expedient group behavior of the individual subsystems may also be achieved without the presence of direct links between them, so to speak. In a number of problems, to achieve this expediency it is sufficient to have the simpler type of interaction which occurs in automaton games. Lack of the necessity for direct links is also important because it enhances the reliability of control systems, and because it makes it possible to construct complex systems from simple subsystems. Otherwise, each subsystem would have a system of links growing with the increasing complexity of the entire system, and its structure, assuring expedient behavior, would also become complex. We must note that the principle of least interaction is also important, because it permits us to consider any system (or its component), if it is capable of expedient behavior, as a single whole, which, of course, considerably simplifies the analysis.
152
Biological Systems and Mathematical Models in Biology
An example of an inhomogeneous control system may be provided by a system designed to search for the extremum of a multivariable function using the ravine method. In fact, this system consists, roughly speaking, of two subsystems-the level forming the local search and the level performing “the step along the ravine.” The first of these levels minimizes the value of the function using, for example, the gradient method. The second one minimizes the time spent on the local search, so that the medium for the second level is provided by the first level. In those regions where the properties of the function (i.e., the subdivision of the variables into essential and nonessential ones) are stationary, the interaction of both subsystems is minimized, i.e., both the values of the function (on the average) and the lengths of the gradient descents become small. Where the function changes (e.g., at the bend of the ravine) the interaction increases. The expediency of the functioning of ‘our subsystems has the effect of imparting expediency to the entire system-it lowers the average values of the function in the course of the search. It is possible that the above notions can be used to explain certain general features of the interaction of nerve centers. The central nervous system contains a large number of separate nerve centers serving to control the effectors, so that each act of behavior (e.g., motion) is a result of their joint activity. Here a change in the functioning of one of the centers should of necessity result in a change of the activity of the remaining ones. If we assume that each nerve center represents an expediently functioning system, then our mathematical models allow us (to a certain degree) to imagine the interaction of the nerve centers without considering the complex system of links and the coordination of their activity. Here one may consider that, for the nerve centers at some level, the stimuli of the medium are replaced by afferentation “coming from below,” and the stimuli from the higher centers determine their particular problems (one can say, determine the organization of this afferentation into a “system of payoff functions”). Here, of course, great importance is attributed to the question of what “particular, personal” problem is solved by each nerve center, which for it is a measure of interaction. We assume that this role is played for each center by the incoming afferentation, and the expediency of its operation consists of lowering the stream of afferentation arriving at the nerve center, which is the measure of interaction for the nerve center. We take the risk of assuming that this role of afferentation is universal, i.e., it takes place in one way or another for all nerve centers, and the nerve system as a whole is organized according to the principle of least interaction. For each external situation, the problem facing the nervous system consists in finding a mode
Principles of the Functioning of the Central Nervous System
153
of operation in which afferentation is minimal. An important feature of the model described here is the relative simplicity of control: After the payoff functions are given, there is no need to control in detail the centers of a given level. By virtue of the expediency of their behavior, they will themselves choose the best distribution of responsibilities (compare with the distribution game described in Section 2). On the other hand, this method of control is also distinguished by its great flexibility-the corrections to the structure of the “payoff functions” can be made by a large number of nerve centers which have some relation with the problem and are not acting in coordination.
0
Continuous Models of Control Systems’ 0
A number of papers have been devoted t o the construction of mathematical models of control systems imitating the functions of the nervous system. These papers and the theory of automata (logical networks) that arose in this connection turned out to be useful in studying the principles of the construction and functioning of the logical circuits of computers. However, their underlying discreteness with respect t o the elements and time makes them of little use when it comes t o describing the functioning of somewhat more complicated systems, like biological systems. It seems reasonable t o us that, instead of considering a large number of individual elements with a complex structure of links among them, one should investigate continuous models.2 Concepts of this kind were used in the important and interesting work of Wiener and Rosenblueth [I971 in describing the mechanism of the fibrillation of the heart muscle. Such a continuous medium will be constructed phenomenologically by considering at the same time certain “natural” assumptions about its properties. It will be noted also that in physiological experiments it is difficult, and not always meaningful, t o separate the individual elements so that a description of the medium not in terms of separate elements but directly, is advantageous. We shall describe here the simplest possible model with the three properties t o be given in what follows. In this sense the article is preliminary, and we hope t o construct This chapter was written jointly with I. M. Gel’fand and previously published [60, p. 12421 (Editor’s note). In this chapter, we shall limit ourselves to continuous models in the simpler sense of the word-namely, in a continuous medium closeness will be understood as geometrical closeness. Generally speaking, however, closeness could also mean closeness in “phase space.” (For example, we can understand the distance between the points A and E as the time that it takes for a signal to travel from A t o E.) 154
155
Continuous Models of Control System
continuous models later which are better approximations of the physiological prototypes. An active tissue will be defined as a medium possessing the following properties3: ( I ) Each point of the medium is capable of instantaneous excitation. During a time R after the time of excitation, the point cannot be excited. The quantity R is called the time of refractivity. The phase t ( x , t ) of a point x at time t will be defined as the time that has passed from the last excitation of this point. For example, if a point x was last excited at time t = 0, then ~ ( xt ,) = t . If t ( x , t ) < R, then we shall say that the point is in the refractive phase. ( 2 ) The excitation can propagate in the medium. The velocity c(x, t ) of the excitation propagation at a point x at a time t depends on the phase of the point c(x, t ) = q [ s ( x , t ) ] .The function q ( t )is defined for all t 2 R. The propagation of an excitation is impossible in regions in the refractive phase. Thus, the state of each point of the system is determined by whether it is excited at a given time or not, and by its phase t. The propagation of an excitation is understood as a propagation of a “discontinuity” in the state of the system whose front moves with the velocity c along the normal to the “discontinuity.” One could also consider a model in which the front of the excitation would have a finite extension. The propagation of an excitation could then be compared to the spreading of flames in a burning m e d i ~ m . ~ (3) The point is capable of spontaneous activity. This means that during a time T after the last excitation, the point again becomes excited spontaneously (if, of course, the point did not become excited due to the influence of the neighboring points). The quantity T is called the period of spontaneous activity.
Let us consider some examples of the processes that may take place in such a tissue 1. We shall investigate the operational mode called the ring rhythm. Properties (1) and (2) were essentially considered by Wiener and Rosenblueth [197]. These properties together with (3) are convenient, not only in the investigation of phenomena in the myocardium, but also in the construction of control systems. A model of flame propagation could probably be useful in studying the propagation of an excitation along an axon or a muscular fiber. A model could be provided by the equation [XI] azu/at2 = A a w x ’ F(~)
+
or by systems of equations of this type.
156
Biological Systems and Mathematical Models in Biology
I
i
I
Figure 20
We have a thin thread of length f closed to form a ring. We assume that spontaneous activity is absent, and the velocity c of propagation of an excitation is constant. As the initial conditions, we take the phase distribution ~ ( x 0), , shown in Fig. 20 with a solid line. Thus, initially the point 8, is excited, and its neighbors on the right (the segment 8,O’) are in the refractive phase. Therefore, the propagation of the excitation will proceed to the left from the point O,, and the points located to the right of 8, will gradually leave the refractive phase (their phases will grow). By a time t the phase distribution t(x, t ) will take the form shown in Fig. 20 with a dotted line. The length of the segment 8,8, is obviously equal to cr. Since the beginning and the end of the segment are identified, the excited point during time t = f / cwill return to the position B0 and the phase distribution will coincide with the initial one. From this example, we can see the role of refractivity that makes it possible for the excitation to move in one direction. In this sense, the refractivity that makes it possible for a point A to act on a point B, where the point B does not act back on the point A , appears to be the opposite of Huyghens’ p r i n ~ i p l e .In ~ the following, we shall return again to operating modes of this type. 2. Consider now an active tissue, and let T be the period of its spontaneous activity. If all points of the tissue were excited simultaneously at the time t =;- 0, then later they would simultaneously become excited at times t = T, 2T, . . .. Thus, on the phase diagram distribution at the initial instant coincides with the axis of abscissas, and then moves upward with a velocity equal to unity. When the time t = T is reached, this straight line again coincides with the axis. Since, however, the phases 0, T , 2T are indistinguishable, it will be more convenient for us to imagine the phase distribution as continually rising, and to define the phase as the distance of the point in question on the curve of the phase distribution to the straight line t
= nT,
nT 5 t ( x , t ) < (n
+ l)T.
An important property of Huyghens’ principle is its mutuality.
157
Continuous Models of Control Systems
We shall say that there is no interaction in the system if any point of the tissue is excited spontaneously and earlier than it could become excited under the influence of the neighboring excited points. In the absence of interaction, the curve of the phase distribution moves upward parallel to itself. It is not hard to see that interaction will be absent for all initial distributions for which the slope of the tangent is less than c-l. For a system of a large number of measurements, the condition for the absence of interaction assumes the form grad t ( x , t ) 5 c-l, where x = (xl,x2,x3) are the coordinates of the point of the tissue. We shall see that, when this condition is violated, including the presence of discontinuities, after some time a phase distribution will be reached that will satisfy this condition. The medium in question possesses reliability in a certain sense. In fact, if for some set of points of the medium xl,. . . ,xn the periods of spontaneous activity increase randomly, then this is not going to result in any noticeable change in the phase distribution, since these points become excited under the influence of the neighboring points with an infinitesimal lag. On the other hand, a random single decrease AT in the period of spontaneous activity will result in a change of the phase distribution only in a region of radius c AT. In this respect, the active tissue discussed here differs from the discrete models of logical networks in which a change of the properties of an individual element distorts the functioning of the entire network. We shall show now that such an active tissue may serve in the role of a memory. Consider, for the purpose of simplicity, the initial phase distribution to be z(x, 0) = 0,and let the point x,, become excited at a time t = R E. The E will, as agreed before, be repphases of the point x, at time t = R resented as equal to T (curve 1, Fig. 21). The same figure shows the changes
+
+
I
ff
A
XU
Figure 22
X
158
Biological Systems and Mathematical Models in Biology
in the phase distribution at various moments of time (curves 2-4). It is not hard to see that, beginning with the time t = T, the phase distribution curve ceases to change. Thus, the excitation of a point x was followed by a change in the phase distribution within a cone (which is characteristic) of the base radius c ( T - R - E ) , the cone being preserved so that both the point of excitation and its phase are “remembered.” Figure 22 shows how the phase distribution t ( x , 0) = 0 is established: the point t = R E is . base radius of the excited consecutively at times xo and t = 2R 2 ~ The characteristic cone is here equal to 2c(T - R - E ) . This example shows that each point of the medium may serve as a counter of the number of elementary excitations. We shall note in addition that in this case, in a steady state, at any point in time there are excited points, but interaction is absent.6 3. Consider now a thin slice of tissue with length I, devoid of any spontaneous activity, in which the propagation velocity C ( T ( X , r ) ) of an excitation is given for the values t 2 R and increases monotonically with increasing t. Let us assume now that the point x = 0 is periodically excited with a period T, and consider the process of propagating excitation impulses. Let y,+(x), k = I , 2, . . . , denote the time that it takes for the kth impulse to travel from point 0 to point x. Then the passage of impulses will be described by a sequence of differential equations y k l ( x ) = c-’(T y k - Y , + - ~with ) boundary conditions y,+(O) = 0 and initial condition y o ( x ) = p?(x). It can be shown that for an arbitrary function ~ ( xwe ) have the formula limk,,yk(x) = xc-’(T), i.e., that the propagation of an excitation in the presence of a periodic excitation occurs in the limit with a constant velocity. This discussion also indicates a method of experimentally determining the function ~ ( t )which , thus reduces to changing the established time interval between the excitation of the beginning and the end of the thread.
+
+
+
(I One could introduce a measure of interaction in the system in the following way. A t , such a measure For a point x , becoming excited at a time lying between t and t could be provided by the quantity T - t ( x , I ) . A measure of interaction for the entire system at a time t could be provided by the expression
+
s,
lim ( I / d t ) At+o
( T - t ( x , I ) ) do,
+
where E is the set of points becoming excited during the time from t to t A t . In Example 2 the interaction tends to 0 with increasing t . Conversely, one could construct examples (see, e.g., Wiener and Rosenblueth [197]), where the interaction tends to a constant value for t --f 03. The states in which the interaction tends to zero will be called normal, and the remaining ones will be called particular. (Particular states arising in the heart could naturally be called fibrillational.)
Continuous Models of Control Systems
159
Similarly, one can also investigate the propagation of an excitation in a ring. Let t ( x , 0) = ~ ( x )and , suppose that from x = 0 an impulse started propagating in a certain direction. Furthermore, let zk(x) be the time that it takes for the kth impulse to travel from 0 to x. Then the process will be described by a sequence of differential equations
Also, in this case one can show that for an arbitrary function p(x) limk+mzk(x)= xf0Zk1, where to can be found from the equation r, = /c-l(t,) [for 1 4 R c ( R ) ] . Analogous results can also be obtained when studying the circulation of a group of impulses along a ring. Discrete electronic models of this type were investigated by Ivanov and Telesnin [91]. We express our gratitude to the participants of the seminar, the physiologists I. S . Balakhovskiy, V. S . Gurfinkel', V. B. Malki,n, and M. L. Shik for their valcable participation in the discussion of the problems presented here.
0
Certain Problems in the Investigation of Movements’ 0
This chapter presents certain general concepts that seemed useful t o us in the study of the physiology of motor activity. Many of these views are strictly connected with the theory about the construction of movements developed in the outstanding papers by Bernshteyn [I7-221. Motor control is one of the most important functions of the nervous system. The structure and function of the nervous system are undoubtedly t o a large extent determined by this task. The physiology of movements is basically a study of a goal-oriented activity of the nervous system as a whole. Therefore, motor control seems t o us to be one of the most natural objects for studying those integral functions of the nervous system which are related, so t o speak, to “operative control.” The ultimate result of the functioning of the nervous system, so far as its motor control is concerned, is sending impulses t o the muscles, and the basic question of the physiology of movement is a study of the mechanism governing the development of expedient combinations and sequences of such signals. The simplest point of view here is the notion that there exists some higher nerve center (lying, for example, in the cortex), where generation of such commands takes place. The commands would completely determine the motion, so that the role of the remaining nerve mechanisms consistes only of a transmission of those commands. The lack of evidence for this point of view is obvious, and papers on the physiology of movement (beginning with the discovery by Fritz and Gittzig, and then the classic works of Sharington, Magnus, Pavlov, Ukhtomskiy, and the publications of contemporary researchers) all deal with the central question of the interaction among various nerve mechanisms in the process of performing a movement (see the literature [ I 7-22]). In natural movement dozens of various muscles, Jointly with Gel’fand et at. [62a] (Editor’s note). 160
Certain Problems in the Investigation of Movements
161
working in coordination, participate. A system of commands necessary for performing motion cannot help but be very complex. In developing this system, it is necessary to take into consideration a rich and diverse afferentation, including that arising in the course of the movement itself. Specific performance of motion to a large extent depends on the initial conditions-the initial position, and so forth. It is beyond any doubt that in the development of commands for muscles and the afferentation processing for that purpose a great number of nerve centers is needed. A study of their interaction brought us to an attempt to describe the peculiar features of complex control systems from a single point of view that will be described below. However, numerous and sometimes essential concepts of the theory proposed in this chapter are still in need of experimental verification, and are considered by us as working hypotheses to help in further search. This point of view (we call it the principle of least interaction; see p. 149 and Gel’fand et a/. [56]) is that a complex multilevel control system is considered to be a set of subsystems possessing relative autonomy. Each of these subsystems has its own “personal” task consisting of lowering the interaction with the “external medium” : the latter for a given subsystem consists of the medium, external with respect to the entire system, and of the remaining subsystems. Complex control systems may consist of several levels, each of which includes a number of such subsystems. For subsystems of a certain level, the actions of the medium (which is external with respect to that level) include the afferentation coming from below, and the organization of their interaction is determined by the interaction of the higher levels. For the lowest level, the afferentation is exclusively receptive in character; the subsystems of this level have outputs into the effectors. Section 2 will dwell on these concepts in more detail. In the organization of motor control, an important role is played by utilization of those features of the motor task which can simplify control: lower the number of independently controlled effector parameters and simplify the processing of the advancing afferentation. The problems that possess those features are, so to speak, organized. In motor control, organization is revealed primarily in the fact that for each motor act one can separate out a relatively small number of the leading effector parameters and determine the basic afferentation necessary for performing a given movement. The striving toward such a simplification is also a case of the principle of least interaction. The lowering of the number of controlled parameters lowers the general level of the impulsion needed for control.
162
Biological Systems and Mathematical Models in Biology
This chapter consists of three sections. The first one is devoted to synergies and certain other mechanisms simplifying motor control. In the second section we shall present our general concepts relating to the mechanism of movement generation at the spinal level. In the last section we describe modeling considerations connected with the functioning of a group of motoneurons. 1 Synergies and Other Mechanisms Simplifying Motor Control
In order for the higher levels of the central nervous system to effectively solve the tasks of organizing motor acts within a required time, it is necessary that the number of controlled parameters be not too large, and the afferentation, requiring analysis, not too complex. An important role in creating such working conditions is played by the so-called synergies. Synergies are those classes of motions which have similar kinematic characteristics, coinciding active muscle groups, and leading types of afferentation. Each synergy is associated with certain peculiar connections imposed on some muscle groups, a subdivision of all the participant muscles into a small number of related groups. Due to this fact, to perform motion it is sufficient to control a small number of independent parameters, even though the number of muscles participating in the motion may be large. Although synergies are few in number, they make it possible to encompass almost all the diversity of arbitrary motions. One can separate relatively simple synergies of pose control (synergy of stabilization), cyclic locomotive synergies (walking, running, swimming, etc.), synergies of throwing, striking, jumping, and a certain (small) number of others. The synergies enumerated here are for a fully developed adult; the biomechanical side of the majority of them has been completely investigated. A more detailed study of one special synergy-the breathing synergy of standing-can be found in the paper by Gurfinkel’ af al. [79]. Here we shall describe it very briefly. In connection with breathing, noticeable displacements occur in various parts of the body. However, these displacements have very little effect on the location of the general center of gravity. The reason, as was explained, is that synchronously with the bending of the trunk to the back (during inhalation) the pelvis is bent forward; during exhalation the coupled displacements occur in the opposite directions. This synergy is specific, and does not occur, for example, during an external disturbance (a light pat on the spine). The synergy breaks down in some neurologic ailments, and then the center of gravity of the body moves in accordance with the phase of breathing. One might think that the above breathing synergy
Certain Problem in the Investigation of Movements
163
of the vertical pose is not an exception, but an example of a typical mechanism occurring in the most diverse natural motions. It is natural to assume that the learning of motion consists of acquiring the corresponding synergy and lowering the number of the parameters requiring independent control. Such a new synergy is produced each time not in an empty place, of course, but on the foundation of a small number of basic synergies and the inborn neurophysioIogical mechanisms lowering the number of the independent parameters of the control system. Some of those mechanisms, even though they were not studied by researchers from this point of view, have been known for a long time. We can mention here such well-researched examples of functional organization as the system of interaction of the muscle motoneurons; these are antagonists acting on the same joint, a system of postural reactions using a fixed system of the interaction of receptions of various types (labyrinth, otolithic, proprioceptive involved in receptions in the neck and the limbs), as well as an important mechanism of the development of time relations. The basic synergies and the simplest neurophysiological mechanisms enumerated here form, one might say, “the vocabulary of motions.” Developing this analogy, one can say that the letters of the language of motions are the stresses of the individual muscles, and the synergies combine these letters into words whose number is much smaller than simply the number of letter combinations. In this case the richness of the vocabulary makes for the diversity of the allowable motions. A majority of the motor tasks faced by an organism lies within the limits of this vocabulary, and it is only in exceptional cases that there arises a need to enrich it. Thus far, we have been talking primarily about the effector side of synergies. It is clear, however, that each synergy involves afferent streams which contain the leading signals and the addresses typical for them. The language of synergies is in this sense not only external (motions), but also the internal language of the nervous system used in motor control. Synergies make it possible to simplify the development of afferentation, organizing it in accordance with the motor task. From this point of view in a problem such as pattern recognition, one must first of all decide what synergy this problem involves, which in turn determines the subsequent course of recognition. We can consider that learning a new motion consists of developing a simple method of motor control, and reduces to a search for and a correction of a convenient synergy or group of synergies, including also identifying the leading afferentation. Krinskiy and Shik [ I I O ] have studied the role played by this last factor under conditions in which the motor task should
164
Biological Systems and Mathematical Models in Biology
have been performed under the control of a purposely faulty visual afferentation. It turned out that, under certain transformations of the visual field, the system making use of visual afferentation in the problem of pose maintenance is stable. We can consider that such imperfections of the visual field still permit the use of the available synergy. More significant impairments of the visual field made the completion of the task impossible, and required a certain training time in order that the tas, might be performed. It is interesting that under certain transformations of the visual signal involving deviations from a given pose, the relearning, even though it made the completion of the task possible, did not result in regaining the accuracy of its solution which was achieved by a tested person using unimpaired visual afferentation. The use of the synergy mechanism is, of course, only one of the methods simplifying the problem of motor control. Another possible approach is connected with the similarity between the problem of motor control and the mathematical problem of the search for the minimum of a multivariable function. The language, natural for this mathematical problem [6Z], is convenient for describing the construction of motion. The combination of local improvements with extrapolation, which is characteristic of nonlocal methods of extremum search, is apparently also typical of the process of decision-making in motor problems, and certain features of that search can be revealed experimentally. Such an approach, as applied to the problem of maintaining an orthogradic posture, was used in a paper by Gel’fand et a/. [55] (see also the literature [56, 62, 74-77]) and stimulated further research devoted to the problem of tremor. Krinskiy and Shik [ZZZ] made an attempt to study a method for solving a simple motor problem imposing, however, high requirements on the accuracy with which the task is performed. The person tested was asked to find a position of the two joints of the upper limbs, such that the galvanometer needle whose position was a certain function of the angles in the joints, would achieve a zero position. The experimenter could simultaneously register the trajectory of the ray on the oscilloscope screen in the coordinate system of the joint angles. A change in one of the joint angles caused the ray to move along the horizontal, and a change in the other caused the ray to move along the vertical (the person tested did not see the screen). The experimenter could also change the coefficients, so that in the successive test runs (each took 10-60 sec) the desired position of the joints would be different. The function used had the form of a “boat” with one lowest point
Certain Problems in the Investigation of Movements
165
Initially, all the subjects used only one method: they made the successive changes in the angles of the joints, moving the location of the ray step by step along the horizontal and the vertical closer to a given point. (A change in the position of the joint was made when the arrow, having passed through a minimum, again indicated that the required pose was not being maintained.) But later, along with this method of solution, another method was used that employed the organization of the function, even though for the subject it remained unknown (independently of the level of theoretical preparation). One could see on the screen how the subject, coming to the “bottom” would follow it toward the lowest point, without climbing the edge (this is possible only with a coordinated change in the angles at both joints). Only in the immediate neighborhood of the deepest point would he make the required pose more accurate by consecutively changing the angles at the joints. The tactics involved in the behavior of the subject were in this case close to the so-called ravine tactics [62]. If one agrees with the concepts presented here-namely, that the higher levels of the nervous system, developing the leading afferentation, control the operation of only a small number of muscle groups determining a given motion (or a synergy)-and if, in addition, one considers that this control is not affected directly but by retuning the interactions at lower levels, then it becomes clear that the basic “chores” involved in performing motion are done by these lower levels. The higher levels only form the functional synergies and retune the system of interactions among the elements in lower levels. Numerous papers on the simulation of pattern recognition state that the purpose of a model is to make possible the formation of a general picture which would reflect only the most essential features of a real object. In motor control we solve, one might say, the opposite problem: from an abstract representation we must construct a real motion with all its necessary details. The realization of a pattern in the form of a real motion requires its translation from the language using spatial and kinematic notions into the language of muscle dynamics : the motor composition, the number of motor units, the order of their spatial and temporal recruitment, and so on.
2 Functioning of Motor Control on the Spinal Level For further discussion, it will be convenient to distinguish the intermediate nerve structures, so that we can dwell in more detail on the functioning of the last effector link-the group of motoneurons and the interneuron structures related to it.
166
Biological Systems and Mathematical Models in Biology
The principles underlying the operation of the intermediate nerve structures will be analyzed using the example of the spinal level of motor organization. The spinal level is characterized by an extremely large volume of developed afferentation and a large number of efferent outputs. Here the afferentation is received by the spinal level “first hand,” and the efferent commands are rapidly carried out. All the activity of the spinal level is influenced by the supraspinal actions organizing its operation with the objective of performing a given motion. The essential feature of the structure of this level is the presence of relatively autonomous subsystems which are also spatially separated. The presence of a number of inborn (or built-in) interactions is typical for these subsystems. The autonomy (even if it is relative) of the individual subsystems of the spinal level permits effective control within short time intervals-the decision does not have to go through a complex and long process of securing agreements. Obviously, the organization of a slightly more complex motion by no means reduces to just one such subsystem or several “linked” subsystems. For this reason, in our discussion of the organization of motion on the spinal level, the central role is played by interaction among the individual subsystems. When we speak of the possibility of the autonomous operation of individual subsystems, we come across an inescapable question : What particular “personal” goal is realized by any given subsystem in a specific motor problem? It seems probable to us that the goal, to a first approximation, is to lower the total impulsation received by the subsystem both from the periphery and from other subsystems and higher levels of the nervous system (“the principle of least interaction”). The total impulsation which serves as the estimate of the work done by a given subsystem is apparently developed by special mechanisms whose functioning depends on a given system of interaction. I t is natural to assume that the “habitual’ or “expected” afferentation makes a relatively small contribution to the total afferentation compared with the “nonhabitual” or “unexpected” afferentation. Perhaps a special role is played also by the “unbalanced” afferentation which contributes a great deal to the total afferentation. In other words, one could assume that the goal of a subsystem is to minimize the external stimuli which tend to disturb its present state and bring it into a new one. The tendency toward the minimization of the interaction leads to a coordinated functioning of the individual subsystems, subjecting the autonomous activity of each one of them to the interests of solving the total problem defined by the supraspinal afferentation. In this case the essential role played by the supraspinal influences consists
Certain Problems in the Investigation of Movements
167
of a suitable reorganization of the interaction of individual subsystems of the spinal level [129].* Such a reorganization may be expressed in a motor effect. However, this is not obligatory. The reorganization may produce “readiness for movement,” but to perform the actual movement an additional stimulus is needed. It is possible that certain supraspinal influences (for example, related to the operation of the labyrinth system) usually use only this method to affect spinal activity. The activity of each relatively autonomous subsystem is, as we assumed, directed toward minimization of the total afferentation consisting of the proprioceptive afferentation and that afferentation which is external relative to a given subsystem from the neighboring subsystems of the same level and the supraspinal afferentation. A change in the functioning of a subsystem results first of all in a change of its afferentation and is directed toward minimization of the total afferentation. If the contribution of the subsystem’s afferentation to the total afferentation is large, then the role played by the remaining stimuli including the supraspinal afferentation stimuli, is decreased. This can be seen under the conditions created in special experiments [47, 133, 175, 1761, or in some situations arising during sports activities. Probably the phenomenon of a dominant elemcnt, discovered by Ukhtomskiy, can also be clearly understood from this point of view. The transfer of control over habitual movements to the lower levels (automatization of motor habits according to N. A. Bernshteyn) from this point of view is a result of the attempt by the higher levels to minimize their interaction with the lower levels. The individual details of motion are created and developed in the interaction among the subsystems forming the spinal level of motor control. If we simplify Wells’ classification a little [195], then in typical movements of man and higher animals one can distinguish muscles performing the basic active part of the motor task (they are relatively few in number), and muscles stabilizing the location of the preponderant mass of the kinetic links of the body (they are in the majority). One and the same muscle may in one motion be included in one group, and in another motion in another group.
* A natural mathematical model for research into control methods by specifying interactions is provided by games played by automata [158]. A computer simulation was made [33] of the behavior of a system consisting of a large number of automata, and the interactions between them were corrected by some upper limit. The problem of the upper limit was to develop an interaction that would result in a given behavior of the automata. The computer simulation showed that control of interaction can be achieved without obtaining complete correspondence between the behavior of the automata group and the interaction specified beforehand.
168
Biological Systems and Mathematical Models in Biology
The functioning of the corresponding control nerve mechanisms is essentially autonomous, and the leading afferentation is peripheral ; the interaction of these mechanisms with the others is relatively small. This implies that such “stabilizing” modes of operation are typical in motion. The relatively minor role played by interaction under those conditions results in the “freedom” of motions, making it possible for the higher levels of the central nervous system not to worry about control over the corresponding spinal subsystems. Apparently, in certain of the simplest cases this type of stabilization is already realized on the segmentary level [80].If, however, the afferentation is not minimized at this lowest level, the higher levels come into play. The functioning of a large number of the stabilizing mechanisms assures the smoothness of motion without which the latter would become atactic. Of course, the regulation of active motion by no means reduces to the operation of stabilizing mechanisms. The interaction system determined by the higher levels should provide unbalanced functioning of the muscles responsible for the active part of motion, and almost always, a suitable variation of the angles at the joints. Although arbitrary motions have been little investigated, one can nevertheless state certain views which result from the above concepts. In particular, if the role of the higher levels of the nervous system is not to send direct commands, but rather to reorganize the interaction system (retune it) that governs the interaction among the nerve mechanisms of the spinal level, then, naturally, such a reorganization should take more time than is needed for the usual transfer of commands. It is well known that the latent time of a simple motor reaction is practically constant, and even systematic training does not lower it by more than 10-15 msec. Here out of 120-180 msec of the latent time, the communication time constitutes no more than 50-80 msec. Since arbitrary active motions for the majority of the nerve structures at the spinal level imply that stabilization mechanisms, required by a given task, must be included, such a reorganization should be roughly the same for a majority of the spinal mechanisms. Of course, a reorganization may change, and even substantially, during the course of the motion. However, a study of the retuning is simplest when the motion has not yet started. The phenomena related to preparation of the spinal nerve mechanisms for the motion will be called the pretuning, and we shall dwell upon them in more detail. One of the first observations which, as is now clear, indicates the presence of the pretuning was made by Hufschmidt [181]. He discovered that even up to 60 msec before the beginning of an arbitrary contraction of a muscle, a braking of the activity of the antagonist muscle takes place.
Certain Problems in the Investigation of Movements
169
Our experiment went as follows. The subject was asked to perform a certain motion on a signal e.g., to bend his foot. For some time after the signal (but before the motion) a sinew reflex was produced and the amplitude of the electromiographic response was measured. It was found that the magnitude of the reflex depended substantially on how much time remained before the start of the motion. This dependence bears the same character for sinew reflexes and the monosynaptic H-reflex [I801on the corresponding muscle. It is interesting to note that for relatively large intervals before the onset of motio., ,70-50 msec), the variations in the amplitude of the spinal reflexes are approximately the same both for those muscles which must participate in the motion and for those which remain inactive. As the beginning of motion is approached, these variations are expressed much more sharply in exactly those muscles that will participate in the motion. Thus, apparently, the retuning of the spinal mechanisms has initially a diffused character, and as the start of motion is approached, the restructuring of the interaction system on the spinal level becomes localized. The possibility is not excluded, of course, that diffused changes in the state of the spinal control mechanisms are related to an orientating reaction occurring in the experiments in question. The experiments are described in more detail later (see p. 189). 3 Functioning of a Motoneuron Group and Motor Units
In this section we would like to discuss certain concepts related to the operation of the last effector link of the motor control system: the motoneuron group. Here we shall be interested in the mode of operation of the individual motoric units, so that attention will be focused on the functioning characteristics of the group, related to its atomicity, the latter being a general property of all nervous structures. The nerve cells themselves do not possess and cannot possess any kind of complex behavior: the information received by individual neurons is immeasurably smaller than that obtained by the entire organism, and their reactions are very stereotyped. Therefore, the central problem in neurophysiology is to study how the desirable behavior of the organism, interacting with a changing external medium, results from the interaction of various nerve structures, and ultimately from the behavior of individual neurons. In this connection, it is of great importance to search for those principles of neuron interaction which would assure that the entire physiological act is completed. The striving to minimize the interaction, which we already mentioned above, creates the possibility of a nonindividualized control over neurons in
I70
Biological Systems and Mathematical Models in Biology
a given nerve structure by affecting their own interaction system. Of course, the influences on the system of neuron interaction may also include direct influences on the neurons. A nonindividualized influence on the neurons in a given center makes it possible to describe their functioning simply. The determining role of the autonomous collective operation means, in particular, that an important place in the solution of any physiological problem is occupied by the so-called “horizontal” interaction of neurons, which have been studied experimentally up to the present on only the simplest examples. These concepts are probably applicable to the organization of the functioning of both the motor and sensor systems. Suppose that there is a uniform medium made up of neurons connected to one another in such a way that each neuron, after excitation, has a relaxing influence on its neighbors. Then the system will be stable when all working neurons function synchronously. Their interaction will then be at a minimum. The applicability of this assertion to real nerve structures is difficult to verify directly, first, because of methodological difficulties, and second, because there is no basis for thinking that the interaction of neurons will only be positive and ~ymmetric.~ Among neurons in a homogeneous system, such as neurons in a single muscle, there must be some interaction simply because of the proximity of their electric fields, the overlapping of the regions of dendrite branching, and also in connection with the fact that each muscular receptor is projected not on one motoneuron, but on several, and each motoneuron receives impulsions from several muscle receptors. Various receptors exert an influence on the motoneurons either directly or through the interneurons. The impulsion of an individual muscle receptor depends on the activity of many motor units of the muscle. But once the mutual interaction of motoneurons exists, then if it is relaxing, the motoneurons should become excited synchronously. If, however, they function independently, this means that there is a special mechanism which prevents synchronization. An experimental investigation of the activity of individual motor units in a human muscle developing a moderate stress in the pose-mode showed that they function practically independently [54, 781. This is even more surprising, because the average frequency of the impulsion of all active motor units is approximately the same, about 7-1 1 /sec (operation at another A mathematical model of such a structure was discussed on p. 155. A physiological model may be provided by the sinus portion of the heart, all of whose elements function synchronously. Here, the leading element is one whose impulse transmission frequency is highest [57].
Certain Problems in the Investigation of Movements
171
frequency is obviously unstable), and because the impulsion of an individual motor unit during several dozen cycles is very stable. (The ratio of the standard deviation to the average duration of the interval between the impulses is about 0.2-0.3.) The independent impulsion of various motor units of one muscle creates the principal possibility of individual control over the units. Under special conditions of artificial visual and auditory control over the activity of motor units, a person can in fact “turn on” or “turn off’ at will a given motor unit (strictly speaking, a group of motor units which contains the given unit) without changing the impulsion of another randomly chosen motor unit in the same muscle [54,1651. The independent activity of motor units makes it possible to understand the genesis of the physiological tremor, and predict the relationship between its amplitude and the stress produced, and its spectral content [53,1471. The actual mechanism producing asynchronous activity and which compares active motor units is unknown. However, there is information available about the properties of certain elements of the segmentary apparatus in the spinal chord and about the character of the connections between them [73, 90,99, 160, 162, 179, 182, 184, 186, 189, 193, 196). This information makes it possible to propose a model of this mechanism. The principal functioning of the models is discussed by Gel’fand el al. [54],a more detailed exposition is given by Kotov and Tsetlin [IOO]and on p. 172 of this book. Here we shall only note that the principal role in creating normal operating conditions in the muscular motor units is played, according to these model representations, by reverse braking, in particular, the dependence of the pulsation of the Renshaw cell on the frequency of its activation, and the hypothesis about the coordinated activation of the corresponding alpha- and gamma-motoneurons, all these in addition to the known properties of the motoneurons themselves. To explain certain pathological modes of operation of muscles (e.g., Parkinsonism and postpoliomyelitic paresis) in which the activity of motor units significantly deviates from the normal, it is sufficient within the framework of this model to assume certain quite specific and minor variations in the properties of the elements of the model. The investigation of the behavior of the model made it possible to distinguish certain essential parameters of the architecture of the segmentary apparatus and the properties of its elements, whose variation has a pronounced effect on the functioning of the system, and other parameters whose variation has little effect on the system. Such conclusions would probably be much harder to obtain directly from experiments.
0
Computer Simulation of the Functioning of a Motoneuron Group’ 0
Gel’fand ef af. [54] discuss the characteristic features of the active functioning of a muscle when maintaining a pose-the pose-oriented operational mode of the motor units with respect to their norm and pathology-and it formulates a number of hypotheses about the nerve mechanisms providing the asynchronous impulsion of the motor units with a relatively constant frequency. The data on the functioning of individual elements of a motoneuron group and muscles, which are available in the physiological literature [169, 170 178, 179, 182, 187, 188, 1901, allow us to formulate a number of hypotheses about the mutual relations and connections among these elements [54, 801. One of the natural methods of verifying the completeness and lack of contradiction in the accepted system of hypotheses is computer simulation of the group activity. In this chapter we present the results of such simulation. We also want to discuss the role of the individual parameters in explaining the features of normal and pathological operational modes of the motor units, and to estimate how critical certain parameters of the muscle and the motoneuron group are, the parameters being taken from the group of those that are either not measured in experiments, or are known only roughly. 1 Pose-Regime of Motor Unit Operation Let us recall the most essential information concerning the properties of a muscle and the motoneuron group that controls it. We know that a The article was written jointly with Yu. B. Kotov. Its final version was finished [I001 after M. L. Tsetlin’s death (Editor’s note).
I 72
Computer Simulation of the Functioning of a Motoneuron Group
173
muscle consists of motor units which are groups of muscle fibers innervated by a single motoneuron each. Motoneurons are relatively large nerve cells located in the front parts of the spinal cord. The impulse produced when a motoneuron is excited stimulates not only the motor unit of the corresponding muscle, but also the specific interneurons, the Renshaw cells. When excited, a Renshaw cell generates a series of impulses with gradually increasing intervals between the impulses (from 0.7 msec at the beginning to 3050 msec at the end of the series). The duration of the series depends on the rest time of a Renshaw cell after the preceding excitation. The impulses from the Renshaw cell have a braking effect on motoneurons. The probability of motoneuron excitation depends on the stimuli received by it from a number of brain sections and from the muscle receptors (spindles and sinews). The spindles (length receptors) have a monosynaptic relaxing effect on a motoneuron. Sinew Golgi2 receptors (receptors of muscle tension) disynaptically depress a motoneuron (lower the probability of excitation). In the pose-maintenance mode, each motor unit of a healthy muscle (and, consequently, the motoneuron innervating it) is capable of pulsating for a long time with the intervals between the impulses amounting to 70140 msec, where the possible range of the intervals between the impulses is 20-200 msec. Any two motor units transmit impulses practically independently of one another. For a moderate stress, only part of the motor units is transmitting impulses. An increase in the stress is achieved by recruiting new motor units and a slight increase in the frequency of individual motor units. As we know [54, 991, in the tremor form of Parkinson’s disease, the motor units transmit impulses in groups (2-4 impulses to a group) with the frequency of group follow-up amounting to 3-5/sec. The groups of various motor units are synchronized with one another, and are usually associated with a definite phase of the oscillations of the joint angle. Instead of the irregular tremor, invisible to the eye, which is characteristic of a healthy person, one observes oscillations of the limbs of great amplitude and close to sinusoidal. When a person who has had poliomelitis tries to contract his muscles, the first motor units are rapidly followed by the remaining ones. Almost all motor units transmit impulses synchronously with an increased frequency (up to 40/sec). Any attempt at longer maintenance of the stress in the damaged muscle fails.
a
C. Golgi, Italian histologist (Translator’s note).
I74
Biological Systems and Mathematical Models in Biology
2 Description of the Model
We shall describe a formalization of the properties of a muscle and a ganglion that we have chosen for the model. 1. We assume that a muscle possesses elasticity k and the viscosity ,LA, and sets in motion a mass m (reduced mass of the movable link of the joint). Therefore, the dynamics of a loaded muscle are described by the differential equation mx + p i kx = F ( t ) - mg. (1)
+
Here x ( t ) is the length of the muscle (subtracting the minimum length), F ( t ) is the stress of the entire muscle, the coefficients m, p, k are chosen in accordance with data of physiological measurements of elasticity and the viscous properties of the muscles, and mg is the load on the muscle. The force F ( t ) produced by the entire muscle is a sum of the forces of the individual motor units. 2. The behavior of a motor unit is completely described by a single function of time, namely, the force f(t) produced in response t o the arrival of a n impulse from a motoneuron. Suppose that at time t* a motor unit receives a stimulating impulse from a motoneuron. Then the stressf(t) for the time t > t* is given by a positive function (Fig. 23a, curve 2), having one maximum and approaching zero for t + 00. Figure 23a (curve 1) also shows a curve obtained in a physiological experiment [ZOI].The function f ( t ) was selected in the following way:
In this formulaf(t*) is the stress of a motor unit at the time of impulse arrival and t , is a constant determining the position of the stress maximum. We also assume that the stress produced by a motor unit is limited from above by a constantf,. Therefore, the coefficient a should satisfy the additional condition at, < 1. The form of the curvef(t) was studied in a number of physiological experiments [IOI]and the values of the constantsf,, a, t , , and the time constant t in the model were chosen so that they would fit the curve closely (fo= 16 gm, a = 1/64, t , = 16 msec, t = 25 msec). Thus, the stress of a motor unit increases linearly; during ( t * , t* t,) it reaches a maximumf, = f ( t * ) at,(f, - f ( t * ) ) .
+
+
175
Computer Simulation of the Functioning of a Motoneuron Group
t" t " + f ,
-0./6 -O./
I
0
fol
$
40
2
20
F
c3
\
2
0
200
8!,msec
400
fcl Figure 23
3. In our model, a motoneuron is completely described by two functions ) the state function q ~ ~ ( t ) . ~ of time: the threshold P A t f ( tand The external stimuli change the function of state vM(t). The external stimulus acting on a motoneuron is called a facilitating stimulus if it increases vM(f), and inhibitory if it lowers it. In the absence of external stimuli ejLM(t) decreases exponentially in absolute magnitude with time constant t P M i.e., , it decreases as exp[- t/t,,]. We say that a motoneuron is excited at a time to if ejLM(to) 2 PM(to). At a time immediately following the time of excitation (it will be denoted by to E , with the understanding that E > 0 and is arbitrarily small), the threshold of the motoneuron receives a positive addition P*w+,and the state function becomes zero. Later, the threshold decreases exponentially with time constant t P M approaching , a constant PM,,i.e., for t > t o ,
+
The difference p ? ~ ( t ) &(t) of the state function and the threshold of motoneurons corresponds, within a constant, to a potential of the cell membrane. Thus, our description of the state of motoneurons corresponds to that accepted by electrophysiologists.
176
Biological Systems and Mathematical Models in Biology
Upon excitation, the motoneuron generates an impulse which stimulates a motor unit with a lag I , . The values of t P M tp , M1,, are taken from physiological experiments, and P M + , Phl0 are selected in such a way as to obtain the best fit to the experimental curves of threshold variation. In our experiment, tI,M = 35 msec, teM = 4 msec, PAW+ = 1.5, PA4,,= 2, and t, = 6 msec. The number of motoneurons was chosen as 60. 4. It is assumed that to the state function of each motoneuron a constant y o is added, which is the same for all motoneurons. In addition, the state functions qAl(t)receive “white noise” [ ( t ) with zero mean value and deviation at2. The constants y o and ut2 were chosen in such a way that the probability of negative values of the sum ~ l , [ ( t ) would be small (in a majority of the versions y o = 0.5, at = 0.125). The values of [ ( t ) are independent for various motoneurons. In the computer program the time was considered discrete (1 cycle = 2 msec). Therefore, the “white noise” & t ) was simulated by adding to v.vl(f) in each cycle a random term with a normal distribution. 5. For each of the receptors for muscle length (spindles) a threshold is specified, i.e., the smallest length at which it begins to exert a facilitating influence on the motoneurons connected to it. With a further increase in the length, the positive addition to the state functions of the motoneurons increases proportionally to the difference between the length of the muscle and the given spindle threshold (the proportionality coefficient is 4 cm-I). The spindle thresholds were chosen to be different. The threshold distribution determines the dependence of the active stress of a muscle on its extension. We have selected a distribution (Fig. 23b) that results in an increase of rigidity when the muscle is extended. In some experiments with the model, the spindle thresholds were given an identical additive term which varied with time (a model of the y influence). The model contains 20 spindles. Each spindle stimulates 15 motoneurons, i.e., each motoneuron receives stimulation from five spindles. This system of connections makes it possible to slightly lower the effect of the instantaneous differences in the spindle thresholds. A similar smoothing process apparently takes place in the spinal cord with the participation of the afferents of Group I1 from the spindles. 6. For the tension receptors in the muscle (sinew Golgi receptors) we are given a threshold F, > 0, which is the tension in the muscle at which the sinew receptors begin to exert an inhibitory effect on all motoneurons. The inhibitory effects are considered to be proportional to the difference between the tension F ( t ) and the threshold F,. The proportionality coeffi-
+
I77
Computer Simulation of the Functioning of a Motoneuron Group
cient al and the threshold tension F, in a majority of experiments are equal to a1 = (1 / 128) gm-' ; Fo = 16 gm, respectively. 7. In our model, the state of a Renshaw cell is completely described by three functions of time: the state function Q ? R ( t ) , the threshold P H ( t ) ,and the time GR(t) passed from the last excitation. We say that a Renshaw cell becomes excited at a time to, if qR(to0) 2 PR(too).At a time to, E immediately after the time of excitation, the threshold increases its value by a positive number P R + , and & becomes zero. Then the threshold decreases exponentially with time constant t P Rap, proaching a positive constant P R O i.e., , for t > t,,,
+
P R ( f ) = [PR(tOO)
- PRO
+
PR+l
exp[-(t - fOO)/SPR1
+
(4)
A Renshaw cell receives impulses from 14 motoneurons. Each impulse from a motoneuron increases rpR(t) by one. In the intervals between the impulses from the motoneurons, F R ( f ) decreases exponentially with time constant zpRapproaching zero. At an excitation time, a Renshaw cell begins to generate a series of inhibitory impulses. In the program, the series from a Renshaw cell is simulated by means of a negative (inhibitory) additive term to the state functions rpM(t) corresponding to the motoneurons. The absolute value of the term at the time of the excitation of a Renshaw cell is equal to a constant g o , and then decreases linearly to zero during the series. The duration of the series f T is a function of the rest time 8Rof a Renshaw cell after the preceding excitation. The form of the function t T ( 0 R ) is shown in Fig. 23c. On the axis of abscissas we plot the time GR that has passed from the time of the previous excitation, and on the ordinate axis, the duration tT of the inhibitory series of a Renshaw cell. In the program, the function tT(&) for aR5 500 msec was approximated by a cubic polynomial. For & > 500 msec, tR(&) = f T (500 msec). The curve (Fig. 23c) does not contradict the physiological data available [169, 170, 1791. In a majority of experiments with the model, tpR = 25 msec, tpR = 25 msec, P R O = 17/16, PR+ = 3, and go = The number of Renshaw cells was chosen to be equal to 26. The connections of the Renshaw cells with the motoneurons were random, but such that each Renshaw cell, upon excitation, inhibits all the motoneurons that could excite it. In addition, the inhibitory series from a Renshaw cell is also received by 9 motoneurons which cannot excite it. With such a system of connections, each motoneuron is inhibited by a group of Renshaw cells that were excited by its impulses, and from some Renshaw cells which it cannot stimulate.
a.
178
Biological Systems and Mathematical Models in Biology
3 Desynchronization of Motoneurons; the Influence of the Renshaw Cell System on Impulsion of Motoneurons
The synchronous impulsion of two motoneurons is defined as a n impulsion mode in which the probability of excitation of the second motoneuron attains a maximum after some constant time from the excitation of the first. This time is called the time lag. A natural measure of the synchronization of two motoneurons (with indices i, j) would be the deviation of the distribution of the intervals 6, between the times of excitation of these motoneurons. Using this characteristic, one could apparently combine all the motoneurons in a ganglion into several independent synchronous groups. The number of such groups and the number of their elements are characteristics of the ganglion-muscle system, which are analogous t o the number of the degrees of freedom in mechanics. However, the use of the deviation of the distribution of Gij as a measure of the synchronization requires a measurement of a considerable number of intervals, i.e., it requires very long computer experiments. The most interesting for our model is the case in which the time lags between the synchronized motoneurons are small compared with the intervals between the excitations of the same motoneuron, and when all motoneurons in a ganglion form one more o r less synchronous group. In our model, the synchronizing stimulus for a motoneuron is provided by a signal from a sinew receptor which is identical for all motoneurons and received by them simultaneously. The lag is absent, and all motoneurons enter one synchronous group; thus the impulses from various motoneurons should come in “bunches” with large intervals between them. The intervals between the successive times of excitation of various motoneurons can be divided in two categories; short intervals between the impulses from various motoneurons inside the bunch and long intervals between the bunches. Therefore, the density of the interval distribution has a large value only in regions of very short and very long intervals. This type of distribution possesses [48] a significant excess
which was also chosen as a measure of synchronization. In this formula p i is the central moment of ith order of the interval distribution. In describing the results of computer experiments, we shall also state in each of the examples what is the largest deviation a of the length of the muscle from the mean in a stationary mode. This characteristic also describes the degree of synchronization of the motoneurons in a ganglion. However, it is less sen-
Computer Simulation of the Functioning of a Motoneuron Group
179
sitive to low synchronization. For a majority of simulation experiments, values of e < 3, 2a < 0.1 cm correspond to the normal asynchronous impulsion. The synchronizing stimuli on the motoneurons in the spinal cord are, apparently, very numerous. Therefore, to obtain normal asynchronous impulsion of motoneurons, we need a special desynchronizing mechanism. This mechanism, according to our ideas, is provided by the system of Renshaw cells. Suppose that a ganglion contains a group of synchronized motoneurons. The white noise, acting on pM (see point 4) causes a nonsimultaneous excitation of motoneurons in this group, other conditions being equal. As we already mentioned, each Renshaw cell, upon excitation, sends an inhibitory series to approximately one third of all motoneurons in the ganglion. The excitation of any motoneuron may cause excitation of several Renshaw cells that are in states sufficiently close to the threshold, and excitation of the majority of motoneurons immediately after a given one (and during the entire series from a Renshaw cell) is very likely. The inhibitory impulses from a Renshaw cell, upon becoming excited by the first impulses from the motoneurons in a synchronous group, make the excitation of the remaining motoneurons more difficult (the remaining ones constitute a major part of a synchronous group). Thus, the number of motoneurons belonging to a synchronous group decreases, i.e., a desynchronization takes place. With the parameters used in our model, a breakdown of the group of synchronous impulse transmitting motoneurons occurs during a time corresponding to 2-3 impulses (Fig. 24a). In this figure on the axis of abscissas we plot the time t, and on the axis of ordinates, the number N,,(t) of the motoneurons excited during 20 msec beginning with a time 1. The point R corresponds to the time when the Renshaw cells, which were turned off until that instant, begin to function. Figure 24b also illustrates the sharp damping of the oscillations in the length of the muscle x ( t ) when the Renshaw cells become activated. If, without changing the system of connections and parameters of Renshaw cells, one deactivates a certain portion of randomly chosen Renshaw cells, then certain motoneurons will receive less inhibitory impulses from Renshaw cells, and may be capable of combining into a synchronous group. The number of motoneurons in this group and the degree of their synchronization are greater, the fewer Renshaw cells remain. The results of the corresponding computer experiment are given in Table XXIII. It can be seen that halving the number of Renshaw cells
180
Biological Systems and Mathematical Models in Biology
R
-0.5Figure 24
results in an increase of the amplitude a of the muscle length oscillations by approximately three times. The degree of synchronization e also increases. The lowering of the number of motoneurons sending impulses to one Renshaw cell lowers the effectiveness of desynchronization, since this lowers the probability of excitation of even one Renshaw cell by an impulse from some motoneuron. The reduction of the number of motoneurons inhibited by one Renshaw cell also leads to impairment of desynchronization except that the impairment is less pronounced due to a significant overlapping of the outputs from Renshaw cells on the motoneurons. The results of the corresponding computer experiment are shown in Table XXIV. The TABLE XXm No. of Renshaw cells
26
13
7
0
e
0.18
7.9
32
48
2a
0.045
0.15
0.5
1.9
Computer Simulation of the Functioning of a Motoneuron Group
I81
data of Table XXIV show that the effectiveness of desynchronization, when the number of inputs to a Renshaw cell is halved, is reduced to a larger extent than when the number of outputs from a Renshaw cell is halved. A simultaneous lowering of the number of inputs and outputs leads to an even sharper impairment of desynchronization. TABLE XXIV Connections=
1 M N + 6 RC 8 RC --f 1 MN
1
1 MN+3RC 4 RC + 1 MN a
I
e
I
0.18
1 1 23
2a 0.045
0.5
MN: motoneutrons, RC: Renshaw cel
Renshaw cells apparently participate in the stabilization of the frequency of motoneuron impulse transmission. This assertion is based on the fact that the portion of time during which a motoneuron experiences an inhibitory influence of a Renshaw cell, ~ ( 8=~fT(8R)/8R, ) increases sharply for intervals between the excitations of Renshaw cells lasting less than 100 msec (Fig. 25a). In addition, Renshaw cells inhibit not only a group of active motoneurons, but also those 6eldom active, rarely impulsing motoneurons, which action becomes even less frequent. Histograms of the intervals for various active motoneurons4 are very similar to each other. For this reason, to lessen the machine time, we used their sum as an indicator of the interval stability. A change in the form of the characteristic tr(8R) of a Renshaw cell results in a change in the position of the maximum and in the deviation of the inThe active motoneurons are those motoneurons which transmit impulses at great frequency and account en masse for one-half of all impulses.
182
2
Biological Systems and Mathematical Models in Biology
0.2
F*
40
\
i -
\
ssp
‘4
q
20
\
2-
0
o./ 0
100 200 300 400 S,, msec
100 200 t, msec
f 6)
fa) Figure 25
terval distribution (see Fig. 25). Figure 25 shows two different characteristics tT1(&) and t772(OR),and their corresponding histograms of intervals n l ( t ) and nz(t).For comparison, we show a histogram n,(t) constructed from the data of Gel’fand et al. [MI. In the majority of computer experiments, we used the characteristic tT, (see Fig. 25). The deviation of the interval duration in computer experiments, even for the most satisfactory versions of the characteristics of Renshaw cells, is greater than the interval deviation, as measured in a physiological experiment. Apparently, there are still other mechanisms of the interval length stabilization which we have not considered.
4 Control of Muscle Length Using the program just described, we simulated a simple motor problem -that of changing the length of a muscle. The length of a muscle, as already mentioned, depends on its load, its elastic and viscous properties, and the active tension it develops, i.e., ultimately on tue facilitating stimuli received by motoneurons. One could, for example, control the length of a muscle by changing the stimuli received by motoneurons from other sections of the spinal cord. In this type of control situation, the length of the muscle will depend strongly on the load, and in a number of cases may be obtained only within a large error. To us it seems more natural to assume another method of control-that related to changing spindle thresholds. This method makes it possible to achieve a more accurate fit between the actual length of the muscle and the desired length. To control the length of a muscle through spindles, we introduced an additional link which produces a term
Computer Simulation of the Functioning of a Motoneuron Group
183
d ( t ) that is added to the thresholds of all spindles, and depends on the difference between the actual length of the muscle x(r) and the desired length
x,(t):
Here the coefficients a,, y , the time lag t , , and the time constant t5 are constants whose values are selected empirically (aZ= Q, y = 1/64, t , = 20 msec, t 5= 256 msec). The required length of the muscle x,(t) was given in the program. A stepwise change of the required length xo(t)caused a subsequent change in the length of the muscle x(t). The transient time for the new length x ( t ) and the character of the transient mode are practically identical when xo(t) is either increased or decreased. In the figure, on the abscissa axis we plot the time measured from the time of the stepwise change in -\-,(I) from the starting value of xo = 0.5 cm to $0.25 cm (xol)or to -0.25 cm (xO2).On the axis of the ordinates, we plot the values of x(t), the length of the muscle, and the value of the corrective term 6(t). The curves s, and 6, show the variation of the coordinate and the correction term for x o ( t ) = x,; the curves xz,6, show the same quantities for xo(t) = xO2. An instantaneous change in the load on a muscle also causes a deflection of s ( t ) from xo(t).Adding a load and subtracting a load on the muscle cause different transitional processes; both differ .in character and duration (Fig. 26b). In this figure, on the abscissa axis, we plotted the time measured from the time of adding (subtracting) a load. On the ordinate axis, we plotted the length of the muscle x(t) and the correction constant 6 ( t ) . The curves x, and 6, correspond to a variation of the muscle length and the correction term after adding a load (an increase of m from 30 to 60 gm); x2 and 6, correspond to the same quantities after subtracting a load (a decrease in m from 30 to 15 gm). The plot corresponding to the subtraction of the load exhibits a characteristic feature. A large surge at the origin and a “step” afterward result from the action of a reflex on the extension. The subsequent drop in the curve is caused by an additional feedback mechanism. It is interesting to note that similar curves of the transient process were obtained in the experiments of G.A. Arutyunyan. With a sharp decrease in the weight of a pistol (unloading) in the hand of a marksman, the aim point follows a trajectory coincident with the curve of Fig. 26b. Certain angles at the joints vary in a similar fashion. When adding to a load, the action of the corrective mechanism is masked by the slowing of the motion because of the larger mass.
184
Biological Systems and Mathematical Models in Biology
0.2
xoz
--
- 0.8
0
'. -
\.-*.----. - -0.2 ----_ , ---
100 200 300 t, msec
fa/
0.3
- 0.2 0
100 ZOO 300
t, msec fbl
Figure 26
5 Simulation of Pathological States In simulating the impulsion mode of a motoneuron in a person having Parkinson's disease (tremor form), we started with the assumption [54] that the thresholds of Renshaw cells in this person are increased. The results of the experiment with the model are shown in Table XXV. The maximum amplitude of oscillations of the muscle length a and the maximum synchronization correspond to the case PI<,= 4 and tqR= 25
Computer SimJation of the Functioning of a Motoneuron Group
I
‘Ro
4
185
17/16
120 25 2.2 0.18 0.32 0.13 0.045’
25 17
T
P 2a
msec. In this case, the oscillations of the muscle length were close to sinusoidal (Fig. 27). In this figure on the abscissa axis we plotted the time, and on the ordinate axis the length of the muscle. Curve 1 corresponds to the variation of the muscle length with time in simulating Parkinson’s disease. Curve 2 illustrates the normal oscillations of length in a model of a healthy muscle. The increase in the summation time of a Renshaw cell lowers the synchronization caused by an increase in the threshold of the Renshaw cell. An attempt was made to simulate the operation of a motor unit in a muscle of a person who has had poliomyelitis. We assumed that in the affected muscle a small portion of motoneurons remains (e.g., 10% of the original number). These motoneurons, working with their motor units, because of their limited number produce the necessary (small) tension only if the impulsion has a higher frequency (Fig. 28). In this figure, on the abscissa axis, we plotted the intervals t between the excitation times of the same motoneuron. On the ordinate axis, we plotted the fraction of the intervals of length t compared with the number of intervals between the excitation times of active motoneurons. The ac-
-
-0.2
9 -0.3, T: 4
-0.4 -0.5
-0.6 0
100 200 300 t, msec
Figure 27
186
Biological Systems and Mathematical Models in Biology
tive motoneurons are not able to stimulate a sufficient number of Renshaw cells, and therefore, in practice, their Renshaw cells d o not participate in desynchronization. Nevertheless, the computer experiments have shown that, other parameters being constant, and even with PRO= 8, the impulse transmission of motoneurons remains asynchronous. Apparently, the synchronizing action
:k 0.2
0
50 100 f, msec
Figure 28
of sinew receptors is insufficient, i.e., the synchronization of motoneurons in poliomyelitis is connected with a change in the character of the random impulsion at the input of a motoneuron or with some other synchronizing stimuli that go beyond the range of our assumptions. The authors are very grateful to M. L. Shik for the constant attention he gave to this work and for a number of valuable comments.
0
Restructuring Prior to a Movement’ 0
A great deal of morphological and physiological data have accumulated up to the present time that indicate that the spinal cord is not a passive slave mechanism carrying out supraspinal commands, but is, rather, a complex system in which the interaction of its elements is no less essential to the motor effect than the supraspinal impulsion that is received by motoneurons. This point of view is substantiated by the following data [73, 99, 160, 162, 164, 172, 173, 176, 182, 184, 1961: 1. The presence in the spinal cord of a large number of interneurons, many of which are not afferent neurons of either second or third order, and which are located in the front part and in the intermediate zone. 2. A preponderant portion of the descending fibers from the brain ends not in motoneurons, but instead in interneurons of the intermediate zone, as well as the front and rear parts. 3. A large portion of the synapses in the spinal cord is formed by the spinal neurons one on top of another, and only a minority is formed by the axons coming from the brain and spinal ganglions. 4. The presence of a powerful system of differentiated infracentral interaction among the motoneurons of various muscular groups at the segmentary level (reverse facilitation and inhibition). 5. The presence of a reflector system of muscular interaction at the segmentary level through muscle receptors (myotatic reflector influences). 6. The effect of segmentary afferentation on the reflexes of distant segments by means of the propriospinal and the spino-bulbo-spinal system. This chapter was written jointly with V. S. Gurfinkel’, Ya. M. Kots, V. I. Krinskiy, Ye. I. Pal’tsev, A. G. Fel’dman and previously published in the same collection with Gurfinkel’ et al. [79](Editor’s note).
187
188
Biological Systems and Mathematical Models in Biology
One can assume, therefore, that the spinal system of the motoneuron interaction (taking place through the corresponding interneurons infracentrally and reflectorally) is under substantial supraspinal influence. On the other hand, the effect of the supraspinal influence depends on the state of the system of interaction among the motoneurons. The motor control occurs largely through the supraspinal change of this inner system of interaction. A number of experimental data about the inner spinal system of motoneuron interaction make it possible to assume that a stable state of silence among almost all motoneurons (muscles) corresponds to it, and that it has a stabilizing effect on motoneurons, leading to an elimination of the effect of stimuli (afferentation). The latter process proceeds either aperiodically or periodically (rhythmic reflexes) depending on the conditions. However, during integrated motor acts the segmentary apparatus permits a retuning such that the interaction of the elements, occurring at the segmentary level, is subjected to the solution of the entire motor problem. A number of experimental data (experiments with a sudden unloading, synchronous registering of the oscillations of the common center of gravity and the electric activity of muscles during the maintenance of a vertical posture) show that under the conditions of integrated motor acts, the correction of movements is achieved during a time which is characteristic of spinal reactions. From this one can conclude that during natural motor acts the functioning of the spinal level is subject to, and agrees with, the entire motor task. The investigation of the problems arising in this connection is in the beginning stage, and thus it is important to formulate a number of possible directions of research. I . An experimental verification of the starting hypothesis stipulating the retuning of the segmentary relations before the movement actually takes place. 2. An attempt to specify the physiological mechanisms facilitating the use of the spinal system of the interaction of the nerve elements in motor control. 3. Rethinking (in view of the difficulty and frequently the impossibility of direct experiments) of the mathematical models of the functioning of the segmentary level and the principles of control over its activity.
The present chapter is devoted to the first of the indicated directions of research. On the basis of methodological considerations, we have chosen as the simplest experimental problem a simple unmeasured movement performed on a command in accordance with the previous instruction.
Restructuring Prior to a Movement
189
As an indicator of the state of the segmentary apparatus of the spinal cord before the movement, we have investigated the sinew and H-reflexes [I801 under a single and double stimulation. The reflexes were tested in the initial state and at various times within the interval between the command and the movement. The reflexes were registered using the electric response of the corresponding muscles of the lower limbs. To produce the sinew reflex, we used a device consisting of an electromagnetic hammer and an electronic delay circuit which permits the sinew shock to be continued for a given period of time after the command (Fig. 29). To produce the H-reflex, we used a two-channel stimulator “multistim” which permits one to set up the required interval between the first and second stimulus, and to independently control their amplitude. The stimulating electrodes used to produce the H-reflex were located in the popliteal dip over the rear shin bone nerve whose stimulation causes in the leg muscles a direct M-response and a reflector monosynaptic H-reflex [l80, 1851.
n
Figure 29. A scheme of the experimental setup for testing the sinew reflex. IG, impulse generator; DB, delay block.
In the first series of investigations, we studied the amplitude of the sinew reflex as a function of the time interval from the moment of the command to the movement. At this command, the subject was supposed to straighten the leg at the knee joint. The registering of the total electromyogram of the quadriceps muscle of the hip (leg extensor) shows that usually during 160-180 msec after the command the straightening of the leg takes place. During the first 100 msec after the command, the amplitude of the knee reflex remains constant. If, however, the sinew shock continues for 100 msec
190
Biological Systems and Mathematical Models in Biology
or more after the command, then the amplitude of the reflex is greater, the closer to the start of the movement the shock is produced. Before the start of motion, it increases 2-3 times (Fig. 30a). It should be noted that an increase in the knee reflex occurs quite steeply in the interval 100-130 msec after the command ( 7 M O msec before the onset of the motion). During the last 30-40 msec before the onset of motion, the amplitude of the reflex increases insignificantly. Figure 30a also shows that the reflex produced in a certain phase of the latent period (tracks 4-6) postpones the beginning of the motor reaction. Analogous changes take place when testing the Achilles reflex (Fig. 30b) when the subject, on command, bent his foot along the sole. This may indicate that, in addition, a change in the state of the segmentary apparatus before the beginning of motion amounts to facilitation of alpha-motoneurons. This is because, if the latter were true, we would have to expect that an increase in the reflex amplitude would continue until the very beginning of motion.
K-
+.
K
:
:;i [
I50 I00 50
1
~
r
0 msec
*.
K
I50
.*
100
If 50 0 msec
Figure 30. Variations of the sinew reflex amplitude during the latent period before an arbitrary movement. (a) Electric responses of the upright head of the quadriceps muscle in the thigh to the standard-value of impact on the patellary sinew at rest (K) as a function of the time interval before the onset of motion (recordings 1-8). The arrow (sound click) signifies the command to start an arbitrary motion. (b) The same as (a) for the calf muscle during the Achilles reflex. Below are plots of the sinew reflex amplitudes as a function of the time before the beginning of motion; the ordinate is the amplitude of the electric response of the muscle (in millivolts).
Restructuring Prior to a Movement
191
The data obtained show that it is only in the first part of the latent period of the simple motor reaction in the segmentary apparatus of the spinal cord that no noticeable changes are observed. However, at 70 msec before the beginning of motion, the state of the segmentary apparatus changes, which is revealed by the increase in amplitude of the knee reflex. To what can these changes be related? Are they related only to the muscle group which, according to the previous instruction, should participate in the motion, or are there also nonspecific changes which embrace various links of the motor apparatus? To answer this question, we made a study in which the subject was asked, on command, to perform two different movements : in one experiment, to bend his leg at the hippelvis joint, and in the other to straighten it at the knee joint. In both cases, we measured the magnitude of the sinew reflex of the upright (double-jointed) head of the quadriceps thigh muscle and of the outer (single-jointed) head of the same muscle which, as we know, have a common sinew. Figures 31 a and b make it clear that during 70-80 msec, before the beginning of bending in the pelvis-hip joint, the increase in the amplitude of the sinew reflex occurs in the upright head of the quadriceps muscle (participating in this movement) and does not occur in the reflector responses of the lateral head. However, just before the motion when the leg is straightened at the knee joint, accomplished with the help of both portions of the muscle, there occurs an increase in the reflex amplitude of both the upright and the lateral heads. In the other form of the experiment, the subject was asked to bend his knee at the knee joint on command. At 70-80 msec before the onset of motion, the knee reflex began to increase, and then again returned to its starting level (Fig. 31c). A small increase in the amplitude of the knee reflex may also be observed before the movement involving the bending of the ipsilateral hand at the elbow joint. Thus, changes in the segmentary apparatus beginning at 70-60 msec before the motion contain both a specific and nonspecific component. This may be related to the well-known data of Granit [73] and others who discovered in critical experiments on animals that the supraspinal activation of gamma-motoneurons has, as a rule, a diffused character. However, more significant changes in this muscle, which participates in the motion, allow us to assume that an increase in the knee reflex before the motion is related not only to a diffused activation of gamma-motoneurons, but also includes some other mechanisms. In this connection it seemed advisable to study the effect of Jendrassik’s maneuver2 (an arbitrary tension in hand
*
A procedure for emphasizing the patellar reflex (Translator’s note).
192
Biological System and Mathematical Models in Biology
lZO
A, mV
EMG
jot
- 1.0 r
150
+ +
+
200
50
100
K
+
*
0 rnsec
I
i
++
+
* +* + i
/50
i
100
50
+
I r
*
0 msec
fc/ Figure 32. The relationship between the amplitude of the sinew reflex and the time before the beginning of an arbitrary bending at the pelvis-hip joint (a), before an arbitrary straightening (b) and bending (c) at the knee joint. On the ordinate axis, we plotted the amplitude of the electric response of the muscle; on the abscissa axis, the time in msec before the beginning of motion. The crosses indicate the response of the upright head, and the circles, the response of the outer head of the quadriceps hip muscle.
Restructuring Prior to a Movement
193
muscles) on the magnitude of the knee reflex at various times before the motion. One can assume that Jendrassik’s maneuver, in addition to the intracentral facilitation [I92],involves also an activation of the gamma-system which leads to an increase of sinew reflexes. Measurements have shown that Jendrassik’s maneuver increases the amplitude of the knee reflex by only 30%, and this facilitated reflex then changes before the onset of motion as usual. As indicated, facilitation of the alpha-motoneurons could be used only with difficulty to explain the sharp increase in the knee reflex before the beginning of motion. Therefore, it is possible that before the onset of a movement there occurs not only an activation of the systems facilitating the monosynaptic reflexes, but also a suppression of the inhibitory systems tonically blocking the monosynaptic reflexes [ I 771. To verify this thesis, we made a fourth series of experiments in which the state of the inhibitory systems of the segmentary apparatus of the spinal cord was tested with the help of single H-reflexes of various strengths and during double stimulation (fifth series of experiments). We know that as the strength of stimulation increases, the H-reflex also increases at first, and then decreases [f85]. In the preceding chapter [8f] we have shown that the lowering of the H-reflex amplitude, as the strength of the stimulation of the rear shin bone nerve increases, is related not only to a blockade of the reflector series through the antidromic excitation of the axons [and bodies of] motoneurons, but also to the process of central inhibition. In our study of the variation of the H-reflexes before a movement (fourth series of experiments) we have found that the amplitude of the Hreflex, caused by a weak stimulation (in the absence of or with an insignificant M-response), invariably increases even at 60 msec before the start of a movement, and increases more, the closer we are to the front of the myogram (Figs. 32 and 33, curve 1). As we can see from the recordings, the time dependence of the amplitude of the reflector response is fairly smooth before the onset of the movement. The amplitude of the H-reflex caused by a strong stimulation (with a strong M-response) invariably changes at 40 msec before the onset of motion (Fig. 33, curve 3). The increase as we approach the beginning of the motion has a different character as compared with the increase of the amplitude of an H-reflex caused by a weak stimulation. The slow increase in the amplitude of the H-reflex with a large Mresponse changes in the 30-msec interval before the onset of motion, being replaced by a steep increase. This fact may indicate a blockade before the beginning motion of the inhibiting system which is partially responsible for stopping the H-reflex when a strong stimulus is applied.
194
Biological Systems and Mathematical Models in Biology
Figure 32. Increase in the amplitude of the H-reflex of the calf muscle during the latent period of a simple motor reaction. (1) Control recording at rest; (2)-(6) recordings at various times before the onset of movement; (7) recording during the movement. The origin of the myogram (controlled with the help of another pair of electrodes when the amplification is large) is shown with an arrow. Recordings (1)-(5) were made with a smaller lag than the recordings (6)-(7) for the inclusion of ray scanning. The strength of the stimulation was 24.5 V.
In the fifth series of experiments we have obtained another indication that before motion a change of state occurs, not only in the alpha-motoneurons [I811 but also in the interneurons of the spinal segment. The rear tibia1 nerve was given pairwise stimuli with an interval of 60 or 25 msec. As we know, in this situation it is easy to choose the stimuli intensities in such a way that a response to the second stimulus will be inhibited. If such pairwise stimuli are applied before the onset of motion, then the response to the 3.6 3.0 2.4 I.8 I.2
0.6
0
C
.2"
F
Figure 33. Increase in the amplitude of the H-reflex in the soleus muscle during the approach to the onset of motion. Ordinate: amplitude of the H-reflex (mV) involved in the electric response of the muscle; Abscissa: intervals (msec) before the start of the myogram. 1-111: reflexes of zones 1-111.
Restructuring Prior to a Movement
295
second stimulus is disinhibited (Fig. 34, curves 3 and 4). This occurs only if the second reflex precedes the onset of arbitrary motion by 20 msec or less. A comparison of the disinhibition curve for the second reflex with the plot of the amplitude of a single H-reflex caused by a stimulation of the same intensity (Fig. 34, curve 1) shows that the effect of disinhibition cannot be explained simply by a facilitation of alpha-motoneurons before the movementeven if only curve 1 in Fig. 34 is treated this way. Similar changes are observed when applying pairwise sinew reflexes. In this case, the second reflex in the pair is initially considerably weakened (as compared with the value in the control register obtained when applying a single stimulus), and is partially restored before the motion (Fig. 35). Consequently, the onset of motion is preceded by extensive reorganization of the spinal segment. The reorganization can be discovered before the beginning of a fast physical motion. Probably, changes of this type are also
2Ozmsec
7 Figure 34. Increase in the amplitude of a single H-reflex (1) and (2) and the amplitude of the second H-reflex during a pairwise stimulation with intervals between the stimuli of 60 msec (3) and 25 msec (4). When pairwise stimulation was applied, the second stimulus was the same as in the case of a single stimulus causing the reflex of curve 1.
Figure 35. Increase in the amplitude of the second knee reflex before the movement when two reflexes were obtained with an interval of 220 msec.
196
Biological Systems and Mathematical Models in Biology
of great importance in the problem of establishing and maintaining posture, where they may be related to the basic mechanism of control over the motor activity [23, 110, I l l , 1291. The diverse changes in the values of the reflexes in various muscle groups, detected in connection with movements (or a preparation for a movement) may be used to study the constitutive and functional motor synergies. The authors are grateful to N. A. Bernshteyn, I . M. Gel’fand, and I. I. Pyatetskiy-Shapiro for valuable advice.
0
Bioelectric Control and Diagnostics of States' 0
The bioelectric processes in the heart, brain, and skeletal muscles have long been recorded, not only as a method of physiological research, but also in connection with the great diagnostic possibilities for electrocardiography, electroencephalography, and electromyography. Along these basic directions of clinical electrophysiology, a great deal of factual 'material has accumulated indicating a connection between bioelectric phenomena and certain changes in the functional state of both individual cell structures and whole organs and systems. The experience gained in the long development of electrophysiology made it possible during the last two decades to enter a new phase in the practical utilization of its achievements. We have in mind the use of biopotentials as control signals in various systems which combine diverse technological devices with an organism, and provide continuous automatic control over these devices. These systems were given the name of bioelectric control systems [31, 961. Probably the first paper in which a biocontrol system was described was written by Ferris et af. [171]. To control the supply of electric current to an animal at definite phases of the heart cycle, they used the R spikes of an ECG. These authors have successfully solved the particular problem but they did not notice the fundamental novelty in their methodological approach and did not foresee its possibilities. A number of interesting advances are due to Bickford [ 167, 1681, Shepherd and Wood [191], Battye et af. [166], and others. Our first research in this direction was begun in 1956 jointly with M. G . Breydo et af. During the past ten years, there have appeared in the literature l This chapter was written jointly with V. S. Gurfinkel'. The final version of the article was written after the death of M. L. Tsetlin (Editor's note).
197
198
Biological Systems and Mathematical Models in Biology
a great number of papers indicating that the principles of bioelectric control have found successful application in the development of active prostheses and orthopedic apparatuses as well as diagnostic and therapeutic devices. At the present time, work in the area of biocontrol is being done in dozens of laboratories in the USA, Canada, England, France, Italy, Poland, and Yugoslavia. In this chapter, our objective is not to give a survey of leading research, but rather to present the basic experiments. We would like to recall that a substantial number of the devices developed for biocontrol can be classified according to the characteristics of the control signal into devices that control the biopotentials of the skeletal muscles, heart, and brain. Our research was in the area comprising the first two groups. 1 Use of Skeletal Muscle Biopotentials for Control The first attempt to construct a working technical system controlled by the biopotentials of the skeletal muscles was made in 1955 by Battye et al. [166]. These workers developed a model of an artificial hand which was closed and opened using biopotentials of the flexors and extensors. The model was operated by a switch. The first servomechanism, controlled by the biopotentials of the skeletal muscles, developed by us in cooperation with M. G . Vreydo er al. was designed to close and open the model artificial hand smoothly. Its detailed description is given by Kobrinskiy et al. [95]. Naturally, there may arise the questions: What is the purpose of the servomechanism controlled by muscular biocurrents? What are the advantages of such a system when compared with the usual methods of control? Considering the participation of a person in industrial processes-in particular, in the process of control over machines, tools, or machine toolswe can imagine the following very simple system. Under the influence of information which a person constantly receives through his lifetime experience, adaptation, etc., his central nervous system produces decisions and programs of action which must be coded in such a way as to make the corresponding movements feasible. This is precisely the way “. . . the external manifestations of the brain’s activity. . . ” are revealed (Sechenov [IN]). Consequently, the commands produced in the central nervous system are carried out through the “code of movements.” However, such a system of human participation in control does not appear to be the only possible way. Apparently, with definite advantages and drawbacks, one can give up the use of muscular motions and use the preceding link in the control loop, namely, the activation of the muscles. In fact, in the last chain in the link
Hoelectric Control and Diagnostics of States
199
carrying out the commands of the central nervous system, the activation of a muscle precedes its tension. Therefore, one can shorten the length of the control link by one element, and use the properties of muscle activation. Here, electromyography is useful. Returning to the original question, we can answer that in all those cases in which, for various reasons, it is more convenient to use muscular activation rather than the movement itself, it is natural to make use of the bioelectric control system. A necessity of this type may arise in connection with a need for a more rapid (as compared with the movement) command from the central nervous system, and also in those cases in which the activation of a muscle proceeds normally, and the movement cannot be performed, e.g., in the case of amputation, or is performed unreliably (extensive paresis and paralysis). The advisability of developing bioelectric control systems with applications to such problems is obvious. An arbitrary coordinated motion is defined as a movement whose result corresponds to the intention of a person. If, however, for the purposes of control, one does not use the movement, but the activation of a muscle, then it is necessary to have information about which muscular excitation parameter most adequately reflects the command coming from the central nervous system. On the basis of physiological research, it was established that even one such indicator of activation as the instantaneous value of the intensity of muscle biopotentials characterizes with sufficient accuracy the command received by the muscle. The choice of the form in which this muscular excitation parameter is used for control purposes depends on the construction of the organs performing the movement. In our first model, we have selected a mechanical system-a servomechanism, controlled by discrete current pulses. In this connection, the problem arose of transforming the instantaneous intensity indices of the biopotentials into a set of standard signals whose frequency of succession should be proportional to the power of the biocurrents. This role is played by an integrator equipped at the output with a thyratron relaxor. Disregarding the large size and the complexity of the apparatus, the first model confirmed the feasibility of achieving a bioelectric system with gradual control. In 1958, we developed and tested a bioelectric manipulator with a hydromechanism. The device achieved a smooth movement of a fairly powerful slave mechanism without transforming the biopotentials into discrete impulses. The grip strength of the manipulator exceeded many times the grip strength of a human hand. Thus, we might say, a muscular amplifier was created.
200
Biological Systems and Mathematical Models in Biology
An investigation of the first models and development of the initial premises for using the amputated muscles of the stump in bioelectric control made it possible to develop forearm prostheses with bioelectric control involving small control units and an electromechanical device. At the present time, they are about to be mass-produced. Speaking of further, practical utilization of bioelectric control involving the potentials of skeletai muscles, it is necessary to mention the physiological problems that must still be solved. They include an investigation of the mutual relationship of the bioelectric activity of various muscles in complex synergies, the possibility of a simultaneous independent control over various muscles, problems relating to recording biopotentials from individual muscles, etc. using insulation. We concentrated on the system of biocontrol as applied to prostheses, because our effort was centered largely in that area. This, however, does not exhaust the possibilities of using biopotentials of skeletal muscles for control purposes. Even within the field of prosthetics there is still another fairly wide area of application for this principle. This is the area involved in supplying prostheses to patients with residual symptoms of poliomyelitis and those paralyzed due to other reasons. Functional orthopedic devices used in extensive paresis and paralysis may be equipped with external sources of energy and be controlled by biocurrents from healthy muscles, or even from paralyzed muscles in case there is at least some bioelectric activity in them. This area of functional therapy is closely associated with another area that used to be called mechanotherapy. Application of bioelectric control in this area makes it possible to provide all kinds of devices for training of exceptional quality. The training may include a stimulation of the residual activity of functionally weakened muscles by accurately supplying the “help” that the device at a given moment of movement should give to the patient. Using this principle, it is apparently also possible to considerably improve the properties of the devices (so-called “iron lungs” and others) used in the therapy of acute or chronic inadequacy of the respiratory musculature. These devices, which are controlled with the biocurrents of the respiratory or auxiliary muscles, not only provide satisfactorily the muchneeded pulmonary ventilation, but also actively stimulate the reactivation of the affected respiratory muscle system.
2 Use of Cardiac Biocurrents to Control Diagnostic Devices A number of problems arising in the construction of various diagnostic devices require special methods of control for their solution. In a number of cases, such a control can be achieved, depending on the state of various
201
Bioelectric Control and Diagnostics of States
physiological systems of the patient or the subject (the simplest example is artificial breathing). T o control devices of that kind, it is necessary to obtain the physiological information systematically. This information can be obtained using diverse methods, one of which also involves biopotentials. The methods of obtaining those potentials have been thoroughly investigated for a number of cases, and are widely used in physiological research and in clinical practice. It seems useful to us to attempt to use the biopotentials as control signals for diagnostic devices. This, of course, creates technical and logical possibilities that are to a large extent related to the specific devices to be controlled and their purpose. We shall try to describe the specific features of the problems under consideration, leaving out the problems related to the amplification of small electric signals, since these problems have been thoroughly investigated. The problems in which it is convenient to use bioelectric signals from the heart are very diverse, and the literature already contains descriptions of certain devices in that class. Here we shall only give examples of using cardiac biopotentials which we have studied and used in our work. Cardiac biopotentials can be used to control devices whose operation is in some way related to cardiac rhythmicity (see the literature [82, 85, 961). The properties of ECG-such as rhythmicity, definite duration of ECG segments, their relationship with the mechanical manifestations of cardiac activity, the relative constancy of the principal features of an ECG in the normal state, and the characteristic changes in pathological states-are all used in systems of bioelectric control. The simplest device in this group is a cardiosynchronizer,2 i.e., a device that provides automatic switching of the therapeutic or diagnostic apparatus synchronously with arbitrarily chosen phases of the cardiac cycle. A.
OF THE HEARTIN X-RAYANALYSIS CARDIAC CYCLE
AN
ARBITRARILY CHOSEN PHASE
OF THE
To diagnose the functional state of the heart, and in particular, to diagnose the contracting function of the myocardium, blood capacity of the heart chambers, and to accurately determine the excursion of the heart at various phases of its activity, it is necessary to perform an x-ray analysis at accurately determined phases of the heart cycle. With this method, one can investigate-under both clinical and experimental conditions-the cardiac activity dynamically (after a surgical intervention, upon administering medThis technique and the instrument designed by us were developed in 1959.
202
Biological Systems and Mathematical Models in Biology
icine, during various functional tests) since it makes it possible to accurately compare heart x rays taken at various moments during an investigation. There is information indicating that in Moscow the x-ray specialists Ts. Russo used the electrocardiographic impulse in the x-ray analysis of the heart at one constant phase of the cardiac cycle. The same concept underlies the technique of chirography used in x-ray analysis of the heart and developed abroad. However, the electronic devices available in those years made it impossible to carry out an effective x-ray analysis of the heart at any phase of the cardiac cycle arbitrarily chosen by the investigator. To switch on the x-ray apparatus at a given phase of the cardiac cycle, we used the biopotentials of the cacdiac muscle. Thus, we used the principle of biocontrol. The R spike was taken as the command signal used to switch on the x-ray apparatus. A diagram of the corresponding device, cardiosynchronizer, is shown in Fig. 36.
-L
Figure 36. Schematic of the cardiosynchronizer.
The output signal from an ordinary electrocardiograph is received at the inlet of a driven multivibrator using a 6NIP tube whose anode circuit includes one of the coils of a polarized relay P, of type RP-4. When the R spike appears, the multivibrator becomes activated, and relay P, begins to operate. The contact K , of the polarized relay closes the anode circuit of the thyratron T, and then the capacitor C, is charged through the variable resistance R,. When the voltage on the capacitor reaches a potential suffi-
Bioelectric Control and Diagnostics of States
203
cient to trigger the thyratron, a discharge occurs. The discharge current of the thyratron forces the relay P, into operation, and the group K,, of the relay contacts controls the x-ray apparatus. The contact group K,, of the same relay closes the second coil of the polarized relay. Then the anode circuit of the thyratron breaks, the capacitor C., discharges, and the circuit returns to its initial state through the contact K23. The moment of exposure to any given phase of the cardiac cycle is determined by the triggering time of the thyratron. In turn, the time when the thyratron is triggered is regulated by the potentiometer R , , which changes the time constant of the RC circuit included in the anode circuit of the thyratron. This scheme was used to control a mass-produced x-ray apparatus of Soviet production, RUM-5. In one series of experiments, we superposed several images of the heart’s shadow taken at the same phase of the cardiac cycle. In this process, the contours of the shadows remained distinct. Figures 37 and 38 show images of the heart’s shadow registered at various arbitrarily chosen phases of the cardiac cycle in diastole and systole. In a number of cases, the x rays of the heart were taken on the same film at different phases of the cardiac cycle. We could thus determine the displacement of the shadow that looked like a complementary shadow in the form of a sickle (Fig. 39). In connection with the fact that changes in the heart’s volume occur during relatively short time intervals, the x-ray pictures must be taken with very short exposure. Thus, we have taken x-ray pictures using an exposure time of 0.05 sec on the apparatus RUM-5. Later, when working on the
Figure 37. Heart x ray in the diastole phase.
204
Biological Systems and Mathematical Models in Biology
Figure 38. Heart x ray in the systole phase.
x-ray apparatus made by the Ellem Company, the exposure time was shortened to 0.02-0.03 sec. To determine the variation in the dimensions of the heart’s shadow accurately, it is advisable to make two exposures on one film: one at the end of diastole, i.e., at the maximum blood intake by the heart, and the second one at the end of systole, i.e., at the minimum blood intake. To increase the accuracy, one can take x-ray pictures in two projections or more. Experiments have shown that the cardiosynchronizer makes it possible (using the first model) to take x-ray pictures of the heart at any phase of the
Figure 39
Btkelectric Control and Diagnostics of States
205
cardiac cycle which must be investigated. This permits a comparative analysis of x rays taken at various phases of the cardiac cycle. It seems useful to us to use the cardiosynchronizer in conducting an angiocardiographic investigation of the heart. The use of contrasting materials when taking x-ray pictures at arbitrarily chosen phases of the cardiac cycle will, in our view, make it possible to determine the thickness of the heart chamber walls, to measure the volume of residual blood in the cavities of the heart, and to determine the presence of any pathological fissures, shunts, and regurgitations. To perform this task, we developed a second model of the device (Fig. 40).
Figure 40. Diagram of the second model of the cardiosynchronizer.
An output signal of negative polarity from any electrocardiograph is supplied to the input to the tube L, in the device, connected to the circuit of a driven multivibrator with cathode feedback; the tube L, transforms the R spike of the ECG into a rectangular impulse which is transmitted to the subsequent cascade L,. The cascade L2 is a trigger circuit whose anode link includes a polarized relay RP-4. The R spike from the ECG, upon trans-
206
Biological Systems and Mathematical Models in Biology
formation, activates the trigger which switches the contacts of the polarized relay. Then the capacitors in the circuits of the keep-alive electrodes of the thyratrons (type MTKh--80)T I ,T, begin to be charged. The rate of charging (and also the lag time) is regulated by the corresponding potentiometers. When each of the thyratrons is excited, this causes the output multivibrator L, to become activated, and thus the relay which regulates the time of exposure begins to operate. When the other thyratron is excited, the trigger is deactivated, the polarized relay is released, the capacitors discharge, and the circuit returns to its initial state. Thus, the device makes it possible to take two pictures during one cardiac cycle. Using the second model of the cardiosynchronizer, we have made a number of investigations on the apparatus of the Ellem type. The results confirmed the possibility of using the cardiosynchronizer to control serial angi ocardi ograph y. An interesting example of the use of the angiocardiographic setup equipped with the cardiosynchronizer is the experiment we made taking x-ray pictures of the heart at those phases of the cardiac cycle in which the mechanics of the heart’s valves becomes accessible to an investigation. When taking x rays of the heart to obtain comparable photographs, the subject must stop breathing, since the position of the heart in the chest changes during the breathing cycle. In this connection, when working with animals and it is iriipossib!e to obtain x rays at a fixed phase of the breathing cycle, the output of the cardiosynchronizer may be provided with a relay connected with a breathing sensor (potentiometric, angular, etc.). In this way pneumocardiosynchronization can be achieved, i.e., it is possible to take x rays of the heart at strictly determined phases of the breathing and cardiac cycles. We must also keep in mind that photography by means of the cardiosynchronizer (by gradually increasing the time between the R spike and the time the x-ray photograph is made) permits an accurate reproduction of the entire cardiac cycle., B.
ELECTRICAL STIMULATION OF THE HEARTUSING CARDIOSYNCHRONIZER
THE
As we know, during the heart’s activity its functional state changes dynamically so that at various phases of the cardiac cycle the excitability of the myocardium turns out to be different. In this connection in an experiThe USSR has produced an experimental series of phase-x-ray-cardiographs FRK-60 VNII MIO.
207
Bioelectric Control and Diagnostics of States
ment involving a study of the heart’s excitability, it is necessary to send the electrical stimulus at an exactly determined moment of the cardiac cycle arbitrarily chosen by the experimenter. In an investigation involving an isolated heart or under conditions of surgery which permit a direct manipulation of the heart, we used coarse mechanical relays to stimulate the myocardium electrically at various phases of the cardiac cycle. Thus, for example, the arm of the Engelman lever recording the contractions of the ventricles was provided with an electric contact which closed the electric circuit at a certain moment when the lever was moved. Changing the position of the contacts, the experimenter could arbitrarily change the time of the electric stimulation relative to the cardiac cycle. This method of research was extremely laborious, relatively crude, and -most important -it made it possible to conduct investigations only under critical conditions. The use of electronics permits a considerable improvement in the methods relating to electrical excitability and other biophysical parameters of the heart. To apply the electric stimulus at different arbitrarily chosen phases of the cardiac cycle, it is convenient to use the cardiosynchronizer. The combination of a cardiosynchronizer with a stimulator makes it possible to apply a test stimulus, regulated as to its amplitude and duration, within fixed time intervals after passing the natural excitation wave. Thus, the proposed method of investigation is based on the principle of biocontrol, since the biopotentials of the heart, recorded from the body of an animal, serve as the control signals for the physiological apparatus. A block diagram of the experiment is given in Fig. 41. One must keep in mind that when it is necessary to apply an electrical stimulus at the beginning of the systole at a time coinciding on the ECG with the rising bend of the R spike (due to a certain inertia involved in the relay
€/ecfrocardiograph
t
-
Cardiosynchronizer
-
--
It
€/ectrostimu/afor
208
Biological Systems and Mathematical Models in Biology
and the system which includes the electrostimulator), there may be a need for a delay, the value of which would be slightly greater than the cardiac cycle period. In the experiments, the time of the electrical stimulus may be indicated on the electrocardiogram for purposes of control. The mark is made by feeding a voltage to the electrocardiograph at the time the output relay is closed. Thus using the cardiosynchronizer, the experimenter, by arbitrarily shifting the moment when the electrostimulator is switched on with respect to the R spike on the electrocardiogram, is able to supply the electric stimulus at any arbitrarily chosen phase of the cardiac cycle. This technique makes it possible to conduct an experiment with electrodes planted in the heart. This considerably widens the possibilities of both experimental studies of excitability and of other biophysical characteristics of the heart. In a prolonged experiment designed to investigate the reaction of the heart to an electric stimulus, one can use both the recording of the biocurrents from the heart and the output signals from various instruments.
C.
USE OF A CARDIOSYNCHRONIZER I N THE INTAKE AND ANALYSIS BLOOD SAMPLES FROM THE HEARTCAVITIES AND BLOOD VESSELS
OF
AT AR0ITRARY PHASES OF THE C A R D I A C CYCLE
When treating patients with congenital heart defects, it is sometimes necessary to determine the gas content of the blood in the heart’s cavities and in the main vessels at different stages of the cardiac cycle. Such a necessity in particular, arises in the presence of pathological manifestations involving a variable direction of blood discharge, in the case of unilateral but brief discharge, etc. In this case the analysis of the specimens, taken in the usual way, may not reveal any changes in the gas content of the blood, since the analysis is made of the average gas content in the blood, which changes very little in the case of brief variations. We shall describe a technique and the apparatus designed for a selective intake and analysis of blood samples obtained by using probes. The technique is based on the use of biopotentials of the heart muscle to control the removal of blood samples. The block diagram is shown in Fig. 42. The probe 1 , introduced in the heart cavity or the vessel being investigated, is connected to a metal cell of a flow oximeter 2, designed to determine the percentage of 0,Hb in the blood removed with the probe. The flow oximeter is composed of a metal cell 3 whose cavity contains a magnetic body 4 set in rotation by the magnet 5 located outside of the cell and turning on the axis of the electric motor 6 , a reflecting photo cell 7 of a suitable spectral
209
Bioelectric Control and Diagnostics of States -1
I
I I
I
I
I
II
I
I
I
L
”
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _I I
I
-
_ _ _ _ _ _ _I
I + L
’
L
_ _ _ _ _ _ _ _ _J
Figure 42
sensitivity, and the power source for light bulbs 8, and a sensitive galvanometer 9. This flow-reflecting oximeter with a magnetic mixer is designed for a rapid determination of % O,Hb, which is necessary in order to actively search for the spot from which the blood is to be sampled. For blood intake, the cavity of the cell is made to contain a block 11, consisting of a vacuum pump 11, electromagnetic valve 12, and a control block governing the valve 13. We used the electric pump 01-19, which is used in surgical practice, as the vacuum pump. The electromagnetic valve is a solenoid with a movable core whose displacement causes the closing and opening of a tube connecting the vacuum pump with the cell of the flow oximeter. The electromagnetic valve is controlled by the control block 14. The latter consists of an amplifier for the biopotential and a cardiosynchronizer, whose circuit was slightly modified to solve the present problem. Thus, to “profile” the required segment of the ECG, we used a trigger cell instead of the output multivibrator with an electromechanic relay which was used earlier in the cardiosynchronizer. The intake of blood during probing was done using a probe whose open end was located in the region of the cardiovascular system where the presence of pathological behavior was reported. To sample blood at a given phase of the cardiac cycle (e.g., during the systole or diastole) one tunes the cardiosynchronizer in accordance with a cardioscope. This is followed by switching on the electric pump producing a discharge in the system. When the electromagnetic pump which controls the cardiosynchronizer is switched on, the cavity of the cell is periodically connected with the lowpressure system, and is filled with blood. The determination of % 0,Hb is made using the reflecting oximeter. During the measurement, the magnetic mixer is switched on. In case it is necessary to collect blood for subsequent
210
Biological Systems and Mathematical Models in Biology
analysis on the Van Swike apparatus, the procedure is as usual. When the oxygen content of the blood is determined at one of the cardiac cycle phases-e.g., during the systole-the cardiosynchronizer is tuned in such a way as to permit the collection of blood at some other phase (e.g., during the diastole). Then the procedure remains the same as before. The principles of bioelectric control were useful when it came to a continuous observation of the state of certain physiological systems or an automatic diagnosis of the deviation of the states of these systems from the norm, and when requiring a signal of the disturbances that might occur.
3 Certain Problems Related to Automatic Diagnosis of Acute Pathological States Full automation of the differential diagnosis of various nosologic forms is as yet not near its technical realization. We have in mind those cases in which the symptoms essential for diagnosis have the form of the subjective utterances of the patient, his habitus, and also information which is received in visual form, e.g., x-ray photographs, skin rashes, etc. In a number of cases in medical practice, it is very important to be able to give a specific diagnosis about the state of a certain physiological system or several systems. Thus, at the present time we observe two different types of diagnostic problems: the problem of recognizing various nosologic forms, i.e., involving a differential diagnosis, and the problem of achieving continuous control and monitoring changes in the physiological state, aimed at obtaining a timely detection and warning about the development of acute pathological states. The occurrence of an acute pathological state may be a result of illnesses, and may also take place in healthy persons due to an exposure to extremely harmful influences from the surroundings. Since we shall be discussing the problem of automatic diagnosis of substantial impairments of the physiological state, we must first of all note that the diagnostic devices solving this problem can be either narrowly specialized, i.e., selectively monitoring changes in the state of some specific functional system (breathing, cardiovascular, central nervous system). or aimed at a diagnosis of changes in the physiological state of the entire organism from changes in several functional systems (diagnosis of collapse, shock, fainting, and so forth). Examples of specialized diagnostic devices are instruments for an automatic recognition of the breakdown in the rhythm of the heart’s activity and
Bioelectric Control and Diagnostics of States
211
breathing. We shall briefly describe the functioning of the device Ritm-1 [74. This device is designed for prolonged continuous analysis of the rhythm of the heart’s activity, detection and determination of the character of its impairments, registering the diagnoses determined in this way, as well as warning of impending states involving a breakdown of the heart’s rhythm. The device may be used in postoperative tents of surgical clinics, particularly in clinics where chest surgery is the primary function. The operating principle underlying the device is the measurement of the duration of time intervals between the successive contractions of the heart (between the R-R spikes of the electrocardiogram), which is followed by automatic processing of the results of these measurements. Under normal conditions, the durations of the individual periods of the heart’s activity differ from one another very little. Considering the occurrence of pulmonary arrhythmia, we may consider that a deviation of 20-25% is usually not connected with a pathological impairment of rhythmicity. As a reference duration of time, the instrument uses the average length of the cardiac cycle over a relatively long period of time (5-10 min). The length of the successive cardiac cycle is compared with the reference length, and the result of the comparison is registered in the memory cells of the instrument. Here the length of the cardiac cycle is considered normal, and the comparison result is denoted with a letter N if the deviation from the standard does not exceed 25%. For the case in which the measured interval is shorter (longer) than the standard interval by more than 25%, the interval is considered short (or long, respectively), and the comparison result is denoted with the letter K (D, respectively). Thus in the memory cells of the instrument, “words” are formed which consist of the letters KND. In our model of the instrument, sets of three intervals are consecutively measured so that each word consists of three letters. The “words” are put together into “diagnoses.” It is not hard to see that there may be only 27 different three-letter words. In this case, for example, the words NKD and KDN are combined into the “diagnosis”: “extrasystole with a compensatory pause,” the words KKN, KKK, KNK, NKK are combined into the “diagnosis” “normal.” All diagnoses (except for “normal”) which are 9 in number are punched on a tape. The code of the diagnoses is shown in Table XXVI. The words are combined in diagnoses in such a way that, when changing the origin of the word formation, the diagnosis remains unchanged. Thus, for example, Fig. 43 shows two electrocardiograms referring to the same diagnosis : “interpolated extrasystole” (diagnosis No. 4 : NKK, KKN).
212
Biological Systems and Mathematical Models in Biologj
NNK, NND, KDN, NKK, DND, DKK, DDK, KND,
NNN NKN, NDN, NKD, KKK, DDD, KKD, DKD, NDK,
-
KNN DNN DNK KKN, KNK NDD, DDN KDK KDD DKN
-1
-2 -3 -4 -5 -6
-1 -8
Simultaneously with giving the “diagnosis,” the accumulation of the next “word” begins, so that the diagnoses are given every three cardiac cycles. When the “normal” diagnosis is given, the registering mechanism is switched off. It will be noted that slow perturbations in the frequency of the heart’s contractions, which are not accompanied by impairments of the rhythmicity, are diagnosed as “normal” due to a fine adjustment of the standard time interval. Figure 44 shows a block diagram of the instrument. The preliminary amplification block amplifies the signal which is the output from the electrocardiograph. The following block forms impulses which are then sent to the circular trigger register on four cold-cathode thyratrons. At each moment of time one of the four thyratrons is excited, and the impulses received deactivate the thyratron which was excited and activate the next one. Thus, each thyratron conducts a current during each cardiac cycle. This makes it possible to measure the duration of three consecutive cardiac cycles using the accumulated capacitances (accumulation block). A special block, designed for developing the standard interval, produces a voltage with which the voltages on the capacitors are compared (in the comparison block).
Figure 43
213
Bioelectric Control and Diagnostics of States
--Ill I
I
I
41
register
Mo/function monitor
Decoder I
c
meosurement
1
I
I
I
I
Diognosis register
I
4
1
Recording OUtDU)
Figure 44
I
I
After the comparison is made, words are formed in the word register which are then transformed into diagnoses by the decoder. The diagnosis register controls the output block. In addition to the blocks indicated, there are blocks where time is measured and also a block which is designed to signal any malfunction. The control block makes sure that all the blocks in the instrument work in the correct sequence. The Ritm-1 device was tested under clinical conditions at the Institute of Cardiovascular Surgery, and its operating characteristics were judged to be satisfactory. A device for monitoring changes in the frequency and rhythmicity of breathing was constructed along the same principles. It registers changes in the breathing frequency if they fall within normal limits, and gives a warning signal if the breathing frequency deviates beyond those limits. A warning signal is also given if the breathing depth is lowered or when the breathing stops, and also when the instrument suffers a malfunction. The input to the instrument is in the form of an electric signal from a carbon resistance sensor which measures the circumference of the patient’s chest. The sensor signal is amplified and sent to a counter. The counter switches on a time measurement block with every 10th breathing cycle. The instrument compares the length of 10 breathing cycles with a reference length. The time is measured by measuring the accumulated capacitance. If
214
Biological Systems and Mafhemafical Models in Biology
the voltage on the capacitor differs from the reference voltage by more than f25%, a warning signal is issued. This group also includes another instrument which was developed at the Biophysics Institute of the Academy of Sciences of the USSR by Yu. K. Azarov et al. with our parti~ipation.~ The instrument is designed to analyze continuously the ECG and automatically detect changes in the location of the S-T interval on the ECG. When the S-T interval is displaced relative to the isoline beyond the established tolerance, the instrument automatically switches on an electrocardiograph, records for 10 sec and simultaneously turns on a sound signal which remains turned on as long as changes in the S-T interval are taking place. The instrument analyzes each cardiac cycle, but gives a warning only after 10-15 consecutive cardiac cycles involving changes in the location of the S-T interval. This instrument presently is being tested. It is, like Ritm-1, interference-free, so it cannot produce artifacts. The instruments also signal any malfunction in their systems. The experience gained in developing these instruments permits us to make certain generalizations with respect to the principles of writing medical programs for automatic diagnostic instruments, and allows us to make statements about bioelectric control systems. Individual oscillations of various physiological indicators under normal conditions make it more difficult and sometimes impossible to judge deviations using absolute indicators. In this connection, the instrument should measure deviations from the individual norm of a given subject. Thus, in those cases where the instrument is going to be used to monitor the physiological state of healthy persons, it is desirable for the instrument to automatically establish the norm and remember it. In a clinic the “norm” may be established by a doctor. In conclusion, significant deviations should be reported as a percentage of the value of the individual norm. I t is important that the normal region of variations in the individual parameters reflect not only the value of this parameter established under conditions of relative rest, but also deviations under the conditions of suitable functional stresses (stress-norm). Automatic diagnostic devices may operate according to two different programs. The simplest (so-called “rigid”) program is based on the observation of definite physiological parameters and their groups, and is designed to discover and establish the existence and timing of a deviation of any parameters from a predetermined range. Instruments using this program are relatively economical, but require qualified individual readjustment for Authorship certificate No. 182849.
Bioelectric Control and Diagnostics of States
215
each patient. (The physician must specify the allowable limits for the variation of each parameter, starting with all the available information.) In addition, these instruments are either too crude or react to admissible and completely safe changes in the state of the patient in the same way as they would react to dangerous changes. The parameters used by a “rigid” program are the following: the lowering of the heartbeat frequency to below 40/min; the increase of the heartbeat frequency to more than 160/min; the cessation of breathing for more than 30 sec; the increase in the breathing frequency to more than 35/min. The limits of a dangerous state generally change with time. Similar variations are taken into account by instruments which operate on the so-called “flexible” program. At the basis of this program lies a comparison of any given parameter at a given moment of time with the average value of the parameter computed over a sufficiently long period of time preceding that time. If the number of investigated parameters includes the rate of change of individual physiological indicators, then the dangerous changes of state may be predicted far in advance. However, this diagnosis improvement is achieved only at the expense of a fairly significant (sometimes even considerable) increase in the complexity of the instrument. An instrument operating according to a flexible program does not need qualified personnel for its adjustment, since the necessary critical values of the parameters are found by the instrument itself. It was found useful to combine blocks operating on both programs in one instrument, and desirable to record each case of a dangerous condition (noting its time). When the instrument registers a dangerous state,, a reliable system to warn medical personnel is needed. In addition, it is then desirable to register the curves of the time-variation of the ECG, EEG, breathing, and arterial pressure. This accumulates information during the period when the patient’s state gets worse. A physician, alerted by a signal from the instrument, can use the preliminary information of the diagnostic system of the instrument as well as recordings from the period of relapse for taking extreme measures. It is necessary to keep in mind that a measurement of various physiological parameters requires different times of measurement; at the same time the instrument should continuously show the diagnosis. To reconcile the indices of various parameters, it is necessary to remember the results of all previous measurements, which thus must be stored in the memory. Changes in the diagnosis should be shown by the instrument within a fairly short time. Operative control over the physiological state could be achieved by using relatively few parameters, such as the frequency of the pulse, breath-
216
Biological Systems and Mathematical Models in Biology
ing, arterial blood pressure, changes in EEG. At the same time, to use the instrument more universally, it must be provided with additional channels for the input of additional information about body temperature, chemical composition of the air exhaled, the venous blood pressure, blood pH, and so forth. The rapid rate of information processing by the logical unit of the instrument makes it advisable to use it in conjunction with a special commutation system for simultaneous control of changes in the physiological states of many patients or subjects. The central problem in writing a medical program is the choice of controlled parameters, a determination of the permissible ranges of variation of the individual parameters, and a preparation of diagnostic complexes of symptoms. To establish the informative significance of various parameters, the diagnostic meaning of different values, their deviations and a significant correlation between the deviations of various parameters, it is necessary to perform very laborious investigations both in a clinic, where acute pathological states spontaneously occur in patients, and in physiology laboratories, where they can be simulated in experiments on animals, and in certain cases in experiments on healthy persons. I t is advisable in the course of such work to use a monitoring apparatus to gather information; this type of apparatus speeds up the collection of the information that is necessary to write a progrdm. In conclusion we should mention the necessity of a very high reliability for such apparatus. This requires consecrated work, primarily to improve the construction of sensors that would be capable of transmitting information faultlessly for long times and under complex conditions (movements of patients). To prevent any injurious artificial signals from being generated, it is advisable to have a system of automatic switch-off of individual channels in the case of artifacts. There are reasons for assuming that at the present time it is already advisable to begin the accumulation of clinical experience pertaining to the use of automatic devices in diagnosing acute pathological states. The work should be done in two directions: for the purpose of gathering experience, to continuously monitor the basic physiological parameters in patients with the help of monitoring devices, and to test programs that have been written already from the previously accumulated experience. Now let us present a couple of general ideas about bioelectric control. First, it should be pointed out that bioelectric control systems can make use of biopotentials generated by various tissues and organs. However, it makes sense to make special devices only when the physiological significance
Bioelectric Control and Diagnostics of States
217
of the corresponding bioelectric potentials has been sufficiently well investigated and when their sources are easily accessible. Apparently this is the reason why the potentials from the brain, the heart, and skeletal muscles are more widely used than any other potentials. When using bioelectric signals for control, a problem naturally arises of extracting the useful information which is contained in the signals. This problem is sometimes solved by directly measuring one or several parameters of a signal, for instance, its amplitude or power. In a number of other cases, the analysis is more complex and involves difficult and interesting problems related to pattern recognition. The use of bioelectric signals for control demands, in addition, that the analysis be very rapid, which imposes another set of requirements on the corresponding techniques of analysis and the technical means at its disposal. Results of this analysis can be effectively utilized for control only when the biological system does not change to any great extent during the time spent on analyzing the results. Therefore, bioelectric control systems are typically characterized by a mode of operation in which there is continuous extraction and analysis of useful information obtained from a biological system, and the analysis results are also continuously used for control. We note, in addition, that the methods of analysis and the corresponding technical means are determined not only by the character of the bioelectric potentials used, but also by the purpose of the system itself.
SUPPLEMENTARY ARTICLES
0
Certain Properties of Finite Graphs Related to the Transportation Problem' 0
This chapter aims at clarifying certain elementary properties of finite graphs and their relationship to the problems of formulating minimum transportation plans. A number of the terms and definitions we use were taken from the literature [114, 1801. 1. Considering only connected graphs, we shall define the distance r(Ai, A j ) between the vertices A i and A, of the graph as the minimum number of edges in a path connecting these vertices. The vertices A i and A j will be called neighboring if r(Ai, A,) = 1 (i.e., if these vertices are the ends of a single edge). We set r ( A i , A i ) = 0. The weight PA,of a vertex A i is defined as
where n is the number of the vertices in a graph (its power). The center of a graph is the vertex C E ( A l , A , , . . . , A,) such that Pc = min(PA1,. . . , PA,).
(2)
An edge is called cyclic if its removal does not impair the connectedness of a graph. A graph which does not contain cyclic edges is called a tree. Any connected subgraph of a tree is also called a tree. A graph containing cyclic edges is called a graph with cycles.
THEOREM 1 A tree of even power has at most two centers. A tree of odd power has exactly one center. Tsetlin [I531 (Editor's note). 221
222
Supplementary Articles
The proof of the theorem is based on the following remark. Let A i and A j be two neighboring vertices of a tree D, (the subscript in a symbol of a tree or a graph indicates its power). When the edge lij which connects them recedes, the tree D, decomposes into subtrees Dnl 3 A i and Dn23 A j . If PA$and PA, are, respectively, the weights of the vertices A i and A j in a tree D, , then n, n2 = n, PA( - PA* = n2 - n , , Furthermore, the weight of a graph will be defined as
+
Ler rjbe the number of those edges of the tree D, whose increasing distance leads to separation of a subgraph of power j 5 [ n / 2 ] . 2 THEOREM 2 Pc
=
C jFj;
[n/21
(4)
j-1
Q
=2
[n/21
Cj(n -j ) q .
(5)
j-1
Formula (4) follows from the fact that a point lying at a distance j from the center, enters j subtrees of power 5 [ n / 2 ] ,which can be obtained from D, by removal of one edge. Equation ( 5 ) can be obtained if we note that, when one edge is removed from D,, one can form j ( n - j ) pairs of points, one from each subtree thus formed. Theorem 2 implies the inequalities
We should also remark that these inequalities also hold for graphs with cycles for which Theorems 1 and 2 do not hold. 2. Let us distribute integers ( q l , . . . , 4,) in the vertices of the graph G,. The number qi will be called the load on the vertex A i , and the set q = ( q i , . . . , qn) will be called the load on the graph G,. An elementary transportation will be defined as a passage from the load q = ( q l , . . . , q i , . . . , q k , . . . , qn) to the load q’ = (41, . . . , qi - 1, . . . , q k 1, . . . , qn), where A i and A k are neighboring vertices. In other words, an elementary transportation is defined as a displacement of a unit load along the edge of a graph.
+
*
The symbol [a]denotes the closest integer contained in a.
223
Finite Graphs Related to the Transportation Problem
It is obvious that if q on a graph, and if
= (ql,
. . . , q,)
andp
=
( p l , . . . ,p,) are two loads
then the load q can be transformed into a loadp by applying a finite number of elementary transportations. A combination of elementary transportations which transforms a load q into a load p will be called a transportation plan for the graph G,, and the number of elementary transports in a given plan will be termed its price. A plan of minimum price will be called a minimum plan. Without a loss of generality, we can assume that Ca1qa = 0 and p 1 = pz = = p, = 0. Therefore in what follows we shall consider only plans of this type. A transportation plan is defined by specifying the numbers (i # k) of the elementary transports along the edges / i k of a graph in some direction ( X i k = - xki). The numbers X i h should satisfy the equations
---
and a minimum plan of transports implies that the expression @
=
1I X i k I *
(8)
i
is minimized. For trees, the problem of formulating a minimum plan of transports is simple. Suppose that an edge lik subdivides a tree D, into two subtrees Dnl 3 Ai and Dng3 A k , where the tree Dnl contains the vertices A i l , . . . , A i n l ,and the tree Dngcontains the vertices A j , , . . . ,A j n r .Then ni n2 = n, and a minimum plan of transportation is defined by the formulas
+
A load will be called positive if among the numbers q l , . . . , q,, only one number qi, is negative. Let 6 denote the average price of the minimum transportation plans for all distributions of given numbers q l , . . . , q, , forming a positive load, over the vertices of the graph G,. Let ds denote the average price of minimum plans for all such distributions of the numbers q l , . . . , qn over the vertices
224
Supplementary Articles
of the graph in which the load qi, is located at its center. We have the relations
@ = { P c / ( n - 1))
C qi-
i#i,
(1 1)
3. For graphs with cycles, the problem of formulating a minimum transportation plan is more complex. We shall give here an algorithm for setting up such a plan, which differs from the one proposed earlier by Kantorovich and Gavurin [93]. Lemma 1 Let
n
Val..
z,...,a,,(x) = C I aj - x I 3-1
attains a and let ail p ai2p . 2 a i n . The expression yal,.z,...,an(~) minimum when the following condition is satisfied : aif,,,+Ii/zi
2 x 2 ai rcn+zi /al.
(12)
Lemma 2 Consider a cycle, i.e., a graph having vertices A i , . . . , A , with loads ql, . . . , qn and edges 112, Iz3, . . . , lnl connecting these vertices (Fig. 45). In accordance with condition (7) we set
A minimum plan of transportation is realized when x is chosen according to Lemma 1, where a1 = 41,
a, = 41
+ q 2 , .. . ,
an-l = q1
+ ... + qn-l,
0
an = 0.
225
Finite Graphs Related to the Transportation Problem
Suppose that the numbers x i k , specifying a transportation plan for a cycle, coincide with the numbers 6, , . . . , b, , b, p b2 2 . . 2 b,. A transportation plan for a cycle is called correct if we have the inequalities
-
br(n+1~21 2 0 2 b[(n+2)/21.
(13)
Remark Lemmas I and 2 imply that a minimum transportation plan for a cycle may always be chosen in such a way that no transportation is made along at least one edge. THEOREM 3 An optimal transportation plan along a given cyclic graph induces correct transportation plans for each of the cycles contained in the graph. If transportation plans for any of the cycles contained in the graph are correct, then the plan transportation for the graph is a minimum plan.
The algorithm for formulating a minimum transportation plan for a given graph with a load is the following: The transports along the edges that are not cyclic can be found from Eq. (9). Then the problem reduces to setting up a minimum transportation plan (with loads changed' accordingly) for a graph (possibly nonconnected) all of whose edges are cyclic. Transports along the edges of this graph are determined in the same way as in Lemma 2, where the process may be repeated for some cycles. A minimum plan is formulated when the transportation plans for all cycles contained in the graph become correct. THEOREM 4 For a given load q on a graph G,, there is a tree Dn c G, such that the minimum transportation plan for the graph G, coincides with a minimum transportation plan for the tree D,.
Application of Matrix Calculus to the Synthesis of Relay-Contact Circuits’
Algebraic methods of analysis and synthesis of series-parallel relaycontact circuits were first developed by Shestakov [ 1 6 / ] . Lunts in his papers [ 1 / 9 , 1201 applied matrix methods to the most general relay-contact networks, but his techniques were too cumbersome when applied to problems of the synthesis of the relay-contact networks which are most important in practice. In the present chapter we shall develop a simple matrix method which is easy to interpret physically and effective in solving a number of problems in the area of relay-contact network synthesis. Let x,y , . . . denote contacts which are closed only when the values of the independent variables x, y , . . . are equal to unity, and let X, j j , . . . denote contacts which are closed only when the independent variables x, y , . . ., have values equal to zero. All independent variables are defined by the corresponding relays, switches, etc., and ass’ume only two values: 0 and 1.
+
1 ) input busbar and Consider an electrical network A”, having a ( p a (p I ) output busbar. These and others will be denoted with numbers 0, I , . . .,p. i f the voltage (the voltage is measured with respect to some common conductor not included in the network, for example, relative to the ground) is supplied to the ith in-out busbar, we shall say that the i t h component of the input vector2 is equal to unity. Thus, the input vector can be written as A = ( A o , . . . , &), where A, = 1 if the voltage is supplied to the ith input busbar, and A i = 0 otherwise. Similarly, we shall assume that the
I.
+
Tsetlin “491 (Editor’s note). We use the word “vector” in view of the external resemblance of the set of numbers I)-dimensional space. defining the input voltages to the ordinary vectors in ( p
*
+
226
Matrix Calculus and the Synthesis of Relay-Contact Circuits
227
kth component of the output vector is equal to unity if the voltage is supplied to the kth output busbar of the network A”, and equal to zero in the opposite case. The output vector may be written in the form M = (pug, ,ul , . . . , ,up), where the coordinates ,uk assume values 0 or I . The sum of two vectors A, = (Ah’), A i l ) , . . . , A Q ) ) and A, = (Ahz), A\z) , . . . , 16”))will be defined as a vector d3= (Ah3)’, Ai3), . . . , A$)) such that 1i3)= A$,) Liz). I n the last expression, the sum is formed in the Boolean sense (see the work of Shestakov [161].3Furthermore, a vector will be called simple if no more than one of its components is equal to one. The vectors thus introduced span a space !Ul which is closed under the operation of addition and each element of which may be decomposed into a sum of simple vectors. Let us return to the network A“. Its matrix A = I ail.1, will be formed in the following way. If the ith input busbar and the kth output busbar are joined by an (ideal) conductor, then a j k is assumed to be equal to unity. If, however, the conductivity between the ith input and kth output busbars is zero, then aikwill also be set to zero. If the input vector is A = ( A o , A,, . . . , /Ip), then the kth component of the output vector can be calculated according to the formula
+
In this formula addition is performed in the Boolean sense. The physical meaning of ( I ) is that voltage is supplied to the kth output busbar even if it is first supplied to just one input busbar connected with the former busbar. Equation ( I ) can be written briefly as M
= AA.
(2)
The matrix A defines on the set 91 a transformation which is in a certain sense analogous to the linear transformation in ( p I)-dimensional linear space. If the network is obtained by a serial (cascade) addition of networks A” and 8 (we shall write it as the equality = A8),then the following matrix may be associated with it3:
+
c
* That is, we consider that
1
+ 1 = 1 (Editor’s note).
‘ The addition is understood in the Boolean sense.
228
Supplementary Articles
When the number of input and output busbars differs by q, we add q columns or rows, consisting of zeros, to the matrix. One can also use rectangular matrices in which the numbers of rows and columns differ by q. Let us give several examples that will illustrate the matrix notation as applied to networks (Fig. 46): (a) Short-circuit network:
‘1 0 0 0 1 0 E,,,
=
.*.
0’ 0
. . . . .. . . . . .. . . . . . ... . 0 0 1 *.*
(c) Commutator:
A network 2 will be called simple if each input busbar is connected to no more than one output vector. A network obtained by a cascade addition of simple networks is also simple. A simple network 2 is associated with a simple matrix A which has no more than oge element equal to unity. Simple
229
Matrix Calculus and the Synthesis of Relay-Contact Circuits
matrices form a semigroup with multiplication determined by Eq. (3). In the present paper we shall limit ourselves to problems involving a synthesis of simple relay-contact networks. We shall study two such problems.
2. Let us construct a network z ( , ( x l , x , , . . . , x,) involving addition relative to an arbitrary modulus m for n variables x l , x,, . . . , x, by defining them as follows: Let r out of the n variables x l , x , , . . . , x, be equal to unity. Then the voltage is transferred from the kth input busbar to the sth output busbar (s = 0,1, . . ., m - l), where s is the sum modulo m of the numbers r and k. Symbolically, addition modulo m may be written as cij = 6 ( i r, j ) , (4)
+
+
where r is the number of variables equal to unity, the sign signifies addition modulo m, and A and M are simple vectors. The network e m ( x l ,x , , . . . , x,) is simple and has m input and m output busbars (0, 1, . . . , m - 1). The network e m ( x l ,x , , . . . , x,) may be associated with a matrix of mth order Cm(xl,x,, . . . , x,). We shall try to express it as a product of n identical matrices
Cm(X1, x2, . .
. 9
x,)
n n
=
i-1
4AXi).
(5)
Using the definition of the network Cm(l,0, . . .,0) = Cm(O,1, . . .,0) = Cm(O,0, . . ., 1). We let Bm(0)= Em. Then, if r of the n variables xl, . . . ,x, are different from zero, then C, = Bh(1). Using the definition of addition modulo m, we have the following relation:
Bh*
(1)
=
Bh(1).
(6)
This relation is satified by the transformation involving a cyclic permutation of simple vectors which changes a simple vector whose ith component is equal to unity into a simple vector whose ( i 1)th component is equal to unity. A vector whose (m - 1)th component is equal to unity is changed into a vector whose zeroth component is equal to unity. This transformation has a matrix 0 1 0 0 0 0 1 *.. 0
+
Bm(1)
=
. . . .. . . . . . ... . . . . . .. .
0 0 0 1 0 0
1
- a .
0
230
Supplementary Articles
Combining expressions for B,(O) and B,( I), we obtain x x 0 0 0 R x 0
...
*..
0 0
. . . . .. . . . . . . .. . . . . . . ... .
0 0 0 0 x 0 0 0
* * .
-..
(7)
X
R
The network & ( x ) is shown in Fig. 47. Now let a number Xibe written in binary form, and let be the digit standing in the kth place of the term Xi.According to binary notation, a 1 in the kth place corresponds to 29 1's in the zeroth place. Therefore, if a 1 in the zeroth place is associated with a matrix B,(l), then a 1 in the qth place will be associated with a matrix &(I).
Figure 47
+
Thus, if we are given n(q 1)-digit numbers X , , X , , . . . , X,, then the matrix corresponding to addition modulo m can be written as
The matrix C ( X , , X , , . . . ,X,) by virtue of (8) can be written as a product of n(q 1) matrices
+
i = 1,2, . . ., n, p
= 0,
I, . . ., q,
231
Matrix Calculus and the Synthesis of Relay-Contact Circuits
where each one depends only on one argument x i ,and consequently may consist of elements of the form I , 0, x i , p ,X i , p . Note that the numbers of the elements of the matrices and that are not identically equal to zero are the same. 3. Consider now a network which depends not only on the number of nonzero arguments, but also on their order. We shall say that an inversion in the order of arguments xl,x2, . . . , x, occurs if xi > xi+l (i.e., if xi = I , xifl = 0). We shall construct a scheme D ( x , , x p , . . . , x,) decomposing the electrical network in case there is at least one inversion present. Just as in the previous case, we want to express D(xl, x2, . . . , x,) as a product of identical matrices
Using the definition of the network, we have P(1)
=
T(I),
P(0)= T(O),
T(l)T(O)
=T
0.
(10)
These relations are satisfied by the following simple matrices of second order :
A diagram for the case n = 5 is shown in Fig. 48. It is obvious that the scheme decomposes a network when there is at least one inversion, and if there is no inversion, supplies voltage to the zeroth busbar if ,'A = 0, and to the 1st busbar if X , = 1. Using cell matrices, it is easy to construct a scheme for addition of a number of inversions modulo m, and so forth.
Figure 48
The matrix elements may be in the form of not only independent variables but also their arbitrary functions, assuming two values (0 and l), which makes it possible to widen the range of problems which can be treated using matrix techniques.
0
A Bibliography of Papers by M. L. Tsetlin 0
1. Finite dimensional representations of the group of unimodular matrices. Jointly with I. M. Gel’fand, Dokl. Akad. Nauk SSSR 71, 825-828 (1950). 2. Finite-dimensional representations of the group of orthogonal matrices. Jointly with I. M. Gel’fand, Dokl. Akad. Nauk SSSR 71, 1017-1020 (1950). 3. A use of matrices in a synthesis of relay-contact networks. Dokl. Akad. Nauk SSSR 86, 525-528 (1952). 4. Quantities with an anomalous parity and a possible explanation of the parity degeneracy of K mesons. Jointly with I. M. Gel’fand, Zh. Eksp. Teor. Fiz. 31, Pt. 6 (12), 1107-1109 (1956). 5. A bioelectric control system. Jointly with A. Ye. Kobrinskiy, M. G. Breydo, V. S. Curfinkel’, A. Ya. Sysinyi, and Ya. S. Yakobson, Dokl. Akad. Nauk SSSR 117, 78-80 (1957). 6. A matrix method of analyzing and synthesizing electron-impulse and relay-contact (nonprimitive) networks. Dokl. Akad. Nauk SSSR 117, 979-982 (1957). 7. A composition and a decomposition of nonprimitive networks. Dokl. Akad.Nauk SSSR 118, 488-491 (1958). 8. 0 neprimitivnykh skhemakh (Nonprimitive networks). Probl. Kibern. No. 1, 23-45 (1958). 9. A matrix method of synthesizing multitact relay-contact communications and control networks. Jointly with G. S . Eydus, Elekfrosvyaz, 12, 41-48 (1958). 10. An algebraic method of synthesizing trigger cell networks. Jointly with G. S. Eydus, Izv. Vyssh. Ucheb. Zaved. Radiofiz. 1, 166176 (1958). 11. A mock-up of a mechanical drive for a prosthesis, controlled by muscular biocurrents. Jointly with A. Ye. Kobrinskiy, V. S. Gurfinkel’, M. G. Breydo, A. Ya. Sysinyi, and Ya. S. Yakobson, Sci. Session Cenfral Sci. Res. Inst. for Prosthefics and Orthopedic Appliances, 6fh, 1958, pp. 153-1 57. 12. A bioelectric control system. Jointly with M. G. Breydo, V. S. Gurfinkel’, A. Ye. Kobrinskiy, A. Ya. Sysinyi, and Ya. S. Yakobson, Probl. Kibern. No. 2, 203-212 (1959). 13. A method for the bioelectric control of mechanisms and devices. Jointly with V. S. Curfinkel’ el al., Invention, Author’s Certificate No. 110657, 1958.
232
A Bibliography of Papers by M. L. Tsetlin
233
14. A servomechanism controlled by muscular biocurrents. Jointly with A. Ye. Kobrinskiy, M. G. Breydo, V. S. Gurfinkel’, Ye. P. Polyanyi, Ya. L. Slavutskiy, A. Ya. Sysinyi, and Ya. S. Yakobson, Invention, Author’s Certificate No. 118581, 1958. 15. Work of the Central Scientific Institute of Prosthetics and Orthopedic Appliances in the area of bioelectric control systems. Jointly with A. Ye. Kobrinskiy, V. S. Gurfinkel’, Ya. L. Slavutskiy, M. G. Breydo, Ya. S. Yakobson, A. Ya. Sisynyi, and Ye. P. Polyanov, Sci. Session Central Sci. Inst. for Prosthetics and Orthopedic Appliances, 7th, 1959, pp. 125-132. 16. Two-channel ferrotransistor transformer networks and an algebraic method for their synthesis. Joii :I. with L. M. Shekhtman, Probl. Kibern. No. 2, 139-149 (1959). 17. A driven multivibrator with a n electronic control. Jointly with A. F. Ivanov, Izv. Vyssh. Ucheb. Zaved. Radiofiz. 2, 133-134 (1959). 18. Problems of bioelectric control. Jointly with V. S. Curfinkel’, Theses. Meeting Physiol. SOC.,9th, 1959. 19. Certain properties of finite graphs related t o the transportation problem. Dokl. Akad. Nauk S S S R 129, 747-750 (1959). 20. A method of cardiography. Jointly with V. S. Gurfinkel’ et al., Invention, Author’s Certificate No. 123634, 1959. 21. Two-channel ferrotransistor networks with a nonperiodic readout. Jointly with L. M. Shekhtman, Probl. Kibern. No. 3, 89-94 (1960). 22. Certain problems of the physical realization of devices performing logical functions. Jointly with L. M. Shekhtman, in “Primeneniye Logiki v Nauke i Tekhnike” (Use of Logic in Science and Technology), pp. 377-393. Akad. Nauk SSSR,Moscow, 1960. 23. Continuous models of control systems. Jointly with I. M. Gel’fand, Dokl. Akad. Nauk S S S R 131, 1242-1245 (1960). 24. A use of the bioelectric signals from the heart for purposes of control. Jointly with V. S. Gurfinkel’, V. B. Malkin, and A. V. Khudyakov, in “Voprosy Patologii i Regeneratsii Organov Krovoobrashcheniya i Dykhaniya” (Problems of the Pathology and Regeneration of Blood Circulation and Breathing Organs), pp. 23-32. SD of AS USSR, Novosibirsk, 1961. 25. A methodology for the electric stimulation of the heart. Jointly with V. S. Gurfinkel’ and V. B. Malkin, Biofizika 6, 125-126 (1961). 26. X-ray heart analysis at arbitrarily chosen phases of the cardiac cycle. Jointly with V. S. Gurfinkel’, V. B. Malkin, and A, V. Khudyakov, Vestn. Rentgenol. Radiol. 36, 25-28 (1961). 27. The principle of a nonlocal search in automatic optimization systems. Jointly with I. M. Gel’fand, Dokl. Akad. Nauk S S S R 137, 295-298 (1961). 28. Certain ideas about the tactics of the construction of movements. Jointly with I. M. Gel‘fand and V. S. Gurfinkel’, Dokl. Akad. Nauk S S S R 139, 1250-1253 (1961). 29. A device for registering and diagnosis of disturbances in the rhythmic activity of the heart. Jointly with Yu. S. Gorokhov, A. P. Matusova, V. A. Mel’nikova, G. M. Tarantovich, and V. M. Shabashov, Izv. Vyssh. Ucheb. Zaved. Radiofiz. 4, 165-172 (1961). 30. Certain problems related t o the behavior of finite automata. Dokl. Akad. Nauk S S S R 139, 830-833 (1961). 31. Behavior of finite automata in random media. Avtomat. Telemekh. 22, 1345-1354 (1961).
234
Supplementary Articles
32. A method of blood sampling from the heart’s cavities and large vessels. Jointly with V. S . Curfinkel’ et al., Invention, Author’s Certificate No. 136011, 1961. 33. Certain methods of controlling complex systems. Jointly with I. M. Gel’fand, Usp. Mat. Nauk 17, 3-26 (1962). 34. Certain problems related t o the application of biocurrents for controlling medical instruments. Jointly with V. S. Gurfinkel’, V. B. Malkin, and A. V. Khudyakov, Theses of lectures. All-Union Conf. Appl. of Radioelectron. Biol. and Med., 1962, pp. 73-74. 35. On control tactics applicable to complex systems in connection with physiology. Jointly with I. M. Gel’fand and V. S. Gurfinkel’, in “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics), pp. 66-73. Akad. Nauk SSSR, Moscow, 1962. 36. Learning by stochastic automata. Jointly with V. I. Varshavskiy and I. P. Vorontsova, in “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics), pp. 192-197. Akad. Nauk SSSR, Moscow, 1962. 37. On the synchronization of motor units and the model concepts related t o it. Jointly with I. M. Gel’fand, V. S. Gurfinkel’, Ya. M. Kots, and M. L. Shik, Biofirika 8, 475486 (1963). 38. Research in posture activity. Jointly with I. M. Gel’fand, V. S. Gurfinkel’, Ya. M. Kots, V. I. Krinskiy, and M. L. Shik, Theses. Conf. Coordination of Motion, edition of the Inst. of Higher Nervous Activity of the Acad. of Sci. of the USSR, Moscow, 1963. 39. A report on a game of a finite automaton against a partner using a mixed strategy. Dokl. Akad. Nauk SSSR 149, 52-53 (1963). 40. Examples of automaton games. Jointly with V. Yu. Krylov, Dokl. Akad. Nauk SSSR 149, 284-287 (1963). 41. On automaton games. Jointly with V. Yu. Krylov, Avtomaf. Telemekh. 24, 975-987 (1963). 42. Finite automata and a simulation of simple forms of behavior. Usp. Mar. Nauk 18, NO. 4 (112), 3-28 (1963). 43. On certain classes of games and automaton games. Jointly with I. M. Gel’fand and I. I. Pyatetskiy-Shapiro, Dokl. Akad. Nauk SSSR 152, 845-848 (1963). 44. An example of a game for many identical automata. Jointly with V. Yu. Krylov and S . L. Ginzburg, Avtomat. Telemekh. 25, 668-672 (1964). 45. Homogeneous automaton games and their simulation on a digital computer. Jointly with V. I. Bryzgalov, I. M. Gel’fand, and I. I. Pyatetskiy-Shapiro, Avtomat. Telemekh. 25, 1572-1580 (1964). 46. An investigation of posture activity. Jointly with I. M. Gel’fand, V. S. Gurfinkel’, Ya. M. Kots, V. I. Krinskiy, and M. L. Shik, Biofirika 9, 71C717 (1964). 47. The behavior of automata in nonperiodic random media and the problem of synchronization in the presence of noise. Jointly with V. I. Varshavskiy and M. I. Meleshina, frobl. feredachi Inform. 1, 65-71 (1965). 48. Some examples of a simulation of the collective behavior of automata. Jointly with S . L. Ginzburg, Probl. feredachi Inform. 1, 54-62 (1965). 49. Certain problems related to the automatic diagnosis of acute pathological states. Jointly with V. S. Gurfinkel’ and V. B. Malkin, in “Kibernetika v Klinicheskoy Meditsine” (Cybernetics in Clinical Medicine). Leningrad, 1965.
A Bibliography of Papers by M. L. Tsetlin
235
50. On the mathematical modeling of the mechanisms of the central nervous system.
Jointly with I. M. Gel’fand, in “Modeli Strukturno-Funktsional’noy Organizatsii Nekotorykh Biologicheskikh System” (Models of the Structurally-Functional Organization of Certain Biological Systems), pp. 9-26. Nauka, Moscow, 1966. 51. Certain problems related t o the investigation of movements. Jointly with I. M. Gel’fand, V. S . Gurfinkel, and M. L. Shik, in “Modeli Strukturno-Funktsional’noy Organizatsii Nekotorykh Biologicheskikh Sistern” (Models of the Structurally-Functional Organization of Certain Biological Systems), pp. 264276. Nauka, Moscow, 1966. 52. Restructuring before a movement. Jointly with V. S. Gurfinkel’, Ya. M. Kots, V. I. Krinskiy, Ye. I. Pal’tsevyi, A. G. Fel’dman, and M. L. Shik, in “Modeli Strukturno-
Funktsional’noy Organizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems), pp. 292-301. Nauk, Moscow, 1966. 53. A simple model for the generation of impulses by a nerve cell. Jointly with Yu. B. Kotov, Biofizika 11, 547-549 (1966). 54. Computer simulation of the functioning of a motoneuron pool. Jointly with Yu. B. Kotov, Probl. Kibern. No. 20, 9-18 (1968). 55. A construction of stochastic automata. Jointly with S. L. Ginzburg, Probl. Kibern. 20, 19-26 (1968). 56. An algorithm for controlling a communications network. Jointly with A. V. Butrimenko and S. L. Ginzburg, Probl. Kibern. 20, 27-38 (1968).
APPENDICES
These appendices discuss questions posed to M. L. Tsetlin in oral form, and contain a summary of the articles written by other authors and relating to the questions posed by him.
0
On the Goore Game' 0
M. L. Tsetlin has formulated the following simple symmetric game played by many identical automata (the Goore game). Suppose there are Nautomata, each with two actions and a memory capacity n, which are capable of expedient behavior in stationary random media. The conditions of the game are: at a time T, one must calchate the portion of the total number of automata that perform the first action, and the automata independently of each other are rewarded (penalized) with the probability ( k / N ) ,( q ( k / N ) = 1 - p ( k / N ) ) . It is assumed that a certain function p ( x ) , x E [0, 11 is given. The automata in accordance with the received penalty or reward change their states; at a time T 1, everything is repeated, etc. The behavior of the group of automata is described by a homogeneous Markov chain with ( 2 r ~states. ) ~ The problem consists in studying the final distribution and its asymptotic properties for an infinitely increasing number of players as functions of the memory capacity. This problem was considered by several authors [29, 49, 1261. Borovikov and Bryzgalov [29] proved that in the case of a trigger game, i.e., automata L2,2 (see page 15) with the memory n = 1 for ZV+ 00, we have for the sequence of the final distribution
+
+
Pr[l ( k / N ) - 4 I I E ] --+ 1
for any
E
> 0.
(1)
The behavior of the group of automata approaches 1 in which each of the automata, independently of the rest, chooses one of the two actions with equal likelihood, without reacting to penalties or rewards that it might receive. The case for n > 1 was simulated on a digital computer. It was shown that with a sufficiently large memory capacity the actions of automata with This chapter was written by B. G. Pittel. See the formulation of the problem by M. L. Tsetlin on p. 113 (Editor's note).
239
240
Appendices
a linear tactic are distributed in such a way as to maximize the total payoff, i.e., the group of automata possesses the property of asymptotic optimality. Volkonskiy [49] made use of the average time before a change of action in his approximate investigation of the behavior of a group of automata in a game (in particular, in the Goore game). (A precise basis for this technique as used in the Goore game is lacking.) For the case
(and p 2 > 4, if one has in mind automata with a linear tactic), it was shown that for asymptotic optimality it is required that limN-++m(n/N) > x ; however, if limN,+,(n/N) < x, then the automata spend most of their time in inconvenient plays. Here x is defined by the formula
vi = p i / q i for automata with a linear tactic L2n,2,pi = 1 / q i , for the Krin-
skiy automata D2n,2(see p. 19). Pittel [I261 investigated a version of the Goore game which is simplified by the value of a deviation from the assumption about the independence of rewards. Here, in order to preserve the uniqueness of the stationary distribution, instead of automata L2,,2one considers automata B,,,, (see p. 48), and instead of Krinskiy automata D 2 n , 2automata , Dkn,zare utilized (which is an analogous modification of automata Dzn,2).I n the corresponding Markov chain, the group of states in which the automata have the same depth of learning is a closed set, and all the other states are irrevocable (in dynamics, “synchronization” of the participant automata occurs). It must also be noted that when the automata are in their transitional states, in the case of a penalty, they independently solve the problem of whether to change the action or not. Let us summarize the results of Pittel [126]. For a game with the function p ( x ) , defined by Volkonskiy [49], properties are proved which coincide with those proved by Volkonskiy [49]. Here, if automata BznPz (and p 2 5 4) are participating (this case for automata L2n,2(“analogs” of B2n,2)is not considered by Volkonskiy [49]), x in (3) must be determined from the formula x = [I - H(1,)]/log2 vl. It was also established that for lirnN++,(n/N) > x (the case of asymptotic optimality), Pr[lo - SN 5 k / N 5 l o ]+ 1,
only if
NSN -+ +m.
241
On the Goore Game
Thus, out of all possible distributions over actions that result in a maximum payoff, the automata select distributions which are asymptotically close from the left to the point of a discontinuity of the function p(x). An interesting interpretation of this property is as follows: From the set of their best decisions, the automata choose one that allows them the most rapid detection of any disadvantages resulting from any deviations from that decision. The disadvantages would be caused by the spontaneous tendency toward a random mixing, a tendency which is suppressed to a greater or lesser extent depending on the memory capacity of the automata. If, however, L N + , - ( n / N ) < x [or p1 = max(p,, p z ) < 3 when the automata Bzn,z are playing], then for fixed a < b,
-
Pr [a/2N1/25 ( k / N ) - 3 5 l1/2N’/~]
N+fm
b
(1/(2n)1/z)J exp [ -x2/2] dx a
(“chaos” occurs). Suppose now that the function p(x) is continuous, 0
4 in the case of automata Bzn,z]. Let q(x) = p(x)/q(x) for the automata B2n,2,and q(x) = l/q(x) for automata D;n,z.We introduce the notation h N / n ( X ) = h(x)
+
(N/n)H(x), H ( x ) = x In( 1/x) (1 - x) In [ 1/( 1 - x)], h ( x ) = In ~ ( x ) .
+
(4)
Let ANln be the maximum point of the function hNln(x),which for simplicity will be assumed to be unique. It is clear that, if N / n is sufficiently small, then A N / , , is close to the maximum point of the function h(x), obviously coinciding with the point x = 2,. In the following, whenever we speak of automata B2n,2, we shall assume that if Z , + + , ( N / n ) < +m, then limN++,(N/n) < R , where R is so small that for e < R , 2, [which is the maximum point of the function h,(x)] it coincides with the maximum point of the function g,(x) [here gJx) = h,(x) if p(x) 2 4, g,(x) = e H ( x ) if p(x) 5 31, where p ( 2 , ) > 4. Such a number must exist since p(Ao) > 3. Conversely, if N / n + +m, then AN/,, + 4, since the function H(x), which is the entropy of the distribution of automata in terms of actions, has a maximum at x = 4. The following is the basic result: If N + +m, then for the sequence of final distributions, Pr[l k / N I 5 el 1, (5) where
E
> 0 and is fixed.
-
242
Apperu(iees
Speaking somewhat loosely, one could say that by virtue of this result, a group of automata of sufficiently large number “maximizes” the function hs/,,(x) = h ( x ) ( N / n ) H ( x ) . This fact is essentially a mathematical expression of the principle of “the war between expediency and thermodynamics.” In conclusion we shall give a more precise formulation of asymptotic properties.
+
THEOREM 1 (Concerning the conditions for which the final distribution is asymptotically normal) Let lim ( N / n ) = e
N++m
< +co,
A,
=
lim (AN/n),
N++m
where 0 < AQ < 1.
.-
Let m be an even number determined by the condition: hr)(A,) = - hhm-’)(Ae) = 0 and hhm)(Ae) # 0 (consequently,
N++m
for
] ~ / ( ~ =-+m l )
e =0
or rn
= 2.
It will be noted that, since in the case of asymptotic optimality c, condition of the theorem then becomes
lirn ( N / n ) = 0,
N++m
lirn ( N m / n ) =
N++w
+GO.
=
(6)
0, the
(7)
The following two theorems show that, when condition ( 6 ) is not satisfied, the final distribution may not tend to a normal one. THEOREM 2 Let 0
< limN+*(N/n)
=
e < +co,
lirn n [ ( N / n )- plm/(m-l) = 0.
N++w
Then
rn > 2, and
243
On the Goore Game
(In the formulation of these two theorems, for brevity we have omitted the asymptotic estimates of the probabilities of large deviations.) THEOREM 3 Suppose that the second condition in (7) is not satisfied, and that, moreover, limLv++03(Nm In N / n ) = 0. Then
Here an,Nand bn,Nare number of the form k / N that are closest from the left and the right to ALv,n; k is an integer. Thus, when condition (9) is satisfied, the property of asymptotic optimality becomes even more pronounced. The discrete form in which this result is stated sharply distinguishes this theorem from those preceding. The necessity of the condition limN-,+,(N/n) = 0 for asymptotic optimality shows that, for large N and n, the automata make a substantial use of “deep” states. Thus, suppose that R, is the probability that in a steady state, the depth of learning lies between /?and n. Then when condition (7) is satisfied, 1 - R,
-
{y(&)[l
+ O((N/n)ml(m-l))]}-(n-p+l’ and
lim (/?/n)= 1
n++m
[ y ( x ) was introduced in Eq. (4)]. We shall supplement this discussion by adding the results from Pittel [227].2
STATEMENT OF THE PROBLEM We are given a town with a population of N people; in the town there are n regions with room for b, ,b, , . . . , b, persons and m factories whose demand for labor is a,, a,, . . . , a, persons. It is natural to assume that N = a, + ... a, = b, b,. The population distribution is given in the form of a matrix 11 xij 11, 1 5 i 5 m, 1 5 j 5 n, where xij is the number of persons living in the jth section and working in the ith factory [persons with index (i, j ) ] . Obviously, Cj xij = aj x i j = b j . The distribution of factories and sections is such that a person with an index (i, j ) needs a time tij to get to work. Suppose that the town residents exchange living quarters, trying to shorten their commuting time. We must describe the dynamics of
+
+- - +
xi
*
This addendum was also written by B. G . Pittel (Editor’s note).
244
Appendices
the population redistribution in the town as a result of address changes and describe the distribution that is reached in the limit. A determination of such a stable population distribution is important in forecasting passenger flows in a city which is in the planning state. The architect G. V. Sheleykhovskiy at one tTme proposed an algorithm that would determine the “natural” distribution. The algorithm did not give any description whatsoever of the dynamics of address changes. However, I. V. Romanovskiy, who acquainted the author (B. G. P.) with the problem, remarked that if this algorithm converges, then the limiting distribution maximizes the weighted entropy (the convergence of the algorithm was recently proved [34]).It seemed interesting to us to obtain similar properties of the limiting distribution from a model of address changes that could somewhat realistically reproduce the dynamics of the population redistribution in the town. In the first version of the proposed probabilistic model of address changes, at any discrete instant of time T the town residents are registered independently from one another in order to make the exchange with the probability z = n(t),where t is the commuting time to work from the present residence [thus, a person with an index (i, j ) registered with probability nii = n(tij)]. Then a randomly chosen pair of registered residents exchange their places of residence. In the second version, a randomly chosen pair of residents with indices (i, j ) , (k,1) exchange places of residence with probability ziinkl.In both versions, the exchange of places of residence is described by a homogeneous Markov chain whose number of states is equal to the number of all possible distributions. In the paper under consideration, an investigation is made of the a s y m p totic (for N - t +XI)properties of the final distributions. The first version uses m = n = 2, and the second uses arbitrary m, n. We shall formulate a theorem about the asymptotic properties of the final distribution for the second version, since for m = n = 2, the result coincides precisely with the corresponding result for the first version of the model. We shall consider a weighted entropy of the distribution (1 xi3 11: H ( x )= C xij ln(l/zijXij), i .i
It is clear that 3
Let
xij
==
xij/N*
xi3 satisfy the following conditions: i
11 xfj(N) 11
be the maximum point of H ( x ) in the indicated region.
245
On the Goore Game
Suppose that lim
N-WWJ
ci > 0,
lim
X++W
hi > 0 .
THEOREM For the sequence of final distributions
(E
is an arbitrary fixed positive number). More precisely: let sij
= (N)""[Xij(N) - x $ ( N ) ] .
It is clear that S = 11 sij 11 belongs to a subspace L defined by the conditions = 0, sij = 0. Then for any bounded closed or relatively open region Q c L
xisij
xi
Here dv(L) is the volume element of the subspace L,L, ill, . . . , are the eigenvalues of the matrix and of quadratic form C i , j ~ $ ( N ) (xi ~ j ) ' . To intrepret the results obtained, we shall assume, e.g., n(t)= e-C/1.3 Then H ( x ) can be written as
+
The representation H ( x ) and the result of the formulated theorem imply that for small C (small memory capacity) the residents distribute themselves in such a way that the entropy H ( x ) is maximized, i.e., completely randomly. For large C , they do it in such a way that & (xij/rij) is maximized, i.e., the harmonic mean, over all town residents, of the commuting time to work is minimized. A comparison of these results with results related to the Goore game reveals their internal similarity. (It is true, of course, that in this model the tendency for large N and large C to maximize the general (xij/tij) is developed statistically due to an interaction, and criterion a Let us make a comparison with automata with a linear tactic and others similar to them: For these automata, the probability of a change in action is also equal to [ f ( p r ) ] - " , where pt is the probability of a penalty for a given action, and n is the memory. Thus, C is analogous to memory.
246
Appendices
in the Goore game this general criterion is given as a condition of the problem.) It seems to us that the incompatibility between the expedient behavior on the part of each member of a group and the tendency toward a random mixing, which is revealed in the Goore game and the model of population redistribution (and which in the Goore game is stronger, the larger the group), must be characteristic of a considerably wider class of models of collective behavior which are related to distributions.
a A Simplified Description of Games Played by Asymptotically Optimal Automata'
a
1 Two-Person Games
In determining the behavior of the simplest automata in games, it is very useful to avoid a detailed computation of the Markov chain describing the automaton game, and instead to study the overall characteristics, e.g., the mean time during which an automaton performs any action. The pioneering attempts in this direction were made by V. A. Ponomarev who used the concept of mean time to make calculations for a symmetric game played by two automata with linear tactics. This approach was very convenient for zero-sum games played by automata. Krinskiy [103, 1041 has proposed a method of determining the limiting payoff in such games for the case in which the participants are automata of asymptotically optimal design of types L, D, K, and others similar to them (see p. 16 ff). The asymptotic optimality of these designs is based on the fact that the mean time TA,(ai), within which an automaton A, performs an action i, is greater the larger the payoff aireceived by the automaton for that action (for large n). To be more precise, one must consider the initial state ~ ( n of) an automaton A,, and speak of the mean time However, for these automata the value of the payoff a is more important than the initial state &I). If A,, A,, . . . , A,, . . . is an asymptotically optimal sequence of such
nF(a).
This chapter was written V. I. Krinskiy, I. I. Pyatetskiy-Shapiro, and A. L. Toomyi (Editor's note). 247
248
Appendices
automata, and a, > a 2 , then
for any initial states p(n) and y ( n ) of automata A , . It is this basic property that allows us to determine the payoff in a zero-sum game. First, let us consider an example. Suppose that a game is played by identical automata, and the matrix of this zero-sum game is
In the play (1, 1) the payoff to the first automaton, (0.3), is greater than that to the second automaton (-0.3). With this type of payoff in a stationary random medium, the mean time before a change of action by the second automaton is considerably less than the mean time before a change of action by the first automaton. Therefore, it is natural also to expect in a game with this type of payoff that the second automaton will change its action sooner than the first, and the automata will go from play (1, 1) to play (1, 2). In play (1, 2), the payoff to the first automaton is already less than that to the second. Now the first automaton will change its action sooner, and the automata will change to play (2, 2). Subsequently, the change of actions will take place in the following manner: 0.3 + -0.1
t
-0.4
t
1
-0.2.
The mean time t i j ,during which automata participate in a play (i, j ) , is determined by the automaton that receives a smaller payoff in the play Therefore, t,,, TA(-O, 3), t1,2 TA(--O.1), tZ,, T,(-0.4), t 2 , 2'T (-0.2). Then ( I ) implies that play (1, 2) is played much longer than the remaining ones, and thus this play makes the greatest contribution to the payoff. Naturally, we can expect that the payoff to the first automaton in such a game is close to -0.1. (This is the matrix element of the payoff matrix that is closest to zero.) The above qualitative picture of the behavior of automata turns out to be not far from the truth. Krinskiy [ Z W ] describes a class S of automata for which it is possible to determine the payoff in zero-sum games. These automata satisfy condition (1) and two additional requirements. The limiting behavior of those automata in a game can be determined in terms of the
-
-
-
-
249
Games Played by Asymptotically Optimal Automata
automaton value d. The automaton value for a pair of asymptotically optimal sequences of automata A , , . . . , A , , . . . and B,, . . . , B,, . . . from this class is determined as follows. The mean time TB,(a) for an automaton B for sufficiently large n is a monotonically increasing function of a, and Ts,(- a ) is a monotonically decreasing function of a. Suppose that for a = d(n), we have the following equality:
The limit d(n) for n-00 (if it exists) is called the automaton value d. The automaton value d is greater, the smaller the “inertia” of the design of A, i.e., the smaller the time TA(a) and the more inertial its opponent B. For identical automata, d = 0. We have the following theorem: Suppose we are given two sequences of automata A , , . . . , A , , . . . , and B,, . . . , B,, . . . , belonging to class S and asymptotically optimal in all stationary random media, and suppose we are given a zero-sum game between two automata A, and B,. Then the limiting payoff W to automaton A always lies between the upper and lower values of the game V and u. Strictly speaking:
r
r,
(a) If a game T is such that the automaton value d 2 V , then W = V. (b) If u < d < V , then the limiting payoff W is close to d. The closeness is understood in the following sense:
ailjl 5 W I ai,aj,, where
ailjl = max{aij}, aij
aisj, = min {aij}, at5’d
and the elements ailjl and aiojesatisfy an additional condition; (c) If d 5 u, then W = u. However, this theorem is not directly applicable to automata L since they are not asymptotically optimal in all stationary random media. If one forms a stationary random medium by choosing payoffs for actions from the interval [-1, 11, then automata L may not be asymptotically optimal. To make them asymptotically optimal, the interval must be restricted to [0, 11. However, in a zero-sum game it is impossible that the payoff to both players in each play lie inside the interval [0, 11 (so long as the payoff matrix does not consist of zeros alone). For such automata, it is meaningful to define an antagonistic game, which is analogous to a zero-sum game, in such a way as t o make the payoffs lie inside the interval [0, 11. A two-
250
Appendices
automaton game is called antagonistic with a sum x if the expected values of the payoffs to the first a\;) and the second a$) automata in each play (0)are related by the following equality:
For x = 1 the payoff does not go beyond [0, 11. For antagonistic games with a sum x, the payoff can also be determined using the above theorem if, instead of the automaton value d, one uses the quantity d, = lim,,,d(n)/x, where dAn)is defined by the equation
[cf. (2)]. Thus, it is also possible to include in the general scheme those sequences of automata which may not be asymptotically optimal in all stationary random media.
Example Suppose an automaton DL,bn,Mplays against an automaton LmLvvn,N (automata D and L have M and N actions, respectively, and the number of states in one ray, corresponding to one action, is m times greater for L as compared with D ) , an antagonistic game with a sum x = 1. The expression for the mean time before a change of action can be used to find the automaton value d, for x = 1. For the automaton L m a v n , ~ , T L ~ ~ , , ~=( [2/(1 U)
-
a)][(A"- l)/(A - I)], where A = (1
+ a)/(l - a ) .
For the automaton D M n , M ,
Here, d, depends on the ratio of memory capacities m. The value d, = corresponds to an identical payoff to both participants of the game with a sum I . Here this occurs for the ratio of memory capacities m = log 4/log 3. We shall also give the ratios of memory capacities m at which both automata play identically for other pairs of sequences of automata:
+
m
=
log7/log3
for
m
=
log7/log4
for
-
KMn,M
and
Lm~n,~,
KMn,M
and
Dm~n,~.
These examples show how an increase in inertia [the time before a change of action, T(a)] of a sequence of automata leads to a reduction in their
Games Played by Asymptotically Optimal Automata
251
payoff. It is clear that the “game abilities” of the automaton designs L, D, and K considered here differ very little: in order for them to play in exactly the same way, the ratio of the memory capacities (which determines the inertia) must never exceed 2.
2 A Remark about Games with an Arbitrary Number of Players An analytic investigation of games played by automata belonging to asymptotically optimal sequences, or, briefly speaking, asymptotically optimal automata, is very difficult and in almost no case has it been done in its entirety. To a large extent, the difficulties are related to the fact that these games are described by Markov chains in which the number of states rapidly increases with an increase in the individual memories of the automata. In this connection, M. L. Tsetlin was interested in the task of simplifying the description of such Markov chains, in particular, in finding a Markov chain whose number of states does not tend to infinity, and which would give us an idea about the actual course of a game. It is possible that these properties would be possessed by a Markov chain constructed in the following way. Suppose we are given a game K with independent outcomes (see p. 42) played by v participant automata W, . . . , 91’. We shall assume for simplicity that each automaton has only two actionsf, and fi . Let pi(f ) denote the probability of a penalty for an ith player if, at the preceding instant of time, the set of actions performed by the players was f = { fi,, . . . ,Av}.Such a game is obviously described by a Markov chain whose number of states is equal to the product of the numbers of states of all participant automata. For example, if each automaton has 2n states, then the number of states in the chain is equal to (2n)”.Suppose that we are given automata 9li, . . . ,91‘that participate in a game K. As we have said, we are considering asymptotically optimal automata. All the automaton designs described on pp. 16-20 have one property: for large n, they perform a single action for a very long time. Therefore, each set of actionsfis performed many times in a row. During all that time, each automaton 21i might be said to be located in a stationary random medium in which the probability of a penalty is equal to pi(f ). For an automaton %i located in a stationary medium with probability of a penalty pi( f ), we shall use T(‘41i,f , , ,pi(f )) to denote the average time during which it performs the action f , , many times in a row. This average time is easy to compute for all automaton designs proposed on pp. 16-20.
252
Appendices
The new Markov chain has 2v states denoted by the sets f = { jj.,, . . . , A,}, where each index j , is equal to 1 or 2. The probability of a transition from a state fi = . . . ,A,} into a state fi = { fkl, . . . ,f,,} is equal to flm=l t,, where
{Al,
tm={
(1/T)(%rn,&,,,Prn(f1))
if
j m f k m ,
1 - (l/T)(%rn,fj, Pm(f1))
if
jm = k m .
9
According to M. L. Tsetlin, the behavior of this Markov chain for n -00 approaches the behavior of the original Markov chain, describing the automaton game.
0
The Problem of Controlling a Communications Network’ 0
In Butrimenko’s dissertation [36] (also see [37-39]),the methods developed by M. L. Tsetlin are applied to construct a decentralized method of controlling communications networks. We shall briefly describe the problem investigated in his paper. We are given a communications network consisting of nodes and the main lines connecting them. It is assumed that the load is produced by the nodes, and the main lines possess a limited number of channels. Both the main lines and the nodes may malfunction, and the network may include new main lines and nodes. The problem is to find routes which satisfy the requirements arising in nodes in such a way as to minimize the number of failures. The failures themselves may be due to overloads or a malfunctioning of individual links of the network. Had we known the loads, their distribution, and the transmitting capacities of the lines beforehand, we would then have a perhaps complex, but in principle exactly soluble, problem. In reality, the loads may change sharply during operation, and the network is subject to unforeseeable parameter changes. Therefore, the control system must be adaptable. It is not hard to see that construction of such a control system using a large specialized machine is undesirable, if only because in case of its malfunction the entire network would break down. An alternative to such a centralized method of control has been proposed by M. L. Tsetlin in his theory of automaton games. In this method the individual control devices do not know the entire problem, and each solves its own problem, using for this purpose only the information about the state of the nearest segments of the network. This chapter was written by A. V. Butrirnenko (Editor’s note).
253
254
Appendices
We shall describe how the choice of the route requirements is made. Suppose, for example, that one needs to transmit a message to a certain node A. For this, one uses the so-called A-relief which is constructed in the following way. All the ribs that meet at A are assigned the number 1, all following ribs are assigned the number 2, etc. so that to find the shortest route it is sufficient at each node to choose the rib that has the smallest number among the ribs that meet at that node. If, however, any changes are made in the network, then one must modify the A-relief. This may be done directly in the process of searching for the shortest route for messages transmitted to A . Apparently, it is advisable to change the relief at once in case of a malfunction or an overload. At the beginning, one must check whether the malfunctioning direction will in fact function for a sufficiently long time. For this purpose, one must introduce inertia in one way or another into the control system. This may, for example, be done by using automata with a linear tactic. One may also introduce inertia in a different way by specifying some average number and assigning it to a rib. The control system for a communications network was simulated on a computer. For a model of a network having real characteristics as far as reliability and loads are concerned, use of such a method made it possible to lower the number of failures from 15.7% to 6%. Furthermore, there arises the following problem: If we have already determined what requirements should at a given moment be made on a given rib, and their number was greater than the transmitting capacity of this line at the time, then which messages should be transmitted, and how should they be distributed over the different channels belonging to the line if these channels possess different reliabilities? To solve this problem one uses an automaton game described on pp. 62-71 as the “distribution game.” A new method is proposed of a division of penalties and rewards among the automata which, similar to the “common fund,” forces them to play the Moore play in the game, and at the same time does not require that they have a large memory.
0
The Operation of the Apartment Commission’ 0
Suppose that there are N applicants, k apartments, and a commission composed of n persons, which distributes the apartments among the applicants. For each member of the commission, a portion of the applicants in “his,” and only their interests are defended by him. Suppose that each member of the commission has his,own version of the ordering of applicants : Ai
= ( q i , azi,
.
. . . , aka,
.
ai+,,,,. . . , aai).
(uii is the name of the applicant who is j t h on the list of an ith commission member.) Of course, in that list “his” applicants are at the top. How should the apartments be distributed? Let us consider the following simplest voting method. Each applicant Ai in each list will be assigned a weight (e.g., his number from the end of the list), and let us calculate the arithmetic mean of the weights of each candidate using all the lists, and then let us place the applicants in descending order of their weights, and let us give apartments to the first k persons on this list. A drawback of this method is that “serious false information” can occur. For example, k = 2, N = 6 , n = 3. Let us set A’
=
(a, b, c, d, e , f ) ,
A2 = ( a , f ,c, e, b, d),
A s = ( d , f , c, e, a, 6 ) .
Here the weights are distributed as follows:
w,= 43, w,= 23,
wc=4,
w, = 34, w e =29, w,= 39,
The text was written by D. I. Kalinin and I. M. Epshteyn. See the formulation of the problem by M. L. Tsetlin on pp. 108 and 121 (Editor’s note). 255
256
Appendices
and as a result the apartments are assigned to a and c, and not to a andf; as would of course be expected. If the commission had a chance to vote again, then the second and the third commission members could interchange c and e in their lists, and would achieve a result satisfying the commission on the whole more than the one obtained before. In fact, the operation of real apartment commissions invariably involves not just one voting procedure, but many discussions, whose goal is to reduce the variety of initial plans to a single common view, and the voting takes place only when this common view has been worked out. Thus, if we want our model to express the actual process of arriving at a solution, it is necessary to allow the commission members to change their views. A game is obtained in which various tactics may be used. M. L. Tsetlin and his colleagues thought that it was precisely this approach that was promising. We shall describe one method of changing views, whose objective is to improve the system of applicant selection. Suppose that a given commission member has a right to change his list (move the applicants on the list), and in this way he arrives at a situation which is most convenient to him.2 Of course, he will not interchange his applicants with others. Suppose that only other applicants are moved on the list. It is obvious that then he can improve the position of his applicants only by lowering the weights of other applicants who find themselves in a better position than his candidates. Consequently, each member should compute the average weights of other candidates according to other members’ lists, and place them in his list in such a way that in his list the most “lucky” one of the others will get the smallest weight 1, the second most lucky will get 2, and so forth.3 We shall show that, if the change of views occurs in such a way that during each cycle a view is changed by only one member, and only in the way described above, then after a finite number of steps a stabilization takes place, i.e., the views stop changing. For the case in which the members change their views simultaneously, the process as a rule never ends, i.e., a cycle sets in. Let Wji denote the weight of the j t h candidate in the ith list. Writing out D = (l/N) W j - I W ) ~where ,
cj”=i(
c wj”,
(l/n)
i-1
c wj N
n
wj =
M
=
(l/N)
j-1
=
( N + 1)/2,
* The result of voting (after a change of views) is again determined, with respect to the arithmetic mean of the weights. * It must be noted that this method is not apriori the best. It is possible that the smallest weight should be assigned, not to the most suitable of the other candidates, but to the most suitable one who has a chance of failing.
257
The Operation of the Apartment Commission
we have
D
=
N
N
j-1
j-1
(l/N) C WjZ - (2M/N) C Wj
+ (l/N) C M 2 = (l/N) C WjZ - M2. N
N
j-1
j-1
Without losing generality, we shall assume that the nth member changes his view (we can always renumber them). We have
c WjZ (F) N
=
I
3-1
N
C
j-1
[ yw; + Wj.1
2
j-1
where C does not change due to a change of view by the nth member. Thus, we have D = (2/Nn2)D2 M ( C / M ) , where D2 = CjL, Wjn[Crz; Wji], and M and C are constants. It is easy to see that during a change of views according to the method described above, D, decreases and so does D. Consequently, after a finite number of steps D (by virtue of the finite number of possible combinations of views) achieves its minimum possible value, and then the views stop changing. Unfortunately, the result of voting depends on the order in which the members change their views. An example:
+ +
k=2 N=6 n=3
6 A': a A2: e A3: d
5 b f f
4 3 2 1 c d e f c a b d c e a b
Suppose that a change of views occurs in the following order: A'-A2-A3A'-. .
.
(I) A': A2: A3: (111)
A': A2: A 3
6 a e d 6 a e :
5 4 3 b d c f c a f c e 5 4 3 b d c f b c d f c e
2 e b a 2 e a a
1 f d b 1 f d b
(11) A': A2: A3:
(IV) A': A2: A3:
6 5 a b e f d f 6 5 a b e f d f
4 3 d c b c c e 4 3 d c b c c e
2 e a a 2 e
1 f d b 1 f a d a b
258
Appendices
We see that the views have stabilized, and the weight distribution obtained is the following:
w, = 3 4 , w,= 3 9 , w,= 3 5 , w,= 3 3 , we= 3 5 ,
Wf= 33;
i.e., the apartments should be assigned to d, e, a n d f , who have obtained the same weights. Suppose now that the members change their views in another order: ,41-~3-~2-,41-.. . 4 (I) A': a b d A2: e f c A3: d f c 6 5 4 (111) A': a b d A2: e f c A3: d f c 6 5
3 c a e 3 c a b
2 e b a 2 e b e
1 f d b 1 f d a
6 5 4
A': a b d A2: e f c A3: d f c 6 5 4 (IV) A': a b d A2: e f c A3: d f c (11)
3 c a b 3 c a b
2 e b e 2 e b e
1 f d a 1 f d a
The views have stabilized, and the distribution of weights thus obtained is different from that preceding:
w, = 3 4 , w,= 3 9 , w,= 3 3 ,
w, = 3 8 ,
we= 3 9 ,
Wf= 38;
i.e., identical weights were given to c, d, and$ A satisfactory description of the set of all possible outcomes has not as yet been obtained. It will also be noted that this algorithm leads to an equalizing of weights and often the number of candidates who obtain identical maximum weights after stabilization is greater than the number of apartments. COALITION
One can approach the problem from another angle: Suppose that the members can form coalitions, i.e., can unite into groups, developing in the group a common view and come forward with that view as a single member. Obviously coalitions will consist of members whose views are largely similar. One of the possible ways of forming coalitions uses a metric in the space of views. M. L. Tsetlin, in addition to the algorithms given here, investigated
The Operation of the Apartment Commission
259
quite a large number of other algorithms. However, there was none that he could consider as a final solution to the problem. Attempts have also been made to develop natural systems of restrictions that must be imposed on the solution. M. L. Tsetlin’s students have found one version of such restrictions, and proved that for this version the solution is unique.
0
The “Hey” Problem’ 0
The “hey” problem was posed by M. L. Tsetlin at a seminar on the theory of automaton games in 1965. Usually the problem was stated using the example of people gathering mushrooms in a forest, or that of fishing vessels catching fish. Persons exchange information about their location by shouting “hey!” These shouts can be heard only within a certain distance. It is assumed that all the members of a group are identical in the sense that under identical conditions the probabilities that they will perform certain actions are identical. In this kind of situation, those methods of behavior are advisable for which everybody together will find sufficiently many mushrooms, and the probability of any person getting lost is small, and also for which the amount of shouting necessary is not large. In the case when the distribution of mushrooms is random, the problem reduces to one in which the criterion is to cover as large an area as possible. M. L. Tsetlin suggested, in connection with an investigation of tactics that are advisable in this type of situation, that problems may arise that may be of interest when it comes to solving problems of collective behavior.
This text was written by I. I. Pyatetskiy-Shapiro (Editor’s note).
260
0
Papers on Continuous Excitable Media’ 0
The paper “On continuous models of control systems” [60] arose in connection with a discussion of a number of problems in the physiology of excitation, in particular, that of the propagation of an excitation in heart tissue. As a result of these discussions I. M. Gel’fand and M. L. Tsetlin constructed a formal model in which they succeeded in abstracting from the particular properties of various excitable tissues. In analyzing this model, they obtained a number of results; these results are partially given in the article [60]. Subsequently the ideas of this article were developed by various authors in a number of papers. A portion of these papers discussed various modifications of abstract models, and others took into account the specific properties of specific objects. M. L. Tsetlin was not the author of all these papers. However, all of them are not just conceptually related to that article [60].M. L. Tsetlin was an indispensable participant in the discussions of the papers at all stages of their preparation, and a majority of these papers were begun on his suggestion. Therefore, it seems natural to present briefly their conclusions i : ~this book. In the following, an analysis is made of the basic results obtained during the investigation of propagation of an excitation in a one-dimensional excitable medium (a linear segment and a ring) and in a two-dimensional excitable medium.
This text was written by M. B. Berkinblit (Editor’s note). 261
262
Appendices
1 One-Dimensional Excitable Media A. A FIBER SEGMENT
I n the article [60],an analysis was made of the propagation of impulses in a fiber with refractivity. The propagation speed of an impulse in the fiber depends on the time at which the preceding impulse passed through a given point. The article considers the case in which the speed of an impulse is lowered during the refractive phase, and monotonically increases as the fiber leaves the refractive phase. For this case it was shown that, during a periodic stimulation of the fiber, the asymptotic mode is described as follows: All impulses propagate in the fiber in such a way that the time interval between them at any point is equal to the period of their occurrence. This theorem is the first example of an exact quantitative discussion of the characteristic features of the propagation of successive impulses in a fiber with refractivity. Subsequently, a number of generalizations of the above theorem were discussed for various functional relationships between the impulse propagation speed and the refractivity, for various initial distributions of refractivity along the fiber length, and for inhomogeneous fibers in which the refractivity is a function of position [16, 1411. It was shown that the theorem is satisfied for objects with various properties and for various initial conditions. These results were obtained in part analytically, in part by means of computer experiments. The same articles dealt with some features of the transient mode during which the process of the propagation of a sequence of impulses approaches the final stationary state. Finally, an investigation was made of the propagation of nonperiodic sequences of impulses along a fiber with refractivity. It was shown that under certain conditions the refractive fiber transforms the nonperiodic input sequence of impulses into a periodic sequence that has a certain mean frequency. This property of refractive fibers imposes certain restrictions on the speed of transmission of messages that are coded using a frequency code. B.
A RING
The propagation of impulses along a circular path was qualitatively studied by physiologists in connection with pathological states in the heart, and in connection with the circulation theory of short memory. The paper [60] gives a quantitative estimate of the process of propagation of impulses in a homogeneous ring, formed by a refractive fiber. It is
263
Papers on Continuous Excitable Meda
stated in the article that an impulse in this type of ring should in the limit propagate at a constant speed. A proof of this assertion was given by Gel’fand and Kazhdan [63]. A detailed study of the propagation of a single impulse and a group of impulses in a ring of excitable tissue was made by Telesnin [14O]. He showed that in a ring, where the speed is a monotonically increasing function of the phase, the impulses begin to move equidistantly and at a constant identical speed. In a ring, where the speed attains a maximum for a certain phase of refractivity, along with one equidistant, another distribution of impulses is possible in which the impulses bundle together in a group which occupies only a portion of the ring. The character of the mode that takes place depends on the length of the ring and on the initial conditions. V. L. Dunin-Barkovskiy (see Arshavskiy et af. [ZO]) made an investigation, both analytically and by means of computer experiments, of the transient modes in the propagation of impulses in a ring. Finally, Arshavskiy er af. [ZO]have demonstrated the applicability of the basic theoretical results to the actual nerve fibers. The work was done on single nerve fibers of an earthworm and a frog which were made into a ring by means of an electronic circuit. Thus, the papers referred to represent quite a detailed analysis of the propagation of impulses in a refractive fiber that was begun by the original article [6O].Mainly, attention was given to the analysis of conditions under which the limiting theorems, formulated in that paper, are satisfied, and also to the transient modes between the initial conditions and the stationary state.
C. BLOCKING OF IMPULSES IN
A
REFRACTIVE FIBER
Upon a suggestion from I. M. Gel’fand an M. L. Tsetlin, attempts were made to study the process of the impulse blocking in an excitable medium, inhomogeneous for refractivity. In the discussion of this question, a great and useful role was also played by Balakhovskiy who subsequently devoted a separate article to that topic [13]. The processes involving the blocking of impulses in an excitable medium with a discrete variation of refractivity at a certain point were thoroughly studied theoretically and experimentally by Arshavskiy et af. [5, 61. M. L. Tsetlin actively participated in this work, and edited articles devoted to those problems. These papers have shown that for a finite length of the impulse in a medium with a discrete change of refractivity, the nth impulse may periodically
264
Appendices
disappear. Patterns of that disappearance were predicted which were then verified experimentally on single nerve fibers. The mechanism of the periodic blocking of impulses investigated in those papers permits us to understand one of the forms of cardiac arrhythmia, the so-called Benkebach periodicity. Subsequently, the blocking of impulses was studied under various conditions in many experimental and theoretical investigations. The disappearance of impulses is interesting primarily because this process apparently plays an important role in a number of pathological cases, and in addition, in certain cases the blocking of impulses is apparently used by the organism in processes involving the coding of afferentation and in performing logical operations. Finally, neuron inhibition is also one of the mechanisms responsible for the blocking of impulses. We shall mention several other papers relating to this cycle. Several authors [7, 16, 1091 discuss the blocking of impulses in a fiber with a gradual variation of refractivity along its length. It was shown that in this case, a periodic disappearance of a group of impulses must take place. Smolyaninov [I361 discussed the problem of the stability of the length of the cycles formed. It was shown that periodic blocking of impulses should be observed not only in a region with longer refractivity, but also in any inhomogeneity resulting in a lowered reliability of passage. A survey of the articles dealing with the blocking of impulses was given by Berkinblit [14]. 2 Two-Dimensional Excitable Media2 A.
THESYNCHRONIZATION PROCESS
Continuing the work on the model formulated [60] by I. M. Gel’fand and M. L. Tsetlin showed that, with different periods of spontaneous excitation at various points of an excitable medium, the entire medium (after a certain transient state) begins to become excited at the frequency of its fastest element-the entire medium is synchronized with the fastest element. This process of synchronization was studied on a computer by Lukashevich [117]. a As indicated, the very fact that the paper [60] was written was due to a large extent to discussions dealing with the physiology of the heart. M. L. Tsetlin was very interested in these questions and put in a great deal of work to build medical apparatus that could be used in heart ailments. It was no coincidence that the subsequent development of research in the area dealing with the properties of two-dimensional excitable media was closely related to problems associated with the physiology of the heart.
Papers on Continuous Excitable Meda
265
The importance of this process for the functioning of the heart and its applicability to the cardiac sinus was shown in the papers by Gel’fand et al. 1571.
B.
“QUIVERING AND FLICKERING”
A study of the propagation of waves in a two-dimensional excitable medium subject to the axioms presented [60]led Balakhovskiy [12]to a model describing path J ’ jgical modes of the propagation of excitation in the heart. He showed that when ruptures occur in the front of a wave traveling in a two-dimensional medium, the region of the rupture becomes a generator of new periodic waves which follow the original wave with a large frequency. In a medium containing inhomogeneities, the waves produced by a generator of this type may themselves suffer disruptions and become new sources of waves. A development of this process results in a state which is similar to the quivering and flickering of the cardiac muscle. The pathological state of this type was simulated on a computer by Lukashevich [118]. He assumed that the rupture of an excitation front occurs as a result of a temporary random “death” of a certain portion of the medium, which later comes back to life. Another front rupture mechanism was analyzed by Krinskiy [105]. He used for the two-dimensional case the results on the blocking of impulses by a refractivity inhomogeneity which were described previously. Krinskiy and Kholopov [I081 have shown that in a medium containing a finite length of the excitation impulse, and one which is inhomogeneous as to refractivity, it is possible for an impulse to be “reflected” from a region of greater refractivity, thus producing an “echo.” (This phenomenon is observed under certain conditions in a real heart.) By taking this phenomenon into account, we can build new models of the quivering and flickering that will permit us to explain the occurrence of the states under consideration for much smaller regions of an excitable medium. The restrictions as to the minimum size of a region, necessary for the occurrence of quivering, entailed considerable difficulty for other models. Krinskiy investigated the interactions among wave generators (reverberators) produced when the front of an excitation is ruptured, and showed that they have a finite lifetime. The duration of the pathological state depends on the ratio of the numbers of appearing and disappearing reverberators. This type of dynamics was studied by Krinskiy and Kholopov in computer experiments, and for the sources of waves of the “echo” type, numerical estimates of the process are given in a paper by Krinskiy et al. [107].
266
Appendices
C . CABLENETWORKS The above investigations represent either an analysis of processes occurring in formal excitable media, done either analytically or using a computer, or the results of experiments with real objects (the'heart, nerve fibers). A different type of paper is represented by the articles by Arshavskiy et al. [8, 9, 15, 135, 1371 dealing with the properties of cable networks. These papers, instead of concentrating on a homogeneous continuous medium, deal with networks of various structures, formed by cable segments whose insulation has properties similar to those of the membranes of living cells. These models, while still quite formal, make a more detailed analysis of the structure of real biological objects, for example, that of the cardiac muscle. This creates additional difficulties for such models, but at the same time makes it possible to explain certain effects discovered in real objects. The models were used to explain certain properties of the cardiac muscle, and to advance a hypothesis about the functioning of dendrites in nerve cells. This sequence of papers is, in the opinion of their authors, generically closely related to the original article [60]. A separate direction is represented by papers proposing to utilize continuous media to perform logical operations. This type of use is suggested by the high reliability of such systems. An example of such papers is an article by Balakhovskiy [ZZ]. At the present time, attempts are being made to use homogeneous integral circuits in computer technology. Thus, homogeneous technological models are beginning to be used as actual control systems. M. L. Tsetlin, when speaking of continuous excitable media, often expressed a view that they are not only a convenient model of the simplest biological objects (such as the cardiac muscle), but that they may also be a suitable element for computers and a model for the functioning of certain brain structures. In relation to computers, this view is indeed beginning to be confirmed.
0
The Restructuring of the Operation of the Spinal Level’ 0
M. L. Tsetlin’s concepts about nonindividualized control in the system of a multilevel regulation of movements were experimentally confirmed in a special series of experiments designed to study the restructuring of the interaction system among the nerve mechanisms on the spinal level. Here we shall only briefly describe the experimental setup and the basic results (for more details, see the literature [83, 125, 841). Gurfinkel’ and Pal’tsev [83] have demonstrated the presence of cross influences in a human spinal cord: the knee jerk produced on one side changed in a specific way the state of the segmentary structures on the opposite side. The changes are revealed by monosynaptic testing. These cross effects are not monotonic: 20-40 msec after the reflex is produced, a slight facilitation occurs on the opposite side, which is replaced within 40-80 msec by a pronounced inhibition, which is again replaced by a considerable facilitation within an interval of 100-180 msec. This is what the picture of cross effects looks like in a healthy person at rest. If the subject is at the same time asked to stretch both legs at the knee joints, and when after a number of such accompanying movements one studies the character of cross effects, then it may be discovered that the cross inhibition usually occurring within 40-80 msec is absent after the conditioned reflex. The normal interaction of segmentary structures is gradually reestablished within several minutes. If the subject is asked to perform another motor task, such as alternate movements at the knee joints as in walking, then in this case, the earlier cross facilitation is replaced by an inhibition which is also gradually attenuated [125, 841. This text was written by V. S. Gurfinkel’ (Editor’s note).
267
268
Appendices
These data indicate that in a multilevel control system the supporting motor apparatus in single-type movements restructures the interaction system of the auxiliary elements of the lower level, and the character of the restructuring is determined by the type of motor task.
References
Abbreviations used in references: AS USSR The Publishing House of the Academy of Sciences of the USSR
IL
OGIZ Association of the State Publishing Houses of the USSR
Fizmatgiz State Publishing House of Physics and Mathematics Literature
AMN Publishing House of the Academy of the Medical Sciences of the USSR
SD of AS USSR The Publishing House of the Siberian Division of the Academy of Sciences of the USSR
Nauka State Publishing House “Nauka” Medgiz State Publishing House of Medical Literature Meditsina State Publishing House “Meditsina”
Foreign Literature State Publishing House
Biomedgiz State Publishing House of Biomedical Literature Mir
State Publishing House “Mir”
Soviet radio State Publishing House “Sovetskayi radio”
Ayzerman, M. A., Experiments in teaching a machine to recognize visual images. In “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics). AS USSR, Moscow, 1962. Alekseyev, M. A., Zalkind, M. S., and Kushnarev, V. M., The solution by a human being of the problem of choice with a probabilistic confirmation of motor reactions. In “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics). AS USSR, Moscow, 1962. Anokhin, P. K., Introductory article. “Problemy Tsentra i Periferii” (Problems of the Center and the Periphery). OGIZ, Gorkiy, 1935. 269
2 70
References
4. Anokhin, P. K., “Problemy Vysshey Nervnoy Deyatel’nosti” (Problems of Higher Nervous Activity). AMN, Moscow, 1949. 5. Arshavskiy, Yu. I., Berkinblit, M. B., and Kovalev, S. A., A periodic transformation of rhythm in single nerve fibers. Eiofizika 7 , No. 4 (1962). 6. Arshavskiy, Yu. I., Berkinblit, M. B., and Kovalev, S. A., The place of occurrence of the rhythm transformation in a nerve fiber with a n artificially produced inhomogeneity. Biofizika 7 , No. 5 (1962). 7. Arshavskiy, Yu. I., Berkinblit, M. B., Kovalev; S. A., and Chaylakhyan, L. M., A periodic transformation of rhythm in nerve fiber with gradually changing properties. Eiofizika 9, No. 3 (1964). 8. Arshavskiy, Yu. I., Berkinblit, M. B., Kovalev, S. A., Smolyaninov, V. V., and Chaylakhyan, L. M., Certain properties of continuous excitable media. I n “Modelirovaniye Funktsiy Nervnoy Sistemy” (Modeling of the Functions of the Nervous System). Rostov and Donu, 1965. 9. Arshavskiy, Yu. I., Berkinblit, M. B., Kovalev, S. A., Smolyaninov, V. V., and Chaylakhyan, L. M., The analysis of the functional properties of dendrites in connection with their structure. I n “Modeli Strukturno-Funktsional’noyOrganizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 10. Arshavskiy, Yu. I., Berkinblit, M. B., and Dunin-Barkovskiy, V. L., The propagation of impulses in a ring formed of a n excitable tissue. Biofizika 10, No. 6 (1965). 11. Balakhovskiy, I. S., O n the possibility of a modeling of simple behavioral acts by discrete homogeneous media. Probl. Kibern. No. 5 (1961). 12. Balakhovskiy, I. S., Certain modes of movement of an excitation in a n ideal excitable tissue. Biofizika 10, No. 6 (1965). 13. Balakhovskiy, I. S., An occurrence of the Benebach Samoylov cycles in continuous and quasicontinuous models. Biofizika 11, No. 1 (1%6). 14. Berkinblit, M. B., A periodic blocking of impulses in excitable tissues. I n “Modeli Struktur’no-Funktsional’noy Organizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 15. Berkinblit, M. B., Kovalev, S. A,, Smolyaninov, V. V., and Chaylakyan, L. M., The electric behavior of the myocardium as a system and the characteristics of the membranes of the heart cells. I n “Modeli Struktur’no-Funktsional’noyOrganizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 16. Berkinblit, M. B., Fomin, S. V., and Kholopov, A. V., The propagation of an impulse in a one-dimensional excitable medium. Biofizika 11, No. 2 (1966). 17. Bernshteyn, N. A., “Klinicheskiye Puti Sovremennoy Biomekhaniki” (Clinical applications of contemporary biomechanics). Collection of papers of the State Institute for the Improvement of Physicians, Kazan’, 1929. 18. Bernshteyn, N. A., The problem of the mutual relationship between coordination and localization. Ark. Biolog. Nauk 38, No. 1 (1935). 19. Bernshteyn, N. A,, “0 Postroyenii Dvizheniy” (On the Generation of Movements). Medgiz, Moscow, 1947. 20. Bernshteyn, N. A., Certain ripening problems of the regulation of motor acts. Vop. Psikhol. No. 6 (1957).
References
271
21. Bernshteyn, N. A., The recurrent problems of the physiology of activity. Probf. Kibern. No. 6 (1961). 22. Bemshteyn, N. A., “Ocherki Po Fiziologii Dvizheniy I Fiziologii Aktivnosti” (Outlines in the Physiology of Movements and the Physiology of Activity). Meditsina, Moscow, 1966. 23. Bernshteyn, N. A., and Kots, Ya. M., Tonus (Muscular tone). Greaf Med. Encycl. 32 (1963). 24. Blackwell, D., and Girshick, M., “Theory of Games and Statistical Decisions.” IL, Moscow, 1958. 25. Bongard, M. M., Modeling of the recognition process on a digital computer. Eiofizika 6, No. 2 (1961). 26. Bongard, M. M., A modeling of the process of learning how to recognize on a digital computer. In “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics). AS USSR, Moscow, 1962. 27. Bongard, M. M., “Problema Uznavaniya” (Problem of Recognition). Nauka, Moscow, 1967. 28. Borovikov, V. A., An approximate solution of the Goore game. Probl. Kibern. No. 20 (1968). 29. Borovikov, V. A., and Bryzgalov, V. I., A simple symmetric game between many automata. Avfomaf. Telemekh. 26, No. 4 (1965). 30. Braverman, E. M., Experiments on teaching machines to recognize visual images. Avtomaf. Telemekha. 23, No. 3 (1962). 31. Breydo, M. G., Gurfinkel’, V. S., Kobrinskiy, A. Ye., Sysin, A. Ya., Tsetlin. M. L., and Yakobson, Ya. S.,On the bioelectric control systems. Probl. Kibern. No. 2 (1958). 32. Bryzgalov, V. I., Gel’fand, I. M., Pyatetskiy-Shapiro, I. I., and Tsetlin, M. L., Homogeneous games of automata and their simulation on a digital computer. Avfomaf. Telemekh. 25, No. 1 1 (1964). 33. Bryzgalov, V. I., Pyatetskiy-Shapiro, I. I., and Shik, M. L., On a two-level model of the interaction of automata. Dokl. Akad. Nauk SSSR 160, No. 5 (1965). 34. Bregman, L. M., A proof of convergence of the method of G. V. Sheleykhovskiy for a problem with transport restrictions. Zh. Vychisf.Mar.Mar. Fiz. 7, No. 1 (1967). 35. Buylov, V. L., A model of a learning system built on static triggers. Diploma Dissertation, Phys. Dept. of the Moscow State Univ., 1958. 36. Butrimenko, A. V., Games of automata and their use in the control of a communications network. Author’s abstract of the Candidate Dissertation, Moscow, 1967. 37. Butrimenko, A. V., On the search for optimal routes in a changing graph. Izv. Akad. Nauk SSSR Tekh. Kibern. No. 6 (1964). 38. Butrimenko, A. V., On the mean distance between the vertices of a graph. In “Seti Peredachi Informatsii I Ikh Avtomatizatsiya” (Information Transfer Networks and Their Automatization). Nauka, Moscow, 1965. 39. Butrimenko, A. V., and Lazarev, V. G., A system of search for the optimal routes of message transmission. Probl. Peredachi Inform. 1, No. 1 (1965). 40. Bush, R., and Mosteller, F., “Stochastic Learning Models.” IL, Moscow, 1962. 41. Vayda, S., Theory of games and linear programming. In “Linear Inequalities.” IL, Moscow, 1959. 42. Vaynshteyn, B. K., Gel’fand, I. M., Kayushina, R. A., and Fedorov, Yu. G., Localization of crystal structures by minimizing the R-factor. Dokl. Akad. Nauk SSSR 153, No. 1 (1963).
272
References
43. Varshavskiy, V. I., and Vorontsova, I. P., On the behavior of stochastic automata with a variable structure. Avrornar. Telemekh. 24, No. 3 (1963). 44. Varshavskiy, V. I., Vorontsova, I. P., and Tsetlin, M. L., “The Learning” of stoch-
45.
46.
47. 48.
astic automata. In “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics). AS USSR, Moscow, 1962. Varshavskiy, V. I., Meleshina, M. I., and Tsetlin, M. L., The behavior of automata in periodic random media and the problem of synchronization in the presence of noise. frobl. Peredachi Inform. 1, No. 1 (1965). Varshavskiy, V. I., Meleshina, M. I., and Tsetlin, M. L., The organization of waiting discipline in mass service systems with a use of models of the collective behavior of automata. Probl. Peredachi Inform. 4, No. 1 (1968). Veber, N. V., Rodionov, I. M., and Shik, M. L., “Escape” of the spinal cord from supraspinal effects. Biofizika 10, No. 2 (1965). Ventsel’, Ye. S., “Teoriya Veroyatnostey” (Theory of Probability). Nauka, Moscow,
1965. 49. Volkonskiy, V. A., Asymptotic properties of the behavior of simple automata in a game. f r o b / . Peredachi Inform. 1, No. 2 (1965). 50. Gel’fand, I. M., Certain problems in the theory of quasilinear equations. Usp. Mar. Nauk 14, No. 2 (1959). 51. Gel’fand, I. M., Grashin, A. F., and Ivanova, L. N., The phase analysis of the p-p scattering at energies of 150 MeV. Zh. Eksp. Teor. Fiz. 40, No. 5 (1961). 52. Gel’fand, I. M., Grashin, A. F., and Pomeranchuk, I. Ya., The phase analysis of the p-p scattering at energies of 95 MeV. Zh. Eksp. Teor. Fir. 40, No. 4 (1961). 53. Gel’fand, I. M., Gurfinkel’, V. S., Kots, Ya. M., Krinskiy, V. I., Tsetlin, M. L., and Shik, M. L., An investigation of posture-related activity. Biofizika 9, No. 5 (1 964). 54. Gel‘fand, I. M., Curfinkel’, V. S., Kots, Ya. M., Tsetlin, M. L., and Shik, M. L.,
On the synchronization of motor units and the related model concepts. Biofizika 8, No. 4 (1963). 55. Gel’fand, I. M., Gurfinkel’, V. S., and Tsetlin, M. L., Certain ideas about the tactics of movement generation. Dokl. Akad. Nauk SSSR 139, No. 5 (1961). 56. Gel’fand, I. M., Gurfinkel’, V. S., and Tsetlin, M. L., On the tactics of controlling complex systems in relation to physiology. In “Biologicheskiye Aspekty Kibernetiki” (Biological Aspects of Cybernetics). AS USSR, Moscow, 1962. 57. Gel’fand, I. M., Kovalev, S. A., and Chaylakhyan, L. M., Intracell stimulation of various sections of a frog’s heart. Dokl. Akad. Nauk SSSR 148, No. 4 (1963). 58. Gel’fand, I. M., Pyatetskiy-Shapiro, 1. I., and Fedorov, Yu. G., Localization of crystal structures by means of a nonlocal search. Dokl. Akad. Nauk SSSR 152, No. 5 (1963). 59. Gel’fand, I. M., Pyatetskiy-Shapiro, I. I., and Tsetlin, M. L., On certain classes of games and games of automata. Dokl. Akad. Nauk SSSR 152, No. 4 (1963). 60. Gel’fand, I. M., and Tsetlin, M. L., On continuous models of control systems. Dokl. Akad. Nauk SSSR 131, No. 6 (1960). 61. Gel’fand, I. M., and Tsetlin, M. L., The principle of the nonlocality of a search in problems of automatic optimization. Dokl. Akad. Nauk SSSR 137, No. 2 (1961). 62. Gel’fand, I. M., and Tsetlin, M. L., On certain methods of controlling complex systems. Usp. Mat. Nauk 17, No. 1 (1962).
References
2 73
62a. Gel’fand, I. M., Gurfinkel’, V. S., Shik, M. L., and Tsetlin, M. L., in “Models of the Structurally-Functional Organization of Some Biological Systems.” Nauka, Moscow, 1966. 63. Gel’fand, S. I., and Kazhdan, D. A., On one integral equation connected with the movement of a n impulse along a circle. Dokl. Akad. Nauk SSSR 141, No. 3 (1961). 64. Ginzburg, S. L., Krylov, V. Yu., and Tsetlin, M. L., On one example of a game of many identical automata. Avtomaf. Telemekh. 25, No. 5 (1964). 65. Ginzburg, S. L., and Tsetlin, M. L., Some examples of a simulation of the collective behavior of automata. Probl. Peredachi Inform. 1 , No. 2 (1965). 66. Glushkov, V. M., “Vvedeniye v Teoriyu Samosovershenstvuyushchikhsya Sistem” (Introduction t o a Theory of Self-perfecting Systems). KVIRTU, Kiev, 1962. 67. Glushkov, V. M., The theory of learning by one class of discrete perceptrons. Zh. Vychisl. Mat. Mat. Fiz. 2, No. 2 (1962). 68. Glushkov, V. M., Self-organizing systems and the abstract theory of automata. Zh. Vychisl. Mat. Mat. Fiz. 2, No. 3 (1962). 69. Glushkov, V. M., Toward the problem of self-learning in a perceptron. Zh. Vychisl. Mat. Mat. Fiz. 2, No. 6 (1962). 70. Glushkov, V. M., “Sintez Tsifrovykh Avtomatov” (Design of Digital Automata). Fizmatgiz, Moscow, 1962. 71. Glushkov, V. M., Kovalevskiy, V. A., and Rybakov, V. I., An algorithm for teaching a machine t o recognize simple geometric figures. In “Printsipy Postroyeniya Samoobuchayushchikhsya Sistem” (The Principles of Constructing Learning Systems). KVIRTU, Kiev, 1962. 72. Gorokhov, Yu. S., Matusova, A. P., Mel’nikova, V. A,, Tarantovich, T. M., Shabashov V. M., and Tsetliii, M. L., A device for a registration and a diagnosis of the disturbances in the rhythmic cardiac activity. Izv. Vyssh. Ucheb. Zaved. Radiofiz. 4, No. 1 (1961). 73. Granit, R., “Electrophysiological Investigations of Reception.” IL, Moscow, 1957. 74. Gurfinkel’, V. S., The standing posture in healthy persons and fitting a prosthesis after an amputation of lower limbs. Author’s Abstract of the Doctoral Dissertation, AMN, 1961. 75. Gurfinkel’, V. S., The problem of the stability in standing and its meaning for prosthesis fitting. Sci. Session Central Sci. Res. Inst. for Prosthetics and Orthopedic Appliances, 2nd, 1952. 76. Gurfinkel’, V. S., Materials on the importance of the visual and vestibular apparatus for the stability of standing in humans. Sci. Session Central Sci. Res. Inst. for Prosthetics and Orthopedic Appliances, 3rd, Moscow, 1953. 77. Gurfinkel’, V. S., Isakov, P. K., and Malkin, V. B., The coordination of the posture and movements under the conditions of lowered and heightened gravitation. Byull. Eksp. Biol. Med. 53, No. 1 1 (1959). 78. Gurfinkel’, V. S., Ivanova, A. N., Kots, Ya. M., Pyatetskiy-Shapiro, I. I., and Shik, M. L., The numerical characteristics of the functioning of motor units in a stationary mode. Biofizika 9, No. 5 (1964). 79. Gurfinkel’, V. S., Kots, Ya. M., Pal’tsev, Ye. I., and Fel’dman, A. G., The organization of the inter-joint interaction using an example of the compensation of the breathing-related relocations in the erect posture in humans. In “Modeli Strukturno-
274
References
Funktsional’noy Organizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 80. Gurfinkel’, V. S., Kots, Ya. M., and Shik, M. L., “Regulyatsiya Pozy Cheloveka” (Regulation of Posture in Humans). Nauka, Moscow, 1965. 81. Gurfinkel’, V. S., Kots, Ya. M., Krinskiy, V. I., and Shik, M. L., A method of estimating the condition of the inhibitory apparatus in a human spinal cord. Byull. Eksp. Biol. Med. 59, No. 5 (1965). 82. Gurfinkel’, V. S., Malkin, V. B., and Tsetlin, M. L., X-ray photography of the heart at arbitrarily chosen phases of the cardiac cycle. Vestn. Rentgenol. Radiol. 36, No. 6 (1961). 83. Gurfinkel’, V. S., and Pal’tsev, Ye. I., The effect of the state of the segmentary apparatus of the spinal cord on the realization of a simple motor reaction. Biofizika 10, No. 5 (1965). 84. Gurfinkel’, V. S.,and Pal’tsev, Ye. I., On certain features of the control of complex biomechanical systems. Probl. Kibern. No. 20 (1969). 85. Gurfinkel’, V. S., Tsetlin, M. L., Malkin, V. B., and Khudyakov, A. V., The use of bioelectric cardiac signals for the purposes of control. In “Voprosy Patologii i Regeneratsii Organov Krovoobrashcheniya i Dykhaniya” (Problems of Pathology and Regeneration of the Circulatory and Breathing Organs). SD of AS USSR, Novosibirsk, 1961. 86. Gurfinkel’, V. S.,and Tsetlin, M. L., Toward a methodology of the electrical stimulation of the heart. Biofizika 6, No. 1 (1961). 87. Dobrovidov, A. V., and Stratonovich, R. L., On the construction of optimal automata functioning in random media. Avfomat. Telemekh. 25, No. 10 (1964). 88. Dubovitskiy, A. Ya., and Milyutin, A. A,, The extremum problem with constraints. Dokl. Akad. Nauk SSSR 149, No. 4 (1963). 89. Dubovitskiy, A. Ya., and Milyutin, A. A., Extremum problems with constraints. Zh. Vychist. Mat. Mar. Fiz. 5, No. 3 (1965). 90. Zhukova, G. P., Problem of the neuron structure of the spinal cord. Ark. Anaf. Gistol. Embriol. 35, No. 6 (1958). 91. Ivanov, A. F., and Telesnin, V. R., A passage of impulse pairs through a circuit and a ring of driven multivibrators. f z v . Vyssh. Ucheb. Zaved. Radiofiz. 2, NO. 1 (1959). 92. Kandel’, E. I., “Parkinsonizm i Yego Khirurgicheskoye Lecheniye” (Parkinsonism and Its Surgical Treatment). Meditsina, Moscow, 1965. 93. Kantorovich, L. V., and Gavurin, M. K., Application of mathematical methods to
problems of the analysis of volume traffic. In “Problemy Povysheniya Effektivnosti Raboty Transporta” (Problems of Increasing the Efficiency of Transportation). AS USSR, Moscow, 1949. 94. Keder-Stepanova, I. A., and Rikko, N. N., A model of a neuron system with a Periodic salvo activity, and stable t o random afferent influences. In “Modeli Strukturno-Funktsional’noy Organizatsii Nekotorykh Biologicheskikh Sistem” (Models of Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 95. Kobrinskiy, A. Ye., Breydo, M. G., Gurfinkel’, V. S., Polyan, Ye. P., Slavutskiy Ya. L., Sysin, A. Ya., Tsetlin, M. L., and Yakobson, Ya.S., A Servomechanism Controlled by Muscular Biocurrents. Author’s Certificate No. 118581, 1958.
References
275
96. Kobrinskiy, A. Ye., Breydo, M. G., Gurfinkel’, V. S., Sysin, A. Ya., Tsetlin, M. L., and Yakobson, Ya. S., A bioelectric control system. Dokl. Akad. Nauk SSSR 117, No. 1 (1957). 97. Kobrinskiy, N. Ye., and Trakhtenbrot, B. A., “Vvedeniye v Teoriyu Konechnykh
Avtomatov” (Introduction to a Theory of Finite Automata). Fizmatgiz, Moscow,
1962. 98. Kovalevskiy, V. A., A correlation method of pattern recognition. Zh. Vychist. Mar. Mar. Fiz. 2, No. 2 (1962). 99. Kostyuk, P. G., Characteristic features of the process of excitation and inhibition
in individual intermediate neurons of the spinal cord. Fiziol. Zh. SSSR im I . M . Sechenova 47, No. 10 (1961). 100. Kotov, Yu. B., and Tsetlin, M. L., The simulation of the functioning of a motoneuron pool on a digital computer. Probl. Kibern. No. 20 (1968). 101. Krid, R., Denni-Broun, D., Ikkls, I., Liddel, Ye., and Sherrington, Ch., “Reflektornaya Deyatel’nost’ Spinnogo Mozga” (Reflector Activity of the Spinal Cord). Biomedgiz, Moscow-Leningrad, 1935. 102. Krinskiy, V. I., An asymptotically-optimal automaton with an exponential rate of convergence. Eiofizika 9, No. 2 (1964). 103. Krinskiy, V. I., On one construction of a sequence of automata and its behavior in games. Dokl. Akad. Nauk SSSR 156, No. 6 (1964). 104. Krinskiy, V. I., Zero-sum games of two asymptotically-optimal sequences of automata. Probl. Peredachi Inform. 2, No. 2 (1966). 105. Krinskiy, V. I., The propagation of an impulse in an inhomogeneous medium (States analogous to cardiac fibrillation). Eiofizika 11, No. 4 (1966). 106. Krinskiy, V. I., and Ponomarev, V. A., Blindman’s buff games. Eiofizika 9, No. 3 (1 964). 107. Krinskiy, V. I., Fomin, S. V., and Kholopov, A. V., On the critical mass during fibrillation. Eiofizika 12, No. 5 (1967). 108. Krinskiy, V. I., and Kholopov, A. V., An echo in an excitable tissue. Eiofizika 12, No. 3 (1967). 109. Krinskiy, V. I., and Kholopov, A. V., Transmission of impulses in an excitable tissue with a continuously distributed refractivity. Eiofizika 12, No. 4 (1967). 110. Krinskiy, V. I., and Shik, M. L., A method of investigating posture. Eiofizika 8, No. 4 (1963). 111. Krinskiy, V. I., and Shik, M. L., On one model motor problem. Eiofizika 9, No. 5 (1964). 112. Krylov, V. Yu., On one stochastic automaton which is asymptotically-optimal in a random medium. Avtomaf. Telemekh. 24, No. 9 (1963). 113. Krylov, V. Yu.,and Tsetlin, M. L., On automaton games. Avtomaf. Telemekh. 24, No. 7 (1963). 114. Kurdyatsev, L. D., Certain mathematical problems of the theory of electric networks. Usp. Mar. Nauk 3, No. 4 (1948). 115. Landau, L. D., and Lifshitz, E. M., Quantum Mech. 1 (1948). 116. Letichevskiy, A. A., and Dorodnitsiyna, A. A., A modeling of the natural selection on a computer. I n “Printsipy Postroyeniya Samoobuchayushchikhsya Sistem” (Principles of Constructing Learning Systems). KVIRTU, Kiev, 1962. 117. Lukashevich, I. P., A computer study of continuous models of control systems. Eiofirika 8, No. 6 (1963).
276
References
118. Lukashevich, 1. P., The rhythmicity of a homogeneous tissue and a modeling of the
behavior of the cardiac muscle. Biofizika 9, No. 6 (1964). 119. Lunts, A. G., An application of the matrix Boolean algebra to the analysis and a construction of relay-contact networks. Dokl. Akad. Nauk SSSR 70, No. 3 (1950). 120. Lunts, A. G., A construction and an analysis of relay-contact networks by means of characteristic functions. Dokl. Akad. Nairk SSSR 75, No. 2 (1950). 121. L’yus, R. D., and Rayfa, Kh., “Games and Decisions.” IL, Moscow, 1961. 122. Mac-Kinsey, J . , “Introduction to Game Theory.” Fizmatgiz, Moscow, 1960. 123. Milyutin, A. A., On automata with an optimal behavior in random media. Avtomat. Telenzekh. 26, No. I (1965). 124. Mott, N., and Messi, G., “Theory of Atomic Collisions.” IL, Moscow, 1951. 125. Pal’tsev, Ye. I., The functional restructuring of the interaction of spinal structures in connection with a performance of arbitrary movements. Biofizika 12, No. 2 (1967). 126. Pittel’, B. G., The asymptotic properties of a version of the Goore game. Probl. Peredachi Inform. 1, No. 3 (1965). 127. Pittel’, B. G., A simple probabilistic model of collective behavior. Probl. Peredachi Inform. 3, No. 3 (1967). 128. Ponomarev, V. A,, A construction of an automaton which is asymptotically-optimal in a stationary random medium. Biofizika 9, No. I (1964). 129. Pyatetskiy-Shapiro, I. I., and Shik, M. L., A problem of the spinal regulation of movements. Biofizika 9, No. 4 (1964). 130. Rozenblatt, F., “Printsipy Neyrodinamiki. Pertseptrony i Teoriya Mekhanizmov Mozga” (Principles of Neurodynamics. Perceptrons and Theory of Brain Mechanisms). Mir, Moscow, 1965. 131. Saati, T., “Elementary Teorii Massogo Obsluzhivaniya i Yeye Prilozheniya” (Elements of a Theory of Mass Servicing and Its Applications). Soviet radio, Moscow, 1965. 132. Savinov, G. V., An electrical modeling of homeostatic systems. Probl. Kibern. No. 4 (1960). 133. Sverdlov, S. M., and Maksimova, Y e . V., On the inhibitory effect of the afferent impulses on the motor effect of pyramide stimulation. Biofizika 10, No. 1 (1965). 134. Sechenov, I. M., “lzbrannyye Trudy” (Collected Works), Vol. 1. AMN, Moscow, 1952. 135. Smolyaninov, V. V., The problem of the electrical properties of syncytiums. I n “Modeli Strukturno-Funktsional’noy Crganizatsii Nekotorykh Biologicheskikh Sistem” (Models of the Structurally-Functional Organization of Certain Biological Systems). Nauka, Moscow, 1966. 136. Smolyaninov, V. V., An analysis of a sequence of Benkebach cycles. Biofizika 11, No. 2 (1966). 137. Smolyaninov, V. V . , The reactive properties of complex physiological objects. Biofizika 12, No. 4 (1967). 138. Stefanov, V. L., An example of a problem involving a collective behavior of two automata. Avtonlat. Teleniekh. 24, No. 6 (1963). 139. Telesnin, V. R., The problem of the propagation of an excitation in a one-dimensional excitable tissue. Izv. Vyssh. Ucheb. Zaved. Radiofiz. 6, No. 3 (1963). 140. Telesnin, V. R., An establishment of a stationary state during a motion of an impulse along a ring made of an excitable tissue. Izv. Vyssh. Ucheb. Zaved. Radiofir. 8, No. I (1965).
References
277
141. Telesnin, V. R., The problem of the propagation of an excitation in one-dimensional structures. Izv. Vyssh. Ucheb. Zaved. Radiofir. 8, No. 2 (1965). 142. Fel’dbaum, A. A., The use of computers in automatic systems. Avtomat. Telemekh. 17, No. 11 (1956). 143. Fel’dbaum, A. A., Avtomaticheskiy optimizator (Automatic optimizer). Avfomat. Telemekh. 19, No. 8 (1958). 144. Fel’dbaum, A. A., “Vychislitel’nyye Ustroystva v Avtomaticheskikh Sistemakh” (Computers in Automatic Systems). Fizmatgiz, Moscow, 1959. 145. Fel’dbaum, A. A., Teoriya dual’nogo upravleniya, I” (Theory of dual control, I). Avtomat. Telemekh. 21, No. 9 (1960). 146. Fel’dbaum, A. A., Theory of dual control, 11. Avtomat. Telernekh. 21, No. 11 (1960). 147. Fel’dman, A. G., A computation of the spectrum of the physiological tremor on the basis of the data about the functioning of motor units. Biofizika 9, No. 6 (1964). 148. Kharkevich, A. A,, Pattern recognition. Radiotekhnika (Moscow) 14, No. 5 (1959). 148a. Tsetlin, M. L., Doctoral Dissertation. 149. Tsetlin, M. L., The application of the matrix calculations t o a construction of relay-contact networks. Dokl. Akad. Nauk SSSR 86, No. 3 (1953). 150. Tsetlin, M. L., A matrix method of a construction of electronic-impulse and relaycontact (nonprimitive) networks. Dokl. Akad. Nauk SSSR 117, No. 6 (1957). 151. Tsetlin, M. L., On compositions and decompositions of nonprimitive networks. Dokl. Akad. Nauk SSSR 118, No. 3 (1958). 152. Tsetlin, M. L., On nonprimitive networks. Probl. Kibern. No. 1 (1958). 153. Tsetlin, M. L., Certain properties of finite graphs in connection with the problem of transportation. Dokl. Akad. Nauk SSSR 129, No. 4 (1959). 154. Tsetlin, M. L., Certain problems of the behavior of finite automata. Dokl. Akad. Nauk SSSR 139, No. 4 (1 96 1). 155. Tsetlin, M. L., The behavior of finite automata in random media. Avfomat. Telemekh. 22, No. 10 (1961). 156. Tsetlin, M. L., Remark on a game of a finite automaton against an opponent using a mixed strategy. Dokl. Akad. Nauk SSSR 149, No. 1 (1963). 157. Tsetlin, M. L., Finite automata and a modeling of simple forms of behavior. Usp. mat. Nauk 18, No. 4 (1963). 158. Tsetlin, M. L., and Krylov, V. Yu.,Examples of automaton games. Dokl. Akad. Nauk SSSR 149, No. 2 (1963). 159. Tsyan’, S.-S., “Tekhnicheskaya Kibernetika” (Technical Cybernetics). IL, MOSCOW, 1956. 160. Shapovalov, A. I., Peculiarities in the responses of neurons of the spinal cord to a rhythmic stimulation during intracell removal. Dokl. Akad. Nauk SSSR 141, No. 5 (1961). 161. Shestakov, V. I., The algebra of two-terminal networks built exclusively of twoterminal circuits. Zh. Tekhn. Fiz. 11, No. 6 (1941). 162. Eccles, J., “Physiology of Nerve Cells.’’ IL, Moscow, 1959. 163. Ashby, U. R., “Introduction to Cybernetics.” IL, Moscow, 1960. 164. Aitken, J. T., and Bridger, J. E., Neuron size and neuron population density in the lumbosacral region of the cats spinal cord. J. Anat. 95, No. 1 (1961). 165. Basmajian, J. V., Control and training of individual motor units. Science 141, No. 3579 (1963).
278
References
166. Battye, C. K., Nightingale, A., and Whillis, J., The use of myoelectric currents in the operation of protheses. J. Bone Joint Surg. Brit. Vol. 37, 506 (1955). 167. Bickford, R. G., Electroencephalographically controlled anasthesia in abdominal surgery. J. Amer. Med. Ass. 143, No. 5-8 (1950). 168. Bickford, R. G., Automatic electroencephalographic control of general anasthesia. Electroencephalog. Clin. Neurophysiol. 2, No. 2 (1950). 169. Eccles, J. C., Eccles, R. M., Iggo, A., and Lundberg, A., Electrophysiological investigations on Renshaw cells. J. Physiol. (London) 159, No. 3 (1961). 170. Eccles, J. C., Fat, P., and Koketsu, K., Cholinergic and inhibitory synapses in a pathway from motor-axon collaterals to motoneurons. J. Physiol. (London) 126, No. 3 (1954). 171. Ferris, L. P., King, B. C., Spense, P. W., and Williams, H. B., Effect of electric shock on the heart. Elec. Ingen. 55, No. 5 (1936). 172. Frank, K., and Fuortes, M. G. F., Unitary activity of spinal interneurons of cats. J . Physiol. (London) 131, No. 2 (1956). 173. Gelfan, S., Neuron and synapse population in the spinal cord: Indication of role in total integration. Nature (London) 198, No. 4876 (1963). 174. Gernand, B. E., Katzuki, Y.,and Livingston, R. B., Functional organization of descending vestibular influences. J. Neurophysiol. 20, No. 5 (1957). 175. Gernand, B. E., and Ades, H. W., Spinal motor responses to acoustic stimulation. Exp. Neurol. 10, No. 1 (1964). 176. Gernand, B. E., in “Basic Research In Paraplegia” (J. D. French and R. W. Porter, eds.). 1962. 177. Giaquinto, S., Pompeiano, O., and Somogyi, J., Supraspinal inhibitory control of spinal reflexes during natural sleep. Experientia 19, No. 12 (1963). 178. Granit, R., Neuromuscular interaction in pastural tone of the cats isometric soleus muscle. J . Physiol. (London) 143, No. 2 (1958). 179. Haase, Die Transformation des Entladungsmusters der Renshaw-Zellen bei teta-
nischer antidromer Reizung (Transformation of the discharge pattern of Renshaw cells under stimulation by antidromous tetanus). Pfluegers Arch. 276, No. 5
(1963). 180. Hoffman, P., “Untersuchungen Uber die Eigenreflexe menschlichen Muskeln” (Investigations of the Self-Reflexes of Human Muscles). Berlin, 1922. 181. Hufschmidt, H. J., Uber einen supraspinalen Hemmungs-mechanismus (A Supraspinal Retardation Mechanism). Pfluegers Arch. 275, No. 5 (1962). 182. Hunt, C. C., Temporal fluctuation in excitability of spinal motoneurons and its influence on monosynaptic reflex response. J . Gen. Physiol. 38, No. 6 (1955). 183. Konig, D., “Theorie der Endlichen und Unendlichen Graphen” (Theory of Finite and Infinite Graphs). Leipzig, 1936. 184. Lloyd, D. P. C., The spinal mechanism of the pyramidal system in cats. J. Neurophysiol. 4, No. 5 (1941). 185. Magladery, J. W., Some observations on spinal reflexes in man. Pfluegers Arch. 261, No. 4 (1955). 186. Matthews, P. B. C., Muscle spindles and their motor control. Physiol. Rev. 44, No. 2 (1964). 187. Matthews, P. B. C., The effect of the activity of the y-motoneurons on the relation between tension and extension in the stretch reflex. J. Physiol. (London) 140, No. 1 (1 958).
References
279
188. Matthews, P. B. C., The dependence of tension upon extension in the stretch reflex of the solens muscle of the decerebrate cat. J. Physiol. (London) 147, No. 3 (1959). 189. Renshaw, B., Central effects of centripetal impulses in axons of spinal ventral roots. J. Neurophysiol. 9, No. 3 (1946). 190. Roberts, Rhythmic excitation of a stretch reflex, revealing (a) Hysteresis and (b) A difference between the responses to pulling and to stretching. Quart. J. Exp. Physiol. Cog. Med. Sci. 48, No. 4 (1963). 191. Shepherd, J. T., and Wood, E. H., Oxygen content of pulmonary artery blood in
192.
193. 194. 195. 196.
197.
198.
man during various phases of the respiratory and cardiac cycle. Fed. Proc. Fed. Amer. SOC.Exp. Biol. 13, No. 1 (I) (1954). Sommer, J., Periphere Bahnung von Muskeleigenreflexen als Wesen des Jendrassikschen Phanomens (Peripheral paths of muscle reflexes considered as the cause of the Jendrassik phenomenon). Deut. Z. Nervenheilk. 150, No. 3 (1940). Szentagothai, J., in “Basic Research in Paraplegia” (J. D. French and R. W. Porter, eds.), 1962. Ulam, S., “A Collection of Mathematical Problems.” New York-London, 1960. Wells, J., Kinesiologie (1955). Wilson, W. J., Functional anatomy of the spinal cord: Pathways starting with recurrent axon collaterals. In “Basic Research in Paraplegia” (J. D. French and R. W. Porter, eds.), 1962. Wiener, N., and Rosenblueth, A., The mathematical formulation of the problem of conduction of impulses in a network of connected excitable elements, specifically in cardiac muscle. Arch. Inst. Cardiol. Mex. 16, No. 3-4 (1946). Stefanyak, V. L., and Tsetlin, M. L., Power regulation in a group of radio stations. Probl. Pereduchi Inform. 3, No. 4 (1967).
Author Index
Numbers in parentheses are reference numbers and indicate that an author’s work is referred to although his name is not cited in the text. Numbers in italics show the page on which the complete reference is listed. A
C
Ades, H. W., 167(175), 278 Aitken, J. T., 187(164), 277 Alekseyev, M. A., 31, 269 Anokhin, P. K., 269, 270 Arshavskiy, Yu. I., 148(5, 6), 149, 263, 264(7), 266, 270 Ashby, U. R., 135(163), 277 Ayzerman, M. A,, 11. 269
Chaylakhyan, L. M., 148(57), 149(9, 15), 170(57), 264(7), 265(57), 266(8, 9, 15), 270, 272
B Balakhovskiy, I. S., 263, 265, 266, 270 Basrnajian, J. V., 171(165), 277 Battye, C. K., 197, 198, 278 Berkinblit, M. B., 148(5, 6), 149(9, 15), 262(16), 263(5,6, lo), 264(7, 16), 266 (8,9, 15), 270 Bernshteyn, N. A., 160, 196(23), 270, 271 Bickford, R. G., 197, 278 Blackwell, D., 6(24), 41(24), 271 Bongard, M. M., 11, 271 Borovikov, V. A., 10, 239, 271 Braverman, E. M., 11, 271 Bregman, L. M., 244 (34), 271 Breydo, M. G., 197(31,96), 198(95), 201 (96), 271, 274, 275 Bridger, J. E., 187(164) Bryzgalov, V. I., 8(32), 10, 73, 167(33), 239, 271 Bush, R., 11, 271 Butrimenko, A. V., 253, 271 Buylov, V. L., 14(35), 18, 271 281
D Denni-Broun, D., 174(101), 275 Dobrovidov, A. Ya., 5, 274 Dorodnitsyna, A. A., 11, 275 Dubovitskiy, A. Ya., 5, 23, 274 Dunin-Barkovskiy, V. L., 263(10), 270
E Eccles, J. C.,172(169, 170), 177(169, 170), 187(162), 277, 278 Eccles, R. M., 172(169), 177(169), 278
F Fat, P., 172(170), 177(170), 278 Fedorov, Yu. G., 140(42, 58), 271, 272 Fel’dbaum, A. A., 11, 132, 134(142-144), 136, 277 Fel’dman, A. G., 162(79), 171(147), 187, 273, 277 Ferris, L. P., 197, 278 Fomin, S. V., 262(16), 264(16), 265(107), 270, 275 Frank, K., 187(172), 278 Fuortes, M. G . F., 187(172), 278
282
Author Index G
Gavurin, M. K., 224, 274 Gelfan, S., 187(173), 278 Gel’fand, I. M., 8(32, 59), 11(56), 73(32), 131, 137(61), 140(42, 51, 52, 58), 147 (60), 148, 154, 155(50), 160, 161, 164 (61, 62), 165(62), 170(54, 57), 171(53, 54), 172, 173(54), 182, 184(54), 261(60), 262(60), 263(60), 264, 265(60), 266(60), 271, 272, 273 Gernand, B. E., 167(175, 176), 187(176), 278 Giaquinto, S., 193(177), 278. Ginzburg, S. L., 8(64), 65, 84, 273 Girshick, M., 6(24), 41(24), 271 Glushkov, V. M., 4, 10, 11, 70, 273 Gorokhov, Yu. S., 211(72), 273 Granit, R., 171(73), 172(178), 187(73), 191(73), 273, 278 Grashin, A. F., 140(51, 52), 272 Gurfinkel’, V. S., 11(56), 131(62a), 160 (62a), 161(56), 162, 164(55,56, 74-77), 168(80), 170(54, 78), 171(53,54), 172 (54,80), 173(54), 182(54), 184(54), 187, 193(81), 197(31, 96), 198(95), 201(82, 85, 96), 267, 271, 272, 273, 274, 275
H Haase, 171(179), 172(179), 177(179), 278 Hoffman, P., 169(180), 189(180), 221(180), 278 Hufschmidt, H. J., 181, 194(181), 278 Hunt, C. C . , 171(182), 172(182), 187(182), 278
I Iggo, A., 172(169), 177(169), 278 Ikkls, I., 174(101), 275 Isakov, P. K., 164(77), 273 Ivanov, A. F., 159, 274 Ivanova, L. N., 140(51), 170(78), 272, 273
K Kandel’, E. I., 274 Kantorovich, L. V., 224, 274
Katzuki, Y., 278 Kayushina, R. A., 140(42), 271 Kazhdan, D. A., 148(63), 263, 273 Keder-Stepanova, 1. A,, 149, 274 Kharkevich, A. A., 11, 277 Kholopov, A. V., 262(16), 264(16, 109), 265(107), 270, 275 Khudyakov, A. V., 201(85), 274 King, B. C., 197(171), 278 Kobrinskiy, A. Ye., 197(31, 96), 198, 201 (96), 271, 274, 275 Kobrinskiy, N. Ye., 4, 275 Konig, D., 278 Koketsu, K., 172(170), 177(170), 278 Kostyuk, P. G . , 171(99), 173(99), 187(99), 275 Kotov, Yu. B., 171, 172, 275 Kots, Ya. M., 162(79), 168(80), 170(54, 78), 171(53, 54), 172(54, 80), 173(54), 182(54), 184(54), 187, 193(81), 196(23), 271, 272, 273, 274 Kovalev, S . A,, 148(5,6, 57), 149(9), 149 (15), 170(57), 263(5,6), 264(7), 265(57), 266(8,9, 15), 270, 272 Kovalevskiy, V. A., 11(71), 273, 275 Krid, R., 174(101), 275 Krinskiy, V. I., 4, 19, 44, 142(102), 163, 164, 171(53), 193(81), 196(110, l l l ) , 247 248, 264(109), 265, 272, 274, 275 Krylov, V. Yu., 4, 6(113, 158), 8(64), 20, 51, 54, 65(64), 142(112), 167(158), 273, 275, 277 Kurdyatsev, L. D., 221(114), 275 Kushnarev, V. M., 31(2), 269
L Landau, L. D., 140(115), 275 Lazarev, V. G., 253(39), 271 Letichevskiy, A. A., 11, 275 Liddel, Ye., 174(101), 275 Lifshitz, E. M., 140(115), 275 Livingston, R. B., 278 Lloyd, D. P. C., 171(184), 187(184), 278 Lukashevich, I. P., 149, 264, 265, 275, 276 Lundberg, A., 172(169), 177(169), 278 Lunts, A. G., 226, 276 L‘yus, R. D., 6(121), 41(121), 276
283
Author Index
M
S
Mac-Kinsey, J., 6(122), 41 (122), 73(122), 276 Magladery, J. W., 189(185), 193(185), 278 Maksimova, Ye. V., 167(133), 276 Malkin, V. B., 164(77), 201(82, 85), 273, 274 Matthews, P. B. C., 171(186), 172(187, 188), 278, 279 Matusova, A. P., 211(72), 273 Meleshina, M. I., 45, 272 Mel'nikova, V. A., 211(72), 273 Messi, G., 140(124), 276 Milyutin, A. A., 5 , 23, 274, 276 Mosteller, F., 11, 271 Mott, N., 140(124), 276
Saati, T., 102(131), 106, 276 Savinov, G. V., 135(132), 276 Sechenov, I. M., 198, 276 Shabashov, V. M., 211(72), 273 Shapovalov, A. I., 171(160), 187(160), 273, 277 Shepherd, J. T., 197, 279 Sherrington, Ch., 174(101), 275 Shestakov, V. I., 226, 227, 277 Shik, M. L., 11, 131(62a), 160(62a), 163, 164, 167(129), 167(33,47), 168(80), 170 (54, 78), 171(53, 54), 172(54,80), 173 (54), 182(54), 184(54), 193(81), 196 (110, 111, 129), 271, 272, 273, 274, 275, 276 Slavutskiy, Ya. L., 198(95), 274 Smolyaninov, V. V., 149(9, 15, 135), 264, 266(8,9, 15), 270, 276 Sommer, J., 193(1'92), 279 Somogyi, J., 193(177), 278 Spense, P. W., 197(171), 278 Stefanov, V. L., 10, 276 Stefanyak, V. L., 279 Stratonovich, R. L., 5 , 274 Sverdlov, S . M., 167(133), 276 Sysin, A. Ya., 197(31,96), 198(95), 201 (96), 271, 274, 275 Szentagothai, J., 171(193), 279
N Nightingale, A., 197(166), 198(166), 278
P Pal'tsev, Ye. I., 162(79), 187, 267, 273, 274, 276 Pittel', B. G., 239(126), 240, 243, 276 Polyan, Ye. P., 198(95), 274 Pomeranchuk, I. Ya., 140(52), 272 Pompeiano, O., 193(177), 278 Ponomarev, V. A., 4, 20, 44, 142(128), 275, 276 Pyatetskiy-Shapiro, I. I., 8(32, 59), 11, 73(32), 140(58), 167(33, 129), 170(78), 196(129), 271, 272, 273, 276
R Rayfa, Kh., 6(121), 41(121), 276 Renshaw, B., 171(189), 279 Rikko, N. N., 149, 274 Roberts, 172(190), 279 Rodionov, I. M., 167(47), 272 Rosenblueth, A., 154, 155, 158, 279 Rozenblatt, F., 11, 276 Rybakov, V. I., 11(71), 273
T Tarantovich, T. M., 21 1 (72), 273 Telesnin, V. R., 148(139), 159, 262(141), 263, 274, 276, 277 Trakhtenbrot, B. A., 4, 275 Tsetlin, M. L., 4(154, 155), 5(44), 6(113, 156-158), 8(32, 59, 113, 158), 11(56), 12, 16, 31(44), 51, 54, 65(64), 73(32), 84 (65), 93(45), 106, 131(62,62a), 137(61), 139(154, 155), 142(154, 157), 143(157), 147(60), 154(60), 160(62a), 161(56), 164 (55, 56, 61, 62), 165(62), 167(158), 170 (54), 171(53,54), 172(54), 173(54), 182 (54), 184(54), 197(31, 96), 198(95), 201 (82,85,96), 211(72), 221, 226, 261(60),
Author Index 262(60), 263(60), 264, 265(60), 266(60), 271, 272, 273, 274, 275, 277, 279 Tsyan’, S.-S., 140(159), 277
U Ulam, S., 133, 279
V Varshavskiy, V. I., 5, 31, 33, 93, 272 Vayda, S., 6(41), 41(41), 271 Vaynshteyn, B. K., 140(42), 271 Veber, N. V., 167(47), 272 Ventsel’, Ye. S . , 178(48), 272 Volkonskiy, V. A., 239(49), 240, 272 Vorontsova, I. P., 5, 31, 33, 272
W Wells, J., 167, 279 Whillis, J., 197(166), 198(166), 278 Wiener, N., 154, 155, 158, 279 Williams, H. B., 197(171), 278 Wilson, W. J., 171(196), 187(196), 279 Wood, E. H., 197, 279
Y Yakobson, Ya. S., 197(31,96), 198(95), 201(96), 271, 274, 275 Z
Zalkind, M. S., 31(2), 269 Zhukova, G . P., 171(90), 274
Subject Index
Biopotential, 197, 216 Blood analysis of samples, 208 gas content of, 208 Breathing artificial, 201 synergy, 162
A Achilles reflex, 190 Action, 12 Address changes, 244 Addressless control, 124 Asymptotically optimal sequence, 16, 142, 25 1 Asynchronous activity, 171 Automata asymptotically optimal, 34, 247, 251 sequence of, 249 stochastic, 48 symmetric, 16, 48 Automatic diagnosis, 210 instrument for, 214 Automaton deterministic, 12 homogeneous, 16 stochastic, 13 tuned to Index i, 84 Automaton game, 41, 42, 143 ergodic, 43 homogeneous, 15 1 Automaton value, 249 Automorphism, 72 of game, 56
C
B Behavior collective, 41, 260 expedient, 31, 110, 141, 151 mathematical models of, 108 Biocontrol system, 197 Bioelectric control, 216 Bioelectric control system, 197, 199, 216 Bioelectric manipulator, 199
Cable network, 266 Cardiac, see Heart Cardiosynchronizer, 202, 205, 206, 208 Central nervous system, 131 Chirography, 202 Circle game, 71, 77, 80 City planning, 244 Coalitions, 258 Code of movement, 198 Communication network control, 253 Commutator, 228 Comparing tactic, 20 Composite medium, 25 Computer simulation, 62, 69, 71, 145, 172, 254, 264 Computer technology, 266 Control process, 133 Control system, 141 adaptable, 253 continuous model of, 154 Cyclic Ede, 221
285
D Depth of state, 22 Distribution game, 64,145, 151, 254 Distribution problem, 115
286
Subject Index
E ECG, 201 Eigenvalue, 27, 34 Eigenvector, 27 Electrical network, 226 Electrocardiograph, 202 Ergodic chain, 26 Evolving structure, 31, 32 Excitable medium, 263 continuous, 261 one-dimensional, 262 two-dimensional, 264 Excitation instantaneous, 155 in ring, 159 Excitation model, 265 Excitation propagation, 147 velocity of, 155 Extrasystole, 21 1 Extremum problem, 23, 134
F Facilitating stimulus, 175 Fiber, 262 inhomogeneous, 262 refractive, 263 Fiber ring, 262 Fiber segment, 262 Finite graphs, 221 Flickering, 265
H H-Reflex, 189, 193 Heart, 201 biocurrent, 200 cycle, 201, 208, 214 echo, 265 muscle, 265, 266 rhythm, 211 x-ray analysis of, 201 Hey problem, 260
I Impulse(s) blocking of, 263 propagation of nonperiodic sequence of, 262 Impulse propagation speed, 262 Inhibition, 187 Inhibitory stimulus, 175 Input variable, 12 Integral circuits, 266 Interaction, 169 Interaction measure, 150 Interneuron, 187 Iron lungs, 200
J Jendrassik’s maneuver, 191
K Knee jerk, 267
G Game(s), 42, see also specific types with common fund, 61, 68 ergodic, 63 homogeneous, 55, 57 mapping of, 56 simple symmetric, 239 symmetric, 76, 79 value of, 44 Game theory, 44 Ganglion-muscle system, 178 Golgi receptors, 173 Goore game, 96, 113, 144, 151, 239
L Language, 125 Linear tactic, 16, 18, 37 Living space distribution of, 122
M M-Response, 193 Markov chain, 25, 34, 239, 251 ergodic, 43 nonhomogeneous, 32 Matrix game, 41
Sdject Index
287
Matrix methods, 226 Mean payoff, 33 Mechanotherapy, 200 Memory capacity, 12 Mixed strategy, 48, 74 Model-muscle, 174 Moore game, 56, 60 Moore play, 60, 68, 79 Moore set, 60 Motoneuron, 174 alpha, 193 desynchronization of, 178 gamma, 191 group, 169, 172 Motor activity, 160 Motor control, 160 Motor organization spinal level of, 166 Motor unit, 169, 174 stress of, 174 Motor unit operation, 172 Movement restructuring prior to, 187 Multivariate function, 134 Multilevel control system, 161, 268 Muscle control of length, 182 dynamics of loaded, 174 skeletal, biopotential, 198 Muscular amplifier, 199 Myotatic reflector influences, 187
0
Operative control, 160 Organization, 134 Output variable, 12
P Parkinsonism, 171 Parkinson’s disease, 173, 184 Pathological states acute, 210 simulation of, 184 Pattern recognition, 163, 165 Penalty, 12 Phase, 155 Physiology, 131 Physiological experiments, 174 Physiological mechanism, 149 Play, value of, 57 Pneumocardiosynchronization, 206 Poliomelitis, 173 Poliomyelitis, 185 Population redistribution, 244 Pose-regime, 172 Postpoliomyelitic paresis, 171 Pretuning, 168 Principle of least interaction, 149 Process control, 198 Prostheses, 198, 200 Pulmonary arrhythmia, 211
Q
N Nash game, 56, 59, 65 Nash play, 58, 81 Nash set, 67 Nerve center, interaction of, 152 Nerve structure, 169 Network, 231 Network synthesis, 226 Neuron inhibition, 264 Neuron interaction, 169 horizontal, 170 Nonpenalty, 12 Nosologic form, 210 Numerical method distribution problem, 84, 89
Queue control, 102 Queuing system, 102 Quivering, 265
R R Spike, 202 Random media, 12, 31 periodic, 93 stationary, 12, 13, 34, 142 Ravine dimensionality of, 137 method, 137, 152 Refractive phase, 155 Refractivity, 262 time of, 155
288
Subject Index
Refractivity phase, 263 Relay-contract circuits, 226 Renshaw cell, 171, 173, 177-179 thresholds of, 184 Retuning, 168 Reverse facilitation, 187 Ring rhythm, 155 Ritm-1, 211, 213
S Search method automatic, 136 blind, 135 local, 136 nonlocal, 136 Search tactic, 135 Servomechanism, 199 Short-circuit network, 228 Simulation, 141, 147 Sinew reflex, 189 Solving device, 86 Spectrum, 34 Spinal cord, 187, 267 Spino-bulbo-spinal system, 187 Spontaneous activity, 155, 156 period of, 155 State, 12 State matrix, 12 Switching period, 96
Synchronization, 240 in presence of noise, 93 Synchronization process, 264 Synergies, 162
T Tibia1 nerve, 194 Tissues active, 155, 156 excitable, 147 plane excitable, 148 slice of, 158 Transportation plan, 223 Transportation problem, 221 Tree, 221
V Van Swike apparatus, 210
W White noise, 176
X x ray, 203 Z
Zero-sum game, 41, 248